Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1upTMs-005pWG-E4 for pgsql-general@arkaria.postgresql.org; Fri, 22 Aug 2025 15:07:52 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1upTMr-006jMv-Sk for pgsql-general@arkaria.postgresql.org; Fri, 22 Aug 2025 15:07:50 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1upTMr-006jMn-6e for pgsql-general@lists.postgresql.org; Fri, 22 Aug 2025 15:07:50 +0000 Received: from mail-yb1-xb2d.google.com ([2607:f8b0:4864:20::b2d]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1upTMo-001D1d-2K for pgsql-general@lists.postgresql.org; Fri, 22 Aug 2025 15:07:48 +0000 Received: by mail-yb1-xb2d.google.com with SMTP id 3f1490d57ef6-e951bdb16f4so953168276.1 for ; Fri, 22 Aug 2025 08:07:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1755875267; x=1756480067; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=RDQY8PO8ybraHgdLDyW2chhwdMZZP107taNv4/vajMo=; b=KVqMg1tIVGdQoz+RKOclbVNlXQgCNLQwD4RNqdQaOHQlZSwIykOPQ4lk6HfCg5OeNA dGaAcn8lAcnHIRpiurz6+OhQasKxPYQWLZtj+kQHmZMfC1VaPOFMWShoo3M89uSC84KK Qdy5Ax0QtNwR0iTUd5R/ThWGienxr2QomiHkgT3lKpdPfeuwq6ZI+m8fFklalsEoaj+v HYifNDfhNPII8mN76nCNlG8u9q5oFvq6gIpLI66vYS4lWAChH6GAhjprJgio8yPdkNpA HNaAp+0xf97p0PVkCwmLUhVkxw/+my+oJ8yiUeEJimIVtVTFjRZgOVKU81pHMjfNKFRL bXrw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1755875267; x=1756480067; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=RDQY8PO8ybraHgdLDyW2chhwdMZZP107taNv4/vajMo=; b=oLpfhJQCHQ7Dgi+4KIltnGv80iimkhWi546YaAE7s4Da+9fgHQwsfSsRdEP/Ctrp81 kbVoMD++uz/M7Otekd4myuqnkl7ZlthVIm7dAoS4h/y7Uq4M/PILfpZSFjN4aoKRtIvA tkcSJsuC7ZDEIvtjzKDlbG6OUtOqnlyX6+/Cf0dPl2btnM6P7d5d0LBhpVebT7A93fWI PqZyl47WBDv0C+ZdLcDbcs39NKoHNBJqotT+WPvNzD9nnepZTPtePopq5WrWPIBRi+gV VcU+G3Fe4RfjbVIE1qJH0sDBWepoZuJtn5Ly4sBixf5/W/HTJUrIOazi4dMHz6MjPFLm e2UQ== X-Forwarded-Encrypted: i=1; AJvYcCXNis7s9Y9AuB0sQ+Uz3INaIfGNMCqjWLB9RIF3HdqMZdwJJc6L7TrphfuwP+aYsPNYq4F9TtiWK2ruHS8D@lists.postgresql.org X-Gm-Message-State: AOJu0Yxqre//mbyNIcPOhA5ctK7pstVsHeA3eYA3lEHNCyD4zFgJIlVt bc9sFmLqyqNVGvomxm6D+cYHc3ne3Mp6tB4lRARZrsrdCZPSbBKNE1+VKo9EmyH/41FJb/dpYnO wtcSA1MNypA+t5mW7LP+AJwtC97abHRg= X-Gm-Gg: ASbGncvbWfNlcV3jjJWfAg+0F0ji8nRrLgEIYiQxii9gGpTVDx4ZxiJXLX2bKMwVdCm /Fbr4DLjQDo+/2r3S1Ek4gHI4Cjp8RhQWYFXIZTSY4muGv2dqC6RyrHJxmS+pVsKqYxKNa2EhL4 QQIJIrTMeSN3uBQ91L78MbquiB4BQ6atqz7+80P8lsQoQ3YgW+fIHPbw2nBKYUOLSAFSa+ch7LB WCPC7X8Bw/X6DGQnlRJuwRvqzrb5jYNTlecnNF35X13UKg6JyNopdawbEZK2eXk46OPXwqQCLrr IoIKMm1Ep8yKXqZ+HJqeGVSBSTPxazg3mOFFQk2jwx/6fKhhXciA X-Google-Smtp-Source: AGHT+IGmXbZRz7DjyZt7rRTXOfUNE4UFz3zCzBV9t+4HULmfQfC3j0xstjXGf6oDylBM+ZTPMueVGW8k7G1DXfAxiUg= X-Received: by 2002:a05:690c:60c5:b0:71c:3fde:31b6 with SMTP id 00721157ae682-71fdc3d153cmr31627107b3.34.1755875266375; Fri, 22 Aug 2025 08:07:46 -0700 (PDT) MIME-Version: 1.0 References: <25334887-f1c3-40a1-94b0-753c7d67ae2b@aklaver.com> <2a3e4a8d-e8c2-46d6-ad7d-9e631ce6725e@aklaver.com> In-Reply-To: From: Pavel Stehule Date: Fri, 22 Aug 2025 17:07:09 +0200 X-Gm-Features: Ac12FXxrQoD8R171kiKY5pZXWePDhsGQJRqIPxHS_X6YIX53qEWATQKDNPxrFfs Message-ID: Subject: Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug To: depesz@depesz.com Cc: Adrian Klaver , PostgreSQL General , Chris Wilson Content-Type: multipart/alternative; boundary="00000000000085b419063cf5919b" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --00000000000085b419063cf5919b Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable p=C3=A1 22. 8. 2025 v 17:03 odes=C3=ADlatel hubert depesz lubaczewski < depesz@depesz.com> napsal: > On Fri, Aug 22, 2025 at 07:33:03AM -0700, Adrian Klaver wrote: > > On 8/22/25 05:37, hubert depesz lubaczewski wrote: > > > On Thu, Aug 21, 2025 at 11:17:27AM -0700, Adrian Klaver wrote: > > > > Have you looked at?: > > > > https://www.pgbouncer.org/changelog.html#pgbouncer-124x > > > > To see if anything stands out. > > > > Then there is: > > > > https://www.pgbouncer.org/config.html#max_prepared_statements > > > > The below may also be worth looking at: > > > > https://github.com/pgbouncer/pgbouncer/pull/1144 > > > > I can't help thinking that there is a caching issue at stake, thoug= h > that is > > > > just a guess. > > > > > > Will check, but we don't use prepared statements, so this doesn't see= m > > > to be part-of-the-problem. > > > > In your OP there is: > > > > "Every now and then (usually every 3-5 minutes, but not through the who= le > > day), we see situations where every query suddently takes ~ 1 second. > > > > > > I'm talkign about "prepare" for preparing statements. binds. also > > "discard all". > > " > > > > Is that referring to something else? > > No. Same problem. The thing is that it comes and goes. > > *BUT* > > I got repeatable case today. Is is breaking on its own everyy > ~ 5 minutes. > > So I made test.sql with: > \timing on > SELECT 'SELECT now(), pg_sleep(.05);' FROM generate_series(1,100000) i > \gexec > > and ran it like this: > > psql -p 5432 -d the_db -f test.sql -qAtX | perl -pe 's/\|\s*$/ :: /' > > This bypassed pgbouncer, connected via unix socket, the whole thing got > simpler. > > After 4 minutes of 50ms runtimes I got: > > 2025-08-22 14:44:14.013541+00 :: Time: 50.090 ms > 2025-08-22 14:44:14.063636+00 :: Time: 50.088 ms > 2025-08-22 14:44:14.11373+00 :: Time: 50.085 ms > 2025-08-22 14:44:14.163822+00 :: Time: 50.099 ms > 2025-08-22 14:44:14.213927+00 :: Time: 1393.932 ms (00:01.394) > 2025-08-22 14:44:15.607872+00 :: Time: 50.171 ms > 2025-08-22 14:44:15.658097+00 :: Time: 50.209 ms > 2025-08-22 14:44:15.708353+00 :: Time: 50.212 ms > 2025-08-22 14:44:15.758589+00 :: Time: 50.170 ms > > This is pg_sleep. There are no locks, nothing else. I think this > disproves idea that the problem is about pgbouncer, and it kinda looks > like a thing in pg? > > Interestingly: ran strace on the pg backend that was doing this loop, > like this: > strace -tt -T -ff -p 1266521 -s 256 -o a.log > > And it showed something really strange: > > =E2=80=A6 > 14:57:00.864684 recvfrom(9, "Q\0\0\0CSELECT now()::timestamptz(2), > pg_backend_pid(), pg_sleep(.05);\0", 8192, 0, NULL, NULL) =3D 68 <0.00000= 7> > 14:57:00.864739 epoll_pwait(11, [], 1, 50, NULL, 8) =3D 0 <0.050063> > 14:57:00.914823 sendto(9, > "T\0\0\0X\0\3now\0\0\0\0\0\0\0\0\0\4\240\0\10\0\0\0\2\0\0pg_backend_pid\0= \0\0\0\0\0\0\0\0\0\27\0\4\377\377\377\377\0\0pg_sleep\0\0\0\0\0\0\0\0\0\10\= 346\0\4\377\377\377\377\0\0D\0\0\0002\0\3\0\0\0\0312025-08-22 > 14:57:00.86+00\0\0\0\0071266521\0\0\0\0C\0\0\0\14:57:00.914849 recvfrom(9= , > "Q\0\0\0CSELECT now()::timestamptz(2), pg_backend_pid(), pg_sleep(.05);\0= ", > 8192, 0, NULL, NULL) =3D 68 <0.000007> > 14:57:00.914891 epoll_pwait(11, [], 1, 50, NULL, 8) =3D 0 <0.050064> > 14:57:00.964978 sendto(9, > "T\0\0\0X\0\3now\0\0\0\0\0\0\0\0\0\4\240\0\10\0\0\0\2\0\0pg_backend_pid\0= \0\0\0\0\0\0\0\0\0\27\0\4\377\377\377\377\0\0pg_sleep\0\0\0\0\0\0\0\0\0\10\= 346\0\4\377\377\377\377\0\0D\0\0\0002\0\3\0\0\0\0312025-08-22 > 14:57:00.91+00\0\0\0\0071266521\0\0\0\0C\0\0\0\ > 14:57:00.965006 recvfrom(9, "Q\0\0\0CSELECT now()::timestamptz(2), > pg_backend_pid(), pg_sleep(.05);\0", 8192, 0, NULL, NULL) =3D 68 <0.00000= 7> > > > 14:57:00.965052 epoll_pwait(11, [], 1, 50, NULL, 8) =3D 0 <0.050063> > 14:57:01.015133 futex(0xffe7a6636538, > FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY) = =3D > 0 <1.354927> > 14:57:02.370180 write(2, "\0\0\36\1YS\23\0T2025-08-22 14:57:02.370 > UTC,\"postgres\",\"the.db\",1266521,\"[local]\",68a884fb.135359,5,\"SELEC= T\",2025-08-22 > 14:55:55 UTC,85/0,0,LOG,00000,\"duration: 1405.106 ms statement: SELECT > now()::timestamptz(2), pg_backend_pid(), pg_sleep(.05); > 14:57:02.370521 sendto(8, "\2\0\0\0@\0\0\0\216A\0\0\0\0\0\0\t\0\0\0\0\0\0= \0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\255\221\33\0\0\0\0\0\262\341\6\0\0\0\0\0= \0\0\0\0\0\0\0\0", > 64, 0, NULL, 0) =3D 64 <0.000041> > 14:57:02.370639 sendto(9, > "T\0\0\0X\0\3now\0\0\0\0\0\0\0\0\0\4\240\0\10\0\0\0\2\0\0pg_backend_pid\0= \0\0\0\0\0\0\0\0\0\27\0\4\377\377\377\377\0\0pg_sleep\0\0\0\0\0\0\0\0\0\10\= 346\0\4\377\377\377\377\0\0D\0\0\0002\0\3\0\0\0\0312025-08-22 > 14:57:00.97+00\0\0\0\0071266521\0\0\0\0C\0\0\0\14:57:02.371063 recvfrom(9= , > "Q\0\0\0CSELECT now()::timestamptz(2), pg_backend_pid(), pg_sleep(.05);\0= ", > 8192, 0, NULL, NULL) =3D 68 <0.000019> > 14:57:02.371215 epoll_pwait(11, [], 1, 50, NULL, 8) =3D 0 <0.050090> > 14:57:02.421381 sendto(9, > "T\0\0\0X\0\3now\0\0\0\0\0\0\0\0\0\4\240\0\10\0\0\0\2\0\0pg_backend_pid\0= \0\0\0\0\0\0\0\0\0\27\0\4\377\377\377\377\0\0pg_sleep\0\0\0\0\0\0\0\0\0\10\= 346\0\4\377\377\377\377\0\0D\0\0\0002\0\3\0\0\0\0312025-08-22 > 14:57:02.37+00\0\0\0\0071266521\0\0\0\0C\0\0\0\ > 14:57:02.421480 recvfrom(9, 0xaaaae7b3fc08, 8192, 0, NULL, NULL) =3D -1 > EAGAIN (Resource temporarily unavailable) <0.000008> > > > 14:57:02.421507 epoll_pwait(4, [{EPOLLIN, {u32=3D470567512, > u64=3D187651886697048}}], 1, -1, NULL, 8) =3D 1 <0.000037> > 14:57:02.421570 recvfrom(9, "Q\0\0\0CSELECT now()::timestamptz(2), > pg_backend_pid(), pg_sleep(.05);\0", 8192, 0, NULL, NULL) =3D 68 <0.00001= 2> > 14:57:02.421673 epoll_pwait(11, [], 1, 50, NULL, 8) =3D 0 <0.050099> > 14:57:02.471866 sendto(9, > "T\0\0\0X\0\3now\0\0\0\0\0\0\0\0\0\4\240\0\10\0\0\0\2\0\0pg_backend_pid\0= \0\0\0\0\0\0\0\0\0\27\0\4\377\377\377\377\0\0pg_sleep\0\0\0\0\0\0\0\0\0\10\= 346\0\4\377\377\377\377\0\0D\0\0\0002\0\3\0\0\0\0312025-08-22 > 14:57:02.42+00\0\0\0\0071266521\0\0\0\0C\0\0\0\14:57:02.471961 recvfrom(9= , > 0xaaaae7b3fc08, 8192, 0, NULL, NULL) =3D -1 EAGAIN (Resource temporarily > unavailable) <0.000008> > =E2=80=A6 > > As in, normal execution is: recv from, epoll_pwait that takes 0.05s, and > sendto. > all clear, nothing fancy. > > *BUT* > in case of this one problematic thing there is futex() call that took > 1.354s ?! Checked, and in straced ~ 800 queries this is the only futex > call() ?! > > Any idea on what could it be? > Few years ago I had a similar experience - some sec mystic lags - the problem was in virtualization. Regards Pavel > > Best regards, > > depesz > > > > --00000000000085b419063cf5919b Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


p=C3=A1 22. 8. = 2025 v=C2=A017:03 odes=C3=ADlatel hubert depesz lubaczewski <depesz@depesz.com> napsal:
On Fri, Aug 22, 2025 at 07:33:0= 3AM -0700, Adrian Klaver wrote:
> On 8/22/25 05:37, hubert depesz lubaczewski wrote:
> > On Thu, Aug 21, 2025 at 11:17:27AM -0700, Adrian Klaver wrote: > > > Have you looked at?:
> > > https://www.pgbouncer.org/chan= gelog.html#pgbouncer-124x
> > > To see if anything stands out.
> > > Then there is:
> > > https://www.pgbouncer.or= g/config.html#max_prepared_statements
> > > The below may also be worth looking at:
> > > https://github.com/pgbouncer/pgbounce= r/pull/1144
> > > I can't help thinking that there is a caching issue at s= take, though that is
> > > just a guess.
> >
> > Will check, but we don't use prepared statements, so this doe= sn't seem
> > to be part-of-the-problem.
>
> In your OP there is:
>
> "Every now and then (usually every 3-5 minutes, but not through t= he whole
> day), we see situations where every query suddently takes ~ 1 second.<= br> >
>
> I'm talkign about "prepare" for preparing statements. bi= nds. also
> "discard all".
> "
>
> Is that referring to something else?

No. Same problem. The thing is that it comes and goes.

*BUT*

I got repeatable case today. Is is breaking on its own everyy
~ 5 minutes.

So I made test.sql with:
\timing on
SELECT 'SELECT now(), pg_sleep(.05);' FROM generate_series(1,100000= ) i \gexec

and ran it like this:

psql -p 5432 -d the_db -f test.sql -qAtX | perl -pe 's/\|\s*$/ :: /'= ;

This bypassed pgbouncer, connected via unix socket, the whole thing got sim= pler.

After 4 minutes of 50ms runtimes I got:

2025-08-22 14:44:14.013541+00 :: Time: 50.090 ms
2025-08-22 14:44:14.063636+00 :: Time: 50.088 ms
2025-08-22 14:44:14.11373+00 :: Time: 50.085 ms
2025-08-22 14:44:14.163822+00 :: Time: 50.099 ms
2025-08-22 14:44:14.213927+00 :: Time: 1393.932 ms (00:01.394)
2025-08-22 14:44:15.607872+00 :: Time: 50.171 ms
2025-08-22 14:44:15.658097+00 :: Time: 50.209 ms
2025-08-22 14:44:15.708353+00 :: Time: 50.212 ms
2025-08-22 14:44:15.758589+00 :: Time: 50.170 ms

This is pg_sleep. There are no locks, nothing else. I think this
disproves idea that the problem is about pgbouncer, and it kinda looks
like a thing in pg?

Interestingly: ran strace on the pg backend that was doing this loop,
like this:
strace -tt -T -ff -p 1266521=C2=A0 -s 256 -o a.log

And it showed something really strange:

=E2=80=A6
14:57:00.864684 recvfrom(9, "Q\0\0\0CSELECT now()::timestamptz(2), pg_= backend_pid(), pg_sleep(.05);\0", 8192, 0, NULL, NULL) =3D 68 <0.00= 0007>
14:57:00.864739 epoll_pwait(11, [], 1, 50, NULL, 8) =3D 0 <0.050063><= br> 14:57:00.914823 sendto(9, "T\0\0\0X\0\3now\0\0\0\0\0\0\0\0\0\4\240\0\1= 0\0\0\0\2\0\0pg_backend_pid\0\0\0\0\0\0\0\0\0\0\27\0\4\377\377\377\377\0\0p= g_sleep\0\0\0\0\0\0\0\0\0\10\346\0\4\377\377\377\377\0\0D\0\0\0002\0\3\0\0\= 0\0312025-08-22 14:57:00.86+00\0\0\0\0071266521\0\0\0\0C\0\0\0\14:57:00.914= 849 recvfrom(9, "Q\0\0\0CSELECT now()::timestamptz(2), pg_backend_pid(= ), pg_sleep(.05);\0", 8192, 0, NULL, NULL) =3D 68 <0.000007>
14:57:00.914891 epoll_pwait(11, [], 1, 50, NULL, 8) =3D 0 <0.050064><= br> 14:57:00.964978 sendto(9, "T\0\0\0X\0\3now\0\0\0\0\0\0\0\0\0\4\240\0\1= 0\0\0\0\2\0\0pg_backend_pid\0\0\0\0\0\0\0\0\0\0\27\0\4\377\377\377\377\0\0p= g_sleep\0\0\0\0\0\0\0\0\0\10\346\0\4\377\377\377\377\0\0D\0\0\0002\0\3\0\0\= 0\0312025-08-22 14:57:00.91+00\0\0\0\0071266521\0\0\0\0C\0\0\0\
14:57:00.965006 recvfrom(9, "Q\0\0\0CSELECT now()::timestamptz(2), pg_= backend_pid(), pg_sleep(.05);\0", 8192, 0, NULL, NULL) =3D 68 <0.00= 0007>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A014:57:= 00.965052 epoll_pwait(11, [], 1, 50, NULL, 8) =3D 0 <0.050063>
14:57:01.015133 futex(0xffe7a6636538, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIM= E, 0, NULL, FUTEX_BITSET_MATCH_ANY) =3D 0 <1.354927>
14:57:02.370180 write(2, "\0\0\36\1YS\23\0T2025-08-22 14:57:02.370 UTC= ,\"postgres\",\"the.db\",1266521,\"[local]\",= 68a884fb.135359,5,\"SELECT\",2025-08-22 14:55:55 UTC,85/0,0,LOG,0= 0000,\"duration: 1405.106 ms=C2=A0 statement: SELECT now()::timestampt= z(2), pg_backend_pid(), pg_sleep(.05);
14:57:02.370521 sendto(8, "\2\0\0\0@\0\0\0\216A\0\0\0\0\0\0\t\0\0\0\0\= 0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\255\221\33\0\0\0\0\0\262\341\6\0\0\0\= 0\0\0\0\0\0\0\0\0\0", 64, 0, NULL, 0) =3D 64 <0.000041>
14:57:02.370639 sendto(9, "T\0\0\0X\0\3now\0\0\0\0\0\0\0\0\0\4\240\0\1= 0\0\0\0\2\0\0pg_backend_pid\0\0\0\0\0\0\0\0\0\0\27\0\4\377\377\377\377\0\0p= g_sleep\0\0\0\0\0\0\0\0\0\10\346\0\4\377\377\377\377\0\0D\0\0\0002\0\3\0\0\= 0\0312025-08-22 14:57:00.97+00\0\0\0\0071266521\0\0\0\0C\0\0\0\14:57:02.371= 063 recvfrom(9, "Q\0\0\0CSELECT now()::timestamptz(2), pg_backend_pid(= ), pg_sleep(.05);\0", 8192, 0, NULL, NULL) =3D 68 <0.000019>
14:57:02.371215 epoll_pwait(11, [], 1, 50, NULL, 8) =3D 0 <0.050090><= br> 14:57:02.421381 sendto(9, "T\0\0\0X\0\3now\0\0\0\0\0\0\0\0\0\4\240\0\1= 0\0\0\0\2\0\0pg_backend_pid\0\0\0\0\0\0\0\0\0\0\27\0\4\377\377\377\377\0\0p= g_sleep\0\0\0\0\0\0\0\0\0\10\346\0\4\377\377\377\377\0\0D\0\0\0002\0\3\0\0\= 0\0312025-08-22 14:57:02.37+00\0\0\0\0071266521\0\0\0\0C\0\0\0\
14:57:02.421480 recvfrom(9, 0xaaaae7b3fc08, 8192, 0, NULL, NULL) =3D -1 EAG= AIN (Resource temporarily unavailable) <0.000008>=C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A014:57:02.421507 epoll_pwait(4, [{EPOLLIN, {u= 32=3D470567512, u64=3D187651886697048}}], 1, -1, NULL, 8) =3D 1 <0.00003= 7>
14:57:02.421570 recvfrom(9, "Q\0\0\0CSELECT now()::timestamptz(2), pg_= backend_pid(), pg_sleep(.05);\0", 8192, 0, NULL, NULL) =3D 68 <0.00= 0012>
14:57:02.421673 epoll_pwait(11, [], 1, 50, NULL, 8) =3D 0 <0.050099><= br> 14:57:02.471866 sendto(9, "T\0\0\0X\0\3now\0\0\0\0\0\0\0\0\0\4\240\0\1= 0\0\0\0\2\0\0pg_backend_pid\0\0\0\0\0\0\0\0\0\0\27\0\4\377\377\377\377\0\0p= g_sleep\0\0\0\0\0\0\0\0\0\10\346\0\4\377\377\377\377\0\0D\0\0\0002\0\3\0\0\= 0\0312025-08-22 14:57:02.42+00\0\0\0\0071266521\0\0\0\0C\0\0\0\14:57:02.471= 961 recvfrom(9, 0xaaaae7b3fc08, 8192, 0, NULL, NULL) =3D -1 EAGAIN (Resourc= e temporarily unavailable) <0.000008>
=E2=80=A6

As in, normal execution is: recv from, epoll_pwait that takes 0.05s, and se= ndto.
all clear, nothing fancy.

*BUT*
in case of this one problematic thing there is futex() call that took 1.354= s ?! Checked, and in straced ~ 800 queries this is the only futex call() ?!=

Any idea on what could it be?

Few years= ago I had a similar experience - some sec mystic lags - the problem was in= virtualization.=C2=A0

Regards

Pavel

=C2=A0

Best regards,

depesz



--00000000000085b419063cf5919b--