public inbox for [email protected]
help / color / mirror / Atom feedFrom: hubert depesz lubaczewski <[email protected]>
To: Thom Brown <[email protected]>
Cc: Adrian Klaver <[email protected]>
Cc: PostgreSQL General <[email protected]>
Subject: Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug
Date: Thu, 21 Aug 2025 14:03:02 +0200
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAA-aLv441nwn05V4HUnEhR_QLregsnmahihpDbZ2JufNMzgY3w@mail.gmail.com>
References: <[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
<CAA-aLv6Vh04awXunt8mbHp_1vtT0fFLJpaXvs38ToW8mxRB2WA@mail.gmail.com>
<[email protected]>
<CAA-aLv441nwn05V4HUnEhR_QLregsnmahihpDbZ2JufNMzgY3w@mail.gmail.com>
On Thu, Aug 21, 2025 at 12:41:44PM +0100, Thom Brown wrote:
> Ah, yeah I meant transparent hugepage:
> cat /sys/kernel/mm/transparent_hugepage/enabled
> This should show it being set as "never".
Ah. Sorry, couldn't decipher. Yes, it's "never".
> > # grep -oP '^2025-08-19 22:09:2\d\.\d+ UTC' postgresql-2025-08-19_220000.csv | uniq -c | grep -C3 -P '^\s*\d\d'
> > 2 2025-08-19 22:09:29.084 UTC
> > 1 2025-08-19 22:09:29.094 UTC
> > 2 2025-08-19 22:09:29.097 UTC
> > 70 2025-08-19 22:09:29.109 UTC
> > 90 2025-08-19 22:09:29.110 UTC
> > 6 2025-08-19 22:09:29.111 UTC
> > 1 2025-08-19 22:09:29.153 UTC
> > 1 2025-08-19 22:09:29.555 UTC
…
> > 22:10:54 all 2.41 0.00 0.28 0.22 0.00 0.10 0.00 0.00 0.00 96.99
> > 22:10:59 all 2.83 0.00 0.29 0.19 0.00 0.12 0.00 0.00 0.00 96.57
>
> This output looks fine, so it doesn't show anything concerning, so
> suggests the issue is somehow on the Postgres side.
>
> Did you happen to poll pg_stat_activity at the time to see whether you
> had lots of IPC waits? I'm wondering whether the storage layer is
> freezing up for a moment.
So, we get select * from pg_stat_activity, for client backends that are
not idle, every 29 seconds.
So, 1 second "freeze" is impossible to cathc.
Plus - I suspect that if I ran select * from pg_stat_activity while "in
freeze", it would also get frozen.
Anyway, I have data from 22:09:22 and 22:09:51. In both cases only
4 non-idle backend.
6 of them had NULL in wait_event*
one was Client/ClientRead and one was IPC/BgWorkerShutdown.
State_change for the IPC/BgWorkerShutdown backend was 2025-08-19
22:09:51.79504+00 so it was well past the moment when the problem
struck.
Best regards,
depesz
view thread (27+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected]
Subject: Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox