public inbox for [email protected]help / color / mirror / Atom feed
Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug 27+ messages / 7 participants [nested] [flat]
* Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug @ 2025-08-20 15:14 Adrian Klaver <[email protected]> 0 siblings, 1 reply; 27+ messages in thread From: Adrian Klaver @ 2025-08-20 15:14 UTC (permalink / raw) To: [email protected]; +Cc: PostgreSQL General <[email protected]> On 8/20/25 04:32, hubert depesz lubaczewski wrote: > On Tue, Aug 19, 2025 at 11:39:03AM -0700, Adrian Klaver wrote: >>> Every now and then (usually every 3-5 minutes, but not through the whole >>> day), we see situations where every query suddently takes ~ 1 second. >> Given the subject line, what you are reporting is happening on the replica, >> correct? > > Yes. > >> If so where is the replica relative to the primary, in terms of network >> distance? > > =$ ping -c 10 primary > reports: > 10 packets transmitted, 10 received, 0% packet loss, time 9181ms > rtt min/avg/max/mdev = 0.942/0.956/0.991/0.012 ms > >> Also what are the 'hardware' specifications on the replica instance? > > c8g.48xlarge ec2 instance. It is arm64, 192 cores, with 384 gb of ram. > > As for storage, this is relatitvely slow, because this db is rather > small: > gp3 500gb volume, with 6000 iops. At no point is IO in any way close to > limits, the whole db fits easily in ram. Hmm. From initial post: "For ~ 1 second there are no logs going to log (we usually have at 5-20 messages logged per second), no connection, nothing. And then we get bunch (30+) messages with the same milisecond time." Are the 30+ messages all coming in on one connection or multiple connections? Also to be clear these are statements that are being run on the replica locally, correct? Does the AWS monitoring indicate any issues? > > Best regards, > > depesz > -- Adrian Klaver [email protected] ^ permalink raw reply [nested|flat] 27+ messages in thread
* Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug @ 2025-08-20 16:08 hubert depesz lubaczewski <[email protected]> parent: Adrian Klaver <[email protected]> 0 siblings, 3 replies; 27+ messages in thread From: hubert depesz lubaczewski @ 2025-08-20 16:08 UTC (permalink / raw) To: Adrian Klaver <[email protected]>; +Cc: PostgreSQL General <[email protected]> On Wed, Aug 20, 2025 at 08:14:47AM -0700, Adrian Klaver wrote: > Hmm. > > From initial post: > > "For ~ 1 second there are no logs going to log (we usually have at 5-20 > messages logged per second), no connection, nothing. And then we get > bunch (30+) messages with the same milisecond time." > > Are the 30+ messages all coming in on one connection or multiple > connections? Multiple connections. > Also to be clear these are statements that are being run on the replica > locally, correct? What do you mean locally? Application servers are *not* on the same server as the db. So no, they are not local. > Does the AWS monitoring indicate any issues? Nope. All looks "fine". Same for our monitoring - CPU usage, iops, load. Best regards, depesz ^ permalink raw reply [nested|flat] 27+ messages in thread
* Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug @ 2025-08-20 17:24 Rob Sargent <[email protected]> parent: hubert depesz lubaczewski <[email protected]> 2 siblings, 0 replies; 27+ messages in thread From: Rob Sargent @ 2025-08-20 17:24 UTC (permalink / raw) To: [email protected]; +Cc: Adrian Klaver <[email protected]>; PostgreSQL General <[email protected]> > On Aug 20, 2025, at 10:08 AM, hubert depesz lubaczewski <[email protected]> wrote: > > On Wed, Aug 20, 2025 at 08:14:47AM -0700, Adrian Klaver wrote: >> Hmm. >> >> From initial post: >> >> "For ~ 1 second there are no logs going to log (we usually have at 5-20 >> messages logged per second), no connection, nothing. And then we get >> bunch (30+) messages with the same milisecond time." Is that logger ms or db server ms? The latter seems unlikely to me. ^ permalink raw reply [nested|flat] 27+ messages in thread
* Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug @ 2025-08-20 17:30 Thom Brown <[email protected]> parent: hubert depesz lubaczewski <[email protected]> 2 siblings, 1 reply; 27+ messages in thread From: Thom Brown @ 2025-08-20 17:30 UTC (permalink / raw) To: [email protected]; +Cc: Adrian Klaver <[email protected]>; PostgreSQL General <[email protected]> On Wed, 20 Aug 2025 at 17:08, hubert depesz lubaczewski <[email protected]> wrote: > > On Wed, Aug 20, 2025 at 08:14:47AM -0700, Adrian Klaver wrote: > > Hmm. > > > > From initial post: > > > > "For ~ 1 second there are no logs going to log (we usually have at 5-20 > > messages logged per second), no connection, nothing. And then we get > > bunch (30+) messages with the same milisecond time." > > > > Are the 30+ messages all coming in on one connection or multiple > > connections? > > Multiple connections. > > > Also to be clear these are statements that are being run on the replica > > locally, correct? > > What do you mean locally? > > Application servers are *not* on the same server as the db. So no, they > are not local. > > > Does the AWS monitoring indicate any issues? > > Nope. All looks "fine". > > Same for our monitoring - CPU usage, iops, load. > > Best regards, > > depesz Do you have THP enabled? Can you use mpstat and see what %steal shows as? Thom ^ permalink raw reply [nested|flat] 27+ messages in thread
* Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug @ 2025-08-20 17:45 Adrian Klaver <[email protected]> parent: hubert depesz lubaczewski <[email protected]> 2 siblings, 1 reply; 27+ messages in thread From: Adrian Klaver @ 2025-08-20 17:45 UTC (permalink / raw) To: [email protected]; +Cc: PostgreSQL General <[email protected]> On 8/20/25 09:08, hubert depesz lubaczewski wrote: > On Wed, Aug 20, 2025 at 08:14:47AM -0700, Adrian Klaver wrote: >> Hmm. >> >> From initial post: >> >> "For ~ 1 second there are no logs going to log (we usually have at 5-20 >> messages logged per second), no connection, nothing. And then we get >> bunch (30+) messages with the same milisecond time." >> >> Are the 30+ messages all coming in on one connection or multiple >> connections? > > Multiple connections. > >> Also to be clear these are statements that are being run on the replica >> locally, correct? > > What do you mean locally? I should have been clearer. Are the queries being run against the replica or the primary? > > Application servers are *not* on the same server as the db. So no, they > are not local. Where are the application servers relative to the replica server? How many applications servers are hitting the database? > >> Does the AWS monitoring indicate any issues? > > Nope. All looks "fine". > > Same for our monitoring - CPU usage, iops, load. > > Best regards, > > depesz > -- Adrian Klaver [email protected] ^ permalink raw reply [nested|flat] 27+ messages in thread
* Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug @ 2025-08-21 10:03 hubert depesz lubaczewski <[email protected]> parent: Thom Brown <[email protected]> 0 siblings, 1 reply; 27+ messages in thread From: hubert depesz lubaczewski @ 2025-08-21 10:03 UTC (permalink / raw) To: Thom Brown <[email protected]>; +Cc: Adrian Klaver <[email protected]>; PostgreSQL General <[email protected]> On Wed, Aug 20, 2025 at 06:30:00PM +0100, Thom Brown wrote: > Do you have THP enabled? Can you use mpstat and see what %steal shows as? Sorry, what is "THP"? I tried searching for "what is thp", and most common search results are related to some chemical compound. mpstat, yes, we have it. Let me quickly show what I get: Two days ago, at ~ 10:10pm UTC I saw this in Pg logs: # grep -oP '^2025-08-19 22:09:2\d\.\d+ UTC' postgresql-2025-08-19_220000.csv | uniq -c | grep -C3 -P '^\s*\d\d' 2 2025-08-19 22:09:29.084 UTC 1 2025-08-19 22:09:29.094 UTC 2 2025-08-19 22:09:29.097 UTC 70 2025-08-19 22:09:29.109 UTC 90 2025-08-19 22:09:29.110 UTC 6 2025-08-19 22:09:29.111 UTC 1 2025-08-19 22:09:29.153 UTC 1 2025-08-19 22:09:29.555 UTC As you can see we have 70, and then 90 messages all logged with the same timestamp. All of them (160) were "duration:x" lines. At the time, mpstat (called using `mpstat 5`) looked like this: CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle 22:09:04 all 2.25 0.00 0.27 0.24 0.00 0.10 0.00 0.00 0.00 97.14 22:09:09 all 2.70 0.00 0.28 0.27 0.00 0.12 0.00 0.00 0.00 96.63 22:09:14 all 2.83 0.00 0.63 0.23 0.00 0.09 0.00 0.00 0.00 96.22 22:09:19 all 2.66 0.00 0.28 0.23 0.00 0.10 0.00 0.00 0.00 96.74 22:09:24 all 2.36 0.00 0.27 0.21 0.00 0.10 0.00 0.00 0.00 97.06 22:09:29 all 2.49 0.00 0.40 0.40 0.00 0.11 0.00 0.00 0.00 96.60 22:09:34 all 2.65 0.00 0.28 0.23 0.00 0.11 0.00 0.00 0.00 96.73 22:09:39 all 2.26 0.00 0.26 0.21 0.00 0.09 0.00 0.00 0.00 97.19 22:09:44 all 2.31 0.00 0.25 0.22 0.00 0.09 0.00 0.00 0.00 97.13 22:09:49 all 2.44 0.00 0.22 0.24 0.00 0.08 0.00 0.00 0.00 97.02 22:09:54 all 2.23 0.00 0.23 0.28 0.00 0.10 0.00 0.00 0.00 97.16 22:09:59 all 2.48 0.00 0.26 0.30 0.00 0.11 0.00 0.00 0.00 96.85 22:10:04 all 2.29 0.00 0.28 0.26 0.00 0.12 0.00 0.00 0.00 97.06 22:10:09 all 2.31 0.00 0.22 0.21 0.00 0.11 0.00 0.00 0.00 97.14 22:10:14 all 2.38 0.00 0.25 0.21 0.00 0.10 0.00 0.00 0.00 97.05 22:10:19 all 2.54 0.00 0.24 0.20 0.00 0.11 0.00 0.00 0.00 96.91 22:10:24 all 2.26 0.00 0.25 0.22 0.00 0.10 0.00 0.00 0.00 97.17 22:10:29 all 2.12 0.00 0.27 0.24 0.00 0.09 0.00 0.00 0.00 97.27 22:10:34 all 2.56 0.00 0.26 0.26 0.00 0.12 0.00 0.00 0.00 96.80 22:10:39 all 2.16 0.00 0.23 0.24 0.00 0.10 0.00 0.00 0.00 97.27 22:10:44 all 2.37 0.00 0.26 0.22 0.00 0.10 0.00 0.00 0.00 97.05 22:10:49 all 2.25 0.00 0.25 0.61 0.00 0.09 0.00 0.00 0.00 96.80 22:10:54 all 2.41 0.00 0.28 0.22 0.00 0.10 0.00 0.00 0.00 96.99 22:10:59 all 2.83 0.00 0.29 0.19 0.00 0.12 0.00 0.00 0.00 96.57 Best regards, depesz ^ permalink raw reply [nested|flat] 27+ messages in thread
* Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug @ 2025-08-21 10:07 hubert depesz lubaczewski <[email protected]> parent: Adrian Klaver <[email protected]> 0 siblings, 1 reply; 27+ messages in thread From: hubert depesz lubaczewski @ 2025-08-21 10:07 UTC (permalink / raw) To: Adrian Klaver <[email protected]>; +Cc: PostgreSQL General <[email protected]> On Wed, Aug 20, 2025 at 10:45:13AM -0700, Adrian Klaver wrote: > On 8/20/25 09:08, hubert depesz lubaczewski wrote: > > On Wed, Aug 20, 2025 at 08:14:47AM -0700, Adrian Klaver wrote: > > > Hmm. > > > > > > From initial post: > > > > > > "For ~ 1 second there are no logs going to log (we usually have at 5-20 > > > messages logged per second), no connection, nothing. And then we get > > > bunch (30+) messages with the same milisecond time." > > > Are the 30+ messages all coming in on one connection or multiple > > > connections? > > Multiple connections. > > > Also to be clear these are statements that are being run on the replica > > > locally, correct? > > What do you mean locally? > I should have been clearer. Are the queries being run against the replica or > the primary? All to replica. Primary has its own work, of course, but the problem we're experiencing is on replicas. > > Application servers are *not* on the same server as the db. So no, they > > are not local. > Where are the application servers relative to the replica server? Well, there is a lot of them. All are in AWS EC2, same region, various availability zones. Generally we use 3 AZs, with more or less equal split, so I'd say ~ 33% of app servers is within the same AZ. > How many applications servers are hitting the database? To be honest, I'm not sure. I have visibility into dbs, and bouncers, not really into Apps. I know that these are automatically dynamically scaled, so number of app server is very varying. I'd say anything from 40 to 200 app servers hit first layer of bouncers, which we usually have 6-9 (2-3 per az). These go to 2nd layer of bouncers, on the db server itself. Best regards, depesz ^ permalink raw reply [nested|flat] 27+ messages in thread
* Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug @ 2025-08-21 11:41 Thom Brown <[email protected]> parent: hubert depesz lubaczewski <[email protected]> 0 siblings, 1 reply; 27+ messages in thread From: Thom Brown @ 2025-08-21 11:41 UTC (permalink / raw) To: [email protected]; +Cc: Adrian Klaver <[email protected]>; PostgreSQL General <[email protected]> On Thu, 21 Aug 2025 at 11:03, hubert depesz lubaczewski <[email protected]> wrote: > > On Wed, Aug 20, 2025 at 06:30:00PM +0100, Thom Brown wrote: > > Do you have THP enabled? Can you use mpstat and see what %steal shows as? > > Sorry, what is "THP"? I tried searching for "what is thp", and most > common search results are related to some chemical compound. Ah, yeah I meant transparent hugepage: cat /sys/kernel/mm/transparent_hugepage/enabled This should show it being set as "never". > > mpstat, yes, we have it. Let me quickly show what I get: > > Two days ago, at ~ 10:10pm UTC I saw this in Pg logs: > > # grep -oP '^2025-08-19 22:09:2\d\.\d+ UTC' postgresql-2025-08-19_220000.csv | uniq -c | grep -C3 -P '^\s*\d\d' > 2 2025-08-19 22:09:29.084 UTC > 1 2025-08-19 22:09:29.094 UTC > 2 2025-08-19 22:09:29.097 UTC > 70 2025-08-19 22:09:29.109 UTC > 90 2025-08-19 22:09:29.110 UTC > 6 2025-08-19 22:09:29.111 UTC > 1 2025-08-19 22:09:29.153 UTC > 1 2025-08-19 22:09:29.555 UTC > > As you can see we have 70, and then 90 messages all logged with the same > timestamp. All of them (160) were "duration:x" lines. > > At the time, mpstat (called using `mpstat 5`) looked like this: > > CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle > 22:09:04 all 2.25 0.00 0.27 0.24 0.00 0.10 0.00 0.00 0.00 97.14 > 22:09:09 all 2.70 0.00 0.28 0.27 0.00 0.12 0.00 0.00 0.00 96.63 > 22:09:14 all 2.83 0.00 0.63 0.23 0.00 0.09 0.00 0.00 0.00 96.22 > 22:09:19 all 2.66 0.00 0.28 0.23 0.00 0.10 0.00 0.00 0.00 96.74 > 22:09:24 all 2.36 0.00 0.27 0.21 0.00 0.10 0.00 0.00 0.00 97.06 > 22:09:29 all 2.49 0.00 0.40 0.40 0.00 0.11 0.00 0.00 0.00 96.60 > 22:09:34 all 2.65 0.00 0.28 0.23 0.00 0.11 0.00 0.00 0.00 96.73 > 22:09:39 all 2.26 0.00 0.26 0.21 0.00 0.09 0.00 0.00 0.00 97.19 > 22:09:44 all 2.31 0.00 0.25 0.22 0.00 0.09 0.00 0.00 0.00 97.13 > 22:09:49 all 2.44 0.00 0.22 0.24 0.00 0.08 0.00 0.00 0.00 97.02 > 22:09:54 all 2.23 0.00 0.23 0.28 0.00 0.10 0.00 0.00 0.00 97.16 > 22:09:59 all 2.48 0.00 0.26 0.30 0.00 0.11 0.00 0.00 0.00 96.85 > 22:10:04 all 2.29 0.00 0.28 0.26 0.00 0.12 0.00 0.00 0.00 97.06 > 22:10:09 all 2.31 0.00 0.22 0.21 0.00 0.11 0.00 0.00 0.00 97.14 > 22:10:14 all 2.38 0.00 0.25 0.21 0.00 0.10 0.00 0.00 0.00 97.05 > 22:10:19 all 2.54 0.00 0.24 0.20 0.00 0.11 0.00 0.00 0.00 96.91 > 22:10:24 all 2.26 0.00 0.25 0.22 0.00 0.10 0.00 0.00 0.00 97.17 > 22:10:29 all 2.12 0.00 0.27 0.24 0.00 0.09 0.00 0.00 0.00 97.27 > 22:10:34 all 2.56 0.00 0.26 0.26 0.00 0.12 0.00 0.00 0.00 96.80 > 22:10:39 all 2.16 0.00 0.23 0.24 0.00 0.10 0.00 0.00 0.00 97.27 > 22:10:44 all 2.37 0.00 0.26 0.22 0.00 0.10 0.00 0.00 0.00 97.05 > 22:10:49 all 2.25 0.00 0.25 0.61 0.00 0.09 0.00 0.00 0.00 96.80 > 22:10:54 all 2.41 0.00 0.28 0.22 0.00 0.10 0.00 0.00 0.00 96.99 > 22:10:59 all 2.83 0.00 0.29 0.19 0.00 0.12 0.00 0.00 0.00 96.57 This output looks fine, so it doesn't show anything concerning, so suggests the issue is somehow on the Postgres side. Did you happen to poll pg_stat_activity at the time to see whether you had lots of IPC waits? I'm wondering whether the storage layer is freezing up for a moment. Thom ^ permalink raw reply [nested|flat] 27+ messages in thread
* Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug @ 2025-08-21 12:03 hubert depesz lubaczewski <[email protected]> parent: Thom Brown <[email protected]> 0 siblings, 0 replies; 27+ messages in thread From: hubert depesz lubaczewski @ 2025-08-21 12:03 UTC (permalink / raw) To: Thom Brown <[email protected]>; +Cc: Adrian Klaver <[email protected]>; PostgreSQL General <[email protected]> On Thu, Aug 21, 2025 at 12:41:44PM +0100, Thom Brown wrote: > Ah, yeah I meant transparent hugepage: > cat /sys/kernel/mm/transparent_hugepage/enabled > This should show it being set as "never". Ah. Sorry, couldn't decipher. Yes, it's "never". > > # grep -oP '^2025-08-19 22:09:2\d\.\d+ UTC' postgresql-2025-08-19_220000.csv | uniq -c | grep -C3 -P '^\s*\d\d' > > 2 2025-08-19 22:09:29.084 UTC > > 1 2025-08-19 22:09:29.094 UTC > > 2 2025-08-19 22:09:29.097 UTC > > 70 2025-08-19 22:09:29.109 UTC > > 90 2025-08-19 22:09:29.110 UTC > > 6 2025-08-19 22:09:29.111 UTC > > 1 2025-08-19 22:09:29.153 UTC > > 1 2025-08-19 22:09:29.555 UTC … > > 22:10:54 all 2.41 0.00 0.28 0.22 0.00 0.10 0.00 0.00 0.00 96.99 > > 22:10:59 all 2.83 0.00 0.29 0.19 0.00 0.12 0.00 0.00 0.00 96.57 > > This output looks fine, so it doesn't show anything concerning, so > suggests the issue is somehow on the Postgres side. > > Did you happen to poll pg_stat_activity at the time to see whether you > had lots of IPC waits? I'm wondering whether the storage layer is > freezing up for a moment. So, we get select * from pg_stat_activity, for client backends that are not idle, every 29 seconds. So, 1 second "freeze" is impossible to cathc. Plus - I suspect that if I ran select * from pg_stat_activity while "in freeze", it would also get frozen. Anyway, I have data from 22:09:22 and 22:09:51. In both cases only 4 non-idle backend. 6 of them had NULL in wait_event* one was Client/ClientRead and one was IPC/BgWorkerShutdown. State_change for the IPC/BgWorkerShutdown backend was 2025-08-19 22:09:51.79504+00 so it was well past the moment when the problem struck. Best regards, depesz ^ permalink raw reply [nested|flat] 27+ messages in thread
* Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug @ 2025-08-21 15:04 Adrian Klaver <[email protected]> parent: hubert depesz lubaczewski <[email protected]> 0 siblings, 1 reply; 27+ messages in thread From: Adrian Klaver @ 2025-08-21 15:04 UTC (permalink / raw) To: [email protected]; +Cc: PostgreSQL General <[email protected]> On 8/21/25 03:07, hubert depesz lubaczewski wrote: > On Wed, Aug 20, 2025 at 10:45:13AM -0700, Adrian Klaver wrote: >> On 8/20/25 09:08, hubert depesz lubaczewski wrote: >>> On Wed, Aug 20, 2025 at 08:14:47AM -0700, Adrian Klaver wrote: >>>> Hmm. >>>> >>>> From initial post: >>>> >>>> "For ~ 1 second there are no logs going to log (we usually have at 5-20 >>>> messages logged per second), no connection, nothing. And then we get >>>> bunch (30+) messages with the same milisecond time." >>>> Are the 30+ messages all coming in on one connection or multiple >>>> connections? >>> Multiple connections. >>>> Also to be clear these are statements that are being run on the replica >>>> locally, correct? >>> What do you mean locally? >> I should have been clearer. Are the queries being run against the replica or >> the primary? > > All to replica. Primary has its own work, of course, but the problem > we're experiencing is on replicas. If I am following there is more then one primary --> replica pair and the problem exists across all the pairs. >> How many applications servers are hitting the database? > > To be honest, I'm not sure. I have visibility into dbs, and bouncers, > not really into Apps. I know that these are automatically dynamically > scaled, so number of app server is very varying. > > I'd say anything from 40 to 200 app servers hit first layer of bouncers, > which we usually have 6-9 (2-3 per az). > > These go to 2nd layer of bouncers, on the db server itself. By bouncer I assume you mean something like pgBouncer, a connection pooler. Is it possible to determine what bouncer the queries in question are coming from? > > Best regards, > > depesz > -- Adrian Klaver [email protected] ^ permalink raw reply [nested|flat] 27+ messages in thread
* Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug @ 2025-08-21 15:13 hubert depesz lubaczewski <[email protected]> parent: Adrian Klaver <[email protected]> 0 siblings, 1 reply; 27+ messages in thread From: hubert depesz lubaczewski @ 2025-08-21 15:13 UTC (permalink / raw) To: Adrian Klaver <[email protected]>; +Cc: PostgreSQL General <[email protected]> On Thu, Aug 21, 2025 at 08:04:25AM -0700, Adrian Klaver wrote: > > > > > "For ~ 1 second there are no logs going to log (we usually have at 5-20 > > > > > messages logged per second), no connection, nothing. And then we get > > > > > bunch (30+) messages with the same milisecond time." > > > > > Are the 30+ messages all coming in on one connection or multiple > > > > > connections? > > > > Multiple connections. > > > > > Also to be clear these are statements that are being run on the replica > > > > > locally, correct? > > > > What do you mean locally? > > > I should have been clearer. Are the queries being run against the replica or > > > the primary? > > All to replica. Primary has its own work, of course, but the problem > > we're experiencing is on replicas. > > If I am following there is more then one primary --> replica pair and the > problem exists across all the pairs. Not all. We have ~ 300 such clusters. The thing doesn't cause any customer-visible issues (after all it's just 1 second delay every so often), so it's generally overlooked when it happens. But we were paying closer attention to one such cluster, and then couple of other, and we've seen this behavior. > > > How many applications servers are hitting the database? > > > > To be honest, I'm not sure. I have visibility into dbs, and bouncers, > > not really into Apps. I know that these are automatically dynamically > > scaled, so number of app server is very varying. > > > > I'd say anything from 40 to 200 app servers hit first layer of bouncers, > > which we usually have 6-9 (2-3 per az). > > > > These go to 2nd layer of bouncers, on the db server itself. > > By bouncer I assume you mean something like pgBouncer, a connection pooler. > Is it possible to determine what bouncer the queries in question are coming > from? From the POV of db, all queries are coming from one of N localhost bouncers. N is usually 2…6. From the POV of the local bouncers, the queries come from range of remote bouncers. Generally we haven't seen any correlation between queries coming from specific ranges of ips. Logged queries, the ones that we see with runtime of 1s, have comments that indicate source, and they some from "all-around". Specifically, "DISCARD ALL" queries are generated by bouncers themselves (both layers). Just so that it will be clear, I don't expect anyone to be able to diagnose the problem based on description. I'm looking more into idea what to look for. The issue is that with the situation being pretty short, and happening on servers with non-trivial query load, I can't do stuff, like, for example, strace, stuff. Best regards, depesz ^ permalink raw reply [nested|flat] 27+ messages in thread
* Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug @ 2025-08-21 15:59 Adrian Klaver <[email protected]> parent: hubert depesz lubaczewski <[email protected]> 0 siblings, 1 reply; 27+ messages in thread From: Adrian Klaver @ 2025-08-21 15:59 UTC (permalink / raw) To: [email protected]; +Cc: PostgreSQL General <[email protected]> On 8/21/25 08:13, hubert depesz lubaczewski wrote: > On Thu, Aug 21, 2025 at 08:04:25AM -0700, Adrian Klaver wrote: >> By bouncer I assume you mean something like pgBouncer, a connection pooler. >> Is it possible to determine what bouncer the queries in question are coming >> from? > > From the POV of db, all queries are coming from one of N localhost > bouncers. N is usually 2…6. > From the POV of the local bouncers, the queries come from range of > remote bouncers. > > Generally we haven't seen any correlation between queries coming from > specific ranges of ips. Logged queries, the ones that we see with > runtime of 1s, have comments that indicate source, and they some from > "all-around". Specifically, "DISCARD ALL" queries are generated by > bouncers themselves (both layers). Well so much for that theory:) > > Just so that it will be clear, I don't expect anyone to be able to > diagnose the problem based on description. I'm looking more into idea > what to look for. The issue is that with the situation being pretty > short, and happening on servers with non-trivial query load, I can't do > stuff, like, for example, strace, stuff. Getting to the bottom of the bag of ideas: Have you looked at the OS system log for the time period involved? You mentioned this seemed to involve PREPARE and DISCARD ALL. Is this the same set of statements or is it all over the place? Also it would be helpful to know what bouncer you are actually using and what mode you are running in? > > Best regards, > > depesz > -- Adrian Klaver [email protected] ^ permalink raw reply [nested|flat] 27+ messages in thread
* Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug @ 2025-08-21 16:51 hubert depesz lubaczewski <[email protected]> parent: Adrian Klaver <[email protected]> 0 siblings, 1 reply; 27+ messages in thread From: hubert depesz lubaczewski @ 2025-08-21 16:51 UTC (permalink / raw) To: Adrian Klaver <[email protected]>; +Cc: PostgreSQL General <[email protected]> On Thu, Aug 21, 2025 at 08:59:03AM -0700, Adrian Klaver wrote: > Getting to the bottom of the bag of ideas: > Have you looked at the OS system log for the time period involved? Yes. Mostly dmesg. Nothing interesting logged around the time. > You mentioned this seemed to involve PREPARE and DISCARD ALL. > Is this the same set of statements or is it all over the place? No. From what I can tell it's random sample. > Also it would be helpful to know what bouncer you are actually using and > what mode you are running in? pgBouncer, version 1.23.1. As for more... mostly transaction pooling. Applications go using transaction pooling, but people (dbas, ops) have session pooling. Best regards, depesz ^ permalink raw reply [nested|flat] 27+ messages in thread
* Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug @ 2025-08-21 18:17 Adrian Klaver <[email protected]> parent: hubert depesz lubaczewski <[email protected]> 0 siblings, 2 replies; 27+ messages in thread From: Adrian Klaver @ 2025-08-21 18:17 UTC (permalink / raw) To: [email protected]; +Cc: PostgreSQL General <[email protected]> On 8/21/25 09:51, hubert depesz lubaczewski wrote: > On Thu, Aug 21, 2025 at 08:59:03AM -0700, Adrian Klaver wrote: >> Getting to the bottom of the bag of ideas: >> Have you looked at the OS system log for the time period involved? > > Yes. Mostly dmesg. Nothing interesting logged around the time. > >> You mentioned this seemed to involve PREPARE and DISCARD ALL. >> Is this the same set of statements or is it all over the place? > > No. From what I can tell it's random sample. > >> Also it would be helpful to know what bouncer you are actually using and >> what mode you are running in? > > pgBouncer, version 1.23.1. As for more... mostly transaction pooling. > Applications go using transaction pooling, but people (dbas, ops) have > session pooling. Have you looked at?: https://www.pgbouncer.org/changelog.html#pgbouncer-124x To see if anything stands out. Then there is: https://www.pgbouncer.org/config.html#max_prepared_statements The below may also be worth looking at: https://github.com/pgbouncer/pgbouncer/pull/1144 I can't help thinking that there is a caching issue at stake, though that is just a guess. > > Best regards, > > depesz > -- Adrian Klaver [email protected] ^ permalink raw reply [nested|flat] 27+ messages in thread
* Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug @ 2025-08-21 18:38 Chris Wilson <[email protected]> parent: Adrian Klaver <[email protected]> 1 sibling, 1 reply; 27+ messages in thread From: Chris Wilson @ 2025-08-21 18:38 UTC (permalink / raw) To: Adrian Klaver <[email protected]>; +Cc: [email protected]; PostgreSQL General <[email protected]> If all your queries are coming through pgBouncer, and only those hang (the server itself responds if you connect directly to it), then it might be this pgBouncer issue: https://github.com/pgbouncer/pgbouncer/issues/1054 Although that issue is now "closed", because the invisible "debug" log message was upgraded to a warning (and I don't think that change is in any released version), the underlying problem still exists: pgbouncer hangs completely (stops forwarding packets) for a while if the PAM authentication queue becomes full. If you have a relatively slow PAM service (such as pam_ldap) then you can trigger it by opening ~100 connections to pgBouncer simultaneously (without waiting for previous ones to authenticate), something like this: for i in `seq 1 100`; do psql -h pgbouncer -p 6432 -U user db_name -c "SELECT 1" & done Thanks, Chris. On Thu, 21 Aug 2025 at 19:17, Adrian Klaver <[email protected]> wrote: > On 8/21/25 09:51, hubert depesz lubaczewski wrote: > > On Thu, Aug 21, 2025 at 08:59:03AM -0700, Adrian Klaver wrote: > >> Getting to the bottom of the bag of ideas: > >> Have you looked at the OS system log for the time period involved? > > > > Yes. Mostly dmesg. Nothing interesting logged around the time. > > > >> You mentioned this seemed to involve PREPARE and DISCARD ALL. > >> Is this the same set of statements or is it all over the place? > > > > No. From what I can tell it's random sample. > > > >> Also it would be helpful to know what bouncer you are actually using and > >> what mode you are running in? > > > > pgBouncer, version 1.23.1. As for more... mostly transaction pooling. > > Applications go using transaction pooling, but people (dbas, ops) have > > session pooling. > > Have you looked at?: > > https://www.pgbouncer.org/changelog.html#pgbouncer-124x > > To see if anything stands out. > > Then there is: > > https://www.pgbouncer.org/config.html#max_prepared_statements > > The below may also be worth looking at: > > https://github.com/pgbouncer/pgbouncer/pull/1144 > > I can't help thinking that there is a caching issue at stake, though > that is just a guess. > > > > > > Best regards, > > > > depesz > > > > > -- > Adrian Klaver > [email protected] > > > ^ permalink raw reply [nested|flat] 27+ messages in thread
* Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug @ 2025-08-22 12:37 hubert depesz lubaczewski <[email protected]> parent: Adrian Klaver <[email protected]> 1 sibling, 1 reply; 27+ messages in thread From: hubert depesz lubaczewski @ 2025-08-22 12:37 UTC (permalink / raw) To: Adrian Klaver <[email protected]>; +Cc: PostgreSQL General <[email protected]> On Thu, Aug 21, 2025 at 11:17:27AM -0700, Adrian Klaver wrote: > Have you looked at?: > https://www.pgbouncer.org/changelog.html#pgbouncer-124x > To see if anything stands out. > Then there is: > https://www.pgbouncer.org/config.html#max_prepared_statements > The below may also be worth looking at: > https://github.com/pgbouncer/pgbouncer/pull/1144 > I can't help thinking that there is a caching issue at stake, though that is > just a guess. Will check, but we don't use prepared statements, so this doesn't seem to be part-of-the-problem. Best regards, depesz ^ permalink raw reply [nested|flat] 27+ messages in thread
* Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug @ 2025-08-22 12:39 hubert depesz lubaczewski <[email protected]> parent: Chris Wilson <[email protected]> 0 siblings, 0 replies; 27+ messages in thread From: hubert depesz lubaczewski @ 2025-08-22 12:39 UTC (permalink / raw) To: Chris Wilson <[email protected]>; +Cc: Adrian Klaver <[email protected]>; PostgreSQL General <[email protected]> On Thu, Aug 21, 2025 at 07:38:34PM +0100, Chris Wilson wrote: > If all your queries are coming through pgBouncer, and only those hang (the > server itself responds if you connect directly to it), then it might be > this pgBouncer issue: > > https://github.com/pgbouncer/pgbouncer/issues/1054 > > Although that issue is now "closed", because the invisible "debug" log > message was upgraded to a warning (and I don't think that change is in any > released version), the underlying problem still exists: pgbouncer hangs > completely (stops forwarding packets) for a while if the PAM > authentication queue becomes full. > > If you have a relatively slow PAM service (such as pam_ldap) then you can > trigger it by opening ~100 connections to pgBouncer simultaneously (without > waiting for previous ones to authenticate), something like this: > > for i in `seq 1 100`; do psql -h pgbouncer -p 6432 -U user db_name -c > "SELECT 1" & done Please note that during the time of problem *NOTHING* seems to be happening in Pg. There are no messages in logs about anything. We don't have direct access to db, so there is no sane way to check if directly sent queries will work. I can, of course, start shell, and run queries there, but ephemeral nature of the problem makes it more difficult. As for authentication - there is no ldap/pam in place. All auth is based on certs/passowrds-in-file, and we never noticed authentication slowdowns. Plus - queries like 'DISCARD ALL' are not ran because someone connected… Best regards, depesz ^ permalink raw reply [nested|flat] 27+ messages in thread
* Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug @ 2025-08-22 14:33 Adrian Klaver <[email protected]> parent: hubert depesz lubaczewski <[email protected]> 0 siblings, 1 reply; 27+ messages in thread From: Adrian Klaver @ 2025-08-22 14:33 UTC (permalink / raw) To: [email protected]; +Cc: PostgreSQL General <[email protected]> On 8/22/25 05:37, hubert depesz lubaczewski wrote: > On Thu, Aug 21, 2025 at 11:17:27AM -0700, Adrian Klaver wrote: >> Have you looked at?: >> https://www.pgbouncer.org/changelog.html#pgbouncer-124x >> To see if anything stands out. >> Then there is: >> https://www.pgbouncer.org/config.html#max_prepared_statements >> The below may also be worth looking at: >> https://github.com/pgbouncer/pgbouncer/pull/1144 >> I can't help thinking that there is a caching issue at stake, though that is >> just a guess. > > Will check, but we don't use prepared statements, so this doesn't seem > to be part-of-the-problem. In your OP there is: "Every now and then (usually every 3-5 minutes, but not through the whole day), we see situations where every query suddently takes ~ 1 second. I'm talkign about "prepare" for preparing statements. binds. also "discard all". " Is that referring to something else? > > Best regards, > > depesz > -- Adrian Klaver [email protected] ^ permalink raw reply [nested|flat] 27+ messages in thread
* Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug @ 2025-08-22 15:03 hubert depesz lubaczewski <[email protected]> parent: Adrian Klaver <[email protected]> 0 siblings, 2 replies; 27+ messages in thread From: hubert depesz lubaczewski @ 2025-08-22 15:03 UTC (permalink / raw) To: Adrian Klaver <[email protected]>; +Cc: PostgreSQL General <[email protected]>; Chris Wilson <[email protected]> On Fri, Aug 22, 2025 at 07:33:03AM -0700, Adrian Klaver wrote: > On 8/22/25 05:37, hubert depesz lubaczewski wrote: > > On Thu, Aug 21, 2025 at 11:17:27AM -0700, Adrian Klaver wrote: > > > Have you looked at?: > > > https://www.pgbouncer.org/changelog.html#pgbouncer-124x > > > To see if anything stands out. > > > Then there is: > > > https://www.pgbouncer.org/config.html#max_prepared_statements > > > The below may also be worth looking at: > > > https://github.com/pgbouncer/pgbouncer/pull/1144 > > > I can't help thinking that there is a caching issue at stake, though that is > > > just a guess. > > > > Will check, but we don't use prepared statements, so this doesn't seem > > to be part-of-the-problem. > > In your OP there is: > > "Every now and then (usually every 3-5 minutes, but not through the whole > day), we see situations where every query suddently takes ~ 1 second. > > > I'm talkign about "prepare" for preparing statements. binds. also > "discard all". > " > > Is that referring to something else? No. Same problem. The thing is that it comes and goes. *BUT* I got repeatable case today. Is is breaking on its own everyy ~ 5 minutes. So I made test.sql with: \timing on SELECT 'SELECT now(), pg_sleep(.05);' FROM generate_series(1,100000) i \gexec and ran it like this: psql -p 5432 -d the_db -f test.sql -qAtX | perl -pe 's/\|\s*$/ :: /' This bypassed pgbouncer, connected via unix socket, the whole thing got simpler. After 4 minutes of 50ms runtimes I got: 2025-08-22 14:44:14.013541+00 :: Time: 50.090 ms 2025-08-22 14:44:14.063636+00 :: Time: 50.088 ms 2025-08-22 14:44:14.11373+00 :: Time: 50.085 ms 2025-08-22 14:44:14.163822+00 :: Time: 50.099 ms 2025-08-22 14:44:14.213927+00 :: Time: 1393.932 ms (00:01.394) 2025-08-22 14:44:15.607872+00 :: Time: 50.171 ms 2025-08-22 14:44:15.658097+00 :: Time: 50.209 ms 2025-08-22 14:44:15.708353+00 :: Time: 50.212 ms 2025-08-22 14:44:15.758589+00 :: Time: 50.170 ms This is pg_sleep. There are no locks, nothing else. I think this disproves idea that the problem is about pgbouncer, and it kinda looks like a thing in pg? Interestingly: ran strace on the pg backend that was doing this loop, like this: strace -tt -T -ff -p 1266521 -s 256 -o a.log And it showed something really strange: … 14:57:00.864684 recvfrom(9, "Q\0\0\0CSELECT now()::timestamptz(2), pg_backend_pid(), pg_sleep(.05);\0", 8192, 0, NULL, NULL) = 68 <0.000007> 14:57:00.864739 epoll_pwait(11, [], 1, 50, NULL, 8) = 0 <0.050063> 14:57:00.914823 sendto(9, "T\0\0\0X\0\3now\0\0\0\0\0\0\0\0\0\4\240\0\10\0\0\0\2\0\0pg_backend_pid\0\0\0\0\0\0\0\0\0\0\27\0\4\377\377\377\377\0\0pg_sleep\0\0\0\0\0\0\0\0\0\10\346\0\4\377\377\377\377\0\0D\0\0\0002\0\3\0\0\0\0312025-08-22 14:57:00.86+00\0\0\0\0071266521\0\0\0\0C\0\0\0\14:57:00.914849 recvfrom(9, "Q\0\0\0CSELECT now()::timestamptz(2), pg_backend_pid(), pg_sleep(.05);\0", 8192, 0, NULL, NULL) = 68 <0.000007> 14:57:00.914891 epoll_pwait(11, [], 1, 50, NULL, 8) = 0 <0.050064> 14:57:00.964978 sendto(9, "T\0\0\0X\0\3now\0\0\0\0\0\0\0\0\0\4\240\0\10\0\0\0\2\0\0pg_backend_pid\0\0\0\0\0\0\0\0\0\0\27\0\4\377\377\377\377\0\0pg_sleep\0\0\0\0\0\0\0\0\0\10\346\0\4\377\377\377\377\0\0D\0\0\0002\0\3\0\0\0\0312025-08-22 14:57:00.91+00\0\0\0\0071266521\0\0\0\0C\0\0\0\ 14:57:00.965006 recvfrom(9, "Q\0\0\0CSELECT now()::timestamptz(2), pg_backend_pid(), pg_sleep(.05);\0", 8192, 0, NULL, NULL) = 68 <0.000007> 14:57:00.965052 epoll_pwait(11, [], 1, 50, NULL, 8) = 0 <0.050063> 14:57:01.015133 futex(0xffe7a6636538, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY) = 0 <1.354927> 14:57:02.370180 write(2, "\0\0\36\1YS\23\0T2025-08-22 14:57:02.370 UTC,\"postgres\",\"the.db\",1266521,\"[local]\",68a884fb.135359,5,\"SELECT\",2025-08-22 14:55:55 UTC,85/0,0,LOG,00000,\"duration: 1405.106 ms statement: SELECT now()::timestamptz(2), pg_backend_pid(), pg_sleep(.05); 14:57:02.370521 sendto(8, "\2\0\0\0@\0\0\0\216A\0\0\0\0\0\0\t\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\255\221\33\0\0\0\0\0\262\341\6\0\0\0\0\0\0\0\0\0\0\0\0\0", 64, 0, NULL, 0) = 64 <0.000041> 14:57:02.370639 sendto(9, "T\0\0\0X\0\3now\0\0\0\0\0\0\0\0\0\4\240\0\10\0\0\0\2\0\0pg_backend_pid\0\0\0\0\0\0\0\0\0\0\27\0\4\377\377\377\377\0\0pg_sleep\0\0\0\0\0\0\0\0\0\10\346\0\4\377\377\377\377\0\0D\0\0\0002\0\3\0\0\0\0312025-08-22 14:57:00.97+00\0\0\0\0071266521\0\0\0\0C\0\0\0\14:57:02.371063 recvfrom(9, "Q\0\0\0CSELECT now()::timestamptz(2), pg_backend_pid(), pg_sleep(.05);\0", 8192, 0, NULL, NULL) = 68 <0.000019> 14:57:02.371215 epoll_pwait(11, [], 1, 50, NULL, 8) = 0 <0.050090> 14:57:02.421381 sendto(9, "T\0\0\0X\0\3now\0\0\0\0\0\0\0\0\0\4\240\0\10\0\0\0\2\0\0pg_backend_pid\0\0\0\0\0\0\0\0\0\0\27\0\4\377\377\377\377\0\0pg_sleep\0\0\0\0\0\0\0\0\0\10\346\0\4\377\377\377\377\0\0D\0\0\0002\0\3\0\0\0\0312025-08-22 14:57:02.37+00\0\0\0\0071266521\0\0\0\0C\0\0\0\ 14:57:02.421480 recvfrom(9, 0xaaaae7b3fc08, 8192, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) <0.000008> 14:57:02.421507 epoll_pwait(4, [{EPOLLIN, {u32=470567512, u64=187651886697048}}], 1, -1, NULL, 8) = 1 <0.000037> 14:57:02.421570 recvfrom(9, "Q\0\0\0CSELECT now()::timestamptz(2), pg_backend_pid(), pg_sleep(.05);\0", 8192, 0, NULL, NULL) = 68 <0.000012> 14:57:02.421673 epoll_pwait(11, [], 1, 50, NULL, 8) = 0 <0.050099> 14:57:02.471866 sendto(9, "T\0\0\0X\0\3now\0\0\0\0\0\0\0\0\0\4\240\0\10\0\0\0\2\0\0pg_backend_pid\0\0\0\0\0\0\0\0\0\0\27\0\4\377\377\377\377\0\0pg_sleep\0\0\0\0\0\0\0\0\0\10\346\0\4\377\377\377\377\0\0D\0\0\0002\0\3\0\0\0\0312025-08-22 14:57:02.42+00\0\0\0\0071266521\0\0\0\0C\0\0\0\14:57:02.471961 recvfrom(9, 0xaaaae7b3fc08, 8192, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) <0.000008> … As in, normal execution is: recv from, epoll_pwait that takes 0.05s, and sendto. all clear, nothing fancy. *BUT* in case of this one problematic thing there is futex() call that took 1.354s ?! Checked, and in straced ~ 800 queries this is the only futex call() ?! Any idea on what could it be? Best regards, depesz ^ permalink raw reply [nested|flat] 27+ messages in thread
* Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug @ 2025-08-22 15:07 Pavel Stehule <[email protected]> parent: hubert depesz lubaczewski <[email protected]> 1 sibling, 1 reply; 27+ messages in thread From: Pavel Stehule @ 2025-08-22 15:07 UTC (permalink / raw) To: [email protected]; +Cc: Adrian Klaver <[email protected]>; PostgreSQL General <[email protected]>; Chris Wilson <[email protected]> pá 22. 8. 2025 v 17:03 odesílatel hubert depesz lubaczewski < [email protected]> napsal: > On Fri, Aug 22, 2025 at 07:33:03AM -0700, Adrian Klaver wrote: > > On 8/22/25 05:37, hubert depesz lubaczewski wrote: > > > On Thu, Aug 21, 2025 at 11:17:27AM -0700, Adrian Klaver wrote: > > > > Have you looked at?: > > > > https://www.pgbouncer.org/changelog.html#pgbouncer-124x > > > > To see if anything stands out. > > > > Then there is: > > > > https://www.pgbouncer.org/config.html#max_prepared_statements > > > > The below may also be worth looking at: > > > > https://github.com/pgbouncer/pgbouncer/pull/1144 > > > > I can't help thinking that there is a caching issue at stake, though > that is > > > > just a guess. > > > > > > Will check, but we don't use prepared statements, so this doesn't seem > > > to be part-of-the-problem. > > > > In your OP there is: > > > > "Every now and then (usually every 3-5 minutes, but not through the whole > > day), we see situations where every query suddently takes ~ 1 second. > > > > > > I'm talkign about "prepare" for preparing statements. binds. also > > "discard all". > > " > > > > Is that referring to something else? > > No. Same problem. The thing is that it comes and goes. > > *BUT* > > I got repeatable case today. Is is breaking on its own everyy > ~ 5 minutes. > > So I made test.sql with: > \timing on > SELECT 'SELECT now(), pg_sleep(.05);' FROM generate_series(1,100000) i > \gexec > > and ran it like this: > > psql -p 5432 -d the_db -f test.sql -qAtX | perl -pe 's/\|\s*$/ :: /' > > This bypassed pgbouncer, connected via unix socket, the whole thing got > simpler. > > After 4 minutes of 50ms runtimes I got: > > 2025-08-22 14:44:14.013541+00 :: Time: 50.090 ms > 2025-08-22 14:44:14.063636+00 :: Time: 50.088 ms > 2025-08-22 14:44:14.11373+00 :: Time: 50.085 ms > 2025-08-22 14:44:14.163822+00 :: Time: 50.099 ms > 2025-08-22 14:44:14.213927+00 :: Time: 1393.932 ms (00:01.394) > 2025-08-22 14:44:15.607872+00 :: Time: 50.171 ms > 2025-08-22 14:44:15.658097+00 :: Time: 50.209 ms > 2025-08-22 14:44:15.708353+00 :: Time: 50.212 ms > 2025-08-22 14:44:15.758589+00 :: Time: 50.170 ms > > This is pg_sleep. There are no locks, nothing else. I think this > disproves idea that the problem is about pgbouncer, and it kinda looks > like a thing in pg? > > Interestingly: ran strace on the pg backend that was doing this loop, > like this: > strace -tt -T -ff -p 1266521 -s 256 -o a.log > > And it showed something really strange: > > … > 14:57:00.864684 recvfrom(9, "Q\0\0\0CSELECT now()::timestamptz(2), > pg_backend_pid(), pg_sleep(.05);\0", 8192, 0, NULL, NULL) = 68 <0.000007> > 14:57:00.864739 epoll_pwait(11, [], 1, 50, NULL, 8) = 0 <0.050063> > 14:57:00.914823 sendto(9, > "T\0\0\0X\0\3now\0\0\0\0\0\0\0\0\0\4\240\0\10\0\0\0\2\0\0pg_backend_pid\0\0\0\0\0\0\0\0\0\0\27\0\4\377\377\377\377\0\0pg_sleep\0\0\0\0\0\0\0\0\0\10\346\0\4\377\377\377\377\0\0D\0\0\0002\0\3\0\0\0\0312025-08-22 > 14:57:00.86+00\0\0\0\0071266521\0\0\0\0C\0\0\0\14:57:00.914849 recvfrom(9, > "Q\0\0\0CSELECT now()::timestamptz(2), pg_backend_pid(), pg_sleep(.05);\0", > 8192, 0, NULL, NULL) = 68 <0.000007> > 14:57:00.914891 epoll_pwait(11, [], 1, 50, NULL, 8) = 0 <0.050064> > 14:57:00.964978 sendto(9, > "T\0\0\0X\0\3now\0\0\0\0\0\0\0\0\0\4\240\0\10\0\0\0\2\0\0pg_backend_pid\0\0\0\0\0\0\0\0\0\0\27\0\4\377\377\377\377\0\0pg_sleep\0\0\0\0\0\0\0\0\0\10\346\0\4\377\377\377\377\0\0D\0\0\0002\0\3\0\0\0\0312025-08-22 > 14:57:00.91+00\0\0\0\0071266521\0\0\0\0C\0\0\0\ > 14:57:00.965006 recvfrom(9, "Q\0\0\0CSELECT now()::timestamptz(2), > pg_backend_pid(), pg_sleep(.05);\0", 8192, 0, NULL, NULL) = 68 <0.000007> > > > 14:57:00.965052 epoll_pwait(11, [], 1, 50, NULL, 8) = 0 <0.050063> > 14:57:01.015133 futex(0xffe7a6636538, > FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY) = > 0 <1.354927> > 14:57:02.370180 write(2, "\0\0\36\1YS\23\0T2025-08-22 14:57:02.370 > UTC,\"postgres\",\"the.db\",1266521,\"[local]\",68a884fb.135359,5,\"SELECT\",2025-08-22 > 14:55:55 UTC,85/0,0,LOG,00000,\"duration: 1405.106 ms statement: SELECT > now()::timestamptz(2), pg_backend_pid(), pg_sleep(.05); > 14:57:02.370521 sendto(8, "\2\0\0\0@\0\0\0\216A\0\0\0\0\0\0\t\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\255\221\33\0\0\0\0\0\262\341\6\0\0\0\0\0\0\0\0\0\0\0\0\0", > 64, 0, NULL, 0) = 64 <0.000041> > 14:57:02.370639 sendto(9, > "T\0\0\0X\0\3now\0\0\0\0\0\0\0\0\0\4\240\0\10\0\0\0\2\0\0pg_backend_pid\0\0\0\0\0\0\0\0\0\0\27\0\4\377\377\377\377\0\0pg_sleep\0\0\0\0\0\0\0\0\0\10\346\0\4\377\377\377\377\0\0D\0\0\0002\0\3\0\0\0\0312025-08-22 > 14:57:00.97+00\0\0\0\0071266521\0\0\0\0C\0\0\0\14:57:02.371063 recvfrom(9, > "Q\0\0\0CSELECT now()::timestamptz(2), pg_backend_pid(), pg_sleep(.05);\0", > 8192, 0, NULL, NULL) = 68 <0.000019> > 14:57:02.371215 epoll_pwait(11, [], 1, 50, NULL, 8) = 0 <0.050090> > 14:57:02.421381 sendto(9, > "T\0\0\0X\0\3now\0\0\0\0\0\0\0\0\0\4\240\0\10\0\0\0\2\0\0pg_backend_pid\0\0\0\0\0\0\0\0\0\0\27\0\4\377\377\377\377\0\0pg_sleep\0\0\0\0\0\0\0\0\0\10\346\0\4\377\377\377\377\0\0D\0\0\0002\0\3\0\0\0\0312025-08-22 > 14:57:02.37+00\0\0\0\0071266521\0\0\0\0C\0\0\0\ > 14:57:02.421480 recvfrom(9, 0xaaaae7b3fc08, 8192, 0, NULL, NULL) = -1 > EAGAIN (Resource temporarily unavailable) <0.000008> > > > 14:57:02.421507 epoll_pwait(4, [{EPOLLIN, {u32=470567512, > u64=187651886697048}}], 1, -1, NULL, 8) = 1 <0.000037> > 14:57:02.421570 recvfrom(9, "Q\0\0\0CSELECT now()::timestamptz(2), > pg_backend_pid(), pg_sleep(.05);\0", 8192, 0, NULL, NULL) = 68 <0.000012> > 14:57:02.421673 epoll_pwait(11, [], 1, 50, NULL, 8) = 0 <0.050099> > 14:57:02.471866 sendto(9, > "T\0\0\0X\0\3now\0\0\0\0\0\0\0\0\0\4\240\0\10\0\0\0\2\0\0pg_backend_pid\0\0\0\0\0\0\0\0\0\0\27\0\4\377\377\377\377\0\0pg_sleep\0\0\0\0\0\0\0\0\0\10\346\0\4\377\377\377\377\0\0D\0\0\0002\0\3\0\0\0\0312025-08-22 > 14:57:02.42+00\0\0\0\0071266521\0\0\0\0C\0\0\0\14:57:02.471961 recvfrom(9, > 0xaaaae7b3fc08, 8192, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily > unavailable) <0.000008> > … > > As in, normal execution is: recv from, epoll_pwait that takes 0.05s, and > sendto. > all clear, nothing fancy. > > *BUT* > in case of this one problematic thing there is futex() call that took > 1.354s ?! Checked, and in straced ~ 800 queries this is the only futex > call() ?! > > Any idea on what could it be? > Few years ago I had a similar experience - some sec mystic lags - the problem was in virtualization. Regards Pavel > > Best regards, > > depesz > > > > ^ permalink raw reply [nested|flat] 27+ messages in thread
* Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug @ 2025-08-22 15:19 hubert depesz lubaczewski <[email protected]> parent: Pavel Stehule <[email protected]> 0 siblings, 0 replies; 27+ messages in thread From: hubert depesz lubaczewski @ 2025-08-22 15:19 UTC (permalink / raw) To: Pavel Stehule <[email protected]>; +Cc: Adrian Klaver <[email protected]>; PostgreSQL General <[email protected]>; Chris Wilson <[email protected]> On Fri, Aug 22, 2025 at 05:07:09PM +0200, Pavel Stehule wrote: > > Any idea on what could it be? > Few years ago I had a similar experience - some sec mystic lags - the > problem was in virtualization. While I can't discount this idea, after all it is aws ec2, so virtual boxes, what makes me wonder is: in case of this slow situation there is futex call, which takes a second. But there are no futex calls in any other query processing logs. So, while it is obvious that futex "wasted" this time, the question is: why would pg run futex() for this one query at all? Best regards, depesz ^ permalink raw reply [nested|flat] 27+ messages in thread
* Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug @ 2025-08-22 15:21 Tom Lane <[email protected]> parent: hubert depesz lubaczewski <[email protected]> 1 sibling, 1 reply; 27+ messages in thread From: Tom Lane @ 2025-08-22 15:21 UTC (permalink / raw) To: [email protected]; +Cc: Adrian Klaver <[email protected]>; PostgreSQL General <[email protected]>; Chris Wilson <[email protected]> hubert depesz lubaczewski <[email protected]> writes: > I got repeatable case today. Is is breaking on its own everyy > ~ 5 minutes. Interesting. That futex call is presumably caused by interaction with some other process within the standby server, and the only plausible candidate really is the startup process (which is replaying WAL received from the primary). There are cases where WAL replay will take locks that can block queries on the standby. Can you correlate the delays on the standby server with any DDL events occurring on the primary? regards, tom lane ^ permalink raw reply [nested|flat] 27+ messages in thread
* Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug @ 2025-08-22 15:30 hubert depesz lubaczewski <[email protected]> parent: Tom Lane <[email protected]> 0 siblings, 2 replies; 27+ messages in thread From: hubert depesz lubaczewski @ 2025-08-22 15:30 UTC (permalink / raw) To: Tom Lane <[email protected]>; +Cc: Adrian Klaver <[email protected]>; PostgreSQL General <[email protected]>; Chris Wilson <[email protected]> On Fri, Aug 22, 2025 at 11:21:22AM -0400, Tom Lane wrote: > hubert depesz lubaczewski <[email protected]> writes: > > I got repeatable case today. Is is breaking on its own everyy > > ~ 5 minutes. > > Interesting. That futex call is presumably caused by interaction > with some other process within the standby server, and the only > plausible candidate really is the startup process (which is replaying > WAL received from the primary). There are cases where WAL replay > will take locks that can block queries on the standby. Can you > correlate the delays on the standby server with any DDL events > occurring on the primary? Nope. Plus there is certain repetition of these cases, so even if I'd miss *some* create table/alter, it just isn't going to be happening every 4-5 minutes. For example, looking at logs for the last ~2h, and just checking situation when there are more than 20 messages in the same milisecond, I can see: 108 14:02:03.149 25 14:04:01.619 110 14:05:36.924 77 14:05:36.925 108 14:09:28.155 38 14:13:52.481 63 14:13:52.482 73 14:13:52.484 146 14:18:19.338 39 14:18:19.339 24 14:20:01.694 82 14:23:07.352 55 14:23:07.353 37 14:23:07.353 45 14:27:44.125 132 14:27:44.126 109 14:31:41.593 70 14:31:41.594 24 14:32:01.205 21 14:34:01.477 79 14:35:36.761 104 14:35:36.762 22 14:39:49.541 151 14:39:49.542 22 14:39:49.543 112 14:44:15.607 73 14:44:15.608 28 14:48:01.256 50 14:48:25.588 131 14:48:25.589 139 14:52:44.391 74 14:57:02.369 117 14:57:02.370 20 15:00:02.008 137 15:00:43.982 34 15:00:43.983 20 15:01:01.110 22 15:04:21.037 153 15:04:21.038 20 15:08:01.136 31 15:08:55.798 126 15:08:55.799 76 15:13:46.654 83 15:13:46.655 20 15:17:01.700 107 15:18:42.112 72 15:18:42.113 124 15:23:48.689 32 15:23:48.690 25 15:23:48.690 28 15:24:01.397 So, while there are outliers, I'd say that most of the problems happens every 3-5 minutes. depesz ^ permalink raw reply [nested|flat] 27+ messages in thread
* Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug @ 2025-08-22 15:39 Adrian Klaver <[email protected]> parent: hubert depesz lubaczewski <[email protected]> 1 sibling, 1 reply; 27+ messages in thread From: Adrian Klaver @ 2025-08-22 15:39 UTC (permalink / raw) To: [email protected]; Tom Lane <[email protected]>; +Cc: PostgreSQL General <[email protected]>; Chris Wilson <[email protected]> On 8/22/25 08:30, hubert depesz lubaczewski wrote: > On Fri, Aug 22, 2025 at 11:21:22AM -0400, Tom Lane wrote: >> hubert depesz lubaczewski <[email protected]> writes: >>> I got repeatable case today. Is is breaking on its own everyy >>> ~ 5 minutes. >> >> Interesting. That futex call is presumably caused by interaction >> with some other process within the standby server, and the only >> plausible candidate really is the startup process (which is replaying >> WAL received from the primary). There are cases where WAL replay >> will take locks that can block queries on the standby. Can you >> correlate the delays on the standby server with any DDL events >> occurring on the primary? > > Nope. Plus there is certain repetition of these cases, so even if I'd > miss *some* create table/alter, it just isn't going to be happening > every 4-5 minutes. > > So, while there are outliers, I'd say that most of the problems happens every > 3-5 minutes. Are you using the Postgres community version or the AWS variant? > > depesz > -- Adrian Klaver [email protected] ^ permalink raw reply [nested|flat] 27+ messages in thread
* Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug @ 2025-08-22 15:42 hubert depesz lubaczewski <[email protected]> parent: Adrian Klaver <[email protected]> 0 siblings, 0 replies; 27+ messages in thread From: hubert depesz lubaczewski @ 2025-08-22 15:42 UTC (permalink / raw) To: Adrian Klaver <[email protected]>; +Cc: Tom Lane <[email protected]>; PostgreSQL General <[email protected]>; Chris Wilson <[email protected]> On Fri, Aug 22, 2025 at 08:39:21AM -0700, Adrian Klaver wrote: > On 8/22/25 08:30, hubert depesz lubaczewski wrote: > > On Fri, Aug 22, 2025 at 11:21:22AM -0400, Tom Lane wrote: > > > hubert depesz lubaczewski <[email protected]> writes: > > > > I got repeatable case today. Is is breaking on its own everyy > > > > ~ 5 minutes. > > > > > > Interesting. That futex call is presumably caused by interaction > > > with some other process within the standby server, and the only > > > plausible candidate really is the startup process (which is replaying > > > WAL received from the primary). There are cases where WAL replay > > > will take locks that can block queries on the standby. Can you > > > correlate the delays on the standby server with any DDL events > > > occurring on the primary? > > > > Nope. Plus there is certain repetition of these cases, so even if I'd > > miss *some* create table/alter, it just isn't going to be happening > > every 4-5 minutes. > > > > > So, while there are outliers, I'd say that most of the problems happens every > > 3-5 minutes. > > Are you using the Postgres community version or the AWS variant? Community. From pgdg repo. Best regards, depesz ^ permalink raw reply [nested|flat] 27+ messages in thread
* Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug @ 2025-08-22 15:55 Tom Lane <[email protected]> parent: hubert depesz lubaczewski <[email protected]> 1 sibling, 1 reply; 27+ messages in thread From: Tom Lane @ 2025-08-22 15:55 UTC (permalink / raw) To: [email protected]; +Cc: Adrian Klaver <[email protected]>; PostgreSQL General <[email protected]>; Chris Wilson <[email protected]> hubert depesz lubaczewski <[email protected]> writes: > On Fri, Aug 22, 2025 at 11:21:22AM -0400, Tom Lane wrote: >> Interesting. That futex call is presumably caused by interaction >> with some other process within the standby server, and the only >> plausible candidate really is the startup process (which is replaying >> WAL received from the primary). There are cases where WAL replay >> will take locks that can block queries on the standby. Can you >> correlate the delays on the standby server with any DDL events >> occurring on the primary? > Nope. Plus there is certain repetition of these cases, so even if I'd > miss *some* create table/alter, it just isn't going to be happening > every 4-5 minutes. Nonetheless, I'm suspecting an interaction with the startup process, because there just isn't that much else that this process could be needing to deal with. Can you try strace'ing both the process doing the test query and the startup process, to see what the startup process is doing at the times the futex calls happen? regards, tom lane ^ permalink raw reply [nested|flat] 27+ messages in thread
* Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug @ 2025-08-22 16:01 hubert depesz lubaczewski <[email protected]> parent: Tom Lane <[email protected]> 0 siblings, 0 replies; 27+ messages in thread From: hubert depesz lubaczewski @ 2025-08-22 16:01 UTC (permalink / raw) To: Tom Lane <[email protected]>; +Cc: Adrian Klaver <[email protected]>; PostgreSQL General <[email protected]>; Chris Wilson <[email protected]> On Fri, Aug 22, 2025 at 11:55:18AM -0400, Tom Lane wrote: > Nonetheless, I'm suspecting an interaction with the startup process, > because there just isn't that much else that this process could be > needing to deal with. Can you try strace'ing both the process doing > the test query and the startup process, to see what the startup > process is doing at the times the futex calls happen? I can, but not today. Will get back to you about it, next week, though. Best regards, depesz ^ permalink raw reply [nested|flat] 27+ messages in thread
end of thread, other threads:[~2025-08-22 16:01 UTC | newest] Thread overview: 27+ messages (download: mbox mbox.gz follow: Atom feed) -- links below jump to the message on this page -- 2025-08-20 15:14 Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug Adrian Klaver <[email protected]> 2025-08-20 16:08 ` hubert depesz lubaczewski <[email protected]> 2025-08-20 17:24 ` Rob Sargent <[email protected]> 2025-08-20 17:30 ` Thom Brown <[email protected]> 2025-08-21 10:03 ` hubert depesz lubaczewski <[email protected]> 2025-08-21 11:41 ` Thom Brown <[email protected]> 2025-08-21 12:03 ` hubert depesz lubaczewski <[email protected]> 2025-08-20 17:45 ` Adrian Klaver <[email protected]> 2025-08-21 10:07 ` hubert depesz lubaczewski <[email protected]> 2025-08-21 15:04 ` Adrian Klaver <[email protected]> 2025-08-21 15:13 ` hubert depesz lubaczewski <[email protected]> 2025-08-21 15:59 ` Adrian Klaver <[email protected]> 2025-08-21 16:51 ` hubert depesz lubaczewski <[email protected]> 2025-08-21 18:17 ` Adrian Klaver <[email protected]> 2025-08-21 18:38 ` Chris Wilson <[email protected]> 2025-08-22 12:39 ` hubert depesz lubaczewski <[email protected]> 2025-08-22 12:37 ` hubert depesz lubaczewski <[email protected]> 2025-08-22 14:33 ` Adrian Klaver <[email protected]> 2025-08-22 15:03 ` hubert depesz lubaczewski <[email protected]> 2025-08-22 15:07 ` Pavel Stehule <[email protected]> 2025-08-22 15:19 ` hubert depesz lubaczewski <[email protected]> 2025-08-22 15:21 ` Tom Lane <[email protected]> 2025-08-22 15:30 ` hubert depesz lubaczewski <[email protected]> 2025-08-22 15:39 ` Adrian Klaver <[email protected]> 2025-08-22 15:42 ` hubert depesz lubaczewski <[email protected]> 2025-08-22 15:55 ` Tom Lane <[email protected]> 2025-08-22 16:01 ` hubert depesz lubaczewski <[email protected]>
This inbox is served by agora; see mirroring instructions for how to clone and mirror all data and code used for this inbox