public inbox for [email protected]  
help / color / mirror / Atom feed
From: hubert depesz lubaczewski <[email protected]>
To: Tom Lane <[email protected]>
Cc: Adrian Klaver <[email protected]>
Cc: PostgreSQL General <[email protected]>
Cc: Chris Wilson <[email protected]>
Subject: Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug
Date: Fri, 22 Aug 2025 17:30:22 +0200
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>

On Fri, Aug 22, 2025 at 11:21:22AM -0400, Tom Lane wrote:
> hubert depesz lubaczewski <[email protected]> writes:
> > I got repeatable case today. Is is breaking on its own everyy
> > ~ 5 minutes.
> 
> Interesting.  That futex call is presumably caused by interaction
> with some other process within the standby server, and the only
> plausible candidate really is the startup process (which is replaying
> WAL received from the primary).  There are cases where WAL replay
> will take locks that can block queries on the standby.  Can you
> correlate the delays on the standby server with any DDL events
> occurring on the primary?

Nope. Plus there is certain repetition of these cases, so even if I'd
miss *some* create table/alter, it just isn't going to be happening
every 4-5 minutes.

For example, looking at logs for the last ~2h, and just checking
situation when there are more than 20 messages in the same milisecond,
I can see:

    108 14:02:03.149
     25 14:04:01.619
    110 14:05:36.924
     77 14:05:36.925
    108 14:09:28.155
     38 14:13:52.481
     63 14:13:52.482
     73 14:13:52.484
    146 14:18:19.338
     39 14:18:19.339
     24 14:20:01.694
     82 14:23:07.352
     55 14:23:07.353
     37 14:23:07.353
     45 14:27:44.125
    132 14:27:44.126
    109 14:31:41.593
     70 14:31:41.594
     24 14:32:01.205
     21 14:34:01.477
     79 14:35:36.761
    104 14:35:36.762
     22 14:39:49.541
    151 14:39:49.542
     22 14:39:49.543
    112 14:44:15.607
     73 14:44:15.608
     28 14:48:01.256
     50 14:48:25.588
    131 14:48:25.589
    139 14:52:44.391
     74 14:57:02.369
    117 14:57:02.370
     20 15:00:02.008
    137 15:00:43.982
     34 15:00:43.983
     20 15:01:01.110
     22 15:04:21.037
    153 15:04:21.038
     20 15:08:01.136
     31 15:08:55.798
    126 15:08:55.799
     76 15:13:46.654
     83 15:13:46.655
     20 15:17:01.700
    107 15:18:42.112
     72 15:18:42.113
    124 15:23:48.689
     32 15:23:48.690
     25 15:23:48.690
     28 15:24:01.397

So, while there are outliers, I'd say that most of the problems happens every
3-5 minutes.

depesz







view thread (27+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox