Re: Implement waiting for wal lsn replay: reloaded

public inbox for [email protected]  
help / color / mirror / Atom feed

From: Xuneng Zhou <[email protected]>
To: Alexander Lakhin <[email protected]>
Cc: Alexander Korotkov <[email protected]>
Cc: Heikki Linnakangas <[email protected]>
Cc: Peter Eisentraut <[email protected]>
Cc: Andres Freund <[email protected]>
Cc: Thomas Munro <[email protected]>
Cc: Álvaro Herrera <[email protected]>
Cc: Chao Li <[email protected]>
Cc: pgsql-hackers <[email protected]>
Cc: Michael Paquier <[email protected]>
Cc: jian he <[email protected]>
Cc: Tomas Vondra <[email protected]>
Cc: Yura Sokolov <[email protected]>
Subject: Re: Implement waiting for wal lsn replay: reloaded
Date: Tue, 19 May 2026 22:18:07 -0700
Message-ID: <CABPTF7X61iuyZT3K3rZssi9qyBsfo+cGs4vWBZ6KexHi1L6hMA@mail.gmail.com> (raw)
In-Reply-To: <CABPTF7V=q+OCSHLMV0v78UU=pz6b7rC81vJ5+XKnVkfAebyEwA@mail.gmail.com>
References: <CABPTF7Xs-64GQNjmbimZNhj2YSKbBny+evz6=cp3X2fkJS+vMQ@mail.gmail.com>
	<CABPTF7Ub=w7CRxi3sNv8oMGMh4hCqUTohuiTuP9Y1DpxRuFtRQ@mail.gmail.com>
	<CAPpHfduJKv9-R2HcpyX9RNgteLL0M1MPS1No1WLnTsegsbG4MQ@mail.gmail.com>
	<CABPTF7WWxgAAr5fT9TFciU+PzeRpC3Dp7SO60AV9XWx561TNKA@mail.gmail.com>
	<CABPTF7X0n=R50z2fBpj3EbYYz04Ab0-DHJa+JfoAEny62QmUdg@mail.gmail.com>
	<CABPTF7U+SUnJX_woQYGe==R9Oz+-V6X0VO2stBLPGfJmH_LEhw@mail.gmail.com>
	<CABPTF7UcuVD0L-X=jZFfeygjPaZWWkVRwtWOaJw2tcXbEN2xsA@mail.gmail.com>
	<CABPTF7Wdq6KbvC3EhLX3Pz=ODCCPEX7qVQ+E=cokkB91an2E-A@mail.gmail.com>
	<CAPpHfdv_BS7csGyg_=pPanRQM9Sf6_wBWNGdVzJRAv0U4eH9cg@mail.gmail.com>
	<CAPpHfds7oSCbZqob7ytT_Lso8fv-NW8LnedUTE4Krde+3rkJeA@mail.gmail.com>
	<CABPTF7WiDtWCR82geyaaaCRV9UiPR5YUHRNDysD_7Ltr1ymfug@mail.gmail.com>
	<CABPTF7Um7oRKBYmek_3gfbaMg5DLzHZmAX-GEacHASfD34xo5g@mail.gmail.com>
	<CABPTF7V-E_e3kQ2vtwUz6Jy7u-8_YeUT0SDoAbu7EKPgNp=ndA@mail.gmail.com>
	<CAPpHfdtNiSqQCu+YtTYcc+K4q9FwtZuAtQ5Qs+KoaZZM4QyYTA@mail.gmail.com>
	<[email protected]>
	<CABPTF7V=q+OCSHLMV0v78UU=pz6b7rC81vJ5+XKnVkfAebyEwA@mail.gmail.com>

On Tue, May 19, 2026 at 8:30 PM Xuneng Zhou <[email protected]> wrote:
>
> Hi,
>
> On Tue, May 19, 2026 at 1:00 PM Alexander Lakhin <[email protected]> wrote:
> >
> > Hello Alexander and Xuneng,
> >
> > 06.04.2026 22:49, Alexander Korotkov wrote:
> >
> > Thank you, I've pushed your version of patchset.  I made two minor
> > corrections for patch #2: mention default mode value in the header
> > comment, and fallback to polling on has_wal_read_bug sparc64+ext4 bug.
> >
> >
> > I discovered a new test failure, that is apparently caused by new
> > wait_for_catchup() implementation [1]:
> > [06:20:23.110](1.069s) not ok 8 - check that the slot state changes to "extended"
> > [06:20:23.110](0.001s) #   Failed test 'check that the slot state changes to "extended"'
> > #   at /Users/ec2-user/bf/goldfish/HEAD/pgsql/src/test/recovery/t/019_replslot_limit.pl line 140.
> > [06:20:23.111](0.000s) #          got: 'unreserved'
> > #     expected: 'extended'
> > [06:20:23.231](0.120s) not ok 9 - check that the slot state changes to "unreserved"
> > [06:20:23.231](0.000s) #   Failed test 'check that the slot state changes to "unreserved"'
> > #   at /Users/ec2-user/bf/goldfish/HEAD/pgsql/src/test/recovery/t/019_replslot_limit.pl line 152.
> > [06:20:23.231](0.000s) #          got: 'lost|'
> > #     expected: 'unreserved|t'
> >
> > I've managed to reproduce such failures with:
> > diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
> > index 07eac07b9ce..493ce92674e 100644
> > --- a/src/backend/replication/walreceiver.c
> > +++ b/src/backend/replication/walreceiver.c
> > @@ -1143,2 +1143,3 @@ XLogWalRcvSendReply(bool force, bool requestReply, bool checkApply)
> >
> > +pg_usleep(10000);
> >      /* Get current timestamp. */
> > diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
> > index 04aa770d981..19cda3a6b51 100644
> > --- a/src/backend/replication/walsender.c
> > +++ b/src/backend/replication/walsender.c
> > @@ -2521,2 +2521,3 @@ ProcessStandbyReplyMessage(void)
> >
> > +pg_usleep(100000);
> >      /* the caller already consumed the msgtype byte */
> >
> > Concretely, a loop:
> > for i in {1..100}; do echo "ITERATION $i"; PROVE_TESTS="t/019*" make -s check -C src/test/recovery/ || break; done
> > failed for me on iterations 2, 1, 7:
> > ITERATION 7
> > # +++ tap check in src/test/recovery +++
> > t/019_replslot_limit.pl .. 8/?
> > #   Failed test 'check that the slot state changes to "extended"'
> > #   at t/019_replslot_limit.pl line 140.
> > #          got: 'unreserved'
> > #     expected: 'extended'
> > t/019_replslot_limit.pl .. 26/? # Looks like you failed 1 test of 26.
> > t/019_replslot_limit.pl .. Dubious, test returned 1 (wstat 256, 0x100)
> > Failed 1/26 subtests
> >
> > With "WAIT FOR LSN" in wait_for_catchup() disabled, 100 iterations
> > passed.
> >
> > Having extra logging added, I could see the key difference.
> > Failed run:
> > 2026-05-19 22:01:37.968 EEST client backend[3632148] 019_replslot_limit.pl LOG:  !!!GetWALAvailability| targetLSN: 0/016C0000, targetSeg: 22, oldestSlotSeg: 23, oldestSegMaxWalSize: 24, oldestSeg: 22
> > 2026-05-19 22:01:37.968 EEST client backend[3632148] 019_replslot_limit.pl STATEMENT:  SELECT wal_status FROM pg_replication_slots WHERE slot_name = 'rep1'
> > vs
> > Successful run:
> > 2026-05-19 22:04:18.102 EEST client backend[3633761] 019_replslot_limit.pl LOG:  !!!GetWALAvailability| targetLSN: 0/01700000, targetSeg: 23, oldestSlotSeg: 23, oldestSegMaxWalSize: 24, oldestSeg: 23
> > 2026-05-19 22:04:18.102 EEST client backend[3633761] 019_replslot_limit.pl STATEMENT:  SELECT wal_status FROM pg_replication_slots WHERE slot_name = 'rep1'
> >
> > That is, with WAIT FOR LSN, primary in this test may advance
> > slot->data.restart_lsn to the expected position after wait_for_catchup()
> > returns.
> >
> > [1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=goldfish&dt=2026-05-13%2006%3A15%3A03
>
> Thanks for reporting this issue.
>
> I think this is related to the semantic change made earlier:
> wait_for_catchup() now returns once the standby itself has reached the
> target LSN, rather than waiting until the primary observes that
> position via pg_stat_replication.
>
> As a result, the primary may not yet have processed the standby
> feedback needed to advance the slot's restart_lsn when
> wait_for_catchup() returns.
>
> Actually, I was aware of this semantic change and previously thought
> it might be harmless. But this failure appears to disprove that. I'll
> prepare a patch to fix this shortly.

After some consideration, 019_replslot_limit.pl appears to the better
place to place the fix rather than by restoring the old primary-side
polling barrier in wait_for_catchup().

The new wait_for_catchup() behavior is closer to its natural
semantics: for replay/write/flush modes, it waits until the standby
itself has reached the requested LSN. The old implementation used
pg_stat_replication on the primary, which also implied that the
primary had received and processed standby feedback. That was a useful
side effect for this test, but it is not required by most callers.

019_replslot_limit.pl is different because it checks primary-side slot
state. For a physical slot, restart_lsn advances only after the
primary's walsender processes standby feedback. So the test needs an
extra condition beyond ordinary standby catchup.

The patch makes that dependency explicit: wait for the standby to
replay the target LSN, then wait for the slot's restart_lsn on the
primary to pass the same LSN. This keeps wait_for_catchup() focused on
standby catchup while making the slot-specific synchronization visible
in the test that needs it.


-- 
Regards,
Xuneng Zhou
HighGo Software Co., Ltd.


Attachments:

  [application/octet-stream] 0001-Stabilize-replslot-limit-test-after-standby-catchup.patch (3.2K, 2-0001-Stabilize-replslot-limit-test-after-standby-catchup.patch)
  download | inline diff:
From 41eb480a5c52c3ebcee4488e38bef5b8aed59ff5 Mon Sep 17 00:00:00 2001
From: Xuneng Zhou <[email protected]>
Date: Tue, 19 May 2026 22:11:04 -0700
Subject: [PATCH] Stabilize replslot limit test after standby catchup

019_replslot_limit.pl checks primary-side WAL availability for a
physical replication slot after stopping and restarting a standby.
wait_for_catchup() now waits for the standby to reach the target LSN
locally, so it can return before the primary has processed standby
feedback and advanced the slot's restart_lsn.

Make the test wait explicitly for that slot advancement before stopping
the standby.
---
 src/test/recovery/t/019_replslot_limit.pl | 23 +++++++++++++++++------
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index 7b253e64d9c..0c43acdf8a7 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -12,6 +12,16 @@ use PostgreSQL::Test::Cluster;
 use Test::More;
 use Time::HiRes qw(usleep);
 
+sub wait_for_standby_and_slot_catchup
+{
+	my ($primary, $standby, $slot_name) = @_;
+
+	my $target_lsn = $primary->lsn('write');
+
+	$primary->wait_for_catchup($standby, 'replay', $target_lsn);
+	$primary->wait_for_slot_catchup($slot_name, 'restart', $target_lsn);
+}
+
 # Initialize primary node, setting wal-segsize to 1MB
 my $node_primary = PostgreSQL::Test::Cluster->new('primary');
 $node_primary->init(allows_streaming => 1, extra => ['--wal-segsize=1']);
@@ -44,8 +54,9 @@ $node_standby->append_conf('postgresql.conf', "primary_slot_name = 'rep1'");
 
 $node_standby->start;
 
-# Wait until standby has replayed enough data
-$node_primary->wait_for_catchup($node_standby);
+# Wait until the standby has replayed enough data, and the primary has
+# processed feedback advancing the slot's restart_lsn.
+wait_for_standby_and_slot_catchup($node_primary, $node_standby, 'rep1');
 
 # Stop standby
 $node_standby->stop;
@@ -79,7 +90,7 @@ is($result, "reserved|t", 'check that slot is working');
 # The standby can reconnect to primary
 $node_standby->start;
 
-$node_primary->wait_for_catchup($node_standby);
+wait_for_standby_and_slot_catchup($node_primary, $node_standby, 'rep1');
 
 $node_standby->stop;
 
@@ -109,7 +120,7 @@ is($result, "reserved",
 
 # The standby can reconnect to primary
 $node_standby->start;
-$node_primary->wait_for_catchup($node_standby);
+wait_for_standby_and_slot_catchup($node_primary, $node_standby, 'rep1');
 $node_standby->stop;
 
 # wal_keep_size overrides max_slot_wal_keep_size
@@ -128,7 +139,7 @@ $result = $node_primary->safe_psql('postgres',
 
 # The standby can reconnect to primary
 $node_standby->start;
-$node_primary->wait_for_catchup($node_standby);
+wait_for_standby_and_slot_catchup($node_primary, $node_standby, 'rep1');
 $node_standby->stop;
 
 # Advance WAL again without checkpoint, reducing remain by 6 MB.
@@ -155,7 +166,7 @@ is($result, "unreserved|t",
 # The standby still can connect to primary before a checkpoint
 $node_standby->start;
 
-$node_primary->wait_for_catchup($node_standby);
+wait_for_standby_and_slot_catchup($node_primary, $node_standby, 'rep1');
 
 $node_standby->stop;
 
-- 
2.50.1 (Apple Git-155)

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Implement waiting for wal lsn replay: reloaded
  In-Reply-To: <CABPTF7X61iuyZT3K3rZssi9qyBsfo+cGs4vWBZ6KexHi1L6hMA@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox