Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wQprW-001uXv-06 for pgsql-hackers@arkaria.postgresql.org; Sat, 23 May 2026 17:10:10 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wQprT-00Fzga-18 for pgsql-hackers@arkaria.postgresql.org; Sat, 23 May 2026 17:10:08 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wQprT-00FzgS-07 for pgsql-hackers@lists.postgresql.org; Sat, 23 May 2026 17:10:08 +0000 Received: from mail-ej1-x629.google.com ([2a00:1450:4864:20::629]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wQprR-000000014fc-3WIG for pgsql-hackers@lists.postgresql.org; Sat, 23 May 2026 17:10:07 +0000 Received: by mail-ej1-x629.google.com with SMTP id a640c23a62f3a-bcda7765d64so1702938766b.1 for ; Sat, 23 May 2026 10:10:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1779556204; cv=none; d=google.com; s=arc-20240605; b=EtXpPd5VKJkA5ynKmqWrG5ANW/52cqQb8f/cpIyHUOKs6IBoDyg8BpL+fC7UOPhi3u z1Z8+STaQMh72zAFLD1geRiow0FffvnPst3C2Ws4xTCjQQycbLw4cjK/XPTKUO565VYM sG341tY8jbY9KZr3B+1LfHIEgiLDQWPASDD0nrh8pPg6IyAajOT/mkhdCt7QGGk9EfrV 75sRHaCWNAL8DzXZSVE50SYCzvFwKls8Dnj2dvwqZit62KpjJI21061rheUNC4ihHU7B 2/Zqp8H2d+5/zI1xtQMb2rYxkaz6Qx81VkX9BDHBj9Zp6ES/zttVI04ZIs01dJyr/tga SogA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=zWjNK+21IirfQIi+JuR6ul3oYyiEkJk1GojyNamgZ8M=; fh=fQjaORg1rOMj5cqAwyxWqCzFjp8N1uulCYiwVU63U0I=; b=WPW3s59POyfiTeInBIt66j9xEdS2ROKGw5XfVVZl3LvelI+BRBjnMGCuY0489tZNyL mUNJfCb6GsV49/cKSU445Jec5gOrNhDJU+CrfHuzlyH1vAMNICWHGiLUPcPl1nytCyYM U+VcN9gKzr9fd36Jf6cD1NkqrW3R73z+eAUR6xQWCVOc4OINTHZ9/kbqt6itJH5/NOJi o5WvC3RU5rDSrTftsFP6faSyhZQ3G7xf7W/DVUFBPEEtRyW950L2EPPLePZA0cm6mhix jlSXfbfzhpObhSuB12ev6SaW7nBS+cIzestKSsTVZmHgqMadYCRCbBDTwCOmxQ+moCg/ 8now==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779556204; x=1780161004; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=zWjNK+21IirfQIi+JuR6ul3oYyiEkJk1GojyNamgZ8M=; b=IlxAjhoGnowDalAV4J4YCWG7xh7iNKpP4VMboZGQtEWcMqC7/pqXQ9xCuxeEnro19T 3PzunJ2RN796RMweb6t2R2BaoKTC450EGXcfLCshFRg+PqsP2GyA6ONqxxAKFwg+/q9+ UFp1qOgKchPn8EeQS0FwwCCQwRvcukaBZi6YtulW/Ar4VjD9vK4wdvC/7qGVa44HbWXI QNqhHHVgGATL9T61eKcCRLed5UAwsdYHjZyM7Tb0I27a1vJ6rPRY17sEKHlLbjgpoAg6 Ql2xKtBkQYO/mNzZD82v4T0sjhsCR5S1YMP8zZUSLTutMBzOEjMCZSyCl8nMwab7mgPl /uow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779556204; x=1780161004; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=zWjNK+21IirfQIi+JuR6ul3oYyiEkJk1GojyNamgZ8M=; b=cvIQkApygBHfTb1v13UHIxpZMSXQ9PHKZ6R9s38rE9DvtwkYot2msGoYE5f0zMhOhj XjJw299WhnVkXQ5wmadaXWm0AltLzSi8I/OqpJ2InYZoE8cPG5EZWRH7e1c/S6GHZQpl eTXZoG9iMqA6CKW/HFY62i0bqo2AIsJc6CCvlAdIXnm2vNTOfBdbF8junjhdsrNiuHRF cmQq7ny/VCuiqbgc/rxpjVhbiHnnSGqYFOyS95OBpRuQGp6rtcE1ypXQ7dfuSwY7ojo6 y3pRgLO49fUFnFi8DD+o+G6uvzzJsvmOW87UWOpbh2VKp/8s9Y0aFo6tqN2rFdaK0HCx 5/qA== X-Forwarded-Encrypted: i=1; AFNElJ8ybd1LMQmGgSDkxncb+Nuv0E5dKZ0XCWUO78nLTD4egtRFkYOF952hDTMalF00teMqnYvqaRejktp6tzo/@lists.postgresql.org X-Gm-Message-State: AOJu0YxdHiSk+VlqyMSrOWs8FPa84kAy1C0B0U+odDY+QxbYFs/8lSpm ltluyDiCxVWui7UzIHr6d4hT/I0EdbK1evAHbkFitI8F71Vy+8kry1dAxnBpCR+2kEf1MeyQkfS O1RyCsjafNYT+dk4QdL3A/iCLuSjaXkI= X-Gm-Gg: Acq92OGy+GYgvYyRa3Hpru2itAoANSCLAgrQtOPfWEDExzvFLCAn1Q6LlzCoeLVmG6E RPYDdjzly3FexEeaAOW1u1nLj+zPcNiJxyfJFVS02TqCoOe8UIbUZTVatQtS3bVFDMc6YfdTwJV 7/TXz8K47ExxHfuswHk9WWFr0X4lX2AMjXD6Ew0EwwMJgtWBsuT26yiRMZImLLdgdPKIKRCSqSA V//mX6WEZacT3R+Dfr7mHMCTayxDkxqTf29GdnyivTVbqHZYH/Tf2jNZFdVFStFEv/6hgwXm3D7 toj/FHlJlEN0JYPQqu0lpSd9kFX/0LzZB1ib12vJfLil513h01DVyttjCEmHQrKD9774UBY+P5S eHYWqQsQopRUUDcGalYQ= X-Received: by 2002:a17:907:26c9:b0:bd4:f3bc:c4b3 with SMTP id a640c23a62f3a-bdc0132e8ffmr625219266b.18.1779556204180; Sat, 23 May 2026 10:10:04 -0700 (PDT) MIME-Version: 1.0 References: <63f6abc9-c0ae-465d-a4e6-667eca6ea008@gmail.com> In-Reply-To: From: Xuneng Zhou Date: Sat, 23 May 2026 10:09:52 -0700 X-Gm-Features: AVHnY4JG3C8p5UR9kE0duUvk085Cr95sI2JpiO2ABZ4J1My9HzMc4pnQV0phgf0 Message-ID: Subject: Re: Implement waiting for wal lsn replay: reloaded To: Alexander Korotkov Cc: Alexander Lakhin , Heikki Linnakangas , Peter Eisentraut , Andres Freund , Thomas Munro , =?UTF-8?Q?=C3=81lvaro_Herrera?= , Chao Li , pgsql-hackers , Michael Paquier , jian he , Tomas Vondra , Yura Sokolov Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi Alexander, On Wed, May 20, 2026 at 5:17=E2=80=AFAM Alexander Korotkov wrote: > > Hi, Xuneng! > > On Wed, May 20, 2026 at 8:18=E2=80=AFAM Xuneng Zhou wrote: > > On Tue, May 19, 2026 at 8:30=E2=80=AFPM Xuneng Zhou wrote: > > > > > > Hi, > > > > > > On Tue, May 19, 2026 at 1:00=E2=80=AFPM Alexander Lakhin wrote: > > > > > > > > Hello Alexander and Xuneng, > > > > > > > > 06.04.2026 22:49, Alexander Korotkov wrote: > > > > > > > > Thank you, I've pushed your version of patchset. I made two minor > > > > corrections for patch #2: mention default mode value in the header > > > > comment, and fallback to polling on has_wal_read_bug sparc64+ext4 b= ug. > > > > > > > > > > > > I discovered a new test failure, that is apparently caused by new > > > > wait_for_catchup() implementation [1]: > > > > [06:20:23.110](1.069s) not ok 8 - check that the slot state changes= to "extended" > > > > [06:20:23.110](0.001s) # Failed test 'check that the slot state c= hanges to "extended"' > > > > # at /Users/ec2-user/bf/goldfish/HEAD/pgsql/src/test/recovery/t/0= 19_replslot_limit.pl line 140. > > > > [06:20:23.111](0.000s) # got: 'unreserved' > > > > # expected: 'extended' > > > > [06:20:23.231](0.120s) not ok 9 - check that the slot state changes= to "unreserved" > > > > [06:20:23.231](0.000s) # Failed test 'check that the slot state c= hanges to "unreserved"' > > > > # at /Users/ec2-user/bf/goldfish/HEAD/pgsql/src/test/recovery/t/0= 19_replslot_limit.pl line 152. > > > > [06:20:23.231](0.000s) # got: 'lost|' > > > > # expected: 'unreserved|t' > > > > > > > > I've managed to reproduce such failures with: > > > > diff --git a/src/backend/replication/walreceiver.c b/src/backend/re= plication/walreceiver.c > > > > index 07eac07b9ce..493ce92674e 100644 > > > > --- a/src/backend/replication/walreceiver.c > > > > +++ b/src/backend/replication/walreceiver.c > > > > @@ -1143,2 +1143,3 @@ XLogWalRcvSendReply(bool force, bool requestR= eply, bool checkApply) > > > > > > > > +pg_usleep(10000); > > > > /* Get current timestamp. */ > > > > diff --git a/src/backend/replication/walsender.c b/src/backend/repl= ication/walsender.c > > > > index 04aa770d981..19cda3a6b51 100644 > > > > --- a/src/backend/replication/walsender.c > > > > +++ b/src/backend/replication/walsender.c > > > > @@ -2521,2 +2521,3 @@ ProcessStandbyReplyMessage(void) > > > > > > > > +pg_usleep(100000); > > > > /* the caller already consumed the msgtype byte */ > > > > > > > > Concretely, a loop: > > > > for i in {1..100}; do echo "ITERATION $i"; PROVE_TESTS=3D"t/019*" m= ake -s check -C src/test/recovery/ || break; done > > > > failed for me on iterations 2, 1, 7: > > > > ITERATION 7 > > > > # +++ tap check in src/test/recovery +++ > > > > t/019_replslot_limit.pl .. 8/? > > > > # Failed test 'check that the slot state changes to "extended"' > > > > # at t/019_replslot_limit.pl line 140. > > > > # got: 'unreserved' > > > > # expected: 'extended' > > > > t/019_replslot_limit.pl .. 26/? # Looks like you failed 1 test of 2= 6. > > > > t/019_replslot_limit.pl .. Dubious, test returned 1 (wstat 256, 0x1= 00) > > > > Failed 1/26 subtests > > > > > > > > With "WAIT FOR LSN" in wait_for_catchup() disabled, 100 iterations > > > > passed. > > > > > > > > Having extra logging added, I could see the key difference. > > > > Failed run: > > > > 2026-05-19 22:01:37.968 EEST client backend[3632148] 019_replslot_l= imit.pl LOG: !!!GetWALAvailability| targetLSN: 0/016C0000, targetSeg: 22, = oldestSlotSeg: 23, oldestSegMaxWalSize: 24, oldestSeg: 22 > > > > 2026-05-19 22:01:37.968 EEST client backend[3632148] 019_replslot_l= imit.pl STATEMENT: SELECT wal_status FROM pg_replication_slots WHERE slot_= name =3D 'rep1' > > > > vs > > > > Successful run: > > > > 2026-05-19 22:04:18.102 EEST client backend[3633761] 019_replslot_l= imit.pl LOG: !!!GetWALAvailability| targetLSN: 0/01700000, targetSeg: 23, = oldestSlotSeg: 23, oldestSegMaxWalSize: 24, oldestSeg: 23 > > > > 2026-05-19 22:04:18.102 EEST client backend[3633761] 019_replslot_l= imit.pl STATEMENT: SELECT wal_status FROM pg_replication_slots WHERE slot_= name =3D 'rep1' > > > > > > > > That is, with WAIT FOR LSN, primary in this test may advance > > > > slot->data.restart_lsn to the expected position after wait_for_catc= hup() > > > > returns. > > > > > > > > [1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=3Dgoldf= ish&dt=3D2026-05-13%2006%3A15%3A03 > > > > > > Thanks for reporting this issue. > > > > > > I think this is related to the semantic change made earlier: > > > wait_for_catchup() now returns once the standby itself has reached th= e > > > target LSN, rather than waiting until the primary observes that > > > position via pg_stat_replication. > > > > > > As a result, the primary may not yet have processed the standby > > > feedback needed to advance the slot's restart_lsn when > > > wait_for_catchup() returns. > > > > > > Actually, I was aware of this semantic change and previously thought > > > it might be harmless. But this failure appears to disprove that. I'll > > > prepare a patch to fix this shortly. > > > > After some consideration, 019_replslot_limit.pl appears to the better > > place to place the fix rather than by restoring the old primary-side > > polling barrier in wait_for_catchup(). > > > > The new wait_for_catchup() behavior is closer to its natural > > semantics: for replay/write/flush modes, it waits until the standby > > itself has reached the requested LSN. The old implementation used > > pg_stat_replication on the primary, which also implied that the > > primary had received and processed standby feedback. That was a useful > > side effect for this test, but it is not required by most callers. > > > > 019_replslot_limit.pl is different because it checks primary-side slot > > state. For a physical slot, restart_lsn advances only after the > > primary's walsender processes standby feedback. So the test needs an > > extra condition beyond ordinary standby catchup. > > > > The patch makes that dependency explicit: wait for the standby to > > replay the target LSN, then wait for the slot's restart_lsn on the > > primary to pass the same LSN. This keeps wait_for_catchup() focused on > > standby catchup while making the slot-specific synchronization visible > > in the test that needs it. > > I agree with you. But do we actually need a > wait_for_standby_and_slot_catchup() wrapper. I think we can call > $node->wait_for_slot_catchup() directly and simplify the fix. Check > the attached patch. > The patch looks good to me. I agree that the wait_for_slot_catchup is not needed and could be misleading. This change would make the exact synchronization point and its intention clearer. The only price we need to pay here is bringing back the polling. But it seems acceptable since the cost was there in the pre-wait-for-lsn era. And thanks for writing the great commit message! --=20 Regards, Xuneng Zhou HighGo Software Co., Ltd.