public inbox for [email protected]  
help / color / mirror / Atom feed
From: Fujii Masao <[email protected]>
To: Anthonin Bonnefoy <[email protected]>
Cc: Andres Freund <[email protected]>
Cc: Alexander Lakhin <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Subject: Re: Shutdown indefinitely stuck due to unflushed FPI_FOR_HINT record
Date: Mon, 16 Mar 2026 14:39:24 +0900
Message-ID: <CAHGQGwFsuPcC7Aov73FTqMC5LEOdZ6uoWno5BR1HX2LnZRNKBw@mail.gmail.com> (raw)
In-Reply-To: <CAHGQGwGpfEsBJBLQDou7xccnb0tj1T_NQv2gZfuQfgzn_b=RsQ@mail.gmail.com>
References: <CAHGQGwHFKF+x4E+SqedMCnmLCitxjTUUtSyL_+mMeuq-GbEt6w@mail.gmail.com>
	<CAO6_Xqp+ADb6KZVWLMALu3xmwVUEO8S1EiCnp38mG6BrHrEnuA@mail.gmail.com>
	<CAO6_XqqKDV+AuP=Gf4kRKPqzyYTsOyGd3LE8Jqkwi7EMPJpbhA@mail.gmail.com>
	<CAHGQGwHc5yH4Nxp59KXJP0kAr61j3W7QeSKT2HxVjZa3OrLzmg@mail.gmail.com>
	<CAO6_Xqq1h6kggb1o206rgouPS0H5jnjahzZ0We-9ggnBjB2JsA@mail.gmail.com>
	<CAHGQGwFJnNUOMiW9wR-2WjSKzzj0wV8p55J8bnJ6mik=z0oFPQ@mail.gmail.com>
	<[email protected]>
	<CAO6_Xqq73TPa3M6nQ7RqRhKkcphy1JX7aNGTYy-x_Sn+6a8Z_Q@mail.gmail.com>
	<CAHGQGwGvnpN=2bo+F7H90YLFcx9=SazwLkcx+0gEcrbQy5NVZg@mail.gmail.com>
	<CAHGQGwECpyJtMqkCEvyqgZDiwAeMj3RKobui7jONrDd35W0x3Q@mail.gmail.com>
	<vzguaguldbcyfbyuq76qj7hx5qdr5kmh67gqkncyb2yhsygrdt@dfhcpteqifux>
	<CAO6_Xqq=-KnxrFQvmyF++XH4ngtXVhcDB979rEn6SPtUhSmNYg@mail.gmail.com>
	<CAHGQGwGpfEsBJBLQDou7xccnb0tj1T_NQv2gZfuQfgzn_b=RsQ@mail.gmail.com>

On Fri, Mar 13, 2026 at 2:24 AM Fujii Masao <[email protected]> wrote:
> Thanks for investigating the issue and making the patch!
> It looks good to me.

Since Tomas added GetXLogInsertEndRecPtr() in commit b1f14c96720,
I updated the patch to use it. Patch attached.
Barring any objections, I will commit it.

-       XLogFlush(GetXLogWriteRecPtr());
+       XLogFlush(GetXLogInsertEndRecPtr());

I excluded the above change from the patch because it seems like a separate
issue. I also wonder whether this code could cause an error in XLogFlush()
even when GetXLogWriteRecPtr() is used.

Regards,

-- 
Fujii Masao


Attachments:

  [application/octet-stream] v2-0001-Fix-WAL-flush-LSN-used-by-logical-walsender-durin.patch (2.4K, 2-v2-0001-Fix-WAL-flush-LSN-used-by-logical-walsender-durin.patch)
  download | inline diff:
From abafee8d06358f9a362373899c330d010d20804a Mon Sep 17 00:00:00 2001
From: Fujii Masao <[email protected]>
Date: Mon, 16 Mar 2026 13:15:13 +0900
Subject: [PATCH v2] Fix WAL flush LSN used by logical walsender during
 shutdown

Commit 6eedb2a5fd8 made the logical walsender call
XLogFlush(GetXLogInsertRecPtr()) to ensure that all pending WAL is flushed,
fixing a publisher shutdown hang. However, if the last WAL record ends at
a page boundary, GetXLogInsertRecPtr() can return an LSN pointing past
the page header, which can cause XLogFlush() to report an error.

A similar issue previously existed in the GiST code. Commit b1f14c96720
introduced GetXLogInsertEndRecPtr(), which returns a safe WAL insertion end
location (returning the start of the page when the last record ends at a page
boundary), and updated the GiST code to use it with XLogFlush().

This commit fixes the issue by making the logical walsender use
XLogFlush(GetXLogInsertEndRecPtr()) when flushing pending WAL during shutdown.

Backpatch to all supported versions.

Reported-by: Andres Freund <[email protected]>
Author: Anthonin Bonnefoy <[email protected]>
Reviewed-by: Fujii Masao <[email protected]>
Discussion: https://postgr.es/m/vzguaguldbcyfbyuq76qj7hx5qdr5kmh67gqkncyb2yhsygrdt@dfhcpteqifux
---
 src/backend/replication/walsender.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 376ff46340d..08253103cb3 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1890,9 +1890,15 @@ WalSndWaitForWal(XLogRecPtr loc)
 		 * If we're shutting down, trigger pending WAL to be written out,
 		 * otherwise we'd possibly end up waiting for WAL that never gets
 		 * written, because walwriter has shut down already.
+		 *
+		 * Note that GetXLogInsertEndRecPtr() is used to obtain the WAL flush
+		 * request location instead of GetXLogInsertRecPtr(). Because if the
+		 * last WAL record ends at a page boundary, GetXLogInsertRecPtr() can
+		 * return an LSN pointing past the page header, which may cause
+		 * XLogFlush() to report an error.
 		 */
 		if (got_STOPPING && !RecoveryInProgress())
-			XLogFlush(GetXLogInsertRecPtr());
+			XLogFlush(GetXLogInsertEndRecPtr());
 
 		/*
 		 * To avoid the scenario where standbys need to catch up to a newer
-- 
2.51.2



view thread (17+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Shutdown indefinitely stuck due to unflushed FPI_FOR_HINT record
  In-Reply-To: <CAHGQGwFsuPcC7Aov73FTqMC5LEOdZ6uoWno5BR1HX2LnZRNKBw@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox