public inbox for [email protected]
help / color / mirror / Atom feedFrom: Anthonin Bonnefoy <[email protected]>
To: Fujii Masao <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Subject: Re: Shutdown indefinitely stuck due to unflushed FPI_FOR_HINT record
Date: Tue, 3 Mar 2026 18:11:02 +0100
Message-ID: <CAO6_XqqKDV+AuP=Gf4kRKPqzyYTsOyGd3LE8Jqkwi7EMPJpbhA@mail.gmail.com> (raw)
In-Reply-To: <CAO6_Xqp+ADb6KZVWLMALu3xmwVUEO8S1EiCnp38mG6BrHrEnuA@mail.gmail.com>
References: <CAO6_Xqo3co3BuUVEVzkaBVw9LidBgeeQ_2hfxeLMQcXwovB3GQ@mail.gmail.com>
<CAO6_XqrZEREa5d+dyjahX6bteBhoN=8Jid-3a4f6Q35sWrv9eg@mail.gmail.com>
<CAHGQGwHFKF+x4E+SqedMCnmLCitxjTUUtSyL_+mMeuq-GbEt6w@mail.gmail.com>
<CAO6_Xqp+ADb6KZVWLMALu3xmwVUEO8S1EiCnp38mG6BrHrEnuA@mail.gmail.com>
Here's a small updated version of the patch, with the 2 different approaches.
- 0001: This is the XLogSetAsyncXactLSN call in RecordTransactionAbort
approach. The small difference with v3 is that the 'XactLastRecEnd !=
0' condition is now merged with !isSubXact:
+if (!isSubXact && XactLastRecEnd != 0)
+{
+ XLogSetAsyncXactLSN(XactLastRecEnd);
XactLastRecEnd = 0;
+}
- 0002: This is the ShutdownXLOG approach. I've used
XLogFlush(WriteRqstPtr) instead of updating the async LSN. It feels
like if we're going to stop the walsenders, we may as well flush
everything and get the WAL in a good state.
The spinlock to access XLogCtl->LogwrtRqst.Write is probably
unnecessary since we're at a point where no additional WAL records
should be written, but it doesn't hurt to keep consistency.
Regards,
Anthonin Bonnefoy
Attachments:
[application/octet-stream] v4-0002-Fix-stuck-shutdown-due-to-unflushed-records.patch (2.7K, 2-v4-0002-Fix-stuck-shutdown-due-to-unflushed-records.patch)
download | inline diff:
From 0db213c05c1fb5df84950159ad991059a40c3d71 Mon Sep 17 00:00:00 2001
From: Anthonin Bonnefoy <[email protected]>
Date: Tue, 3 Mar 2026 17:42:40 +0100
Subject: Fix stuck shutdown due to unflushed records
Shutdown sequence may be stuck indefinitely under the following
circumstances:
- Data checksums is enabled
- A logical replication walsender is running
- A select in an explicit transaction tries to prune a full heap page,
wrote a FPI_FOR_HINT record which crosses the page boundary
- The select is rollbacked (or killed)
- 'pg_ctl stop' is sent
The FPI_FOR_HINT record is likely going to be a contrecord and starts a
new page. However, as the select is rollbacked, XLogSetAsyncXactLSN
isn't called to advance the LSN to include this record.
When the checkpointer starts ShutdownXLOG(), all walsenders will be
notified to stop. However, the logical replication walsender will be
stuck in the following infinite loop:
- Tries to read the last FPI_FOR_HINT record
- The page with the record header is read
- tot_len > len, the record needs to be reassembled
- Tries to read the next page containing the rest of the record. It fails since this page was never written.
- xlog reader state is reset with XLogReaderInvalReadState
- It goes back to the start of WalSndLoop's loop
There are some attempts done by the walsender to flush the WAL using
XLogBackgroundFlush. However, XLogBackgroundFlush only writes completed
blocks, or up to the latest known async lsn.
Since the select was rollbacked, XLogBackgroundFlush doesn't flush the
next partial page.
This patch fixes the issue by advancing flushing all records before
signaling the walsenders to stop, avoiding the case where the walsenders
read a partially written record.
---
src/backend/access/transam/xlog.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 354ac645bdc..31afb249d5b 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6710,6 +6710,8 @@ GetLastSegSwitchData(XLogRecPtr *lastSwitchLSN)
void
ShutdownXLOG(int code, Datum arg)
{
+ XLogRecPtr WriteRqstPtr;
+
/*
* We should have an aux process resource owner to use, and we should not
* be in a transaction that's installed some other resowner.
@@ -6723,6 +6725,15 @@ ShutdownXLOG(int code, Datum arg)
ereport(IsPostmasterEnvironment ? LOG : NOTICE,
(errmsg("shutting down")));
+ /*
+ * We may have unflushed records, make sure everything is flushed before
+ * stopping the walsenders.
+ */
+ SpinLockAcquire(&XLogCtl->info_lck);
+ WriteRqstPtr = XLogCtl->LogwrtRqst.Write;
+ SpinLockRelease(&XLogCtl->info_lck);
+ XLogFlush(WriteRqstPtr);
+
/*
* Signal walsenders to move to stopping state.
*/
--
2.52.0
[application/octet-stream] v4-0001-Fix-stuck-shutdown-due-to-unflushed-records.patch (2.8K, 3-v4-0001-Fix-stuck-shutdown-due-to-unflushed-records.patch)
download | inline diff:
From a2310f65695d9481354100a503c92b7cd8255a2d Mon Sep 17 00:00:00 2001
From: Anthonin Bonnefoy <[email protected]>
Date: Tue, 24 Feb 2026 09:24:48 +0100
Subject: Fix stuck shutdown due to unflushed records
Shutdown sequence may be stuck indefinitely under the following
circumstances:
- Data checksums is enabled
- A logical replication walsender is running
- A select in an explicit transaction tries to prune a full heap page,
wrote a FPI_FOR_HINT record which crosses the page boundary
- The select is rollbacked (or killed)
- 'pg_ctl stop' is sent
The FPI_FOR_HINT record is likely going to be a contrecord and starts a
new page. However, as the select is rollbacked, XLogSetAsyncXactLSN
isn't called to advance the LSN to include this record.
When the checkpointer starts ShutdownXLOG(), all walsenders will be
notified to stop. However, the logical replication walsender will be
stuck in the following infinite loop:
- Tries to read the last FPI_FOR_HINT record
- The page with the record header is read
- tot_len > len, the record needs to be reassembled
- Tries to read the next page containing the rest of the record. It fails since this page was never written.
- xlog reader state is reset with XLogReaderInvalReadState
- It goes back to the start of WalSndLoop's loop
There are some attempts done by the walsender to flush the WAL using
XLogBackgroundFlush. However, XLogBackgroundFlush only writes completed
blocks, or up to the latest known async lsn.
Since the select was rollbacked, XLogBackgroundFlush doesn't flush the
next partial page.
This patch fixes the issue by advancing the async LSN, even when the
transaction doesn't have an assigned xid. This allows
XLogBackgroundFlush to write the necessary partial page when called by
the walsender.
---
src/backend/access/transam/xact.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index eba4f063168..1786b397769 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -1786,8 +1786,19 @@ RecordTransactionAbort(bool isSubXact)
if (!TransactionIdIsValid(xid))
{
/* Reset XactLastRecEnd until the next transaction writes something */
- if (!isSubXact)
+ if (!isSubXact && XactLastRecEnd != 0)
+ {
+ /*
+ * Even if no xid was assigned, some records may have been written
+ * in the WAL. Report the latest async LSN, so that the WAL writer
+ * knows to flush those records. This is important when shutting
+ * down, walsender may use XLogBackgroundFlush to trigger pending
+ * WAL to be written out. If they're not tracked by async xact
+ * lsn, they won't be written by XLogBackgroundFlush.
+ */
+ XLogSetAsyncXactLSN(XactLastRecEnd);
XactLastRecEnd = 0;
+ }
return InvalidTransactionId;
}
--
2.52.0
view thread (17+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected]
Subject: Re: Shutdown indefinitely stuck due to unflushed FPI_FOR_HINT record
In-Reply-To: <CAO6_XqqKDV+AuP=Gf4kRKPqzyYTsOyGd3LE8Jqkwi7EMPJpbhA@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox