public inbox for [email protected]
help / color / mirror / Atom feedFrom: Heikki Linnakangas <[email protected]>
To: Sebastian Webber <[email protected]>
To: [email protected]
Cc: Andrey Borodin <[email protected]>
Cc: Álvaro Herrera <[email protected]>
Cc: Dmitry Yurichev <[email protected]>
Cc: Chao Li <[email protected]>
Cc: Ivan Bykov <[email protected]>
Cc: Kirill Reshke <[email protected]>
Subject: Re: 17.8 standby crashes during WAL replay from 17.5 primary: "could not access status of transaction"
Date: Sat, 14 Feb 2026 13:42:02 +0200
Message-ID: <[email protected]> (raw)
In-Reply-To: <CACV2tSw3VYS7d27ftO_cs+aF3M54+JwWBbqSGLcKoG9cvyb6EA@mail.gmail.com>
References: <CACV2tSw3VYS7d27ftO_cs+aF3M54+JwWBbqSGLcKoG9cvyb6EA@mail.gmail.com>
On 13/02/2026 22:31, Sebastian Webber wrote:
> PostgreSQL version: 17.8 (standby), 17.5 (primary)
>
> Primary: PostgreSQL 17.5 (Debian 17.5-1.pgdg130+1) on aarch64-unknown-
> linux-gnu
> Standby: PostgreSQL 17.8 (Debian 17.8-1.pgdg13+1) on aarch64-unknown-
> linux-gnu
>
> Platform: Docker containers on macOS (Apple Silicon / aarch64), Docker
> Desktop
>
>
> Description
> -----------
>
> A PostgreSQL 17.8 standby crashes during WAL replay when streaming
> from a 17.5 primary. The crash occurs after replaying a
> MultiXact/TRUNCATE_ID record followed by a MultiXact/CREATE_ID
> record.
Thanks for the report, I can repro it with your script. It is indeed a
regression introduced in the latest minor release, in the logic to
replay multixact WAL generated on older minor versions. (Commit
8ba61bc063). Adding the folks from the thread that led to that commit.
The commit added this in RecordNewMultiXact():
> /*
> * Older minor versions didn't set the next multixid's offset in this
> * function, and therefore didn't initialize the next page until the next
> * multixid was assigned. If we're replaying WAL that was generated by
> * such a version, the next page might not be initialized yet. Initialize
> * it now.
> */
> if (InRecovery &&
> next_pageno != pageno &&
> pg_atomic_read_u64(&MultiXactOffsetCtl->shared->latest_page_number) == pageno)
> {
> elog(DEBUG1, "next offsets page is not initialized, initializing it now");
The idea is that if the next offset falls on a different page
(next_pageno != pageno), and we have not yet initialized the next page
(pg_atomic_read_u64(&MultiXactOffsetCtl->shared->latest_page_number) ==
pageno), we initialize it now. However, that last check goes wrong after
a truncation record is replayed. Replaying a truncation record does this:
>
> /*
> * During XLOG replay, latest_page_number isn't necessarily set up
> * yet; insert a suitable value to bypass the sanity test in
> * SimpleLruTruncate.
> */
> pageno = MultiXactIdToOffsetPage(xlrec.endTruncOff);
> pg_atomic_write_u64(&MultiXactOffsetCtl->shared->latest_page_number,
> pageno);
Thanks to that, latest_page_number moves backwards to much older page
number. That breaks the "was the next offset page already initialized?"
test in RecordNewMultiXact().
I don't understand why that "bypass the sanity check" is needed. As far
as I can see, latest_page_number is tracked accurately during WAL
replay, and should already be set up. It's initialized in
StartupMultiXact(), and updated whenever the next page is initialized.
That was introduced a long time ago, in commit 4f627f8973, which in turn
was a backpatched and had deal with WAL that was generated before that
commit. I suspect it was necessary back then, for backwards
compatiblity, but isn't necessary any more. Hence, I propose to remove
that "bypass the sanity check" code (attached). Does anyone see a
scenario where latest_page_number might not be set correctly?
If we want to play it even more safe -- and I guess that's the right
thing to do for backpatching -- we could set latest_page_number
*temporarily* while we do the the truncation, and restore the old value
afterwards.
This fixes the bug. With this fix, you can replay WAL that's already
been generated.
- Heikki
Attachments:
[text/x-patch] 0001-Don-t-reset-latest_page_number-when-replaying-multix.patch (1.9K, 2-0001-Don-t-reset-latest_page_number-when-replaying-multix.patch)
download | inline diff:
From 59556e5b24f7973b857e54e6fcd136d401c9ff0f Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Sat, 14 Feb 2026 13:30:03 +0200
Subject: [PATCH 1/1] Don't reset 'latest_page_number' when replaying multixid
truncation
'latest_page_number' is set to the correct value, according to
nextOffset, early at system startup. Contrary to the comment, it hence
should be set up correctly by the time we get to WAL replay.
This fixes a failure to replay WAL generated on older minor versions,
before commit 789d65364c (18.2, 17.8, 16.12, 15.16, 14.21).
Discussion: https://www.postgresql.org/message-id/[email protected];lightning.p46.dedyn.io
---
src/backend/access/transam/multixact.c | 10 ----------
1 file changed, 10 deletions(-)
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index c863e4e0556..e45ec0d7247 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -3571,7 +3571,6 @@ multixact_redo(XLogReaderState *record)
else if (info == XLOG_MULTIXACT_TRUNCATE_ID)
{
xl_multixact_truncate xlrec;
- int64 pageno;
memcpy(&xlrec, XLogRecGetData(record),
SizeOfMultiXactTruncate);
@@ -3596,15 +3595,6 @@ multixact_redo(XLogReaderState *record)
SetMultiXactIdLimit(xlrec.endTruncOff, xlrec.oldestMultiDB, false);
PerformMembersTruncation(xlrec.startTruncMemb, xlrec.endTruncMemb);
-
- /*
- * During XLOG replay, latest_page_number isn't necessarily set up
- * yet; insert a suitable value to bypass the sanity test in
- * SimpleLruTruncate.
- */
- pageno = MultiXactIdToOffsetPage(xlrec.endTruncOff);
- pg_atomic_write_u64(&MultiXactOffsetCtl->shared->latest_page_number,
- pageno);
PerformOffsetsTruncation(xlrec.startTruncOff, xlrec.endTruncOff);
LWLockRelease(MultiXactTruncationLock);
--
2.47.3
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: 17.8 standby crashes during WAL replay from 17.5 primary: "could not access status of transaction"
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox