public inbox for [email protected]
help / color / mirror / Atom feedFrom: Andrey Borodin <[email protected]>
To: Heikki Linnakangas <[email protected]>
Cc: Kirill Reshke <[email protected]>
Cc: Sebastian Webber <[email protected]>
Cc: [email protected]
Cc: Andrey Borodin <[email protected]>
Cc: Álvaro Herrera <[email protected]>
Cc: Dmitry Yurichev <[email protected]>
Cc: Chao Li <[email protected]>
Cc: Ivan Bykov <[email protected]>
Subject: Re: 17.8 standby crashes during WAL replay from 17.5 primary: "could not access status of transaction"
Date: Wed, 18 Feb 2026 13:58:03 +0500
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <CACV2tSw3VYS7d27ftO_cs+aF3M54+JwWBbqSGLcKoG9cvyb6EA@mail.gmail.com>
<[email protected]>
<CALdSSPhMhNzRRd-SeU0PTwKiGDpFOb5Yss7PWBPN3cHv6kW8eQ@mail.gmail.com>
<[email protected]>
> On 16 Feb 2026, at 21:01, Heikki Linnakangas <[email protected]> wrote:
>
> Andrey if you can verify with your TAP test, too, that'd be great.
Here's a hand-wavy test on top of REL_17_STABLE. It modifies binaries to simulate old WAL write behavior.
I tried to hack it with -DDEMO_SIMULATE_OLD_MULTIXACT_BEHAVIOR, but gave up and just hardcoded.
We are not going to commit it, aren't we?
If we comment out this line (patch does it)
pg_atomic_write_u64(&MultiXactOffsetCtl->shared->latest_page_number,
pageno);
the test will pass.
Either way it will hang indefinitely because
2026-02-18 13:44:12.238 +05 [52360] LOG: started streaming WAL from primary at 0/3000000 on timeline 1
2026-02-18 13:44:12.250 +05 [52359] FATAL: could not access status of transaction 4096
2026-02-18 13:44:12.250 +05 [52359] DETAIL: Could not read from file "pg_multixact/offsets/0000" at offset 16384: read too few bytes.
2026-02-18 13:44:12.250 +05 [52359] CONTEXT: WAL redo at 0/30245E0 for MultiXact/CREATE_ID: 4095 offset 8189 nmembers 2: 4835 (sh) 4835 (upd)
Most hand-wavy part is test_multixact_write_truncate_wal(): truncation is synthetic.
FWIW, a lot of calculations and commenting done by LLM. Let me know if such a verbosity is not good for readability.
Best regards, Andrey Borodin.
Attachments:
[application/octet-stream] 0001-Test-Multixact-truncation-near-page-boundary-replay-.patch (13.6K, 2-0001-Test-Multixact-truncation-near-page-boundary-replay-.patch)
download | inline diff:
From 465eb45cffab0f8503a66288246a0416a0702071 Mon Sep 17 00:00:00 2001
From: Andrey Borodin <[email protected]>
Date: Wed, 18 Feb 2026 10:11:26 +0500
Subject: [PATCH] Test Multixact truncation near page-boundary replay on
standby
Add a TAP test that reproduces the bug fixed by commit 4a36c89f165:
TRUNCATE_ID WAL replay resets latest_page_number, breaking the
init-next-page check in RecordNewMultiXact. When a page-crossing
CREATE_ID is replayed after a TRUNCATE_ID whose endTruncOff lands on
a different page, the standby startup process crashes with:
FATAL: could not access status of transaction ...
DETAIL: Could not read from file "pg_multixact/offsets/..." read too few bytes.
To trigger the bug reliably in a single-binary test, two additional
changes to multixact.c simulate WAL from older minor versions
(pre-8ba61bc063):
- ExtendMultiXactOffset(result) instead of (result + 1), so the
primary does not pre-zero the next page before the CREATE_ID.
- The "set next multixid's offset" block in RecordNewMultiXact is
skipped on the primary (!InRecovery) but kept during recovery,
so the standby still tries to read the next page.
A helper function test_multixact_write_truncate_wal() injects a
TRUNCATE_ID WAL record with a controlled endTruncOff, simulating
the concurrent truncation + multixact creation that occurs in
production.
Apply the fix (0002-Don-t-reset-latest_page_number-when-replaying-
multix.patch) on top of this patch to verify the test passes.
---
src/backend/access/transam/multixact.c | 15 ++-
src/test/modules/test_slru/Makefile | 4 +-
src/test/modules/test_slru/meson.build | 6 +
.../t/002_multixact_truncation_replay.pl | 95 ++++++++++++++++
src/test/modules/test_slru/test_multixact.c | 105 ++++++++++++++++++
src/test/modules/test_slru/test_slru--1.0.sql | 7 ++
6 files changed, 228 insertions(+), 4 deletions(-)
create mode 100644 src/test/modules/test_slru/t/002_multixact_truncation_replay.pl
create mode 100644 src/test/modules/test_slru/test_multixact.c
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index c863e4e0556..e1fc55d0745 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -996,7 +996,15 @@ RecordNewMultiXact(MultiXactId multi, MultiXactOffset offset,
/*
* Set the next multixid's offset to the end of this multixid's members.
+ *
+ * On the primary (!InRecovery), skip this to produce WAL without the next
+ * offset already set — simulating pre-8ba61bc063 behavior. During
+ * recovery, keep this code so the standby tries to read the next page,
+ * triggering the bug when the init-next-page check fails due to
+ * truncation resetting latest_page_number.
*/
+ if (InRecovery)
+ {
if (next_pageno == pageno)
{
next_offptr = offptr + 1;
@@ -1027,6 +1035,7 @@ RecordNewMultiXact(MultiXactId multi, MultiXactOffset offset,
*next_offptr = next_offset;
MultiXactOffsetCtl->shared->page_dirty[slotno] = true;
}
+ }
/* Release MultiXactOffset SLRU lock. */
LWLockRelease(lock);
@@ -1227,7 +1236,7 @@ GetNewMultiXactId(int nmembers, MultiXactOffset *offset)
* Make sure there is room for the next MXID in the file. Assigning this
* MXID sets the next MXID's offset already.
*/
- ExtendMultiXactOffset(result + 1);
+ ExtendMultiXactOffset(result);
/*
* Reserve the members space, similarly to above. Also, be careful not to
@@ -3603,8 +3612,8 @@ multixact_redo(XLogReaderState *record)
* SimpleLruTruncate.
*/
pageno = MultiXactIdToOffsetPage(xlrec.endTruncOff);
- pg_atomic_write_u64(&MultiXactOffsetCtl->shared->latest_page_number,
- pageno);
+ // pg_atomic_write_u64(&MultiXactOffsetCtl->shared->latest_page_number,
+ // pageno);
PerformOffsetsTruncation(xlrec.startTruncOff, xlrec.endTruncOff);
LWLockRelease(MultiXactTruncationLock);
diff --git a/src/test/modules/test_slru/Makefile b/src/test/modules/test_slru/Makefile
index 936886753b7..8870e49da85 100644
--- a/src/test/modules/test_slru/Makefile
+++ b/src/test/modules/test_slru/Makefile
@@ -3,7 +3,8 @@
MODULE_big = test_slru
OBJS = \
$(WIN32RES) \
- test_slru.o
+ test_slru.o \
+ test_multixact.o
PGFILEDESC = "test_slru - test module for SLRUs"
EXTENSION = test_slru
@@ -11,6 +12,7 @@ DATA = test_slru--1.0.sql
REGRESS_OPTS = --temp-config $(top_srcdir)/src/test/modules/test_slru/test_slru.conf
REGRESS = test_slru
+TAP_TESTS = 1
# Disabled because these tests require "shared_preload_libraries=test_slru",
# which typical installcheck users do not have (e.g. buildfarm clients).
NO_INSTALLCHECK = 1
diff --git a/src/test/modules/test_slru/meson.build b/src/test/modules/test_slru/meson.build
index ce91e606313..f589b3ec358 100644
--- a/src/test/modules/test_slru/meson.build
+++ b/src/test/modules/test_slru/meson.build
@@ -2,6 +2,7 @@
test_slru_sources = files(
'test_slru.c',
+ 'test_multixact.c',
)
if host_system == 'windows'
@@ -32,4 +33,9 @@ tests += {
'regress_args': ['--temp-config', files('test_slru.conf')],
'runningcheck': false,
},
+ 'tap': {
+ 'tests': [
+ 't/002_multixact_truncation_replay.pl',
+ ],
+ },
}
diff --git a/src/test/modules/test_slru/t/002_multixact_truncation_replay.pl b/src/test/modules/test_slru/t/002_multixact_truncation_replay.pl
new file mode 100644
index 00000000000..4a4140e8bd2
--- /dev/null
+++ b/src/test/modules/test_slru/t/002_multixact_truncation_replay.pl
@@ -0,0 +1,95 @@
+# Copyright (c) 2024-2026, PostgreSQL Global Development Group
+
+# Test multixact SLRU truncation replay on standby.
+#
+# Reproduces the bug fixed by commit 4a36c89f165: during TRUNCATE_ID replay,
+# latest_page_number was reset to MultiXactIdToOffsetPage(endTruncOff). This
+# broke the init-next-page check in RecordNewMultiXact, which compares
+# latest_page_number == pageno. If a CREATE_ID that crosses a page boundary
+# is replayed AFTER a TRUNCATE_ID whose endTruncOff is on a different page,
+# the init check doesn't fire, the next page isn't initialized, and
+# SimpleLruReadPage fails with FATAL.
+#
+# The test uses test_multixact_write_truncate_wal() to inject a TRUNCATE_ID
+# WAL record with endTruncOff on page 0, placed between two batches of
+# CREATE_IDs. This simulates the real-world scenario where truncation runs
+# concurrently with multixact creation.
+#
+# To produce WAL without a pre-zeroed next page (as older minor versions did
+# before 8ba61bc063), two changes in multixact.c are required:
+# - ExtendMultiXactOffset(result) instead of (result + 1)
+# - next-offset write in RecordNewMultiXact skipped on primary
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+
+use Test::More;
+
+# MULTIXACT_OFFSETS_PER_PAGE = BLCKSZ/4 = 2048 (for 8kB blocks).
+#
+# Scenario:
+# 1. Create 2046 multixacts (multis 1..2046). nextMXact = 2047, page 0.
+# 2. Take backup.
+# 3. Create 2048 MORE multixacts (2047..4094). Multi 2047 crosses page 0->1.
+# 4. Inject TRUNCATE_ID with endTruncOff = 10 (page 0).
+# 5. Create 1 more multixact (4095), last entry on page 1, crossing to page 2.
+#
+# On the standby:
+# - StartupMultiXact sets latest_page_number = page(2047) = 0
+# - CREATE_ID(2047) crosses page 0->1: init check fires (0==0), zeros page 1,
+# latest_page_number updated to 1 by SimpleLruZeroPage
+# - CREATE_IDs for 2048..4094 on page 1 (no crossings)
+# - TRUNCATE_ID(endTruncOff=10): latest_page_number reset to page(10) = 0
+# - CREATE_ID(4095): pageno=1, next_pageno=2
+# Init check: latest_page_number(0) != pageno(1) -> SKIP
+# RecordNewMultiXact tries SimpleLruReadPage(page 2) -> FATAL
+#
+# With the fix (not resetting latest_page_number in TRUNCATE_ID replay):
+# - latest_page_number stays 1 after the page 0->1 crossing
+# - Init check at CREATE_ID(4095): latest_page_number(1) == pageno(1) -> fires
+# - Page 2 is initialized -> replay succeeds
+
+my $node_primary = PostgreSQL::Test::Cluster->new('main');
+$node_primary->init(allows_streaming => 'physical');
+$node_primary->append_conf('postgresql.conf',
+ "shared_preload_libraries = 'test_slru'");
+$node_primary->start;
+$node_primary->safe_psql('postgres', q(CREATE EXTENSION test_slru));
+
+# Fill page 0: multis 1..2046, nextMXact = 2047
+$node_primary->safe_psql('postgres', q{SELECT test_create_multixacts(2046)});
+
+$node_primary->backup('mx_backup');
+
+# Fill page 1: multis 2047..4094, nextMXact = 4095.
+# Multi 2047 crosses page 0->1; on the standby the init check zeros page 1.
+$node_primary->safe_psql('postgres', q{SELECT test_create_multixacts(2048)});
+
+# Inject TRUNCATE_ID with endTruncOff on page 0.
+# On the standby this resets latest_page_number from 1 back to 0.
+$node_primary->safe_psql('postgres',
+ q{SELECT test_multixact_write_truncate_wal('10'::xid)});
+
+# Create multi 4095 (page 1, entry 2047) which crosses to page 2.
+# Without the fix the standby crashes here: latest_page_number(0) != pageno(1).
+$node_primary->safe_psql('postgres', q{SELECT test_create_multixact()});
+$node_primary->safe_psql('postgres', q{SELECT pg_switch_wal()});
+
+my $node_standby = PostgreSQL::Test::Cluster->new('standby');
+$node_standby->init_from_backup($node_primary, 'mx_backup',
+ has_streaming => 1);
+$node_standby->start;
+
+my $primary_lsn = $node_primary->lsn('flush');
+my $replayed = $node_standby->poll_query_until('postgres',
+ qq{SELECT '$primary_lsn'::pg_lsn <= pg_last_wal_replay_lsn()});
+
+ok($replayed, "standby replayed TRUNCATE_ID + page-crossing CREATE_ID");
+
+$node_standby->stop if $replayed;
+$node_primary->stop;
+
+done_testing();
diff --git a/src/test/modules/test_slru/test_multixact.c b/src/test/modules/test_slru/test_multixact.c
new file mode 100644
index 00000000000..e2f6f6a738d
--- /dev/null
+++ b/src/test/modules/test_slru/test_multixact.c
@@ -0,0 +1,105 @@
+/*-------------------------------------------------------------------------
+ *
+ * test_multixact.c
+ * Support code for multixact testing
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/test/modules/test_slru/test_multixact.c
+ *
+ * -------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "access/multixact.h"
+#include "access/xact.h"
+#include "access/xlog.h"
+#include "access/xloginsert.h"
+#include "fmgr.h"
+#include "miscadmin.h"
+#include "utils/pg_lsn.h"
+
+PG_FUNCTION_INFO_V1(test_create_multixact);
+PG_FUNCTION_INFO_V1(test_create_multixacts);
+PG_FUNCTION_INFO_V1(test_multixact_write_truncate_wal);
+
+/*
+ * Produces multixact with 2 current xids
+ */
+Datum
+test_create_multixact(PG_FUNCTION_ARGS)
+{
+ MultiXactId id;
+
+ MultiXactIdSetOldestMember();
+ id = MultiXactIdCreate(GetCurrentTransactionId(), MultiXactStatusUpdate,
+ GetCurrentTransactionId(), MultiXactStatusForShare);
+ PG_RETURN_TRANSACTIONID(id);
+}
+
+/*
+ * Create n multixacts. Used to quickly fill offset pages for truncation tests.
+ *
+ * Each iteration uses a subtransaction so that GetCurrentTransactionId()
+ * returns a different xid, preventing mXactCacheGetBySet from returning a
+ * cached result and ensuring a new MultiXactId is allocated every time.
+ */
+Datum
+test_create_multixacts(PG_FUNCTION_ARGS)
+{
+ int32 n = PG_GETARG_INT32(0);
+ MultiXactId first_id = InvalidMultiXactId;
+
+ if (n <= 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("n must be positive")));
+
+ for (int i = 0; i < n; i++)
+ {
+ MultiXactId id;
+
+ BeginInternalSubTransaction(NULL);
+ MultiXactIdSetOldestMember();
+ id = MultiXactIdCreate(GetCurrentTransactionId(), MultiXactStatusUpdate,
+ GetCurrentTransactionId(), MultiXactStatusForShare);
+ ReleaseCurrentSubTransaction();
+
+ if (i == 0)
+ first_id = id;
+ }
+
+ PG_RETURN_TRANSACTIONID(first_id);
+}
+
+/*
+ * Write a TRUNCATE_ID WAL record with the given endTruncOff.
+ *
+ * This is used to simulate a truncation that sets latest_page_number to a
+ * specific page during standby replay, without actually truncating anything
+ * on the primary. The standby's multixact_redo handler will reset
+ * latest_page_number = MultiXactIdToOffsetPage(endTruncOff).
+ */
+Datum
+test_multixact_write_truncate_wal(PG_FUNCTION_ARGS)
+{
+ MultiXactId endTruncOff = PG_GETARG_TRANSACTIONID(0);
+ xl_multixact_truncate xlrec;
+ XLogRecPtr recptr;
+
+ xlrec.oldestMultiDB = MyDatabaseId;
+ xlrec.startTruncOff = 1;
+ xlrec.endTruncOff = endTruncOff;
+ xlrec.startTruncMemb = 0;
+ xlrec.endTruncMemb = 0;
+
+ XLogBeginInsert();
+ XLogRegisterData((char *) &xlrec, SizeOfMultiXactTruncate);
+ recptr = XLogInsert(RM_MULTIXACT_ID, XLOG_MULTIXACT_TRUNCATE_ID);
+ XLogFlush(recptr);
+
+ PG_RETURN_LSN(recptr);
+}
diff --git a/src/test/modules/test_slru/test_slru--1.0.sql b/src/test/modules/test_slru/test_slru--1.0.sql
index 202e8da3fde..0d6271473bf 100644
--- a/src/test/modules/test_slru/test_slru--1.0.sql
+++ b/src/test/modules/test_slru/test_slru--1.0.sql
@@ -19,3 +19,10 @@ CREATE OR REPLACE FUNCTION test_slru_page_truncate(bigint) RETURNS VOID
AS 'MODULE_PATHNAME', 'test_slru_page_truncate' LANGUAGE C;
CREATE OR REPLACE FUNCTION test_slru_delete_all() RETURNS VOID
AS 'MODULE_PATHNAME', 'test_slru_delete_all' LANGUAGE C;
+
+CREATE OR REPLACE FUNCTION test_create_multixact() RETURNS xid
+ AS 'MODULE_PATHNAME', 'test_create_multixact' LANGUAGE C;
+CREATE OR REPLACE FUNCTION test_create_multixacts(int) RETURNS xid
+ AS 'MODULE_PATHNAME', 'test_create_multixacts' LANGUAGE C;
+CREATE OR REPLACE FUNCTION test_multixact_write_truncate_wal(xid) RETURNS pg_lsn
+ AS 'MODULE_PATHNAME', 'test_multixact_write_truncate_wal' LANGUAGE C;
--
2.51.2
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: 17.8 standby crashes during WAL replay from 17.5 primary: "could not access status of transaction"
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox