public inbox for [email protected]
help / color / mirror / Atom feedFrom: Andrey Borodin <[email protected]>
To: Heikki Linnakangas <[email protected]>
Cc: Michael Paquier <[email protected]>
Cc: Ayush Tiwari <[email protected]>
Cc: Radim Marek <[email protected]>
Cc: Marko Tiikkaja <[email protected]>
Cc: PostgreSQL mailing lists <[email protected]>
Subject: Re: BUG #19490: Streaming standby on 16.14 stops applying WAL on MultiXactOffsetSLRU when primary is 16.8
Date: Tue, 26 May 2026 23:29:58 +0500
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
<[email protected]>
<CAL9smLBMxKBCmsA9UGcmf93bT2_MsZ+POH-oHREuwKdmMU7jfQ@mail.gmail.com>
<[email protected]>
<CAJgoLkJfFgL-V+pYB7=R81AbURTE6sMhzVHDQDhVGnfXRSJ9Wg@mail.gmail.com>
<CAJgoLkKCu0wCwPQZSo5no=XATU-4LMK4QfKBwV928o2uKcxe=g@mail.gmail.com>
<CAJTYsWU6tdEvVh4YKLxz7+amZ7+Wb7_s-FBjsMMeLNj1fKeSNg@mail.gmail.com>
<[email protected]>
<CAJTYsWWXvbBJe+WYJZcnoSTyVz9vk5ro3x2qAq_uvXvK2KwaMQ@mail.gmail.com>
<[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
> On 26 May 2026, at 17:28, Heikki Linnakangas <[email protected]> wrote:
>
> looks correct
I tested that change as follows.
Setted up REL_16_0 as primary, REL_16_STABLE as standby.
Generate multixacts in a single session using savepoints:
BEGIN;
SELECT * FROM t WHERE i = 1 FOR NO KEY UPDATE;
-- repeat 2500 times:
SAVEPOINT a; SELECT * FROM t WHERE i = 1 FOR UPDATE; ROLLBACK TO a;
COMMIT;
Each iteration creates a new MultiXactId. 2500 iterations cross the SLRU page
boundary at multixact 2048 with some spare multis (we'll pickle the excess ones in
jars when all is fixed, toying with 2048 wasted dev cycles for no reason).
Test:
0. Run the workload on REL_16_0 primary (2500 multixacts, crossing page 0->1)
1. Take pg_basebackup
2. Run the workload again (2500 more, crossing page 1->2)
3. Start the standby
I observe:
Without the change startup deadlocks.
With the change standby catches up, the DEBUG1 message "next offsets page is not
initialized, initializing it now" confirms the compat block fires correctly.
I packaged this test into a buildfarm module (TestReplayXversion) [0] that
builds REL_x_0 and runs this check on REL_x_STABLE build. It reproduces the deadlock
on 14, 15, and 16; 17 and 18 pass. Currently I'm struggling to inject regress WAL trace
into it, not working so far. On a bright side - I managed to get PR number 42 in buildfarm
client repo.
Best regards, Andrey Borodin.
[0] https://github.com/PGBuildFarm/client-code/pull/42
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: BUG #19490: Streaming standby on 16.14 stops applying WAL on MultiXactOffsetSLRU when primary is 16.8
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox