public inbox for [email protected]  
help / color / mirror / Atom feed
From: =?UTF-8?B?5q615Z2k5LuBKOWIu+mfpyk=?= <[email protected]>
To: Heikki Linnakangas <[email protected]>
Cc: pgsql-hackers <[email protected]>
Cc: x4mmm <[email protected]>
Subject:  回复:Bug in MultiXact replay compat logic for older minor version after crash-recovery
Date: Sun, 22 Mar 2026 21:09:05 +0800
Message-ID: <76f70088-e2ee-4b17-9e12-fa89f2a08393.duankunren.dkr@alibaba-inc.com> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>

Thanks for the v2 patch.
    
On 20/03/2026 16:19, Heikki Linnakangas wrote:
> it means that tracking the latest page we have zeroed is not merely
> an optimization to avoid excessive SimpleLruDoesPhysicalPageExist()
> calls, it's needed for correctness.
    
Agreed.
    
On 20/03/2026 18:14, Heikki Linnakangas wrote:
> I also added another safety measure: before calling
> SimpleLruDoesPhysicalPageExist(), flush all the SLRU buffers.
    
This is more robust than scanning the SLRU buffers first and only
calling SimpleLruDoesPhysicalPageExist() on a miss, which would
rely on the SLRU eviction invariant.
    
I walked through the scenarios I could think of. Let N be the last
multixid on offset page P, so N+1 falls on page P+1.
    
(a) Old-version WAL (CREATE_ID:N before ZERO_OFF_PAGE:P+1):
    last_initialized_offsets_page = P from earlier ZERO_OFF_PAGE.
    init_needed = (P == P) = true -> init P+1. Correct.
    Later ZERO_OFF_PAGE:P+1 is skipped via pre_initialized_offsets_page.
    
(b) Crash-restart, page P+1 not on disk (the original bug):
    last_initialized_offsets_page = -1, fallback path fires.
    SimpleLruDoesPhysicalPageExist(P+1) = false -> init. Correct.
    
(c) Crash-restart, page P+1 already on disk:
    Same fallback, SimpleLruDoesPhysicalPageExist(P+1) = true -> skip.
    last_initialized_offsets_page stays -1 until the next
    ZERO_OFF_PAGE switches back to the fast path.
    
(d) Out-of-order CREATE_IDs (ZERO_PAGE:P+1 -> CREATE_ID:N+1 ->
    CREATE_ID:N+2 -> CREATE_ID:N):
    N+1 and N+2 don't cross a page boundary, compat logic not entered.
    CREATE_ID:N: init_needed = (P+1 == P) = false -> skip.
    Page P+1 is not re-zeroed, data from N+1/N+2 preserved.
    
(e) Consecutive page crossings (N on page P, later M on page P+1):
    After init of P+1: last_initialized_offsets_page = P+1.
    CREATE_ID:M: init_needed = (P+1 == P+1) = true -> init P+2.
    Tracking advances monotonically across page boundaries.
    
The logic looks correct to me in all the cases above.
    
Regards,
Duan


view thread (7+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re:  回复:Bug in MultiXact replay compat logic for older minor version after crash-recovery
  In-Reply-To: <76f70088-e2ee-4b17-9e12-fa89f2a08393.duankunren.dkr@alibaba-inc.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox