public inbox for [email protected]  
help / color / mirror / Atom feed
From: Heikki Linnakangas <[email protected]>
To: Maxim Orlov <[email protected]>
Cc: wenhui qiu <[email protected]>
Cc: Alexander Korotkov <[email protected]>
Cc: Ashutosh Bapat <[email protected]>
Cc: Postgres hackers <[email protected]>
Subject: Re: POC: make mxidoff 64 bits
Date: Thu, 30 Oct 2025 11:10:43 +0200
Message-ID: <[email protected]> (raw)
In-Reply-To: <CACG=ezaABYDepYf24MUNxc2oHRERxXbXHNMP+i-Pr1AXu26x0A@mail.gmail.com>
References: <CACG=ezaWg7_nt-8ey4aKv2w9LcuLthHknwCawmBgEeTnJrJTcw@mail.gmail.com>
	<CACG=ezYiLzCSo43uTPzAeq8ZCnGSkAsw061=oMyw5J1NUZ9Jwg@mail.gmail.com>
	<CAExHW5soKc9mhLhroi__yrPD-ymkFbz=e5hyZ34iqjM-cdK9_g@mail.gmail.com>
	<CACG=ezaLdhpFY9_qr7B6s7kkg_=s5S_ZD2=dsSeBNgdWWWuKbg@mail.gmail.com>
	<CACG=ezaFoCx4XEGi4gQWrusYD81LTLTKBUdgs-mEjNwzVarRnw@mail.gmail.com>
	<CACG=ezarNdLaTdr3fdWGyVJEzRAa=Vd5LLmu7VN2r7XL6LH8xA@mail.gmail.com>
	<CAPpHfdtPybyMYBj-x3-Z5=4bj_vhYk2R0nezfy=Vjcz4QBMDgw@mail.gmail.com>
	<CACG=ezaCc4bFfua-VA1NB6wppMPwuMmZiGdrkb-iYK9ZmQa6gg@mail.gmail.com>
	<CAGjGUAJUmSFMunCcK8DXcjLrs2Hfk2kFiaWDTc6ti03S8Echmw@mail.gmail.com>
	<CACG=ezaCKR+--O2TZrm3jYGgpDjdpLXdz3marG6t_=nzP5+Gog@mail.gmail.com>
	<CAGjGUA+uxpbkRaxkZmcSNiGXJ_G3Zj4-gzaiy=94DH7rvE8tig@mail.gmail.com>
	<CACG=ezbPUASDL1eJ+c-ZkJMwRPukvp3EL0q1vSUa1h+fnX8y3g@mail.gmail.com>
	<[email protected]>
	<CACG=ezaABYDepYf24MUNxc2oHRERxXbXHNMP+i-Pr1AXu26x0A@mail.gmail.com>

On 30/10/2025 08:13, Maxim Orlov wrote:
> On Tue, 28 Oct 2025 at 17:17, Heikki Linnakangas <[email protected] 
> <mailto:[email protected]>> wrote:
> 
>     On 27/10/2025 17:54, Maxim Orlov wrote:
> 
> 
>     If backend C looks up multixid 101 in between steps 3 and 4, it would
>     read the offset incorrectly, because 'base' isn't set yet.
> 
> Hmm, maybe I miss something? We set page base on first write of any
> offset on the page, not only the first one. In other words, there
> should never be a case when we read an offset without a previously
> defined page base. Correct me if I'm wrong:
> 1. Backend A assigned mxact=100, offset=1000.
> 2. Backend B assigned mxact=101, offset=1010.
> 3. Backend B calls RecordNewMultiXact()/MXOffsetWrite() and
>      set page base=1010, offset plus 0^0x80000000 bit while
>      holding lock on the page.
> 4. Backend C looks up for the mxact=101 by calling MXOffsetRead()
>      and should get exactly what he's looking for:
>      base (1010) + offset (0) minus 0x80000000 bit.
> 5. Backend A calls RecordNewMultiXact() and sets his offset using
>      existing base from step 3.

Oh I see, the 'base' is not necessarily the base offset of the first 
multixact on the page, it's the base offset of the first multixid that 
is written to the page. And the (short) offsets can be negative. That's 
a frighteningly clever encoding scheme. One upshot of that is that WAL 
redo might get construct the page with a different 'base'. I guess that 
works, but it scares me. Could we come up with a more deterministic scheme?

- Heikki






view thread (79+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: POC: make mxidoff 64 bits
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox