public inbox for [email protected]
help / color / mirror / Atom feedFrom: Heikki Linnakangas <[email protected]>
To: Maxim Orlov <[email protected]>
Cc: Alvaro Herrera <[email protected]>
Cc: Alexander Korotkov <[email protected]>
Cc: wenhui qiu <[email protected]>
Cc: Postgres hackers <[email protected]>
Cc: Ashutosh Bapat <[email protected]>
Subject: Re: POC: make mxidoff 64 bits
Date: Thu, 4 Dec 2025 12:39:43 +0200
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <CACG=ezaWg7_nt-8ey4aKv2w9LcuLthHknwCawmBgEeTnJrJTcw@mail.gmail.com>
<CACG=ezZwdvsijzuXE3hex3xHcoz75EQYBXRTsQJVwbo5J5sS3g@mail.gmail.com>
<CACG=ezbs912S58=uR17b4w8uuWv1=OcCRaTW_OWdFm4+tXZA6w@mail.gmail.com>
<CAGjGUA+BfcWyccNN4=tHsW_E-koRxbg8h8ut6hjvPsHMgmek6w@mail.gmail.com>
<CACG=ezYbYO_KHWdeDedbDcY0tOS0JfaqBxG3=bG5+DdsDK4MpQ@mail.gmail.com>
<CACG=ezYpZRPwoRCz_h3Qerd3XJNdpTHCpwGbZphNdy26tA4_qQ@mail.gmail.com>
<[email protected]>
<[email protected]>
<CACG=ezYUJSvnuxntkURNWo_1vZ+AtmcQfqd_h6WgDzGaudfw+Q@mail.gmail.com>
<[email protected]>
<[email protected]>
<CAExHW5tUEkiQrvm9hgccjKUNkWBnJ5_HDUrAwiHBTxu+Vuj29Q@mail.gmail.com>
<[email protected]>
<[email protected]>
<CACG=ezY0=ri8A0duXbpd1XNUc1jnngaPnmB0-+UZpxAv7-fNtw@mail.gmail.com>
<[email protected]>
On 26/11/2025 17:50, Heikki Linnakangas wrote:
> On 26/11/2025 17:23, Maxim Orlov wrote:
>> On Tue, 25 Nov 2025 at 13:07, Heikki Linnakangas <[email protected]
>> <mailto:[email protected]>> wrote:
>>> GetOldMultiXactIdSingleMember() currently asserts that the offset is
>>> never zero, but it should try to do something sensible in that case
>>> instead of just failing.
>>
>> Correct me if I'm wrong, but we added the assertion that offsets are
>> never 0, based on the idea that case #2 will never take place during an
>> update. If this isn't the case, this assertion could be removed.
>> The rest of the function appears to work correctly.
>>
>> I even think that, as an experiment, we could randomly reset some of the
>> offsets to zero and nothing would happen, except that some data would
>> be lost.
>
> +1
>
>> The most sensible thing we can do is give the user a warning, right?
>> Something like, "During the update, we encountered some weird offset
>> that shouldn't have been there, but there's nothing we can do about it,
>> just take note."
>
> Yep, makes sense.
I read through the SLRU reading codepath, looking for all the things
that could go wrong (not sure I got them all):
1. An SLRU file does not exist
2. An SLRU file is too short, i.e. a page does not exist
3. The offset in 'offsets' page is 0
4. The offset in 'offsets' page looks invalid, i.e. it's greater than
nextOffset or smaller than oldestOffset.
5. The offset is out of order compared to its neighbors
6. The multixid has no members
7. The multixid has an invalid (0) member
8. A multixid has more than one updating member
Some of those situations are theoretically are possible if there was a
crash. We don't follow the WAL-before-data rule for these SLRUs.
Instead, we piggyback on the WAL-before-data of the heap page that would
reference the multixid. In other words, we rely on the fact that if a
multixid write is missed or torn because of a crash, that multixid will
not be referenced from anywhwere and will never be read.
However, that doesn't hold for pg_upgrade. pg_upgrade will try to read
all the multixids. So we need to make the multixact reading code
tolerant of the situations that could be present after a crash. I think
the right philosophy here is that we try to read all the old multixids,
and do our best to interpret them the same way that the old server
would. For those situations that can legitimately be present if the old
server crashed at some point, be silent. For cases that should not
happen, even if there was a crash, print a warning. For example, I think
an SLRU file should never be missing (1) or truncated (2). But the zero
offset (3), and (6) can happen.
Perhaps we should check that all the files exist and have the correct
sizes in the pre-check stage, and abort the upgrade early if anything is
missing. That would be pretty cheap to check.
- Heikki
view thread (79+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: POC: make mxidoff 64 bits
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox