public inbox for [email protected]  
help / color / mirror / Atom feed
From: Melanie Plageman <[email protected]>
To: Kirill Reshke <[email protected]>
Cc: Andres Freund <[email protected]>
Cc: Robert Haas <[email protected]>
Cc: Andrey Borodin <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Cc: Heikki Linnakangas <[email protected]>
Cc: Chao Li <[email protected]>
Subject: Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)
Date: Thu, 18 Dec 2025 15:04:34 -0500
Message-ID: <CAAKRu_Y=hpfgwq=H+R9M7Q=t7FEgMGCuhojTO0WU4n=kf=ZAyg@mail.gmail.com> (raw)
In-Reply-To: <CALdSSPhHswwOtaS92kTnvgraGyOT3mGcUOUGAnAL0nUWusUkiA@mail.gmail.com>
References: <CAAKRu_ZP-3=SaZykpwDBMJOdUKyW3Wm5JZfPFRR3L5Ac8ouq4w@mail.gmail.com>
	<CAAKRu_bgkOQqu3K5n4YLRsNBZqJ9Rjg80ROqgKSr2UGz4b5hUg@mail.gmail.com>
	<2wk7jo4m4qwh5sn33pfgerdjfujebbccsmmlownybddbh6nawl@mdyyqpqzxjek>
	<CAAKRu_YR-COJ9aGnMUQqt5yoWmUBjikqrd4jGNZYouHaXpis9g@mail.gmail.com>
	<CALdSSPhiCwJwWwgJP1NmqRmnp9RS2tGOBY0gQrfLCbB+OS5_KQ@mail.gmail.com>
	<CAAKRu_YS+Ocm=OzMaZnG4egFiE9v4VYfZ25DXd6jbwegqmGYbQ@mail.gmail.com>
	<CAAKRu_ZGZSqhGt-RcmmfiSheC+1fjQdxy6_+oM-1jMn8hyVptQ@mail.gmail.com>
	<CALdSSPg+B8RTzTXhJvCcKJBqgzhPZkq0E2oqxQdv74ZNZOMVzg@mail.gmail.com>
	<CAAKRu_Zha7HcdQBv8tTtQrcry5332J6kHnOc1X=TT03LzUXDow@mail.gmail.com>
	<CAAKRu_amF00f2T_H8N6pbbe75C22EeX1OqA=svpj8LNO1sdUuw@mail.gmail.com>
	<mhf4vkmh3j57zx7vuxp4jagtdzwhu3573pgfpmnjwqa6i6yj5y@sy4ymcdtdklo>
	<CAAKRu_agCLAQ9OjmrTdJe-X=Xr7QnU4d=cfxdQGwc9jNx9w31w@mail.gmail.com>
	<CAAKRu_ayWLg=WDGZZfSPWf0KjPM8u=LBb0D6XaEWyx2_YFFwAQ@mail.gmail.com>
	<CALdSSPhtPK36_sr9yFo3cN9TXQmjvSX3BZXCF8fVBLX-GETD0Q@mail.gmail.com>
	<CAAKRu_Yt6EH5aFSJBm-k7PrNM4bTt56fTRbyU7gqYXe4cW+F9g@mail.gmail.com>
	<CALdSSPhH7hi+EzYqq0=eMCthi7iNpY_YyECAC1qxPb7rd0TLrw@mail.gmail.com>
	<CAAKRu_baqEkdJ0_1vokvgbaN52ycR65HM2-DR-VsRmYVVQLV8w@mail.gmail.com>
	<CALdSSPhHswwOtaS92kTnvgraGyOT3mGcUOUGAnAL0nUWusUkiA@mail.gmail.com>

On Thu, Dec 18, 2025 at 10:46 AM Kirill Reshke <[email protected]> wrote:
>
> On Thu, 18 Dec 2025 at 20:18, Melanie Plageman
> <[email protected]> wrote:
> > > Also, after the whole set is committed, we should then never
> > > experience discrepancy between  PD_ALL_VISIBLE and VM bits? Because
> > > they will be set in a single WAL record. The only cases when heap and
> > > VM disagrees on all-visibility then are corruption,
> > > pg_visibilitymap_truncate and old data (data before v19+ upgrade?)
> > > If my understanding is correct, should we add document this?
> >
> > Even on current master, I don't see a scenario other than VM
> > corruption or truncation where PD_ALL_VISIBLE can be set but not the
> > VM (or vice versa). The only way would be if you error out after
> > setting PD_ALL_VISIBLE before setting the VM. Setting PD_ALL_VISIBLE
> > is not in a critical section in lazy_scan_prune(), so it won't panic
> > and dump shared memory, so the buffer with PD_ALL_VISIBLE set may
> > later get written out. But the only obvious way I see to error out of
> > MarkBufferDirty() is if the buffer is not valid -- which would have
> > kept us from doing previous operations on the buffer, I would think.
>
> Well... I may be missing something, but on current HEAD,
> XLOG_HEAP2_PRUNE_VACUUM_SCAN and XLOG_HEAP2_VISIBLE are two different
> record, XLOG_HEAP2_PRUNE_VACUUM_SCAN being always emitted first. So,
> WAL writer may end up kill-9-ed just after
> XLOG_HEAP2_PRUNE_VACUUM_SCAN makes it to the disk, and
> XLOG_HEAP2_VISIBLE never. Crash recovery then, and we have
> discrepancy. This does not happen with a single WAL record.
> Another simple reproducer here: standby streaming, receiving
> XLOG_HEAP2_PRUNE_VACUUM_SCAN from primary, Then network becomes bad,
> and we never get XLOG_HEAP2_VISIBLE from primary. Then we promoted by
> the admin. And again, VM bit vs PD_ALL_VISIBLE discrepancy. Am I
> missing something?

Well, currently XLOG_HEAP2_PRUNE_VACUUM_SCAN doesn't set
PD_ALL_VISIBLE. PD_ALL_VISIBLE is WAL-logged in the XLOG_HEAP2_VISIBLE
record because in lazy_scan_prune() we call PageSetAllVisible() and
then visibilitymap_set() -> log_heap_visible() adds the heap buffer to
the WAL chain (with XLogRegisterBuffer()).

And if you notice when XLOG_HEAP2_VISIBLE is replayed in
heap_xlog_visible(), that is where we do PageSetAllVisible() on the
heap page.

So I think you can end up with PD_ALL_VISIBLE set if you error out
precisely between setting it and WAL logging it because we don't set
it in a critical section. But you can't end up with a WAL record that
sets PD_ALL_VISIBLE and another one that sets the VM.

Once we have my code changes, you can never end up with PD_ALL_VISIBLE
set and the VM not set because they are in the same critical section
and if we error out, it will cause a panic which will purge shared
memory.

- Melanie





view thread (143+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)
  In-Reply-To: <CAAKRu_Y=hpfgwq=H+R9M7Q=t7FEgMGCuhojTO0WU4n=kf=ZAyg@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox