public inbox for [email protected]
help / color / mirror / Atom feedFrom: Kirill Reshke <[email protected]>
To: Melanie Plageman <[email protected]>
Cc: Andres Freund <[email protected]>
Cc: Robert Haas <[email protected]>
Cc: Andrey Borodin <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Cc: Heikki Linnakangas <[email protected]>
Cc: Chao Li <[email protected]>
Subject: Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)
Date: Thu, 18 Dec 2025 13:55:46 +0500
Message-ID: <CALdSSPhH7hi+EzYqq0=eMCthi7iNpY_YyECAC1qxPb7rd0TLrw@mail.gmail.com> (raw)
In-Reply-To: <CAAKRu_Yt6EH5aFSJBm-k7PrNM4bTt56fTRbyU7gqYXe4cW+F9g@mail.gmail.com>
References: <CAAKRu_ZP-3=SaZykpwDBMJOdUKyW3Wm5JZfPFRR3L5Ac8ouq4w@mail.gmail.com>
<CAAKRu_bgkOQqu3K5n4YLRsNBZqJ9Rjg80ROqgKSr2UGz4b5hUg@mail.gmail.com>
<2wk7jo4m4qwh5sn33pfgerdjfujebbccsmmlownybddbh6nawl@mdyyqpqzxjek>
<CAAKRu_YR-COJ9aGnMUQqt5yoWmUBjikqrd4jGNZYouHaXpis9g@mail.gmail.com>
<CALdSSPhiCwJwWwgJP1NmqRmnp9RS2tGOBY0gQrfLCbB+OS5_KQ@mail.gmail.com>
<CAAKRu_YS+Ocm=OzMaZnG4egFiE9v4VYfZ25DXd6jbwegqmGYbQ@mail.gmail.com>
<CAAKRu_ZGZSqhGt-RcmmfiSheC+1fjQdxy6_+oM-1jMn8hyVptQ@mail.gmail.com>
<CALdSSPg+B8RTzTXhJvCcKJBqgzhPZkq0E2oqxQdv74ZNZOMVzg@mail.gmail.com>
<CAAKRu_Zha7HcdQBv8tTtQrcry5332J6kHnOc1X=TT03LzUXDow@mail.gmail.com>
<CAAKRu_amF00f2T_H8N6pbbe75C22EeX1OqA=svpj8LNO1sdUuw@mail.gmail.com>
<mhf4vkmh3j57zx7vuxp4jagtdzwhu3573pgfpmnjwqa6i6yj5y@sy4ymcdtdklo>
<CAAKRu_agCLAQ9OjmrTdJe-X=Xr7QnU4d=cfxdQGwc9jNx9w31w@mail.gmail.com>
<CAAKRu_ayWLg=WDGZZfSPWf0KjPM8u=LBb0D6XaEWyx2_YFFwAQ@mail.gmail.com>
<CALdSSPhtPK36_sr9yFo3cN9TXQmjvSX3BZXCF8fVBLX-GETD0Q@mail.gmail.com>
<CAAKRu_Yt6EH5aFSJBm-k7PrNM4bTt56fTRbyU7gqYXe4cW+F9g@mail.gmail.com>
On Thu, 18 Dec 2025 at 05:30, Melanie Plageman
<[email protected]> wrote:
>
> > in v27-0005. This patch changes code which is not exercised in
> > tests[0]. I spent some time understanding the conditions when we
> > entered this. There is a comment about non-finished relation
> > extension, but I got no success trying to reproduce this. I ended up
> > modifying code to lose PageSetAllVisible in proper places and running
> > vacuum. Looks like everything works as expected. I will spend some
> > more time on this, maybe I will be successful in writing an
> > injection-point-based TAP test which hits this...
>
> Based on the coverage report link you provided, that code is changed
> by v27 0007, not 0005. 0005 is about moving an assertion out of
> lazy_scan_prune(). 0007 changes lazy_scan_new_or_empty() (the code in
> question).
>
> Regarding 0007, it looks like what is uncovered (the orange bits in
> the coverage report are uncovered, I assume) is empty pages _without_
> PD_ALL_VISIBLE set. I don't see anywhere where PageSetAllVisible() is
> called except vacuum and COPY FREEZE.
Sure, I meant 0007.
> If I was trying to guess how empty pages with PD_ALL_VISIBLE set are
> getting vacuumed, I would think it is due to SKIP_PAGES_THRESHOLD
> causing us to vacuum an all-frozen empty page.
Yes, vacuum (disable_page_skipping);
> Then the question is, why wouldn't we have coverage of the empty page
> first being set all-visible/all-frozen? It can't be COPY FREEZE
> because the page is empty. And it can't be vacuum, because then we
> would have coverage. It's very mysterious.
>
> It would be good to have coverage for this case. I don't think you'll
> need an injection point for the main case of "empty page not yet set
> all-visible is vacuumed for the first time" (unless I'm
> misunderstanding something).
>
> I'm not sure how you'll test the "vacuuming an empty, previously
> uninitialized page" case described in this comment, though.
>
> * It's possible that another backend has extended the heap,
> * initialized the page, and then failed to WAL-log the page due
> * to an ERROR. Since heap extension is not WAL-logged, recovery
> * might try to replay our record setting the page all-visible and
> * find that the page isn't initialized, which will cause a PANIC.
> * To prevent that, check whether the page has been previously
> * WAL-logged, and if not, do that now.
>
> You'd want to force an error during relation extension and then vacuum
> the page. I don't know if you need an injection point to force the
> error -- depends on what kind of error, I think.
I did small archeology and this "if (PageIsEmpty(page)) { if
(!PageIsAllVisible(page)) { .... }}" code originates back to
608195a3a365. Comment about not WAL-logged relation extension is from
a6370fd9ed3d, and I don't think we need to think about this case.
I am currently inclined to think that we cannot see an empty page that
has PD_ALL_VISIBLE not-set. This is because when we make a page empty,
we are in a critical section, and we WAL-log everything we do, so our
changes should not be half-made. Maybe as of 608195a3a365, there was a
case with empry-page-without-PD_ALL_VISIBLE, but I dont think this
happens on HEAD.
> So that I know for attribution, did you review 0003-0005?
yes, but I did not have any valuable review points for them.
Also, after the whole set is committed, we should then never
experience discrepancy between PD_ALL_VISIBLE and VM bits? Because
they will be set in a single WAL record. The only cases when heap and
VM disagrees on all-visibility then are corruption,
pg_visibilitymap_truncate and old data (data before v19+ upgrade?)
If my understanding is correct, should we add document this?
--
Best regards,
Kirill Reshke
view thread (143+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)
In-Reply-To: <CALdSSPhH7hi+EzYqq0=eMCthi7iNpY_YyECAC1qxPb7rd0TLrw@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox