public inbox for [email protected]  
help / color / mirror / Atom feed
From: Melanie Plageman <[email protected]>
To: Andres Freund <[email protected]>
Cc: Andrey Borodin <[email protected]>
Cc: Kirill Reshke <[email protected]>
Cc: Chao Li <[email protected]>
Cc: Xuneng Zhou <[email protected]>
Cc: Robert Haas <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Cc: Heikki Linnakangas <[email protected]>
Subject: Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)
Date: Mon, 2 Mar 2026 18:38:07 -0500
Message-ID: <CAAKRu_Y1MuANdm1p47Ev13Y9EQz8z+pw-vHOh=3DVdahUTjgXg@mail.gmail.com> (raw)
In-Reply-To: <bqc4kh5midfn44gnjiqez3bjqv4zogydguvdn446riw45jcf3y@4ez66il7ebvk>
References: <[email protected]>
	<CAAKRu_ZCjHoRPfQ8AbMrFY8TOMCPAvZ0_m9SX7yg0edfTk45-g@mail.gmail.com>
	<[email protected]>
	<CAAKRu_a04jbDACwzRYwzDND31aPyf7Yvz9TAZrTr=+F5bK1aVA@mail.gmail.com>
	<CALdSSPjcv25jmXm29X-MRWZBae6+HwcWfVH1PE8NfD=EMTnkAg@mail.gmail.com>
	<CAAKRu_bwtBEzDwemyim1r6yYonw7FTyFr1HXG8vywCe-MdbPBQ@mail.gmail.com>
	<[email protected]>
	<CAAKRu_YQd=2KvomM+RHcpeDKj0bq+peJ=3W-fip+pkvzA-Jq9w@mail.gmail.com>
	<7ib3sa55sapwjlaz4sijbiq7iezna27kjvvvar4dpgkmadml6t@gfpkkwmdnepx>
	<CAAKRu_bs+gZ83QDacmBxunPvCGnXJ05hxP2BDPJ3BGwdbGRXzg@mail.gmail.com>
	<bqc4kh5midfn44gnjiqez3bjqv4zogydguvdn446riw45jcf3y@4ez66il7ebvk>

On Fri, Feb 20, 2026 at 12:59 PM Andres Freund <[email protected]> wrote:
>
> On 2026-01-28 18:16:10 -0500, Melanie Plageman wrote:
>
> > I could see an argument for moving identify_and_fix_vm_corruption()
> > out of the helper and into heap_page_prune_and_freeze() but then we'd
> > have to move visibilitymap_get_status() out too. And that takes away a
> > lot of the benefit of encapsulating all that logic.
>
> I was wondering about that option. Relatedly, I also was wondering if we ought
> to do identify_and_fix_vm_corruption() regardless of ->attempt_update_vm.

Attached v35 does this. I always pin the vmbuffer if we are going to
prune in heap_page_prune_opt(). In many cases, because it's saved in
the scan descriptor, it won't actually need to take a new pin. During
pruning, I check for VM corruption even if I am not considering
setting the VM.

> > Well, after this patch set, clearing the VM does happen before we emit
> > WAL for pruning.
>
> That I think is a substantial improvement, the current (i.e. before your
> series) placement really is pretty insane due to the guaranteed divergence it
> causes.
>
> I wonder if we actually should just force an FPI whenever we detect such
> corruption, that way it would reliably fixed on the standby as well.

Only problem is we would have to do an FPI of the VM page as well if
we wanted the corruption to be reliably fixed on the standby.

> > It wouldn't be hard to move the corruption fixups to the beginning of
> > heap_page_prune_and_freeze() in the new code structure.
>
> As identify_and_fix_vm_corruption() needs lpdead_items, I'm not sure that's
> true?
>
> I wonder if at least the warning for the "(PageIsAllVisible(heap_page) &&
> nlpdead_items > 0)" test should be moved to
> heap_prune_record_dead_or_unused(). That way the WARNING could include the
> offset number and it'd also work in the mark_unused_now case.
>
> Perhaps it also should trigger for RECENTLY_DEAD, INSERT_IN_PROGRESS,
> DELETE_IN_PROGRESS?
>
> At that point the !page_all_visible && vm_all_visible part could indeed be
> moved to the start of heap_page_prune_and_freeze()

I've done all this. There is heap page/VM corruption check at the
beginning of heap_page_prune_and_freeze() and then checking for
corruption during pruning in the previously covered case (lpdead
items) as well as the mark_unused_now case, and
RECENTLY_DEAD/INSERT_IN_PROGRESS/DELETE_IN_PROGRESS.

> > Would it be worth it? What benefit would we get? Do you just feel that it
> > should logically come first?
>
> One insanity is that right now we will process all frozen pages over and over
> due to he skip pages threshold, wasting a *lot* of CPU and memory bandwidth.
> It'd be quite defensible to just skip processing the page once we determined
> it's already all frozen.  But for that we'd probably want to do the
> "page_all_visible && vm_all_visible" check before returning...

I've added a fast path to bypass pruning/freezing when the page is
already all-visible. And I check for pg_all_visible && vm_all_visible
beforehand. The one downside this has is if there is a page marked
all-frozen but has dead tuples on it, we'll never get to fix that
corruption nor clean up the dead tuples. But the fast path kind of
seems worth it to me.

> > > Do we actually forsee a case where only one of HEAP_PAGE_PRUNE_FREEZE |
> > > HEAP_PAGE_PRUNE_UPDATE_VM would be set?
> >
> > Yes, when setting the VM on-access, it is too expensive to call
> > heap_prepare_freeze_tuple() on each tuple. I could work on trying to
> > optimize it, but it isn't currently viable.
>
> Is it too expensive to do so even when we already decided to do some pruning?
> I am not surprised it's too expensive when there's not even a dead tuple on
> the page.  But I am mildly surprised if it's too expensive to do when we'd WAL
> log anyway?

It's not really possible in the current code structure to only call
heap_prepare_freeze_tuple() when there are at least some prunable
tuples. We go through the line pointers and record them as prunable at
the same time we call heap_prepare_freeze_tuple(), so we won't know
until we've examined all line pointers that there are no prunable
tuples, at which point we will have called heap_prepare_freeze_tuple()
for every tuple.

> > I think using all_frozen_except_dead while maintaining
> > visibility_cutoff_xid (in heap_prune_record_unchanged_lp_normal()) has
> > the potential to be confusing, though. We'd need to keep updating
> > visibility_cutoff_xid when all_visible is false but
> > all_frozen_except_dead is true as well as when all_visible is true.
> > And because we don't care about all_visible_except_dead, it gets even
> > more confusing to make sure we are maintaining the right variables in
> > the right situations.
>
> I suspect we should just track all of the horizons/cutoffs all the time. This
> whole stuff about optimizing out a few conditional assignments complicates the
> code substantially and feels extremely error prone to me.

I've done this in v35. I posted the freeze horizon tracking patch
separately in [1] but it is in v35 as 0004. Tracking the newest live
xid is in 0009. This also always tracks all_visible for all callers
since I unconditionally pass the vmbuffer now. I still don't set the
VM if the query is modifying the relation, though.

> I probably complained about this before, and it's not this patch's fault, but
> PruneState->{all_visible,all_frozen} are imo confusingly named, due to
> sounding like they describe the current state, rather than the possible state
> after pruning.  It's not helped by this comment:
>
>          * NOTE: all_visible and all_frozen initially don't include LP_DEAD items.
>          * That's convenient for heap_page_prune_and_freeze() to use them to
>          * decide whether to opportunistically freeze the page or not.  The
>          * all_visible and all_frozen values ultimately used to set the VM are
>          * adjusted to include LP_DEAD items after we determine whether or not to
>          * opportunistically freeze.
>
> "all-visible ... are adjusted to include LP_DEAD" ... - just reading that it's
> hard to know what it means.

0003 does the rename.

> The first thing to improve pruning performance that I would do is to introduce
> a fastpath for pages that a) area already frozen b) do not have dead items (if
> we're not freezing). Iterating through HOT chains is far from cheap, and if
> all rows are live, there's not really a point in doing so.  This is
> particulary important for VACUUMs where we end up freezing a ton of pages that
> are already frozen, due to the silly skip_pages_threshold thing.

0007 adds a fast path.

> > +static TransactionId
> > +get_conflict_xid(bool do_prune, bool do_freeze, bool do_set_vm,
> > +                              uint8 old_vmbits, uint8 new_vmbits,
> > +                              TransactionId latest_xid_removed, TransactionId frz_conflict_horizon,
> > +                              TransactionId visibility_cutoff_xid)
> > +{
> > +     TransactionId conflict_xid;
> > +
> > +     /*
> > +      * We can omit the snapshot conflict horizon if we are not pruning or
> > +      * freezing any tuples and are setting an already all-visible page
> > +      * all-frozen in the VM.
>
> Maybe mention when this can happen, because it's not immediately obvious.

I've added this to my TODO. I honestly can't think of a scenario where
it can happen. But I remember spending quite a bit of time thinking
about it on another occasion. The current code (in master) does
specifically account for this scenario, which is why I kept the logic,
but I'm not sure how it can happen.

I made all the other changes to specific comments you mentioned in
your mail but I won't bore you with itemization.

> >       if (do_set_vm)
> >               conflict_xid = visibility_cutoff_xid;
> >       else if (do_freeze)
> >               conflict_xid = frz_conflict_horizon;
> >       else
> >               conflict_xid = InvalidTransactionId;
>
> Could it be worth checking that if (do_set_vm && do_freeze) the
> frz_conflict_horizon won't "violated" by using visibility_cutoff_xid instead?

Yes, as you mentioned off-list, this wasn't right. New code is like this

TransactionId conflict_xid = InvalidTransactionId;
...
    if (do_set_vm)
        conflict_xid = newest_live_xid;
    if (do_freeze && TransactionIdFollows(newest_frozen_xid, conflict_xid))
        conflict_xid = newest_frozen_xid;

> > From 8d350868206456f631883a40a955dff480e408d3 Mon Sep 17 00:00:00 2001
> > From: Melanie Plageman <[email protected]>
> > Date: Wed, 17 Dec 2025 16:51:05 -0500
> > Subject: [PATCH v34 09/14] Use GlobalVisState in vacuum to determine page
> >  level visibility
> >
> > [...]
> >
> > Because comparing a transaction ID against GlobalVisState is more
> > expensive than comparing against a single XID, we defer this check until
> > after scanning all tuples on the page.
>
> Curious, is this a precaution or was this a measurable bottleneck?

I did see GlobalVisTestXidMaybeRunning() in a profile I did when it
was still called for every HEAPTUPLE_LIVE tuple in
heap_prune_record_unchanged_lp_normal(), but I don't have the profile
or test case around anymore.

However, since I now unconditionally maintain the newest_live_xid,
moving GlobalVisTestXidMaybeRunning() back into
heap_prune_record_unchanged_lp_normal() wouldn't help us avoid any
work. It would just make the values of prstate.set_all_visible and
prstate.set_all_frozen more accurate sooner. But I don't think it's
worth the extra function call since set_all_frozen and set_all_visible
won't be totally "done" until after we decide whether or not to
opportunistically freeze anyway.

> > @@ -1077,6 +1078,24 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
> >       prune_freeze_plan(RelationGetRelid(params->relation),
> >                                         buffer, &prstate, off_loc);
> >
> > +     /*
> > +      * After processing all the live tuples on the page, if the newest xmin
> > +      * amongst them may be considered running by any snapshot, the page cannot
> > +      * be all-visible.
> > +      */
> > +     if (prstate.all_visible &&
> > +             TransactionIdIsNormal(prstate.visibility_cutoff_xid) &&
>
> Any reason to test IsNormal rather than just IsValid()?  There should never be
> a reason it's a valid but not "normal" xid, right?

Well the reason I did this was that the existing code in master
tracking visibility_cutoff_xid only advances it if
TransactionIdIsNormal(). I'm a bit confused about it too because it
seems like we would still want to do it for bootstrap mode xids. But I
see PageSetPrunable() only allows normal xids.

> > @@ -1794,28 +1812,15 @@ heap_prune_record_unchanged_lp_normal(Page page, PruneState *prstate, OffsetNumb
> >                               }
> >
> >                               /*
> > -                              * The inserter definitely committed.  But is it old enough
> > -                              * that everyone sees it as committed?  A FrozenTransactionId
> > -                              * is seen as committed to everyone.  Otherwise, we check if
> > -                              * there is a snapshot that considers this xid to still be
> > -                              * running, and if so, we don't consider the page all-visible.
> > +                              * The inserter definitely committed. But we don't know if it
> > +                              * is old enough that everyone sees it as committed. Later,
> > +                              * after processing all the tuples on the page, we'll check if
> > +                              * there is any snapshot that still considers the newest xid
> > +                              * on the page to be running. If so, we don't consider the
> > +                              * page all-visible.
> >                                */
> >                               xmin = HeapTupleHeaderGetXmin(htup);
> >
> > -                             /*
> > -                              * For now always use prstate->cutoffs for this test, because
> > -                              * we only update 'all_visible' and 'all_frozen' when freezing
> > -                              * is requested. We could use GlobalVisTestIsRemovableXid
> > -                              * instead, if a non-freezing caller wanted to set the VM bit.
> > -                              */
> > -                             Assert(prstate->cutoffs);
> > -                             if (!TransactionIdPrecedes(xmin, prstate->cutoffs->OldestXmin))
> > -                             {
> > -                                     prstate->all_visible = false;
> > -                                     prstate->all_frozen = false;
> > -                                     break;
> > -                             }
> > -
> >                               /* Track newest xmin on page. */
> >                               if (TransactionIdFollows(xmin, prstate->visibility_cutoff_xid) &&
> >                                       TransactionIdIsNormal(xmin))
>
> Kinda wonder if this cod eshould be in something like
> heap_prune_record_freezable() or such, rather than be inside
> heap_prune_record_unchanged_lp_normal().

I played around with it, but it all felt a bit awkward. I wrote it
down for a future enhancement idea.

> > Subject: [PATCH v34 10/14] Unset all_visible sooner if not freezing
> >
> > In the prune/freeze path, we currently delay clearing all_visible and
> > all_frozen in the presence of dead items to allow opportunistic
> > freezing.
> >
> > However, if no freezing will be attempted, there’s no need to delay.
> > Clearing the flags earlier avoids extra bookkeeping in
> > heap_prune_record_unchanged_lp_normal(). This currently has no runtime
> > effect because all callers that consider setting the VM also prepare
> > freeze plans, but upcoming changes will allow on-access pruning to set
> > the VM without freezing. The extra bookkeeping was noticeable in a
> > profile of on-access VM setting.
>
> What workload was that?

It was a select * offset all query with a few fat tuples on each page
and none of them prunable. I'm planning on digging up the
case/creating a new one to see if it is reproducible. This was with an
older version of the code that had more conditionals as well. This
commit is actually dropped in v35 because I now always keep
newest_live_xid up-to-date (0009) which means unsetting
set_all_visible sooner has no benefit.

> Theoretically, even if we don't freeze, the page still may be all-visible or
> all frozen after the removal of dead items, no? Practically that won't happen,
> because we don't remove dead items in any of the relevant paths, but from the
> commit message and comments that's not entirely clear.

Yea, it's clearer with the commit dropped.

> > @@ -678,6 +678,12 @@ typedef struct EState
> >                                                                        * ExecDoInitialPruning() */
> >       const char *es_sourceText;      /* Source text from QueryDesc */
> >
> > +     /*
> > +      * RT indexes of relations modified by the query through a
> > +      * UPDATE/DELETE/INSERT/MERGE or targeted by a SELECT FOR UPDATE.
> > +      */
> > +     Bitmapset  *es_modified_relids;
> > +
>
> Other EState fields are initialized in CreateExecutorState, this isn't afaict?

Oops, yes. I based it on es_unpruned_relids which wasn't initialized
there either. I've added a commit (0013) to initialize a few EState
fields that weren't initialized in CreateExecutorState() as well.

> Wonder if it's worth adding a crosscheck somewhere, verifying that if a
> relation is modified, it's in es_modified_relids. Otherwise this could very
> well silently get out of date.

Done in v35 (0014).

> Also, there's some overlap between the informtion collected this way, and
> AcquireExecutorLocks(), ScanQueryForLocks(), which determine the needed lock
> modes via rte->rellockmode.

Those are in parser/planner, so it doesn't seem like a good fit. I
populate es_modified_relids in the executor.

I don't know exactly what the overlap would be between RTEs with an
exclusive rellockmode and es_modified_relids. It seems like you could
have RTEs which don't end up getting modified that have a lock level
that would have made you think that they would be modified.

But were you imagining a substitution or a cross-check?

> > From 8205b2d7da0c3ad3cbc5cead336ced677996b37d Mon Sep 17 00:00:00 2001
> > From: Melanie Plageman <[email protected]>
> > Date: Wed, 3 Dec 2025 15:12:18 -0500
> > Subject: [PATCH v34 12/14] Pass down information on table modification to scan
> >  node
>
> Perhaps worth splitting up, so the addition of the 0 flag is separate from the
> the read only hint aspect.

Done.

[1] https://www.postgresql.org/message-id/CAAKRu_bbaUV8OUjAfVa_iALgKnTSfB4gO3jnkfpcFgrxEpSGJQ%40mail.gma...


Attachments:

  [text/x-patch] v35-0001-Move-commonly-used-context-into-PruneState-and-s.patch (16.4K, 2-v35-0001-Move-commonly-used-context-into-PruneState-and-s.patch)
  download | inline diff:
From 7526e2a0e7d1a013cb9f4d95dff8a4feabd7035b Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Thu, 26 Feb 2026 10:09:55 -0500
Subject: [PATCH v35 01/18] Move commonly used context into PruneState and
 simplify helpers

heap_page_prune_and_freeze() and many of its helpers use the heap
buffer, block number, and page. Other helpers took the heap page and
didn't use it. Initializing these values once during
prune_freeze_setup() simplifies the helpers' interfaces and avoids any
repeated calls to BufferGetBlockNumber() and BufferGetPage().

While updating PruneState, also reorganize its fields to make layout and
documentation more consistent
---
 src/backend/access/heap/pruneheap.c | 136 +++++++++++++++-------------
 1 file changed, 72 insertions(+), 64 deletions(-)

diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index 632c2427952..3c5d33834fc 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -45,6 +45,16 @@ typedef struct
 	/* whether to attempt freezing tuples */
 	bool		attempt_freeze;
 	struct VacuumCutoffs *cutoffs;
+	Relation	relation;
+
+	/*
+	 * Keep the buffer, block, and page handy so that helpers needing to
+	 * access them don't need to make repeated calls to BufferGetBlockNumber()
+	 * and BufferGetPage().
+	 */
+	BlockNumber block;
+	Buffer		buffer;
+	Page		page;
 
 	/*-------------------------------------------------------
 	 * Fields describing what to do to the page
@@ -98,11 +108,19 @@ typedef struct
 	 */
 	int8		htsv[MaxHeapTuplesPerPage + 1];
 
-	/*
-	 * Freezing-related state.
+	/*-------------------------------------------------------
+	 * Working state for freezing
+	 *-------------------------------------------------------
 	 */
 	HeapPageFreeze pagefrz;
 
+	/*
+	 * The snapshot conflict horizon used when freezing tuples. The final
+	 * snapshot conflict horizon for the record may be newer if pruning
+	 * removes newer transaction IDs.
+	 */
+	TransactionId frz_conflict_horizon;
+
 	/*-------------------------------------------------------
 	 * Information about what was done
 	 *
@@ -129,13 +147,6 @@ typedef struct
 	int			lpdead_items;	/* number of items in the array */
 	OffsetNumber *deadoffsets;	/* points directly to presult->deadoffsets */
 
-	/*
-	 * The snapshot conflict horizon used when freezing tuples. The final
-	 * snapshot conflict horizon for the record may be newer if pruning
-	 * removes newer transaction IDs.
-	 */
-	TransactionId frz_conflict_horizon;
-
 	/*
 	 * all_visible and all_frozen indicate if the all-visible and all-frozen
 	 * bits in the visibility map can be set for this page after pruning.
@@ -162,14 +173,12 @@ static void prune_freeze_setup(PruneFreezeParams *params,
 							   MultiXactId *new_relmin_mxid,
 							   PruneFreezeResult *presult,
 							   PruneState *prstate);
-static void prune_freeze_plan(Oid reloid, Buffer buffer,
-							  PruneState *prstate,
+static void prune_freeze_plan(PruneState *prstate,
 							  OffsetNumber *off_loc);
 static HTSV_Result heap_prune_satisfies_vacuum(PruneState *prstate,
-											   HeapTuple tup,
-											   Buffer buffer);
+											   HeapTuple tup);
 static inline HTSV_Result htsv_get_valid_status(int status);
-static void heap_prune_chain(Page page, BlockNumber blockno, OffsetNumber maxoff,
+static void heap_prune_chain(OffsetNumber maxoff,
 							 OffsetNumber rootoffnum, PruneState *prstate);
 static void heap_prune_record_prunable(PruneState *prstate, TransactionId xid);
 static void heap_prune_record_redirect(PruneState *prstate,
@@ -181,15 +190,14 @@ static void heap_prune_record_dead_or_unused(PruneState *prstate, OffsetNumber o
 											 bool was_normal);
 static void heap_prune_record_unused(PruneState *prstate, OffsetNumber offnum, bool was_normal);
 
-static void heap_prune_record_unchanged_lp_unused(Page page, PruneState *prstate, OffsetNumber offnum);
-static void heap_prune_record_unchanged_lp_normal(Page page, PruneState *prstate, OffsetNumber offnum);
-static void heap_prune_record_unchanged_lp_dead(Page page, PruneState *prstate, OffsetNumber offnum);
+static void heap_prune_record_unchanged_lp_unused(PruneState *prstate, OffsetNumber offnum);
+static void heap_prune_record_unchanged_lp_normal(PruneState *prstate, OffsetNumber offnum);
+static void heap_prune_record_unchanged_lp_dead(PruneState *prstate, OffsetNumber offnum);
 static void heap_prune_record_unchanged_lp_redirect(PruneState *prstate, OffsetNumber offnum);
 
 static void page_verify_redirects(Page page);
 
-static bool heap_page_will_freeze(Relation relation, Buffer buffer,
-								  bool did_tuple_hint_fpi, bool do_prune, bool do_hint_prune,
+static bool heap_page_will_freeze(bool did_tuple_hint_fpi, bool do_prune, bool do_hint_prune,
 								  PruneState *prstate);
 
 
@@ -342,6 +350,10 @@ prune_freeze_setup(PruneFreezeParams *params,
 	Assert(!(params->options & HEAP_PAGE_PRUNE_FREEZE) || params->cutoffs);
 	prstate->attempt_freeze = (params->options & HEAP_PAGE_PRUNE_FREEZE) != 0;
 	prstate->cutoffs = params->cutoffs;
+	prstate->relation = params->relation;
+	prstate->block = BufferGetBlockNumber(params->buffer);
+	prstate->buffer = params->buffer;
+	prstate->page = BufferGetPage(params->buffer);
 
 	/*
 	 * Our strategy is to scan the page and make lists of items to change,
@@ -455,16 +467,15 @@ prune_freeze_setup(PruneFreezeParams *params,
  * *off_loc is used for error callback and cleared before returning.
  */
 static void
-prune_freeze_plan(Oid reloid, Buffer buffer, PruneState *prstate,
-				  OffsetNumber *off_loc)
+prune_freeze_plan(PruneState *prstate, OffsetNumber *off_loc)
 {
-	Page		page = BufferGetPage(buffer);
-	BlockNumber blockno = BufferGetBlockNumber(buffer);
-	OffsetNumber maxoff = PageGetMaxOffsetNumber(page);
+	Page		page = prstate->page;
+	BlockNumber blockno = prstate->block;
+	OffsetNumber maxoff = PageGetMaxOffsetNumber(prstate->page);
 	OffsetNumber offnum;
 	HeapTupleData tup;
 
-	tup.t_tableOid = reloid;
+	tup.t_tableOid = RelationGetRelid(prstate->relation);
 
 	/*
 	 * Determine HTSV for all tuples, and queue them up for processing as HOT
@@ -505,7 +516,7 @@ prune_freeze_plan(Oid reloid, Buffer buffer, PruneState *prstate,
 		/* Nothing to do if slot doesn't contain a tuple */
 		if (!ItemIdIsUsed(itemid))
 		{
-			heap_prune_record_unchanged_lp_unused(page, prstate, offnum);
+			heap_prune_record_unchanged_lp_unused(prstate, offnum);
 			continue;
 		}
 
@@ -518,7 +529,7 @@ prune_freeze_plan(Oid reloid, Buffer buffer, PruneState *prstate,
 			if (unlikely(prstate->mark_unused_now))
 				heap_prune_record_unused(prstate, offnum, false);
 			else
-				heap_prune_record_unchanged_lp_dead(page, prstate, offnum);
+				heap_prune_record_unchanged_lp_dead(prstate, offnum);
 			continue;
 		}
 
@@ -539,8 +550,7 @@ prune_freeze_plan(Oid reloid, Buffer buffer, PruneState *prstate,
 		tup.t_len = ItemIdGetLength(itemid);
 		ItemPointerSet(&tup.t_self, blockno, offnum);
 
-		prstate->htsv[offnum] = heap_prune_satisfies_vacuum(prstate, &tup,
-															buffer);
+		prstate->htsv[offnum] = heap_prune_satisfies_vacuum(prstate, &tup);
 
 		if (!HeapTupleHeaderIsHeapOnly(htup))
 			prstate->root_items[prstate->nroot_items++] = offnum;
@@ -571,7 +581,7 @@ prune_freeze_plan(Oid reloid, Buffer buffer, PruneState *prstate,
 		*off_loc = offnum;
 
 		/* Process this item or chain of items */
-		heap_prune_chain(page, blockno, maxoff, offnum, prstate);
+		heap_prune_chain(maxoff, offnum, prstate);
 	}
 
 	/*
@@ -627,7 +637,7 @@ prune_freeze_plan(Oid reloid, Buffer buffer, PruneState *prstate,
 			}
 		}
 		else
-			heap_prune_record_unchanged_lp_normal(page, prstate, offnum);
+			heap_prune_record_unchanged_lp_normal(prstate, offnum);
 	}
 
 	/* We should now have processed every tuple exactly once  */
@@ -648,7 +658,7 @@ prune_freeze_plan(Oid reloid, Buffer buffer, PruneState *prstate,
 
 /*
  * Decide whether to proceed with freezing according to the freeze plans
- * prepared for the given heap buffer. If freezing is chosen, this function
+ * prepared for the current heap buffer. If freezing is chosen, this function
  * performs several pre-freeze checks.
  *
  * The values of do_prune, do_hint_prune, and did_tuple_hint_fpi must be
@@ -660,8 +670,7 @@ prune_freeze_plan(Oid reloid, Buffer buffer, PruneState *prstate,
  * page, and false otherwise.
  */
 static bool
-heap_page_will_freeze(Relation relation, Buffer buffer,
-					  bool did_tuple_hint_fpi,
+heap_page_will_freeze(bool did_tuple_hint_fpi,
 					  bool do_prune,
 					  bool do_hint_prune,
 					  PruneState *prstate)
@@ -709,18 +718,19 @@ heap_page_will_freeze(Relation relation, Buffer buffer,
 			 * Freezing would make the page all-frozen.  Have already emitted
 			 * an FPI or will do so anyway?
 			 */
-			if (RelationNeedsWAL(relation))
+			if (RelationNeedsWAL(prstate->relation))
 			{
 				if (did_tuple_hint_fpi)
 					do_freeze = true;
 				else if (do_prune)
 				{
-					if (XLogCheckBufferNeedsBackup(buffer))
+					if (XLogCheckBufferNeedsBackup(prstate->buffer))
 						do_freeze = true;
 				}
 				else if (do_hint_prune)
 				{
-					if (XLogHintBitIsNeeded() && XLogCheckBufferNeedsBackup(buffer))
+					if (XLogHintBitIsNeeded() &&
+						XLogCheckBufferNeedsBackup(prstate->buffer))
 						do_freeze = true;
 				}
 			}
@@ -733,7 +743,7 @@ heap_page_will_freeze(Relation relation, Buffer buffer,
 		 * Validate the tuples we will be freezing before entering the
 		 * critical section.
 		 */
-		heap_pre_freeze_checks(buffer, prstate->frozen, prstate->nfrozen);
+		heap_pre_freeze_checks(prstate->buffer, prstate->frozen, prstate->nfrozen);
 
 		/*
 		 * Calculate what the snapshot conflict horizon should be for a record
@@ -822,8 +832,6 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
 						   TransactionId *new_relfrozen_xid,
 						   MultiXactId *new_relmin_mxid)
 {
-	Buffer		buffer = params->buffer;
-	Page		page = BufferGetPage(buffer);
 	PruneState	prstate;
 	bool		do_freeze;
 	bool		do_prune;
@@ -842,8 +850,7 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
 	 * Prepare queue of state changes to later be executed in a critical
 	 * section.
 	 */
-	prune_freeze_plan(RelationGetRelid(params->relation),
-					  buffer, &prstate, off_loc);
+	prune_freeze_plan(&prstate, off_loc);
 
 	/*
 	 * If checksums are enabled, calling heap_prune_satisfies_vacuum() while
@@ -861,15 +868,14 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
 	 * pd_prune_xid field or the page was marked full, we will update the hint
 	 * bit.
 	 */
-	do_hint_prune = ((PageHeader) page)->pd_prune_xid != prstate.new_prune_xid ||
-		PageIsFull(page);
+	do_hint_prune = ((PageHeader) prstate.page)->pd_prune_xid != prstate.new_prune_xid ||
+		PageIsFull(prstate.page);
 
 	/*
 	 * Decide if we want to go ahead with freezing according to the freeze
 	 * plans we prepared, or not.
 	 */
-	do_freeze = heap_page_will_freeze(params->relation, buffer,
-									  did_tuple_hint_fpi,
+	do_freeze = heap_page_will_freeze(did_tuple_hint_fpi,
 									  do_prune,
 									  do_hint_prune,
 									  &prstate);
@@ -901,14 +907,14 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
 		 * Update the page's pd_prune_xid field to either zero, or the lowest
 		 * XID of any soon-prunable tuple.
 		 */
-		((PageHeader) page)->pd_prune_xid = prstate.new_prune_xid;
+		((PageHeader) prstate.page)->pd_prune_xid = prstate.new_prune_xid;
 
 		/*
 		 * Also clear the "page is full" flag, since there's no point in
 		 * repeating the prune/defrag process until something else happens to
 		 * the page.
 		 */
-		PageClearFull(page);
+		PageClearFull(prstate.page);
 
 		/*
 		 * If that's all we had to do to the page, this is a non-WAL-logged
@@ -916,7 +922,7 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
 		 * the buffer dirty below.
 		 */
 		if (!do_freeze && !do_prune)
-			MarkBufferDirtyHint(buffer, true);
+			MarkBufferDirtyHint(prstate.buffer, true);
 	}
 
 	if (do_prune || do_freeze)
@@ -924,21 +930,21 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
 		/* Apply the planned item changes and repair page fragmentation. */
 		if (do_prune)
 		{
-			heap_page_prune_execute(buffer, false,
+			heap_page_prune_execute(prstate.buffer, false,
 									prstate.redirected, prstate.nredirected,
 									prstate.nowdead, prstate.ndead,
 									prstate.nowunused, prstate.nunused);
 		}
 
 		if (do_freeze)
-			heap_freeze_prepared_tuples(buffer, prstate.frozen, prstate.nfrozen);
+			heap_freeze_prepared_tuples(prstate.buffer, prstate.frozen, prstate.nfrozen);
 
-		MarkBufferDirty(buffer);
+		MarkBufferDirty(prstate.buffer);
 
 		/*
 		 * Emit a WAL XLOG_HEAP2_PRUNE* record showing what we did
 		 */
-		if (RelationNeedsWAL(params->relation))
+		if (RelationNeedsWAL(prstate.relation))
 		{
 			/*
 			 * The snapshotConflictHorizon for the whole record should be the
@@ -958,7 +964,7 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
 			else
 				conflict_xid = prstate.latest_xid_removed;
 
-			log_heap_prune_and_freeze(params->relation, buffer,
+			log_heap_prune_and_freeze(prstate.relation, prstate.buffer,
 									  InvalidBuffer,	/* vmbuffer */
 									  0,	/* vmflags */
 									  conflict_xid,
@@ -1018,12 +1024,12 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
  * Perform visibility checks for heap pruning.
  */
 static HTSV_Result
-heap_prune_satisfies_vacuum(PruneState *prstate, HeapTuple tup, Buffer buffer)
+heap_prune_satisfies_vacuum(PruneState *prstate, HeapTuple tup)
 {
 	HTSV_Result res;
 	TransactionId dead_after;
 
-	res = HeapTupleSatisfiesVacuumHorizon(tup, buffer, &dead_after);
+	res = HeapTupleSatisfiesVacuumHorizon(tup, prstate->buffer, &dead_after);
 
 	if (res != HEAPTUPLE_RECENTLY_DEAD)
 		return res;
@@ -1100,13 +1106,14 @@ htsv_get_valid_status(int status)
  * based on that outcome.
  */
 static void
-heap_prune_chain(Page page, BlockNumber blockno, OffsetNumber maxoff,
-				 OffsetNumber rootoffnum, PruneState *prstate)
+heap_prune_chain(OffsetNumber maxoff, OffsetNumber rootoffnum,
+				 PruneState *prstate)
 {
 	TransactionId priorXmax = InvalidTransactionId;
 	ItemId		rootlp;
 	OffsetNumber offnum;
 	OffsetNumber chainitems[MaxHeapTuplesPerPage];
+	Page		page = prstate->page;
 
 	/*
 	 * After traversing the HOT chain, ndeadchain is the index in chainitems
@@ -1235,7 +1242,7 @@ heap_prune_chain(Page page, BlockNumber blockno, OffsetNumber maxoff,
 		/*
 		 * Advance to next chain member.
 		 */
-		Assert(ItemPointerGetBlockNumber(&htup->t_ctid) == blockno);
+		Assert(ItemPointerGetBlockNumber(&htup->t_ctid) == prstate->block);
 		offnum = ItemPointerGetOffsetNumber(&htup->t_ctid);
 		priorXmax = HeapTupleHeaderGetUpdateXid(htup);
 	}
@@ -1270,7 +1277,7 @@ process_chain:
 			i++;
 		}
 		for (; i < nchain; i++)
-			heap_prune_record_unchanged_lp_normal(page, prstate, chainitems[i]);
+			heap_prune_record_unchanged_lp_normal(prstate, chainitems[i]);
 	}
 	else if (ndeadchain == nchain)
 	{
@@ -1296,7 +1303,7 @@ process_chain:
 
 		/* the rest of tuples in the chain are normal, unchanged tuples */
 		for (int i = ndeadchain; i < nchain; i++)
-			heap_prune_record_unchanged_lp_normal(page, prstate, chainitems[i]);
+			heap_prune_record_unchanged_lp_normal(prstate, chainitems[i]);
 	}
 }
 
@@ -1421,7 +1428,7 @@ heap_prune_record_unused(PruneState *prstate, OffsetNumber offnum, bool was_norm
  * Record an unused line pointer that is left unchanged.
  */
 static void
-heap_prune_record_unchanged_lp_unused(Page page, PruneState *prstate, OffsetNumber offnum)
+heap_prune_record_unchanged_lp_unused(PruneState *prstate, OffsetNumber offnum)
 {
 	Assert(!prstate->processed[offnum]);
 	prstate->processed[offnum] = true;
@@ -1432,9 +1439,10 @@ heap_prune_record_unchanged_lp_unused(Page page, PruneState *prstate, OffsetNumb
  * update bookkeeping of tuple counts and page visibility.
  */
 static void
-heap_prune_record_unchanged_lp_normal(Page page, PruneState *prstate, OffsetNumber offnum)
+heap_prune_record_unchanged_lp_normal(PruneState *prstate, OffsetNumber offnum)
 {
 	HeapTupleHeader htup;
+	Page		page = prstate->page;
 
 	Assert(!prstate->processed[offnum]);
 	prstate->processed[offnum] = true;
@@ -1615,7 +1623,7 @@ heap_prune_record_unchanged_lp_normal(Page page, PruneState *prstate, OffsetNumb
  * Record line pointer that was already LP_DEAD and is left unchanged.
  */
 static void
-heap_prune_record_unchanged_lp_dead(Page page, PruneState *prstate, OffsetNumber offnum)
+heap_prune_record_unchanged_lp_dead(PruneState *prstate, OffsetNumber offnum)
 {
 	Assert(!prstate->processed[offnum]);
 	prstate->processed[offnum] = true;
-- 
2.43.0



  [text/x-patch] v35-0002-Add-PageGetPruneXid-helper.patch (1.9K, 3-v35-0002-Add-PageGetPruneXid-helper.patch)
  download | inline diff:
From aad49496321243eaab94d288da021c537b96f652 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Wed, 25 Feb 2026 14:09:11 -0500
Subject: [PATCH v35 02/18] Add PageGetPruneXid helper

This is inline with other page header accessors. It improves readability
and avoids long lines.
---
 src/backend/access/heap/pruneheap.c | 4 ++--
 src/include/storage/bufpage.h       | 6 ++++++
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index 3c5d33834fc..1d61b336193 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -234,7 +234,7 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
 	 * determining the appropriate horizon is a waste if there's no prune_xid
 	 * (i.e. no updates/deletes left potentially dead tuples around).
 	 */
-	prune_xid = ((PageHeader) page)->pd_prune_xid;
+	prune_xid = PageGetPruneXid(page);
 	if (!TransactionIdIsValid(prune_xid))
 		return;
 
@@ -868,7 +868,7 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
 	 * pd_prune_xid field or the page was marked full, we will update the hint
 	 * bit.
 	 */
-	do_hint_prune = ((PageHeader) prstate.page)->pd_prune_xid != prstate.new_prune_xid ||
+	do_hint_prune = PageGetPruneXid(prstate.page) != prstate.new_prune_xid ||
 		PageIsFull(prstate.page);
 
 	/*
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index ae3725b3b81..92a6bb9b0c0 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -441,6 +441,12 @@ PageClearAllVisible(Page page)
 	((PageHeader) page)->pd_flags &= ~PD_ALL_VISIBLE;
 }
 
+static inline TransactionId
+PageGetPruneXid(const PageData *page)
+{
+	return ((const PageHeaderData *) page)->pd_prune_xid;
+}
+
 /*
  * These two require "access/transam.h", so left as macros.
  */
-- 
2.43.0



  [text/x-patch] v35-0003-Rename-PruneState-all_visible-all_frozen.patch (13.7K, 4-v35-0003-Rename-PruneState-all_visible-all_frozen.patch)
  download | inline diff:
From 7038ae8d57ff2d5f63c2a306e34703a4b54c047a Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Sun, 1 Mar 2026 15:59:04 -0500
Subject: [PATCH v35 03/18] Rename PruneState->all_visible/all_frozen

to set_all_visible and set_all_frozen to clarify that this is the
proposed state of the all-visible and all-frozen bits for a heap page in
the visibility map, not the current state.

Author: Melanie Plageman <[email protected]>
Suggested-by: Andres Freund <[email protected]>
Discussion: https://postgr.es/m/bqc4kh5midfn44gnjiqez3bjqv4zogydguvdn446riw45jcf3y%404ez66il7ebvk
---
 src/backend/access/heap/pruneheap.c | 144 ++++++++++++++--------------
 1 file changed, 74 insertions(+), 70 deletions(-)

diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index 1d61b336193..fa5aa2a63f2 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -148,22 +148,24 @@ typedef struct
 	OffsetNumber *deadoffsets;	/* points directly to presult->deadoffsets */
 
 	/*
-	 * all_visible and all_frozen indicate if the all-visible and all-frozen
-	 * bits in the visibility map can be set for this page after pruning.
+	 * set_all_visible and set_all_frozen indicate if the all-visible and
+	 * all-frozen bits in the visibility map can be set for this page after
+	 * pruning.
 	 *
 	 * visibility_cutoff_xid is the newest xmin of live tuples on the page.
 	 * The caller can use it as the conflict horizon, when setting the VM
-	 * bits.  It is only valid if we froze some tuples, and all_frozen is
+	 * bits.  It is only valid if we froze some tuples, and set_all_frozen is
 	 * true.
 	 *
-	 * NOTE: all_visible and all_frozen initially don't include LP_DEAD items.
-	 * That's convenient for heap_page_prune_and_freeze() to use them to
-	 * decide whether to freeze the page or not.  The all_visible and
-	 * all_frozen values returned to the caller are adjusted to include
-	 * LP_DEAD items after we determine whether to opportunistically freeze.
+	 * NOTE: set_all_visible and set_all_frozen initially don't include
+	 * LP_DEAD items. That's convenient for heap_page_prune_and_freeze() to
+	 * use them to decide whether to freeze the page or not.  The
+	 * set_all_visible and set_all_frozen values returned to the caller are
+	 * adjusted to include LP_DEAD items after we determine whether to
+	 * opportunistically freeze.
 	 */
-	bool		all_visible;
-	bool		all_frozen;
+	bool		set_all_visible;
+	bool		set_all_frozen;
 	TransactionId visibility_cutoff_xid;
 } PruneState;
 
@@ -419,22 +421,22 @@ prune_freeze_setup(PruneFreezeParams *params,
 	 * setting the VM bits.
 	 *
 	 * In addition to telling the caller whether it can set the VM bit, we
-	 * also use 'all_visible' and 'all_frozen' for our own decision-making. If
-	 * the whole page would become frozen, we consider opportunistically
-	 * freezing tuples.  We will not be able to freeze the whole page if there
-	 * are tuples present that are not visible to everyone or if there are
-	 * dead tuples which are not yet removable.  However, dead tuples which
-	 * will be removed by the end of vacuuming should not preclude us from
-	 * opportunistically freezing.  Because of that, we do not immediately
-	 * clear all_visible and all_frozen when we see LP_DEAD items.  We fix
-	 * that after scanning the line pointers. We must correct all_visible and
-	 * all_frozen before we return them to the caller, so that the caller
-	 * doesn't set the VM bits incorrectly.
+	 * also use 'set_all_visible' and 'set_all_frozen' for our own
+	 * decision-making. If the whole page would become frozen, we consider
+	 * opportunistically freezing tuples.  We will not be able to freeze the
+	 * whole page if there are tuples present that are not visible to everyone
+	 * or if there are dead tuples which are not yet removable.  However, dead
+	 * tuples which will be removed by the end of vacuuming should not
+	 * preclude us from opportunistically freezing.  Because of that, we do
+	 * not immediately clear set_all_visible and set_all_frozen when we see
+	 * LP_DEAD items.  We fix that after scanning the line pointers. We must
+	 * correct set_all_visible and set_all_frozen before we return them to the
+	 * caller, so that the caller doesn't set the VM bits incorrectly.
 	 */
 	if (prstate->attempt_freeze)
 	{
-		prstate->all_visible = true;
-		prstate->all_frozen = true;
+		prstate->set_all_visible = true;
+		prstate->set_all_frozen = true;
 	}
 	else
 	{
@@ -442,8 +444,8 @@ prune_freeze_setup(PruneFreezeParams *params,
 		 * Initializing to false allows skipping the work to update them in
 		 * heap_prune_record_unchanged_lp_normal().
 		 */
-		prstate->all_visible = false;
-		prstate->all_frozen = false;
+		prstate->set_all_visible = false;
+		prstate->set_all_frozen = false;
 	}
 
 	/*
@@ -683,8 +685,8 @@ heap_page_will_freeze(bool did_tuple_hint_fpi,
 	 */
 	if (!prstate->attempt_freeze)
 	{
-		Assert(!prstate->all_frozen && prstate->nfrozen == 0);
-		Assert(prstate->lpdead_items == 0 || !prstate->all_visible);
+		Assert(!prstate->set_all_frozen && prstate->nfrozen == 0);
+		Assert(prstate->lpdead_items == 0 || !prstate->set_all_visible);
 		return false;
 	}
 
@@ -710,9 +712,9 @@ heap_page_will_freeze(bool did_tuple_hint_fpi,
 		 * anymore.  The opportunistic freeze heuristic must be improved;
 		 * however, for now, try to approximate the old logic.
 		 */
-		if (prstate->all_frozen && prstate->nfrozen > 0)
+		if (prstate->set_all_frozen && prstate->nfrozen > 0)
 		{
-			Assert(prstate->all_visible);
+			Assert(prstate->set_all_visible);
 
 			/*
 			 * Freezing would make the page all-frozen.  Have already emitted
@@ -752,7 +754,7 @@ heap_page_will_freeze(bool did_tuple_hint_fpi,
 		 * in the VM once we're done with it. Otherwise, we generate a
 		 * conservative cutoff by stepping back from OldestXmin.
 		 */
-		if (prstate->all_frozen)
+		if (prstate->set_all_frozen)
 			prstate->frz_conflict_horizon = prstate->visibility_cutoff_xid;
 		else
 		{
@@ -769,7 +771,7 @@ heap_page_will_freeze(bool did_tuple_hint_fpi,
 		 */
 		Assert(!prstate->pagefrz.freeze_required);
 
-		prstate->all_frozen = false;
+		prstate->set_all_frozen = false;
 		prstate->nfrozen = 0;	/* avoid miscounts in instrumentation */
 	}
 	else
@@ -804,11 +806,12 @@ heap_page_will_freeze(bool did_tuple_hint_fpi,
  * if it's considered advantageous for overall system performance to do so
  * now.  The 'params.cutoffs', 'presult', 'new_relfrozen_xid' and
  * 'new_relmin_mxid' arguments are required when freezing.  When
- * HEAP_PAGE_PRUNE_FREEZE option is passed, we also set presult->all_visible
- * and presult->all_frozen after determining whether or not to
- * opportunistically freeze, to indicate if the VM bits can be set.  They are
- * always set to false when the HEAP_PAGE_PRUNE_FREEZE option is not passed,
- * because at the moment only callers that also freeze need that information.
+ * HEAP_PAGE_PRUNE_FREEZE option is passed, we also set
+ * presult->set_all_visible and presult->set_all_frozen after determining
+ * whether or not to opportunistically freeze, to indicate if the VM bits can
+ * be set.  They are always set to false when the HEAP_PAGE_PRUNE_FREEZE
+ * option is not passed, because at the moment only callers that also freeze
+ * need that information.
  *
  * presult contains output parameters needed by callers, such as the number of
  * tuples removed and the offsets of dead items on the page after pruning.
@@ -882,21 +885,21 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
 
 	/*
 	 * While scanning the line pointers, we did not clear
-	 * all_visible/all_frozen when encountering LP_DEAD items because we
-	 * wanted the decision whether or not to freeze the page to be unaffected
-	 * by the short-term presence of LP_DEAD items.  These LP_DEAD items are
-	 * effectively assumed to be LP_UNUSED items in the making.  It doesn't
-	 * matter which vacuum heap pass (initial pass or final pass) ends up
-	 * setting the page all-frozen, as long as the ongoing VACUUM does it.
+	 * set_all_visible/set_all_frozen when encountering LP_DEAD items because
+	 * we wanted the decision whether or not to freeze the page to be
+	 * unaffected by the short-term presence of LP_DEAD items.  These LP_DEAD
+	 * items are effectively assumed to be LP_UNUSED items in the making.  It
+	 * doesn't matter which vacuum heap pass (initial pass or final pass) ends
+	 * up setting the page all-frozen, as long as the ongoing VACUUM does it.
 	 *
 	 * Now that we finished determining whether or not to freeze the page,
-	 * update all_visible and all_frozen so that they reflect the true state
-	 * of the page for setting PD_ALL_VISIBLE and VM bits.
+	 * update set_all_visible and set_all_frozen so that they reflect the true
+	 * state of the page for setting PD_ALL_VISIBLE and VM bits.
 	 */
 	if (prstate.lpdead_items > 0)
-		prstate.all_visible = prstate.all_frozen = false;
+		prstate.set_all_visible = prstate.set_all_frozen = false;
 
-	Assert(!prstate.all_frozen || prstate.all_visible);
+	Assert(!prstate.set_all_frozen || prstate.set_all_visible);
 
 	/* Any error while applying the changes is critical */
 	START_CRIT_SECTION();
@@ -984,8 +987,8 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
 	presult->nfrozen = prstate.nfrozen;
 	presult->live_tuples = prstate.live_tuples;
 	presult->recently_dead_tuples = prstate.recently_dead_tuples;
-	presult->all_visible = prstate.all_visible;
-	presult->all_frozen = prstate.all_frozen;
+	presult->all_visible = prstate.set_all_visible;
+	presult->all_frozen = prstate.set_all_frozen;
 	presult->hastup = prstate.hastup;
 
 	/*
@@ -1365,9 +1368,9 @@ heap_prune_record_dead(PruneState *prstate, OffsetNumber offnum,
 	prstate->ndead++;
 
 	/*
-	 * Deliberately delay unsetting all_visible and all_frozen until later
-	 * during pruning. Removable dead tuples shouldn't preclude freezing the
-	 * page.
+	 * Deliberately delay unsetting set_all_visible and set_all_frozen until
+	 * later during pruning. Removable dead tuples shouldn't preclude freezing
+	 * the page.
 	 */
 
 	/* Record the dead offset for vacuum */
@@ -1489,14 +1492,14 @@ heap_prune_record_unchanged_lp_normal(PruneState *prstate, OffsetNumber offnum)
 			 * See SetHintBits for more info.  Check that the tuple is hinted
 			 * xmin-committed because of that.
 			 */
-			if (prstate->all_visible)
+			if (prstate->set_all_visible)
 			{
 				TransactionId xmin;
 
 				if (!HeapTupleHeaderXminCommitted(htup))
 				{
-					prstate->all_visible = false;
-					prstate->all_frozen = false;
+					prstate->set_all_visible = false;
+					prstate->set_all_frozen = false;
 					break;
 				}
 
@@ -1511,15 +1514,16 @@ heap_prune_record_unchanged_lp_normal(PruneState *prstate, OffsetNumber offnum)
 
 				/*
 				 * For now always use prstate->cutoffs for this test, because
-				 * we only update 'all_visible' and 'all_frozen' when freezing
-				 * is requested. We could use GlobalVisTestIsRemovableXid
-				 * instead, if a non-freezing caller wanted to set the VM bit.
+				 * we only update 'set_all_visible' and 'set_all_frozen' when
+				 * freezing is requested. We could use
+				 * GlobalVisTestIsRemovableXid instead, if a non-freezing
+				 * caller wanted to set the VM bit.
 				 */
 				Assert(prstate->cutoffs);
 				if (!TransactionIdPrecedes(xmin, prstate->cutoffs->OldestXmin))
 				{
-					prstate->all_visible = false;
-					prstate->all_frozen = false;
+					prstate->set_all_visible = false;
+					prstate->set_all_frozen = false;
 					break;
 				}
 
@@ -1532,8 +1536,8 @@ heap_prune_record_unchanged_lp_normal(PruneState *prstate, OffsetNumber offnum)
 
 		case HEAPTUPLE_RECENTLY_DEAD:
 			prstate->recently_dead_tuples++;
-			prstate->all_visible = false;
-			prstate->all_frozen = false;
+			prstate->set_all_visible = false;
+			prstate->set_all_frozen = false;
 
 			/*
 			 * This tuple will soon become DEAD.  Update the hint field so
@@ -1552,8 +1556,8 @@ heap_prune_record_unchanged_lp_normal(PruneState *prstate, OffsetNumber offnum)
 			 * assumption is a bit shaky, but it is what acquire_sample_rows()
 			 * does, so be consistent.
 			 */
-			prstate->all_visible = false;
-			prstate->all_frozen = false;
+			prstate->set_all_visible = false;
+			prstate->set_all_frozen = false;
 
 			/*
 			 * If we wanted to optimize for aborts, we might consider marking
@@ -1571,8 +1575,8 @@ heap_prune_record_unchanged_lp_normal(PruneState *prstate, OffsetNumber offnum)
 			 * will commit and update the counters after we report.
 			 */
 			prstate->live_tuples++;
-			prstate->all_visible = false;
-			prstate->all_frozen = false;
+			prstate->set_all_visible = false;
+			prstate->set_all_frozen = false;
 
 			/*
 			 * This tuple may soon become DEAD.  Update the hint field so that
@@ -1614,7 +1618,7 @@ heap_prune_record_unchanged_lp_normal(PruneState *prstate, OffsetNumber offnum)
 		 * definitely cannot be set all-frozen in the visibility map later on.
 		 */
 		if (!totally_frozen)
-			prstate->all_frozen = false;
+			prstate->set_all_frozen = false;
 	}
 }
 
@@ -1637,10 +1641,10 @@ heap_prune_record_unchanged_lp_dead(PruneState *prstate, OffsetNumber offnum)
 	 * hastup/nonempty_pages as provisional no matter how LP_DEAD items are
 	 * handled (handled here, or handled later on).
 	 *
-	 * Similarly, don't unset all_visible and all_frozen until later, at the
-	 * end of heap_page_prune_and_freeze().  This will allow us to attempt to
-	 * freeze the page after pruning.  As long as we unset it before updating
-	 * the visibility map, this will be correct.
+	 * Similarly, don't unset set_all_visible and set_all_frozen until later,
+	 * at the end of heap_page_prune_and_freeze().  This will allow us to
+	 * attempt to freeze the page after pruning.  As long as we unset it
+	 * before updating the visibility map, this will be correct.
 	 */
 
 	/* Record the dead offset for vacuum */
-- 
2.43.0



  [text/x-patch] v35-0004-Use-the-newest-to-be-frozen-xid-as-the-conflict-.patch (8.0K, 5-v35-0004-Use-the-newest-to-be-frozen-xid-as-the-conflict-.patch)
  download | inline diff:
From a3b91ab430e7af8b459c169181c1dc3f0f04c8bf Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Wed, 25 Feb 2026 13:55:45 -0500
Subject: [PATCH v35 04/18] Use the newest to-be-frozen xid as the conflict
 horizon for freezing

Previously WAL records that froze tuples used OldestXmin as the snapshot
conflict horizon. However, OldestXmin is newer than the newest frozen
tuple's xid. By tracking the newest to-be-frozen xid and using it as the
snapshot conflict horizon instead, we end up with an older horizon that
will result in fewer query cancellations on the standby.
---
 src/backend/access/heap/heapam.c    | 16 +++++++++++
 src/backend/access/heap/pruneheap.c | 44 ++++++++---------------------
 src/include/access/heapam.h         |  8 ++++++
 3 files changed, 36 insertions(+), 32 deletions(-)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index a231563f0df..76f94fdfa5b 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -6781,6 +6781,10 @@ heap_inplace_unlock(Relation relation,
  * NB: Caller should avoid needlessly calling heap_tuple_should_freeze when we
  * have already forced page-level freezing, since that might incur the same
  * SLRU buffer misses that we specifically intended to avoid by freezing.
+ *
+ * We won't update the FreezePageConflictXid because any lockers don't affect
+ * visibility on the standby, and we don't have to worry about the update XID
+ * because the only way it can be older than OldestXmin is if it aborted.
  */
 static TransactionId
 FreezeMultiXactId(MultiXactId multi, uint16 t_infomask,
@@ -7173,7 +7177,11 @@ heap_prepare_freeze_tuple(HeapTupleHeader tuple,
 
 		/* Verify that xmin committed if and when freeze plan is executed */
 		if (freeze_xmin)
+		{
 			frz->checkflags |= HEAP_FREEZE_CHECK_XMIN_COMMITTED;
+			if (TransactionIdFollows(xid, pagefrz->FreezePageConflictXid))
+				pagefrz->FreezePageConflictXid = xid;
+		}
 	}
 
 	/*
@@ -7192,6 +7200,9 @@ heap_prepare_freeze_tuple(HeapTupleHeader tuple,
 		 */
 		replace_xvac = pagefrz->freeze_required = true;
 
+		if (TransactionIdFollows(xid, pagefrz->FreezePageConflictXid))
+			pagefrz->FreezePageConflictXid = xid;
+
 		/* Will set replace_xvac flags in freeze plan below */
 	}
 
@@ -7316,7 +7327,11 @@ heap_prepare_freeze_tuple(HeapTupleHeader tuple,
 		 * independent of this, since the lock is released at xact end.)
 		 */
 		if (freeze_xmax && !HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask))
+		{
 			frz->checkflags |= HEAP_FREEZE_CHECK_XMAX_ABORTED;
+			if (TransactionIdFollows(xid, pagefrz->FreezePageConflictXid))
+				pagefrz->FreezePageConflictXid = xid;
+		}
 	}
 	else if (!TransactionIdIsValid(xid))
 	{
@@ -7499,6 +7514,7 @@ heap_freeze_tuple(HeapTupleHeader tuple,
 	cutoffs.MultiXactCutoff = MultiXactCutoff;
 
 	pagefrz.freeze_required = true;
+	pagefrz.FreezePageConflictXid = InvalidTransactionId;
 	pagefrz.FreezePageRelfrozenXid = FreezeLimit;
 	pagefrz.FreezePageRelminMxid = MultiXactCutoff;
 	pagefrz.NoFreezePageRelfrozenXid = FreezeLimit;
diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index fa5aa2a63f2..07868dbcc17 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -114,13 +114,6 @@ typedef struct
 	 */
 	HeapPageFreeze pagefrz;
 
-	/*
-	 * The snapshot conflict horizon used when freezing tuples. The final
-	 * snapshot conflict horizon for the record may be newer if pruning
-	 * removes newer transaction IDs.
-	 */
-	TransactionId frz_conflict_horizon;
-
 	/*-------------------------------------------------------
 	 * Information about what was done
 	 *
@@ -377,6 +370,7 @@ prune_freeze_setup(PruneFreezeParams *params,
 
 	/* initialize page freezing working state */
 	prstate->pagefrz.freeze_required = false;
+	prstate->pagefrz.FreezePageConflictXid = InvalidTransactionId;
 	if (prstate->attempt_freeze)
 	{
 		Assert(new_relfrozen_xid && new_relmin_mxid);
@@ -407,7 +401,6 @@ prune_freeze_setup(PruneFreezeParams *params,
 	 * PruneState.
 	 */
 	prstate->deadoffsets = presult->deadoffsets;
-	prstate->frz_conflict_horizon = InvalidTransactionId;
 
 	/*
 	 * Vacuum may update the VM after we're done.  We can keep track of
@@ -746,22 +739,8 @@ heap_page_will_freeze(bool did_tuple_hint_fpi,
 		 * critical section.
 		 */
 		heap_pre_freeze_checks(prstate->buffer, prstate->frozen, prstate->nfrozen);
-
-		/*
-		 * Calculate what the snapshot conflict horizon should be for a record
-		 * freezing tuples. We can use the visibility_cutoff_xid as our cutoff
-		 * for conflicts when the whole page is eligible to become all-frozen
-		 * in the VM once we're done with it. Otherwise, we generate a
-		 * conservative cutoff by stepping back from OldestXmin.
-		 */
-		if (prstate->set_all_frozen)
-			prstate->frz_conflict_horizon = prstate->visibility_cutoff_xid;
-		else
-		{
-			/* Avoids false conflicts when hot_standby_feedback in use */
-			prstate->frz_conflict_horizon = prstate->cutoffs->OldestXmin;
-			TransactionIdRetreat(prstate->frz_conflict_horizon);
-		}
+		Assert(TransactionIdPrecedesOrEquals(prstate->pagefrz.FreezePageConflictXid,
+											 prstate->cutoffs->OldestXmin));
 	}
 	else if (prstate->nfrozen > 0)
 	{
@@ -886,11 +865,12 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
 	/*
 	 * While scanning the line pointers, we did not clear
 	 * set_all_visible/set_all_frozen when encountering LP_DEAD items because
-	 * we wanted the decision whether or not to freeze the page to be
-	 * unaffected by the short-term presence of LP_DEAD items.  These LP_DEAD
-	 * items are effectively assumed to be LP_UNUSED items in the making.  It
-	 * doesn't matter which vacuum heap pass (initial pass or final pass) ends
-	 * up setting the page all-frozen, as long as the ongoing VACUUM does it.
+	 * we wanted the decision whether or not to opportunistically freeze the
+	 * page to be unaffected by the short-term presence of LP_DEAD items.
+	 * These LP_DEAD items are effectively assumed to be LP_UNUSED items in
+	 * the making. It doesn't matter which vacuum heap pass (initial pass or
+	 * final pass) ends up setting the page all-frozen, as long as the ongoing
+	 * VACUUM does it.
 	 *
 	 * Now that we finished determining whether or not to freeze the page,
 	 * update set_all_visible and set_all_frozen so that they reflect the true
@@ -953,7 +933,7 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
 			 * The snapshotConflictHorizon for the whole record should be the
 			 * most conservative of all the horizons calculated for any of the
 			 * possible modifications.  If this record will prune tuples, any
-			 * transactions on the standby older than the youngest xmax of the
+			 * transactions on the standby older than the youngest xid of the
 			 * most recently removed tuple this record will prune will
 			 * conflict.  If this record will freeze tuples, any transactions
 			 * on the standby with xids older than the youngest tuple this
@@ -961,9 +941,9 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
 			 */
 			TransactionId conflict_xid;
 
-			if (TransactionIdFollows(prstate.frz_conflict_horizon,
+			if (TransactionIdFollows(prstate.pagefrz.FreezePageConflictXid,
 									 prstate.latest_xid_removed))
-				conflict_xid = prstate.frz_conflict_horizon;
+				conflict_xid = prstate.pagefrz.FreezePageConflictXid;
 			else
 				conflict_xid = prstate.latest_xid_removed;
 
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 3c0961ab36b..fae79b37f0d 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -208,6 +208,14 @@ typedef struct HeapPageFreeze
 	TransactionId FreezePageRelfrozenXid;
 	MultiXactId FreezePageRelminMxid;
 
+	/*
+	 * The youngest XID that will be frozen or removed during freezing. It is
+	 * used to calculate the snapshot conflict horizon for a WAL record
+	 * freezing tuples. Because it is only used if we do end up freezing
+	 * tuples, there is no need for a "no freeze" version.
+	 */
+	TransactionId FreezePageConflictXid;
+
 	/*
 	 * "No freeze" NewRelfrozenXid/NewRelminMxid trackers.
 	 *
-- 
2.43.0



  [text/x-patch] v35-0005-Save-vmbuffer-in-heap-specific-scan-descriptors-.patch (5.9K, 6-v35-0005-Save-vmbuffer-in-heap-specific-scan-descriptors-.patch)
  download | inline diff:
From 09b9cc477d8d9b689888566b9d4dced5eefea208 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Fri, 27 Feb 2026 16:23:57 -0500
Subject: [PATCH v35 05/18] Save vmbuffer in heap-specific scan descriptors for
 on-access pruning

Future commits will use the visibility map in on-access pruning to avoid
pruning when a page is all-visible, fix VM corruption, and set the VM if
the page is all-visible.

Saving the vmbuffer in the scan descriptor reduces the number of times
it would need to be pinned and unpinned, making the overhead of doing so
negligible.
---
 src/backend/access/heap/heapam.c         | 12 +++++++++++-
 src/backend/access/heap/heapam_handler.c | 12 ++++++++++--
 src/backend/access/heap/pruneheap.c      |  2 +-
 src/include/access/heapam.h              | 19 ++++++++++++++++---
 4 files changed, 38 insertions(+), 7 deletions(-)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 76f94fdfa5b..e19209f180d 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -633,7 +633,7 @@ heap_prepare_pagescan(TableScanDesc sscan)
 	/*
 	 * Prune and repair fragmentation for the whole page, if possible.
 	 */
-	heap_page_prune_opt(scan->rs_base.rs_rd, buffer);
+	heap_page_prune_opt(scan->rs_base.rs_rd, buffer, &scan->rs_vmbuffer);
 
 	/*
 	 * We must hold share lock on the buffer content while examining tuple
@@ -1310,6 +1310,7 @@ heap_beginscan(Relation relation, Snapshot snapshot,
 														  sizeof(TBMIterateResult));
 	}
 
+	scan->rs_vmbuffer = InvalidBuffer;
 
 	return (TableScanDesc) scan;
 }
@@ -1348,6 +1349,12 @@ heap_rescan(TableScanDesc sscan, ScanKey key, bool set_params,
 		scan->rs_cbuf = InvalidBuffer;
 	}
 
+	if (BufferIsValid(scan->rs_vmbuffer))
+	{
+		ReleaseBuffer(scan->rs_vmbuffer);
+		scan->rs_vmbuffer = InvalidBuffer;
+	}
+
 	/*
 	 * SO_TYPE_BITMAPSCAN would be cleaned up here, but it does not hold any
 	 * additional data vs a normal HeapScan
@@ -1380,6 +1387,9 @@ heap_endscan(TableScanDesc sscan)
 	if (BufferIsValid(scan->rs_cbuf))
 		ReleaseBuffer(scan->rs_cbuf);
 
+	if (BufferIsValid(scan->rs_vmbuffer))
+		ReleaseBuffer(scan->rs_vmbuffer);
+
 	/*
 	 * Must free the read stream before freeing the BufferAccessStrategy.
 	 */
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 3ff36f59bf8..47624194f93 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -85,6 +85,7 @@ heapam_index_fetch_begin(Relation rel)
 
 	hscan->xs_base.rel = rel;
 	hscan->xs_cbuf = InvalidBuffer;
+	hscan->xs_vmbuffer = InvalidBuffer;
 
 	return &hscan->xs_base;
 }
@@ -99,6 +100,12 @@ heapam_index_fetch_reset(IndexFetchTableData *scan)
 		ReleaseBuffer(hscan->xs_cbuf);
 		hscan->xs_cbuf = InvalidBuffer;
 	}
+
+	if (BufferIsValid(hscan->xs_vmbuffer))
+	{
+		ReleaseBuffer(hscan->xs_vmbuffer);
+		hscan->xs_vmbuffer = InvalidBuffer;
+	}
 }
 
 static void
@@ -138,7 +145,8 @@ heapam_index_fetch_tuple(struct IndexFetchTableData *scan,
 		 * Prune page, but only if we weren't already on this page
 		 */
 		if (prev_buf != hscan->xs_cbuf)
-			heap_page_prune_opt(hscan->xs_base.rel, hscan->xs_cbuf);
+			heap_page_prune_opt(hscan->xs_base.rel, hscan->xs_cbuf,
+								&hscan->xs_vmbuffer);
 	}
 
 	/* Obtain share-lock on the buffer so we can examine visibility */
@@ -2533,7 +2541,7 @@ BitmapHeapScanNextBlock(TableScanDesc scan,
 	/*
 	 * Prune and repair fragmentation for the whole page, if possible.
 	 */
-	heap_page_prune_opt(scan->rs_rd, buffer);
+	heap_page_prune_opt(scan->rs_rd, buffer, &hscan->rs_vmbuffer);
 
 	/*
 	 * We must hold share lock on the buffer content while examining tuple
diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index 07868dbcc17..5ce3e54a036 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -209,7 +209,7 @@ static bool heap_page_will_freeze(bool did_tuple_hint_fpi, bool do_prune, bool d
  * Caller must have pin on the buffer, and must *not* have a lock on it.
  */
 void
-heap_page_prune_opt(Relation relation, Buffer buffer)
+heap_page_prune_opt(Relation relation, Buffer buffer, Buffer *vmbuffer)
 {
 	Page		page = BufferGetPage(buffer);
 	TransactionId prune_xid;
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index fae79b37f0d..4e2e71be558 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -94,6 +94,12 @@ typedef struct HeapScanDescData
 	 */
 	ParallelBlockTableScanWorkerData *rs_parallelworkerdata;
 
+	/*
+	 * For sequential scans and bitmap heap scans. The current heap block's
+	 * corresponding page in the visibility map.
+	 */
+	Buffer		rs_vmbuffer;
+
 	/* these fields only used in page-at-a-time mode and for bitmap scans */
 	uint32		rs_cindex;		/* current tuple's index in vistuples */
 	uint32		rs_ntuples;		/* number of visible tuples on page */
@@ -116,8 +122,14 @@ typedef struct IndexFetchHeapData
 {
 	IndexFetchTableData xs_base;	/* AM independent part of the descriptor */
 
-	Buffer		xs_cbuf;		/* current heap buffer in scan, if any */
-	/* NB: if xs_cbuf is not InvalidBuffer, we hold a pin on that buffer */
+	/*
+	 * Current heap buffer in scan, if any. NB: if xs_cbuf is not
+	 * InvalidBuffer, we hold a pin on that buffer.
+	 */
+	Buffer		xs_cbuf;
+
+	/* Current heap block's corresponding page in the visibility map */
+	Buffer		xs_vmbuffer;
 } IndexFetchHeapData;
 
 /* Result codes for HeapTupleSatisfiesVacuum */
@@ -417,7 +429,8 @@ extern TransactionId heap_index_delete_tuples(Relation rel,
 											  TM_IndexDeleteOp *delstate);
 
 /* in heap/pruneheap.c */
-extern void heap_page_prune_opt(Relation relation, Buffer buffer);
+extern void heap_page_prune_opt(Relation relation, Buffer buffer,
+								Buffer *vmbuffer);
 extern void heap_page_prune_and_freeze(PruneFreezeParams *params,
 									   PruneFreezeResult *presult,
 									   OffsetNumber *off_loc,
-- 
2.43.0



  [text/x-patch] v35-0006-Fix-visibility-map-corruption-in-more-cases.patch (18.3K, 7-v35-0006-Fix-visibility-map-corruption-in-more-cases.patch)
  download | inline diff:
From c6a1fa5c8319779b800f903e24d3f239e16c1cc1 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Wed, 25 Feb 2026 16:23:09 -0500
Subject: [PATCH v35 06/18] Fix visibility map corruption in more cases

Move VM corruption detection and repair into pruning. This allows VM
repair during on-access pruning, not only during vacuum.

Also, expand corruption detection to cover pages marked all-visible that
contain dead tuples and tuples inserted or updated by in-progress
transactions, rather than only all-visible pages with LP_DEAD items.

Pinning the correct VM page before on-access pruning is cheap when
compared to the cost of actually pruning. The vmbuffer is saved in the
scan descriptor, so a query should only need to pin each VM page once
and a single VM page covers a large number of heap pages.
---
 src/backend/access/heap/pruneheap.c  | 174 +++++++++++++++++++++++++--
 src/backend/access/heap/vacuumlazy.c |  89 +-------------
 src/include/access/heapam.h          |  12 ++
 3 files changed, 175 insertions(+), 100 deletions(-)

diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index 5ce3e54a036..fa470f663b7 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -19,7 +19,7 @@
 #include "access/htup_details.h"
 #include "access/multixact.h"
 #include "access/transam.h"
-#include "access/visibilitymapdefs.h"
+#include "access/visibilitymap.h"
 #include "access/xlog.h"
 #include "access/xloginsert.h"
 #include "commands/vacuum.h"
@@ -114,6 +114,21 @@ typedef struct
 	 */
 	HeapPageFreeze pagefrz;
 
+	/*-------------------------------------------------------
+	 * Working state for visibility map processing
+	 *-------------------------------------------------------
+	 */
+
+	/*
+	 * Caller must provide a pinned vmbuffer corresponding to the heap block
+	 * passed to heap_page_prune_and_freeze(). We will fix any corruption
+	 * found in the VM.
+	 */
+	Buffer		vmbuffer;
+
+	/* Bits in the vmbuffer for this heap page */
+	uint8		vmbits;
+
 	/*-------------------------------------------------------
 	 * Information about what was done
 	 *
@@ -168,6 +183,7 @@ static void prune_freeze_setup(PruneFreezeParams *params,
 							   MultiXactId *new_relmin_mxid,
 							   PruneFreezeResult *presult,
 							   PruneState *prstate);
+static void heap_fix_vm_corruption(PruneState *prstate, OffsetNumber offnum);
 static void prune_freeze_plan(PruneState *prstate,
 							  OffsetNumber *off_loc);
 static HTSV_Result heap_prune_satisfies_vacuum(PruneState *prstate,
@@ -175,7 +191,8 @@ static HTSV_Result heap_prune_satisfies_vacuum(PruneState *prstate,
 static inline HTSV_Result htsv_get_valid_status(int status);
 static void heap_prune_chain(OffsetNumber maxoff,
 							 OffsetNumber rootoffnum, PruneState *prstate);
-static void heap_prune_record_prunable(PruneState *prstate, TransactionId xid);
+static void heap_prune_record_prunable(PruneState *prstate, TransactionId xid,
+									   OffsetNumber offnum);
 static void heap_prune_record_redirect(PruneState *prstate,
 									   OffsetNumber offnum, OffsetNumber rdoffnum,
 									   bool was_normal);
@@ -207,6 +224,9 @@ static bool heap_page_will_freeze(bool did_tuple_hint_fpi, bool do_prune, bool d
  * if there's not any use in pruning.
  *
  * Caller must have pin on the buffer, and must *not* have a lock on it.
+ *
+ * If vmbuffer is not yet pinned and pruning is performed, vmbuffer will be
+ * pinned. If we find VM corruption during pruning, we will fix it.
  */
 void
 heap_page_prune_opt(Relation relation, Buffer buffer, Buffer *vmbuffer)
@@ -273,6 +293,16 @@ heap_page_prune_opt(Relation relation, Buffer buffer, Buffer *vmbuffer)
 		{
 			OffsetNumber dummy_off_loc;
 			PruneFreezeResult presult;
+			PruneFreezeParams params;
+
+			visibilitymap_pin(relation, BufferGetBlockNumber(buffer), vmbuffer);
+
+			params.relation = relation;
+			params.buffer = buffer;
+			params.vmbuffer = *vmbuffer;
+			params.reason = PRUNE_ON_ACCESS;
+			params.vistest = vistest;
+			params.cutoffs = NULL;
 
 			/*
 			 * We don't pass the HEAP_PAGE_PRUNE_MARK_UNUSED_NOW option
@@ -280,14 +310,7 @@ heap_page_prune_opt(Relation relation, Buffer buffer, Buffer *vmbuffer)
 			 * cannot safely determine that during on-access pruning with the
 			 * current implementation.
 			 */
-			PruneFreezeParams params = {
-				.relation = relation,
-				.buffer = buffer,
-				.reason = PRUNE_ON_ACCESS,
-				.options = 0,
-				.vistest = vistest,
-				.cutoffs = NULL,
-			};
+			params.options = 0;
 
 			heap_page_prune_and_freeze(&params, &presult, &dummy_off_loc,
 									   NULL, NULL);
@@ -350,6 +373,12 @@ prune_freeze_setup(PruneFreezeParams *params,
 	prstate->buffer = params->buffer;
 	prstate->page = BufferGetPage(params->buffer);
 
+	Assert(BufferIsValid(params->vmbuffer));
+	prstate->vmbuffer = params->vmbuffer;
+	prstate->vmbits = visibilitymap_get_status(prstate->relation,
+											   prstate->block,
+											   &prstate->vmbuffer);
+
 	/*
 	 * Our strategy is to scan the page and make lists of items to change,
 	 * then apply the changes within a critical section.  This keeps as much
@@ -766,6 +795,90 @@ heap_page_will_freeze(bool did_tuple_hint_fpi,
 	return do_freeze;
 }
 
+/*
+ * Helper to fix visibility-related corruption on a heap page and its
+ * corresponding VM page. An all-visible page cannot have dead items nor can
+ * it have tuples that are not visible to all running transactions. It clears
+ * the VM corruption as well as resetting the vmbits used during pruning.
+ *
+ * This function must be called while holding an exclusive lock on the heap
+ * buffer, and any dead items must have been discovered under that same lock.
+ * Although we do not hold a lock on the VM buffer, it is pinned, and the heap
+ * buffer is exclusively locked, ensuring that no other backend can update the
+ * VM bits corresponding to this heap page.
+ *
+ * heap_fix_vm_corruption() makes changes to the VM and, potentially, the heap
+ * page, but it does not need to be done in a critical section because
+ * clearing the VM is not WAL-logged.
+ */
+static void
+heap_fix_vm_corruption(PruneState *prstate, OffsetNumber offnum)
+{
+	Assert(BufferIsLockedByMeInMode(prstate->buffer, BUFFER_LOCK_EXCLUSIVE));
+
+	if (PageIsAllVisible(prstate->page))
+	{
+		/*
+		 * It's possible for the value returned by
+		 * GetOldestNonRemovableTransactionId() to move backwards, so it's not
+		 * wrong for us to see tuples that appear to not be visible to
+		 * everyone yet, while PD_ALL_VISIBLE is already set. The real safe
+		 * xmin value never moves backwards, but
+		 * GetOldestNonRemovableTransactionId() is conservative and sometimes
+		 * returns a value that's unnecessarily small, so if we see that
+		 * contradiction it just means that the tuples that we think are not
+		 * visible to everyone yet actually are, and the PD_ALL_VISIBLE flag
+		 * is correct.
+		 *
+		 * However, there should never be LP_DEAD items, dead tuple versions,
+		 * or tuples inserted by an in-progress transaction on a page with
+		 * PD_ALL_VISIBLE set.
+		 */
+		if (prstate->lpdead_items > 0)
+		{
+			ereport(WARNING,
+					(errcode(ERRCODE_DATA_CORRUPTED),
+					 errmsg("LP_DEAD item found on page marked as all-visible"),
+					 errdetail("relation \"%s\", page %u, tuple %u",
+							   RelationGetRelationName(prstate->relation),
+							   prstate->block, offnum)));
+		}
+		else
+		{
+			ereport(WARNING,
+					(errcode(ERRCODE_DATA_CORRUPTED),
+					 errmsg("tuple not visible to all found on page marked as all-visible"),
+					 errdetail("relation \"%s\", page %u, tuple %u",
+							   RelationGetRelationName(prstate->relation),
+							   prstate->block, offnum)));
+		}
+
+		/*
+		 * Mark the buffer dirty now in case we make no further changes and
+		 * therefore would not mark it dirty later.
+		 */
+		PageClearAllVisible(prstate->page);
+		MarkBufferDirtyHint(prstate->buffer, true);
+	}
+	else if (prstate->vmbits & VISIBILITYMAP_VALID_BITS)
+	{
+		/*
+		 * As of PostgreSQL 9.2, the visibility map bit should never be set if
+		 * the page-level bit is clear.  However, it's possible that the bit
+		 * got cleared after heap_vac_scan_next_block() was called, so we must
+		 * recheck with buffer lock before concluding that the VM is corrupt.
+		 */
+		ereport(WARNING,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("page %u in \"%s\" is not marked all-visible but visibility map bit is set",
+						prstate->block,
+						RelationGetRelationName(prstate->relation))));
+	}
+
+	visibilitymap_clear(prstate->relation, prstate->block, prstate->vmbuffer,
+						VISIBILITYMAP_VALID_BITS);
+	prstate->vmbits = 0;
+}
 
 /*
  * Prune and repair fragmentation and potentially freeze tuples on the
@@ -826,6 +939,10 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
 					   new_relfrozen_xid, new_relmin_mxid,
 					   presult, &prstate);
 
+	if ((prstate.vmbits & VISIBILITYMAP_VALID_BITS) &&
+		!PageIsAllVisible(prstate.page))
+		heap_fix_vm_corruption(&prstate, InvalidOffsetNumber);
+
 	/*
 	 * Examine all line pointers and tuple visibility information to determine
 	 * which line pointers should change state and which tuples may be frozen.
@@ -970,6 +1087,7 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
 	presult->all_visible = prstate.set_all_visible;
 	presult->all_frozen = prstate.set_all_frozen;
 	presult->hastup = prstate.hastup;
+	presult->vmbits = prstate.vmbits;
 
 	/*
 	 * For callers planning to update the visibility map, the conflict horizon
@@ -1292,7 +1410,8 @@ process_chain:
 
 /* Record lowest soon-prunable XID */
 static void
-heap_prune_record_prunable(PruneState *prstate, TransactionId xid)
+heap_prune_record_prunable(PruneState *prstate, TransactionId xid,
+						   OffsetNumber offnum)
 {
 	/*
 	 * This should exactly match the PageSetPrunable macro.  We can't store
@@ -1302,6 +1421,13 @@ heap_prune_record_prunable(PruneState *prstate, TransactionId xid)
 	if (!TransactionIdIsValid(prstate->new_prune_xid) ||
 		TransactionIdPrecedes(xid, prstate->new_prune_xid))
 		prstate->new_prune_xid = xid;
+
+	/*
+	 * It's incorrect for a page to be marked all-visible if it contains
+	 * prunable items.
+	 */
+	if (PageIsAllVisible(prstate->page))
+		heap_fix_vm_corruption(prstate, offnum);
 }
 
 /* Record line pointer to be redirected */
@@ -1385,6 +1511,15 @@ heap_prune_record_dead_or_unused(PruneState *prstate, OffsetNumber offnum,
 		heap_prune_record_unused(prstate, offnum, was_normal);
 	else
 		heap_prune_record_dead(prstate, offnum, was_normal);
+
+	/*
+	 * It's incorrect for the page to be set all-visible if it contains dead
+	 * items. Fix that on the heap page and check the VM for corruption as
+	 * well. Do that here rather than in heap_prune_record_dead() so we also
+	 * cover tuples that are directly marked LP_UNUSED via mark_unused_now.
+	 */
+	if (PageIsAllVisible(prstate->page))
+		heap_fix_vm_corruption(prstate, offnum);
 }
 
 /* Record line pointer to be marked unused */
@@ -1524,7 +1659,8 @@ heap_prune_record_unchanged_lp_normal(PruneState *prstate, OffsetNumber offnum)
 			 * that the page is reconsidered for pruning in future.
 			 */
 			heap_prune_record_prunable(prstate,
-									   HeapTupleHeaderGetUpdateXid(htup));
+									   HeapTupleHeaderGetUpdateXid(htup),
+									   offnum);
 			break;
 
 		case HEAPTUPLE_INSERT_IN_PROGRESS:
@@ -1539,6 +1675,10 @@ heap_prune_record_unchanged_lp_normal(PruneState *prstate, OffsetNumber offnum)
 			prstate->set_all_visible = false;
 			prstate->set_all_frozen = false;
 
+			/* The page should not be marked all-visible */
+			if (PageIsAllVisible(page))
+				heap_fix_vm_corruption(prstate, offnum);
+
 			/*
 			 * If we wanted to optimize for aborts, we might consider marking
 			 * the page prunable when we see INSERT_IN_PROGRESS.  But we
@@ -1563,7 +1703,8 @@ heap_prune_record_unchanged_lp_normal(PruneState *prstate, OffsetNumber offnum)
 			 * the page is reconsidered for pruning in future.
 			 */
 			heap_prune_record_prunable(prstate,
-									   HeapTupleHeaderGetUpdateXid(htup));
+									   HeapTupleHeaderGetUpdateXid(htup),
+									   offnum);
 			break;
 
 		default:
@@ -1629,6 +1770,13 @@ heap_prune_record_unchanged_lp_dead(PruneState *prstate, OffsetNumber offnum)
 
 	/* Record the dead offset for vacuum */
 	prstate->deadoffsets[prstate->lpdead_items++] = offnum;
+
+	/*
+	 * It's incorrect for a page to be marked all-visible if it contains dead
+	 * items.
+	 */
+	if (PageIsAllVisible(prstate->page))
+		heap_fix_vm_corruption(prstate, offnum);
 }
 
 /*
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 5b6f2441f6b..0a0aa8e5a9e 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -424,11 +424,6 @@ static void find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
 								   bool sharelock, Buffer vmbuffer);
-static void identify_and_fix_vm_corruption(Relation rel, Buffer heap_buffer,
-										   BlockNumber heap_blk, Page heap_page,
-										   int nlpdead_items,
-										   Buffer vmbuffer,
-										   uint8 *vmbits);
 static int	lazy_scan_prune(LVRelState *vacrel, Buffer buf,
 							BlockNumber blkno, Page page,
 							Buffer vmbuffer,
@@ -1963,81 +1958,6 @@ cmpOffsetNumbers(const void *a, const void *b)
 	return pg_cmp_u16(*(const OffsetNumber *) a, *(const OffsetNumber *) b);
 }
 
-/*
- * Helper to correct any corruption detected on a heap page and its
- * corresponding visibility map page after pruning but before setting the
- * visibility map. It examines the heap page, the associated VM page, and the
- * number of dead items previously identified.
- *
- * This function must be called while holding an exclusive lock on the heap
- * buffer, and the dead items must have been discovered under that same lock.
-
- * The provided vmbits must reflect the current state of the VM block
- * referenced by vmbuffer. Although we do not hold a lock on the VM buffer, it
- * is pinned, and the heap buffer is exclusively locked, ensuring that no
- * other backend can update the VM bits corresponding to this heap page.
- *
- * If it clears corruption, it will zero out vmbits.
- */
-static void
-identify_and_fix_vm_corruption(Relation rel, Buffer heap_buffer,
-							   BlockNumber heap_blk, Page heap_page,
-							   int nlpdead_items,
-							   Buffer vmbuffer,
-							   uint8 *vmbits)
-{
-	Assert(visibilitymap_get_status(rel, heap_blk, &vmbuffer) == *vmbits);
-
-	Assert(BufferIsLockedByMeInMode(heap_buffer, BUFFER_LOCK_EXCLUSIVE));
-
-	/*
-	 * As of PostgreSQL 9.2, the visibility map bit should never be set if the
-	 * page-level bit is clear.  However, it's possible that the bit got
-	 * cleared after heap_vac_scan_next_block() was called, so we must recheck
-	 * with buffer lock before concluding that the VM is corrupt.
-	 */
-	if (!PageIsAllVisible(heap_page) &&
-		((*vmbits & VISIBILITYMAP_VALID_BITS) != 0))
-	{
-		ereport(WARNING,
-				(errcode(ERRCODE_DATA_CORRUPTED),
-				 errmsg("page is not marked all-visible but visibility map bit is set in relation \"%s\" page %u",
-						RelationGetRelationName(rel), heap_blk)));
-
-		visibilitymap_clear(rel, heap_blk, vmbuffer,
-							VISIBILITYMAP_VALID_BITS);
-		*vmbits = 0;
-	}
-
-	/*
-	 * It's possible for the value returned by
-	 * GetOldestNonRemovableTransactionId() to move backwards, so it's not
-	 * wrong for us to see tuples that appear to not be visible to everyone
-	 * yet, while PD_ALL_VISIBLE is already set. The real safe xmin value
-	 * never moves backwards, but GetOldestNonRemovableTransactionId() is
-	 * conservative and sometimes returns a value that's unnecessarily small,
-	 * so if we see that contradiction it just means that the tuples that we
-	 * think are not visible to everyone yet actually are, and the
-	 * PD_ALL_VISIBLE flag is correct.
-	 *
-	 * There should never be LP_DEAD items on a page with PD_ALL_VISIBLE set,
-	 * however.
-	 */
-	else if (PageIsAllVisible(heap_page) && nlpdead_items > 0)
-	{
-		ereport(WARNING,
-				(errcode(ERRCODE_DATA_CORRUPTED),
-				 errmsg("page containing LP_DEAD items is marked as all-visible in relation \"%s\" page %u",
-						RelationGetRelationName(rel), heap_blk)));
-
-		PageClearAllVisible(heap_page);
-		MarkBufferDirty(heap_buffer);
-		visibilitymap_clear(rel, heap_blk, vmbuffer,
-							VISIBILITYMAP_VALID_BITS);
-		*vmbits = 0;
-	}
-}
-
 /*
  *	lazy_scan_prune() -- lazy_scan_heap() pruning and freezing.
  *
@@ -2069,6 +1989,7 @@ lazy_scan_prune(LVRelState *vacrel,
 	PruneFreezeParams params = {
 		.relation = rel,
 		.buffer = buf,
+		.vmbuffer = vmbuffer,
 		.reason = PRUNE_VACUUM_SCAN,
 		.options = HEAP_PAGE_PRUNE_FREEZE,
 		.vistest = vacrel->vistest,
@@ -2178,18 +2099,12 @@ lazy_scan_prune(LVRelState *vacrel,
 	Assert(!presult.all_visible || !(*has_lpdead_items));
 	Assert(!presult.all_frozen || presult.all_visible);
 
-	old_vmbits = visibilitymap_get_status(vacrel->rel, blkno, &vmbuffer);
-
-	identify_and_fix_vm_corruption(vacrel->rel, buf, blkno, page,
-								   presult.lpdead_items, vmbuffer,
-								   &old_vmbits);
-
 	if (!presult.all_visible)
 		return presult.ndeleted;
 
 	/* Set the visibility map and page visibility hint */
+	old_vmbits = presult.vmbits;
 	new_vmbits = VISIBILITYMAP_ALL_VISIBLE;
-
 	if (presult.all_frozen)
 		new_vmbits |= VISIBILITYMAP_ALL_FROZEN;
 
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 4e2e71be558..9db92c7db8a 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -258,6 +258,12 @@ typedef struct PruneFreezeParams
 	Relation	relation;		/* relation containing buffer to be pruned */
 	Buffer		buffer;			/* buffer to be pruned */
 
+	/*
+	 * Callers should provide a pinned vmbuffer corresponding to the heap
+	 * block in buffer. We will check for and repair any corruption in the VM.
+	 */
+	Buffer		vmbuffer;
+
 	/*
 	 * The reason pruning was performed.  It is used to set the WAL record
 	 * opcode which is used for debugging and analysis purposes.
@@ -319,6 +325,12 @@ typedef struct PruneFreezeResult
 	bool		all_frozen;
 	TransactionId vm_conflict_horizon;
 
+	/*
+	 * vmbits is the value of the vmbuffer's vmbits at the beginning of
+	 * pruning. It is cleared if VM corruption is found and corrected.
+	 */
+	uint8		vmbits;
+
 	/*
 	 * Whether or not the page makes rel truncation unsafe.  This is set to
 	 * 'true', even if the page contains LP_DEAD items.  VACUUM will remove
-- 
2.43.0



  [text/x-patch] v35-0007-Add-pruning-fast-path-for-all-visible-and-all-fr.patch (4.3K, 8-v35-0007-Add-pruning-fast-path-for-all-visible-and-all-fr.patch)
  download | inline diff:
From 7e8ea684a4c6ee5d4b7169ec3195be75e76172e9 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Wed, 25 Feb 2026 16:48:19 -0500
Subject: [PATCH v35 07/18] Add pruning fast path for all-visible and
 all-frozen pages

Because of the SKIP_PAGES_THRESHOLD optimization or a stale prune XID,
heap_page_prune_and_freeze() can be invoked for pages with no pruning or
freezing work. To avoid this, if a page is already all-frozen or it is
all-visible and no freezing will be attempted, we can exit early.
---
 src/backend/access/heap/pruneheap.c | 73 +++++++++++++++++++++++++++++
 1 file changed, 73 insertions(+)

diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index fa470f663b7..73db45f8dfd 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -184,6 +184,7 @@ static void prune_freeze_setup(PruneFreezeParams *params,
 							   PruneFreezeResult *presult,
 							   PruneState *prstate);
 static void heap_fix_vm_corruption(PruneState *prstate, OffsetNumber offnum);
+static void heap_page_bypass_prune_freeze(PruneState *prstate, PruneFreezeResult *presult);
 static void prune_freeze_plan(PruneState *prstate,
 							  OffsetNumber *off_loc);
 static HTSV_Result heap_prune_satisfies_vacuum(PruneState *prstate,
@@ -880,6 +881,66 @@ heap_fix_vm_corruption(PruneState *prstate, OffsetNumber offnum)
 	prstate->vmbits = 0;
 }
 
+/*
+ * If the page is already all-frozen, or already all-visible and freezing
+ * is not being attempted, there is no remaining work and we can bypass the
+ * expensive overhead of heap_page_prune_and_freeze().
+ *
+ * This can happen when the page has a stale prune hint, or if VACUUM is
+ * scanning an already all-frozen page due to SKIP_PAGES_THRESHOLD.
+ *
+ * The caller must already have examined the visibility map and saved the
+ * status for the page's VM bits in prstate->vmbits. Caller must hold a
+ * content lock on the heap page since it will examine line pointers.
+ *
+ * Before calling heap_page_bypass_prune_freeze(), the caller should first
+ * check for and fix any discrepancy between the page-level visibility hint
+ * and the visibility map. Otherwise, the fast path will always prevent us
+ * from getting them in sync. Note that if there are tuples on the page that
+ * are not visible to all but the VM is incorrectly marked
+ * all-visible/all-frozen, we will not get the chance to fix that corruption
+ * when using the fast path.
+ */
+static void
+heap_page_bypass_prune_freeze(PruneState *prstate, PruneFreezeResult *presult)
+{
+	OffsetNumber maxoff = PageGetMaxOffsetNumber(prstate->page);
+	Page		page = prstate->page;
+
+	Assert(prstate->vmbits & VISIBILITYMAP_ALL_FROZEN ||
+		   (prstate->vmbits & VISIBILITYMAP_ALL_VISIBLE &&
+			!prstate->attempt_freeze));
+
+	/* We'll fill in presult for the caller */
+	memset(presult, 0, sizeof(PruneFreezeResult));
+
+	/*
+	 * Since the page is all-visible, a count of the normal ItemIds on the
+	 * page should be sufficient for vacuum's live tuple count.
+	 */
+	for (OffsetNumber off = FirstOffsetNumber;
+		 off <= maxoff;
+		 off = OffsetNumberNext(off))
+	{
+		if (ItemIdIsNormal(PageGetItemId(page, off)))
+			prstate->live_tuples++;
+	}
+
+	presult->live_tuples = prstate->live_tuples;
+
+	/* Clear any stale prune hint */
+	if (TransactionIdIsValid(PageGetPruneXid(page)))
+	{
+		PageClearPrunable(page);
+		MarkBufferDirtyHint(prstate->buffer, true);
+	}
+
+	presult->vmbits = prstate->vmbits;
+
+	if (!PageIsEmpty(page))
+		presult->hastup = true;
+}
+
 /*
  * Prune and repair fragmentation and potentially freeze tuples on the
  * specified page.
@@ -943,6 +1004,18 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
 		!PageIsAllVisible(prstate.page))
 		heap_fix_vm_corruption(&prstate, InvalidOffsetNumber);
 
+	/*
+	 * If the page is already all-frozen, or already all-visible when freezing
+	 * is not being attempted, we can exit early. Do this after fixing any
+	 * discrepancy between the page-level visibility hint and the VM.
+	 */
+	if (prstate.vmbits & VISIBILITYMAP_ALL_FROZEN ||
+		(prstate.vmbits & VISIBILITYMAP_ALL_VISIBLE && !prstate.attempt_freeze))
+	{
+		heap_page_bypass_prune_freeze(&prstate, presult);
+		return;
+	}
+
 	/*
 	 * Examine all line pointers and tuple visibility information to determine
 	 * which line pointers should change state and which tuples may be frozen.
-- 
2.43.0



  [text/x-patch] v35-0008-Use-GlobalVisState-in-vacuum-to-determine-page-l.patch (11.4K, 9-v35-0008-Use-GlobalVisState-in-vacuum-to-determine-page-l.patch)
  download | inline diff:
From f271209e3feb75f79e94b83c3d564e5d14d1b9bf Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Wed, 17 Dec 2025 16:51:05 -0500
Subject: [PATCH v35 08/18] Use GlobalVisState in vacuum to determine page
 level visibility

During vacuum's first and third phases, we examine tuples' visibility
to determine if we can set the page all-visible in the visibility map.

Previously, this check compared tuple xmins against a single XID chosen at
the start of vacuum (OldestXmin). We now use GlobalVisState, which also
enables future work to set the VM during on-access pruning, since ordinary
queries have access to GlobalVisState but not OldestXmin.

This also benefits vacuum: in some cases, GlobalVisState may advance
during a vacuum, allowing more pages to become considered all-visible.
And, in the future, we could easily add a heuristic to update
GlobalVisState more frequently during vacuums of large tables.

OldestXmin is still used for freezing and as a backstop to ensure we
don't freeze a dead tuple that wasn't yet prunable according to
GlobalVisState in the rare occurrences where GlobalVisState moves
backwards.

Because comparing a transaction ID against GlobalVisState is more
expensive than comparing against a single XID, we defer this check until
after scanning all tuples on the page. Therefore, we perform the
GlobalVisState check only once per page. This is safe because
visibility_cutoff_xid records the newest live xmin on the page;
if it is globally visible, then the entire page is all-visible.

Using GlobalVisState means on-access pruning can also maintain
visibility_cutoff_xid. This approach will result in examining more tuple
xmins than before; however, the additional cost should not be
significant. And doing so will enable us to set the visibility map on
access in the future.

Author: Melanie Plageman <[email protected]>
Reviewed-by: Andres Freund <[email protected]>
Reviewed-by: Chao Li <[email protected]>
Discussion: https://postgr.es/m/flat/bqc4kh5midfn44gnjiqez3bjqv4zogydguvdn446riw45jcf3y%404ez66il7ebvk#c755ef151507aba58471ffaca607e493
---
 src/backend/access/heap/heapam_visibility.c | 22 +++++++++
 src/backend/access/heap/pruneheap.c         | 37 +++++++--------
 src/backend/access/heap/vacuumlazy.c        | 51 +++++++++++++--------
 src/include/access/heapam.h                 |  2 +
 4 files changed, 72 insertions(+), 40 deletions(-)

diff --git a/src/backend/access/heap/heapam_visibility.c b/src/backend/access/heap/heapam_visibility.c
index 75ae268d753..aee88947393 100644
--- a/src/backend/access/heap/heapam_visibility.c
+++ b/src/backend/access/heap/heapam_visibility.c
@@ -1060,6 +1060,28 @@ HeapTupleSatisfiesVacuum(HeapTuple htup, TransactionId OldestXmin,
 	return res;
 }
 
+/*
+ * Wrapper around GlobalVisTestIsRemovableXid() for use when examining live
+ * tuples. Returns true if the given XID may be considered running by at least
+ * one snapshot.
+ *
+ * This function alone is insufficient to determine tuple visibility; callers
+ * must also consider the XID's commit status. Its purpose is purely semantic:
+ * when applied to live tuples, GlobalVisTestIsRemovableXid() is checking
+ * whether the inserting transaction is still considered running, not whether
+ * the tuple is removable. Live tuples are, by definition, not removable, but
+ * the snapshot criteria for “transaction still running” are identical to
+ * those used for removal XIDs.
+ *
+ * See the comment above GlobalVisTestIsRemovable[Full]Xid() for details on the
+ * required preconditions for calling this function.
+ */
+bool
+GlobalVisTestXidMaybeRunning(GlobalVisState *state, TransactionId xid)
+{
+	return !GlobalVisTestIsRemovableXid(state, xid);
+}
+
 /*
  * Work horse for HeapTupleSatisfiesVacuum and similar routines.
  *
diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index 73db45f8dfd..7b72804a3e5 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -1024,6 +1024,17 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
 	 */
 	prune_freeze_plan(&prstate, off_loc);
 
+	/*
+	 * After processing all the live tuples on the page, if the newest xmin
+	 * amongst them may be considered running by any snapshot, the page cannot
+	 * be all-visible.
+	 */
+	if (prstate.set_all_visible &&
+		TransactionIdIsNormal(prstate.visibility_cutoff_xid) &&
+		GlobalVisTestXidMaybeRunning(prstate.vistest,
+									 prstate.visibility_cutoff_xid))
+		prstate.set_all_visible = prstate.set_all_frozen = false;
+
 	/*
 	 * If checksums are enabled, calling heap_prune_satisfies_vacuum() while
 	 * checking tuple visibility information in prune_freeze_plan() may have
@@ -1692,29 +1703,15 @@ heap_prune_record_unchanged_lp_normal(PruneState *prstate, OffsetNumber offnum)
 				}
 
 				/*
-				 * The inserter definitely committed.  But is it old enough
-				 * that everyone sees it as committed?  A FrozenTransactionId
-				 * is seen as committed to everyone.  Otherwise, we check if
-				 * there is a snapshot that considers this xid to still be
-				 * running, and if so, we don't consider the page all-visible.
+				 * The inserter definitely committed. But we don't know if it
+				 * is old enough that everyone sees it as committed. Later,
+				 * after processing all the tuples on the page, we'll check if
+				 * there is any snapshot that still considers the newest xid
+				 * on the page to be running. If so, we don't consider the
+				 * page all-visible.
 				 */
 				xmin = HeapTupleHeaderGetXmin(htup);
 
-				/*
-				 * For now always use prstate->cutoffs for this test, because
-				 * we only update 'set_all_visible' and 'set_all_frozen' when
-				 * freezing is requested. We could use
-				 * GlobalVisTestIsRemovableXid instead, if a non-freezing
-				 * caller wanted to set the VM bit.
-				 */
-				Assert(prstate->cutoffs);
-				if (!TransactionIdPrecedes(xmin, prstate->cutoffs->OldestXmin))
-				{
-					prstate->set_all_visible = false;
-					prstate->set_all_frozen = false;
-					break;
-				}
-
 				/* Track newest xmin on page. */
 				if (TransactionIdFollows(xmin, prstate->visibility_cutoff_xid) &&
 					TransactionIdIsNormal(xmin))
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 0a0aa8e5a9e..6c7807d5bd3 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -460,13 +460,13 @@ static void dead_items_cleanup(LVRelState *vacrel);
 
 #ifdef USE_ASSERT_CHECKING
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
-									 TransactionId OldestXmin,
+									 GlobalVisState *vistest,
 									 bool *all_frozen,
 									 TransactionId *visibility_cutoff_xid,
 									 OffsetNumber *logging_offnum);
 #endif
 static bool heap_page_would_be_all_visible(Relation rel, Buffer buf,
-										   TransactionId OldestXmin,
+										   GlobalVisState *vistest,
 										   OffsetNumber *deadoffsets,
 										   int ndeadoffsets,
 										   bool *all_frozen,
@@ -2053,13 +2053,10 @@ lazy_scan_prune(LVRelState *vacrel,
 		Assert(presult.lpdead_items == 0);
 
 		Assert(heap_page_is_all_visible(vacrel->rel, buf,
-										vacrel->cutoffs.OldestXmin, &debug_all_frozen,
+										vacrel->vistest, &debug_all_frozen,
 										&debug_cutoff, &vacrel->offnum));
 
 		Assert(presult.all_frozen == debug_all_frozen);
-
-		Assert(!TransactionIdIsValid(debug_cutoff) ||
-			   debug_cutoff == presult.vm_conflict_horizon);
 	}
 #endif
 
@@ -2815,7 +2812,7 @@ lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
 	 * done outside the critical section.
 	 */
 	if (heap_page_would_be_all_visible(vacrel->rel, buffer,
-									   vacrel->cutoffs.OldestXmin,
+									   vacrel->vistest,
 									   deadoffsets, num_offsets,
 									   &all_frozen, &visibility_cutoff_xid,
 									   &vacrel->offnum))
@@ -3576,14 +3573,14 @@ dead_items_cleanup(LVRelState *vacrel)
  */
 static bool
 heap_page_is_all_visible(Relation rel, Buffer buf,
-						 TransactionId OldestXmin,
+						 GlobalVisState *vistest,
 						 bool *all_frozen,
 						 TransactionId *visibility_cutoff_xid,
 						 OffsetNumber *logging_offnum)
 {
 
 	return heap_page_would_be_all_visible(rel, buf,
-										  OldestXmin,
+										  vistest,
 										  NULL, 0,
 										  all_frozen,
 										  visibility_cutoff_xid,
@@ -3604,7 +3601,7 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
  * Returns true if the page is all-visible other than the provided
  * deadoffsets and false otherwise.
  *
- * OldestXmin is used to determine visibility.
+ * vistest is used to determine visibility.
  *
  * Output parameters:
  *
@@ -3623,7 +3620,7 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
  */
 static bool
 heap_page_would_be_all_visible(Relation rel, Buffer buf,
-							   TransactionId OldestXmin,
+							   GlobalVisState *vistest,
 							   OffsetNumber *deadoffsets,
 							   int ndeadoffsets,
 							   bool *all_frozen,
@@ -3704,7 +3701,7 @@ heap_page_would_be_all_visible(Relation rel, Buffer buf,
 				{
 					TransactionId xmin;
 
-					/* Check comments in lazy_scan_prune. */
+					/* Check heap_prune_record_unchanged_lp_normal comments */
 					if (!HeapTupleHeaderXminCommitted(tuple.t_data))
 					{
 						all_visible = false;
@@ -3713,16 +3710,17 @@ heap_page_would_be_all_visible(Relation rel, Buffer buf,
 					}
 
 					/*
-					 * The inserter definitely committed. But is it old enough
-					 * that everyone sees it as committed?
+					 * The inserter definitely committed. But we don't know if
+					 * it is old enough that everyone sees it as committed.
+					 * Don't check that now.
+					 *
+					 * If we scan all tuples without finding one that prevents
+					 * the page from being all-visible, we then check whether
+					 * any snapshot still considers the newest XID on the page
+					 * to be running. In that case, the page is not considered
+					 * all-visible.
 					 */
 					xmin = HeapTupleHeaderGetXmin(tuple.t_data);
-					if (!TransactionIdPrecedes(xmin, OldestXmin))
-					{
-						all_visible = false;
-						*all_frozen = false;
-						break;
-					}
 
 					/* Track newest xmin on page. */
 					if (TransactionIdFollows(xmin, *visibility_cutoff_xid) &&
@@ -3751,6 +3749,19 @@ heap_page_would_be_all_visible(Relation rel, Buffer buf,
 		}
 	}							/* scan along page */
 
+	/*
+	 * After processing all the live tuples on the page, if the newest xmin
+	 * among them may still be considered running by any snapshot, the page
+	 * cannot be all-visible.
+	 */
+	if (all_visible &&
+		TransactionIdIsNormal(*visibility_cutoff_xid) &&
+		GlobalVisTestXidMaybeRunning(vistest, *visibility_cutoff_xid))
+	{
+		all_visible = false;
+		*all_frozen = false;
+	}
+
 	/* Clear the offset information once we have processed the given page. */
 	*logging_offnum = InvalidOffsetNumber;
 
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 9db92c7db8a..e401dd52e25 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -474,6 +474,8 @@ extern TM_Result HeapTupleSatisfiesUpdate(HeapTuple htup, CommandId curcid,
 										  Buffer buffer);
 extern HTSV_Result HeapTupleSatisfiesVacuum(HeapTuple htup, TransactionId OldestXmin,
 											Buffer buffer);
+
+extern bool GlobalVisTestXidMaybeRunning(GlobalVisState *state, TransactionId xid);
 extern HTSV_Result HeapTupleSatisfiesVacuumHorizon(HeapTuple htup, Buffer buffer,
 												   TransactionId *dead_after);
 extern void HeapTupleSetHintBits(HeapTupleHeader tuple, Buffer buffer,
-- 
2.43.0



  [text/x-patch] v35-0009-Keep-newest-live-XID-up-to-date-even-if-page-not.patch (14.8K, 10-v35-0009-Keep-newest-live-XID-up-to-date-even-if-page-not.patch)
  download | inline diff:
From f70d52103e8f665de92bd531ff3a261b0142d20d Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Sat, 28 Feb 2026 16:06:51 -0500
Subject: [PATCH v35 09/18] Keep newest live XID up-to-date even if page not
 all-visible

During pruning, we keep track of the newest xmin of live tuples on the
page visible to all running and future transactions so that we can use
it later as the snapshot conflict horizon when setting the VM if the
page turns out to be all-visible.

Previously, we stopped updating this value once we determined the page
was not all-visible. However, maintaining it even when the page is not
all-visible is inexpensive and makes the snapshot conflict horizon
calculation clearer. This guarantees it won't contain a stale value.

Since we'll keep it up to date all the time now anyway, there's no
reason not to maintain all_visible for on-access pruning. This will
allow us to set the VM on-access in the future.

Author: Melanie Plageman <[email protected]>
Reviewed-by: Andres Freund <[email protected]>
Earlier version reviewed-by: Chao Li <[email protected]>
Discussion: https://postgr.es/m/bqc4kh5midfn44gnjiqez3bjqv4zogydguvdn446riw45jcf3y%404ez66il7ebvk
---
 src/backend/access/heap/pruneheap.c  | 127 +++++++++++----------------
 src/backend/access/heap/vacuumlazy.c |  30 +++----
 2 files changed, 65 insertions(+), 92 deletions(-)

diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index 7b72804a3e5..dd731f64bc6 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -129,6 +129,9 @@ typedef struct
 	/* Bits in the vmbuffer for this heap page */
 	uint8		vmbits;
 
+	/* The newest xmin of live tuples on the page */
+	TransactionId newest_live_xid;
+
 	/*-------------------------------------------------------
 	 * Information about what was done
 	 *
@@ -160,11 +163,6 @@ typedef struct
 	 * all-frozen bits in the visibility map can be set for this page after
 	 * pruning.
 	 *
-	 * visibility_cutoff_xid is the newest xmin of live tuples on the page.
-	 * The caller can use it as the conflict horizon, when setting the VM
-	 * bits.  It is only valid if we froze some tuples, and set_all_frozen is
-	 * true.
-	 *
 	 * NOTE: set_all_visible and set_all_frozen initially don't include
 	 * LP_DEAD items. That's convenient for heap_page_prune_and_freeze() to
 	 * use them to decide whether to freeze the page or not.  The
@@ -174,7 +172,6 @@ typedef struct
 	 */
 	bool		set_all_visible;
 	bool		set_all_frozen;
-	TransactionId visibility_cutoff_xid;
 } PruneState;
 
 /* Local functions */
@@ -433,53 +430,35 @@ prune_freeze_setup(PruneFreezeParams *params,
 	prstate->deadoffsets = presult->deadoffsets;
 
 	/*
-	 * Vacuum may update the VM after we're done.  We can keep track of
-	 * whether the page will be all-visible and all-frozen after pruning and
-	 * freezing to help the caller to do that.
-	 *
-	 * Currently, only VACUUM sets the VM bits.  To save the effort, only do
-	 * the bookkeeping if the caller needs it.  Currently, that's tied to
-	 * HEAP_PAGE_PRUNE_FREEZE, but it could be a separate flag if you wanted
-	 * to update the VM bits without also freezing or freeze without also
-	 * setting the VM bits.
+	 * We track whether the page will be all-visible/all-frozen at the end of
+	 * pruning and freezing. While examining tuple visibility, we'll set
+	 * set_all_visible to false if there are tuples on the page not visible to
+	 * all running and future transactions. set_all_visible is always
+	 * maintained but only VACUUM will set the VM if the page ends up being
+	 * all-visible.
 	 *
-	 * In addition to telling the caller whether it can set the VM bit, we
-	 * also use 'set_all_visible' and 'set_all_frozen' for our own
-	 * decision-making. If the whole page would become frozen, we consider
-	 * opportunistically freezing tuples.  We will not be able to freeze the
-	 * whole page if there are tuples present that are not visible to everyone
-	 * or if there are dead tuples which are not yet removable.  However, dead
-	 * tuples which will be removed by the end of vacuuming should not
-	 * preclude us from opportunistically freezing.  Because of that, we do
-	 * not immediately clear set_all_visible and set_all_frozen when we see
-	 * LP_DEAD items.  We fix that after scanning the line pointers. We must
-	 * correct set_all_visible and set_all_frozen before we return them to the
-	 * caller, so that the caller doesn't set the VM bits incorrectly.
+	 * We also keep track of the newest live XID, which is used to calculate
+	 * the snapshot conflict horizon for a WAL record setting the VM.
 	 */
-	if (prstate->attempt_freeze)
-	{
-		prstate->set_all_visible = true;
-		prstate->set_all_frozen = true;
-	}
-	else
-	{
-		/*
-		 * Initializing to false allows skipping the work to update them in
-		 * heap_prune_record_unchanged_lp_normal().
-		 */
-		prstate->set_all_visible = false;
-		prstate->set_all_frozen = false;
-	}
+	prstate->set_all_visible = true;
+	prstate->newest_live_xid = InvalidTransactionId;
 
 	/*
-	 * The visibility cutoff xid is the newest xmin of live tuples on the
-	 * page.  In the common case, this will be set as the conflict horizon the
-	 * caller can use for updating the VM.  If, at the end of freezing and
-	 * pruning, the page is all-frozen, there is no possibility that any
-	 * running transaction on the standby does not see tuples on the page as
-	 * all-visible, so the conflict horizon remains InvalidTransactionId.
+	 * Currently, only VACUUM performs freezing, but other callers may in the
+	 * future. Other callers must initialize prstate.all_frozen to false,
+	 * since we will not call heap_prepare_freeze_tuple() for each tuple.
+	 *
+	 * We only consider opportunistic freezing if the page would become
+	 * all-frozen, or if it would be all-frozen except for dead tuples that
+	 * VACUUM will remove.
+	 *
+	 * Dead tuples that will be removed by the end of vacuum should not
+	 * prevent opportunistic freezing. Therefore, we do not clear
+	 * set_all_visible and set_all_frozen when we encounter LP_DEAD items.
+	 * Instead, we correct them after deciding whether to freeze, but before
+	 * updating the VM, to avoid setting the VM bits incorrectly.
 	 */
-	prstate->visibility_cutoff_xid = InvalidTransactionId;
+	prstate->set_all_frozen = prstate->attempt_freeze ? true : false;
 }
 
 /*
@@ -709,7 +688,6 @@ heap_page_will_freeze(bool did_tuple_hint_fpi,
 	if (!prstate->attempt_freeze)
 	{
 		Assert(!prstate->set_all_frozen && prstate->nfrozen == 0);
-		Assert(prstate->lpdead_items == 0 || !prstate->set_all_visible);
 		return false;
 	}
 
@@ -962,9 +940,8 @@ heap_page_bypass_prune_freeze(PruneState *prstate, PruneFreezeResult *presult)
  * HEAP_PAGE_PRUNE_FREEZE option is passed, we also set
  * presult->set_all_visible and presult->set_all_frozen after determining
  * whether or not to opportunistically freeze, to indicate if the VM bits can
- * be set.  They are always set to false when the HEAP_PAGE_PRUNE_FREEZE
- * option is not passed, because at the moment only callers that also freeze
- * need that information.
+ * be set. 'all-frozen' is always set to false when the HEAP_PAGE_PRUNE_FREEZE
+ * option is not passed.
  *
  * presult contains output parameters needed by callers, such as the number of
  * tuples removed and the offsets of dead items on the page after pruning.
@@ -1030,9 +1007,9 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
 	 * be all-visible.
 	 */
 	if (prstate.set_all_visible &&
-		TransactionIdIsNormal(prstate.visibility_cutoff_xid) &&
+		TransactionIdIsNormal(prstate.newest_live_xid) &&
 		GlobalVisTestXidMaybeRunning(prstate.vistest,
-									 prstate.visibility_cutoff_xid))
+									 prstate.newest_live_xid))
 		prstate.set_all_visible = prstate.set_all_frozen = false;
 
 	/*
@@ -1184,7 +1161,7 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
 	if (presult->all_frozen)
 		presult->vm_conflict_horizon = InvalidTransactionId;
 	else
-		presult->vm_conflict_horizon = prstate.visibility_cutoff_xid;
+		presult->vm_conflict_horizon = prstate.newest_live_xid;
 
 	presult->lpdead_items = prstate.lpdead_items;
 	/* the presult->deadoffsets array was already filled in */
@@ -1644,6 +1621,7 @@ static void
 heap_prune_record_unchanged_lp_normal(PruneState *prstate, OffsetNumber offnum)
 {
 	HeapTupleHeader htup;
+	TransactionId xmin;
 	Page		page = prstate->page;
 
 	Assert(!prstate->processed[offnum]);
@@ -1691,32 +1669,27 @@ heap_prune_record_unchanged_lp_normal(PruneState *prstate, OffsetNumber offnum)
 			 * See SetHintBits for more info.  Check that the tuple is hinted
 			 * xmin-committed because of that.
 			 */
-			if (prstate->set_all_visible)
+			if (!HeapTupleHeaderXminCommitted(htup))
 			{
-				TransactionId xmin;
+				prstate->set_all_visible = false;
+				prstate->set_all_frozen = false;
+				break;
+			}
 
-				if (!HeapTupleHeaderXminCommitted(htup))
-				{
-					prstate->set_all_visible = false;
-					prstate->set_all_frozen = false;
-					break;
-				}
+			/*
+			 * The inserter definitely committed. But we don't know if it is
+			 * old enough that everyone sees it as committed. Later, after
+			 * processing all the tuples on the page, we'll check if there is
+			 * any snapshot that still considers the newest xid on the page to
+			 * be running. If so, we don't consider the page all-visible.
+			 */
+			xmin = HeapTupleHeaderGetXmin(htup);
 
-				/*
-				 * The inserter definitely committed. But we don't know if it
-				 * is old enough that everyone sees it as committed. Later,
-				 * after processing all the tuples on the page, we'll check if
-				 * there is any snapshot that still considers the newest xid
-				 * on the page to be running. If so, we don't consider the
-				 * page all-visible.
-				 */
-				xmin = HeapTupleHeaderGetXmin(htup);
+			/* Track newest xmin on page. */
+			if (TransactionIdFollows(xmin, prstate->newest_live_xid) &&
+				TransactionIdIsNormal(xmin))
+				prstate->newest_live_xid = xmin;
 
-				/* Track newest xmin on page. */
-				if (TransactionIdFollows(xmin, prstate->visibility_cutoff_xid) &&
-					TransactionIdIsNormal(xmin))
-					prstate->visibility_cutoff_xid = xmin;
-			}
 			break;
 
 		case HEAPTUPLE_RECENTLY_DEAD:
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 6c7807d5bd3..b5370ec26da 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -462,7 +462,7 @@ static void dead_items_cleanup(LVRelState *vacrel);
 static bool heap_page_is_all_visible(Relation rel, Buffer buf,
 									 GlobalVisState *vistest,
 									 bool *all_frozen,
-									 TransactionId *visibility_cutoff_xid,
+									 TransactionId *newest_live_xid,
 									 OffsetNumber *logging_offnum);
 #endif
 static bool heap_page_would_be_all_visible(Relation rel, Buffer buf,
@@ -470,7 +470,7 @@ static bool heap_page_would_be_all_visible(Relation rel, Buffer buf,
 										   OffsetNumber *deadoffsets,
 										   int ndeadoffsets,
 										   bool *all_frozen,
-										   TransactionId *visibility_cutoff_xid,
+										   TransactionId *newest_live_xid,
 										   OffsetNumber *logging_offnum);
 static void update_relstats_all_indexes(LVRelState *vacrel);
 static void vacuum_error_callback(void *arg);
@@ -2788,7 +2788,7 @@ lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
 	Page		page = BufferGetPage(buffer);
 	OffsetNumber unused[MaxHeapTuplesPerPage];
 	int			nunused = 0;
-	TransactionId visibility_cutoff_xid;
+	TransactionId newest_live_xid;
 	TransactionId conflict_xid = InvalidTransactionId;
 	bool		all_frozen;
 	LVSavedErrInfo saved_err_info;
@@ -2814,14 +2814,14 @@ lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
 	if (heap_page_would_be_all_visible(vacrel->rel, buffer,
 									   vacrel->vistest,
 									   deadoffsets, num_offsets,
-									   &all_frozen, &visibility_cutoff_xid,
+									   &all_frozen, &newest_live_xid,
 									   &vacrel->offnum))
 	{
 		vmflags |= VISIBILITYMAP_ALL_VISIBLE;
 		if (all_frozen)
 		{
 			vmflags |= VISIBILITYMAP_ALL_FROZEN;
-			Assert(!TransactionIdIsValid(visibility_cutoff_xid));
+			Assert(!TransactionIdIsValid(newest_live_xid));
 		}
 
 		/*
@@ -2862,7 +2862,7 @@ lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
 		visibilitymap_set_vmbits(blkno,
 								 vmbuffer, vmflags,
 								 vacrel->rel->rd_locator);
-		conflict_xid = visibility_cutoff_xid;
+		conflict_xid = newest_live_xid;
 	}
 
 	/*
@@ -3575,7 +3575,7 @@ static bool
 heap_page_is_all_visible(Relation rel, Buffer buf,
 						 GlobalVisState *vistest,
 						 bool *all_frozen,
-						 TransactionId *visibility_cutoff_xid,
+						 TransactionId *newest_live_xid,
 						 OffsetNumber *logging_offnum)
 {
 
@@ -3583,7 +3583,7 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 										  vistest,
 										  NULL, 0,
 										  all_frozen,
-										  visibility_cutoff_xid,
+										  newest_live_xid,
 										  logging_offnum);
 }
 #endif
@@ -3606,7 +3606,7 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
  * Output parameters:
  *
  *  - *all_frozen: true if every tuple on the page is frozen
- *  - *visibility_cutoff_xid: newest xmin; valid only if page is all-visible
+ *  - *newest_live_xid: newest xmin of live tuples on the page
  *  - *logging_offnum: OffsetNumber of current tuple being processed;
  *     used by vacuum's error callback system.
  *
@@ -3624,7 +3624,7 @@ heap_page_would_be_all_visible(Relation rel, Buffer buf,
 							   OffsetNumber *deadoffsets,
 							   int ndeadoffsets,
 							   bool *all_frozen,
-							   TransactionId *visibility_cutoff_xid,
+							   TransactionId *newest_live_xid,
 							   OffsetNumber *logging_offnum)
 {
 	Page		page = BufferGetPage(buf);
@@ -3634,7 +3634,7 @@ heap_page_would_be_all_visible(Relation rel, Buffer buf,
 	bool		all_visible = true;
 	int			matched_dead_count = 0;
 
-	*visibility_cutoff_xid = InvalidTransactionId;
+	*newest_live_xid = InvalidTransactionId;
 	*all_frozen = true;
 
 	Assert(ndeadoffsets == 0 || deadoffsets);
@@ -3723,9 +3723,9 @@ heap_page_would_be_all_visible(Relation rel, Buffer buf,
 					xmin = HeapTupleHeaderGetXmin(tuple.t_data);
 
 					/* Track newest xmin on page. */
-					if (TransactionIdFollows(xmin, *visibility_cutoff_xid) &&
+					if (TransactionIdFollows(xmin, *newest_live_xid) &&
 						TransactionIdIsNormal(xmin))
-						*visibility_cutoff_xid = xmin;
+						*newest_live_xid = xmin;
 
 					/* Check whether this tuple is already frozen or not */
 					if (all_visible && *all_frozen &&
@@ -3755,8 +3755,8 @@ heap_page_would_be_all_visible(Relation rel, Buffer buf,
 	 * cannot be all-visible.
 	 */
 	if (all_visible &&
-		TransactionIdIsNormal(*visibility_cutoff_xid) &&
-		GlobalVisTestXidMaybeRunning(vistest, *visibility_cutoff_xid))
+		TransactionIdIsNormal(*newest_live_xid) &&
+		GlobalVisTestXidMaybeRunning(vistest, *newest_live_xid))
 	{
 		all_visible = false;
 		*all_frozen = false;
-- 
2.43.0



  [text/x-patch] v35-0010-Eliminate-XLOG_HEAP2_VISIBLE-from-vacuum-phase-I.patch (27.3K, 11-v35-0010-Eliminate-XLOG_HEAP2_VISIBLE-from-vacuum-phase-I.patch)
  download | inline diff:
From 479ab7c11c1e48c938934706acf21cff460297c0 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Tue, 2 Dec 2025 15:07:42 -0500
Subject: [PATCH v35 10/18] Eliminate XLOG_HEAP2_VISIBLE from vacuum phase I
 prune/freeze

Vacuum no longer emits a separate WAL record for each page set
all-visible or all-frozen during phase I. Instead, visibility map
updates are now included in the XLOG_HEAP2_PRUNE_VACUUM_SCAN record that
is already emitted for pruning and freezing.

Previously, heap_page_prune_and_freeze() determined whether a page was
all-visible, but the corresponding VM bits were only set later in
lazy_scan_prune(). Now the VM is updated immediately in
heap_page_prune_and_freeze(), at the same time as the heap
modifications.

This change applies only to vacuum phase I, not to pruning performed
during normal page access.

Author: Melanie Plageman <[email protected]>
Reviewed-by: Andres Freund <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Kirill Reshke <[email protected]>
Reviewed-by: Chao Li <[email protected]>
Discussion: https://postgr.es/m/flat/CAAKRu_ZMw6Npd_qm2KM%2BFwQ3cMOMx1Dh3VMhp8-V7SOLxdK9-g%40mail.gmail.com
---
 src/backend/access/heap/pruneheap.c  | 321 ++++++++++++++++++++-------
 src/backend/access/heap/vacuumlazy.c | 107 +--------
 src/include/access/heapam.h          |  37 ++-
 3 files changed, 266 insertions(+), 199 deletions(-)

diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index dd731f64bc6..d41e1c6fce4 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -72,6 +72,21 @@ typedef struct
 	OffsetNumber nowunused[MaxHeapTuplesPerPage];
 	HeapTupleFreeze frozen[MaxHeapTuplesPerPage];
 
+	/*
+	 * set_all_visible and set_all_frozen indicate if the all-visible and
+	 * all-frozen bits in the visibility map can be set for this page after
+	 * pruning.
+	 *
+	 * NOTE: set_all_visible and set_all_frozen initially don't include
+	 * LP_DEAD items. That's convenient for heap_page_prune_and_freeze() to
+	 * use them to decide whether to opportunistically freeze the page or not.
+	 * The set_all_visible and set_all_frozen values ultimately used to set
+	 * the VM are adjusted to include LP_DEAD items after we determine whether
+	 * or not to opportunistically freeze.
+	 */
+	bool		set_all_visible;
+	bool		set_all_frozen;
+
 	/*-------------------------------------------------------
 	 * Working state for HOT chain processing
 	 *-------------------------------------------------------
@@ -122,12 +137,16 @@ typedef struct
 	/*
 	 * Caller must provide a pinned vmbuffer corresponding to the heap block
 	 * passed to heap_page_prune_and_freeze(). We will fix any corruption
-	 * found in the VM.
+	 * found in the VM and set the VM if the page is all-visible/all-frozen.
 	 */
 	Buffer		vmbuffer;
 
-	/* Bits in the vmbuffer for this heap page */
-	uint8		vmbits;
+	/*
+	 * The state of the VM bits at the beginning of pruning and the state they
+	 * will be in at the end.
+	 */
+	uint8		old_vmbits;
+	uint8		new_vmbits;
 
 	/* The newest xmin of live tuples on the page */
 	TransactionId newest_live_xid;
@@ -157,21 +176,6 @@ typedef struct
 	 */
 	int			lpdead_items;	/* number of items in the array */
 	OffsetNumber *deadoffsets;	/* points directly to presult->deadoffsets */
-
-	/*
-	 * set_all_visible and set_all_frozen indicate if the all-visible and
-	 * all-frozen bits in the visibility map can be set for this page after
-	 * pruning.
-	 *
-	 * NOTE: set_all_visible and set_all_frozen initially don't include
-	 * LP_DEAD items. That's convenient for heap_page_prune_and_freeze() to
-	 * use them to decide whether to freeze the page or not.  The
-	 * set_all_visible and set_all_frozen values returned to the caller are
-	 * adjusted to include LP_DEAD items after we determine whether to
-	 * opportunistically freeze.
-	 */
-	bool		set_all_visible;
-	bool		set_all_frozen;
 } PruneState;
 
 /* Local functions */
@@ -209,6 +213,12 @@ static void page_verify_redirects(Page page);
 
 static bool heap_page_will_freeze(bool did_tuple_hint_fpi, bool do_prune, bool do_hint_prune,
 								  PruneState *prstate);
+static bool heap_page_will_set_vm(PruneState *prstate, PruneReason reason);
+static TransactionId get_conflict_xid(bool do_prune, bool do_freeze, bool do_set_vm,
+									  uint8 old_vmbits, uint8 new_vmbits,
+									  TransactionId latest_xid_removed,
+									  TransactionId newest_frozen_xid,
+									  TransactionId newest_live_xid);
 
 
 /*
@@ -373,9 +383,10 @@ prune_freeze_setup(PruneFreezeParams *params,
 
 	Assert(BufferIsValid(params->vmbuffer));
 	prstate->vmbuffer = params->vmbuffer;
-	prstate->vmbits = visibilitymap_get_status(prstate->relation,
-											   prstate->block,
-											   &prstate->vmbuffer);
+	prstate->new_vmbits = 0;
+	prstate->old_vmbits = visibilitymap_get_status(prstate->relation,
+												   prstate->block,
+												   &prstate->vmbuffer);
 
 	/*
 	 * Our strategy is to scan the page and make lists of items to change,
@@ -445,7 +456,7 @@ prune_freeze_setup(PruneFreezeParams *params,
 
 	/*
 	 * Currently, only VACUUM performs freezing, but other callers may in the
-	 * future. Other callers must initialize prstate.all_frozen to false,
+	 * future. Other callers must initialize prstate.set_all_frozen to false,
 	 * since we will not call heap_prepare_freeze_tuple() for each tuple.
 	 *
 	 * We only consider opportunistic freezing if the page would become
@@ -774,6 +785,66 @@ heap_page_will_freeze(bool did_tuple_hint_fpi,
 	return do_freeze;
 }
 
+/*
+ * Calculate the conflict horizon for the whole XLOG_HEAP2_PRUNE_VACUUM_SCAN
+ * or XLOG_HEAP2_PRUNE_ON_ACCESS record.
+ */
+static TransactionId
+get_conflict_xid(bool do_prune, bool do_freeze, bool do_set_vm,
+				 uint8 old_vmbits, uint8 new_vmbits,
+				 TransactionId latest_xid_removed,
+				 TransactionId newest_frozen_xid,
+				 TransactionId newest_live_xid)
+{
+	TransactionId conflict_xid = InvalidTransactionId;
+
+	/*
+	 * We can omit the snapshot conflict horizon if we are not pruning or
+	 * freezing any tuples and are setting an already all-visible page
+	 * all-frozen in the VM. In this case, all of the tuples on the page must
+	 * already be seen as frozen by all MVCC snapshots on the standby (any
+	 * conflict would ahve been handled in reaction to the WAL record freezing
+	 * those tuples).
+	 */
+	if (!do_prune &&
+		!do_freeze &&
+		(old_vmbits & VISIBILITYMAP_ALL_VISIBLE) &&
+		(new_vmbits & VISIBILITYMAP_ALL_FROZEN))
+		return InvalidTransactionId;
+
+	/*
+	 * The snapshot conflict horizon for the whole record should be the most
+	 * conservative (newest) of all the horizons calculated for any of the
+	 * possible modifications. If this record will prune tuples, any queries
+	 * on the standby with xmin older than the youngest XID of the most
+	 * recently removed tuple this record will prune will conflict.  If this
+	 * record will freeze tuples, any queries on the standby with xmin older
+	 * than the youngest tuple this record will freeze will conflict.
+	 *
+	 * If we are setting the VM, the conflict horizon is almost always the
+	 * newest live XID, except in the situation described above.
+	 *
+	 * By picking the newest of all of those, we can ensure that all changes
+	 * in the record have been taken into account.
+	 */
+	if (do_set_vm)
+		conflict_xid = newest_live_xid;
+	if (do_freeze && TransactionIdFollows(newest_frozen_xid, conflict_xid))
+		conflict_xid = newest_frozen_xid;
+
+	/*
+	 * If we are removing tuples with a younger XID than our so far calculated
+	 * conflict_xid, we must use this as our horizon.
+	 */
+	if (TransactionIdFollows(latest_xid_removed, conflict_xid))
+	{
+		Assert(do_prune);
+		conflict_xid = latest_xid_removed;
+	}
+
+	return conflict_xid;
+}
+
 /*
  * Helper to fix visibility-related corruption on a heap page and its
  * corresponding VM page. An all-visible page cannot have dead items nor can
@@ -839,7 +910,7 @@ heap_fix_vm_corruption(PruneState *prstate, OffsetNumber offnum)
 		PageClearAllVisible(prstate->page);
 		MarkBufferDirtyHint(prstate->buffer, true);
 	}
-	else if (prstate->vmbits & VISIBILITYMAP_VALID_BITS)
+	else if (prstate->old_vmbits & VISIBILITYMAP_VALID_BITS)
 	{
 		/*
 		 * As of PostgreSQL 9.2, the visibility map bit should never be set if
@@ -856,7 +927,43 @@ heap_fix_vm_corruption(PruneState *prstate, OffsetNumber offnum)
 
 	visibilitymap_clear(prstate->relation, prstate->block, prstate->vmbuffer,
 						VISIBILITYMAP_VALID_BITS);
-	prstate->vmbits = 0;
+	prstate->old_vmbits = 0;
+}
+
+/*
+ * Decide whether to set the visibility map bits (all-visible and all-frozen)
+ * for heap_blk using information from the PruneState and VM.
+ *
+ * This function does not actually set the VM bits or page-level visibility
+ * hint, PD_ALL_VISIBLE.
+ *
+ * Returns true if one or both VM bits should be set and false otherwise.
+ */
+static bool
+heap_page_will_set_vm(PruneState *prstate, PruneReason reason)
+{
+	/*
+	 * Though on-access pruning maintains prstate->set_all_visible, we don't
+	 * consider setting the VM.
+	 */
+	if (reason == PRUNE_ON_ACCESS)
+		return false;
+
+	if (!prstate->set_all_visible)
+		return false;
+
+	prstate->new_vmbits = VISIBILITYMAP_ALL_VISIBLE;
+
+	if (prstate->set_all_frozen)
+		prstate->new_vmbits |= VISIBILITYMAP_ALL_FROZEN;
+
+	if (prstate->new_vmbits == prstate->old_vmbits)
+	{
+		prstate->new_vmbits = 0;
+		return false;
+	}
+
+	return true;
 }
 
 /*
@@ -885,8 +992,8 @@ heap_page_bypass_prune_freeze(PruneState *prstate, PruneFreezeResult *presult)
 	OffsetNumber maxoff = PageGetMaxOffsetNumber(prstate->page);
 	Page		page = prstate->page;
 
-	Assert(prstate->vmbits & VISIBILITYMAP_ALL_FROZEN ||
-		   (prstate->vmbits & VISIBILITYMAP_ALL_VISIBLE &&
+	Assert(prstate->old_vmbits & VISIBILITYMAP_ALL_FROZEN ||
+		   (prstate->old_vmbits & VISIBILITYMAP_ALL_VISIBLE &&
 			!prstate->attempt_freeze));
 
 	/* We'll fill in presult for the caller */
@@ -913,15 +1020,14 @@ heap_page_bypass_prune_freeze(PruneState *prstate, PruneFreezeResult *presult)
 		MarkBufferDirtyHint(prstate->buffer, true);
 	}
 
-	presult->vmbits = prstate->vmbits;
-
 	if (!PageIsEmpty(page))
 		presult->hastup = true;
 }
 
 /*
  * Prune and repair fragmentation and potentially freeze tuples on the
- * specified page.
+ * specified page. If the page's visibility status has changed, update it in
+ * the VM.
  *
  * Caller must have pin and buffer cleanup lock on the page.  Note that we
  * don't update the FSM information for page on caller's behalf.  Caller might
@@ -936,12 +1042,10 @@ heap_page_bypass_prune_freeze(PruneState *prstate, PruneFreezeResult *presult)
  * tuples if it's required in order to advance relfrozenxid / relminmxid, or
  * if it's considered advantageous for overall system performance to do so
  * now.  The 'params.cutoffs', 'presult', 'new_relfrozen_xid' and
- * 'new_relmin_mxid' arguments are required when freezing.  When
- * HEAP_PAGE_PRUNE_FREEZE option is passed, we also set
- * presult->set_all_visible and presult->set_all_frozen after determining
- * whether or not to opportunistically freeze, to indicate if the VM bits can
- * be set. 'all-frozen' is always set to false when the HEAP_PAGE_PRUNE_FREEZE
- * option is not passed.
+ * 'new_relmin_mxid' arguments are required when freezing.
+ *
+ * A vmbuffer corresponding to the heap page is also is passed and if the page
+ * is found to be all-visible/all-frozen, we will set it in the VM.
  *
  * presult contains output parameters needed by callers, such as the number of
  * tuples removed and the offsets of dead items on the page after pruning.
@@ -969,15 +1073,17 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
 	bool		do_freeze;
 	bool		do_prune;
 	bool		do_hint_prune;
+	bool		do_set_vm;
 	bool		did_tuple_hint_fpi;
 	int64		fpi_before = pgWalUsage.wal_fpi;
+	TransactionId conflict_xid;
 
 	/* Initialize prstate */
 	prune_freeze_setup(params,
 					   new_relfrozen_xid, new_relmin_mxid,
 					   presult, &prstate);
 
-	if ((prstate.vmbits & VISIBILITYMAP_VALID_BITS) &&
+	if ((prstate.old_vmbits & VISIBILITYMAP_VALID_BITS) &&
 		!PageIsAllVisible(prstate.page))
 		heap_fix_vm_corruption(&prstate, InvalidOffsetNumber);
 
@@ -986,8 +1092,8 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
 	 * is not being attempted, we can exit early. Do this after fixing any
 	 * discrepancy between the page-level visibility hint and the VM.
 	 */
-	if (prstate.vmbits & VISIBILITYMAP_ALL_FROZEN ||
-		(prstate.vmbits & VISIBILITYMAP_ALL_VISIBLE && !prstate.attempt_freeze))
+	if (prstate.old_vmbits & VISIBILITYMAP_ALL_FROZEN ||
+		(prstate.old_vmbits & VISIBILITYMAP_ALL_VISIBLE && !prstate.attempt_freeze))
 	{
 		heap_page_bypass_prune_freeze(&prstate, presult);
 		return;
@@ -1058,6 +1164,25 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
 		prstate.set_all_visible = prstate.set_all_frozen = false;
 
 	Assert(!prstate.set_all_frozen || prstate.set_all_visible);
+	Assert(!prstate.set_all_visible || (prstate.lpdead_items == 0));
+
+	do_set_vm = heap_page_will_set_vm(&prstate, params->reason);
+
+	/*
+	 * new_vmbits should be 0 regardless of whether or not the page is
+	 * all-visible if we do not intend to set the VM.
+	 */
+	Assert(do_set_vm || prstate.new_vmbits == 0);
+
+	conflict_xid = get_conflict_xid(do_prune, do_freeze, do_set_vm,
+									prstate.old_vmbits, prstate.new_vmbits,
+									prstate.latest_xid_removed,
+									prstate.pagefrz.FreezePageConflictXid,
+									prstate.newest_live_xid);
+
+	/* Lock vmbuffer before entering a critical section */
+	if (do_set_vm)
+		LockBuffer(prstate.vmbuffer, BUFFER_LOCK_EXCLUSIVE);
 
 	/* Any error while applying the changes is critical */
 	START_CRIT_SECTION();
@@ -1079,14 +1204,17 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
 
 		/*
 		 * If that's all we had to do to the page, this is a non-WAL-logged
-		 * hint.  If we are going to freeze or prune the page, we will mark
-		 * the buffer dirty below.
+		 * hint.  If we are going to freeze or prune the page or set
+		 * PD_ALL_VISIBLE, we will mark the buffer dirty below.
+		 *
+		 * Setting PD_ALL_VISIBLE is fully WAL-logged because it is forbidden
+		 * for the VM to be set and PD_ALL_VISIBLE to be clear.
 		 */
-		if (!do_freeze && !do_prune)
+		if (!do_freeze && !do_prune && !do_set_vm)
 			MarkBufferDirtyHint(prstate.buffer, true);
 	}
 
-	if (do_prune || do_freeze)
+	if (do_prune || do_freeze || do_set_vm)
 	{
 		/* Apply the planned item changes and repair page fragmentation. */
 		if (do_prune)
@@ -1100,6 +1228,27 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
 		if (do_freeze)
 			heap_freeze_prepared_tuples(prstate.buffer, prstate.frozen, prstate.nfrozen);
 
+		/* Set the visibility map and page visibility hint */
+		if (do_set_vm)
+		{
+			/*
+			 * While it is valid for PD_ALL_VISIBLE to be set when the
+			 * corresponding VM bit is clear, we strongly prefer to keep them
+			 * in sync.
+			 *
+			 * The heap buffer must be marked dirty before adding it to the
+			 * WAL chain when setting the VM. We don't worry about
+			 * unnecessarily dirtying the heap buffer if PD_ALL_VISIBLE is
+			 * already set, though. It is extremely rare to have a clean heap
+			 * buffer with PD_ALL_VISIBLE already set and the VM bits clear,
+			 * so there is no point in optimizing it.
+			 */
+			PageSetAllVisible(prstate.page);
+			PageClearPrunable(prstate.page);
+			visibilitymap_set_vmbits(prstate.block, prstate.vmbuffer, prstate.new_vmbits,
+									 prstate.relation->rd_locator);
+		}
+
 		MarkBufferDirty(prstate.buffer);
 
 		/*
@@ -1107,29 +1256,12 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
 		 */
 		if (RelationNeedsWAL(prstate.relation))
 		{
-			/*
-			 * The snapshotConflictHorizon for the whole record should be the
-			 * most conservative of all the horizons calculated for any of the
-			 * possible modifications.  If this record will prune tuples, any
-			 * transactions on the standby older than the youngest xid of the
-			 * most recently removed tuple this record will prune will
-			 * conflict.  If this record will freeze tuples, any transactions
-			 * on the standby with xids older than the youngest tuple this
-			 * record will freeze will conflict.
-			 */
-			TransactionId conflict_xid;
-
-			if (TransactionIdFollows(prstate.pagefrz.FreezePageConflictXid,
-									 prstate.latest_xid_removed))
-				conflict_xid = prstate.pagefrz.FreezePageConflictXid;
-			else
-				conflict_xid = prstate.latest_xid_removed;
-
 			log_heap_prune_and_freeze(prstate.relation, prstate.buffer,
-									  InvalidBuffer,	/* vmbuffer */
-									  0,	/* vmflags */
+									  do_set_vm ? prstate.vmbuffer : InvalidBuffer,
+									  do_set_vm ? prstate.new_vmbits : 0,
 									  conflict_xid,
-									  true, params->reason,
+									  true, /* cleanup lock */
+									  params->reason,
 									  prstate.frozen, prstate.nfrozen,
 									  prstate.redirected, prstate.nredirected,
 									  prstate.nowdead, prstate.ndead,
@@ -1139,33 +1271,64 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
 
 	END_CRIT_SECTION();
 
+	if (do_set_vm)
+		LockBuffer(prstate.vmbuffer, BUFFER_LOCK_UNLOCK);
+
+	/*
+	 * During its second pass over the heap, VACUUM calls
+	 * heap_page_would_be_all_visible() to determine whether a page is
+	 * all-visible and all-frozen. The logic here is similar. After completing
+	 * pruning and freezing, use an assertion to verify that our results
+	 * remain consistent with heap_page_would_be_all_visible().
+	 */
+#ifdef USE_ASSERT_CHECKING
+	if (prstate.set_all_visible)
+	{
+		TransactionId debug_cutoff;
+		bool		debug_all_frozen;
+
+		Assert(prstate.lpdead_items == 0);
+
+		Assert(heap_page_is_all_visible(prstate.relation, prstate.buffer,
+										prstate.vistest,
+										&debug_all_frozen,
+										&debug_cutoff, off_loc));
+
+		/*
+		 * It's possible the page is composed entirely of frozen tuples but is
+		 * not set all-frozen in the VM and did not pass
+		 * HEAP_PAGE_PRUNE_FREEZE. In this case, it's possible
+		 * heap_page_is_all_visible() finds the page completely frozen, even
+		 * though prstate.all_frozen is false.
+		 */
+		Assert(!prstate.set_all_frozen || debug_all_frozen);
+	}
+#endif
+
 	/* Copy information back for caller */
 	presult->ndeleted = prstate.ndeleted;
 	presult->nnewlpdead = prstate.ndead;
 	presult->nfrozen = prstate.nfrozen;
 	presult->live_tuples = prstate.live_tuples;
 	presult->recently_dead_tuples = prstate.recently_dead_tuples;
-	presult->all_visible = prstate.set_all_visible;
-	presult->all_frozen = prstate.set_all_frozen;
 	presult->hastup = prstate.hastup;
-	presult->vmbits = prstate.vmbits;
-
-	/*
-	 * For callers planning to update the visibility map, the conflict horizon
-	 * for that record must be the newest xmin on the page.  However, if the
-	 * page is completely frozen, there can be no conflict and the
-	 * vm_conflict_horizon should remain InvalidTransactionId.  This includes
-	 * the case that we just froze all the tuples; the prune-freeze record
-	 * included the conflict XID already so the caller doesn't need it.
-	 */
-	if (presult->all_frozen)
-		presult->vm_conflict_horizon = InvalidTransactionId;
-	else
-		presult->vm_conflict_horizon = prstate.newest_live_xid;
 
 	presult->lpdead_items = prstate.lpdead_items;
 	/* the presult->deadoffsets array was already filled in */
 
+	if (do_set_vm)
+	{
+		if ((prstate.old_vmbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
+		{
+			presult->new_all_visible_pages = 1;
+			if (prstate.set_all_frozen)
+				presult->new_all_visible_frozen_pages = 1;
+		}
+		else if ((prstate.old_vmbits & VISIBILITYMAP_ALL_FROZEN) == 0 &&
+				 prstate.set_all_frozen)
+			presult->new_all_frozen_pages = 1;
+	}
+
 	if (prstate.attempt_freeze)
 	{
 		if (presult->nfrozen > 0)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index b5370ec26da..4678e0b9c26 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -458,13 +458,6 @@ static void dead_items_add(LVRelState *vacrel, BlockNumber blkno, OffsetNumber *
 static void dead_items_reset(LVRelState *vacrel);
 static void dead_items_cleanup(LVRelState *vacrel);
 
-#ifdef USE_ASSERT_CHECKING
-static bool heap_page_is_all_visible(Relation rel, Buffer buf,
-									 GlobalVisState *vistest,
-									 bool *all_frozen,
-									 TransactionId *newest_live_xid,
-									 OffsetNumber *logging_offnum);
-#endif
 static bool heap_page_would_be_all_visible(Relation rel, Buffer buf,
 										   GlobalVisState *vistest,
 										   OffsetNumber *deadoffsets,
@@ -1995,8 +1988,6 @@ lazy_scan_prune(LVRelState *vacrel,
 		.vistest = vacrel->vistest,
 		.cutoffs = &vacrel->cutoffs,
 	};
-	uint8		old_vmbits = 0;
-	uint8		new_vmbits = 0;
 
 	Assert(BufferGetBlockNumber(buf) == blkno);
 
@@ -2037,29 +2028,6 @@ lazy_scan_prune(LVRelState *vacrel,
 		vacrel->new_frozen_tuple_pages++;
 	}
 
-	/*
-	 * VACUUM will call heap_page_is_all_visible() during the second pass over
-	 * the heap to determine all_visible and all_frozen for the page -- this
-	 * is a specialized version of the logic from this function.  Now that
-	 * we've finished pruning and freezing, make sure that we're in total
-	 * agreement with heap_page_is_all_visible() using an assertion.
-	 */
-#ifdef USE_ASSERT_CHECKING
-	if (presult.all_visible)
-	{
-		TransactionId debug_cutoff;
-		bool		debug_all_frozen;
-
-		Assert(presult.lpdead_items == 0);
-
-		Assert(heap_page_is_all_visible(vacrel->rel, buf,
-										vacrel->vistest, &debug_all_frozen,
-										&debug_cutoff, &vacrel->offnum));
-
-		Assert(presult.all_frozen == debug_all_frozen);
-	}
-#endif
-
 	/*
 	 * Now save details of the LP_DEAD items from the page in vacrel
 	 */
@@ -2080,6 +2048,14 @@ lazy_scan_prune(LVRelState *vacrel,
 	}
 
 	/* Finally, add page-local counts to whole-VACUUM counts */
+	vacrel->new_all_visible_pages += presult.new_all_visible_pages;
+	vacrel->new_all_visible_all_frozen_pages += presult.new_all_visible_frozen_pages;
+	vacrel->new_all_frozen_pages += presult.new_all_frozen_pages;
+
+	/* Capture if the page was newly set frozen */
+	*vm_page_frozen = presult.new_all_visible_frozen_pages > 0 ||
+		presult.new_all_frozen_pages > 0;
+
 	vacrel->tuples_deleted += presult.ndeleted;
 	vacrel->tuples_frozen += presult.nfrozen;
 	vacrel->lpdead_items += presult.lpdead_items;
@@ -2093,71 +2069,6 @@ lazy_scan_prune(LVRelState *vacrel,
 	/* Did we find LP_DEAD items? */
 	*has_lpdead_items = (presult.lpdead_items > 0);
 
-	Assert(!presult.all_visible || !(*has_lpdead_items));
-	Assert(!presult.all_frozen || presult.all_visible);
-
-	if (!presult.all_visible)
-		return presult.ndeleted;
-
-	/* Set the visibility map and page visibility hint */
-	old_vmbits = presult.vmbits;
-	new_vmbits = VISIBILITYMAP_ALL_VISIBLE;
-	if (presult.all_frozen)
-		new_vmbits |= VISIBILITYMAP_ALL_FROZEN;
-
-	/* Nothing to do */
-	if (old_vmbits == new_vmbits)
-		return presult.ndeleted;
-
-	/*
-	 * It should never be the case that the visibility map page is set while
-	 * the page-level bit is clear (and if so, we cleared it above), but the
-	 * reverse is allowed (if checksums are not enabled). Regardless, set both
-	 * bits so that we get back in sync.
-	 *
-	 * The heap buffer must be marked dirty before adding it to the WAL chain
-	 * when setting the VM. We don't worry about unnecessarily dirtying the
-	 * heap buffer if PD_ALL_VISIBLE is already set, though. It is extremely
-	 * rare to have a clean heap buffer with PD_ALL_VISIBLE already set and
-	 * the VM bits clear, so there is no point in optimizing it.
-	 */
-	PageSetAllVisible(page);
-	PageClearPrunable(page);
-	MarkBufferDirty(buf);
-
-	/*
-	 * If the page is being set all-frozen, we pass InvalidTransactionId as
-	 * the cutoff_xid, since a snapshot conflict horizon sufficient to make
-	 * everything safe for REDO was logged when the page's tuples were frozen.
-	 */
-	Assert(!presult.all_frozen ||
-		   !TransactionIdIsValid(presult.vm_conflict_horizon));
-
-	visibilitymap_set(vacrel->rel, blkno, buf,
-					  InvalidXLogRecPtr,
-					  vmbuffer, presult.vm_conflict_horizon,
-					  new_vmbits);
-
-	/*
-	 * If the page wasn't already set all-visible and/or all-frozen in the VM,
-	 * count it as newly set for logging.
-	 */
-	if ((old_vmbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
-	{
-		vacrel->new_all_visible_pages++;
-		if (presult.all_frozen)
-		{
-			vacrel->new_all_visible_all_frozen_pages++;
-			*vm_page_frozen = true;
-		}
-	}
-	else if ((old_vmbits & VISIBILITYMAP_ALL_FROZEN) == 0 &&
-			 presult.all_frozen)
-	{
-		vacrel->new_all_frozen_pages++;
-		*vm_page_frozen = true;
-	}
-
 	return presult.ndeleted;
 }
 
@@ -3571,7 +3482,7 @@ dead_items_cleanup(LVRelState *vacrel)
  * that expect no LP_DEAD on the page. Currently assert-only, but there is no
  * reason not to use it outside of asserts.
  */
-static bool
+bool
 heap_page_is_all_visible(Relation rel, Buffer buf,
 						 GlobalVisState *vistest,
 						 bool *all_frozen,
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index e401dd52e25..7ef4cbbfb1e 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -260,7 +260,8 @@ typedef struct PruneFreezeParams
 
 	/*
 	 * Callers should provide a pinned vmbuffer corresponding to the heap
-	 * block in buffer. We will check for and repair any corruption in the VM.
+	 * block in buffer. We will check for and repair any corruption in the VM
+	 * and set the VM after pruning if the page is all-visible/all-frozen.
 	 */
 	Buffer		vmbuffer;
 
@@ -276,8 +277,7 @@ typedef struct PruneFreezeParams
 	 * HEAP_PAGE_PRUNE_MARK_UNUSED_NOW indicates that dead items can be set
 	 * LP_UNUSED during pruning.
 	 *
-	 * HEAP_PAGE_PRUNE_FREEZE indicates that we will also freeze tuples, and
-	 * will return 'all_visible', 'all_frozen' flags to the caller.
+	 * HEAP_PAGE_PRUNE_FREEZE indicates that we will also freeze tuples.
 	 */
 	int			options;
 
@@ -311,25 +311,12 @@ typedef struct PruneFreezeResult
 	int			recently_dead_tuples;
 
 	/*
-	 * all_visible and all_frozen indicate if the all-visible and all-frozen
-	 * bits in the visibility map can be set for this page, after pruning.
-	 *
-	 * vm_conflict_horizon is the newest xmin of live tuples on the page.  The
-	 * caller can use it as the conflict horizon when setting the VM bits.  It
-	 * is only valid if we froze some tuples (nfrozen > 0), and all_frozen is
-	 * true.
-	 *
-	 * These are only set if the HEAP_PAGE_PRUNE_FREEZE option is set.
-	 */
-	bool		all_visible;
-	bool		all_frozen;
-	TransactionId vm_conflict_horizon;
-
-	/*
-	 * vmbits is the value of the vmbuffer's vmbits at the beginning of
-	 * pruning. It is cleared if VM corruption is found and corrected.
+	 * Whether or not the page was newly set all-visible and all-frozen during
+	 * phase I of vacuuming.
 	 */
-	uint8		vmbits;
+	BlockNumber new_all_visible_pages;
+	BlockNumber new_all_visible_frozen_pages;
+	BlockNumber new_all_frozen_pages;
 
 	/*
 	 * Whether or not the page makes rel truncation unsafe.  This is set to
@@ -466,7 +453,13 @@ extern void log_heap_prune_and_freeze(Relation relation, Buffer buffer,
 /* in heap/vacuumlazy.c */
 extern void heap_vacuum_rel(Relation rel,
 							const VacuumParams params, BufferAccessStrategy bstrategy);
-
+#ifdef USE_ASSERT_CHECKING
+extern bool heap_page_is_all_visible(Relation rel, Buffer buf,
+									 GlobalVisState *vistest,
+									 bool *all_frozen,
+									 TransactionId *visibility_cutoff_xid,
+									 OffsetNumber *logging_offnum);
+#endif
 /* in heap/heapam_visibility.c */
 extern bool HeapTupleSatisfiesVisibility(HeapTuple htup, Snapshot snapshot,
 										 Buffer buffer);
-- 
2.43.0



  [text/x-patch] v35-0011-Eliminate-XLOG_HEAP2_VISIBLE-from-empty-page-vac.patch (2.6K, 12-v35-0011-Eliminate-XLOG_HEAP2_VISIBLE-from-empty-page-vac.patch)
  download | inline diff:
From 410c0e06c85c4d686f114635b0044549dc22eceb Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Sat, 27 Sep 2025 11:55:21 -0400
Subject: [PATCH v35 11/18] Eliminate XLOG_HEAP2_VISIBLE from empty-page vacuum

As part of removing XLOG_HEAP2_VISIBLE records, phase I of VACUUM now
marks empty pages all-visible in a XLOG_HEAP2_PRUNE_VACUUM_SCAN record.

Author: Melanie Plageman <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Andres Freund <[email protected]>
Reviewed-by: Chao Li <[email protected]>
---
 src/backend/access/heap/vacuumlazy.c | 35 +++++++++++++++++++++++-----
 1 file changed, 29 insertions(+), 6 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 4678e0b9c26..68fa77b5318 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1902,9 +1902,12 @@ lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, BlockNumber blkno,
 		 */
 		if (!PageIsAllVisible(page))
 		{
+			/* Lock vmbuffer before entering critical section */
+			LockBuffer(vmbuffer, BUFFER_LOCK_EXCLUSIVE);
+
 			START_CRIT_SECTION();
 
-			/* mark buffer dirty before writing a WAL record */
+			/* Mark buffer dirty before writing any WAL records */
 			MarkBufferDirty(buf);
 
 			/*
@@ -1922,13 +1925,33 @@ lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, BlockNumber blkno,
 
 			PageSetAllVisible(page);
 			PageClearPrunable(page);
-			visibilitymap_set(vacrel->rel, blkno, buf,
-							  InvalidXLogRecPtr,
-							  vmbuffer, InvalidTransactionId,
-							  VISIBILITYMAP_ALL_VISIBLE |
-							  VISIBILITYMAP_ALL_FROZEN);
+			visibilitymap_set_vmbits(blkno,
+									 vmbuffer,
+									 VISIBILITYMAP_ALL_VISIBLE |
+									 VISIBILITYMAP_ALL_FROZEN,
+									 vacrel->rel->rd_locator);
+
+			/*
+			 * Emit WAL for setting PD_ALL_VISIBLE on the heap page and
+			 * setting the VM.
+			 */
+			if (RelationNeedsWAL(vacrel->rel))
+				log_heap_prune_and_freeze(vacrel->rel, buf,
+										  vmbuffer,
+										  VISIBILITYMAP_ALL_VISIBLE |
+										  VISIBILITYMAP_ALL_FROZEN,
+										  InvalidTransactionId, /* conflict xid */
+										  false,	/* cleanup lock */
+										  PRUNE_VACUUM_SCAN,	/* reason */
+										  NULL, 0,
+										  NULL, 0,
+										  NULL, 0,
+										  NULL, 0);
+
 			END_CRIT_SECTION();
 
+			LockBuffer(vmbuffer, BUFFER_LOCK_UNLOCK);
+
 			/* Count the newly all-frozen pages for logging */
 			vacrel->new_all_visible_pages++;
 			vacrel->new_all_visible_all_frozen_pages++;
-- 
2.43.0



  [text/x-patch] v35-0012-Remove-XLOG_HEAP2_VISIBLE-entirely.patch (25.0K, 13-v35-0012-Remove-XLOG_HEAP2_VISIBLE-entirely.patch)
  download | inline diff:
From 44626ffe27eddbd1dea7851b10079c150069faf7 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Sat, 27 Sep 2025 11:55:36 -0400
Subject: [PATCH v35 12/18] Remove XLOG_HEAP2_VISIBLE entirely

As no remaining users emit XLOG_HEAP2_VISIBLE records.
This includes deleting the xl_heap_visible struct and all functions
responsible for emitting or replaying XLOG_HEAP2_VISIBLE records.

This changes the visibility map API, so any external users/consumers of
the VM-only WAL record will need to change.

Author: Melanie Plageman <[email protected]>
Reviewed-by: Andrey Borodin <[email protected]>
Reviewed-by: Andres Freund <[email protected]>
Reviewed-by: Chao Li <[email protected]>
---
 src/backend/access/common/bufmask.c      |   4 +-
 src/backend/access/heap/heapam.c         |  54 +-------
 src/backend/access/heap/heapam_xlog.c    | 156 ++---------------------
 src/backend/access/heap/pruneheap.c      |   4 +-
 src/backend/access/heap/vacuumlazy.c     |  16 +--
 src/backend/access/heap/visibilitymap.c  | 110 +---------------
 src/backend/access/rmgrdesc/heapdesc.c   |  10 --
 src/backend/replication/logical/decode.c |   1 -
 src/backend/storage/ipc/standby.c        |  12 +-
 src/include/access/heapam_xlog.h         |  20 ---
 src/include/access/visibilitymap.h       |  13 +-
 src/include/access/visibilitymapdefs.h   |   9 --
 src/tools/pgindent/typedefs.list         |   1 -
 13 files changed, 38 insertions(+), 372 deletions(-)

diff --git a/src/backend/access/common/bufmask.c b/src/backend/access/common/bufmask.c
index 1a9e7bea5d2..bce767d7b71 100644
--- a/src/backend/access/common/bufmask.c
+++ b/src/backend/access/common/bufmask.c
@@ -56,8 +56,8 @@ mask_page_hint_bits(Page page)
 
 	/*
 	 * During replay, if the page LSN has advanced past our XLOG record's LSN,
-	 * we don't mark the page all-visible. See heap_xlog_visible() for
-	 * details.
+	 * we don't mark the page all-visible. See heap_xlog_prune_and_freeze()
+	 * for more details.
 	 */
 	PageClearAllVisible(page);
 }
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index e19209f180d..2f9ef87463e 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -2589,11 +2589,11 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
 		{
 			PageSetAllVisible(page);
 			PageClearPrunable(page);
-			visibilitymap_set_vmbits(BufferGetBlockNumber(buffer),
-									 vmbuffer,
-									 VISIBILITYMAP_ALL_VISIBLE |
-									 VISIBILITYMAP_ALL_FROZEN,
-									 relation->rd_locator);
+			visibilitymap_set(BufferGetBlockNumber(buffer),
+							  vmbuffer,
+							  VISIBILITYMAP_ALL_VISIBLE |
+							  VISIBILITYMAP_ALL_FROZEN,
+							  relation->rd_locator);
 		}
 
 		/*
@@ -8894,50 +8894,6 @@ bottomup_sort_and_shrink(TM_IndexDeleteOp *delstate)
 	return nblocksfavorable;
 }
 
-/*
- * Perform XLogInsert for a heap-visible operation.  'block' is the block
- * being marked all-visible, and vm_buffer is the buffer containing the
- * corresponding visibility map block.  Both should have already been modified
- * and dirtied.
- *
- * snapshotConflictHorizon comes from the largest xmin on the page being
- * marked all-visible.  REDO routine uses it to generate recovery conflicts.
- *
- * If checksums or wal_log_hints are enabled, we may also generate a full-page
- * image of heap_buffer. Otherwise, we optimize away the FPI (by specifying
- * REGBUF_NO_IMAGE for the heap buffer), in which case the caller should *not*
- * update the heap page's LSN.
- */
-XLogRecPtr
-log_heap_visible(Relation rel, Buffer heap_buffer, Buffer vm_buffer,
-				 TransactionId snapshotConflictHorizon, uint8 vmflags)
-{
-	xl_heap_visible xlrec;
-	XLogRecPtr	recptr;
-	uint8		flags;
-
-	Assert(BufferIsValid(heap_buffer));
-	Assert(BufferIsValid(vm_buffer));
-
-	xlrec.snapshotConflictHorizon = snapshotConflictHorizon;
-	xlrec.flags = vmflags;
-	if (RelationIsAccessibleInLogicalDecoding(rel))
-		xlrec.flags |= VISIBILITYMAP_XLOG_CATALOG_REL;
-	XLogBeginInsert();
-	XLogRegisterData(&xlrec, SizeOfHeapVisible);
-
-	XLogRegisterBuffer(0, vm_buffer, 0);
-
-	flags = REGBUF_STANDARD;
-	if (!XLogHintBitIsNeeded())
-		flags |= REGBUF_NO_IMAGE;
-	XLogRegisterBuffer(1, heap_buffer, flags);
-
-	recptr = XLogInsert(RM_HEAP2_ID, XLOG_HEAP2_VISIBLE);
-
-	return recptr;
-}
-
 /*
  * Perform XLogInsert for a heap-update operation.  Caller must already
  * have modified the buffer(s) and marked them dirty.
diff --git a/src/backend/access/heap/heapam_xlog.c b/src/backend/access/heap/heapam_xlog.c
index 6d39a5fff7c..df89f93edb4 100644
--- a/src/backend/access/heap/heapam_xlog.c
+++ b/src/backend/access/heap/heapam_xlog.c
@@ -239,7 +239,7 @@ heap_xlog_prune_freeze(XLogReaderState *record)
 		if (PageIsNew(vmpage))
 			PageInit(vmpage, BLCKSZ, 0);
 
-		visibilitymap_set_vmbits(blkno, vmbuffer, vmflags, rlocator);
+		visibilitymap_set(blkno, vmbuffer, vmflags, rlocator);
 
 		Assert(BufferIsDirty(vmbuffer));
 		PageSetLSN(vmpage, lsn);
@@ -252,143 +252,6 @@ heap_xlog_prune_freeze(XLogReaderState *record)
 		XLogRecordPageWithFreeSpace(rlocator, blkno, freespace);
 }
 
-/*
- * Replay XLOG_HEAP2_VISIBLE records.
- *
- * The critical integrity requirement here is that we must never end up with
- * a situation where the visibility map bit is set, and the page-level
- * PD_ALL_VISIBLE bit is clear.  If that were to occur, then a subsequent
- * page modification would fail to clear the visibility map bit.
- */
-static void
-heap_xlog_visible(XLogReaderState *record)
-{
-	XLogRecPtr	lsn = record->EndRecPtr;
-	xl_heap_visible *xlrec = (xl_heap_visible *) XLogRecGetData(record);
-	Buffer		vmbuffer = InvalidBuffer;
-	Buffer		buffer;
-	Page		page;
-	RelFileLocator rlocator;
-	BlockNumber blkno;
-	XLogRedoAction action;
-
-	Assert((xlrec->flags & VISIBILITYMAP_XLOG_VALID_BITS) == xlrec->flags);
-
-	XLogRecGetBlockTag(record, 1, &rlocator, NULL, &blkno);
-
-	/*
-	 * If there are any Hot Standby transactions running that have an xmin
-	 * horizon old enough that this page isn't all-visible for them, they
-	 * might incorrectly decide that an index-only scan can skip a heap fetch.
-	 *
-	 * NB: It might be better to throw some kind of "soft" conflict here that
-	 * forces any index-only scan that is in flight to perform heap fetches,
-	 * rather than killing the transaction outright.
-	 */
-	if (InHotStandby)
-		ResolveRecoveryConflictWithSnapshot(xlrec->snapshotConflictHorizon,
-											xlrec->flags & VISIBILITYMAP_XLOG_CATALOG_REL,
-											rlocator);
-
-	/*
-	 * Read the heap page, if it still exists. If the heap file has dropped or
-	 * truncated later in recovery, we don't need to update the page, but we'd
-	 * better still update the visibility map.
-	 */
-	action = XLogReadBufferForRedo(record, 1, &buffer);
-	if (action == BLK_NEEDS_REDO)
-	{
-		/*
-		 * We don't bump the LSN of the heap page when setting the visibility
-		 * map bit (unless checksums or wal_hint_bits is enabled, in which
-		 * case we must). This exposes us to torn page hazards, but since
-		 * we're not inspecting the existing page contents in any way, we
-		 * don't care.
-		 */
-		page = BufferGetPage(buffer);
-
-		PageSetAllVisible(page);
-		PageClearPrunable(page);
-
-		if (XLogHintBitIsNeeded())
-			PageSetLSN(page, lsn);
-
-		MarkBufferDirty(buffer);
-	}
-	else if (action == BLK_RESTORED)
-	{
-		/*
-		 * If heap block was backed up, we already restored it and there's
-		 * nothing more to do. (This can only happen with checksums or
-		 * wal_log_hints enabled.)
-		 */
-	}
-
-	if (BufferIsValid(buffer))
-	{
-		Size		space = PageGetFreeSpace(BufferGetPage(buffer));
-
-		UnlockReleaseBuffer(buffer);
-
-		/*
-		 * Since FSM is not WAL-logged and only updated heuristically, it
-		 * easily becomes stale in standbys.  If the standby is later promoted
-		 * and runs VACUUM, it will skip updating individual free space
-		 * figures for pages that became all-visible (or all-frozen, depending
-		 * on the vacuum mode,) which is troublesome when FreeSpaceMapVacuum
-		 * propagates too optimistic free space values to upper FSM layers;
-		 * later inserters try to use such pages only to find out that they
-		 * are unusable.  This can cause long stalls when there are many such
-		 * pages.
-		 *
-		 * Forestall those problems by updating FSM's idea about a page that
-		 * is becoming all-visible or all-frozen.
-		 *
-		 * Do this regardless of a full-page image being applied, since the
-		 * FSM data is not in the page anyway.
-		 */
-		if (xlrec->flags & VISIBILITYMAP_VALID_BITS)
-			XLogRecordPageWithFreeSpace(rlocator, blkno, space);
-	}
-
-	/*
-	 * Even if we skipped the heap page update due to the LSN interlock, it's
-	 * still safe to update the visibility map.  Any WAL record that clears
-	 * the visibility map bit does so before checking the page LSN, so any
-	 * bits that need to be cleared will still be cleared.
-	 */
-	if (XLogReadBufferForRedoExtended(record, 0, RBM_ZERO_ON_ERROR, false,
-									  &vmbuffer) == BLK_NEEDS_REDO)
-	{
-		Page		vmpage = BufferGetPage(vmbuffer);
-		Relation	reln;
-		uint8		vmbits;
-
-		/* initialize the page if it was read as zeros */
-		if (PageIsNew(vmpage))
-			PageInit(vmpage, BLCKSZ, 0);
-
-		/* remove VISIBILITYMAP_XLOG_* */
-		vmbits = xlrec->flags & VISIBILITYMAP_VALID_BITS;
-
-		/*
-		 * XLogReadBufferForRedoExtended locked the buffer. But
-		 * visibilitymap_set will handle locking itself.
-		 */
-		LockBuffer(vmbuffer, BUFFER_LOCK_UNLOCK);
-
-		reln = CreateFakeRelcacheEntry(rlocator);
-
-		visibilitymap_set(reln, blkno, InvalidBuffer, lsn, vmbuffer,
-						  xlrec->snapshotConflictHorizon, vmbits);
-
-		ReleaseBuffer(vmbuffer);
-		FreeFakeRelcacheEntry(reln);
-	}
-	else if (BufferIsValid(vmbuffer))
-		UnlockReleaseBuffer(vmbuffer);
-}
-
 /*
  * Given an "infobits" field from an XLog record, set the correct bits in the
  * given infomask and infomask2 for the tuple touched by the record.
@@ -769,8 +632,8 @@ heap_xlog_multi_insert(XLogReaderState *record)
 	 *
 	 * During recovery, however, no concurrent writers exist. Therefore,
 	 * updating the VM without holding the heap page lock is safe enough. This
-	 * same approach is taken when replaying xl_heap_visible records (see
-	 * heap_xlog_visible()).
+	 * same approach is taken when replaying XLOG_HEAP2_PRUNE* records (see
+	 * heap_xlog_prune_and_freeze()).
 	 */
 	if ((xlrec->flags & XLH_INSERT_ALL_FROZEN_SET) &&
 		XLogReadBufferForRedoExtended(record, 1, RBM_ZERO_ON_ERROR, false,
@@ -782,11 +645,11 @@ heap_xlog_multi_insert(XLogReaderState *record)
 		if (PageIsNew(vmpage))
 			PageInit(vmpage, BLCKSZ, 0);
 
-		visibilitymap_set_vmbits(blkno,
-								 vmbuffer,
-								 VISIBILITYMAP_ALL_VISIBLE |
-								 VISIBILITYMAP_ALL_FROZEN,
-								 rlocator);
+		visibilitymap_set(blkno,
+						  vmbuffer,
+						  VISIBILITYMAP_ALL_VISIBLE |
+						  VISIBILITYMAP_ALL_FROZEN,
+						  rlocator);
 
 		Assert(BufferIsDirty(vmbuffer));
 		PageSetLSN(vmpage, lsn);
@@ -1367,9 +1230,6 @@ heap2_redo(XLogReaderState *record)
 		case XLOG_HEAP2_PRUNE_VACUUM_CLEANUP:
 			heap_xlog_prune_freeze(record);
 			break;
-		case XLOG_HEAP2_VISIBLE:
-			heap_xlog_visible(record);
-			break;
 		case XLOG_HEAP2_MULTI_INSERT:
 			heap_xlog_multi_insert(record);
 			break;
diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index d41e1c6fce4..b66d49f4d60 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -1245,8 +1245,8 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
 			 */
 			PageSetAllVisible(prstate.page);
 			PageClearPrunable(prstate.page);
-			visibilitymap_set_vmbits(prstate.block, prstate.vmbuffer, prstate.new_vmbits,
-									 prstate.relation->rd_locator);
+			visibilitymap_set(prstate.block, prstate.vmbuffer, prstate.new_vmbits,
+							  prstate.relation->rd_locator);
 		}
 
 		MarkBufferDirty(prstate.buffer);
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 68fa77b5318..ef607945a93 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1925,11 +1925,11 @@ lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, BlockNumber blkno,
 
 			PageSetAllVisible(page);
 			PageClearPrunable(page);
-			visibilitymap_set_vmbits(blkno,
-									 vmbuffer,
-									 VISIBILITYMAP_ALL_VISIBLE |
-									 VISIBILITYMAP_ALL_FROZEN,
-									 vacrel->rel->rd_locator);
+			visibilitymap_set(blkno,
+							  vmbuffer,
+							  VISIBILITYMAP_ALL_VISIBLE |
+							  VISIBILITYMAP_ALL_FROZEN,
+							  vacrel->rel->rd_locator);
 
 			/*
 			 * Emit WAL for setting PD_ALL_VISIBLE on the heap page and
@@ -2793,9 +2793,9 @@ lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
 		 */
 		PageSetAllVisible(page);
 		PageClearPrunable(page);
-		visibilitymap_set_vmbits(blkno,
-								 vmbuffer, vmflags,
-								 vacrel->rel->rd_locator);
+		visibilitymap_set(blkno,
+						  vmbuffer, vmflags,
+						  vacrel->rel->rd_locator);
 		conflict_xid = newest_live_xid;
 	}
 
diff --git a/src/backend/access/heap/visibilitymap.c b/src/backend/access/heap/visibilitymap.c
index 3047bd46def..fc74e39e069 100644
--- a/src/backend/access/heap/visibilitymap.c
+++ b/src/backend/access/heap/visibilitymap.c
@@ -14,8 +14,7 @@
  *		visibilitymap_clear  - clear bits for one page in the visibility map
  *		visibilitymap_pin	 - pin a map page for setting a bit
  *		visibilitymap_pin_ok - check whether correct map page is already pinned
- *		visibilitymap_set	 - set bit(s) in a previously pinned page and log
- *		visibilitymap_set_vmbits - set bit(s) in a pinned page
+ *		visibilitymap_set	 - set bit(s) in a previously pinned page
  *		visibilitymap_get_status - get status of bits
  *		visibilitymap_count  - count number of bits set in visibility map
  *		visibilitymap_prepare_truncate -
@@ -220,112 +219,11 @@ visibilitymap_pin_ok(BlockNumber heapBlk, Buffer vmbuf)
 	return BufferIsValid(vmbuf) && BufferGetBlockNumber(vmbuf) == mapBlock;
 }
 
-/*
- *	visibilitymap_set - set bit(s) on a previously pinned page
- *
- * recptr is the LSN of the XLOG record we're replaying, if we're in recovery,
- * or InvalidXLogRecPtr in normal running.  The VM page LSN is advanced to the
- * one provided; in normal running, we generate a new XLOG record and set the
- * page LSN to that value (though the heap page's LSN may *not* be updated;
- * see below).  cutoff_xid is the largest xmin on the page being marked
- * all-visible; it is needed for Hot Standby, and can be InvalidTransactionId
- * if the page contains no tuples.  It can also be set to InvalidTransactionId
- * when a page that is already all-visible is being marked all-frozen.
- *
- * Caller is expected to set the heap page's PD_ALL_VISIBLE bit before calling
- * this function. Except in recovery, caller should also pass the heap
- * buffer. When checksums are enabled and we're not in recovery, we must add
- * the heap buffer to the WAL chain to protect it from being torn.
- *
- * You must pass a buffer containing the correct map page to this function.
- * Call visibilitymap_pin first to pin the right one. This function doesn't do
- * any I/O.
- */
-void
-visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf,
-				  XLogRecPtr recptr, Buffer vmBuf, TransactionId cutoff_xid,
-				  uint8 flags)
-{
-	BlockNumber mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
-	uint32		mapByte = HEAPBLK_TO_MAPBYTE(heapBlk);
-	uint8		mapOffset = HEAPBLK_TO_OFFSET(heapBlk);
-	Page		page;
-	uint8	   *map;
-	uint8		status;
-
-#ifdef TRACE_VISIBILITYMAP
-	elog(DEBUG1, "vm_set flags 0x%02X for %s %d",
-		 flags, RelationGetRelationName(rel), heapBlk);
-#endif
-
-	Assert(InRecovery || !XLogRecPtrIsValid(recptr));
-	Assert(InRecovery || PageIsAllVisible(BufferGetPage(heapBuf)));
-	Assert((flags & VISIBILITYMAP_VALID_BITS) == flags);
-
-	/* Must never set all_frozen bit without also setting all_visible bit */
-	Assert(flags != VISIBILITYMAP_ALL_FROZEN);
-
-	/* Check that we have the right heap page pinned, if present */
-	if (BufferIsValid(heapBuf) && BufferGetBlockNumber(heapBuf) != heapBlk)
-		elog(ERROR, "wrong heap buffer passed to visibilitymap_set");
-
-	Assert(!BufferIsValid(heapBuf) ||
-		   BufferIsLockedByMeInMode(heapBuf, BUFFER_LOCK_EXCLUSIVE));
-
-	/* Check that we have the right VM page pinned */
-	if (!BufferIsValid(vmBuf) || BufferGetBlockNumber(vmBuf) != mapBlock)
-		elog(ERROR, "wrong VM buffer passed to visibilitymap_set");
-
-	page = BufferGetPage(vmBuf);
-	map = (uint8 *) PageGetContents(page);
-	LockBuffer(vmBuf, BUFFER_LOCK_EXCLUSIVE);
-
-	status = (map[mapByte] >> mapOffset) & VISIBILITYMAP_VALID_BITS;
-	if (flags != status)
-	{
-		START_CRIT_SECTION();
-
-		map[mapByte] |= (flags << mapOffset);
-		MarkBufferDirty(vmBuf);
-
-		if (RelationNeedsWAL(rel))
-		{
-			if (!XLogRecPtrIsValid(recptr))
-			{
-				Assert(!InRecovery);
-				recptr = log_heap_visible(rel, heapBuf, vmBuf, cutoff_xid, flags);
-
-				/*
-				 * If data checksums are enabled (or wal_log_hints=on), we
-				 * need to protect the heap page from being torn.
-				 *
-				 * If not, then we must *not* update the heap page's LSN. In
-				 * this case, the FPI for the heap page was omitted from the
-				 * WAL record inserted above, so it would be incorrect to
-				 * update the heap page's LSN.
-				 */
-				if (XLogHintBitIsNeeded())
-				{
-					Page		heapPage = BufferGetPage(heapBuf);
-
-					PageSetLSN(heapPage, recptr);
-				}
-			}
-			PageSetLSN(page, recptr);
-		}
-
-		END_CRIT_SECTION();
-	}
-
-	LockBuffer(vmBuf, BUFFER_LOCK_UNLOCK);
-}
-
 /*
  * Set VM (visibility map) flags in the VM block in vmBuf.
  *
  * This function is intended for callers that log VM changes together
  * with the heap page modifications that rendered the page all-visible.
- * Callers that log VM changes separately should use visibilitymap_set().
  *
  * vmBuf must be pinned and exclusively locked, and it must cover the VM bits
  * corresponding to heapBlk.
@@ -341,9 +239,9 @@ visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf,
  * rlocator is used only for debugging messages.
  */
 void
-visibilitymap_set_vmbits(BlockNumber heapBlk,
-						 Buffer vmBuf, uint8 flags,
-						 const RelFileLocator rlocator)
+visibilitymap_set(BlockNumber heapBlk,
+				  Buffer vmBuf, uint8 flags,
+				  const RelFileLocator rlocator)
 {
 	BlockNumber mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
 	uint32		mapByte = HEAPBLK_TO_MAPBYTE(heapBlk);
diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c
index 02ae91653c1..75ae6f9d375 100644
--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -349,13 +349,6 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
 			}
 		}
 	}
-	else if (info == XLOG_HEAP2_VISIBLE)
-	{
-		xl_heap_visible *xlrec = (xl_heap_visible *) rec;
-
-		appendStringInfo(buf, "snapshotConflictHorizon: %u, flags: 0x%02X",
-						 xlrec->snapshotConflictHorizon, xlrec->flags);
-	}
 	else if (info == XLOG_HEAP2_MULTI_INSERT)
 	{
 		xl_heap_multi_insert *xlrec = (xl_heap_multi_insert *) rec;
@@ -461,9 +454,6 @@ heap2_identify(uint8 info)
 		case XLOG_HEAP2_PRUNE_VACUUM_CLEANUP:
 			id = "PRUNE_VACUUM_CLEANUP";
 			break;
-		case XLOG_HEAP2_VISIBLE:
-			id = "VISIBLE";
-			break;
 		case XLOG_HEAP2_MULTI_INSERT:
 			id = "MULTI_INSERT";
 			break;
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index 21f03864a66..3c027bcb2f7 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -448,7 +448,6 @@ heap2_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 		case XLOG_HEAP2_PRUNE_ON_ACCESS:
 		case XLOG_HEAP2_PRUNE_VACUUM_SCAN:
 		case XLOG_HEAP2_PRUNE_VACUUM_CLEANUP:
-		case XLOG_HEAP2_VISIBLE:
 		case XLOG_HEAP2_LOCK_UPDATED:
 			break;
 		default:
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index d83afbfb9d6..afacc1b8e0d 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -476,12 +476,12 @@ ResolveRecoveryConflictWithSnapshot(TransactionId snapshotConflictHorizon,
 	 * If we get passed InvalidTransactionId then we do nothing (no conflict).
 	 *
 	 * This can happen when replaying already-applied WAL records after a
-	 * standby crash or restart, or when replaying an XLOG_HEAP2_VISIBLE
-	 * record that marks as frozen a page which was already all-visible.  It's
-	 * also quite common with records generated during index deletion
-	 * (original execution of the deletion can reason that a recovery conflict
-	 * which is sufficient for the deletion operation must take place before
-	 * replay of the deletion record itself).
+	 * standby crash or restart, or when replaying a record that marks as
+	 * frozen a page which was already marked all-visible in the visibility
+	 * map.  It's also quite common with records generated during index
+	 * deletion (original execution of the deletion can reason that a recovery
+	 * conflict which is sufficient for the deletion operation must take place
+	 * before replay of the deletion record itself).
 	 */
 	if (!TransactionIdIsValid(snapshotConflictHorizon))
 		return;
diff --git a/src/include/access/heapam_xlog.h b/src/include/access/heapam_xlog.h
index ce3566ba949..5eed567a8e5 100644
--- a/src/include/access/heapam_xlog.h
+++ b/src/include/access/heapam_xlog.h
@@ -60,7 +60,6 @@
 #define XLOG_HEAP2_PRUNE_ON_ACCESS		0x10
 #define XLOG_HEAP2_PRUNE_VACUUM_SCAN	0x20
 #define XLOG_HEAP2_PRUNE_VACUUM_CLEANUP	0x30
-#define XLOG_HEAP2_VISIBLE		0x40
 #define XLOG_HEAP2_MULTI_INSERT 0x50
 #define XLOG_HEAP2_LOCK_UPDATED 0x60
 #define XLOG_HEAP2_NEW_CID		0x70
@@ -443,20 +442,6 @@ typedef struct xl_heap_inplace
 
 #define MinSizeOfHeapInplace	(offsetof(xl_heap_inplace, nmsgs) + sizeof(int))
 
-/*
- * This is what we need to know about setting a visibility map bit
- *
- * Backup blk 0: visibility map buffer
- * Backup blk 1: heap buffer
- */
-typedef struct xl_heap_visible
-{
-	TransactionId snapshotConflictHorizon;
-	uint8		flags;
-} xl_heap_visible;
-
-#define SizeOfHeapVisible (offsetof(xl_heap_visible, flags) + sizeof(uint8))
-
 typedef struct xl_heap_new_cid
 {
 	/*
@@ -500,11 +485,6 @@ extern void heap2_desc(StringInfo buf, XLogReaderState *record);
 extern const char *heap2_identify(uint8 info);
 extern void heap_xlog_logical_rewrite(XLogReaderState *r);
 
-extern XLogRecPtr log_heap_visible(Relation rel, Buffer heap_buffer,
-								   Buffer vm_buffer,
-								   TransactionId snapshotConflictHorizon,
-								   uint8 vmflags);
-
 /* in heapdesc.c, so it can be shared between frontend/backend code */
 extern void heap_xlog_deserialize_prune_and_freeze(char *cursor, uint16 flags,
 												   int *nplans, xlhp_freeze_plan **plans,
diff --git a/src/include/access/visibilitymap.h b/src/include/access/visibilitymap.h
index a0166c5b410..001afb037f3 100644
--- a/src/include/access/visibilitymap.h
+++ b/src/include/access/visibilitymap.h
@@ -15,7 +15,6 @@
 #define VISIBILITYMAP_H
 
 #include "access/visibilitymapdefs.h"
-#include "access/xlogdefs.h"
 #include "storage/block.h"
 #include "storage/buf.h"
 #include "storage/relfilelocator.h"
@@ -32,15 +31,9 @@ extern bool visibilitymap_clear(Relation rel, BlockNumber heapBlk,
 extern void visibilitymap_pin(Relation rel, BlockNumber heapBlk,
 							  Buffer *vmbuf);
 extern bool visibilitymap_pin_ok(BlockNumber heapBlk, Buffer vmbuf);
-extern void visibilitymap_set(Relation rel,
-							  BlockNumber heapBlk, Buffer heapBuf,
-							  XLogRecPtr recptr,
-							  Buffer vmBuf,
-							  TransactionId cutoff_xid,
-							  uint8 flags);
-extern void visibilitymap_set_vmbits(BlockNumber heapBlk,
-									 Buffer vmBuf, uint8 flags,
-									 const RelFileLocator rlocator);
+extern void visibilitymap_set(BlockNumber heapBlk,
+							  Buffer vmBuf, uint8 flags,
+							  const RelFileLocator rlocator);
 extern uint8 visibilitymap_get_status(Relation rel, BlockNumber heapBlk, Buffer *vmbuf);
 extern void visibilitymap_count(Relation rel, BlockNumber *all_visible, BlockNumber *all_frozen);
 extern BlockNumber visibilitymap_prepare_truncate(Relation rel,
diff --git a/src/include/access/visibilitymapdefs.h b/src/include/access/visibilitymapdefs.h
index 89153b3cd9a..e5794c8559e 100644
--- a/src/include/access/visibilitymapdefs.h
+++ b/src/include/access/visibilitymapdefs.h
@@ -21,14 +21,5 @@
 #define VISIBILITYMAP_ALL_FROZEN	0x02
 #define VISIBILITYMAP_VALID_BITS	0x03	/* OR of all valid visibilitymap
 											 * flags bits */
-/*
- * To detect recovery conflicts during logical decoding on a standby, we need
- * to know if a table is a user catalog table. For that we add an additional
- * bit into xl_heap_visible.flags, in addition to the above.
- *
- * NB: VISIBILITYMAP_XLOG_* may not be passed to visibilitymap_set().
- */
-#define VISIBILITYMAP_XLOG_CATALOG_REL	0x04
-#define VISIBILITYMAP_XLOG_VALID_BITS	(VISIBILITYMAP_VALID_BITS | VISIBILITYMAP_XLOG_CATALOG_REL)
 
 #endif							/* VISIBILITYMAPDEFS_H */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 77e3c04144e..f5cbcf084a4 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -4356,7 +4356,6 @@ xl_heap_prune
 xl_heap_rewrite_mapping
 xl_heap_truncate
 xl_heap_update
-xl_heap_visible
 xl_invalid_page
 xl_invalid_page_key
 xl_invalidations
-- 
2.43.0



  [text/x-patch] v35-0013-Initialize-missing-fields-in-CreateExecutorState.patch (924B, 14-v35-0013-Initialize-missing-fields-in-CreateExecutorState.patch)
  download | inline diff:
From f24da3eaa6c3587bb0621817b78c148af0393349 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Sun, 1 Mar 2026 16:48:19 -0500
Subject: [PATCH v35 13/18] Initialize missing fields in CreateExecutorState()

d47cbf474ecbd449a4 forgot to initialize a few fields it introduced in
the EState, so do that now.
---
 src/backend/executor/execUtils.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index a7955e476f9..cd4d5452cfb 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -119,6 +119,9 @@ CreateExecutorState(void)
 	estate->es_rteperminfos = NIL;
 	estate->es_plannedstmt = NULL;
 	estate->es_part_prune_infos = NIL;
+	estate->es_part_prune_states = NIL;
+	estate->es_part_prune_results = NIL;
+	estate->es_unpruned_relids = NULL;
 
 	estate->es_junkFilter = NULL;
 
-- 
2.43.0



  [text/x-patch] v35-0014-Track-which-relations-are-modified-by-a-query.patch (5.4K, 15-v35-0014-Track-which-relations-are-modified-by-a-query.patch)
  download | inline diff:
From 382cdd7f98291e00e0fe11c53a32e2b64396fd8e Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Wed, 3 Dec 2025 15:07:24 -0500
Subject: [PATCH v35 14/18] Track which relations are modified by a query

Save the relids in a bitmap in the estate. A later commit will pass this
information down to scan nodes to control whether or not the scan allows
setting the visibility map while on-access pruning. We don't want to set
the visibility map if the query is just going to modify the page
immediately after.

Author: Melanie Plageman <[email protected]>
Reviewed-by: Andres Freund <[email protected]>
Reviewed-by: Chao Li <[email protected]>
---
 src/backend/executor/execMain.c  | 13 +++++++++++++
 src/backend/executor/execUtils.c | 32 ++++++++++++++++++++++++++++++++
 src/include/executor/executor.h  |  3 +++
 src/include/nodes/execnodes.h    |  6 ++++++
 4 files changed, 54 insertions(+)

diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index bfd3ebc601e..6f51b82a364 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -920,6 +920,10 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 					break;
 			}
 
+			/* If it has a rowmark, the relation may be modified */
+			estate->es_modified_relids = bms_add_member(estate->es_modified_relids,
+														rc->rti);
+
 			/* Check that relation is a legal target for marking */
 			if (relation)
 				CheckValidRowMarkRel(relation, rc->markType);
@@ -990,6 +994,10 @@ InitPlan(QueryDesc *queryDesc, int eflags)
 	 */
 	planstate = ExecInitNode(plan, estate, eflags);
 
+#ifdef USE_ASSERT_CHECKING
+	CrossCheckModifiedRelids(estate);
+#endif
+
 	/*
 	 * Get the tuple descriptor describing the type of tuples to return.
 	 */
@@ -3027,6 +3035,7 @@ EvalPlanQualStart(EPQState *epqstate, Plan *planTree)
 	rcestate->es_range_table_size = parentestate->es_range_table_size;
 	rcestate->es_relations = parentestate->es_relations;
 	rcestate->es_rowmarks = parentestate->es_rowmarks;
+	rcestate->es_modified_relids = parentestate->es_modified_relids;
 	rcestate->es_rteperminfos = parentestate->es_rteperminfos;
 	rcestate->es_plannedstmt = parentestate->es_plannedstmt;
 	rcestate->es_junkFilter = parentestate->es_junkFilter;
@@ -3165,6 +3174,10 @@ EvalPlanQualStart(EPQState *epqstate, Plan *planTree)
 	 */
 	epqstate->recheckplanstate = ExecInitNode(planTree, rcestate, 0);
 
+#ifdef USE_ASSERT_CHECKING
+	CrossCheckModifiedRelids(rcestate);
+#endif
+
 	MemoryContextSwitchTo(oldcontext);
 }
 
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index cd4d5452cfb..b4e95644404 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -123,6 +123,8 @@ CreateExecutorState(void)
 	estate->es_part_prune_results = NIL;
 	estate->es_unpruned_relids = NULL;
 
+	estate->es_modified_relids = NULL;
+
 	estate->es_junkFilter = NULL;
 
 	estate->es_output_cid = (CommandId) 0;
@@ -871,6 +873,34 @@ ExecGetRangeTableRelation(EState *estate, Index rti, bool isResultRel)
 	return rel;
 }
 
+#ifdef USE_ASSERT_CHECKING
+/*
+ * Assert that es_modified_relids includes all potentially modified RT
+ * indexes.
+ */
+void
+CrossCheckModifiedRelids(EState *estate)
+{
+	Bitmapset  *expected = NULL;
+	ListCell   *lc;
+	Index		rti;
+
+	foreach(lc, estate->es_opened_result_relations)
+	{
+		ResultRelInfo *rri = lfirst_node(ResultRelInfo, lc);
+
+		expected = bms_add_member(expected, rri->ri_RangeTableIndex);
+	}
+	if (estate->es_rowmarks)
+	{
+		for (rti = 1; rti <= estate->es_range_table_size; rti++)
+			if (estate->es_rowmarks[rti - 1] != NULL)
+				expected = bms_add_member(expected, rti);
+	}
+	Assert(bms_is_subset(expected, estate->es_modified_relids));
+}
+#endif
+
 /*
  * ExecInitResultRelation
  *		Open relation given by the passed-in RT index and fill its
@@ -896,6 +926,8 @@ ExecInitResultRelation(EState *estate, ResultRelInfo *resultRelInfo,
 		estate->es_result_relations = (ResultRelInfo **)
 			palloc0(estate->es_range_table_size * sizeof(ResultRelInfo *));
 	estate->es_result_relations[rti - 1] = resultRelInfo;
+	estate->es_modified_relids = bms_add_member(estate->es_modified_relids,
+												rti);
 
 	/*
 	 * Saving in the list allows to avoid needlessly traversing the whole
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index d46ba59895d..05f032baeaa 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -703,6 +703,9 @@ extern Relation ExecGetRangeTableRelation(EState *estate, Index rti,
 										  bool isResultRel);
 extern void ExecInitResultRelation(EState *estate, ResultRelInfo *resultRelInfo,
 								   Index rti);
+#ifdef USE_ASSERT_CHECKING
+extern void CrossCheckModifiedRelids(EState *estate);
+#endif
 
 extern int	executor_errposition(EState *estate, int location);
 
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 63c067d5aae..610385df12b 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -679,6 +679,12 @@ typedef struct EState
 									 * ExecDoInitialPruning() */
 	const char *es_sourceText;	/* Source text from QueryDesc */
 
+	/*
+	 * RT indexes of relations modified by the query through a
+	 * UPDATE/DELETE/INSERT/MERGE or targeted by a SELECT FOR UPDATE.
+	 */
+	Bitmapset  *es_modified_relids;
+
 	JunkFilter *es_junkFilter;	/* top-level junk filter, if any */
 
 	/* If query can insert/delete tuples, the command ID to mark them with */
-- 
2.43.0



  [text/x-patch] v35-0015-Make-begin_scan-functions-take-a-flags-argument.patch (21.2K, 16-v35-0015-Make-begin_scan-functions-take-a-flags-argument.patch)
  download | inline diff:
From 61ce0d481c14b6203efdb7fa77949e777505d613 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Mon, 2 Mar 2026 16:31:17 -0500
Subject: [PATCH v35 15/18] Make begin_scan() functions take a flags argument

This lets us pass more information from the executor to use when
building the scan descriptor. A future commit will use this to tell the
scan descriptor whether or not its relation is read-only in the current
query.
---
 contrib/pgrowlocks/pgrowlocks.c           |  2 +-
 src/backend/access/brin/brin.c            |  3 ++-
 src/backend/access/gin/gininsert.c        |  3 ++-
 src/backend/access/heap/heapam_handler.c  |  6 +++---
 src/backend/access/index/genam.c          |  4 ++--
 src/backend/access/index/indexam.c        |  6 +++---
 src/backend/access/nbtree/nbtsort.c       |  2 +-
 src/backend/access/table/tableam.c        |  7 ++++---
 src/backend/commands/constraint.c         |  2 +-
 src/backend/commands/copyto.c             |  2 +-
 src/backend/commands/tablecmds.c          |  8 ++++----
 src/backend/commands/typecmds.c           |  4 ++--
 src/backend/executor/execIndexing.c       |  2 +-
 src/backend/executor/execReplication.c    |  8 ++++----
 src/backend/executor/nodeBitmapHeapscan.c |  2 +-
 src/backend/executor/nodeIndexonlyscan.c  |  2 +-
 src/backend/executor/nodeIndexscan.c      |  4 ++--
 src/backend/executor/nodeSeqscan.c        |  6 +++---
 src/backend/partitioning/partbounds.c     |  2 +-
 src/backend/utils/adt/selfuncs.c          |  2 +-
 src/include/access/genam.h                |  2 +-
 src/include/access/tableam.h              | 17 +++++++++--------
 22 files changed, 50 insertions(+), 46 deletions(-)

diff --git a/contrib/pgrowlocks/pgrowlocks.c b/contrib/pgrowlocks/pgrowlocks.c
index f88269332b6..27f01d8055f 100644
--- a/contrib/pgrowlocks/pgrowlocks.c
+++ b/contrib/pgrowlocks/pgrowlocks.c
@@ -114,7 +114,7 @@ pgrowlocks(PG_FUNCTION_ARGS)
 					   RelationGetRelationName(rel));
 
 	/* Scan the relation */
-	scan = table_beginscan(rel, GetActiveSnapshot(), 0, NULL);
+	scan = table_beginscan(rel, GetActiveSnapshot(), 0, NULL, 0);
 	hscan = (HeapScanDesc) scan;
 
 	attinmeta = TupleDescGetAttInMetadata(rsinfo->setDesc);
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 9cd563fd0c3..eea24eb7116 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -2844,7 +2844,8 @@ _brin_parallel_scan_and_build(BrinBuildState *state,
 	indexInfo->ii_Concurrent = brinshared->isconcurrent;
 
 	scan = table_beginscan_parallel(heap,
-									ParallelTableScanFromBrinShared(brinshared));
+									ParallelTableScanFromBrinShared(brinshared),
+									0);
 
 	reltuples = table_index_build_scan(heap, index, indexInfo, true, true,
 									   brinbuildCallbackParallel, state, scan);
diff --git a/src/backend/access/gin/gininsert.c b/src/backend/access/gin/gininsert.c
index ee9b6106922..977308f7282 100644
--- a/src/backend/access/gin/gininsert.c
+++ b/src/backend/access/gin/gininsert.c
@@ -2060,7 +2060,8 @@ _gin_parallel_scan_and_build(GinBuildState *state,
 	indexInfo->ii_Concurrent = ginshared->isconcurrent;
 
 	scan = table_beginscan_parallel(heap,
-									ParallelTableScanFromGinBuildShared(ginshared));
+									ParallelTableScanFromGinBuildShared(ginshared),
+									0);
 
 	reltuples = table_index_build_scan(heap, index, indexInfo, true, progress,
 									   ginBuildCallbackParallel, state, scan);
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 47624194f93..ebe2e87a28b 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -79,7 +79,7 @@ heapam_slot_callbacks(Relation relation)
  */
 
 static IndexFetchTableData *
-heapam_index_fetch_begin(Relation rel)
+heapam_index_fetch_begin(Relation rel, uint32 flags)
 {
 	IndexFetchHeapData *hscan = palloc0_object(IndexFetchHeapData);
 
@@ -761,7 +761,7 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
 
 		tableScan = NULL;
 		heapScan = NULL;
-		indexScan = index_beginscan(OldHeap, OldIndex, SnapshotAny, NULL, 0, 0);
+		indexScan = index_beginscan(OldHeap, OldIndex, SnapshotAny, NULL, 0, 0, 0);
 		index_rescan(indexScan, NULL, 0, NULL, 0);
 	}
 	else
@@ -770,7 +770,7 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
 		pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
 									 PROGRESS_CLUSTER_PHASE_SEQ_SCAN_HEAP);
 
-		tableScan = table_beginscan(OldHeap, SnapshotAny, 0, (ScanKey) NULL);
+		tableScan = table_beginscan(OldHeap, SnapshotAny, 0, (ScanKey) NULL, 0);
 		heapScan = (HeapScanDesc) tableScan;
 		indexScan = NULL;
 
diff --git a/src/backend/access/index/genam.c b/src/backend/access/index/genam.c
index 5e89b86a62c..1fe7ffb2487 100644
--- a/src/backend/access/index/genam.c
+++ b/src/backend/access/index/genam.c
@@ -455,7 +455,7 @@ systable_beginscan(Relation heapRelation,
 		}
 
 		sysscan->iscan = index_beginscan(heapRelation, irel,
-										 snapshot, NULL, nkeys, 0);
+										 snapshot, NULL, nkeys, 0, 0);
 		index_rescan(sysscan->iscan, idxkey, nkeys, NULL, 0);
 		sysscan->scan = NULL;
 
@@ -716,7 +716,7 @@ systable_beginscan_ordered(Relation heapRelation,
 		bsysscan = true;
 
 	sysscan->iscan = index_beginscan(heapRelation, indexRelation,
-									 snapshot, NULL, nkeys, 0);
+									 snapshot, NULL, nkeys, 0, 0);
 	index_rescan(sysscan->iscan, idxkey, nkeys, NULL, 0);
 	sysscan->scan = NULL;
 
diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
index 43f64a0e721..1827208396c 100644
--- a/src/backend/access/index/indexam.c
+++ b/src/backend/access/index/indexam.c
@@ -257,7 +257,7 @@ index_beginscan(Relation heapRelation,
 				Relation indexRelation,
 				Snapshot snapshot,
 				IndexScanInstrumentation *instrument,
-				int nkeys, int norderbys)
+				int nkeys, int norderbys, uint32 flags)
 {
 	IndexScanDesc scan;
 
@@ -284,7 +284,7 @@ index_beginscan(Relation heapRelation,
 	scan->instrument = instrument;
 
 	/* prepare to fetch index matches from table */
-	scan->xs_heapfetch = table_index_fetch_begin(heapRelation);
+	scan->xs_heapfetch = table_index_fetch_begin(heapRelation, flags);
 
 	return scan;
 }
@@ -615,7 +615,7 @@ index_beginscan_parallel(Relation heaprel, Relation indexrel,
 	scan->instrument = instrument;
 
 	/* prepare to fetch index matches from table */
-	scan->xs_heapfetch = table_index_fetch_begin(heaprel);
+	scan->xs_heapfetch = table_index_fetch_begin(heaprel, 0);
 
 	return scan;
 }
diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index fd9d4087b5a..cc486e66793 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -1926,7 +1926,7 @@ _bt_parallel_scan_and_sort(BTSpool *btspool, BTSpool *btspool2,
 	indexInfo = BuildIndexInfo(btspool->index);
 	indexInfo->ii_Concurrent = btshared->isconcurrent;
 	scan = table_beginscan_parallel(btspool->heap,
-									ParallelTableScanFromBTShared(btshared));
+									ParallelTableScanFromBTShared(btshared), 0);
 	reltuples = table_index_build_scan(btspool->heap, btspool->index, indexInfo,
 									   true, progress, _bt_build_callback,
 									   &buildstate, scan);
diff --git a/src/backend/access/table/tableam.c b/src/backend/access/table/tableam.c
index dfda1af412e..b3aeee36ce6 100644
--- a/src/backend/access/table/tableam.c
+++ b/src/backend/access/table/tableam.c
@@ -163,10 +163,11 @@ table_parallelscan_initialize(Relation rel, ParallelTableScanDesc pscan,
 }
 
 TableScanDesc
-table_beginscan_parallel(Relation relation, ParallelTableScanDesc pscan)
+table_beginscan_parallel(Relation relation, ParallelTableScanDesc pscan, uint32 flags)
 {
 	Snapshot	snapshot;
-	uint32		flags = SO_TYPE_SEQSCAN |
+
+	flags |= SO_TYPE_SEQSCAN |
 		SO_ALLOW_STRAT | SO_ALLOW_SYNC | SO_ALLOW_PAGEMODE;
 
 	Assert(RelFileLocatorEquals(relation->rd_locator, pscan->phs_locator));
@@ -248,7 +249,7 @@ table_index_fetch_tuple_check(Relation rel,
 	bool		found;
 
 	slot = table_slot_create(rel, NULL);
-	scan = table_index_fetch_begin(rel);
+	scan = table_index_fetch_begin(rel, 0);
 	found = table_index_fetch_tuple(scan, tid, snapshot, slot, &call_again,
 									all_dead);
 	table_index_fetch_end(scan);
diff --git a/src/backend/commands/constraint.c b/src/backend/commands/constraint.c
index cc11c47b6f2..37cfbd63938 100644
--- a/src/backend/commands/constraint.c
+++ b/src/backend/commands/constraint.c
@@ -106,7 +106,7 @@ unique_key_recheck(PG_FUNCTION_ARGS)
 	 */
 	tmptid = checktid;
 	{
-		IndexFetchTableData *scan = table_index_fetch_begin(trigdata->tg_relation);
+		IndexFetchTableData *scan = table_index_fetch_begin(trigdata->tg_relation, 0);
 		bool		call_again = false;
 
 		if (!table_index_fetch_tuple(scan, &tmptid, SnapshotSelf, slot,
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 9ceeff6d99e..c5cbc5b4e1f 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -1158,7 +1158,7 @@ CopyRelationTo(CopyToState cstate, Relation rel, Relation root_rel, uint64 *proc
 	AttrMap    *map = NULL;
 	TupleTableSlot *root_slot = NULL;
 
-	scandesc = table_beginscan(rel, GetActiveSnapshot(), 0, NULL);
+	scandesc = table_beginscan(rel, GetActiveSnapshot(), 0, NULL, 0);
 	slot = table_slot_create(rel, NULL);
 
 	/*
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index b04b0dbd2a0..654cc7db175 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -6388,7 +6388,7 @@ ATRewriteTable(AlteredTableInfo *tab, Oid OIDNewHeap)
 		 * checking all the constraints.
 		 */
 		snapshot = RegisterSnapshot(GetLatestSnapshot());
-		scan = table_beginscan(oldrel, snapshot, 0, NULL);
+		scan = table_beginscan(oldrel, snapshot, 0, NULL, 0);
 
 		/*
 		 * Switch to per-tuple memory context and reset it for each tuple
@@ -13765,7 +13765,7 @@ validateForeignKeyConstraint(char *conname,
 	 */
 	snapshot = RegisterSnapshot(GetLatestSnapshot());
 	slot = table_slot_create(rel, NULL);
-	scan = table_beginscan(rel, snapshot, 0, NULL);
+	scan = table_beginscan(rel, snapshot, 0, NULL, 0);
 
 	perTupCxt = AllocSetContextCreate(CurrentMemoryContext,
 									  "validateForeignKeyConstraint",
@@ -22623,7 +22623,7 @@ MergePartitionsMoveRows(List **wqueue, List *mergingPartitions, Relation newPart
 
 		/* Scan through the rows. */
 		snapshot = RegisterSnapshot(GetLatestSnapshot());
-		scan = table_beginscan(mergingPartition, snapshot, 0, NULL);
+		scan = table_beginscan(mergingPartition, snapshot, 0, NULL, 0);
 
 		/*
 		 * Switch to per-tuple memory context and reset it for each tuple
@@ -23087,7 +23087,7 @@ SplitPartitionMoveRows(List **wqueue, Relation rel, Relation splitRel,
 
 	/* Scan through the rows. */
 	snapshot = RegisterSnapshot(GetLatestSnapshot());
-	scan = table_beginscan(splitRel, snapshot, 0, NULL);
+	scan = table_beginscan(splitRel, snapshot, 0, NULL, 0);
 
 	/*
 	 * Switch to per-tuple memory context and reset it for each tuple
diff --git a/src/backend/commands/typecmds.c b/src/backend/commands/typecmds.c
index 3dab6bb5a79..5316cea7cec 100644
--- a/src/backend/commands/typecmds.c
+++ b/src/backend/commands/typecmds.c
@@ -3185,7 +3185,7 @@ validateDomainNotNullConstraint(Oid domainoid)
 
 		/* Scan all tuples in this relation */
 		snapshot = RegisterSnapshot(GetLatestSnapshot());
-		scan = table_beginscan(testrel, snapshot, 0, NULL);
+		scan = table_beginscan(testrel, snapshot, 0, NULL, 0);
 		slot = table_slot_create(testrel, NULL);
 		while (table_scan_getnextslot(scan, ForwardScanDirection, slot))
 		{
@@ -3266,7 +3266,7 @@ validateDomainCheckConstraint(Oid domainoid, const char *ccbin, LOCKMODE lockmod
 
 		/* Scan all tuples in this relation */
 		snapshot = RegisterSnapshot(GetLatestSnapshot());
-		scan = table_beginscan(testrel, snapshot, 0, NULL);
+		scan = table_beginscan(testrel, snapshot, 0, NULL, 0);
 		slot = table_slot_create(testrel, NULL);
 		while (table_scan_getnextslot(scan, ForwardScanDirection, slot))
 		{
diff --git a/src/backend/executor/execIndexing.c b/src/backend/executor/execIndexing.c
index 9d071e495c6..cb3e4f67ea1 100644
--- a/src/backend/executor/execIndexing.c
+++ b/src/backend/executor/execIndexing.c
@@ -815,7 +815,7 @@ check_exclusion_or_unique_constraint(Relation heap, Relation index,
 retry:
 	conflict = false;
 	found_self = false;
-	index_scan = index_beginscan(heap, index, &DirtySnapshot, NULL, indnkeyatts, 0);
+	index_scan = index_beginscan(heap, index, &DirtySnapshot, NULL, indnkeyatts, 0, 0);
 	index_rescan(index_scan, scankeys, indnkeyatts, NULL, 0);
 
 	while (index_getnext_slot(index_scan, ForwardScanDirection, existing_slot))
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 2497ee7edc5..5b8ca1abf62 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -205,7 +205,7 @@ RelationFindReplTupleByIndex(Relation rel, Oid idxoid,
 	skey_attoff = build_replindex_scan_key(skey, rel, idxrel, searchslot);
 
 	/* Start an index scan. */
-	scan = index_beginscan(rel, idxrel, &snap, NULL, skey_attoff, 0);
+	scan = index_beginscan(rel, idxrel, &snap, NULL, skey_attoff, 0, 0);
 
 retry:
 	found = false;
@@ -383,7 +383,7 @@ RelationFindReplTupleSeq(Relation rel, LockTupleMode lockmode,
 
 	/* Start a heap scan. */
 	InitDirtySnapshot(snap);
-	scan = table_beginscan(rel, &snap, 0, NULL);
+	scan = table_beginscan(rel, &snap, 0, NULL, 0);
 	scanslot = table_slot_create(rel, NULL);
 
 retry:
@@ -602,7 +602,7 @@ RelationFindDeletedTupleInfoSeq(Relation rel, TupleTableSlot *searchslot,
 	 * not yet committed or those just committed prior to the scan are
 	 * excluded in update_most_recent_deletion_info().
 	 */
-	scan = table_beginscan(rel, SnapshotAny, 0, NULL);
+	scan = table_beginscan(rel, SnapshotAny, 0, NULL, 0);
 	scanslot = table_slot_create(rel, NULL);
 
 	table_rescan(scan, NULL);
@@ -666,7 +666,7 @@ RelationFindDeletedTupleInfoByIndex(Relation rel, Oid idxoid,
 	 * not yet committed or those just committed prior to the scan are
 	 * excluded in update_most_recent_deletion_info().
 	 */
-	scan = index_beginscan(rel, idxrel, SnapshotAny, NULL, skey_attoff, 0);
+	scan = index_beginscan(rel, idxrel, SnapshotAny, NULL, skey_attoff, 0, 0);
 
 	index_rescan(scan, skey, skey_attoff, NULL, 0);
 
diff --git a/src/backend/executor/nodeBitmapHeapscan.c b/src/backend/executor/nodeBitmapHeapscan.c
index c68c26cbf38..106bcd3301c 100644
--- a/src/backend/executor/nodeBitmapHeapscan.c
+++ b/src/backend/executor/nodeBitmapHeapscan.c
@@ -107,7 +107,7 @@ BitmapTableScanSetup(BitmapHeapScanState *node)
 			table_beginscan_bm(node->ss.ss_currentRelation,
 							   node->ss.ps.state->es_snapshot,
 							   0,
-							   NULL);
+							   NULL, 0);
 	}
 
 	node->ss.ss_currentScanDesc->st.rs_tbmiterator = tbmiterator;
diff --git a/src/backend/executor/nodeIndexonlyscan.c b/src/backend/executor/nodeIndexonlyscan.c
index c2d09374517..cf4d9a4f832 100644
--- a/src/backend/executor/nodeIndexonlyscan.c
+++ b/src/backend/executor/nodeIndexonlyscan.c
@@ -94,7 +94,7 @@ IndexOnlyNext(IndexOnlyScanState *node)
 								   estate->es_snapshot,
 								   &node->ioss_Instrument,
 								   node->ioss_NumScanKeys,
-								   node->ioss_NumOrderByKeys);
+								   node->ioss_NumOrderByKeys, 0);
 
 		node->ioss_ScanDesc = scandesc;
 
diff --git a/src/backend/executor/nodeIndexscan.c b/src/backend/executor/nodeIndexscan.c
index a616abff04c..a7af2f6628a 100644
--- a/src/backend/executor/nodeIndexscan.c
+++ b/src/backend/executor/nodeIndexscan.c
@@ -111,7 +111,7 @@ IndexNext(IndexScanState *node)
 								   estate->es_snapshot,
 								   &node->iss_Instrument,
 								   node->iss_NumScanKeys,
-								   node->iss_NumOrderByKeys);
+								   node->iss_NumOrderByKeys, 0);
 
 		node->iss_ScanDesc = scandesc;
 
@@ -207,7 +207,7 @@ IndexNextWithReorder(IndexScanState *node)
 								   estate->es_snapshot,
 								   &node->iss_Instrument,
 								   node->iss_NumScanKeys,
-								   node->iss_NumOrderByKeys);
+								   node->iss_NumOrderByKeys, 0);
 
 		node->iss_ScanDesc = scandesc;
 
diff --git a/src/backend/executor/nodeSeqscan.c b/src/backend/executor/nodeSeqscan.c
index af3c788ce8b..d9d7ec0516a 100644
--- a/src/backend/executor/nodeSeqscan.c
+++ b/src/backend/executor/nodeSeqscan.c
@@ -71,7 +71,7 @@ SeqNext(SeqScanState *node)
 		 */
 		scandesc = table_beginscan(node->ss.ss_currentRelation,
 								   estate->es_snapshot,
-								   0, NULL);
+								   0, NULL, 0);
 		node->ss.ss_currentScanDesc = scandesc;
 	}
 
@@ -374,7 +374,7 @@ ExecSeqScanInitializeDSM(SeqScanState *node,
 								  estate->es_snapshot);
 	shm_toc_insert(pcxt->toc, node->ss.ps.plan->plan_node_id, pscan);
 	node->ss.ss_currentScanDesc =
-		table_beginscan_parallel(node->ss.ss_currentRelation, pscan);
+		table_beginscan_parallel(node->ss.ss_currentRelation, pscan, 0);
 }
 
 /* ----------------------------------------------------------------
@@ -407,5 +407,5 @@ ExecSeqScanInitializeWorker(SeqScanState *node,
 
 	pscan = shm_toc_lookup(pwcxt->toc, node->ss.ps.plan->plan_node_id, false);
 	node->ss.ss_currentScanDesc =
-		table_beginscan_parallel(node->ss.ss_currentRelation, pscan);
+		table_beginscan_parallel(node->ss.ss_currentRelation, pscan, 0);
 }
diff --git a/src/backend/partitioning/partbounds.c b/src/backend/partitioning/partbounds.c
index 0ca312ac27d..b7c4e6d1071 100644
--- a/src/backend/partitioning/partbounds.c
+++ b/src/backend/partitioning/partbounds.c
@@ -3362,7 +3362,7 @@ check_default_partition_contents(Relation parent, Relation default_rel,
 		econtext = GetPerTupleExprContext(estate);
 		snapshot = RegisterSnapshot(GetLatestSnapshot());
 		tupslot = table_slot_create(part_rel, &estate->es_tupleTable);
-		scan = table_beginscan(part_rel, snapshot, 0, NULL);
+		scan = table_beginscan(part_rel, snapshot, 0, NULL, 0);
 
 		/*
 		 * Switch to per-tuple memory context and reset it for each tuple
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index dd7e11c0ca5..3da2db74e88 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -7186,7 +7186,7 @@ get_actual_variable_endpoint(Relation heapRel,
 
 	index_scan = index_beginscan(heapRel, indexRel,
 								 &SnapshotNonVacuumable, NULL,
-								 1, 0);
+								 1, 0, 0);
 	/* Set it up for index-only scan */
 	index_scan->xs_want_itup = true;
 	index_rescan(index_scan, scankeys, 1, NULL, 0);
diff --git a/src/include/access/genam.h b/src/include/access/genam.h
index 4c0429cc613..3934fa44793 100644
--- a/src/include/access/genam.h
+++ b/src/include/access/genam.h
@@ -156,7 +156,7 @@ extern IndexScanDesc index_beginscan(Relation heapRelation,
 									 Relation indexRelation,
 									 Snapshot snapshot,
 									 IndexScanInstrumentation *instrument,
-									 int nkeys, int norderbys);
+									 int nkeys, int norderbys, uint32 flags);
 extern IndexScanDesc index_beginscan_bitmap(Relation indexRelation,
 											Snapshot snapshot,
 											IndexScanInstrumentation *instrument,
diff --git a/src/include/access/tableam.h b/src/include/access/tableam.h
index 06084752245..e881e4f82a0 100644
--- a/src/include/access/tableam.h
+++ b/src/include/access/tableam.h
@@ -420,7 +420,7 @@ typedef struct TableAmRoutine
 	 *
 	 * Tuples for an index scan can then be fetched via index_fetch_tuple.
 	 */
-	struct IndexFetchTableData *(*index_fetch_begin) (Relation rel);
+	struct IndexFetchTableData *(*index_fetch_begin) (Relation rel, uint32 flags);
 
 	/*
 	 * Reset index fetch. Typically this will release cross index fetch
@@ -894,9 +894,9 @@ table_beginscan_common(Relation rel, Snapshot snapshot, int nkeys,
  */
 static inline TableScanDesc
 table_beginscan(Relation rel, Snapshot snapshot,
-				int nkeys, ScanKeyData *key)
+				int nkeys, ScanKeyData *key, uint32 flags)
 {
-	uint32		flags = SO_TYPE_SEQSCAN |
+	flags |= SO_TYPE_SEQSCAN |
 		SO_ALLOW_STRAT | SO_ALLOW_SYNC | SO_ALLOW_PAGEMODE;
 
 	return table_beginscan_common(rel, snapshot, nkeys, key, NULL, flags);
@@ -939,9 +939,9 @@ table_beginscan_strat(Relation rel, Snapshot snapshot,
  */
 static inline TableScanDesc
 table_beginscan_bm(Relation rel, Snapshot snapshot,
-				   int nkeys, ScanKeyData *key)
+				   int nkeys, ScanKeyData *key, uint32 flags)
 {
-	uint32		flags = SO_TYPE_BITMAPSCAN | SO_ALLOW_PAGEMODE;
+	flags |= SO_TYPE_BITMAPSCAN | SO_ALLOW_PAGEMODE;
 
 	return table_beginscan_common(rel, snapshot, nkeys, key, NULL, flags);
 }
@@ -1139,7 +1139,8 @@ extern void table_parallelscan_initialize(Relation rel,
  * Caller must hold a suitable lock on the relation.
  */
 extern TableScanDesc table_beginscan_parallel(Relation relation,
-											  ParallelTableScanDesc pscan);
+											  ParallelTableScanDesc pscan,
+											  uint32 flags);
 
 /*
  * Begin a parallel tid range scan. `pscan` needs to have been initialized
@@ -1175,7 +1176,7 @@ table_parallelscan_reinitialize(Relation rel, ParallelTableScanDesc pscan)
  * Tuples for an index scan can then be fetched via table_index_fetch_tuple().
  */
 static inline IndexFetchTableData *
-table_index_fetch_begin(Relation rel)
+table_index_fetch_begin(Relation rel, uint32 flags)
 {
 	/*
 	 * We don't allow scans to be started while CheckXidAlive is set, except
@@ -1185,7 +1186,7 @@ table_index_fetch_begin(Relation rel)
 	if (unlikely(TransactionIdIsValid(CheckXidAlive) && !bsysscan))
 		elog(ERROR, "scan started during logical decoding");
 
-	return rel->rd_tableam->index_fetch_begin(rel);
+	return rel->rd_tableam->index_fetch_begin(rel, flags);
 }
 
 /*
-- 
2.43.0



  [text/x-patch] v35-0016-Pass-down-information-on-table-modification-to-s.patch (8.0K, 17-v35-0016-Pass-down-information-on-table-modification-to-s.patch)
  download | inline diff:
From 1b41b0a89323c45652965d2e11afd729bdb2c1c7 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Mon, 2 Mar 2026 16:31:33 -0500
Subject: [PATCH v35 16/18] Pass down information on table modification to scan
 node

Pass down information to sequential scan, index [only] scan, and bitmap
table scan nodes on whether or not the query modifies the relation being
scanned. A later commit will use this information to update the VM
during on-access pruning only if the relation is not modified by the
query.

Author: Melanie Plageman <[email protected]>
Reviewed-by: Andres Freund <[email protected]>
Reviewed-by: Andrey Borodin <[email protected]>
Reviewed-by: Chao Li <[email protected]>
Discussion: https://postgr.es/m/4379FDA3-9446-4E2C-9C15-32EFE8D4F31B%40yandex-team.ru
---
 src/backend/access/heap/heapam_handler.c  |  1 +
 src/backend/executor/nodeBitmapHeapscan.c |  9 +++++++-
 src/backend/executor/nodeIndexonlyscan.c  |  9 +++++++-
 src/backend/executor/nodeIndexscan.c      | 18 ++++++++++++++--
 src/backend/executor/nodeSeqscan.c        | 26 ++++++++++++++++++++---
 src/include/access/heapam.h               |  6 ++++++
 src/include/access/tableam.h              |  2 ++
 7 files changed, 64 insertions(+), 7 deletions(-)

diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index ebe2e87a28b..3a8eb9d8b61 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -86,6 +86,7 @@ heapam_index_fetch_begin(Relation rel, uint32 flags)
 	hscan->xs_base.rel = rel;
 	hscan->xs_cbuf = InvalidBuffer;
 	hscan->xs_vmbuffer = InvalidBuffer;
+	hscan->modifies_base_rel = !(flags & SO_HINT_REL_READ_ONLY);
 
 	return &hscan->xs_base;
 }
diff --git a/src/backend/executor/nodeBitmapHeapscan.c b/src/backend/executor/nodeBitmapHeapscan.c
index 106bcd3301c..1017676fce0 100644
--- a/src/backend/executor/nodeBitmapHeapscan.c
+++ b/src/backend/executor/nodeBitmapHeapscan.c
@@ -103,11 +103,18 @@ BitmapTableScanSetup(BitmapHeapScanState *node)
 	 */
 	if (!node->ss.ss_currentScanDesc)
 	{
+		uint32		flags = 0;
+
+		if (!bms_is_member(((Scan *) node->ss.ps.plan)->scanrelid,
+						   node->ss.ps.state->es_modified_relids))
+			flags = SO_HINT_REL_READ_ONLY;
+
 		node->ss.ss_currentScanDesc =
 			table_beginscan_bm(node->ss.ss_currentRelation,
 							   node->ss.ps.state->es_snapshot,
 							   0,
-							   NULL, 0);
+							   NULL,
+							   flags);
 	}
 
 	node->ss.ss_currentScanDesc->st.rs_tbmiterator = tbmiterator;
diff --git a/src/backend/executor/nodeIndexonlyscan.c b/src/backend/executor/nodeIndexonlyscan.c
index cf4d9a4f832..2fe724a323f 100644
--- a/src/backend/executor/nodeIndexonlyscan.c
+++ b/src/backend/executor/nodeIndexonlyscan.c
@@ -84,6 +84,12 @@ IndexOnlyNext(IndexOnlyScanState *node)
 
 	if (scandesc == NULL)
 	{
+		uint32		flags = 0;
+
+		if (!bms_is_member(((Scan *) node->ss.ps.plan)->scanrelid,
+						   estate->es_modified_relids))
+			flags = SO_HINT_REL_READ_ONLY;
+
 		/*
 		 * We reach here if the index only scan is not parallel, or if we're
 		 * serially executing an index only scan that was planned to be
@@ -94,7 +100,8 @@ IndexOnlyNext(IndexOnlyScanState *node)
 								   estate->es_snapshot,
 								   &node->ioss_Instrument,
 								   node->ioss_NumScanKeys,
-								   node->ioss_NumOrderByKeys, 0);
+								   node->ioss_NumOrderByKeys,
+								   flags);
 
 		node->ioss_ScanDesc = scandesc;
 
diff --git a/src/backend/executor/nodeIndexscan.c b/src/backend/executor/nodeIndexscan.c
index a7af2f6628a..8730dab7469 100644
--- a/src/backend/executor/nodeIndexscan.c
+++ b/src/backend/executor/nodeIndexscan.c
@@ -102,6 +102,12 @@ IndexNext(IndexScanState *node)
 
 	if (scandesc == NULL)
 	{
+		uint32		flags = 0;
+
+		if (!bms_is_member(((Scan *) node->ss.ps.plan)->scanrelid,
+						   estate->es_modified_relids))
+			flags = SO_HINT_REL_READ_ONLY;
+
 		/*
 		 * We reach here if the index scan is not parallel, or if we're
 		 * serially executing an index scan that was planned to be parallel.
@@ -111,7 +117,8 @@ IndexNext(IndexScanState *node)
 								   estate->es_snapshot,
 								   &node->iss_Instrument,
 								   node->iss_NumScanKeys,
-								   node->iss_NumOrderByKeys, 0);
+								   node->iss_NumOrderByKeys,
+								   flags);
 
 		node->iss_ScanDesc = scandesc;
 
@@ -198,6 +205,12 @@ IndexNextWithReorder(IndexScanState *node)
 
 	if (scandesc == NULL)
 	{
+		uint32		flags = 0;
+
+		if (!bms_is_member(((Scan *) node->ss.ps.plan)->scanrelid,
+						   estate->es_modified_relids))
+			flags = SO_HINT_REL_READ_ONLY;
+
 		/*
 		 * We reach here if the index scan is not parallel, or if we're
 		 * serially executing an index scan that was planned to be parallel.
@@ -207,7 +220,8 @@ IndexNextWithReorder(IndexScanState *node)
 								   estate->es_snapshot,
 								   &node->iss_Instrument,
 								   node->iss_NumScanKeys,
-								   node->iss_NumOrderByKeys, 0);
+								   node->iss_NumOrderByKeys,
+								   flags);
 
 		node->iss_ScanDesc = scandesc;
 
diff --git a/src/backend/executor/nodeSeqscan.c b/src/backend/executor/nodeSeqscan.c
index d9d7ec0516a..336354922a2 100644
--- a/src/backend/executor/nodeSeqscan.c
+++ b/src/backend/executor/nodeSeqscan.c
@@ -65,13 +65,20 @@ SeqNext(SeqScanState *node)
 
 	if (scandesc == NULL)
 	{
+		uint32		flags = 0;
+
+		if (!bms_is_member(((Scan *) node->ss.ps.plan)->scanrelid,
+						   estate->es_modified_relids))
+			flags = SO_HINT_REL_READ_ONLY;
+
 		/*
 		 * We reach here if the scan is not parallel, or if we're serially
 		 * executing a scan that was planned to be parallel.
 		 */
 		scandesc = table_beginscan(node->ss.ss_currentRelation,
 								   estate->es_snapshot,
-								   0, NULL, 0);
+								   0, NULL, flags);
+
 		node->ss.ss_currentScanDesc = scandesc;
 	}
 
@@ -367,14 +374,20 @@ ExecSeqScanInitializeDSM(SeqScanState *node,
 {
 	EState	   *estate = node->ss.ps.state;
 	ParallelTableScanDesc pscan;
+	uint32		flags = 0;
 
 	pscan = shm_toc_allocate(pcxt->toc, node->pscan_len);
 	table_parallelscan_initialize(node->ss.ss_currentRelation,
 								  pscan,
 								  estate->es_snapshot);
 	shm_toc_insert(pcxt->toc, node->ss.ps.plan->plan_node_id, pscan);
+	if (!bms_is_member(((Scan *) node->ss.ps.plan)->scanrelid,
+					   estate->es_modified_relids))
+		flags = SO_HINT_REL_READ_ONLY;
+
 	node->ss.ss_currentScanDesc =
-		table_beginscan_parallel(node->ss.ss_currentRelation, pscan, 0);
+		table_beginscan_parallel(node->ss.ss_currentRelation, pscan,
+								 flags);
 }
 
 /* ----------------------------------------------------------------
@@ -404,8 +417,15 @@ ExecSeqScanInitializeWorker(SeqScanState *node,
 							ParallelWorkerContext *pwcxt)
 {
 	ParallelTableScanDesc pscan;
+	uint32		flags = 0;
+
+	if (!bms_is_member(((Scan *) node->ss.ps.plan)->scanrelid,
+					   node->ss.ps.state->es_modified_relids))
+		flags = SO_HINT_REL_READ_ONLY;
 
 	pscan = shm_toc_lookup(pwcxt->toc, node->ss.ps.plan->plan_node_id, false);
 	node->ss.ss_currentScanDesc =
-		table_beginscan_parallel(node->ss.ss_currentRelation, pscan, 0);
+		table_beginscan_parallel(node->ss.ss_currentRelation,
+								 pscan,
+								 flags);
 }
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 7ef4cbbfb1e..c20218f8190 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -130,6 +130,12 @@ typedef struct IndexFetchHeapData
 
 	/* Current heap block's corresponding page in the visibility map */
 	Buffer		xs_vmbuffer;
+
+	/*
+	 * Some optimizations can only be performed if the query does not modify
+	 * the underlying relation. Track that here.
+	 */
+	bool		modifies_base_rel;
 } IndexFetchHeapData;
 
 /* Result codes for HeapTupleSatisfiesVacuum */
diff --git a/src/include/access/tableam.h b/src/include/access/tableam.h
index e881e4f82a0..599011ba567 100644
--- a/src/include/access/tableam.h
+++ b/src/include/access/tableam.h
@@ -63,6 +63,8 @@ typedef enum ScanOptions
 
 	/* unregister snapshot at scan end? */
 	SO_TEMP_SNAPSHOT = 1 << 9,
+	/* set if the query doesn't modify the relation */
+	SO_HINT_REL_READ_ONLY = 1 << 10,
 }			ScanOptions;
 
 /*
-- 
2.43.0



  [text/x-patch] v35-0017-Allow-on-access-pruning-to-set-pages-all-visible.patch (9.9K, 18-v35-0017-Allow-on-access-pruning-to-set-pages-all-visible.patch)
  download | inline diff:
From 88a86dbfc54db38c890718d74419d94f15dade18 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Fri, 27 Feb 2026 16:33:40 -0500
Subject: [PATCH v35 17/18] Allow on-access pruning to set pages all-visible

Many queries do not modify the underlying relation. For such queries, if
on-access pruning occurs during the scan, we can check whether the page
has become all-visible and update the visibility map accordingly.
Previously, only vacuum and COPY FREEZE marked pages as all-visible or
all-frozen.

This commit implements on-access VM setting for sequential scans as well
as for the underlying heap relation in index scans and bitmap heap
scans.

Author: Melanie Plageman <[email protected]>
Reviewed-by: Andres Freund <[email protected]>
Reviewed-by: Kirill Reshke <[email protected]>
Discussion: https://postgr.es/m/flat/CAAKRu_ZMw6Npd_qm2KM%2BFwQ3cMOMx1Dh3VMhp8-V7SOLxdK9-g%40mail.gmail.com
---
 src/backend/access/heap/heapam.c              |  3 +-
 src/backend/access/heap/heapam_handler.c      |  6 ++-
 src/backend/access/heap/pruneheap.c           | 41 +++++++++++++++----
 src/backend/access/heap/vacuumlazy.c          |  2 +-
 src/include/access/heapam.h                   | 12 ++++--
 .../t/035_standby_logical_decoding.pl         |  3 +-
 6 files changed, 50 insertions(+), 17 deletions(-)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 2f9ef87463e..5539bb8c10b 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -633,7 +633,8 @@ heap_prepare_pagescan(TableScanDesc sscan)
 	/*
 	 * Prune and repair fragmentation for the whole page, if possible.
 	 */
-	heap_page_prune_opt(scan->rs_base.rs_rd, buffer, &scan->rs_vmbuffer);
+	heap_page_prune_opt(scan->rs_base.rs_rd, buffer, &scan->rs_vmbuffer,
+						(sscan->rs_flags & SO_HINT_REL_READ_ONLY));
 
 	/*
 	 * We must hold share lock on the buffer content while examining tuple
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 3a8eb9d8b61..eb5a1b7bd21 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -147,7 +147,8 @@ heapam_index_fetch_tuple(struct IndexFetchTableData *scan,
 		 */
 		if (prev_buf != hscan->xs_cbuf)
 			heap_page_prune_opt(hscan->xs_base.rel, hscan->xs_cbuf,
-								&hscan->xs_vmbuffer);
+								&hscan->xs_vmbuffer,
+								hscan->modifies_base_rel);
 	}
 
 	/* Obtain share-lock on the buffer so we can examine visibility */
@@ -2542,7 +2543,8 @@ BitmapHeapScanNextBlock(TableScanDesc scan,
 	/*
 	 * Prune and repair fragmentation for the whole page, if possible.
 	 */
-	heap_page_prune_opt(scan->rs_rd, buffer, &hscan->rs_vmbuffer);
+	heap_page_prune_opt(scan->rs_rd, buffer, &hscan->rs_vmbuffer,
+						scan->rs_flags & SO_HINT_REL_READ_ONLY);
 
 	/*
 	 * We must hold share lock on the buffer content while examining tuple
diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index b66d49f4d60..fc2ddcb5ab4 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -44,6 +44,8 @@ typedef struct
 	bool		mark_unused_now;
 	/* whether to attempt freezing tuples */
 	bool		attempt_freeze;
+	/* whether to attempt setting the VM */
+	bool		attempt_set_vm;
 	struct VacuumCutoffs *cutoffs;
 	Relation	relation;
 
@@ -213,7 +215,8 @@ static void page_verify_redirects(Page page);
 
 static bool heap_page_will_freeze(bool did_tuple_hint_fpi, bool do_prune, bool do_hint_prune,
 								  PruneState *prstate);
-static bool heap_page_will_set_vm(PruneState *prstate, PruneReason reason);
+static bool heap_page_will_set_vm(PruneState *prstate, PruneReason reason,
+								  bool do_prune, bool do_freeze);
 static TransactionId get_conflict_xid(bool do_prune, bool do_freeze, bool do_set_vm,
 									  uint8 old_vmbits, uint8 new_vmbits,
 									  TransactionId latest_xid_removed,
@@ -237,7 +240,8 @@ static TransactionId get_conflict_xid(bool do_prune, bool do_freeze, bool do_set
  * pinned. If we find VM corruption during pruning, we will fix it.
  */
 void
-heap_page_prune_opt(Relation relation, Buffer buffer, Buffer *vmbuffer)
+heap_page_prune_opt(Relation relation, Buffer buffer, Buffer *vmbuffer,
+					bool rel_read_only)
 {
 	Page		page = BufferGetPage(buffer);
 	TransactionId prune_xid;
@@ -319,6 +323,8 @@ heap_page_prune_opt(Relation relation, Buffer buffer, Buffer *vmbuffer)
 			 * current implementation.
 			 */
 			params.options = 0;
+			if (rel_read_only)
+				params.options = HEAP_PAGE_PRUNE_SET_VM;
 
 			heap_page_prune_and_freeze(&params, &presult, &dummy_off_loc,
 									   NULL, NULL);
@@ -375,6 +381,7 @@ prune_freeze_setup(PruneFreezeParams *params,
 	/* cutoffs must be provided if we will attempt freezing */
 	Assert(!(params->options & HEAP_PAGE_PRUNE_FREEZE) || params->cutoffs);
 	prstate->attempt_freeze = (params->options & HEAP_PAGE_PRUNE_FREEZE) != 0;
+	prstate->attempt_set_vm = (params->options & HEAP_PAGE_PRUNE_SET_VM) != 0;
 	prstate->cutoffs = params->cutoffs;
 	prstate->relation = params->relation;
 	prstate->block = BufferGetBlockNumber(params->buffer);
@@ -937,21 +944,37 @@ heap_fix_vm_corruption(PruneState *prstate, OffsetNumber offnum)
  * This function does not actually set the VM bits or page-level visibility
  * hint, PD_ALL_VISIBLE.
  *
+ * This should be called only after do_freeze has been decided (and do_prune
+ * has been set), as these factor into our heuristic-based decision.
+ *
  * Returns true if one or both VM bits should be set and false otherwise.
  */
 static bool
-heap_page_will_set_vm(PruneState *prstate, PruneReason reason)
+heap_page_will_set_vm(PruneState *prstate, PruneReason reason,
+					  bool do_prune, bool do_freeze)
 {
-	/*
-	 * Though on-access pruning maintains prstate->set_all_visible, we don't
-	 * consider setting the VM.
-	 */
-	if (reason == PRUNE_ON_ACCESS)
+	if (!prstate->attempt_set_vm)
 		return false;
 
 	if (!prstate->set_all_visible)
 		return false;
 
+	/*
+	 * If this is an on-access call and we're not actually pruning, avoid
+	 * setting the visibility map if it would newly dirty the heap page or, if
+	 * the page is already dirty, if doing so would require including a
+	 * full-page image (FPI) of the heap page in the WAL. This situation
+	 * should be rare, as on-access pruning is only attempted when
+	 * pd_prune_xid is valid.
+	 */
+	if (reason == PRUNE_ON_ACCESS && !do_prune && !do_freeze &&
+		(!BufferIsDirty(prstate->buffer) || XLogCheckBufferNeedsBackup(prstate->buffer)))
+	{
+		prstate->set_all_visible = false;
+		prstate->set_all_frozen = false;
+		return false;
+	}
+
 	prstate->new_vmbits = VISIBILITYMAP_ALL_VISIBLE;
 
 	if (prstate->set_all_frozen)
@@ -1166,7 +1189,7 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
 	Assert(!prstate.set_all_frozen || prstate.set_all_visible);
 	Assert(!prstate.set_all_visible || (prstate.lpdead_items == 0));
 
-	do_set_vm = heap_page_will_set_vm(&prstate, params->reason);
+	do_set_vm = heap_page_will_set_vm(&prstate, params->reason, do_prune, do_freeze);
 
 	/*
 	 * new_vmbits should be 0 regardless of whether or not the page is
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index ef607945a93..ab76800b4df 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -2007,7 +2007,7 @@ lazy_scan_prune(LVRelState *vacrel,
 		.buffer = buf,
 		.vmbuffer = vmbuffer,
 		.reason = PRUNE_VACUUM_SCAN,
-		.options = HEAP_PAGE_PRUNE_FREEZE,
+		.options = HEAP_PAGE_PRUNE_FREEZE | HEAP_PAGE_PRUNE_SET_VM,
 		.vistest = vacrel->vistest,
 		.cutoffs = &vacrel->cutoffs,
 	};
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index c20218f8190..0a3e3df9b2d 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -42,6 +42,7 @@
 /* "options" flag bits for heap_page_prune_and_freeze */
 #define HEAP_PAGE_PRUNE_MARK_UNUSED_NOW		(1 << 0)
 #define HEAP_PAGE_PRUNE_FREEZE				(1 << 1)
+#define HEAP_PAGE_PRUNE_SET_VM				(1 << 2)
 
 typedef struct BulkInsertStateData *BulkInsertState;
 typedef struct GlobalVisState GlobalVisState;
@@ -96,7 +97,8 @@ typedef struct HeapScanDescData
 
 	/*
 	 * For sequential scans and bitmap heap scans. The current heap block's
-	 * corresponding page in the visibility map.
+	 * corresponding page in the visibility map. If the relation is not
+	 * modified by the query, on-access pruning may set the VM.
 	 */
 	Buffer		rs_vmbuffer;
 
@@ -128,7 +130,11 @@ typedef struct IndexFetchHeapData
 	 */
 	Buffer		xs_cbuf;
 
-	/* Current heap block's corresponding page in the visibility map */
+	/*
+	 * Current heap block's corresponding page in the visibility map. For
+	 * index scans that do not modify the underlying heap table, on-access
+	 * pruning may set the VM on-access.
+	 */
 	Buffer		xs_vmbuffer;
 
 	/*
@@ -435,7 +441,7 @@ extern TransactionId heap_index_delete_tuples(Relation rel,
 
 /* in heap/pruneheap.c */
 extern void heap_page_prune_opt(Relation relation, Buffer buffer,
-								Buffer *vmbuffer);
+								Buffer *vmbuffer, bool rel_read_only);
 extern void heap_page_prune_and_freeze(PruneFreezeParams *params,
 									   PruneFreezeResult *presult,
 									   OffsetNumber *off_loc,
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index d264a698ff6..a5536ba4ff6 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -296,6 +296,7 @@ wal_level = 'logical'
 max_replication_slots = 4
 max_wal_senders = 4
 autovacuum = off
+hot_standby_feedback = on
 });
 $node_primary->dump_info;
 $node_primary->start;
@@ -748,7 +749,7 @@ check_pg_recvlogical_stderr($handle,
 $logstart = -s $node_standby->logfile;
 
 reactive_slots_change_hfs_and_wait_for_xmins('shared_row_removal_',
-	'no_conflict_', 0, 1);
+	'no_conflict_', 1, 0);
 
 # This should not trigger a conflict
 wait_until_vacuum_can_remove(
-- 
2.43.0



  [text/x-patch] v35-0018-Set-pd_prune_xid-on-insert.patch (9.3K, 19-v35-0018-Set-pd_prune_xid-on-insert.patch)
  download | inline diff:
From 815a2d10ebc6f672be5508a0c4a98ff866d0d71b Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Tue, 29 Jul 2025 16:12:56 -0400
Subject: [PATCH v35 18/18] Set pd_prune_xid on insert
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Now that visibility map (VM) updates can occur during read-only queries,
it makes sense to also set the page’s pd_prune_xid hint during inserts.
This enables heap_page_prune_and_freeze() to run and set the VM
all-visible after a page is filled with newly inserted tuples the first
time it is read.

This change also addresses a long-standing note in heap_insert() and
heap_multi_insert(), which observed that setting pd_prune_xid would
help clean up aborted insertions sooner. Without it, such tuples might
linger until VACUUM, whereas now they can be pruned earlier.

The index killtuples test had to be updated to reflect a larger number
of hits by some accesses. Since the prune_xid is set by the fill/insert
step, on-access pruning can happen during the first access step (before
the DELETE). This is when the VM is extended. After the DELETE, the next
access hits the VM block instead of extending it. Thus, an additional
buffer hit is counted for the table.

Reviewed-by: Chao Li <[email protected]>
---
 src/backend/access/heap/heapam.c              | 31 +++++++++++++------
 src/backend/access/heap/heapam_xlog.c         | 17 +++++++++-
 src/backend/access/heap/pruneheap.c           | 14 ++++-----
 .../modules/index/expected/killtuples.out     |  8 ++---
 4 files changed, 47 insertions(+), 23 deletions(-)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 5539bb8c10b..bb124bc767b 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -2156,6 +2156,7 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
 	TransactionId xid = GetCurrentTransactionId();
 	HeapTuple	heaptup;
 	Buffer		buffer;
+	Page		page;
 	Buffer		vmbuffer = InvalidBuffer;
 	bool		all_visible_cleared = false;
 
@@ -2182,6 +2183,8 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
 									   &vmbuffer, NULL,
 									   0);
 
+	page = BufferGetPage(buffer);
+
 	/*
 	 * We're about to do the actual insert -- but check for conflict first, to
 	 * avoid possibly having to roll back work we've just done.
@@ -2205,25 +2208,29 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
 	RelationPutHeapTuple(relation, buffer, heaptup,
 						 (options & HEAP_INSERT_SPECULATIVE) != 0);
 
-	if (PageIsAllVisible(BufferGetPage(buffer)))
+	if (PageIsAllVisible(page))
 	{
 		all_visible_cleared = true;
-		PageClearAllVisible(BufferGetPage(buffer));
+		PageClearAllVisible(page);
 		visibilitymap_clear(relation,
 							ItemPointerGetBlockNumber(&(heaptup->t_self)),
 							vmbuffer, VISIBILITYMAP_VALID_BITS);
 	}
 
 	/*
-	 * XXX Should we set PageSetPrunable on this page ?
+	 * Set pd_prune_xid to trigger heap_page_prune_and_freeze() once the page
+	 * is full so that we can set the page all-visible in the VM.
 	 *
-	 * The inserting transaction may eventually abort thus making this tuple
-	 * DEAD and hence available for pruning. Though we don't want to optimize
-	 * for aborts, if no other tuple in this page is UPDATEd/DELETEd, the
-	 * aborted tuple will never be pruned until next vacuum is triggered.
+	 * Setting pd_prune_xid is also handy if the inserting transaction
+	 * eventually aborts making this tuple DEAD and hence available for
+	 * pruning. If no other tuple in this page is UPDATEd/DELETEd, the aborted
+	 * tuple would never otherwise be pruned until next vacuum is triggered.
 	 *
-	 * If you do add PageSetPrunable here, add it in heap_xlog_insert too.
+	 * Don't set it if we are in bootstrap mode or we are inserting a frozen
+	 * tuple.
 	 */
+	if (TransactionIdIsNormal(xid) && !(options & HEAP_INSERT_FROZEN))
+		PageSetPrunable(page, xid);
 
 	MarkBufferDirty(buffer);
 
@@ -2233,7 +2240,6 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
 		xl_heap_insert xlrec;
 		xl_heap_header xlhdr;
 		XLogRecPtr	recptr;
-		Page		page = BufferGetPage(buffer);
 		uint8		info = XLOG_HEAP_INSERT;
 		int			bufflags = 0;
 
@@ -2598,8 +2604,13 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
 		}
 
 		/*
-		 * XXX Should we set PageSetPrunable on this page ? See heap_insert()
+		 * Set pd_prune_xid. See heap_insert() for more on why we do this when
+		 * inserting tuples. This only makes sense if we aren't already
+		 * setting the page frozen in the VM. We also don't set it in
+		 * bootstrap mode.
 		 */
+		if (!all_frozen_set && TransactionIdIsNormal(xid))
+			PageSetPrunable(page, xid);
 
 		MarkBufferDirty(buffer);
 
diff --git a/src/backend/access/heap/heapam_xlog.c b/src/backend/access/heap/heapam_xlog.c
index df89f93edb4..edd5c946c6a 100644
--- a/src/backend/access/heap/heapam_xlog.c
+++ b/src/backend/access/heap/heapam_xlog.c
@@ -450,6 +450,14 @@ heap_xlog_insert(XLogReaderState *record)
 
 		freespace = PageGetHeapFreeSpace(page); /* needed to update FSM below */
 
+		/*
+		 * Set the page prunable to trigger on-access pruning later, which may
+		 * set the page all-visible in the VM. See comments in heap_insert().
+		 */
+		if (TransactionIdIsNormal(XLogRecGetXid(record)) &&
+			!HeapTupleHeaderXminFrozen(htup))
+			PageSetPrunable(page, XLogRecGetXid(record));
+
 		PageSetLSN(page, lsn);
 
 		if (xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED)
@@ -599,12 +607,19 @@ heap_xlog_multi_insert(XLogReaderState *record)
 		if (xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED)
 			PageClearAllVisible(page);
 
-		/* XLH_INSERT_ALL_FROZEN_SET implies that all tuples are visible */
+		/*
+		 * XLH_INSERT_ALL_FROZEN_SET implies that all tuples are visible. If
+		 * we are not setting the page frozen, then set the page's prunable
+		 * hint so that we trigger on-access pruning later which may set the
+		 * page all-visible in the VM.
+		 */
 		if (xlrec->flags & XLH_INSERT_ALL_FROZEN_SET)
 		{
 			PageSetAllVisible(page);
 			PageClearPrunable(page);
 		}
+		else
+			PageSetPrunable(page, XLogRecGetXid(record));
 
 		MarkBufferDirty(buffer);
 	}
diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index fc2ddcb5ab4..72a1c311bd0 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -1904,16 +1904,14 @@ heap_prune_record_unchanged_lp_normal(PruneState *prstate, OffsetNumber offnum)
 			prstate->set_all_visible = false;
 			prstate->set_all_frozen = false;
 
-			/* The page should not be marked all-visible */
-			if (PageIsAllVisible(page))
-				heap_fix_vm_corruption(prstate, offnum);
-
 			/*
-			 * If we wanted to optimize for aborts, we might consider marking
-			 * the page prunable when we see INSERT_IN_PROGRESS.  But we
-			 * don't.  See related decisions about when to mark the page
-			 * prunable in heapam.c.
+			 * Though there is nothing "prunable" on the page, we maintain
+			 * pd_prune_xid for inserts so that we have the opportunity to
+			 * mark them all-visible during the next round of pruning.
 			 */
+			heap_prune_record_prunable(prstate,
+									   HeapTupleHeaderGetXmin(htup),
+									   offnum);
 			break;
 
 		case HEAPTUPLE_DELETE_IN_PROGRESS:
diff --git a/src/test/modules/index/expected/killtuples.out b/src/test/modules/index/expected/killtuples.out
index be7ddd756ef..700144d6783 100644
--- a/src/test/modules/index/expected/killtuples.out
+++ b/src/test/modules/index/expected/killtuples.out
@@ -54,7 +54,7 @@ step flush: SELECT FROM pg_stat_force_next_flush();
 step result: SELECT heap_blks_read + heap_blks_hit - counter.heap_accesses AS new_heap_accesses FROM counter, pg_statio_all_tables WHERE relname = 'kill_prior_tuple';
 new_heap_accesses
 -----------------
-                1
+                2
 (1 row)
 
 step measure: UPDATE counter SET heap_accesses = (SELECT heap_blks_read + heap_blks_hit FROM pg_statio_all_tables WHERE relname = 'kill_prior_tuple');
@@ -130,7 +130,7 @@ step flush: SELECT FROM pg_stat_force_next_flush();
 step result: SELECT heap_blks_read + heap_blks_hit - counter.heap_accesses AS new_heap_accesses FROM counter, pg_statio_all_tables WHERE relname = 'kill_prior_tuple';
 new_heap_accesses
 -----------------
-                1
+                2
 (1 row)
 
 step measure: UPDATE counter SET heap_accesses = (SELECT heap_blks_read + heap_blks_hit FROM pg_statio_all_tables WHERE relname = 'kill_prior_tuple');
@@ -283,7 +283,7 @@ step flush: SELECT FROM pg_stat_force_next_flush();
 step result: SELECT heap_blks_read + heap_blks_hit - counter.heap_accesses AS new_heap_accesses FROM counter, pg_statio_all_tables WHERE relname = 'kill_prior_tuple';
 new_heap_accesses
 -----------------
-                1
+                2
 (1 row)
 
 step measure: UPDATE counter SET heap_accesses = (SELECT heap_blks_read + heap_blks_hit FROM pg_statio_all_tables WHERE relname = 'kill_prior_tuple');
@@ -329,7 +329,7 @@ step flush: SELECT FROM pg_stat_force_next_flush();
 step result: SELECT heap_blks_read + heap_blks_hit - counter.heap_accesses AS new_heap_accesses FROM counter, pg_statio_all_tables WHERE relname = 'kill_prior_tuple';
 new_heap_accesses
 -----------------
-                1
+                2
 (1 row)
 
 step measure: UPDATE counter SET heap_accesses = (SELECT heap_blks_read + heap_blks_hit FROM pg_statio_all_tables WHERE relname = 'kill_prior_tuple');
-- 
2.43.0



view thread (144+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)
  In-Reply-To: <CAAKRu_Y1MuANdm1p47Ev13Y9EQz8z+pw-vHOh=3DVdahUTjgXg@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox