public inbox for [email protected]  
help / color / mirror / Atom feed
From: Melanie Plageman <[email protected]>
To: PostgreSQL Hackers <[email protected]>
Cc: Andres Freund <[email protected]>
Cc: Robert Haas <[email protected]>
Subject: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)
Date: Mon, 23 Jun 2025 16:25:16 -0400
Message-ID: <CAAKRu_ZMw6Npd_qm2KM+FwQ3cMOMx1Dh3VMhp8-V7SOLxdK9-g@mail.gmail.com> (raw)

Hi,

The attached patch set eliminates xl_heap_visible, the WAL record
emitted when a block of the heap is set all-visible/frozen in the
visibility map. Instead, it includes the information needed to update
the VM in the WAL record already emitted by the operation modifying
the heap page.

Currently COPY FREEZE and vacuum are the only operations that set the
VM. So, this patch modifies the xl_heap_multi_insert and xl_heap_prune
records.

The result is a dramatic reduction in WAL volume for these operations.
I've included numbers below.

I also think that it makes more sense to include changes to the VM in
the same WAL record as the changes that rendered the page all-visible.
In some cases, we will only set the page all-visible, but that is in
the context of the operation on the heap page which discovered that it
was all-visible. Therefore, I find this to be a clarity as well as a
performance improvement.

This project is also the first step toward setting the VM on-access
for queries which do not modify the page. There are a few design
issues that must be sorted out for that project which I will detail
separately. Note that this patch set currently does not implement
setting the VM on-access.

The attached patch set isn't 100% polished. I think some of the
variable names and comments could use work, but I'd like to validate
the idea of doing this before doing a full polish. This is a summary
of what is in the set:

Patches:
0001 - 0002: cleanup
0003 - 0004: refactoring
0005: COPY FREEZE changes
0006: refactoring
0007: vacuum phase III changes
0008: vacuum phase I empty page changes
0009 - 0012: refactoring
0013: vacuum phase I normal page changes
0014: cleanup

Performance benefits of eliminating xl_heap_visible:

vacuum of table with index (DDL at bottom of email)
--
master -> patch
WAL bytes: 405346 -> 303088 = 25% reduction
WAL records: 6682 -> 4459 = 33% reduction

vacuum of table without index
--
master -> patch
WAL records: 4452 -> 2231 = 50% reduction
WAL bytes: 289016 -> 177978 = 38% reduction

COPY FREEZE of table without index
--
master -> patch
WAL records: 3672777 -> 1854589 = 50% reduction
WAL bytes: 841340339 -> 748545732  = 11% reduction (new pages need a
copy of the whole page)

table for vacuum example:
--
create table foo(a int, b numeric, c numeric) with (autovacuum_enabled= false);
insert into foo select i % 18, repeat('1', 400)::numeric, repeat('2',
400)::numeric from generate_series(1,40000)i;
-- don't make index for no-index case
create index on foo(a);
delete from foo where a = 1;
vacuum (verbose, process_toast false) foo;


copy freeze example:
--
-- create a data file
create table large(a int, b int) with (autovacuum_enabled = false,
fillfactor = 10);
insert into large SELECT generate_series(1,40000000)i, 1;
copy large to 'large.data';

-- example
BEGIN;
create table large(a int, b int) with (autovacuum_enabled = false,
fillfactor = 10);
COPY large FROM 'large.data' WITH (FREEZE);
COMMIT;

- Melanie


Attachments:

  [text/x-patch] v1-0002-Simplify-vacuum-VM-update-logging-counters.patch (2.9K, 2-v1-0002-Simplify-vacuum-VM-update-logging-counters.patch)
  download | inline diff:
From 6cbbdd359ae4de835bbd77369b598885e8a279b2 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Wed, 18 Jun 2025 16:12:15 -0400
Subject: [PATCH v1 02/14] Simplify vacuum VM update logging counters

We can simplify the VM counters added in dc6acfd910b8 to
lazy_vacuum_heap_page() and lazy_scan_new_or_empty().

We won't invoke lazy_vacuum_heap_page() unless there are dead line
pointers, so we know the page can't be all-visible.

In lazy_scan_new_or_empty(), we only update the VM if the page-level
hint PD_ALL_VISIBLE is clear, and the VM bit cannot be set if the page
level bit is clear because a subsequent page update would fail to clear
the visibility map bit.

Simplify the logic for determining which log counters to increment based
on this knowledge.
---
 src/backend/access/heap/vacuumlazy.c | 32 +++++++++++-----------------
 1 file changed, 12 insertions(+), 20 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 09416450af9..c8da2f835c4 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1900,17 +1900,12 @@ lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, BlockNumber blkno,
 										   VISIBILITYMAP_ALL_FROZEN);
 			END_CRIT_SECTION();
 
-			/*
-			 * If the page wasn't already set all-visible and/or all-frozen in
-			 * the VM, count it as newly set for logging.
-			 */
-			if ((old_vmbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
-			{
-				vacrel->vm_new_visible_pages++;
-				vacrel->vm_new_visible_frozen_pages++;
-			}
-			else if ((old_vmbits & VISIBILITYMAP_ALL_FROZEN) == 0)
-				vacrel->vm_new_frozen_pages++;
+			/* VM bits cannot have been set if PD_ALL_VISIBLE was clear */
+			Assert((old_vmbits & VISIBILITYMAP_VALID_BITS) == 0);
+			(void) old_vmbits; /* Silence compiler */
+			/* Count the newly all-frozen pages for logging. */
+			vacrel->vm_new_visible_pages++;
+			vacrel->vm_new_visible_frozen_pages++;
 		}
 
 		freespace = PageGetHeapFreeSpace(page);
@@ -2930,20 +2925,17 @@ lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
 									   vmbuffer, visibility_cutoff_xid,
 									   flags);
 
-		/*
-		 * If the page wasn't already set all-visible and/or all-frozen in the
-		 * VM, count it as newly set for logging.
-		 */
-		if ((old_vmbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
+		/* We know the page should not have been all-visible */
+		Assert((old_vmbits & VISIBILITYMAP_VALID_BITS) == 0);
+		(void) old_vmbits; /* Silence compiler */
+
+		/* Count the newly set VM page for logging */
+		if ((flags & VISIBILITYMAP_ALL_VISIBLE) != 0)
 		{
 			vacrel->vm_new_visible_pages++;
 			if (all_frozen)
 				vacrel->vm_new_visible_frozen_pages++;
 		}
-
-		else if ((old_vmbits & VISIBILITYMAP_ALL_FROZEN) == 0 &&
-				 all_frozen)
-			vacrel->vm_new_frozen_pages++;
 	}
 
 	/* Revert to the previous phase information for error traceback */
-- 
2.34.1



  [text/x-patch] v1-0004-Introduce-unlogged-versions-of-VM-functions.patch (6.0K, 3-v1-0004-Introduce-unlogged-versions-of-VM-functions.patch)
  download | inline diff:
From 9750354f2b7d7bd3afd38fca5e0ca2dd814a19a2 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Tue, 17 Jun 2025 17:22:10 -0400
Subject: [PATCH v1 04/14] Introduce unlogged versions of VM functions

Future commits will eliminate usages of xl_heap_visible and incorporate
setting the VM into the WAL records making other changes to the heap
page. As a step toward this make versions of the functions which update
the VM and its heap-specific wrapper which do not emit their own WAL.

These will be used in follow-on commits.
---
 src/backend/access/heap/heapam.c        | 44 ++++++++++++++++++++++++
 src/backend/access/heap/visibilitymap.c | 45 +++++++++++++++++++++++++
 src/include/access/heapam.h             |  3 ++
 src/include/access/visibilitymap.h      |  2 ++
 4 files changed, 94 insertions(+)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index dc409fd3a60..15dc3d88843 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -7868,6 +7868,50 @@ heap_page_set_vm_and_log(Relation rel, BlockNumber heap_blk, Buffer heap_buf,
 							 InvalidXLogRecPtr, vmbuf, cutoff_xid, vmflags, set_heap_lsn);
 }
 
+/*
+ * Ensure the provided heap page is marked PD_ALL_VISIBLE and then set the
+ * provided vmflags in the provided vmbuf.
+ *
+ * Both the heap page and VM page should be pinned and exclusive locked.
+ * You must pass a VM buffer containing the correct page of the map
+ * corresponding to the passed in heap block.
+ *
+ * This should only be called in a critical section that also emits WAL (as
+ * needed) for both heap page changes and VM page changes.
+ */
+uint8
+heap_page_set_vm(Relation rel, BlockNumber heap_blk, Buffer heap_buf,
+				 Buffer vmbuf, uint8 vmflags)
+{
+	Page		heap_page = BufferGetPage(heap_buf);
+
+	Assert(BufferIsValid(heap_buf));
+	Assert(CritSectionCount > 0);
+
+	/* Check that we have the right heap page pinned */
+	if (BufferGetBlockNumber(heap_buf) != heap_blk)
+		elog(ERROR, "wrong heap buffer passed to heap_page_set_vm");
+
+	/*
+	 * We must never end up with the VM bit set and the page-level
+	 * PD_ALL_VISIBLE bit clear. If that were to occur, a subsequent page
+	 * modification would fail to clear the VM bit.
+	 *
+	 * Prior to Postgres 19, it was possible for the page-level bit to be set
+	 * and the VM bit to be clear. This could happen if we crashed after
+	 * setting PD_ALL_VISIBLE but before setting bits in the VM. Since
+	 * Postgres 19, since heap page modifications are done in the same
+	 * critical section as setting the VM bits, that should not longer happen.
+	 */
+	if (!PageIsAllVisible(heap_page))
+	{
+		PageSetAllVisible(heap_page);
+		MarkBufferDirty(heap_buf);
+	}
+
+	return visibilitymap_set_vmbyte(rel, heap_blk, vmbuf, vmflags);
+}
+
 /*
  * heap_tuple_should_freeze
  *
diff --git a/src/backend/access/heap/visibilitymap.c b/src/backend/access/heap/visibilitymap.c
index 45721399122..9f27ace0e1c 100644
--- a/src/backend/access/heap/visibilitymap.c
+++ b/src/backend/access/heap/visibilitymap.c
@@ -317,6 +317,51 @@ visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf,
 	return status;
 }
 
+/*
+ * Set flags in the VM block contained in the passed in vmBuf.
+ * Caller must have pinned and exclusive locked the correct block of the VM in
+ * vmBuf.
+ * Caller is responsible for WAL logging the changes to the VM buffer and for
+ * making any changes needed to the associated heap page.
+ */
+uint8
+visibilitymap_set_vmbyte(Relation rel, BlockNumber heapBlk,
+						 Buffer vmBuf, uint8 flags)
+{
+	BlockNumber mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
+	uint32		mapByte = HEAPBLK_TO_MAPBYTE(heapBlk);
+	uint8		mapOffset = HEAPBLK_TO_OFFSET(heapBlk);
+	Page		page;
+	uint8	   *map;
+	uint8		status;
+
+#ifdef TRACE_VISIBILITYMAP
+	elog(DEBUG1, "vm_set %s %d", RelationGetRelationName(rel), heapBlk);
+#endif
+
+	/* Flags should be valid. Also never clear bits with this function */
+	Assert((flags & VISIBILITYMAP_VALID_BITS) == flags);
+
+	/* Must never set all_frozen bit without also setting all_visible bit */
+	Assert(flags != VISIBILITYMAP_ALL_FROZEN);
+
+	/* Check that we have the right VM page pinned */
+	if (!BufferIsValid(vmBuf) || BufferGetBlockNumber(vmBuf) != mapBlock)
+		elog(ERROR, "wrong VM buffer passed to visibilitymap_set");
+
+	page = BufferGetPage(vmBuf);
+	map = (uint8 *) PageGetContents(page);
+
+	status = (map[mapByte] >> mapOffset) & VISIBILITYMAP_VALID_BITS;
+	if (flags != status)
+	{
+		map[mapByte] |= (flags << mapOffset);
+		MarkBufferDirty(vmBuf);
+	}
+
+	return status;
+}
+
 /*
  *	visibilitymap_get_status - get status of bits
  *
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 9375296062f..5127fdb9c77 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -360,6 +360,9 @@ extern bool heap_tuple_should_freeze(HeapTupleHeader tuple,
 									 TransactionId *NoFreezePageRelfrozenXid,
 									 MultiXactId *NoFreezePageRelminMxid);
 extern bool heap_tuple_needs_eventual_freeze(HeapTupleHeader tuple);
+
+extern uint8 heap_page_set_vm(Relation rel, BlockNumber heap_blk, Buffer heap_buf,
+							  Buffer vmbuf, uint8 vmflags);
 extern uint8 heap_page_set_vm_and_log(Relation rel, BlockNumber heap_blk, Buffer heap_buf,
 									  Buffer vmbuf, TransactionId cutoff_xid,
 									  uint8 vmflags);
diff --git a/src/include/access/visibilitymap.h b/src/include/access/visibilitymap.h
index 4fa4f837535..5d0a9417c25 100644
--- a/src/include/access/visibilitymap.h
+++ b/src/include/access/visibilitymap.h
@@ -37,6 +37,8 @@ extern uint8 visibilitymap_set(Relation rel,
 							   Buffer vmBuf,
 							   TransactionId cutoff_xid,
 							   uint8 flags, bool set_heap_lsn);
+extern uint8 visibilitymap_set_vmbyte(Relation rel, BlockNumber heapBlk,
+									  Buffer vmBuf, uint8 flags);
 extern uint8 visibilitymap_get_status(Relation rel, BlockNumber heapBlk, Buffer *vmbuf);
 extern void visibilitymap_count(Relation rel, BlockNumber *all_visible, BlockNumber *all_frozen);
 extern BlockNumber visibilitymap_prepare_truncate(Relation rel,
-- 
2.34.1



  [text/x-patch] v1-0001-Remove-unused-check-in-heap_xlog_insert.patch (1.3K, 4-v1-0001-Remove-unused-check-in-heap_xlog_insert.patch)
  download | inline diff:
From 593d33896dcb618f806b911e80fd448fdacbba0a Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Thu, 12 Jun 2025 15:38:37 -0400
Subject: [PATCH v1 01/14] Remove unused check in heap_xlog_insert()

8e03eb92e9ad54e2 reverted the commit 39b66a91bd which allowed freezing
in the heap_insert() code path but did not remove the corresponding
check in heap_xlog_insert(). This code is extraneous but not harmful.
However, cleaning it up makes it very clear that, as of now, we do not
support any freezing of pages in the heap_insert() path.
---
 src/backend/access/heap/heapam_xlog.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/src/backend/access/heap/heapam_xlog.c b/src/backend/access/heap/heapam_xlog.c
index 30f4c2d3c67..fa94e104f1c 100644
--- a/src/backend/access/heap/heapam_xlog.c
+++ b/src/backend/access/heap/heapam_xlog.c
@@ -508,9 +508,8 @@ heap_xlog_insert(XLogReaderState *record)
 		if (xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED)
 			PageClearAllVisible(page);
 
-		/* XLH_INSERT_ALL_FROZEN_SET implies that all tuples are visible */
-		if (xlrec->flags & XLH_INSERT_ALL_FROZEN_SET)
-			PageSetAllVisible(page);
+		/* This should not happen in the heap_insert() code path */
+		Assert(!(xlrec->flags & XLH_INSERT_ALL_FROZEN_SET));
 
 		MarkBufferDirty(buffer);
 	}
-- 
2.34.1



  [text/x-patch] v1-0003-Introduce-heap-specific-wrapper-for-visibilitymap.patch (12.8K, 5-v1-0003-Introduce-heap-specific-wrapper-for-visibilitymap.patch)
  download | inline diff:
From 904a31bb8f519f5a9e4b30d9010edf506cddad1f Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Tue, 17 Jun 2025 17:06:45 -0400
Subject: [PATCH v1 03/14] Introduce heap-specific wrapper for
 visibilitymap_set()

visibilitymap_set(), which sets bits in the visibility map corresponding
to the heap block of the table passed in, arguably contains some
layering violations.

For example, it sets the heap page's LSN when checksums/wal_log_hints
are enabled. However, the caller may not need to set PD_ALL_VISIBLE
(when it is already set) and thus may not have marked the buffer dirty.
visibilitymap_set() will still set the page LSN in this case, even when
it would have been correct for the caller to *not* mark the buffer
dirty.

Also, every caller that needs to has to remember to set PD_ALL_VISIBLE
and mark the buffer dirty. This commit introduces a wrapper that does
this and a flag to visibilitymap_set() indicating whether or not the
heap page LSN should be set.
---
 src/backend/access/heap/heapam.c        | 62 ++++++++++++++++++-------
 src/backend/access/heap/heapam_xlog.c   |  2 +-
 src/backend/access/heap/vacuumlazy.c    | 60 ++++++------------------
 src/backend/access/heap/visibilitymap.c | 19 ++++----
 src/include/access/heapam.h             |  3 ++
 src/include/access/visibilitymap.h      |  2 +-
 6 files changed, 73 insertions(+), 75 deletions(-)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 0dcd6ee817e..dc409fd3a60 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -2505,8 +2505,6 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
 								BufferGetBlockNumber(buffer),
 								vmbuffer, VISIBILITYMAP_VALID_BITS);
 		}
-		else if (all_frozen_set)
-			PageSetAllVisible(page);
 
 		/*
 		 * XXX Should we set PageSetPrunable on this page ? See heap_insert()
@@ -2632,23 +2630,16 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
 
 		/*
 		 * If we've frozen everything on the page, update the visibilitymap.
-		 * We're already holding pin on the vmbuffer.
+		 * We're already holding pin on the vmbuffer. It's fine to use
+		 * InvalidTransactionId here - this is only used when
+		 * HEAP_INSERT_FROZEN is specified, which intentionally violates
+		 * visibility rules.
 		 */
 		if (all_frozen_set)
-		{
-			Assert(PageIsAllVisible(page));
-			Assert(visibilitymap_pin_ok(BufferGetBlockNumber(buffer), vmbuffer));
-
-			/*
-			 * It's fine to use InvalidTransactionId here - this is only used
-			 * when HEAP_INSERT_FROZEN is specified, which intentionally
-			 * violates visibility rules.
-			 */
-			visibilitymap_set(relation, BufferGetBlockNumber(buffer), buffer,
-							  InvalidXLogRecPtr, vmbuffer,
-							  InvalidTransactionId,
-							  VISIBILITYMAP_ALL_VISIBLE | VISIBILITYMAP_ALL_FROZEN);
-		}
+			heap_page_set_vm_and_log(relation, BufferGetBlockNumber(buffer), buffer,
+									 vmbuffer,
+									 InvalidTransactionId,
+									 VISIBILITYMAP_ALL_VISIBLE | VISIBILITYMAP_ALL_FROZEN);
 
 		UnlockReleaseBuffer(buffer);
 		ndone += nthispage;
@@ -7840,6 +7831,43 @@ heap_tuple_needs_eventual_freeze(HeapTupleHeader tuple)
 	return false;
 }
 
+/*
+ * Make the heap and VM page changes needed to set a page all-visible.
+ * Do not call in recovery.
+ */
+uint8
+heap_page_set_vm_and_log(Relation rel, BlockNumber heap_blk, Buffer heap_buf,
+						 Buffer vmbuf, TransactionId cutoff_xid,
+						 uint8 vmflags)
+{
+	Page		heap_page = BufferGetPage(heap_buf);
+	bool		set_heap_lsn = false;
+
+	Assert(BufferIsValid(heap_buf));
+
+	/* Check that we have the right heap page pinned, if present */
+	if (BufferGetBlockNumber(heap_buf) != heap_blk)
+		elog(ERROR, "wrong heap buffer passed to heap_page_set_vm_and_log");
+
+	/*
+	 * We must never end up with the VM bit set and the page-level
+	 * PD_ALL_VISIBLE bit clear. If that were to occur, a subsequent page
+	 * modification would fail to clear the VM bit. Though it is possible for
+	 * the page-level bit to be set and the VM bit to be clear if checksums
+	 * and wal_log_hints are not enabled.
+	 */
+	if (!PageIsAllVisible(heap_page))
+	{
+		PageSetAllVisible(heap_page);
+		MarkBufferDirty(heap_buf);
+		if (XLogHintBitIsNeeded())
+			set_heap_lsn = true;
+	}
+
+	return visibilitymap_set(rel, heap_blk, heap_buf,
+							 InvalidXLogRecPtr, vmbuf, cutoff_xid, vmflags, set_heap_lsn);
+}
+
 /*
  * heap_tuple_should_freeze
  *
diff --git a/src/backend/access/heap/heapam_xlog.c b/src/backend/access/heap/heapam_xlog.c
index fa94e104f1c..cfd4fc3327d 100644
--- a/src/backend/access/heap/heapam_xlog.c
+++ b/src/backend/access/heap/heapam_xlog.c
@@ -298,7 +298,7 @@ heap_xlog_visible(XLogReaderState *record)
 		visibilitymap_pin(reln, blkno, &vmbuffer);
 
 		visibilitymap_set(reln, blkno, InvalidBuffer, lsn, vmbuffer,
-						  xlrec->snapshotConflictHorizon, vmbits);
+						  xlrec->snapshotConflictHorizon, vmbits, false);
 
 		ReleaseBuffer(vmbuffer);
 		FreeFakeRelcacheEntry(reln);
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index c8da2f835c4..5e662936dd7 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1892,12 +1892,10 @@ lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, BlockNumber blkno,
 				PageGetLSN(page) == InvalidXLogRecPtr)
 				log_newpage_buffer(buf, true);
 
-			PageSetAllVisible(page);
-			old_vmbits = visibilitymap_set(vacrel->rel, blkno, buf,
-										   InvalidXLogRecPtr,
-										   vmbuffer, InvalidTransactionId,
-										   VISIBILITYMAP_ALL_VISIBLE |
-										   VISIBILITYMAP_ALL_FROZEN);
+			old_vmbits = heap_page_set_vm_and_log(vacrel->rel, blkno, buf,
+												  vmbuffer, InvalidTransactionId,
+												  VISIBILITYMAP_ALL_VISIBLE |
+												  VISIBILITYMAP_ALL_FROZEN);
 			END_CRIT_SECTION();
 
 			/* VM bits cannot have been set if PD_ALL_VISIBLE was clear */
@@ -2074,25 +2072,9 @@ lazy_scan_prune(LVRelState *vacrel,
 			flags |= VISIBILITYMAP_ALL_FROZEN;
 		}
 
-		/*
-		 * It should never be the case that the visibility map page is set
-		 * while the page-level bit is clear, but the reverse is allowed (if
-		 * checksums are not enabled).  Regardless, set both bits so that we
-		 * get back in sync.
-		 *
-		 * NB: If the heap page is all-visible but the VM bit is not set, we
-		 * don't need to dirty the heap page.  However, if checksums are
-		 * enabled, we do need to make sure that the heap page is dirtied
-		 * before passing it to visibilitymap_set(), because it may be logged.
-		 * Given that this situation should only happen in rare cases after a
-		 * crash, it is not worth optimizing.
-		 */
-		PageSetAllVisible(page);
-		MarkBufferDirty(buf);
-		old_vmbits = visibilitymap_set(vacrel->rel, blkno, buf,
-									   InvalidXLogRecPtr,
-									   vmbuffer, presult.vm_conflict_horizon,
-									   flags);
+		old_vmbits = heap_page_set_vm_and_log(vacrel->rel, blkno, buf,
+											  vmbuffer, presult.vm_conflict_horizon,
+											  flags);
 
 		/*
 		 * If the page wasn't already set all-visible and/or all-frozen in the
@@ -2164,17 +2146,6 @@ lazy_scan_prune(LVRelState *vacrel,
 	{
 		uint8		old_vmbits;
 
-		/*
-		 * Avoid relying on all_visible_according_to_vm as a proxy for the
-		 * page-level PD_ALL_VISIBLE bit being set, since it might have become
-		 * stale -- even when all_visible is set
-		 */
-		if (!PageIsAllVisible(page))
-		{
-			PageSetAllVisible(page);
-			MarkBufferDirty(buf);
-		}
-
 		/*
 		 * Set the page all-frozen (and all-visible) in the VM.
 		 *
@@ -2183,11 +2154,10 @@ lazy_scan_prune(LVRelState *vacrel,
 		 * was logged when the page's tuples were frozen.
 		 */
 		Assert(!TransactionIdIsValid(presult.vm_conflict_horizon));
-		old_vmbits = visibilitymap_set(vacrel->rel, blkno, buf,
-									   InvalidXLogRecPtr,
-									   vmbuffer, InvalidTransactionId,
-									   VISIBILITYMAP_ALL_VISIBLE |
-									   VISIBILITYMAP_ALL_FROZEN);
+		old_vmbits = heap_page_set_vm_and_log(vacrel->rel, blkno, buf,
+											  vmbuffer, InvalidTransactionId,
+											  VISIBILITYMAP_ALL_VISIBLE |
+											  VISIBILITYMAP_ALL_FROZEN);
 
 		/*
 		 * The page was likely already set all-visible in the VM. However,
@@ -2919,11 +2889,9 @@ lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
 			flags |= VISIBILITYMAP_ALL_FROZEN;
 		}
 
-		PageSetAllVisible(page);
-		old_vmbits = visibilitymap_set(vacrel->rel, blkno, buffer,
-									   InvalidXLogRecPtr,
-									   vmbuffer, visibility_cutoff_xid,
-									   flags);
+		old_vmbits = heap_page_set_vm_and_log(vacrel->rel, blkno, buffer,
+											  vmbuffer, visibility_cutoff_xid,
+											  flags);
 
 		/* We know the page should not have been all-visible */
 		Assert((old_vmbits & VISIBILITYMAP_VALID_BITS) == 0);
diff --git a/src/backend/access/heap/visibilitymap.c b/src/backend/access/heap/visibilitymap.c
index 745a04ef26e..45721399122 100644
--- a/src/backend/access/heap/visibilitymap.c
+++ b/src/backend/access/heap/visibilitymap.c
@@ -232,9 +232,10 @@ visibilitymap_pin_ok(BlockNumber heapBlk, Buffer vmbuf)
  * when a page that is already all-visible is being marked all-frozen.
  *
  * Caller is expected to set the heap page's PD_ALL_VISIBLE bit before calling
- * this function. Except in recovery, caller should also pass the heap
- * buffer. When checksums are enabled and we're not in recovery, we must add
- * the heap buffer to the WAL chain to protect it from being torn.
+ * this function. Except in recovery, caller should also pass the heap buffer.
+ * When checksums are enabled and we're not in recovery, if the heap page was
+ * modified, we must add the heap buffer to the WAL chain to protect it from
+ * being torn.
  *
  * You must pass a buffer containing the correct map page to this function.
  * Call visibilitymap_pin first to pin the right one. This function doesn't do
@@ -245,7 +246,7 @@ visibilitymap_pin_ok(BlockNumber heapBlk, Buffer vmbuf)
 uint8
 visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf,
 				  XLogRecPtr recptr, Buffer vmBuf, TransactionId cutoff_xid,
-				  uint8 flags)
+				  uint8 flags, bool set_heap_lsn)
 {
 	BlockNumber mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
 	uint32		mapByte = HEAPBLK_TO_MAPBYTE(heapBlk);
@@ -259,16 +260,12 @@ visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf,
 #endif
 
 	Assert(InRecovery || XLogRecPtrIsInvalid(recptr));
-	Assert(InRecovery || PageIsAllVisible((Page) BufferGetPage(heapBuf)));
+	Assert(!(InRecovery && set_heap_lsn));
 	Assert((flags & VISIBILITYMAP_VALID_BITS) == flags);
 
 	/* Must never set all_frozen bit without also setting all_visible bit */
 	Assert(flags != VISIBILITYMAP_ALL_FROZEN);
 
-	/* Check that we have the right heap page pinned, if present */
-	if (BufferIsValid(heapBuf) && BufferGetBlockNumber(heapBuf) != heapBlk)
-		elog(ERROR, "wrong heap buffer passed to visibilitymap_set");
-
 	/* Check that we have the right VM page pinned */
 	if (!BufferIsValid(vmBuf) || BufferGetBlockNumber(vmBuf) != mapBlock)
 		elog(ERROR, "wrong VM buffer passed to visibilitymap_set");
@@ -301,10 +298,12 @@ visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf,
 				 * WAL record inserted above, so it would be incorrect to
 				 * update the heap page's LSN.
 				 */
-				if (XLogHintBitIsNeeded())
+				if (set_heap_lsn)
 				{
 					Page		heapPage = BufferGetPage(heapBuf);
 
+					Assert(XLogHintBitIsNeeded());
+
 					PageSetLSN(heapPage, recptr);
 				}
 			}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 3a9424c19c9..9375296062f 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -360,6 +360,9 @@ extern bool heap_tuple_should_freeze(HeapTupleHeader tuple,
 									 TransactionId *NoFreezePageRelfrozenXid,
 									 MultiXactId *NoFreezePageRelminMxid);
 extern bool heap_tuple_needs_eventual_freeze(HeapTupleHeader tuple);
+extern uint8 heap_page_set_vm_and_log(Relation rel, BlockNumber heap_blk, Buffer heap_buf,
+									  Buffer vmbuf, TransactionId cutoff_xid,
+									  uint8 vmflags);
 
 extern void simple_heap_insert(Relation relation, HeapTuple tup);
 extern void simple_heap_delete(Relation relation, ItemPointer tid);
diff --git a/src/include/access/visibilitymap.h b/src/include/access/visibilitymap.h
index be21c6dd1a3..4fa4f837535 100644
--- a/src/include/access/visibilitymap.h
+++ b/src/include/access/visibilitymap.h
@@ -36,7 +36,7 @@ extern uint8 visibilitymap_set(Relation rel,
 							   XLogRecPtr recptr,
 							   Buffer vmBuf,
 							   TransactionId cutoff_xid,
-							   uint8 flags);
+							   uint8 flags, bool set_heap_lsn);
 extern uint8 visibilitymap_get_status(Relation rel, BlockNumber heapBlk, Buffer *vmbuf);
 extern void visibilitymap_count(Relation rel, BlockNumber *all_visible, BlockNumber *all_frozen);
 extern BlockNumber visibilitymap_prepare_truncate(Relation rel,
-- 
2.34.1



  [text/x-patch] v1-0005-Eliminate-xl_heap_visible-in-COPY-FREEZE.patch (6.8K, 6-v1-0005-Eliminate-xl_heap_visible-in-COPY-FREEZE.patch)
  download | inline diff:
From b8dacf8fed00b3d1fcf59e61adb1541ba68746a0 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Tue, 17 Jun 2025 17:40:28 -0400
Subject: [PATCH v1 05/14] Eliminate xl_heap_visible in COPY FREEZE

Instead of emitting a separate WAL record for setting the VM bits in
xl_heap_visible, include the required update in the xl_heap_multi_insert
record instead.
---
 src/backend/access/heap/heapam.c       | 42 +++++++++++++++++---------
 src/backend/access/heap/heapam_xlog.c  | 37 ++++++++++++++++++++++-
 src/backend/access/rmgrdesc/heapdesc.c |  5 +++
 3 files changed, 69 insertions(+), 15 deletions(-)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 15dc3d88843..3d9b114b4e8 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -2493,9 +2493,6 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
 		/*
 		 * If the page is all visible, need to clear that, unless we're only
 		 * going to add further frozen rows to it.
-		 *
-		 * If we're only adding already frozen rows to a previously empty
-		 * page, mark it as all-visible.
 		 */
 		if (PageIsAllVisible(page) && !(options & HEAP_INSERT_FROZEN))
 		{
@@ -2506,6 +2503,22 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
 								vmbuffer, VISIBILITYMAP_VALID_BITS);
 		}
 
+		/*
+		 * If we're only adding already frozen rows to a previously empty
+		 * page, mark it as all-visible. And if we've frozen everything on the
+		 * page, update the visibility map. We're already holding a pin on the
+		 * vmbuffer.
+		 */
+		else if (all_frozen_set)
+		{
+			LockBuffer(vmbuffer, BUFFER_LOCK_EXCLUSIVE);
+			heap_page_set_vm(relation,
+							 BufferGetBlockNumber(buffer), buffer,
+							 vmbuffer,
+							 VISIBILITYMAP_ALL_VISIBLE |
+							 VISIBILITYMAP_ALL_FROZEN);
+		}
+
 		/*
 		 * XXX Should we set PageSetPrunable on this page ? See heap_insert()
 		 */
@@ -2552,6 +2565,12 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
 			xlrec->flags = 0;
 			if (all_visible_cleared)
 				xlrec->flags = XLH_INSERT_ALL_VISIBLE_CLEARED;
+
+			/*
+			 * We don't have to worry about including a conflict xid in the
+			 * WAL record as HEAP_INSERT_FROZEN intentionally violates
+			 * visibility rules.
+			 */
 			if (all_frozen_set)
 				xlrec->flags = XLH_INSERT_ALL_FROZEN_SET;
 
@@ -2614,7 +2633,10 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
 
 			XLogBeginInsert();
 			XLogRegisterData(xlrec, tupledata - scratch.data);
+
 			XLogRegisterBuffer(0, buffer, REGBUF_STANDARD | bufflags);
+			if (all_frozen_set)
+				XLogRegisterBuffer(1, vmbuffer, 0);
 
 			XLogRegisterBufData(0, tupledata, totaldatalen);
 
@@ -2624,22 +2646,14 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
 			recptr = XLogInsert(RM_HEAP2_ID, info);
 
 			PageSetLSN(page, recptr);
+			if (all_frozen_set)
+				PageSetLSN(BufferGetPage(vmbuffer), recptr);
 		}
 
 		END_CRIT_SECTION();
 
-		/*
-		 * If we've frozen everything on the page, update the visibilitymap.
-		 * We're already holding pin on the vmbuffer. It's fine to use
-		 * InvalidTransactionId here - this is only used when
-		 * HEAP_INSERT_FROZEN is specified, which intentionally violates
-		 * visibility rules.
-		 */
 		if (all_frozen_set)
-			heap_page_set_vm_and_log(relation, BufferGetBlockNumber(buffer), buffer,
-									 vmbuffer,
-									 InvalidTransactionId,
-									 VISIBILITYMAP_ALL_VISIBLE | VISIBILITYMAP_ALL_FROZEN);
+			LockBuffer(vmbuffer, BUFFER_LOCK_UNLOCK);
 
 		UnlockReleaseBuffer(buffer);
 		ndone += nthispage;
diff --git a/src/backend/access/heap/heapam_xlog.c b/src/backend/access/heap/heapam_xlog.c
index cfd4fc3327d..a0f3673621a 100644
--- a/src/backend/access/heap/heapam_xlog.c
+++ b/src/backend/access/heap/heapam_xlog.c
@@ -552,6 +552,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
 	int			i;
 	bool		isinit = (XLogRecGetInfo(record) & XLOG_HEAP_INIT_PAGE) != 0;
 	XLogRedoAction action;
+	Buffer		vmbuffer = InvalidBuffer;
 
 	/*
 	 * Insertion doesn't overwrite MVCC data, so no conflict processing is
@@ -572,11 +573,11 @@ heap_xlog_multi_insert(XLogReaderState *record)
 	if (xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED)
 	{
 		Relation	reln = CreateFakeRelcacheEntry(rlocator);
-		Buffer		vmbuffer = InvalidBuffer;
 
 		visibilitymap_pin(reln, blkno, &vmbuffer);
 		visibilitymap_clear(reln, blkno, vmbuffer, VISIBILITYMAP_VALID_BITS);
 		ReleaseBuffer(vmbuffer);
+		vmbuffer = InvalidBuffer;
 		FreeFakeRelcacheEntry(reln);
 	}
 
@@ -663,6 +664,40 @@ heap_xlog_multi_insert(XLogReaderState *record)
 	if (BufferIsValid(buffer))
 		UnlockReleaseBuffer(buffer);
 
+	buffer = InvalidBuffer;
+
+	/*
+	 * Now read and update the VM block. Even if we skipped updating the heap
+	 * page due to the file being dropped or truncated later in recovery, it's
+	 * still safe to update the visibility map.  Any WAL record that clears
+	 * the visibility map bit does so before checking the page LSN, so any
+	 * bits that need to be cleared will still be cleared.
+	 *
+	 * It is only okay to set the VM bits without holding the heap page lock
+	 * because we can expect no other writers of this page.
+	 */
+	if (xlrec->flags & XLH_INSERT_ALL_FROZEN_SET &&
+		XLogReadBufferForRedoExtended(record, 1, RBM_ZERO_ON_ERROR, false,
+									  &vmbuffer) == BLK_NEEDS_REDO)
+	{
+
+		uint8		old_vmbits = 0;
+		Relation	reln = CreateFakeRelcacheEntry(rlocator);
+
+		visibilitymap_pin(reln, blkno, &vmbuffer);
+		old_vmbits = visibilitymap_set_vmbyte(reln, blkno,
+											  vmbuffer,
+											  VISIBILITYMAP_ALL_VISIBLE |
+											  VISIBILITYMAP_ALL_FROZEN);
+		Assert((old_vmbits & VISIBILITYMAP_VALID_BITS) == 0);
+		(void) old_vmbits; /* Silence compiler */
+		PageSetLSN(BufferGetPage(vmbuffer), lsn);
+		FreeFakeRelcacheEntry(reln);
+	}
+
+	if (BufferIsValid(vmbuffer))
+		UnlockReleaseBuffer(vmbuffer);
+
 	/*
 	 * If the page is running low on free space, update the FSM as well.
 	 * Arbitrarily, our definition of "low" is less than 20%. We can't do much
diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c
index 82b62c95de5..b48d7dc1d24 100644
--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -16,6 +16,7 @@
 
 #include "access/heapam_xlog.h"
 #include "access/rmgrdesc_utils.h"
+#include "access/visibilitymapdefs.h"
 #include "storage/standbydefs.h"
 
 /*
@@ -354,6 +355,10 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
 		appendStringInfo(buf, "ntuples: %d, flags: 0x%02X", xlrec->ntuples,
 						 xlrec->flags);
 
+		if (xlrec->flags & XLH_INSERT_ALL_FROZEN_SET)
+			appendStringInfo(buf, ", vm_flags: 0x%02X",
+							 VISIBILITYMAP_ALL_VISIBLE | VISIBILITYMAP_ALL_FROZEN);
+
 		if (XLogRecHasBlockData(record, 0) && !isinit)
 		{
 			appendStringInfoString(buf, ", offsets:");
-- 
2.34.1



  [text/x-patch] v1-0009-Combine-lazy_scan_prune-VM-corruption-cases.patch (7.1K, 7-v1-0009-Combine-lazy_scan_prune-VM-corruption-cases.patch)
  download | inline diff:
From 40308800989edf1821639cd18e0c2630f4417c22 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Wed, 28 May 2025 16:04:03 -0400
Subject: [PATCH v1 09/14] Combine lazy_scan_prune VM corruption cases

lazy_scan_prune() updates the visibility map after phase I of heap
vacuuming. It also checks and fixes corruption in the VM. The corruption
cases where mixed in with the normal visibility map update cases.

Careful study of the ordering of the current logic reveals that the
corruption cases can be reordered and extracted into a separate
function. This should result in no additional overhead when compared to
previous execution.

This reordering makes it clear which cases are about corruption and
which cases are normal VM updates. Separating them also makes it
possible to combine the normal cases in a future commit. This will make
the logic easier to understand and allow for further separation of the
logic to allow updating the VM in the same record as pruning and
freezing in phase I.
---
 src/backend/access/heap/vacuumlazy.c | 115 +++++++++++++++++----------
 1 file changed, 74 insertions(+), 41 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index b1ff49bee6b..8328cab0955 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -431,6 +431,13 @@ static void find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
 								   bool sharelock, Buffer vmbuffer);
+
+static bool identify_and_fix_vm_corruption(Relation relation,
+										   BlockNumber heap_blk,
+										   Buffer heap_buffer, Page heap_page,
+										   bool heap_blk_known_av,
+										   int64 nlpdead_items,
+										   Buffer vmbuffer);
 static void lazy_scan_prune(LVRelState *vacrel, Buffer buf,
 							BlockNumber blkno, Page page,
 							Buffer vmbuffer, bool all_visible_according_to_vm,
@@ -1940,6 +1947,66 @@ lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, BlockNumber blkno,
 	return false;
 }
 
+/*
+ * When updating the visibility map after phase I heap vacuuming, we take the
+ * opportunity to identify and fix any VM corruption.
+ *
+ * heap_blk_known_av is the visibility status of the heap page collected
+ * while finding the next unskippable block in heap_vac_scan_next_block().
+ */
+static bool
+identify_and_fix_vm_corruption(Relation relation,
+							   BlockNumber heap_blk,
+							   Buffer heap_buffer, Page heap_page,
+							   bool heap_blk_known_av,
+							   int64 nlpdead_items,
+							   Buffer vmbuffer)
+{
+	/*
+	 * As of PostgreSQL 9.2, the visibility map bit should never be set if the
+	 * page-level bit is clear.  However, it's possible that the bit got
+	 * cleared after heap_vac_scan_next_block() was called, so we must recheck
+	 * with buffer lock before concluding that the VM is corrupt.
+	 */
+	if (heap_blk_known_av && !PageIsAllVisible(heap_page) &&
+		visibilitymap_get_status(relation, heap_blk, &vmbuffer) != 0)
+	{
+		elog(WARNING, "page is not marked all-visible but visibility map bit is set in relation \"%s\" page %u",
+			 RelationGetRelationName(relation), heap_blk);
+		visibilitymap_clear(relation, heap_blk, vmbuffer,
+							VISIBILITYMAP_VALID_BITS);
+		return true;
+	}
+
+	/*
+	 * It's possible for the value returned by
+	 * GetOldestNonRemovableTransactionId() to move backwards, so it's not
+	 * wrong for us to see tuples that appear to not be visible to everyone
+	 * yet, while PD_ALL_VISIBLE is already set. The real safe xmin value
+	 * never moves backwards, but GetOldestNonRemovableTransactionId() is
+	 * conservative and sometimes returns a value that's unnecessarily small,
+	 * so if we see that contradiction it just means that the tuples that we
+	 * think are not visible to everyone yet actually are, and the
+	 * PD_ALL_VISIBLE flag is correct.
+	 *
+	 * There should never be LP_DEAD items on a page with PD_ALL_VISIBLE set,
+	 * however.
+	 */
+	if (nlpdead_items > 0 && PageIsAllVisible(heap_page))
+	{
+		elog(WARNING, "page containing LP_DEAD items is marked as all-visible in relation \"%s\" page %u",
+			 RelationGetRelationName(relation), heap_blk);
+		PageClearAllVisible(heap_page);
+		MarkBufferDirty(heap_buffer);
+		visibilitymap_clear(relation, heap_blk, vmbuffer,
+							VISIBILITYMAP_VALID_BITS);
+		return true;
+	}
+
+	return false;
+}
+
+
 /* qsort comparator for sorting OffsetNumbers */
 static int
 cmpOffsetNumbers(const void *a, const void *b)
@@ -2084,9 +2151,14 @@ lazy_scan_prune(LVRelState *vacrel,
 	/*
 	 * Handle setting visibility map bit based on information from the VM (as
 	 * of last heap_vac_scan_next_block() call), and from all_visible and
-	 * all_frozen variables
+	 * all_frozen variables. Start by looking for any VM corruption.
 	 */
-	if (!all_visible_according_to_vm && presult.all_visible)
+	if (identify_and_fix_vm_corruption(vacrel->rel, blkno, buf, page,
+									   all_visible_according_to_vm, presult.lpdead_items, vmbuffer))
+	{
+		/* Don't update the VM if we just cleared corruption in it */
+	}
+	else if (!all_visible_according_to_vm && presult.all_visible)
 	{
 		uint8		old_vmbits;
 		uint8		flags = VISIBILITYMAP_ALL_VISIBLE;
@@ -2122,45 +2194,6 @@ lazy_scan_prune(LVRelState *vacrel,
 		}
 	}
 
-	/*
-	 * As of PostgreSQL 9.2, the visibility map bit should never be set if the
-	 * page-level bit is clear.  However, it's possible that the bit got
-	 * cleared after heap_vac_scan_next_block() was called, so we must recheck
-	 * with buffer lock before concluding that the VM is corrupt.
-	 */
-	else if (all_visible_according_to_vm && !PageIsAllVisible(page) &&
-			 visibilitymap_get_status(vacrel->rel, blkno, &vmbuffer) != 0)
-	{
-		elog(WARNING, "page is not marked all-visible but visibility map bit is set in relation \"%s\" page %u",
-			 vacrel->relname, blkno);
-		visibilitymap_clear(vacrel->rel, blkno, vmbuffer,
-							VISIBILITYMAP_VALID_BITS);
-	}
-
-	/*
-	 * It's possible for the value returned by
-	 * GetOldestNonRemovableTransactionId() to move backwards, so it's not
-	 * wrong for us to see tuples that appear to not be visible to everyone
-	 * yet, while PD_ALL_VISIBLE is already set. The real safe xmin value
-	 * never moves backwards, but GetOldestNonRemovableTransactionId() is
-	 * conservative and sometimes returns a value that's unnecessarily small,
-	 * so if we see that contradiction it just means that the tuples that we
-	 * think are not visible to everyone yet actually are, and the
-	 * PD_ALL_VISIBLE flag is correct.
-	 *
-	 * There should never be LP_DEAD items on a page with PD_ALL_VISIBLE set,
-	 * however.
-	 */
-	else if (presult.lpdead_items > 0 && PageIsAllVisible(page))
-	{
-		elog(WARNING, "page containing LP_DEAD items is marked as all-visible in relation \"%s\" page %u",
-			 vacrel->relname, blkno);
-		PageClearAllVisible(page);
-		MarkBufferDirty(buf);
-		visibilitymap_clear(vacrel->rel, blkno, vmbuffer,
-							VISIBILITYMAP_VALID_BITS);
-	}
-
 	/*
 	 * If the all-visible page is all-frozen but not marked as such yet, mark
 	 * it as all-frozen.  Note that all_frozen is only valid if all_visible is
-- 
2.34.1



  [text/x-patch] v1-0006-Make-heap_page_is_all_visible-independent-of-LVRe.patch (5.1K, 8-v1-0006-Make-heap_page_is_all_visible-independent-of-LVRe.patch)
  download | inline diff:
From 797f6f09cf7287af4b4e929e903e115a767df145 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Wed, 18 Jun 2025 15:48:51 -0400
Subject: [PATCH v1 06/14] Make heap_page_is_all_visible independent of
 LVRelState

Future commits will use this function inside of pruneheap.c where we do
not have access to the LVRelState. We only need two parameters from the
LVRelState, so just pass those in explicitly.
---
 src/backend/access/heap/vacuumlazy.c | 45 ++++++++++++++++++----------
 1 file changed, 29 insertions(+), 16 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 5e662936dd7..0cf4a69c431 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -464,8 +464,11 @@ static void dead_items_add(LVRelState *vacrel, BlockNumber blkno, OffsetNumber *
 						   int num_offsets);
 static void dead_items_reset(LVRelState *vacrel);
 static void dead_items_cleanup(LVRelState *vacrel);
-static bool heap_page_is_all_visible(LVRelState *vacrel, Buffer buf,
-									 TransactionId *visibility_cutoff_xid, bool *all_frozen);
+static bool heap_page_is_all_visible(Relation rel, Buffer buf,
+									 TransactionId OldestXmin,
+									 bool *all_frozen,
+									 TransactionId *visibility_cutoff_xid,
+									 OffsetNumber *logging_offnum);
 static void update_relstats_all_indexes(LVRelState *vacrel);
 static void vacuum_error_callback(void *arg);
 static void update_vacuum_error_info(LVRelState *vacrel,
@@ -2010,8 +2013,9 @@ lazy_scan_prune(LVRelState *vacrel,
 
 		Assert(presult.lpdead_items == 0);
 
-		if (!heap_page_is_all_visible(vacrel, buf,
-									  &debug_cutoff, &debug_all_frozen))
+		if (!heap_page_is_all_visible(vacrel->rel, buf,
+									  vacrel->cutoffs.OldestXmin, &debug_all_frozen,
+									  &debug_cutoff, &vacrel->offnum))
 			Assert(false);
 
 		Assert(presult.all_frozen == debug_all_frozen);
@@ -2877,8 +2881,8 @@ lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
 	 * emitted.
 	 */
 	Assert(!PageIsAllVisible(page));
-	if (heap_page_is_all_visible(vacrel, buffer, &visibility_cutoff_xid,
-								 &all_frozen))
+	if (heap_page_is_all_visible(vacrel->rel, buffer, vacrel->cutoffs.OldestXmin,
+								 &all_frozen, &visibility_cutoff_xid, &vacrel->offnum))
 	{
 		uint8		old_vmbits;
 		uint8		flags = VISIBILITYMAP_ALL_VISIBLE;
@@ -3568,9 +3572,16 @@ dead_items_cleanup(LVRelState *vacrel)
 
 /*
  * Check if every tuple in the given page is visible to all current and future
- * transactions. Also return the visibility_cutoff_xid which is the highest
- * xmin amongst the visible tuples.  Set *all_frozen to true if every tuple
- * on this page is frozen.
+ * transactions.
+ *
+ * OldestXmin is used to determine visibility.
+ *
+ * *logging_offnum will have the OffsetNumber of the current tuple being
+ * processed for vacuum's error callback system.
+ *
+ * Return the visibility_cutoff_xid which is the highest xmin amongst the
+ * visible tuples. Sets *all_frozen to true if every tuple on this page is
+ * frozen.
  *
  * This is a stripped down version of lazy_scan_prune().  If you change
  * anything here, make sure that everything stays in sync.  Note that an
@@ -3578,9 +3589,11 @@ dead_items_cleanup(LVRelState *vacrel)
  * introducing new side-effects here.
  */
 static bool
-heap_page_is_all_visible(LVRelState *vacrel, Buffer buf,
+heap_page_is_all_visible(Relation rel, Buffer buf,
+						 TransactionId OldestXmin,
+						 bool *all_frozen,
 						 TransactionId *visibility_cutoff_xid,
-						 bool *all_frozen)
+						 OffsetNumber *logging_offnum)
 {
 	Page		page = BufferGetPage(buf);
 	BlockNumber blockno = BufferGetBlockNumber(buf);
@@ -3603,7 +3616,7 @@ heap_page_is_all_visible(LVRelState *vacrel, Buffer buf,
 		 * Set the offset number so that we can display it along with any
 		 * error that occurred while processing this tuple.
 		 */
-		vacrel->offnum = offnum;
+		*logging_offnum = offnum;
 		itemid = PageGetItemId(page, offnum);
 
 		/* Unused or redirect line pointers are of no interest */
@@ -3627,9 +3640,9 @@ heap_page_is_all_visible(LVRelState *vacrel, Buffer buf,
 
 		tuple.t_data = (HeapTupleHeader) PageGetItem(page, itemid);
 		tuple.t_len = ItemIdGetLength(itemid);
-		tuple.t_tableOid = RelationGetRelid(vacrel->rel);
+		tuple.t_tableOid = RelationGetRelid(rel);
 
-		switch (HeapTupleSatisfiesVacuum(&tuple, vacrel->cutoffs.OldestXmin,
+		switch (HeapTupleSatisfiesVacuum(&tuple, OldestXmin,
 										 buf))
 		{
 			case HEAPTUPLE_LIVE:
@@ -3650,7 +3663,7 @@ heap_page_is_all_visible(LVRelState *vacrel, Buffer buf,
 					 */
 					xmin = HeapTupleHeaderGetXmin(tuple.t_data);
 					if (!TransactionIdPrecedes(xmin,
-											   vacrel->cutoffs.OldestXmin))
+											   OldestXmin))
 					{
 						all_visible = false;
 						*all_frozen = false;
@@ -3685,7 +3698,7 @@ heap_page_is_all_visible(LVRelState *vacrel, Buffer buf,
 	}							/* scan along page */
 
 	/* Clear the offset information once we have processed the given page. */
-	vacrel->offnum = InvalidOffsetNumber;
+	*logging_offnum = InvalidOffsetNumber;
 
 	return all_visible;
 }
-- 
2.34.1



  [text/x-patch] v1-0007-Eliminate-xl_heap_visible-from-vacuum-phase-III.patch (25.3K, 9-v1-0007-Eliminate-xl_heap_visible-from-vacuum-phase-III.patch)
  download | inline diff:
From 087566e214b67713390f742dfd825d330d3f8360 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Wed, 18 Jun 2025 12:42:13 -0400
Subject: [PATCH v1 07/14] Eliminate xl_heap_visible from vacuum phase III

Instead of emitting a separate xl_heap_visible record for each page that
is all-visible after vacuum's third phase, use the VM-related options
when emitting the xl_heap_prune record with the changes vacuum makes in
phase III.
---
 src/backend/access/heap/heapam_xlog.c  | 148 +++++++++++++++++++---
 src/backend/access/heap/pruneheap.c    |  48 +++++++-
 src/backend/access/heap/vacuumlazy.c   | 164 ++++++++++++++++---------
 src/backend/access/rmgrdesc/heapdesc.c |  13 +-
 src/include/access/heapam.h            |   9 ++
 src/include/access/heapam_xlog.h       |   3 +
 6 files changed, 308 insertions(+), 77 deletions(-)

diff --git a/src/backend/access/heap/heapam_xlog.c b/src/backend/access/heap/heapam_xlog.c
index a0f3673621a..bb6680c0467 100644
--- a/src/backend/access/heap/heapam_xlog.c
+++ b/src/backend/access/heap/heapam_xlog.c
@@ -35,7 +35,8 @@ heap_xlog_prune_freeze(XLogReaderState *record)
 	Buffer		buffer;
 	RelFileLocator rlocator;
 	BlockNumber blkno;
-	XLogRedoAction action;
+	Buffer		vmbuffer = InvalidBuffer;
+	uint8		vmflags = 0;
 
 	XLogRecGetBlockTag(record, 0, &rlocator, NULL, &blkno);
 	memcpy(&xlrec, maindataptr, SizeOfHeapPrune);
@@ -51,10 +52,15 @@ heap_xlog_prune_freeze(XLogReaderState *record)
 		   (xlrec.flags & (XLHP_HAS_REDIRECTIONS | XLHP_HAS_DEAD_ITEMS)) == 0);
 
 	/*
-	 * We are about to remove and/or freeze tuples.  In Hot Standby mode,
-	 * ensure that there are no queries running for which the removed tuples
-	 * are still visible or which still consider the frozen xids as running.
-	 * The conflict horizon XID comes after xl_heap_prune.
+	 * After xl_heap_prune is the optional snapshot conflict horizon.
+	 *
+	 * In Hot Standby mode, we must ensure that there are no running queries
+	 * which would conflict with the changes in this record. If pruning, that
+	 * means we cannot remove tuples still visible to transactions on the
+	 * standby. If freezing, that means we cannot freeze tuples with xids that
+	 * are still considered running on the standby. And for setting the VM, we
+	 * cannot do so if the page isn't all-visible to all transactions on the
+	 * standby.
 	 */
 	if ((xlrec.flags & XLHP_HAS_CONFLICT_HORIZON) != 0)
 	{
@@ -70,13 +76,28 @@ heap_xlog_prune_freeze(XLogReaderState *record)
 												rlocator);
 	}
 
+	/* Next are the optionally included vmflags. Copy them out for later use. */
+	if ((xlrec.flags & XLHP_HAS_VMFLAGS) != 0)
+	{
+		memcpy(&vmflags, maindataptr, sizeof(uint8));
+		maindataptr += sizeof(uint8);
+
+		/*
+		 * We don't set VISIBILITYMAP_XLOG_CATALOG_REL in the combined record
+		 * because we already have XLHP_IS_CATALOG_REL.
+		 */
+		Assert((vmflags & VISIBILITYMAP_VALID_BITS) == vmflags);
+		/* Must never set all_frozen bit without also setting all_visible bit */
+		Assert(vmflags != VISIBILITYMAP_ALL_FROZEN);
+	}
+
 	/*
-	 * If we have a full-page image, restore it and we're done.
+	 * If we have a full-page image of the heap block, restore it and we're
+	 * done with the heap block.
 	 */
-	action = XLogReadBufferForRedoExtended(record, 0, RBM_NORMAL,
-										   (xlrec.flags & XLHP_CLEANUP_LOCK) != 0,
-										   &buffer);
-	if (action == BLK_NEEDS_REDO)
+	if (XLogReadBufferForRedoExtended(record, 0, RBM_NORMAL,
+									  (xlrec.flags & XLHP_CLEANUP_LOCK) != 0,
+									  &buffer) == BLK_NEEDS_REDO)
 	{
 		Page		page = (Page) BufferGetPage(buffer);
 		OffsetNumber *redirected;
@@ -89,6 +110,9 @@ heap_xlog_prune_freeze(XLogReaderState *record)
 		Size		datalen;
 		xlhp_freeze_plan *plans;
 		OffsetNumber *frz_offsets;
+		bool		do_prune;
+		bool		mark_buffer_dirty;
+		bool		set_heap_lsn;
 		char	   *dataptr = XLogRecGetBlockData(record, 0, &datalen);
 
 		heap_xlog_deserialize_prune_and_freeze(dataptr, xlrec.flags,
@@ -97,11 +121,18 @@ heap_xlog_prune_freeze(XLogReaderState *record)
 											   &ndead, &nowdead,
 											   &nunused, &nowunused);
 
+		do_prune = nredirected > 0 || ndead > 0 || nunused > 0;
+		set_heap_lsn = mark_buffer_dirty = do_prune || nplans > 0;
+
+		/* Ensure the record does something */
+		Assert(do_prune || nplans > 0 ||
+			   vmflags & VISIBILITYMAP_VALID_BITS);
+
 		/*
 		 * Update all line pointers per the record, and repair fragmentation
 		 * if needed.
 		 */
-		if (nredirected > 0 || ndead > 0 || nunused > 0)
+		if (do_prune)
 			heap_page_prune_execute(buffer,
 									(xlrec.flags & XLHP_CLEANUP_LOCK) == 0,
 									redirected, nredirected,
@@ -138,26 +169,78 @@ heap_xlog_prune_freeze(XLogReaderState *record)
 		/* There should be no more data */
 		Assert((char *) frz_offsets == dataptr + datalen);
 
+		Assert(BufferIsValid(buffer) &&
+			   BufferGetBlockNumber(buffer) == blkno);
+
+		/*
+		 * Now set PD_ALL_VISIBLE, if required. We'll only do this if we are
+		 * also going to set bits in the VM later.
+		 *
+		 * We must never end up with the VM bit set and the page-level
+		 * PD_ALL_VISIBLE bit clear. If that were to occur, a subsequent page
+		 * modification would fail to clear the VM bit.
+		 *
+		 * Prior to Postgres 19, it was possible for the page-level bit to be
+		 * set and the VM bit to be clear. This could happen if we crashed
+		 * after setting PD_ALL_VISIBLE but before setting bits in the VM.
+		 */
+		if ((vmflags & VISIBILITYMAP_VALID_BITS) && !PageIsAllVisible(page))
+		{
+			PageSetAllVisible(page);
+
+			/*
+			 * Setting PD_ALL_VISIBLE only forces us to update the heap page
+			 * LSN if checksums or wal_log_hints are enabled (in which case we
+			 * must). This exposes us to torn page hazards, but since we're
+			 * not inspecting the existing page contents in any way, we don't
+			 * care.
+			 */
+			set_heap_lsn = XLogHintBitIsNeeded() ? true : set_heap_lsn;
+			mark_buffer_dirty = true;
+		}
+
 		/*
 		 * Note: we don't worry about updating the page's prunability hints.
 		 * At worst this will cause an extra prune cycle to occur soon.
 		 */
 
-		PageSetLSN(page, lsn);
-		MarkBufferDirty(buffer);
+		if (mark_buffer_dirty)
+			MarkBufferDirty(buffer);
+		if (set_heap_lsn)
+			PageSetLSN(page, lsn);
 	}
 
 	/*
-	 * If we released any space or line pointers, update the free space map.
+	 * If we released any space or line pointers or will be setting a page in
+	 * the visibility map, update the free space map.
+	 *
+	 * Even if we are just updating the VM (and thus not freeing up any
+	 * space), we'll still update the FSM for this page. Since FSM is not
+	 * WAL-logged and only updated heuristically, it easily becomes stale in
+	 * standbys.  If the standby is later promoted and runs VACUUM, it will
+	 * skip updating individual free space figures for pages that became
+	 * all-visible (or all-frozen, depending on the vacuum mode,) which is
+	 * troublesome when FreeSpaceMapVacuum propagates too optimistic free
+	 * space values to upper FSM layers; later inserters try to use such pages
+	 * only to find out that they are unusable.  This can cause long stalls
+	 * when there are many such pages.
+	 *
+	 * Forestall those problems by updating FSM's idea about a page that is
+	 * becoming all-visible or all-frozen.
 	 *
 	 * Do this regardless of a full-page image being applied, since the FSM
 	 * data is not in the page anyway.
+	 *
+	 * We want to avoid holding an exclusive lock on the heap buffer while
+	 * doing IO (either of the FSM or the VM), so we'll release the lock on
+	 * the heap buffer before doing either.
 	 */
 	if (BufferIsValid(buffer))
 	{
-		if (xlrec.flags & (XLHP_HAS_REDIRECTIONS |
-						   XLHP_HAS_DEAD_ITEMS |
-						   XLHP_HAS_NOW_UNUSED_ITEMS))
+		if ((xlrec.flags & (XLHP_HAS_REDIRECTIONS |
+							XLHP_HAS_DEAD_ITEMS |
+							XLHP_HAS_NOW_UNUSED_ITEMS)) ||
+			vmflags & VISIBILITYMAP_VALID_BITS)
 		{
 			Size		freespace = PageGetHeapFreeSpace(BufferGetPage(buffer));
 
@@ -168,6 +251,37 @@ heap_xlog_prune_freeze(XLogReaderState *record)
 		else
 			UnlockReleaseBuffer(buffer);
 	}
+
+	/*
+	 * Read and update the VM block. Even if we skipped updating the heap page
+	 * due to the file being dropped or truncated later in recovery, it's
+	 * still safe to update the visibility map.  Any WAL record that clears
+	 * the visibility map bit does so before checking the page LSN, so any
+	 * bits that need to be cleared will still be cleared.
+	 *
+	 * Note that it is *only* okay that we do not hold a lock on the heap page
+	 * because we are in recovery and can expect no other writers to clear
+	 * PD_ALL_VISIBLE before we are able to update the VM.
+	 */
+	if (vmflags & VISIBILITYMAP_VALID_BITS &&
+		XLogReadBufferForRedoExtended(record, 1,
+									  RBM_ZERO_ON_ERROR,
+									  false,
+									  &vmbuffer) == BLK_NEEDS_REDO)
+	{
+		uint8		old_vmbits = 0;
+		Relation	reln = CreateFakeRelcacheEntry(rlocator);
+
+		visibilitymap_pin(reln, blkno, &vmbuffer);
+		old_vmbits = visibilitymap_set_vmbyte(reln, blkno, vmbuffer, vmflags);
+		/* Only set VM page LSN if we modified the page */
+		if (old_vmbits != vmflags)
+			PageSetLSN(BufferGetPage(vmbuffer), lsn);
+		FreeFakeRelcacheEntry(reln);
+	}
+
+	if (BufferIsValid(vmbuffer))
+		UnlockReleaseBuffer(vmbuffer);
 }
 
 /*
diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index a8025889be0..d9ba0f96e34 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -21,6 +21,7 @@
 #include "access/transam.h"
 #include "access/xlog.h"
 #include "access/xloginsert.h"
+#include "access/visibilitymapdefs.h"
 #include "commands/vacuum.h"
 #include "executor/instrument.h"
 #include "miscadmin.h"
@@ -835,6 +836,7 @@ heap_page_prune_and_freeze(Relation relation, Buffer buffer,
 				conflict_xid = prstate.latest_xid_removed;
 
 			log_heap_prune_and_freeze(relation, buffer,
+									  InvalidBuffer, 0, false,
 									  conflict_xid,
 									  true, reason,
 									  prstate.frozen, prstate.nfrozen,
@@ -2045,12 +2047,23 @@ heap_log_freeze_plan(HeapTupleFreeze *tuples, int ntuples,
  * replaying 'unused' items depends on whether they were all previously marked
  * as dead.
  *
+ * If the VM is being updated, vmflags will contain the bits to set. In this
+ * case, vmbuffer should already have been updated and marked dirty and should
+ * still be pinned and locked.
+ *
+ * set_pd_all_vis indicates that we set PD_ALL_VISIBLE and thus should update
+ * the page LSN when checksums/wal_log_hints are enabled even if we did not
+ * prune or freeze tuples on the page.
+ *
  * Note: This function scribbles on the 'frozen' array.
  *
  * Note: This is called in a critical section, so careful what you do here.
  */
 void
 log_heap_prune_and_freeze(Relation relation, Buffer buffer,
+						  Buffer vmbuffer,
+						  uint8 vmflags,
+						  bool set_pd_all_vis,
 						  TransactionId conflict_xid,
 						  bool cleanup_lock,
 						  PruneReason reason,
@@ -2062,6 +2075,7 @@ log_heap_prune_and_freeze(Relation relation, Buffer buffer,
 	xl_heap_prune xlrec;
 	XLogRecPtr	recptr;
 	uint8		info;
+	uint8		regbuf_flags;
 
 	/* The following local variables hold data registered in the WAL record: */
 	xlhp_freeze_plan plans[MaxHeapTuplesPerPage];
@@ -2070,8 +2084,19 @@ log_heap_prune_and_freeze(Relation relation, Buffer buffer,
 	xlhp_prune_items dead_items;
 	xlhp_prune_items unused_items;
 	OffsetNumber frz_offsets[MaxHeapTuplesPerPage];
+	bool		do_prune = nredirected > 0 || ndead > 0 || nunused > 0;
 
 	xlrec.flags = 0;
+	regbuf_flags = REGBUF_STANDARD;
+
+	/*
+	 * We can avoid an FPI if the only modification we are making to the heap
+	 * page is to set PD_ALL_VISIBLE and checksums/wal_log_hints are disabled.
+	 */
+	if (!do_prune &&
+		nfrozen == 0 &&
+		(!set_pd_all_vis || !XLogHintBitIsNeeded()))
+		regbuf_flags |= REGBUF_NO_IMAGE;
 
 	/*
 	 * Prepare data for the buffer.  The arrays are not actually in the
@@ -2079,7 +2104,11 @@ log_heap_prune_and_freeze(Relation relation, Buffer buffer,
 	 * page image, the arrays can be omitted.
 	 */
 	XLogBeginInsert();
-	XLogRegisterBuffer(0, buffer, REGBUF_STANDARD);
+	XLogRegisterBuffer(0, buffer, regbuf_flags);
+
+	if (vmflags & VISIBILITYMAP_VALID_BITS)
+		XLogRegisterBuffer(1, vmbuffer, 0);
+
 	if (nfrozen > 0)
 	{
 		int			nplans;
@@ -2136,6 +2165,8 @@ log_heap_prune_and_freeze(Relation relation, Buffer buffer,
 	 * Prepare the main xl_heap_prune record.  We already set the XLHP_HAS_*
 	 * flag above.
 	 */
+	if (vmflags & VISIBILITYMAP_VALID_BITS)
+		xlrec.flags |= XLHP_HAS_VMFLAGS;
 	if (RelationIsAccessibleInLogicalDecoding(relation))
 		xlrec.flags |= XLHP_IS_CATALOG_REL;
 	if (TransactionIdIsValid(conflict_xid))
@@ -2150,6 +2181,8 @@ log_heap_prune_and_freeze(Relation relation, Buffer buffer,
 	XLogRegisterData(&xlrec, SizeOfHeapPrune);
 	if (TransactionIdIsValid(conflict_xid))
 		XLogRegisterData(&conflict_xid, sizeof(TransactionId));
+	if (vmflags & VISIBILITYMAP_VALID_BITS)
+		XLogRegisterData(&vmflags, sizeof(uint8));
 
 	switch (reason)
 	{
@@ -2168,5 +2201,16 @@ log_heap_prune_and_freeze(Relation relation, Buffer buffer,
 	}
 	recptr = XLogInsert(RM_HEAP2_ID, info);
 
-	PageSetLSN(BufferGetPage(buffer), recptr);
+	if (vmflags & VISIBILITYMAP_VALID_BITS)
+		PageSetLSN(BufferGetPage(vmbuffer), recptr);
+
+	/*
+	 * If pruning or freezing tuples or setting the page all-visible when
+	 * checksums or wal_hint_bits are enabled, we must bump the LSN. Torn
+	 * pages are possible if we update PD_ALL_VISIBLE without bumping the LSN,
+	 * but this is deemed okay for page hint updates.
+	 */
+	if (do_prune || nfrozen > 0 ||
+		(set_pd_all_vis && XLogHintBitIsNeeded()))
+		PageSetLSN(BufferGetPage(buffer), recptr);
 }
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 0cf4a69c431..32f21d20194 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -464,11 +464,13 @@ static void dead_items_add(LVRelState *vacrel, BlockNumber blkno, OffsetNumber *
 						   int num_offsets);
 static void dead_items_reset(LVRelState *vacrel);
 static void dead_items_cleanup(LVRelState *vacrel);
-static bool heap_page_is_all_visible(Relation rel, Buffer buf,
-									 TransactionId OldestXmin,
-									 bool *all_frozen,
-									 TransactionId *visibility_cutoff_xid,
-									 OffsetNumber *logging_offnum);
+static bool heap_page_is_all_visible_except_lpdead(Relation rel, Buffer buf,
+												   TransactionId OldestXmin,
+												   OffsetNumber *deadoffsets,
+												   int allowed_num_offsets,
+												   bool *all_frozen,
+												   TransactionId *visibility_cutoff_xid,
+												   OffsetNumber *logging_offnum);
 static void update_relstats_all_indexes(LVRelState *vacrel);
 static void vacuum_error_callback(void *arg);
 static void update_vacuum_error_info(LVRelState *vacrel,
@@ -2817,8 +2819,12 @@ lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
 	OffsetNumber unused[MaxHeapTuplesPerPage];
 	int			nunused = 0;
 	TransactionId visibility_cutoff_xid;
+	TransactionId conflict_xid = InvalidTransactionId;
 	bool		all_frozen;
 	LVSavedErrInfo saved_err_info;
+	uint8		old_vmbits = 0;
+	uint8		vmflags = 0;
+	bool		set_pd_all_vis = false;
 
 	Assert(vacrel->do_index_vacuuming);
 
@@ -2829,6 +2835,20 @@ lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
 							 VACUUM_ERRCB_PHASE_VACUUM_HEAP, blkno,
 							 InvalidOffsetNumber);
 
+	if (heap_page_is_all_visible_except_lpdead(vacrel->rel, buffer,
+											   vacrel->cutoffs.OldestXmin,
+											   deadoffsets, num_offsets,
+											   &all_frozen, &visibility_cutoff_xid,
+											   &vacrel->offnum))
+	{
+		vmflags |= VISIBILITYMAP_ALL_VISIBLE;
+		if (all_frozen)
+		{
+			vmflags |= VISIBILITYMAP_ALL_FROZEN;
+			Assert(!TransactionIdIsValid(visibility_cutoff_xid));
+		}
+	}
+
 	START_CRIT_SECTION();
 
 	for (int i = 0; i < num_offsets; i++)
@@ -2848,6 +2868,21 @@ lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
 	/* Attempt to truncate line pointer array now */
 	PageTruncateLinePointerArray(page);
 
+	if ((vmflags & VISIBILITYMAP_VALID_BITS) != 0)
+	{
+		Assert(!PageIsAllVisible(page));
+		set_pd_all_vis = true;
+		LockBuffer(vmbuffer, BUFFER_LOCK_EXCLUSIVE);
+		old_vmbits = heap_page_set_vm(vacrel->rel,
+									  blkno, buffer,
+									  vmbuffer, vmflags);
+
+		/* We know the page should not have been all-visible */
+		Assert((old_vmbits & VISIBILITYMAP_VALID_BITS) == 0);
+		(void) old_vmbits; /* Silence compiler */
+		conflict_xid = visibility_cutoff_xid;
+	}
+
 	/*
 	 * Mark buffer dirty before we write WAL.
 	 */
@@ -2857,7 +2892,10 @@ lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
 	if (RelationNeedsWAL(vacrel->rel))
 	{
 		log_heap_prune_and_freeze(vacrel->rel, buffer,
-								  InvalidTransactionId,
+								  vmbuffer,
+								  vmflags,
+								  set_pd_all_vis,
+								  conflict_xid,
 								  false,	/* no cleanup lock required */
 								  PRUNE_VACUUM_CLEANUP,
 								  NULL, 0,	/* frozen */
@@ -2866,48 +2904,14 @@ lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
 								  unused, nunused);
 	}
 
-	/*
-	 * End critical section, so we safely can do visibility tests (which
-	 * possibly need to perform IO and allocate memory!). If we crash now the
-	 * page (including the corresponding vm bit) might not be marked all
-	 * visible, but that's fine. A later vacuum will fix that.
-	 */
 	END_CRIT_SECTION();
 
-	/*
-	 * Now that we have removed the LP_DEAD items from the page, once again
-	 * check if the page has become all-visible.  The page is already marked
-	 * dirty, exclusively locked, and, if needed, a full page image has been
-	 * emitted.
-	 */
-	Assert(!PageIsAllVisible(page));
-	if (heap_page_is_all_visible(vacrel->rel, buffer, vacrel->cutoffs.OldestXmin,
-								 &all_frozen, &visibility_cutoff_xid, &vacrel->offnum))
+	if ((vmflags & VISIBILITYMAP_ALL_VISIBLE) != 0)
 	{
-		uint8		old_vmbits;
-		uint8		flags = VISIBILITYMAP_ALL_VISIBLE;
-
+		LockBuffer(vmbuffer, BUFFER_LOCK_UNLOCK);
+		vacrel->vm_new_visible_pages++;
 		if (all_frozen)
-		{
-			Assert(!TransactionIdIsValid(visibility_cutoff_xid));
-			flags |= VISIBILITYMAP_ALL_FROZEN;
-		}
-
-		old_vmbits = heap_page_set_vm_and_log(vacrel->rel, blkno, buffer,
-											  vmbuffer, visibility_cutoff_xid,
-											  flags);
-
-		/* We know the page should not have been all-visible */
-		Assert((old_vmbits & VISIBILITYMAP_VALID_BITS) == 0);
-		(void) old_vmbits; /* Silence compiler */
-
-		/* Count the newly set VM page for logging */
-		if ((flags & VISIBILITYMAP_ALL_VISIBLE) != 0)
-		{
-			vacrel->vm_new_visible_pages++;
-			if (all_frozen)
-				vacrel->vm_new_visible_frozen_pages++;
-		}
+			vacrel->vm_new_visible_frozen_pages++;
 	}
 
 	/* Revert to the previous phase information for error traceback */
@@ -3570,6 +3574,25 @@ dead_items_cleanup(LVRelState *vacrel)
 	vacrel->pvs = NULL;
 }
 
+/*
+ * Wrapper for heap_page_is_all_visible_except_lpdead() which can be used for
+ * callers that expect no LP_DEAD on the page.
+ */
+bool
+heap_page_is_all_visible(Relation rel, Buffer buf,
+						 TransactionId OldestXmin,
+						 bool *all_frozen,
+						 TransactionId *visibility_cutoff_xid,
+						 OffsetNumber *logging_offnum)
+{
+
+	return heap_page_is_all_visible_except_lpdead(rel, buf, OldestXmin,
+												  NULL, 0,
+												  all_frozen,
+												  visibility_cutoff_xid,
+												  logging_offnum);
+}
+
 /*
  * Check if every tuple in the given page is visible to all current and future
  * transactions.
@@ -3583,23 +3606,35 @@ dead_items_cleanup(LVRelState *vacrel)
  * visible tuples. Sets *all_frozen to true if every tuple on this page is
  * frozen.
  *
- * This is a stripped down version of lazy_scan_prune().  If you change
- * anything here, make sure that everything stays in sync.  Note that an
- * assertion calls us to verify that everybody still agrees.  Be sure to avoid
- * introducing new side-effects here.
+ * deadoffsets are the offsets we know about and are about to set LP_UNUSED.
+ * allowed_num_offsets is the number of those. As long as the LP_DEAD items we
+ * encounter on the page match those exactly, we can set the page all-visible
+ * in the VM.
+ *
+ * Callers looking to verify that the page is all-visible can call
+ * heap_page_is_all_visible().
+ *
+ * This is similar logic to that in heap_prune_record_unchanged_lp_normal() If
+ * you change anything here, make sure that everything stays in sync.  Note
+ * that an assertion calls us to verify that everybody still agrees.  Be sure
+ * to avoid introducing new side-effects here.
  */
 static bool
-heap_page_is_all_visible(Relation rel, Buffer buf,
-						 TransactionId OldestXmin,
-						 bool *all_frozen,
-						 TransactionId *visibility_cutoff_xid,
-						 OffsetNumber *logging_offnum)
+heap_page_is_all_visible_except_lpdead(Relation rel, Buffer buf,
+									   TransactionId OldestXmin,
+									   OffsetNumber *deadoffsets,
+									   int allowed_num_offsets,
+									   bool *all_frozen,
+									   TransactionId *visibility_cutoff_xid,
+									   OffsetNumber *logging_offnum)
 {
 	Page		page = BufferGetPage(buf);
 	BlockNumber blockno = BufferGetBlockNumber(buf);
 	OffsetNumber offnum,
 				maxoff;
 	bool		all_visible = true;
+	OffsetNumber current_dead_offsets[MaxHeapTuplesPerPage];
+	size_t		current_num_offsets = 0;
 
 	*visibility_cutoff_xid = InvalidTransactionId;
 	*all_frozen = true;
@@ -3631,9 +3666,8 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 		 */
 		if (ItemIdIsDead(itemid))
 		{
-			all_visible = false;
-			*all_frozen = false;
-			break;
+			current_dead_offsets[current_num_offsets++] = offnum;
+			continue;
 		}
 
 		Assert(ItemIdIsNormal(itemid));
@@ -3700,7 +3734,23 @@ heap_page_is_all_visible(Relation rel, Buffer buf,
 	/* Clear the offset information once we have processed the given page. */
 	*logging_offnum = InvalidOffsetNumber;
 
-	return all_visible;
+	/* If we already know it's not all-visible, return false */
+	if (!all_visible)
+		return false;
+
+	/* If we weren't allowed any dead offsets, we're done */
+	if (allowed_num_offsets == 0)
+		return current_num_offsets == 0;
+
+	/* If the number of dead offsets has changed, that's wrong */
+	if (current_num_offsets != allowed_num_offsets)
+		return false;
+
+	Assert(deadoffsets);
+
+	/* The dead offsets must be the same dead offsets */
+	return memcmp(current_dead_offsets, deadoffsets,
+				  allowed_num_offsets * sizeof(OffsetNumber)) == 0;
 }
 
 /*
diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c
index b48d7dc1d24..d6c86ccac20 100644
--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -266,6 +266,7 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
 {
 	char	   *rec = XLogRecGetData(record);
 	uint8		info = XLogRecGetInfo(record) & ~XLR_INFO_MASK;
+	char	   *maindataptr = rec + SizeOfHeapPrune;
 
 	info &= XLOG_HEAP_OPMASK;
 	if (info == XLOG_HEAP2_PRUNE_ON_ACCESS ||
@@ -278,7 +279,8 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
 		{
 			TransactionId conflict_xid;
 
-			memcpy(&conflict_xid, rec + SizeOfHeapPrune, sizeof(TransactionId));
+			memcpy(&conflict_xid, maindataptr, sizeof(TransactionId));
+			maindataptr += sizeof(TransactionId);
 
 			appendStringInfo(buf, "snapshotConflictHorizon: %u",
 							 conflict_xid);
@@ -287,6 +289,15 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
 		appendStringInfo(buf, ", isCatalogRel: %c",
 						 xlrec->flags & XLHP_IS_CATALOG_REL ? 'T' : 'F');
 
+		if (xlrec->flags & XLHP_HAS_VMFLAGS)
+		{
+			uint8		vmflags;
+
+			memcpy(&vmflags, maindataptr, sizeof(uint8));
+			maindataptr += sizeof(uint8);
+			appendStringInfo(buf, ", vm_flags: 0x%02X", vmflags);
+		}
+
 		if (XLogRecHasBlockData(record, 0))
 		{
 			Size		datalen;
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 5127fdb9c77..d2ac380bb64 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -343,6 +343,12 @@ extern void heap_inplace_update_and_unlock(Relation relation,
 										   Buffer buffer);
 extern void heap_inplace_unlock(Relation relation,
 								HeapTuple oldtup, Buffer buffer);
+
+extern bool heap_page_is_all_visible(Relation rel, Buffer buf,
+									 TransactionId OldestXmin,
+									 bool *all_frozen,
+									 TransactionId *visibility_cutoff_xid,
+									 OffsetNumber *logging_offnum);
 extern bool heap_prepare_freeze_tuple(HeapTupleHeader tuple,
 									  const struct VacuumCutoffs *cutoffs,
 									  HeapPageFreeze *pagefrz,
@@ -393,6 +399,9 @@ extern void heap_page_prune_execute(Buffer buffer, bool lp_truncate_only,
 									OffsetNumber *nowunused, int nunused);
 extern void heap_get_root_tuples(Page page, OffsetNumber *root_offsets);
 extern void log_heap_prune_and_freeze(Relation relation, Buffer buffer,
+									  Buffer vmbuffer,
+									  uint8 vmflags,
+									  bool vm_modified_heap_page,
 									  TransactionId conflict_xid,
 									  bool cleanup_lock,
 									  PruneReason reason,
diff --git a/src/include/access/heapam_xlog.h b/src/include/access/heapam_xlog.h
index 277df6b3cf0..ceae9c083ff 100644
--- a/src/include/access/heapam_xlog.h
+++ b/src/include/access/heapam_xlog.h
@@ -295,6 +295,9 @@ typedef struct xl_heap_prune
 
 #define SizeOfHeapPrune (offsetof(xl_heap_prune, flags) + sizeof(uint8))
 
+/* If the record should update the VM, this is the new value */
+#define		XLHP_HAS_VMFLAGS			(1 << 0)
+
 /* to handle recovery conflict during logical decoding on standby */
 #define		XLHP_IS_CATALOG_REL			(1 << 1)
 
-- 
2.34.1



  [text/x-patch] v1-0008-Use-xl_heap_prune-record-for-setting-empty-pages-.patch (6.4K, 10-v1-0008-Use-xl_heap_prune-record-for-setting-empty-pages-.patch)
  download | inline diff:
From 52dada80db2b5f5f6e5810c633d953d03ad10c05 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Wed, 18 Jun 2025 12:42:19 -0400
Subject: [PATCH v1 08/14] Use xl_heap_prune record for setting empty pages
 all-visible

As part of a project to eliminate xl_heap_visible records, eliminate
their usage in phase I vacuum of empty pages.
---
 src/backend/access/heap/pruneheap.c  | 14 ++++--
 src/backend/access/heap/vacuumlazy.c | 64 ++++++++++++++++++----------
 src/include/access/heapam.h          |  1 +
 3 files changed, 54 insertions(+), 25 deletions(-)

diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index d9ba0f96e34..97e51f78854 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -836,6 +836,7 @@ heap_page_prune_and_freeze(Relation relation, Buffer buffer,
 				conflict_xid = prstate.latest_xid_removed;
 
 			log_heap_prune_and_freeze(relation, buffer,
+									  false,
 									  InvalidBuffer, 0, false,
 									  conflict_xid,
 									  true, reason,
@@ -2051,6 +2052,9 @@ heap_log_freeze_plan(HeapTupleFreeze *tuples, int ntuples,
  * case, vmbuffer should already have been updated and marked dirty and should
  * still be pinned and locked.
  *
+ * force_heap_fpi indicates that a full page image of the heap block should be
+ * forced.
+ *
  * set_pd_all_vis indicates that we set PD_ALL_VISIBLE and thus should update
  * the page LSN when checksums/wal_log_hints are enabled even if we did not
  * prune or freeze tuples on the page.
@@ -2061,6 +2065,7 @@ heap_log_freeze_plan(HeapTupleFreeze *tuples, int ntuples,
  */
 void
 log_heap_prune_and_freeze(Relation relation, Buffer buffer,
+						  bool force_heap_fpi,
 						  Buffer vmbuffer,
 						  uint8 vmflags,
 						  bool set_pd_all_vis,
@@ -2089,13 +2094,16 @@ log_heap_prune_and_freeze(Relation relation, Buffer buffer,
 	xlrec.flags = 0;
 	regbuf_flags = REGBUF_STANDARD;
 
+	if (force_heap_fpi)
+		regbuf_flags |= REGBUF_FORCE_IMAGE;
+
 	/*
 	 * We can avoid an FPI if the only modification we are making to the heap
 	 * page is to set PD_ALL_VISIBLE and checksums/wal_log_hints are disabled.
 	 */
-	if (!do_prune &&
-		nfrozen == 0 &&
-		(!set_pd_all_vis || !XLogHintBitIsNeeded()))
+	else if (!do_prune &&
+			 nfrozen == 0 &&
+			 (!set_pd_all_vis || !XLogHintBitIsNeeded()))
 		regbuf_flags |= REGBUF_NO_IMAGE;
 
 	/*
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 32f21d20194..b1ff49bee6b 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1850,6 +1850,7 @@ lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, BlockNumber blkno,
 
 	if (PageIsEmpty(page))
 	{
+
 		/*
 		 * It seems likely that caller will always be able to get a cleanup
 		 * lock on an empty page.  But don't take any chances -- escalate to
@@ -1877,35 +1878,53 @@ lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, BlockNumber blkno,
 		 */
 		if (!PageIsAllVisible(page))
 		{
-			uint8		old_vmbits;
+			uint8		old_vmbits = 0;
+			uint8		new_vmbits = 0;
 
-			START_CRIT_SECTION();
+			new_vmbits = VISIBILITYMAP_ALL_VISIBLE |
+				VISIBILITYMAP_ALL_FROZEN;
 
-			/* mark buffer dirty before writing a WAL record */
-			MarkBufferDirty(buf);
+			START_CRIT_SECTION();
 
-			/*
-			 * It's possible that another backend has extended the heap,
-			 * initialized the page, and then failed to WAL-log the page due
-			 * to an ERROR.  Since heap extension is not WAL-logged, recovery
-			 * might try to replay our record setting the page all-visible and
-			 * find that the page isn't initialized, which will cause a PANIC.
-			 * To prevent that, check whether the page has been previously
-			 * WAL-logged, and if not, do that now.
-			 */
-			if (RelationNeedsWAL(vacrel->rel) &&
-				PageGetLSN(page) == InvalidXLogRecPtr)
-				log_newpage_buffer(buf, true);
-
-			old_vmbits = heap_page_set_vm_and_log(vacrel->rel, blkno, buf,
-												  vmbuffer, InvalidTransactionId,
-												  VISIBILITYMAP_ALL_VISIBLE |
-												  VISIBILITYMAP_ALL_FROZEN);
-			END_CRIT_SECTION();
+			LockBuffer(vmbuffer, BUFFER_LOCK_EXCLUSIVE);
+			old_vmbits = heap_page_set_vm(vacrel->rel, blkno, buf,
+										  vmbuffer, new_vmbits);
 
 			/* VM bits cannot have been set if PD_ALL_VISIBLE was clear */
 			Assert((old_vmbits & VISIBILITYMAP_VALID_BITS) == 0);
 			(void) old_vmbits; /* Silence compiler */
+
+			/* Should have set PD_ALL_VISIBLE and marked buf dirty */
+			Assert(BufferIsDirty(buf));
+
+			if (RelationNeedsWAL(vacrel->rel))
+			{
+				/*
+				 * It's possible that another backend has extended the heap,
+				 * initialized the page, and then failed to WAL-log the page
+				 * due to an ERROR.  Since heap extension is not WAL-logged,
+				 * recovery might try to replay our record setting the page
+				 * all-visible and find that the page isn't initialized, which
+				 * will cause a PANIC. To prevent that, if the page hasn't
+				 * been previously WAL-logged, force a heap FPI.
+				 */
+				log_heap_prune_and_freeze(vacrel->rel, buf,
+										  PageGetLSN(page) == InvalidXLogRecPtr,
+										  vmbuffer,
+										  new_vmbits,
+										  true,
+										  InvalidTransactionId,
+										  false, PRUNE_VACUUM_SCAN,
+										  NULL, 0,
+										  NULL, 0,
+										  NULL, 0,
+										  NULL, 0);
+			}
+
+			END_CRIT_SECTION();
+
+			LockBuffer(vmbuffer, BUFFER_LOCK_UNLOCK);
+
 			/* Count the newly all-frozen pages for logging. */
 			vacrel->vm_new_visible_pages++;
 			vacrel->vm_new_visible_frozen_pages++;
@@ -2892,6 +2911,7 @@ lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
 	if (RelationNeedsWAL(vacrel->rel))
 	{
 		log_heap_prune_and_freeze(vacrel->rel, buffer,
+								  false,
 								  vmbuffer,
 								  vmflags,
 								  set_pd_all_vis,
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index d2ac380bb64..1fa6eb047fd 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -399,6 +399,7 @@ extern void heap_page_prune_execute(Buffer buffer, bool lp_truncate_only,
 									OffsetNumber *nowunused, int nunused);
 extern void heap_get_root_tuples(Page page, OffsetNumber *root_offsets);
 extern void log_heap_prune_and_freeze(Relation relation, Buffer buffer,
+									  bool force_heap_fpi,
 									  Buffer vmbuffer,
 									  uint8 vmflags,
 									  bool vm_modified_heap_page,
-- 
2.34.1



  [text/x-patch] v1-0010-Combine-vacuum-phase-I-VM-update-cases.patch (4.2K, 11-v1-0010-Combine-vacuum-phase-I-VM-update-cases.patch)
  download | inline diff:
From 00d6a64f3c5fa4d87e59968b636941846e6c542b Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Wed, 28 May 2025 16:35:36 -0400
Subject: [PATCH v1 10/14] Combine vacuum phase I VM update cases

After phase I of vacuum we update the VM, either setting the VM bits
when all bits are currently unset or setting just the frozen bit when
the all-visible bit is already set. Those cases had a lot of duplicated
code. Combine them. This is simpler to understand and also allows makes
the code compact enough to start using to update the VM while pruning
and freezing.
---
 src/backend/access/heap/vacuumlazy.c | 71 +++++++++-------------------
 1 file changed, 22 insertions(+), 49 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 8328cab0955..fdac36f0835 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -2158,11 +2158,26 @@ lazy_scan_prune(LVRelState *vacrel,
 	{
 		/* Don't update the VM if we just cleared corruption in it */
 	}
-	else if (!all_visible_according_to_vm && presult.all_visible)
+
+	/*
+	 * If the page isn't yet marked all-visible in the VM or it is and needs
+	 * to me marked all-frozen, update the VM Note that all_frozen is only
+	 * valid if all_visible is true, so we must check both all_visible and
+	 * all_frozen.
+	 */
+	else if (presult.all_visible &&
+			 (!all_visible_according_to_vm ||
+			  (presult.all_frozen && !VM_ALL_FROZEN(vacrel->rel, blkno, &vmbuffer))))
 	{
 		uint8		old_vmbits;
 		uint8		flags = VISIBILITYMAP_ALL_VISIBLE;
 
+		/*
+		 * If the page is all-frozen, we can pass InvalidTransactionId as our
+		 * cutoff_xid, since a snapshotConflictHorizon sufficient to make
+		 * everything safe for REDO was logged when the page's tuples were
+		 * frozen.
+		 */
 		if (presult.all_frozen)
 		{
 			Assert(!TransactionIdIsValid(presult.vm_conflict_horizon));
@@ -2174,6 +2189,12 @@ lazy_scan_prune(LVRelState *vacrel,
 											  flags);
 
 		/*
+		 * Even if we are only setting the all-frozen bit, there is a small
+		 * chance that the VM was modified sometime between setting
+		 * all_visible_according_to_vm and checking the visibility during
+		 * pruning. Check the return value of old_vmbits to ensure the
+		 * visibility map counters used for logging are accurate.
+		 *
 		 * If the page wasn't already set all-visible and/or all-frozen in the
 		 * VM, count it as newly set for logging.
 		 */
@@ -2193,54 +2214,6 @@ lazy_scan_prune(LVRelState *vacrel,
 			*vm_page_frozen = true;
 		}
 	}
-
-	/*
-	 * If the all-visible page is all-frozen but not marked as such yet, mark
-	 * it as all-frozen.  Note that all_frozen is only valid if all_visible is
-	 * true, so we must check both all_visible and all_frozen.
-	 */
-	else if (all_visible_according_to_vm && presult.all_visible &&
-			 presult.all_frozen && !VM_ALL_FROZEN(vacrel->rel, blkno, &vmbuffer))
-	{
-		uint8		old_vmbits;
-
-		/*
-		 * Set the page all-frozen (and all-visible) in the VM.
-		 *
-		 * We can pass InvalidTransactionId as our cutoff_xid, since a
-		 * snapshotConflictHorizon sufficient to make everything safe for REDO
-		 * was logged when the page's tuples were frozen.
-		 */
-		Assert(!TransactionIdIsValid(presult.vm_conflict_horizon));
-		old_vmbits = heap_page_set_vm_and_log(vacrel->rel, blkno, buf,
-											  vmbuffer, InvalidTransactionId,
-											  VISIBILITYMAP_ALL_VISIBLE |
-											  VISIBILITYMAP_ALL_FROZEN);
-
-		/*
-		 * The page was likely already set all-visible in the VM. However,
-		 * there is a small chance that it was modified sometime between
-		 * setting all_visible_according_to_vm and checking the visibility
-		 * during pruning. Check the return value of old_vmbits anyway to
-		 * ensure the visibility map counters used for logging are accurate.
-		 */
-		if ((old_vmbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
-		{
-			vacrel->vm_new_visible_pages++;
-			vacrel->vm_new_visible_frozen_pages++;
-			*vm_page_frozen = true;
-		}
-
-		/*
-		 * We already checked that the page was not set all-frozen in the VM
-		 * above, so we don't need to test the value of old_vmbits.
-		 */
-		else
-		{
-			vacrel->vm_new_frozen_pages++;
-			*vm_page_frozen = true;
-		}
-	}
 }
 
 /*
-- 
2.34.1



  [text/x-patch] v1-0012-Update-VM-in-pruneheap.c.patch (12.0K, 12-v1-0012-Update-VM-in-pruneheap.c.patch)
  download | inline diff:
From 53357a7e0f61e1ec00323c0cd14c8afb3f655b83 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Mon, 2 Jun 2025 11:04:14 -0400
Subject: [PATCH v1 12/14] Update VM in pruneheap.c

As a step toward updating the VM in the same critical section and WAL
record as pruning and freezing (during phase I of vacuuming), first move
the VM update (still in its own critical section and WAL record) into
heap_page_prune_and_freeze(). This makes review easier.
---
 src/backend/access/heap/pruneheap.c  | 99 +++++++++++++++++++++++-----
 src/backend/access/heap/vacuumlazy.c | 81 +++++------------------
 src/include/access/heapam.h          | 15 +++--
 3 files changed, 106 insertions(+), 89 deletions(-)

diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index 496b70e318f..425dcc77534 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -364,7 +364,8 @@ identify_and_fix_vm_corruption(Relation relation,
 
 /*
  * Prune and repair fragmentation and potentially freeze tuples on the
- * specified page.
+ * specified page. If the page's visibility status has changed, update it in
+ * the VM.
  *
  * Caller must have pin and buffer cleanup lock on the page.  Note that we
  * don't update the FSM information for page on caller's behalf.  Caller might
@@ -440,6 +441,8 @@ heap_page_prune_and_freeze(Relation relation, Buffer buffer,
 	bool		do_freeze;
 	bool		do_prune;
 	bool		do_hint;
+	uint8		vmflags = 0;
+	uint8		old_vmbits = 0;
 	bool		hint_bit_fpi;
 	int64		fpi_before = pgWalUsage.wal_fpi;
 
@@ -939,7 +942,7 @@ heap_page_prune_and_freeze(Relation relation, Buffer buffer,
 	 *
 	 * Now that freezing has been finalized, unset all_visible if there are
 	 * any LP_DEAD items on the page.  It needs to reflect the present state
-	 * of the page, as expected by our caller.
+	 * of the page, as expected for updating the visibility map.
 	 */
 	if (prstate.all_visible && prstate.lpdead_items == 0)
 	{
@@ -955,31 +958,91 @@ heap_page_prune_and_freeze(Relation relation, Buffer buffer,
 	presult->hastup = prstate.hastup;
 
 	/*
-	 * For callers planning to update the visibility map, the conflict horizon
-	 * for that record must be the newest xmin on the page.  However, if the
-	 * page is completely frozen, there can be no conflict and the
-	 * vm_conflict_horizon should remain InvalidTransactionId.  This includes
-	 * the case that we just froze all the tuples; the prune-freeze record
-	 * included the conflict XID already so the caller doesn't need it.
+	 * If updating the visibility map, the conflict horizon for that record
+	 * must be the newest xmin on the page.  However, if the page is
+	 * completely frozen, there can be no conflict and the vm_conflict_horizon
+	 * should remain InvalidTransactionId.  This includes the case that we
+	 * just froze all the tuples; the prune-freeze record included the
+	 * conflict XID already so the VM update record doesn't need it.
 	 */
 	if (presult->all_frozen)
 		presult->vm_conflict_horizon = InvalidTransactionId;
 	else
 		presult->vm_conflict_horizon = prstate.visibility_cutoff_xid;
 
-	presult->lpdead_items = prstate.lpdead_items;
-	/* the presult->deadoffsets array was already filled in */
-
 	/*
-	 * Clear any VM corruption. This does not need to be done in a critical
-	 * section.
+	 * Handle setting visibility map bit based on information from the VM (as
+	 * of last heap_vac_scan_next_block() call), and from all_visible and
+	 * all_frozen variables.
 	 */
-	presult->vm_corruption = false;
 	if (options & HEAP_PAGE_PRUNE_UPDATE_VM)
-		presult->vm_corruption = identify_and_fix_vm_corruption(relation,
-																blockno, buffer, page,
-																blk_known_av,
-																prstate.lpdead_items, vmbuffer);
+	{
+		if (identify_and_fix_vm_corruption(relation,
+										   blockno, buffer, page,
+										   blk_known_av,
+										   prstate.lpdead_items, vmbuffer))
+		{
+			/* If we fix corruption, don't update the VM further */
+		}
+
+		/*
+		 * If the page isn't yet marked all-visible in the VM or it is and
+		 * needs to me marked all-frozen, update the VM Note that all_frozen
+		 * is only valid if all_visible is true, so we must check both
+		 * all_visible and all_frozen.
+		 */
+		else if (presult->all_visible &&
+				 (!blk_known_av ||
+				  (presult->all_frozen && !VM_ALL_FROZEN(relation, blockno, &vmbuffer))))
+		{
+			Assert(prstate.lpdead_items == 0);
+			vmflags = VISIBILITYMAP_ALL_VISIBLE;
+
+			/*
+			 * If the page is all-frozen, we can pass InvalidTransactionId as
+			 * our cutoff_xid, since a snapshotConflictHorizon sufficient to
+			 * make everything safe for REDO was logged when the page's tuples
+			 * were frozen.
+			 */
+			if (presult->all_frozen)
+			{
+				Assert(!TransactionIdIsValid(presult->vm_conflict_horizon));
+				vmflags |= VISIBILITYMAP_ALL_FROZEN;
+			}
+
+			/*
+			 * It's possible for the VM bit to be clear and the page-level bit
+			 * to be set if checksums are not enabled.
+			 *
+			 * And even if we are just planning to update the frozen bit in
+			 * the VM, we shouldn't rely on all_visible_according_to_vm as a
+			 * proxy for the page-level PD_ALL_VISIBLE bit being set, since it
+			 * might have become stale.
+			 *
+			 * If the heap page is all-visible but the VM bit is not set, we
+			 * don't need to dirty the heap page.  However, if checksums are
+			 * enabled, we do need to make sure that the heap page is dirtied
+			 * before passing it to visibilitymap_set(), because it may be
+			 * logged.
+			 */
+			if (!PageIsAllVisible(page) || XLogHintBitIsNeeded())
+			{
+				PageSetAllVisible(page);
+				MarkBufferDirty(buffer);
+			}
+
+			old_vmbits = heap_page_set_vm_and_log(relation, blockno, buffer,
+												  vmbuffer, presult->vm_conflict_horizon,
+												  vmflags);
+		}
+	}
+
+	presult->lpdead_items = prstate.lpdead_items;
+	/* the presult->deadoffsets array was already filled in */
+
+	presult->old_vmbits = old_vmbits;
+	presult->new_vmbits = vmflags;
+
 	if (prstate.freeze)
 	{
 		if (presult->nfrozen > 0)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 592cd455cf4..5806207a674 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1940,7 +1940,6 @@ lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, BlockNumber blkno,
 	return false;
 }
 
-
 /* qsort comparator for sorting OffsetNumbers */
 static int
 cmpOffsetNumbers(const void *a, const void *b)
@@ -1956,7 +1955,8 @@ cmpOffsetNumbers(const void *a, const void *b)
  * vmbuffer is the buffer containing the VM block with visibility information
  * for the heap block, blkno. all_visible_according_to_vm is the saved
  * visibility status of the heap block looked up earlier by the caller. We
- * won't rely entirely on this status, as it may be out of date.
+ * won't rely entirely on this status, as it may be out of date. These will be
+ * passed on to heap_page_prune_and_freeze() to use while setting the VM.
  *
  * *has_lpdead_items is set to true or false depending on whether, upon return
  * from this function, any LP_DEAD items are still present on the page.
@@ -1983,6 +1983,7 @@ lazy_scan_prune(LVRelState *vacrel,
 
 	/*
 	 * Prune all HOT-update chains and potentially freeze tuples on this page.
+	 * Then, if the page's visibility status has changed, update the VM.
 	 *
 	 * If the relation has no indexes, we can immediately mark would-be dead
 	 * items LP_UNUSED.
@@ -1991,10 +1992,6 @@ lazy_scan_prune(LVRelState *vacrel,
 	 * presult.ndeleted.  It should not be confused with presult.lpdead_items;
 	 * presult.lpdead_items's final value can be thought of as the number of
 	 * tuples that were deleted from indexes.
-	 *
-	 * We will update the VM after collecting LP_DEAD items and freezing
-	 * tuples. Pruning will have determined whether or not the page is
-	 * all-visible.
 	 */
 	prune_options = HEAP_PAGE_PRUNE_FREEZE | HEAP_PAGE_PRUNE_UPDATE_VM;
 	if (vacrel->nindexes == 0)
@@ -2086,70 +2083,26 @@ lazy_scan_prune(LVRelState *vacrel,
 	Assert(!presult.all_visible || !(*has_lpdead_items));
 
 	/*
-	 * Handle setting visibility map bit based on information from the VM (as
-	 * of last heap_vac_scan_next_block() call), and from all_visible and
-	 * all_frozen variables.
-	 */
-	if (presult.vm_corruption)
-	{
-		/* Don't update the VM if we just cleared corruption in it */
-	}
-
-	/*
-	 * If the page isn't yet marked all-visible in the VM or it is and needs
-	 * to me marked all-frozen, update the VM Note that all_frozen is only
-	 * valid if all_visible is true, so we must check both all_visible and
-	 * all_frozen.
+	 * For the purposes of logging, count whether or not the page was newly
+	 * set all-visible and, potentially, all-frozen.
 	 */
-	else if (presult.all_visible &&
-			 (!all_visible_according_to_vm ||
-			  (presult.all_frozen && !VM_ALL_FROZEN(vacrel->rel, blkno, &vmbuffer))))
+	if ((presult.old_vmbits & VISIBILITYMAP_ALL_VISIBLE) == 0 &&
+		(presult.new_vmbits & VISIBILITYMAP_ALL_VISIBLE) != 0)
 	{
-		uint8		old_vmbits;
-		uint8		flags = VISIBILITYMAP_ALL_VISIBLE;
-
-		/*
-		 * If the page is all-frozen, we can pass InvalidTransactionId as our
-		 * cutoff_xid, since a snapshotConflictHorizon sufficient to make
-		 * everything safe for REDO was logged when the page's tuples were
-		 * frozen.
-		 */
-		if (presult.all_frozen)
-		{
-			Assert(!TransactionIdIsValid(presult.vm_conflict_horizon));
-			flags |= VISIBILITYMAP_ALL_FROZEN;
-		}
-
-		old_vmbits = heap_page_set_vm_and_log(vacrel->rel, blkno, buf,
-											  vmbuffer, presult.vm_conflict_horizon,
-											  flags);
-
-		/*
-		 * Even if we are only setting the all-frozen bit, there is a small
-		 * chance that the VM was modified sometime between setting
-		 * all_visible_according_to_vm and checking the visibility during
-		 * pruning. Check the return value of old_vmbits to ensure the
-		 * visibility map counters used for logging are accurate.
-		 *
-		 * If the page wasn't already set all-visible and/or all-frozen in the
-		 * VM, count it as newly set for logging.
-		 */
-		if ((old_vmbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
-		{
-			vacrel->vm_new_visible_pages++;
-			if (presult.all_frozen)
-			{
-				vacrel->vm_new_visible_frozen_pages++;
-				*vm_page_frozen = true;
-			}
-		}
-		else if ((old_vmbits & VISIBILITYMAP_ALL_FROZEN) == 0 &&
-				 presult.all_frozen)
+		vacrel->vm_new_visible_pages++;
+		if ((presult.new_vmbits & VISIBILITYMAP_ALL_FROZEN) != 0)
 		{
-			vacrel->vm_new_frozen_pages++;
+			vacrel->vm_new_visible_frozen_pages++;
 			*vm_page_frozen = true;
 		}
 	}
+	else if ((presult.old_vmbits & VISIBILITYMAP_ALL_FROZEN) == 0 &&
+			 (presult.new_vmbits & VISIBILITYMAP_ALL_FROZEN) != 0)
+	{
+		Assert((presult.new_vmbits & VISIBILITYMAP_ALL_VISIBLE) != 0);
+		vacrel->vm_new_frozen_pages++;
+		*vm_page_frozen = true;
+	}
 }
 
 /*
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 0886867a161..534a63aab31 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -234,20 +234,21 @@ typedef struct PruneFreezeResult
 	int			recently_dead_tuples;
 
 	/*
-	 * all_visible and all_frozen indicate if the all-visible and all-frozen
-	 * bits in the visibility map can be set for this page, after pruning.
+	 * all_visible and all_frozen indicate the status of the page as reflected
+	 * in the visibility map after pruning, freezing, and setting any pages
+	 * all-visible in the visibility map.
 	 *
-	 * vm_conflict_horizon is the newest xmin of live tuples on the page.  The
-	 * caller can use it as the conflict horizon when setting the VM bits.  It
-	 * is only valid if we froze some tuples (nfrozen > 0), and all_frozen is
-	 * true.
+	 * vm_conflict_horizon is the newest xmin of live tuples on the page
+	 * (older than OldestXmin).  It will only be valid if we did not set the
+	 * page all-frozen in the VM.
 	 *
 	 * These are only set if the HEAP_PRUNE_FREEZE option is set.
 	 */
 	bool		all_visible;
 	bool		all_frozen;
 	TransactionId vm_conflict_horizon;
-	bool		vm_corruption;
+	uint8		old_vmbits;
+	uint8		new_vmbits;
 
 	/*
 	 * Whether or not the page makes rel truncation unsafe.  This is set to
-- 
2.34.1



  [text/x-patch] v1-0014-Remove-xl_heap_visible-entirely.patch (19.8K, 13-v1-0014-Remove-xl_heap_visible-entirely.patch)
  download | inline diff:
From 51ce0c717152c316a29ce97edf6dfd8f720c3cba Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Wed, 18 Jun 2025 12:30:42 -0400
Subject: [PATCH v1 14/14] Remove xl_heap_visible entirely

There are now no users of this, so eliminate it entirely.

ci-os-only:
---
 src/backend/access/common/bufmask.c      |   3 +-
 src/backend/access/heap/heapam.c         |  83 +------------
 src/backend/access/heap/heapam_xlog.c    | 149 +----------------------
 src/backend/access/heap/visibilitymap.c  | 101 +--------------
 src/backend/access/rmgrdesc/heapdesc.c   |  10 --
 src/backend/replication/logical/decode.c |   1 -
 src/include/access/heapam.h              |   3 -
 src/include/access/heapam_xlog.h         |   6 -
 src/include/access/visibilitymap.h       |  10 +-
 9 files changed, 12 insertions(+), 354 deletions(-)

diff --git a/src/backend/access/common/bufmask.c b/src/backend/access/common/bufmask.c
index bb260cffa68..1fff01383b3 100644
--- a/src/backend/access/common/bufmask.c
+++ b/src/backend/access/common/bufmask.c
@@ -56,8 +56,7 @@ mask_page_hint_bits(Page page)
 
 	/*
 	 * During replay, if the page LSN has advanced past our XLOG record's LSN,
-	 * we don't mark the page all-visible. See heap_xlog_visible() for
-	 * details.
+	 * we don't mark the page all-visible.
 	 */
 	PageClearAllVisible(page);
 }
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 3d9b114b4e8..6f134dfd535 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -40,6 +40,7 @@
 #include "access/valid.h"
 #include "access/visibilitymap.h"
 #include "access/xloginsert.h"
+#include "access/xlogutils.h"
 #include "catalog/pg_database.h"
 #include "catalog/pg_database_d.h"
 #include "commands/vacuum.h"
@@ -7845,43 +7846,6 @@ heap_tuple_needs_eventual_freeze(HeapTupleHeader tuple)
 	return false;
 }
 
-/*
- * Make the heap and VM page changes needed to set a page all-visible.
- * Do not call in recovery.
- */
-uint8
-heap_page_set_vm_and_log(Relation rel, BlockNumber heap_blk, Buffer heap_buf,
-						 Buffer vmbuf, TransactionId cutoff_xid,
-						 uint8 vmflags)
-{
-	Page		heap_page = BufferGetPage(heap_buf);
-	bool		set_heap_lsn = false;
-
-	Assert(BufferIsValid(heap_buf));
-
-	/* Check that we have the right heap page pinned, if present */
-	if (BufferGetBlockNumber(heap_buf) != heap_blk)
-		elog(ERROR, "wrong heap buffer passed to heap_page_set_vm_and_log");
-
-	/*
-	 * We must never end up with the VM bit set and the page-level
-	 * PD_ALL_VISIBLE bit clear. If that were to occur, a subsequent page
-	 * modification would fail to clear the VM bit. Though it is possible for
-	 * the page-level bit to be set and the VM bit to be clear if checksums
-	 * and wal_log_hints are not enabled.
-	 */
-	if (!PageIsAllVisible(heap_page))
-	{
-		PageSetAllVisible(heap_page);
-		MarkBufferDirty(heap_buf);
-		if (XLogHintBitIsNeeded())
-			set_heap_lsn = true;
-	}
-
-	return visibilitymap_set(rel, heap_blk, heap_buf,
-							 InvalidXLogRecPtr, vmbuf, cutoff_xid, vmflags, set_heap_lsn);
-}
-
 /*
  * Ensure the provided heap page is marked PD_ALL_VISIBLE and then set the
  * provided vmflags in the provided vmbuf.
@@ -7923,7 +7887,7 @@ heap_page_set_vm(Relation rel, BlockNumber heap_blk, Buffer heap_buf,
 		MarkBufferDirty(heap_buf);
 	}
 
-	return visibilitymap_set_vmbyte(rel, heap_blk, vmbuf, vmflags);
+	return visibilitymap_set(rel, heap_blk, vmbuf, vmflags);
 }
 
 /*
@@ -8865,49 +8829,6 @@ bottomup_sort_and_shrink(TM_IndexDeleteOp *delstate)
 	return nblocksfavorable;
 }
 
-/*
- * Perform XLogInsert for a heap-visible operation.  'block' is the block
- * being marked all-visible, and vm_buffer is the buffer containing the
- * corresponding visibility map block.  Both should have already been modified
- * and dirtied.
- *
- * snapshotConflictHorizon comes from the largest xmin on the page being
- * marked all-visible.  REDO routine uses it to generate recovery conflicts.
- *
- * If checksums or wal_log_hints are enabled, we may also generate a full-page
- * image of heap_buffer. Otherwise, we optimize away the FPI (by specifying
- * REGBUF_NO_IMAGE for the heap buffer), in which case the caller should *not*
- * update the heap page's LSN.
- */
-XLogRecPtr
-log_heap_visible(Relation rel, Buffer heap_buffer, Buffer vm_buffer,
-				 TransactionId snapshotConflictHorizon, uint8 vmflags)
-{
-	xl_heap_visible xlrec;
-	XLogRecPtr	recptr;
-	uint8		flags;
-
-	Assert(BufferIsValid(heap_buffer));
-	Assert(BufferIsValid(vm_buffer));
-
-	xlrec.snapshotConflictHorizon = snapshotConflictHorizon;
-	xlrec.flags = vmflags;
-	if (RelationIsAccessibleInLogicalDecoding(rel))
-		xlrec.flags |= VISIBILITYMAP_XLOG_CATALOG_REL;
-	XLogBeginInsert();
-	XLogRegisterData(&xlrec, SizeOfHeapVisible);
-
-	XLogRegisterBuffer(0, vm_buffer, 0);
-
-	flags = REGBUF_STANDARD;
-	if (!XLogHintBitIsNeeded())
-		flags |= REGBUF_NO_IMAGE;
-	XLogRegisterBuffer(1, heap_buffer, flags);
-
-	recptr = XLogInsert(RM_HEAP2_ID, XLOG_HEAP2_VISIBLE);
-
-	return recptr;
-}
 
 /*
  * Perform XLogInsert for a heap-update operation.  Caller must already
diff --git a/src/backend/access/heap/heapam_xlog.c b/src/backend/access/heap/heapam_xlog.c
index bb6680c0467..c64fc39bc01 100644
--- a/src/backend/access/heap/heapam_xlog.c
+++ b/src/backend/access/heap/heapam_xlog.c
@@ -273,7 +273,7 @@ heap_xlog_prune_freeze(XLogReaderState *record)
 		Relation	reln = CreateFakeRelcacheEntry(rlocator);
 
 		visibilitymap_pin(reln, blkno, &vmbuffer);
-		old_vmbits = visibilitymap_set_vmbyte(reln, blkno, vmbuffer, vmflags);
+		old_vmbits = visibilitymap_set(reln, blkno, vmbuffer, vmflags);
 		/* Only set VM page LSN if we modified the page */
 		if (old_vmbits != vmflags)
 			PageSetLSN(BufferGetPage(vmbuffer), lsn);
@@ -284,142 +284,6 @@ heap_xlog_prune_freeze(XLogReaderState *record)
 		UnlockReleaseBuffer(vmbuffer);
 }
 
-/*
- * Replay XLOG_HEAP2_VISIBLE records.
- *
- * The critical integrity requirement here is that we must never end up with
- * a situation where the visibility map bit is set, and the page-level
- * PD_ALL_VISIBLE bit is clear.  If that were to occur, then a subsequent
- * page modification would fail to clear the visibility map bit.
- */
-static void
-heap_xlog_visible(XLogReaderState *record)
-{
-	XLogRecPtr	lsn = record->EndRecPtr;
-	xl_heap_visible *xlrec = (xl_heap_visible *) XLogRecGetData(record);
-	Buffer		vmbuffer = InvalidBuffer;
-	Buffer		buffer;
-	Page		page;
-	RelFileLocator rlocator;
-	BlockNumber blkno;
-	XLogRedoAction action;
-
-	Assert((xlrec->flags & VISIBILITYMAP_XLOG_VALID_BITS) == xlrec->flags);
-
-	XLogRecGetBlockTag(record, 1, &rlocator, NULL, &blkno);
-
-	/*
-	 * If there are any Hot Standby transactions running that have an xmin
-	 * horizon old enough that this page isn't all-visible for them, they
-	 * might incorrectly decide that an index-only scan can skip a heap fetch.
-	 *
-	 * NB: It might be better to throw some kind of "soft" conflict here that
-	 * forces any index-only scan that is in flight to perform heap fetches,
-	 * rather than killing the transaction outright.
-	 */
-	if (InHotStandby)
-		ResolveRecoveryConflictWithSnapshot(xlrec->snapshotConflictHorizon,
-											xlrec->flags & VISIBILITYMAP_XLOG_CATALOG_REL,
-											rlocator);
-
-	/*
-	 * Read the heap page, if it still exists. If the heap file has dropped or
-	 * truncated later in recovery, we don't need to update the page, but we'd
-	 * better still update the visibility map.
-	 */
-	action = XLogReadBufferForRedo(record, 1, &buffer);
-	if (action == BLK_NEEDS_REDO)
-	{
-		/*
-		 * We don't bump the LSN of the heap page when setting the visibility
-		 * map bit (unless checksums or wal_hint_bits is enabled, in which
-		 * case we must). This exposes us to torn page hazards, but since
-		 * we're not inspecting the existing page contents in any way, we
-		 * don't care.
-		 */
-		page = BufferGetPage(buffer);
-
-		PageSetAllVisible(page);
-
-		if (XLogHintBitIsNeeded())
-			PageSetLSN(page, lsn);
-
-		MarkBufferDirty(buffer);
-	}
-	else if (action == BLK_RESTORED)
-	{
-		/*
-		 * If heap block was backed up, we already restored it and there's
-		 * nothing more to do. (This can only happen with checksums or
-		 * wal_log_hints enabled.)
-		 */
-	}
-
-	if (BufferIsValid(buffer))
-	{
-		Size		space = PageGetFreeSpace(BufferGetPage(buffer));
-
-		UnlockReleaseBuffer(buffer);
-
-		/*
-		 * Since FSM is not WAL-logged and only updated heuristically, it
-		 * easily becomes stale in standbys.  If the standby is later promoted
-		 * and runs VACUUM, it will skip updating individual free space
-		 * figures for pages that became all-visible (or all-frozen, depending
-		 * on the vacuum mode,) which is troublesome when FreeSpaceMapVacuum
-		 * propagates too optimistic free space values to upper FSM layers;
-		 * later inserters try to use such pages only to find out that they
-		 * are unusable.  This can cause long stalls when there are many such
-		 * pages.
-		 *
-		 * Forestall those problems by updating FSM's idea about a page that
-		 * is becoming all-visible or all-frozen.
-		 *
-		 * Do this regardless of a full-page image being applied, since the
-		 * FSM data is not in the page anyway.
-		 */
-		if (xlrec->flags & VISIBILITYMAP_VALID_BITS)
-			XLogRecordPageWithFreeSpace(rlocator, blkno, space);
-	}
-
-	/*
-	 * Even if we skipped the heap page update due to the LSN interlock, it's
-	 * still safe to update the visibility map.  Any WAL record that clears
-	 * the visibility map bit does so before checking the page LSN, so any
-	 * bits that need to be cleared will still be cleared.
-	 */
-	if (XLogReadBufferForRedoExtended(record, 0, RBM_ZERO_ON_ERROR, false,
-									  &vmbuffer) == BLK_NEEDS_REDO)
-	{
-		Page		vmpage = BufferGetPage(vmbuffer);
-		Relation	reln;
-		uint8		vmbits;
-
-		/* initialize the page if it was read as zeros */
-		if (PageIsNew(vmpage))
-			PageInit(vmpage, BLCKSZ, 0);
-
-		/* remove VISIBILITYMAP_XLOG_* */
-		vmbits = xlrec->flags & VISIBILITYMAP_VALID_BITS;
-
-		/*
-		 * XLogReadBufferForRedoExtended locked the buffer. But
-		 * visibilitymap_set will handle locking itself.
-		 */
-		LockBuffer(vmbuffer, BUFFER_LOCK_UNLOCK);
-
-		reln = CreateFakeRelcacheEntry(rlocator);
-		visibilitymap_pin(reln, blkno, &vmbuffer);
-
-		visibilitymap_set(reln, blkno, InvalidBuffer, lsn, vmbuffer,
-						  xlrec->snapshotConflictHorizon, vmbits, false);
-
-		ReleaseBuffer(vmbuffer);
-		FreeFakeRelcacheEntry(reln);
-	}
-	else if (BufferIsValid(vmbuffer))
-		UnlockReleaseBuffer(vmbuffer);
-}
 
 /*
  * Given an "infobits" field from an XLog record, set the correct bits in the
@@ -799,10 +663,10 @@ heap_xlog_multi_insert(XLogReaderState *record)
 		Relation	reln = CreateFakeRelcacheEntry(rlocator);
 
 		visibilitymap_pin(reln, blkno, &vmbuffer);
-		old_vmbits = visibilitymap_set_vmbyte(reln, blkno,
-											  vmbuffer,
-											  VISIBILITYMAP_ALL_VISIBLE |
-											  VISIBILITYMAP_ALL_FROZEN);
+		old_vmbits = visibilitymap_set(reln, blkno,
+									   vmbuffer,
+									   VISIBILITYMAP_ALL_VISIBLE |
+									   VISIBILITYMAP_ALL_FROZEN);
 		Assert((old_vmbits & VISIBILITYMAP_VALID_BITS) == 0);
 		(void) old_vmbits; /* Silence compiler */
 		PageSetLSN(BufferGetPage(vmbuffer), lsn);
@@ -1384,9 +1248,6 @@ heap2_redo(XLogReaderState *record)
 		case XLOG_HEAP2_PRUNE_VACUUM_CLEANUP:
 			heap_xlog_prune_freeze(record);
 			break;
-		case XLOG_HEAP2_VISIBLE:
-			heap_xlog_visible(record);
-			break;
 		case XLOG_HEAP2_MULTI_INSERT:
 			heap_xlog_multi_insert(record);
 			break;
diff --git a/src/backend/access/heap/visibilitymap.c b/src/backend/access/heap/visibilitymap.c
index 9f27ace0e1c..a24554fe191 100644
--- a/src/backend/access/heap/visibilitymap.c
+++ b/src/backend/access/heap/visibilitymap.c
@@ -219,103 +219,6 @@ visibilitymap_pin_ok(BlockNumber heapBlk, Buffer vmbuf)
 	return BufferIsValid(vmbuf) && BufferGetBlockNumber(vmbuf) == mapBlock;
 }
 
-/*
- *	visibilitymap_set - set bit(s) on a previously pinned page
- *
- * recptr is the LSN of the XLOG record we're replaying, if we're in recovery,
- * or InvalidXLogRecPtr in normal running.  The VM page LSN is advanced to the
- * one provided; in normal running, we generate a new XLOG record and set the
- * page LSN to that value (though the heap page's LSN may *not* be updated;
- * see below).  cutoff_xid is the largest xmin on the page being marked
- * all-visible; it is needed for Hot Standby, and can be InvalidTransactionId
- * if the page contains no tuples.  It can also be set to InvalidTransactionId
- * when a page that is already all-visible is being marked all-frozen.
- *
- * Caller is expected to set the heap page's PD_ALL_VISIBLE bit before calling
- * this function. Except in recovery, caller should also pass the heap buffer.
- * When checksums are enabled and we're not in recovery, if the heap page was
- * modified, we must add the heap buffer to the WAL chain to protect it from
- * being torn.
- *
- * You must pass a buffer containing the correct map page to this function.
- * Call visibilitymap_pin first to pin the right one. This function doesn't do
- * any I/O.
- *
- * Returns the state of the page's VM bits before setting flags.
- */
-uint8
-visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf,
-				  XLogRecPtr recptr, Buffer vmBuf, TransactionId cutoff_xid,
-				  uint8 flags, bool set_heap_lsn)
-{
-	BlockNumber mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
-	uint32		mapByte = HEAPBLK_TO_MAPBYTE(heapBlk);
-	uint8		mapOffset = HEAPBLK_TO_OFFSET(heapBlk);
-	Page		page;
-	uint8	   *map;
-	uint8		status;
-
-#ifdef TRACE_VISIBILITYMAP
-	elog(DEBUG1, "vm_set %s %d", RelationGetRelationName(rel), heapBlk);
-#endif
-
-	Assert(InRecovery || XLogRecPtrIsInvalid(recptr));
-	Assert(!(InRecovery && set_heap_lsn));
-	Assert((flags & VISIBILITYMAP_VALID_BITS) == flags);
-
-	/* Must never set all_frozen bit without also setting all_visible bit */
-	Assert(flags != VISIBILITYMAP_ALL_FROZEN);
-
-	/* Check that we have the right VM page pinned */
-	if (!BufferIsValid(vmBuf) || BufferGetBlockNumber(vmBuf) != mapBlock)
-		elog(ERROR, "wrong VM buffer passed to visibilitymap_set");
-
-	page = BufferGetPage(vmBuf);
-	map = (uint8 *) PageGetContents(page);
-	LockBuffer(vmBuf, BUFFER_LOCK_EXCLUSIVE);
-
-	status = (map[mapByte] >> mapOffset) & VISIBILITYMAP_VALID_BITS;
-	if (flags != status)
-	{
-		START_CRIT_SECTION();
-
-		map[mapByte] |= (flags << mapOffset);
-		MarkBufferDirty(vmBuf);
-
-		if (RelationNeedsWAL(rel))
-		{
-			if (XLogRecPtrIsInvalid(recptr))
-			{
-				Assert(!InRecovery);
-				recptr = log_heap_visible(rel, heapBuf, vmBuf, cutoff_xid, flags);
-
-				/*
-				 * If data checksums are enabled (or wal_log_hints=on), we
-				 * need to protect the heap page from being torn.
-				 *
-				 * If not, then we must *not* update the heap page's LSN. In
-				 * this case, the FPI for the heap page was omitted from the
-				 * WAL record inserted above, so it would be incorrect to
-				 * update the heap page's LSN.
-				 */
-				if (set_heap_lsn)
-				{
-					Page		heapPage = BufferGetPage(heapBuf);
-
-					Assert(XLogHintBitIsNeeded());
-
-					PageSetLSN(heapPage, recptr);
-				}
-			}
-			PageSetLSN(page, recptr);
-		}
-
-		END_CRIT_SECTION();
-	}
-
-	LockBuffer(vmBuf, BUFFER_LOCK_UNLOCK);
-	return status;
-}
 
 /*
  * Set flags in the VM block contained in the passed in vmBuf.
@@ -325,8 +228,8 @@ visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf,
  * making any changes needed to the associated heap page.
  */
 uint8
-visibilitymap_set_vmbyte(Relation rel, BlockNumber heapBlk,
-						 Buffer vmBuf, uint8 flags)
+visibilitymap_set(Relation rel, BlockNumber heapBlk,
+				  Buffer vmBuf, uint8 flags)
 {
 	BlockNumber mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
 	uint32		mapByte = HEAPBLK_TO_MAPBYTE(heapBlk);
diff --git a/src/backend/access/rmgrdesc/heapdesc.c b/src/backend/access/rmgrdesc/heapdesc.c
index d6c86ccac20..f7880a4ed81 100644
--- a/src/backend/access/rmgrdesc/heapdesc.c
+++ b/src/backend/access/rmgrdesc/heapdesc.c
@@ -351,13 +351,6 @@ heap2_desc(StringInfo buf, XLogReaderState *record)
 			}
 		}
 	}
-	else if (info == XLOG_HEAP2_VISIBLE)
-	{
-		xl_heap_visible *xlrec = (xl_heap_visible *) rec;
-
-		appendStringInfo(buf, "snapshotConflictHorizon: %u, flags: 0x%02X",
-						 xlrec->snapshotConflictHorizon, xlrec->flags);
-	}
 	else if (info == XLOG_HEAP2_MULTI_INSERT)
 	{
 		xl_heap_multi_insert *xlrec = (xl_heap_multi_insert *) rec;
@@ -462,9 +455,6 @@ heap2_identify(uint8 info)
 		case XLOG_HEAP2_PRUNE_VACUUM_CLEANUP:
 			id = "PRUNE_VACUUM_CLEANUP";
 			break;
-		case XLOG_HEAP2_VISIBLE:
-			id = "VISIBLE";
-			break;
 		case XLOG_HEAP2_MULTI_INSERT:
 			id = "MULTI_INSERT";
 			break;
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index cc03f0706e9..2fdd4af90a8 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -454,7 +454,6 @@ heap2_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 		case XLOG_HEAP2_PRUNE_ON_ACCESS:
 		case XLOG_HEAP2_PRUNE_VACUUM_SCAN:
 		case XLOG_HEAP2_PRUNE_VACUUM_CLEANUP:
-		case XLOG_HEAP2_VISIBLE:
 		case XLOG_HEAP2_LOCK_UPDATED:
 			break;
 		default:
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index e35b4adf38d..c404b794fda 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -365,9 +365,6 @@ extern bool heap_tuple_needs_eventual_freeze(HeapTupleHeader tuple);
 
 extern uint8 heap_page_set_vm(Relation rel, BlockNumber heap_blk, Buffer heap_buf,
 							  Buffer vmbuf, uint8 vmflags);
-extern uint8 heap_page_set_vm_and_log(Relation rel, BlockNumber heap_blk, Buffer heap_buf,
-									  Buffer vmbuf, TransactionId cutoff_xid,
-									  uint8 vmflags);
 
 extern void simple_heap_insert(Relation relation, HeapTuple tup);
 extern void simple_heap_delete(Relation relation, ItemPointer tid);
diff --git a/src/include/access/heapam_xlog.h b/src/include/access/heapam_xlog.h
index ceae9c083ff..9a61434b881 100644
--- a/src/include/access/heapam_xlog.h
+++ b/src/include/access/heapam_xlog.h
@@ -60,7 +60,6 @@
 #define XLOG_HEAP2_PRUNE_ON_ACCESS		0x10
 #define XLOG_HEAP2_PRUNE_VACUUM_SCAN	0x20
 #define XLOG_HEAP2_PRUNE_VACUUM_CLEANUP	0x30
-#define XLOG_HEAP2_VISIBLE		0x40
 #define XLOG_HEAP2_MULTI_INSERT 0x50
 #define XLOG_HEAP2_LOCK_UPDATED 0x60
 #define XLOG_HEAP2_NEW_CID		0x70
@@ -495,11 +494,6 @@ extern void heap2_desc(StringInfo buf, XLogReaderState *record);
 extern const char *heap2_identify(uint8 info);
 extern void heap_xlog_logical_rewrite(XLogReaderState *r);
 
-extern XLogRecPtr log_heap_visible(Relation rel, Buffer heap_buffer,
-								   Buffer vm_buffer,
-								   TransactionId snapshotConflictHorizon,
-								   uint8 vmflags);
-
 /* in heapdesc.c, so it can be shared between frontend/backend code */
 extern void heap_xlog_deserialize_prune_and_freeze(char *cursor, uint8 flags,
 												   int *nplans, xlhp_freeze_plan **plans,
diff --git a/src/include/access/visibilitymap.h b/src/include/access/visibilitymap.h
index 5d0a9417c25..20141e3e805 100644
--- a/src/include/access/visibilitymap.h
+++ b/src/include/access/visibilitymap.h
@@ -31,14 +31,8 @@ extern bool visibilitymap_clear(Relation rel, BlockNumber heapBlk,
 extern void visibilitymap_pin(Relation rel, BlockNumber heapBlk,
 							  Buffer *vmbuf);
 extern bool visibilitymap_pin_ok(BlockNumber heapBlk, Buffer vmbuf);
-extern uint8 visibilitymap_set(Relation rel,
-							   BlockNumber heapBlk, Buffer heapBuf,
-							   XLogRecPtr recptr,
-							   Buffer vmBuf,
-							   TransactionId cutoff_xid,
-							   uint8 flags, bool set_heap_lsn);
-extern uint8 visibilitymap_set_vmbyte(Relation rel, BlockNumber heapBlk,
-									  Buffer vmBuf, uint8 flags);
+extern uint8 visibilitymap_set(Relation rel, BlockNumber heapBlk,
+							   Buffer vmBuf, uint8 flags);
 extern uint8 visibilitymap_get_status(Relation rel, BlockNumber heapBlk, Buffer *vmbuf);
 extern void visibilitymap_count(Relation rel, BlockNumber *all_visible, BlockNumber *all_frozen);
 extern BlockNumber visibilitymap_prepare_truncate(Relation rel,
-- 
2.34.1



  [text/x-patch] v1-0011-Find-and-fix-VM-corruption-in-heap_page_prune_and.patch (11.7K, 14-v1-0011-Find-and-fix-VM-corruption-in-heap_page_prune_and.patch)
  download | inline diff:
From 5266974bf5a16b05b8c8bac33f4630ed1f1552e1 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Wed, 28 May 2025 16:45:59 -0400
Subject: [PATCH v1 11/14] Find and fix VM corruption in
 heap_page_prune_and_freeze

Future commits will update the VM in the same critical section and WAL
record as pruning and freezing. For ease of review, this commit makes
one step toward doing this. It moves the VM corruption handling case to
heap_page_prune_and_freeze().
---
 src/backend/access/heap/pruneheap.c  | 87 +++++++++++++++++++++++++++-
 src/backend/access/heap/vacuumlazy.c | 78 +++----------------------
 src/include/access/heapam.h          |  4 ++
 3 files changed, 96 insertions(+), 73 deletions(-)

diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index 97e51f78854..496b70e318f 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -21,7 +21,7 @@
 #include "access/transam.h"
 #include "access/xlog.h"
 #include "access/xloginsert.h"
-#include "access/visibilitymapdefs.h"
+#include "access/visibilitymap.h"
 #include "commands/vacuum.h"
 #include "executor/instrument.h"
 #include "miscadmin.h"
@@ -177,6 +177,13 @@ static void heap_prune_record_unchanged_lp_redirect(PruneState *prstate, OffsetN
 
 static void page_verify_redirects(Page page);
 
+static bool identify_and_fix_vm_corruption(Relation relation,
+										   BlockNumber heap_blk,
+										   Buffer heap_buffer, Page heap_page,
+										   bool heap_blk_known_av,
+										   int64 nlpdead_items,
+										   Buffer vmbuffer);
+
 
 /*
  * Optionally prune and repair fragmentation in the specified page.
@@ -261,7 +268,9 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
 			 * not the relation has indexes, since we cannot safely determine
 			 * that during on-access pruning with the current implementation.
 			 */
-			heap_page_prune_and_freeze(relation, buffer, vistest, 0,
+			heap_page_prune_and_freeze(relation, buffer, false,
+									   InvalidBuffer,
+									   vistest, 0,
 									   NULL, &presult, PRUNE_ON_ACCESS, &dummy_off_loc, NULL, NULL);
 
 			/*
@@ -294,6 +303,64 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
 	}
 }
 
+/*
+ * When updating the visibility map after phase I heap vacuuming, we take the
+ * opportunity to identify and fix any VM corruption.
+ *
+ * heap_blk_known_av is the visibility status of the heap page collected
+ * while finding the next unskippable block in heap_vac_scan_next_block().
+ */
+static bool
+identify_and_fix_vm_corruption(Relation relation,
+							   BlockNumber heap_blk,
+							   Buffer heap_buffer, Page heap_page,
+							   bool heap_blk_known_av,
+							   int64 nlpdead_items,
+							   Buffer vmbuffer)
+{
+	/*
+	 * As of PostgreSQL 9.2, the visibility map bit should never be set if the
+	 * page-level bit is clear.  However, it's possible that the bit got
+	 * cleared after heap_vac_scan_next_block() was called, so we must recheck
+	 * with buffer lock before concluding that the VM is corrupt.
+	 */
+	if (heap_blk_known_av && !PageIsAllVisible(heap_page) &&
+		visibilitymap_get_status(relation, heap_blk, &vmbuffer) != 0)
+	{
+		elog(WARNING, "page is not marked all-visible but visibility map bit is set in relation \"%s\" page %u",
+			 RelationGetRelationName(relation), heap_blk);
+		visibilitymap_clear(relation, heap_blk, vmbuffer,
+							VISIBILITYMAP_VALID_BITS);
+		return true;
+	}
+
+	/*
+	 * It's possible for the value returned by
+	 * GetOldestNonRemovableTransactionId() to move backwards, so it's not
+	 * wrong for us to see tuples that appear to not be visible to everyone
+	 * yet, while PD_ALL_VISIBLE is already set. The real safe xmin value
+	 * never moves backwards, but GetOldestNonRemovableTransactionId() is
+	 * conservative and sometimes returns a value that's unnecessarily small,
+	 * so if we see that contradiction it just means that the tuples that we
+	 * think are not visible to everyone yet actually are, and the
+	 * PD_ALL_VISIBLE flag is correct.
+	 *
+	 * There should never be LP_DEAD items on a page with PD_ALL_VISIBLE set,
+	 * however.
+	 */
+	if (nlpdead_items > 0 && PageIsAllVisible(heap_page))
+	{
+		elog(WARNING, "page containing LP_DEAD items is marked as all-visible in relation \"%s\" page %u",
+			 RelationGetRelationName(relation), heap_blk);
+		PageClearAllVisible(heap_page);
+		MarkBufferDirty(heap_buffer);
+		visibilitymap_clear(relation, heap_blk, vmbuffer,
+							VISIBILITYMAP_VALID_BITS);
+		return true;
+	}
+
+	return false;
+}
 
 /*
  * Prune and repair fragmentation and potentially freeze tuples on the
@@ -314,6 +381,10 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
  * HEAP_PRUNE_FREEZE option is not set, because at the moment only callers
  * that also freeze need that information.
  *
+ * blk_known_av is the visibility status of the heap block as of the last call
+ * to find_next_unskippable_block(). vmbuffer is the buffer that may already
+ * contain the required block of the visibility map.
+ *
  * vistest is used to distinguish whether tuples are DEAD or RECENTLY_DEAD
  * (see heap_prune_satisfies_vacuum).
  *
@@ -349,6 +420,8 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
  */
 void
 heap_page_prune_and_freeze(Relation relation, Buffer buffer,
+						   bool blk_known_av,
+						   Buffer vmbuffer,
 						   GlobalVisState *vistest,
 						   int options,
 						   struct VacuumCutoffs *cutoffs,
@@ -897,6 +970,16 @@ heap_page_prune_and_freeze(Relation relation, Buffer buffer,
 	presult->lpdead_items = prstate.lpdead_items;
 	/* the presult->deadoffsets array was already filled in */
 
+	/*
+	 * Clear any VM corruption. This does not need to be done in a critical
+	 * section.
+	 */
+	presult->vm_corruption = false;
+	if (options & HEAP_PAGE_PRUNE_UPDATE_VM)
+		presult->vm_corruption = identify_and_fix_vm_corruption(relation,
+																blockno, buffer, page,
+																blk_known_av,
+																prstate.lpdead_items, vmbuffer);
 	if (prstate.freeze)
 	{
 		if (presult->nfrozen > 0)
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index fdac36f0835..592cd455cf4 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -431,13 +431,6 @@ static void find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
 								   bool sharelock, Buffer vmbuffer);
-
-static bool identify_and_fix_vm_corruption(Relation relation,
-										   BlockNumber heap_blk,
-										   Buffer heap_buffer, Page heap_page,
-										   bool heap_blk_known_av,
-										   int64 nlpdead_items,
-										   Buffer vmbuffer);
 static void lazy_scan_prune(LVRelState *vacrel, Buffer buf,
 							BlockNumber blkno, Page page,
 							Buffer vmbuffer, bool all_visible_according_to_vm,
@@ -1947,65 +1940,6 @@ lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, BlockNumber blkno,
 	return false;
 }
 
-/*
- * When updating the visibility map after phase I heap vacuuming, we take the
- * opportunity to identify and fix any VM corruption.
- *
- * heap_blk_known_av is the visibility status of the heap page collected
- * while finding the next unskippable block in heap_vac_scan_next_block().
- */
-static bool
-identify_and_fix_vm_corruption(Relation relation,
-							   BlockNumber heap_blk,
-							   Buffer heap_buffer, Page heap_page,
-							   bool heap_blk_known_av,
-							   int64 nlpdead_items,
-							   Buffer vmbuffer)
-{
-	/*
-	 * As of PostgreSQL 9.2, the visibility map bit should never be set if the
-	 * page-level bit is clear.  However, it's possible that the bit got
-	 * cleared after heap_vac_scan_next_block() was called, so we must recheck
-	 * with buffer lock before concluding that the VM is corrupt.
-	 */
-	if (heap_blk_known_av && !PageIsAllVisible(heap_page) &&
-		visibilitymap_get_status(relation, heap_blk, &vmbuffer) != 0)
-	{
-		elog(WARNING, "page is not marked all-visible but visibility map bit is set in relation \"%s\" page %u",
-			 RelationGetRelationName(relation), heap_blk);
-		visibilitymap_clear(relation, heap_blk, vmbuffer,
-							VISIBILITYMAP_VALID_BITS);
-		return true;
-	}
-
-	/*
-	 * It's possible for the value returned by
-	 * GetOldestNonRemovableTransactionId() to move backwards, so it's not
-	 * wrong for us to see tuples that appear to not be visible to everyone
-	 * yet, while PD_ALL_VISIBLE is already set. The real safe xmin value
-	 * never moves backwards, but GetOldestNonRemovableTransactionId() is
-	 * conservative and sometimes returns a value that's unnecessarily small,
-	 * so if we see that contradiction it just means that the tuples that we
-	 * think are not visible to everyone yet actually are, and the
-	 * PD_ALL_VISIBLE flag is correct.
-	 *
-	 * There should never be LP_DEAD items on a page with PD_ALL_VISIBLE set,
-	 * however.
-	 */
-	if (nlpdead_items > 0 && PageIsAllVisible(heap_page))
-	{
-		elog(WARNING, "page containing LP_DEAD items is marked as all-visible in relation \"%s\" page %u",
-			 RelationGetRelationName(relation), heap_blk);
-		PageClearAllVisible(heap_page);
-		MarkBufferDirty(heap_buffer);
-		visibilitymap_clear(relation, heap_blk, vmbuffer,
-							VISIBILITYMAP_VALID_BITS);
-		return true;
-	}
-
-	return false;
-}
-
 
 /* qsort comparator for sorting OffsetNumbers */
 static int
@@ -2062,11 +1996,14 @@ lazy_scan_prune(LVRelState *vacrel,
 	 * tuples. Pruning will have determined whether or not the page is
 	 * all-visible.
 	 */
-	prune_options = HEAP_PAGE_PRUNE_FREEZE;
+	prune_options = HEAP_PAGE_PRUNE_FREEZE | HEAP_PAGE_PRUNE_UPDATE_VM;
 	if (vacrel->nindexes == 0)
 		prune_options |= HEAP_PAGE_PRUNE_MARK_UNUSED_NOW;
 
-	heap_page_prune_and_freeze(rel, buf, vacrel->vistest, prune_options,
+	heap_page_prune_and_freeze(rel, buf,
+							   all_visible_according_to_vm,
+							   vmbuffer,
+							   vacrel->vistest, prune_options,
 							   &vacrel->cutoffs, &presult, PRUNE_VACUUM_SCAN,
 							   &vacrel->offnum,
 							   &vacrel->NewRelfrozenXid, &vacrel->NewRelminMxid);
@@ -2151,10 +2088,9 @@ lazy_scan_prune(LVRelState *vacrel,
 	/*
 	 * Handle setting visibility map bit based on information from the VM (as
 	 * of last heap_vac_scan_next_block() call), and from all_visible and
-	 * all_frozen variables. Start by looking for any VM corruption.
+	 * all_frozen variables.
 	 */
-	if (identify_and_fix_vm_corruption(vacrel->rel, blkno, buf, page,
-									   all_visible_according_to_vm, presult.lpdead_items, vmbuffer))
+	if (presult.vm_corruption)
 	{
 		/* Don't update the VM if we just cleared corruption in it */
 	}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 1fa6eb047fd..0886867a161 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -41,6 +41,7 @@
 /* "options" flag bits for heap_page_prune_and_freeze */
 #define HEAP_PAGE_PRUNE_MARK_UNUSED_NOW		(1 << 0)
 #define HEAP_PAGE_PRUNE_FREEZE				(1 << 1)
+#define HEAP_PAGE_PRUNE_UPDATE_VM			(1 << 2)
 
 typedef struct BulkInsertStateData *BulkInsertState;
 struct TupleTableSlot;
@@ -246,6 +247,7 @@ typedef struct PruneFreezeResult
 	bool		all_visible;
 	bool		all_frozen;
 	TransactionId vm_conflict_horizon;
+	bool		vm_corruption;
 
 	/*
 	 * Whether or not the page makes rel truncation unsafe.  This is set to
@@ -385,6 +387,8 @@ extern TransactionId heap_index_delete_tuples(Relation rel,
 struct GlobalVisState;
 extern void heap_page_prune_opt(Relation relation, Buffer buffer);
 extern void heap_page_prune_and_freeze(Relation relation, Buffer buffer,
+									   bool blk_known_av,
+									   Buffer vmbuffer,
 									   struct GlobalVisState *vistest,
 									   int options,
 									   struct VacuumCutoffs *cutoffs,
-- 
2.34.1



  [text/x-patch] v1-0013-Eliminate-xl_heap_visible-from-vacuum-phase-I-pru.patch (24.7K, 15-v1-0013-Eliminate-xl_heap_visible-from-vacuum-phase-I-pru.patch)
  download | inline diff:
From f106462fefde3c18ae5767c879f2cc6026748938 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Wed, 18 Jun 2025 12:41:00 -0400
Subject: [PATCH v1 13/14] Eliminate xl_heap_visible from vacuum phase I
 prune/freeze

Instead of emitting a separate WAL record for every block rendered
all-visible/frozen by vacuum's phase I, include the changes to the VM in
the xl_heap_prune record already emitted.

This is only enabled for vacuum's prune/freeze work, not for on-access
pruning.
---
 src/backend/access/heap/pruneheap.c  | 384 +++++++++++++++------------
 src/backend/access/heap/vacuumlazy.c |  30 ---
 src/include/access/heapam.h          |  15 +-
 3 files changed, 223 insertions(+), 206 deletions(-)

diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index 425dcc77534..2d9624a246e 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -44,6 +44,13 @@ typedef struct
 	bool		mark_unused_now;
 	/* whether to attempt freezing tuples */
 	bool		freeze;
+
+	/*
+	 * Whether or not to consider updating the VM. There is some bookkeeping
+	 * that must be maintained if we would like to update the VM.
+	 */
+	bool		update_vm;
+
 	struct VacuumCutoffs *cutoffs;
 
 	/*-------------------------------------------------------
@@ -108,8 +115,9 @@ typedef struct
 	 *
 	 * These fields are not used by pruning itself for the most part, but are
 	 * used to collect information about what was pruned and what state the
-	 * page is in after pruning, for the benefit of the caller.  They are
-	 * copied to the caller's PruneFreezeResult at the end.
+	 * page is in after pruning to use when updating the visibility map and
+	 * for the benefit of the caller.  They are copied to the caller's
+	 * PruneFreezeResult at the end.
 	 * -------------------------------------------------------
 	 */
 
@@ -138,11 +146,10 @@ typedef struct
 	 * bits.  It is only valid if we froze some tuples, and all_frozen is
 	 * true.
 	 *
-	 * NOTE: all_visible and all_frozen don't include LP_DEAD items.  That's
-	 * convenient for heap_page_prune_and_freeze(), to use them to decide
-	 * whether to freeze the page or not.  The all_visible and all_frozen
-	 * values returned to the caller are adjusted to include LP_DEAD items at
-	 * the end.
+	 * NOTE: all_visible and all_frozen don't include LP_DEAD items until
+	 * directly before updating the VM. We ignore LP_DEAD items when deciding
+	 * whether or not to opportunistically freeze and when determining the
+	 * snapshot conflict horizon required when freezing tuples.
 	 *
 	 * all_frozen should only be considered valid if all_visible is also set;
 	 * we don't bother to clear the all_frozen flag every time we clear the
@@ -377,11 +384,15 @@ identify_and_fix_vm_corruption(Relation relation,
  * considered advantageous for overall system performance to do so now.  The
  * 'cutoffs', 'presult', 'new_relfrozen_xid' and 'new_relmin_mxid' arguments
  * are required when freezing.  When HEAP_PRUNE_FREEZE option is set, we also
- * set presult->all_visible and presult->all_frozen on exit, to indicate if
- * the VM bits can be set.  They are always set to false when the
- * HEAP_PRUNE_FREEZE option is not set, because at the moment only callers
+ * set presult->all_visible and presult->all_frozen on exit, for use when
+ * validating the changes made to the VM. They are always set to false when
+ * the HEAP_PRUNE_FREEZE option is not set, because at the moment only callers
  * that also freeze need that information.
  *
+ * If HEAP_PAGE_PRUNE_UPDATE_VM is set and the visibility status of the page
+ * has changed, we will update the VM at the same time as pruning and freezing
+ * the heap page.
+ *
  * blk_known_av is the visibility status of the heap block as of the last call
  * to find_next_unskippable_block(). vmbuffer is the buffer that may already
  * contain the required block of the visibility map.
@@ -396,6 +407,8 @@ identify_and_fix_vm_corruption(Relation relation,
  *   FREEZE indicates that we will also freeze tuples, and will return
  *   'all_visible', 'all_frozen' flags to the caller.
  *
+ *   UPDATE_VM indicates that we will set the page's status in the VM.
+ *
  * cutoffs contains the freeze cutoffs, established by VACUUM at the beginning
  * of vacuuming the relation.  Required if HEAP_PRUNE_FREEZE option is set.
  * cutoffs->OldestXmin is also used to determine if dead tuples are
@@ -441,15 +454,19 @@ heap_page_prune_and_freeze(Relation relation, Buffer buffer,
 	bool		do_freeze;
 	bool		do_prune;
 	bool		do_hint;
+	bool		do_set_vm;
 	uint8		vmflags = 0;
 	uint8		old_vmbits = 0;
 	bool		hint_bit_fpi;
 	int64		fpi_before = pgWalUsage.wal_fpi;
+	bool		all_frozen_except_lp_dead = false;
+	bool		set_pd_all_visible = false;
 
 	/* Copy parameters to prstate */
 	prstate.vistest = vistest;
 	prstate.mark_unused_now = (options & HEAP_PAGE_PRUNE_MARK_UNUSED_NOW) != 0;
 	prstate.freeze = (options & HEAP_PAGE_PRUNE_FREEZE) != 0;
+	prstate.update_vm = (options & HEAP_PAGE_PRUNE_UPDATE_VM) != 0;
 	prstate.cutoffs = cutoffs;
 
 	/*
@@ -496,29 +513,27 @@ heap_page_prune_and_freeze(Relation relation, Buffer buffer,
 	prstate.deadoffsets = presult->deadoffsets;
 
 	/*
-	 * Caller may update the VM after we're done.  We can keep track of
-	 * whether the page will be all-visible and all-frozen after pruning and
-	 * freezing to help the caller to do that.
+	 * Keep track of whether or not the page will be all-visible and
+	 * all-frozen for use in opportunistic freezing and to update the VM if
+	 * the caller requests it.
+	 *
+	 * Currently, only VACUUM attempts freezing and setting the VM bits. But
+	 * other callers could do either one. The visibility bookkeeping is
+	 * required for opportunistic freezing (in addition to setting the VM
+	 * bits) because we only consider opportunistically freezing tuples if the
+	 * whole page would become all-frozen or if the whole page will be frozen
+	 * except for dead tuples that will be removed by vacuum.
 	 *
-	 * Currently, only VACUUM sets the VM bits.  To save the effort, only do
-	 * the bookkeeping if the caller needs it.  Currently, that's tied to
-	 * HEAP_PAGE_PRUNE_FREEZE, but it could be a separate flag if you wanted
-	 * to update the VM bits without also freezing or freeze without also
-	 * setting the VM bits.
+	 * Dead tuples which will be removed by the end of vacuuming should not
+	 * preclude us from opportunistically freezing, so we do not clear
+	 * all_visible when we see LP_DEAD items. We fix that after determining
+	 * whether or not to freeze but before deciding whether or not to update
+	 * the VM so that we don't set the VM bit incorrectly.
 	 *
-	 * In addition to telling the caller whether it can set the VM bit, we
-	 * also use 'all_visible' and 'all_frozen' for our own decision-making. If
-	 * the whole page would become frozen, we consider opportunistically
-	 * freezing tuples.  We will not be able to freeze the whole page if there
-	 * are tuples present that are not visible to everyone or if there are
-	 * dead tuples which are not yet removable.  However, dead tuples which
-	 * will be removed by the end of vacuuming should not preclude us from
-	 * opportunistically freezing.  Because of that, we do not clear
-	 * all_visible when we see LP_DEAD items.  We fix that at the end of the
-	 * function, when we return the value to the caller, so that the caller
-	 * doesn't set the VM bit incorrectly.
+	 * If not freezing or updating the VM, we otherwise avoid the extra
+	 * bookkeeping.
 	 */
-	if (prstate.freeze)
+	if (prstate.freeze || prstate.update_vm)
 	{
 		prstate.all_visible = true;
 		prstate.all_frozen = true;
@@ -534,12 +549,15 @@ heap_page_prune_and_freeze(Relation relation, Buffer buffer,
 	}
 
 	/*
-	 * The visibility cutoff xid is the newest xmin of live tuples on the
-	 * page.  In the common case, this will be set as the conflict horizon the
-	 * caller can use for updating the VM.  If, at the end of freezing and
-	 * pruning, the page is all-frozen, there is no possibility that any
-	 * running transaction on the standby does not see tuples on the page as
-	 * all-visible, so the conflict horizon remains InvalidTransactionId.
+	 * The visibility cutoff xid is the newest xmin of live, committed tuples
+	 * older than OldestXmin on the page. This field is only kept up-to-date
+	 * if the page is all-visible. As soon as a tuple is encountered that is
+	 * not visible to all, this field is unmaintained. As long as it is
+	 * maintained, it can be used to calculate the snapshot conflict horizon.
+	 * This is most likely to happen when updating the VM and/or freezing all
+	 * live tuples on the page. It is updated before returning to the caller
+	 * because vacuum does assert-build only validation on the page using this
+	 * field.
 	 */
 	prstate.visibility_cutoff_xid = InvalidTransactionId;
 
@@ -827,6 +845,68 @@ heap_page_prune_and_freeze(Relation relation, Buffer buffer,
 		 */
 	}
 
+	/*
+	 * It was convenient to ignore LP_DEAD items in all_visible earlier on to
+	 * make the choice of whether or not to freeze the page unaffected by the
+	 * short-term presence of LP_DEAD items.  These LP_DEAD items were
+	 * effectively assumed to be LP_UNUSED items in the making.  It doesn't
+	 * matter which vacuum heap pass (initial pass or final pass) ends up
+	 * setting the page all-frozen, as long as the ongoing VACUUM does it.
+	 *
+	 * Now that freezing has been finalized, unset all_visible if there are
+	 * any LP_DEAD items on the page. It needs to reflect the present state of
+	 * the page when using it to determine whether or not to update the VM.
+	 *
+	 * Keep track of whether or not the page was all-frozen except LP_DEAD
+	 * items for the purposes of calculating the snapshot conflict horizon,
+	 * though.
+	 */
+	all_frozen_except_lp_dead = prstate.all_frozen;
+	if (prstate.lpdead_items > 0)
+	{
+		prstate.all_visible = false;
+		prstate.all_frozen = false;
+	}
+
+	/*
+	 * Handle setting visibility map bit based on information from the VM (as
+	 * of last heap_vac_scan_next_block() call), and from all_visible and
+	 * all_frozen variables.
+	 */
+	if (prstate.update_vm)
+	{
+		/*
+		 * Clear any VM corruption. This does not need to be in a critical
+		 * section, so we do it first. If PD_ALL_VISIBLE is incorrectly set,
+		 * we may mark the heap page buffer dirty here and could end up doing
+		 * so again later. This is not a correctness issue and is in the path
+		 * of VM corruption, so we don't have to worry about the extra
+		 * performance overhead.
+		 */
+		if (identify_and_fix_vm_corruption(relation,
+										   blockno, buffer, page,
+										   blk_known_av, prstate.lpdead_items, vmbuffer))
+		{
+			/* If we fix corruption, don't update the VM further */
+		}
+
+		/* Determine if we actually need to set the VM and which bits to set. */
+		else if (prstate.all_visible &&
+				 (!blk_known_av ||
+				  (prstate.all_frozen && !VM_ALL_FROZEN(relation, blockno, &vmbuffer))))
+		{
+			vmflags |= VISIBILITYMAP_ALL_VISIBLE;
+			if (prstate.all_frozen)
+				vmflags |= VISIBILITYMAP_ALL_FROZEN;
+		}
+	}
+
+	do_set_vm = vmflags & VISIBILITYMAP_VALID_BITS;
+	set_pd_all_visible = do_set_vm && !PageIsAllVisible(page);
+
+	/* Save these for the caller in case we later zero out vmflags */
+	presult->new_vmbits = vmflags;
+
 	/* Any error while applying the changes is critical */
 	START_CRIT_SECTION();
 
@@ -848,13 +928,13 @@ heap_page_prune_and_freeze(Relation relation, Buffer buffer,
 		/*
 		 * If that's all we had to do to the page, this is a non-WAL-logged
 		 * hint.  If we are going to freeze or prune the page, we will mark
-		 * the buffer dirty below.
+		 * the buffer dirty and emit WAL below.
 		 */
-		if (!do_freeze && !do_prune)
+		if (!do_prune && !do_freeze && !do_set_vm)
 			MarkBufferDirtyHint(buffer, true);
 	}
 
-	if (do_prune || do_freeze)
+	if (do_prune || do_freeze || do_set_vm)
 	{
 		/* Apply the planned item changes and repair page fragmentation. */
 		if (do_prune)
@@ -868,7 +948,23 @@ heap_page_prune_and_freeze(Relation relation, Buffer buffer,
 		if (do_freeze)
 			heap_freeze_prepared_tuples(buffer, prstate.frozen, prstate.nfrozen);
 
-		MarkBufferDirty(buffer);
+		if (do_prune || do_freeze)
+			MarkBufferDirty(buffer);
+
+		if (do_set_vm)
+		{
+			LockBuffer(vmbuffer, BUFFER_LOCK_EXCLUSIVE);
+			old_vmbits = heap_page_set_vm(relation, blockno, buffer,
+										  vmbuffer, vmflags);
+
+			if (old_vmbits == vmflags)
+			{
+				LockBuffer(vmbuffer, BUFFER_LOCK_UNLOCK);
+				do_set_vm = false;
+				/* 0 out vmflags so we don't emit VM update WAL */
+				vmflags = 0;
+			}
+		}
 
 		/*
 		 * Emit a WAL XLOG_HEAP2_PRUNE_FREEZE record showing what we did
@@ -885,35 +981,57 @@ heap_page_prune_and_freeze(Relation relation, Buffer buffer,
 			 * on the standby with xids older than the youngest tuple this
 			 * record will freeze will conflict.
 			 */
-			TransactionId frz_conflict_horizon = InvalidTransactionId;
-			TransactionId conflict_xid;
+			TransactionId conflict_xid = InvalidTransactionId;
+
+			/*
+			 * If we are updating the VM, the conflict horizon is almost
+			 * always the visibility cutoff XID.
+			 *
+			 * Separately, if we are freezing any tuples, as an optimization,
+			 * we can use the visibility_cutoff_xid as the conflict horizon if
+			 * the page will be all-frozen. This is true even if there are
+			 * LP_DEAD line pointers because we ignored those when maintaining
+			 * the visibility_cutoff_xid.
+			 */
+			if (do_set_vm || (do_freeze && all_frozen_except_lp_dead))
+				conflict_xid = prstate.visibility_cutoff_xid;
 
 			/*
-			 * We can use the visibility_cutoff_xid as our cutoff for
-			 * conflicts when the whole page is eligible to become all-frozen
-			 * in the VM once we're done with it.  Otherwise we generate a
-			 * conservative cutoff by stepping back from OldestXmin.
+			 * Otherwise, if we are freezing but the page would not be
+			 * all-frozen, we have to use the more pessimistic horizon of
+			 * OldestXmin, which may be newer than the newest tuple we froze.
+			 * That's because we won't have maintained the
+			 * visibility_cutoff_xid.
 			 */
-			if (do_freeze)
+			else if (do_freeze)
 			{
-				if (prstate.all_visible && prstate.all_frozen)
-					frz_conflict_horizon = prstate.visibility_cutoff_xid;
-				else
-				{
-					/* Avoids false conflicts when hot_standby_feedback in use */
-					frz_conflict_horizon = prstate.cutoffs->OldestXmin;
-					TransactionIdRetreat(frz_conflict_horizon);
-				}
+				conflict_xid = prstate.cutoffs->OldestXmin;
+				TransactionIdRetreat(conflict_xid);
 			}
 
-			if (TransactionIdFollows(frz_conflict_horizon, prstate.latest_xid_removed))
-				conflict_xid = frz_conflict_horizon;
-			else
+			/*
+			 * If we are removing tuples with a younger xmax than our so far
+			 * calculated conflict_xid, we must use this as our horizon.
+			 */
+			if (TransactionIdFollows(prstate.latest_xid_removed, conflict_xid))
 				conflict_xid = prstate.latest_xid_removed;
 
+			/*
+			 * We can omit the snapshot conflict horizon if we are not pruning
+			 * or freezing any tuples and are setting an already all-visible
+			 * page all-frozen in the VM. In this case, all of the tuples on
+			 * the page must already be visible to all MVCC snapshots on the
+			 * standby.
+			 */
+			if (!do_prune && !do_freeze && do_set_vm &&
+				blk_known_av && (vmflags & VISIBILITYMAP_ALL_FROZEN))
+				conflict_xid = InvalidTransactionId;
+
 			log_heap_prune_and_freeze(relation, buffer,
 									  false,
-									  InvalidBuffer, 0, false,
+									  vmbuffer,
+									  vmflags,
+									  set_pd_all_visible,
 									  conflict_xid,
 									  true, reason,
 									  prstate.frozen, prstate.nfrozen,
@@ -925,124 +1043,55 @@ heap_page_prune_and_freeze(Relation relation, Buffer buffer,
 
 	END_CRIT_SECTION();
 
-	/* Copy information back for caller */
-	presult->ndeleted = prstate.ndeleted;
-	presult->nnewlpdead = prstate.ndead;
-	presult->nfrozen = prstate.nfrozen;
-	presult->live_tuples = prstate.live_tuples;
-	presult->recently_dead_tuples = prstate.recently_dead_tuples;
-
-	/*
-	 * It was convenient to ignore LP_DEAD items in all_visible earlier on to
-	 * make the choice of whether or not to freeze the page unaffected by the
-	 * short-term presence of LP_DEAD items.  These LP_DEAD items were
-	 * effectively assumed to be LP_UNUSED items in the making.  It doesn't
-	 * matter which vacuum heap pass (initial pass or final pass) ends up
-	 * setting the page all-frozen, as long as the ongoing VACUUM does it.
-	 *
-	 * Now that freezing has been finalized, unset all_visible if there are
-	 * any LP_DEAD items on the page.  It needs to reflect the present state
-	 * of the page, as expected for updating the visibility map.
-	 */
-	if (prstate.all_visible && prstate.lpdead_items == 0)
-	{
-		presult->all_visible = prstate.all_visible;
-		presult->all_frozen = prstate.all_frozen;
-	}
-	else
-	{
-		presult->all_visible = false;
-		presult->all_frozen = false;
-	}
+	if (do_set_vm)
+		LockBuffer(vmbuffer, BUFFER_LOCK_UNLOCK);
 
-	presult->hastup = prstate.hastup;
-
-	/*
-	 * If updating the visibility map, the conflict horizon for that record
-	 * must be the newest xmin on the page.  However, if the page is
-	 * completely frozen, there can be no conflict and the vm_conflict_horizon
-	 * should remain InvalidTransactionId.  This includes the case that we
-	 * just froze all the tuples; the prune-freeze record included the
-	 * conflict XID already so the VM update record doesn't need it.
-	 */
-	if (presult->all_frozen)
-		presult->vm_conflict_horizon = InvalidTransactionId;
-	else
-		presult->vm_conflict_horizon = prstate.visibility_cutoff_xid;
+	Assert(!prstate.all_visible || (prstate.lpdead_items == 0));
 
 	/*
-	 * Handle setting visibility map bit based on information from the VM (as
-	 * of last heap_vac_scan_next_block() call), and from all_visible and
-	 * all_frozen variables.
+	 * VACUUM will call heap_page_is_all_visible() during the second pass over
+	 * the heap to determine all_visible and all_frozen for the page -- this
+	 * is a specialized version of the logic from this function.  Now that
+	 * we've finished pruning and freezing, make sure that we're in total
+	 * agreement with heap_page_is_all_visible() using an assertion. We will
+	 * have already set the page in the VM, so this assertion will only let
+	 * you know that you've already done something wrong.
 	 */
-	if (options & HEAP_PAGE_PRUNE_UPDATE_VM)
+#ifdef USE_ASSERT_CHECKING
+	if (prstate.all_visible)
 	{
-		if (identify_and_fix_vm_corruption(relation,
-										   blockno, buffer, page,
-										   blk_known_av,
-										   prstate.lpdead_items, vmbuffer))
-		{
-			/* If we fix corruption, don't update the VM further */
-		}
+		TransactionId debug_cutoff;
+		bool		debug_all_frozen;
 
-		/*
-		 * If the page isn't yet marked all-visible in the VM or it is and
-		 * needs to me marked all-frozen, update the VM Note that all_frozen
-		 * is only valid if all_visible is true, so we must check both
-		 * all_visible and all_frozen.
-		 */
-		else if (presult->all_visible &&
-				 (!blk_known_av ||
-				  (presult->all_frozen && !VM_ALL_FROZEN(relation, blockno, &vmbuffer))))
-		{
-			Assert(prstate.lpdead_items == 0);
-			vmflags = VISIBILITYMAP_ALL_VISIBLE;
+		Assert(cutoffs);
 
-			/*
-			 * If the page is all-frozen, we can pass InvalidTransactionId as
-			 * our cutoff_xid, since a snapshotConflictHorizon sufficient to
-			 * make everything safe for REDO was logged when the page's tuples
-			 * were frozen.
-			 */
-			if (presult->all_frozen)
-			{
-				Assert(!TransactionIdIsValid(presult->vm_conflict_horizon));
-				vmflags |= VISIBILITYMAP_ALL_FROZEN;
-			}
+		Assert(prstate.lpdead_items == 0);
 
-			/*
-			 * It's possible for the VM bit to be clear and the page-level bit
-			 * to be set if checksums are not enabled.
-			 *
-			 * And even if we are just planning to update the frozen bit in
-			 * the VM, we shouldn't rely on all_visible_according_to_vm as a
-			 * proxy for the page-level PD_ALL_VISIBLE bit being set, since it
-			 * might have become stale.
-			 *
-			 * If the heap page is all-visible but the VM bit is not set, we
-			 * don't need to dirty the heap page.  However, if checksums are
-			 * enabled, we do need to make sure that the heap page is dirtied
-			 * before passing it to visibilitymap_set(), because it may be
-			 * logged.
-			 */
-			if (!PageIsAllVisible(page) || XLogHintBitIsNeeded())
-			{
-				PageSetAllVisible(page);
-				MarkBufferDirty(buffer);
-			}
+		if (!heap_page_is_all_visible(relation, buffer,
+									  cutoffs->OldestXmin,
+									  &debug_all_frozen,
+									  &debug_cutoff, off_loc))
+			Assert(false);
 
-			old_vmbits = heap_page_set_vm_and_log(relation, blockno, buffer,
-												  vmbuffer, presult->vm_conflict_horizon,
-												  vmflags);
-		}
+		Assert(prstate.all_frozen == debug_all_frozen);
+
+		Assert(!TransactionIdIsValid(debug_cutoff) ||
+			   debug_cutoff == prstate.visibility_cutoff_xid);
 	}
+#endif
 
+	/* Copy information back for caller */
+	presult->ndeleted = prstate.ndeleted;
+	presult->nnewlpdead = prstate.ndead;
+	presult->nfrozen = prstate.nfrozen;
+	presult->live_tuples = prstate.live_tuples;
+	presult->recently_dead_tuples = prstate.recently_dead_tuples;
+	presult->old_vmbits = old_vmbits;
+	/* new_vmbits was set above */
+	presult->hastup = prstate.hastup;
 	presult->lpdead_items = prstate.lpdead_items;
 	/* the presult->deadoffsets array was already filled in */
 
-	presult->old_vmbits = old_vmbits;
-	presult->new_vmbits = vmflags;
-
 	if (prstate.freeze)
 	{
 		if (presult->nfrozen > 0)
@@ -1624,8 +1673,13 @@ heap_prune_record_unchanged_lp_normal(Page page, PruneState *prstate, OffsetNumb
 			break;
 	}
 
-	/* Consider freezing any normal tuples which will not be removed */
-	if (prstate->freeze)
+	/*
+	 * Consider freezing any normal tuples which will not be removed.
+	 * Regardless of whether or not we want to freeze the tuples, if we want
+	 * to update the VM, we have to call heap_prepare_freeze_tuple() on every
+	 * tuple to know whether or not the page will be totally frozen.
+	 */
+	if (prstate->freeze || prstate->update_vm)
 	{
 		bool		totally_frozen;
 
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 5806207a674..7d74f8fc0f1 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -2019,34 +2019,6 @@ lazy_scan_prune(LVRelState *vacrel,
 		vacrel->new_frozen_tuple_pages++;
 	}
 
-	/*
-	 * VACUUM will call heap_page_is_all_visible() during the second pass over
-	 * the heap to determine all_visible and all_frozen for the page -- this
-	 * is a specialized version of the logic from this function.  Now that
-	 * we've finished pruning and freezing, make sure that we're in total
-	 * agreement with heap_page_is_all_visible() using an assertion.
-	 */
-#ifdef USE_ASSERT_CHECKING
-	/* Note that all_frozen value does not matter when !all_visible */
-	if (presult.all_visible)
-	{
-		TransactionId debug_cutoff;
-		bool		debug_all_frozen;
-
-		Assert(presult.lpdead_items == 0);
-
-		if (!heap_page_is_all_visible(vacrel->rel, buf,
-									  vacrel->cutoffs.OldestXmin, &debug_all_frozen,
-									  &debug_cutoff, &vacrel->offnum))
-			Assert(false);
-
-		Assert(presult.all_frozen == debug_all_frozen);
-
-		Assert(!TransactionIdIsValid(debug_cutoff) ||
-			   debug_cutoff == presult.vm_conflict_horizon);
-	}
-#endif
-
 	/*
 	 * Now save details of the LP_DEAD items from the page in vacrel
 	 */
@@ -2080,8 +2052,6 @@ lazy_scan_prune(LVRelState *vacrel,
 	/* Did we find LP_DEAD items? */
 	*has_lpdead_items = (presult.lpdead_items > 0);
 
-	Assert(!presult.all_visible || !(*has_lpdead_items));
-
 	/*
 	 * For the purposes of logging, count whether or not the page was newly
 	 * set all-visible and, potentially, all-frozen.
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 534a63aab31..e35b4adf38d 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -234,19 +234,12 @@ typedef struct PruneFreezeResult
 	int			recently_dead_tuples;
 
 	/*
-	 * all_visible and all_frozen indicate the status of the page as reflected
-	 * in the visibility map after pruning, freezing, and setting any pages
-	 * all-visible in the visibility map.
+	 * old_vmbits are the state of the all-visible and all-frozen bits in the
+	 * visibility map before updating it during phase I of vacuuming.
+	 * new_vmbits are the state of those bits after phase I of vacuuming.
 	 *
-	 * vm_conflict_horizon is the newest xmin of live tuples on the page
-	 * (older than OldestXmin).  It will only be valid if we did not set the
-	 * page all-frozen in the VM.
-	 *
-	 * These are only set if the HEAP_PRUNE_FREEZE option is set.
+	 * These are only set if the HEAP_PAGE_PRUNE_UPDATE_VM option is set.
 	 */
-	bool		all_visible;
-	bool		all_frozen;
-	TransactionId vm_conflict_horizon;
 	uint8		old_vmbits;
 	uint8		new_vmbits;
 
-- 
2.34.1



view thread (143+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected]
  Subject: Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)
  In-Reply-To: <CAAKRu_ZMw6Npd_qm2KM+FwQ3cMOMx1Dh3VMhp8-V7SOLxdK9-g@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox