public inbox for [email protected]  
help / color / mirror / Atom feed
From: Alena Rybakina <[email protected]>
To: Andrei Lepikhov <[email protected]>
To: pgsql-hackers <[email protected]>
Cc: Alexander Korotkov <[email protected]>
Cc: Amit Kapila <[email protected]>
Cc: Jim Nasby <[email protected]>
Cc: Bertrand Drouvot <[email protected]>
Cc: Kirill Reshke <[email protected]>
Cc: Andrei Zubkov <[email protected]>
Cc: Masahiko Sawada <[email protected]>
Cc: Melanie Plageman <[email protected]>
Cc: jian he <[email protected]>
Cc: Sami Imseih <[email protected]>
Cc: vignesh C <[email protected]>
Cc: Ilia Evdokimov <[email protected]>
Subject: Re: Vacuum statistics
Date: Fri, 13 Mar 2026 16:04:24 +0300
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
	<CAA4eK1JOUn+EqWSfRgKfgBZOXT7Q2dw2enmSZZgOhoMOFwopPA@mail.gmail.com>
	<[email protected]>
	<CAPpHfdtQd29O15Cmp1qeqTCerQF0Y+BGh63qtX3RkA7k=0TZ1Q@mail.gmail.com>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>

On 13.03.2026 15:51, Alena Rybakina wrote:

>>>
>>> In addition, it makes sense to discuss how these parameters are 
>>> supposed to be used. I see the following use cases:
>>>
>>> 1. Which tables have the most VM churn? - monitoring 
>>> rev_all_visible_pages normalised on the table size and its average 
>>> tuple width might expose the most suspicious tables (in terms of 
>>> table statistics).
>>> 2. DML Skew. Dividing rev_all_visible_pages by the number of tuple 
>>> updates/deletes, normalised by the average table and tuple sizes, 
>>> might indicate whether changes are localised within the table.
>>> 3. IndexOnlyScan effectiveness. Considering the speed of 
>>> rev_all_visible_pages change, normalised to the value of the 
>>> relallvisible statistic, we may detect tables where Index-Only Scan 
>>> might be inefficiently used.
>>
>> With the parameter that was included before (pg_class_relallfrozen 
>> and relallvisible 
>> https://github.com/MasaoFujii/postgresql/commit/99f8f3fbbc8f743290844e8c676d39dad11c5d5d) 
>> in the pg_stat_tables, I think I can provide isolation test to prove 
>> it - I can use my isolation test 
>> vacuum-extending-in-repetable-read.spec that I have added in the 
>> extension (ext_vacuum_statistics). What do you think? 
>
> I've prepared the test. Do you think it would make sense to include it 
> in 0001?
>
I have added it in the 31th version for now and nothing else has been 
changed (if you don't mind, exclude it).
From 486a29e6a22d43e2911eb849bdb3b3b39eefab91 Mon Sep 17 00:00:00 2001
From: Alena Rybakina <[email protected]>
Date: Fri, 13 Mar 2026 16:00:39 +0300
Subject: [PATCH] Track table VM stability.

Add rev_all_visible_pages and rev_all_frozen_pages counters to
pg_stat_all_tables tracking the number of times the all-visible and
all-frozen bits are cleared in the visibility map. These bits are cleared by
backend processes during regular DML operations. Hence, the counters are placed
in table statistic entry.

A high rev_all_visible_pages rate relative to DML volume indicates
that modifications are scattered across previously-clean pages rather
than concentrated on already-dirty ones, causing index-only scans to
fall back to heap fetches.  A high rev_all_frozen_pages rate indicates
that vacuum's freezing work is being frequently undone by concurrent
DML.

Authors: Alena Rybakina <[email protected]>,
         Andrei Lepikhov <[email protected]>,
         Andrei Zubkov <[email protected]>
Reviewed-by: Dilip Kumar <[email protected]>,
         Masahiko Sawada <[email protected]>,
         Ilia Evdokimov <[email protected]>,
         Jian He <[email protected]>,
         Kirill Reshke <[email protected]>,
         Alexander Korotkov <[email protected]>,
         Jim Nasby <[email protected]>,
         Sami Imseih <[email protected]>,
         Karina Litskevich <[email protected]>
---
 doc/src/sgml/monitoring.sgml                  |  32 +++
 src/backend/access/heap/visibilitymap.c       |  10 +
 src/backend/catalog/system_views.sql          |   2 +
 src/backend/utils/activity/pgstat_relation.c  |   2 +
 src/backend/utils/adt/pgstatfuncs.c           |   6 +
 src/include/catalog/pg_proc.dat               |  12 +-
 src/include/pgstat.h                          |  17 +-
 .../t/052_vacuum_extending_freeze_test.pl     | 215 ++++++++++++++++++
 src/test/regress/expected/rules.out           |   6 +
 9 files changed, 300 insertions(+), 2 deletions(-)
 create mode 100644 src/test/recovery/t/052_vacuum_extending_freeze_test.pl

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index b77d189a500..fb656977b2e 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -4090,6 +4090,38 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>rev_all_visible_pages</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Number of times the all-visible bit in the
+       <link linkend="storage-vm">visibility map</link> was cleared for a
+       page of this table.  The all-visible bit is cleared by backend
+       processes when they modify a heap page that was previously marked
+       all-visible, for example during an <command>INSERT</command>,
+       <command>UPDATE</command>, or <command>DELETE</command>.
+       A high rate of change in this counter means that index-only scans
+       on this table may frequently need to fall back to heap fetches,
+       and that vacuum must re-do visibility map work on those pages.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>rev_all_frozen_pages</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Number of times the all-frozen bit in the
+       <link linkend="storage-vm">visibility map</link> was cleared for a
+       page of this table.  The all-frozen bit is cleared by backend
+       processes when they modify a heap page that was previously marked
+       all-frozen.  A high value compared to the number of vacuum cycles
+       indicates that DML activity is frequently undoing the freezing work
+       performed by vacuum.
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>last_vacuum</structfield> <type>timestamp with time zone</type>
diff --git a/src/backend/access/heap/visibilitymap.c b/src/backend/access/heap/visibilitymap.c
index 3047bd46def..2e7c28ea307 100644
--- a/src/backend/access/heap/visibilitymap.c
+++ b/src/backend/access/heap/visibilitymap.c
@@ -92,6 +92,7 @@
 #include "access/xloginsert.h"
 #include "access/xlogutils.h"
 #include "miscadmin.h"
+#include "pgstat.h"
 #include "port/pg_bitutils.h"
 #include "storage/bufmgr.h"
 #include "storage/smgr.h"
@@ -161,6 +162,15 @@ visibilitymap_clear(Relation rel, BlockNumber heapBlk, Buffer vmbuf, uint8 flags
 
 	if (map[mapByte] & mask)
 	{
+		/*
+		 * Track how often all-visible or all-frozen bits are cleared in the
+		 * visibility map.
+		 */
+		if (map[mapByte] >> mapOffset & flags & VISIBILITYMAP_ALL_VISIBLE)
+			pgstat_count_vm_rev_all_visible(rel);
+		if (map[mapByte] >> mapOffset & flags & VISIBILITYMAP_ALL_FROZEN)
+			pgstat_count_vm_rev_all_frozen(rel);
+
 		map[mapByte] &= ~mask;
 
 		MarkBufferDirty(vmbuf);
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 7553f31fef0..fa4c74bcd5d 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -715,6 +715,8 @@ CREATE VIEW pg_stat_all_tables AS
             pg_stat_get_dead_tuples(C.oid) AS n_dead_tup,
             pg_stat_get_mod_since_analyze(C.oid) AS n_mod_since_analyze,
             pg_stat_get_ins_since_vacuum(C.oid) AS n_ins_since_vacuum,
+            pg_stat_get_rev_all_visible_pages(C.oid) AS rev_all_visible_pages,
+            pg_stat_get_rev_all_frozen_pages(C.oid) AS rev_all_frozen_pages,
             pg_stat_get_last_vacuum_time(C.oid) as last_vacuum,
             pg_stat_get_last_autovacuum_time(C.oid) as last_autovacuum,
             pg_stat_get_last_analyze_time(C.oid) as last_analyze,
diff --git a/src/backend/utils/activity/pgstat_relation.c b/src/backend/utils/activity/pgstat_relation.c
index bc8c43b96aa..bb26e97898d 100644
--- a/src/backend/utils/activity/pgstat_relation.c
+++ b/src/backend/utils/activity/pgstat_relation.c
@@ -879,6 +879,8 @@ pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
 
 	tabentry->blocks_fetched += lstats->counts.blocks_fetched;
 	tabentry->blocks_hit += lstats->counts.blocks_hit;
+	tabentry->rev_all_visible_pages += lstats->counts.rev_all_visible_pages;
+	tabentry->rev_all_frozen_pages += lstats->counts.rev_all_frozen_pages;
 
 	/* Clamp live_tuples in case of negative delta_live_tuples */
 	tabentry->live_tuples = Max(tabentry->live_tuples, 0);
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 73ca0bb0b7f..901f3dd55a1 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -106,6 +106,12 @@ PG_STAT_GET_RELENTRY_INT64(tuples_updated)
 /* pg_stat_get_vacuum_count */
 PG_STAT_GET_RELENTRY_INT64(vacuum_count)
 
+/* pg_stat_get_rev_all_frozen_pages */
+PG_STAT_GET_RELENTRY_INT64(rev_all_frozen_pages)
+
+/* pg_stat_get_rev_all_visible_pages */
+PG_STAT_GET_RELENTRY_INT64(rev_all_visible_pages)
+
 #define PG_STAT_GET_RELENTRY_FLOAT8(stat)						\
 Datum															\
 CppConcat(pg_stat_get_,stat)(PG_FUNCTION_ARGS)					\
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 5e5e33f64fc..961337ce282 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12693,6 +12693,16 @@
   prosrc => 'hashoid8' },
 { oid => '8281', descr => 'hash',
   proname => 'hashoid8extended', prorettype => 'int8',
-  proargtypes => 'oid8 int8', prosrc => 'hashoid8extended' },
+  proargtypes => 'oid8 int8',   prosrc => 'hashoid8extended' },
 
+{ oid => '8002',
+  descr => 'statistics: number of times the all-visible pages in the visibility map was removed for pages of table',
+  proname => 'pg_stat_get_rev_all_visible_pages', provolatile => 's',
+  proparallel => 'r', prorettype => 'int8', proargtypes => 'oid',
+  prosrc => 'pg_stat_get_rev_all_visible_pages' },
+{ oid => '8003',
+  descr => 'statistics: number of times the all-frozen pages in the visibility map was removed for pages of table',
+  proname => 'pg_stat_get_rev_all_frozen_pages', provolatile => 's',
+  proparallel => 'r', prorettype => 'int8', proargtypes => 'oid',
+  prosrc => 'pg_stat_get_rev_all_frozen_pages' },
 ]
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index fff7ecc2533..04ccb3c06c2 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -156,6 +156,8 @@ typedef struct PgStat_TableCounts
 
 	PgStat_Counter blocks_fetched;
 	PgStat_Counter blocks_hit;
+	PgStat_Counter rev_all_visible_pages;
+	PgStat_Counter rev_all_frozen_pages;
 } PgStat_TableCounts;
 
 /* ----------
@@ -214,7 +216,7 @@ typedef struct PgStat_TableXactStatus
  * ------------------------------------------------------------
  */
 
-#define PGSTAT_FILE_FORMAT_ID	0x01A5BCBB
+#define PGSTAT_FILE_FORMAT_ID	0x01A5BCBC
 
 typedef struct PgStat_ArchiverStats
 {
@@ -447,6 +449,8 @@ typedef struct PgStat_StatTabEntry
 
 	PgStat_Counter blocks_fetched;
 	PgStat_Counter blocks_hit;
+	PgStat_Counter rev_all_visible_pages;
+	PgStat_Counter rev_all_frozen_pages;
 
 	TimestampTz last_vacuum_time;	/* user initiated vacuum */
 	PgStat_Counter vacuum_count;
@@ -722,6 +726,17 @@ extern void pgstat_report_analyze(Relation rel,
 		if (pgstat_should_count_relation(rel))						\
 			(rel)->pgstat_info->counts.blocks_hit++;				\
 	} while (0)
+/* count revocations of all-visible and all-frozen bits in visibility map */
+#define pgstat_count_vm_rev_all_visible(rel)						\
+	do {															\
+		if (pgstat_should_count_relation(rel))						\
+			(rel)->pgstat_info->counts.rev_all_visible_pages++;	\
+	} while (0)
+#define pgstat_count_vm_rev_all_frozen(rel)						\
+	do {															\
+		if (pgstat_should_count_relation(rel))						\
+			(rel)->pgstat_info->counts.rev_all_frozen_pages++;	\
+	} while (0)
 
 extern void pgstat_count_heap_insert(Relation rel, PgStat_Counter n);
 extern void pgstat_count_heap_update(Relation rel, bool hot, bool newpage);
diff --git a/src/test/recovery/t/052_vacuum_extending_freeze_test.pl b/src/test/recovery/t/052_vacuum_extending_freeze_test.pl
new file mode 100644
index 00000000000..384e123381f
--- /dev/null
+++ b/src/test/recovery/t/052_vacuum_extending_freeze_test.pl
@@ -0,0 +1,215 @@
+# Copyright (c) 2025 PostgreSQL Global Development Group
+#
+# Test cumulative vacuum stats system using TAP
+#
+# In short, this test validates the correctness and stability of cumulative
+# vacuum statistics accounting around freezing, visibility, and revision
+# tracking across VACUUM and backend operations.
+
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+plan tests => 10;
+
+#------------------------------------------------------------------------------
+# Test cluster setup
+#------------------------------------------------------------------------------
+
+my $node = PostgreSQL::Test::Cluster->new('vacuum_extending_freeze_test');
+$node->init;
+
+# Configure the server for aggressive freezing behavior used by the test
+$node->append_conf('postgresql.conf', q{
+	log_min_messages = notice
+    vacuum_freeze_min_age = 0
+    vacuum_freeze_table_age = 0
+});
+
+$node->start();
+
+#------------------------------------------------------------------------------
+# Database creation and initialization
+#------------------------------------------------------------------------------
+
+$node->safe_psql('postgres', q{
+	CREATE DATABASE statistic_vacuum_database_regression;
+});
+
+# Main test database name
+my $dbname = 'statistic_vacuum_database_regression';
+
+# Enable necessary settings and force the stats collector to flush next
+$node->safe_psql($dbname, q{
+    SET track_functions = 'all';
+    SELECT pg_stat_force_next_flush();
+});
+
+#------------------------------------------------------------------------------
+# Timing parameters for polling loops
+#------------------------------------------------------------------------------
+
+my $timeout    = 30;     # overall wait timeout in seconds
+my $interval   = 0.015;  # poll interval in seconds (15 ms)
+my $start_time = time();
+my $updated    = 0;
+
+# Polls statistics until the named columns exceed the provided
+# baseline values or until timeout.
+#
+# run_vacuum is a boolean (0 or 1) means we need to fetch frozen and visible pages
+# from pg_class table, otherwise we need to fetch frozen and visible pages from pg_stat_all_tables table.
+# Returns: 1 if the condition is met before timeout, 0 otherwise.
+sub wait_for_vacuum_stats {
+    my (%args) = @_;
+    my $run_vacuum = ($args{run_vacuum} or 0);
+    my $result_query;
+    my $sql;
+
+    my $start = time();
+    while ((time() - $start) < $timeout) {
+
+        if ($run_vacuum) {
+            $node->safe_psql($dbname, 'VACUUM vestat');
+
+            $sql = "
+            SELECT relallfrozen > 0
+                AND relallvisible > 0
+                FROM pg_class c
+                WHERE c.relname = 'vestat'";
+        }
+        else {
+            $sql = "
+            SELECT rev_all_frozen_pages > 0
+                AND rev_all_visible_pages > 0
+                FROM pg_stat_all_tables
+                WHERE relname = 'vestat'";
+        }
+
+        $result_query = $node->safe_psql($dbname, $sql);
+
+        return 1 if (defined $result_query && $result_query eq 't');
+
+        # sub-second sleep
+        sleep($interval);
+    }
+
+    return 0;
+}
+
+#------------------------------------------------------------------------------
+# Variables to hold vacuum statistics snapshots for comparisons
+#------------------------------------------------------------------------------
+
+my $relallvisible = 0;
+my $relallfrozen = 0;
+
+my $relallvisible_prev = 0;
+my $relallfrozen_prev = 0;
+
+my $rev_all_frozen_pages = 0;
+my $rev_all_visible_pages = 0;
+
+my $res;
+
+#------------------------------------------------------------------------------
+# fetch_vacuum_stats
+#
+# Loads current values of the relevant vacuum counters for the test table
+# into the package-level variables above so tests can compare later.
+#------------------------------------------------------------------------------
+
+sub fetch_vacuum_stats {
+    my $base_statistics = $node->safe_psql(
+        $dbname,
+        "SELECT c.relallvisible, c.relallfrozen,
+                rev_all_visible_pages, rev_all_frozen_pages
+           FROM pg_class c
+           LEFT JOIN pg_stat_all_tables s ON s.relid = c.oid
+          WHERE c.relname = 'vestat';"
+    );
+
+    $base_statistics =~ s/\s*\|\s*/ /g;   # transform " | " into space
+    ($relallvisible, $relallfrozen, $rev_all_visible_pages, $rev_all_frozen_pages)
+        = split /\s+/, $base_statistics;
+}
+
+#------------------------------------------------------------------------------
+# Test 1: Create test table, populate it and run an initial vacuum to force freezing
+#------------------------------------------------------------------------------
+
+$node->safe_psql($dbname, q{
+	SELECT pg_stat_force_next_flush();
+	CREATE TABLE vestat (x int)
+		WITH (autovacuum_enabled = off, fillfactor = 70);
+	INSERT INTO vestat SELECT x FROM generate_series(1, 5000) AS g(x);
+	ANALYZE vestat;
+});
+
+# Poll the stats view until the expected deltas appear or timeout.
+$updated = wait_for_vacuum_stats(run_vacuum => 1);
+
+ok($updated,
+   'vacuum stats updated after vacuuming the table (relallfrozen and relallvisible advanced)')
+  or diag "Timeout waiting for pg_stats_vacuum_tables to update after $timeout seconds during vacuum";
+
+#------------------------------------------------------------------------------
+# Snapshot current statistics for later comparison
+#------------------------------------------------------------------------------
+
+fetch_vacuum_stats();
+
+#------------------------------------------------------------------------------
+# Verify initial statistics after vacuum
+#------------------------------------------------------------------------------
+ok($relallfrozen > $relallfrozen_prev, 'relallfrozen has increased');
+ok($relallvisible > $relallvisible_prev, 'relallvisible has increased');
+ok($rev_all_frozen_pages == 0, 'rev_all_frozen_pages stay the same');
+ok($rev_all_visible_pages == 0, 'rev_all_visible_pages stay the same');
+
+#------------------------------------------------------------------------------
+# Test 2: Trigger backend updates
+# Backend activity should reset per-page visibility/freeze marks and increment revision counters
+#------------------------------------------------------------------------------
+$relallfrozen_prev = $relallfrozen;
+$relallvisible_prev = $relallvisible;
+
+$node->safe_psql($dbname, q{
+    UPDATE vestat SET x = x + 1001;
+});
+
+$node->safe_psql($dbname, 'SELECT pg_stat_force_next_flush()');
+
+# Poll until stats update or timeout.
+$updated = wait_for_vacuum_stats(run_vacuum => 0);
+ok($updated,
+   'vacuum stats updated after backend tuple updates (rev_all_frozen_pages and rev_all_visible_pages advanced)')
+  or diag "Timeout waiting for pg_stats_vacuum_* update after $timeout seconds";
+
+#------------------------------------------------------------------------------
+# Snapshot current statistics for later comparison
+#------------------------------------------------------------------------------
+
+fetch_vacuum_stats();
+
+#------------------------------------------------------------------------------
+# Check updated statistics after backend activity
+#------------------------------------------------------------------------------
+
+ok($relallfrozen == $relallfrozen_prev, 'relallfrozen stay the same');
+ok($relallvisible == $relallvisible_prev, 'relallvisible stay the same');
+ok($rev_all_frozen_pages > 0, 'rev_all_frozen_pages has increased');
+ok($rev_all_visible_pages > 0, 'rev_all_visible_pages has increased');
+
+#------------------------------------------------------------------------------
+# Cleanup
+#------------------------------------------------------------------------------
+
+$node->safe_psql('postgres', q{
+	DROP DATABASE statistic_vacuum_database_regression;
+});
+
+$node->stop;
+done_testing();
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index f4ee2bd7459..8dbf5ce34bb 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1834,6 +1834,8 @@ pg_stat_all_tables| SELECT c.oid AS relid,
     pg_stat_get_dead_tuples(c.oid) AS n_dead_tup,
     pg_stat_get_mod_since_analyze(c.oid) AS n_mod_since_analyze,
     pg_stat_get_ins_since_vacuum(c.oid) AS n_ins_since_vacuum,
+    pg_stat_get_rev_all_visible_pages(c.oid) AS rev_all_visible_pages,
+    pg_stat_get_rev_all_frozen_pages(c.oid) AS rev_all_frozen_pages,
     pg_stat_get_last_vacuum_time(c.oid) AS last_vacuum,
     pg_stat_get_last_autovacuum_time(c.oid) AS last_autovacuum,
     pg_stat_get_last_analyze_time(c.oid) AS last_analyze,
@@ -2256,6 +2258,8 @@ pg_stat_sys_tables| SELECT relid,
     n_dead_tup,
     n_mod_since_analyze,
     n_ins_since_vacuum,
+    rev_all_visible_pages,
+    rev_all_frozen_pages,
     last_vacuum,
     last_autovacuum,
     last_analyze,
@@ -2311,6 +2315,8 @@ pg_stat_user_tables| SELECT relid,
     n_dead_tup,
     n_mod_since_analyze,
     n_ins_since_vacuum,
+    rev_all_visible_pages,
+    rev_all_frozen_pages,
     last_vacuum,
     last_autovacuum,
     last_analyze,
-- 
2.39.5 (Apple Git-154)



Attachments:

  [text/plain] v31-0001-Track-table-VM-stability.patch (19.2K, 2-v31-0001-Track-table-VM-stability.patch)
  download | inline diff:
From 486a29e6a22d43e2911eb849bdb3b3b39eefab91 Mon Sep 17 00:00:00 2001
From: Alena Rybakina <[email protected]>
Date: Fri, 13 Mar 2026 16:00:39 +0300
Subject: [PATCH] Track table VM stability.

Add rev_all_visible_pages and rev_all_frozen_pages counters to
pg_stat_all_tables tracking the number of times the all-visible and
all-frozen bits are cleared in the visibility map. These bits are cleared by
backend processes during regular DML operations. Hence, the counters are placed
in table statistic entry.

A high rev_all_visible_pages rate relative to DML volume indicates
that modifications are scattered across previously-clean pages rather
than concentrated on already-dirty ones, causing index-only scans to
fall back to heap fetches.  A high rev_all_frozen_pages rate indicates
that vacuum's freezing work is being frequently undone by concurrent
DML.

Authors: Alena Rybakina <[email protected]>,
         Andrei Lepikhov <[email protected]>,
         Andrei Zubkov <[email protected]>
Reviewed-by: Dilip Kumar <[email protected]>,
         Masahiko Sawada <[email protected]>,
         Ilia Evdokimov <[email protected]>,
         Jian He <[email protected]>,
         Kirill Reshke <[email protected]>,
         Alexander Korotkov <[email protected]>,
         Jim Nasby <[email protected]>,
         Sami Imseih <[email protected]>,
         Karina Litskevich <[email protected]>
---
 doc/src/sgml/monitoring.sgml                  |  32 +++
 src/backend/access/heap/visibilitymap.c       |  10 +
 src/backend/catalog/system_views.sql          |   2 +
 src/backend/utils/activity/pgstat_relation.c  |   2 +
 src/backend/utils/adt/pgstatfuncs.c           |   6 +
 src/include/catalog/pg_proc.dat               |  12 +-
 src/include/pgstat.h                          |  17 +-
 .../t/052_vacuum_extending_freeze_test.pl     | 215 ++++++++++++++++++
 src/test/regress/expected/rules.out           |   6 +
 9 files changed, 300 insertions(+), 2 deletions(-)
 create mode 100644 src/test/recovery/t/052_vacuum_extending_freeze_test.pl

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index b77d189a500..fb656977b2e 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -4090,6 +4090,38 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>rev_all_visible_pages</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Number of times the all-visible bit in the
+       <link linkend="storage-vm">visibility map</link> was cleared for a
+       page of this table.  The all-visible bit is cleared by backend
+       processes when they modify a heap page that was previously marked
+       all-visible, for example during an <command>INSERT</command>,
+       <command>UPDATE</command>, or <command>DELETE</command>.
+       A high rate of change in this counter means that index-only scans
+       on this table may frequently need to fall back to heap fetches,
+       and that vacuum must re-do visibility map work on those pages.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>rev_all_frozen_pages</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Number of times the all-frozen bit in the
+       <link linkend="storage-vm">visibility map</link> was cleared for a
+       page of this table.  The all-frozen bit is cleared by backend
+       processes when they modify a heap page that was previously marked
+       all-frozen.  A high value compared to the number of vacuum cycles
+       indicates that DML activity is frequently undoing the freezing work
+       performed by vacuum.
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>last_vacuum</structfield> <type>timestamp with time zone</type>
diff --git a/src/backend/access/heap/visibilitymap.c b/src/backend/access/heap/visibilitymap.c
index 3047bd46def..2e7c28ea307 100644
--- a/src/backend/access/heap/visibilitymap.c
+++ b/src/backend/access/heap/visibilitymap.c
@@ -92,6 +92,7 @@
 #include "access/xloginsert.h"
 #include "access/xlogutils.h"
 #include "miscadmin.h"
+#include "pgstat.h"
 #include "port/pg_bitutils.h"
 #include "storage/bufmgr.h"
 #include "storage/smgr.h"
@@ -161,6 +162,15 @@ visibilitymap_clear(Relation rel, BlockNumber heapBlk, Buffer vmbuf, uint8 flags
 
 	if (map[mapByte] & mask)
 	{
+		/*
+		 * Track how often all-visible or all-frozen bits are cleared in the
+		 * visibility map.
+		 */
+		if (map[mapByte] >> mapOffset & flags & VISIBILITYMAP_ALL_VISIBLE)
+			pgstat_count_vm_rev_all_visible(rel);
+		if (map[mapByte] >> mapOffset & flags & VISIBILITYMAP_ALL_FROZEN)
+			pgstat_count_vm_rev_all_frozen(rel);
+
 		map[mapByte] &= ~mask;
 
 		MarkBufferDirty(vmbuf);
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 7553f31fef0..fa4c74bcd5d 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -715,6 +715,8 @@ CREATE VIEW pg_stat_all_tables AS
             pg_stat_get_dead_tuples(C.oid) AS n_dead_tup,
             pg_stat_get_mod_since_analyze(C.oid) AS n_mod_since_analyze,
             pg_stat_get_ins_since_vacuum(C.oid) AS n_ins_since_vacuum,
+            pg_stat_get_rev_all_visible_pages(C.oid) AS rev_all_visible_pages,
+            pg_stat_get_rev_all_frozen_pages(C.oid) AS rev_all_frozen_pages,
             pg_stat_get_last_vacuum_time(C.oid) as last_vacuum,
             pg_stat_get_last_autovacuum_time(C.oid) as last_autovacuum,
             pg_stat_get_last_analyze_time(C.oid) as last_analyze,
diff --git a/src/backend/utils/activity/pgstat_relation.c b/src/backend/utils/activity/pgstat_relation.c
index bc8c43b96aa..bb26e97898d 100644
--- a/src/backend/utils/activity/pgstat_relation.c
+++ b/src/backend/utils/activity/pgstat_relation.c
@@ -879,6 +879,8 @@ pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
 
 	tabentry->blocks_fetched += lstats->counts.blocks_fetched;
 	tabentry->blocks_hit += lstats->counts.blocks_hit;
+	tabentry->rev_all_visible_pages += lstats->counts.rev_all_visible_pages;
+	tabentry->rev_all_frozen_pages += lstats->counts.rev_all_frozen_pages;
 
 	/* Clamp live_tuples in case of negative delta_live_tuples */
 	tabentry->live_tuples = Max(tabentry->live_tuples, 0);
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 73ca0bb0b7f..901f3dd55a1 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -106,6 +106,12 @@ PG_STAT_GET_RELENTRY_INT64(tuples_updated)
 /* pg_stat_get_vacuum_count */
 PG_STAT_GET_RELENTRY_INT64(vacuum_count)
 
+/* pg_stat_get_rev_all_frozen_pages */
+PG_STAT_GET_RELENTRY_INT64(rev_all_frozen_pages)
+
+/* pg_stat_get_rev_all_visible_pages */
+PG_STAT_GET_RELENTRY_INT64(rev_all_visible_pages)
+
 #define PG_STAT_GET_RELENTRY_FLOAT8(stat)						\
 Datum															\
 CppConcat(pg_stat_get_,stat)(PG_FUNCTION_ARGS)					\
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 5e5e33f64fc..961337ce282 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12693,6 +12693,16 @@
   prosrc => 'hashoid8' },
 { oid => '8281', descr => 'hash',
   proname => 'hashoid8extended', prorettype => 'int8',
-  proargtypes => 'oid8 int8', prosrc => 'hashoid8extended' },
+  proargtypes => 'oid8 int8',   prosrc => 'hashoid8extended' },
 
+{ oid => '8002',
+  descr => 'statistics: number of times the all-visible pages in the visibility map was removed for pages of table',
+  proname => 'pg_stat_get_rev_all_visible_pages', provolatile => 's',
+  proparallel => 'r', prorettype => 'int8', proargtypes => 'oid',
+  prosrc => 'pg_stat_get_rev_all_visible_pages' },
+{ oid => '8003',
+  descr => 'statistics: number of times the all-frozen pages in the visibility map was removed for pages of table',
+  proname => 'pg_stat_get_rev_all_frozen_pages', provolatile => 's',
+  proparallel => 'r', prorettype => 'int8', proargtypes => 'oid',
+  prosrc => 'pg_stat_get_rev_all_frozen_pages' },
 ]
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index fff7ecc2533..04ccb3c06c2 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -156,6 +156,8 @@ typedef struct PgStat_TableCounts
 
 	PgStat_Counter blocks_fetched;
 	PgStat_Counter blocks_hit;
+	PgStat_Counter rev_all_visible_pages;
+	PgStat_Counter rev_all_frozen_pages;
 } PgStat_TableCounts;
 
 /* ----------
@@ -214,7 +216,7 @@ typedef struct PgStat_TableXactStatus
  * ------------------------------------------------------------
  */
 
-#define PGSTAT_FILE_FORMAT_ID	0x01A5BCBB
+#define PGSTAT_FILE_FORMAT_ID	0x01A5BCBC
 
 typedef struct PgStat_ArchiverStats
 {
@@ -447,6 +449,8 @@ typedef struct PgStat_StatTabEntry
 
 	PgStat_Counter blocks_fetched;
 	PgStat_Counter blocks_hit;
+	PgStat_Counter rev_all_visible_pages;
+	PgStat_Counter rev_all_frozen_pages;
 
 	TimestampTz last_vacuum_time;	/* user initiated vacuum */
 	PgStat_Counter vacuum_count;
@@ -722,6 +726,17 @@ extern void pgstat_report_analyze(Relation rel,
 		if (pgstat_should_count_relation(rel))						\
 			(rel)->pgstat_info->counts.blocks_hit++;				\
 	} while (0)
+/* count revocations of all-visible and all-frozen bits in visibility map */
+#define pgstat_count_vm_rev_all_visible(rel)						\
+	do {															\
+		if (pgstat_should_count_relation(rel))						\
+			(rel)->pgstat_info->counts.rev_all_visible_pages++;	\
+	} while (0)
+#define pgstat_count_vm_rev_all_frozen(rel)						\
+	do {															\
+		if (pgstat_should_count_relation(rel))						\
+			(rel)->pgstat_info->counts.rev_all_frozen_pages++;	\
+	} while (0)
 
 extern void pgstat_count_heap_insert(Relation rel, PgStat_Counter n);
 extern void pgstat_count_heap_update(Relation rel, bool hot, bool newpage);
diff --git a/src/test/recovery/t/052_vacuum_extending_freeze_test.pl b/src/test/recovery/t/052_vacuum_extending_freeze_test.pl
new file mode 100644
index 00000000000..384e123381f
--- /dev/null
+++ b/src/test/recovery/t/052_vacuum_extending_freeze_test.pl
@@ -0,0 +1,215 @@
+# Copyright (c) 2025 PostgreSQL Global Development Group
+#
+# Test cumulative vacuum stats system using TAP
+#
+# In short, this test validates the correctness and stability of cumulative
+# vacuum statistics accounting around freezing, visibility, and revision
+# tracking across VACUUM and backend operations.
+
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+plan tests => 10;
+
+#------------------------------------------------------------------------------
+# Test cluster setup
+#------------------------------------------------------------------------------
+
+my $node = PostgreSQL::Test::Cluster->new('vacuum_extending_freeze_test');
+$node->init;
+
+# Configure the server for aggressive freezing behavior used by the test
+$node->append_conf('postgresql.conf', q{
+	log_min_messages = notice
+    vacuum_freeze_min_age = 0
+    vacuum_freeze_table_age = 0
+});
+
+$node->start();
+
+#------------------------------------------------------------------------------
+# Database creation and initialization
+#------------------------------------------------------------------------------
+
+$node->safe_psql('postgres', q{
+	CREATE DATABASE statistic_vacuum_database_regression;
+});
+
+# Main test database name
+my $dbname = 'statistic_vacuum_database_regression';
+
+# Enable necessary settings and force the stats collector to flush next
+$node->safe_psql($dbname, q{
+    SET track_functions = 'all';
+    SELECT pg_stat_force_next_flush();
+});
+
+#------------------------------------------------------------------------------
+# Timing parameters for polling loops
+#------------------------------------------------------------------------------
+
+my $timeout    = 30;     # overall wait timeout in seconds
+my $interval   = 0.015;  # poll interval in seconds (15 ms)
+my $start_time = time();
+my $updated    = 0;
+
+# Polls statistics until the named columns exceed the provided
+# baseline values or until timeout.
+#
+# run_vacuum is a boolean (0 or 1) means we need to fetch frozen and visible pages
+# from pg_class table, otherwise we need to fetch frozen and visible pages from pg_stat_all_tables table.
+# Returns: 1 if the condition is met before timeout, 0 otherwise.
+sub wait_for_vacuum_stats {
+    my (%args) = @_;
+    my $run_vacuum = ($args{run_vacuum} or 0);
+    my $result_query;
+    my $sql;
+
+    my $start = time();
+    while ((time() - $start) < $timeout) {
+
+        if ($run_vacuum) {
+            $node->safe_psql($dbname, 'VACUUM vestat');
+
+            $sql = "
+            SELECT relallfrozen > 0
+                AND relallvisible > 0
+                FROM pg_class c
+                WHERE c.relname = 'vestat'";
+        }
+        else {
+            $sql = "
+            SELECT rev_all_frozen_pages > 0
+                AND rev_all_visible_pages > 0
+                FROM pg_stat_all_tables
+                WHERE relname = 'vestat'";
+        }
+
+        $result_query = $node->safe_psql($dbname, $sql);
+
+        return 1 if (defined $result_query && $result_query eq 't');
+
+        # sub-second sleep
+        sleep($interval);
+    }
+
+    return 0;
+}
+
+#------------------------------------------------------------------------------
+# Variables to hold vacuum statistics snapshots for comparisons
+#------------------------------------------------------------------------------
+
+my $relallvisible = 0;
+my $relallfrozen = 0;
+
+my $relallvisible_prev = 0;
+my $relallfrozen_prev = 0;
+
+my $rev_all_frozen_pages = 0;
+my $rev_all_visible_pages = 0;
+
+my $res;
+
+#------------------------------------------------------------------------------
+# fetch_vacuum_stats
+#
+# Loads current values of the relevant vacuum counters for the test table
+# into the package-level variables above so tests can compare later.
+#------------------------------------------------------------------------------
+
+sub fetch_vacuum_stats {
+    my $base_statistics = $node->safe_psql(
+        $dbname,
+        "SELECT c.relallvisible, c.relallfrozen,
+                rev_all_visible_pages, rev_all_frozen_pages
+           FROM pg_class c
+           LEFT JOIN pg_stat_all_tables s ON s.relid = c.oid
+          WHERE c.relname = 'vestat';"
+    );
+
+    $base_statistics =~ s/\s*\|\s*/ /g;   # transform " | " into space
+    ($relallvisible, $relallfrozen, $rev_all_visible_pages, $rev_all_frozen_pages)
+        = split /\s+/, $base_statistics;
+}
+
+#------------------------------------------------------------------------------
+# Test 1: Create test table, populate it and run an initial vacuum to force freezing
+#------------------------------------------------------------------------------
+
+$node->safe_psql($dbname, q{
+	SELECT pg_stat_force_next_flush();
+	CREATE TABLE vestat (x int)
+		WITH (autovacuum_enabled = off, fillfactor = 70);
+	INSERT INTO vestat SELECT x FROM generate_series(1, 5000) AS g(x);
+	ANALYZE vestat;
+});
+
+# Poll the stats view until the expected deltas appear or timeout.
+$updated = wait_for_vacuum_stats(run_vacuum => 1);
+
+ok($updated,
+   'vacuum stats updated after vacuuming the table (relallfrozen and relallvisible advanced)')
+  or diag "Timeout waiting for pg_stats_vacuum_tables to update after $timeout seconds during vacuum";
+
+#------------------------------------------------------------------------------
+# Snapshot current statistics for later comparison
+#------------------------------------------------------------------------------
+
+fetch_vacuum_stats();
+
+#------------------------------------------------------------------------------
+# Verify initial statistics after vacuum
+#------------------------------------------------------------------------------
+ok($relallfrozen > $relallfrozen_prev, 'relallfrozen has increased');
+ok($relallvisible > $relallvisible_prev, 'relallvisible has increased');
+ok($rev_all_frozen_pages == 0, 'rev_all_frozen_pages stay the same');
+ok($rev_all_visible_pages == 0, 'rev_all_visible_pages stay the same');
+
+#------------------------------------------------------------------------------
+# Test 2: Trigger backend updates
+# Backend activity should reset per-page visibility/freeze marks and increment revision counters
+#------------------------------------------------------------------------------
+$relallfrozen_prev = $relallfrozen;
+$relallvisible_prev = $relallvisible;
+
+$node->safe_psql($dbname, q{
+    UPDATE vestat SET x = x + 1001;
+});
+
+$node->safe_psql($dbname, 'SELECT pg_stat_force_next_flush()');
+
+# Poll until stats update or timeout.
+$updated = wait_for_vacuum_stats(run_vacuum => 0);
+ok($updated,
+   'vacuum stats updated after backend tuple updates (rev_all_frozen_pages and rev_all_visible_pages advanced)')
+  or diag "Timeout waiting for pg_stats_vacuum_* update after $timeout seconds";
+
+#------------------------------------------------------------------------------
+# Snapshot current statistics for later comparison
+#------------------------------------------------------------------------------
+
+fetch_vacuum_stats();
+
+#------------------------------------------------------------------------------
+# Check updated statistics after backend activity
+#------------------------------------------------------------------------------
+
+ok($relallfrozen == $relallfrozen_prev, 'relallfrozen stay the same');
+ok($relallvisible == $relallvisible_prev, 'relallvisible stay the same');
+ok($rev_all_frozen_pages > 0, 'rev_all_frozen_pages has increased');
+ok($rev_all_visible_pages > 0, 'rev_all_visible_pages has increased');
+
+#------------------------------------------------------------------------------
+# Cleanup
+#------------------------------------------------------------------------------
+
+$node->safe_psql('postgres', q{
+	DROP DATABASE statistic_vacuum_database_regression;
+});
+
+$node->stop;
+done_testing();
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index f4ee2bd7459..8dbf5ce34bb 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1834,6 +1834,8 @@ pg_stat_all_tables| SELECT c.oid AS relid,
     pg_stat_get_dead_tuples(c.oid) AS n_dead_tup,
     pg_stat_get_mod_since_analyze(c.oid) AS n_mod_since_analyze,
     pg_stat_get_ins_since_vacuum(c.oid) AS n_ins_since_vacuum,
+    pg_stat_get_rev_all_visible_pages(c.oid) AS rev_all_visible_pages,
+    pg_stat_get_rev_all_frozen_pages(c.oid) AS rev_all_frozen_pages,
     pg_stat_get_last_vacuum_time(c.oid) AS last_vacuum,
     pg_stat_get_last_autovacuum_time(c.oid) AS last_autovacuum,
     pg_stat_get_last_analyze_time(c.oid) AS last_analyze,
@@ -2256,6 +2258,8 @@ pg_stat_sys_tables| SELECT relid,
     n_dead_tup,
     n_mod_since_analyze,
     n_ins_since_vacuum,
+    rev_all_visible_pages,
+    rev_all_frozen_pages,
     last_vacuum,
     last_autovacuum,
     last_analyze,
@@ -2311,6 +2315,8 @@ pg_stat_user_tables| SELECT relid,
     n_dead_tup,
     n_mod_since_analyze,
     n_ins_since_vacuum,
+    rev_all_visible_pages,
+    rev_all_frozen_pages,
     last_vacuum,
     last_autovacuum,
     last_analyze,
-- 
2.39.5 (Apple Git-154)



view thread (75+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Vacuum statistics
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox