From 46c0c3d632a069a70923dd8d378b6c5d92b7eb1b Mon Sep 17 00:00:00 2001 From: Naga Appani Date: Sat, 16 Aug 2025 17:52:06 +0000 Subject: [PATCH v4] Add pg_get_multixact_stats() function for monitoring MultiXact usage MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Expose multixact state via a new SQL-callable function pg_get_multixact_stats(), returning: - num_mxids : number of MultiXact IDs in use - num_members : number of member entries in use - oldest_multixact : oldest MultiXact ID still needed - oldest_offset : oldest member offset still in use The function returns NULLs if the MultiXact subsystem is not yet initialized. An isolation test (multixact_stats) asserts only invariants that are stable while a newly created multixact is pinned: (1) adding a second locker on the same tuple increases members by ≥1; (2) num_mxids and num_members do not decrease across snapshots; and (3) oldest_* never decrease. The test prints a deterministic key/value table ("assertion | ok") and makes no assertions after locks are released, so it remains robust even if background VACUUM/FREEZE runs. Documentation updates: - func-info.sgml: add function entry - maintenance.sgml: mention monitoring multixact usage Build and catalog: - new module: src/backend/utils/adt/multixactfuncs.c - pg_proc.dat entry - meson.build integration Discussion: https://www.postgresql.org/message-id/flat/CA%2BQeY%2BAAsYK6WvBW4qYzHz4bahHycDAY_q5ECmHkEV_eB9ckzg%40mail.gmail.com Author: Naga Appani Reviewed-by: Ashutosh Bapat Reviewed-by: Michael Paquier --- doc/src/sgml/func/func-info.sgml | 28 ++++ doc/src/sgml/maintenance.sgml | 54 +++++++- src/backend/utils/adt/Makefile | 1 + src/backend/utils/adt/meson.build | 1 + src/backend/utils/adt/multixactfuncs.c | 62 +++++++++ src/include/catalog/pg_proc.dat | 15 ++ .../isolation/expected/multixact_stats.out | 94 +++++++++++++ src/test/isolation/isolation_schedule | 1 + src/test/isolation/specs/multixact_stats.spec | 128 ++++++++++++++++++ 9 files changed, 379 insertions(+), 5 deletions(-) create mode 100644 src/backend/utils/adt/multixactfuncs.c create mode 100644 src/test/isolation/expected/multixact_stats.out create mode 100644 src/test/isolation/specs/multixact_stats.spec diff --git a/doc/src/sgml/func/func-info.sgml b/doc/src/sgml/func/func-info.sgml index c393832d94c..ea063f6a81d 100644 --- a/doc/src/sgml/func/func-info.sgml +++ b/doc/src/sgml/func/func-info.sgml @@ -2975,6 +2975,34 @@ acl | {postgres=arwdDxtm/postgres,foo=r/postgres} modify key columns. + + + + + pg_get_multixact_stats + + pg_get_multixact_stats () + record + ( num_mxids integer, + num_members bigint, + oldest_multixact xid, + oldest_offset bigint ) + + + Returns statistics about current multixact usage: + num_mxids is the number of multixact IDs assigned, + num_members is the number of multixact member entries created, + oldest_multixact is the oldest multixact ID still in use, and + oldest_offset is the oldest member offset still in use. + These values can be used to monitor multixact consumption and anticipate + autovacuum behavior. See + for further details on multixact wraparound. + + + This is a live snapshot of shared counters; the numbers can change between calls, + even within the same transaction. + + diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml index e7a9f58c015..e3a63c5b864 100644 --- a/doc/src/sgml/maintenance.sgml +++ b/doc/src/sgml/maintenance.sgml @@ -813,12 +813,56 @@ HINT: Execute a database-wide VACUUM in that database. As a safety device, an aggressive vacuum scan will occur for any table whose multixact-age is greater than . Also, if the - storage occupied by multixacts members exceeds about 10GB, aggressive vacuum + linkend="guc-autovacuum-multixact-freeze-max-age"/>. Also, if the number + of members created exceeds approximately 2 billion entries, aggressive vacuum scans will occur more often for all tables, starting with those that - have the oldest multixact-age. Both of these kinds of aggressive - scans will occur even if autovacuum is nominally disabled. The members storage - area can grow up to about 20GB before reaching wraparound. + have the oldest multixact-age. Both of these kinds of aggressive + scans will occur even if autovacuum is nominally disabled. The members can grow + up to approximately 4 billion entries before reaching wraparound. + + + + The pg_get_multixact_stats() function, described in + , provides a way to monitor + multixact allocation and usage patterns in real time. For example: + +postgres=# SELECT * FROM pg_catalog.pg_get_multixact_stats(); + num_mxids | num_members | oldest_multixact | oldest_offset +-----------+-------------+------------------+--------------- + 99883849 | 773468747 | 39974368 | 351952978 +(1 row) + + This output shows a system with significant multixact activity: about ~100 million + multixact IDs and ~773 million member entries have been created since the oldest + surviving multixact (ID 39974368). By leveraging this information, the function helps: + + + + Identify unusual multixact activity from concurrent row-level locks + or foreign key operations. For example, a spike in num_mxids might indicate + multiple sessions running UPDATE statements with foreign key checks, + concurrent SELECT FOR SHARE operations, or frequent use of savepoints + causing lock contention. + + + + + Track multixact cleanup efficiency by monitoring oldest_multixact. + If this value remains unchanged while num_members grows, it could indicate + that long-running transactions are preventing cleanup, or autovacuum is + not keeping up with the workload. + + + + + Detect potential performance impacts before they become critical. + For instance, high multixact usage from frequent row-level locking or + foreign key operations can lead to increased I/O and CPU overhead during + vacuum operations. Monitoring these stats helps tune autovacuum frequency + and transaction patterns. + + + diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile index ffeacf2b819..cc68ac545a5 100644 --- a/src/backend/utils/adt/Makefile +++ b/src/backend/utils/adt/Makefile @@ -68,6 +68,7 @@ OBJS = \ misc.o \ multirangetypes.o \ multirangetypes_selfuncs.o \ + multixactfuncs.o \ name.o \ network.o \ network_gist.o \ diff --git a/src/backend/utils/adt/meson.build b/src/backend/utils/adt/meson.build index ed9bbd7b926..dac372c3bea 100644 --- a/src/backend/utils/adt/meson.build +++ b/src/backend/utils/adt/meson.build @@ -55,6 +55,7 @@ backend_sources += files( 'misc.c', 'multirangetypes.c', 'multirangetypes_selfuncs.c', + 'multixactfuncs.c', 'name.c', 'network.c', 'network_gist.c', diff --git a/src/backend/utils/adt/multixactfuncs.c b/src/backend/utils/adt/multixactfuncs.c new file mode 100644 index 00000000000..faf02bd1626 --- /dev/null +++ b/src/backend/utils/adt/multixactfuncs.c @@ -0,0 +1,62 @@ +/*------------------------------------------------------------------------- + * multixactfuncs.c + * Functions for reporting on multixact state. + * + * This module provides SQL-callable functions that expose internal multixact + * state information for monitoring usage and detecting potential wraparound + * conditions that may require vacuum maintenance. + * + * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * IDENTIFICATION + * src/backend/utils/adt/multixactfuncs.c + *------------------------------------------------------------------------- + */ +#include "postgres.h" + +#include "access/multixact.h" +#include "funcapi.h" +#include "utils/builtins.h" + +/* + * pg_get_multixact_stats + * + * Returns statistics about current MultiXact usage: + * - num_mxids: Number of MultiXact IDs in use + * - num_members: Total number of member entries + * - oldest_multixact: Oldest MultiXact ID still needed + * - oldest_offset: Oldest offset still in use + * + * Returns a row of NULLs if the MultiXact system is not yet initialized. + */ +Datum +pg_get_multixact_stats(PG_FUNCTION_ARGS) +{ + TupleDesc tupdesc; + Datum values[4]; + bool nulls[4] = {true, true, true, true}; + MultiXactOffset members; + MultiXactId oldestMultiXactId; + uint32 multixacts; + MultiXactOffset oldestOffset; + HeapTuple tuple; + + if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("return type must be a row type"))); + + if (GetMultiXactInfo(&multixacts, &members, &oldestMultiXactId, &oldestOffset)) + { + values[0] = Int32GetDatum(multixacts); + values[1] = Int64GetDatum(members); + values[2] = UInt32GetDatum(oldestMultiXactId); + values[3] = Int64GetDatum(oldestOffset); + nulls[0] = nulls[1] = nulls[2] = nulls[3] = false; + } + + tuple = heap_form_tuple(tupdesc, values, nulls); + + PG_RETURN_DATUM(HeapTupleGetDatum(tuple)); +} diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index 118d6da1ace..837bba938e6 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -12576,4 +12576,19 @@ proargnames => '{pid,io_id,io_generation,state,operation,off,length,target,handle_data_len,raw_result,result,target_desc,f_sync,f_localmem,f_buffered}', prosrc => 'pg_get_aios' }, +# Get MultiXact state +{ + oid => '9001', + descr => 'get current multixact member and multixact ID counts and oldest values', + proname => 'pg_get_multixact_stats', + prorettype => 'record', + proargtypes => '', + proallargtypes => '{int4,int8,xid,int8}', + proargmodes => '{o,o,o,o}', + proargnames => '{num_mxids,num_members,oldest_multixact,oldest_offset}', + provolatile => 'v', + proparallel => 's', + prosrc => 'pg_get_multixact_stats' +}, + ] diff --git a/src/test/isolation/expected/multixact_stats.out b/src/test/isolation/expected/multixact_stats.out new file mode 100644 index 00000000000..2893c4d9f36 --- /dev/null +++ b/src/test/isolation/expected/multixact_stats.out @@ -0,0 +1,94 @@ +Parsed test spec with 3 sessions + +starting permutation: d_begin snap0 s1_begin s1_lock snap1 s2_begin s2_lock snap2 check_while_pinned d_commit s1_commit s2_commit +step d_begin: BEGIN; SET client_min_messages = warning; +step snap0: + CREATE TEMP TABLE snap0 AS + SELECT num_mxids, num_members, oldest_multixact, oldest_offset + FROM pg_get_multixact_stats(); + +step s1_begin: BEGIN; +step s1_lock: SELECT 1 FROM mxq WHERE id=1 FOR KEY SHARE; +?column? +-------- + 1 +(1 row) + +step snap1: + CREATE TEMP TABLE snap1 AS + SELECT num_mxids, num_members, oldest_multixact, oldest_offset + FROM pg_get_multixact_stats(); + +step s2_begin: BEGIN; +step s2_lock: SELECT 1 FROM mxq WHERE id=1 FOR KEY SHARE; +?column? +-------- + 1 +(1 row) + +step snap2: + CREATE TEMP TABLE snap2 AS + SELECT num_mxids, num_members, oldest_multixact, oldest_offset + FROM pg_get_multixact_stats(); + +step check_while_pinned: + SELECT r.assertion, r.ok + FROM snap0 s0 + JOIN snap1 s1 ON TRUE + JOIN snap2 s2 ON TRUE, + LATERAL unnest( + ARRAY[ + 'is_init_mxids', + 'is_init_members', + 'is_init_oldest_mxid', + 'is_init_oldest_off', + 'is_oldest_mxid_nondec_01', + 'is_oldest_mxid_nondec_12', + 'is_oldest_off_nondec_01', + 'is_oldest_off_nondec_12', + 'is_members_increased_ge1', + 'is_mxids_nondec_01', + 'is_mxids_nondec_12', + 'is_members_nondec_01', + 'is_members_nondec_12' + ], + ARRAY[ + (s2.num_mxids IS NOT NULL), + (s2.num_members IS NOT NULL), + (s2.oldest_multixact IS NOT NULL), + (s2.oldest_offset IS NOT NULL), + + (s1.oldest_multixact::text::bigint >= COALESCE(s0.oldest_multixact::text::bigint, 0)), + (s2.oldest_multixact::text::bigint >= COALESCE(s1.oldest_multixact::text::bigint, 0)), + (s1.oldest_offset >= COALESCE(s0.oldest_offset, 0)), + (s2.oldest_offset >= COALESCE(s1.oldest_offset, 0)), + + (s2.num_members >= COALESCE(s1.num_members, 0) + 1), + + (s1.num_mxids >= COALESCE(s0.num_mxids, 0)), + (s2.num_mxids >= COALESCE(s1.num_mxids, 0)), + (s1.num_members >= COALESCE(s0.num_members, 0)), + (s2.num_members >= COALESCE(s1.num_members, 0)) + ] + ) AS r(assertion, ok); + +assertion |ok +------------------------+-- +is_init_mxids |t +is_init_members |t +is_init_oldest_mxid |t +is_init_oldest_off |t +is_oldest_mxid_nondec_01|t +is_oldest_mxid_nondec_12|t +is_oldest_off_nondec_01 |t +is_oldest_off_nondec_12 |t +is_members_increased_ge1|t +is_mxids_nondec_01 |t +is_mxids_nondec_12 |t +is_members_nondec_01 |t +is_members_nondec_12 |t +(13 rows) + +step d_commit: COMMIT; +step s1_commit: COMMIT; +step s2_commit: COMMIT; diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule index 4411d3c86dd..7da500bf6cf 100644 --- a/src/test/isolation/isolation_schedule +++ b/src/test/isolation/isolation_schedule @@ -117,3 +117,4 @@ test: serializable-parallel-2 test: serializable-parallel-3 test: matview-write-skew test: lock-nowait +test: multixact_stats diff --git a/src/test/isolation/specs/multixact_stats.spec b/src/test/isolation/specs/multixact_stats.spec new file mode 100644 index 00000000000..cbf4b57294e --- /dev/null +++ b/src/test/isolation/specs/multixact_stats.spec @@ -0,0 +1,128 @@ +# High-signal invariants for pg_get_multixact_stats() +# We create exactly one fresh MultiXact on a brand-new table. While it is pinned +# by two open transactions, we assert only invariants that background VACUUM/FREEZE +# cannot violate: +# • members increased by ≥ 1 when the second locker arrived, +# • num_mxids / num_members did not decrease vs earlier snapshots, +# • oldest_* never decreases. +# We make NO assertions after releasing locks (freezing/truncation may shrink deltas). +# NOTE: Snapshots snap0 and subsequent checks are taken inside an open driver +# transaction to narrow the window for unrelated truncation between snapshots. +# +# Terminology (global counters): +# num_mxids, num_members : “in-use” deltas derived from global horizons +# oldest_multixact, offset : oldest horizons; they move forward, never backward +# +# All assertions execute while our multixact is pinned by open txns, which protects +# the truncation horizon (VACUUM can’t advance past our pinned multi). + +setup +{ + CREATE TABLE mxq(id int PRIMARY KEY, v int); + INSERT INTO mxq VALUES (1, 42); +} + +teardown +{ + DROP TABLE mxq; +} + +# Two lockers on the SAME tuple -> one MultiXact with >= 2 members. +session "s1" +setup { SET client_min_messages = warning; SET lock_timeout = '5s'; } +step s1_begin { BEGIN; } +step s1_lock { SELECT 1 FROM mxq WHERE id=1 FOR KEY SHARE; } +step s1_commit { COMMIT; } + +session "s2" +setup { SET client_min_messages = warning; SET lock_timeout = '5s'; } +step s2_begin { BEGIN; } +step s2_lock { SELECT 1 FROM mxq WHERE id=1 FOR KEY SHARE; } +step s2_commit { COMMIT; } + +# Driver session: keep a transaction open while we take snapshots and check. +session "driver" +step d_begin { BEGIN; SET client_min_messages = warning; } + +# Baseline BEFORE any locking; may be NULLs if multixact isn't initialized yet. +step snap0 { + CREATE TEMP TABLE snap0 AS + SELECT num_mxids, num_members, oldest_multixact, oldest_offset + FROM pg_get_multixact_stats(); +} + +# After s1 has locked the row (still in driver xact). +step snap1 { + CREATE TEMP TABLE snap1 AS + SELECT num_mxids, num_members, oldest_multixact, oldest_offset + FROM pg_get_multixact_stats(); +} + +# After s2 joins on the SAME tuple -> multixact with >= 2 members (still in driver xact). +step snap2 { + CREATE TEMP TABLE snap2 AS + SELECT num_mxids, num_members, oldest_multixact, oldest_offset + FROM pg_get_multixact_stats(); +} + +# Pretty, deterministic key/value output of boolean checks. +# Keys: +# is_init_mxids : num_mxids is non-NULL +# is_init_members : num_members is non-NULL +# is_init_oldest_mxid : oldest_multixact is non-NULL +# is_init_oldest_off : oldest_offset is non-NULL +# is_oldest_mxid_nondec_01 : oldest_multixact did not decrease (snap0→snap1) +# is_oldest_mxid_nondec_12 : oldest_multixact did not decrease (snap1→snap2) +# is_oldest_off_nondec_01 : oldest_offset did not decrease (snap0→snap1) +# is_oldest_off_nondec_12 : oldest_offset did not decrease (snap1→snap2) +# is_members_increased_ge1 : members increased by at least 1 when s2 joined +# is_mxids_nondec_01 : num_mxids did not decrease (snap0→snap1) +# is_mxids_nondec_12 : num_mxids did not decrease (snap1→snap2) +# is_members_nondec_01 : num_members did not decrease (snap0→snap1) +# is_members_nondec_12 : num_members did not decrease (snap1→snap2) +step check_while_pinned { + SELECT r.assertion, r.ok + FROM snap0 s0 + JOIN snap1 s1 ON TRUE + JOIN snap2 s2 ON TRUE, + LATERAL unnest( + ARRAY[ + 'is_init_mxids', + 'is_init_members', + 'is_init_oldest_mxid', + 'is_init_oldest_off', + 'is_oldest_mxid_nondec_01', + 'is_oldest_mxid_nondec_12', + 'is_oldest_off_nondec_01', + 'is_oldest_off_nondec_12', + 'is_members_increased_ge1', + 'is_mxids_nondec_01', + 'is_mxids_nondec_12', + 'is_members_nondec_01', + 'is_members_nondec_12' + ], + ARRAY[ + (s2.num_mxids IS NOT NULL), + (s2.num_members IS NOT NULL), + (s2.oldest_multixact IS NOT NULL), + (s2.oldest_offset IS NOT NULL), + + (s1.oldest_multixact::text::bigint >= COALESCE(s0.oldest_multixact::text::bigint, 0)), + (s2.oldest_multixact::text::bigint >= COALESCE(s1.oldest_multixact::text::bigint, 0)), + (s1.oldest_offset >= COALESCE(s0.oldest_offset, 0)), + (s2.oldest_offset >= COALESCE(s1.oldest_offset, 0)), + + (s2.num_members >= COALESCE(s1.num_members, 0) + 1), + + (s1.num_mxids >= COALESCE(s0.num_mxids, 0)), + (s2.num_mxids >= COALESCE(s1.num_mxids, 0)), + (s1.num_members >= COALESCE(s0.num_members, 0)), + (s2.num_members >= COALESCE(s1.num_members, 0)) + ] + ) AS r(assertion, ok); +} + +step d_commit { COMMIT; } + +permutation d_begin snap0 s1_begin s1_lock snap1 s2_begin s2_lock snap2 check_while_pinned d_commit s1_commit s2_commit + -- 2.47.3