public inbox for [email protected]
help / color / mirror / Atom feedFrom: Naga Appani <[email protected]>
To: Ashutosh Bapat <[email protected]>
Cc: Michael Paquier <[email protected]>
Cc: Kirill Reshke <[email protected]>
Cc: [email protected]
Subject: Re: [Proposal] Expose internal MultiXact member count function for efficient monitoring
Date: Mon, 18 Aug 2025 20:32:39 -0500
Message-ID: <CA+QeY+AmRSSJU8GKJ1uNVe2ps8GTONXpioYR7gEz91_prKR6sw@mail.gmail.com> (raw)
In-Reply-To: <CAExHW5s-StFwDpZBeiikvdnrBBr0s+2jJTZRv3Hh914T4C6Euw@mail.gmail.com>
References: <CA+QeY+AAsYK6WvBW4qYzHz4bahHycDAY_q5ECmHkEV_eB9ckzg@mail.gmail.com>
<CA+QeY+CuTcKSFmw3H-_OtCv+bv0T4uTSFgaAY1N9ssL-WaiiSA@mail.gmail.com>
<CALdSSPi3Gh08NtcCn44uVeUAYGOT74sU6uei_06qUTa5rMK43g@mail.gmail.com>
<CA+QeY+DTggHskCXOa39nag2sFds9BD-7k__zPbvL-_VVyJw7Sg@mail.gmail.com>
<CAExHW5uxBnodmpXygUQUxZAyxYvbXczadxGOVeL=0DxpkXUR-g@mail.gmail.com>
<[email protected]>
<CAExHW5vba_ecL2c27giAOOaRxoNdVX2BNQVCN4PFOZJC17NrqQ@mail.gmail.com>
<CA+QeY+CFLjh0gb-j9p+0eJK=r19izhEPCBY3ns4+HCgOQojh8Q@mail.gmail.com>
<CAExHW5s5hnxKK_VE=q94sN8uU5BJCRpAescGuamheuM-Ua559A@mail.gmail.com>
<CA+QeY+C3tc12wWX56yHozLYK=MmCkn1fODzFWC6-dSY+TZghTQ@mail.gmail.com>
<CAExHW5s-StFwDpZBeiikvdnrBBr0s+2jJTZRv3Hh914T4C6Euw@mail.gmail.com>
Hi Michael, Ashutosh,
Thanks a lot for taking the time to review this patch and share your thoughts.
Here’s a short summary of what has changed in v5:
- Added the new pg_get_multixact_stats() function in multixactfuncs.c.
- Fixed the misleading “atomic read” comment in the accessor.
- Clarified documentation: thresholds are described in terms of
counts, since that’s what the code uses.
- Added a members_bytes column in pg_get_multixact_stats() to give
users a rough size estimate (num_members * 5), while making it clear
this is layout-dependent.
Please see my in-line replies below.
---
On Mon, Aug 18, 2025 at 1:49 AM Michael Paquier <[email protected]> wrote:
> My point was a bit different: multixactfuncs.c should be created first
> because we already have one SQL function in multixact.c that can be
> moved inside it, with the declarations it requires added to
> multixact.h. I've extracted what you did, moved the existing
> pg_get_multixact_members() inside the new file, and applied the
> result.
>
Really appreciate your clarification and for making that change. I
misunderstood your earlier point.
> + * Returns information about current MultiXact state in a single atomic read:
>
> This comment is incorrect. This is not an atomic read, grabbing a
> consistent state of the data across one single lock acquisition.
>
Fixed and adjusted wording.
---
On Mon, Aug 18, 2025 at 6:56 AM Ashutosh Bapat
<[email protected]> wrote:
> The current document says
> "Also, if the storage occupied by multixacts members exceeds about
> 10GB, aggressive vacuum scans will occur more often for all tables,
> starting with those that have the oldest multixact-age." - do you mean
> that it's wrong. Instead of checking 10GB threashold, is the code
> checking the equivalent member count? If so, I think we need a
> separate patch to correct the documentation first. Can you please
> point me to the code? Documentation should reflect the code.
>
The decision is made in MultiXactMemberFreezeThreshold() [0], and it
is entirely count-based:
if (members <= MULTIXACT_MEMBER_SAFE_THRESHOLD)
return autovacuum_multixact_freeze_max_age;
fraction = (double) (members - MULTIXACT_MEMBER_SAFE_THRESHOLD) /
(MULTIXACT_MEMBER_DANGER_THRESHOLD - MULTIXACT_MEMBER_SAFE_THRESHOLD);
MaxMultiXactOffset is defined in multixact.h [1]:
#define MaxMultiXactOffset ((MultiXactOffset) 0xFFFFFFFF)
Thresholds are defined in multixact.c [2]
#define MULTIXACT_MEMBER_SAFE_THRESHOLD (MaxMultiXactOffset / 2)
#define MULTIXACT_MEMBER_DANGER_THRESHOLD \
(MaxMultiXactOffset - MaxMultiXactOffset / 4)
These translate to:
- MaxMultiXactOffset: ~4.29 billion (2^32 - 1)
- MULTIXACT_MEMBER_SAFE_THRESHOLD: ~2.14 billion (2^31 - 1)
- MULTIXACT_MEMBER_DANGER_THRESHOLD: ~3.22 billion (3/4 * 2^32)
So the code path is count-driven.
Regarding docs:
For earlier versions (18 and before), the storage-size approximation
remains relevant because users don’t have direct access to member
count information. Since we don’t plan to backpatch (I assume so) this
new function, the documentation for older branches should continue to
rely on the storage-based approximation.
Now that pg_get_multixact_stats() exposes num_members, the HEAD branch
docs can describe the thresholds in terms of counts directly. For
older branches, the storage approximation still provides users with a
practical way to reason about wraparound risk.
> The constant multiplier which converts a count into the disk size is
> in the server code. Duplicating it outside the server code, even in
> documentation, would require maintenance. GetMultiXactInfo() may not
> do the arithmetic but pg_get_multixact_stats() is lean enough to add a
> couple computations.
>
Thanks for suggesting this — it makes sense, especially for users
upgrading from earlier versions to 19 and higher. I’ve added a
members_bytes column to pg_get_multixact_stats(), computed as
num_members * 5. This respects the existing server-side logic while
also giving those users a familiar reference point, helping them
connect the older size-based guidance with the new count-based view.
---
References:
[0] https://github.com/postgres/postgres/blob/master/src/backend/access/transam/multixact.c#L2916
[1] https://github.com/postgres/postgres/blob/master/src/include/access/multixact.h#L31
[2] https://github.com/postgres/postgres/blob/master/src/backend/access/transam/multixact.c#L216-L218
Patch v5 is attached. Thanks again for the thoughtful reviews — I really
appreciate the guidance and look forward to further feedback.
Best regards,
Naga
Attachments:
[application/octet-stream] v5-0001-Add-pg_get_multixact_stats-function-for-monitorin.patch (21.9K, 2-v5-0001-Add-pg_get_multixact_stats-function-for-monitorin.patch)
download | inline diff:
From 883781d4da9c133aa1c2408379276bb4f52bf3a8 Mon Sep 17 00:00:00 2001
From: Naga Appani <[email protected]>
Date: Mon, 18 Aug 2025 20:51:25 +0000
Subject: [PATCH v5] Add pg_get_multixact_stats() function for monitoring
MultiXact usage
Expose multixact state via a new SQL-callable function pg_get_multixact_stats(),
returning:
- num_mxids : number of MultiXact IDs in use
- num_members : number of member entries in use
- members_bytes : bytes used by num_members in pg_multixact/members directory
- oldest_multixact : oldest MultiXact ID still needed
- oldest_offset : oldest member offset still in use
This patch:
1. Renames ReadMultiXactCounts() to GetMultiXactInfo() and makes it public
- Provides a single accessor for MultiXact state
- Returns counts and horizons in one call
2. Adds pg_get_multixact_stats() function
- SQL-callable interface to GetMultiXactInfo()
- Returns NULLs if MultiXact system not initialized
- Includes isolation tests for monitoring invariants
Documentation updates:
- func-info.sgml: add function entry
- maintenance.sgml: mention monitoring multixact usage
Build and catalog:
- Add function to existing multixactfuncs.c
- pg_proc.dat entry
Author: Naga Appani <[email protected]>
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Michael Paquier <[email protected]>
Discussion: https://www.postgresql.org/message-id/flat/CA%2BQeY%2BAAsYK6WvBW4qYzHz4bahHycDAY_q5ECmHkEV_eB9ckzg%40mail.gmail.com
---
doc/src/sgml/func/func-info.sgml | 31 +++++
doc/src/sgml/maintenance.sgml | 54 +++++++-
src/backend/access/transam/multixact.c | 43 ++++--
src/backend/utils/adt/multixactfuncs.c | 52 +++++++
src/include/access/multixact.h | 1 +
src/include/catalog/pg_proc.dat | 15 +++
.../isolation/expected/multixact_stats.out | 94 +++++++++++++
src/test/isolation/isolation_schedule | 1 +
src/test/isolation/specs/multixact_stats.spec | 127 ++++++++++++++++++
9 files changed, 399 insertions(+), 19 deletions(-)
create mode 100644 src/test/isolation/expected/multixact_stats.out
create mode 100644 src/test/isolation/specs/multixact_stats.spec
diff --git a/doc/src/sgml/func/func-info.sgml b/doc/src/sgml/func/func-info.sgml
index c393832d94c..9dedc3715d7 100644
--- a/doc/src/sgml/func/func-info.sgml
+++ b/doc/src/sgml/func/func-info.sgml
@@ -2975,6 +2975,37 @@ acl | {postgres=arwdDxtm/postgres,foo=r/postgres}
modify key columns.
</para></entry>
</row>
+
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_get_multixact_stats</primary>
+ </indexterm>
+ <function>pg_get_multixact_stats</function> ()
+ <returnvalue>record</returnvalue>
+ ( <parameter>num_mxids</parameter> <type>integer</type>,
+ <parameter>num_members</parameter> <type>bigint</type>,
+ <parameter>members_bytes</parameter> <type>bigint</type>,
+ <parameter>oldest_multixact</parameter> <type>xid</type>,
+ <parameter>oldest_offset</parameter> <type>bigint</type> )
+ </para>
+ <para>
+ Returns statistics about current multixact usage:
+ <literal>num_mxids</literal> is the number of multixact IDs assigned,
+ <literal>num_members</literal> is the number of multixact member entries created,
+ <literal>members_bytes</literal> is the storage occupied by <literal>num_members</literal>
+ in <literal>pg_multixact/members</literal> directory,
+ <literal>oldest_multixact</literal> is the oldest multixact ID still in use, and
+ <literal>oldest_offset</literal> is the oldest member offset still in use.
+ These values can be used to monitor multixact consumption and anticipate
+ autovacuum behavior. See <xref linkend="vacuum-for-multixact-wraparound"/>
+ for further details on multixact wraparound.
+ </para>
+ <para>
+ This is a live snapshot of shared counters; the numbers can change between calls,
+ even within the same transaction.
+ </para></entry>
+ </row>
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index e7a9f58c015..badd3392c4f 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -813,12 +813,56 @@ HINT: Execute a database-wide VACUUM in that database.
<para>
As a safety device, an aggressive vacuum scan will
occur for any table whose multixact-age is greater than <xref
- linkend="guc-autovacuum-multixact-freeze-max-age"/>. Also, if the
- storage occupied by multixacts members exceeds about 10GB, aggressive vacuum
+ linkend="guc-autovacuum-multixact-freeze-max-age"/>. Also, if the number
+ of members created exceeds approximately 2 billion entries, aggressive vacuum
scans will occur more often for all tables, starting with those that
- have the oldest multixact-age. Both of these kinds of aggressive
- scans will occur even if autovacuum is nominally disabled. The members storage
- area can grow up to about 20GB before reaching wraparound.
+ have the oldest multixact-age. Both of these kinds of aggressive
+ scans will occur even if autovacuum is nominally disabled. The members can grow
+ up to approximately 4 billion entries before reaching wraparound.
+ </para>
+
+ <para>
+ The <function>pg_get_multixact_stats()</function> function, described in
+ <xref linkend="functions-pg-snapshot"/>, provides a way to monitor
+ multixact allocation and usage patterns in real time. For example:
+ <programlisting>
+postgres=# SELECT num_mxids,num_members,oldest_multixact,oldest_offset FROM pg_catalog.pg_get_multixact_stats();
+ num_mxids | num_members | oldest_multixact | oldest_offset
+-----------+-------------+------------------+---------------
+ 99883849 | 773468747 | 39974368 | 351952978
+(1 row)
+ </programlisting>
+ This output shows a system with significant multixact activity: about ~100 million
+ multixact IDs and ~773 million member entries have been created since the oldest
+ surviving multixact (ID 39974368). By leveraging this information, the function helps:
+ <orderedlist>
+ <listitem>
+ <simpara>
+ Identify unusual multixact activity from concurrent row-level locks
+ or foreign key operations. For example, a spike in num_mxids might indicate
+ multiple sessions running UPDATE statements with foreign key checks,
+ concurrent SELECT FOR SHARE operations, or frequent use of savepoints
+ causing lock contention.
+ </simpara>
+ </listitem>
+ <listitem>
+ <simpara>
+ Track multixact cleanup efficiency by monitoring oldest_multixact.
+ If this value remains unchanged while num_members grows, it could indicate
+ that long-running transactions are preventing cleanup, or autovacuum is
+ not keeping up with the workload.
+ </simpara>
+ </listitem>
+ <listitem>
+ <simpara>
+ Detect potential performance impacts before they become critical.
+ For instance, high multixact usage from frequent row-level locking or
+ foreign key operations can lead to increased I/O and CPU overhead during
+ vacuum operations. Monitoring these stats helps tune autovacuum frequency
+ and transaction patterns.
+ </simpara>
+ </listitem>
+ </orderedlist>
</para>
<para>
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index 886740d2d55..5fb7c12fdce 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -2857,31 +2857,44 @@ find_multixact_start(MultiXactId multi, MultiXactOffset *result)
}
/*
- * Determine how many multixacts, and how many multixact members, currently
- * exist. Return false if unable to determine.
+ * GetMultiXactInfo
+ *
+ * Returns information about current MultiXact state:
+ * - multixacts: Number of MultiXacts (nextMultiXactId - oldestMultiXactId)
+ * - members: Number of member entries (nextOffset - oldestOffset)
+ * - oldestMultiXactId: Oldest MultiXact ID still in use
+ * - oldestOffset: Oldest offset still in use
+ *
+ * Returns false if the oldest offset is not known, in which case all output
+ * parameters are set to 0/invalid values for consistency.
*/
-static bool
-ReadMultiXactCounts(uint32 *multixacts, MultiXactOffset *members)
+bool
+GetMultiXactInfo(uint32 *multixacts, MultiXactOffset *members,
+ MultiXactId *oldestMultiXactId, MultiXactOffset *oldestOffset)
{
- MultiXactOffset nextOffset;
- MultiXactOffset oldestOffset;
- MultiXactId oldestMultiXactId;
- MultiXactId nextMultiXactId;
- bool oldestOffsetKnown;
+ MultiXactOffset nextOffset;
+ MultiXactId nextMultiXactId;
+ bool oldestOffsetKnown;
LWLockAcquire(MultiXactGenLock, LW_SHARED);
nextOffset = MultiXactState->nextOffset;
- oldestMultiXactId = MultiXactState->oldestMultiXactId;
+ *oldestMultiXactId = MultiXactState->oldestMultiXactId;
nextMultiXactId = MultiXactState->nextMXact;
- oldestOffset = MultiXactState->oldestOffset;
+ *oldestOffset = MultiXactState->oldestOffset;
oldestOffsetKnown = MultiXactState->oldestOffsetKnown;
LWLockRelease(MultiXactGenLock);
if (!oldestOffsetKnown)
+ {
+ *members = 0;
+ *multixacts = 0;
+ *oldestMultiXactId = InvalidMultiXactId;
+ *oldestOffset = 0;
return false;
+ }
- *members = nextOffset - oldestOffset;
- *multixacts = nextMultiXactId - oldestMultiXactId;
+ *members = nextOffset - *oldestOffset;
+ *multixacts = nextMultiXactId - *oldestMultiXactId;
return true;
}
@@ -2920,9 +2933,11 @@ MultiXactMemberFreezeThreshold(void)
uint32 victim_multixacts;
double fraction;
int result;
+ MultiXactId oldestMultiXactId;
+ MultiXactOffset oldestOffset;
/* If we can't determine member space utilization, assume the worst. */
- if (!ReadMultiXactCounts(&multixacts, &members))
+ if (!GetMultiXactInfo(&multixacts, &members, &oldestMultiXactId, &oldestOffset))
return 0;
/* If member space utilization is low, no special action is required. */
diff --git a/src/backend/utils/adt/multixactfuncs.c b/src/backend/utils/adt/multixactfuncs.c
index e74ea938348..ba9e8313ab4 100644
--- a/src/backend/utils/adt/multixactfuncs.c
+++ b/src/backend/utils/adt/multixactfuncs.c
@@ -85,3 +85,55 @@ pg_get_multixact_members(PG_FUNCTION_ARGS)
SRF_RETURN_DONE(funccxt);
}
+
+/*
+ * pg_get_multixact_stats
+ *
+ * Returns statistics about current MultiXact usage:
+ * - num_mxids: Number of MultiXact IDs in use
+ * - num_members: Total number of member entries
+ * - oldest_multixact: Oldest MultiXact ID still needed
+ * - oldest_offset: Oldest offset still in use
+ *
+ * Returns a row of NULLs if the MultiXact system is not yet initialized.
+ */
+Datum
+pg_get_multixact_stats(PG_FUNCTION_ARGS)
+{
+ TupleDesc tupdesc;
+ Datum values[5];
+ bool nulls[5] = {true, true, true, true, true};
+ MultiXactOffset members;
+ MultiXactId oldestMultiXactId;
+ uint32 multixacts;
+ MultiXactOffset oldestOffset;
+ int64 membersBytes;
+ HeapTuple tuple;
+
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("return type must be a row type")));
+
+ if (GetMultiXactInfo(&multixacts, &members, &oldestMultiXactId, &oldestOffset))
+ {
+ /*
+ * Calculate approximate storage space:
+ * - Members are stored in groups of 4
+ * - Each group takes 20 bytes (5 bytes per member)
+ * Note: This ignores small page overhead (12 bytes per 8KB)
+ */
+ membersBytes = (int64) members * 5;
+
+ values[0] = Int32GetDatum(multixacts);
+ values[1] = Int64GetDatum(members);
+ values[2] = Int64GetDatum(membersBytes);
+ values[3] = UInt32GetDatum(oldestMultiXactId);
+ values[4] = Int64GetDatum(oldestOffset);
+ nulls[0] = nulls[1] = nulls[2] = nulls[3] = nulls[4] = false;
+ }
+
+ tuple = heap_form_tuple(tupdesc, values, nulls);
+
+ PG_RETURN_DATUM(HeapTupleGetDatum(tuple));
+}
diff --git a/src/include/access/multixact.h b/src/include/access/multixact.h
index 6607b645a18..19de74950cb 100644
--- a/src/include/access/multixact.h
+++ b/src/include/access/multixact.h
@@ -159,5 +159,6 @@ extern const char *multixact_identify(uint8 info);
extern char *mxid_to_string(MultiXactId multi, int nmembers,
MultiXactMember *members);
extern char *mxstatus_to_string(MultiXactStatus status);
+extern bool GetMultiXactInfo(uint32 *multixacts, MultiXactOffset *members, MultiXactId *oldestMultiXactId, MultiXactOffset *oldestOffset);
#endif /* MULTIXACT_H */
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 118d6da1ace..985364e0cd6 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12576,4 +12576,19 @@
proargnames => '{pid,io_id,io_generation,state,operation,off,length,target,handle_data_len,raw_result,result,target_desc,f_sync,f_localmem,f_buffered}',
prosrc => 'pg_get_aios' },
+# Get MultiXact state
+{
+ oid => '9001',
+ descr => 'get current multixact member and multixact ID counts and oldest values',
+ proname => 'pg_get_multixact_stats',
+ prorettype => 'record',
+ proargtypes => '',
+ proallargtypes => '{int4,int8,int8,xid,int8}',
+ proargmodes => '{o,o,o,o,o}',
+ proargnames => '{num_mxids,num_members,members_bytes,oldest_multixact,oldest_offset}',
+ provolatile => 'v',
+ proparallel => 's',
+ prosrc => 'pg_get_multixact_stats'
+},
+
]
diff --git a/src/test/isolation/expected/multixact_stats.out b/src/test/isolation/expected/multixact_stats.out
new file mode 100644
index 00000000000..2893c4d9f36
--- /dev/null
+++ b/src/test/isolation/expected/multixact_stats.out
@@ -0,0 +1,94 @@
+Parsed test spec with 3 sessions
+
+starting permutation: d_begin snap0 s1_begin s1_lock snap1 s2_begin s2_lock snap2 check_while_pinned d_commit s1_commit s2_commit
+step d_begin: BEGIN; SET client_min_messages = warning;
+step snap0:
+ CREATE TEMP TABLE snap0 AS
+ SELECT num_mxids, num_members, oldest_multixact, oldest_offset
+ FROM pg_get_multixact_stats();
+
+step s1_begin: BEGIN;
+step s1_lock: SELECT 1 FROM mxq WHERE id=1 FOR KEY SHARE;
+?column?
+--------
+ 1
+(1 row)
+
+step snap1:
+ CREATE TEMP TABLE snap1 AS
+ SELECT num_mxids, num_members, oldest_multixact, oldest_offset
+ FROM pg_get_multixact_stats();
+
+step s2_begin: BEGIN;
+step s2_lock: SELECT 1 FROM mxq WHERE id=1 FOR KEY SHARE;
+?column?
+--------
+ 1
+(1 row)
+
+step snap2:
+ CREATE TEMP TABLE snap2 AS
+ SELECT num_mxids, num_members, oldest_multixact, oldest_offset
+ FROM pg_get_multixact_stats();
+
+step check_while_pinned:
+ SELECT r.assertion, r.ok
+ FROM snap0 s0
+ JOIN snap1 s1 ON TRUE
+ JOIN snap2 s2 ON TRUE,
+ LATERAL unnest(
+ ARRAY[
+ 'is_init_mxids',
+ 'is_init_members',
+ 'is_init_oldest_mxid',
+ 'is_init_oldest_off',
+ 'is_oldest_mxid_nondec_01',
+ 'is_oldest_mxid_nondec_12',
+ 'is_oldest_off_nondec_01',
+ 'is_oldest_off_nondec_12',
+ 'is_members_increased_ge1',
+ 'is_mxids_nondec_01',
+ 'is_mxids_nondec_12',
+ 'is_members_nondec_01',
+ 'is_members_nondec_12'
+ ],
+ ARRAY[
+ (s2.num_mxids IS NOT NULL),
+ (s2.num_members IS NOT NULL),
+ (s2.oldest_multixact IS NOT NULL),
+ (s2.oldest_offset IS NOT NULL),
+
+ (s1.oldest_multixact::text::bigint >= COALESCE(s0.oldest_multixact::text::bigint, 0)),
+ (s2.oldest_multixact::text::bigint >= COALESCE(s1.oldest_multixact::text::bigint, 0)),
+ (s1.oldest_offset >= COALESCE(s0.oldest_offset, 0)),
+ (s2.oldest_offset >= COALESCE(s1.oldest_offset, 0)),
+
+ (s2.num_members >= COALESCE(s1.num_members, 0) + 1),
+
+ (s1.num_mxids >= COALESCE(s0.num_mxids, 0)),
+ (s2.num_mxids >= COALESCE(s1.num_mxids, 0)),
+ (s1.num_members >= COALESCE(s0.num_members, 0)),
+ (s2.num_members >= COALESCE(s1.num_members, 0))
+ ]
+ ) AS r(assertion, ok);
+
+assertion |ok
+------------------------+--
+is_init_mxids |t
+is_init_members |t
+is_init_oldest_mxid |t
+is_init_oldest_off |t
+is_oldest_mxid_nondec_01|t
+is_oldest_mxid_nondec_12|t
+is_oldest_off_nondec_01 |t
+is_oldest_off_nondec_12 |t
+is_members_increased_ge1|t
+is_mxids_nondec_01 |t
+is_mxids_nondec_12 |t
+is_members_nondec_01 |t
+is_members_nondec_12 |t
+(13 rows)
+
+step d_commit: COMMIT;
+step s1_commit: COMMIT;
+step s2_commit: COMMIT;
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 9f1e997d81b..4d94fc94e77 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -118,3 +118,4 @@ test: serializable-parallel-2
test: serializable-parallel-3
test: matview-write-skew
test: lock-nowait
+test: multixact_stats
diff --git a/src/test/isolation/specs/multixact_stats.spec b/src/test/isolation/specs/multixact_stats.spec
new file mode 100644
index 00000000000..9098b6f5c5d
--- /dev/null
+++ b/src/test/isolation/specs/multixact_stats.spec
@@ -0,0 +1,127 @@
+# High-signal invariants for pg_get_multixact_stats()
+# We create exactly one fresh MultiXact on a brand-new table. While it is pinned
+# by two open transactions, we assert only invariants that background VACUUM/FREEZE
+# cannot violate:
+# • members increased by ≥ 1 when the second locker arrived,
+# • num_mxids / num_members did not decrease vs earlier snapshots,
+# • oldest_* never decreases.
+# We make NO assertions after releasing locks (freezing/truncation may shrink deltas).
+# NOTE: Snapshots snap0 and subsequent checks are taken inside an open driver
+# transaction to narrow the window for unrelated truncation between snapshots.
+#
+# Terminology (global counters):
+# num_mxids, num_members : “in-use” deltas derived from global horizons
+# oldest_multixact, offset : oldest horizons; they move forward, never backward
+#
+# All assertions execute while our multixact is pinned by open txns, which protects
+# the truncation horizon (VACUUM can’t advance past our pinned multi).
+
+setup
+{
+ CREATE TABLE mxq(id int PRIMARY KEY, v int);
+ INSERT INTO mxq VALUES (1, 42);
+}
+
+teardown
+{
+ DROP TABLE mxq;
+}
+
+# Two lockers on the SAME tuple -> one MultiXact with >= 2 members.
+session "s1"
+setup { SET client_min_messages = warning; SET lock_timeout = '5s'; }
+step s1_begin { BEGIN; }
+step s1_lock { SELECT 1 FROM mxq WHERE id=1 FOR KEY SHARE; }
+step s1_commit { COMMIT; }
+
+session "s2"
+setup { SET client_min_messages = warning; SET lock_timeout = '5s'; }
+step s2_begin { BEGIN; }
+step s2_lock { SELECT 1 FROM mxq WHERE id=1 FOR KEY SHARE; }
+step s2_commit { COMMIT; }
+
+# Driver session: keep a transaction open while we take snapshots and check.
+session "driver"
+step d_begin { BEGIN; SET client_min_messages = warning; }
+
+# Baseline BEFORE any locking; may be NULLs if multixact isn't initialized yet.
+step snap0 {
+ CREATE TEMP TABLE snap0 AS
+ SELECT num_mxids, num_members, oldest_multixact, oldest_offset
+ FROM pg_get_multixact_stats();
+}
+
+# After s1 has locked the row (still in driver xact).
+step snap1 {
+ CREATE TEMP TABLE snap1 AS
+ SELECT num_mxids, num_members, oldest_multixact, oldest_offset
+ FROM pg_get_multixact_stats();
+}
+
+# After s2 joins on the SAME tuple -> multixact with >= 2 members (still in driver xact).
+step snap2 {
+ CREATE TEMP TABLE snap2 AS
+ SELECT num_mxids, num_members, oldest_multixact, oldest_offset
+ FROM pg_get_multixact_stats();
+}
+
+# Pretty, deterministic key/value output of boolean checks.
+# Keys:
+# is_init_mxids : num_mxids is non-NULL
+# is_init_members : num_members is non-NULL
+# is_init_oldest_mxid : oldest_multixact is non-NULL
+# is_init_oldest_off : oldest_offset is non-NULL
+# is_oldest_mxid_nondec_01 : oldest_multixact did not decrease (snap0→snap1)
+# is_oldest_mxid_nondec_12 : oldest_multixact did not decrease (snap1→snap2)
+# is_oldest_off_nondec_01 : oldest_offset did not decrease (snap0→snap1)
+# is_oldest_off_nondec_12 : oldest_offset did not decrease (snap1→snap2)
+# is_members_increased_ge1 : members increased by at least 1 when s2 joined
+# is_mxids_nondec_01 : num_mxids did not decrease (snap0→snap1)
+# is_mxids_nondec_12 : num_mxids did not decrease (snap1→snap2)
+# is_members_nondec_01 : num_members did not decrease (snap0→snap1)
+# is_members_nondec_12 : num_members did not decrease (snap1→snap2)
+step check_while_pinned {
+ SELECT r.assertion, r.ok
+ FROM snap0 s0
+ JOIN snap1 s1 ON TRUE
+ JOIN snap2 s2 ON TRUE,
+ LATERAL unnest(
+ ARRAY[
+ 'is_init_mxids',
+ 'is_init_members',
+ 'is_init_oldest_mxid',
+ 'is_init_oldest_off',
+ 'is_oldest_mxid_nondec_01',
+ 'is_oldest_mxid_nondec_12',
+ 'is_oldest_off_nondec_01',
+ 'is_oldest_off_nondec_12',
+ 'is_members_increased_ge1',
+ 'is_mxids_nondec_01',
+ 'is_mxids_nondec_12',
+ 'is_members_nondec_01',
+ 'is_members_nondec_12'
+ ],
+ ARRAY[
+ (s2.num_mxids IS NOT NULL),
+ (s2.num_members IS NOT NULL),
+ (s2.oldest_multixact IS NOT NULL),
+ (s2.oldest_offset IS NOT NULL),
+
+ (s1.oldest_multixact::text::bigint >= COALESCE(s0.oldest_multixact::text::bigint, 0)),
+ (s2.oldest_multixact::text::bigint >= COALESCE(s1.oldest_multixact::text::bigint, 0)),
+ (s1.oldest_offset >= COALESCE(s0.oldest_offset, 0)),
+ (s2.oldest_offset >= COALESCE(s1.oldest_offset, 0)),
+
+ (s2.num_members >= COALESCE(s1.num_members, 0) + 1),
+
+ (s1.num_mxids >= COALESCE(s0.num_mxids, 0)),
+ (s2.num_mxids >= COALESCE(s1.num_mxids, 0)),
+ (s1.num_members >= COALESCE(s0.num_members, 0)),
+ (s2.num_members >= COALESCE(s1.num_members, 0))
+ ]
+ ) AS r(assertion, ok);
+}
+
+step d_commit { COMMIT; }
+
+permutation d_begin snap0 s1_begin s1_lock snap1 s2_begin s2_lock snap2 check_while_pinned d_commit s1_commit s2_commit
--
2.47.3
view thread (42+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected]
Subject: Re: [Proposal] Expose internal MultiXact member count function for efficient monitoring
In-Reply-To: <CA+QeY+AmRSSJU8GKJ1uNVe2ps8GTONXpioYR7gEz91_prKR6sw@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox