public inbox for [email protected]  
help / color / mirror / Atom feed
From: Naga Appani <[email protected]>
To: Tomas Vondra <[email protected]>
Cc: Xuneng Zhou <[email protected]>
Cc: torikoshia <[email protected]>
Cc: Ashutosh Bapat <[email protected]>
Cc: Michael Paquier <[email protected]>
Cc: Kirill Reshke <[email protected]>
Cc: [email protected]
Subject: Re: [Proposal] Expose internal MultiXact member count function for efficient monitoring
Date: Tue, 4 Nov 2025 19:13:09 -0600
Message-ID: <CA+QeY+Aja_=j1EuY87L06KaPO4EJqqkS4B+Vg9AsWnGM1d_VRA@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <CALdSSPi3Gh08NtcCn44uVeUAYGOT74sU6uei_06qUTa5rMK43g@mail.gmail.com>
	<CAExHW5s-StFwDpZBeiikvdnrBBr0s+2jJTZRv3Hh914T4C6Euw@mail.gmail.com>
	<CA+QeY+AmRSSJU8GKJ1uNVe2ps8GTONXpioYR7gEz91_prKR6sw@mail.gmail.com>
	<[email protected]>
	<CA+QeY+AwWV=g8z1eDE7OGHxKBer18gVER7nD+SnfhqLmiL7NOA@mail.gmail.com>
	<[email protected]>
	<[email protected]>
	<CAExHW5sUU1+Vq9oG6x=u85W7Obr-dPfAc7rQtaN-LJw5uufuMA@mail.gmail.com>
	<CA+QeY+DLsjj4RxY-Z2wRC-sYCeFxuPSd8zSkt4Nvs==YV6GP6w@mail.gmail.com>
	<CAExHW5tDxjXKcP0XuUZXb_UbpAs_oQ29HyOtvL0xf7dCkk5ypw@mail.gmail.com>
	<CA+QeY+A2B7mjWc8kkieNS_nxNyTeZpTzi4Ue+vtHzkmfFxO-8Q@mail.gmail.com>
	<CAExHW5sFtNduYEnyGhQOb=xiLrbkvwE0merhpZA5pNqcNzBbmw@mail.gmail.com>
	<CA+QeY+C38cgqesZi1Tt92=D6rw_RzCV5yKhoDe5eFLtQ9RQB_Q@mail.gmail.com>
	<[email protected]>
	<CABPTF7WfTDdhZbHBgoqo=uGP_DZO-xO1bX5BfkwjgHK9jU+tOA@mail.gmail.com>
	<[email protected]>

Thank you for the feedback, Tomas! I agree with the goal you outlined,
providing a
user-friendly “how much space is left” signal would make monitoring far more
actionable.

On Sat, Oct 18, 2025 at 6:18 AM Tomas Vondra <[email protected]> wrote:
>
> Knowing num_mxids / num_members or members_size is nice, but how would
> I judge how far the system is from hitting some threshold or hard limit?
> Is there some maximum number of mxids/members that we could return? Or
> something like that?

Based on this, I experimented with calculating a num_remaining_members value to
estimate how close the system is to MultiXact member-space exhaustion. I tested
two approaches and validated their behavior through repeated exhaustion cycles.
The results are below.

At the same time, both you and Ashutosh pointed out that oldest_offset exposes
internal implementation details and is not particularly useful on its own, so I
removed oldest_offset in v11.

WHAT I TRIED in regards to space remaining
==========================================

Approach 1: (offsetStopLimit - nextOffset)
------------------------------------------
I exposed offsetStopLimit from GetMultiXactInfo() and computed:

    remainingMembers = offsetStopLimit - nextOffset;

Behavior at exhaustion:

    postgres=# SELECT num_mxids,num_members,remaining_members
               FROM pg_get_multixact_stats();
     num_mxids | num_members | remaining_members
    -----------+-------------+-------------------
     115409471 | 4294914940  |                 1
    (1 row)

After wraparound cleanup:

    postgres=# SELECT num_mxids,num_members,remaining_members
               FROM pg_get_multixact_stats();
     num_mxids | num_members | remaining_members
    -----------+-------------+-------------------
             0 |           0 |                 0
    (1 row)

The value stayed at 0 until roughly ~100k new members were allocated. My reading
is that nextOffset wraps to a small value, while offsetStopLimit remains large
(derived from the oldestOffset at the moment of truncation). Without using the
backend’s wrap-aware comparison logic (MultiXactOffsetWouldWrap()), plain
subtraction crosses the wrap boundary and becomes misleading.

Approach 2: (MaxMultiXactOffset - members)
------------------------------------------
I also tested:

    remainingMembers = MaxMultiXactOffset - members;

Across three exhaustion cycles:

1st attempt:

    postgres=# SELECT num_mxids,num_members,remaining_members
             FROM pg_get_multixact_stats();
     num_mxids | num_members | remaining_members
    -----------+-------------+-------------------
     125098473 | 4294914940  |             52355
    (1 row)

2nd attempt:

    postgres=# SELECT num_mxids,num_members,remaining_members
               FROM pg_get_multixact_stats();
     num_mxids | num_members | remaining_members
    -----------+-------------+-------------------
     116285530 | 4294905729  |             61566
    (1 row)

3rd attempt:

    postgres=# SELECT num_mxids,num_members,remaining_members
               FROM pg_get_multixact_stats();
     num_mxids | num_members | remaining_members
    -----------+-------------+-------------------
     111973488 | 4294862592  |            104703
    (1 row)

The system correctly rejected inserts in each cycle, but the computed
“remaining”
value increased between cycles. This seems to match the dynamic nature of
offsetStopLimit, which appears to be recomputed after truncation:
- based on the new oldestOffset
- aligned back to the start of its segment
- with one safety segment subtracted

Because the stop boundary shifts depending on segment boundaries, the plain
(Max − members) formula reflects alignment effects rather than actual remaining
capacity.

Understanding
============
Based on reading the relevant parts of multixact.c and observing the runtime
behavior, both approaches seem to run into limitations when trying to derive a
“remaining members” value outside the backend. I may be missing details, but the
behavior I observed suggests that a reliable computation might require
duplicating
several internal mechanisms, including:
- wrap-aware offset comparison
- SLRU page and segment alignment rules
- SetOffsetVacuumLimit’s segment recalculation

Without accounting for those, the derived numbers behaved inconsistently across
tests, sometimes staying at 0 until a large jump, and in other cases increasing
between exhaustion cycles. This seems broadly consistent with your concern that
simple arithmetic on these counters does not match how the backend determines
wraparound risk.

To be clear, this interpretation is based only on what I could infer from the
code and testing, and I may not be capturing the entire picture. But from what I
observed, a user-visible “remaining members” metric does not seem
straightforward
without exposing or replicating backend logic.

My thoughts
==========
Given all this, the cleanest approach appears to be not exposing a “remaining
members” counter directly.
PostgreSQL has historically avoided exposing remaining-capacity counters for
wraparound-limited resources such as:
- transaction IDs
- MultiXact IDs
- OIDs

Instead, PostgreSQL exposes current usage and relies on documented
thresholds for
monitoring. Following that established pattern avoids tying a SQL-visible
interface to backend internals that may evolve over time.

Self-monitoring based on documented limits
==========================================
Monitoring then follows the same pattern PostgreSQL already uses for XIDs and
other wraparound-limited values:
- track num_members growth over time
- warn when it exceeds roughly 2^31
- treat values approaching 2^32 as exhaustion-risk territory
- observe the growth rate to estimate when intervention may be needed

This keeps the interface simple, stable, and aligned with existing PostgreSQL
behavior.

Why oldest_offset was removed
=============================
Both you and Ashutosh pointed out that oldest_offset reflects internal SLRU
geometry and is not actionable without reproducing backend logic. Combined with
the behavior seen in the experiments above, it made sense not to expose this
field in the user-visible API. It is removed in v11.

Final shape of the function (v11)
=================================
The function now returns:
- num_mxids
- num_members
- members_size
- oldest_multixact

These fields are stable, directly interpretable, and do not depend on SLRU
internals or wrap-aware arithmetic.

On Thu, Oct 16, 2025 at 9:10 PM Xuneng Zhou <[email protected]> wrote:
> Here’s the updated v10 patch, now including access/htup_details.h in
> src/backend/utils/adt/multixactfuncs.c.

Thank you!

On Thu, Oct 16, 2025 at 7:28 PM torikoshia <[email protected]> wrote:
>
> Could you please update the patch to fix this?

Thank you for raising it and bringing it to attention!

Attached is the v11.

Best regards,
Naga


Attachments:

  [application/octet-stream] v11-0001-Add-pg_get_multixact_stats-function-for-monitori.patch (17.6K, 2-v11-0001-Add-pg_get_multixact_stats-function-for-monitori.patch)
  download | inline diff:
From 0b0c60f7abc7bb2913fb7e8b4a6286723a0caf74 Mon Sep 17 00:00:00 2001
From: Naga Appani <[email protected]>
Date: Thu, 23 Oct 2025 22:11:03 +0000
Subject: [PATCH v11] Add pg_get_multixact_stats() function for monitoring
 MultiXact usage

Expose multixact state via a new SQL-callable function pg_get_multixact_stats(),
returning:
- num_mxids          : number of MultiXact IDs in use
- num_members        : number of member entries in use
- members_size       : bytes used by num_members in pg_multixact/members directory
- oldest_multixact   : oldest MultiXact ID still needed

This patch adds pg_get_multixact_stats() function
   - SQL-callable interface to GetMultiXactInfo()
   - Returns NULLs if MultiXact system not initialized
   - Includes isolation tests for monitoring invariants

Documentation updates:
- func-info.sgml: add function entry
- maintenance.sgml: mention monitoring multixact usage

Build and catalog:
- Add function to existing multixactfuncs.c
- pg_proc.dat entry

Author: Naga Appani <[email protected]>
Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Michael Paquier <[email protected]>
Discussion: https://www.postgresql.org/message-id/flat/CA%2BQeY%2BAAsYK6WvBW4qYzHz4bahHycDAY_q5ECmHkEV_eB9ckzg%40mail.gmail.com
---
 doc/src/sgml/func/func-info.sgml              |  33 +++++
 doc/src/sgml/maintenance.sgml                 |  58 ++++++++-
 src/backend/utils/adt/multixactfuncs.c        |  46 +++++++
 src/include/catalog/pg_proc.dat               |  10 ++
 .../isolation/expected/multixact_stats.out    |  89 ++++++++++++++
 src/test/isolation/isolation_schedule         |   1 +
 src/test/isolation/specs/multixact_stats.spec | 113 ++++++++++++++++++
 7 files changed, 344 insertions(+), 6 deletions(-)
 create mode 100644 src/test/isolation/expected/multixact_stats.out
 create mode 100644 src/test/isolation/specs/multixact_stats.spec

diff --git a/doc/src/sgml/func/func-info.sgml b/doc/src/sgml/func/func-info.sgml
index c393832d94c..0d756186197 100644
--- a/doc/src/sgml/func/func-info.sgml
+++ b/doc/src/sgml/func/func-info.sgml
@@ -2975,6 +2975,39 @@ acl      | {postgres=arwdDxtm/postgres,foo=r/postgres}
         modify key columns.
        </para></entry>
       </row>
+
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_get_multixact_stats</primary>
+        </indexterm>
+        <function>pg_get_multixact_stats</function> ()
+        <returnvalue>record</returnvalue>
+        ( <parameter>num_mxids</parameter> <type>integer</type>,
+          <parameter>num_members</parameter> <type>bigint</type>,
+          <parameter>members_size</parameter> <type>bigint</type>,
+          <parameter>oldest_multixact</parameter> <type>xid</type> )
+       </para>
+       <para>
+        Returns statistics about current multixact usage:
+        <literal>num_mxids</literal> is the total number of multixact IDs currently present in the system,
+        <literal>num_members</literal> is the total number of multixact member entries currently
+        present in the system,
+        <literal>members_size</literal> is the storage occupied by <literal>num_members</literal>
+        in the <literal>pg_multixact/members</literal> directory,
+        <literal>oldest_multixact</literal> is the oldest multixact ID still in use.
+       </para>
+       <para>
+        The function reports statistics at the time it is invoked. Values may vary between calls,
+        even within a single transaction.
+       </para>
+       <para>
+        Returns <literal>NULL</literal> when multixact statistics are unavailable,
+        such as during startup before multixact initialization completes.
+        Specifically, this occurs when the oldest multixact offset
+        corresponding to a multixact referenced by a relation is not known.
+       </para></entry>
+      </row>
      </tbody>
     </tgroup>
    </table>
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index dc59c88319e..4d7e172450a 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -813,14 +813,60 @@ HINT:  Execute a database-wide VACUUM in that database.
     <para>
      As a safety device, an aggressive vacuum scan will
      occur for any table whose multixact-age is greater than <xref
-     linkend="guc-autovacuum-multixact-freeze-max-age"/>.  Also, if the
-     storage occupied by multixacts members exceeds about 10GB, aggressive vacuum
-     scans will occur more often for all tables, starting with those that
-     have the oldest multixact-age.  Both of these kinds of aggressive
-     scans will occur even if autovacuum is nominally disabled. The members storage
-     area can grow up to about 20GB before reaching wraparound.
+     linkend="guc-autovacuum-multixact-freeze-max-age"/>. Also, if the number
+     of multixact member entries created exceeds approximately 2^31 entries
+     (occupying roughly 10GB in the <literal>pg_multixact/members</literal> directory),
+     aggressive vacuum scans will occur more often for all tables, starting with those that
+     have the oldest multixact-age. Both of these kinds of aggressive
+     scans will occur even if autovacuum is nominally disabled. The members can grow
+     up to approximately 2^32 entries (occupying roughly 20GB in the
+     <literal>pg_multixact/members</literal> directory) before reaching wraparound.
     </para>
 
+    <para>
+     The <function>pg_get_multixact_stats()</function> function described in
+     <xref linkend="functions-pg-snapshot"/> provides a way to monitor
+     multixact allocation and usage patterns in real time, for example:
+     <programlisting>
+postgres=# SELECT *,pg_size_pretty(members_size) members_size_pretty FROM pg_catalog.pg_get_multixact_stats();
+ num_mxids | num_members | members_size | oldest_multixact | members_size_pretty
+-----------+-------------+--------------+------------------+---------------------
+ 311740299 |  2785241176 |  13926205880 |                2 | 13 GB
+(1 row)
+     </programlisting>
+     This output shows a system with significant multixact activity: about ~312 million
+     multixact IDs and ~2.8 billion member entries consuming 13 GB of storage space.
+     By leveraging this information, the function helps:
+     <orderedlist>
+      <listitem>
+       <simpara>
+        Identify unusual multixact activity from concurrent row-level locks
+        or foreign key operations. For example, a spike in <literal>num_mxids</literal> might indicate
+        multiple sessions running <literal>UPDATE</literal> statements with foreign key checks,
+        concurrent <literal>SELECT FOR SHARE</literal> operations, or frequent use of savepoints
+        causing lock contention.
+       </simpara>
+      </listitem>
+      <listitem>
+       <simpara>
+        Track multixact cleanup efficiency by monitoring oldest_multixact.
+        If this value remains unchanged while <literal>num_members</literal> grows, it could indicate
+        that long-running transactions are preventing cleanup, or autovacuum is
+        not keeping up with the workload.
+       </simpara>
+      </listitem>
+      <listitem>
+       <simpara>
+        Detect potential performance impacts before they become critical.
+        For instance, high multixact usage from frequent row-level locking or
+        foreign key operations can lead to increased I/O and CPU overhead during
+        vacuum operations. Monitoring these stats helps tune autovacuum frequency
+        and transaction patterns.
+       </simpara>
+      </listitem>
+     </orderedlist>
+     </para>
+
     <para>
      Similar to the XID case, if autovacuum fails to clear old MXIDs from a table, the
      system will begin to emit warning messages when the database's oldest MXIDs reach forty
diff --git a/src/backend/utils/adt/multixactfuncs.c b/src/backend/utils/adt/multixactfuncs.c
index e74ea938348..286676d3829 100644
--- a/src/backend/utils/adt/multixactfuncs.c
+++ b/src/backend/utils/adt/multixactfuncs.c
@@ -15,6 +15,7 @@
 #include "postgres.h"
 
 #include "access/multixact.h"
+#include "access/htup_details.h"
 #include "funcapi.h"
 #include "utils/builtins.h"
 
@@ -85,3 +86,48 @@ pg_get_multixact_members(PG_FUNCTION_ARGS)
 
 	SRF_RETURN_DONE(funccxt);
 }
+
+/*
+ * pg_get_multixact_stats
+ *
+ * Returns statistics about current multixact usage.
+ *
+ * Returns NULL if the oldest referenced offset is unknown.
+ */
+Datum
+pg_get_multixact_stats(PG_FUNCTION_ARGS)
+{
+	TupleDesc	tupdesc;
+	Datum		values[4];
+	bool		nulls[4];
+	MultiXactOffset	members;
+	MultiXactId	oldestMultiXactId;
+	uint32		multixacts;
+	MultiXactOffset	oldestOffset;
+	int64		membersBytes;
+
+	if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("return type must be a row type")));
+
+	if (GetMultiXactInfo(&multixacts, &members, &oldestMultiXactId, &oldestOffset))
+	{
+		/*
+		 * Calculate storage space for members. Members are stored in groups of 4,
+		 * with each group taking 20 bytes, resulting in 5 bytes per member.
+		 * Note: This ignores small page overhead (12 bytes per 8KB)
+		 */
+		membersBytes = (int64) members * 5;
+
+		values[0] = UInt32GetDatum(multixacts);
+		values[1] = UInt32GetDatum(members);
+		values[2] = Int64GetDatum(membersBytes);
+		values[3] = UInt32GetDatum(oldestMultiXactId);
+		memset(nulls, false, sizeof(nulls));
+
+		return HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls));
+	}
+
+	PG_RETURN_NULL();
+}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 9121a382f76..928aa3cff72 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12604,4 +12604,14 @@
   proargnames => '{pid,io_id,io_generation,state,operation,off,length,target,handle_data_len,raw_result,result,target_desc,f_sync,f_localmem,f_buffered}',
   prosrc => 'pg_get_aios' },
 
+# Get multixact usage
+{ oid => '9001', descr => 'get current multixact usage statistics',
+  proname => 'pg_get_multixact_stats',
+  provolatile => 'v', proparallel => 's', prorettype => 'record',
+  proargtypes => '',
+  proallargtypes => '{int8,int8,int8,xid}',
+  proargmodes => '{o,o,o,o}',
+  proargnames => '{num_mxids,num_members,members_size,oldest_multixact}',
+  prosrc => 'pg_get_multixact_stats'},
+
 ]
diff --git a/src/test/isolation/expected/multixact_stats.out b/src/test/isolation/expected/multixact_stats.out
new file mode 100644
index 00000000000..27a6510c4ad
--- /dev/null
+++ b/src/test/isolation/expected/multixact_stats.out
@@ -0,0 +1,89 @@
+Parsed test spec with 2 sessions
+
+starting permutation: snap0 s1_begin s1_lock snap1 s2_begin s2_lock snap2 check_while_pinned s1_commit s2_commit
+step snap0: 
+  CREATE TEMP TABLE snap0 AS
+  SELECT num_mxids, num_members, oldest_multixact
+  FROM pg_get_multixact_stats();
+
+step s1_begin: BEGIN;
+step s1_lock: SELECT 1 FROM mxq WHERE id=1 FOR KEY SHARE;
+?column?
+--------
+       1
+(1 row)
+
+step snap1: 
+  CREATE TEMP TABLE snap1 AS
+  SELECT num_mxids, num_members, oldest_multixact
+  FROM pg_get_multixact_stats();
+
+step s2_begin: BEGIN;
+step s2_lock: SELECT 1 FROM mxq WHERE id=1 FOR KEY SHARE;
+?column?
+--------
+       1
+(1 row)
+
+step snap2: 
+  CREATE TEMP TABLE snap2 AS
+  SELECT num_mxids, num_members, oldest_multixact
+  FROM pg_get_multixact_stats();
+
+step check_while_pinned: 
+  SELECT r.assertion, r.ok
+  FROM snap0 s0
+  JOIN snap1 s1 ON TRUE
+  JOIN snap2 s2 ON TRUE,
+  LATERAL unnest(
+    ARRAY[
+      'is_init_mxids',
+      'is_init_members',
+      'is_init_oldest_mxid',
+      'is_init_oldest_off',
+      'is_oldest_mxid_nondec_01',
+      'is_oldest_mxid_nondec_12',
+      'is_oldest_off_nondec_01',
+      'is_oldest_off_nondec_12',
+      'is_members_increased_ge1',
+      'is_mxids_nondec_01',
+      'is_mxids_nondec_12',
+      'is_members_nondec_01',
+      'is_members_nondec_12'
+    ],
+    ARRAY[
+      (s2.num_mxids        IS NOT NULL),
+      (s2.num_members      IS NOT NULL),
+      (s2.oldest_multixact IS NOT NULL),
+
+      (s1.oldest_multixact::text::bigint >= COALESCE(s0.oldest_multixact::text::bigint, 0)),
+      (s2.oldest_multixact::text::bigint >= COALESCE(s1.oldest_multixact::text::bigint, 0)),
+
+      (s2.num_members >= COALESCE(s1.num_members, 0) + 1),
+
+      (s1.num_mxids   >= COALESCE(s0.num_mxids,   0)),
+      (s2.num_mxids   >= COALESCE(s1.num_mxids,   0)),
+      (s1.num_members >= COALESCE(s0.num_members, 0)),
+      (s2.num_members >= COALESCE(s1.num_members, 0))
+    ]
+  ) AS r(assertion, ok);
+
+assertion               |ok
+------------------------+--
+is_init_mxids           |t 
+is_init_members         |t 
+is_init_oldest_mxid     |t 
+is_init_oldest_off      |t 
+is_oldest_mxid_nondec_01|t 
+is_oldest_mxid_nondec_12|t 
+is_oldest_off_nondec_01 |t 
+is_oldest_off_nondec_12 |t 
+is_members_increased_ge1|t 
+is_mxids_nondec_01      |t 
+is_mxids_nondec_12      |  
+is_members_nondec_01    |  
+is_members_nondec_12    |  
+(13 rows)
+
+step s1_commit: COMMIT;
+step s2_commit: COMMIT;
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 5afae33d370..bab8a8eaf31 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -120,3 +120,4 @@ test: serializable-parallel-2
 test: serializable-parallel-3
 test: matview-write-skew
 test: lock-nowait
+test: multixact_stats
diff --git a/src/test/isolation/specs/multixact_stats.spec b/src/test/isolation/specs/multixact_stats.spec
new file mode 100644
index 00000000000..7b034654504
--- /dev/null
+++ b/src/test/isolation/specs/multixact_stats.spec
@@ -0,0 +1,113 @@
+# Test invariants for pg_get_multixact_stats()
+# We create exactly one fresh MultiXact on a brand-new table. While it is pinned
+# by two open transactions, we assert only invariants that background VACUUM/FREEZE
+# cannot violate:
+#   • members increased by ≥ 1 when the second session locked the row,
+#   • num_mxids / num_members did not decrease vs earlier snapshots,
+#   • oldest_* never decreases.
+# We make NO assertions after releasing locks (freezing/truncation may shrink deltas).
+#
+# Terminology (global counters):
+#   num_mxids, num_members    : "in-use" deltas derived from global horizons
+#   oldest_multixact, offset  : oldest horizons; they move forward, never backward
+#
+# All assertions execute while our multixact is pinned by open txns, which protects
+# the truncation horizon (VACUUM can't advance past our pinned multi).
+
+setup
+{
+    CREATE TABLE mxq(id int PRIMARY KEY, v int);
+    INSERT INTO mxq VALUES (1, 42);
+}
+
+teardown
+{
+    DROP TABLE mxq;
+}
+
+# Two sessions that lock on the same tuple -> one MultiXact with >= 2 members.
+session "s1"
+setup { SET client_min_messages = warning; SET lock_timeout = '5s'; }
+step s1_begin  { BEGIN; }
+step s1_lock   { SELECT 1 FROM mxq WHERE id=1 FOR KEY SHARE; }
+step s1_commit { COMMIT; }
+
+session "s2"
+setup { SET client_min_messages = warning; SET lock_timeout = '5s'; }
+step s2_begin  { BEGIN; }
+step s2_lock   { SELECT 1 FROM mxq WHERE id=1 FOR KEY SHARE; }
+step s2_commit { COMMIT; }
+
+# Baseline BEFORE any locking; may be NULLs if multixact isn't initialized yet.
+step snap0 {
+  CREATE TEMP TABLE snap0 AS
+  SELECT num_mxids, num_members, oldest_multixact
+  FROM pg_get_multixact_stats();
+}
+
+# After s1 has locked the row.
+step snap1 {
+  CREATE TEMP TABLE snap1 AS
+  SELECT num_mxids, num_members, oldest_multixact
+  FROM pg_get_multixact_stats();
+}
+
+# After s2 joins on the SAME tuple -> multixact with >= 2 members.
+step snap2 {
+  CREATE TEMP TABLE snap2 AS
+  SELECT num_mxids, num_members, oldest_multixact
+  FROM pg_get_multixact_stats();
+}
+
+# Pretty, deterministic key/value output of boolean checks.
+# Keys:
+#   is_init_mxids            : num_mxids is non-NULL
+#   is_init_members          : num_members is non-NULL
+#   is_init_oldest_mxid      : oldest_multixact is non-NULL
+#   is_oldest_mxid_nondec_01 : oldest_multixact did not decrease (snap0→snap1)
+#   is_oldest_mxid_nondec_12 : oldest_multixact did not decrease (snap1→snap2)
+#   is_members_increased_ge1 : members increased by at least 1 when s2 joined
+#   is_mxids_nondec_01       : num_mxids did not decrease (snap0→snap1)
+#   is_mxids_nondec_12       : num_mxids did not decrease (snap1→snap2)
+#   is_members_nondec_01     : num_members did not decrease (snap0→snap1)
+#   is_members_nondec_12     : num_members did not decrease (snap1→snap2)
+step check_while_pinned {
+  SELECT r.assertion, r.ok
+  FROM snap0 s0
+  JOIN snap1 s1 ON TRUE
+  JOIN snap2 s2 ON TRUE,
+  LATERAL unnest(
+    ARRAY[
+      'is_init_mxids',
+      'is_init_members',
+      'is_init_oldest_mxid',
+      'is_init_oldest_off',
+      'is_oldest_mxid_nondec_01',
+      'is_oldest_mxid_nondec_12',
+      'is_oldest_off_nondec_01',
+      'is_oldest_off_nondec_12',
+      'is_members_increased_ge1',
+      'is_mxids_nondec_01',
+      'is_mxids_nondec_12',
+      'is_members_nondec_01',
+      'is_members_nondec_12'
+    ],
+    ARRAY[
+      (s2.num_mxids        IS NOT NULL),
+      (s2.num_members      IS NOT NULL),
+      (s2.oldest_multixact IS NOT NULL),
+
+      (s1.oldest_multixact::text::bigint >= COALESCE(s0.oldest_multixact::text::bigint, 0)),
+      (s2.oldest_multixact::text::bigint >= COALESCE(s1.oldest_multixact::text::bigint, 0)),
+
+      (s2.num_members >= COALESCE(s1.num_members, 0) + 1),
+
+      (s1.num_mxids   >= COALESCE(s0.num_mxids,   0)),
+      (s2.num_mxids   >= COALESCE(s1.num_mxids,   0)),
+      (s1.num_members >= COALESCE(s0.num_members, 0)),
+      (s2.num_members >= COALESCE(s1.num_members, 0))
+    ]
+  ) AS r(assertion, ok);
+}
+
+permutation snap0 s1_begin s1_lock snap1 s2_begin s2_lock snap2 check_while_pinned s1_commit s2_commit
-- 
2.47.3



view thread (42+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: [Proposal] Expose internal MultiXact member count function for efficient monitoring
  In-Reply-To: <CA+QeY+Aja_=j1EuY87L06KaPO4EJqqkS4B+Vg9AsWnGM1d_VRA@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox