Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1va4IN-003AKe-2o for pgsql-hackers@arkaria.postgresql.org; Mon, 29 Dec 2025 03:51:49 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1va4IM-00FPYL-1F for pgsql-hackers@arkaria.postgresql.org; Mon, 29 Dec 2025 03:51:47 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1va4IL-00FPYC-0Q for pgsql-hackers@lists.postgresql.org; Mon, 29 Dec 2025 03:51:46 +0000 Received: from fhigh-a8-smtp.messagingengine.com ([103.168.172.159]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1va4II-0037M0-3A for pgsql-hackers@postgresql.org; Mon, 29 Dec 2025 03:51:44 +0000 Received: from phl-compute-05.internal (phl-compute-05.internal [10.202.2.45]) by mailfhigh.phl.internal (Postfix) with ESMTP id A08E71400026; Sun, 28 Dec 2025 22:51:40 -0500 (EST) Received: from phl-frontend-04 ([10.202.2.163]) by phl-compute-05.internal (MEProxy); Sun, 28 Dec 2025 22:51:40 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=paquier.xyz; h= cc:cc:content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1766980300; x=1767066700; bh=XB+jjTPerp 9VSjLJhwc33LomrXyKE4kn4IVfGXw6NOE=; b=ijfmT5CFex0LpfXdF1zPM1PSNI F9xuWOdTN27v2hldJRb4VdhDCuC1xykJ8YDszZt0qB4VhGkgTGo41HN9dCunuPB8 Zg9YFRU+MFmA05XfMMP377k4Cs00X9sPQQXsnykhTR9HmdmWABpLBuc10trM9GDK kQdW8+gFfx4ZaLEYUUIpyE6xQXs0ydUrDW6lJcn7uvtLopCqqO+M+Uui7z1+aMCQ 1tm8by5TtXSc4APu32QfxjRtoxWLgW3feQNPBBzm/duOO3wkKPFSPogF6+51Imt3 VR33ayFQijT9Luvm/eq3HwscasGolN32G2XhdhlkZ1AwjL7TPGuoF01BmLdA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t= 1766980300; x=1767066700; bh=XB+jjTPerp9VSjLJhwc33LomrXyKE4kn4IV fGXw6NOE=; b=wvAnfLARlJ0qC+J3F+UAO4dGKLHZTzkgltGgZilgaxcRaA7eFy6 pFa+IOh7wQW3N/I3aAk32KSqmjyqhX16ObS8Z4lzSDUAkCCJoviw61MqzaGJHFWj Seu0Jsrd/BtK7gZA426K6uOF01Gy8UxCGabnJnlw8axWKK38NpiX1TPemoZ/ksIq lozwbGAbkyJKEMLmF9bRdB4v3prQ9qZp24az2AG93eF1vwxO8Z89WkNazVPyI6k7 HXRUG4JVvvORPo7Q1fGu9Tb/2eVKfiiPAsb+GcZaH8xkhQpGtu+ClNl1uU+85cv8 r3sOL3CbLxixAIDVBWmExPS4s/p0SCf5uwA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefgedrtddtgdejieduudcutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpuffrtefokffrpgfnqfghnecuuegr ihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenfghrlh cuvffnffculdejtddmnecujfgurhepfffhvfevuffkfhggtggujgesghdtreertddtvden ucfhrhhomhepofhitghhrggvlhcurfgrqhhuihgvrhcuoehmihgthhgrvghlsehprghquh hivghrrdighiiiqeenucggtffrrghtthgvrhhnpeetleeifedufffhhfdtteelgeeggeff hfekueevteeigfduudevudetgfegiedvjeenucevlhhushhtvghrufhiiigvpedtnecurf grrhgrmhepmhgrihhlfhhrohhmpehmihgthhgrvghlsehprghquhhivghrrdighiiipdhn sggprhgtphhtthhopeejpdhmohguvgepshhmthhpohhuthdprhgtphhtthhopehnrghgnh hrihhksehgmhgrihhlrdgtohhmpdhrtghpthhtoheprghshhhuthhoshhhrdgsrghprght rdhoshhssehgmhgrihhlrdgtohhmpdhrtghpthhtohepthhomhgrshesvhhonhgurhgrrd hmvgdprhgtphhtthhopeiguhhnvghnghiihhhouhesghhmrghilhdrtghomhdprhgtphht thhopehtohhrihhkohhshhhirgesohhsshdrnhhtthgurghtrgdrtghomhdprhgtphhtth hopehrvghshhhkvghkihhrihhllhesghhmrghilhdrtghomhdprhgtphhtthhopehpghhs qhhlqdhhrggtkhgvrhhssehpohhsthhgrhgvshhqlhdrohhrgh X-ME-Proxy: Feedback-ID: i0fe9450f:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Sun, 28 Dec 2025 22:51:36 -0500 (EST) Date: Mon, 29 Dec 2025 12:51:23 +0900 From: Michael Paquier To: Naga Appani Cc: Ashutosh Bapat , Tomas Vondra , Xuneng Zhou , torikoshia , Kirill Reshke , pgsql-hackers@postgresql.org Subject: Re: [Proposal] Expose internal MultiXact member count function for efficient monitoring Message-ID: References: MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="cWl8XzXwS661QL5q" Content-Disposition: inline In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --cWl8XzXwS661QL5q Content-Type: multipart/mixed; boundary="jFtgfJ2gtN0acEpO" Content-Disposition: inline --jFtgfJ2gtN0acEpO Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Thu, Dec 25, 2025 at 10:30:37AM +0900, Michael Paquier wrote: > I am wondering if it would not be cleaner and less confusing to do > things slightly differently (sorry I did not pay much attention to > that previously): > - Change GetMultiXactInfo() to return two offsets, nextOffset and > oldestOffset. > - Use uint64 for members and recalculate the difference in > MultiXactMemberFreezeThreshold() and the function code. Heikki has > just switched multixact offsets to be 64 bits, yippee. > - Redefine MultiXactMemberStorageSize() so as it does not take a > number of members in input, but as the amount of space taken between > two offsets. At least that would be more consistent with all the > other inline functions of multixact.h that rely on MultiXactOffset > inputs. Using a int64 is still OK I guess, there may be a case to > detect "negative" numbers and give a change to the users of the new > inline function to notice that they did a computation wrong, rather > than hiding a signedness problem. So, here is what I have in mind, split into independent pieces: - Remove the existing type confusion with GetMultiXactInfo(), due to how things have always been done in MultiXactMemberFreezeThreshold(). - Add macro MultiXactOffsetStorageSize(), to calculate the amount of space used between two offsets. - The main patch, with adjustments in comments, the test (no non-ASCII characters in that, please). One thing that was really surprising is that you did not consider ROLE_PG_READ_ALL_STATS. We expect all the stats information to be hidden if a role is not granted access to them, and this function should be no exception especially as it relates to disk space usage like database or tablespace size functions. Anyway, attached are all these updated pieces. The doc edits are what I have mentioned upthread, close to what you have suggested to me offline. Comments? -- Michael --jFtgfJ2gtN0acEpO Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="v14-0001-Rework-GetMultiXactInfo.patch" Content-Transfer-Encoding: quoted-printable =46rom dbe13b3d61c03f5d8d6773021e87c626ab04b3b0 Mon Sep 17 00:00:00 2001 =46rom: Michael Paquier Date: Mon, 29 Dec 2025 11:39:56 +0900 Subject: [PATCH v14 1/3] Rework GetMultiXactInfo() This routine returned a number of offsets as a MultiXactOffset, but it is not actually an offset, just a number to define their range. This was confusing. This type confusion comes from the original implementation of MultiXactMemberFreezeThreshold(). --- src/include/access/multixact.h | 2 +- src/backend/access/transam/multixact.c | 16 ++++++++-------- 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/src/include/access/multixact.h b/src/include/access/multixact.h index 6433fe163641..d22abbb72512 100644 --- a/src/include/access/multixact.h +++ b/src/include/access/multixact.h @@ -109,7 +109,7 @@ extern bool MultiXactIdIsRunning(MultiXactId multi, boo= l isLockOnly); extern void MultiXactIdSetOldestMember(void); extern int GetMultiXactIdMembers(MultiXactId multi, MultiXactMember **memb= ers, bool from_pgupgrade, bool isLockOnly); -extern void GetMultiXactInfo(uint32 *multixacts, MultiXactOffset *members, +extern void GetMultiXactInfo(uint32 *multixacts, MultiXactOffset *nextOffs= et, MultiXactId *oldestMultiXactId, MultiXactOffset *oldestOffset); extern bool MultiXactIdPrecedes(MultiXactId multi1, MultiXactId multi2); diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/tr= ansam/multixact.c index 34956a5a6634..0d6f594e2a06 100644 --- a/src/backend/access/transam/multixact.c +++ b/src/backend/access/transam/multixact.c @@ -2461,25 +2461,23 @@ find_multixact_start(MultiXactId multi, MultiXactOf= fset *result) * * Returns information about the current MultiXact state, as of: * multixacts: Number of MultiXacts (nextMultiXactId - oldestMultiXactId) - * members: Number of member entries (nextOffset - oldestOffset) + * nextOffset: Next-to-be-assigned offset * oldestMultiXactId: Oldest MultiXact ID still in use * oldestOffset: Oldest offset still in use */ void -GetMultiXactInfo(uint32 *multixacts, MultiXactOffset *members, +GetMultiXactInfo(uint32 *multixacts, MultiXactOffset *nextOffset, MultiXactId *oldestMultiXactId, MultiXactOffset *oldestOffset) { - MultiXactOffset nextOffset; MultiXactId nextMultiXactId; =20 LWLockAcquire(MultiXactGenLock, LW_SHARED); - nextOffset =3D MultiXactState->nextOffset; + *nextOffset =3D MultiXactState->nextOffset; *oldestMultiXactId =3D MultiXactState->oldestMultiXactId; nextMultiXactId =3D MultiXactState->nextMXact; *oldestOffset =3D MultiXactState->oldestOffset; LWLockRelease(MultiXactGenLock); =20 - *members =3D nextOffset - *oldestOffset; *multixacts =3D nextMultiXactId - *oldestMultiXactId; } =20 @@ -2514,16 +2512,18 @@ GetMultiXactInfo(uint32 *multixacts, MultiXactOffse= t *members, int MultiXactMemberFreezeThreshold(void) { - MultiXactOffset members; uint32 multixacts; uint32 victim_multixacts; double fraction; int result; MultiXactId oldestMultiXactId; MultiXactOffset oldestOffset; + MultiXactOffset nextOffset; + uint64 members; =20 - /* Read the current offsets and members usage. */ - GetMultiXactInfo(&multixacts, &members, &oldestMultiXactId, &oldestOffset= ); + /* Read the current offsets and multixact usage. */ + GetMultiXactInfo(&multixacts, &nextOffset, &oldestMultiXactId, &oldestOff= set); + members =3D nextOffset - oldestOffset; =20 /* If member space utilization is low, no special action is required. */ if (members <=3D MULTIXACT_MEMBER_LOW_THRESHOLD) --=20 2.51.0 --jFtgfJ2gtN0acEpO Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="v14-0002-Add-MultiXactOffsetStorageSize.patch" Content-Transfer-Encoding: quoted-printable =46rom 4ef3b82b91eaee132cfed886f6a1a1455bb084a3 Mon Sep 17 00:00:00 2001 =46rom: Michael Paquier Date: Mon, 29 Dec 2025 11:41:35 +0900 Subject: [PATCH v14 2/3] Add MultiXactOffsetStorageSize() This calculates the amount of space taken by two multixact offsets, useful on its own to know the amount of space multixacts may use. This will be used by an upcoming patch. --- src/include/access/multixact_internal.h | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/src/include/access/multixact_internal.h b/src/include/access/m= ultixact_internal.h index f2d6539e8a67..65dc2be148db 100644 --- a/src/include/access/multixact_internal.h +++ b/src/include/access/multixact_internal.h @@ -121,4 +121,14 @@ MXOffsetToMemberOffset(MultiXactOffset offset) member_in_group * sizeof(TransactionId); } =20 +/* Storage space consumed by a range of offsets, in bytes */ +static inline int64 +MultiXactOffsetStorageSize(MultiXactOffset new_offset, + MultiXactOffset old_offset) +{ + Assert(new_offset >=3D old_offset); + return (int64) ((new_offset - old_offset) / MULTIXACT_MEMBERS_PER_MEMBERG= ROUP) * + MULTIXACT_MEMBERGROUP_SIZE; +} + #endif /* MULTIXACT_INTERNAL_H */ --=20 2.51.0 --jFtgfJ2gtN0acEpO Content-Type: text/x-diff; charset=utf-8 Content-Disposition: attachment; filename="v14-0003-Add-pg_get_multixact_stats-function-for-monitori.patch" Content-Transfer-Encoding: quoted-printable =46rom a34c851f3421d5023f896f28549e77fd6d760100 Mon Sep 17 00:00:00 2001 =46rom: Naga Appani Date: Wed, 24 Dec 2025 21:06:16 +0000 Subject: [PATCH v14 3/3] Add pg_get_multixact_stats() function for monitori= ng MultiXact usage Expose multixact state via a new SQL-callable function pg_get_multixact_sta= ts(), returning: - num_mxids : number of MultiXact IDs in use - num_members : number of member entries in use - members_size : bytes used by num_members in pg_multixact/members di= rectory - oldest_multixact : oldest MultiXact ID still needed This patch adds pg_get_multixact_stats() function - SQL-callable interface to GetMultiXactInfo() - Includes isolation tests for monitoring invariants Documentation updates: - func-info.sgml: add function entry - maintenance.sgml: mention monitoring multixact usage Build and catalog: - Add function to existing multixactfuncs.c - pg_proc.dat entry Author: Naga Appani Reviewed-by: Ashutosh Bapat Reviewed-by: Michael Paquier Discussion: https://www.postgresql.org/message-id/flat/CA%2BQeY%2BAAsYK6WvB= W4qYzHz4bahHycDAY_q5ECmHkEV_eB9ckzg%40mail.gmail.com --- src/include/catalog/pg_proc.dat | 10 ++ src/backend/utils/adt/multixactfuncs.c | 53 +++++++++ .../isolation/expected/multixact-stats.out | 89 ++++++++++++++ src/test/isolation/isolation_schedule | 1 + src/test/isolation/specs/multixact-stats.spec | 111 ++++++++++++++++++ src/test/regress/expected/misc_functions.out | 29 +++++ src/test/regress/sql/misc_functions.sql | 15 +++ doc/src/sgml/func/func-info.sgml | 33 ++++++ doc/src/sgml/maintenance.sgml | 39 +++++- 9 files changed, 375 insertions(+), 5 deletions(-) create mode 100644 src/test/isolation/expected/multixact-stats.out create mode 100644 src/test/isolation/specs/multixact-stats.spec diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.= dat index fd9448ec7b98..6caea6c8281e 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -12612,4 +12612,14 @@ proargnames =3D> '{pid,io_id,io_generation,state,operation,off,length,ta= rget,handle_data_len,raw_result,result,target_desc,f_sync,f_localmem,f_buff= ered}', prosrc =3D> 'pg_get_aios' }, =20 +# Get multixact usage +{ oid =3D> '9001', descr =3D> 'get current multixact usage statistics', + proname =3D> 'pg_get_multixact_stats', + provolatile =3D> 'v', proparallel =3D> 's', prorettype =3D> 'record', + proargtypes =3D> '', + proallargtypes =3D> '{int8,int8,int8,xid}', + proargmodes =3D> '{o,o,o,o}', + proargnames =3D> '{num_mxids,num_members,members_size,oldest_multixact}', + prosrc =3D> 'pg_get_multixact_stats'}, + ] diff --git a/src/backend/utils/adt/multixactfuncs.c b/src/backend/utils/adt= /multixactfuncs.c index a428e140bc4b..b39db200a391 100644 --- a/src/backend/utils/adt/multixactfuncs.c +++ b/src/backend/utils/adt/multixactfuncs.c @@ -15,7 +15,12 @@ #include "postgres.h" =20 #include "access/multixact.h" +#include "access/multixact_internal.h" +#include "access/htup_details.h" +#include "catalog/pg_authid_d.h" #include "funcapi.h" +#include "miscadmin.h" +#include "utils/acl.h" #include "utils/builtins.h" =20 /* @@ -85,3 +90,51 @@ pg_get_multixact_members(PG_FUNCTION_ARGS) =20 SRF_RETURN_DONE(funccxt); } + +/* + * pg_get_multixact_stats + * + * Returns statistics about current multixact usage. + */ +Datum +pg_get_multixact_stats(PG_FUNCTION_ARGS) +{ + TupleDesc tupdesc; + Datum values[4]; + bool nulls[4]; + uint64 members; + MultiXactId oldestMultiXactId; + uint32 multixacts; + MultiXactOffset oldestOffset; + MultiXactOffset nextOffset; + int64 membersBytes; + + if (get_call_result_type(fcinfo, NULL, &tupdesc) !=3D TYPEFUNC_COMPOSITE) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("return type must be a row type"))); + + GetMultiXactInfo(&multixacts, &nextOffset, &oldestMultiXactId, &oldestOff= set); + members =3D nextOffset - oldestOffset; + + membersBytes =3D MultiXactOffsetStorageSize(nextOffset, oldestOffset); + + if (!has_privs_of_role(GetUserId(), ROLE_PG_READ_ALL_STATS)) + { + /* + * Only superusers and roles with privileges of pg_read_all_stats can + * see details. + */ + memset(nulls, true, sizeof(bool) * tupdesc->natts); + } + else + { + values[0] =3D UInt32GetDatum(multixacts); + values[1] =3D Int64GetDatum(members); + values[2] =3D Int64GetDatum(membersBytes); + values[3] =3D UInt32GetDatum(oldestMultiXactId); + memset(nulls, false, sizeof(nulls)); + } + + return HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)); +} diff --git a/src/test/isolation/expected/multixact-stats.out b/src/test/iso= lation/expected/multixact-stats.out new file mode 100644 index 000000000000..27a6510c4ad5 --- /dev/null +++ b/src/test/isolation/expected/multixact-stats.out @@ -0,0 +1,89 @@ +Parsed test spec with 2 sessions + +starting permutation: snap0 s1_begin s1_lock snap1 s2_begin s2_lock snap2 = check_while_pinned s1_commit s2_commit +step snap0:=20 + CREATE TEMP TABLE snap0 AS + SELECT num_mxids, num_members, oldest_multixact + FROM pg_get_multixact_stats(); + +step s1_begin: BEGIN; +step s1_lock: SELECT 1 FROM mxq WHERE id=3D1 FOR KEY SHARE; +?column? +-------- + 1 +(1 row) + +step snap1:=20 + CREATE TEMP TABLE snap1 AS + SELECT num_mxids, num_members, oldest_multixact + FROM pg_get_multixact_stats(); + +step s2_begin: BEGIN; +step s2_lock: SELECT 1 FROM mxq WHERE id=3D1 FOR KEY SHARE; +?column? +-------- + 1 +(1 row) + +step snap2:=20 + CREATE TEMP TABLE snap2 AS + SELECT num_mxids, num_members, oldest_multixact + FROM pg_get_multixact_stats(); + +step check_while_pinned:=20 + SELECT r.assertion, r.ok + FROM snap0 s0 + JOIN snap1 s1 ON TRUE + JOIN snap2 s2 ON TRUE, + LATERAL unnest( + ARRAY[ + 'is_init_mxids', + 'is_init_members', + 'is_init_oldest_mxid', + 'is_init_oldest_off', + 'is_oldest_mxid_nondec_01', + 'is_oldest_mxid_nondec_12', + 'is_oldest_off_nondec_01', + 'is_oldest_off_nondec_12', + 'is_members_increased_ge1', + 'is_mxids_nondec_01', + 'is_mxids_nondec_12', + 'is_members_nondec_01', + 'is_members_nondec_12' + ], + ARRAY[ + (s2.num_mxids IS NOT NULL), + (s2.num_members IS NOT NULL), + (s2.oldest_multixact IS NOT NULL), + + (s1.oldest_multixact::text::bigint >=3D COALESCE(s0.oldest_multixact= ::text::bigint, 0)), + (s2.oldest_multixact::text::bigint >=3D COALESCE(s1.oldest_multixact= ::text::bigint, 0)), + + (s2.num_members >=3D COALESCE(s1.num_members, 0) + 1), + + (s1.num_mxids >=3D COALESCE(s0.num_mxids, 0)), + (s2.num_mxids >=3D COALESCE(s1.num_mxids, 0)), + (s1.num_members >=3D COALESCE(s0.num_members, 0)), + (s2.num_members >=3D COALESCE(s1.num_members, 0)) + ] + ) AS r(assertion, ok); + +assertion |ok +------------------------+-- +is_init_mxids |t=20 +is_init_members |t=20 +is_init_oldest_mxid |t=20 +is_init_oldest_off |t=20 +is_oldest_mxid_nondec_01|t=20 +is_oldest_mxid_nondec_12|t=20 +is_oldest_off_nondec_01 |t=20 +is_oldest_off_nondec_12 |t=20 +is_members_increased_ge1|t=20 +is_mxids_nondec_01 |t=20 +is_mxids_nondec_12 | =20 +is_members_nondec_01 | =20 +is_members_nondec_12 | =20 +(13 rows) + +step s1_commit: COMMIT; +step s2_commit: COMMIT; diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/iso= lation_schedule index f2e067b1fbc5..01ff1c6586fe 100644 --- a/src/test/isolation/isolation_schedule +++ b/src/test/isolation/isolation_schedule @@ -63,6 +63,7 @@ test: delete-abort-savept-2 test: aborted-keyrevoke test: multixact-no-deadlock test: multixact-no-forget +test: multixact-stats test: lock-committed-update test: lock-committed-keyupdate test: update-locked-tuple diff --git a/src/test/isolation/specs/multixact-stats.spec b/src/test/isola= tion/specs/multixact-stats.spec new file mode 100644 index 000000000000..6c1dd94958d1 --- /dev/null +++ b/src/test/isolation/specs/multixact-stats.spec @@ -0,0 +1,111 @@ +# Test for pg_get_multixact_stats() +# +# We create exactly one fresh MultiXact on a brand-new table. While it is +# pinned by two open transactions, we check patterns of this function that +# VACUUM/FREEZE cannot violate: +# 1) "members" increased by =E2=89=A5 1 when the second session locked the= row, +# 2) (num_mxids / num_members) did not decrease compared to earlier snapsh= ots +# 3) "oldest_*" fields never decreases. +# +# This test does not do checks patterns after releasing locks, as freezing +# and/or truncation may shrink the multixact ranges calculated. + +setup +{ + CREATE TABLE mxq(id int PRIMARY KEY, v int); + INSERT INTO mxq VALUES (1, 42); +} + +teardown +{ + DROP TABLE mxq; +} + +# Two sessions that lock the same tuple, leading to one MultiXact with +# at least 2 members. +session "s1" +setup { SET client_min_messages =3D warning; SET lock_timeout =3D '5s'; } +step s1_begin { BEGIN; } +step s1_lock { SELECT 1 FROM mxq WHERE id=3D1 FOR KEY SHARE; } +step s1_commit { COMMIT; } + +session "s2" +setup { SET client_min_messages =3D warning; SET lock_timeout =3D '5s'; } +step s2_begin { BEGIN; } +step s2_lock { SELECT 1 FROM mxq WHERE id=3D1 FOR KEY SHARE; } +step s2_commit { COMMIT; } + +# Save multixact state *BEFORE* any locking; some of these may be NULLs if +# multixacts have not initialized yet. +step snap0 { + CREATE TEMP TABLE snap0 AS + SELECT num_mxids, num_members, oldest_multixact + FROM pg_get_multixact_stats(); +} + +# Save multixact state after s1 has locked the row. +step snap1 { + CREATE TEMP TABLE snap1 AS + SELECT num_mxids, num_members, oldest_multixact + FROM pg_get_multixact_stats(); +} + +# Save multixact state after s2 joins to lock the same row, leading to +# a multixact with at least 2 members. +step snap2 { + CREATE TEMP TABLE snap2 AS + SELECT num_mxids, num_members, oldest_multixact + FROM pg_get_multixact_stats(); +} + +# Pretty, deterministic key/value outputs based of boolean checks: +# is_init_mxids : num_mxids is non-NULL +# is_init_members : num_members is non-NULL +# is_init_oldest_mxid : oldest_multixact is non-NULL +# is_oldest_mxid_nondec_01 : oldest_multixact did not decrease (snap0->s= nap1) +# is_oldest_mxid_nondec_12 : oldest_multixact did not decrease (snap1->s= nap2) +# is_members_increased_ge1 : members increased by at least 1 when s2 joi= ned +# is_mxids_nondec_01 : num_mxids did not decrease (snap0->snap1) +# is_mxids_nondec_12 : num_mxids did not decrease (snap1->snap2) +# is_members_nondec_01 : num_members did not decrease (snap0->snap1) +# is_members_nondec_12 : num_members did not decrease (snap1->snap2) +step check_while_pinned { + SELECT r.assertion, r.ok + FROM snap0 s0 + JOIN snap1 s1 ON TRUE + JOIN snap2 s2 ON TRUE, + LATERAL unnest( + ARRAY[ + 'is_init_mxids', + 'is_init_members', + 'is_init_oldest_mxid', + 'is_init_oldest_off', + 'is_oldest_mxid_nondec_01', + 'is_oldest_mxid_nondec_12', + 'is_oldest_off_nondec_01', + 'is_oldest_off_nondec_12', + 'is_members_increased_ge1', + 'is_mxids_nondec_01', + 'is_mxids_nondec_12', + 'is_members_nondec_01', + 'is_members_nondec_12' + ], + ARRAY[ + (s2.num_mxids IS NOT NULL), + (s2.num_members IS NOT NULL), + (s2.oldest_multixact IS NOT NULL), + + (s1.oldest_multixact::text::bigint >=3D COALESCE(s0.oldest_multixact= ::text::bigint, 0)), + (s2.oldest_multixact::text::bigint >=3D COALESCE(s1.oldest_multixact= ::text::bigint, 0)), + + (s2.num_members >=3D COALESCE(s1.num_members, 0) + 1), + + (s1.num_mxids >=3D COALESCE(s0.num_mxids, 0)), + (s2.num_mxids >=3D COALESCE(s1.num_mxids, 0)), + (s1.num_members >=3D COALESCE(s0.num_members, 0)), + (s2.num_members >=3D COALESCE(s1.num_members, 0)) + ] + ) AS r(assertion, ok); +} + +permutation snap0 s1_begin s1_lock snap1 s2_begin s2_lock snap2 check_whil= e_pinned s1_commit s2_commit diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regres= s/expected/misc_functions.out index d7d965d884a1..6c03b1a79d75 100644 --- a/src/test/regress/expected/misc_functions.out +++ b/src/test/regress/expected/misc_functions.out @@ -999,3 +999,32 @@ SELECT test_relpath(); SELECT pg_replication_origin_create('regress_' || repeat('a', 505)); ERROR: replication origin name is too long DETAIL: Replication origin names must be no longer than 512 bytes. +-- pg_get_multixact_stats tests +CREATE ROLE regress_multixact_funcs; +-- Access granted for superusers. +SELECT oldest_multixact IS NULL AS null_result FROM pg_get_multixact_stats= (); + null_result=20 +------------- + f +(1 row) + +-- Access revoked. +SET ROLE regress_multixact_funcs; +SELECT oldest_multixact IS NULL AS null_result FROM pg_get_multixact_stats= (); + null_result=20 +------------- + t +(1 row) + +RESET ROLE; +-- Access granted for users with pg_monitor rights. +GRANT pg_monitor TO regress_multixact_funcs; +SET ROLE regress_multixact_funcs; +SELECT oldest_multixact IS NULL AS null_result FROM pg_get_multixact_stats= (); + null_result=20 +------------- + f +(1 row) + +RESET ROLE; +DROP ROLE regress_multixact_funcs; diff --git a/src/test/regress/sql/misc_functions.sql b/src/test/regress/sql= /misc_functions.sql index 0fc20fbb6b40..35b7983996c4 100644 --- a/src/test/regress/sql/misc_functions.sql +++ b/src/test/regress/sql/misc_functions.sql @@ -459,3 +459,18 @@ SELECT test_relpath(); =20 -- pg_replication_origin.roname limit SELECT pg_replication_origin_create('regress_' || repeat('a', 505)); + +-- pg_get_multixact_stats tests +CREATE ROLE regress_multixact_funcs; +-- Access granted for superusers. +SELECT oldest_multixact IS NULL AS null_result FROM pg_get_multixact_stats= (); +-- Access revoked. +SET ROLE regress_multixact_funcs; +SELECT oldest_multixact IS NULL AS null_result FROM pg_get_multixact_stats= (); +RESET ROLE; +-- Access granted for users with pg_monitor rights. +GRANT pg_monitor TO regress_multixact_funcs; +SET ROLE regress_multixact_funcs; +SELECT oldest_multixact IS NULL AS null_result FROM pg_get_multixact_stats= (); +RESET ROLE; +DROP ROLE regress_multixact_funcs; diff --git a/doc/src/sgml/func/func-info.sgml b/doc/src/sgml/func/func-info= =2Esgml index d4508114a48e..175f18315cd4 100644 --- a/doc/src/sgml/func/func-info.sgml +++ b/doc/src/sgml/func/func-info.sgml @@ -2975,6 +2975,39 @@ acl | {postgres=3DarwdDxtm/postgres,foo=3Dr/pos= tgres} modify key columns. + + + + + pg_get_multixact_stats + + pg_get_multixact_stats () + record + ( num_mxids integer, + num_members bigint, + members_size bigint, + oldest_multixact xid ) + + + Returns statistics about current multixact usage: + num_mxids is the total number of multixact IDs + currently present in the system, num_members is + the total number of multixact member entries currently present in + the system, members_size is the storage occupied + by num_members in the + pg_multixact/members directory, + oldest_multixact is the oldest multixact ID sti= ll + in use. + + + The function reports statistics at the time it is invoked. Values = may + vary between calls, even within a single transaction. + + + To use this function, you must have privileges of the + pg_read_all_stats role. + + diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml index 08e6489afb8e..7c958b062731 100644 --- a/doc/src/sgml/maintenance.sgml +++ b/doc/src/sgml/maintenance.sgml @@ -813,12 +813,41 @@ HINT: Execute a database-wide VACUUM in that databas= e. As a safety device, an aggressive vacuum scan will occur for any table whose multixact-age is greater than . Also, if the - storage occupied by multixacts members exceeds about 10GB, aggressive= vacuum + linkend=3D"guc-autovacuum-multixact-freeze-max-age"/>. Also, if the n= umber + of multixact member entries created exceeds approximately 2 billion + entries (occupying roughly 10GB in the + pg_multixact/members directory), aggressive vacuum scans will occur more often for all tables, starting with those that - have the oldest multixact-age. Both of these kinds of aggressive - scans will occur even if autovacuum is nominally disabled. The member= s storage - area can grow up to about 20GB before reaching wraparound. + have the oldest multixact-age. Both of these kinds of aggressive + scans will occur even if autovacuum is nominally disabled. At approxi= mately + 4 billion entries (occupying roughly 20GB in the + pg_multixact/members directory), even more aggress= ive + vacuum scans are triggered to reclaim member storage space. + + + + The pg_get_multixact_stats() function described = in + provides a way to monitor + multixact allocation and usage patterns in real time, for example: + +=3D# SELECT *, pg_size_pretty(members_size) members_size_pretty + FROM pg_catalog.pg_get_multixact_stats(); + num_mxids | num_members | members_size | oldest_multixact | members_size_= pretty +-----------+-------------+--------------+------------------+--------------= ------- + 311740299 | 2785241176 | 13926205880 | 2 | 13 GB +(1 row) + + This output shows a system with significant multixact activity: about + 312 million multixact IDs and about 2.8 billion member entries consum= ing + 13 GB of storage space. + A spike in num_mxids might indicate multiple sessi= ons + running UPDATE statements with foreign key checks, + concurrent SELECT FOR SHARE operations, or frequent + use of savepoints causing lock contention. + If oldest_multixact value remains unchanged while + num_members grows, it could indicate that long-run= ning + transactions are preventing cleanup, or autovacuum is + not keeping up with the workload. =20 --=20 2.51.0 --jFtgfJ2gtN0acEpO-- --cWl8XzXwS661QL5q Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEG72nH6vTowiyblFKnvQgOdbyQH0FAmlR+rsACgkQnvQgOdby QH36dRAAmSPriG2RMxvEUQRTHwCjgwHMV1n9Z2G0JT/OW6cRrIkeTIHACeVCE+wl yQrSc2r89ufeBuWBHqulliqv4B7kJnYWKYjjGl7TgRvTbgwSoR4trl/8rlhvRFez somQWXKJUP3KcC5eYQ0ssqKfsowSr0beIqeQzErWL2l/jSfrQTEqYt0FfnzpKmOL sWPspDw7ZuvKehnCDKFRLfOv5sGYpuUwIoaGcL2qrMMNpFyUTFcnHy67LC1jcuiG zswubRQvq7rBo5JLj4YTgeoyJrcLy5f5bg7C3PZEymO6f8LHXzBg8aXcT6UujSKy NVG2f7eq87u2xomS6srnX2xrBheaLlamwBu8HcsX3XyYKdGp6CyYpemGfe06oa64 dI2Vhy7+d+1Ot5yG28Z6Ry+LQ1d7hQLNFmuIaKrSbEKf4Gmqb1aRedEt7jwAwRVX QYETdApqsh0jrncfzO5ZBP2T8bVgHZQ0GUf5uW9NpHiDLh4END1AiuSlLvcDbP3y mtMQ5FNFf+MGd79ZebQ1D8lxR5RzoJ/cxwKTDjiVgSBUNL56EJPg/tOlWYImO7/0 Q4gMa6gRzVf0u5Lgs5pripMGI99jjjSWrcbaBQ8UZ6waDLKX2KRJokIbarJOJmN3 f0OP0TSc7dVzyjwBBOFVspqnu7FxQ4wn18ljoRmV7ilp+gTQ2hE= =ZZyl -----END PGP SIGNATURE----- --cWl8XzXwS661QL5q--