public inbox for [email protected]
help / color / mirror / Atom feedFrom: Andres Freund <[email protected]>
To: Bertrand Drouvot <[email protected]>
Cc: Michael Paquier <[email protected]>
Cc: Kirill Reshke <[email protected]>
Cc: Robert Haas <[email protected]>
Cc: [email protected]
Subject: Re: relfilenode statistics
Date: Mon, 15 Dec 2025 12:48:25 -0500
Message-ID: <ezbfcpjtfh3vtm667aegyrptcdsvinbuxw6y5p6j3e5fbffrl5@i7v6cjapz3yv> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
Hi,
On 2025-12-15 16:29:18 +0000, Bertrand Drouvot wrote:
> From 7908ba56cb8b6255b869af6be13077aa0315d5f1 Mon Sep 17 00:00:00 2001
> From: Bertrand Drouvot <[email protected]>
> Date: Wed, 1 Oct 2025 09:45:26 +0000
> Subject: [PATCH v8 1/2] Key PGSTAT_KIND_RELATION by relfile locator
>
> This patch changes the key used for the PGSTAT_KIND_RELATION statistic kind.
> Instead of the relation oid, it now relies on:
>
> - dboid (linked to RelFileLocator's dbOid)
> - objoid which is the result of a new macro (namely RelFileLocatorToPgStatObjid())
> that computes an objoid based on the RelFileLocator's spcOid and the
> RelFileLocator's relNumber.
I think this needs to make more explicit that this works because the object ID
now is a uint64, and that both the inputs are 32 bits.
> That will allow us to add new stats (add writes counters) and ensure that some
> counters (n_dead_tup and friends) are replicated.
Yay.
> The patch introduces pgstat_reloid_to_relfilelocator() to 1) avoid calling
> RelationIdGetRelation() to get the relfilelocator based on the relation oid
> and 2) handle the partitioned table case.
>
> Please note that:
>
> - when running pg_stat_have_stats('relation',...) we now need to be connected
> to the database that hosts the relation. As pg_stat_have_stats() is not
> documented publicly, then the changes done in 029_stats_restart.pl look
> enough.
That seems fine.
> - this patch does not handle rewrites so some tests are failing. It's only
> intent is to ease the review and should not be pushed without being
> merged with the following patch that handles the rewrites.
Makes sense.
> diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
> index 62035b7f9c3..a9b2b4e1033 100644
> --- a/src/backend/access/heap/vacuumlazy.c
> +++ b/src/backend/access/heap/vacuumlazy.c
> @@ -961,8 +961,7 @@ heap_vacuum_rel(Relation rel, const VacuumParams params,
> * soon in cases where the failsafe prevented significant amounts of heap
> * vacuuming.
> */
> - pgstat_report_vacuum(RelationGetRelid(rel),
> - rel->rd_rel->relisshared,
> + pgstat_report_vacuum(rel->rd_locator,
> Max(vacrel->new_live_tuples, 0),
> vacrel->recently_dead_tuples +
> vacrel->missed_dead_tuples,
Why not pass in the Relation itself? Given that we do that already for
pgstat_report_analyze(), it seems like that'd be an improvement even
independent of this change?
> diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
> index 1bd3924e35e..563a3697690 100644
> --- a/src/backend/postmaster/autovacuum.c
> +++ b/src/backend/postmaster/autovacuum.c
> @@ -2048,8 +2048,7 @@ do_autovacuum(void)
>
> /* Fetch reloptions and the pgstat entry for this table */
> relopts = extract_autovac_opts(tuple, pg_class_desc);
> - tabentry = pgstat_fetch_stat_tabentry_ext(classForm->relisshared,
> - relid);
> + tabentry = pgstat_fetch_stat_tabentry_ext(relid);
>
> /* Check if it needs vacuum or analyze */
> relation_needs_vacanalyze(relid, relopts, classForm, tabentry,
I don't think this is good - now do_autovacuum() will do a separate syscache
lookup to fetch information the caller already has (due to the
pgstat_reloid_to_relfilelocator() in pgstat_fetch_stat_tabentry_ext()). That's
not too bad for things like viewing stats, but do_autovacuum() does this for
every table in a database...
> @@ -326,9 +363,26 @@ pgstat_report_analyze(Relation rel,
> ts = GetCurrentTimestamp();
> elapsedtime = TimestampDifferenceMilliseconds(starttime, ts);
>
> + if (rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
> + locator = rel->rd_locator;
> + else
> + {
> + /*
> + * Partitioned tables don't have storage, so construct a synthetic
> + * locator for statistics tracking. Use the relation OID as relNumber.
> + * No collision with regular relations is possible because relNumbers
> + * are also assigned from the pg_class OID space (see
> + * GetNewRelFileNumber()), making each value unique across the
> + * database regardless of spcOid.
> + */
I don't think this is true as stated. Two reasons:
1) This afaict guarantees that the relfilenode will not class with oids, but
it does *NOT* guarantee that it does not clash with other relfilenodes
2) Note that GetNewRelFileNumber() does *NOT* check for conflicts when
creating a new relfilenode for an existing relation:
* If the relfilenumber will also be used as the relation's OID, pass the
* opened pg_class catalog, and this routine will guarantee that the result
* is also an unused OID within pg_class. If the result is to be used only
* as a relfilenumber for an existing relation, pass NULL for pg_class.
Greetings,
Andres Freund
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: relfilenode statistics
In-Reply-To: <ezbfcpjtfh3vtm667aegyrptcdsvinbuxw6y5p6j3e5fbffrl5@i7v6cjapz3yv>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox