public inbox for [email protected]
help / color / mirror / Atom feedFrom: Sami Imseih <[email protected]>
To: Nathan Bossart <[email protected]>
Cc: Andres Freund <[email protected]>
Cc: Tom Lane <[email protected]>
Cc: Alexander Lakhin <[email protected]>
Cc: Bharath Rupireddy <[email protected]>
Cc: Robert Treat <[email protected]>
Cc: [email protected]
Cc: pgsql-hackers <[email protected]>
Cc: [email protected]
Subject: Re: Add pg_stat_autovacuum_priority
Date: Wed, 8 Apr 2026 18:51:59 -0500
Message-ID: <CAA5RZ0uiqzLPvZMXsooFPR5udpXUiVy8WvTLbduOvT2JPawYkA@mail.gmail.com> (raw)
In-Reply-To: <adbNL1puSbBi8nMo@nathan>
References: <adQsdvPPNviWMCXb@nathan>
<[email protected]>
<[email protected]>
<adaWuTR7oCKodH7k@nathan>
<[email protected]>
<adan_E0o69p81lIj@nathan>
<[email protected]>
<u3akkfw3vaocviudt6f7ft4kxjc2273rh3tfhz5rfwg6jrcc2t@wurwws6wgjjz>
<adbHYQMZ0pHQttWa@nathan>
<CAA5RZ0uL1ukKvV0Gwgok9ut_203q_E_tzBrwhDAuJCKVNwWZVw@mail.gmail.com>
<adbNL1puSbBi8nMo@nathan>
> On Wed, Apr 08, 2026 at 04:33:00PM -0500, Sami Imseih wrote:
> >> On Wed, Apr 08, 2026 at 04:40:03PM -0400, Andres Freund wrote:
> >> > Note that the whole cached state does automatically get reset at the end of
> >> > the transaction (AtEOXact_PgStat()->pgstat_clear_snapshot()), just like it did
> >> > before the shmem stats stuff.
> >>
> >> I see a lot of memory used for the pgStatEntryRefHash table, too (e.g., ~16
> >> MB for 100K tables). What's interesting is that I cannot reproduce similar
> >> usage with views like pg_stat_all_tables. If memory was not a concern, I
> >> think the "bool *may_free" idea would be fine.
> >
> > Instead of may_free, which is invasive, what about pgstat_fetch_entry_nocache
> > which can be called by 2 new APIs pgstat_fetch_stat_tabentry_nocache() and
> > pgstat_fetch_stat_tabentry_nocache_ext(). This way a caller that uses
> > these will be required to pfree?
>
> This might help avoid memory usage within a snapshot,
Yes, this is exactly why it would be useful for autovacuum workers and
the scores view. relation_needs_vacanalyze() can use the nocache variant,
and we don't need to have a conditional pfree. What I did not like about this
idea after thinking about it more is the performance overhead potentially since
every call to the view will take a shared lock on the entries, and the stats
will not be consistent when the view is called multiple times in a transaction
even when stats_fetch_consistency is NONE. The latter could be a desired
feature, but it goes against the users intentions and could be confusing.
I went ahead and implemented Andres's idea of will_free. Callers of
pgstat_fetch_entry can either pass a NULL to a will_free parameter,
or a bool. Callers that pass the bool can check if will_free is true and
can choose to free the entry.
For now, to keep the changes minimal, I only pgstat_fetch_stat_tabentry_ext()
will call pgstat_fetch_entry() with the bool and relation_needs_vacanalyze()
will be the only call site that checks this to pfree the entry.
There may be some opportunities to improve other call sites if they
are indeed leaking.
For example, pgstat_copy_relation_stats() could leak with
fetch_consistency = NONE. I kept
that out for now, but we should probably close that gap in another patch.
Also, pgstat_fetch_stat_dbentry() in autovacuum.c could potentially
use this, but I did
not look into detail.
What do you think?
--
Sami
Attachments:
[application/octet-stream] v2-0001-Fix-double-free-on-relation_needs_vacanalyze.patch (9.2K, 2-v2-0001-Fix-double-free-on-relation_needs_vacanalyze.patch)
download | inline diff:
From d9dbf585e51274da0a71e0d06460e2330822bd86 Mon Sep 17 00:00:00 2001
From: Sami Imseih <[email protected]>
Date: Wed, 8 Apr 2026 23:36:07 +0000
Subject: [PATCH v2 1/1] Fix double-free on relation_needs_vacanalyze
relation_needs_vacanalyze() unconditionally calls pfree() on the
tabentry returned by pgstat_fetch_stat_tabentry_ext(). This is correct
for autovacuum, which uses PGSTAT_FETCH_CONSISTENCY_NONE where the
caller owns the allocation, but causes a double-free for callers using
PGSTAT_FETCH_CONSISTENCY_CACHE where the entry is managed by the
snapshot cache.
Fix this by adding a bool *will_free output parameter to
pgstat_fetch_entry() and propagating it through
pgstat_fetch_stat_tabentry_ext(). This lets callers determine whether
they are responsible for freeing the returned entry. Callers that
don't need this information pass NULL, keeping existing call sites
minimal while allowing future callers to make use of it as needed.
---
src/backend/postmaster/autovacuum.c | 6 ++++--
src/backend/utils/activity/pgstat.c | 9 ++++++++-
src/backend/utils/activity/pgstat_backend.c | 2 +-
src/backend/utils/activity/pgstat_database.c | 2 +-
src/backend/utils/activity/pgstat_function.c | 2 +-
src/backend/utils/activity/pgstat_relation.c | 8 ++++----
src/backend/utils/activity/pgstat_replslot.c | 2 +-
src/backend/utils/activity/pgstat_subscription.c | 2 +-
src/include/pgstat.h | 3 ++-
src/include/utils/pgstat_internal.h | 3 ++-
.../modules/test_custom_stats/test_custom_var_stats.c | 2 +-
11 files changed, 26 insertions(+), 15 deletions(-)
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index bd626a16363..3c1fed6264d 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -3080,6 +3080,7 @@ relation_needs_vacanalyze(Oid relid,
PgStat_StatTabEntry *tabentry;
bool force_vacuum;
bool av_enabled;
+ bool will_free;
/* constants from reloptions or GUC variables */
int vac_base_thresh,
@@ -3246,7 +3247,7 @@ relation_needs_vacanalyze(Oid relid,
* forced.
*/
tabentry = pgstat_fetch_stat_tabentry_ext(classForm->relisshared,
- relid);
+ relid, &will_free);
if (!tabentry)
return;
@@ -3327,7 +3328,8 @@ relation_needs_vacanalyze(Oid relid,
anltuples, anlthresh, scores->anl,
scores->xid, scores->mxid);
- pfree(tabentry);
+ if (will_free)
+ pfree(tabentry);
}
/*
diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c
index eb8ccbaa628..a0561250710 100644
--- a/src/backend/utils/activity/pgstat.c
+++ b/src/backend/utils/activity/pgstat.c
@@ -960,7 +960,7 @@ pgstat_clear_snapshot(void)
}
void *
-pgstat_fetch_entry(PgStat_Kind kind, Oid dboid, uint64 objid)
+pgstat_fetch_entry(PgStat_Kind kind, Oid dboid, uint64 objid, bool *will_free)
{
PgStat_HashKey key = {0};
PgStat_EntryRef *entry_ref;
@@ -971,6 +971,13 @@ pgstat_fetch_entry(PgStat_Kind kind, Oid dboid, uint64 objid)
Assert(IsUnderPostmaster || !IsPostmasterEnvironment);
Assert(!kind_info->fixed_amount);
+ /*
+ * When pgstat_fetch_consistency is PGSTAT_FETCH_CONSISTENCY_NONE, callers
+ * will be responsible for freeing the entry.
+ */
+ if (will_free)
+ *will_free = (pgstat_fetch_consistency == PGSTAT_FETCH_CONSISTENCY_NONE);
+
pgstat_prep_snapshot();
key.kind = kind;
diff --git a/src/backend/utils/activity/pgstat_backend.c b/src/backend/utils/activity/pgstat_backend.c
index 04fe13e64c6..6ffef1040a2 100644
--- a/src/backend/utils/activity/pgstat_backend.c
+++ b/src/backend/utils/activity/pgstat_backend.c
@@ -95,7 +95,7 @@ pgstat_fetch_stat_backend(ProcNumber procNumber)
PgStat_Backend *backend_entry;
backend_entry = (PgStat_Backend *) pgstat_fetch_entry(PGSTAT_KIND_BACKEND,
- InvalidOid, procNumber);
+ InvalidOid, procNumber, NULL);
return backend_entry;
}
diff --git a/src/backend/utils/activity/pgstat_database.c b/src/backend/utils/activity/pgstat_database.c
index 933dcb5cae5..f1846d3236c 100644
--- a/src/backend/utils/activity/pgstat_database.c
+++ b/src/backend/utils/activity/pgstat_database.c
@@ -288,7 +288,7 @@ PgStat_StatDBEntry *
pgstat_fetch_stat_dbentry(Oid dboid)
{
return (PgStat_StatDBEntry *)
- pgstat_fetch_entry(PGSTAT_KIND_DATABASE, dboid, InvalidOid);
+ pgstat_fetch_entry(PGSTAT_KIND_DATABASE, dboid, InvalidOid, NULL);
}
void
diff --git a/src/backend/utils/activity/pgstat_function.c b/src/backend/utils/activity/pgstat_function.c
index e6b84283c6c..d47d05e3d92 100644
--- a/src/backend/utils/activity/pgstat_function.c
+++ b/src/backend/utils/activity/pgstat_function.c
@@ -245,5 +245,5 @@ PgStat_StatFuncEntry *
pgstat_fetch_stat_funcentry(Oid func_id)
{
return (PgStat_StatFuncEntry *)
- pgstat_fetch_entry(PGSTAT_KIND_FUNCTION, MyDatabaseId, func_id);
+ pgstat_fetch_entry(PGSTAT_KIND_FUNCTION, MyDatabaseId, func_id, NULL);
}
diff --git a/src/backend/utils/activity/pgstat_relation.c b/src/backend/utils/activity/pgstat_relation.c
index bc8c43b96aa..01239472f6b 100644
--- a/src/backend/utils/activity/pgstat_relation.c
+++ b/src/backend/utils/activity/pgstat_relation.c
@@ -61,7 +61,7 @@ pgstat_copy_relation_stats(Relation dst, Relation src)
PgStat_EntryRef *dst_ref;
srcstats = pgstat_fetch_stat_tabentry_ext(src->rd_rel->relisshared,
- RelationGetRelid(src));
+ RelationGetRelid(src), NULL);
if (!srcstats)
return;
@@ -468,7 +468,7 @@ pgstat_update_heap_dead_tuples(Relation rel, int delta)
PgStat_StatTabEntry *
pgstat_fetch_stat_tabentry(Oid relid)
{
- return pgstat_fetch_stat_tabentry_ext(IsSharedRelation(relid), relid);
+ return pgstat_fetch_stat_tabentry_ext(IsSharedRelation(relid), relid, NULL);
}
/*
@@ -476,12 +476,12 @@ pgstat_fetch_stat_tabentry(Oid relid)
* whether the to-be-accessed table is a shared relation or not.
*/
PgStat_StatTabEntry *
-pgstat_fetch_stat_tabentry_ext(bool shared, Oid reloid)
+pgstat_fetch_stat_tabentry_ext(bool shared, Oid reloid, bool *will_free)
{
Oid dboid = (shared ? InvalidOid : MyDatabaseId);
return (PgStat_StatTabEntry *)
- pgstat_fetch_entry(PGSTAT_KIND_RELATION, dboid, reloid);
+ pgstat_fetch_entry(PGSTAT_KIND_RELATION, dboid, reloid, will_free);
}
/*
diff --git a/src/backend/utils/activity/pgstat_replslot.c b/src/backend/utils/activity/pgstat_replslot.c
index 168ef8f8f45..e09bb6b4355 100644
--- a/src/backend/utils/activity/pgstat_replslot.c
+++ b/src/backend/utils/activity/pgstat_replslot.c
@@ -208,7 +208,7 @@ pgstat_fetch_replslot(NameData slotname)
if (idx != -1)
slotentry = (PgStat_StatReplSlotEntry *) pgstat_fetch_entry(PGSTAT_KIND_REPLSLOT,
- InvalidOid, idx);
+ InvalidOid, idx, NULL);
LWLockRelease(ReplicationSlotControlLock);
diff --git a/src/backend/utils/activity/pgstat_subscription.c b/src/backend/utils/activity/pgstat_subscription.c
index 3277cf88a4e..3eaf3e0390f 100644
--- a/src/backend/utils/activity/pgstat_subscription.c
+++ b/src/backend/utils/activity/pgstat_subscription.c
@@ -107,7 +107,7 @@ PgStat_StatSubEntry *
pgstat_fetch_stat_subscription(Oid subid)
{
return (PgStat_StatSubEntry *)
- pgstat_fetch_entry(PGSTAT_KIND_SUBSCRIPTION, InvalidOid, subid);
+ pgstat_fetch_entry(PGSTAT_KIND_SUBSCRIPTION, InvalidOid, subid, NULL);
}
/*
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 2786a7c5ffb..a42d2124080 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -763,7 +763,8 @@ extern void pgstat_twophase_postabort(FullTransactionId fxid, uint16 info,
extern PgStat_StatTabEntry *pgstat_fetch_stat_tabentry(Oid relid);
extern PgStat_StatTabEntry *pgstat_fetch_stat_tabentry_ext(bool shared,
- Oid reloid);
+ Oid reloid,
+ bool *will_free);
extern PgStat_TableStatus *find_tabstat_entry(Oid rel_id);
diff --git a/src/include/utils/pgstat_internal.h b/src/include/utils/pgstat_internal.h
index c0496f2c69c..1669644fe8a 100644
--- a/src/include/utils/pgstat_internal.h
+++ b/src/include/utils/pgstat_internal.h
@@ -685,7 +685,8 @@ extern PgStat_EntryRef *pgstat_prep_pending_entry(PgStat_Kind kind, Oid dboid,
extern PgStat_EntryRef *pgstat_fetch_pending_entry(PgStat_Kind kind,
Oid dboid, uint64 objid);
-extern void *pgstat_fetch_entry(PgStat_Kind kind, Oid dboid, uint64 objid);
+extern void *pgstat_fetch_entry(PgStat_Kind kind, Oid dboid, uint64 objid,
+ bool *will_free);
extern void pgstat_snapshot_fixed(PgStat_Kind kind);
diff --git a/src/test/modules/test_custom_stats/test_custom_var_stats.c b/src/test/modules/test_custom_stats/test_custom_var_stats.c
index 2ef0e903745..95ec7687c6c 100644
--- a/src/test/modules/test_custom_stats/test_custom_var_stats.c
+++ b/src/test/modules/test_custom_stats/test_custom_var_stats.c
@@ -494,7 +494,7 @@ test_custom_stats_var_fetch_entry(const char *stat_name)
return (PgStat_StatCustomVarEntry *)
pgstat_fetch_entry(PGSTAT_KIND_TEST_CUSTOM_VAR_STATS,
InvalidOid,
- PGSTAT_CUSTOM_VAR_STATS_IDX(stat_name));
+ PGSTAT_CUSTOM_VAR_STATS_IDX(stat_name), NULL);
}
/*--------------------------------------------------------------------------
--
2.50.1
view thread (60+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: Add pg_stat_autovacuum_priority
In-Reply-To: <CAA5RZ0uiqzLPvZMXsooFPR5udpXUiVy8WvTLbduOvT2JPawYkA@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox