public inbox for [email protected]
help / color / mirror / Atom feedRe: Flush some statistics within running transactions
22+ messages / 4 participants
[nested] [flat]
* Re: Flush some statistics within running transactions
@ 2026-02-17 19:18 Sami Imseih <[email protected]>
2026-02-18 05:40 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
0 siblings, 1 reply; 22+ messages in thread
From: Sami Imseih @ 2026-02-17 19:18 UTC (permalink / raw)
To: Bertrand Drouvot <[email protected]>; +Cc: Michael Paquier <[email protected]>; Fujii Masao <[email protected]>; [email protected]; Zsolt Parragi <[email protected]>
>
> > I do not have any further comments on this patchset.
>
> Thanks for the review!
I flipped this CF entry to Ready-for-committer
--
Sami Imseih
Amazon Web Services (AWS)
^ permalink raw reply [nested|flat] 22+ messages in thread
* Re: Flush some statistics within running transactions
2026-02-17 19:18 Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
@ 2026-02-18 05:40 ` Bertrand Drouvot <[email protected]>
2026-02-18 11:37 ` Re: Flush some statistics within running transactions Jakub Wartak <[email protected]>
2026-02-19 03:58 ` Re: Flush some statistics within running transactions Michael Paquier <[email protected]>
0 siblings, 2 replies; 22+ messages in thread
From: Bertrand Drouvot @ 2026-02-18 05:40 UTC (permalink / raw)
To: Sami Imseih <[email protected]>; +Cc: Michael Paquier <[email protected]>; Fujii Masao <[email protected]>; [email protected]; Zsolt Parragi <[email protected]>
Hi,
On Tue, Feb 17, 2026 at 01:18:35PM -0600, Sami Imseih wrote:
> >
> > > I do not have any further comments on this patchset.
> >
> > Thanks for the review!
>
> I flipped this CF entry to Ready-for-committer
Thanks!
PFA a mandatory rebase (nothing that needs review) due to a92b809f9da1.
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachments:
[text/x-diff] v8-0001-Add-pgstat_report_anytime_stat-for-periodic-stats.patch (41.9K, 2-v8-0001-Add-pgstat_report_anytime_stat-for-periodic-stats.patch)
download | inline diff:
From fc1f5423f49d14850b9dde764caade5d633df8bd Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot <[email protected]>
Date: Mon, 5 Jan 2026 09:41:39 +0000
Subject: [PATCH v8 1/5] Add pgstat_report_anytime_stat() for periodic stats
flushing
Long running transactions can accumulate significant statistics (WAL, IO, ...)
that remain unflushed until the transaction ends. This delays visibility of
resource usage in monitoring views like pg_stat_io and pg_stat_wal and produces
spikes when flushed.
This commit introduces pgstat_report_anytime_stat(), which flushes
non transactional statistics even inside active transactions. A new timeout
handler fires every second (if enabled while adding pending stats) to call this
function, ensuring timely stats visibility without waiting for transaction completion.
Implementation details:
- Add PgStat_FlushMode enum to classify stats kinds:
* FLUSH_ANYTIME: Stats that can always be flushed (WAL, IO, ...)
* FLUSH_AT_TXN_BOUNDARY: Stats requiring transaction boundaries
- Modify pgstat_flush_pending_entries() and pgstat_flush_fixed_stats()
to accept a boolean anytime_only parameter:
* When false: flushes all stats (existing behavior)
* When true: flushes only FLUSH_ANYTIME stats and skips FLUSH_AT_TXN_BOUNDARY stats
- The flush_pending_cb and flush_static_cb callbacks now receive an anytime_only
boolean parameter. Most of the time it's not used (except for assertions), but it's
preparatory work for moving the relations stats to anytime (without introducin
a new callback).
- Add pgstat_schedule_anytime_update() macro to schedule the next anytime flush,
relying on PGSTAT_MIN_INTERVAL
The force parameter in pgstat_report_anytime_stat() is currently unused (always
called with force=false) but reserved for future use cases requiring immediate
flushing.
---
src/backend/access/transam/xlog.c | 6 +
src/backend/postmaster/bgwriter.c | 9 +-
src/backend/postmaster/checkpointer.c | 10 +-
src/backend/postmaster/startup.c | 2 +
src/backend/postmaster/walsummarizer.c | 9 +-
src/backend/postmaster/walwriter.c | 9 +-
src/backend/replication/walreceiver.c | 9 +-
src/backend/replication/walsender.c | 8 +-
src/backend/tcop/postgres.c | 12 ++
src/backend/utils/activity/pgstat.c | 112 ++++++++++++++----
src/backend/utils/activity/pgstat_backend.c | 13 +-
src/backend/utils/activity/pgstat_bgwriter.c | 2 +-
.../utils/activity/pgstat_checkpointer.c | 2 +-
src/backend/utils/activity/pgstat_database.c | 2 +-
src/backend/utils/activity/pgstat_function.c | 4 +-
src/backend/utils/activity/pgstat_io.c | 10 +-
src/backend/utils/activity/pgstat_relation.c | 12 +-
src/backend/utils/activity/pgstat_slru.c | 6 +-
.../utils/activity/pgstat_subscription.c | 4 +-
src/backend/utils/activity/pgstat_wal.c | 10 +-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/init/postinit.c | 3 +
src/include/miscadmin.h | 1 +
src/include/pgstat.h | 16 +++
src/include/utils/pgstat_internal.h | 52 ++++++--
src/include/utils/timeout.h | 1 +
.../test_custom_stats/test_custom_var_stats.c | 4 +-
src/tools/pgindent/typedefs.list | 1 +
28 files changed, 264 insertions(+), 66 deletions(-)
10.8% src/backend/postmaster/
6.0% src/backend/replication/
50.7% src/backend/utils/activity/
6.0% src/backend/
19.3% src/include/utils/
5.6% src/include/
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 13ec6225b85..d01b11c7470 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -1085,6 +1085,9 @@ XLogInsertRecord(XLogRecData *rdata,
pgWalUsage.wal_fpi += num_fpi;
pgWalUsage.wal_fpi_bytes += fpi_bytes;
+ /* Schedule next anytime stats update timeout */
+ pgstat_schedule_anytime_update();
+
/* Required for the flush of pending stats WAL data */
pgstat_report_fixed = true;
}
@@ -2066,6 +2069,9 @@ AdvanceXLInsertBuffer(XLogRecPtr upto, TimeLineID tli, bool opportunistic)
pgWalUsage.wal_buffers_full++;
TRACE_POSTGRESQL_WAL_BUFFER_WRITE_DIRTY_DONE();
+ /* Schedule next anytime stats update timeout */
+ pgstat_schedule_anytime_update();
+
/*
* Required for the flush of pending stats WAL data, per
* update of pgWalUsage.
diff --git a/src/backend/postmaster/bgwriter.c b/src/backend/postmaster/bgwriter.c
index 0956bd39a85..059c601c3b8 100644
--- a/src/backend/postmaster/bgwriter.c
+++ b/src/backend/postmaster/bgwriter.c
@@ -49,7 +49,9 @@
#include "storage/smgr.h"
#include "storage/standby.h"
#include "utils/memutils.h"
+#include "utils/pgstat_internal.h"
#include "utils/resowner.h"
+#include "utils/timeout.h"
#include "utils/timestamp.h"
/*
@@ -103,7 +105,7 @@ BackgroundWriterMain(const void *startup_data, size_t startup_data_len)
pqsignal(SIGINT, SIG_IGN);
pqsignal(SIGTERM, SignalHandlerForShutdownRequest);
/* SIGQUIT handler was already set up by InitPostmasterChild */
- pqsignal(SIGALRM, SIG_IGN);
+ InitializeTimeouts(); /* establishes SIGALRM handler */
pqsignal(SIGPIPE, SIG_IGN);
pqsignal(SIGUSR1, procsignal_sigusr1_handler);
pqsignal(SIGUSR2, SIG_IGN);
@@ -113,6 +115,11 @@ BackgroundWriterMain(const void *startup_data, size_t startup_data_len)
*/
pqsignal(SIGCHLD, SIG_DFL);
+ /*
+ * Register timeouts needed
+ */
+ RegisterTimeout(ANYTIME_STATS_UPDATE_TIMEOUT, AnytimeStatsUpdateTimeoutHandler);
+
/*
* We just started, assume there has been either a shutdown or
* end-of-recovery snapshot.
diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c
index e03c19123bc..e11c4b099c8 100644
--- a/src/backend/postmaster/checkpointer.c
+++ b/src/backend/postmaster/checkpointer.c
@@ -66,8 +66,9 @@
#include "utils/acl.h"
#include "utils/guc.h"
#include "utils/memutils.h"
+#include "utils/pgstat_internal.h"
#include "utils/resowner.h"
-
+#include "utils/timeout.h"
/*----------
* Shared memory area for communication between checkpointer and backends
@@ -215,7 +216,7 @@ CheckpointerMain(const void *startup_data, size_t startup_data_len)
pqsignal(SIGINT, ReqShutdownXLOG);
pqsignal(SIGTERM, SIG_IGN); /* ignore SIGTERM */
/* SIGQUIT handler was already set up by InitPostmasterChild */
- pqsignal(SIGALRM, SIG_IGN);
+ InitializeTimeouts(); /* establishes SIGALRM handler */
pqsignal(SIGPIPE, SIG_IGN);
pqsignal(SIGUSR1, procsignal_sigusr1_handler);
pqsignal(SIGUSR2, SignalHandlerForShutdownRequest);
@@ -225,6 +226,11 @@ CheckpointerMain(const void *startup_data, size_t startup_data_len)
*/
pqsignal(SIGCHLD, SIG_DFL);
+ /*
+ * Register timeouts needed
+ */
+ RegisterTimeout(ANYTIME_STATS_UPDATE_TIMEOUT, AnytimeStatsUpdateTimeoutHandler);
+
/*
* Initialize so that first time-driven event happens at the correct time.
*/
diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c
index cdbe53dd262..4954fe425b7 100644
--- a/src/backend/postmaster/startup.c
+++ b/src/backend/postmaster/startup.c
@@ -32,6 +32,7 @@
#include "storage/standby.h"
#include "utils/guc.h"
#include "utils/memutils.h"
+#include "utils/pgstat_internal.h"
#include "utils/timeout.h"
@@ -245,6 +246,7 @@ StartupProcessMain(const void *startup_data, size_t startup_data_len)
RegisterTimeout(STANDBY_DEADLOCK_TIMEOUT, StandbyDeadLockHandler);
RegisterTimeout(STANDBY_TIMEOUT, StandbyTimeoutHandler);
RegisterTimeout(STANDBY_LOCK_TIMEOUT, StandbyLockTimeoutHandler);
+ RegisterTimeout(ANYTIME_STATS_UPDATE_TIMEOUT, AnytimeStatsUpdateTimeoutHandler);
/*
* Unblock signals (they were blocked when the postmaster forked us)
diff --git a/src/backend/postmaster/walsummarizer.c b/src/backend/postmaster/walsummarizer.c
index 742137edad6..f1bae9d23d6 100644
--- a/src/backend/postmaster/walsummarizer.c
+++ b/src/backend/postmaster/walsummarizer.c
@@ -48,6 +48,8 @@
#include "storage/shmem.h"
#include "utils/guc.h"
#include "utils/memutils.h"
+#include "utils/pgstat_internal.h"
+#include "utils/timeout.h"
#include "utils/wait_event.h"
/*
@@ -246,7 +248,7 @@ WalSummarizerMain(const void *startup_data, size_t startup_data_len)
pqsignal(SIGINT, SIG_IGN); /* no query to cancel */
pqsignal(SIGTERM, SignalHandlerForShutdownRequest);
/* SIGQUIT handler was already set up by InitPostmasterChild */
- pqsignal(SIGALRM, SIG_IGN);
+ InitializeTimeouts(); /* establishes SIGALRM handler */
pqsignal(SIGPIPE, SIG_IGN);
pqsignal(SIGUSR1, procsignal_sigusr1_handler);
pqsignal(SIGUSR2, SIG_IGN); /* not used */
@@ -268,6 +270,11 @@ WalSummarizerMain(const void *startup_data, size_t startup_data_len)
*/
pqsignal(SIGCHLD, SIG_DFL);
+ /*
+ * Register timeouts needed
+ */
+ RegisterTimeout(ANYTIME_STATS_UPDATE_TIMEOUT, AnytimeStatsUpdateTimeoutHandler);
+
/*
* If an exception is encountered, processing resumes here.
*/
diff --git a/src/backend/postmaster/walwriter.c b/src/backend/postmaster/walwriter.c
index 7c0e2809c17..bcf59227a00 100644
--- a/src/backend/postmaster/walwriter.c
+++ b/src/backend/postmaster/walwriter.c
@@ -61,7 +61,9 @@
#include "storage/smgr.h"
#include "utils/hsearch.h"
#include "utils/memutils.h"
+#include "utils/pgstat_internal.h"
#include "utils/resowner.h"
+#include "utils/timeout.h"
/*
@@ -103,7 +105,7 @@ WalWriterMain(const void *startup_data, size_t startup_data_len)
pqsignal(SIGINT, SIG_IGN); /* no query to cancel */
pqsignal(SIGTERM, SignalHandlerForShutdownRequest);
/* SIGQUIT handler was already set up by InitPostmasterChild */
- pqsignal(SIGALRM, SIG_IGN);
+ InitializeTimeouts(); /* establishes SIGALRM handler */
pqsignal(SIGPIPE, SIG_IGN);
pqsignal(SIGUSR1, procsignal_sigusr1_handler);
pqsignal(SIGUSR2, SIG_IGN); /* not used */
@@ -113,6 +115,11 @@ WalWriterMain(const void *startup_data, size_t startup_data_len)
*/
pqsignal(SIGCHLD, SIG_DFL);
+ /*
+ * Register timeouts needed
+ */
+ RegisterTimeout(ANYTIME_STATS_UPDATE_TIMEOUT, AnytimeStatsUpdateTimeoutHandler);
+
/*
* Create a memory context that we will do all our work in. We do this so
* that we can reset the context during error recovery and thereby avoid
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 10e64a7d1f4..11b7c114d3b 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -77,7 +77,9 @@
#include "utils/builtins.h"
#include "utils/guc.h"
#include "utils/pg_lsn.h"
+#include "utils/pgstat_internal.h"
#include "utils/ps_status.h"
+#include "utils/timeout.h"
#include "utils/timestamp.h"
@@ -252,7 +254,7 @@ WalReceiverMain(const void *startup_data, size_t startup_data_len)
pqsignal(SIGINT, SIG_IGN);
pqsignal(SIGTERM, die); /* request shutdown */
/* SIGQUIT handler was already set up by InitPostmasterChild */
- pqsignal(SIGALRM, SIG_IGN);
+ InitializeTimeouts(); /* establishes SIGALRM handler */
pqsignal(SIGPIPE, SIG_IGN);
pqsignal(SIGUSR1, procsignal_sigusr1_handler);
pqsignal(SIGUSR2, SIG_IGN);
@@ -260,6 +262,11 @@ WalReceiverMain(const void *startup_data, size_t startup_data_len)
/* Reset some signals that are accepted by postmaster but not here */
pqsignal(SIGCHLD, SIG_DFL);
+ /*
+ * Register timeouts needed
+ */
+ RegisterTimeout(ANYTIME_STATS_UPDATE_TIMEOUT, AnytimeStatsUpdateTimeoutHandler);
+
/* Load the libpq-specific functions */
load_file("libpqwalreceiver", false);
if (WalReceiverFunctions == NULL)
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 2cde8ebc729..a7214d0dc6f 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1987,8 +1987,8 @@ WalSndWaitForWal(XLogRecPtr loc)
if (TimestampDifferenceExceeds(last_flush, now,
WALSENDER_STATS_FLUSH_INTERVAL))
{
- pgstat_flush_io(false);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO);
+ pgstat_flush_io(false, true);
+ (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
last_flush = now;
}
@@ -3016,8 +3016,8 @@ WalSndLoop(WalSndSendDataCallback send_data)
if (TimestampDifferenceExceeds(last_flush, now,
WALSENDER_STATS_FLUSH_INTERVAL))
{
- pgstat_flush_io(false);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO);
+ pgstat_flush_io(false, true);
+ (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
last_flush = now;
}
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 21de158adbb..2089de782d5 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -3564,6 +3564,18 @@ ProcessInterrupts(void)
pgstat_report_stat(true);
}
+ /*
+ * Flush stats outside of transaction boundary if the timeout fired.
+ * Unlike transactional stats, these can be flushed even inside a running
+ * transaction.
+ */
+ if (AnytimeStatsUpdateTimeoutPending)
+ {
+ AnytimeStatsUpdateTimeoutPending = false;
+
+ pgstat_report_anytime_stat(false);
+ }
+
if (ProcSignalBarrierPending)
ProcessProcSignalBarrier();
diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c
index 11bb71cad5a..a4ff64dc5ce 100644
--- a/src/backend/utils/activity/pgstat.c
+++ b/src/backend/utils/activity/pgstat.c
@@ -112,6 +112,7 @@
#include "utils/guc_hooks.h"
#include "utils/memutils.h"
#include "utils/pgstat_internal.h"
+#include "utils/timeout.h"
#include "utils/timestamp.h"
@@ -122,8 +123,6 @@
* ----------
*/
-/* minimum interval non-forced stats flushes.*/
-#define PGSTAT_MIN_INTERVAL 1000
/* how long until to block flushing pending stats updates */
#define PGSTAT_MAX_INTERVAL 60000
/* when to call pgstat_report_stat() again, even when idle */
@@ -187,7 +186,8 @@ static void pgstat_init_snapshot_fixed(void);
static void pgstat_reset_after_failure(void);
-static bool pgstat_flush_pending_entries(bool nowait);
+static bool pgstat_flush_pending_entries(bool nowait, bool anytime_only);
+static bool pgstat_flush_fixed_stats(bool nowait, bool anytime_only);
static void pgstat_prep_snapshot(void);
static void pgstat_build_snapshot(void);
@@ -288,6 +288,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = true,
+ .flush_mode = FLUSH_AT_TXN_BOUNDARY,
/* so pg_stat_database entries can be seen in all databases */
.accessed_across_databases = true,
@@ -305,6 +306,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = true,
+ .flush_mode = FLUSH_AT_TXN_BOUNDARY,
.shared_size = sizeof(PgStatShared_Relation),
.shared_data_off = offsetof(PgStatShared_Relation, stats),
@@ -321,6 +323,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = true,
+ .flush_mode = FLUSH_AT_TXN_BOUNDARY,
.shared_size = sizeof(PgStatShared_Function),
.shared_data_off = offsetof(PgStatShared_Function, stats),
@@ -336,6 +339,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = true,
+ .flush_mode = FLUSH_AT_TXN_BOUNDARY,
.accessed_across_databases = true,
@@ -353,6 +357,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = true,
+ .flush_mode = FLUSH_AT_TXN_BOUNDARY,
/* so pg_stat_subscription_stats entries can be seen in all databases */
.accessed_across_databases = true,
@@ -370,6 +375,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = false,
+ .flush_mode = FLUSH_ANYTIME,
.accessed_across_databases = true,
@@ -436,6 +442,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = true,
.write_to_file = true,
+ .flush_mode = FLUSH_ANYTIME,
.snapshot_ctl_off = offsetof(PgStat_Snapshot, io),
.shared_ctl_off = offsetof(PgStat_ShmemControl, io),
@@ -453,6 +460,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = true,
.write_to_file = true,
+ .flush_mode = FLUSH_ANYTIME,
.snapshot_ctl_off = offsetof(PgStat_Snapshot, slru),
.shared_ctl_off = offsetof(PgStat_ShmemControl, slru),
@@ -470,6 +478,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = true,
.write_to_file = true,
+ .flush_mode = FLUSH_ANYTIME,
.snapshot_ctl_off = offsetof(PgStat_Snapshot, wal),
.shared_ctl_off = offsetof(PgStat_ShmemControl, wal),
@@ -775,23 +784,11 @@ pgstat_report_stat(bool force)
partial_flush = false;
/* flush of variable-numbered stats tracked in pending entries list */
- partial_flush |= pgstat_flush_pending_entries(nowait);
+ partial_flush |= pgstat_flush_pending_entries(nowait, false);
/* flush of other stats kinds */
if (pgstat_report_fixed)
- {
- for (PgStat_Kind kind = PGSTAT_KIND_MIN; kind <= PGSTAT_KIND_MAX; kind++)
- {
- const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
-
- if (!kind_info)
- continue;
- if (!kind_info->flush_static_cb)
- continue;
-
- partial_flush |= kind_info->flush_static_cb(nowait);
- }
- }
+ partial_flush |= pgstat_flush_fixed_stats(nowait, false);
last_flush = now;
@@ -1293,7 +1290,8 @@ pgstat_prep_pending_entry(PgStat_Kind kind, Oid dboid, uint64 objid, bool *creat
if (entry_ref->pending == NULL)
{
- size_t entrysize = pgstat_get_kind_info(kind)->pending_size;
+ const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
+ size_t entrysize = kind_info->pending_size;
Assert(entrysize != (size_t) -1);
@@ -1345,9 +1343,14 @@ pgstat_delete_pending_entry(PgStat_EntryRef *entry_ref)
/*
* Flush out pending variable-numbered stats.
+ *
+ * If anytime_only is true, only flushes FLUSH_ANYTIME entries.
+ * This is safe to call inside transactions.
+ *
+ * If anytime_only is false, flushes all entries.
*/
static bool
-pgstat_flush_pending_entries(bool nowait)
+pgstat_flush_pending_entries(bool nowait, bool anytime_only)
{
bool have_pending = false;
dlist_node *cur = NULL;
@@ -1377,8 +1380,22 @@ pgstat_flush_pending_entries(bool nowait)
Assert(!kind_info->fixed_amount);
Assert(kind_info->flush_pending_cb != NULL);
+ /* Skip transactional stats if we're in anytime_only mode */
+ if (anytime_only && kind_info->flush_mode == FLUSH_AT_TXN_BOUNDARY)
+ {
+ have_pending = true;
+
+ if (dlist_has_next(&pgStatPending, cur))
+ next = dlist_next_node(&pgStatPending, cur);
+ else
+ next = NULL;
+
+ cur = next;
+ continue;
+ }
+
/* flush the stats, if possible */
- did_flush = kind_info->flush_pending_cb(entry_ref, nowait);
+ did_flush = kind_info->flush_pending_cb(entry_ref, nowait, anytime_only);
Assert(did_flush || nowait);
@@ -1402,6 +1419,33 @@ pgstat_flush_pending_entries(bool nowait)
return have_pending;
}
+/*
+ * Flush fixed-amount stats.
+ *
+ * If anytime_only is true, only flushes FLUSH_ANYTIME stats (safe inside transactions).
+ * If anytime_only is false, flushes all stats with flush_static_cb.
+ */
+static bool
+pgstat_flush_fixed_stats(bool nowait, bool anytime_only)
+{
+ bool partial_flush = false;
+
+ for (PgStat_Kind kind = PGSTAT_KIND_MIN; kind <= PGSTAT_KIND_MAX; kind++)
+ {
+ const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
+
+ if (!kind_info || !kind_info->flush_static_cb)
+ continue;
+
+ /* Skip transactional stats if we're in anytime_only mode */
+ if (anytime_only && kind_info->flush_mode == FLUSH_AT_TXN_BOUNDARY)
+ continue;
+
+ partial_flush |= kind_info->flush_static_cb(nowait, anytime_only);
+ }
+
+ return partial_flush;
+}
/* ------------------------------------------------------------
* Helper / infrastructure functions
@@ -2119,3 +2163,31 @@ assign_stats_fetch_consistency(int newval, void *extra)
if (pgstat_fetch_consistency != newval)
force_stats_snapshot_clear = true;
}
+
+/*
+ * Flushes only FLUSH_ANYTIME stats using non-blocking locks. Transactional
+ * stats (FLUSH_AT_TXN_BOUNDARY) remain pending until transaction boundary.
+ * Safe to call inside transactions.
+ */
+void
+pgstat_report_anytime_stat(bool force)
+{
+ bool nowait = !force;
+
+ pgstat_assert_is_up();
+
+ /* Flush stats outside of transaction boundary */
+ pgstat_flush_pending_entries(nowait, true);
+ pgstat_flush_fixed_stats(nowait, true);
+}
+
+/*
+ * Timeout handler for flushing anytime stats.
+ */
+void
+AnytimeStatsUpdateTimeoutHandler(void)
+{
+ AnytimeStatsUpdateTimeoutPending = true;
+ InterruptPending = true;
+ SetLatch(MyLatch);
+}
diff --git a/src/backend/utils/activity/pgstat_backend.c b/src/backend/utils/activity/pgstat_backend.c
index f2f8d3ff75f..b09316d3ab3 100644
--- a/src/backend/utils/activity/pgstat_backend.c
+++ b/src/backend/utils/activity/pgstat_backend.c
@@ -31,6 +31,7 @@
#include "storage/procarray.h"
#include "utils/memutils.h"
#include "utils/pgstat_internal.h"
+#include "utils/timeout.h"
/*
* Backend statistics counts waiting to be flushed out. These counters may be
@@ -66,6 +67,9 @@ pgstat_count_backend_io_op_time(IOObject io_object, IOContext io_context,
INSTR_TIME_ADD(PendingBackendStats.pending_io.pending_times[io_object][io_context][io_op],
io_time);
+ /* Schedule next anytime stats update timeout */
+ pgstat_schedule_anytime_update();
+
backend_has_iostats = true;
pgstat_report_fixed = true;
}
@@ -82,6 +86,9 @@ pgstat_count_backend_io_op(IOObject io_object, IOContext io_context,
PendingBackendStats.pending_io.counts[io_object][io_context][io_op] += cnt;
PendingBackendStats.pending_io.bytes[io_object][io_context][io_op] += bytes;
+ /* Schedule next anytime stats update timeout */
+ pgstat_schedule_anytime_update();
+
backend_has_iostats = true;
pgstat_report_fixed = true;
}
@@ -268,7 +275,7 @@ pgstat_flush_backend_entry_wal(PgStat_EntryRef *entry_ref)
* if some statistics could not be flushed due to lock contention.
*/
bool
-pgstat_flush_backend(bool nowait, bits32 flags)
+pgstat_flush_backend(bool nowait, bits32 flags, bool anytime_only)
{
PgStat_EntryRef *entry_ref;
bool has_pending_data = false;
@@ -311,9 +318,9 @@ pgstat_flush_backend(bool nowait, bits32 flags)
* If some stats could not be flushed due to lock contention, return true.
*/
bool
-pgstat_backend_flush_cb(bool nowait)
+pgstat_backend_flush_cb(bool nowait, bool anytime_only)
{
- return pgstat_flush_backend(nowait, PGSTAT_BACKEND_FLUSH_ALL);
+ return pgstat_flush_backend(nowait, PGSTAT_BACKEND_FLUSH_ALL, anytime_only);
}
/*
diff --git a/src/backend/utils/activity/pgstat_bgwriter.c b/src/backend/utils/activity/pgstat_bgwriter.c
index ed2fd801189..1c5f0c3ec40 100644
--- a/src/backend/utils/activity/pgstat_bgwriter.c
+++ b/src/backend/utils/activity/pgstat_bgwriter.c
@@ -61,7 +61,7 @@ pgstat_report_bgwriter(void)
/*
* Report IO statistics
*/
- pgstat_flush_io(false);
+ pgstat_flush_io(false, true);
}
/*
diff --git a/src/backend/utils/activity/pgstat_checkpointer.c b/src/backend/utils/activity/pgstat_checkpointer.c
index 1f70194b7a7..2d89a082464 100644
--- a/src/backend/utils/activity/pgstat_checkpointer.c
+++ b/src/backend/utils/activity/pgstat_checkpointer.c
@@ -68,7 +68,7 @@ pgstat_report_checkpointer(void)
/*
* Report IO statistics
*/
- pgstat_flush_io(false);
+ pgstat_flush_io(false, true);
}
/*
diff --git a/src/backend/utils/activity/pgstat_database.c b/src/backend/utils/activity/pgstat_database.c
index 933dcb5cae5..8e86df60461 100644
--- a/src/backend/utils/activity/pgstat_database.c
+++ b/src/backend/utils/activity/pgstat_database.c
@@ -435,7 +435,7 @@ pgstat_reset_database_timestamp(Oid dboid, TimestampTz ts)
* false without flushing the entry. Otherwise returns true.
*/
bool
-pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
+pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
{
PgStatShared_Database *sharedent;
PgStat_StatDBEntry *pendingent;
diff --git a/src/backend/utils/activity/pgstat_function.c b/src/backend/utils/activity/pgstat_function.c
index e6b84283c6c..5ba4958382f 100644
--- a/src/backend/utils/activity/pgstat_function.c
+++ b/src/backend/utils/activity/pgstat_function.c
@@ -190,11 +190,13 @@ pgstat_end_function_usage(PgStat_FunctionCallUsage *fcu, bool finalize)
* false without flushing the entry. Otherwise returns true.
*/
bool
-pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
+pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
{
PgStat_FunctionCounts *localent;
PgStatShared_Function *shfuncent;
+ Assert(!anytime_only);
+
localent = (PgStat_FunctionCounts *) entry_ref->pending;
shfuncent = (PgStatShared_Function *) entry_ref->shared_stats;
diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index 28de24538dc..7cd32900236 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -19,6 +19,7 @@
#include "executor/instrument.h"
#include "storage/bufmgr.h"
#include "utils/pgstat_internal.h"
+#include "utils/timeout.h"
static PgStat_PendingIO PendingIOStats;
static bool have_iostats = false;
@@ -79,6 +80,9 @@ pgstat_count_io_op(IOObject io_object, IOContext io_context, IOOp io_op,
/* Add the per-backend counts */
pgstat_count_backend_io_op(io_object, io_context, io_op, cnt, bytes);
+ /* Schedule next anytime stats update timeout */
+ pgstat_schedule_anytime_update();
+
have_iostats = true;
pgstat_report_fixed = true;
}
@@ -172,9 +176,9 @@ pgstat_fetch_stat_io(void)
* Simpler wrapper of pgstat_io_flush_cb()
*/
void
-pgstat_flush_io(bool nowait)
+pgstat_flush_io(bool nowait, bool anytime_only)
{
- (void) pgstat_io_flush_cb(nowait);
+ (void) pgstat_io_flush_cb(nowait, anytime_only);
}
/*
@@ -186,7 +190,7 @@ pgstat_flush_io(bool nowait)
* acquired. Otherwise, return false.
*/
bool
-pgstat_io_flush_cb(bool nowait)
+pgstat_io_flush_cb(bool nowait, bool anytime_only)
{
LWLock *bktype_lock;
PgStat_BktypeIO *bktype_shstats;
diff --git a/src/backend/utils/activity/pgstat_relation.c b/src/backend/utils/activity/pgstat_relation.c
index bc8c43b96aa..04d21483d93 100644
--- a/src/backend/utils/activity/pgstat_relation.c
+++ b/src/backend/utils/activity/pgstat_relation.c
@@ -267,8 +267,8 @@ pgstat_report_vacuum(Relation rel, PgStat_Counter livetuples,
* is done -- which will likely vacuum many relations -- or until the
* VACUUM command has processed all tables and committed.
*/
- pgstat_flush_io(false);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO);
+ pgstat_flush_io(false, true);
+ (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
}
/*
@@ -362,8 +362,8 @@ pgstat_report_analyze(Relation rel,
pgstat_unlock_entry(entry_ref);
/* see pgstat_report_vacuum() */
- pgstat_flush_io(false);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO);
+ pgstat_flush_io(false, true);
+ (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
}
/*
@@ -812,7 +812,7 @@ pgstat_twophase_postabort(FullTransactionId fxid, uint16 info,
* entry when successfully flushing.
*/
bool
-pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
+pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
{
Oid dboid;
PgStat_TableStatus *lstats; /* pending stats entry */
@@ -820,6 +820,8 @@ pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
PgStat_StatTabEntry *tabentry; /* table entry of shared stats */
PgStat_StatDBEntry *dbentry; /* pending database entry */
+ Assert(!anytime_only);
+
dboid = entry_ref->shared_entry->key.dboid;
lstats = (PgStat_TableStatus *) entry_ref->pending;
shtabstats = (PgStatShared_Relation *) entry_ref->shared_stats;
diff --git a/src/backend/utils/activity/pgstat_slru.c b/src/backend/utils/activity/pgstat_slru.c
index 2190f388eae..bf8a4d58673 100644
--- a/src/backend/utils/activity/pgstat_slru.c
+++ b/src/backend/utils/activity/pgstat_slru.c
@@ -19,6 +19,7 @@
#include "utils/pgstat_internal.h"
#include "utils/timestamp.h"
+#include "utils/timeout.h"
static inline PgStat_SLRUStats *get_slru_entry(int slru_idx);
@@ -139,7 +140,7 @@ pgstat_get_slru_index(const char *name)
* acquired. Otherwise return false.
*/
bool
-pgstat_slru_flush_cb(bool nowait)
+pgstat_slru_flush_cb(bool nowait, bool anytime_only)
{
PgStatShared_SLRU *stats_shmem = &pgStatLocal.shmem->slru;
int i;
@@ -223,6 +224,9 @@ get_slru_entry(int slru_idx)
Assert((slru_idx >= 0) && (slru_idx < SLRU_NUM_ELEMENTS));
+ /* Schedule next anytime stats update timeout */
+ pgstat_schedule_anytime_update();
+
have_slrustats = true;
pgstat_report_fixed = true;
diff --git a/src/backend/utils/activity/pgstat_subscription.c b/src/backend/utils/activity/pgstat_subscription.c
index 500b1899188..c4614817966 100644
--- a/src/backend/utils/activity/pgstat_subscription.c
+++ b/src/backend/utils/activity/pgstat_subscription.c
@@ -116,11 +116,13 @@ pgstat_fetch_stat_subscription(Oid subid)
* false without flushing the entry. Otherwise returns true.
*/
bool
-pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
+pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
{
PgStat_BackendSubEntry *localent;
PgStatShared_Subscription *shsubent;
+ Assert(!anytime_only);
+
localent = (PgStat_BackendSubEntry *) entry_ref->pending;
shsubent = (PgStatShared_Subscription *) entry_ref->shared_stats;
diff --git a/src/backend/utils/activity/pgstat_wal.c b/src/backend/utils/activity/pgstat_wal.c
index 183e0a7a97b..2c2f3f10e10 100644
--- a/src/backend/utils/activity/pgstat_wal.c
+++ b/src/backend/utils/activity/pgstat_wal.c
@@ -51,12 +51,12 @@ pgstat_report_wal(bool force)
nowait = !force;
/* flush wal stats */
- (void) pgstat_wal_flush_cb(nowait);
- pgstat_flush_backend(nowait, PGSTAT_BACKEND_FLUSH_WAL);
+ (void) pgstat_wal_flush_cb(nowait, true);
+ (void) pgstat_flush_backend(nowait, PGSTAT_BACKEND_FLUSH_WAL, true);
/* flush IO stats */
- pgstat_flush_io(nowait);
- (void) pgstat_flush_backend(nowait, PGSTAT_BACKEND_FLUSH_IO);
+ pgstat_flush_io(nowait, true);
+ (void) pgstat_flush_backend(nowait, PGSTAT_BACKEND_FLUSH_IO, true);
}
/*
@@ -88,7 +88,7 @@ pgstat_wal_have_pending(void)
* acquired. Otherwise return false.
*/
bool
-pgstat_wal_flush_cb(bool nowait)
+pgstat_wal_flush_cb(bool nowait, bool anytime_only)
{
PgStatShared_Wal *stats_shmem = &pgStatLocal.shmem->wal;
WalUsage wal_usage_diff = {0};
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 36ad708b360..ad44826c39e 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -40,6 +40,7 @@ volatile sig_atomic_t IdleSessionTimeoutPending = false;
volatile sig_atomic_t ProcSignalBarrierPending = false;
volatile sig_atomic_t LogMemoryContextPending = false;
volatile sig_atomic_t IdleStatsUpdateTimeoutPending = false;
+volatile sig_atomic_t AnytimeStatsUpdateTimeoutPending = false;
volatile uint32 InterruptHoldoffCount = 0;
volatile uint32 QueryCancelHoldoffCount = 0;
volatile uint32 CritSectionCount = 0;
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index b59e08605cc..eeeac1bf39a 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -64,6 +64,7 @@
#include "utils/injection_point.h"
#include "utils/memutils.h"
#include "utils/pg_locale.h"
+#include "utils/pgstat_internal.h"
#include "utils/portal.h"
#include "utils/ps_status.h"
#include "utils/snapmgr.h"
@@ -773,6 +774,8 @@ InitPostgres(const char *in_dbname, Oid dboid,
RegisterTimeout(CLIENT_CONNECTION_CHECK_TIMEOUT, ClientCheckTimeoutHandler);
RegisterTimeout(IDLE_STATS_UPDATE_TIMEOUT,
IdleStatsUpdateTimeoutHandler);
+ RegisterTimeout(ANYTIME_STATS_UPDATE_TIMEOUT,
+ AnytimeStatsUpdateTimeoutHandler);
}
/*
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index f16f35659b9..84e698da214 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -96,6 +96,7 @@ extern PGDLLIMPORT volatile sig_atomic_t IdleSessionTimeoutPending;
extern PGDLLIMPORT volatile sig_atomic_t ProcSignalBarrierPending;
extern PGDLLIMPORT volatile sig_atomic_t LogMemoryContextPending;
extern PGDLLIMPORT volatile sig_atomic_t IdleStatsUpdateTimeoutPending;
+extern PGDLLIMPORT volatile sig_atomic_t AnytimeStatsUpdateTimeoutPending;
extern PGDLLIMPORT volatile sig_atomic_t CheckClientConnectionPending;
extern PGDLLIMPORT volatile sig_atomic_t ClientConnectionLost;
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index fff7ecc2533..b340a680614 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -35,6 +35,9 @@
/* Default directory to store temporary statistics data in */
#define PG_STAT_TMP_DIR "pg_stat_tmp"
+/* Minimum interval non-forced stats flushes */
+#define PGSTAT_MIN_INTERVAL 1000
+
/* Values for track_functions GUC variable --- order is significant! */
typedef enum TrackFunctionsLevel
{
@@ -533,8 +536,21 @@ extern void pgstat_initialize(void);
/* Functions called from backends */
extern long pgstat_report_stat(bool force);
+extern void pgstat_report_anytime_stat(bool force);
extern void pgstat_force_next_flush(void);
+/*
+ * Schedule the next anytime stats update timeout.
+ *
+ * This should be called whenever accumulating statistics that support
+ * FLUSH_ANYTIME flushing mode.
+ */
+#define pgstat_schedule_anytime_update() \
+ do { \
+ if (IsUnderPostmaster && !get_timeout_active(ANYTIME_STATS_UPDATE_TIMEOUT)) \
+ enable_timeout_after(ANYTIME_STATS_UPDATE_TIMEOUT, PGSTAT_MIN_INTERVAL); \
+ } while (0)
+
extern void pgstat_reset_counters(void);
extern void pgstat_reset(PgStat_Kind kind, Oid dboid, uint64 objid);
extern void pgstat_reset_of_kind(PgStat_Kind kind);
diff --git a/src/include/utils/pgstat_internal.h b/src/include/utils/pgstat_internal.h
index 9b8fbae00ed..607f4255268 100644
--- a/src/include/utils/pgstat_internal.h
+++ b/src/include/utils/pgstat_internal.h
@@ -224,6 +224,19 @@ typedef struct PgStat_SubXactStatus
PgStat_TableXactStatus *first; /* head of list for this subxact */
} PgStat_SubXactStatus;
+/*
+ * Flush mode for statistics kinds.
+ *
+ * FLUSH_AT_TXN_BOUNDARY has to be the first because we want it to be the
+ * default value.
+ */
+typedef enum PgStat_FlushMode
+{
+ FLUSH_AT_TXN_BOUNDARY, /* All fields can only be flushed at
+ * transaction boundary */
+ FLUSH_ANYTIME, /* All fields can be flushed anytime,
+ * including within transactions */
+} PgStat_FlushMode;
/*
* Metadata for a specific kind of statistics.
@@ -251,6 +264,16 @@ typedef struct PgStat_KindInfo
*/
bool track_entry_count:1;
+ /*
+ * The mode of when to flush stats. See PgStat_FlushMode for more details.
+ *
+ * This member only has meaning for statistics kinds that accumulate
+ * pending stats and use flush callbacks. For kinds that write directly to
+ * shared memory (e.g., archiver, bgwriter, checkpointer), this member has
+ * no effect.
+ */
+ PgStat_FlushMode flush_mode;
+
/*
* The size of an entry in the shared stats hash table (pointed to by
* PgStatShared_HashEntry->body). For fixed-numbered statistics, this is
@@ -297,8 +320,10 @@ typedef struct PgStat_KindInfo
* For variable-numbered stats: flush pending stats. Required if pending
* data is used. See flush_static_cb when dealing with stats data that
* that cannot use PgStat_EntryRef->pending.
+ *
+ * The anytime_only parameter indicates whether this is an anytime flush.
*/
- bool (*flush_pending_cb) (PgStat_EntryRef *sr, bool nowait);
+ bool (*flush_pending_cb) (PgStat_EntryRef *sr, bool nowait, bool anytime_only);
/*
* For variable-numbered stats: delete pending stats. Optional.
@@ -366,8 +391,10 @@ typedef struct PgStat_KindInfo
*
* "pgstat_report_fixed" needs to be set to trigger the flush of pending
* stats.
+ *
+ * The anytime_only parameter indicates whether this is an anytime flush.
*/
- bool (*flush_static_cb) (bool nowait);
+ bool (*flush_static_cb) (bool nowait, bool anytime_only);
/*
* For fixed-numbered statistics: Reset All.
@@ -677,6 +704,7 @@ extern PgStat_EntryRef *pgstat_fetch_pending_entry(PgStat_Kind kind,
extern void *pgstat_fetch_entry(PgStat_Kind kind, Oid dboid, uint64 objid);
extern void pgstat_snapshot_fixed(PgStat_Kind kind);
+extern void AnytimeStatsUpdateTimeoutHandler(void);
/*
@@ -696,8 +724,8 @@ extern void pgstat_archiver_snapshot_cb(void);
#define PGSTAT_BACKEND_FLUSH_WAL (1 << 1) /* Flush WAL statistics */
#define PGSTAT_BACKEND_FLUSH_ALL (PGSTAT_BACKEND_FLUSH_IO | PGSTAT_BACKEND_FLUSH_WAL)
-extern bool pgstat_flush_backend(bool nowait, bits32 flags);
-extern bool pgstat_backend_flush_cb(bool nowait);
+extern bool pgstat_flush_backend(bool nowait, bits32 flags, bool anytime_only);
+extern bool pgstat_backend_flush_cb(bool nowait, bool anytime_only);
extern void pgstat_backend_reset_timestamp_cb(PgStatShared_Common *header,
TimestampTz ts);
@@ -729,7 +757,7 @@ extern void AtEOXact_PgStat_Database(bool isCommit, bool parallel);
extern PgStat_StatDBEntry *pgstat_prep_database_pending(Oid dboid);
extern void pgstat_reset_database_timestamp(Oid dboid, TimestampTz ts);
-extern bool pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait);
+extern bool pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only);
extern void pgstat_database_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
@@ -737,7 +765,7 @@ extern void pgstat_database_reset_timestamp_cb(PgStatShared_Common *header, Time
* Functions in pgstat_function.c
*/
-extern bool pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait);
+extern bool pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only);
extern void pgstat_function_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
@@ -745,9 +773,9 @@ extern void pgstat_function_reset_timestamp_cb(PgStatShared_Common *header, Time
* Functions in pgstat_io.c
*/
-extern void pgstat_flush_io(bool nowait);
+extern void pgstat_flush_io(bool nowait, bool anytime_only);
-extern bool pgstat_io_flush_cb(bool nowait);
+extern bool pgstat_io_flush_cb(bool nowait, bool anytime_only);
extern void pgstat_io_init_shmem_cb(void *stats);
extern void pgstat_io_reset_all_cb(TimestampTz ts);
extern void pgstat_io_snapshot_cb(void);
@@ -762,7 +790,7 @@ extern void AtEOSubXact_PgStat_Relations(PgStat_SubXactStatus *xact_state, bool
extern void AtPrepare_PgStat_Relations(PgStat_SubXactStatus *xact_state);
extern void PostPrepare_PgStat_Relations(PgStat_SubXactStatus *xact_state);
-extern bool pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait);
+extern bool pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only);
extern void pgstat_relation_delete_pending_cb(PgStat_EntryRef *entry_ref);
extern void pgstat_relation_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
@@ -809,7 +837,7 @@ extern PgStatShared_Common *pgstat_init_entry(PgStat_Kind kind,
* Functions in pgstat_slru.c
*/
-extern bool pgstat_slru_flush_cb(bool nowait);
+extern bool pgstat_slru_flush_cb(bool nowait, bool anytime_only);
extern void pgstat_slru_init_shmem_cb(void *stats);
extern void pgstat_slru_reset_all_cb(TimestampTz ts);
extern void pgstat_slru_snapshot_cb(void);
@@ -820,7 +848,7 @@ extern void pgstat_slru_snapshot_cb(void);
*/
extern void pgstat_wal_init_backend_cb(void);
-extern bool pgstat_wal_flush_cb(bool nowait);
+extern bool pgstat_wal_flush_cb(bool nowait, bool anytime_only);
extern void pgstat_wal_init_shmem_cb(void *stats);
extern void pgstat_wal_reset_all_cb(TimestampTz ts);
extern void pgstat_wal_snapshot_cb(void);
@@ -830,7 +858,7 @@ extern void pgstat_wal_snapshot_cb(void);
* Functions in pgstat_subscription.c
*/
-extern bool pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait);
+extern bool pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only);
extern void pgstat_subscription_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
diff --git a/src/include/utils/timeout.h b/src/include/utils/timeout.h
index 0965b590b34..10723bb664c 100644
--- a/src/include/utils/timeout.h
+++ b/src/include/utils/timeout.h
@@ -35,6 +35,7 @@ typedef enum TimeoutId
IDLE_SESSION_TIMEOUT,
IDLE_STATS_UPDATE_TIMEOUT,
CLIENT_CONNECTION_CHECK_TIMEOUT,
+ ANYTIME_STATS_UPDATE_TIMEOUT,
STARTUP_PROGRESS_TIMEOUT,
/* First user-definable timeout reason */
USER_TIMEOUT,
diff --git a/src/test/modules/test_custom_stats/test_custom_var_stats.c b/src/test/modules/test_custom_stats/test_custom_var_stats.c
index 64a8fe63cce..bc0b5d6e0eb 100644
--- a/src/test/modules/test_custom_stats/test_custom_var_stats.c
+++ b/src/test/modules/test_custom_stats/test_custom_var_stats.c
@@ -83,7 +83,7 @@ static dsa_area *custom_stats_description_dsa = NULL;
/* Flush callback: merge pending stats into shared memory */
static bool test_custom_stats_var_flush_pending_cb(PgStat_EntryRef *entry_ref,
- bool nowait);
+ bool nowait, bool anytime_only);
/* Serialization callback: write auxiliary entry data */
static void test_custom_stats_var_to_serialized_data(const PgStat_HashKey *key,
@@ -150,7 +150,7 @@ _PG_init(void)
* Returns false only if nowait=true and lock acquisition fails.
*/
static bool
-test_custom_stats_var_flush_pending_cb(PgStat_EntryRef *entry_ref, bool nowait)
+test_custom_stats_var_flush_pending_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
{
PgStat_StatCustomVarEntry *pending_entry;
PgStatShared_CustomVarEntry *shared_entry;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 241945734ec..1dbc4b96f51 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2271,6 +2271,7 @@ PgStat_Counter
PgStat_EntryRef
PgStat_EntryRefHashEntry
PgStat_FetchConsistency
+PgStat_FlushMode
PgStat_FunctionCallUsage
PgStat_FunctionCounts
PgStat_HashKey
--
2.34.1
[text/x-diff] v8-0002-Add-anytime-flush-tests-for-custom-stats.patch (9.0K, 3-v8-0002-Add-anytime-flush-tests-for-custom-stats.patch)
download | inline diff:
From 3f426eb4b56382aacfb7e3ae8377d7c8fae3db91 Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot <[email protected]>
Date: Thu, 5 Feb 2026 05:54:34 +0000
Subject: [PATCH v8 2/5] Add anytime flush tests for custom stats
---
.../test_custom_stats/t/001_custom_stats.pl | 41 +++++++++++++
.../test_custom_fixed_stats--1.0.sql | 5 ++
.../test_custom_fixed_stats.c | 57 +++++++++++++++++++
.../test_custom_var_stats--1.0.sql | 5 ++
.../test_custom_stats/test_custom_var_stats.c | 27 +++++++++
5 files changed, 135 insertions(+)
33.8% src/test/modules/test_custom_stats/t/
66.1% src/test/modules/test_custom_stats/
diff --git a/src/test/modules/test_custom_stats/t/001_custom_stats.pl b/src/test/modules/test_custom_stats/t/001_custom_stats.pl
index 9e6a7a38577..7be1b281776 100644
--- a/src/test/modules/test_custom_stats/t/001_custom_stats.pl
+++ b/src/test/modules/test_custom_stats/t/001_custom_stats.pl
@@ -156,5 +156,46 @@ $result = $node->safe_psql('postgres',
);
is($result, "0", "report of fixed-sized after manual reset");
+# Test FLUSH_ANYTIME mechanism with custom fixed stats
+# This verifies that custom stats can be flushed during a transaction
+
+# Reset stats first
+$node->safe_psql('postgres', q(select test_custom_stats_fixed_reset()));
+$node->safe_psql('postgres', q(select pg_stat_force_next_flush()));
+
+my $anytime_test = q[
+ BEGIN;
+ -- Accumulate stats
+ select test_custom_stats_fixed_anytime_update() from generate_series(1, 2);
+ -- Wait (has to be greater than PGSTAT_MIN_INTERVAL)
+ select pg_sleep(1.5);
+ -- Check
+ select 'anytime:'||numcalls from test_custom_stats_fixed_report();
+];
+
+$result = $node->safe_psql('postgres', $anytime_test);
+like($result, qr/^anytime:2/m,
+ "anytime fixed stats flushed during transaction");
+
+# Test FLUSH_ANYTIME mechanism with custom variable stats
+# This verifies that custom stats can be flushed during a transaction
+
+$node->safe_psql('postgres', q(select pg_stat_force_next_flush()));
+
+$anytime_test = q[
+ BEGIN;
+ -- Accumulate stats
+ select test_custom_stats_var_anytime_update('entry2');
+ select test_custom_stats_var_anytime_update('entry2');
+ -- Wait (has to be greater than PGSTAT_MIN_INTERVAL)
+ select pg_sleep(1.5);
+ -- Check
+ select * from test_custom_stats_var_report('entry2');
+];
+
+$result = $node->safe_psql('postgres', $anytime_test);
+like($result, qr/^entry2|2|/m,
+ "anytime var stats flushed during transaction");
+
# Test completed successfully
done_testing();
diff --git a/src/test/modules/test_custom_stats/test_custom_fixed_stats--1.0.sql b/src/test/modules/test_custom_stats/test_custom_fixed_stats--1.0.sql
index 69a93b5241f..da3a798f289 100644
--- a/src/test/modules/test_custom_stats/test_custom_fixed_stats--1.0.sql
+++ b/src/test/modules/test_custom_stats/test_custom_fixed_stats--1.0.sql
@@ -18,3 +18,8 @@ CREATE FUNCTION test_custom_stats_fixed_reset()
RETURNS void
AS 'MODULE_PATHNAME', 'test_custom_stats_fixed_reset'
LANGUAGE C STRICT PARALLEL UNSAFE;
+
+CREATE FUNCTION test_custom_stats_fixed_anytime_update()
+RETURNS void
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT PARALLEL UNSAFE;
diff --git a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
index 908bd18a7c7..30b0fbcbdc7 100644
--- a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
+++ b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
@@ -18,6 +18,7 @@
#include "pgstat.h"
#include "utils/builtins.h"
#include "utils/pgstat_internal.h"
+#include "utils/timeout.h"
PG_MODULE_MAGIC_EXT(
.name = "test_custom_fixed_stats",
@@ -43,11 +44,13 @@ typedef struct PgStatShared_CustomFixedEntry
static void test_custom_stats_fixed_init_shmem_cb(void *stats);
static void test_custom_stats_fixed_reset_all_cb(TimestampTz ts);
static void test_custom_stats_fixed_snapshot_cb(void);
+static bool test_custom_stats_fixed_flush_cb(bool nowait, bool anytime_only);
static const PgStat_KindInfo custom_stats = {
.name = "test_custom_fixed_stats",
.fixed_amount = true, /* exactly one entry */
.write_to_file = true, /* persist to stats file */
+ .flush_mode = FLUSH_ANYTIME, /* can be flushed anytime */
.shared_size = sizeof(PgStat_StatCustomFixedEntry),
.shared_data_off = offsetof(PgStatShared_CustomFixedEntry, stats),
@@ -56,8 +59,12 @@ static const PgStat_KindInfo custom_stats = {
.init_shmem_cb = test_custom_stats_fixed_init_shmem_cb,
.reset_all_cb = test_custom_stats_fixed_reset_all_cb,
.snapshot_cb = test_custom_stats_fixed_snapshot_cb,
+ .flush_static_cb = test_custom_stats_fixed_flush_cb,
};
+/* Pending statistics */
+static PgStat_StatCustomFixedEntry PendingCustomStats = {0};
+
/*
* Kind ID for test_custom_fixed_stats.
*/
@@ -141,6 +148,38 @@ test_custom_stats_fixed_snapshot_cb(void)
#undef FIXED_COMP
}
+/*
+ * test_custom_stats_fixed_flush_cb
+ * Flush pending stats to shared memory
+ */
+static bool
+test_custom_stats_fixed_flush_cb(bool nowait, bool anytime_only)
+{
+ PgStatShared_CustomFixedEntry *stats_shmem;
+
+ /* Nothing to flush if no calls were made */
+ if (PendingCustomStats.numcalls == 0)
+ return false;
+
+ stats_shmem = pgstat_get_custom_shmem_data(PGSTAT_KIND_TEST_CUSTOM_FIXED_STATS);
+
+ if (!nowait)
+ LWLockAcquire(&stats_shmem->lock, LW_EXCLUSIVE);
+ else if (!LWLockConditionalAcquire(&stats_shmem->lock, LW_EXCLUSIVE))
+ return true;
+
+ pgstat_begin_changecount_write(&stats_shmem->changecount);
+ stats_shmem->stats.numcalls += PendingCustomStats.numcalls;
+ pgstat_end_changecount_write(&stats_shmem->changecount);
+
+ LWLockRelease(&stats_shmem->lock);
+
+ /* Reset pending stats */
+ PendingCustomStats.numcalls = 0;
+
+ return false; /* successfully flushed */
+}
+
/*--------------------------------------------------------------------------
* SQL-callable functions
*--------------------------------------------------------------------------
@@ -222,3 +261,21 @@ test_custom_stats_fixed_report(PG_FUNCTION_ARGS)
/* Return as tuple */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
+
+/*
+ * test_custom_stats_fixed_anytime_update
+ * Increment call counter and schedule anytime flush
+ */
+PG_FUNCTION_INFO_V1(test_custom_stats_fixed_anytime_update);
+Datum
+test_custom_stats_fixed_anytime_update(PG_FUNCTION_ARGS)
+{
+ /* Accumulate in pending stats */
+ PendingCustomStats.numcalls++;
+
+ /* Schedule anytime stats update */
+ pgstat_schedule_anytime_update();
+ pgstat_report_fixed = true;
+
+ PG_RETURN_VOID();
+}
diff --git a/src/test/modules/test_custom_stats/test_custom_var_stats--1.0.sql b/src/test/modules/test_custom_stats/test_custom_var_stats--1.0.sql
index 5ed8cfc2dcf..ed66d38981e 100644
--- a/src/test/modules/test_custom_stats/test_custom_var_stats--1.0.sql
+++ b/src/test/modules/test_custom_stats/test_custom_var_stats--1.0.sql
@@ -24,3 +24,8 @@ CREATE FUNCTION test_custom_stats_var_report(INOUT name TEXT,
RETURNS SETOF record
AS 'MODULE_PATHNAME', 'test_custom_stats_var_report'
LANGUAGE C STRICT PARALLEL UNSAFE;
+
+CREATE FUNCTION test_custom_stats_var_anytime_update(IN name TEXT)
+RETURNS void
+AS 'MODULE_PATHNAME', 'test_custom_stats_var_anytime_update'
+LANGUAGE C STRICT PARALLEL UNSAFE;
diff --git a/src/test/modules/test_custom_stats/test_custom_var_stats.c b/src/test/modules/test_custom_stats/test_custom_var_stats.c
index bc0b5d6e0eb..207e841911b 100644
--- a/src/test/modules/test_custom_stats/test_custom_var_stats.c
+++ b/src/test/modules/test_custom_stats/test_custom_var_stats.c
@@ -17,6 +17,7 @@
#include "storage/dsm_registry.h"
#include "utils/builtins.h"
#include "utils/pgstat_internal.h"
+#include "utils/timeout.h"
PG_MODULE_MAGIC_EXT(
.name = "test_custom_var_stats",
@@ -107,6 +108,7 @@ static const PgStat_KindInfo custom_stats = {
.name = "test_custom_var_stats",
.fixed_amount = false, /* variable number of entries */
.write_to_file = true, /* persist across restarts */
+ .flush_mode = FLUSH_ANYTIME, /* can be flushed anytime */
.track_entry_count = true, /* count active entries */
.accessed_across_databases = true, /* global statistics */
.shared_size = sizeof(PgStatShared_CustomVarEntry),
@@ -689,3 +691,28 @@ test_custom_stats_var_report(PG_FUNCTION_ARGS)
SRF_RETURN_DONE(funcctx);
}
+
+/*
+ * test_custom_stats_var_anytime_update
+ * Increment custom statistic counter and schedule anytime flush
+ */
+PG_FUNCTION_INFO_V1(test_custom_stats_var_anytime_update);
+Datum
+test_custom_stats_var_anytime_update(PG_FUNCTION_ARGS)
+{
+ char *stat_name = text_to_cstring(PG_GETARG_TEXT_PP(0));
+ PgStat_EntryRef *entry_ref;
+ PgStat_StatCustomVarEntry *pending_entry;
+
+ /* Get pending entry in local memory */
+ entry_ref = pgstat_prep_pending_entry(PGSTAT_KIND_TEST_CUSTOM_VAR_STATS, InvalidOid,
+ PGSTAT_CUSTOM_VAR_STATS_IDX(stat_name), NULL);
+
+ pending_entry = (PgStat_StatCustomVarEntry *) entry_ref->pending;
+ pending_entry->numcalls++;
+
+ /* Schedule anytime stats update */
+ pgstat_schedule_anytime_update();
+
+ PG_RETURN_VOID();
+}
--
2.34.1
[text/x-diff] v8-0003-Add-GUC-to-specify-non-transactional-statistics-f.patch (8.9K, 4-v8-0003-Add-GUC-to-specify-non-transactional-statistics-f.patch)
download | inline diff:
From 1be773551abc042e941b538dc472294869718b3a Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot <[email protected]>
Date: Wed, 28 Jan 2026 07:53:13 +0000
Subject: [PATCH v8 3/5] Add GUC to specify non-transactional statistics flush
interval
Adding pgstat_flush_interval, a new GUC to set the interval between flushes of
non-transactional statistics.
---
doc/src/sgml/config.sgml | 32 +++++++++++++++++++
src/backend/utils/activity/pgstat.c | 16 ++++++++++
src/backend/utils/misc/guc_parameters.dat | 10 ++++++
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/include/pgstat.h | 6 ++--
src/include/utils/guc_hooks.h | 1 +
.../test_custom_stats/t/001_custom_stats.pl | 6 ++--
7 files changed, 66 insertions(+), 6 deletions(-)
51.8% doc/src/sgml/
13.3% src/backend/utils/activity/
13.6% src/backend/utils/misc/
11.3% src/include/
9.8% src/test/modules/test_custom_stats/t/
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index faf0bdb62aa..03875b490b7 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -8932,6 +8932,38 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-stats-flush-interval" xreflabel="stats_flush_interval">
+ <term><varname>stats_flush_interval</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>stats_flush_interval</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the interval at which certain statistics, which can be updated while a
+ transaction is in progress, are made visible. These include WAL activity
+ and I/O operations.
+ Such statistics are refreshed at the specified interval and can be observed
+ during active transactions in monitoring views such as
+ <link linkend="monitoring-pg-stat-wal-view"><structname>pg_stat_wal</structname></link>
+ and
+ <link linkend="monitoring-pg-stat-io-view"><structname>pg_stat_io</structname></link>.
+ If the value is specified without a unit, milliseconds are assumed.
+ The default is 10 seconds (<literal>10s</literal>), which is generally
+ the smallest practical value for long-running transactions.
+ </para>
+ <note>
+ <para>
+ This parameter does not affect statistics that are only reported at
+ transaction end, such as the columns of <structname>pg_stat_all_tables</structname>
+ (for example, <structfield>n_tup_ins</structfield>, <structfield>n_tup_upd</structfield>,
+ and <structfield>n_tup_del</structfield>). These statistics are always
+ flushed at the end of a transaction.
+ </para>
+ </note>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c
index a4ff64dc5ce..dd85a27c52f 100644
--- a/src/backend/utils/activity/pgstat.c
+++ b/src/backend/utils/activity/pgstat.c
@@ -123,6 +123,8 @@
* ----------
*/
+/* minimum interval non-forced stats flushes.*/
+#define PGSTAT_MIN_INTERVAL 1000
/* how long until to block flushing pending stats updates */
#define PGSTAT_MAX_INTERVAL 60000
/* when to call pgstat_report_stat() again, even when idle */
@@ -203,6 +205,7 @@ static inline bool pgstat_is_kind_valid(PgStat_Kind kind);
bool pgstat_track_counts = false;
int pgstat_fetch_consistency = PGSTAT_FETCH_CONSISTENCY_CACHE;
+int pgstat_flush_interval = 10000;
/* ----------
@@ -2164,6 +2167,19 @@ assign_stats_fetch_consistency(int newval, void *extra)
force_stats_snapshot_clear = true;
}
+/*
+ * GUC assign_hook for stats_flush_interval.
+ */
+void
+assign_stats_flush_interval(int newval, void *extra)
+{
+ if (get_timeout_active(ANYTIME_STATS_UPDATE_TIMEOUT))
+ {
+ disable_timeout(ANYTIME_STATS_UPDATE_TIMEOUT, false);
+ enable_timeout_after(ANYTIME_STATS_UPDATE_TIMEOUT, newval);
+ }
+}
+
/*
* Flushes only FLUSH_ANYTIME stats using non-blocking locks. Transactional
* stats (FLUSH_AT_TXN_BOUNDARY) remain pending until transaction boundary.
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 271c033952e..d2734caafea 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -2801,6 +2801,16 @@
assign_hook => 'assign_stats_fetch_consistency',
},
+{ name => 'stats_flush_interval', type => 'int', context => 'PGC_USERSET', group => 'STATS_CUMULATIVE',
+ short_desc => 'Sets the interval between flushes of non-transactional statistics.',
+ flags => 'GUC_UNIT_MS',
+ variable => 'pgstat_flush_interval',
+ boot_val => '10000',
+ min => '1000',
+ max => 'INT_MAX',
+ assign_hook => 'assign_stats_flush_interval'
+},
+
{ name => 'subtransaction_buffers', type => 'int', context => 'PGC_POSTMASTER', group => 'RESOURCES_MEM',
short_desc => 'Sets the size of the dedicated buffer pool used for the subtransaction cache.',
long_desc => '0 means use a fraction of "shared_buffers".',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index f938cc65a3a..8bd37a25b38 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -688,6 +688,7 @@
#track_wal_io_timing = off
#track_functions = none # none, pl, all
#stats_fetch_consistency = cache # cache, none, snapshot
+#stats_flush_interval = 10s # in milliseconds
# - Monitoring -
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index b340a680614..ef856dbf55b 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -35,9 +35,6 @@
/* Default directory to store temporary statistics data in */
#define PG_STAT_TMP_DIR "pg_stat_tmp"
-/* Minimum interval non-forced stats flushes */
-#define PGSTAT_MIN_INTERVAL 1000
-
/* Values for track_functions GUC variable --- order is significant! */
typedef enum TrackFunctionsLevel
{
@@ -548,7 +545,7 @@ extern void pgstat_force_next_flush(void);
#define pgstat_schedule_anytime_update() \
do { \
if (IsUnderPostmaster && !get_timeout_active(ANYTIME_STATS_UPDATE_TIMEOUT)) \
- enable_timeout_after(ANYTIME_STATS_UPDATE_TIMEOUT, PGSTAT_MIN_INTERVAL); \
+ enable_timeout_after(ANYTIME_STATS_UPDATE_TIMEOUT, pgstat_flush_interval); \
} while (0)
extern void pgstat_reset_counters(void);
@@ -828,6 +825,7 @@ extern PgStat_WalStats *pgstat_fetch_stat_wal(void);
extern PGDLLIMPORT bool pgstat_track_counts;
extern PGDLLIMPORT int pgstat_track_functions;
extern PGDLLIMPORT int pgstat_fetch_consistency;
+extern PGDLLIMPORT int pgstat_flush_interval;
/*
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 9c90670d9b8..9b5d2a90387 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -132,6 +132,7 @@ extern bool check_session_authorization(char **newval, void **extra, GucSource s
extern void assign_session_authorization(const char *newval, void *extra);
extern void assign_session_replication_role(int newval, void *extra);
extern void assign_stats_fetch_consistency(int newval, void *extra);
+extern void assign_stats_flush_interval(int newval, void *extra);
extern bool check_ssl(bool *newval, void **extra, GucSource source);
extern bool check_stage_log_stats(bool *newval, void **extra, GucSource source);
extern bool check_standard_conforming_strings(bool *newval, void **extra,
diff --git a/src/test/modules/test_custom_stats/t/001_custom_stats.pl b/src/test/modules/test_custom_stats/t/001_custom_stats.pl
index 7be1b281776..22e2a75dcb9 100644
--- a/src/test/modules/test_custom_stats/t/001_custom_stats.pl
+++ b/src/test/modules/test_custom_stats/t/001_custom_stats.pl
@@ -164,10 +164,11 @@ $node->safe_psql('postgres', q(select test_custom_stats_fixed_reset()));
$node->safe_psql('postgres', q(select pg_stat_force_next_flush()));
my $anytime_test = q[
+ SET stats_flush_interval = '1s';
BEGIN;
-- Accumulate stats
select test_custom_stats_fixed_anytime_update() from generate_series(1, 2);
- -- Wait (has to be greater than PGSTAT_MIN_INTERVAL)
+ -- Wait (has to be greater than stats_flush_interval)
select pg_sleep(1.5);
-- Check
select 'anytime:'||numcalls from test_custom_stats_fixed_report();
@@ -183,11 +184,12 @@ like($result, qr/^anytime:2/m,
$node->safe_psql('postgres', q(select pg_stat_force_next_flush()));
$anytime_test = q[
+ SET stats_flush_interval = '1s';
BEGIN;
-- Accumulate stats
select test_custom_stats_var_anytime_update('entry2');
select test_custom_stats_var_anytime_update('entry2');
- -- Wait (has to be greater than PGSTAT_MIN_INTERVAL)
+ -- Wait (has to be greater than stats_flush_interval)
select pg_sleep(1.5);
-- Check
select * from test_custom_stats_var_report('entry2');
--
2.34.1
[text/x-diff] v8-0004-Remove-useless-calls-to-flush-some-stats.patch (7.7K, 5-v8-0004-Remove-useless-calls-to-flush-some-stats.patch)
download | inline diff:
From 3938754d5ab37da226ad86278ad1a104f82e80ed Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot <[email protected]>
Date: Tue, 6 Jan 2026 11:06:31 +0000
Subject: [PATCH v8 4/5] Remove useless calls to flush some stats
Now that some stats can be flushed outside of transaction boundaries, remove
useless calls to report/flush some stats. Those calls were in place because
before commit <XXXX> stats were flushed only at transaction boundaries.
Note that:
- it reverts 039549d70f6 (it just keeps its tests)
- it can't be done for checkpointer and bgworker for example because they don't
have a flush callback to call
- it can't be done for auxiliary process (walsummarizer for example) because they
currently do not register the new timeout handler
---
src/backend/replication/walreceiver.c | 10 ------
src/backend/replication/walsender.c | 36 ++------------------
src/backend/utils/activity/pgstat_relation.c | 13 -------
src/test/recovery/t/001_stream_rep.pl | 1 +
src/test/subscription/t/001_rep_changes.pl | 1 +
5 files changed, 4 insertions(+), 57 deletions(-)
69.4% src/backend/replication/
23.4% src/backend/utils/activity/
3.5% src/test/recovery/t/
3.6% src/test/subscription/t/
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 11b7c114d3b..953ba97ed00 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -571,16 +571,6 @@ WalReceiverMain(const void *startup_data, size_t startup_data_len)
*/
bool requestReply = false;
- /*
- * Report pending statistics to the cumulative stats
- * system. This location is useful for the report as it
- * is not within a tight loop in the WAL receiver, to
- * avoid bloating pgstats with requests, while also making
- * sure that the reports happen each time a status update
- * is sent.
- */
- pgstat_report_wal(false);
-
/*
* Check if time since last receive from primary has
* reached the configured limit.
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index a7214d0dc6f..9a136e35b48 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -94,14 +94,10 @@
#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/pg_lsn.h"
-#include "utils/pgstat_internal.h"
#include "utils/ps_status.h"
#include "utils/timeout.h"
#include "utils/timestamp.h"
-/* Minimum interval used by walsender for stats flushes, in ms */
-#define WALSENDER_STATS_FLUSH_INTERVAL 1000
-
/*
* Maximum data payload in a WAL data message. Must be >= XLOG_BLCKSZ.
*
@@ -1846,7 +1842,6 @@ WalSndWaitForWal(XLogRecPtr loc)
int wakeEvents;
uint32 wait_event = 0;
static XLogRecPtr RecentFlushPtr = InvalidXLogRecPtr;
- TimestampTz last_flush = 0;
/*
* Fast path to avoid acquiring the spinlock in case we already know we
@@ -1867,7 +1862,6 @@ WalSndWaitForWal(XLogRecPtr loc)
{
bool wait_for_standby_at_stop = false;
long sleeptime;
- TimestampTz now;
/* Clear any already-pending wakeups */
ResetLatch(MyLatch);
@@ -1973,8 +1967,7 @@ WalSndWaitForWal(XLogRecPtr loc)
* new WAL to be generated. (But if we have nothing to send, we don't
* want to wake on socket-writable.)
*/
- now = GetCurrentTimestamp();
- sleeptime = WalSndComputeSleeptime(now);
+ sleeptime = WalSndComputeSleeptime(GetCurrentTimestamp());
wakeEvents = WL_SOCKET_READABLE;
@@ -1983,15 +1976,6 @@ WalSndWaitForWal(XLogRecPtr loc)
Assert(wait_event != 0);
- /* Report IO statistics, if needed */
- if (TimestampDifferenceExceeds(last_flush, now,
- WALSENDER_STATS_FLUSH_INTERVAL))
- {
- pgstat_flush_io(false, true);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
- last_flush = now;
- }
-
WalSndWait(wakeEvents, sleeptime, wait_event);
}
@@ -2894,8 +2878,6 @@ WalSndCheckTimeOut(void)
static void
WalSndLoop(WalSndSendDataCallback send_data)
{
- TimestampTz last_flush = 0;
-
/*
* Initialize the last reply timestamp. That enables timeout processing
* from hereon.
@@ -2985,9 +2967,6 @@ WalSndLoop(WalSndSendDataCallback send_data)
* WalSndWaitForWal() handle any other blocking; idle receivers need
* its additional actions. For physical replication, also block if
* caught up; its send_data does not block.
- *
- * The IO statistics are reported in WalSndWaitForWal() for the
- * logical WAL senders.
*/
if ((WalSndCaughtUp && send_data != XLogSendLogical &&
!streamingDoneSending) ||
@@ -2995,7 +2974,6 @@ WalSndLoop(WalSndSendDataCallback send_data)
{
long sleeptime;
int wakeEvents;
- TimestampTz now;
if (!streamingDoneReceiving)
wakeEvents = WL_SOCKET_READABLE;
@@ -3006,21 +2984,11 @@ WalSndLoop(WalSndSendDataCallback send_data)
* Use fresh timestamp, not last_processing, to reduce the chance
* of reaching wal_sender_timeout before sending a keepalive.
*/
- now = GetCurrentTimestamp();
- sleeptime = WalSndComputeSleeptime(now);
+ sleeptime = WalSndComputeSleeptime(GetCurrentTimestamp());
if (pq_is_send_pending())
wakeEvents |= WL_SOCKET_WRITEABLE;
- /* Report IO statistics, if needed */
- if (TimestampDifferenceExceeds(last_flush, now,
- WALSENDER_STATS_FLUSH_INTERVAL))
- {
- pgstat_flush_io(false, true);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
- last_flush = now;
- }
-
/* Sleep until something happens or we time out */
WalSndWait(wakeEvents, sleeptime, WAIT_EVENT_WAL_SENDER_MAIN);
}
diff --git a/src/backend/utils/activity/pgstat_relation.c b/src/backend/utils/activity/pgstat_relation.c
index 04d21483d93..ae2952cae89 100644
--- a/src/backend/utils/activity/pgstat_relation.c
+++ b/src/backend/utils/activity/pgstat_relation.c
@@ -260,15 +260,6 @@ pgstat_report_vacuum(Relation rel, PgStat_Counter livetuples,
}
pgstat_unlock_entry(entry_ref);
-
- /*
- * Flush IO statistics now. pgstat_report_stat() will flush IO stats,
- * however this will not be called until after an entire autovacuum cycle
- * is done -- which will likely vacuum many relations -- or until the
- * VACUUM command has processed all tables and committed.
- */
- pgstat_flush_io(false, true);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
}
/*
@@ -360,10 +351,6 @@ pgstat_report_analyze(Relation rel,
}
pgstat_unlock_entry(entry_ref);
-
- /* see pgstat_report_vacuum() */
- pgstat_flush_io(false, true);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
}
/*
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index e9ac67813c7..cfa095ff0a8 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -15,6 +15,7 @@ my $node_primary = PostgreSQL::Test::Cluster->new('primary');
$node_primary->init(
allows_streaming => 1,
auth_extra => [ '--create-role' => 'repl_role' ]);
+$node_primary->append_conf('postgresql.conf', "stats_flush_interval = '1s'");
$node_primary->start;
my $backup_name = 'my_backup';
diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl
index 7d41715ed81..29bae5e1121 100644
--- a/src/test/subscription/t/001_rep_changes.pl
+++ b/src/test/subscription/t/001_rep_changes.pl
@@ -11,6 +11,7 @@ use Test::More;
# Initialize publisher node
my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf', "stats_flush_interval = '1s'");
$node_publisher->start;
# Create subscriber node
--
2.34.1
[text/x-diff] v8-0005-Change-RELATION-and-DATABASE-stats-to-anytime-flu.patch (34.2K, 6-v8-0005-Change-RELATION-and-DATABASE-stats-to-anytime-flu.patch)
download | inline diff:
From b3d0b50b0c3a258adf61db81464a2910af0f68c4 Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot <[email protected]>
Date: Mon, 19 Jan 2026 06:27:55 +0000
Subject: [PATCH v8 5/5] Change RELATION and DATABASE stats to anytime flush
This commit allows mixing fields with different transaction behavior within
the same RELATION or DATABASE statistics kind: some fields are transactional
(e.g., tuple inserts/updates/deletes) while others are non-transactional
(e.g., sequential scans, blocks read).
It modifies the relation flush callback to handle the anytime_only parameter
introduced in commit <nnnn>.
Implementation details:
- Change RELATION from FLUSH_AT_TXN_BOUNDARY to FLUSH_ANYTIME
- Change DATABASE from FLUSH_AT_TXN_BOUNDARY to FLUSH_ANYTIME
- Add a is_partial parameter to flush_pending_cb() to be able to distinguish
partial flushes in pgstat_flush_pending_entries()
- Modify pgstat_relation_flush_cb() to handle anytime_only parameter: when
true, then flush only non-transactional stats and when false, then flush all
the stats. When set to true, it clears flushed fields from pending stats to
prevent double-counting at transaction boundary
DATABASE stats inherit the anytime flush behavior so that relation-derived
stats (tuples_returned, tuples_fetched, blocks_fetched, blocks_hit) are
visible while transactions are in progress.
Tests are added to verify the anytime flush behavior for mixed fields.
---
doc/src/sgml/monitoring.sgml | 37 ++++++-
src/backend/utils/activity/pgstat.c | 15 +--
src/backend/utils/activity/pgstat_database.c | 6 +-
src/backend/utils/activity/pgstat_function.c | 6 +-
src/backend/utils/activity/pgstat_relation.c | 92 ++++++++++++----
.../utils/activity/pgstat_subscription.c | 6 +-
src/include/pgstat.h | 27 ++++-
src/include/utils/pgstat_internal.h | 16 ++-
src/test/isolation/expected/stats.out | 102 ++++++++++++++++++
src/test/isolation/expected/stats_1.out | 102 ++++++++++++++++++
src/test/isolation/specs/stats.spec | 27 ++++-
.../test_custom_stats/test_custom_var_stats.c | 9 +-
12 files changed, 404 insertions(+), 41 deletions(-)
11.7% doc/src/sgml/
26.8% src/backend/utils/activity/
4.2% src/include/utils/
5.4% src/include/
45.1% src/test/isolation/expected/
4.7% src/test/isolation/specs/
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index b77d189a500..f2321b631b0 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3767,6 +3767,19 @@ description | Waiting for a newly initialized WAL file to reach durable storage
</tgroup>
</table>
+ <note>
+ <para>
+ Some statistics are updated while a transaction is in progress (for example,
+ <structfield>blks_read</structfield>, <structfield>blks_hit</structfield>,
+ <structfield>tup_returned</structfield> and <structfield>tup_fetched</structfield>).
+ Statistics that either do not depend on transactions or require transactional
+ consistency are updated only when the transaction ends. Statistics that require
+ transactional consistency include <structfield>xact_commit</structfield>,
+ <structfield>xact_rollback</structfield>, <structfield>tup_inserted</structfield>,
+ <structfield>tup_updated</structfield> and <structfield>tup_deleted</structfield>.
+ </para>
+ </note>
+
</sect2>
<sect2 id="monitoring-pg-stat-database-conflicts-view">
@@ -3956,8 +3969,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
<structfield>last_seq_scan</structfield> <type>timestamp with time zone</type>
</para>
<para>
- The time of the last sequential scan on this table, based on the
- most recent transaction stop time
+ The approximate time of the last sequential scan on this table, updated
+ at least every <varname>stats_flush_interval</varname>
</para></entry>
</row>
@@ -3984,8 +3997,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
<structfield>last_idx_scan</structfield> <type>timestamp with time zone</type>
</para>
<para>
- The time of the last index scan on this table, based on the
- most recent transaction stop time
+ The approximate time of the last index scan on this table, updated
+ at least every <varname>stats_flush_interval</varname>
</para></entry>
</row>
@@ -4223,6 +4236,15 @@ description | Waiting for a newly initialized WAL file to reach durable storage
</tgroup>
</table>
+ <note>
+ <para>
+ The <structfield>seq_scan</structfield>, <structfield>last_seq_scan</structfield>,
+ <structfield>seq_tup_read</structfield>, <structfield>idx_scan</structfield>,
+ <structfield>last_idx_scan</structfield> and <structfield>idx_tup_fetch</structfield>
+ are updated while the transactions are in progress.
+ </para>
+ </note>
+
</sect2>
<sect2 id="monitoring-pg-stat-all-indexes-view">
@@ -4404,6 +4426,13 @@ description | Waiting for a newly initialized WAL file to reach durable storage
tuples (see <xref linkend="indexes-multicolumn"/>).
</para>
</note>
+ <note>
+ <para>
+ The <structfield>idx_scan</structfield>, <structfield>last_idx_scan</structfield>,
+ <structfield>idx_tup_read</structfield> and <structfield>idx_tup_fetch</structfield>
+ are updated while the transactions are in progress.
+ </para>
+ </note>
<tip>
<para>
<command>EXPLAIN ANALYZE</command> outputs the total number of index
diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c
index dd85a27c52f..a20a87709c6 100644
--- a/src/backend/utils/activity/pgstat.c
+++ b/src/backend/utils/activity/pgstat.c
@@ -291,7 +291,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = true,
- .flush_mode = FLUSH_AT_TXN_BOUNDARY,
+ .flush_mode = FLUSH_ANYTIME,
/* so pg_stat_database entries can be seen in all databases */
.accessed_across_databases = true,
@@ -309,7 +309,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = true,
- .flush_mode = FLUSH_AT_TXN_BOUNDARY,
+ .flush_mode = FLUSH_ANYTIME,
.shared_size = sizeof(PgStatShared_Relation),
.shared_data_off = offsetof(PgStatShared_Relation, stats),
@@ -1347,7 +1347,8 @@ pgstat_delete_pending_entry(PgStat_EntryRef *entry_ref)
/*
* Flush out pending variable-numbered stats.
*
- * If anytime_only is true, only flushes FLUSH_ANYTIME entries.
+ * If anytime_only is true, only flushes FLUSH_ANYTIME entries. For entries
+ * that support it, the callback may flush only non-transactional fields.
* This is safe to call inside transactions.
*
* If anytime_only is false, flushes all entries.
@@ -1378,6 +1379,7 @@ pgstat_flush_pending_entries(bool nowait, bool anytime_only)
PgStat_Kind kind = key.kind;
const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
bool did_flush;
+ bool is_partial_flush = false;
dlist_node *next;
Assert(!kind_info->fixed_amount);
@@ -1398,7 +1400,8 @@ pgstat_flush_pending_entries(bool nowait, bool anytime_only)
}
/* flush the stats, if possible */
- did_flush = kind_info->flush_pending_cb(entry_ref, nowait, anytime_only);
+ did_flush = kind_info->flush_pending_cb(entry_ref, nowait,
+ anytime_only, &is_partial_flush);
Assert(did_flush || nowait);
@@ -1408,8 +1411,8 @@ pgstat_flush_pending_entries(bool nowait, bool anytime_only)
else
next = NULL;
- /* if successfully flushed, remove entry */
- if (did_flush)
+ /* if successfull non-partial flush, remove entry */
+ if (did_flush && !is_partial_flush)
pgstat_delete_pending_entry(entry_ref);
else
have_pending = true;
diff --git a/src/backend/utils/activity/pgstat_database.c b/src/backend/utils/activity/pgstat_database.c
index 8e86df60461..59dd0790fd7 100644
--- a/src/backend/utils/activity/pgstat_database.c
+++ b/src/backend/utils/activity/pgstat_database.c
@@ -435,7 +435,8 @@ pgstat_reset_database_timestamp(Oid dboid, TimestampTz ts)
* false without flushing the entry. Otherwise returns true.
*/
bool
-pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
+pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial)
{
PgStatShared_Database *sharedent;
PgStat_StatDBEntry *pendingent;
@@ -443,6 +444,9 @@ pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_o
pendingent = (PgStat_StatDBEntry *) entry_ref->pending;
sharedent = (PgStatShared_Database *) entry_ref->shared_stats;
+ /* this is not a partial flush */
+ *is_partial = false;
+
if (!pgstat_lock_entry(entry_ref, nowait))
return false;
diff --git a/src/backend/utils/activity/pgstat_function.c b/src/backend/utils/activity/pgstat_function.c
index 5ba4958382f..44193c93fc7 100644
--- a/src/backend/utils/activity/pgstat_function.c
+++ b/src/backend/utils/activity/pgstat_function.c
@@ -190,7 +190,8 @@ pgstat_end_function_usage(PgStat_FunctionCallUsage *fcu, bool finalize)
* false without flushing the entry. Otherwise returns true.
*/
bool
-pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
+pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial)
{
PgStat_FunctionCounts *localent;
PgStatShared_Function *shfuncent;
@@ -200,6 +201,9 @@ pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_o
localent = (PgStat_FunctionCounts *) entry_ref->pending;
shfuncent = (PgStatShared_Function *) entry_ref->shared_stats;
+ /* this is not a partial flush */
+ *is_partial = false;
+
/* localent always has non-zero content */
if (!pgstat_lock_entry(entry_ref, nowait))
diff --git a/src/backend/utils/activity/pgstat_relation.c b/src/backend/utils/activity/pgstat_relation.c
index ae2952cae89..62363dacfe1 100644
--- a/src/backend/utils/activity/pgstat_relation.c
+++ b/src/backend/utils/activity/pgstat_relation.c
@@ -47,7 +47,19 @@ static void add_tabstat_xact_level(PgStat_TableStatus *pgstat_info, int nest_lev
static void ensure_tabstat_xact_level(PgStat_TableStatus *pgstat_info);
static void save_truncdrop_counters(PgStat_TableXactStatus *trans, bool is_drop);
static void restore_truncdrop_counters(PgStat_TableXactStatus *trans);
+static void flush_relation_anytime_stats(PgStat_StatTabEntry *tabentry,
+ PgStat_TableCounts *counts, bool anytime_only);
+/*
+ * Update database statistics with non-transactional stats.
+ */
+#define UPDATE_DATABASE_ANYTIME_STATS(dbentry, counts) \
+ do { \
+ (dbentry)->tuples_returned += (counts)->tuples_returned; \
+ (dbentry)->tuples_fetched += (counts)->tuples_fetched; \
+ (dbentry)->blocks_fetched += (counts)->blocks_fetched; \
+ (dbentry)->blocks_hit += (counts)->blocks_hit; \
+ } while (0)
/*
* Copy stats between relations. This is used for things like REINDEX
@@ -789,6 +801,29 @@ pgstat_twophase_postabort(FullTransactionId fxid, uint16 info,
rec->tuples_inserted + rec->tuples_updated;
}
+/*
+ * Helper function to flush non-transactional statistics.
+ */
+static void
+flush_relation_anytime_stats(PgStat_StatTabEntry *tabentry, PgStat_TableCounts *counts,
+ bool anytime_only)
+{
+ TimestampTz t;
+
+ tabentry->numscans += counts->numscans;
+ if (counts->numscans)
+ {
+ t = anytime_only ? GetCurrentTimestamp() : GetCurrentTransactionStopTimestamp();
+ if (t > tabentry->lastscan)
+ tabentry->lastscan = t;
+ }
+
+ tabentry->tuples_returned += counts->tuples_returned;
+ tabentry->tuples_fetched += counts->tuples_fetched;
+ tabentry->blocks_fetched += counts->blocks_fetched;
+ tabentry->blocks_hit += counts->blocks_hit;
+}
+
/*
* Flush out pending stats for the entry
*
@@ -797,9 +832,17 @@ pgstat_twophase_postabort(FullTransactionId fxid, uint16 info,
*
* Some of the stats are copied to the corresponding pending database stats
* entry when successfully flushing.
+ *
+ * If anytime_only is true, only non-transactional fields are flushed
+ * (numscans, tuples_returned, tuples_fetched, blocks_fetched, blocks_hit).
+ * Transactional fields remain pending until transaction boundary.
+ *
+ * Some of the stats are copied to the corresponding pending database stats
+ * entry when successfully flushing.
*/
bool
-pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
+pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial)
{
Oid dboid;
PgStat_TableStatus *lstats; /* pending stats entry */
@@ -807,12 +850,13 @@ pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_o
PgStat_StatTabEntry *tabentry; /* table entry of shared stats */
PgStat_StatDBEntry *dbentry; /* pending database entry */
- Assert(!anytime_only);
-
dboid = entry_ref->shared_entry->key.dboid;
lstats = (PgStat_TableStatus *) entry_ref->pending;
shtabstats = (PgStatShared_Relation *) entry_ref->shared_stats;
+ /* this is a partial flush if in anytime only mode */
+ *is_partial = anytime_only;
+
/*
* Ignore entries that didn't accumulate any actual counts, such as
* indexes that were opened by the planner but not used.
@@ -824,19 +868,36 @@ pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_o
if (!pgstat_lock_entry(entry_ref, nowait))
return false;
- /* add the values to the shared entry. */
tabentry = &shtabstats->stats;
- tabentry->numscans += lstats->counts.numscans;
- if (lstats->counts.numscans)
+ if (anytime_only)
{
- TimestampTz t = GetCurrentTransactionStopTimestamp();
- if (t > tabentry->lastscan)
- tabentry->lastscan = t;
+ /* Flush non-transactional statistics */
+ flush_relation_anytime_stats(tabentry, &lstats->counts, true);
+
+ pgstat_unlock_entry(entry_ref);
+
+ /* Also update the corresponding fields in database stats */
+ dbentry = pgstat_prep_database_pending(dboid);
+ UPDATE_DATABASE_ANYTIME_STATS(dbentry, &lstats->counts);
+
+ /*
+ * Clear the flushed fields from pending stats to prevent
+ * double-counting when we flush all fields at transaction boundary.
+ */
+ lstats->counts.numscans = 0;
+ lstats->counts.tuples_returned = 0;
+ lstats->counts.tuples_fetched = 0;
+ lstats->counts.blocks_fetched = 0;
+ lstats->counts.blocks_hit = 0;
+
+ return true;
}
- tabentry->tuples_returned += lstats->counts.tuples_returned;
- tabentry->tuples_fetched += lstats->counts.tuples_fetched;
+
+ /* Flush non-transactional statistics */
+ flush_relation_anytime_stats(tabentry, &lstats->counts, false);
+
tabentry->tuples_inserted += lstats->counts.tuples_inserted;
tabentry->tuples_updated += lstats->counts.tuples_updated;
tabentry->tuples_deleted += lstats->counts.tuples_deleted;
@@ -866,9 +927,6 @@ pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_o
*/
tabentry->ins_since_vacuum += lstats->counts.tuples_inserted;
- tabentry->blocks_fetched += lstats->counts.blocks_fetched;
- tabentry->blocks_hit += lstats->counts.blocks_hit;
-
/* Clamp live_tuples in case of negative delta_live_tuples */
tabentry->live_tuples = Max(tabentry->live_tuples, 0);
/* Likewise for dead_tuples */
@@ -878,13 +936,11 @@ pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_o
/* The entry was successfully flushed, add the same to database stats */
dbentry = pgstat_prep_database_pending(dboid);
- dbentry->tuples_returned += lstats->counts.tuples_returned;
- dbentry->tuples_fetched += lstats->counts.tuples_fetched;
+ UPDATE_DATABASE_ANYTIME_STATS(dbentry, &lstats->counts);
+
dbentry->tuples_inserted += lstats->counts.tuples_inserted;
dbentry->tuples_updated += lstats->counts.tuples_updated;
dbentry->tuples_deleted += lstats->counts.tuples_deleted;
- dbentry->blocks_fetched += lstats->counts.blocks_fetched;
- dbentry->blocks_hit += lstats->counts.blocks_hit;
return true;
}
diff --git a/src/backend/utils/activity/pgstat_subscription.c b/src/backend/utils/activity/pgstat_subscription.c
index c4614817966..43fec86c635 100644
--- a/src/backend/utils/activity/pgstat_subscription.c
+++ b/src/backend/utils/activity/pgstat_subscription.c
@@ -116,7 +116,8 @@ pgstat_fetch_stat_subscription(Oid subid)
* false without flushing the entry. Otherwise returns true.
*/
bool
-pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
+pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial)
{
PgStat_BackendSubEntry *localent;
PgStatShared_Subscription *shsubent;
@@ -126,6 +127,9 @@ pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anyti
localent = (PgStat_BackendSubEntry *) entry_ref->pending;
shsubent = (PgStatShared_Subscription *) entry_ref->shared_stats;
+ /* this is not a partial flush */
+ *is_partial = false;
+
/* localent always has non-zero content */
if (!pgstat_lock_entry(entry_ref, nowait))
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index ef856dbf55b..06639198f28 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -21,6 +21,7 @@
#include "utils/backend_status.h" /* for backward compatibility */ /* IWYU pragma: export */
#include "utils/pgstat_kind.h"
#include "utils/relcache.h"
+#include "utils/timeout.h"
#include "utils/wait_event.h" /* for backward compatibility */ /* IWYU pragma: export */
@@ -537,10 +538,11 @@ extern void pgstat_report_anytime_stat(bool force);
extern void pgstat_force_next_flush(void);
/*
- * Schedule the next anytime stats update timeout.
+ * Schedule the next anytime stats update timeout and mark that we have
+ * mixed anytime stats pending.
*
* This should be called whenever accumulating statistics that support
- * FLUSH_ANYTIME flushing mode.
+ * FLUSH_ANYTIME or FLUSH_MIXED flushing modes.
*/
#define pgstat_schedule_anytime_update() \
do { \
@@ -703,37 +705,58 @@ extern void pgstat_report_analyze(Relation rel,
#define pgstat_count_heap_scan(rel) \
do { \
if (pgstat_should_count_relation(rel)) \
+ { \
(rel)->pgstat_info->counts.numscans++; \
+ pgstat_schedule_anytime_update(); \
+ } \
} while (0)
#define pgstat_count_heap_getnext(rel) \
do { \
if (pgstat_should_count_relation(rel)) \
+ { \
(rel)->pgstat_info->counts.tuples_returned++; \
+ pgstat_schedule_anytime_update(); \
+ } \
} while (0)
#define pgstat_count_heap_fetch(rel) \
do { \
if (pgstat_should_count_relation(rel)) \
+ { \
(rel)->pgstat_info->counts.tuples_fetched++; \
+ pgstat_schedule_anytime_update(); \
+ } \
} while (0)
#define pgstat_count_index_scan(rel) \
do { \
if (pgstat_should_count_relation(rel)) \
+ { \
(rel)->pgstat_info->counts.numscans++; \
+ pgstat_schedule_anytime_update(); \
+ } \
} while (0)
#define pgstat_count_index_tuples(rel, n) \
do { \
if (pgstat_should_count_relation(rel)) \
+ { \
(rel)->pgstat_info->counts.tuples_returned += (n); \
+ pgstat_schedule_anytime_update(); \
+ } \
} while (0)
#define pgstat_count_buffer_read(rel) \
do { \
if (pgstat_should_count_relation(rel)) \
+ { \
(rel)->pgstat_info->counts.blocks_fetched++; \
+ pgstat_schedule_anytime_update(); \
+ } \
} while (0)
#define pgstat_count_buffer_hit(rel) \
do { \
if (pgstat_should_count_relation(rel)) \
+ { \
(rel)->pgstat_info->counts.blocks_hit++; \
+ pgstat_schedule_anytime_update(); \
+ } \
} while (0)
extern void pgstat_count_heap_insert(Relation rel, PgStat_Counter n);
diff --git a/src/include/utils/pgstat_internal.h b/src/include/utils/pgstat_internal.h
index 607f4255268..1a2114aad8a 100644
--- a/src/include/utils/pgstat_internal.h
+++ b/src/include/utils/pgstat_internal.h
@@ -322,8 +322,10 @@ typedef struct PgStat_KindInfo
* that cannot use PgStat_EntryRef->pending.
*
* The anytime_only parameter indicates whether this is an anytime flush.
+ * The is_partial parameter indicates whether this is a partial flush.
*/
- bool (*flush_pending_cb) (PgStat_EntryRef *sr, bool nowait, bool anytime_only);
+ bool (*flush_pending_cb) (PgStat_EntryRef *sr, bool nowait,
+ bool anytime_only, bool *is_partial);
/*
* For variable-numbered stats: delete pending stats. Optional.
@@ -757,7 +759,8 @@ extern void AtEOXact_PgStat_Database(bool isCommit, bool parallel);
extern PgStat_StatDBEntry *pgstat_prep_database_pending(Oid dboid);
extern void pgstat_reset_database_timestamp(Oid dboid, TimestampTz ts);
-extern bool pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only);
+extern bool pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial);
extern void pgstat_database_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
@@ -765,7 +768,8 @@ extern void pgstat_database_reset_timestamp_cb(PgStatShared_Common *header, Time
* Functions in pgstat_function.c
*/
-extern bool pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only);
+extern bool pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial);
extern void pgstat_function_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
@@ -790,7 +794,8 @@ extern void AtEOSubXact_PgStat_Relations(PgStat_SubXactStatus *xact_state, bool
extern void AtPrepare_PgStat_Relations(PgStat_SubXactStatus *xact_state);
extern void PostPrepare_PgStat_Relations(PgStat_SubXactStatus *xact_state);
-extern bool pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only);
+extern bool pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial);
extern void pgstat_relation_delete_pending_cb(PgStat_EntryRef *entry_ref);
extern void pgstat_relation_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
@@ -858,7 +863,8 @@ extern void pgstat_wal_snapshot_cb(void);
* Functions in pgstat_subscription.c
*/
-extern bool pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only);
+extern bool pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial);
extern void pgstat_subscription_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
diff --git a/src/test/isolation/expected/stats.out b/src/test/isolation/expected/stats.out
index cfad309ccf3..11e3e57806d 100644
--- a/src/test/isolation/expected/stats.out
+++ b/src/test/isolation/expected/stats.out
@@ -2245,6 +2245,108 @@ seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum
(1 row)
+starting permutation: s2_begin s2_table_select s1_sleep s1_table_stats s2_track_counts_off s2_table_select s1_sleep s1_table_stats s2_track_counts_on s2_table_select s1_sleep s1_table_stats s2_table_drop s2_commit
+pg_stat_force_next_flush
+------------------------
+
+(1 row)
+
+step s2_begin: BEGIN;
+step s2_table_select: SELECT * FROM test_stat_tab ORDER BY key, value;
+key|value
+---+-----
+k0 | 1
+(1 row)
+
+step s1_sleep: SELECT pg_sleep(1.5);
+pg_sleep
+--------
+
+(1 row)
+
+step s1_table_stats:
+ SELECT
+ pg_stat_get_numscans(tso.oid) AS seq_scan,
+ pg_stat_get_tuples_returned(tso.oid) AS seq_tup_read,
+ pg_stat_get_tuples_inserted(tso.oid) AS n_tup_ins,
+ pg_stat_get_tuples_updated(tso.oid) AS n_tup_upd,
+ pg_stat_get_tuples_deleted(tso.oid) AS n_tup_del,
+ pg_stat_get_live_tuples(tso.oid) AS n_live_tup,
+ pg_stat_get_dead_tuples(tso.oid) AS n_dead_tup,
+ pg_stat_get_vacuum_count(tso.oid) AS vacuum_count
+ FROM test_stat_oid AS tso
+ WHERE tso.name = 'test_stat_tab'
+
+seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum_count
+--------+------------+---------+---------+---------+----------+----------+------------
+ 1| 1| 1| 0| 0| 1| 0| 0
+(1 row)
+
+step s2_track_counts_off: SET track_counts = off;
+step s2_table_select: SELECT * FROM test_stat_tab ORDER BY key, value;
+key|value
+---+-----
+k0 | 1
+(1 row)
+
+step s1_sleep: SELECT pg_sleep(1.5);
+pg_sleep
+--------
+
+(1 row)
+
+step s1_table_stats:
+ SELECT
+ pg_stat_get_numscans(tso.oid) AS seq_scan,
+ pg_stat_get_tuples_returned(tso.oid) AS seq_tup_read,
+ pg_stat_get_tuples_inserted(tso.oid) AS n_tup_ins,
+ pg_stat_get_tuples_updated(tso.oid) AS n_tup_upd,
+ pg_stat_get_tuples_deleted(tso.oid) AS n_tup_del,
+ pg_stat_get_live_tuples(tso.oid) AS n_live_tup,
+ pg_stat_get_dead_tuples(tso.oid) AS n_dead_tup,
+ pg_stat_get_vacuum_count(tso.oid) AS vacuum_count
+ FROM test_stat_oid AS tso
+ WHERE tso.name = 'test_stat_tab'
+
+seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum_count
+--------+------------+---------+---------+---------+----------+----------+------------
+ 1| 1| 1| 0| 0| 1| 0| 0
+(1 row)
+
+step s2_track_counts_on: SET track_counts = on;
+step s2_table_select: SELECT * FROM test_stat_tab ORDER BY key, value;
+key|value
+---+-----
+k0 | 1
+(1 row)
+
+step s1_sleep: SELECT pg_sleep(1.5);
+pg_sleep
+--------
+
+(1 row)
+
+step s1_table_stats:
+ SELECT
+ pg_stat_get_numscans(tso.oid) AS seq_scan,
+ pg_stat_get_tuples_returned(tso.oid) AS seq_tup_read,
+ pg_stat_get_tuples_inserted(tso.oid) AS n_tup_ins,
+ pg_stat_get_tuples_updated(tso.oid) AS n_tup_upd,
+ pg_stat_get_tuples_deleted(tso.oid) AS n_tup_del,
+ pg_stat_get_live_tuples(tso.oid) AS n_live_tup,
+ pg_stat_get_dead_tuples(tso.oid) AS n_dead_tup,
+ pg_stat_get_vacuum_count(tso.oid) AS vacuum_count
+ FROM test_stat_oid AS tso
+ WHERE tso.name = 'test_stat_tab'
+
+seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum_count
+--------+------------+---------+---------+---------+----------+----------+------------
+ 2| 2| 1| 0| 0| 1| 0| 0
+(1 row)
+
+step s2_table_drop: DROP TABLE test_stat_tab;
+step s2_commit: COMMIT;
+
starting permutation: s1_track_counts_off s1_table_stats s1_track_counts_on
pg_stat_force_next_flush
------------------------
diff --git a/src/test/isolation/expected/stats_1.out b/src/test/isolation/expected/stats_1.out
index e1d937784cb..aef582e7582 100644
--- a/src/test/isolation/expected/stats_1.out
+++ b/src/test/isolation/expected/stats_1.out
@@ -2253,6 +2253,108 @@ seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum
(1 row)
+starting permutation: s2_begin s2_table_select s1_sleep s1_table_stats s2_track_counts_off s2_table_select s1_sleep s1_table_stats s2_track_counts_on s2_table_select s1_sleep s1_table_stats s2_table_drop s2_commit
+pg_stat_force_next_flush
+------------------------
+
+(1 row)
+
+step s2_begin: BEGIN;
+step s2_table_select: SELECT * FROM test_stat_tab ORDER BY key, value;
+key|value
+---+-----
+k0 | 1
+(1 row)
+
+step s1_sleep: SELECT pg_sleep(1.5);
+pg_sleep
+--------
+
+(1 row)
+
+step s1_table_stats:
+ SELECT
+ pg_stat_get_numscans(tso.oid) AS seq_scan,
+ pg_stat_get_tuples_returned(tso.oid) AS seq_tup_read,
+ pg_stat_get_tuples_inserted(tso.oid) AS n_tup_ins,
+ pg_stat_get_tuples_updated(tso.oid) AS n_tup_upd,
+ pg_stat_get_tuples_deleted(tso.oid) AS n_tup_del,
+ pg_stat_get_live_tuples(tso.oid) AS n_live_tup,
+ pg_stat_get_dead_tuples(tso.oid) AS n_dead_tup,
+ pg_stat_get_vacuum_count(tso.oid) AS vacuum_count
+ FROM test_stat_oid AS tso
+ WHERE tso.name = 'test_stat_tab'
+
+seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum_count
+--------+------------+---------+---------+---------+----------+----------+------------
+ 1| 1| 1| 0| 0| 1| 0| 0
+(1 row)
+
+step s2_track_counts_off: SET track_counts = off;
+step s2_table_select: SELECT * FROM test_stat_tab ORDER BY key, value;
+key|value
+---+-----
+k0 | 1
+(1 row)
+
+step s1_sleep: SELECT pg_sleep(1.5);
+pg_sleep
+--------
+
+(1 row)
+
+step s1_table_stats:
+ SELECT
+ pg_stat_get_numscans(tso.oid) AS seq_scan,
+ pg_stat_get_tuples_returned(tso.oid) AS seq_tup_read,
+ pg_stat_get_tuples_inserted(tso.oid) AS n_tup_ins,
+ pg_stat_get_tuples_updated(tso.oid) AS n_tup_upd,
+ pg_stat_get_tuples_deleted(tso.oid) AS n_tup_del,
+ pg_stat_get_live_tuples(tso.oid) AS n_live_tup,
+ pg_stat_get_dead_tuples(tso.oid) AS n_dead_tup,
+ pg_stat_get_vacuum_count(tso.oid) AS vacuum_count
+ FROM test_stat_oid AS tso
+ WHERE tso.name = 'test_stat_tab'
+
+seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum_count
+--------+------------+---------+---------+---------+----------+----------+------------
+ 1| 1| 1| 0| 0| 1| 0| 0
+(1 row)
+
+step s2_track_counts_on: SET track_counts = on;
+step s2_table_select: SELECT * FROM test_stat_tab ORDER BY key, value;
+key|value
+---+-----
+k0 | 1
+(1 row)
+
+step s1_sleep: SELECT pg_sleep(1.5);
+pg_sleep
+--------
+
+(1 row)
+
+step s1_table_stats:
+ SELECT
+ pg_stat_get_numscans(tso.oid) AS seq_scan,
+ pg_stat_get_tuples_returned(tso.oid) AS seq_tup_read,
+ pg_stat_get_tuples_inserted(tso.oid) AS n_tup_ins,
+ pg_stat_get_tuples_updated(tso.oid) AS n_tup_upd,
+ pg_stat_get_tuples_deleted(tso.oid) AS n_tup_del,
+ pg_stat_get_live_tuples(tso.oid) AS n_live_tup,
+ pg_stat_get_dead_tuples(tso.oid) AS n_dead_tup,
+ pg_stat_get_vacuum_count(tso.oid) AS vacuum_count
+ FROM test_stat_oid AS tso
+ WHERE tso.name = 'test_stat_tab'
+
+seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum_count
+--------+------------+---------+---------+---------+----------+----------+------------
+ 2| 2| 1| 0| 0| 1| 0| 0
+(1 row)
+
+step s2_table_drop: DROP TABLE test_stat_tab;
+step s2_commit: COMMIT;
+
starting permutation: s1_track_counts_off s1_table_stats s1_track_counts_on
pg_stat_force_next_flush
------------------------
diff --git a/src/test/isolation/specs/stats.spec b/src/test/isolation/specs/stats.spec
index da16710da0f..47414eb6009 100644
--- a/src/test/isolation/specs/stats.spec
+++ b/src/test/isolation/specs/stats.spec
@@ -50,6 +50,8 @@ step s1_rollback { ROLLBACK; }
step s1_prepare_a { PREPARE TRANSACTION 'a'; }
step s1_commit_prepared_a { COMMIT PREPARED 'a'; }
step s1_rollback_prepared_a { ROLLBACK PREPARED 'a'; }
+# Has to be greater than session 2 stats_flush_interval
+step s1_sleep { SELECT pg_sleep(1.5); }
# Function stats steps
step s1_ff { SELECT pg_stat_force_next_flush(); }
@@ -132,12 +134,16 @@ step s1_slru_check_stats {
session s2
-setup { SET stats_fetch_consistency = 'none'; }
+setup {
+ SET stats_fetch_consistency = 'none';
+ SET stats_flush_interval = '1s';
+}
step s2_begin { BEGIN; }
step s2_commit { COMMIT; }
step s2_commit_prepared_a { COMMIT PREPARED 'a'; }
step s2_rollback_prepared_a { ROLLBACK PREPARED 'a'; }
step s2_ff { SELECT pg_stat_force_next_flush(); }
+step s2_table_drop { DROP TABLE test_stat_tab; }
# Function stats steps
step s2_track_funcs_all { SET track_functions = 'all'; }
@@ -156,6 +162,8 @@ step s2_func_stats {
}
# Relation stats steps
+step s2_track_counts_on { SET track_counts = on; }
+step s2_track_counts_off { SET track_counts = off; }
step s2_table_select { SELECT * FROM test_stat_tab ORDER BY key, value; }
step s2_table_update_k1 { UPDATE test_stat_tab SET value = value + 1 WHERE key = 'k1';}
@@ -435,6 +443,23 @@ permutation
s1_table_drop
s1_table_stats
+### Check that some stats are updated (seq_scan and seq_tup_read)
+### while the transaction is still running
+permutation
+ s2_begin
+ s2_table_select
+ s1_sleep
+ s1_table_stats
+ s2_track_counts_off
+ s2_table_select
+ s1_sleep
+ s1_table_stats
+ s2_track_counts_on
+ s2_table_select
+ s1_sleep
+ s1_table_stats
+ s2_table_drop
+ s2_commit
### Check that we don't count changes with track counts off, but allow access
### to prior stats
diff --git a/src/test/modules/test_custom_stats/test_custom_var_stats.c b/src/test/modules/test_custom_stats/test_custom_var_stats.c
index 207e841911b..ffcda7b6c7a 100644
--- a/src/test/modules/test_custom_stats/test_custom_var_stats.c
+++ b/src/test/modules/test_custom_stats/test_custom_var_stats.c
@@ -84,7 +84,8 @@ static dsa_area *custom_stats_description_dsa = NULL;
/* Flush callback: merge pending stats into shared memory */
static bool test_custom_stats_var_flush_pending_cb(PgStat_EntryRef *entry_ref,
- bool nowait, bool anytime_only);
+ bool nowait, bool anytime_only,
+ bool *is_partial);
/* Serialization callback: write auxiliary entry data */
static void test_custom_stats_var_to_serialized_data(const PgStat_HashKey *key,
@@ -152,7 +153,8 @@ _PG_init(void)
* Returns false only if nowait=true and lock acquisition fails.
*/
static bool
-test_custom_stats_var_flush_pending_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
+test_custom_stats_var_flush_pending_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial)
{
PgStat_StatCustomVarEntry *pending_entry;
PgStatShared_CustomVarEntry *shared_entry;
@@ -160,6 +162,9 @@ test_custom_stats_var_flush_pending_cb(PgStat_EntryRef *entry_ref, bool nowait,
pending_entry = (PgStat_StatCustomVarEntry *) entry_ref->pending;
shared_entry = (PgStatShared_CustomVarEntry *) entry_ref->shared_stats;
+ /* this is not a partial flush */
+ *is_partial = false;
+
if (!pgstat_lock_entry(entry_ref, nowait))
return false;
--
2.34.1
^ permalink raw reply [nested|flat] 22+ messages in thread
* Re: Flush some statistics within running transactions
2026-02-17 19:18 Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-18 05:40 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
@ 2026-02-18 11:37 ` Jakub Wartak <[email protected]>
2026-02-19 07:06 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
1 sibling, 1 reply; 22+ messages in thread
From: Jakub Wartak @ 2026-02-18 11:37 UTC (permalink / raw)
To: Bertrand Drouvot <[email protected]>; +Cc: Sami Imseih <[email protected]>; Michael Paquier <[email protected]>; Fujii Masao <[email protected]>; [email protected]; Zsolt Parragi <[email protected]>
On Wed, Feb 18, 2026 at 6:41 AM Bertrand Drouvot
<[email protected]> wrote:
>
> Hi,
>
> On Tue, Feb 17, 2026 at 01:18:35PM -0600, Sami Imseih wrote:
> > >
> > > > I do not have any further comments on this patchset.
> > >
> > > Thanks for the review!
> >
> > I flipped this CF entry to Ready-for-committer
>
> Thanks!
>
> PFA a mandatory rebase (nothing that needs review) due to a92b809f9da1.
Hi Bertrand!
Thanks for working on this. I've took a quick look on this patchset:
v8-0005: you start using pgstat_schedule_anytime_update() from really
hot macros like pgstat_count_buffer_hit() / pgstat_count_buffer_read()
or pgstat_count_heap_getnext(), e.g.:
#define pgstat_count_buffer_hit(rel)
do {
if (pgstat_should_count_relation(rel))
+ {
(rel)->pgstat_info->counts.blocks_hit++;
+ pgstat_schedule_anytime_update();
+ }
} while (0)
where #define pgstat_schedule_anytime_update() is
do {
if (IsUnderPostmaster &&
!get_timeout_active(ANYTIME_STATS_UPDATE_TIMEOUT))
enable_timeout_after(ANYTIME_STATS_UPDATE_TIMEOUT,
pgstat_flush_interval);
} while (0)
however that function (get_timeout_active()) is not static inlined so I'm
wondering wouldn't there some major performance impact? Those buffer
macros seem to be pretty heavy hitters, e.g. quite often even per single
buffer in PinBufferForBlock():
pgstat_count_buffer_read(rel);
if (*foundPtr)
pgstat_count_buffer_hit(rel);
so it seems to be:
- often unnecessary double work (and probably as this is a function call to
get_timeout_active it won't be optimized by compiler?)
- but the main question is: why do we need that often to recheck and re-enable
timers so often from such hot places?
v8-0001: this patch modifies ProcessInterrupts() which checks for
AnytimeStatsUpdateTimeoutPending and it may happen that it takes LWLocks
(pgstat_report_anytime_stat()->pgstat_flush_pending_entries()->e.g.
pgstat_*_flush_cb()-> pgstat_lock_entry() -> LWLock)
(It's more a question than finding): isn't it too risky to take that
LWLock from potentially random spots as the CHECK_FOR_INTERRUPTS() is
literally everywhere (~318 places). Wouldn't it be safer to flush
from a couple of desired places?
-J.
^ permalink raw reply [nested|flat] 22+ messages in thread
* Re: Flush some statistics within running transactions
2026-02-17 19:18 Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-18 05:40 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-18 11:37 ` Re: Flush some statistics within running transactions Jakub Wartak <[email protected]>
@ 2026-02-19 07:06 ` Bertrand Drouvot <[email protected]>
0 siblings, 0 replies; 22+ messages in thread
From: Bertrand Drouvot @ 2026-02-19 07:06 UTC (permalink / raw)
To: Jakub Wartak <[email protected]>; +Cc: Sami Imseih <[email protected]>; Michael Paquier <[email protected]>; Fujii Masao <[email protected]>; [email protected]; Zsolt Parragi <[email protected]>
Hi Jakub!
On Wed, Feb 18, 2026 at 12:37:01PM +0100, Jakub Wartak wrote:
> On Wed, Feb 18, 2026 at 6:41 AM Bertrand Drouvot
> <[email protected]> wrote:
> >
>
> Thanks for working on this. I've took a quick look on this patchset:
Thanks!
> however that function (get_timeout_active()) is not static inlined so I'm
> wondering wouldn't there some major performance impact?
The get_timeout_active() function is very simple but you are right to be
concerned by the extra function call in hot paths.
So, with a simple pgbench test as:
pgbench -i -s 1
pgbench -n -c1 -j1 -f <(echo "SELECT count(*) FROM pgbench_accounts;") -T 60
1/ with the patch
tps average over 5 runs is about 123.8 and a perf report reports:
0.44% postgres postgres [.] get_timeout_active
0.43% <blabla>;ExecAgg;fetch_input_tuple;IndexOnlyNext;get_timeout_active
To compare with other solutions that will not make use of a get_timeout_active()
function call, let's also check the IndexOnlyNext profile (see above as to why):
12.05% postgres postgres [.] IndexOnlyNext
2/ without the patch
tps average over 5 runs is about 124.9 and a perf report reports:
11.59% postgres postgres [.] IndexOnlyNext
3/ with the patch and get_timeout_active() inlined:
tps average over 5 runs is about 124 and a perf report reports:
10.89% postgres postgres [.] IndexOnlyNext
4/ with the patch and using a boolean instead of get_timeout_active()
tps average over 5 runs is about 129 and a perf report reports:
11.88% postgres postgres [.] IndexOnlyNext
I'm not 100% sure what to conclude here.
What I can see is that get_timeout_active() was about 0.45%, that's not a lot
but that is simple enough to remove that I think that we just should.
> - but the main question is: why do we need that often to recheck and re-enable
> timers so often from such hot places?
We need to ensure that anytime stats are flushed each time we update an anytime
stats. Kind of the same idea as with the existing "pgstat_report_fixed".
> v8-0001: this patch modifies ProcessInterrupts() which checks for
> AnytimeStatsUpdateTimeoutPending and it may happen that it takes LWLocks
> (pgstat_report_anytime_stat()->pgstat_flush_pending_entries()->e.g.
> pgstat_*_flush_cb()-> pgstat_lock_entry() -> LWLock)
>
> (It's more a question than finding): isn't it too risky to take that
> LWLock from potentially random spots as the CHECK_FOR_INTERRUPTS() is
> literally everywhere (~318 places). Wouldn't it be safer to flush
> from a couple of desired places?
That's a good question. pgstat_report_stat() is already called in ProcessInterrupts()
and pgstat_report_anytime_stat() does not do more than it, so I'm tempted to say
that we are fine here.
PFA, a new version that:
- Implements the boolean (i.e "pgstat_pending_anytime") usage instead of the
get_timeout_active() call.
- Fix a race in assign_stats_flush_interval() reported by Sami off thread.
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachments:
[text/x-diff] v9-0001-Add-pgstat_report_anytime_stat-for-periodic-stats.patch (42.8K, 2-v9-0001-Add-pgstat_report_anytime_stat-for-periodic-stats.patch)
download | inline diff:
From 1314c8096213e95807f45f1c54185f9329cb7dd3 Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot <[email protected]>
Date: Mon, 5 Jan 2026 09:41:39 +0000
Subject: [PATCH v9 1/5] Add pgstat_report_anytime_stat() for periodic stats
flushing
Long running transactions can accumulate significant statistics (WAL, IO, ...)
that remain unflushed until the transaction ends. This delays visibility of
resource usage in monitoring views like pg_stat_io and pg_stat_wal and produces
spikes when flushed.
This commit introduces pgstat_report_anytime_stat(), which flushes
non transactional statistics even inside active transactions. A new timeout
handler fires every second (if enabled while adding pending stats) to call this
function, ensuring timely stats visibility without waiting for transaction completion.
Implementation details:
- Add PgStat_FlushMode enum to classify stats kinds:
* FLUSH_ANYTIME: Stats that can always be flushed (WAL, IO, ...)
* FLUSH_AT_TXN_BOUNDARY: Stats requiring transaction boundaries
- Modify pgstat_flush_pending_entries() and pgstat_flush_fixed_stats()
to accept a boolean anytime_only parameter:
* When false: flushes all stats (existing behavior)
* When true: flushes only FLUSH_ANYTIME stats and skips FLUSH_AT_TXN_BOUNDARY stats
- The flush_pending_cb and flush_static_cb callbacks now receive an anytime_only
boolean parameter. Most of the time it's not used (except for assertions), but it's
preparatory work for moving the relations stats to anytime (without introducin
a new callback).
- Add pgstat_schedule_anytime_update() macro to schedule the next anytime flush,
relying on PGSTAT_MIN_INTERVAL
The force parameter in pgstat_report_anytime_stat() is currently unused (always
called with force=false) but reserved for future use cases requiring immediate
flushing.
---
src/backend/access/transam/xlog.c | 6 +
src/backend/postmaster/bgwriter.c | 9 +-
src/backend/postmaster/checkpointer.c | 10 +-
src/backend/postmaster/startup.c | 2 +
src/backend/postmaster/walsummarizer.c | 9 +-
src/backend/postmaster/walwriter.c | 9 +-
src/backend/replication/walreceiver.c | 9 +-
src/backend/replication/walsender.c | 8 +-
src/backend/tcop/postgres.c | 12 ++
src/backend/utils/activity/pgstat.c | 120 +++++++++++++++---
src/backend/utils/activity/pgstat_backend.c | 13 +-
src/backend/utils/activity/pgstat_bgwriter.c | 2 +-
.../utils/activity/pgstat_checkpointer.c | 2 +-
src/backend/utils/activity/pgstat_database.c | 2 +-
src/backend/utils/activity/pgstat_function.c | 4 +-
src/backend/utils/activity/pgstat_io.c | 10 +-
src/backend/utils/activity/pgstat_relation.c | 12 +-
src/backend/utils/activity/pgstat_slru.c | 6 +-
.../utils/activity/pgstat_subscription.c | 4 +-
src/backend/utils/activity/pgstat_wal.c | 10 +-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/init/postinit.c | 3 +
src/include/miscadmin.h | 1 +
src/include/pgstat.h | 22 ++++
src/include/utils/pgstat_internal.h | 52 ++++++--
src/include/utils/timeout.h | 1 +
.../test_custom_stats/test_custom_var_stats.c | 4 +-
src/tools/pgindent/typedefs.list | 1 +
28 files changed, 278 insertions(+), 66 deletions(-)
10.5% src/backend/postmaster/
5.8% src/backend/replication/
50.9% src/backend/utils/activity/
5.9% src/backend/
18.8% src/include/utils/
6.6% src/include/
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 13ec6225b85..d01b11c7470 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -1085,6 +1085,9 @@ XLogInsertRecord(XLogRecData *rdata,
pgWalUsage.wal_fpi += num_fpi;
pgWalUsage.wal_fpi_bytes += fpi_bytes;
+ /* Schedule next anytime stats update timeout */
+ pgstat_schedule_anytime_update();
+
/* Required for the flush of pending stats WAL data */
pgstat_report_fixed = true;
}
@@ -2066,6 +2069,9 @@ AdvanceXLInsertBuffer(XLogRecPtr upto, TimeLineID tli, bool opportunistic)
pgWalUsage.wal_buffers_full++;
TRACE_POSTGRESQL_WAL_BUFFER_WRITE_DIRTY_DONE();
+ /* Schedule next anytime stats update timeout */
+ pgstat_schedule_anytime_update();
+
/*
* Required for the flush of pending stats WAL data, per
* update of pgWalUsage.
diff --git a/src/backend/postmaster/bgwriter.c b/src/backend/postmaster/bgwriter.c
index 0956bd39a85..059c601c3b8 100644
--- a/src/backend/postmaster/bgwriter.c
+++ b/src/backend/postmaster/bgwriter.c
@@ -49,7 +49,9 @@
#include "storage/smgr.h"
#include "storage/standby.h"
#include "utils/memutils.h"
+#include "utils/pgstat_internal.h"
#include "utils/resowner.h"
+#include "utils/timeout.h"
#include "utils/timestamp.h"
/*
@@ -103,7 +105,7 @@ BackgroundWriterMain(const void *startup_data, size_t startup_data_len)
pqsignal(SIGINT, SIG_IGN);
pqsignal(SIGTERM, SignalHandlerForShutdownRequest);
/* SIGQUIT handler was already set up by InitPostmasterChild */
- pqsignal(SIGALRM, SIG_IGN);
+ InitializeTimeouts(); /* establishes SIGALRM handler */
pqsignal(SIGPIPE, SIG_IGN);
pqsignal(SIGUSR1, procsignal_sigusr1_handler);
pqsignal(SIGUSR2, SIG_IGN);
@@ -113,6 +115,11 @@ BackgroundWriterMain(const void *startup_data, size_t startup_data_len)
*/
pqsignal(SIGCHLD, SIG_DFL);
+ /*
+ * Register timeouts needed
+ */
+ RegisterTimeout(ANYTIME_STATS_UPDATE_TIMEOUT, AnytimeStatsUpdateTimeoutHandler);
+
/*
* We just started, assume there has been either a shutdown or
* end-of-recovery snapshot.
diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c
index e03c19123bc..e11c4b099c8 100644
--- a/src/backend/postmaster/checkpointer.c
+++ b/src/backend/postmaster/checkpointer.c
@@ -66,8 +66,9 @@
#include "utils/acl.h"
#include "utils/guc.h"
#include "utils/memutils.h"
+#include "utils/pgstat_internal.h"
#include "utils/resowner.h"
-
+#include "utils/timeout.h"
/*----------
* Shared memory area for communication between checkpointer and backends
@@ -215,7 +216,7 @@ CheckpointerMain(const void *startup_data, size_t startup_data_len)
pqsignal(SIGINT, ReqShutdownXLOG);
pqsignal(SIGTERM, SIG_IGN); /* ignore SIGTERM */
/* SIGQUIT handler was already set up by InitPostmasterChild */
- pqsignal(SIGALRM, SIG_IGN);
+ InitializeTimeouts(); /* establishes SIGALRM handler */
pqsignal(SIGPIPE, SIG_IGN);
pqsignal(SIGUSR1, procsignal_sigusr1_handler);
pqsignal(SIGUSR2, SignalHandlerForShutdownRequest);
@@ -225,6 +226,11 @@ CheckpointerMain(const void *startup_data, size_t startup_data_len)
*/
pqsignal(SIGCHLD, SIG_DFL);
+ /*
+ * Register timeouts needed
+ */
+ RegisterTimeout(ANYTIME_STATS_UPDATE_TIMEOUT, AnytimeStatsUpdateTimeoutHandler);
+
/*
* Initialize so that first time-driven event happens at the correct time.
*/
diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c
index cdbe53dd262..4954fe425b7 100644
--- a/src/backend/postmaster/startup.c
+++ b/src/backend/postmaster/startup.c
@@ -32,6 +32,7 @@
#include "storage/standby.h"
#include "utils/guc.h"
#include "utils/memutils.h"
+#include "utils/pgstat_internal.h"
#include "utils/timeout.h"
@@ -245,6 +246,7 @@ StartupProcessMain(const void *startup_data, size_t startup_data_len)
RegisterTimeout(STANDBY_DEADLOCK_TIMEOUT, StandbyDeadLockHandler);
RegisterTimeout(STANDBY_TIMEOUT, StandbyTimeoutHandler);
RegisterTimeout(STANDBY_LOCK_TIMEOUT, StandbyLockTimeoutHandler);
+ RegisterTimeout(ANYTIME_STATS_UPDATE_TIMEOUT, AnytimeStatsUpdateTimeoutHandler);
/*
* Unblock signals (they were blocked when the postmaster forked us)
diff --git a/src/backend/postmaster/walsummarizer.c b/src/backend/postmaster/walsummarizer.c
index 742137edad6..f1bae9d23d6 100644
--- a/src/backend/postmaster/walsummarizer.c
+++ b/src/backend/postmaster/walsummarizer.c
@@ -48,6 +48,8 @@
#include "storage/shmem.h"
#include "utils/guc.h"
#include "utils/memutils.h"
+#include "utils/pgstat_internal.h"
+#include "utils/timeout.h"
#include "utils/wait_event.h"
/*
@@ -246,7 +248,7 @@ WalSummarizerMain(const void *startup_data, size_t startup_data_len)
pqsignal(SIGINT, SIG_IGN); /* no query to cancel */
pqsignal(SIGTERM, SignalHandlerForShutdownRequest);
/* SIGQUIT handler was already set up by InitPostmasterChild */
- pqsignal(SIGALRM, SIG_IGN);
+ InitializeTimeouts(); /* establishes SIGALRM handler */
pqsignal(SIGPIPE, SIG_IGN);
pqsignal(SIGUSR1, procsignal_sigusr1_handler);
pqsignal(SIGUSR2, SIG_IGN); /* not used */
@@ -268,6 +270,11 @@ WalSummarizerMain(const void *startup_data, size_t startup_data_len)
*/
pqsignal(SIGCHLD, SIG_DFL);
+ /*
+ * Register timeouts needed
+ */
+ RegisterTimeout(ANYTIME_STATS_UPDATE_TIMEOUT, AnytimeStatsUpdateTimeoutHandler);
+
/*
* If an exception is encountered, processing resumes here.
*/
diff --git a/src/backend/postmaster/walwriter.c b/src/backend/postmaster/walwriter.c
index 7c0e2809c17..bcf59227a00 100644
--- a/src/backend/postmaster/walwriter.c
+++ b/src/backend/postmaster/walwriter.c
@@ -61,7 +61,9 @@
#include "storage/smgr.h"
#include "utils/hsearch.h"
#include "utils/memutils.h"
+#include "utils/pgstat_internal.h"
#include "utils/resowner.h"
+#include "utils/timeout.h"
/*
@@ -103,7 +105,7 @@ WalWriterMain(const void *startup_data, size_t startup_data_len)
pqsignal(SIGINT, SIG_IGN); /* no query to cancel */
pqsignal(SIGTERM, SignalHandlerForShutdownRequest);
/* SIGQUIT handler was already set up by InitPostmasterChild */
- pqsignal(SIGALRM, SIG_IGN);
+ InitializeTimeouts(); /* establishes SIGALRM handler */
pqsignal(SIGPIPE, SIG_IGN);
pqsignal(SIGUSR1, procsignal_sigusr1_handler);
pqsignal(SIGUSR2, SIG_IGN); /* not used */
@@ -113,6 +115,11 @@ WalWriterMain(const void *startup_data, size_t startup_data_len)
*/
pqsignal(SIGCHLD, SIG_DFL);
+ /*
+ * Register timeouts needed
+ */
+ RegisterTimeout(ANYTIME_STATS_UPDATE_TIMEOUT, AnytimeStatsUpdateTimeoutHandler);
+
/*
* Create a memory context that we will do all our work in. We do this so
* that we can reset the context during error recovery and thereby avoid
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 10e64a7d1f4..11b7c114d3b 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -77,7 +77,9 @@
#include "utils/builtins.h"
#include "utils/guc.h"
#include "utils/pg_lsn.h"
+#include "utils/pgstat_internal.h"
#include "utils/ps_status.h"
+#include "utils/timeout.h"
#include "utils/timestamp.h"
@@ -252,7 +254,7 @@ WalReceiverMain(const void *startup_data, size_t startup_data_len)
pqsignal(SIGINT, SIG_IGN);
pqsignal(SIGTERM, die); /* request shutdown */
/* SIGQUIT handler was already set up by InitPostmasterChild */
- pqsignal(SIGALRM, SIG_IGN);
+ InitializeTimeouts(); /* establishes SIGALRM handler */
pqsignal(SIGPIPE, SIG_IGN);
pqsignal(SIGUSR1, procsignal_sigusr1_handler);
pqsignal(SIGUSR2, SIG_IGN);
@@ -260,6 +262,11 @@ WalReceiverMain(const void *startup_data, size_t startup_data_len)
/* Reset some signals that are accepted by postmaster but not here */
pqsignal(SIGCHLD, SIG_DFL);
+ /*
+ * Register timeouts needed
+ */
+ RegisterTimeout(ANYTIME_STATS_UPDATE_TIMEOUT, AnytimeStatsUpdateTimeoutHandler);
+
/* Load the libpq-specific functions */
load_file("libpqwalreceiver", false);
if (WalReceiverFunctions == NULL)
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 2cde8ebc729..a7214d0dc6f 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1987,8 +1987,8 @@ WalSndWaitForWal(XLogRecPtr loc)
if (TimestampDifferenceExceeds(last_flush, now,
WALSENDER_STATS_FLUSH_INTERVAL))
{
- pgstat_flush_io(false);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO);
+ pgstat_flush_io(false, true);
+ (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
last_flush = now;
}
@@ -3016,8 +3016,8 @@ WalSndLoop(WalSndSendDataCallback send_data)
if (TimestampDifferenceExceeds(last_flush, now,
WALSENDER_STATS_FLUSH_INTERVAL))
{
- pgstat_flush_io(false);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO);
+ pgstat_flush_io(false, true);
+ (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
last_flush = now;
}
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 21de158adbb..2089de782d5 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -3564,6 +3564,18 @@ ProcessInterrupts(void)
pgstat_report_stat(true);
}
+ /*
+ * Flush stats outside of transaction boundary if the timeout fired.
+ * Unlike transactional stats, these can be flushed even inside a running
+ * transaction.
+ */
+ if (AnytimeStatsUpdateTimeoutPending)
+ {
+ AnytimeStatsUpdateTimeoutPending = false;
+
+ pgstat_report_anytime_stat(false);
+ }
+
if (ProcSignalBarrierPending)
ProcessProcSignalBarrier();
diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c
index 11bb71cad5a..419dc512d9b 100644
--- a/src/backend/utils/activity/pgstat.c
+++ b/src/backend/utils/activity/pgstat.c
@@ -112,6 +112,7 @@
#include "utils/guc_hooks.h"
#include "utils/memutils.h"
#include "utils/pgstat_internal.h"
+#include "utils/timeout.h"
#include "utils/timestamp.h"
@@ -122,8 +123,6 @@
* ----------
*/
-/* minimum interval non-forced stats flushes.*/
-#define PGSTAT_MIN_INTERVAL 1000
/* how long until to block flushing pending stats updates */
#define PGSTAT_MAX_INTERVAL 60000
/* when to call pgstat_report_stat() again, even when idle */
@@ -187,7 +186,8 @@ static void pgstat_init_snapshot_fixed(void);
static void pgstat_reset_after_failure(void);
-static bool pgstat_flush_pending_entries(bool nowait);
+static bool pgstat_flush_pending_entries(bool nowait, bool anytime_only);
+static bool pgstat_flush_fixed_stats(bool nowait, bool anytime_only);
static void pgstat_prep_snapshot(void);
static void pgstat_build_snapshot(void);
@@ -218,6 +218,12 @@ PgStat_LocalState pgStatLocal;
*/
bool pgstat_report_fixed = false;
+/*
+ * Track when there is pending anytime flush to avoid relying on
+ * get_timeout_active() in hot pathes.
+ */
+bool pgstat_pending_anytime = false;
+
/* ----------
* Local data
*
@@ -288,6 +294,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = true,
+ .flush_mode = FLUSH_AT_TXN_BOUNDARY,
/* so pg_stat_database entries can be seen in all databases */
.accessed_across_databases = true,
@@ -305,6 +312,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = true,
+ .flush_mode = FLUSH_AT_TXN_BOUNDARY,
.shared_size = sizeof(PgStatShared_Relation),
.shared_data_off = offsetof(PgStatShared_Relation, stats),
@@ -321,6 +329,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = true,
+ .flush_mode = FLUSH_AT_TXN_BOUNDARY,
.shared_size = sizeof(PgStatShared_Function),
.shared_data_off = offsetof(PgStatShared_Function, stats),
@@ -336,6 +345,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = true,
+ .flush_mode = FLUSH_AT_TXN_BOUNDARY,
.accessed_across_databases = true,
@@ -353,6 +363,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = true,
+ .flush_mode = FLUSH_AT_TXN_BOUNDARY,
/* so pg_stat_subscription_stats entries can be seen in all databases */
.accessed_across_databases = true,
@@ -370,6 +381,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = false,
+ .flush_mode = FLUSH_ANYTIME,
.accessed_across_databases = true,
@@ -436,6 +448,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = true,
.write_to_file = true,
+ .flush_mode = FLUSH_ANYTIME,
.snapshot_ctl_off = offsetof(PgStat_Snapshot, io),
.shared_ctl_off = offsetof(PgStat_ShmemControl, io),
@@ -453,6 +466,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = true,
.write_to_file = true,
+ .flush_mode = FLUSH_ANYTIME,
.snapshot_ctl_off = offsetof(PgStat_Snapshot, slru),
.shared_ctl_off = offsetof(PgStat_ShmemControl, slru),
@@ -470,6 +484,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = true,
.write_to_file = true,
+ .flush_mode = FLUSH_ANYTIME,
.snapshot_ctl_off = offsetof(PgStat_Snapshot, wal),
.shared_ctl_off = offsetof(PgStat_ShmemControl, wal),
@@ -775,23 +790,11 @@ pgstat_report_stat(bool force)
partial_flush = false;
/* flush of variable-numbered stats tracked in pending entries list */
- partial_flush |= pgstat_flush_pending_entries(nowait);
+ partial_flush |= pgstat_flush_pending_entries(nowait, false);
/* flush of other stats kinds */
if (pgstat_report_fixed)
- {
- for (PgStat_Kind kind = PGSTAT_KIND_MIN; kind <= PGSTAT_KIND_MAX; kind++)
- {
- const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
-
- if (!kind_info)
- continue;
- if (!kind_info->flush_static_cb)
- continue;
-
- partial_flush |= kind_info->flush_static_cb(nowait);
- }
- }
+ partial_flush |= pgstat_flush_fixed_stats(nowait, false);
last_flush = now;
@@ -1293,7 +1296,8 @@ pgstat_prep_pending_entry(PgStat_Kind kind, Oid dboid, uint64 objid, bool *creat
if (entry_ref->pending == NULL)
{
- size_t entrysize = pgstat_get_kind_info(kind)->pending_size;
+ const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
+ size_t entrysize = kind_info->pending_size;
Assert(entrysize != (size_t) -1);
@@ -1345,9 +1349,14 @@ pgstat_delete_pending_entry(PgStat_EntryRef *entry_ref)
/*
* Flush out pending variable-numbered stats.
+ *
+ * If anytime_only is true, only flushes FLUSH_ANYTIME entries.
+ * This is safe to call inside transactions.
+ *
+ * If anytime_only is false, flushes all entries.
*/
static bool
-pgstat_flush_pending_entries(bool nowait)
+pgstat_flush_pending_entries(bool nowait, bool anytime_only)
{
bool have_pending = false;
dlist_node *cur = NULL;
@@ -1377,8 +1386,22 @@ pgstat_flush_pending_entries(bool nowait)
Assert(!kind_info->fixed_amount);
Assert(kind_info->flush_pending_cb != NULL);
+ /* Skip transactional stats if we're in anytime_only mode */
+ if (anytime_only && kind_info->flush_mode == FLUSH_AT_TXN_BOUNDARY)
+ {
+ have_pending = true;
+
+ if (dlist_has_next(&pgStatPending, cur))
+ next = dlist_next_node(&pgStatPending, cur);
+ else
+ next = NULL;
+
+ cur = next;
+ continue;
+ }
+
/* flush the stats, if possible */
- did_flush = kind_info->flush_pending_cb(entry_ref, nowait);
+ did_flush = kind_info->flush_pending_cb(entry_ref, nowait, anytime_only);
Assert(did_flush || nowait);
@@ -1402,6 +1425,33 @@ pgstat_flush_pending_entries(bool nowait)
return have_pending;
}
+/*
+ * Flush fixed-amount stats.
+ *
+ * If anytime_only is true, only flushes FLUSH_ANYTIME stats (safe inside transactions).
+ * If anytime_only is false, flushes all stats with flush_static_cb.
+ */
+static bool
+pgstat_flush_fixed_stats(bool nowait, bool anytime_only)
+{
+ bool partial_flush = false;
+
+ for (PgStat_Kind kind = PGSTAT_KIND_MIN; kind <= PGSTAT_KIND_MAX; kind++)
+ {
+ const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
+
+ if (!kind_info || !kind_info->flush_static_cb)
+ continue;
+
+ /* Skip transactional stats if we're in anytime_only mode */
+ if (anytime_only && kind_info->flush_mode == FLUSH_AT_TXN_BOUNDARY)
+ continue;
+
+ partial_flush |= kind_info->flush_static_cb(nowait, anytime_only);
+ }
+
+ return partial_flush;
+}
/* ------------------------------------------------------------
* Helper / infrastructure functions
@@ -2119,3 +2169,33 @@ assign_stats_fetch_consistency(int newval, void *extra)
if (pgstat_fetch_consistency != newval)
force_stats_snapshot_clear = true;
}
+
+/*
+ * Flushes only FLUSH_ANYTIME stats using non-blocking locks. Transactional
+ * stats (FLUSH_AT_TXN_BOUNDARY) remain pending until transaction boundary.
+ * Safe to call inside transactions.
+ */
+void
+pgstat_report_anytime_stat(bool force)
+{
+ bool nowait = !force;
+
+ pgstat_assert_is_up();
+
+ /* Flush stats outside of transaction boundary */
+ pgstat_flush_pending_entries(nowait, true);
+ pgstat_flush_fixed_stats(nowait, true);
+
+ pgstat_pending_anytime = false;
+}
+
+/*
+ * Timeout handler for flushing anytime stats.
+ */
+void
+AnytimeStatsUpdateTimeoutHandler(void)
+{
+ AnytimeStatsUpdateTimeoutPending = true;
+ InterruptPending = true;
+ SetLatch(MyLatch);
+}
diff --git a/src/backend/utils/activity/pgstat_backend.c b/src/backend/utils/activity/pgstat_backend.c
index f2f8d3ff75f..b09316d3ab3 100644
--- a/src/backend/utils/activity/pgstat_backend.c
+++ b/src/backend/utils/activity/pgstat_backend.c
@@ -31,6 +31,7 @@
#include "storage/procarray.h"
#include "utils/memutils.h"
#include "utils/pgstat_internal.h"
+#include "utils/timeout.h"
/*
* Backend statistics counts waiting to be flushed out. These counters may be
@@ -66,6 +67,9 @@ pgstat_count_backend_io_op_time(IOObject io_object, IOContext io_context,
INSTR_TIME_ADD(PendingBackendStats.pending_io.pending_times[io_object][io_context][io_op],
io_time);
+ /* Schedule next anytime stats update timeout */
+ pgstat_schedule_anytime_update();
+
backend_has_iostats = true;
pgstat_report_fixed = true;
}
@@ -82,6 +86,9 @@ pgstat_count_backend_io_op(IOObject io_object, IOContext io_context,
PendingBackendStats.pending_io.counts[io_object][io_context][io_op] += cnt;
PendingBackendStats.pending_io.bytes[io_object][io_context][io_op] += bytes;
+ /* Schedule next anytime stats update timeout */
+ pgstat_schedule_anytime_update();
+
backend_has_iostats = true;
pgstat_report_fixed = true;
}
@@ -268,7 +275,7 @@ pgstat_flush_backend_entry_wal(PgStat_EntryRef *entry_ref)
* if some statistics could not be flushed due to lock contention.
*/
bool
-pgstat_flush_backend(bool nowait, bits32 flags)
+pgstat_flush_backend(bool nowait, bits32 flags, bool anytime_only)
{
PgStat_EntryRef *entry_ref;
bool has_pending_data = false;
@@ -311,9 +318,9 @@ pgstat_flush_backend(bool nowait, bits32 flags)
* If some stats could not be flushed due to lock contention, return true.
*/
bool
-pgstat_backend_flush_cb(bool nowait)
+pgstat_backend_flush_cb(bool nowait, bool anytime_only)
{
- return pgstat_flush_backend(nowait, PGSTAT_BACKEND_FLUSH_ALL);
+ return pgstat_flush_backend(nowait, PGSTAT_BACKEND_FLUSH_ALL, anytime_only);
}
/*
diff --git a/src/backend/utils/activity/pgstat_bgwriter.c b/src/backend/utils/activity/pgstat_bgwriter.c
index ed2fd801189..1c5f0c3ec40 100644
--- a/src/backend/utils/activity/pgstat_bgwriter.c
+++ b/src/backend/utils/activity/pgstat_bgwriter.c
@@ -61,7 +61,7 @@ pgstat_report_bgwriter(void)
/*
* Report IO statistics
*/
- pgstat_flush_io(false);
+ pgstat_flush_io(false, true);
}
/*
diff --git a/src/backend/utils/activity/pgstat_checkpointer.c b/src/backend/utils/activity/pgstat_checkpointer.c
index 1f70194b7a7..2d89a082464 100644
--- a/src/backend/utils/activity/pgstat_checkpointer.c
+++ b/src/backend/utils/activity/pgstat_checkpointer.c
@@ -68,7 +68,7 @@ pgstat_report_checkpointer(void)
/*
* Report IO statistics
*/
- pgstat_flush_io(false);
+ pgstat_flush_io(false, true);
}
/*
diff --git a/src/backend/utils/activity/pgstat_database.c b/src/backend/utils/activity/pgstat_database.c
index 933dcb5cae5..8e86df60461 100644
--- a/src/backend/utils/activity/pgstat_database.c
+++ b/src/backend/utils/activity/pgstat_database.c
@@ -435,7 +435,7 @@ pgstat_reset_database_timestamp(Oid dboid, TimestampTz ts)
* false without flushing the entry. Otherwise returns true.
*/
bool
-pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
+pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
{
PgStatShared_Database *sharedent;
PgStat_StatDBEntry *pendingent;
diff --git a/src/backend/utils/activity/pgstat_function.c b/src/backend/utils/activity/pgstat_function.c
index e6b84283c6c..5ba4958382f 100644
--- a/src/backend/utils/activity/pgstat_function.c
+++ b/src/backend/utils/activity/pgstat_function.c
@@ -190,11 +190,13 @@ pgstat_end_function_usage(PgStat_FunctionCallUsage *fcu, bool finalize)
* false without flushing the entry. Otherwise returns true.
*/
bool
-pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
+pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
{
PgStat_FunctionCounts *localent;
PgStatShared_Function *shfuncent;
+ Assert(!anytime_only);
+
localent = (PgStat_FunctionCounts *) entry_ref->pending;
shfuncent = (PgStatShared_Function *) entry_ref->shared_stats;
diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index 28de24538dc..7cd32900236 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -19,6 +19,7 @@
#include "executor/instrument.h"
#include "storage/bufmgr.h"
#include "utils/pgstat_internal.h"
+#include "utils/timeout.h"
static PgStat_PendingIO PendingIOStats;
static bool have_iostats = false;
@@ -79,6 +80,9 @@ pgstat_count_io_op(IOObject io_object, IOContext io_context, IOOp io_op,
/* Add the per-backend counts */
pgstat_count_backend_io_op(io_object, io_context, io_op, cnt, bytes);
+ /* Schedule next anytime stats update timeout */
+ pgstat_schedule_anytime_update();
+
have_iostats = true;
pgstat_report_fixed = true;
}
@@ -172,9 +176,9 @@ pgstat_fetch_stat_io(void)
* Simpler wrapper of pgstat_io_flush_cb()
*/
void
-pgstat_flush_io(bool nowait)
+pgstat_flush_io(bool nowait, bool anytime_only)
{
- (void) pgstat_io_flush_cb(nowait);
+ (void) pgstat_io_flush_cb(nowait, anytime_only);
}
/*
@@ -186,7 +190,7 @@ pgstat_flush_io(bool nowait)
* acquired. Otherwise, return false.
*/
bool
-pgstat_io_flush_cb(bool nowait)
+pgstat_io_flush_cb(bool nowait, bool anytime_only)
{
LWLock *bktype_lock;
PgStat_BktypeIO *bktype_shstats;
diff --git a/src/backend/utils/activity/pgstat_relation.c b/src/backend/utils/activity/pgstat_relation.c
index bc8c43b96aa..04d21483d93 100644
--- a/src/backend/utils/activity/pgstat_relation.c
+++ b/src/backend/utils/activity/pgstat_relation.c
@@ -267,8 +267,8 @@ pgstat_report_vacuum(Relation rel, PgStat_Counter livetuples,
* is done -- which will likely vacuum many relations -- or until the
* VACUUM command has processed all tables and committed.
*/
- pgstat_flush_io(false);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO);
+ pgstat_flush_io(false, true);
+ (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
}
/*
@@ -362,8 +362,8 @@ pgstat_report_analyze(Relation rel,
pgstat_unlock_entry(entry_ref);
/* see pgstat_report_vacuum() */
- pgstat_flush_io(false);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO);
+ pgstat_flush_io(false, true);
+ (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
}
/*
@@ -812,7 +812,7 @@ pgstat_twophase_postabort(FullTransactionId fxid, uint16 info,
* entry when successfully flushing.
*/
bool
-pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
+pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
{
Oid dboid;
PgStat_TableStatus *lstats; /* pending stats entry */
@@ -820,6 +820,8 @@ pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
PgStat_StatTabEntry *tabentry; /* table entry of shared stats */
PgStat_StatDBEntry *dbentry; /* pending database entry */
+ Assert(!anytime_only);
+
dboid = entry_ref->shared_entry->key.dboid;
lstats = (PgStat_TableStatus *) entry_ref->pending;
shtabstats = (PgStatShared_Relation *) entry_ref->shared_stats;
diff --git a/src/backend/utils/activity/pgstat_slru.c b/src/backend/utils/activity/pgstat_slru.c
index 2190f388eae..bf8a4d58673 100644
--- a/src/backend/utils/activity/pgstat_slru.c
+++ b/src/backend/utils/activity/pgstat_slru.c
@@ -19,6 +19,7 @@
#include "utils/pgstat_internal.h"
#include "utils/timestamp.h"
+#include "utils/timeout.h"
static inline PgStat_SLRUStats *get_slru_entry(int slru_idx);
@@ -139,7 +140,7 @@ pgstat_get_slru_index(const char *name)
* acquired. Otherwise return false.
*/
bool
-pgstat_slru_flush_cb(bool nowait)
+pgstat_slru_flush_cb(bool nowait, bool anytime_only)
{
PgStatShared_SLRU *stats_shmem = &pgStatLocal.shmem->slru;
int i;
@@ -223,6 +224,9 @@ get_slru_entry(int slru_idx)
Assert((slru_idx >= 0) && (slru_idx < SLRU_NUM_ELEMENTS));
+ /* Schedule next anytime stats update timeout */
+ pgstat_schedule_anytime_update();
+
have_slrustats = true;
pgstat_report_fixed = true;
diff --git a/src/backend/utils/activity/pgstat_subscription.c b/src/backend/utils/activity/pgstat_subscription.c
index 500b1899188..c4614817966 100644
--- a/src/backend/utils/activity/pgstat_subscription.c
+++ b/src/backend/utils/activity/pgstat_subscription.c
@@ -116,11 +116,13 @@ pgstat_fetch_stat_subscription(Oid subid)
* false without flushing the entry. Otherwise returns true.
*/
bool
-pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
+pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
{
PgStat_BackendSubEntry *localent;
PgStatShared_Subscription *shsubent;
+ Assert(!anytime_only);
+
localent = (PgStat_BackendSubEntry *) entry_ref->pending;
shsubent = (PgStatShared_Subscription *) entry_ref->shared_stats;
diff --git a/src/backend/utils/activity/pgstat_wal.c b/src/backend/utils/activity/pgstat_wal.c
index 183e0a7a97b..2c2f3f10e10 100644
--- a/src/backend/utils/activity/pgstat_wal.c
+++ b/src/backend/utils/activity/pgstat_wal.c
@@ -51,12 +51,12 @@ pgstat_report_wal(bool force)
nowait = !force;
/* flush wal stats */
- (void) pgstat_wal_flush_cb(nowait);
- pgstat_flush_backend(nowait, PGSTAT_BACKEND_FLUSH_WAL);
+ (void) pgstat_wal_flush_cb(nowait, true);
+ (void) pgstat_flush_backend(nowait, PGSTAT_BACKEND_FLUSH_WAL, true);
/* flush IO stats */
- pgstat_flush_io(nowait);
- (void) pgstat_flush_backend(nowait, PGSTAT_BACKEND_FLUSH_IO);
+ pgstat_flush_io(nowait, true);
+ (void) pgstat_flush_backend(nowait, PGSTAT_BACKEND_FLUSH_IO, true);
}
/*
@@ -88,7 +88,7 @@ pgstat_wal_have_pending(void)
* acquired. Otherwise return false.
*/
bool
-pgstat_wal_flush_cb(bool nowait)
+pgstat_wal_flush_cb(bool nowait, bool anytime_only)
{
PgStatShared_Wal *stats_shmem = &pgStatLocal.shmem->wal;
WalUsage wal_usage_diff = {0};
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 36ad708b360..ad44826c39e 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -40,6 +40,7 @@ volatile sig_atomic_t IdleSessionTimeoutPending = false;
volatile sig_atomic_t ProcSignalBarrierPending = false;
volatile sig_atomic_t LogMemoryContextPending = false;
volatile sig_atomic_t IdleStatsUpdateTimeoutPending = false;
+volatile sig_atomic_t AnytimeStatsUpdateTimeoutPending = false;
volatile uint32 InterruptHoldoffCount = 0;
volatile uint32 QueryCancelHoldoffCount = 0;
volatile uint32 CritSectionCount = 0;
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index b59e08605cc..eeeac1bf39a 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -64,6 +64,7 @@
#include "utils/injection_point.h"
#include "utils/memutils.h"
#include "utils/pg_locale.h"
+#include "utils/pgstat_internal.h"
#include "utils/portal.h"
#include "utils/ps_status.h"
#include "utils/snapmgr.h"
@@ -773,6 +774,8 @@ InitPostgres(const char *in_dbname, Oid dboid,
RegisterTimeout(CLIENT_CONNECTION_CHECK_TIMEOUT, ClientCheckTimeoutHandler);
RegisterTimeout(IDLE_STATS_UPDATE_TIMEOUT,
IdleStatsUpdateTimeoutHandler);
+ RegisterTimeout(ANYTIME_STATS_UPDATE_TIMEOUT,
+ AnytimeStatsUpdateTimeoutHandler);
}
/*
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index f16f35659b9..84e698da214 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -96,6 +96,7 @@ extern PGDLLIMPORT volatile sig_atomic_t IdleSessionTimeoutPending;
extern PGDLLIMPORT volatile sig_atomic_t ProcSignalBarrierPending;
extern PGDLLIMPORT volatile sig_atomic_t LogMemoryContextPending;
extern PGDLLIMPORT volatile sig_atomic_t IdleStatsUpdateTimeoutPending;
+extern PGDLLIMPORT volatile sig_atomic_t AnytimeStatsUpdateTimeoutPending;
extern PGDLLIMPORT volatile sig_atomic_t CheckClientConnectionPending;
extern PGDLLIMPORT volatile sig_atomic_t ClientConnectionLost;
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index fff7ecc2533..f0f546d419a 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -35,6 +35,9 @@
/* Default directory to store temporary statistics data in */
#define PG_STAT_TMP_DIR "pg_stat_tmp"
+/* Minimum interval non-forced stats flushes */
+#define PGSTAT_MIN_INTERVAL 1000
+
/* Values for track_functions GUC variable --- order is significant! */
typedef enum TrackFunctionsLevel
{
@@ -533,8 +536,24 @@ extern void pgstat_initialize(void);
/* Functions called from backends */
extern long pgstat_report_stat(bool force);
+extern void pgstat_report_anytime_stat(bool force);
extern void pgstat_force_next_flush(void);
+/*
+ * Schedule the next anytime stats update timeout.
+ *
+ * This should be called whenever accumulating statistics that support
+ * FLUSH_ANYTIME flushing mode.
+ */
+#define pgstat_schedule_anytime_update() \
+ do { \
+ if (IsUnderPostmaster && !pgstat_pending_anytime) \
+ { \
+ enable_timeout_after(ANYTIME_STATS_UPDATE_TIMEOUT, PGSTAT_MIN_INTERVAL); \
+ pgstat_pending_anytime = true; \
+ } \
+ } while (0)
+
extern void pgstat_reset_counters(void);
extern void pgstat_reset(PgStat_Kind kind, Oid dboid, uint64 objid);
extern void pgstat_reset_of_kind(PgStat_Kind kind);
@@ -808,6 +827,8 @@ extern PgStat_WalStats *pgstat_fetch_stat_wal(void);
* Variables in pgstat.c
*/
+extern PGDLLIMPORT bool pgstat_pending_anytime;
+
/* GUC parameters */
extern PGDLLIMPORT bool pgstat_track_counts;
extern PGDLLIMPORT int pgstat_track_functions;
@@ -851,4 +872,5 @@ extern PGDLLIMPORT PgStat_Counter pgStatTransactionIdleTime;
/* updated by the traffic cop and in errfinish() */
extern PGDLLIMPORT SessionEndType pgStatSessionEndCause;
+
#endif /* PGSTAT_H */
diff --git a/src/include/utils/pgstat_internal.h b/src/include/utils/pgstat_internal.h
index 9b8fbae00ed..607f4255268 100644
--- a/src/include/utils/pgstat_internal.h
+++ b/src/include/utils/pgstat_internal.h
@@ -224,6 +224,19 @@ typedef struct PgStat_SubXactStatus
PgStat_TableXactStatus *first; /* head of list for this subxact */
} PgStat_SubXactStatus;
+/*
+ * Flush mode for statistics kinds.
+ *
+ * FLUSH_AT_TXN_BOUNDARY has to be the first because we want it to be the
+ * default value.
+ */
+typedef enum PgStat_FlushMode
+{
+ FLUSH_AT_TXN_BOUNDARY, /* All fields can only be flushed at
+ * transaction boundary */
+ FLUSH_ANYTIME, /* All fields can be flushed anytime,
+ * including within transactions */
+} PgStat_FlushMode;
/*
* Metadata for a specific kind of statistics.
@@ -251,6 +264,16 @@ typedef struct PgStat_KindInfo
*/
bool track_entry_count:1;
+ /*
+ * The mode of when to flush stats. See PgStat_FlushMode for more details.
+ *
+ * This member only has meaning for statistics kinds that accumulate
+ * pending stats and use flush callbacks. For kinds that write directly to
+ * shared memory (e.g., archiver, bgwriter, checkpointer), this member has
+ * no effect.
+ */
+ PgStat_FlushMode flush_mode;
+
/*
* The size of an entry in the shared stats hash table (pointed to by
* PgStatShared_HashEntry->body). For fixed-numbered statistics, this is
@@ -297,8 +320,10 @@ typedef struct PgStat_KindInfo
* For variable-numbered stats: flush pending stats. Required if pending
* data is used. See flush_static_cb when dealing with stats data that
* that cannot use PgStat_EntryRef->pending.
+ *
+ * The anytime_only parameter indicates whether this is an anytime flush.
*/
- bool (*flush_pending_cb) (PgStat_EntryRef *sr, bool nowait);
+ bool (*flush_pending_cb) (PgStat_EntryRef *sr, bool nowait, bool anytime_only);
/*
* For variable-numbered stats: delete pending stats. Optional.
@@ -366,8 +391,10 @@ typedef struct PgStat_KindInfo
*
* "pgstat_report_fixed" needs to be set to trigger the flush of pending
* stats.
+ *
+ * The anytime_only parameter indicates whether this is an anytime flush.
*/
- bool (*flush_static_cb) (bool nowait);
+ bool (*flush_static_cb) (bool nowait, bool anytime_only);
/*
* For fixed-numbered statistics: Reset All.
@@ -677,6 +704,7 @@ extern PgStat_EntryRef *pgstat_fetch_pending_entry(PgStat_Kind kind,
extern void *pgstat_fetch_entry(PgStat_Kind kind, Oid dboid, uint64 objid);
extern void pgstat_snapshot_fixed(PgStat_Kind kind);
+extern void AnytimeStatsUpdateTimeoutHandler(void);
/*
@@ -696,8 +724,8 @@ extern void pgstat_archiver_snapshot_cb(void);
#define PGSTAT_BACKEND_FLUSH_WAL (1 << 1) /* Flush WAL statistics */
#define PGSTAT_BACKEND_FLUSH_ALL (PGSTAT_BACKEND_FLUSH_IO | PGSTAT_BACKEND_FLUSH_WAL)
-extern bool pgstat_flush_backend(bool nowait, bits32 flags);
-extern bool pgstat_backend_flush_cb(bool nowait);
+extern bool pgstat_flush_backend(bool nowait, bits32 flags, bool anytime_only);
+extern bool pgstat_backend_flush_cb(bool nowait, bool anytime_only);
extern void pgstat_backend_reset_timestamp_cb(PgStatShared_Common *header,
TimestampTz ts);
@@ -729,7 +757,7 @@ extern void AtEOXact_PgStat_Database(bool isCommit, bool parallel);
extern PgStat_StatDBEntry *pgstat_prep_database_pending(Oid dboid);
extern void pgstat_reset_database_timestamp(Oid dboid, TimestampTz ts);
-extern bool pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait);
+extern bool pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only);
extern void pgstat_database_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
@@ -737,7 +765,7 @@ extern void pgstat_database_reset_timestamp_cb(PgStatShared_Common *header, Time
* Functions in pgstat_function.c
*/
-extern bool pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait);
+extern bool pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only);
extern void pgstat_function_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
@@ -745,9 +773,9 @@ extern void pgstat_function_reset_timestamp_cb(PgStatShared_Common *header, Time
* Functions in pgstat_io.c
*/
-extern void pgstat_flush_io(bool nowait);
+extern void pgstat_flush_io(bool nowait, bool anytime_only);
-extern bool pgstat_io_flush_cb(bool nowait);
+extern bool pgstat_io_flush_cb(bool nowait, bool anytime_only);
extern void pgstat_io_init_shmem_cb(void *stats);
extern void pgstat_io_reset_all_cb(TimestampTz ts);
extern void pgstat_io_snapshot_cb(void);
@@ -762,7 +790,7 @@ extern void AtEOSubXact_PgStat_Relations(PgStat_SubXactStatus *xact_state, bool
extern void AtPrepare_PgStat_Relations(PgStat_SubXactStatus *xact_state);
extern void PostPrepare_PgStat_Relations(PgStat_SubXactStatus *xact_state);
-extern bool pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait);
+extern bool pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only);
extern void pgstat_relation_delete_pending_cb(PgStat_EntryRef *entry_ref);
extern void pgstat_relation_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
@@ -809,7 +837,7 @@ extern PgStatShared_Common *pgstat_init_entry(PgStat_Kind kind,
* Functions in pgstat_slru.c
*/
-extern bool pgstat_slru_flush_cb(bool nowait);
+extern bool pgstat_slru_flush_cb(bool nowait, bool anytime_only);
extern void pgstat_slru_init_shmem_cb(void *stats);
extern void pgstat_slru_reset_all_cb(TimestampTz ts);
extern void pgstat_slru_snapshot_cb(void);
@@ -820,7 +848,7 @@ extern void pgstat_slru_snapshot_cb(void);
*/
extern void pgstat_wal_init_backend_cb(void);
-extern bool pgstat_wal_flush_cb(bool nowait);
+extern bool pgstat_wal_flush_cb(bool nowait, bool anytime_only);
extern void pgstat_wal_init_shmem_cb(void *stats);
extern void pgstat_wal_reset_all_cb(TimestampTz ts);
extern void pgstat_wal_snapshot_cb(void);
@@ -830,7 +858,7 @@ extern void pgstat_wal_snapshot_cb(void);
* Functions in pgstat_subscription.c
*/
-extern bool pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait);
+extern bool pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only);
extern void pgstat_subscription_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
diff --git a/src/include/utils/timeout.h b/src/include/utils/timeout.h
index 0965b590b34..10723bb664c 100644
--- a/src/include/utils/timeout.h
+++ b/src/include/utils/timeout.h
@@ -35,6 +35,7 @@ typedef enum TimeoutId
IDLE_SESSION_TIMEOUT,
IDLE_STATS_UPDATE_TIMEOUT,
CLIENT_CONNECTION_CHECK_TIMEOUT,
+ ANYTIME_STATS_UPDATE_TIMEOUT,
STARTUP_PROGRESS_TIMEOUT,
/* First user-definable timeout reason */
USER_TIMEOUT,
diff --git a/src/test/modules/test_custom_stats/test_custom_var_stats.c b/src/test/modules/test_custom_stats/test_custom_var_stats.c
index 64a8fe63cce..bc0b5d6e0eb 100644
--- a/src/test/modules/test_custom_stats/test_custom_var_stats.c
+++ b/src/test/modules/test_custom_stats/test_custom_var_stats.c
@@ -83,7 +83,7 @@ static dsa_area *custom_stats_description_dsa = NULL;
/* Flush callback: merge pending stats into shared memory */
static bool test_custom_stats_var_flush_pending_cb(PgStat_EntryRef *entry_ref,
- bool nowait);
+ bool nowait, bool anytime_only);
/* Serialization callback: write auxiliary entry data */
static void test_custom_stats_var_to_serialized_data(const PgStat_HashKey *key,
@@ -150,7 +150,7 @@ _PG_init(void)
* Returns false only if nowait=true and lock acquisition fails.
*/
static bool
-test_custom_stats_var_flush_pending_cb(PgStat_EntryRef *entry_ref, bool nowait)
+test_custom_stats_var_flush_pending_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
{
PgStat_StatCustomVarEntry *pending_entry;
PgStatShared_CustomVarEntry *shared_entry;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 241945734ec..1dbc4b96f51 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2271,6 +2271,7 @@ PgStat_Counter
PgStat_EntryRef
PgStat_EntryRefHashEntry
PgStat_FetchConsistency
+PgStat_FlushMode
PgStat_FunctionCallUsage
PgStat_FunctionCounts
PgStat_HashKey
--
2.34.1
[text/x-diff] v9-0002-Add-anytime-flush-tests-for-custom-stats.patch (9.0K, 3-v9-0002-Add-anytime-flush-tests-for-custom-stats.patch)
download | inline diff:
From 0199255c67c2ab8b63418c1c5ccef5b6aa1396f2 Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot <[email protected]>
Date: Thu, 5 Feb 2026 05:54:34 +0000
Subject: [PATCH v9 2/5] Add anytime flush tests for custom stats
---
.../test_custom_stats/t/001_custom_stats.pl | 41 +++++++++++++
.../test_custom_fixed_stats--1.0.sql | 5 ++
.../test_custom_fixed_stats.c | 57 +++++++++++++++++++
.../test_custom_var_stats--1.0.sql | 5 ++
.../test_custom_stats/test_custom_var_stats.c | 27 +++++++++
5 files changed, 135 insertions(+)
33.8% src/test/modules/test_custom_stats/t/
66.1% src/test/modules/test_custom_stats/
diff --git a/src/test/modules/test_custom_stats/t/001_custom_stats.pl b/src/test/modules/test_custom_stats/t/001_custom_stats.pl
index 9e6a7a38577..7be1b281776 100644
--- a/src/test/modules/test_custom_stats/t/001_custom_stats.pl
+++ b/src/test/modules/test_custom_stats/t/001_custom_stats.pl
@@ -156,5 +156,46 @@ $result = $node->safe_psql('postgres',
);
is($result, "0", "report of fixed-sized after manual reset");
+# Test FLUSH_ANYTIME mechanism with custom fixed stats
+# This verifies that custom stats can be flushed during a transaction
+
+# Reset stats first
+$node->safe_psql('postgres', q(select test_custom_stats_fixed_reset()));
+$node->safe_psql('postgres', q(select pg_stat_force_next_flush()));
+
+my $anytime_test = q[
+ BEGIN;
+ -- Accumulate stats
+ select test_custom_stats_fixed_anytime_update() from generate_series(1, 2);
+ -- Wait (has to be greater than PGSTAT_MIN_INTERVAL)
+ select pg_sleep(1.5);
+ -- Check
+ select 'anytime:'||numcalls from test_custom_stats_fixed_report();
+];
+
+$result = $node->safe_psql('postgres', $anytime_test);
+like($result, qr/^anytime:2/m,
+ "anytime fixed stats flushed during transaction");
+
+# Test FLUSH_ANYTIME mechanism with custom variable stats
+# This verifies that custom stats can be flushed during a transaction
+
+$node->safe_psql('postgres', q(select pg_stat_force_next_flush()));
+
+$anytime_test = q[
+ BEGIN;
+ -- Accumulate stats
+ select test_custom_stats_var_anytime_update('entry2');
+ select test_custom_stats_var_anytime_update('entry2');
+ -- Wait (has to be greater than PGSTAT_MIN_INTERVAL)
+ select pg_sleep(1.5);
+ -- Check
+ select * from test_custom_stats_var_report('entry2');
+];
+
+$result = $node->safe_psql('postgres', $anytime_test);
+like($result, qr/^entry2|2|/m,
+ "anytime var stats flushed during transaction");
+
# Test completed successfully
done_testing();
diff --git a/src/test/modules/test_custom_stats/test_custom_fixed_stats--1.0.sql b/src/test/modules/test_custom_stats/test_custom_fixed_stats--1.0.sql
index 69a93b5241f..da3a798f289 100644
--- a/src/test/modules/test_custom_stats/test_custom_fixed_stats--1.0.sql
+++ b/src/test/modules/test_custom_stats/test_custom_fixed_stats--1.0.sql
@@ -18,3 +18,8 @@ CREATE FUNCTION test_custom_stats_fixed_reset()
RETURNS void
AS 'MODULE_PATHNAME', 'test_custom_stats_fixed_reset'
LANGUAGE C STRICT PARALLEL UNSAFE;
+
+CREATE FUNCTION test_custom_stats_fixed_anytime_update()
+RETURNS void
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT PARALLEL UNSAFE;
diff --git a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
index 908bd18a7c7..30b0fbcbdc7 100644
--- a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
+++ b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
@@ -18,6 +18,7 @@
#include "pgstat.h"
#include "utils/builtins.h"
#include "utils/pgstat_internal.h"
+#include "utils/timeout.h"
PG_MODULE_MAGIC_EXT(
.name = "test_custom_fixed_stats",
@@ -43,11 +44,13 @@ typedef struct PgStatShared_CustomFixedEntry
static void test_custom_stats_fixed_init_shmem_cb(void *stats);
static void test_custom_stats_fixed_reset_all_cb(TimestampTz ts);
static void test_custom_stats_fixed_snapshot_cb(void);
+static bool test_custom_stats_fixed_flush_cb(bool nowait, bool anytime_only);
static const PgStat_KindInfo custom_stats = {
.name = "test_custom_fixed_stats",
.fixed_amount = true, /* exactly one entry */
.write_to_file = true, /* persist to stats file */
+ .flush_mode = FLUSH_ANYTIME, /* can be flushed anytime */
.shared_size = sizeof(PgStat_StatCustomFixedEntry),
.shared_data_off = offsetof(PgStatShared_CustomFixedEntry, stats),
@@ -56,8 +59,12 @@ static const PgStat_KindInfo custom_stats = {
.init_shmem_cb = test_custom_stats_fixed_init_shmem_cb,
.reset_all_cb = test_custom_stats_fixed_reset_all_cb,
.snapshot_cb = test_custom_stats_fixed_snapshot_cb,
+ .flush_static_cb = test_custom_stats_fixed_flush_cb,
};
+/* Pending statistics */
+static PgStat_StatCustomFixedEntry PendingCustomStats = {0};
+
/*
* Kind ID for test_custom_fixed_stats.
*/
@@ -141,6 +148,38 @@ test_custom_stats_fixed_snapshot_cb(void)
#undef FIXED_COMP
}
+/*
+ * test_custom_stats_fixed_flush_cb
+ * Flush pending stats to shared memory
+ */
+static bool
+test_custom_stats_fixed_flush_cb(bool nowait, bool anytime_only)
+{
+ PgStatShared_CustomFixedEntry *stats_shmem;
+
+ /* Nothing to flush if no calls were made */
+ if (PendingCustomStats.numcalls == 0)
+ return false;
+
+ stats_shmem = pgstat_get_custom_shmem_data(PGSTAT_KIND_TEST_CUSTOM_FIXED_STATS);
+
+ if (!nowait)
+ LWLockAcquire(&stats_shmem->lock, LW_EXCLUSIVE);
+ else if (!LWLockConditionalAcquire(&stats_shmem->lock, LW_EXCLUSIVE))
+ return true;
+
+ pgstat_begin_changecount_write(&stats_shmem->changecount);
+ stats_shmem->stats.numcalls += PendingCustomStats.numcalls;
+ pgstat_end_changecount_write(&stats_shmem->changecount);
+
+ LWLockRelease(&stats_shmem->lock);
+
+ /* Reset pending stats */
+ PendingCustomStats.numcalls = 0;
+
+ return false; /* successfully flushed */
+}
+
/*--------------------------------------------------------------------------
* SQL-callable functions
*--------------------------------------------------------------------------
@@ -222,3 +261,21 @@ test_custom_stats_fixed_report(PG_FUNCTION_ARGS)
/* Return as tuple */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
+
+/*
+ * test_custom_stats_fixed_anytime_update
+ * Increment call counter and schedule anytime flush
+ */
+PG_FUNCTION_INFO_V1(test_custom_stats_fixed_anytime_update);
+Datum
+test_custom_stats_fixed_anytime_update(PG_FUNCTION_ARGS)
+{
+ /* Accumulate in pending stats */
+ PendingCustomStats.numcalls++;
+
+ /* Schedule anytime stats update */
+ pgstat_schedule_anytime_update();
+ pgstat_report_fixed = true;
+
+ PG_RETURN_VOID();
+}
diff --git a/src/test/modules/test_custom_stats/test_custom_var_stats--1.0.sql b/src/test/modules/test_custom_stats/test_custom_var_stats--1.0.sql
index 5ed8cfc2dcf..ed66d38981e 100644
--- a/src/test/modules/test_custom_stats/test_custom_var_stats--1.0.sql
+++ b/src/test/modules/test_custom_stats/test_custom_var_stats--1.0.sql
@@ -24,3 +24,8 @@ CREATE FUNCTION test_custom_stats_var_report(INOUT name TEXT,
RETURNS SETOF record
AS 'MODULE_PATHNAME', 'test_custom_stats_var_report'
LANGUAGE C STRICT PARALLEL UNSAFE;
+
+CREATE FUNCTION test_custom_stats_var_anytime_update(IN name TEXT)
+RETURNS void
+AS 'MODULE_PATHNAME', 'test_custom_stats_var_anytime_update'
+LANGUAGE C STRICT PARALLEL UNSAFE;
diff --git a/src/test/modules/test_custom_stats/test_custom_var_stats.c b/src/test/modules/test_custom_stats/test_custom_var_stats.c
index bc0b5d6e0eb..207e841911b 100644
--- a/src/test/modules/test_custom_stats/test_custom_var_stats.c
+++ b/src/test/modules/test_custom_stats/test_custom_var_stats.c
@@ -17,6 +17,7 @@
#include "storage/dsm_registry.h"
#include "utils/builtins.h"
#include "utils/pgstat_internal.h"
+#include "utils/timeout.h"
PG_MODULE_MAGIC_EXT(
.name = "test_custom_var_stats",
@@ -107,6 +108,7 @@ static const PgStat_KindInfo custom_stats = {
.name = "test_custom_var_stats",
.fixed_amount = false, /* variable number of entries */
.write_to_file = true, /* persist across restarts */
+ .flush_mode = FLUSH_ANYTIME, /* can be flushed anytime */
.track_entry_count = true, /* count active entries */
.accessed_across_databases = true, /* global statistics */
.shared_size = sizeof(PgStatShared_CustomVarEntry),
@@ -689,3 +691,28 @@ test_custom_stats_var_report(PG_FUNCTION_ARGS)
SRF_RETURN_DONE(funcctx);
}
+
+/*
+ * test_custom_stats_var_anytime_update
+ * Increment custom statistic counter and schedule anytime flush
+ */
+PG_FUNCTION_INFO_V1(test_custom_stats_var_anytime_update);
+Datum
+test_custom_stats_var_anytime_update(PG_FUNCTION_ARGS)
+{
+ char *stat_name = text_to_cstring(PG_GETARG_TEXT_PP(0));
+ PgStat_EntryRef *entry_ref;
+ PgStat_StatCustomVarEntry *pending_entry;
+
+ /* Get pending entry in local memory */
+ entry_ref = pgstat_prep_pending_entry(PGSTAT_KIND_TEST_CUSTOM_VAR_STATS, InvalidOid,
+ PGSTAT_CUSTOM_VAR_STATS_IDX(stat_name), NULL);
+
+ pending_entry = (PgStat_StatCustomVarEntry *) entry_ref->pending;
+ pending_entry->numcalls++;
+
+ /* Schedule anytime stats update */
+ pgstat_schedule_anytime_update();
+
+ PG_RETURN_VOID();
+}
--
2.34.1
[text/x-diff] v9-0003-Add-GUC-to-specify-non-transactional-statistics-f.patch (9.8K, 4-v9-0003-Add-GUC-to-specify-non-transactional-statistics-f.patch)
download | inline diff:
From 5784c8acff4d51ccd2509be0cd72b760c5f45fa7 Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot <[email protected]>
Date: Wed, 28 Jan 2026 07:53:13 +0000
Subject: [PATCH v9 3/5] Add GUC to specify non-transactional statistics flush
interval
Adding pgstat_flush_interval, a new GUC to set the interval between flushes of
non-transactional statistics.
---
doc/src/sgml/config.sgml | 32 +++++++++++++++++++
src/backend/utils/activity/pgstat.c | 13 ++++++++
src/backend/utils/misc/guc_parameters.dat | 10 ++++++
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/backend/utils/misc/timeout.c | 6 ++++
src/include/pgstat.h | 6 ++--
src/include/utils/guc_hooks.h | 1 +
src/include/utils/timeout.h | 1 +
.../test_custom_stats/t/001_custom_stats.pl | 6 ++--
9 files changed, 70 insertions(+), 6 deletions(-)
51.0% doc/src/sgml/
10.6% src/backend/utils/activity/
15.9% src/backend/utils/misc/
3.6% src/include/utils/
9.0% src/include/
9.6% src/test/modules/test_custom_stats/t/
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index faf0bdb62aa..03875b490b7 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -8932,6 +8932,38 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-stats-flush-interval" xreflabel="stats_flush_interval">
+ <term><varname>stats_flush_interval</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>stats_flush_interval</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the interval at which certain statistics, which can be updated while a
+ transaction is in progress, are made visible. These include WAL activity
+ and I/O operations.
+ Such statistics are refreshed at the specified interval and can be observed
+ during active transactions in monitoring views such as
+ <link linkend="monitoring-pg-stat-wal-view"><structname>pg_stat_wal</structname></link>
+ and
+ <link linkend="monitoring-pg-stat-io-view"><structname>pg_stat_io</structname></link>.
+ If the value is specified without a unit, milliseconds are assumed.
+ The default is 10 seconds (<literal>10s</literal>), which is generally
+ the smallest practical value for long-running transactions.
+ </para>
+ <note>
+ <para>
+ This parameter does not affect statistics that are only reported at
+ transaction end, such as the columns of <structname>pg_stat_all_tables</structname>
+ (for example, <structfield>n_tup_ins</structfield>, <structfield>n_tup_upd</structfield>,
+ and <structfield>n_tup_del</structfield>). These statistics are always
+ flushed at the end of a transaction.
+ </para>
+ </note>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c
index 419dc512d9b..578c575dbfb 100644
--- a/src/backend/utils/activity/pgstat.c
+++ b/src/backend/utils/activity/pgstat.c
@@ -123,6 +123,8 @@
* ----------
*/
+/* minimum interval non-forced stats flushes.*/
+#define PGSTAT_MIN_INTERVAL 1000
/* how long until to block flushing pending stats updates */
#define PGSTAT_MAX_INTERVAL 60000
/* when to call pgstat_report_stat() again, even when idle */
@@ -203,6 +205,7 @@ static inline bool pgstat_is_kind_valid(PgStat_Kind kind);
bool pgstat_track_counts = false;
int pgstat_fetch_consistency = PGSTAT_FETCH_CONSISTENCY_CACHE;
+int pgstat_flush_interval = 10000;
/* ----------
@@ -2170,6 +2173,16 @@ assign_stats_fetch_consistency(int newval, void *extra)
force_stats_snapshot_clear = true;
}
+/*
+ * GUC assign_hook for stats_flush_interval.
+ */
+void
+assign_stats_flush_interval(int newval, void *extra)
+{
+ if (get_all_timeouts_initialized())
+ enable_timeout_after(ANYTIME_STATS_UPDATE_TIMEOUT, newval);
+}
+
/*
* Flushes only FLUSH_ANYTIME stats using non-blocking locks. Transactional
* stats (FLUSH_AT_TXN_BOUNDARY) remain pending until transaction boundary.
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 271c033952e..d2734caafea 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -2801,6 +2801,16 @@
assign_hook => 'assign_stats_fetch_consistency',
},
+{ name => 'stats_flush_interval', type => 'int', context => 'PGC_USERSET', group => 'STATS_CUMULATIVE',
+ short_desc => 'Sets the interval between flushes of non-transactional statistics.',
+ flags => 'GUC_UNIT_MS',
+ variable => 'pgstat_flush_interval',
+ boot_val => '10000',
+ min => '1000',
+ max => 'INT_MAX',
+ assign_hook => 'assign_stats_flush_interval'
+},
+
{ name => 'subtransaction_buffers', type => 'int', context => 'PGC_POSTMASTER', group => 'RESOURCES_MEM',
short_desc => 'Sets the size of the dedicated buffer pool used for the subtransaction cache.',
long_desc => '0 means use a fraction of "shared_buffers".',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index f938cc65a3a..8bd37a25b38 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -688,6 +688,7 @@
#track_wal_io_timing = off
#track_functions = none # none, pl, all
#stats_fetch_consistency = cache # cache, none, snapshot
+#stats_flush_interval = 10s # in milliseconds
# - Monitoring -
diff --git a/src/backend/utils/misc/timeout.c b/src/backend/utils/misc/timeout.c
index ddba5dc607c..85c4260d1db 100644
--- a/src/backend/utils/misc/timeout.c
+++ b/src/backend/utils/misc/timeout.c
@@ -828,3 +828,9 @@ get_timeout_finish_time(TimeoutId id)
{
return all_timeouts[id].fin_time;
}
+
+bool
+get_all_timeouts_initialized(void)
+{
+ return all_timeouts_initialized;
+}
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index f0f546d419a..7829c563316 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -35,9 +35,6 @@
/* Default directory to store temporary statistics data in */
#define PG_STAT_TMP_DIR "pg_stat_tmp"
-/* Minimum interval non-forced stats flushes */
-#define PGSTAT_MIN_INTERVAL 1000
-
/* Values for track_functions GUC variable --- order is significant! */
typedef enum TrackFunctionsLevel
{
@@ -549,7 +546,7 @@ extern void pgstat_force_next_flush(void);
do { \
if (IsUnderPostmaster && !pgstat_pending_anytime) \
{ \
- enable_timeout_after(ANYTIME_STATS_UPDATE_TIMEOUT, PGSTAT_MIN_INTERVAL); \
+ enable_timeout_after(ANYTIME_STATS_UPDATE_TIMEOUT, pgstat_flush_interval); \
pgstat_pending_anytime = true; \
} \
} while (0)
@@ -833,6 +830,7 @@ extern PGDLLIMPORT bool pgstat_pending_anytime;
extern PGDLLIMPORT bool pgstat_track_counts;
extern PGDLLIMPORT int pgstat_track_functions;
extern PGDLLIMPORT int pgstat_fetch_consistency;
+extern PGDLLIMPORT int pgstat_flush_interval;
/*
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 9c90670d9b8..9b5d2a90387 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -132,6 +132,7 @@ extern bool check_session_authorization(char **newval, void **extra, GucSource s
extern void assign_session_authorization(const char *newval, void *extra);
extern void assign_session_replication_role(int newval, void *extra);
extern void assign_stats_fetch_consistency(int newval, void *extra);
+extern void assign_stats_flush_interval(int newval, void *extra);
extern bool check_ssl(bool *newval, void **extra, GucSource source);
extern bool check_stage_log_stats(bool *newval, void **extra, GucSource source);
extern bool check_standard_conforming_strings(bool *newval, void **extra,
diff --git a/src/include/utils/timeout.h b/src/include/utils/timeout.h
index 10723bb664c..fe7327de209 100644
--- a/src/include/utils/timeout.h
+++ b/src/include/utils/timeout.h
@@ -93,5 +93,6 @@ extern bool get_timeout_active(TimeoutId id);
extern bool get_timeout_indicator(TimeoutId id, bool reset_indicator);
extern TimestampTz get_timeout_start_time(TimeoutId id);
extern TimestampTz get_timeout_finish_time(TimeoutId id);
+extern bool get_all_timeouts_initialized(void);
#endif /* TIMEOUT_H */
diff --git a/src/test/modules/test_custom_stats/t/001_custom_stats.pl b/src/test/modules/test_custom_stats/t/001_custom_stats.pl
index 7be1b281776..22e2a75dcb9 100644
--- a/src/test/modules/test_custom_stats/t/001_custom_stats.pl
+++ b/src/test/modules/test_custom_stats/t/001_custom_stats.pl
@@ -164,10 +164,11 @@ $node->safe_psql('postgres', q(select test_custom_stats_fixed_reset()));
$node->safe_psql('postgres', q(select pg_stat_force_next_flush()));
my $anytime_test = q[
+ SET stats_flush_interval = '1s';
BEGIN;
-- Accumulate stats
select test_custom_stats_fixed_anytime_update() from generate_series(1, 2);
- -- Wait (has to be greater than PGSTAT_MIN_INTERVAL)
+ -- Wait (has to be greater than stats_flush_interval)
select pg_sleep(1.5);
-- Check
select 'anytime:'||numcalls from test_custom_stats_fixed_report();
@@ -183,11 +184,12 @@ like($result, qr/^anytime:2/m,
$node->safe_psql('postgres', q(select pg_stat_force_next_flush()));
$anytime_test = q[
+ SET stats_flush_interval = '1s';
BEGIN;
-- Accumulate stats
select test_custom_stats_var_anytime_update('entry2');
select test_custom_stats_var_anytime_update('entry2');
- -- Wait (has to be greater than PGSTAT_MIN_INTERVAL)
+ -- Wait (has to be greater than stats_flush_interval)
select pg_sleep(1.5);
-- Check
select * from test_custom_stats_var_report('entry2');
--
2.34.1
[text/x-diff] v9-0004-Remove-useless-calls-to-flush-some-stats.patch (7.7K, 5-v9-0004-Remove-useless-calls-to-flush-some-stats.patch)
download | inline diff:
From 4dd666f49b3b970e2520fe82b06d701ea37ef84b Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot <[email protected]>
Date: Tue, 6 Jan 2026 11:06:31 +0000
Subject: [PATCH v9 4/5] Remove useless calls to flush some stats
Now that some stats can be flushed outside of transaction boundaries, remove
useless calls to report/flush some stats. Those calls were in place because
before commit <XXXX> stats were flushed only at transaction boundaries.
Note that:
- it reverts 039549d70f6 (it just keeps its tests)
- it can't be done for checkpointer and bgworker for example because they don't
have a flush callback to call
- it can't be done for auxiliary process (walsummarizer for example) because they
currently do not register the new timeout handler
---
src/backend/replication/walreceiver.c | 10 ------
src/backend/replication/walsender.c | 36 ++------------------
src/backend/utils/activity/pgstat_relation.c | 13 -------
src/test/recovery/t/001_stream_rep.pl | 1 +
src/test/subscription/t/001_rep_changes.pl | 1 +
5 files changed, 4 insertions(+), 57 deletions(-)
69.4% src/backend/replication/
23.4% src/backend/utils/activity/
3.5% src/test/recovery/t/
3.6% src/test/subscription/t/
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 11b7c114d3b..953ba97ed00 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -571,16 +571,6 @@ WalReceiverMain(const void *startup_data, size_t startup_data_len)
*/
bool requestReply = false;
- /*
- * Report pending statistics to the cumulative stats
- * system. This location is useful for the report as it
- * is not within a tight loop in the WAL receiver, to
- * avoid bloating pgstats with requests, while also making
- * sure that the reports happen each time a status update
- * is sent.
- */
- pgstat_report_wal(false);
-
/*
* Check if time since last receive from primary has
* reached the configured limit.
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index a7214d0dc6f..9a136e35b48 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -94,14 +94,10 @@
#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/pg_lsn.h"
-#include "utils/pgstat_internal.h"
#include "utils/ps_status.h"
#include "utils/timeout.h"
#include "utils/timestamp.h"
-/* Minimum interval used by walsender for stats flushes, in ms */
-#define WALSENDER_STATS_FLUSH_INTERVAL 1000
-
/*
* Maximum data payload in a WAL data message. Must be >= XLOG_BLCKSZ.
*
@@ -1846,7 +1842,6 @@ WalSndWaitForWal(XLogRecPtr loc)
int wakeEvents;
uint32 wait_event = 0;
static XLogRecPtr RecentFlushPtr = InvalidXLogRecPtr;
- TimestampTz last_flush = 0;
/*
* Fast path to avoid acquiring the spinlock in case we already know we
@@ -1867,7 +1862,6 @@ WalSndWaitForWal(XLogRecPtr loc)
{
bool wait_for_standby_at_stop = false;
long sleeptime;
- TimestampTz now;
/* Clear any already-pending wakeups */
ResetLatch(MyLatch);
@@ -1973,8 +1967,7 @@ WalSndWaitForWal(XLogRecPtr loc)
* new WAL to be generated. (But if we have nothing to send, we don't
* want to wake on socket-writable.)
*/
- now = GetCurrentTimestamp();
- sleeptime = WalSndComputeSleeptime(now);
+ sleeptime = WalSndComputeSleeptime(GetCurrentTimestamp());
wakeEvents = WL_SOCKET_READABLE;
@@ -1983,15 +1976,6 @@ WalSndWaitForWal(XLogRecPtr loc)
Assert(wait_event != 0);
- /* Report IO statistics, if needed */
- if (TimestampDifferenceExceeds(last_flush, now,
- WALSENDER_STATS_FLUSH_INTERVAL))
- {
- pgstat_flush_io(false, true);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
- last_flush = now;
- }
-
WalSndWait(wakeEvents, sleeptime, wait_event);
}
@@ -2894,8 +2878,6 @@ WalSndCheckTimeOut(void)
static void
WalSndLoop(WalSndSendDataCallback send_data)
{
- TimestampTz last_flush = 0;
-
/*
* Initialize the last reply timestamp. That enables timeout processing
* from hereon.
@@ -2985,9 +2967,6 @@ WalSndLoop(WalSndSendDataCallback send_data)
* WalSndWaitForWal() handle any other blocking; idle receivers need
* its additional actions. For physical replication, also block if
* caught up; its send_data does not block.
- *
- * The IO statistics are reported in WalSndWaitForWal() for the
- * logical WAL senders.
*/
if ((WalSndCaughtUp && send_data != XLogSendLogical &&
!streamingDoneSending) ||
@@ -2995,7 +2974,6 @@ WalSndLoop(WalSndSendDataCallback send_data)
{
long sleeptime;
int wakeEvents;
- TimestampTz now;
if (!streamingDoneReceiving)
wakeEvents = WL_SOCKET_READABLE;
@@ -3006,21 +2984,11 @@ WalSndLoop(WalSndSendDataCallback send_data)
* Use fresh timestamp, not last_processing, to reduce the chance
* of reaching wal_sender_timeout before sending a keepalive.
*/
- now = GetCurrentTimestamp();
- sleeptime = WalSndComputeSleeptime(now);
+ sleeptime = WalSndComputeSleeptime(GetCurrentTimestamp());
if (pq_is_send_pending())
wakeEvents |= WL_SOCKET_WRITEABLE;
- /* Report IO statistics, if needed */
- if (TimestampDifferenceExceeds(last_flush, now,
- WALSENDER_STATS_FLUSH_INTERVAL))
- {
- pgstat_flush_io(false, true);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
- last_flush = now;
- }
-
/* Sleep until something happens or we time out */
WalSndWait(wakeEvents, sleeptime, WAIT_EVENT_WAL_SENDER_MAIN);
}
diff --git a/src/backend/utils/activity/pgstat_relation.c b/src/backend/utils/activity/pgstat_relation.c
index 04d21483d93..ae2952cae89 100644
--- a/src/backend/utils/activity/pgstat_relation.c
+++ b/src/backend/utils/activity/pgstat_relation.c
@@ -260,15 +260,6 @@ pgstat_report_vacuum(Relation rel, PgStat_Counter livetuples,
}
pgstat_unlock_entry(entry_ref);
-
- /*
- * Flush IO statistics now. pgstat_report_stat() will flush IO stats,
- * however this will not be called until after an entire autovacuum cycle
- * is done -- which will likely vacuum many relations -- or until the
- * VACUUM command has processed all tables and committed.
- */
- pgstat_flush_io(false, true);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
}
/*
@@ -360,10 +351,6 @@ pgstat_report_analyze(Relation rel,
}
pgstat_unlock_entry(entry_ref);
-
- /* see pgstat_report_vacuum() */
- pgstat_flush_io(false, true);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
}
/*
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index e9ac67813c7..cfa095ff0a8 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -15,6 +15,7 @@ my $node_primary = PostgreSQL::Test::Cluster->new('primary');
$node_primary->init(
allows_streaming => 1,
auth_extra => [ '--create-role' => 'repl_role' ]);
+$node_primary->append_conf('postgresql.conf', "stats_flush_interval = '1s'");
$node_primary->start;
my $backup_name = 'my_backup';
diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl
index 7d41715ed81..29bae5e1121 100644
--- a/src/test/subscription/t/001_rep_changes.pl
+++ b/src/test/subscription/t/001_rep_changes.pl
@@ -11,6 +11,7 @@ use Test::More;
# Initialize publisher node
my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf', "stats_flush_interval = '1s'");
$node_publisher->start;
# Create subscriber node
--
2.34.1
[text/x-diff] v9-0005-Change-RELATION-and-DATABASE-stats-to-anytime-flu.patch (34.2K, 6-v9-0005-Change-RELATION-and-DATABASE-stats-to-anytime-flu.patch)
download | inline diff:
From c7a578378f8b70603a12e58d83cc3c6259006245 Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot <[email protected]>
Date: Mon, 19 Jan 2026 06:27:55 +0000
Subject: [PATCH v9 5/5] Change RELATION and DATABASE stats to anytime flush
This commit allows mixing fields with different transaction behavior within
the same RELATION or DATABASE statistics kind: some fields are transactional
(e.g., tuple inserts/updates/deletes) while others are non-transactional
(e.g., sequential scans, blocks read).
It modifies the relation flush callback to handle the anytime_only parameter
introduced in commit <nnnn>.
Implementation details:
- Change RELATION from FLUSH_AT_TXN_BOUNDARY to FLUSH_ANYTIME
- Change DATABASE from FLUSH_AT_TXN_BOUNDARY to FLUSH_ANYTIME
- Add a is_partial parameter to flush_pending_cb() to be able to distinguish
partial flushes in pgstat_flush_pending_entries()
- Modify pgstat_relation_flush_cb() to handle anytime_only parameter: when
true, then flush only non-transactional stats and when false, then flush all
the stats. When set to true, it clears flushed fields from pending stats to
prevent double-counting at transaction boundary
DATABASE stats inherit the anytime flush behavior so that relation-derived
stats (tuples_returned, tuples_fetched, blocks_fetched, blocks_hit) are
visible while transactions are in progress.
Tests are added to verify the anytime flush behavior for mixed fields.
---
doc/src/sgml/monitoring.sgml | 37 ++++++-
src/backend/utils/activity/pgstat.c | 15 +--
src/backend/utils/activity/pgstat_database.c | 6 +-
src/backend/utils/activity/pgstat_function.c | 6 +-
src/backend/utils/activity/pgstat_relation.c | 92 ++++++++++++----
.../utils/activity/pgstat_subscription.c | 6 +-
src/include/pgstat.h | 27 ++++-
src/include/utils/pgstat_internal.h | 16 ++-
src/test/isolation/expected/stats.out | 102 ++++++++++++++++++
src/test/isolation/expected/stats_1.out | 102 ++++++++++++++++++
src/test/isolation/specs/stats.spec | 27 ++++-
.../test_custom_stats/test_custom_var_stats.c | 9 +-
12 files changed, 404 insertions(+), 41 deletions(-)
11.7% doc/src/sgml/
26.8% src/backend/utils/activity/
4.2% src/include/utils/
5.4% src/include/
45.1% src/test/isolation/expected/
4.7% src/test/isolation/specs/
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index b77d189a500..f2321b631b0 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3767,6 +3767,19 @@ description | Waiting for a newly initialized WAL file to reach durable storage
</tgroup>
</table>
+ <note>
+ <para>
+ Some statistics are updated while a transaction is in progress (for example,
+ <structfield>blks_read</structfield>, <structfield>blks_hit</structfield>,
+ <structfield>tup_returned</structfield> and <structfield>tup_fetched</structfield>).
+ Statistics that either do not depend on transactions or require transactional
+ consistency are updated only when the transaction ends. Statistics that require
+ transactional consistency include <structfield>xact_commit</structfield>,
+ <structfield>xact_rollback</structfield>, <structfield>tup_inserted</structfield>,
+ <structfield>tup_updated</structfield> and <structfield>tup_deleted</structfield>.
+ </para>
+ </note>
+
</sect2>
<sect2 id="monitoring-pg-stat-database-conflicts-view">
@@ -3956,8 +3969,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
<structfield>last_seq_scan</structfield> <type>timestamp with time zone</type>
</para>
<para>
- The time of the last sequential scan on this table, based on the
- most recent transaction stop time
+ The approximate time of the last sequential scan on this table, updated
+ at least every <varname>stats_flush_interval</varname>
</para></entry>
</row>
@@ -3984,8 +3997,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
<structfield>last_idx_scan</structfield> <type>timestamp with time zone</type>
</para>
<para>
- The time of the last index scan on this table, based on the
- most recent transaction stop time
+ The approximate time of the last index scan on this table, updated
+ at least every <varname>stats_flush_interval</varname>
</para></entry>
</row>
@@ -4223,6 +4236,15 @@ description | Waiting for a newly initialized WAL file to reach durable storage
</tgroup>
</table>
+ <note>
+ <para>
+ The <structfield>seq_scan</structfield>, <structfield>last_seq_scan</structfield>,
+ <structfield>seq_tup_read</structfield>, <structfield>idx_scan</structfield>,
+ <structfield>last_idx_scan</structfield> and <structfield>idx_tup_fetch</structfield>
+ are updated while the transactions are in progress.
+ </para>
+ </note>
+
</sect2>
<sect2 id="monitoring-pg-stat-all-indexes-view">
@@ -4404,6 +4426,13 @@ description | Waiting for a newly initialized WAL file to reach durable storage
tuples (see <xref linkend="indexes-multicolumn"/>).
</para>
</note>
+ <note>
+ <para>
+ The <structfield>idx_scan</structfield>, <structfield>last_idx_scan</structfield>,
+ <structfield>idx_tup_read</structfield> and <structfield>idx_tup_fetch</structfield>
+ are updated while the transactions are in progress.
+ </para>
+ </note>
<tip>
<para>
<command>EXPLAIN ANALYZE</command> outputs the total number of index
diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c
index 578c575dbfb..273258a965a 100644
--- a/src/backend/utils/activity/pgstat.c
+++ b/src/backend/utils/activity/pgstat.c
@@ -297,7 +297,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = true,
- .flush_mode = FLUSH_AT_TXN_BOUNDARY,
+ .flush_mode = FLUSH_ANYTIME,
/* so pg_stat_database entries can be seen in all databases */
.accessed_across_databases = true,
@@ -315,7 +315,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = true,
- .flush_mode = FLUSH_AT_TXN_BOUNDARY,
+ .flush_mode = FLUSH_ANYTIME,
.shared_size = sizeof(PgStatShared_Relation),
.shared_data_off = offsetof(PgStatShared_Relation, stats),
@@ -1353,7 +1353,8 @@ pgstat_delete_pending_entry(PgStat_EntryRef *entry_ref)
/*
* Flush out pending variable-numbered stats.
*
- * If anytime_only is true, only flushes FLUSH_ANYTIME entries.
+ * If anytime_only is true, only flushes FLUSH_ANYTIME entries. For entries
+ * that support it, the callback may flush only non-transactional fields.
* This is safe to call inside transactions.
*
* If anytime_only is false, flushes all entries.
@@ -1384,6 +1385,7 @@ pgstat_flush_pending_entries(bool nowait, bool anytime_only)
PgStat_Kind kind = key.kind;
const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
bool did_flush;
+ bool is_partial_flush = false;
dlist_node *next;
Assert(!kind_info->fixed_amount);
@@ -1404,7 +1406,8 @@ pgstat_flush_pending_entries(bool nowait, bool anytime_only)
}
/* flush the stats, if possible */
- did_flush = kind_info->flush_pending_cb(entry_ref, nowait, anytime_only);
+ did_flush = kind_info->flush_pending_cb(entry_ref, nowait,
+ anytime_only, &is_partial_flush);
Assert(did_flush || nowait);
@@ -1414,8 +1417,8 @@ pgstat_flush_pending_entries(bool nowait, bool anytime_only)
else
next = NULL;
- /* if successfully flushed, remove entry */
- if (did_flush)
+ /* if successfull non-partial flush, remove entry */
+ if (did_flush && !is_partial_flush)
pgstat_delete_pending_entry(entry_ref);
else
have_pending = true;
diff --git a/src/backend/utils/activity/pgstat_database.c b/src/backend/utils/activity/pgstat_database.c
index 8e86df60461..59dd0790fd7 100644
--- a/src/backend/utils/activity/pgstat_database.c
+++ b/src/backend/utils/activity/pgstat_database.c
@@ -435,7 +435,8 @@ pgstat_reset_database_timestamp(Oid dboid, TimestampTz ts)
* false without flushing the entry. Otherwise returns true.
*/
bool
-pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
+pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial)
{
PgStatShared_Database *sharedent;
PgStat_StatDBEntry *pendingent;
@@ -443,6 +444,9 @@ pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_o
pendingent = (PgStat_StatDBEntry *) entry_ref->pending;
sharedent = (PgStatShared_Database *) entry_ref->shared_stats;
+ /* this is not a partial flush */
+ *is_partial = false;
+
if (!pgstat_lock_entry(entry_ref, nowait))
return false;
diff --git a/src/backend/utils/activity/pgstat_function.c b/src/backend/utils/activity/pgstat_function.c
index 5ba4958382f..44193c93fc7 100644
--- a/src/backend/utils/activity/pgstat_function.c
+++ b/src/backend/utils/activity/pgstat_function.c
@@ -190,7 +190,8 @@ pgstat_end_function_usage(PgStat_FunctionCallUsage *fcu, bool finalize)
* false without flushing the entry. Otherwise returns true.
*/
bool
-pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
+pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial)
{
PgStat_FunctionCounts *localent;
PgStatShared_Function *shfuncent;
@@ -200,6 +201,9 @@ pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_o
localent = (PgStat_FunctionCounts *) entry_ref->pending;
shfuncent = (PgStatShared_Function *) entry_ref->shared_stats;
+ /* this is not a partial flush */
+ *is_partial = false;
+
/* localent always has non-zero content */
if (!pgstat_lock_entry(entry_ref, nowait))
diff --git a/src/backend/utils/activity/pgstat_relation.c b/src/backend/utils/activity/pgstat_relation.c
index ae2952cae89..62363dacfe1 100644
--- a/src/backend/utils/activity/pgstat_relation.c
+++ b/src/backend/utils/activity/pgstat_relation.c
@@ -47,7 +47,19 @@ static void add_tabstat_xact_level(PgStat_TableStatus *pgstat_info, int nest_lev
static void ensure_tabstat_xact_level(PgStat_TableStatus *pgstat_info);
static void save_truncdrop_counters(PgStat_TableXactStatus *trans, bool is_drop);
static void restore_truncdrop_counters(PgStat_TableXactStatus *trans);
+static void flush_relation_anytime_stats(PgStat_StatTabEntry *tabentry,
+ PgStat_TableCounts *counts, bool anytime_only);
+/*
+ * Update database statistics with non-transactional stats.
+ */
+#define UPDATE_DATABASE_ANYTIME_STATS(dbentry, counts) \
+ do { \
+ (dbentry)->tuples_returned += (counts)->tuples_returned; \
+ (dbentry)->tuples_fetched += (counts)->tuples_fetched; \
+ (dbentry)->blocks_fetched += (counts)->blocks_fetched; \
+ (dbentry)->blocks_hit += (counts)->blocks_hit; \
+ } while (0)
/*
* Copy stats between relations. This is used for things like REINDEX
@@ -789,6 +801,29 @@ pgstat_twophase_postabort(FullTransactionId fxid, uint16 info,
rec->tuples_inserted + rec->tuples_updated;
}
+/*
+ * Helper function to flush non-transactional statistics.
+ */
+static void
+flush_relation_anytime_stats(PgStat_StatTabEntry *tabentry, PgStat_TableCounts *counts,
+ bool anytime_only)
+{
+ TimestampTz t;
+
+ tabentry->numscans += counts->numscans;
+ if (counts->numscans)
+ {
+ t = anytime_only ? GetCurrentTimestamp() : GetCurrentTransactionStopTimestamp();
+ if (t > tabentry->lastscan)
+ tabentry->lastscan = t;
+ }
+
+ tabentry->tuples_returned += counts->tuples_returned;
+ tabentry->tuples_fetched += counts->tuples_fetched;
+ tabentry->blocks_fetched += counts->blocks_fetched;
+ tabentry->blocks_hit += counts->blocks_hit;
+}
+
/*
* Flush out pending stats for the entry
*
@@ -797,9 +832,17 @@ pgstat_twophase_postabort(FullTransactionId fxid, uint16 info,
*
* Some of the stats are copied to the corresponding pending database stats
* entry when successfully flushing.
+ *
+ * If anytime_only is true, only non-transactional fields are flushed
+ * (numscans, tuples_returned, tuples_fetched, blocks_fetched, blocks_hit).
+ * Transactional fields remain pending until transaction boundary.
+ *
+ * Some of the stats are copied to the corresponding pending database stats
+ * entry when successfully flushing.
*/
bool
-pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
+pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial)
{
Oid dboid;
PgStat_TableStatus *lstats; /* pending stats entry */
@@ -807,12 +850,13 @@ pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_o
PgStat_StatTabEntry *tabentry; /* table entry of shared stats */
PgStat_StatDBEntry *dbentry; /* pending database entry */
- Assert(!anytime_only);
-
dboid = entry_ref->shared_entry->key.dboid;
lstats = (PgStat_TableStatus *) entry_ref->pending;
shtabstats = (PgStatShared_Relation *) entry_ref->shared_stats;
+ /* this is a partial flush if in anytime only mode */
+ *is_partial = anytime_only;
+
/*
* Ignore entries that didn't accumulate any actual counts, such as
* indexes that were opened by the planner but not used.
@@ -824,19 +868,36 @@ pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_o
if (!pgstat_lock_entry(entry_ref, nowait))
return false;
- /* add the values to the shared entry. */
tabentry = &shtabstats->stats;
- tabentry->numscans += lstats->counts.numscans;
- if (lstats->counts.numscans)
+ if (anytime_only)
{
- TimestampTz t = GetCurrentTransactionStopTimestamp();
- if (t > tabentry->lastscan)
- tabentry->lastscan = t;
+ /* Flush non-transactional statistics */
+ flush_relation_anytime_stats(tabentry, &lstats->counts, true);
+
+ pgstat_unlock_entry(entry_ref);
+
+ /* Also update the corresponding fields in database stats */
+ dbentry = pgstat_prep_database_pending(dboid);
+ UPDATE_DATABASE_ANYTIME_STATS(dbentry, &lstats->counts);
+
+ /*
+ * Clear the flushed fields from pending stats to prevent
+ * double-counting when we flush all fields at transaction boundary.
+ */
+ lstats->counts.numscans = 0;
+ lstats->counts.tuples_returned = 0;
+ lstats->counts.tuples_fetched = 0;
+ lstats->counts.blocks_fetched = 0;
+ lstats->counts.blocks_hit = 0;
+
+ return true;
}
- tabentry->tuples_returned += lstats->counts.tuples_returned;
- tabentry->tuples_fetched += lstats->counts.tuples_fetched;
+
+ /* Flush non-transactional statistics */
+ flush_relation_anytime_stats(tabentry, &lstats->counts, false);
+
tabentry->tuples_inserted += lstats->counts.tuples_inserted;
tabentry->tuples_updated += lstats->counts.tuples_updated;
tabentry->tuples_deleted += lstats->counts.tuples_deleted;
@@ -866,9 +927,6 @@ pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_o
*/
tabentry->ins_since_vacuum += lstats->counts.tuples_inserted;
- tabentry->blocks_fetched += lstats->counts.blocks_fetched;
- tabentry->blocks_hit += lstats->counts.blocks_hit;
-
/* Clamp live_tuples in case of negative delta_live_tuples */
tabentry->live_tuples = Max(tabentry->live_tuples, 0);
/* Likewise for dead_tuples */
@@ -878,13 +936,11 @@ pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_o
/* The entry was successfully flushed, add the same to database stats */
dbentry = pgstat_prep_database_pending(dboid);
- dbentry->tuples_returned += lstats->counts.tuples_returned;
- dbentry->tuples_fetched += lstats->counts.tuples_fetched;
+ UPDATE_DATABASE_ANYTIME_STATS(dbentry, &lstats->counts);
+
dbentry->tuples_inserted += lstats->counts.tuples_inserted;
dbentry->tuples_updated += lstats->counts.tuples_updated;
dbentry->tuples_deleted += lstats->counts.tuples_deleted;
- dbentry->blocks_fetched += lstats->counts.blocks_fetched;
- dbentry->blocks_hit += lstats->counts.blocks_hit;
return true;
}
diff --git a/src/backend/utils/activity/pgstat_subscription.c b/src/backend/utils/activity/pgstat_subscription.c
index c4614817966..43fec86c635 100644
--- a/src/backend/utils/activity/pgstat_subscription.c
+++ b/src/backend/utils/activity/pgstat_subscription.c
@@ -116,7 +116,8 @@ pgstat_fetch_stat_subscription(Oid subid)
* false without flushing the entry. Otherwise returns true.
*/
bool
-pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
+pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial)
{
PgStat_BackendSubEntry *localent;
PgStatShared_Subscription *shsubent;
@@ -126,6 +127,9 @@ pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anyti
localent = (PgStat_BackendSubEntry *) entry_ref->pending;
shsubent = (PgStatShared_Subscription *) entry_ref->shared_stats;
+ /* this is not a partial flush */
+ *is_partial = false;
+
/* localent always has non-zero content */
if (!pgstat_lock_entry(entry_ref, nowait))
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 7829c563316..87cbb539d7c 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -21,6 +21,7 @@
#include "utils/backend_status.h" /* for backward compatibility */ /* IWYU pragma: export */
#include "utils/pgstat_kind.h"
#include "utils/relcache.h"
+#include "utils/timeout.h"
#include "utils/wait_event.h" /* for backward compatibility */ /* IWYU pragma: export */
@@ -537,10 +538,11 @@ extern void pgstat_report_anytime_stat(bool force);
extern void pgstat_force_next_flush(void);
/*
- * Schedule the next anytime stats update timeout.
+ * Schedule the next anytime stats update timeout and mark that we have
+ * mixed anytime stats pending.
*
* This should be called whenever accumulating statistics that support
- * FLUSH_ANYTIME flushing mode.
+ * FLUSH_ANYTIME or FLUSH_MIXED flushing modes.
*/
#define pgstat_schedule_anytime_update() \
do { \
@@ -706,37 +708,58 @@ extern void pgstat_report_analyze(Relation rel,
#define pgstat_count_heap_scan(rel) \
do { \
if (pgstat_should_count_relation(rel)) \
+ { \
(rel)->pgstat_info->counts.numscans++; \
+ pgstat_schedule_anytime_update(); \
+ } \
} while (0)
#define pgstat_count_heap_getnext(rel) \
do { \
if (pgstat_should_count_relation(rel)) \
+ { \
(rel)->pgstat_info->counts.tuples_returned++; \
+ pgstat_schedule_anytime_update(); \
+ } \
} while (0)
#define pgstat_count_heap_fetch(rel) \
do { \
if (pgstat_should_count_relation(rel)) \
+ { \
(rel)->pgstat_info->counts.tuples_fetched++; \
+ pgstat_schedule_anytime_update(); \
+ } \
} while (0)
#define pgstat_count_index_scan(rel) \
do { \
if (pgstat_should_count_relation(rel)) \
+ { \
(rel)->pgstat_info->counts.numscans++; \
+ pgstat_schedule_anytime_update(); \
+ } \
} while (0)
#define pgstat_count_index_tuples(rel, n) \
do { \
if (pgstat_should_count_relation(rel)) \
+ { \
(rel)->pgstat_info->counts.tuples_returned += (n); \
+ pgstat_schedule_anytime_update(); \
+ } \
} while (0)
#define pgstat_count_buffer_read(rel) \
do { \
if (pgstat_should_count_relation(rel)) \
+ { \
(rel)->pgstat_info->counts.blocks_fetched++; \
+ pgstat_schedule_anytime_update(); \
+ } \
} while (0)
#define pgstat_count_buffer_hit(rel) \
do { \
if (pgstat_should_count_relation(rel)) \
+ { \
(rel)->pgstat_info->counts.blocks_hit++; \
+ pgstat_schedule_anytime_update(); \
+ } \
} while (0)
extern void pgstat_count_heap_insert(Relation rel, PgStat_Counter n);
diff --git a/src/include/utils/pgstat_internal.h b/src/include/utils/pgstat_internal.h
index 607f4255268..1a2114aad8a 100644
--- a/src/include/utils/pgstat_internal.h
+++ b/src/include/utils/pgstat_internal.h
@@ -322,8 +322,10 @@ typedef struct PgStat_KindInfo
* that cannot use PgStat_EntryRef->pending.
*
* The anytime_only parameter indicates whether this is an anytime flush.
+ * The is_partial parameter indicates whether this is a partial flush.
*/
- bool (*flush_pending_cb) (PgStat_EntryRef *sr, bool nowait, bool anytime_only);
+ bool (*flush_pending_cb) (PgStat_EntryRef *sr, bool nowait,
+ bool anytime_only, bool *is_partial);
/*
* For variable-numbered stats: delete pending stats. Optional.
@@ -757,7 +759,8 @@ extern void AtEOXact_PgStat_Database(bool isCommit, bool parallel);
extern PgStat_StatDBEntry *pgstat_prep_database_pending(Oid dboid);
extern void pgstat_reset_database_timestamp(Oid dboid, TimestampTz ts);
-extern bool pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only);
+extern bool pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial);
extern void pgstat_database_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
@@ -765,7 +768,8 @@ extern void pgstat_database_reset_timestamp_cb(PgStatShared_Common *header, Time
* Functions in pgstat_function.c
*/
-extern bool pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only);
+extern bool pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial);
extern void pgstat_function_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
@@ -790,7 +794,8 @@ extern void AtEOSubXact_PgStat_Relations(PgStat_SubXactStatus *xact_state, bool
extern void AtPrepare_PgStat_Relations(PgStat_SubXactStatus *xact_state);
extern void PostPrepare_PgStat_Relations(PgStat_SubXactStatus *xact_state);
-extern bool pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only);
+extern bool pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial);
extern void pgstat_relation_delete_pending_cb(PgStat_EntryRef *entry_ref);
extern void pgstat_relation_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
@@ -858,7 +863,8 @@ extern void pgstat_wal_snapshot_cb(void);
* Functions in pgstat_subscription.c
*/
-extern bool pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only);
+extern bool pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial);
extern void pgstat_subscription_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
diff --git a/src/test/isolation/expected/stats.out b/src/test/isolation/expected/stats.out
index cfad309ccf3..11e3e57806d 100644
--- a/src/test/isolation/expected/stats.out
+++ b/src/test/isolation/expected/stats.out
@@ -2245,6 +2245,108 @@ seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum
(1 row)
+starting permutation: s2_begin s2_table_select s1_sleep s1_table_stats s2_track_counts_off s2_table_select s1_sleep s1_table_stats s2_track_counts_on s2_table_select s1_sleep s1_table_stats s2_table_drop s2_commit
+pg_stat_force_next_flush
+------------------------
+
+(1 row)
+
+step s2_begin: BEGIN;
+step s2_table_select: SELECT * FROM test_stat_tab ORDER BY key, value;
+key|value
+---+-----
+k0 | 1
+(1 row)
+
+step s1_sleep: SELECT pg_sleep(1.5);
+pg_sleep
+--------
+
+(1 row)
+
+step s1_table_stats:
+ SELECT
+ pg_stat_get_numscans(tso.oid) AS seq_scan,
+ pg_stat_get_tuples_returned(tso.oid) AS seq_tup_read,
+ pg_stat_get_tuples_inserted(tso.oid) AS n_tup_ins,
+ pg_stat_get_tuples_updated(tso.oid) AS n_tup_upd,
+ pg_stat_get_tuples_deleted(tso.oid) AS n_tup_del,
+ pg_stat_get_live_tuples(tso.oid) AS n_live_tup,
+ pg_stat_get_dead_tuples(tso.oid) AS n_dead_tup,
+ pg_stat_get_vacuum_count(tso.oid) AS vacuum_count
+ FROM test_stat_oid AS tso
+ WHERE tso.name = 'test_stat_tab'
+
+seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum_count
+--------+------------+---------+---------+---------+----------+----------+------------
+ 1| 1| 1| 0| 0| 1| 0| 0
+(1 row)
+
+step s2_track_counts_off: SET track_counts = off;
+step s2_table_select: SELECT * FROM test_stat_tab ORDER BY key, value;
+key|value
+---+-----
+k0 | 1
+(1 row)
+
+step s1_sleep: SELECT pg_sleep(1.5);
+pg_sleep
+--------
+
+(1 row)
+
+step s1_table_stats:
+ SELECT
+ pg_stat_get_numscans(tso.oid) AS seq_scan,
+ pg_stat_get_tuples_returned(tso.oid) AS seq_tup_read,
+ pg_stat_get_tuples_inserted(tso.oid) AS n_tup_ins,
+ pg_stat_get_tuples_updated(tso.oid) AS n_tup_upd,
+ pg_stat_get_tuples_deleted(tso.oid) AS n_tup_del,
+ pg_stat_get_live_tuples(tso.oid) AS n_live_tup,
+ pg_stat_get_dead_tuples(tso.oid) AS n_dead_tup,
+ pg_stat_get_vacuum_count(tso.oid) AS vacuum_count
+ FROM test_stat_oid AS tso
+ WHERE tso.name = 'test_stat_tab'
+
+seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum_count
+--------+------------+---------+---------+---------+----------+----------+------------
+ 1| 1| 1| 0| 0| 1| 0| 0
+(1 row)
+
+step s2_track_counts_on: SET track_counts = on;
+step s2_table_select: SELECT * FROM test_stat_tab ORDER BY key, value;
+key|value
+---+-----
+k0 | 1
+(1 row)
+
+step s1_sleep: SELECT pg_sleep(1.5);
+pg_sleep
+--------
+
+(1 row)
+
+step s1_table_stats:
+ SELECT
+ pg_stat_get_numscans(tso.oid) AS seq_scan,
+ pg_stat_get_tuples_returned(tso.oid) AS seq_tup_read,
+ pg_stat_get_tuples_inserted(tso.oid) AS n_tup_ins,
+ pg_stat_get_tuples_updated(tso.oid) AS n_tup_upd,
+ pg_stat_get_tuples_deleted(tso.oid) AS n_tup_del,
+ pg_stat_get_live_tuples(tso.oid) AS n_live_tup,
+ pg_stat_get_dead_tuples(tso.oid) AS n_dead_tup,
+ pg_stat_get_vacuum_count(tso.oid) AS vacuum_count
+ FROM test_stat_oid AS tso
+ WHERE tso.name = 'test_stat_tab'
+
+seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum_count
+--------+------------+---------+---------+---------+----------+----------+------------
+ 2| 2| 1| 0| 0| 1| 0| 0
+(1 row)
+
+step s2_table_drop: DROP TABLE test_stat_tab;
+step s2_commit: COMMIT;
+
starting permutation: s1_track_counts_off s1_table_stats s1_track_counts_on
pg_stat_force_next_flush
------------------------
diff --git a/src/test/isolation/expected/stats_1.out b/src/test/isolation/expected/stats_1.out
index e1d937784cb..aef582e7582 100644
--- a/src/test/isolation/expected/stats_1.out
+++ b/src/test/isolation/expected/stats_1.out
@@ -2253,6 +2253,108 @@ seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum
(1 row)
+starting permutation: s2_begin s2_table_select s1_sleep s1_table_stats s2_track_counts_off s2_table_select s1_sleep s1_table_stats s2_track_counts_on s2_table_select s1_sleep s1_table_stats s2_table_drop s2_commit
+pg_stat_force_next_flush
+------------------------
+
+(1 row)
+
+step s2_begin: BEGIN;
+step s2_table_select: SELECT * FROM test_stat_tab ORDER BY key, value;
+key|value
+---+-----
+k0 | 1
+(1 row)
+
+step s1_sleep: SELECT pg_sleep(1.5);
+pg_sleep
+--------
+
+(1 row)
+
+step s1_table_stats:
+ SELECT
+ pg_stat_get_numscans(tso.oid) AS seq_scan,
+ pg_stat_get_tuples_returned(tso.oid) AS seq_tup_read,
+ pg_stat_get_tuples_inserted(tso.oid) AS n_tup_ins,
+ pg_stat_get_tuples_updated(tso.oid) AS n_tup_upd,
+ pg_stat_get_tuples_deleted(tso.oid) AS n_tup_del,
+ pg_stat_get_live_tuples(tso.oid) AS n_live_tup,
+ pg_stat_get_dead_tuples(tso.oid) AS n_dead_tup,
+ pg_stat_get_vacuum_count(tso.oid) AS vacuum_count
+ FROM test_stat_oid AS tso
+ WHERE tso.name = 'test_stat_tab'
+
+seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum_count
+--------+------------+---------+---------+---------+----------+----------+------------
+ 1| 1| 1| 0| 0| 1| 0| 0
+(1 row)
+
+step s2_track_counts_off: SET track_counts = off;
+step s2_table_select: SELECT * FROM test_stat_tab ORDER BY key, value;
+key|value
+---+-----
+k0 | 1
+(1 row)
+
+step s1_sleep: SELECT pg_sleep(1.5);
+pg_sleep
+--------
+
+(1 row)
+
+step s1_table_stats:
+ SELECT
+ pg_stat_get_numscans(tso.oid) AS seq_scan,
+ pg_stat_get_tuples_returned(tso.oid) AS seq_tup_read,
+ pg_stat_get_tuples_inserted(tso.oid) AS n_tup_ins,
+ pg_stat_get_tuples_updated(tso.oid) AS n_tup_upd,
+ pg_stat_get_tuples_deleted(tso.oid) AS n_tup_del,
+ pg_stat_get_live_tuples(tso.oid) AS n_live_tup,
+ pg_stat_get_dead_tuples(tso.oid) AS n_dead_tup,
+ pg_stat_get_vacuum_count(tso.oid) AS vacuum_count
+ FROM test_stat_oid AS tso
+ WHERE tso.name = 'test_stat_tab'
+
+seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum_count
+--------+------------+---------+---------+---------+----------+----------+------------
+ 1| 1| 1| 0| 0| 1| 0| 0
+(1 row)
+
+step s2_track_counts_on: SET track_counts = on;
+step s2_table_select: SELECT * FROM test_stat_tab ORDER BY key, value;
+key|value
+---+-----
+k0 | 1
+(1 row)
+
+step s1_sleep: SELECT pg_sleep(1.5);
+pg_sleep
+--------
+
+(1 row)
+
+step s1_table_stats:
+ SELECT
+ pg_stat_get_numscans(tso.oid) AS seq_scan,
+ pg_stat_get_tuples_returned(tso.oid) AS seq_tup_read,
+ pg_stat_get_tuples_inserted(tso.oid) AS n_tup_ins,
+ pg_stat_get_tuples_updated(tso.oid) AS n_tup_upd,
+ pg_stat_get_tuples_deleted(tso.oid) AS n_tup_del,
+ pg_stat_get_live_tuples(tso.oid) AS n_live_tup,
+ pg_stat_get_dead_tuples(tso.oid) AS n_dead_tup,
+ pg_stat_get_vacuum_count(tso.oid) AS vacuum_count
+ FROM test_stat_oid AS tso
+ WHERE tso.name = 'test_stat_tab'
+
+seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum_count
+--------+------------+---------+---------+---------+----------+----------+------------
+ 2| 2| 1| 0| 0| 1| 0| 0
+(1 row)
+
+step s2_table_drop: DROP TABLE test_stat_tab;
+step s2_commit: COMMIT;
+
starting permutation: s1_track_counts_off s1_table_stats s1_track_counts_on
pg_stat_force_next_flush
------------------------
diff --git a/src/test/isolation/specs/stats.spec b/src/test/isolation/specs/stats.spec
index da16710da0f..47414eb6009 100644
--- a/src/test/isolation/specs/stats.spec
+++ b/src/test/isolation/specs/stats.spec
@@ -50,6 +50,8 @@ step s1_rollback { ROLLBACK; }
step s1_prepare_a { PREPARE TRANSACTION 'a'; }
step s1_commit_prepared_a { COMMIT PREPARED 'a'; }
step s1_rollback_prepared_a { ROLLBACK PREPARED 'a'; }
+# Has to be greater than session 2 stats_flush_interval
+step s1_sleep { SELECT pg_sleep(1.5); }
# Function stats steps
step s1_ff { SELECT pg_stat_force_next_flush(); }
@@ -132,12 +134,16 @@ step s1_slru_check_stats {
session s2
-setup { SET stats_fetch_consistency = 'none'; }
+setup {
+ SET stats_fetch_consistency = 'none';
+ SET stats_flush_interval = '1s';
+}
step s2_begin { BEGIN; }
step s2_commit { COMMIT; }
step s2_commit_prepared_a { COMMIT PREPARED 'a'; }
step s2_rollback_prepared_a { ROLLBACK PREPARED 'a'; }
step s2_ff { SELECT pg_stat_force_next_flush(); }
+step s2_table_drop { DROP TABLE test_stat_tab; }
# Function stats steps
step s2_track_funcs_all { SET track_functions = 'all'; }
@@ -156,6 +162,8 @@ step s2_func_stats {
}
# Relation stats steps
+step s2_track_counts_on { SET track_counts = on; }
+step s2_track_counts_off { SET track_counts = off; }
step s2_table_select { SELECT * FROM test_stat_tab ORDER BY key, value; }
step s2_table_update_k1 { UPDATE test_stat_tab SET value = value + 1 WHERE key = 'k1';}
@@ -435,6 +443,23 @@ permutation
s1_table_drop
s1_table_stats
+### Check that some stats are updated (seq_scan and seq_tup_read)
+### while the transaction is still running
+permutation
+ s2_begin
+ s2_table_select
+ s1_sleep
+ s1_table_stats
+ s2_track_counts_off
+ s2_table_select
+ s1_sleep
+ s1_table_stats
+ s2_track_counts_on
+ s2_table_select
+ s1_sleep
+ s1_table_stats
+ s2_table_drop
+ s2_commit
### Check that we don't count changes with track counts off, but allow access
### to prior stats
diff --git a/src/test/modules/test_custom_stats/test_custom_var_stats.c b/src/test/modules/test_custom_stats/test_custom_var_stats.c
index 207e841911b..ffcda7b6c7a 100644
--- a/src/test/modules/test_custom_stats/test_custom_var_stats.c
+++ b/src/test/modules/test_custom_stats/test_custom_var_stats.c
@@ -84,7 +84,8 @@ static dsa_area *custom_stats_description_dsa = NULL;
/* Flush callback: merge pending stats into shared memory */
static bool test_custom_stats_var_flush_pending_cb(PgStat_EntryRef *entry_ref,
- bool nowait, bool anytime_only);
+ bool nowait, bool anytime_only,
+ bool *is_partial);
/* Serialization callback: write auxiliary entry data */
static void test_custom_stats_var_to_serialized_data(const PgStat_HashKey *key,
@@ -152,7 +153,8 @@ _PG_init(void)
* Returns false only if nowait=true and lock acquisition fails.
*/
static bool
-test_custom_stats_var_flush_pending_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
+test_custom_stats_var_flush_pending_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial)
{
PgStat_StatCustomVarEntry *pending_entry;
PgStatShared_CustomVarEntry *shared_entry;
@@ -160,6 +162,9 @@ test_custom_stats_var_flush_pending_cb(PgStat_EntryRef *entry_ref, bool nowait,
pending_entry = (PgStat_StatCustomVarEntry *) entry_ref->pending;
shared_entry = (PgStatShared_CustomVarEntry *) entry_ref->shared_stats;
+ /* this is not a partial flush */
+ *is_partial = false;
+
if (!pgstat_lock_entry(entry_ref, nowait))
return false;
--
2.34.1
^ permalink raw reply [nested|flat] 22+ messages in thread
* Re: Flush some statistics within running transactions
2026-02-17 19:18 Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-18 05:40 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
@ 2026-02-19 03:58 ` Michael Paquier <[email protected]>
2026-02-19 08:01 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
1 sibling, 1 reply; 22+ messages in thread
From: Michael Paquier @ 2026-02-19 03:58 UTC (permalink / raw)
To: Bertrand Drouvot <[email protected]>; +Cc: Sami Imseih <[email protected]>; Fujii Masao <[email protected]>; [email protected]; Zsolt Parragi <[email protected]>
On Wed, Feb 18, 2026 at 05:40:46AM +0000, Bertrand Drouvot wrote:
> PFA a mandatory rebase (nothing that needs review) due to a92b809f9da1.
I don't find the design of this patch appealing, and my mind points
towards two pieces of it:
1) The new requirement related to pgstat_schedule_anytime_update()
that a stats kind needs to call to enable a timeout. This partially
doubles with pgstat_report_fixed. And I suspect that this extra set
of requirements, introducing a new level of complexity for in-core
stats kinds as well as extension developers, would be the source of
more bugs.
2) The timeout requirement itself, relying on a timeout threshold
controlled by a backend-side configuration.
With that in mind, wouldn't it be simpler if we introduced an API that
could be used from client applications instead, in a model similar
what we do for procsignal.c/h? One such example is
LOG_MEMORY_CONTEXT, where we have a SQL function that is able to tell
to a backend that it needs to do something. I could see various
benefits to this approach, because it gives more flexibility with the
timing of the stats flushes, which may not be a backend-side only
policy:
- Use a cron bgworker in the backend, that scans pg_stat_activity, for
example for long-running transactions based on a threshold.
- Do the same periodic scan of pg_stat_activity, but from a client
application.
The PROCSIG would need to set a flag in a new SIGUSR1 handler that
would trigger the flush for the stats kinds that have the
out-of-transaction property set once we go through in
ProcessInterrupts(). We already have a pgstats report call there,
hence it is a matter of removing the timeout requirements as presented
in the patch, and let client applications when this should happen.
The property of tracking which stats kind is surely important, Sami
has reminded that a couple of hours ago that there are some stats that
we should not flush even if we get an async request. Another thing
that I am doubting about is if using the same async flush threshold
makes sense for everything. Long-running transactions, for example,
mostly would not care much even if we use an interval less aggressive
than what a WAL sender sees.
Not a fan of the hardcoded sleeps in the tests, either. On fast
machines, these tend to waste in runtime because a process stands idle
doing nothing. On slow machines, tests could be unstable if a sleep
takes longer than it takes for the environment to react to a condition
of the test.
--
Michael
Attachments:
[application/pgp-signature] signature.asc (833B, 2-signature.asc)
download
^ permalink raw reply [nested|flat] 22+ messages in thread
* Re: Flush some statistics within running transactions
2026-02-17 19:18 Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-18 05:40 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-19 03:58 ` Re: Flush some statistics within running transactions Michael Paquier <[email protected]>
@ 2026-02-19 08:01 ` Bertrand Drouvot <[email protected]>
2026-02-19 22:08 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-23 23:48 ` Re: Flush some statistics within running transactions Michael Paquier <[email protected]>
0 siblings, 2 replies; 22+ messages in thread
From: Bertrand Drouvot @ 2026-02-19 08:01 UTC (permalink / raw)
To: Michael Paquier <[email protected]>; +Cc: Sami Imseih <[email protected]>; Fujii Masao <[email protected]>; [email protected]; Zsolt Parragi <[email protected]>
Hi,
On Thu, Feb 19, 2026 at 12:58:12PM +0900, Michael Paquier wrote:
> On Wed, Feb 18, 2026 at 05:40:46AM +0000, Bertrand Drouvot wrote:
> > PFA a mandatory rebase (nothing that needs review) due to a92b809f9da1.
>
> I don't find the design of this patch appealing, and my mind points
> towards two pieces of it:
> 1) The new requirement related to pgstat_schedule_anytime_update()
> that a stats kind needs to call to enable a timeout. This partially
> doubles with pgstat_report_fixed. And I suspect that this extra set
> of requirements, introducing a new level of complexity for in-core
> stats kinds as well as extension developers, would be the source of
> more bugs.
Yeah, maybe we should re-think the way we report that we have something to flush,
but I think that's more a general discussion that should also take care of
pgstat_report_fixed.
> 2) The timeout requirement itself, relying on a timeout threshold
> controlled by a backend-side configuration.
What are you concerns with this?
> With that in mind, wouldn't it be simpler if we introduced an API that
> could be used from client applications instead, in a model similar
> what we do for procsignal.c/h?
That's another angle to look at it but I think that giving this responsability to
the clients would not solve the concerns we had in [1] (that led to 039549d70f6
and to this thread). It seems to me that a solution/design that does not allow
us to "revert" 039549d70f6 does not suit our needs. Thoughts?
> Not a fan of the hardcoded sleeps in the tests, either.
Yeah, after our off-list discussion yesterday, I tried to implement the same
trick that f1e251be80a has done with injection points (nice trick by the way!),
but that led to:
TRAP: failed Assert("CritSectionCount == 0 || (context)->allowInCritSection"), File: "mcxt.c", Line: 1237
postgres: main: walwriter (ExceptionalCondition+0x9e)[0xc4ce4d]
postgres: main: walwriter (MemoryContextAlloc+0x8c)[0xc8f0ec]
postgres: main: walwriter (MemoryContextStrdup+0x37)[0xc8fea1]
postgres: main: walwriter (pstrdup+0x22)[0xc8fee4]
postgres: main: walwriter (substitute_path_macro+0x65)[0xc56068]
postgres: main: walwriter [0xc55e90]
postgres: main: walwriter (load_external_function+0x59)[0xc553db]
postgres: main: walwriter [0xc7b858]
postgres: main: walwriter [0xc7c125]
postgres: main: walwriter (IsInjectionPointAttached+0x18)[0xc7c20c]
postgres: main: walwriter (pgstat_count_backend_io_op+0x12f)[0xa9a116]
postgres: main: walwriter (pgstat_count_io_op+0x169)[0xa9cb57]
postgres: main: walwriter (pgstat_count_io_op_time+0x1cc)[0xa9cda7]
So, I did not spend that much time on it. I could if we strongly think that those
sleeps have to be discarded though.
[1]: https://postgr.es/m/erpzwxoptqhuptdrtehqydzjapvroumkhh7lc6poclbhe7jk7l%40l3yfsq5q4pw7
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 22+ messages in thread
* Re: Flush some statistics within running transactions
2026-02-17 19:18 Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-18 05:40 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-19 03:58 ` Re: Flush some statistics within running transactions Michael Paquier <[email protected]>
2026-02-19 08:01 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
@ 2026-02-19 22:08 ` Sami Imseih <[email protected]>
2026-02-20 15:55 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
1 sibling, 1 reply; 22+ messages in thread
From: Sami Imseih @ 2026-02-19 22:08 UTC (permalink / raw)
To: Bertrand Drouvot <[email protected]>; +Cc: Michael Paquier <[email protected]>; Fujii Masao <[email protected]>; [email protected]; Zsolt Parragi <[email protected]>
> > Not a fan of the hardcoded sleeps in the tests, either.
>
> Yeah, after our off-list discussion yesterday, I tried to implement the same
> trick that f1e251be80a has done with injection points (nice trick by the way!),
> but that led to:
>
> TRAP: failed Assert("CritSectionCount == 0 || (context)->allowInCritSection"), File: "mcxt.c", Line: 1237
> postgres: main: walwriter (ExceptionalCondition+0x9e)[0xc4ce4d]
> postgres: main: walwriter (MemoryContextAlloc+0x8c)[0xc8f0ec]
> postgres: main: walwriter (MemoryContextStrdup+0x37)[0xc8fea1]
> postgres: main: walwriter (pstrdup+0x22)[0xc8fee4]
> postgres: main: walwriter (substitute_path_macro+0x65)[0xc56068]
> postgres: main: walwriter [0xc55e90]
> postgres: main: walwriter (load_external_function+0x59)[0xc553db]
> postgres: main: walwriter [0xc7b858]
> postgres: main: walwriter [0xc7c125]
> postgres: main: walwriter (IsInjectionPointAttached+0x18)[0xc7c20c]
> postgres: main: walwriter (pgstat_count_backend_io_op+0x12f)[0xa9a116]
> postgres: main: walwriter (pgstat_count_io_op+0x169)[0xa9cb57]
> postgres: main: walwriter (pgstat_count_io_op_time+0x1cc)[0xa9cda7]
>
> So, I did not spend that much time on it. I could if we strongly think that those
> sleeps have to be discarded though.
I took a look at this today out of interest, based on what you mentioned to me
offline.
There is this in injection_points.c
```
/*
* Load an injection point into the local cache.
*
* This is useful to be able to load an injection point before running it,
* especially if the injection point is called in a code path where memory
* allocations cannot happen, like critical sections.
*/
void
InjectionPointLoad(const char *name)
{
#ifdef USE_INJECTION_POINTS
InjectionPointCacheRefresh(name);
#else
elog(ERROR, "Injection points are not supported by this build");
#endif
}
```
so, instead of calling IS_INJECTION_POINT_ATTACHED macro which is
called on-demand and in the case of I/O stats, during the critical section,
you can just call INJECTION_POINT_LOAD during pgstat_initialize(),
like this:
```
INJECTION_POINT_LOAD("anytime-update-reduce-timeout");
```
Sami Imseih
Amazon Web Services (AWS)
^ permalink raw reply [nested|flat] 22+ messages in thread
* Re: Flush some statistics within running transactions
2026-02-17 19:18 Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-18 05:40 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-19 03:58 ` Re: Flush some statistics within running transactions Michael Paquier <[email protected]>
2026-02-19 08:01 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-19 22:08 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
@ 2026-02-20 15:55 ` Bertrand Drouvot <[email protected]>
2026-02-23 02:12 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
0 siblings, 1 reply; 22+ messages in thread
From: Bertrand Drouvot @ 2026-02-20 15:55 UTC (permalink / raw)
To: Sami Imseih <[email protected]>; +Cc: Michael Paquier <[email protected]>; Fujii Masao <[email protected]>; [email protected]; Zsolt Parragi <[email protected]>
Hi,
On Thu, Feb 19, 2026 at 04:08:59PM -0600, Sami Imseih wrote:
> > > Not a fan of the hardcoded sleeps in the tests, either.
> >
> > Yeah, after our off-list discussion yesterday, I tried to implement the same
> > trick that f1e251be80a has done with injection points (nice trick by the way!),
PFA, a tiny ("include") rebase due to 9842e8aca09 (the v9 patch series was applying
but not compiling).
> I took a look at this today out of interest,
Thanks!
> so, instead of calling IS_INJECTION_POINT_ATTACHED macro which is
I think that not calling IS_INJECTION_POINT_ATTACHED() but only relying on
"ifdef USE_INJECTION_POINTS" would set the tiny timeout value to the entire test
suit.
I'm waiting for Michael's feedback about the current design ([1]) to see if it's
still worth spending time improving the tests.
[1]: https://postgr.es/m/aZbDYMrOkeCyIubO%40ip-10-97-1-34.eu-west-3.compute.internal
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachments:
[text/x-diff] v10-0001-Add-pgstat_report_anytime_stat-for-periodic-stat.patch (42.9K, 2-v10-0001-Add-pgstat_report_anytime_stat-for-periodic-stat.patch)
download | inline diff:
From 653d6a56427e9789ffa07d03f131563cd75270fc Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot <[email protected]>
Date: Mon, 5 Jan 2026 09:41:39 +0000
Subject: [PATCH v10 1/5] Add pgstat_report_anytime_stat() for periodic stats
flushing
Long running transactions can accumulate significant statistics (WAL, IO, ...)
that remain unflushed until the transaction ends. This delays visibility of
resource usage in monitoring views like pg_stat_io and pg_stat_wal and produces
spikes when flushed.
This commit introduces pgstat_report_anytime_stat(), which flushes
non transactional statistics even inside active transactions. A new timeout
handler fires every second (if enabled while adding pending stats) to call this
function, ensuring timely stats visibility without waiting for transaction completion.
Implementation details:
- Add PgStat_FlushMode enum to classify stats kinds:
* FLUSH_ANYTIME: Stats that can always be flushed (WAL, IO, ...)
* FLUSH_AT_TXN_BOUNDARY: Stats requiring transaction boundaries
- Modify pgstat_flush_pending_entries() and pgstat_flush_fixed_stats()
to accept a boolean anytime_only parameter:
* When false: flushes all stats (existing behavior)
* When true: flushes only FLUSH_ANYTIME stats and skips FLUSH_AT_TXN_BOUNDARY stats
- The flush_pending_cb and flush_static_cb callbacks now receive an anytime_only
boolean parameter. Most of the time it's not used (except for assertions), but it's
preparatory work for moving the relations stats to anytime (without introducin
a new callback).
- Add pgstat_schedule_anytime_update() macro to schedule the next anytime flush,
relying on PGSTAT_MIN_INTERVAL
The force parameter in pgstat_report_anytime_stat() is currently unused (always
called with force=false) but reserved for future use cases requiring immediate
flushing.
---
src/backend/access/transam/xlog.c | 6 +
src/backend/postmaster/bgwriter.c | 9 +-
src/backend/postmaster/checkpointer.c | 10 +-
src/backend/postmaster/startup.c | 2 +
src/backend/postmaster/walsummarizer.c | 9 +-
src/backend/postmaster/walwriter.c | 9 +-
src/backend/replication/walreceiver.c | 9 +-
src/backend/replication/walsender.c | 8 +-
src/backend/tcop/postgres.c | 12 ++
src/backend/utils/activity/pgstat.c | 121 +++++++++++++++---
src/backend/utils/activity/pgstat_backend.c | 13 +-
src/backend/utils/activity/pgstat_bgwriter.c | 2 +-
.../utils/activity/pgstat_checkpointer.c | 2 +-
src/backend/utils/activity/pgstat_database.c | 2 +-
src/backend/utils/activity/pgstat_function.c | 4 +-
src/backend/utils/activity/pgstat_io.c | 10 +-
src/backend/utils/activity/pgstat_relation.c | 12 +-
src/backend/utils/activity/pgstat_slru.c | 6 +-
.../utils/activity/pgstat_subscription.c | 4 +-
src/backend/utils/activity/pgstat_wal.c | 10 +-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/init/postinit.c | 3 +
src/include/miscadmin.h | 1 +
src/include/pgstat.h | 22 ++++
src/include/utils/pgstat_internal.h | 52 ++++++--
src/include/utils/timeout.h | 1 +
.../test_custom_stats/test_custom_var_stats.c | 4 +-
src/tools/pgindent/typedefs.list | 1 +
28 files changed, 279 insertions(+), 66 deletions(-)
10.5% src/backend/postmaster/
5.8% src/backend/replication/
51.0% src/backend/utils/activity/
5.8% src/backend/
18.7% src/include/utils/
6.6% src/include/
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 13cce9b49f1..cf29fc91f70 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -1085,6 +1085,9 @@ XLogInsertRecord(XLogRecData *rdata,
pgWalUsage.wal_fpi += num_fpi;
pgWalUsage.wal_fpi_bytes += fpi_bytes;
+ /* Schedule next anytime stats update timeout */
+ pgstat_schedule_anytime_update();
+
/* Required for the flush of pending stats WAL data */
pgstat_report_fixed = true;
}
@@ -2066,6 +2069,9 @@ AdvanceXLInsertBuffer(XLogRecPtr upto, TimeLineID tli, bool opportunistic)
pgWalUsage.wal_buffers_full++;
TRACE_POSTGRESQL_WAL_BUFFER_WRITE_DIRTY_DONE();
+ /* Schedule next anytime stats update timeout */
+ pgstat_schedule_anytime_update();
+
/*
* Required for the flush of pending stats WAL data, per
* update of pgWalUsage.
diff --git a/src/backend/postmaster/bgwriter.c b/src/backend/postmaster/bgwriter.c
index 0956bd39a85..059c601c3b8 100644
--- a/src/backend/postmaster/bgwriter.c
+++ b/src/backend/postmaster/bgwriter.c
@@ -49,7 +49,9 @@
#include "storage/smgr.h"
#include "storage/standby.h"
#include "utils/memutils.h"
+#include "utils/pgstat_internal.h"
#include "utils/resowner.h"
+#include "utils/timeout.h"
#include "utils/timestamp.h"
/*
@@ -103,7 +105,7 @@ BackgroundWriterMain(const void *startup_data, size_t startup_data_len)
pqsignal(SIGINT, SIG_IGN);
pqsignal(SIGTERM, SignalHandlerForShutdownRequest);
/* SIGQUIT handler was already set up by InitPostmasterChild */
- pqsignal(SIGALRM, SIG_IGN);
+ InitializeTimeouts(); /* establishes SIGALRM handler */
pqsignal(SIGPIPE, SIG_IGN);
pqsignal(SIGUSR1, procsignal_sigusr1_handler);
pqsignal(SIGUSR2, SIG_IGN);
@@ -113,6 +115,11 @@ BackgroundWriterMain(const void *startup_data, size_t startup_data_len)
*/
pqsignal(SIGCHLD, SIG_DFL);
+ /*
+ * Register timeouts needed
+ */
+ RegisterTimeout(ANYTIME_STATS_UPDATE_TIMEOUT, AnytimeStatsUpdateTimeoutHandler);
+
/*
* We just started, assume there has been either a shutdown or
* end-of-recovery snapshot.
diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c
index e03c19123bc..e11c4b099c8 100644
--- a/src/backend/postmaster/checkpointer.c
+++ b/src/backend/postmaster/checkpointer.c
@@ -66,8 +66,9 @@
#include "utils/acl.h"
#include "utils/guc.h"
#include "utils/memutils.h"
+#include "utils/pgstat_internal.h"
#include "utils/resowner.h"
-
+#include "utils/timeout.h"
/*----------
* Shared memory area for communication between checkpointer and backends
@@ -215,7 +216,7 @@ CheckpointerMain(const void *startup_data, size_t startup_data_len)
pqsignal(SIGINT, ReqShutdownXLOG);
pqsignal(SIGTERM, SIG_IGN); /* ignore SIGTERM */
/* SIGQUIT handler was already set up by InitPostmasterChild */
- pqsignal(SIGALRM, SIG_IGN);
+ InitializeTimeouts(); /* establishes SIGALRM handler */
pqsignal(SIGPIPE, SIG_IGN);
pqsignal(SIGUSR1, procsignal_sigusr1_handler);
pqsignal(SIGUSR2, SignalHandlerForShutdownRequest);
@@ -225,6 +226,11 @@ CheckpointerMain(const void *startup_data, size_t startup_data_len)
*/
pqsignal(SIGCHLD, SIG_DFL);
+ /*
+ * Register timeouts needed
+ */
+ RegisterTimeout(ANYTIME_STATS_UPDATE_TIMEOUT, AnytimeStatsUpdateTimeoutHandler);
+
/*
* Initialize so that first time-driven event happens at the correct time.
*/
diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c
index cdbe53dd262..4954fe425b7 100644
--- a/src/backend/postmaster/startup.c
+++ b/src/backend/postmaster/startup.c
@@ -32,6 +32,7 @@
#include "storage/standby.h"
#include "utils/guc.h"
#include "utils/memutils.h"
+#include "utils/pgstat_internal.h"
#include "utils/timeout.h"
@@ -245,6 +246,7 @@ StartupProcessMain(const void *startup_data, size_t startup_data_len)
RegisterTimeout(STANDBY_DEADLOCK_TIMEOUT, StandbyDeadLockHandler);
RegisterTimeout(STANDBY_TIMEOUT, StandbyTimeoutHandler);
RegisterTimeout(STANDBY_LOCK_TIMEOUT, StandbyLockTimeoutHandler);
+ RegisterTimeout(ANYTIME_STATS_UPDATE_TIMEOUT, AnytimeStatsUpdateTimeoutHandler);
/*
* Unblock signals (they were blocked when the postmaster forked us)
diff --git a/src/backend/postmaster/walsummarizer.c b/src/backend/postmaster/walsummarizer.c
index 742137edad6..f1bae9d23d6 100644
--- a/src/backend/postmaster/walsummarizer.c
+++ b/src/backend/postmaster/walsummarizer.c
@@ -48,6 +48,8 @@
#include "storage/shmem.h"
#include "utils/guc.h"
#include "utils/memutils.h"
+#include "utils/pgstat_internal.h"
+#include "utils/timeout.h"
#include "utils/wait_event.h"
/*
@@ -246,7 +248,7 @@ WalSummarizerMain(const void *startup_data, size_t startup_data_len)
pqsignal(SIGINT, SIG_IGN); /* no query to cancel */
pqsignal(SIGTERM, SignalHandlerForShutdownRequest);
/* SIGQUIT handler was already set up by InitPostmasterChild */
- pqsignal(SIGALRM, SIG_IGN);
+ InitializeTimeouts(); /* establishes SIGALRM handler */
pqsignal(SIGPIPE, SIG_IGN);
pqsignal(SIGUSR1, procsignal_sigusr1_handler);
pqsignal(SIGUSR2, SIG_IGN); /* not used */
@@ -268,6 +270,11 @@ WalSummarizerMain(const void *startup_data, size_t startup_data_len)
*/
pqsignal(SIGCHLD, SIG_DFL);
+ /*
+ * Register timeouts needed
+ */
+ RegisterTimeout(ANYTIME_STATS_UPDATE_TIMEOUT, AnytimeStatsUpdateTimeoutHandler);
+
/*
* If an exception is encountered, processing resumes here.
*/
diff --git a/src/backend/postmaster/walwriter.c b/src/backend/postmaster/walwriter.c
index 7c0e2809c17..bcf59227a00 100644
--- a/src/backend/postmaster/walwriter.c
+++ b/src/backend/postmaster/walwriter.c
@@ -61,7 +61,9 @@
#include "storage/smgr.h"
#include "utils/hsearch.h"
#include "utils/memutils.h"
+#include "utils/pgstat_internal.h"
#include "utils/resowner.h"
+#include "utils/timeout.h"
/*
@@ -103,7 +105,7 @@ WalWriterMain(const void *startup_data, size_t startup_data_len)
pqsignal(SIGINT, SIG_IGN); /* no query to cancel */
pqsignal(SIGTERM, SignalHandlerForShutdownRequest);
/* SIGQUIT handler was already set up by InitPostmasterChild */
- pqsignal(SIGALRM, SIG_IGN);
+ InitializeTimeouts(); /* establishes SIGALRM handler */
pqsignal(SIGPIPE, SIG_IGN);
pqsignal(SIGUSR1, procsignal_sigusr1_handler);
pqsignal(SIGUSR2, SIG_IGN); /* not used */
@@ -113,6 +115,11 @@ WalWriterMain(const void *startup_data, size_t startup_data_len)
*/
pqsignal(SIGCHLD, SIG_DFL);
+ /*
+ * Register timeouts needed
+ */
+ RegisterTimeout(ANYTIME_STATS_UPDATE_TIMEOUT, AnytimeStatsUpdateTimeoutHandler);
+
/*
* Create a memory context that we will do all our work in. We do this so
* that we can reset the context during error recovery and thereby avoid
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 7c1b8757d7d..aecc7a127e6 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -77,7 +77,9 @@
#include "utils/builtins.h"
#include "utils/guc.h"
#include "utils/pg_lsn.h"
+#include "utils/pgstat_internal.h"
#include "utils/ps_status.h"
+#include "utils/timeout.h"
#include "utils/timestamp.h"
@@ -252,7 +254,7 @@ WalReceiverMain(const void *startup_data, size_t startup_data_len)
pqsignal(SIGINT, SIG_IGN);
pqsignal(SIGTERM, die); /* request shutdown */
/* SIGQUIT handler was already set up by InitPostmasterChild */
- pqsignal(SIGALRM, SIG_IGN);
+ InitializeTimeouts(); /* establishes SIGALRM handler */
pqsignal(SIGPIPE, SIG_IGN);
pqsignal(SIGUSR1, procsignal_sigusr1_handler);
pqsignal(SIGUSR2, SIG_IGN);
@@ -260,6 +262,11 @@ WalReceiverMain(const void *startup_data, size_t startup_data_len)
/* Reset some signals that are accepted by postmaster but not here */
pqsignal(SIGCHLD, SIG_DFL);
+ /*
+ * Register timeouts needed
+ */
+ RegisterTimeout(ANYTIME_STATS_UPDATE_TIMEOUT, AnytimeStatsUpdateTimeoutHandler);
+
/* Load the libpq-specific functions */
load_file("libpqwalreceiver", false);
if (WalReceiverFunctions == NULL)
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 2cde8ebc729..a7214d0dc6f 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1987,8 +1987,8 @@ WalSndWaitForWal(XLogRecPtr loc)
if (TimestampDifferenceExceeds(last_flush, now,
WALSENDER_STATS_FLUSH_INTERVAL))
{
- pgstat_flush_io(false);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO);
+ pgstat_flush_io(false, true);
+ (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
last_flush = now;
}
@@ -3016,8 +3016,8 @@ WalSndLoop(WalSndSendDataCallback send_data)
if (TimestampDifferenceExceeds(last_flush, now,
WALSENDER_STATS_FLUSH_INTERVAL))
{
- pgstat_flush_io(false);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO);
+ pgstat_flush_io(false, true);
+ (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
last_flush = now;
}
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index d01a09dd0c4..8c30efa2443 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -3564,6 +3564,18 @@ ProcessInterrupts(void)
pgstat_report_stat(true);
}
+ /*
+ * Flush stats outside of transaction boundary if the timeout fired.
+ * Unlike transactional stats, these can be flushed even inside a running
+ * transaction.
+ */
+ if (AnytimeStatsUpdateTimeoutPending)
+ {
+ AnytimeStatsUpdateTimeoutPending = false;
+
+ pgstat_report_anytime_stat(false);
+ }
+
if (ProcSignalBarrierPending)
ProcessProcSignalBarrier();
diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c
index 11bb71cad5a..ddd331e2c81 100644
--- a/src/backend/utils/activity/pgstat.c
+++ b/src/backend/utils/activity/pgstat.c
@@ -108,10 +108,12 @@
#include "pgstat.h"
#include "storage/fd.h"
#include "storage/ipc.h"
+#include "storage/latch.h"
#include "storage/lwlock.h"
#include "utils/guc_hooks.h"
#include "utils/memutils.h"
#include "utils/pgstat_internal.h"
+#include "utils/timeout.h"
#include "utils/timestamp.h"
@@ -122,8 +124,6 @@
* ----------
*/
-/* minimum interval non-forced stats flushes.*/
-#define PGSTAT_MIN_INTERVAL 1000
/* how long until to block flushing pending stats updates */
#define PGSTAT_MAX_INTERVAL 60000
/* when to call pgstat_report_stat() again, even when idle */
@@ -187,7 +187,8 @@ static void pgstat_init_snapshot_fixed(void);
static void pgstat_reset_after_failure(void);
-static bool pgstat_flush_pending_entries(bool nowait);
+static bool pgstat_flush_pending_entries(bool nowait, bool anytime_only);
+static bool pgstat_flush_fixed_stats(bool nowait, bool anytime_only);
static void pgstat_prep_snapshot(void);
static void pgstat_build_snapshot(void);
@@ -218,6 +219,12 @@ PgStat_LocalState pgStatLocal;
*/
bool pgstat_report_fixed = false;
+/*
+ * Track when there is pending anytime flush to avoid relying on
+ * get_timeout_active() in hot pathes.
+ */
+bool pgstat_pending_anytime = false;
+
/* ----------
* Local data
*
@@ -288,6 +295,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = true,
+ .flush_mode = FLUSH_AT_TXN_BOUNDARY,
/* so pg_stat_database entries can be seen in all databases */
.accessed_across_databases = true,
@@ -305,6 +313,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = true,
+ .flush_mode = FLUSH_AT_TXN_BOUNDARY,
.shared_size = sizeof(PgStatShared_Relation),
.shared_data_off = offsetof(PgStatShared_Relation, stats),
@@ -321,6 +330,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = true,
+ .flush_mode = FLUSH_AT_TXN_BOUNDARY,
.shared_size = sizeof(PgStatShared_Function),
.shared_data_off = offsetof(PgStatShared_Function, stats),
@@ -336,6 +346,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = true,
+ .flush_mode = FLUSH_AT_TXN_BOUNDARY,
.accessed_across_databases = true,
@@ -353,6 +364,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = true,
+ .flush_mode = FLUSH_AT_TXN_BOUNDARY,
/* so pg_stat_subscription_stats entries can be seen in all databases */
.accessed_across_databases = true,
@@ -370,6 +382,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = false,
+ .flush_mode = FLUSH_ANYTIME,
.accessed_across_databases = true,
@@ -436,6 +449,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = true,
.write_to_file = true,
+ .flush_mode = FLUSH_ANYTIME,
.snapshot_ctl_off = offsetof(PgStat_Snapshot, io),
.shared_ctl_off = offsetof(PgStat_ShmemControl, io),
@@ -453,6 +467,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = true,
.write_to_file = true,
+ .flush_mode = FLUSH_ANYTIME,
.snapshot_ctl_off = offsetof(PgStat_Snapshot, slru),
.shared_ctl_off = offsetof(PgStat_ShmemControl, slru),
@@ -470,6 +485,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = true,
.write_to_file = true,
+ .flush_mode = FLUSH_ANYTIME,
.snapshot_ctl_off = offsetof(PgStat_Snapshot, wal),
.shared_ctl_off = offsetof(PgStat_ShmemControl, wal),
@@ -775,23 +791,11 @@ pgstat_report_stat(bool force)
partial_flush = false;
/* flush of variable-numbered stats tracked in pending entries list */
- partial_flush |= pgstat_flush_pending_entries(nowait);
+ partial_flush |= pgstat_flush_pending_entries(nowait, false);
/* flush of other stats kinds */
if (pgstat_report_fixed)
- {
- for (PgStat_Kind kind = PGSTAT_KIND_MIN; kind <= PGSTAT_KIND_MAX; kind++)
- {
- const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
-
- if (!kind_info)
- continue;
- if (!kind_info->flush_static_cb)
- continue;
-
- partial_flush |= kind_info->flush_static_cb(nowait);
- }
- }
+ partial_flush |= pgstat_flush_fixed_stats(nowait, false);
last_flush = now;
@@ -1293,7 +1297,8 @@ pgstat_prep_pending_entry(PgStat_Kind kind, Oid dboid, uint64 objid, bool *creat
if (entry_ref->pending == NULL)
{
- size_t entrysize = pgstat_get_kind_info(kind)->pending_size;
+ const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
+ size_t entrysize = kind_info->pending_size;
Assert(entrysize != (size_t) -1);
@@ -1345,9 +1350,14 @@ pgstat_delete_pending_entry(PgStat_EntryRef *entry_ref)
/*
* Flush out pending variable-numbered stats.
+ *
+ * If anytime_only is true, only flushes FLUSH_ANYTIME entries.
+ * This is safe to call inside transactions.
+ *
+ * If anytime_only is false, flushes all entries.
*/
static bool
-pgstat_flush_pending_entries(bool nowait)
+pgstat_flush_pending_entries(bool nowait, bool anytime_only)
{
bool have_pending = false;
dlist_node *cur = NULL;
@@ -1377,8 +1387,22 @@ pgstat_flush_pending_entries(bool nowait)
Assert(!kind_info->fixed_amount);
Assert(kind_info->flush_pending_cb != NULL);
+ /* Skip transactional stats if we're in anytime_only mode */
+ if (anytime_only && kind_info->flush_mode == FLUSH_AT_TXN_BOUNDARY)
+ {
+ have_pending = true;
+
+ if (dlist_has_next(&pgStatPending, cur))
+ next = dlist_next_node(&pgStatPending, cur);
+ else
+ next = NULL;
+
+ cur = next;
+ continue;
+ }
+
/* flush the stats, if possible */
- did_flush = kind_info->flush_pending_cb(entry_ref, nowait);
+ did_flush = kind_info->flush_pending_cb(entry_ref, nowait, anytime_only);
Assert(did_flush || nowait);
@@ -1402,6 +1426,33 @@ pgstat_flush_pending_entries(bool nowait)
return have_pending;
}
+/*
+ * Flush fixed-amount stats.
+ *
+ * If anytime_only is true, only flushes FLUSH_ANYTIME stats (safe inside transactions).
+ * If anytime_only is false, flushes all stats with flush_static_cb.
+ */
+static bool
+pgstat_flush_fixed_stats(bool nowait, bool anytime_only)
+{
+ bool partial_flush = false;
+
+ for (PgStat_Kind kind = PGSTAT_KIND_MIN; kind <= PGSTAT_KIND_MAX; kind++)
+ {
+ const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
+
+ if (!kind_info || !kind_info->flush_static_cb)
+ continue;
+
+ /* Skip transactional stats if we're in anytime_only mode */
+ if (anytime_only && kind_info->flush_mode == FLUSH_AT_TXN_BOUNDARY)
+ continue;
+
+ partial_flush |= kind_info->flush_static_cb(nowait, anytime_only);
+ }
+
+ return partial_flush;
+}
/* ------------------------------------------------------------
* Helper / infrastructure functions
@@ -2119,3 +2170,33 @@ assign_stats_fetch_consistency(int newval, void *extra)
if (pgstat_fetch_consistency != newval)
force_stats_snapshot_clear = true;
}
+
+/*
+ * Flushes only FLUSH_ANYTIME stats using non-blocking locks. Transactional
+ * stats (FLUSH_AT_TXN_BOUNDARY) remain pending until transaction boundary.
+ * Safe to call inside transactions.
+ */
+void
+pgstat_report_anytime_stat(bool force)
+{
+ bool nowait = !force;
+
+ pgstat_assert_is_up();
+
+ /* Flush stats outside of transaction boundary */
+ pgstat_flush_pending_entries(nowait, true);
+ pgstat_flush_fixed_stats(nowait, true);
+
+ pgstat_pending_anytime = false;
+}
+
+/*
+ * Timeout handler for flushing anytime stats.
+ */
+void
+AnytimeStatsUpdateTimeoutHandler(void)
+{
+ AnytimeStatsUpdateTimeoutPending = true;
+ InterruptPending = true;
+ SetLatch(MyLatch);
+}
diff --git a/src/backend/utils/activity/pgstat_backend.c b/src/backend/utils/activity/pgstat_backend.c
index f2f8d3ff75f..b09316d3ab3 100644
--- a/src/backend/utils/activity/pgstat_backend.c
+++ b/src/backend/utils/activity/pgstat_backend.c
@@ -31,6 +31,7 @@
#include "storage/procarray.h"
#include "utils/memutils.h"
#include "utils/pgstat_internal.h"
+#include "utils/timeout.h"
/*
* Backend statistics counts waiting to be flushed out. These counters may be
@@ -66,6 +67,9 @@ pgstat_count_backend_io_op_time(IOObject io_object, IOContext io_context,
INSTR_TIME_ADD(PendingBackendStats.pending_io.pending_times[io_object][io_context][io_op],
io_time);
+ /* Schedule next anytime stats update timeout */
+ pgstat_schedule_anytime_update();
+
backend_has_iostats = true;
pgstat_report_fixed = true;
}
@@ -82,6 +86,9 @@ pgstat_count_backend_io_op(IOObject io_object, IOContext io_context,
PendingBackendStats.pending_io.counts[io_object][io_context][io_op] += cnt;
PendingBackendStats.pending_io.bytes[io_object][io_context][io_op] += bytes;
+ /* Schedule next anytime stats update timeout */
+ pgstat_schedule_anytime_update();
+
backend_has_iostats = true;
pgstat_report_fixed = true;
}
@@ -268,7 +275,7 @@ pgstat_flush_backend_entry_wal(PgStat_EntryRef *entry_ref)
* if some statistics could not be flushed due to lock contention.
*/
bool
-pgstat_flush_backend(bool nowait, bits32 flags)
+pgstat_flush_backend(bool nowait, bits32 flags, bool anytime_only)
{
PgStat_EntryRef *entry_ref;
bool has_pending_data = false;
@@ -311,9 +318,9 @@ pgstat_flush_backend(bool nowait, bits32 flags)
* If some stats could not be flushed due to lock contention, return true.
*/
bool
-pgstat_backend_flush_cb(bool nowait)
+pgstat_backend_flush_cb(bool nowait, bool anytime_only)
{
- return pgstat_flush_backend(nowait, PGSTAT_BACKEND_FLUSH_ALL);
+ return pgstat_flush_backend(nowait, PGSTAT_BACKEND_FLUSH_ALL, anytime_only);
}
/*
diff --git a/src/backend/utils/activity/pgstat_bgwriter.c b/src/backend/utils/activity/pgstat_bgwriter.c
index ed2fd801189..1c5f0c3ec40 100644
--- a/src/backend/utils/activity/pgstat_bgwriter.c
+++ b/src/backend/utils/activity/pgstat_bgwriter.c
@@ -61,7 +61,7 @@ pgstat_report_bgwriter(void)
/*
* Report IO statistics
*/
- pgstat_flush_io(false);
+ pgstat_flush_io(false, true);
}
/*
diff --git a/src/backend/utils/activity/pgstat_checkpointer.c b/src/backend/utils/activity/pgstat_checkpointer.c
index 1f70194b7a7..2d89a082464 100644
--- a/src/backend/utils/activity/pgstat_checkpointer.c
+++ b/src/backend/utils/activity/pgstat_checkpointer.c
@@ -68,7 +68,7 @@ pgstat_report_checkpointer(void)
/*
* Report IO statistics
*/
- pgstat_flush_io(false);
+ pgstat_flush_io(false, true);
}
/*
diff --git a/src/backend/utils/activity/pgstat_database.c b/src/backend/utils/activity/pgstat_database.c
index 933dcb5cae5..8e86df60461 100644
--- a/src/backend/utils/activity/pgstat_database.c
+++ b/src/backend/utils/activity/pgstat_database.c
@@ -435,7 +435,7 @@ pgstat_reset_database_timestamp(Oid dboid, TimestampTz ts)
* false without flushing the entry. Otherwise returns true.
*/
bool
-pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
+pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
{
PgStatShared_Database *sharedent;
PgStat_StatDBEntry *pendingent;
diff --git a/src/backend/utils/activity/pgstat_function.c b/src/backend/utils/activity/pgstat_function.c
index e6b84283c6c..5ba4958382f 100644
--- a/src/backend/utils/activity/pgstat_function.c
+++ b/src/backend/utils/activity/pgstat_function.c
@@ -190,11 +190,13 @@ pgstat_end_function_usage(PgStat_FunctionCallUsage *fcu, bool finalize)
* false without flushing the entry. Otherwise returns true.
*/
bool
-pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
+pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
{
PgStat_FunctionCounts *localent;
PgStatShared_Function *shfuncent;
+ Assert(!anytime_only);
+
localent = (PgStat_FunctionCounts *) entry_ref->pending;
shfuncent = (PgStatShared_Function *) entry_ref->shared_stats;
diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index 28de24538dc..7cd32900236 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -19,6 +19,7 @@
#include "executor/instrument.h"
#include "storage/bufmgr.h"
#include "utils/pgstat_internal.h"
+#include "utils/timeout.h"
static PgStat_PendingIO PendingIOStats;
static bool have_iostats = false;
@@ -79,6 +80,9 @@ pgstat_count_io_op(IOObject io_object, IOContext io_context, IOOp io_op,
/* Add the per-backend counts */
pgstat_count_backend_io_op(io_object, io_context, io_op, cnt, bytes);
+ /* Schedule next anytime stats update timeout */
+ pgstat_schedule_anytime_update();
+
have_iostats = true;
pgstat_report_fixed = true;
}
@@ -172,9 +176,9 @@ pgstat_fetch_stat_io(void)
* Simpler wrapper of pgstat_io_flush_cb()
*/
void
-pgstat_flush_io(bool nowait)
+pgstat_flush_io(bool nowait, bool anytime_only)
{
- (void) pgstat_io_flush_cb(nowait);
+ (void) pgstat_io_flush_cb(nowait, anytime_only);
}
/*
@@ -186,7 +190,7 @@ pgstat_flush_io(bool nowait)
* acquired. Otherwise, return false.
*/
bool
-pgstat_io_flush_cb(bool nowait)
+pgstat_io_flush_cb(bool nowait, bool anytime_only)
{
LWLock *bktype_lock;
PgStat_BktypeIO *bktype_shstats;
diff --git a/src/backend/utils/activity/pgstat_relation.c b/src/backend/utils/activity/pgstat_relation.c
index bc8c43b96aa..04d21483d93 100644
--- a/src/backend/utils/activity/pgstat_relation.c
+++ b/src/backend/utils/activity/pgstat_relation.c
@@ -267,8 +267,8 @@ pgstat_report_vacuum(Relation rel, PgStat_Counter livetuples,
* is done -- which will likely vacuum many relations -- or until the
* VACUUM command has processed all tables and committed.
*/
- pgstat_flush_io(false);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO);
+ pgstat_flush_io(false, true);
+ (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
}
/*
@@ -362,8 +362,8 @@ pgstat_report_analyze(Relation rel,
pgstat_unlock_entry(entry_ref);
/* see pgstat_report_vacuum() */
- pgstat_flush_io(false);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO);
+ pgstat_flush_io(false, true);
+ (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
}
/*
@@ -812,7 +812,7 @@ pgstat_twophase_postabort(FullTransactionId fxid, uint16 info,
* entry when successfully flushing.
*/
bool
-pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
+pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
{
Oid dboid;
PgStat_TableStatus *lstats; /* pending stats entry */
@@ -820,6 +820,8 @@ pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
PgStat_StatTabEntry *tabentry; /* table entry of shared stats */
PgStat_StatDBEntry *dbentry; /* pending database entry */
+ Assert(!anytime_only);
+
dboid = entry_ref->shared_entry->key.dboid;
lstats = (PgStat_TableStatus *) entry_ref->pending;
shtabstats = (PgStatShared_Relation *) entry_ref->shared_stats;
diff --git a/src/backend/utils/activity/pgstat_slru.c b/src/backend/utils/activity/pgstat_slru.c
index 2190f388eae..bf8a4d58673 100644
--- a/src/backend/utils/activity/pgstat_slru.c
+++ b/src/backend/utils/activity/pgstat_slru.c
@@ -19,6 +19,7 @@
#include "utils/pgstat_internal.h"
#include "utils/timestamp.h"
+#include "utils/timeout.h"
static inline PgStat_SLRUStats *get_slru_entry(int slru_idx);
@@ -139,7 +140,7 @@ pgstat_get_slru_index(const char *name)
* acquired. Otherwise return false.
*/
bool
-pgstat_slru_flush_cb(bool nowait)
+pgstat_slru_flush_cb(bool nowait, bool anytime_only)
{
PgStatShared_SLRU *stats_shmem = &pgStatLocal.shmem->slru;
int i;
@@ -223,6 +224,9 @@ get_slru_entry(int slru_idx)
Assert((slru_idx >= 0) && (slru_idx < SLRU_NUM_ELEMENTS));
+ /* Schedule next anytime stats update timeout */
+ pgstat_schedule_anytime_update();
+
have_slrustats = true;
pgstat_report_fixed = true;
diff --git a/src/backend/utils/activity/pgstat_subscription.c b/src/backend/utils/activity/pgstat_subscription.c
index 3277cf88a4e..6b6eec7578d 100644
--- a/src/backend/utils/activity/pgstat_subscription.c
+++ b/src/backend/utils/activity/pgstat_subscription.c
@@ -117,11 +117,13 @@ pgstat_fetch_stat_subscription(Oid subid)
* false without flushing the entry. Otherwise returns true.
*/
bool
-pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
+pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
{
PgStat_BackendSubEntry *localent;
PgStatShared_Subscription *shsubent;
+ Assert(!anytime_only);
+
localent = (PgStat_BackendSubEntry *) entry_ref->pending;
shsubent = (PgStatShared_Subscription *) entry_ref->shared_stats;
diff --git a/src/backend/utils/activity/pgstat_wal.c b/src/backend/utils/activity/pgstat_wal.c
index 183e0a7a97b..2c2f3f10e10 100644
--- a/src/backend/utils/activity/pgstat_wal.c
+++ b/src/backend/utils/activity/pgstat_wal.c
@@ -51,12 +51,12 @@ pgstat_report_wal(bool force)
nowait = !force;
/* flush wal stats */
- (void) pgstat_wal_flush_cb(nowait);
- pgstat_flush_backend(nowait, PGSTAT_BACKEND_FLUSH_WAL);
+ (void) pgstat_wal_flush_cb(nowait, true);
+ (void) pgstat_flush_backend(nowait, PGSTAT_BACKEND_FLUSH_WAL, true);
/* flush IO stats */
- pgstat_flush_io(nowait);
- (void) pgstat_flush_backend(nowait, PGSTAT_BACKEND_FLUSH_IO);
+ pgstat_flush_io(nowait, true);
+ (void) pgstat_flush_backend(nowait, PGSTAT_BACKEND_FLUSH_IO, true);
}
/*
@@ -88,7 +88,7 @@ pgstat_wal_have_pending(void)
* acquired. Otherwise return false.
*/
bool
-pgstat_wal_flush_cb(bool nowait)
+pgstat_wal_flush_cb(bool nowait, bool anytime_only)
{
PgStatShared_Wal *stats_shmem = &pgStatLocal.shmem->wal;
WalUsage wal_usage_diff = {0};
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 36ad708b360..ad44826c39e 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -40,6 +40,7 @@ volatile sig_atomic_t IdleSessionTimeoutPending = false;
volatile sig_atomic_t ProcSignalBarrierPending = false;
volatile sig_atomic_t LogMemoryContextPending = false;
volatile sig_atomic_t IdleStatsUpdateTimeoutPending = false;
+volatile sig_atomic_t AnytimeStatsUpdateTimeoutPending = false;
volatile uint32 InterruptHoldoffCount = 0;
volatile uint32 QueryCancelHoldoffCount = 0;
volatile uint32 CritSectionCount = 0;
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index b59e08605cc..eeeac1bf39a 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -64,6 +64,7 @@
#include "utils/injection_point.h"
#include "utils/memutils.h"
#include "utils/pg_locale.h"
+#include "utils/pgstat_internal.h"
#include "utils/portal.h"
#include "utils/ps_status.h"
#include "utils/snapmgr.h"
@@ -773,6 +774,8 @@ InitPostgres(const char *in_dbname, Oid dboid,
RegisterTimeout(CLIENT_CONNECTION_CHECK_TIMEOUT, ClientCheckTimeoutHandler);
RegisterTimeout(IDLE_STATS_UPDATE_TIMEOUT,
IdleStatsUpdateTimeoutHandler);
+ RegisterTimeout(ANYTIME_STATS_UPDATE_TIMEOUT,
+ AnytimeStatsUpdateTimeoutHandler);
}
/*
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index f16f35659b9..84e698da214 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -96,6 +96,7 @@ extern PGDLLIMPORT volatile sig_atomic_t IdleSessionTimeoutPending;
extern PGDLLIMPORT volatile sig_atomic_t ProcSignalBarrierPending;
extern PGDLLIMPORT volatile sig_atomic_t LogMemoryContextPending;
extern PGDLLIMPORT volatile sig_atomic_t IdleStatsUpdateTimeoutPending;
+extern PGDLLIMPORT volatile sig_atomic_t AnytimeStatsUpdateTimeoutPending;
extern PGDLLIMPORT volatile sig_atomic_t CheckClientConnectionPending;
extern PGDLLIMPORT volatile sig_atomic_t ClientConnectionLost;
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 9bb777c3d5a..b011a315679 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -34,6 +34,9 @@
/* Default directory to store temporary statistics data in */
#define PG_STAT_TMP_DIR "pg_stat_tmp"
+/* Minimum interval non-forced stats flushes */
+#define PGSTAT_MIN_INTERVAL 1000
+
/* Values for track_functions GUC variable --- order is significant! */
typedef enum TrackFunctionsLevel
{
@@ -532,8 +535,24 @@ extern void pgstat_initialize(void);
/* Functions called from backends */
extern long pgstat_report_stat(bool force);
+extern void pgstat_report_anytime_stat(bool force);
extern void pgstat_force_next_flush(void);
+/*
+ * Schedule the next anytime stats update timeout.
+ *
+ * This should be called whenever accumulating statistics that support
+ * FLUSH_ANYTIME flushing mode.
+ */
+#define pgstat_schedule_anytime_update() \
+ do { \
+ if (IsUnderPostmaster && !pgstat_pending_anytime) \
+ { \
+ enable_timeout_after(ANYTIME_STATS_UPDATE_TIMEOUT, PGSTAT_MIN_INTERVAL); \
+ pgstat_pending_anytime = true; \
+ } \
+ } while (0)
+
extern void pgstat_reset_counters(void);
extern void pgstat_reset(PgStat_Kind kind, Oid dboid, uint64 objid);
extern void pgstat_reset_of_kind(PgStat_Kind kind);
@@ -806,6 +825,8 @@ extern PgStat_WalStats *pgstat_fetch_stat_wal(void);
* Variables in pgstat.c
*/
+extern PGDLLIMPORT bool pgstat_pending_anytime;
+
/* GUC parameters */
extern PGDLLIMPORT bool pgstat_track_counts;
extern PGDLLIMPORT int pgstat_track_functions;
@@ -849,4 +870,5 @@ extern PGDLLIMPORT PgStat_Counter pgStatTransactionIdleTime;
/* updated by the traffic cop and in errfinish() */
extern PGDLLIMPORT SessionEndType pgStatSessionEndCause;
+
#endif /* PGSTAT_H */
diff --git a/src/include/utils/pgstat_internal.h b/src/include/utils/pgstat_internal.h
index 9b8fbae00ed..607f4255268 100644
--- a/src/include/utils/pgstat_internal.h
+++ b/src/include/utils/pgstat_internal.h
@@ -224,6 +224,19 @@ typedef struct PgStat_SubXactStatus
PgStat_TableXactStatus *first; /* head of list for this subxact */
} PgStat_SubXactStatus;
+/*
+ * Flush mode for statistics kinds.
+ *
+ * FLUSH_AT_TXN_BOUNDARY has to be the first because we want it to be the
+ * default value.
+ */
+typedef enum PgStat_FlushMode
+{
+ FLUSH_AT_TXN_BOUNDARY, /* All fields can only be flushed at
+ * transaction boundary */
+ FLUSH_ANYTIME, /* All fields can be flushed anytime,
+ * including within transactions */
+} PgStat_FlushMode;
/*
* Metadata for a specific kind of statistics.
@@ -251,6 +264,16 @@ typedef struct PgStat_KindInfo
*/
bool track_entry_count:1;
+ /*
+ * The mode of when to flush stats. See PgStat_FlushMode for more details.
+ *
+ * This member only has meaning for statistics kinds that accumulate
+ * pending stats and use flush callbacks. For kinds that write directly to
+ * shared memory (e.g., archiver, bgwriter, checkpointer), this member has
+ * no effect.
+ */
+ PgStat_FlushMode flush_mode;
+
/*
* The size of an entry in the shared stats hash table (pointed to by
* PgStatShared_HashEntry->body). For fixed-numbered statistics, this is
@@ -297,8 +320,10 @@ typedef struct PgStat_KindInfo
* For variable-numbered stats: flush pending stats. Required if pending
* data is used. See flush_static_cb when dealing with stats data that
* that cannot use PgStat_EntryRef->pending.
+ *
+ * The anytime_only parameter indicates whether this is an anytime flush.
*/
- bool (*flush_pending_cb) (PgStat_EntryRef *sr, bool nowait);
+ bool (*flush_pending_cb) (PgStat_EntryRef *sr, bool nowait, bool anytime_only);
/*
* For variable-numbered stats: delete pending stats. Optional.
@@ -366,8 +391,10 @@ typedef struct PgStat_KindInfo
*
* "pgstat_report_fixed" needs to be set to trigger the flush of pending
* stats.
+ *
+ * The anytime_only parameter indicates whether this is an anytime flush.
*/
- bool (*flush_static_cb) (bool nowait);
+ bool (*flush_static_cb) (bool nowait, bool anytime_only);
/*
* For fixed-numbered statistics: Reset All.
@@ -677,6 +704,7 @@ extern PgStat_EntryRef *pgstat_fetch_pending_entry(PgStat_Kind kind,
extern void *pgstat_fetch_entry(PgStat_Kind kind, Oid dboid, uint64 objid);
extern void pgstat_snapshot_fixed(PgStat_Kind kind);
+extern void AnytimeStatsUpdateTimeoutHandler(void);
/*
@@ -696,8 +724,8 @@ extern void pgstat_archiver_snapshot_cb(void);
#define PGSTAT_BACKEND_FLUSH_WAL (1 << 1) /* Flush WAL statistics */
#define PGSTAT_BACKEND_FLUSH_ALL (PGSTAT_BACKEND_FLUSH_IO | PGSTAT_BACKEND_FLUSH_WAL)
-extern bool pgstat_flush_backend(bool nowait, bits32 flags);
-extern bool pgstat_backend_flush_cb(bool nowait);
+extern bool pgstat_flush_backend(bool nowait, bits32 flags, bool anytime_only);
+extern bool pgstat_backend_flush_cb(bool nowait, bool anytime_only);
extern void pgstat_backend_reset_timestamp_cb(PgStatShared_Common *header,
TimestampTz ts);
@@ -729,7 +757,7 @@ extern void AtEOXact_PgStat_Database(bool isCommit, bool parallel);
extern PgStat_StatDBEntry *pgstat_prep_database_pending(Oid dboid);
extern void pgstat_reset_database_timestamp(Oid dboid, TimestampTz ts);
-extern bool pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait);
+extern bool pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only);
extern void pgstat_database_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
@@ -737,7 +765,7 @@ extern void pgstat_database_reset_timestamp_cb(PgStatShared_Common *header, Time
* Functions in pgstat_function.c
*/
-extern bool pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait);
+extern bool pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only);
extern void pgstat_function_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
@@ -745,9 +773,9 @@ extern void pgstat_function_reset_timestamp_cb(PgStatShared_Common *header, Time
* Functions in pgstat_io.c
*/
-extern void pgstat_flush_io(bool nowait);
+extern void pgstat_flush_io(bool nowait, bool anytime_only);
-extern bool pgstat_io_flush_cb(bool nowait);
+extern bool pgstat_io_flush_cb(bool nowait, bool anytime_only);
extern void pgstat_io_init_shmem_cb(void *stats);
extern void pgstat_io_reset_all_cb(TimestampTz ts);
extern void pgstat_io_snapshot_cb(void);
@@ -762,7 +790,7 @@ extern void AtEOSubXact_PgStat_Relations(PgStat_SubXactStatus *xact_state, bool
extern void AtPrepare_PgStat_Relations(PgStat_SubXactStatus *xact_state);
extern void PostPrepare_PgStat_Relations(PgStat_SubXactStatus *xact_state);
-extern bool pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait);
+extern bool pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only);
extern void pgstat_relation_delete_pending_cb(PgStat_EntryRef *entry_ref);
extern void pgstat_relation_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
@@ -809,7 +837,7 @@ extern PgStatShared_Common *pgstat_init_entry(PgStat_Kind kind,
* Functions in pgstat_slru.c
*/
-extern bool pgstat_slru_flush_cb(bool nowait);
+extern bool pgstat_slru_flush_cb(bool nowait, bool anytime_only);
extern void pgstat_slru_init_shmem_cb(void *stats);
extern void pgstat_slru_reset_all_cb(TimestampTz ts);
extern void pgstat_slru_snapshot_cb(void);
@@ -820,7 +848,7 @@ extern void pgstat_slru_snapshot_cb(void);
*/
extern void pgstat_wal_init_backend_cb(void);
-extern bool pgstat_wal_flush_cb(bool nowait);
+extern bool pgstat_wal_flush_cb(bool nowait, bool anytime_only);
extern void pgstat_wal_init_shmem_cb(void *stats);
extern void pgstat_wal_reset_all_cb(TimestampTz ts);
extern void pgstat_wal_snapshot_cb(void);
@@ -830,7 +858,7 @@ extern void pgstat_wal_snapshot_cb(void);
* Functions in pgstat_subscription.c
*/
-extern bool pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait);
+extern bool pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only);
extern void pgstat_subscription_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
diff --git a/src/include/utils/timeout.h b/src/include/utils/timeout.h
index 0965b590b34..10723bb664c 100644
--- a/src/include/utils/timeout.h
+++ b/src/include/utils/timeout.h
@@ -35,6 +35,7 @@ typedef enum TimeoutId
IDLE_SESSION_TIMEOUT,
IDLE_STATS_UPDATE_TIMEOUT,
CLIENT_CONNECTION_CHECK_TIMEOUT,
+ ANYTIME_STATS_UPDATE_TIMEOUT,
STARTUP_PROGRESS_TIMEOUT,
/* First user-definable timeout reason */
USER_TIMEOUT,
diff --git a/src/test/modules/test_custom_stats/test_custom_var_stats.c b/src/test/modules/test_custom_stats/test_custom_var_stats.c
index da28afbd929..4c207611236 100644
--- a/src/test/modules/test_custom_stats/test_custom_var_stats.c
+++ b/src/test/modules/test_custom_stats/test_custom_var_stats.c
@@ -84,7 +84,7 @@ static dsa_area *custom_stats_description_dsa = NULL;
/* Flush callback: merge pending stats into shared memory */
static bool test_custom_stats_var_flush_pending_cb(PgStat_EntryRef *entry_ref,
- bool nowait);
+ bool nowait, bool anytime_only);
/* Serialization callback: write auxiliary entry data */
static void test_custom_stats_var_to_serialized_data(const PgStat_HashKey *key,
@@ -151,7 +151,7 @@ _PG_init(void)
* Returns false only if nowait=true and lock acquisition fails.
*/
static bool
-test_custom_stats_var_flush_pending_cb(PgStat_EntryRef *entry_ref, bool nowait)
+test_custom_stats_var_flush_pending_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
{
PgStat_StatCustomVarEntry *pending_entry;
PgStatShared_CustomVarEntry *shared_entry;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 241945734ec..1dbc4b96f51 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2271,6 +2271,7 @@ PgStat_Counter
PgStat_EntryRef
PgStat_EntryRefHashEntry
PgStat_FetchConsistency
+PgStat_FlushMode
PgStat_FunctionCallUsage
PgStat_FunctionCounts
PgStat_HashKey
--
2.34.1
[text/x-diff] v10-0002-Add-anytime-flush-tests-for-custom-stats.patch (9.0K, 3-v10-0002-Add-anytime-flush-tests-for-custom-stats.patch)
download | inline diff:
From cf2dffc645aefb1db5d5e28de2d923e7ffa2f346 Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot <[email protected]>
Date: Thu, 5 Feb 2026 05:54:34 +0000
Subject: [PATCH v10 2/5] Add anytime flush tests for custom stats
---
.../test_custom_stats/t/001_custom_stats.pl | 41 +++++++++++++
.../test_custom_fixed_stats--1.0.sql | 5 ++
.../test_custom_fixed_stats.c | 57 +++++++++++++++++++
.../test_custom_var_stats--1.0.sql | 5 ++
.../test_custom_stats/test_custom_var_stats.c | 27 +++++++++
5 files changed, 135 insertions(+)
33.8% src/test/modules/test_custom_stats/t/
66.1% src/test/modules/test_custom_stats/
diff --git a/src/test/modules/test_custom_stats/t/001_custom_stats.pl b/src/test/modules/test_custom_stats/t/001_custom_stats.pl
index 9e6a7a38577..7be1b281776 100644
--- a/src/test/modules/test_custom_stats/t/001_custom_stats.pl
+++ b/src/test/modules/test_custom_stats/t/001_custom_stats.pl
@@ -156,5 +156,46 @@ $result = $node->safe_psql('postgres',
);
is($result, "0", "report of fixed-sized after manual reset");
+# Test FLUSH_ANYTIME mechanism with custom fixed stats
+# This verifies that custom stats can be flushed during a transaction
+
+# Reset stats first
+$node->safe_psql('postgres', q(select test_custom_stats_fixed_reset()));
+$node->safe_psql('postgres', q(select pg_stat_force_next_flush()));
+
+my $anytime_test = q[
+ BEGIN;
+ -- Accumulate stats
+ select test_custom_stats_fixed_anytime_update() from generate_series(1, 2);
+ -- Wait (has to be greater than PGSTAT_MIN_INTERVAL)
+ select pg_sleep(1.5);
+ -- Check
+ select 'anytime:'||numcalls from test_custom_stats_fixed_report();
+];
+
+$result = $node->safe_psql('postgres', $anytime_test);
+like($result, qr/^anytime:2/m,
+ "anytime fixed stats flushed during transaction");
+
+# Test FLUSH_ANYTIME mechanism with custom variable stats
+# This verifies that custom stats can be flushed during a transaction
+
+$node->safe_psql('postgres', q(select pg_stat_force_next_flush()));
+
+$anytime_test = q[
+ BEGIN;
+ -- Accumulate stats
+ select test_custom_stats_var_anytime_update('entry2');
+ select test_custom_stats_var_anytime_update('entry2');
+ -- Wait (has to be greater than PGSTAT_MIN_INTERVAL)
+ select pg_sleep(1.5);
+ -- Check
+ select * from test_custom_stats_var_report('entry2');
+];
+
+$result = $node->safe_psql('postgres', $anytime_test);
+like($result, qr/^entry2|2|/m,
+ "anytime var stats flushed during transaction");
+
# Test completed successfully
done_testing();
diff --git a/src/test/modules/test_custom_stats/test_custom_fixed_stats--1.0.sql b/src/test/modules/test_custom_stats/test_custom_fixed_stats--1.0.sql
index 69a93b5241f..da3a798f289 100644
--- a/src/test/modules/test_custom_stats/test_custom_fixed_stats--1.0.sql
+++ b/src/test/modules/test_custom_stats/test_custom_fixed_stats--1.0.sql
@@ -18,3 +18,8 @@ CREATE FUNCTION test_custom_stats_fixed_reset()
RETURNS void
AS 'MODULE_PATHNAME', 'test_custom_stats_fixed_reset'
LANGUAGE C STRICT PARALLEL UNSAFE;
+
+CREATE FUNCTION test_custom_stats_fixed_anytime_update()
+RETURNS void
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT PARALLEL UNSAFE;
diff --git a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
index 908bd18a7c7..30b0fbcbdc7 100644
--- a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
+++ b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
@@ -18,6 +18,7 @@
#include "pgstat.h"
#include "utils/builtins.h"
#include "utils/pgstat_internal.h"
+#include "utils/timeout.h"
PG_MODULE_MAGIC_EXT(
.name = "test_custom_fixed_stats",
@@ -43,11 +44,13 @@ typedef struct PgStatShared_CustomFixedEntry
static void test_custom_stats_fixed_init_shmem_cb(void *stats);
static void test_custom_stats_fixed_reset_all_cb(TimestampTz ts);
static void test_custom_stats_fixed_snapshot_cb(void);
+static bool test_custom_stats_fixed_flush_cb(bool nowait, bool anytime_only);
static const PgStat_KindInfo custom_stats = {
.name = "test_custom_fixed_stats",
.fixed_amount = true, /* exactly one entry */
.write_to_file = true, /* persist to stats file */
+ .flush_mode = FLUSH_ANYTIME, /* can be flushed anytime */
.shared_size = sizeof(PgStat_StatCustomFixedEntry),
.shared_data_off = offsetof(PgStatShared_CustomFixedEntry, stats),
@@ -56,8 +59,12 @@ static const PgStat_KindInfo custom_stats = {
.init_shmem_cb = test_custom_stats_fixed_init_shmem_cb,
.reset_all_cb = test_custom_stats_fixed_reset_all_cb,
.snapshot_cb = test_custom_stats_fixed_snapshot_cb,
+ .flush_static_cb = test_custom_stats_fixed_flush_cb,
};
+/* Pending statistics */
+static PgStat_StatCustomFixedEntry PendingCustomStats = {0};
+
/*
* Kind ID for test_custom_fixed_stats.
*/
@@ -141,6 +148,38 @@ test_custom_stats_fixed_snapshot_cb(void)
#undef FIXED_COMP
}
+/*
+ * test_custom_stats_fixed_flush_cb
+ * Flush pending stats to shared memory
+ */
+static bool
+test_custom_stats_fixed_flush_cb(bool nowait, bool anytime_only)
+{
+ PgStatShared_CustomFixedEntry *stats_shmem;
+
+ /* Nothing to flush if no calls were made */
+ if (PendingCustomStats.numcalls == 0)
+ return false;
+
+ stats_shmem = pgstat_get_custom_shmem_data(PGSTAT_KIND_TEST_CUSTOM_FIXED_STATS);
+
+ if (!nowait)
+ LWLockAcquire(&stats_shmem->lock, LW_EXCLUSIVE);
+ else if (!LWLockConditionalAcquire(&stats_shmem->lock, LW_EXCLUSIVE))
+ return true;
+
+ pgstat_begin_changecount_write(&stats_shmem->changecount);
+ stats_shmem->stats.numcalls += PendingCustomStats.numcalls;
+ pgstat_end_changecount_write(&stats_shmem->changecount);
+
+ LWLockRelease(&stats_shmem->lock);
+
+ /* Reset pending stats */
+ PendingCustomStats.numcalls = 0;
+
+ return false; /* successfully flushed */
+}
+
/*--------------------------------------------------------------------------
* SQL-callable functions
*--------------------------------------------------------------------------
@@ -222,3 +261,21 @@ test_custom_stats_fixed_report(PG_FUNCTION_ARGS)
/* Return as tuple */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
+
+/*
+ * test_custom_stats_fixed_anytime_update
+ * Increment call counter and schedule anytime flush
+ */
+PG_FUNCTION_INFO_V1(test_custom_stats_fixed_anytime_update);
+Datum
+test_custom_stats_fixed_anytime_update(PG_FUNCTION_ARGS)
+{
+ /* Accumulate in pending stats */
+ PendingCustomStats.numcalls++;
+
+ /* Schedule anytime stats update */
+ pgstat_schedule_anytime_update();
+ pgstat_report_fixed = true;
+
+ PG_RETURN_VOID();
+}
diff --git a/src/test/modules/test_custom_stats/test_custom_var_stats--1.0.sql b/src/test/modules/test_custom_stats/test_custom_var_stats--1.0.sql
index 5ed8cfc2dcf..ed66d38981e 100644
--- a/src/test/modules/test_custom_stats/test_custom_var_stats--1.0.sql
+++ b/src/test/modules/test_custom_stats/test_custom_var_stats--1.0.sql
@@ -24,3 +24,8 @@ CREATE FUNCTION test_custom_stats_var_report(INOUT name TEXT,
RETURNS SETOF record
AS 'MODULE_PATHNAME', 'test_custom_stats_var_report'
LANGUAGE C STRICT PARALLEL UNSAFE;
+
+CREATE FUNCTION test_custom_stats_var_anytime_update(IN name TEXT)
+RETURNS void
+AS 'MODULE_PATHNAME', 'test_custom_stats_var_anytime_update'
+LANGUAGE C STRICT PARALLEL UNSAFE;
diff --git a/src/test/modules/test_custom_stats/test_custom_var_stats.c b/src/test/modules/test_custom_stats/test_custom_var_stats.c
index 4c207611236..e9f1bda6b32 100644
--- a/src/test/modules/test_custom_stats/test_custom_var_stats.c
+++ b/src/test/modules/test_custom_stats/test_custom_var_stats.c
@@ -18,6 +18,7 @@
#include "storage/dsm_registry.h"
#include "utils/builtins.h"
#include "utils/pgstat_internal.h"
+#include "utils/timeout.h"
PG_MODULE_MAGIC_EXT(
.name = "test_custom_var_stats",
@@ -108,6 +109,7 @@ static const PgStat_KindInfo custom_stats = {
.name = "test_custom_var_stats",
.fixed_amount = false, /* variable number of entries */
.write_to_file = true, /* persist across restarts */
+ .flush_mode = FLUSH_ANYTIME, /* can be flushed anytime */
.track_entry_count = true, /* count active entries */
.accessed_across_databases = true, /* global statistics */
.shared_size = sizeof(PgStatShared_CustomVarEntry),
@@ -690,3 +692,28 @@ test_custom_stats_var_report(PG_FUNCTION_ARGS)
SRF_RETURN_DONE(funcctx);
}
+
+/*
+ * test_custom_stats_var_anytime_update
+ * Increment custom statistic counter and schedule anytime flush
+ */
+PG_FUNCTION_INFO_V1(test_custom_stats_var_anytime_update);
+Datum
+test_custom_stats_var_anytime_update(PG_FUNCTION_ARGS)
+{
+ char *stat_name = text_to_cstring(PG_GETARG_TEXT_PP(0));
+ PgStat_EntryRef *entry_ref;
+ PgStat_StatCustomVarEntry *pending_entry;
+
+ /* Get pending entry in local memory */
+ entry_ref = pgstat_prep_pending_entry(PGSTAT_KIND_TEST_CUSTOM_VAR_STATS, InvalidOid,
+ PGSTAT_CUSTOM_VAR_STATS_IDX(stat_name), NULL);
+
+ pending_entry = (PgStat_StatCustomVarEntry *) entry_ref->pending;
+ pending_entry->numcalls++;
+
+ /* Schedule anytime stats update */
+ pgstat_schedule_anytime_update();
+
+ PG_RETURN_VOID();
+}
--
2.34.1
[text/x-diff] v10-0003-Add-GUC-to-specify-non-transactional-statistics-.patch (9.8K, 4-v10-0003-Add-GUC-to-specify-non-transactional-statistics-.patch)
download | inline diff:
From ca50ef477a3eb1bf0384f49126c9d95f28438927 Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot <[email protected]>
Date: Wed, 28 Jan 2026 07:53:13 +0000
Subject: [PATCH v10 3/5] Add GUC to specify non-transactional statistics flush
interval
Adding pgstat_flush_interval, a new GUC to set the interval between flushes of
non-transactional statistics.
---
doc/src/sgml/config.sgml | 32 +++++++++++++++++++
src/backend/utils/activity/pgstat.c | 13 ++++++++
src/backend/utils/misc/guc_parameters.dat | 10 ++++++
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/backend/utils/misc/timeout.c | 6 ++++
src/include/pgstat.h | 6 ++--
src/include/utils/guc_hooks.h | 1 +
src/include/utils/timeout.h | 1 +
.../test_custom_stats/t/001_custom_stats.pl | 6 ++--
9 files changed, 70 insertions(+), 6 deletions(-)
51.0% doc/src/sgml/
10.6% src/backend/utils/activity/
15.9% src/backend/utils/misc/
3.6% src/include/utils/
9.0% src/include/
9.6% src/test/modules/test_custom_stats/t/
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 20dbcaeb3ee..1eed71007a7 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -8929,6 +8929,38 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-stats-flush-interval" xreflabel="stats_flush_interval">
+ <term><varname>stats_flush_interval</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>stats_flush_interval</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the interval at which certain statistics, which can be updated while a
+ transaction is in progress, are made visible. These include WAL activity
+ and I/O operations.
+ Such statistics are refreshed at the specified interval and can be observed
+ during active transactions in monitoring views such as
+ <link linkend="monitoring-pg-stat-wal-view"><structname>pg_stat_wal</structname></link>
+ and
+ <link linkend="monitoring-pg-stat-io-view"><structname>pg_stat_io</structname></link>.
+ If the value is specified without a unit, milliseconds are assumed.
+ The default is 10 seconds (<literal>10s</literal>), which is generally
+ the smallest practical value for long-running transactions.
+ </para>
+ <note>
+ <para>
+ This parameter does not affect statistics that are only reported at
+ transaction end, such as the columns of <structname>pg_stat_all_tables</structname>
+ (for example, <structfield>n_tup_ins</structfield>, <structfield>n_tup_upd</structfield>,
+ and <structfield>n_tup_del</structfield>). These statistics are always
+ flushed at the end of a transaction.
+ </para>
+ </note>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c
index ddd331e2c81..fd6ab0db16f 100644
--- a/src/backend/utils/activity/pgstat.c
+++ b/src/backend/utils/activity/pgstat.c
@@ -124,6 +124,8 @@
* ----------
*/
+/* minimum interval non-forced stats flushes.*/
+#define PGSTAT_MIN_INTERVAL 1000
/* how long until to block flushing pending stats updates */
#define PGSTAT_MAX_INTERVAL 60000
/* when to call pgstat_report_stat() again, even when idle */
@@ -204,6 +206,7 @@ static inline bool pgstat_is_kind_valid(PgStat_Kind kind);
bool pgstat_track_counts = false;
int pgstat_fetch_consistency = PGSTAT_FETCH_CONSISTENCY_CACHE;
+int pgstat_flush_interval = 10000;
/* ----------
@@ -2171,6 +2174,16 @@ assign_stats_fetch_consistency(int newval, void *extra)
force_stats_snapshot_clear = true;
}
+/*
+ * GUC assign_hook for stats_flush_interval.
+ */
+void
+assign_stats_flush_interval(int newval, void *extra)
+{
+ if (get_all_timeouts_initialized())
+ enable_timeout_after(ANYTIME_STATS_UPDATE_TIMEOUT, newval);
+}
+
/*
* Flushes only FLUSH_ANYTIME stats using non-blocking locks. Transactional
* stats (FLUSH_AT_TXN_BOUNDARY) remain pending until transaction boundary.
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 9507778415d..073e08c7892 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -2801,6 +2801,16 @@
assign_hook => 'assign_stats_fetch_consistency',
},
+{ name => 'stats_flush_interval', type => 'int', context => 'PGC_USERSET', group => 'STATS_CUMULATIVE',
+ short_desc => 'Sets the interval between flushes of non-transactional statistics.',
+ flags => 'GUC_UNIT_MS',
+ variable => 'pgstat_flush_interval',
+ boot_val => '10000',
+ min => '1000',
+ max => 'INT_MAX',
+ assign_hook => 'assign_stats_flush_interval'
+},
+
{ name => 'subtransaction_buffers', type => 'int', context => 'PGC_POSTMASTER', group => 'RESOURCES_MEM',
short_desc => 'Sets the size of the dedicated buffer pool used for the subtransaction cache.',
long_desc => '0 means use a fraction of "shared_buffers".',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index f938cc65a3a..8bd37a25b38 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -688,6 +688,7 @@
#track_wal_io_timing = off
#track_functions = none # none, pl, all
#stats_fetch_consistency = cache # cache, none, snapshot
+#stats_flush_interval = 10s # in milliseconds
# - Monitoring -
diff --git a/src/backend/utils/misc/timeout.c b/src/backend/utils/misc/timeout.c
index ddba5dc607c..85c4260d1db 100644
--- a/src/backend/utils/misc/timeout.c
+++ b/src/backend/utils/misc/timeout.c
@@ -828,3 +828,9 @@ get_timeout_finish_time(TimeoutId id)
{
return all_timeouts[id].fin_time;
}
+
+bool
+get_all_timeouts_initialized(void)
+{
+ return all_timeouts_initialized;
+}
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index b011a315679..90237c70829 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -34,9 +34,6 @@
/* Default directory to store temporary statistics data in */
#define PG_STAT_TMP_DIR "pg_stat_tmp"
-/* Minimum interval non-forced stats flushes */
-#define PGSTAT_MIN_INTERVAL 1000
-
/* Values for track_functions GUC variable --- order is significant! */
typedef enum TrackFunctionsLevel
{
@@ -548,7 +545,7 @@ extern void pgstat_force_next_flush(void);
do { \
if (IsUnderPostmaster && !pgstat_pending_anytime) \
{ \
- enable_timeout_after(ANYTIME_STATS_UPDATE_TIMEOUT, PGSTAT_MIN_INTERVAL); \
+ enable_timeout_after(ANYTIME_STATS_UPDATE_TIMEOUT, pgstat_flush_interval); \
pgstat_pending_anytime = true; \
} \
} while (0)
@@ -831,6 +828,7 @@ extern PGDLLIMPORT bool pgstat_pending_anytime;
extern PGDLLIMPORT bool pgstat_track_counts;
extern PGDLLIMPORT int pgstat_track_functions;
extern PGDLLIMPORT int pgstat_fetch_consistency;
+extern PGDLLIMPORT int pgstat_flush_interval;
/*
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 9c90670d9b8..9b5d2a90387 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -132,6 +132,7 @@ extern bool check_session_authorization(char **newval, void **extra, GucSource s
extern void assign_session_authorization(const char *newval, void *extra);
extern void assign_session_replication_role(int newval, void *extra);
extern void assign_stats_fetch_consistency(int newval, void *extra);
+extern void assign_stats_flush_interval(int newval, void *extra);
extern bool check_ssl(bool *newval, void **extra, GucSource source);
extern bool check_stage_log_stats(bool *newval, void **extra, GucSource source);
extern bool check_standard_conforming_strings(bool *newval, void **extra,
diff --git a/src/include/utils/timeout.h b/src/include/utils/timeout.h
index 10723bb664c..fe7327de209 100644
--- a/src/include/utils/timeout.h
+++ b/src/include/utils/timeout.h
@@ -93,5 +93,6 @@ extern bool get_timeout_active(TimeoutId id);
extern bool get_timeout_indicator(TimeoutId id, bool reset_indicator);
extern TimestampTz get_timeout_start_time(TimeoutId id);
extern TimestampTz get_timeout_finish_time(TimeoutId id);
+extern bool get_all_timeouts_initialized(void);
#endif /* TIMEOUT_H */
diff --git a/src/test/modules/test_custom_stats/t/001_custom_stats.pl b/src/test/modules/test_custom_stats/t/001_custom_stats.pl
index 7be1b281776..22e2a75dcb9 100644
--- a/src/test/modules/test_custom_stats/t/001_custom_stats.pl
+++ b/src/test/modules/test_custom_stats/t/001_custom_stats.pl
@@ -164,10 +164,11 @@ $node->safe_psql('postgres', q(select test_custom_stats_fixed_reset()));
$node->safe_psql('postgres', q(select pg_stat_force_next_flush()));
my $anytime_test = q[
+ SET stats_flush_interval = '1s';
BEGIN;
-- Accumulate stats
select test_custom_stats_fixed_anytime_update() from generate_series(1, 2);
- -- Wait (has to be greater than PGSTAT_MIN_INTERVAL)
+ -- Wait (has to be greater than stats_flush_interval)
select pg_sleep(1.5);
-- Check
select 'anytime:'||numcalls from test_custom_stats_fixed_report();
@@ -183,11 +184,12 @@ like($result, qr/^anytime:2/m,
$node->safe_psql('postgres', q(select pg_stat_force_next_flush()));
$anytime_test = q[
+ SET stats_flush_interval = '1s';
BEGIN;
-- Accumulate stats
select test_custom_stats_var_anytime_update('entry2');
select test_custom_stats_var_anytime_update('entry2');
- -- Wait (has to be greater than PGSTAT_MIN_INTERVAL)
+ -- Wait (has to be greater than stats_flush_interval)
select pg_sleep(1.5);
-- Check
select * from test_custom_stats_var_report('entry2');
--
2.34.1
[text/x-diff] v10-0004-Remove-useless-calls-to-flush-some-stats.patch (7.7K, 5-v10-0004-Remove-useless-calls-to-flush-some-stats.patch)
download | inline diff:
From d27bf3b7ddb430bc145764db842eac8e41c9b9e8 Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot <[email protected]>
Date: Tue, 6 Jan 2026 11:06:31 +0000
Subject: [PATCH v10 4/5] Remove useless calls to flush some stats
Now that some stats can be flushed outside of transaction boundaries, remove
useless calls to report/flush some stats. Those calls were in place because
before commit <XXXX> stats were flushed only at transaction boundaries.
Note that:
- it reverts 039549d70f6 (it just keeps its tests)
- it can't be done for checkpointer and bgworker for example because they don't
have a flush callback to call
- it can't be done for auxiliary process (walsummarizer for example) because they
currently do not register the new timeout handler
---
src/backend/replication/walreceiver.c | 10 ------
src/backend/replication/walsender.c | 36 ++------------------
src/backend/utils/activity/pgstat_relation.c | 13 -------
src/test/recovery/t/001_stream_rep.pl | 1 +
src/test/subscription/t/001_rep_changes.pl | 1 +
5 files changed, 4 insertions(+), 57 deletions(-)
69.4% src/backend/replication/
23.4% src/backend/utils/activity/
3.5% src/test/recovery/t/
3.6% src/test/subscription/t/
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index aecc7a127e6..edf5ac65660 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -571,16 +571,6 @@ WalReceiverMain(const void *startup_data, size_t startup_data_len)
*/
bool requestReply = false;
- /*
- * Report pending statistics to the cumulative stats
- * system. This location is useful for the report as it
- * is not within a tight loop in the WAL receiver, to
- * avoid bloating pgstats with requests, while also making
- * sure that the reports happen each time a status update
- * is sent.
- */
- pgstat_report_wal(false);
-
/*
* Check if time since last receive from primary has
* reached the configured limit.
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index a7214d0dc6f..9a136e35b48 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -94,14 +94,10 @@
#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/pg_lsn.h"
-#include "utils/pgstat_internal.h"
#include "utils/ps_status.h"
#include "utils/timeout.h"
#include "utils/timestamp.h"
-/* Minimum interval used by walsender for stats flushes, in ms */
-#define WALSENDER_STATS_FLUSH_INTERVAL 1000
-
/*
* Maximum data payload in a WAL data message. Must be >= XLOG_BLCKSZ.
*
@@ -1846,7 +1842,6 @@ WalSndWaitForWal(XLogRecPtr loc)
int wakeEvents;
uint32 wait_event = 0;
static XLogRecPtr RecentFlushPtr = InvalidXLogRecPtr;
- TimestampTz last_flush = 0;
/*
* Fast path to avoid acquiring the spinlock in case we already know we
@@ -1867,7 +1862,6 @@ WalSndWaitForWal(XLogRecPtr loc)
{
bool wait_for_standby_at_stop = false;
long sleeptime;
- TimestampTz now;
/* Clear any already-pending wakeups */
ResetLatch(MyLatch);
@@ -1973,8 +1967,7 @@ WalSndWaitForWal(XLogRecPtr loc)
* new WAL to be generated. (But if we have nothing to send, we don't
* want to wake on socket-writable.)
*/
- now = GetCurrentTimestamp();
- sleeptime = WalSndComputeSleeptime(now);
+ sleeptime = WalSndComputeSleeptime(GetCurrentTimestamp());
wakeEvents = WL_SOCKET_READABLE;
@@ -1983,15 +1976,6 @@ WalSndWaitForWal(XLogRecPtr loc)
Assert(wait_event != 0);
- /* Report IO statistics, if needed */
- if (TimestampDifferenceExceeds(last_flush, now,
- WALSENDER_STATS_FLUSH_INTERVAL))
- {
- pgstat_flush_io(false, true);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
- last_flush = now;
- }
-
WalSndWait(wakeEvents, sleeptime, wait_event);
}
@@ -2894,8 +2878,6 @@ WalSndCheckTimeOut(void)
static void
WalSndLoop(WalSndSendDataCallback send_data)
{
- TimestampTz last_flush = 0;
-
/*
* Initialize the last reply timestamp. That enables timeout processing
* from hereon.
@@ -2985,9 +2967,6 @@ WalSndLoop(WalSndSendDataCallback send_data)
* WalSndWaitForWal() handle any other blocking; idle receivers need
* its additional actions. For physical replication, also block if
* caught up; its send_data does not block.
- *
- * The IO statistics are reported in WalSndWaitForWal() for the
- * logical WAL senders.
*/
if ((WalSndCaughtUp && send_data != XLogSendLogical &&
!streamingDoneSending) ||
@@ -2995,7 +2974,6 @@ WalSndLoop(WalSndSendDataCallback send_data)
{
long sleeptime;
int wakeEvents;
- TimestampTz now;
if (!streamingDoneReceiving)
wakeEvents = WL_SOCKET_READABLE;
@@ -3006,21 +2984,11 @@ WalSndLoop(WalSndSendDataCallback send_data)
* Use fresh timestamp, not last_processing, to reduce the chance
* of reaching wal_sender_timeout before sending a keepalive.
*/
- now = GetCurrentTimestamp();
- sleeptime = WalSndComputeSleeptime(now);
+ sleeptime = WalSndComputeSleeptime(GetCurrentTimestamp());
if (pq_is_send_pending())
wakeEvents |= WL_SOCKET_WRITEABLE;
- /* Report IO statistics, if needed */
- if (TimestampDifferenceExceeds(last_flush, now,
- WALSENDER_STATS_FLUSH_INTERVAL))
- {
- pgstat_flush_io(false, true);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
- last_flush = now;
- }
-
/* Sleep until something happens or we time out */
WalSndWait(wakeEvents, sleeptime, WAIT_EVENT_WAL_SENDER_MAIN);
}
diff --git a/src/backend/utils/activity/pgstat_relation.c b/src/backend/utils/activity/pgstat_relation.c
index 04d21483d93..ae2952cae89 100644
--- a/src/backend/utils/activity/pgstat_relation.c
+++ b/src/backend/utils/activity/pgstat_relation.c
@@ -260,15 +260,6 @@ pgstat_report_vacuum(Relation rel, PgStat_Counter livetuples,
}
pgstat_unlock_entry(entry_ref);
-
- /*
- * Flush IO statistics now. pgstat_report_stat() will flush IO stats,
- * however this will not be called until after an entire autovacuum cycle
- * is done -- which will likely vacuum many relations -- or until the
- * VACUUM command has processed all tables and committed.
- */
- pgstat_flush_io(false, true);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
}
/*
@@ -360,10 +351,6 @@ pgstat_report_analyze(Relation rel,
}
pgstat_unlock_entry(entry_ref);
-
- /* see pgstat_report_vacuum() */
- pgstat_flush_io(false, true);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
}
/*
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index e9ac67813c7..cfa095ff0a8 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -15,6 +15,7 @@ my $node_primary = PostgreSQL::Test::Cluster->new('primary');
$node_primary->init(
allows_streaming => 1,
auth_extra => [ '--create-role' => 'repl_role' ]);
+$node_primary->append_conf('postgresql.conf', "stats_flush_interval = '1s'");
$node_primary->start;
my $backup_name = 'my_backup';
diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl
index 7d41715ed81..29bae5e1121 100644
--- a/src/test/subscription/t/001_rep_changes.pl
+++ b/src/test/subscription/t/001_rep_changes.pl
@@ -11,6 +11,7 @@ use Test::More;
# Initialize publisher node
my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf', "stats_flush_interval = '1s'");
$node_publisher->start;
# Create subscriber node
--
2.34.1
[text/x-diff] v10-0005-Change-RELATION-and-DATABASE-stats-to-anytime-fl.patch (34.2K, 6-v10-0005-Change-RELATION-and-DATABASE-stats-to-anytime-fl.patch)
download | inline diff:
From a9dc58fbd4474adf36fd64de54386cc2311b2484 Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot <[email protected]>
Date: Mon, 19 Jan 2026 06:27:55 +0000
Subject: [PATCH v10 5/5] Change RELATION and DATABASE stats to anytime flush
This commit allows mixing fields with different transaction behavior within
the same RELATION or DATABASE statistics kind: some fields are transactional
(e.g., tuple inserts/updates/deletes) while others are non-transactional
(e.g., sequential scans, blocks read).
It modifies the relation flush callback to handle the anytime_only parameter
introduced in commit <nnnn>.
Implementation details:
- Change RELATION from FLUSH_AT_TXN_BOUNDARY to FLUSH_ANYTIME
- Change DATABASE from FLUSH_AT_TXN_BOUNDARY to FLUSH_ANYTIME
- Add a is_partial parameter to flush_pending_cb() to be able to distinguish
partial flushes in pgstat_flush_pending_entries()
- Modify pgstat_relation_flush_cb() to handle anytime_only parameter: when
true, then flush only non-transactional stats and when false, then flush all
the stats. When set to true, it clears flushed fields from pending stats to
prevent double-counting at transaction boundary
DATABASE stats inherit the anytime flush behavior so that relation-derived
stats (tuples_returned, tuples_fetched, blocks_fetched, blocks_hit) are
visible while transactions are in progress.
Tests are added to verify the anytime flush behavior for mixed fields.
---
doc/src/sgml/monitoring.sgml | 37 ++++++-
src/backend/utils/activity/pgstat.c | 15 +--
src/backend/utils/activity/pgstat_database.c | 6 +-
src/backend/utils/activity/pgstat_function.c | 6 +-
src/backend/utils/activity/pgstat_relation.c | 92 ++++++++++++----
.../utils/activity/pgstat_subscription.c | 6 +-
src/include/pgstat.h | 27 ++++-
src/include/utils/pgstat_internal.h | 16 ++-
src/test/isolation/expected/stats.out | 102 ++++++++++++++++++
src/test/isolation/expected/stats_1.out | 102 ++++++++++++++++++
src/test/isolation/specs/stats.spec | 27 ++++-
.../test_custom_stats/test_custom_var_stats.c | 9 +-
12 files changed, 404 insertions(+), 41 deletions(-)
11.7% doc/src/sgml/
26.8% src/backend/utils/activity/
4.2% src/include/utils/
5.4% src/include/
45.1% src/test/isolation/expected/
4.7% src/test/isolation/specs/
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index b77d189a500..f2321b631b0 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3767,6 +3767,19 @@ description | Waiting for a newly initialized WAL file to reach durable storage
</tgroup>
</table>
+ <note>
+ <para>
+ Some statistics are updated while a transaction is in progress (for example,
+ <structfield>blks_read</structfield>, <structfield>blks_hit</structfield>,
+ <structfield>tup_returned</structfield> and <structfield>tup_fetched</structfield>).
+ Statistics that either do not depend on transactions or require transactional
+ consistency are updated only when the transaction ends. Statistics that require
+ transactional consistency include <structfield>xact_commit</structfield>,
+ <structfield>xact_rollback</structfield>, <structfield>tup_inserted</structfield>,
+ <structfield>tup_updated</structfield> and <structfield>tup_deleted</structfield>.
+ </para>
+ </note>
+
</sect2>
<sect2 id="monitoring-pg-stat-database-conflicts-view">
@@ -3956,8 +3969,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
<structfield>last_seq_scan</structfield> <type>timestamp with time zone</type>
</para>
<para>
- The time of the last sequential scan on this table, based on the
- most recent transaction stop time
+ The approximate time of the last sequential scan on this table, updated
+ at least every <varname>stats_flush_interval</varname>
</para></entry>
</row>
@@ -3984,8 +3997,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
<structfield>last_idx_scan</structfield> <type>timestamp with time zone</type>
</para>
<para>
- The time of the last index scan on this table, based on the
- most recent transaction stop time
+ The approximate time of the last index scan on this table, updated
+ at least every <varname>stats_flush_interval</varname>
</para></entry>
</row>
@@ -4223,6 +4236,15 @@ description | Waiting for a newly initialized WAL file to reach durable storage
</tgroup>
</table>
+ <note>
+ <para>
+ The <structfield>seq_scan</structfield>, <structfield>last_seq_scan</structfield>,
+ <structfield>seq_tup_read</structfield>, <structfield>idx_scan</structfield>,
+ <structfield>last_idx_scan</structfield> and <structfield>idx_tup_fetch</structfield>
+ are updated while the transactions are in progress.
+ </para>
+ </note>
+
</sect2>
<sect2 id="monitoring-pg-stat-all-indexes-view">
@@ -4404,6 +4426,13 @@ description | Waiting for a newly initialized WAL file to reach durable storage
tuples (see <xref linkend="indexes-multicolumn"/>).
</para>
</note>
+ <note>
+ <para>
+ The <structfield>idx_scan</structfield>, <structfield>last_idx_scan</structfield>,
+ <structfield>idx_tup_read</structfield> and <structfield>idx_tup_fetch</structfield>
+ are updated while the transactions are in progress.
+ </para>
+ </note>
<tip>
<para>
<command>EXPLAIN ANALYZE</command> outputs the total number of index
diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c
index fd6ab0db16f..a8a905640d0 100644
--- a/src/backend/utils/activity/pgstat.c
+++ b/src/backend/utils/activity/pgstat.c
@@ -298,7 +298,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = true,
- .flush_mode = FLUSH_AT_TXN_BOUNDARY,
+ .flush_mode = FLUSH_ANYTIME,
/* so pg_stat_database entries can be seen in all databases */
.accessed_across_databases = true,
@@ -316,7 +316,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = true,
- .flush_mode = FLUSH_AT_TXN_BOUNDARY,
+ .flush_mode = FLUSH_ANYTIME,
.shared_size = sizeof(PgStatShared_Relation),
.shared_data_off = offsetof(PgStatShared_Relation, stats),
@@ -1354,7 +1354,8 @@ pgstat_delete_pending_entry(PgStat_EntryRef *entry_ref)
/*
* Flush out pending variable-numbered stats.
*
- * If anytime_only is true, only flushes FLUSH_ANYTIME entries.
+ * If anytime_only is true, only flushes FLUSH_ANYTIME entries. For entries
+ * that support it, the callback may flush only non-transactional fields.
* This is safe to call inside transactions.
*
* If anytime_only is false, flushes all entries.
@@ -1385,6 +1386,7 @@ pgstat_flush_pending_entries(bool nowait, bool anytime_only)
PgStat_Kind kind = key.kind;
const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
bool did_flush;
+ bool is_partial_flush = false;
dlist_node *next;
Assert(!kind_info->fixed_amount);
@@ -1405,7 +1407,8 @@ pgstat_flush_pending_entries(bool nowait, bool anytime_only)
}
/* flush the stats, if possible */
- did_flush = kind_info->flush_pending_cb(entry_ref, nowait, anytime_only);
+ did_flush = kind_info->flush_pending_cb(entry_ref, nowait,
+ anytime_only, &is_partial_flush);
Assert(did_flush || nowait);
@@ -1415,8 +1418,8 @@ pgstat_flush_pending_entries(bool nowait, bool anytime_only)
else
next = NULL;
- /* if successfully flushed, remove entry */
- if (did_flush)
+ /* if successfull non-partial flush, remove entry */
+ if (did_flush && !is_partial_flush)
pgstat_delete_pending_entry(entry_ref);
else
have_pending = true;
diff --git a/src/backend/utils/activity/pgstat_database.c b/src/backend/utils/activity/pgstat_database.c
index 8e86df60461..59dd0790fd7 100644
--- a/src/backend/utils/activity/pgstat_database.c
+++ b/src/backend/utils/activity/pgstat_database.c
@@ -435,7 +435,8 @@ pgstat_reset_database_timestamp(Oid dboid, TimestampTz ts)
* false without flushing the entry. Otherwise returns true.
*/
bool
-pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
+pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial)
{
PgStatShared_Database *sharedent;
PgStat_StatDBEntry *pendingent;
@@ -443,6 +444,9 @@ pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_o
pendingent = (PgStat_StatDBEntry *) entry_ref->pending;
sharedent = (PgStatShared_Database *) entry_ref->shared_stats;
+ /* this is not a partial flush */
+ *is_partial = false;
+
if (!pgstat_lock_entry(entry_ref, nowait))
return false;
diff --git a/src/backend/utils/activity/pgstat_function.c b/src/backend/utils/activity/pgstat_function.c
index 5ba4958382f..44193c93fc7 100644
--- a/src/backend/utils/activity/pgstat_function.c
+++ b/src/backend/utils/activity/pgstat_function.c
@@ -190,7 +190,8 @@ pgstat_end_function_usage(PgStat_FunctionCallUsage *fcu, bool finalize)
* false without flushing the entry. Otherwise returns true.
*/
bool
-pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
+pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial)
{
PgStat_FunctionCounts *localent;
PgStatShared_Function *shfuncent;
@@ -200,6 +201,9 @@ pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_o
localent = (PgStat_FunctionCounts *) entry_ref->pending;
shfuncent = (PgStatShared_Function *) entry_ref->shared_stats;
+ /* this is not a partial flush */
+ *is_partial = false;
+
/* localent always has non-zero content */
if (!pgstat_lock_entry(entry_ref, nowait))
diff --git a/src/backend/utils/activity/pgstat_relation.c b/src/backend/utils/activity/pgstat_relation.c
index ae2952cae89..62363dacfe1 100644
--- a/src/backend/utils/activity/pgstat_relation.c
+++ b/src/backend/utils/activity/pgstat_relation.c
@@ -47,7 +47,19 @@ static void add_tabstat_xact_level(PgStat_TableStatus *pgstat_info, int nest_lev
static void ensure_tabstat_xact_level(PgStat_TableStatus *pgstat_info);
static void save_truncdrop_counters(PgStat_TableXactStatus *trans, bool is_drop);
static void restore_truncdrop_counters(PgStat_TableXactStatus *trans);
+static void flush_relation_anytime_stats(PgStat_StatTabEntry *tabentry,
+ PgStat_TableCounts *counts, bool anytime_only);
+/*
+ * Update database statistics with non-transactional stats.
+ */
+#define UPDATE_DATABASE_ANYTIME_STATS(dbentry, counts) \
+ do { \
+ (dbentry)->tuples_returned += (counts)->tuples_returned; \
+ (dbentry)->tuples_fetched += (counts)->tuples_fetched; \
+ (dbentry)->blocks_fetched += (counts)->blocks_fetched; \
+ (dbentry)->blocks_hit += (counts)->blocks_hit; \
+ } while (0)
/*
* Copy stats between relations. This is used for things like REINDEX
@@ -789,6 +801,29 @@ pgstat_twophase_postabort(FullTransactionId fxid, uint16 info,
rec->tuples_inserted + rec->tuples_updated;
}
+/*
+ * Helper function to flush non-transactional statistics.
+ */
+static void
+flush_relation_anytime_stats(PgStat_StatTabEntry *tabentry, PgStat_TableCounts *counts,
+ bool anytime_only)
+{
+ TimestampTz t;
+
+ tabentry->numscans += counts->numscans;
+ if (counts->numscans)
+ {
+ t = anytime_only ? GetCurrentTimestamp() : GetCurrentTransactionStopTimestamp();
+ if (t > tabentry->lastscan)
+ tabentry->lastscan = t;
+ }
+
+ tabentry->tuples_returned += counts->tuples_returned;
+ tabentry->tuples_fetched += counts->tuples_fetched;
+ tabentry->blocks_fetched += counts->blocks_fetched;
+ tabentry->blocks_hit += counts->blocks_hit;
+}
+
/*
* Flush out pending stats for the entry
*
@@ -797,9 +832,17 @@ pgstat_twophase_postabort(FullTransactionId fxid, uint16 info,
*
* Some of the stats are copied to the corresponding pending database stats
* entry when successfully flushing.
+ *
+ * If anytime_only is true, only non-transactional fields are flushed
+ * (numscans, tuples_returned, tuples_fetched, blocks_fetched, blocks_hit).
+ * Transactional fields remain pending until transaction boundary.
+ *
+ * Some of the stats are copied to the corresponding pending database stats
+ * entry when successfully flushing.
*/
bool
-pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
+pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial)
{
Oid dboid;
PgStat_TableStatus *lstats; /* pending stats entry */
@@ -807,12 +850,13 @@ pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_o
PgStat_StatTabEntry *tabentry; /* table entry of shared stats */
PgStat_StatDBEntry *dbentry; /* pending database entry */
- Assert(!anytime_only);
-
dboid = entry_ref->shared_entry->key.dboid;
lstats = (PgStat_TableStatus *) entry_ref->pending;
shtabstats = (PgStatShared_Relation *) entry_ref->shared_stats;
+ /* this is a partial flush if in anytime only mode */
+ *is_partial = anytime_only;
+
/*
* Ignore entries that didn't accumulate any actual counts, such as
* indexes that were opened by the planner but not used.
@@ -824,19 +868,36 @@ pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_o
if (!pgstat_lock_entry(entry_ref, nowait))
return false;
- /* add the values to the shared entry. */
tabentry = &shtabstats->stats;
- tabentry->numscans += lstats->counts.numscans;
- if (lstats->counts.numscans)
+ if (anytime_only)
{
- TimestampTz t = GetCurrentTransactionStopTimestamp();
- if (t > tabentry->lastscan)
- tabentry->lastscan = t;
+ /* Flush non-transactional statistics */
+ flush_relation_anytime_stats(tabentry, &lstats->counts, true);
+
+ pgstat_unlock_entry(entry_ref);
+
+ /* Also update the corresponding fields in database stats */
+ dbentry = pgstat_prep_database_pending(dboid);
+ UPDATE_DATABASE_ANYTIME_STATS(dbentry, &lstats->counts);
+
+ /*
+ * Clear the flushed fields from pending stats to prevent
+ * double-counting when we flush all fields at transaction boundary.
+ */
+ lstats->counts.numscans = 0;
+ lstats->counts.tuples_returned = 0;
+ lstats->counts.tuples_fetched = 0;
+ lstats->counts.blocks_fetched = 0;
+ lstats->counts.blocks_hit = 0;
+
+ return true;
}
- tabentry->tuples_returned += lstats->counts.tuples_returned;
- tabentry->tuples_fetched += lstats->counts.tuples_fetched;
+
+ /* Flush non-transactional statistics */
+ flush_relation_anytime_stats(tabentry, &lstats->counts, false);
+
tabentry->tuples_inserted += lstats->counts.tuples_inserted;
tabentry->tuples_updated += lstats->counts.tuples_updated;
tabentry->tuples_deleted += lstats->counts.tuples_deleted;
@@ -866,9 +927,6 @@ pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_o
*/
tabentry->ins_since_vacuum += lstats->counts.tuples_inserted;
- tabentry->blocks_fetched += lstats->counts.blocks_fetched;
- tabentry->blocks_hit += lstats->counts.blocks_hit;
-
/* Clamp live_tuples in case of negative delta_live_tuples */
tabentry->live_tuples = Max(tabentry->live_tuples, 0);
/* Likewise for dead_tuples */
@@ -878,13 +936,11 @@ pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_o
/* The entry was successfully flushed, add the same to database stats */
dbentry = pgstat_prep_database_pending(dboid);
- dbentry->tuples_returned += lstats->counts.tuples_returned;
- dbentry->tuples_fetched += lstats->counts.tuples_fetched;
+ UPDATE_DATABASE_ANYTIME_STATS(dbentry, &lstats->counts);
+
dbentry->tuples_inserted += lstats->counts.tuples_inserted;
dbentry->tuples_updated += lstats->counts.tuples_updated;
dbentry->tuples_deleted += lstats->counts.tuples_deleted;
- dbentry->blocks_fetched += lstats->counts.blocks_fetched;
- dbentry->blocks_hit += lstats->counts.blocks_hit;
return true;
}
diff --git a/src/backend/utils/activity/pgstat_subscription.c b/src/backend/utils/activity/pgstat_subscription.c
index 6b6eec7578d..bb32782a9d3 100644
--- a/src/backend/utils/activity/pgstat_subscription.c
+++ b/src/backend/utils/activity/pgstat_subscription.c
@@ -117,7 +117,8 @@ pgstat_fetch_stat_subscription(Oid subid)
* false without flushing the entry. Otherwise returns true.
*/
bool
-pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
+pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial)
{
PgStat_BackendSubEntry *localent;
PgStatShared_Subscription *shsubent;
@@ -127,6 +128,9 @@ pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anyti
localent = (PgStat_BackendSubEntry *) entry_ref->pending;
shsubent = (PgStatShared_Subscription *) entry_ref->shared_stats;
+ /* this is not a partial flush */
+ *is_partial = false;
+
/* localent always has non-zero content */
if (!pgstat_lock_entry(entry_ref, nowait))
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 90237c70829..d26ff26e3e3 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -20,6 +20,7 @@
#include "utils/backend_status.h" /* for backward compatibility */ /* IWYU pragma: export */
#include "utils/pgstat_kind.h"
#include "utils/relcache.h"
+#include "utils/timeout.h"
#include "utils/wait_event.h" /* for backward compatibility */ /* IWYU pragma: export */
@@ -536,10 +537,11 @@ extern void pgstat_report_anytime_stat(bool force);
extern void pgstat_force_next_flush(void);
/*
- * Schedule the next anytime stats update timeout.
+ * Schedule the next anytime stats update timeout and mark that we have
+ * mixed anytime stats pending.
*
* This should be called whenever accumulating statistics that support
- * FLUSH_ANYTIME flushing mode.
+ * FLUSH_ANYTIME or FLUSH_MIXED flushing modes.
*/
#define pgstat_schedule_anytime_update() \
do { \
@@ -705,37 +707,58 @@ extern void pgstat_report_analyze(Relation rel,
#define pgstat_count_heap_scan(rel) \
do { \
if (pgstat_should_count_relation(rel)) \
+ { \
(rel)->pgstat_info->counts.numscans++; \
+ pgstat_schedule_anytime_update(); \
+ } \
} while (0)
#define pgstat_count_heap_getnext(rel) \
do { \
if (pgstat_should_count_relation(rel)) \
+ { \
(rel)->pgstat_info->counts.tuples_returned++; \
+ pgstat_schedule_anytime_update(); \
+ } \
} while (0)
#define pgstat_count_heap_fetch(rel) \
do { \
if (pgstat_should_count_relation(rel)) \
+ { \
(rel)->pgstat_info->counts.tuples_fetched++; \
+ pgstat_schedule_anytime_update(); \
+ } \
} while (0)
#define pgstat_count_index_scan(rel) \
do { \
if (pgstat_should_count_relation(rel)) \
+ { \
(rel)->pgstat_info->counts.numscans++; \
+ pgstat_schedule_anytime_update(); \
+ } \
} while (0)
#define pgstat_count_index_tuples(rel, n) \
do { \
if (pgstat_should_count_relation(rel)) \
+ { \
(rel)->pgstat_info->counts.tuples_returned += (n); \
+ pgstat_schedule_anytime_update(); \
+ } \
} while (0)
#define pgstat_count_buffer_read(rel) \
do { \
if (pgstat_should_count_relation(rel)) \
+ { \
(rel)->pgstat_info->counts.blocks_fetched++; \
+ pgstat_schedule_anytime_update(); \
+ } \
} while (0)
#define pgstat_count_buffer_hit(rel) \
do { \
if (pgstat_should_count_relation(rel)) \
+ { \
(rel)->pgstat_info->counts.blocks_hit++; \
+ pgstat_schedule_anytime_update(); \
+ } \
} while (0)
extern void pgstat_count_heap_insert(Relation rel, PgStat_Counter n);
diff --git a/src/include/utils/pgstat_internal.h b/src/include/utils/pgstat_internal.h
index 607f4255268..1a2114aad8a 100644
--- a/src/include/utils/pgstat_internal.h
+++ b/src/include/utils/pgstat_internal.h
@@ -322,8 +322,10 @@ typedef struct PgStat_KindInfo
* that cannot use PgStat_EntryRef->pending.
*
* The anytime_only parameter indicates whether this is an anytime flush.
+ * The is_partial parameter indicates whether this is a partial flush.
*/
- bool (*flush_pending_cb) (PgStat_EntryRef *sr, bool nowait, bool anytime_only);
+ bool (*flush_pending_cb) (PgStat_EntryRef *sr, bool nowait,
+ bool anytime_only, bool *is_partial);
/*
* For variable-numbered stats: delete pending stats. Optional.
@@ -757,7 +759,8 @@ extern void AtEOXact_PgStat_Database(bool isCommit, bool parallel);
extern PgStat_StatDBEntry *pgstat_prep_database_pending(Oid dboid);
extern void pgstat_reset_database_timestamp(Oid dboid, TimestampTz ts);
-extern bool pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only);
+extern bool pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial);
extern void pgstat_database_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
@@ -765,7 +768,8 @@ extern void pgstat_database_reset_timestamp_cb(PgStatShared_Common *header, Time
* Functions in pgstat_function.c
*/
-extern bool pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only);
+extern bool pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial);
extern void pgstat_function_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
@@ -790,7 +794,8 @@ extern void AtEOSubXact_PgStat_Relations(PgStat_SubXactStatus *xact_state, bool
extern void AtPrepare_PgStat_Relations(PgStat_SubXactStatus *xact_state);
extern void PostPrepare_PgStat_Relations(PgStat_SubXactStatus *xact_state);
-extern bool pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only);
+extern bool pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial);
extern void pgstat_relation_delete_pending_cb(PgStat_EntryRef *entry_ref);
extern void pgstat_relation_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
@@ -858,7 +863,8 @@ extern void pgstat_wal_snapshot_cb(void);
* Functions in pgstat_subscription.c
*/
-extern bool pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only);
+extern bool pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial);
extern void pgstat_subscription_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
diff --git a/src/test/isolation/expected/stats.out b/src/test/isolation/expected/stats.out
index cfad309ccf3..11e3e57806d 100644
--- a/src/test/isolation/expected/stats.out
+++ b/src/test/isolation/expected/stats.out
@@ -2245,6 +2245,108 @@ seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum
(1 row)
+starting permutation: s2_begin s2_table_select s1_sleep s1_table_stats s2_track_counts_off s2_table_select s1_sleep s1_table_stats s2_track_counts_on s2_table_select s1_sleep s1_table_stats s2_table_drop s2_commit
+pg_stat_force_next_flush
+------------------------
+
+(1 row)
+
+step s2_begin: BEGIN;
+step s2_table_select: SELECT * FROM test_stat_tab ORDER BY key, value;
+key|value
+---+-----
+k0 | 1
+(1 row)
+
+step s1_sleep: SELECT pg_sleep(1.5);
+pg_sleep
+--------
+
+(1 row)
+
+step s1_table_stats:
+ SELECT
+ pg_stat_get_numscans(tso.oid) AS seq_scan,
+ pg_stat_get_tuples_returned(tso.oid) AS seq_tup_read,
+ pg_stat_get_tuples_inserted(tso.oid) AS n_tup_ins,
+ pg_stat_get_tuples_updated(tso.oid) AS n_tup_upd,
+ pg_stat_get_tuples_deleted(tso.oid) AS n_tup_del,
+ pg_stat_get_live_tuples(tso.oid) AS n_live_tup,
+ pg_stat_get_dead_tuples(tso.oid) AS n_dead_tup,
+ pg_stat_get_vacuum_count(tso.oid) AS vacuum_count
+ FROM test_stat_oid AS tso
+ WHERE tso.name = 'test_stat_tab'
+
+seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum_count
+--------+------------+---------+---------+---------+----------+----------+------------
+ 1| 1| 1| 0| 0| 1| 0| 0
+(1 row)
+
+step s2_track_counts_off: SET track_counts = off;
+step s2_table_select: SELECT * FROM test_stat_tab ORDER BY key, value;
+key|value
+---+-----
+k0 | 1
+(1 row)
+
+step s1_sleep: SELECT pg_sleep(1.5);
+pg_sleep
+--------
+
+(1 row)
+
+step s1_table_stats:
+ SELECT
+ pg_stat_get_numscans(tso.oid) AS seq_scan,
+ pg_stat_get_tuples_returned(tso.oid) AS seq_tup_read,
+ pg_stat_get_tuples_inserted(tso.oid) AS n_tup_ins,
+ pg_stat_get_tuples_updated(tso.oid) AS n_tup_upd,
+ pg_stat_get_tuples_deleted(tso.oid) AS n_tup_del,
+ pg_stat_get_live_tuples(tso.oid) AS n_live_tup,
+ pg_stat_get_dead_tuples(tso.oid) AS n_dead_tup,
+ pg_stat_get_vacuum_count(tso.oid) AS vacuum_count
+ FROM test_stat_oid AS tso
+ WHERE tso.name = 'test_stat_tab'
+
+seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum_count
+--------+------------+---------+---------+---------+----------+----------+------------
+ 1| 1| 1| 0| 0| 1| 0| 0
+(1 row)
+
+step s2_track_counts_on: SET track_counts = on;
+step s2_table_select: SELECT * FROM test_stat_tab ORDER BY key, value;
+key|value
+---+-----
+k0 | 1
+(1 row)
+
+step s1_sleep: SELECT pg_sleep(1.5);
+pg_sleep
+--------
+
+(1 row)
+
+step s1_table_stats:
+ SELECT
+ pg_stat_get_numscans(tso.oid) AS seq_scan,
+ pg_stat_get_tuples_returned(tso.oid) AS seq_tup_read,
+ pg_stat_get_tuples_inserted(tso.oid) AS n_tup_ins,
+ pg_stat_get_tuples_updated(tso.oid) AS n_tup_upd,
+ pg_stat_get_tuples_deleted(tso.oid) AS n_tup_del,
+ pg_stat_get_live_tuples(tso.oid) AS n_live_tup,
+ pg_stat_get_dead_tuples(tso.oid) AS n_dead_tup,
+ pg_stat_get_vacuum_count(tso.oid) AS vacuum_count
+ FROM test_stat_oid AS tso
+ WHERE tso.name = 'test_stat_tab'
+
+seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum_count
+--------+------------+---------+---------+---------+----------+----------+------------
+ 2| 2| 1| 0| 0| 1| 0| 0
+(1 row)
+
+step s2_table_drop: DROP TABLE test_stat_tab;
+step s2_commit: COMMIT;
+
starting permutation: s1_track_counts_off s1_table_stats s1_track_counts_on
pg_stat_force_next_flush
------------------------
diff --git a/src/test/isolation/expected/stats_1.out b/src/test/isolation/expected/stats_1.out
index e1d937784cb..aef582e7582 100644
--- a/src/test/isolation/expected/stats_1.out
+++ b/src/test/isolation/expected/stats_1.out
@@ -2253,6 +2253,108 @@ seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum
(1 row)
+starting permutation: s2_begin s2_table_select s1_sleep s1_table_stats s2_track_counts_off s2_table_select s1_sleep s1_table_stats s2_track_counts_on s2_table_select s1_sleep s1_table_stats s2_table_drop s2_commit
+pg_stat_force_next_flush
+------------------------
+
+(1 row)
+
+step s2_begin: BEGIN;
+step s2_table_select: SELECT * FROM test_stat_tab ORDER BY key, value;
+key|value
+---+-----
+k0 | 1
+(1 row)
+
+step s1_sleep: SELECT pg_sleep(1.5);
+pg_sleep
+--------
+
+(1 row)
+
+step s1_table_stats:
+ SELECT
+ pg_stat_get_numscans(tso.oid) AS seq_scan,
+ pg_stat_get_tuples_returned(tso.oid) AS seq_tup_read,
+ pg_stat_get_tuples_inserted(tso.oid) AS n_tup_ins,
+ pg_stat_get_tuples_updated(tso.oid) AS n_tup_upd,
+ pg_stat_get_tuples_deleted(tso.oid) AS n_tup_del,
+ pg_stat_get_live_tuples(tso.oid) AS n_live_tup,
+ pg_stat_get_dead_tuples(tso.oid) AS n_dead_tup,
+ pg_stat_get_vacuum_count(tso.oid) AS vacuum_count
+ FROM test_stat_oid AS tso
+ WHERE tso.name = 'test_stat_tab'
+
+seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum_count
+--------+------------+---------+---------+---------+----------+----------+------------
+ 1| 1| 1| 0| 0| 1| 0| 0
+(1 row)
+
+step s2_track_counts_off: SET track_counts = off;
+step s2_table_select: SELECT * FROM test_stat_tab ORDER BY key, value;
+key|value
+---+-----
+k0 | 1
+(1 row)
+
+step s1_sleep: SELECT pg_sleep(1.5);
+pg_sleep
+--------
+
+(1 row)
+
+step s1_table_stats:
+ SELECT
+ pg_stat_get_numscans(tso.oid) AS seq_scan,
+ pg_stat_get_tuples_returned(tso.oid) AS seq_tup_read,
+ pg_stat_get_tuples_inserted(tso.oid) AS n_tup_ins,
+ pg_stat_get_tuples_updated(tso.oid) AS n_tup_upd,
+ pg_stat_get_tuples_deleted(tso.oid) AS n_tup_del,
+ pg_stat_get_live_tuples(tso.oid) AS n_live_tup,
+ pg_stat_get_dead_tuples(tso.oid) AS n_dead_tup,
+ pg_stat_get_vacuum_count(tso.oid) AS vacuum_count
+ FROM test_stat_oid AS tso
+ WHERE tso.name = 'test_stat_tab'
+
+seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum_count
+--------+------------+---------+---------+---------+----------+----------+------------
+ 1| 1| 1| 0| 0| 1| 0| 0
+(1 row)
+
+step s2_track_counts_on: SET track_counts = on;
+step s2_table_select: SELECT * FROM test_stat_tab ORDER BY key, value;
+key|value
+---+-----
+k0 | 1
+(1 row)
+
+step s1_sleep: SELECT pg_sleep(1.5);
+pg_sleep
+--------
+
+(1 row)
+
+step s1_table_stats:
+ SELECT
+ pg_stat_get_numscans(tso.oid) AS seq_scan,
+ pg_stat_get_tuples_returned(tso.oid) AS seq_tup_read,
+ pg_stat_get_tuples_inserted(tso.oid) AS n_tup_ins,
+ pg_stat_get_tuples_updated(tso.oid) AS n_tup_upd,
+ pg_stat_get_tuples_deleted(tso.oid) AS n_tup_del,
+ pg_stat_get_live_tuples(tso.oid) AS n_live_tup,
+ pg_stat_get_dead_tuples(tso.oid) AS n_dead_tup,
+ pg_stat_get_vacuum_count(tso.oid) AS vacuum_count
+ FROM test_stat_oid AS tso
+ WHERE tso.name = 'test_stat_tab'
+
+seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum_count
+--------+------------+---------+---------+---------+----------+----------+------------
+ 2| 2| 1| 0| 0| 1| 0| 0
+(1 row)
+
+step s2_table_drop: DROP TABLE test_stat_tab;
+step s2_commit: COMMIT;
+
starting permutation: s1_track_counts_off s1_table_stats s1_track_counts_on
pg_stat_force_next_flush
------------------------
diff --git a/src/test/isolation/specs/stats.spec b/src/test/isolation/specs/stats.spec
index da16710da0f..47414eb6009 100644
--- a/src/test/isolation/specs/stats.spec
+++ b/src/test/isolation/specs/stats.spec
@@ -50,6 +50,8 @@ step s1_rollback { ROLLBACK; }
step s1_prepare_a { PREPARE TRANSACTION 'a'; }
step s1_commit_prepared_a { COMMIT PREPARED 'a'; }
step s1_rollback_prepared_a { ROLLBACK PREPARED 'a'; }
+# Has to be greater than session 2 stats_flush_interval
+step s1_sleep { SELECT pg_sleep(1.5); }
# Function stats steps
step s1_ff { SELECT pg_stat_force_next_flush(); }
@@ -132,12 +134,16 @@ step s1_slru_check_stats {
session s2
-setup { SET stats_fetch_consistency = 'none'; }
+setup {
+ SET stats_fetch_consistency = 'none';
+ SET stats_flush_interval = '1s';
+}
step s2_begin { BEGIN; }
step s2_commit { COMMIT; }
step s2_commit_prepared_a { COMMIT PREPARED 'a'; }
step s2_rollback_prepared_a { ROLLBACK PREPARED 'a'; }
step s2_ff { SELECT pg_stat_force_next_flush(); }
+step s2_table_drop { DROP TABLE test_stat_tab; }
# Function stats steps
step s2_track_funcs_all { SET track_functions = 'all'; }
@@ -156,6 +162,8 @@ step s2_func_stats {
}
# Relation stats steps
+step s2_track_counts_on { SET track_counts = on; }
+step s2_track_counts_off { SET track_counts = off; }
step s2_table_select { SELECT * FROM test_stat_tab ORDER BY key, value; }
step s2_table_update_k1 { UPDATE test_stat_tab SET value = value + 1 WHERE key = 'k1';}
@@ -435,6 +443,23 @@ permutation
s1_table_drop
s1_table_stats
+### Check that some stats are updated (seq_scan and seq_tup_read)
+### while the transaction is still running
+permutation
+ s2_begin
+ s2_table_select
+ s1_sleep
+ s1_table_stats
+ s2_track_counts_off
+ s2_table_select
+ s1_sleep
+ s1_table_stats
+ s2_track_counts_on
+ s2_table_select
+ s1_sleep
+ s1_table_stats
+ s2_table_drop
+ s2_commit
### Check that we don't count changes with track counts off, but allow access
### to prior stats
diff --git a/src/test/modules/test_custom_stats/test_custom_var_stats.c b/src/test/modules/test_custom_stats/test_custom_var_stats.c
index e9f1bda6b32..59f531df5f7 100644
--- a/src/test/modules/test_custom_stats/test_custom_var_stats.c
+++ b/src/test/modules/test_custom_stats/test_custom_var_stats.c
@@ -85,7 +85,8 @@ static dsa_area *custom_stats_description_dsa = NULL;
/* Flush callback: merge pending stats into shared memory */
static bool test_custom_stats_var_flush_pending_cb(PgStat_EntryRef *entry_ref,
- bool nowait, bool anytime_only);
+ bool nowait, bool anytime_only,
+ bool *is_partial);
/* Serialization callback: write auxiliary entry data */
static void test_custom_stats_var_to_serialized_data(const PgStat_HashKey *key,
@@ -153,7 +154,8 @@ _PG_init(void)
* Returns false only if nowait=true and lock acquisition fails.
*/
static bool
-test_custom_stats_var_flush_pending_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
+test_custom_stats_var_flush_pending_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial)
{
PgStat_StatCustomVarEntry *pending_entry;
PgStatShared_CustomVarEntry *shared_entry;
@@ -161,6 +163,9 @@ test_custom_stats_var_flush_pending_cb(PgStat_EntryRef *entry_ref, bool nowait,
pending_entry = (PgStat_StatCustomVarEntry *) entry_ref->pending;
shared_entry = (PgStatShared_CustomVarEntry *) entry_ref->shared_stats;
+ /* this is not a partial flush */
+ *is_partial = false;
+
if (!pgstat_lock_entry(entry_ref, nowait))
return false;
--
2.34.1
^ permalink raw reply [nested|flat] 22+ messages in thread
* Re: Flush some statistics within running transactions
2026-02-17 19:18 Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-18 05:40 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-19 03:58 ` Re: Flush some statistics within running transactions Michael Paquier <[email protected]>
2026-02-19 08:01 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-19 22:08 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-20 15:55 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
@ 2026-02-23 02:12 ` Sami Imseih <[email protected]>
2026-02-23 08:14 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
0 siblings, 1 reply; 22+ messages in thread
From: Sami Imseih @ 2026-02-23 02:12 UTC (permalink / raw)
To: Bertrand Drouvot <[email protected]>; +Cc: Michael Paquier <[email protected]>; Fujii Masao <[email protected]>; [email protected]; Zsolt Parragi <[email protected]>
> > I took a look at this today out of interest,
>
> Thanks!
>
> > so, instead of calling IS_INJECTION_POINT_ATTACHED macro which is
>
> I think that not calling IS_INJECTION_POINT_ATTACHED() but only relying on
> "ifdef USE_INJECTION_POINTS" would set the tiny timeout value to the entire test
> suit.
Yes, you are right about that.
To Michael's earlier comments:
> I don't find the design of this patch appealing, and my mind points
> towards two pieces of it:
> 1) The new requirement related to pgstat_schedule_anytime_update()
> that a stats kind needs to call to enable a timeout. This partially
> doubles with pgstat_report_fixed. And I suspect that this extra set
> of requirements, introducing a new level of complexity for in-core
> stats kinds as well as extension developers, would be the source of
> more bugs.
I don't see how for fixed stats, adding a
pgstat_schedule_anytime_update() call for such kinds
will be too complex or more error prone. it looks quite straightforward
as this is done before pgstat_report_fixed is set to true. Also,
because processes like wal-sender can loop forever, as mentioned
here [1], a timeout at stats_flush_interval seems like a
straightforward way to deal with this problem.
For variable-length statistics, perhaps we can do things a bit
differently than what is currently proposed. 0005 requires
a relation anytime stat update to call
pgstat_schedule_anytime_update(). This is done this way because
it allows long-running queries to update their stats every
stats_flush_interval using a timeout.
But maybe what we should be doing for variable-numbered stats is
to schedule an anytime update whenever a "transaction goes idle".
This way, unlike the current state of things where we are only
updating relation stats at the end of a transaction, we are now
updating relation stats at the end of SQL execution, and within a
transaction.
Basically, we will continue scheduling anytime
updates every stats_flush_interval for fixed stats (xlog,
bgwriter, etc.), but for variable stats, we only update anytime
stats after SQL execution. This is better than what we have now,
where stats are only updated at the end of the transaction.
The timeout will only be needed to schedule an update
for fixed stats. For variable stats, we can use
GetCurrentStatementStartTimestamp, which is the timestamp of the
last query executed, to throttle pgstat_flush_pending_entries().
We can also flush variable number stats whenever we flush
fixed number stats, in case we enter a long
idle-in-transaction state after a few quick back-to-back queries.
> 2) The timeout requirement itself, relying on a timeout threshold
> controlled by a backend-side configuration.
Perhaps we may not need the stats_flush_inteval and just force
a 10 second timeout for fixed stats.
> With that in mind, wouldn't it be simpler if we introduced an API that
> could be used from client applications instead, in a model similar
> what we do for procsignal.c/h? One such example is
> LOG_MEMORY_CONTEXT, where we have a SQL function that is able to tell
> to a backend that it needs to do something. I could see various
> benefits to this approach, because it gives more flexibility with the
> timing of the stats flushes, which may not be a backend-side only
> policy:
> - Use a cron bgworker in the backend, that scans pg_stat_activity, for
> example for long-running transactions based on a threshold.
> - Do the same periodic scan of pg_stat_activity, but from a client
> application.
I find it odd to ask applications/clients to trigger a flush. I am not saying
that we should not offer such an API, especially if someone want to flush
stats more frequently than stats_flush_interval, but there should be
the ability for core to handle this automatically outside of the transaction
boundaries.
One comment about the current test. I think there is a bug that was
missed in the earlier review. For the var_anytime_update, we need to
have an escape before the pipe. Also, we should set
stats_fetch_consistency=none in that test.
-like($result, qr/^entry2|2|/m,
+like($result, qr/^entry2\|2\|/m,
Otherwise, the test is returning a false-positive.
[1] https://www.postgresql.org/message-id/erpzwxoptqhuptdrtehqydzjapvroumkhh7lc6poclbhe7jk7l%40l3yfsq5q4...
--
Sami Imseih
Amazon Web Services (AWS)
^ permalink raw reply [nested|flat] 22+ messages in thread
* Re: Flush some statistics within running transactions
2026-02-17 19:18 Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-18 05:40 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-19 03:58 ` Re: Flush some statistics within running transactions Michael Paquier <[email protected]>
2026-02-19 08:01 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-19 22:08 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-20 15:55 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-23 02:12 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
@ 2026-02-23 08:14 ` Bertrand Drouvot <[email protected]>
2026-02-23 23:47 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
0 siblings, 1 reply; 22+ messages in thread
From: Bertrand Drouvot @ 2026-02-23 08:14 UTC (permalink / raw)
To: Sami Imseih <[email protected]>; +Cc: Michael Paquier <[email protected]>; Fujii Masao <[email protected]>; [email protected]; Zsolt Parragi <[email protected]>
Hi,
On Sun, Feb 22, 2026 at 08:12:06PM -0600, Sami Imseih wrote:
> > > I took a look at this today out of interest,
> >
> > Thanks!
> >
> > > so, instead of calling IS_INJECTION_POINT_ATTACHED macro which is
> >
> > I think that not calling IS_INJECTION_POINT_ATTACHED() but only relying on
> > "ifdef USE_INJECTION_POINTS" would set the tiny timeout value to the entire test
> > suit.
>
> Yes, you are right about that.
And even if we are able to set a tiny timeout to say N in the test, the test
would still need a pg_sleep(N+<something>). So I wonder if the best way is not to
re-introduce pg_stat_force_anytime_flush() that was in v6?
> To Michael's earlier comments:
>
> For variable-length statistics, perhaps we can do things a bit
> differently than what is currently proposed. 0005 requires
> a relation anytime stat update to call
> pgstat_schedule_anytime_update(). This is done this way because
> it allows long-running queries to update their stats every
> stats_flush_interval using a timeout.
>
> But maybe what we should be doing for variable-numbered stats is
> to schedule an anytime update whenever a "transaction goes idle".
I think the logic for fixed stats and variable stats should be the same. If
not we could observe discrepancies: for example a long running select could
genereate reads/hits IO visible in pg_stat_io but tuples_returned, tuples_fetched,
blocks_fetched or blocks_hit would not be updated until the session goes idle.
> I find it odd to ask applications/clients to trigger a flush. I am not saying
> that we should not offer such an API, especially if someone want to flush
> stats more frequently than stats_flush_interval,
Yeah, and that could help with the tests to avoid the sleep.
> but there should be
> the ability for core to handle this automatically outside of the transaction
> boundaries.
Agreed. I think that public facing API could be an addition, not a replacement.
> One comment about the current test. I think there is a bug that was
> missed in the earlier review. For the var_anytime_update, we need to
> have an escape before the pipe.
>
> -like($result, qr/^entry2|2|/m,
> +like($result, qr/^entry2\|2\|/m,
Oh right, thanks! The attach rewrites it like the fixed test instead.
Also the attached introduces a mandatory rebase due to 308622edf17.
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachments:
[text/x-diff] v11-0001-Add-pgstat_report_anytime_stat-for-periodic-stat.patch (42.9K, 2-v11-0001-Add-pgstat_report_anytime_stat-for-periodic-stat.patch)
download | inline diff:
From 88752c424a416952dc3ead0403eabfec5b19f0c6 Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot <[email protected]>
Date: Mon, 5 Jan 2026 09:41:39 +0000
Subject: [PATCH v11 1/5] Add pgstat_report_anytime_stat() for periodic stats
flushing
Long running transactions can accumulate significant statistics (WAL, IO, ...)
that remain unflushed until the transaction ends. This delays visibility of
resource usage in monitoring views like pg_stat_io and pg_stat_wal and produces
spikes when flushed.
This commit introduces pgstat_report_anytime_stat(), which flushes
non transactional statistics even inside active transactions. A new timeout
handler fires every second (if enabled while adding pending stats) to call this
function, ensuring timely stats visibility without waiting for transaction completion.
Implementation details:
- Add PgStat_FlushMode enum to classify stats kinds:
* FLUSH_ANYTIME: Stats that can always be flushed (WAL, IO, ...)
* FLUSH_AT_TXN_BOUNDARY: Stats requiring transaction boundaries
- Modify pgstat_flush_pending_entries() and pgstat_flush_fixed_stats()
to accept a boolean anytime_only parameter:
* When false: flushes all stats (existing behavior)
* When true: flushes only FLUSH_ANYTIME stats and skips FLUSH_AT_TXN_BOUNDARY stats
- The flush_pending_cb and flush_static_cb callbacks now receive an anytime_only
boolean parameter. Most of the time it's not used (except for assertions), but it's
preparatory work for moving the relations stats to anytime (without introducin
a new callback).
- Add pgstat_schedule_anytime_update() macro to schedule the next anytime flush,
relying on PGSTAT_MIN_INTERVAL
The force parameter in pgstat_report_anytime_stat() is currently unused (always
called with force=false) but reserved for future use cases requiring immediate
flushing.
---
src/backend/access/transam/xlog.c | 6 +
src/backend/postmaster/bgwriter.c | 9 +-
src/backend/postmaster/checkpointer.c | 10 +-
src/backend/postmaster/startup.c | 2 +
src/backend/postmaster/walsummarizer.c | 9 +-
src/backend/postmaster/walwriter.c | 9 +-
src/backend/replication/walreceiver.c | 9 +-
src/backend/replication/walsender.c | 8 +-
src/backend/tcop/postgres.c | 12 ++
src/backend/utils/activity/pgstat.c | 121 +++++++++++++++---
src/backend/utils/activity/pgstat_backend.c | 13 +-
src/backend/utils/activity/pgstat_bgwriter.c | 2 +-
.../utils/activity/pgstat_checkpointer.c | 2 +-
src/backend/utils/activity/pgstat_database.c | 2 +-
src/backend/utils/activity/pgstat_function.c | 4 +-
src/backend/utils/activity/pgstat_io.c | 10 +-
src/backend/utils/activity/pgstat_relation.c | 12 +-
src/backend/utils/activity/pgstat_slru.c | 6 +-
.../utils/activity/pgstat_subscription.c | 4 +-
src/backend/utils/activity/pgstat_wal.c | 10 +-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/init/postinit.c | 3 +
src/include/miscadmin.h | 1 +
src/include/pgstat.h | 22 ++++
src/include/utils/pgstat_internal.h | 52 ++++++--
src/include/utils/timeout.h | 1 +
.../test_custom_stats/test_custom_var_stats.c | 4 +-
src/tools/pgindent/typedefs.list | 1 +
28 files changed, 279 insertions(+), 66 deletions(-)
10.5% src/backend/postmaster/
5.8% src/backend/replication/
51.0% src/backend/utils/activity/
5.8% src/backend/
18.7% src/include/utils/
6.6% src/include/
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 13cce9b49f1..cf29fc91f70 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -1085,6 +1085,9 @@ XLogInsertRecord(XLogRecData *rdata,
pgWalUsage.wal_fpi += num_fpi;
pgWalUsage.wal_fpi_bytes += fpi_bytes;
+ /* Schedule next anytime stats update timeout */
+ pgstat_schedule_anytime_update();
+
/* Required for the flush of pending stats WAL data */
pgstat_report_fixed = true;
}
@@ -2066,6 +2069,9 @@ AdvanceXLInsertBuffer(XLogRecPtr upto, TimeLineID tli, bool opportunistic)
pgWalUsage.wal_buffers_full++;
TRACE_POSTGRESQL_WAL_BUFFER_WRITE_DIRTY_DONE();
+ /* Schedule next anytime stats update timeout */
+ pgstat_schedule_anytime_update();
+
/*
* Required for the flush of pending stats WAL data, per
* update of pgWalUsage.
diff --git a/src/backend/postmaster/bgwriter.c b/src/backend/postmaster/bgwriter.c
index 0956bd39a85..059c601c3b8 100644
--- a/src/backend/postmaster/bgwriter.c
+++ b/src/backend/postmaster/bgwriter.c
@@ -49,7 +49,9 @@
#include "storage/smgr.h"
#include "storage/standby.h"
#include "utils/memutils.h"
+#include "utils/pgstat_internal.h"
#include "utils/resowner.h"
+#include "utils/timeout.h"
#include "utils/timestamp.h"
/*
@@ -103,7 +105,7 @@ BackgroundWriterMain(const void *startup_data, size_t startup_data_len)
pqsignal(SIGINT, SIG_IGN);
pqsignal(SIGTERM, SignalHandlerForShutdownRequest);
/* SIGQUIT handler was already set up by InitPostmasterChild */
- pqsignal(SIGALRM, SIG_IGN);
+ InitializeTimeouts(); /* establishes SIGALRM handler */
pqsignal(SIGPIPE, SIG_IGN);
pqsignal(SIGUSR1, procsignal_sigusr1_handler);
pqsignal(SIGUSR2, SIG_IGN);
@@ -113,6 +115,11 @@ BackgroundWriterMain(const void *startup_data, size_t startup_data_len)
*/
pqsignal(SIGCHLD, SIG_DFL);
+ /*
+ * Register timeouts needed
+ */
+ RegisterTimeout(ANYTIME_STATS_UPDATE_TIMEOUT, AnytimeStatsUpdateTimeoutHandler);
+
/*
* We just started, assume there has been either a shutdown or
* end-of-recovery snapshot.
diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c
index e03c19123bc..e11c4b099c8 100644
--- a/src/backend/postmaster/checkpointer.c
+++ b/src/backend/postmaster/checkpointer.c
@@ -66,8 +66,9 @@
#include "utils/acl.h"
#include "utils/guc.h"
#include "utils/memutils.h"
+#include "utils/pgstat_internal.h"
#include "utils/resowner.h"
-
+#include "utils/timeout.h"
/*----------
* Shared memory area for communication between checkpointer and backends
@@ -215,7 +216,7 @@ CheckpointerMain(const void *startup_data, size_t startup_data_len)
pqsignal(SIGINT, ReqShutdownXLOG);
pqsignal(SIGTERM, SIG_IGN); /* ignore SIGTERM */
/* SIGQUIT handler was already set up by InitPostmasterChild */
- pqsignal(SIGALRM, SIG_IGN);
+ InitializeTimeouts(); /* establishes SIGALRM handler */
pqsignal(SIGPIPE, SIG_IGN);
pqsignal(SIGUSR1, procsignal_sigusr1_handler);
pqsignal(SIGUSR2, SignalHandlerForShutdownRequest);
@@ -225,6 +226,11 @@ CheckpointerMain(const void *startup_data, size_t startup_data_len)
*/
pqsignal(SIGCHLD, SIG_DFL);
+ /*
+ * Register timeouts needed
+ */
+ RegisterTimeout(ANYTIME_STATS_UPDATE_TIMEOUT, AnytimeStatsUpdateTimeoutHandler);
+
/*
* Initialize so that first time-driven event happens at the correct time.
*/
diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c
index cdbe53dd262..4954fe425b7 100644
--- a/src/backend/postmaster/startup.c
+++ b/src/backend/postmaster/startup.c
@@ -32,6 +32,7 @@
#include "storage/standby.h"
#include "utils/guc.h"
#include "utils/memutils.h"
+#include "utils/pgstat_internal.h"
#include "utils/timeout.h"
@@ -245,6 +246,7 @@ StartupProcessMain(const void *startup_data, size_t startup_data_len)
RegisterTimeout(STANDBY_DEADLOCK_TIMEOUT, StandbyDeadLockHandler);
RegisterTimeout(STANDBY_TIMEOUT, StandbyTimeoutHandler);
RegisterTimeout(STANDBY_LOCK_TIMEOUT, StandbyLockTimeoutHandler);
+ RegisterTimeout(ANYTIME_STATS_UPDATE_TIMEOUT, AnytimeStatsUpdateTimeoutHandler);
/*
* Unblock signals (they were blocked when the postmaster forked us)
diff --git a/src/backend/postmaster/walsummarizer.c b/src/backend/postmaster/walsummarizer.c
index 742137edad6..f1bae9d23d6 100644
--- a/src/backend/postmaster/walsummarizer.c
+++ b/src/backend/postmaster/walsummarizer.c
@@ -48,6 +48,8 @@
#include "storage/shmem.h"
#include "utils/guc.h"
#include "utils/memutils.h"
+#include "utils/pgstat_internal.h"
+#include "utils/timeout.h"
#include "utils/wait_event.h"
/*
@@ -246,7 +248,7 @@ WalSummarizerMain(const void *startup_data, size_t startup_data_len)
pqsignal(SIGINT, SIG_IGN); /* no query to cancel */
pqsignal(SIGTERM, SignalHandlerForShutdownRequest);
/* SIGQUIT handler was already set up by InitPostmasterChild */
- pqsignal(SIGALRM, SIG_IGN);
+ InitializeTimeouts(); /* establishes SIGALRM handler */
pqsignal(SIGPIPE, SIG_IGN);
pqsignal(SIGUSR1, procsignal_sigusr1_handler);
pqsignal(SIGUSR2, SIG_IGN); /* not used */
@@ -268,6 +270,11 @@ WalSummarizerMain(const void *startup_data, size_t startup_data_len)
*/
pqsignal(SIGCHLD, SIG_DFL);
+ /*
+ * Register timeouts needed
+ */
+ RegisterTimeout(ANYTIME_STATS_UPDATE_TIMEOUT, AnytimeStatsUpdateTimeoutHandler);
+
/*
* If an exception is encountered, processing resumes here.
*/
diff --git a/src/backend/postmaster/walwriter.c b/src/backend/postmaster/walwriter.c
index 7c0e2809c17..bcf59227a00 100644
--- a/src/backend/postmaster/walwriter.c
+++ b/src/backend/postmaster/walwriter.c
@@ -61,7 +61,9 @@
#include "storage/smgr.h"
#include "utils/hsearch.h"
#include "utils/memutils.h"
+#include "utils/pgstat_internal.h"
#include "utils/resowner.h"
+#include "utils/timeout.h"
/*
@@ -103,7 +105,7 @@ WalWriterMain(const void *startup_data, size_t startup_data_len)
pqsignal(SIGINT, SIG_IGN); /* no query to cancel */
pqsignal(SIGTERM, SignalHandlerForShutdownRequest);
/* SIGQUIT handler was already set up by InitPostmasterChild */
- pqsignal(SIGALRM, SIG_IGN);
+ InitializeTimeouts(); /* establishes SIGALRM handler */
pqsignal(SIGPIPE, SIG_IGN);
pqsignal(SIGUSR1, procsignal_sigusr1_handler);
pqsignal(SIGUSR2, SIG_IGN); /* not used */
@@ -113,6 +115,11 @@ WalWriterMain(const void *startup_data, size_t startup_data_len)
*/
pqsignal(SIGCHLD, SIG_DFL);
+ /*
+ * Register timeouts needed
+ */
+ RegisterTimeout(ANYTIME_STATS_UPDATE_TIMEOUT, AnytimeStatsUpdateTimeoutHandler);
+
/*
* Create a memory context that we will do all our work in. We do this so
* that we can reset the context during error recovery and thereby avoid
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 7c1b8757d7d..aecc7a127e6 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -77,7 +77,9 @@
#include "utils/builtins.h"
#include "utils/guc.h"
#include "utils/pg_lsn.h"
+#include "utils/pgstat_internal.h"
#include "utils/ps_status.h"
+#include "utils/timeout.h"
#include "utils/timestamp.h"
@@ -252,7 +254,7 @@ WalReceiverMain(const void *startup_data, size_t startup_data_len)
pqsignal(SIGINT, SIG_IGN);
pqsignal(SIGTERM, die); /* request shutdown */
/* SIGQUIT handler was already set up by InitPostmasterChild */
- pqsignal(SIGALRM, SIG_IGN);
+ InitializeTimeouts(); /* establishes SIGALRM handler */
pqsignal(SIGPIPE, SIG_IGN);
pqsignal(SIGUSR1, procsignal_sigusr1_handler);
pqsignal(SIGUSR2, SIG_IGN);
@@ -260,6 +262,11 @@ WalReceiverMain(const void *startup_data, size_t startup_data_len)
/* Reset some signals that are accepted by postmaster but not here */
pqsignal(SIGCHLD, SIG_DFL);
+ /*
+ * Register timeouts needed
+ */
+ RegisterTimeout(ANYTIME_STATS_UPDATE_TIMEOUT, AnytimeStatsUpdateTimeoutHandler);
+
/* Load the libpq-specific functions */
load_file("libpqwalreceiver", false);
if (WalReceiverFunctions == NULL)
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 2cde8ebc729..a7214d0dc6f 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1987,8 +1987,8 @@ WalSndWaitForWal(XLogRecPtr loc)
if (TimestampDifferenceExceeds(last_flush, now,
WALSENDER_STATS_FLUSH_INTERVAL))
{
- pgstat_flush_io(false);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO);
+ pgstat_flush_io(false, true);
+ (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
last_flush = now;
}
@@ -3016,8 +3016,8 @@ WalSndLoop(WalSndSendDataCallback send_data)
if (TimestampDifferenceExceeds(last_flush, now,
WALSENDER_STATS_FLUSH_INTERVAL))
{
- pgstat_flush_io(false);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO);
+ pgstat_flush_io(false, true);
+ (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
last_flush = now;
}
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index d01a09dd0c4..8c30efa2443 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -3564,6 +3564,18 @@ ProcessInterrupts(void)
pgstat_report_stat(true);
}
+ /*
+ * Flush stats outside of transaction boundary if the timeout fired.
+ * Unlike transactional stats, these can be flushed even inside a running
+ * transaction.
+ */
+ if (AnytimeStatsUpdateTimeoutPending)
+ {
+ AnytimeStatsUpdateTimeoutPending = false;
+
+ pgstat_report_anytime_stat(false);
+ }
+
if (ProcSignalBarrierPending)
ProcessProcSignalBarrier();
diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c
index 11bb71cad5a..ddd331e2c81 100644
--- a/src/backend/utils/activity/pgstat.c
+++ b/src/backend/utils/activity/pgstat.c
@@ -108,10 +108,12 @@
#include "pgstat.h"
#include "storage/fd.h"
#include "storage/ipc.h"
+#include "storage/latch.h"
#include "storage/lwlock.h"
#include "utils/guc_hooks.h"
#include "utils/memutils.h"
#include "utils/pgstat_internal.h"
+#include "utils/timeout.h"
#include "utils/timestamp.h"
@@ -122,8 +124,6 @@
* ----------
*/
-/* minimum interval non-forced stats flushes.*/
-#define PGSTAT_MIN_INTERVAL 1000
/* how long until to block flushing pending stats updates */
#define PGSTAT_MAX_INTERVAL 60000
/* when to call pgstat_report_stat() again, even when idle */
@@ -187,7 +187,8 @@ static void pgstat_init_snapshot_fixed(void);
static void pgstat_reset_after_failure(void);
-static bool pgstat_flush_pending_entries(bool nowait);
+static bool pgstat_flush_pending_entries(bool nowait, bool anytime_only);
+static bool pgstat_flush_fixed_stats(bool nowait, bool anytime_only);
static void pgstat_prep_snapshot(void);
static void pgstat_build_snapshot(void);
@@ -218,6 +219,12 @@ PgStat_LocalState pgStatLocal;
*/
bool pgstat_report_fixed = false;
+/*
+ * Track when there is pending anytime flush to avoid relying on
+ * get_timeout_active() in hot pathes.
+ */
+bool pgstat_pending_anytime = false;
+
/* ----------
* Local data
*
@@ -288,6 +295,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = true,
+ .flush_mode = FLUSH_AT_TXN_BOUNDARY,
/* so pg_stat_database entries can be seen in all databases */
.accessed_across_databases = true,
@@ -305,6 +313,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = true,
+ .flush_mode = FLUSH_AT_TXN_BOUNDARY,
.shared_size = sizeof(PgStatShared_Relation),
.shared_data_off = offsetof(PgStatShared_Relation, stats),
@@ -321,6 +330,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = true,
+ .flush_mode = FLUSH_AT_TXN_BOUNDARY,
.shared_size = sizeof(PgStatShared_Function),
.shared_data_off = offsetof(PgStatShared_Function, stats),
@@ -336,6 +346,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = true,
+ .flush_mode = FLUSH_AT_TXN_BOUNDARY,
.accessed_across_databases = true,
@@ -353,6 +364,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = true,
+ .flush_mode = FLUSH_AT_TXN_BOUNDARY,
/* so pg_stat_subscription_stats entries can be seen in all databases */
.accessed_across_databases = true,
@@ -370,6 +382,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = false,
+ .flush_mode = FLUSH_ANYTIME,
.accessed_across_databases = true,
@@ -436,6 +449,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = true,
.write_to_file = true,
+ .flush_mode = FLUSH_ANYTIME,
.snapshot_ctl_off = offsetof(PgStat_Snapshot, io),
.shared_ctl_off = offsetof(PgStat_ShmemControl, io),
@@ -453,6 +467,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = true,
.write_to_file = true,
+ .flush_mode = FLUSH_ANYTIME,
.snapshot_ctl_off = offsetof(PgStat_Snapshot, slru),
.shared_ctl_off = offsetof(PgStat_ShmemControl, slru),
@@ -470,6 +485,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = true,
.write_to_file = true,
+ .flush_mode = FLUSH_ANYTIME,
.snapshot_ctl_off = offsetof(PgStat_Snapshot, wal),
.shared_ctl_off = offsetof(PgStat_ShmemControl, wal),
@@ -775,23 +791,11 @@ pgstat_report_stat(bool force)
partial_flush = false;
/* flush of variable-numbered stats tracked in pending entries list */
- partial_flush |= pgstat_flush_pending_entries(nowait);
+ partial_flush |= pgstat_flush_pending_entries(nowait, false);
/* flush of other stats kinds */
if (pgstat_report_fixed)
- {
- for (PgStat_Kind kind = PGSTAT_KIND_MIN; kind <= PGSTAT_KIND_MAX; kind++)
- {
- const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
-
- if (!kind_info)
- continue;
- if (!kind_info->flush_static_cb)
- continue;
-
- partial_flush |= kind_info->flush_static_cb(nowait);
- }
- }
+ partial_flush |= pgstat_flush_fixed_stats(nowait, false);
last_flush = now;
@@ -1293,7 +1297,8 @@ pgstat_prep_pending_entry(PgStat_Kind kind, Oid dboid, uint64 objid, bool *creat
if (entry_ref->pending == NULL)
{
- size_t entrysize = pgstat_get_kind_info(kind)->pending_size;
+ const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
+ size_t entrysize = kind_info->pending_size;
Assert(entrysize != (size_t) -1);
@@ -1345,9 +1350,14 @@ pgstat_delete_pending_entry(PgStat_EntryRef *entry_ref)
/*
* Flush out pending variable-numbered stats.
+ *
+ * If anytime_only is true, only flushes FLUSH_ANYTIME entries.
+ * This is safe to call inside transactions.
+ *
+ * If anytime_only is false, flushes all entries.
*/
static bool
-pgstat_flush_pending_entries(bool nowait)
+pgstat_flush_pending_entries(bool nowait, bool anytime_only)
{
bool have_pending = false;
dlist_node *cur = NULL;
@@ -1377,8 +1387,22 @@ pgstat_flush_pending_entries(bool nowait)
Assert(!kind_info->fixed_amount);
Assert(kind_info->flush_pending_cb != NULL);
+ /* Skip transactional stats if we're in anytime_only mode */
+ if (anytime_only && kind_info->flush_mode == FLUSH_AT_TXN_BOUNDARY)
+ {
+ have_pending = true;
+
+ if (dlist_has_next(&pgStatPending, cur))
+ next = dlist_next_node(&pgStatPending, cur);
+ else
+ next = NULL;
+
+ cur = next;
+ continue;
+ }
+
/* flush the stats, if possible */
- did_flush = kind_info->flush_pending_cb(entry_ref, nowait);
+ did_flush = kind_info->flush_pending_cb(entry_ref, nowait, anytime_only);
Assert(did_flush || nowait);
@@ -1402,6 +1426,33 @@ pgstat_flush_pending_entries(bool nowait)
return have_pending;
}
+/*
+ * Flush fixed-amount stats.
+ *
+ * If anytime_only is true, only flushes FLUSH_ANYTIME stats (safe inside transactions).
+ * If anytime_only is false, flushes all stats with flush_static_cb.
+ */
+static bool
+pgstat_flush_fixed_stats(bool nowait, bool anytime_only)
+{
+ bool partial_flush = false;
+
+ for (PgStat_Kind kind = PGSTAT_KIND_MIN; kind <= PGSTAT_KIND_MAX; kind++)
+ {
+ const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
+
+ if (!kind_info || !kind_info->flush_static_cb)
+ continue;
+
+ /* Skip transactional stats if we're in anytime_only mode */
+ if (anytime_only && kind_info->flush_mode == FLUSH_AT_TXN_BOUNDARY)
+ continue;
+
+ partial_flush |= kind_info->flush_static_cb(nowait, anytime_only);
+ }
+
+ return partial_flush;
+}
/* ------------------------------------------------------------
* Helper / infrastructure functions
@@ -2119,3 +2170,33 @@ assign_stats_fetch_consistency(int newval, void *extra)
if (pgstat_fetch_consistency != newval)
force_stats_snapshot_clear = true;
}
+
+/*
+ * Flushes only FLUSH_ANYTIME stats using non-blocking locks. Transactional
+ * stats (FLUSH_AT_TXN_BOUNDARY) remain pending until transaction boundary.
+ * Safe to call inside transactions.
+ */
+void
+pgstat_report_anytime_stat(bool force)
+{
+ bool nowait = !force;
+
+ pgstat_assert_is_up();
+
+ /* Flush stats outside of transaction boundary */
+ pgstat_flush_pending_entries(nowait, true);
+ pgstat_flush_fixed_stats(nowait, true);
+
+ pgstat_pending_anytime = false;
+}
+
+/*
+ * Timeout handler for flushing anytime stats.
+ */
+void
+AnytimeStatsUpdateTimeoutHandler(void)
+{
+ AnytimeStatsUpdateTimeoutPending = true;
+ InterruptPending = true;
+ SetLatch(MyLatch);
+}
diff --git a/src/backend/utils/activity/pgstat_backend.c b/src/backend/utils/activity/pgstat_backend.c
index f2f8d3ff75f..b09316d3ab3 100644
--- a/src/backend/utils/activity/pgstat_backend.c
+++ b/src/backend/utils/activity/pgstat_backend.c
@@ -31,6 +31,7 @@
#include "storage/procarray.h"
#include "utils/memutils.h"
#include "utils/pgstat_internal.h"
+#include "utils/timeout.h"
/*
* Backend statistics counts waiting to be flushed out. These counters may be
@@ -66,6 +67,9 @@ pgstat_count_backend_io_op_time(IOObject io_object, IOContext io_context,
INSTR_TIME_ADD(PendingBackendStats.pending_io.pending_times[io_object][io_context][io_op],
io_time);
+ /* Schedule next anytime stats update timeout */
+ pgstat_schedule_anytime_update();
+
backend_has_iostats = true;
pgstat_report_fixed = true;
}
@@ -82,6 +86,9 @@ pgstat_count_backend_io_op(IOObject io_object, IOContext io_context,
PendingBackendStats.pending_io.counts[io_object][io_context][io_op] += cnt;
PendingBackendStats.pending_io.bytes[io_object][io_context][io_op] += bytes;
+ /* Schedule next anytime stats update timeout */
+ pgstat_schedule_anytime_update();
+
backend_has_iostats = true;
pgstat_report_fixed = true;
}
@@ -268,7 +275,7 @@ pgstat_flush_backend_entry_wal(PgStat_EntryRef *entry_ref)
* if some statistics could not be flushed due to lock contention.
*/
bool
-pgstat_flush_backend(bool nowait, bits32 flags)
+pgstat_flush_backend(bool nowait, bits32 flags, bool anytime_only)
{
PgStat_EntryRef *entry_ref;
bool has_pending_data = false;
@@ -311,9 +318,9 @@ pgstat_flush_backend(bool nowait, bits32 flags)
* If some stats could not be flushed due to lock contention, return true.
*/
bool
-pgstat_backend_flush_cb(bool nowait)
+pgstat_backend_flush_cb(bool nowait, bool anytime_only)
{
- return pgstat_flush_backend(nowait, PGSTAT_BACKEND_FLUSH_ALL);
+ return pgstat_flush_backend(nowait, PGSTAT_BACKEND_FLUSH_ALL, anytime_only);
}
/*
diff --git a/src/backend/utils/activity/pgstat_bgwriter.c b/src/backend/utils/activity/pgstat_bgwriter.c
index ed2fd801189..1c5f0c3ec40 100644
--- a/src/backend/utils/activity/pgstat_bgwriter.c
+++ b/src/backend/utils/activity/pgstat_bgwriter.c
@@ -61,7 +61,7 @@ pgstat_report_bgwriter(void)
/*
* Report IO statistics
*/
- pgstat_flush_io(false);
+ pgstat_flush_io(false, true);
}
/*
diff --git a/src/backend/utils/activity/pgstat_checkpointer.c b/src/backend/utils/activity/pgstat_checkpointer.c
index 1f70194b7a7..2d89a082464 100644
--- a/src/backend/utils/activity/pgstat_checkpointer.c
+++ b/src/backend/utils/activity/pgstat_checkpointer.c
@@ -68,7 +68,7 @@ pgstat_report_checkpointer(void)
/*
* Report IO statistics
*/
- pgstat_flush_io(false);
+ pgstat_flush_io(false, true);
}
/*
diff --git a/src/backend/utils/activity/pgstat_database.c b/src/backend/utils/activity/pgstat_database.c
index 933dcb5cae5..8e86df60461 100644
--- a/src/backend/utils/activity/pgstat_database.c
+++ b/src/backend/utils/activity/pgstat_database.c
@@ -435,7 +435,7 @@ pgstat_reset_database_timestamp(Oid dboid, TimestampTz ts)
* false without flushing the entry. Otherwise returns true.
*/
bool
-pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
+pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
{
PgStatShared_Database *sharedent;
PgStat_StatDBEntry *pendingent;
diff --git a/src/backend/utils/activity/pgstat_function.c b/src/backend/utils/activity/pgstat_function.c
index e6b84283c6c..5ba4958382f 100644
--- a/src/backend/utils/activity/pgstat_function.c
+++ b/src/backend/utils/activity/pgstat_function.c
@@ -190,11 +190,13 @@ pgstat_end_function_usage(PgStat_FunctionCallUsage *fcu, bool finalize)
* false without flushing the entry. Otherwise returns true.
*/
bool
-pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
+pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
{
PgStat_FunctionCounts *localent;
PgStatShared_Function *shfuncent;
+ Assert(!anytime_only);
+
localent = (PgStat_FunctionCounts *) entry_ref->pending;
shfuncent = (PgStatShared_Function *) entry_ref->shared_stats;
diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index 28de24538dc..7cd32900236 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -19,6 +19,7 @@
#include "executor/instrument.h"
#include "storage/bufmgr.h"
#include "utils/pgstat_internal.h"
+#include "utils/timeout.h"
static PgStat_PendingIO PendingIOStats;
static bool have_iostats = false;
@@ -79,6 +80,9 @@ pgstat_count_io_op(IOObject io_object, IOContext io_context, IOOp io_op,
/* Add the per-backend counts */
pgstat_count_backend_io_op(io_object, io_context, io_op, cnt, bytes);
+ /* Schedule next anytime stats update timeout */
+ pgstat_schedule_anytime_update();
+
have_iostats = true;
pgstat_report_fixed = true;
}
@@ -172,9 +176,9 @@ pgstat_fetch_stat_io(void)
* Simpler wrapper of pgstat_io_flush_cb()
*/
void
-pgstat_flush_io(bool nowait)
+pgstat_flush_io(bool nowait, bool anytime_only)
{
- (void) pgstat_io_flush_cb(nowait);
+ (void) pgstat_io_flush_cb(nowait, anytime_only);
}
/*
@@ -186,7 +190,7 @@ pgstat_flush_io(bool nowait)
* acquired. Otherwise, return false.
*/
bool
-pgstat_io_flush_cb(bool nowait)
+pgstat_io_flush_cb(bool nowait, bool anytime_only)
{
LWLock *bktype_lock;
PgStat_BktypeIO *bktype_shstats;
diff --git a/src/backend/utils/activity/pgstat_relation.c b/src/backend/utils/activity/pgstat_relation.c
index bc8c43b96aa..04d21483d93 100644
--- a/src/backend/utils/activity/pgstat_relation.c
+++ b/src/backend/utils/activity/pgstat_relation.c
@@ -267,8 +267,8 @@ pgstat_report_vacuum(Relation rel, PgStat_Counter livetuples,
* is done -- which will likely vacuum many relations -- or until the
* VACUUM command has processed all tables and committed.
*/
- pgstat_flush_io(false);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO);
+ pgstat_flush_io(false, true);
+ (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
}
/*
@@ -362,8 +362,8 @@ pgstat_report_analyze(Relation rel,
pgstat_unlock_entry(entry_ref);
/* see pgstat_report_vacuum() */
- pgstat_flush_io(false);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO);
+ pgstat_flush_io(false, true);
+ (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
}
/*
@@ -812,7 +812,7 @@ pgstat_twophase_postabort(FullTransactionId fxid, uint16 info,
* entry when successfully flushing.
*/
bool
-pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
+pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
{
Oid dboid;
PgStat_TableStatus *lstats; /* pending stats entry */
@@ -820,6 +820,8 @@ pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
PgStat_StatTabEntry *tabentry; /* table entry of shared stats */
PgStat_StatDBEntry *dbentry; /* pending database entry */
+ Assert(!anytime_only);
+
dboid = entry_ref->shared_entry->key.dboid;
lstats = (PgStat_TableStatus *) entry_ref->pending;
shtabstats = (PgStatShared_Relation *) entry_ref->shared_stats;
diff --git a/src/backend/utils/activity/pgstat_slru.c b/src/backend/utils/activity/pgstat_slru.c
index 2190f388eae..bf8a4d58673 100644
--- a/src/backend/utils/activity/pgstat_slru.c
+++ b/src/backend/utils/activity/pgstat_slru.c
@@ -19,6 +19,7 @@
#include "utils/pgstat_internal.h"
#include "utils/timestamp.h"
+#include "utils/timeout.h"
static inline PgStat_SLRUStats *get_slru_entry(int slru_idx);
@@ -139,7 +140,7 @@ pgstat_get_slru_index(const char *name)
* acquired. Otherwise return false.
*/
bool
-pgstat_slru_flush_cb(bool nowait)
+pgstat_slru_flush_cb(bool nowait, bool anytime_only)
{
PgStatShared_SLRU *stats_shmem = &pgStatLocal.shmem->slru;
int i;
@@ -223,6 +224,9 @@ get_slru_entry(int slru_idx)
Assert((slru_idx >= 0) && (slru_idx < SLRU_NUM_ELEMENTS));
+ /* Schedule next anytime stats update timeout */
+ pgstat_schedule_anytime_update();
+
have_slrustats = true;
pgstat_report_fixed = true;
diff --git a/src/backend/utils/activity/pgstat_subscription.c b/src/backend/utils/activity/pgstat_subscription.c
index 3277cf88a4e..6b6eec7578d 100644
--- a/src/backend/utils/activity/pgstat_subscription.c
+++ b/src/backend/utils/activity/pgstat_subscription.c
@@ -117,11 +117,13 @@ pgstat_fetch_stat_subscription(Oid subid)
* false without flushing the entry. Otherwise returns true.
*/
bool
-pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
+pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
{
PgStat_BackendSubEntry *localent;
PgStatShared_Subscription *shsubent;
+ Assert(!anytime_only);
+
localent = (PgStat_BackendSubEntry *) entry_ref->pending;
shsubent = (PgStatShared_Subscription *) entry_ref->shared_stats;
diff --git a/src/backend/utils/activity/pgstat_wal.c b/src/backend/utils/activity/pgstat_wal.c
index 183e0a7a97b..2c2f3f10e10 100644
--- a/src/backend/utils/activity/pgstat_wal.c
+++ b/src/backend/utils/activity/pgstat_wal.c
@@ -51,12 +51,12 @@ pgstat_report_wal(bool force)
nowait = !force;
/* flush wal stats */
- (void) pgstat_wal_flush_cb(nowait);
- pgstat_flush_backend(nowait, PGSTAT_BACKEND_FLUSH_WAL);
+ (void) pgstat_wal_flush_cb(nowait, true);
+ (void) pgstat_flush_backend(nowait, PGSTAT_BACKEND_FLUSH_WAL, true);
/* flush IO stats */
- pgstat_flush_io(nowait);
- (void) pgstat_flush_backend(nowait, PGSTAT_BACKEND_FLUSH_IO);
+ pgstat_flush_io(nowait, true);
+ (void) pgstat_flush_backend(nowait, PGSTAT_BACKEND_FLUSH_IO, true);
}
/*
@@ -88,7 +88,7 @@ pgstat_wal_have_pending(void)
* acquired. Otherwise return false.
*/
bool
-pgstat_wal_flush_cb(bool nowait)
+pgstat_wal_flush_cb(bool nowait, bool anytime_only)
{
PgStatShared_Wal *stats_shmem = &pgStatLocal.shmem->wal;
WalUsage wal_usage_diff = {0};
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 36ad708b360..ad44826c39e 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -40,6 +40,7 @@ volatile sig_atomic_t IdleSessionTimeoutPending = false;
volatile sig_atomic_t ProcSignalBarrierPending = false;
volatile sig_atomic_t LogMemoryContextPending = false;
volatile sig_atomic_t IdleStatsUpdateTimeoutPending = false;
+volatile sig_atomic_t AnytimeStatsUpdateTimeoutPending = false;
volatile uint32 InterruptHoldoffCount = 0;
volatile uint32 QueryCancelHoldoffCount = 0;
volatile uint32 CritSectionCount = 0;
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index b59e08605cc..eeeac1bf39a 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -64,6 +64,7 @@
#include "utils/injection_point.h"
#include "utils/memutils.h"
#include "utils/pg_locale.h"
+#include "utils/pgstat_internal.h"
#include "utils/portal.h"
#include "utils/ps_status.h"
#include "utils/snapmgr.h"
@@ -773,6 +774,8 @@ InitPostgres(const char *in_dbname, Oid dboid,
RegisterTimeout(CLIENT_CONNECTION_CHECK_TIMEOUT, ClientCheckTimeoutHandler);
RegisterTimeout(IDLE_STATS_UPDATE_TIMEOUT,
IdleStatsUpdateTimeoutHandler);
+ RegisterTimeout(ANYTIME_STATS_UPDATE_TIMEOUT,
+ AnytimeStatsUpdateTimeoutHandler);
}
/*
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index f16f35659b9..84e698da214 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -96,6 +96,7 @@ extern PGDLLIMPORT volatile sig_atomic_t IdleSessionTimeoutPending;
extern PGDLLIMPORT volatile sig_atomic_t ProcSignalBarrierPending;
extern PGDLLIMPORT volatile sig_atomic_t LogMemoryContextPending;
extern PGDLLIMPORT volatile sig_atomic_t IdleStatsUpdateTimeoutPending;
+extern PGDLLIMPORT volatile sig_atomic_t AnytimeStatsUpdateTimeoutPending;
extern PGDLLIMPORT volatile sig_atomic_t CheckClientConnectionPending;
extern PGDLLIMPORT volatile sig_atomic_t ClientConnectionLost;
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 9bb777c3d5a..b011a315679 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -34,6 +34,9 @@
/* Default directory to store temporary statistics data in */
#define PG_STAT_TMP_DIR "pg_stat_tmp"
+/* Minimum interval non-forced stats flushes */
+#define PGSTAT_MIN_INTERVAL 1000
+
/* Values for track_functions GUC variable --- order is significant! */
typedef enum TrackFunctionsLevel
{
@@ -532,8 +535,24 @@ extern void pgstat_initialize(void);
/* Functions called from backends */
extern long pgstat_report_stat(bool force);
+extern void pgstat_report_anytime_stat(bool force);
extern void pgstat_force_next_flush(void);
+/*
+ * Schedule the next anytime stats update timeout.
+ *
+ * This should be called whenever accumulating statistics that support
+ * FLUSH_ANYTIME flushing mode.
+ */
+#define pgstat_schedule_anytime_update() \
+ do { \
+ if (IsUnderPostmaster && !pgstat_pending_anytime) \
+ { \
+ enable_timeout_after(ANYTIME_STATS_UPDATE_TIMEOUT, PGSTAT_MIN_INTERVAL); \
+ pgstat_pending_anytime = true; \
+ } \
+ } while (0)
+
extern void pgstat_reset_counters(void);
extern void pgstat_reset(PgStat_Kind kind, Oid dboid, uint64 objid);
extern void pgstat_reset_of_kind(PgStat_Kind kind);
@@ -806,6 +825,8 @@ extern PgStat_WalStats *pgstat_fetch_stat_wal(void);
* Variables in pgstat.c
*/
+extern PGDLLIMPORT bool pgstat_pending_anytime;
+
/* GUC parameters */
extern PGDLLIMPORT bool pgstat_track_counts;
extern PGDLLIMPORT int pgstat_track_functions;
@@ -849,4 +870,5 @@ extern PGDLLIMPORT PgStat_Counter pgStatTransactionIdleTime;
/* updated by the traffic cop and in errfinish() */
extern PGDLLIMPORT SessionEndType pgStatSessionEndCause;
+
#endif /* PGSTAT_H */
diff --git a/src/include/utils/pgstat_internal.h b/src/include/utils/pgstat_internal.h
index 9b8fbae00ed..607f4255268 100644
--- a/src/include/utils/pgstat_internal.h
+++ b/src/include/utils/pgstat_internal.h
@@ -224,6 +224,19 @@ typedef struct PgStat_SubXactStatus
PgStat_TableXactStatus *first; /* head of list for this subxact */
} PgStat_SubXactStatus;
+/*
+ * Flush mode for statistics kinds.
+ *
+ * FLUSH_AT_TXN_BOUNDARY has to be the first because we want it to be the
+ * default value.
+ */
+typedef enum PgStat_FlushMode
+{
+ FLUSH_AT_TXN_BOUNDARY, /* All fields can only be flushed at
+ * transaction boundary */
+ FLUSH_ANYTIME, /* All fields can be flushed anytime,
+ * including within transactions */
+} PgStat_FlushMode;
/*
* Metadata for a specific kind of statistics.
@@ -251,6 +264,16 @@ typedef struct PgStat_KindInfo
*/
bool track_entry_count:1;
+ /*
+ * The mode of when to flush stats. See PgStat_FlushMode for more details.
+ *
+ * This member only has meaning for statistics kinds that accumulate
+ * pending stats and use flush callbacks. For kinds that write directly to
+ * shared memory (e.g., archiver, bgwriter, checkpointer), this member has
+ * no effect.
+ */
+ PgStat_FlushMode flush_mode;
+
/*
* The size of an entry in the shared stats hash table (pointed to by
* PgStatShared_HashEntry->body). For fixed-numbered statistics, this is
@@ -297,8 +320,10 @@ typedef struct PgStat_KindInfo
* For variable-numbered stats: flush pending stats. Required if pending
* data is used. See flush_static_cb when dealing with stats data that
* that cannot use PgStat_EntryRef->pending.
+ *
+ * The anytime_only parameter indicates whether this is an anytime flush.
*/
- bool (*flush_pending_cb) (PgStat_EntryRef *sr, bool nowait);
+ bool (*flush_pending_cb) (PgStat_EntryRef *sr, bool nowait, bool anytime_only);
/*
* For variable-numbered stats: delete pending stats. Optional.
@@ -366,8 +391,10 @@ typedef struct PgStat_KindInfo
*
* "pgstat_report_fixed" needs to be set to trigger the flush of pending
* stats.
+ *
+ * The anytime_only parameter indicates whether this is an anytime flush.
*/
- bool (*flush_static_cb) (bool nowait);
+ bool (*flush_static_cb) (bool nowait, bool anytime_only);
/*
* For fixed-numbered statistics: Reset All.
@@ -677,6 +704,7 @@ extern PgStat_EntryRef *pgstat_fetch_pending_entry(PgStat_Kind kind,
extern void *pgstat_fetch_entry(PgStat_Kind kind, Oid dboid, uint64 objid);
extern void pgstat_snapshot_fixed(PgStat_Kind kind);
+extern void AnytimeStatsUpdateTimeoutHandler(void);
/*
@@ -696,8 +724,8 @@ extern void pgstat_archiver_snapshot_cb(void);
#define PGSTAT_BACKEND_FLUSH_WAL (1 << 1) /* Flush WAL statistics */
#define PGSTAT_BACKEND_FLUSH_ALL (PGSTAT_BACKEND_FLUSH_IO | PGSTAT_BACKEND_FLUSH_WAL)
-extern bool pgstat_flush_backend(bool nowait, bits32 flags);
-extern bool pgstat_backend_flush_cb(bool nowait);
+extern bool pgstat_flush_backend(bool nowait, bits32 flags, bool anytime_only);
+extern bool pgstat_backend_flush_cb(bool nowait, bool anytime_only);
extern void pgstat_backend_reset_timestamp_cb(PgStatShared_Common *header,
TimestampTz ts);
@@ -729,7 +757,7 @@ extern void AtEOXact_PgStat_Database(bool isCommit, bool parallel);
extern PgStat_StatDBEntry *pgstat_prep_database_pending(Oid dboid);
extern void pgstat_reset_database_timestamp(Oid dboid, TimestampTz ts);
-extern bool pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait);
+extern bool pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only);
extern void pgstat_database_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
@@ -737,7 +765,7 @@ extern void pgstat_database_reset_timestamp_cb(PgStatShared_Common *header, Time
* Functions in pgstat_function.c
*/
-extern bool pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait);
+extern bool pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only);
extern void pgstat_function_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
@@ -745,9 +773,9 @@ extern void pgstat_function_reset_timestamp_cb(PgStatShared_Common *header, Time
* Functions in pgstat_io.c
*/
-extern void pgstat_flush_io(bool nowait);
+extern void pgstat_flush_io(bool nowait, bool anytime_only);
-extern bool pgstat_io_flush_cb(bool nowait);
+extern bool pgstat_io_flush_cb(bool nowait, bool anytime_only);
extern void pgstat_io_init_shmem_cb(void *stats);
extern void pgstat_io_reset_all_cb(TimestampTz ts);
extern void pgstat_io_snapshot_cb(void);
@@ -762,7 +790,7 @@ extern void AtEOSubXact_PgStat_Relations(PgStat_SubXactStatus *xact_state, bool
extern void AtPrepare_PgStat_Relations(PgStat_SubXactStatus *xact_state);
extern void PostPrepare_PgStat_Relations(PgStat_SubXactStatus *xact_state);
-extern bool pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait);
+extern bool pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only);
extern void pgstat_relation_delete_pending_cb(PgStat_EntryRef *entry_ref);
extern void pgstat_relation_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
@@ -809,7 +837,7 @@ extern PgStatShared_Common *pgstat_init_entry(PgStat_Kind kind,
* Functions in pgstat_slru.c
*/
-extern bool pgstat_slru_flush_cb(bool nowait);
+extern bool pgstat_slru_flush_cb(bool nowait, bool anytime_only);
extern void pgstat_slru_init_shmem_cb(void *stats);
extern void pgstat_slru_reset_all_cb(TimestampTz ts);
extern void pgstat_slru_snapshot_cb(void);
@@ -820,7 +848,7 @@ extern void pgstat_slru_snapshot_cb(void);
*/
extern void pgstat_wal_init_backend_cb(void);
-extern bool pgstat_wal_flush_cb(bool nowait);
+extern bool pgstat_wal_flush_cb(bool nowait, bool anytime_only);
extern void pgstat_wal_init_shmem_cb(void *stats);
extern void pgstat_wal_reset_all_cb(TimestampTz ts);
extern void pgstat_wal_snapshot_cb(void);
@@ -830,7 +858,7 @@ extern void pgstat_wal_snapshot_cb(void);
* Functions in pgstat_subscription.c
*/
-extern bool pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait);
+extern bool pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only);
extern void pgstat_subscription_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
diff --git a/src/include/utils/timeout.h b/src/include/utils/timeout.h
index 0965b590b34..10723bb664c 100644
--- a/src/include/utils/timeout.h
+++ b/src/include/utils/timeout.h
@@ -35,6 +35,7 @@ typedef enum TimeoutId
IDLE_SESSION_TIMEOUT,
IDLE_STATS_UPDATE_TIMEOUT,
CLIENT_CONNECTION_CHECK_TIMEOUT,
+ ANYTIME_STATS_UPDATE_TIMEOUT,
STARTUP_PROGRESS_TIMEOUT,
/* First user-definable timeout reason */
USER_TIMEOUT,
diff --git a/src/test/modules/test_custom_stats/test_custom_var_stats.c b/src/test/modules/test_custom_stats/test_custom_var_stats.c
index da28afbd929..4c207611236 100644
--- a/src/test/modules/test_custom_stats/test_custom_var_stats.c
+++ b/src/test/modules/test_custom_stats/test_custom_var_stats.c
@@ -84,7 +84,7 @@ static dsa_area *custom_stats_description_dsa = NULL;
/* Flush callback: merge pending stats into shared memory */
static bool test_custom_stats_var_flush_pending_cb(PgStat_EntryRef *entry_ref,
- bool nowait);
+ bool nowait, bool anytime_only);
/* Serialization callback: write auxiliary entry data */
static void test_custom_stats_var_to_serialized_data(const PgStat_HashKey *key,
@@ -151,7 +151,7 @@ _PG_init(void)
* Returns false only if nowait=true and lock acquisition fails.
*/
static bool
-test_custom_stats_var_flush_pending_cb(PgStat_EntryRef *entry_ref, bool nowait)
+test_custom_stats_var_flush_pending_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
{
PgStat_StatCustomVarEntry *pending_entry;
PgStatShared_CustomVarEntry *shared_entry;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 241945734ec..1dbc4b96f51 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2271,6 +2271,7 @@ PgStat_Counter
PgStat_EntryRef
PgStat_EntryRefHashEntry
PgStat_FetchConsistency
+PgStat_FlushMode
PgStat_FunctionCallUsage
PgStat_FunctionCounts
PgStat_HashKey
--
2.34.1
[text/x-diff] v11-0002-Add-anytime-flush-tests-for-custom-stats.patch (9.1K, 3-v11-0002-Add-anytime-flush-tests-for-custom-stats.patch)
download | inline diff:
From 7e6ab1d0ce578486fb96b55a56162f71fe5a370c Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot <[email protected]>
Date: Thu, 5 Feb 2026 05:54:34 +0000
Subject: [PATCH v11 2/5] Add anytime flush tests for custom stats
---
.../test_custom_stats/t/001_custom_stats.pl | 43 ++++++++++++++
.../test_custom_fixed_stats--1.0.sql | 5 ++
.../test_custom_fixed_stats.c | 57 +++++++++++++++++++
.../test_custom_var_stats--1.0.sql | 5 ++
.../test_custom_stats/test_custom_var_stats.c | 27 +++++++++
5 files changed, 137 insertions(+)
35.8% src/test/modules/test_custom_stats/t/
64.1% src/test/modules/test_custom_stats/
diff --git a/src/test/modules/test_custom_stats/t/001_custom_stats.pl b/src/test/modules/test_custom_stats/t/001_custom_stats.pl
index 9e6a7a38577..6ba4022418f 100644
--- a/src/test/modules/test_custom_stats/t/001_custom_stats.pl
+++ b/src/test/modules/test_custom_stats/t/001_custom_stats.pl
@@ -156,5 +156,48 @@ $result = $node->safe_psql('postgres',
);
is($result, "0", "report of fixed-sized after manual reset");
+# Test FLUSH_ANYTIME mechanism with custom fixed stats
+# This verifies that custom stats can be flushed during a transaction
+
+# Reset stats first
+$node->safe_psql('postgres', q(select test_custom_stats_fixed_reset()));
+$node->safe_psql('postgres', q(select pg_stat_force_next_flush()));
+
+my $anytime_test = q[
+ BEGIN;
+ SET LOCAL stats_fetch_consistency = none;
+ -- Accumulate stats
+ select test_custom_stats_fixed_anytime_update() from generate_series(1, 2);
+ -- Wait (has to be greater than PGSTAT_MIN_INTERVAL)
+ select pg_sleep(1.5);
+ -- Check
+ select 'fixed_anytime:'||numcalls from test_custom_stats_fixed_report();
+];
+
+$result = $node->safe_psql('postgres', $anytime_test);
+like($result, qr/^fixed_anytime:2/m,
+ "anytime fixed stats flushed during transaction");
+
+# Test FLUSH_ANYTIME mechanism with custom variable stats
+# This verifies that custom stats can be flushed during a transaction
+
+$node->safe_psql('postgres', q(select pg_stat_force_next_flush()));
+
+$anytime_test = q[
+ BEGIN;
+ SET LOCAL stats_fetch_consistency = none;
+ -- Accumulate stats
+ select test_custom_stats_var_anytime_update('entry2');
+ select test_custom_stats_var_anytime_update('entry2');
+ -- Wait (has to be greater than PGSTAT_MIN_INTERVAL)
+ select pg_sleep(1.5);
+ -- Check
+ select 'var_anytime:'||calls from test_custom_stats_var_report('entry2');
+];
+
+$result = $node->safe_psql('postgres', $anytime_test);
+like($result, qr/^var_anytime:2/m,
+ "anytime var stats flushed during transaction");
+
# Test completed successfully
done_testing();
diff --git a/src/test/modules/test_custom_stats/test_custom_fixed_stats--1.0.sql b/src/test/modules/test_custom_stats/test_custom_fixed_stats--1.0.sql
index 69a93b5241f..da3a798f289 100644
--- a/src/test/modules/test_custom_stats/test_custom_fixed_stats--1.0.sql
+++ b/src/test/modules/test_custom_stats/test_custom_fixed_stats--1.0.sql
@@ -18,3 +18,8 @@ CREATE FUNCTION test_custom_stats_fixed_reset()
RETURNS void
AS 'MODULE_PATHNAME', 'test_custom_stats_fixed_reset'
LANGUAGE C STRICT PARALLEL UNSAFE;
+
+CREATE FUNCTION test_custom_stats_fixed_anytime_update()
+RETURNS void
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT PARALLEL UNSAFE;
diff --git a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
index 485e08e5c19..e7fbb2737ef 100644
--- a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
+++ b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
@@ -18,6 +18,7 @@
#include "pgstat.h"
#include "utils/builtins.h"
#include "utils/pgstat_internal.h"
+#include "utils/timeout.h"
#include "utils/timestamp.h"
PG_MODULE_MAGIC_EXT(
@@ -44,11 +45,13 @@ typedef struct PgStatShared_CustomFixedEntry
static void test_custom_stats_fixed_init_shmem_cb(void *stats);
static void test_custom_stats_fixed_reset_all_cb(TimestampTz ts);
static void test_custom_stats_fixed_snapshot_cb(void);
+static bool test_custom_stats_fixed_flush_cb(bool nowait, bool anytime_only);
static const PgStat_KindInfo custom_stats = {
.name = "test_custom_fixed_stats",
.fixed_amount = true, /* exactly one entry */
.write_to_file = true, /* persist to stats file */
+ .flush_mode = FLUSH_ANYTIME, /* can be flushed anytime */
.shared_size = sizeof(PgStat_StatCustomFixedEntry),
.shared_data_off = offsetof(PgStatShared_CustomFixedEntry, stats),
@@ -57,8 +60,12 @@ static const PgStat_KindInfo custom_stats = {
.init_shmem_cb = test_custom_stats_fixed_init_shmem_cb,
.reset_all_cb = test_custom_stats_fixed_reset_all_cb,
.snapshot_cb = test_custom_stats_fixed_snapshot_cb,
+ .flush_static_cb = test_custom_stats_fixed_flush_cb,
};
+/* Pending statistics */
+static PgStat_StatCustomFixedEntry PendingCustomStats = {0};
+
/*
* Kind ID for test_custom_fixed_stats.
*/
@@ -142,6 +149,38 @@ test_custom_stats_fixed_snapshot_cb(void)
#undef FIXED_COMP
}
+/*
+ * test_custom_stats_fixed_flush_cb
+ * Flush pending stats to shared memory
+ */
+static bool
+test_custom_stats_fixed_flush_cb(bool nowait, bool anytime_only)
+{
+ PgStatShared_CustomFixedEntry *stats_shmem;
+
+ /* Nothing to flush if no calls were made */
+ if (PendingCustomStats.numcalls == 0)
+ return false;
+
+ stats_shmem = pgstat_get_custom_shmem_data(PGSTAT_KIND_TEST_CUSTOM_FIXED_STATS);
+
+ if (!nowait)
+ LWLockAcquire(&stats_shmem->lock, LW_EXCLUSIVE);
+ else if (!LWLockConditionalAcquire(&stats_shmem->lock, LW_EXCLUSIVE))
+ return true;
+
+ pgstat_begin_changecount_write(&stats_shmem->changecount);
+ stats_shmem->stats.numcalls += PendingCustomStats.numcalls;
+ pgstat_end_changecount_write(&stats_shmem->changecount);
+
+ LWLockRelease(&stats_shmem->lock);
+
+ /* Reset pending stats */
+ PendingCustomStats.numcalls = 0;
+
+ return false; /* successfully flushed */
+}
+
/*--------------------------------------------------------------------------
* SQL-callable functions
*--------------------------------------------------------------------------
@@ -223,3 +262,21 @@ test_custom_stats_fixed_report(PG_FUNCTION_ARGS)
/* Return as tuple */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
+
+/*
+ * test_custom_stats_fixed_anytime_update
+ * Increment call counter and schedule anytime flush
+ */
+PG_FUNCTION_INFO_V1(test_custom_stats_fixed_anytime_update);
+Datum
+test_custom_stats_fixed_anytime_update(PG_FUNCTION_ARGS)
+{
+ /* Accumulate in pending stats */
+ PendingCustomStats.numcalls++;
+
+ /* Schedule anytime stats update */
+ pgstat_schedule_anytime_update();
+ pgstat_report_fixed = true;
+
+ PG_RETURN_VOID();
+}
diff --git a/src/test/modules/test_custom_stats/test_custom_var_stats--1.0.sql b/src/test/modules/test_custom_stats/test_custom_var_stats--1.0.sql
index 5ed8cfc2dcf..ed66d38981e 100644
--- a/src/test/modules/test_custom_stats/test_custom_var_stats--1.0.sql
+++ b/src/test/modules/test_custom_stats/test_custom_var_stats--1.0.sql
@@ -24,3 +24,8 @@ CREATE FUNCTION test_custom_stats_var_report(INOUT name TEXT,
RETURNS SETOF record
AS 'MODULE_PATHNAME', 'test_custom_stats_var_report'
LANGUAGE C STRICT PARALLEL UNSAFE;
+
+CREATE FUNCTION test_custom_stats_var_anytime_update(IN name TEXT)
+RETURNS void
+AS 'MODULE_PATHNAME', 'test_custom_stats_var_anytime_update'
+LANGUAGE C STRICT PARALLEL UNSAFE;
diff --git a/src/test/modules/test_custom_stats/test_custom_var_stats.c b/src/test/modules/test_custom_stats/test_custom_var_stats.c
index 4c207611236..e9f1bda6b32 100644
--- a/src/test/modules/test_custom_stats/test_custom_var_stats.c
+++ b/src/test/modules/test_custom_stats/test_custom_var_stats.c
@@ -18,6 +18,7 @@
#include "storage/dsm_registry.h"
#include "utils/builtins.h"
#include "utils/pgstat_internal.h"
+#include "utils/timeout.h"
PG_MODULE_MAGIC_EXT(
.name = "test_custom_var_stats",
@@ -108,6 +109,7 @@ static const PgStat_KindInfo custom_stats = {
.name = "test_custom_var_stats",
.fixed_amount = false, /* variable number of entries */
.write_to_file = true, /* persist across restarts */
+ .flush_mode = FLUSH_ANYTIME, /* can be flushed anytime */
.track_entry_count = true, /* count active entries */
.accessed_across_databases = true, /* global statistics */
.shared_size = sizeof(PgStatShared_CustomVarEntry),
@@ -690,3 +692,28 @@ test_custom_stats_var_report(PG_FUNCTION_ARGS)
SRF_RETURN_DONE(funcctx);
}
+
+/*
+ * test_custom_stats_var_anytime_update
+ * Increment custom statistic counter and schedule anytime flush
+ */
+PG_FUNCTION_INFO_V1(test_custom_stats_var_anytime_update);
+Datum
+test_custom_stats_var_anytime_update(PG_FUNCTION_ARGS)
+{
+ char *stat_name = text_to_cstring(PG_GETARG_TEXT_PP(0));
+ PgStat_EntryRef *entry_ref;
+ PgStat_StatCustomVarEntry *pending_entry;
+
+ /* Get pending entry in local memory */
+ entry_ref = pgstat_prep_pending_entry(PGSTAT_KIND_TEST_CUSTOM_VAR_STATS, InvalidOid,
+ PGSTAT_CUSTOM_VAR_STATS_IDX(stat_name), NULL);
+
+ pending_entry = (PgStat_StatCustomVarEntry *) entry_ref->pending;
+ pending_entry->numcalls++;
+
+ /* Schedule anytime stats update */
+ pgstat_schedule_anytime_update();
+
+ PG_RETURN_VOID();
+}
--
2.34.1
[text/x-diff] v11-0003-Add-GUC-to-specify-non-transactional-statistics-.patch (10.0K, 4-v11-0003-Add-GUC-to-specify-non-transactional-statistics-.patch)
download | inline diff:
From 205e22a432753760b3110e851955fd3421ff3564 Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot <[email protected]>
Date: Wed, 28 Jan 2026 07:53:13 +0000
Subject: [PATCH v11 3/5] Add GUC to specify non-transactional statistics flush
interval
Adding pgstat_flush_interval, a new GUC to set the interval between flushes of
non-transactional statistics.
---
doc/src/sgml/config.sgml | 32 +++++++++++++++++++
src/backend/utils/activity/pgstat.c | 13 ++++++++
src/backend/utils/misc/guc_parameters.dat | 10 ++++++
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/backend/utils/misc/timeout.c | 6 ++++
src/include/pgstat.h | 6 ++--
src/include/utils/guc_hooks.h | 1 +
src/include/utils/timeout.h | 1 +
.../test_custom_stats/t/001_custom_stats.pl | 6 ++--
9 files changed, 70 insertions(+), 6 deletions(-)
51.0% doc/src/sgml/
10.6% src/backend/utils/activity/
15.9% src/backend/utils/misc/
3.6% src/include/utils/
9.0% src/include/
9.6% src/test/modules/test_custom_stats/t/
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 20dbcaeb3ee..1eed71007a7 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -8929,6 +8929,38 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-stats-flush-interval" xreflabel="stats_flush_interval">
+ <term><varname>stats_flush_interval</varname> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>stats_flush_interval</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Sets the interval at which certain statistics, which can be updated while a
+ transaction is in progress, are made visible. These include WAL activity
+ and I/O operations.
+ Such statistics are refreshed at the specified interval and can be observed
+ during active transactions in monitoring views such as
+ <link linkend="monitoring-pg-stat-wal-view"><structname>pg_stat_wal</structname></link>
+ and
+ <link linkend="monitoring-pg-stat-io-view"><structname>pg_stat_io</structname></link>.
+ If the value is specified without a unit, milliseconds are assumed.
+ The default is 10 seconds (<literal>10s</literal>), which is generally
+ the smallest practical value for long-running transactions.
+ </para>
+ <note>
+ <para>
+ This parameter does not affect statistics that are only reported at
+ transaction end, such as the columns of <structname>pg_stat_all_tables</structname>
+ (for example, <structfield>n_tup_ins</structfield>, <structfield>n_tup_upd</structfield>,
+ and <structfield>n_tup_del</structfield>). These statistics are always
+ flushed at the end of a transaction.
+ </para>
+ </note>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c
index ddd331e2c81..fd6ab0db16f 100644
--- a/src/backend/utils/activity/pgstat.c
+++ b/src/backend/utils/activity/pgstat.c
@@ -124,6 +124,8 @@
* ----------
*/
+/* minimum interval non-forced stats flushes.*/
+#define PGSTAT_MIN_INTERVAL 1000
/* how long until to block flushing pending stats updates */
#define PGSTAT_MAX_INTERVAL 60000
/* when to call pgstat_report_stat() again, even when idle */
@@ -204,6 +206,7 @@ static inline bool pgstat_is_kind_valid(PgStat_Kind kind);
bool pgstat_track_counts = false;
int pgstat_fetch_consistency = PGSTAT_FETCH_CONSISTENCY_CACHE;
+int pgstat_flush_interval = 10000;
/* ----------
@@ -2171,6 +2174,16 @@ assign_stats_fetch_consistency(int newval, void *extra)
force_stats_snapshot_clear = true;
}
+/*
+ * GUC assign_hook for stats_flush_interval.
+ */
+void
+assign_stats_flush_interval(int newval, void *extra)
+{
+ if (get_all_timeouts_initialized())
+ enable_timeout_after(ANYTIME_STATS_UPDATE_TIMEOUT, newval);
+}
+
/*
* Flushes only FLUSH_ANYTIME stats using non-blocking locks. Transactional
* stats (FLUSH_AT_TXN_BOUNDARY) remain pending until transaction boundary.
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 9507778415d..073e08c7892 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -2801,6 +2801,16 @@
assign_hook => 'assign_stats_fetch_consistency',
},
+{ name => 'stats_flush_interval', type => 'int', context => 'PGC_USERSET', group => 'STATS_CUMULATIVE',
+ short_desc => 'Sets the interval between flushes of non-transactional statistics.',
+ flags => 'GUC_UNIT_MS',
+ variable => 'pgstat_flush_interval',
+ boot_val => '10000',
+ min => '1000',
+ max => 'INT_MAX',
+ assign_hook => 'assign_stats_flush_interval'
+},
+
{ name => 'subtransaction_buffers', type => 'int', context => 'PGC_POSTMASTER', group => 'RESOURCES_MEM',
short_desc => 'Sets the size of the dedicated buffer pool used for the subtransaction cache.',
long_desc => '0 means use a fraction of "shared_buffers".',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index f938cc65a3a..8bd37a25b38 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -688,6 +688,7 @@
#track_wal_io_timing = off
#track_functions = none # none, pl, all
#stats_fetch_consistency = cache # cache, none, snapshot
+#stats_flush_interval = 10s # in milliseconds
# - Monitoring -
diff --git a/src/backend/utils/misc/timeout.c b/src/backend/utils/misc/timeout.c
index ddba5dc607c..85c4260d1db 100644
--- a/src/backend/utils/misc/timeout.c
+++ b/src/backend/utils/misc/timeout.c
@@ -828,3 +828,9 @@ get_timeout_finish_time(TimeoutId id)
{
return all_timeouts[id].fin_time;
}
+
+bool
+get_all_timeouts_initialized(void)
+{
+ return all_timeouts_initialized;
+}
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index b011a315679..90237c70829 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -34,9 +34,6 @@
/* Default directory to store temporary statistics data in */
#define PG_STAT_TMP_DIR "pg_stat_tmp"
-/* Minimum interval non-forced stats flushes */
-#define PGSTAT_MIN_INTERVAL 1000
-
/* Values for track_functions GUC variable --- order is significant! */
typedef enum TrackFunctionsLevel
{
@@ -548,7 +545,7 @@ extern void pgstat_force_next_flush(void);
do { \
if (IsUnderPostmaster && !pgstat_pending_anytime) \
{ \
- enable_timeout_after(ANYTIME_STATS_UPDATE_TIMEOUT, PGSTAT_MIN_INTERVAL); \
+ enable_timeout_after(ANYTIME_STATS_UPDATE_TIMEOUT, pgstat_flush_interval); \
pgstat_pending_anytime = true; \
} \
} while (0)
@@ -831,6 +828,7 @@ extern PGDLLIMPORT bool pgstat_pending_anytime;
extern PGDLLIMPORT bool pgstat_track_counts;
extern PGDLLIMPORT int pgstat_track_functions;
extern PGDLLIMPORT int pgstat_fetch_consistency;
+extern PGDLLIMPORT int pgstat_flush_interval;
/*
diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h
index 9c90670d9b8..9b5d2a90387 100644
--- a/src/include/utils/guc_hooks.h
+++ b/src/include/utils/guc_hooks.h
@@ -132,6 +132,7 @@ extern bool check_session_authorization(char **newval, void **extra, GucSource s
extern void assign_session_authorization(const char *newval, void *extra);
extern void assign_session_replication_role(int newval, void *extra);
extern void assign_stats_fetch_consistency(int newval, void *extra);
+extern void assign_stats_flush_interval(int newval, void *extra);
extern bool check_ssl(bool *newval, void **extra, GucSource source);
extern bool check_stage_log_stats(bool *newval, void **extra, GucSource source);
extern bool check_standard_conforming_strings(bool *newval, void **extra,
diff --git a/src/include/utils/timeout.h b/src/include/utils/timeout.h
index 10723bb664c..fe7327de209 100644
--- a/src/include/utils/timeout.h
+++ b/src/include/utils/timeout.h
@@ -93,5 +93,6 @@ extern bool get_timeout_active(TimeoutId id);
extern bool get_timeout_indicator(TimeoutId id, bool reset_indicator);
extern TimestampTz get_timeout_start_time(TimeoutId id);
extern TimestampTz get_timeout_finish_time(TimeoutId id);
+extern bool get_all_timeouts_initialized(void);
#endif /* TIMEOUT_H */
diff --git a/src/test/modules/test_custom_stats/t/001_custom_stats.pl b/src/test/modules/test_custom_stats/t/001_custom_stats.pl
index 6ba4022418f..920443487c0 100644
--- a/src/test/modules/test_custom_stats/t/001_custom_stats.pl
+++ b/src/test/modules/test_custom_stats/t/001_custom_stats.pl
@@ -164,11 +164,12 @@ $node->safe_psql('postgres', q(select test_custom_stats_fixed_reset()));
$node->safe_psql('postgres', q(select pg_stat_force_next_flush()));
my $anytime_test = q[
+ SET stats_flush_interval = '1s';
BEGIN;
SET LOCAL stats_fetch_consistency = none;
-- Accumulate stats
select test_custom_stats_fixed_anytime_update() from generate_series(1, 2);
- -- Wait (has to be greater than PGSTAT_MIN_INTERVAL)
+ -- Wait (has to be greater than stats_flush_interval)
select pg_sleep(1.5);
-- Check
select 'fixed_anytime:'||numcalls from test_custom_stats_fixed_report();
@@ -184,12 +185,13 @@ like($result, qr/^fixed_anytime:2/m,
$node->safe_psql('postgres', q(select pg_stat_force_next_flush()));
$anytime_test = q[
+ SET stats_flush_interval = '1s';
BEGIN;
SET LOCAL stats_fetch_consistency = none;
-- Accumulate stats
select test_custom_stats_var_anytime_update('entry2');
select test_custom_stats_var_anytime_update('entry2');
- -- Wait (has to be greater than PGSTAT_MIN_INTERVAL)
+ -- Wait (has to be greater than stats_flush_interval)
select pg_sleep(1.5);
-- Check
select 'var_anytime:'||calls from test_custom_stats_var_report('entry2');
--
2.34.1
[text/x-diff] v11-0004-Remove-useless-calls-to-flush-some-stats.patch (7.7K, 5-v11-0004-Remove-useless-calls-to-flush-some-stats.patch)
download | inline diff:
From aeac6f996ef494124c5888ce2699e9d7b03c435a Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot <[email protected]>
Date: Tue, 6 Jan 2026 11:06:31 +0000
Subject: [PATCH v11 4/5] Remove useless calls to flush some stats
Now that some stats can be flushed outside of transaction boundaries, remove
useless calls to report/flush some stats. Those calls were in place because
before commit <XXXX> stats were flushed only at transaction boundaries.
Note that:
- it reverts 039549d70f6 (it just keeps its tests)
- it can't be done for checkpointer and bgworker for example because they don't
have a flush callback to call
- it can't be done for auxiliary process (walsummarizer for example) because they
currently do not register the new timeout handler
---
src/backend/replication/walreceiver.c | 10 ------
src/backend/replication/walsender.c | 36 ++------------------
src/backend/utils/activity/pgstat_relation.c | 13 -------
src/test/recovery/t/001_stream_rep.pl | 1 +
src/test/subscription/t/001_rep_changes.pl | 1 +
5 files changed, 4 insertions(+), 57 deletions(-)
69.4% src/backend/replication/
23.4% src/backend/utils/activity/
3.5% src/test/recovery/t/
3.6% src/test/subscription/t/
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index aecc7a127e6..edf5ac65660 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -571,16 +571,6 @@ WalReceiverMain(const void *startup_data, size_t startup_data_len)
*/
bool requestReply = false;
- /*
- * Report pending statistics to the cumulative stats
- * system. This location is useful for the report as it
- * is not within a tight loop in the WAL receiver, to
- * avoid bloating pgstats with requests, while also making
- * sure that the reports happen each time a status update
- * is sent.
- */
- pgstat_report_wal(false);
-
/*
* Check if time since last receive from primary has
* reached the configured limit.
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index a7214d0dc6f..9a136e35b48 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -94,14 +94,10 @@
#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/pg_lsn.h"
-#include "utils/pgstat_internal.h"
#include "utils/ps_status.h"
#include "utils/timeout.h"
#include "utils/timestamp.h"
-/* Minimum interval used by walsender for stats flushes, in ms */
-#define WALSENDER_STATS_FLUSH_INTERVAL 1000
-
/*
* Maximum data payload in a WAL data message. Must be >= XLOG_BLCKSZ.
*
@@ -1846,7 +1842,6 @@ WalSndWaitForWal(XLogRecPtr loc)
int wakeEvents;
uint32 wait_event = 0;
static XLogRecPtr RecentFlushPtr = InvalidXLogRecPtr;
- TimestampTz last_flush = 0;
/*
* Fast path to avoid acquiring the spinlock in case we already know we
@@ -1867,7 +1862,6 @@ WalSndWaitForWal(XLogRecPtr loc)
{
bool wait_for_standby_at_stop = false;
long sleeptime;
- TimestampTz now;
/* Clear any already-pending wakeups */
ResetLatch(MyLatch);
@@ -1973,8 +1967,7 @@ WalSndWaitForWal(XLogRecPtr loc)
* new WAL to be generated. (But if we have nothing to send, we don't
* want to wake on socket-writable.)
*/
- now = GetCurrentTimestamp();
- sleeptime = WalSndComputeSleeptime(now);
+ sleeptime = WalSndComputeSleeptime(GetCurrentTimestamp());
wakeEvents = WL_SOCKET_READABLE;
@@ -1983,15 +1976,6 @@ WalSndWaitForWal(XLogRecPtr loc)
Assert(wait_event != 0);
- /* Report IO statistics, if needed */
- if (TimestampDifferenceExceeds(last_flush, now,
- WALSENDER_STATS_FLUSH_INTERVAL))
- {
- pgstat_flush_io(false, true);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
- last_flush = now;
- }
-
WalSndWait(wakeEvents, sleeptime, wait_event);
}
@@ -2894,8 +2878,6 @@ WalSndCheckTimeOut(void)
static void
WalSndLoop(WalSndSendDataCallback send_data)
{
- TimestampTz last_flush = 0;
-
/*
* Initialize the last reply timestamp. That enables timeout processing
* from hereon.
@@ -2985,9 +2967,6 @@ WalSndLoop(WalSndSendDataCallback send_data)
* WalSndWaitForWal() handle any other blocking; idle receivers need
* its additional actions. For physical replication, also block if
* caught up; its send_data does not block.
- *
- * The IO statistics are reported in WalSndWaitForWal() for the
- * logical WAL senders.
*/
if ((WalSndCaughtUp && send_data != XLogSendLogical &&
!streamingDoneSending) ||
@@ -2995,7 +2974,6 @@ WalSndLoop(WalSndSendDataCallback send_data)
{
long sleeptime;
int wakeEvents;
- TimestampTz now;
if (!streamingDoneReceiving)
wakeEvents = WL_SOCKET_READABLE;
@@ -3006,21 +2984,11 @@ WalSndLoop(WalSndSendDataCallback send_data)
* Use fresh timestamp, not last_processing, to reduce the chance
* of reaching wal_sender_timeout before sending a keepalive.
*/
- now = GetCurrentTimestamp();
- sleeptime = WalSndComputeSleeptime(now);
+ sleeptime = WalSndComputeSleeptime(GetCurrentTimestamp());
if (pq_is_send_pending())
wakeEvents |= WL_SOCKET_WRITEABLE;
- /* Report IO statistics, if needed */
- if (TimestampDifferenceExceeds(last_flush, now,
- WALSENDER_STATS_FLUSH_INTERVAL))
- {
- pgstat_flush_io(false, true);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
- last_flush = now;
- }
-
/* Sleep until something happens or we time out */
WalSndWait(wakeEvents, sleeptime, WAIT_EVENT_WAL_SENDER_MAIN);
}
diff --git a/src/backend/utils/activity/pgstat_relation.c b/src/backend/utils/activity/pgstat_relation.c
index 04d21483d93..ae2952cae89 100644
--- a/src/backend/utils/activity/pgstat_relation.c
+++ b/src/backend/utils/activity/pgstat_relation.c
@@ -260,15 +260,6 @@ pgstat_report_vacuum(Relation rel, PgStat_Counter livetuples,
}
pgstat_unlock_entry(entry_ref);
-
- /*
- * Flush IO statistics now. pgstat_report_stat() will flush IO stats,
- * however this will not be called until after an entire autovacuum cycle
- * is done -- which will likely vacuum many relations -- or until the
- * VACUUM command has processed all tables and committed.
- */
- pgstat_flush_io(false, true);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
}
/*
@@ -360,10 +351,6 @@ pgstat_report_analyze(Relation rel,
}
pgstat_unlock_entry(entry_ref);
-
- /* see pgstat_report_vacuum() */
- pgstat_flush_io(false, true);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
}
/*
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index e9ac67813c7..cfa095ff0a8 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -15,6 +15,7 @@ my $node_primary = PostgreSQL::Test::Cluster->new('primary');
$node_primary->init(
allows_streaming => 1,
auth_extra => [ '--create-role' => 'repl_role' ]);
+$node_primary->append_conf('postgresql.conf', "stats_flush_interval = '1s'");
$node_primary->start;
my $backup_name = 'my_backup';
diff --git a/src/test/subscription/t/001_rep_changes.pl b/src/test/subscription/t/001_rep_changes.pl
index 7d41715ed81..29bae5e1121 100644
--- a/src/test/subscription/t/001_rep_changes.pl
+++ b/src/test/subscription/t/001_rep_changes.pl
@@ -11,6 +11,7 @@ use Test::More;
# Initialize publisher node
my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf', "stats_flush_interval = '1s'");
$node_publisher->start;
# Create subscriber node
--
2.34.1
[text/x-diff] v11-0005-Change-RELATION-and-DATABASE-stats-to-anytime-fl.patch (34.2K, 6-v11-0005-Change-RELATION-and-DATABASE-stats-to-anytime-fl.patch)
download | inline diff:
From 543cade27fcc4913f53fbc8df5df87936f7d8294 Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot <[email protected]>
Date: Mon, 19 Jan 2026 06:27:55 +0000
Subject: [PATCH v11 5/5] Change RELATION and DATABASE stats to anytime flush
This commit allows mixing fields with different transaction behavior within
the same RELATION or DATABASE statistics kind: some fields are transactional
(e.g., tuple inserts/updates/deletes) while others are non-transactional
(e.g., sequential scans, blocks read).
It modifies the relation flush callback to handle the anytime_only parameter
introduced in commit <nnnn>.
Implementation details:
- Change RELATION from FLUSH_AT_TXN_BOUNDARY to FLUSH_ANYTIME
- Change DATABASE from FLUSH_AT_TXN_BOUNDARY to FLUSH_ANYTIME
- Add a is_partial parameter to flush_pending_cb() to be able to distinguish
partial flushes in pgstat_flush_pending_entries()
- Modify pgstat_relation_flush_cb() to handle anytime_only parameter: when
true, then flush only non-transactional stats and when false, then flush all
the stats. When set to true, it clears flushed fields from pending stats to
prevent double-counting at transaction boundary
DATABASE stats inherit the anytime flush behavior so that relation-derived
stats (tuples_returned, tuples_fetched, blocks_fetched, blocks_hit) are
visible while transactions are in progress.
Tests are added to verify the anytime flush behavior for mixed fields.
---
doc/src/sgml/monitoring.sgml | 37 ++++++-
src/backend/utils/activity/pgstat.c | 15 +--
src/backend/utils/activity/pgstat_database.c | 6 +-
src/backend/utils/activity/pgstat_function.c | 6 +-
src/backend/utils/activity/pgstat_relation.c | 92 ++++++++++++----
.../utils/activity/pgstat_subscription.c | 6 +-
src/include/pgstat.h | 27 ++++-
src/include/utils/pgstat_internal.h | 16 ++-
src/test/isolation/expected/stats.out | 102 ++++++++++++++++++
src/test/isolation/expected/stats_1.out | 102 ++++++++++++++++++
src/test/isolation/specs/stats.spec | 27 ++++-
.../test_custom_stats/test_custom_var_stats.c | 9 +-
12 files changed, 404 insertions(+), 41 deletions(-)
11.7% doc/src/sgml/
26.8% src/backend/utils/activity/
4.2% src/include/utils/
5.4% src/include/
45.1% src/test/isolation/expected/
4.7% src/test/isolation/specs/
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index b77d189a500..f2321b631b0 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3767,6 +3767,19 @@ description | Waiting for a newly initialized WAL file to reach durable storage
</tgroup>
</table>
+ <note>
+ <para>
+ Some statistics are updated while a transaction is in progress (for example,
+ <structfield>blks_read</structfield>, <structfield>blks_hit</structfield>,
+ <structfield>tup_returned</structfield> and <structfield>tup_fetched</structfield>).
+ Statistics that either do not depend on transactions or require transactional
+ consistency are updated only when the transaction ends. Statistics that require
+ transactional consistency include <structfield>xact_commit</structfield>,
+ <structfield>xact_rollback</structfield>, <structfield>tup_inserted</structfield>,
+ <structfield>tup_updated</structfield> and <structfield>tup_deleted</structfield>.
+ </para>
+ </note>
+
</sect2>
<sect2 id="monitoring-pg-stat-database-conflicts-view">
@@ -3956,8 +3969,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
<structfield>last_seq_scan</structfield> <type>timestamp with time zone</type>
</para>
<para>
- The time of the last sequential scan on this table, based on the
- most recent transaction stop time
+ The approximate time of the last sequential scan on this table, updated
+ at least every <varname>stats_flush_interval</varname>
</para></entry>
</row>
@@ -3984,8 +3997,8 @@ description | Waiting for a newly initialized WAL file to reach durable storage
<structfield>last_idx_scan</structfield> <type>timestamp with time zone</type>
</para>
<para>
- The time of the last index scan on this table, based on the
- most recent transaction stop time
+ The approximate time of the last index scan on this table, updated
+ at least every <varname>stats_flush_interval</varname>
</para></entry>
</row>
@@ -4223,6 +4236,15 @@ description | Waiting for a newly initialized WAL file to reach durable storage
</tgroup>
</table>
+ <note>
+ <para>
+ The <structfield>seq_scan</structfield>, <structfield>last_seq_scan</structfield>,
+ <structfield>seq_tup_read</structfield>, <structfield>idx_scan</structfield>,
+ <structfield>last_idx_scan</structfield> and <structfield>idx_tup_fetch</structfield>
+ are updated while the transactions are in progress.
+ </para>
+ </note>
+
</sect2>
<sect2 id="monitoring-pg-stat-all-indexes-view">
@@ -4404,6 +4426,13 @@ description | Waiting for a newly initialized WAL file to reach durable storage
tuples (see <xref linkend="indexes-multicolumn"/>).
</para>
</note>
+ <note>
+ <para>
+ The <structfield>idx_scan</structfield>, <structfield>last_idx_scan</structfield>,
+ <structfield>idx_tup_read</structfield> and <structfield>idx_tup_fetch</structfield>
+ are updated while the transactions are in progress.
+ </para>
+ </note>
<tip>
<para>
<command>EXPLAIN ANALYZE</command> outputs the total number of index
diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c
index fd6ab0db16f..a8a905640d0 100644
--- a/src/backend/utils/activity/pgstat.c
+++ b/src/backend/utils/activity/pgstat.c
@@ -298,7 +298,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = true,
- .flush_mode = FLUSH_AT_TXN_BOUNDARY,
+ .flush_mode = FLUSH_ANYTIME,
/* so pg_stat_database entries can be seen in all databases */
.accessed_across_databases = true,
@@ -316,7 +316,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = true,
- .flush_mode = FLUSH_AT_TXN_BOUNDARY,
+ .flush_mode = FLUSH_ANYTIME,
.shared_size = sizeof(PgStatShared_Relation),
.shared_data_off = offsetof(PgStatShared_Relation, stats),
@@ -1354,7 +1354,8 @@ pgstat_delete_pending_entry(PgStat_EntryRef *entry_ref)
/*
* Flush out pending variable-numbered stats.
*
- * If anytime_only is true, only flushes FLUSH_ANYTIME entries.
+ * If anytime_only is true, only flushes FLUSH_ANYTIME entries. For entries
+ * that support it, the callback may flush only non-transactional fields.
* This is safe to call inside transactions.
*
* If anytime_only is false, flushes all entries.
@@ -1385,6 +1386,7 @@ pgstat_flush_pending_entries(bool nowait, bool anytime_only)
PgStat_Kind kind = key.kind;
const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
bool did_flush;
+ bool is_partial_flush = false;
dlist_node *next;
Assert(!kind_info->fixed_amount);
@@ -1405,7 +1407,8 @@ pgstat_flush_pending_entries(bool nowait, bool anytime_only)
}
/* flush the stats, if possible */
- did_flush = kind_info->flush_pending_cb(entry_ref, nowait, anytime_only);
+ did_flush = kind_info->flush_pending_cb(entry_ref, nowait,
+ anytime_only, &is_partial_flush);
Assert(did_flush || nowait);
@@ -1415,8 +1418,8 @@ pgstat_flush_pending_entries(bool nowait, bool anytime_only)
else
next = NULL;
- /* if successfully flushed, remove entry */
- if (did_flush)
+ /* if successfull non-partial flush, remove entry */
+ if (did_flush && !is_partial_flush)
pgstat_delete_pending_entry(entry_ref);
else
have_pending = true;
diff --git a/src/backend/utils/activity/pgstat_database.c b/src/backend/utils/activity/pgstat_database.c
index 8e86df60461..59dd0790fd7 100644
--- a/src/backend/utils/activity/pgstat_database.c
+++ b/src/backend/utils/activity/pgstat_database.c
@@ -435,7 +435,8 @@ pgstat_reset_database_timestamp(Oid dboid, TimestampTz ts)
* false without flushing the entry. Otherwise returns true.
*/
bool
-pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
+pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial)
{
PgStatShared_Database *sharedent;
PgStat_StatDBEntry *pendingent;
@@ -443,6 +444,9 @@ pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_o
pendingent = (PgStat_StatDBEntry *) entry_ref->pending;
sharedent = (PgStatShared_Database *) entry_ref->shared_stats;
+ /* this is not a partial flush */
+ *is_partial = false;
+
if (!pgstat_lock_entry(entry_ref, nowait))
return false;
diff --git a/src/backend/utils/activity/pgstat_function.c b/src/backend/utils/activity/pgstat_function.c
index 5ba4958382f..44193c93fc7 100644
--- a/src/backend/utils/activity/pgstat_function.c
+++ b/src/backend/utils/activity/pgstat_function.c
@@ -190,7 +190,8 @@ pgstat_end_function_usage(PgStat_FunctionCallUsage *fcu, bool finalize)
* false without flushing the entry. Otherwise returns true.
*/
bool
-pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
+pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial)
{
PgStat_FunctionCounts *localent;
PgStatShared_Function *shfuncent;
@@ -200,6 +201,9 @@ pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_o
localent = (PgStat_FunctionCounts *) entry_ref->pending;
shfuncent = (PgStatShared_Function *) entry_ref->shared_stats;
+ /* this is not a partial flush */
+ *is_partial = false;
+
/* localent always has non-zero content */
if (!pgstat_lock_entry(entry_ref, nowait))
diff --git a/src/backend/utils/activity/pgstat_relation.c b/src/backend/utils/activity/pgstat_relation.c
index ae2952cae89..62363dacfe1 100644
--- a/src/backend/utils/activity/pgstat_relation.c
+++ b/src/backend/utils/activity/pgstat_relation.c
@@ -47,7 +47,19 @@ static void add_tabstat_xact_level(PgStat_TableStatus *pgstat_info, int nest_lev
static void ensure_tabstat_xact_level(PgStat_TableStatus *pgstat_info);
static void save_truncdrop_counters(PgStat_TableXactStatus *trans, bool is_drop);
static void restore_truncdrop_counters(PgStat_TableXactStatus *trans);
+static void flush_relation_anytime_stats(PgStat_StatTabEntry *tabentry,
+ PgStat_TableCounts *counts, bool anytime_only);
+/*
+ * Update database statistics with non-transactional stats.
+ */
+#define UPDATE_DATABASE_ANYTIME_STATS(dbentry, counts) \
+ do { \
+ (dbentry)->tuples_returned += (counts)->tuples_returned; \
+ (dbentry)->tuples_fetched += (counts)->tuples_fetched; \
+ (dbentry)->blocks_fetched += (counts)->blocks_fetched; \
+ (dbentry)->blocks_hit += (counts)->blocks_hit; \
+ } while (0)
/*
* Copy stats between relations. This is used for things like REINDEX
@@ -789,6 +801,29 @@ pgstat_twophase_postabort(FullTransactionId fxid, uint16 info,
rec->tuples_inserted + rec->tuples_updated;
}
+/*
+ * Helper function to flush non-transactional statistics.
+ */
+static void
+flush_relation_anytime_stats(PgStat_StatTabEntry *tabentry, PgStat_TableCounts *counts,
+ bool anytime_only)
+{
+ TimestampTz t;
+
+ tabentry->numscans += counts->numscans;
+ if (counts->numscans)
+ {
+ t = anytime_only ? GetCurrentTimestamp() : GetCurrentTransactionStopTimestamp();
+ if (t > tabentry->lastscan)
+ tabentry->lastscan = t;
+ }
+
+ tabentry->tuples_returned += counts->tuples_returned;
+ tabentry->tuples_fetched += counts->tuples_fetched;
+ tabentry->blocks_fetched += counts->blocks_fetched;
+ tabentry->blocks_hit += counts->blocks_hit;
+}
+
/*
* Flush out pending stats for the entry
*
@@ -797,9 +832,17 @@ pgstat_twophase_postabort(FullTransactionId fxid, uint16 info,
*
* Some of the stats are copied to the corresponding pending database stats
* entry when successfully flushing.
+ *
+ * If anytime_only is true, only non-transactional fields are flushed
+ * (numscans, tuples_returned, tuples_fetched, blocks_fetched, blocks_hit).
+ * Transactional fields remain pending until transaction boundary.
+ *
+ * Some of the stats are copied to the corresponding pending database stats
+ * entry when successfully flushing.
*/
bool
-pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
+pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial)
{
Oid dboid;
PgStat_TableStatus *lstats; /* pending stats entry */
@@ -807,12 +850,13 @@ pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_o
PgStat_StatTabEntry *tabentry; /* table entry of shared stats */
PgStat_StatDBEntry *dbentry; /* pending database entry */
- Assert(!anytime_only);
-
dboid = entry_ref->shared_entry->key.dboid;
lstats = (PgStat_TableStatus *) entry_ref->pending;
shtabstats = (PgStatShared_Relation *) entry_ref->shared_stats;
+ /* this is a partial flush if in anytime only mode */
+ *is_partial = anytime_only;
+
/*
* Ignore entries that didn't accumulate any actual counts, such as
* indexes that were opened by the planner but not used.
@@ -824,19 +868,36 @@ pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_o
if (!pgstat_lock_entry(entry_ref, nowait))
return false;
- /* add the values to the shared entry. */
tabentry = &shtabstats->stats;
- tabentry->numscans += lstats->counts.numscans;
- if (lstats->counts.numscans)
+ if (anytime_only)
{
- TimestampTz t = GetCurrentTransactionStopTimestamp();
- if (t > tabentry->lastscan)
- tabentry->lastscan = t;
+ /* Flush non-transactional statistics */
+ flush_relation_anytime_stats(tabentry, &lstats->counts, true);
+
+ pgstat_unlock_entry(entry_ref);
+
+ /* Also update the corresponding fields in database stats */
+ dbentry = pgstat_prep_database_pending(dboid);
+ UPDATE_DATABASE_ANYTIME_STATS(dbentry, &lstats->counts);
+
+ /*
+ * Clear the flushed fields from pending stats to prevent
+ * double-counting when we flush all fields at transaction boundary.
+ */
+ lstats->counts.numscans = 0;
+ lstats->counts.tuples_returned = 0;
+ lstats->counts.tuples_fetched = 0;
+ lstats->counts.blocks_fetched = 0;
+ lstats->counts.blocks_hit = 0;
+
+ return true;
}
- tabentry->tuples_returned += lstats->counts.tuples_returned;
- tabentry->tuples_fetched += lstats->counts.tuples_fetched;
+
+ /* Flush non-transactional statistics */
+ flush_relation_anytime_stats(tabentry, &lstats->counts, false);
+
tabentry->tuples_inserted += lstats->counts.tuples_inserted;
tabentry->tuples_updated += lstats->counts.tuples_updated;
tabentry->tuples_deleted += lstats->counts.tuples_deleted;
@@ -866,9 +927,6 @@ pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_o
*/
tabentry->ins_since_vacuum += lstats->counts.tuples_inserted;
- tabentry->blocks_fetched += lstats->counts.blocks_fetched;
- tabentry->blocks_hit += lstats->counts.blocks_hit;
-
/* Clamp live_tuples in case of negative delta_live_tuples */
tabentry->live_tuples = Max(tabentry->live_tuples, 0);
/* Likewise for dead_tuples */
@@ -878,13 +936,11 @@ pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_o
/* The entry was successfully flushed, add the same to database stats */
dbentry = pgstat_prep_database_pending(dboid);
- dbentry->tuples_returned += lstats->counts.tuples_returned;
- dbentry->tuples_fetched += lstats->counts.tuples_fetched;
+ UPDATE_DATABASE_ANYTIME_STATS(dbentry, &lstats->counts);
+
dbentry->tuples_inserted += lstats->counts.tuples_inserted;
dbentry->tuples_updated += lstats->counts.tuples_updated;
dbentry->tuples_deleted += lstats->counts.tuples_deleted;
- dbentry->blocks_fetched += lstats->counts.blocks_fetched;
- dbentry->blocks_hit += lstats->counts.blocks_hit;
return true;
}
diff --git a/src/backend/utils/activity/pgstat_subscription.c b/src/backend/utils/activity/pgstat_subscription.c
index 6b6eec7578d..bb32782a9d3 100644
--- a/src/backend/utils/activity/pgstat_subscription.c
+++ b/src/backend/utils/activity/pgstat_subscription.c
@@ -117,7 +117,8 @@ pgstat_fetch_stat_subscription(Oid subid)
* false without flushing the entry. Otherwise returns true.
*/
bool
-pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
+pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial)
{
PgStat_BackendSubEntry *localent;
PgStatShared_Subscription *shsubent;
@@ -127,6 +128,9 @@ pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anyti
localent = (PgStat_BackendSubEntry *) entry_ref->pending;
shsubent = (PgStatShared_Subscription *) entry_ref->shared_stats;
+ /* this is not a partial flush */
+ *is_partial = false;
+
/* localent always has non-zero content */
if (!pgstat_lock_entry(entry_ref, nowait))
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 90237c70829..d26ff26e3e3 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -20,6 +20,7 @@
#include "utils/backend_status.h" /* for backward compatibility */ /* IWYU pragma: export */
#include "utils/pgstat_kind.h"
#include "utils/relcache.h"
+#include "utils/timeout.h"
#include "utils/wait_event.h" /* for backward compatibility */ /* IWYU pragma: export */
@@ -536,10 +537,11 @@ extern void pgstat_report_anytime_stat(bool force);
extern void pgstat_force_next_flush(void);
/*
- * Schedule the next anytime stats update timeout.
+ * Schedule the next anytime stats update timeout and mark that we have
+ * mixed anytime stats pending.
*
* This should be called whenever accumulating statistics that support
- * FLUSH_ANYTIME flushing mode.
+ * FLUSH_ANYTIME or FLUSH_MIXED flushing modes.
*/
#define pgstat_schedule_anytime_update() \
do { \
@@ -705,37 +707,58 @@ extern void pgstat_report_analyze(Relation rel,
#define pgstat_count_heap_scan(rel) \
do { \
if (pgstat_should_count_relation(rel)) \
+ { \
(rel)->pgstat_info->counts.numscans++; \
+ pgstat_schedule_anytime_update(); \
+ } \
} while (0)
#define pgstat_count_heap_getnext(rel) \
do { \
if (pgstat_should_count_relation(rel)) \
+ { \
(rel)->pgstat_info->counts.tuples_returned++; \
+ pgstat_schedule_anytime_update(); \
+ } \
} while (0)
#define pgstat_count_heap_fetch(rel) \
do { \
if (pgstat_should_count_relation(rel)) \
+ { \
(rel)->pgstat_info->counts.tuples_fetched++; \
+ pgstat_schedule_anytime_update(); \
+ } \
} while (0)
#define pgstat_count_index_scan(rel) \
do { \
if (pgstat_should_count_relation(rel)) \
+ { \
(rel)->pgstat_info->counts.numscans++; \
+ pgstat_schedule_anytime_update(); \
+ } \
} while (0)
#define pgstat_count_index_tuples(rel, n) \
do { \
if (pgstat_should_count_relation(rel)) \
+ { \
(rel)->pgstat_info->counts.tuples_returned += (n); \
+ pgstat_schedule_anytime_update(); \
+ } \
} while (0)
#define pgstat_count_buffer_read(rel) \
do { \
if (pgstat_should_count_relation(rel)) \
+ { \
(rel)->pgstat_info->counts.blocks_fetched++; \
+ pgstat_schedule_anytime_update(); \
+ } \
} while (0)
#define pgstat_count_buffer_hit(rel) \
do { \
if (pgstat_should_count_relation(rel)) \
+ { \
(rel)->pgstat_info->counts.blocks_hit++; \
+ pgstat_schedule_anytime_update(); \
+ } \
} while (0)
extern void pgstat_count_heap_insert(Relation rel, PgStat_Counter n);
diff --git a/src/include/utils/pgstat_internal.h b/src/include/utils/pgstat_internal.h
index 607f4255268..1a2114aad8a 100644
--- a/src/include/utils/pgstat_internal.h
+++ b/src/include/utils/pgstat_internal.h
@@ -322,8 +322,10 @@ typedef struct PgStat_KindInfo
* that cannot use PgStat_EntryRef->pending.
*
* The anytime_only parameter indicates whether this is an anytime flush.
+ * The is_partial parameter indicates whether this is a partial flush.
*/
- bool (*flush_pending_cb) (PgStat_EntryRef *sr, bool nowait, bool anytime_only);
+ bool (*flush_pending_cb) (PgStat_EntryRef *sr, bool nowait,
+ bool anytime_only, bool *is_partial);
/*
* For variable-numbered stats: delete pending stats. Optional.
@@ -757,7 +759,8 @@ extern void AtEOXact_PgStat_Database(bool isCommit, bool parallel);
extern PgStat_StatDBEntry *pgstat_prep_database_pending(Oid dboid);
extern void pgstat_reset_database_timestamp(Oid dboid, TimestampTz ts);
-extern bool pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only);
+extern bool pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial);
extern void pgstat_database_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
@@ -765,7 +768,8 @@ extern void pgstat_database_reset_timestamp_cb(PgStatShared_Common *header, Time
* Functions in pgstat_function.c
*/
-extern bool pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only);
+extern bool pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial);
extern void pgstat_function_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
@@ -790,7 +794,8 @@ extern void AtEOSubXact_PgStat_Relations(PgStat_SubXactStatus *xact_state, bool
extern void AtPrepare_PgStat_Relations(PgStat_SubXactStatus *xact_state);
extern void PostPrepare_PgStat_Relations(PgStat_SubXactStatus *xact_state);
-extern bool pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only);
+extern bool pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial);
extern void pgstat_relation_delete_pending_cb(PgStat_EntryRef *entry_ref);
extern void pgstat_relation_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
@@ -858,7 +863,8 @@ extern void pgstat_wal_snapshot_cb(void);
* Functions in pgstat_subscription.c
*/
-extern bool pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only);
+extern bool pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial);
extern void pgstat_subscription_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
diff --git a/src/test/isolation/expected/stats.out b/src/test/isolation/expected/stats.out
index cfad309ccf3..11e3e57806d 100644
--- a/src/test/isolation/expected/stats.out
+++ b/src/test/isolation/expected/stats.out
@@ -2245,6 +2245,108 @@ seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum
(1 row)
+starting permutation: s2_begin s2_table_select s1_sleep s1_table_stats s2_track_counts_off s2_table_select s1_sleep s1_table_stats s2_track_counts_on s2_table_select s1_sleep s1_table_stats s2_table_drop s2_commit
+pg_stat_force_next_flush
+------------------------
+
+(1 row)
+
+step s2_begin: BEGIN;
+step s2_table_select: SELECT * FROM test_stat_tab ORDER BY key, value;
+key|value
+---+-----
+k0 | 1
+(1 row)
+
+step s1_sleep: SELECT pg_sleep(1.5);
+pg_sleep
+--------
+
+(1 row)
+
+step s1_table_stats:
+ SELECT
+ pg_stat_get_numscans(tso.oid) AS seq_scan,
+ pg_stat_get_tuples_returned(tso.oid) AS seq_tup_read,
+ pg_stat_get_tuples_inserted(tso.oid) AS n_tup_ins,
+ pg_stat_get_tuples_updated(tso.oid) AS n_tup_upd,
+ pg_stat_get_tuples_deleted(tso.oid) AS n_tup_del,
+ pg_stat_get_live_tuples(tso.oid) AS n_live_tup,
+ pg_stat_get_dead_tuples(tso.oid) AS n_dead_tup,
+ pg_stat_get_vacuum_count(tso.oid) AS vacuum_count
+ FROM test_stat_oid AS tso
+ WHERE tso.name = 'test_stat_tab'
+
+seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum_count
+--------+------------+---------+---------+---------+----------+----------+------------
+ 1| 1| 1| 0| 0| 1| 0| 0
+(1 row)
+
+step s2_track_counts_off: SET track_counts = off;
+step s2_table_select: SELECT * FROM test_stat_tab ORDER BY key, value;
+key|value
+---+-----
+k0 | 1
+(1 row)
+
+step s1_sleep: SELECT pg_sleep(1.5);
+pg_sleep
+--------
+
+(1 row)
+
+step s1_table_stats:
+ SELECT
+ pg_stat_get_numscans(tso.oid) AS seq_scan,
+ pg_stat_get_tuples_returned(tso.oid) AS seq_tup_read,
+ pg_stat_get_tuples_inserted(tso.oid) AS n_tup_ins,
+ pg_stat_get_tuples_updated(tso.oid) AS n_tup_upd,
+ pg_stat_get_tuples_deleted(tso.oid) AS n_tup_del,
+ pg_stat_get_live_tuples(tso.oid) AS n_live_tup,
+ pg_stat_get_dead_tuples(tso.oid) AS n_dead_tup,
+ pg_stat_get_vacuum_count(tso.oid) AS vacuum_count
+ FROM test_stat_oid AS tso
+ WHERE tso.name = 'test_stat_tab'
+
+seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum_count
+--------+------------+---------+---------+---------+----------+----------+------------
+ 1| 1| 1| 0| 0| 1| 0| 0
+(1 row)
+
+step s2_track_counts_on: SET track_counts = on;
+step s2_table_select: SELECT * FROM test_stat_tab ORDER BY key, value;
+key|value
+---+-----
+k0 | 1
+(1 row)
+
+step s1_sleep: SELECT pg_sleep(1.5);
+pg_sleep
+--------
+
+(1 row)
+
+step s1_table_stats:
+ SELECT
+ pg_stat_get_numscans(tso.oid) AS seq_scan,
+ pg_stat_get_tuples_returned(tso.oid) AS seq_tup_read,
+ pg_stat_get_tuples_inserted(tso.oid) AS n_tup_ins,
+ pg_stat_get_tuples_updated(tso.oid) AS n_tup_upd,
+ pg_stat_get_tuples_deleted(tso.oid) AS n_tup_del,
+ pg_stat_get_live_tuples(tso.oid) AS n_live_tup,
+ pg_stat_get_dead_tuples(tso.oid) AS n_dead_tup,
+ pg_stat_get_vacuum_count(tso.oid) AS vacuum_count
+ FROM test_stat_oid AS tso
+ WHERE tso.name = 'test_stat_tab'
+
+seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum_count
+--------+------------+---------+---------+---------+----------+----------+------------
+ 2| 2| 1| 0| 0| 1| 0| 0
+(1 row)
+
+step s2_table_drop: DROP TABLE test_stat_tab;
+step s2_commit: COMMIT;
+
starting permutation: s1_track_counts_off s1_table_stats s1_track_counts_on
pg_stat_force_next_flush
------------------------
diff --git a/src/test/isolation/expected/stats_1.out b/src/test/isolation/expected/stats_1.out
index e1d937784cb..aef582e7582 100644
--- a/src/test/isolation/expected/stats_1.out
+++ b/src/test/isolation/expected/stats_1.out
@@ -2253,6 +2253,108 @@ seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum
(1 row)
+starting permutation: s2_begin s2_table_select s1_sleep s1_table_stats s2_track_counts_off s2_table_select s1_sleep s1_table_stats s2_track_counts_on s2_table_select s1_sleep s1_table_stats s2_table_drop s2_commit
+pg_stat_force_next_flush
+------------------------
+
+(1 row)
+
+step s2_begin: BEGIN;
+step s2_table_select: SELECT * FROM test_stat_tab ORDER BY key, value;
+key|value
+---+-----
+k0 | 1
+(1 row)
+
+step s1_sleep: SELECT pg_sleep(1.5);
+pg_sleep
+--------
+
+(1 row)
+
+step s1_table_stats:
+ SELECT
+ pg_stat_get_numscans(tso.oid) AS seq_scan,
+ pg_stat_get_tuples_returned(tso.oid) AS seq_tup_read,
+ pg_stat_get_tuples_inserted(tso.oid) AS n_tup_ins,
+ pg_stat_get_tuples_updated(tso.oid) AS n_tup_upd,
+ pg_stat_get_tuples_deleted(tso.oid) AS n_tup_del,
+ pg_stat_get_live_tuples(tso.oid) AS n_live_tup,
+ pg_stat_get_dead_tuples(tso.oid) AS n_dead_tup,
+ pg_stat_get_vacuum_count(tso.oid) AS vacuum_count
+ FROM test_stat_oid AS tso
+ WHERE tso.name = 'test_stat_tab'
+
+seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum_count
+--------+------------+---------+---------+---------+----------+----------+------------
+ 1| 1| 1| 0| 0| 1| 0| 0
+(1 row)
+
+step s2_track_counts_off: SET track_counts = off;
+step s2_table_select: SELECT * FROM test_stat_tab ORDER BY key, value;
+key|value
+---+-----
+k0 | 1
+(1 row)
+
+step s1_sleep: SELECT pg_sleep(1.5);
+pg_sleep
+--------
+
+(1 row)
+
+step s1_table_stats:
+ SELECT
+ pg_stat_get_numscans(tso.oid) AS seq_scan,
+ pg_stat_get_tuples_returned(tso.oid) AS seq_tup_read,
+ pg_stat_get_tuples_inserted(tso.oid) AS n_tup_ins,
+ pg_stat_get_tuples_updated(tso.oid) AS n_tup_upd,
+ pg_stat_get_tuples_deleted(tso.oid) AS n_tup_del,
+ pg_stat_get_live_tuples(tso.oid) AS n_live_tup,
+ pg_stat_get_dead_tuples(tso.oid) AS n_dead_tup,
+ pg_stat_get_vacuum_count(tso.oid) AS vacuum_count
+ FROM test_stat_oid AS tso
+ WHERE tso.name = 'test_stat_tab'
+
+seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum_count
+--------+------------+---------+---------+---------+----------+----------+------------
+ 1| 1| 1| 0| 0| 1| 0| 0
+(1 row)
+
+step s2_track_counts_on: SET track_counts = on;
+step s2_table_select: SELECT * FROM test_stat_tab ORDER BY key, value;
+key|value
+---+-----
+k0 | 1
+(1 row)
+
+step s1_sleep: SELECT pg_sleep(1.5);
+pg_sleep
+--------
+
+(1 row)
+
+step s1_table_stats:
+ SELECT
+ pg_stat_get_numscans(tso.oid) AS seq_scan,
+ pg_stat_get_tuples_returned(tso.oid) AS seq_tup_read,
+ pg_stat_get_tuples_inserted(tso.oid) AS n_tup_ins,
+ pg_stat_get_tuples_updated(tso.oid) AS n_tup_upd,
+ pg_stat_get_tuples_deleted(tso.oid) AS n_tup_del,
+ pg_stat_get_live_tuples(tso.oid) AS n_live_tup,
+ pg_stat_get_dead_tuples(tso.oid) AS n_dead_tup,
+ pg_stat_get_vacuum_count(tso.oid) AS vacuum_count
+ FROM test_stat_oid AS tso
+ WHERE tso.name = 'test_stat_tab'
+
+seq_scan|seq_tup_read|n_tup_ins|n_tup_upd|n_tup_del|n_live_tup|n_dead_tup|vacuum_count
+--------+------------+---------+---------+---------+----------+----------+------------
+ 2| 2| 1| 0| 0| 1| 0| 0
+(1 row)
+
+step s2_table_drop: DROP TABLE test_stat_tab;
+step s2_commit: COMMIT;
+
starting permutation: s1_track_counts_off s1_table_stats s1_track_counts_on
pg_stat_force_next_flush
------------------------
diff --git a/src/test/isolation/specs/stats.spec b/src/test/isolation/specs/stats.spec
index da16710da0f..47414eb6009 100644
--- a/src/test/isolation/specs/stats.spec
+++ b/src/test/isolation/specs/stats.spec
@@ -50,6 +50,8 @@ step s1_rollback { ROLLBACK; }
step s1_prepare_a { PREPARE TRANSACTION 'a'; }
step s1_commit_prepared_a { COMMIT PREPARED 'a'; }
step s1_rollback_prepared_a { ROLLBACK PREPARED 'a'; }
+# Has to be greater than session 2 stats_flush_interval
+step s1_sleep { SELECT pg_sleep(1.5); }
# Function stats steps
step s1_ff { SELECT pg_stat_force_next_flush(); }
@@ -132,12 +134,16 @@ step s1_slru_check_stats {
session s2
-setup { SET stats_fetch_consistency = 'none'; }
+setup {
+ SET stats_fetch_consistency = 'none';
+ SET stats_flush_interval = '1s';
+}
step s2_begin { BEGIN; }
step s2_commit { COMMIT; }
step s2_commit_prepared_a { COMMIT PREPARED 'a'; }
step s2_rollback_prepared_a { ROLLBACK PREPARED 'a'; }
step s2_ff { SELECT pg_stat_force_next_flush(); }
+step s2_table_drop { DROP TABLE test_stat_tab; }
# Function stats steps
step s2_track_funcs_all { SET track_functions = 'all'; }
@@ -156,6 +162,8 @@ step s2_func_stats {
}
# Relation stats steps
+step s2_track_counts_on { SET track_counts = on; }
+step s2_track_counts_off { SET track_counts = off; }
step s2_table_select { SELECT * FROM test_stat_tab ORDER BY key, value; }
step s2_table_update_k1 { UPDATE test_stat_tab SET value = value + 1 WHERE key = 'k1';}
@@ -435,6 +443,23 @@ permutation
s1_table_drop
s1_table_stats
+### Check that some stats are updated (seq_scan and seq_tup_read)
+### while the transaction is still running
+permutation
+ s2_begin
+ s2_table_select
+ s1_sleep
+ s1_table_stats
+ s2_track_counts_off
+ s2_table_select
+ s1_sleep
+ s1_table_stats
+ s2_track_counts_on
+ s2_table_select
+ s1_sleep
+ s1_table_stats
+ s2_table_drop
+ s2_commit
### Check that we don't count changes with track counts off, but allow access
### to prior stats
diff --git a/src/test/modules/test_custom_stats/test_custom_var_stats.c b/src/test/modules/test_custom_stats/test_custom_var_stats.c
index e9f1bda6b32..59f531df5f7 100644
--- a/src/test/modules/test_custom_stats/test_custom_var_stats.c
+++ b/src/test/modules/test_custom_stats/test_custom_var_stats.c
@@ -85,7 +85,8 @@ static dsa_area *custom_stats_description_dsa = NULL;
/* Flush callback: merge pending stats into shared memory */
static bool test_custom_stats_var_flush_pending_cb(PgStat_EntryRef *entry_ref,
- bool nowait, bool anytime_only);
+ bool nowait, bool anytime_only,
+ bool *is_partial);
/* Serialization callback: write auxiliary entry data */
static void test_custom_stats_var_to_serialized_data(const PgStat_HashKey *key,
@@ -153,7 +154,8 @@ _PG_init(void)
* Returns false only if nowait=true and lock acquisition fails.
*/
static bool
-test_custom_stats_var_flush_pending_cb(PgStat_EntryRef *entry_ref, bool nowait, bool anytime_only)
+test_custom_stats_var_flush_pending_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool anytime_only, bool *is_partial)
{
PgStat_StatCustomVarEntry *pending_entry;
PgStatShared_CustomVarEntry *shared_entry;
@@ -161,6 +163,9 @@ test_custom_stats_var_flush_pending_cb(PgStat_EntryRef *entry_ref, bool nowait,
pending_entry = (PgStat_StatCustomVarEntry *) entry_ref->pending;
shared_entry = (PgStatShared_CustomVarEntry *) entry_ref->shared_stats;
+ /* this is not a partial flush */
+ *is_partial = false;
+
if (!pgstat_lock_entry(entry_ref, nowait))
return false;
--
2.34.1
^ permalink raw reply [nested|flat] 22+ messages in thread
* Re: Flush some statistics within running transactions
2026-02-17 19:18 Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-18 05:40 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-19 03:58 ` Re: Flush some statistics within running transactions Michael Paquier <[email protected]>
2026-02-19 08:01 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-19 22:08 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-20 15:55 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-23 02:12 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-23 08:14 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
@ 2026-02-23 23:47 ` Sami Imseih <[email protected]>
2026-02-24 01:56 ` Re: Flush some statistics within running transactions Michael Paquier <[email protected]>
2026-02-24 12:01 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
0 siblings, 2 replies; 22+ messages in thread
From: Sami Imseih @ 2026-02-23 23:47 UTC (permalink / raw)
To: Bertrand Drouvot <[email protected]>; +Cc: Michael Paquier <[email protected]>; Fujii Masao <[email protected]>; [email protected]; Zsolt Parragi <[email protected]>
> > For variable-length statistics, perhaps we can do things a bit
> > differently than what is currently proposed. 0005 requires
> > a relation anytime stat update to call
> > pgstat_schedule_anytime_update(). This is done this way because
> > it allows long-running queries to update their stats every
> > stats_flush_interval using a timeout.
> >
> > But maybe what we should be doing for variable-numbered stats is
> > to schedule an anytime update whenever a "transaction goes idle".
>
> I think the logic for fixed stats and variable stats should be the same. If
> not we could observe discrepancies: for example a long running select could
> genereate reads/hits IO visible in pg_stat_io but tuples_returned, tuples_fetched,
> blocks_fetched or blocks_hit would not be updated until the session goes idle.
After having more time to think about this, I believe it can be much simpler.
As soon as we enter an idle-in-transaction (aborted) state, we can simply
schedule an anytime update. This ensures that a flush is scheduled whenever
the fixed stats trigger one, which will likely be the most common reason
(e.g., I/O stats, WAL stats, etc.). To cover the cases where fixed stats
do not schedule a flush, we can also schedule one as soon as a transaction
goes idle.
In my mind, this makes this whole flushing scheduling behavior easy to reason
about, and if we introduce future anytime stats anywhere, we are not required
to schedule a flush for each individual field. The flush callback will of course
still need to decide what to flush anytime or at the transaction boundary.
What do you think?
--
Sami Imseih
Amazon Web Services (AWS)
^ permalink raw reply [nested|flat] 22+ messages in thread
* Re: Flush some statistics within running transactions
2026-02-17 19:18 Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-18 05:40 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-19 03:58 ` Re: Flush some statistics within running transactions Michael Paquier <[email protected]>
2026-02-19 08:01 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-19 22:08 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-20 15:55 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-23 02:12 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-23 08:14 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-23 23:47 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
@ 2026-02-24 01:56 ` Michael Paquier <[email protected]>
1 sibling, 0 replies; 22+ messages in thread
From: Michael Paquier @ 2026-02-24 01:56 UTC (permalink / raw)
To: Sami Imseih <[email protected]>; +Cc: Bertrand Drouvot <[email protected]>; Fujii Masao <[email protected]>; [email protected]; Zsolt Parragi <[email protected]>
On Mon, Feb 23, 2026 at 05:47:22PM -0600, Sami Imseih wrote:
>> I think the logic for fixed stats and variable stats should be the same. If
>> not we could observe discrepancies: for example a long running select could
>> genereate reads/hits IO visible in pg_stat_io but tuples_returned, tuples_fetched,
>> blocks_fetched or blocks_hit would not be updated until the session goes idle.
>
> After having more time to think about this, I believe it can be much simpler.
> As soon as we enter an idle-in-transaction (aborted) state, we can simply
> schedule an anytime update. This ensures that a flush is scheduled whenever
> the fixed stats trigger one, which will likely be the most common reason
> (e.g., I/O stats, WAL stats, etc.). To cover the cases where fixed stats
> do not schedule a flush, we can also schedule one as soon as a transaction
> goes idle.
>
> In my mind, this makes this whole flushing scheduling behavior easy to reason
> about, and if we introduce future anytime stats anywhere, we are not required
> to schedule a flush for each individual field. The flush callback will of course
> still need to decide what to flush anytime or at the transaction boundary.
>
> What do you think?
I cannot picture yet fully how a patch among these lines would be
shaped, but having a strategic flush of the stats when we are in an
idle-in-transaction state sounds like an interesting option here.
I think that this leans towards two first pieces of infrastructure for
this patch set:
- The new stats kind option.
- A new pgstats API that is able to classify the flushes depending on
property assigned for each stats kind, and make these happen on a
caller-basis.
--
Michael
Attachments:
[application/pgp-signature] signature.asc (833B, 2-signature.asc)
download
^ permalink raw reply [nested|flat] 22+ messages in thread
* Re: Flush some statistics within running transactions
2026-02-17 19:18 Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-18 05:40 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-19 03:58 ` Re: Flush some statistics within running transactions Michael Paquier <[email protected]>
2026-02-19 08:01 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-19 22:08 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-20 15:55 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-23 02:12 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-23 08:14 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-23 23:47 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
@ 2026-02-24 12:01 ` Bertrand Drouvot <[email protected]>
2026-03-16 06:26 ` Re: Flush some statistics within running transactions Michael Paquier <[email protected]>
1 sibling, 1 reply; 22+ messages in thread
From: Bertrand Drouvot @ 2026-02-24 12:01 UTC (permalink / raw)
To: Sami Imseih <[email protected]>; +Cc: Michael Paquier <[email protected]>; Fujii Masao <[email protected]>; [email protected]; Zsolt Parragi <[email protected]>
Hi,
On Mon, Feb 23, 2026 at 05:47:22PM -0600, Sami Imseih wrote:
> > > For variable-length statistics, perhaps we can do things a bit
> > > differently than what is currently proposed. 0005 requires
> > > a relation anytime stat update to call
> > > pgstat_schedule_anytime_update(). This is done this way because
> > > it allows long-running queries to update their stats every
> > > stats_flush_interval using a timeout.
> > >
> > > But maybe what we should be doing for variable-numbered stats is
> > > to schedule an anytime update whenever a "transaction goes idle".
> >
> > I think the logic for fixed stats and variable stats should be the same. If
> > not we could observe discrepancies: for example a long running select could
> > genereate reads/hits IO visible in pg_stat_io but tuples_returned, tuples_fetched,
> > blocks_fetched or blocks_hit would not be updated until the session goes idle.
>
> After having more time to think about this, I believe it can be much simpler.
> As soon as we enter an idle-in-transaction (aborted) state, we can simply
> schedule an anytime update. This ensures that a flush is scheduled whenever
> the fixed stats trigger one, which will likely be the most common reason
> (e.g., I/O stats, WAL stats, etc.). To cover the cases where fixed stats
> do not schedule a flush, we can also schedule one as soon as a transaction
> goes idle.
>
> In my mind, this makes this whole flushing scheduling behavior easy to reason
> about, and if we introduce future anytime stats anywhere, we are not required
> to schedule a flush for each individual field. The flush callback will of course
> still need to decide what to flush anytime or at the transaction boundary.
>
> What do you think?
My understanding is that (correct me if I'm wrong):
- fixed stats would still be designed the way it is in v11
- variable stats would not need the pgstat_schedule_anytime_update() calls in
various places. The flush would be done/schedule when the session goes idle.
Then I agree that that looks ok and that:
> This ensures that a flush is scheduled whenever
> the fixed stats trigger one, which will likely be the most common reason
> (e.g., I/O stats, WAL stats, etc.)
Though I don't think that adresses Michael's concern: "main worries are
mainly around 1), I guess, with the new SIGALRM handler requirements for all
auxiliary processes" in [1].
Regards,
[1]: https://postgr.es/m/aZznT84Ssh8PywcH%40paquier.xyz
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 22+ messages in thread
* Re: Flush some statistics within running transactions
2026-02-17 19:18 Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-18 05:40 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-19 03:58 ` Re: Flush some statistics within running transactions Michael Paquier <[email protected]>
2026-02-19 08:01 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-19 22:08 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-20 15:55 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-23 02:12 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-23 08:14 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-23 23:47 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-24 12:01 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
@ 2026-03-16 06:26 ` Michael Paquier <[email protected]>
2026-03-16 09:20 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
0 siblings, 1 reply; 22+ messages in thread
From: Michael Paquier @ 2026-03-16 06:26 UTC (permalink / raw)
To: Bertrand Drouvot <[email protected]>; +Cc: Sami Imseih <[email protected]>; Fujii Masao <[email protected]>; [email protected]; Zsolt Parragi <[email protected]>
On Tue, Feb 24, 2026 at 12:01:30PM +0000, Bertrand Drouvot wrote:
> Though I don't think that adresses Michael's concern: "main worries are
> mainly around 1), I guess, with the new SIGALRM handler requirements for all
> auxiliary processes" in [1].
FWIW, I am still concerned about that, and I have pondered about what
we could do here. While reviewing the existing code, one thing that I
have noticed we could do is rely on the existing interface of
pgstat_report_stat() without changing the existing callers, and not
touching at all the flush callbacks. If we begin to require the
"force" mode when the routine the called inside a transaction block,
things seem to work pretty smoothly in combination with a stats kind
property that allows the stats data to be flushed if we are inside a
transaction while a report happens. And it is possible to enforce
checks inside pgstat_report_stat() as well.
So please find attached my shot at that:
- Introduction of a new system function called pg_stat_report(), based
on a procsignal that gives a way to signal backends for a stats
update, reusing the existing code where we only do flushes when idle
and not in a transaction.
- Property that tracks under which contexts the reports are allowed.
Here I have decided to stick with simple, as in only allowing IO and
WAL stats to be flushed if we are inside a transaction.
Using that, I have done a few tests with three backends:
- One with a long-running transaction.
- One that periodically triggers the reports.
- One that looks at IO and WAL stat.
And the third session is able to get refreshes for both of these stats
kinds, while the other stats remain the same.
Note that this is a WIP, which is check-world stable. One thing that
sticks a bit in mind now is that perhaps we should not allow the
function for auxiliary processes at all. A second thing is the
requirement of allowing partial flushes at the end of the report path,
which is OK because the variable-sized stats can have pending data.
Perhaps we should just have pgstat_flush_pending_entries() provide a
correct status in line with the property set in a stats kind when we
try a flush while in a transaction.
Thoughts or tomatoes?
--
Michael
From c5aeb083265efbd6041ed3868b669997d4430760 Mon Sep 17 00:00:00 2001
From: Michael Paquier <[email protected]>
Date: Mon, 16 Mar 2026 15:12:24 +0900
Subject: [PATCH] Add support for transient stats updates
This introduces a function able to push stats updates, with a new stats
kind property to allow stats to be updated while in a transaction.
---
src/include/catalog/pg_proc.dat | 7 +++
src/include/miscadmin.h | 1 +
src/include/storage/procsignal.h | 1 +
src/include/utils/pgstat_internal.h | 16 ++++++
src/backend/storage/ipc/procsignal.c | 16 ++++++
src/backend/tcop/postgres.c | 11 +++-
src/backend/utils/activity/pgstat.c | 42 ++++++++++++--
src/backend/utils/adt/pgstatfuncs.c | 58 ++++++++++++++++++++
src/backend/utils/init/globals.c | 1 +
src/test/regress/expected/misc_functions.out | 52 ++++++++++++++++++
src/test/regress/sql/misc_functions.sql | 29 ++++++++++
doc/src/sgml/func/func-admin.sgml | 16 ++++++
12 files changed, 242 insertions(+), 8 deletions(-)
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 361e2cfffebe..85869154657a 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -8692,6 +8692,13 @@
prosrc => 'pg_log_backend_memory_contexts',
proacl => '{POSTGRES=X}' },
+# request an update of statistics
+{ oid => '8789', descr => 'have the specified backend push a pgstats update',
+ proname => 'pg_stat_report', provolatile => 'v',
+ prorettype => 'bool', proargtypes => 'int4',
+ prosrc => 'pg_stat_report',
+ proacl => '{POSTGRES=X}' },
+
# non-persistent series generator
{ oid => '1066', descr => 'non-persistent series generator',
proname => 'generate_series', prorows => '1000',
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index f16f35659b9b..a245cdd79d17 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -94,6 +94,7 @@ extern PGDLLIMPORT volatile sig_atomic_t IdleInTransactionSessionTimeoutPending;
extern PGDLLIMPORT volatile sig_atomic_t TransactionTimeoutPending;
extern PGDLLIMPORT volatile sig_atomic_t IdleSessionTimeoutPending;
extern PGDLLIMPORT volatile sig_atomic_t ProcSignalBarrierPending;
+extern PGDLLIMPORT volatile sig_atomic_t ProcSignalStatsUpdatePending;
extern PGDLLIMPORT volatile sig_atomic_t LogMemoryContextPending;
extern PGDLLIMPORT volatile sig_atomic_t IdleStatsUpdateTimeoutPending;
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 348fba53a931..d2e60403ad52 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -34,6 +34,7 @@ typedef enum
PROCSIG_PARALLEL_MESSAGE, /* message from cooperating parallel backend */
PROCSIG_WALSND_INIT_STOPPING, /* ask walsenders to prepare for shutdown */
PROCSIG_BARRIER, /* global barrier interrupt */
+ PROCSIG_STATS_UPDATE, /* pgstats update */
PROCSIG_LOG_MEMORY_CONTEXT, /* ask backend to log the memory contexts */
PROCSIG_PARALLEL_APPLY_MESSAGE, /* Message from parallel apply workers */
PROCSIG_RECOVERY_CONFLICT, /* backend is blocking recovery, check
diff --git a/src/include/utils/pgstat_internal.h b/src/include/utils/pgstat_internal.h
index 9b8fbae00ed5..fa4cc4fe9c60 100644
--- a/src/include/utils/pgstat_internal.h
+++ b/src/include/utils/pgstat_internal.h
@@ -224,6 +224,17 @@ typedef struct PgStat_SubXactStatus
PgStat_TableXactStatus *first; /* head of list for this subxact */
} PgStat_SubXactStatus;
+/*
+ * Contexts related to the report of the statistics, defined as properties
+ * of PgStat_KindInfo.report_context. These define when a stats report is
+ * allowed depending on the stats kind and the context where
+ * pgstat_report_stat() is called.
+ */
+
+/* report allowed while idle, outside a transaction (default) */
+#define PGSTAT_REPORT_IDLE 0x00
+/* report of stats data allowed within a transaction */
+#define PGSTAT_REPORT_TRANSACTION 0x01
/*
* Metadata for a specific kind of statistics.
@@ -251,6 +262,11 @@ typedef struct PgStat_KindInfo
*/
bool track_entry_count:1;
+ /*
+ * Contexts allowed for the report of this stats kind data.
+ */
+ bits32 report_context;
+
/*
* The size of an entry in the shared stats hash table (pointed to by
* PgStatShared_HashEntry->body). For fixed-numbered statistics, this is
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 7e017c8d53b5..e1bff2185933 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -490,6 +490,19 @@ HandleProcSignalBarrierInterrupt(void)
/* latch will be set by procsignal_sigusr1_handler */
}
+/*
+ * Handle receipt of an interrupt indicating that a stats update has been
+ * requested. This routine only gets called when PROCSIG_STATS_UPDATE is
+ * sent.
+ */
+static void
+HandleProcSignalStatsUpdateInterrupt(void)
+{
+ InterruptPending = true;
+ ProcSignalStatsUpdatePending = true;
+ /* latch will be set by procsignal_sigusr1_handler */
+}
+
/*
* Perform global barrier related interrupt checking.
*
@@ -694,6 +707,9 @@ procsignal_sigusr1_handler(SIGNAL_ARGS)
if (CheckProcSignal(PROCSIG_BARRIER))
HandleProcSignalBarrierInterrupt();
+ if (CheckProcSignal(PROCSIG_STATS_UPDATE))
+ HandleProcSignalStatsUpdateInterrupt();
+
if (CheckProcSignal(PROCSIG_LOG_MEMORY_CONTEXT))
HandleLogMemoryContextInterrupt();
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index d01a09dd0c41..ddd57dfea780 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -3555,12 +3555,17 @@ ProcessInterrupts(void)
/*
* If there are pending stats updates and we currently are truly idle
- * (matching the conditions in PostgresMain(), report stats now.
+ * (matching the conditions in PostgresMain(), or if a status update has
+ * been requested, report stats now.
*/
if (IdleStatsUpdateTimeoutPending &&
- DoingCommandRead && !IsTransactionOrTransactionBlock())
+ DoingCommandRead && !IsTransactionOrTransactionBlock() ||
+ ProcSignalStatsUpdatePending)
{
- IdleStatsUpdateTimeoutPending = false;
+ if (IdleStatsUpdateTimeoutPending)
+ IdleStatsUpdateTimeoutPending = false;
+ if (ProcSignalStatsUpdatePending)
+ ProcSignalStatsUpdatePending = false;
pgstat_report_stat(true);
}
diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c
index 11bb71cad5ad..5521e96d0cae 100644
--- a/src/backend/utils/activity/pgstat.c
+++ b/src/backend/utils/activity/pgstat.c
@@ -436,6 +436,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = true,
.write_to_file = true,
+ .report_context = PGSTAT_REPORT_TRANSACTION,
.snapshot_ctl_off = offsetof(PgStat_Snapshot, io),
.shared_ctl_off = offsetof(PgStat_ShmemControl, io),
@@ -470,6 +471,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = true,
.write_to_file = true,
+ .report_context = PGSTAT_REPORT_TRANSACTION,
.snapshot_ctl_off = offsetof(PgStat_Snapshot, wal),
.shared_ctl_off = offsetof(PgStat_ShmemControl, wal),
@@ -698,8 +700,9 @@ pgstat_initialize(void)
* a timeout after which to call pgstat_report_stat(true), but are not
* required to do so.
*
- * Note that this is called only when not within a transaction, so it is fair
- * to use transaction stop time as an approximation of current time.
+ * Note that when this is called only when not within a transaction, we use
+ * the transaction stop time as an approximation of current time. "force"
+ * is required when this is called within a transaction.
*/
long
pgstat_report_stat(bool force)
@@ -709,9 +712,14 @@ pgstat_report_stat(bool force)
bool partial_flush;
TimestampTz now;
bool nowait;
+ bool is_xact = IsTransactionOrTransactionBlock();
pgstat_assert_is_up();
- Assert(!IsTransactionOrTransactionBlock());
+ /*
+ * "force" is required if this routine is called inside a transaction
+ * block.
+ */
+ Assert(!is_xact || force);
/* "absorb" the forced flush even if there's nothing to flush */
if (pgStatForceNextFlush)
@@ -789,6 +797,11 @@ pgstat_report_stat(bool force)
if (!kind_info->flush_static_cb)
continue;
+ /* Skip if this stats kind cannot be flushed in a transaction */
+ if (is_xact &&
+ (kind_info->report_context & PGSTAT_REPORT_TRANSACTION) == 0)
+ continue;
+
partial_flush |= kind_info->flush_static_cb(nowait);
}
}
@@ -801,8 +814,11 @@ pgstat_report_stat(bool force)
*/
if (partial_flush)
{
- /* force should have prevented us from getting here */
- Assert(!force);
+ /*
+ * force should have prevented us from getting here, and partial
+ * flushes are accepted inside a transaction.
+ */
+ Assert(!force || is_xact);
/* remember since when stats have been pending */
if (pending_since == 0)
@@ -1351,6 +1367,7 @@ pgstat_flush_pending_entries(bool nowait)
{
bool have_pending = false;
dlist_node *cur = NULL;
+ bool is_xact = IsTransactionOrTransactionBlock();
/*
* Need to be a bit careful iterating over the list of pending entries.
@@ -1377,6 +1394,21 @@ pgstat_flush_pending_entries(bool nowait)
Assert(!kind_info->fixed_amount);
Assert(kind_info->flush_pending_cb != NULL);
+ /* Skip if this stats kind cannot be flushed while in a transaction */
+ if (is_xact &&
+ (kind_info->report_context & PGSTAT_REPORT_TRANSACTION) == 0)
+ {
+ have_pending = true;
+
+ if (dlist_has_next(&pgStatPending, cur))
+ next = dlist_next_node(&pgStatPending, cur);
+ else
+ next = NULL;
+
+ cur = next;
+ continue;
+ }
+
/* flush the stats, if possible */
did_flush = kind_info->flush_pending_cb(entry_ref, nowait);
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index bad5642d9c90..b27a5eb58e18 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -28,6 +28,7 @@
#include "replication/logicallauncher.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/procsignal.h"
#include "utils/acl.h"
#include "utils/builtins.h"
#include "utils/timestamp.h"
@@ -2325,3 +2326,60 @@ pg_stat_have_stats(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(pgstat_have_entry(kind, dboid, objid));
}
+
+/*
+ * pg_stat_report
+ * Signal a backend or an auxiliary process to have it push an update
+ * of its statistics data.
+ *
+ * By default, only superusers are allowed to signal to log the memory
+ * contexts because allowing any users to issue this request at an unbounded
+ * rate would cause lots of log messages and which can lead to denial of
+ * service. Additional roles can be permitted with GRANT.
+ *
+ * On receipt of this signal, a backend or an auxiliary process sets the flag
+ * in the signal handler, which causes the next CHECK_FOR_INTERRUPTS()
+ * or process-specific interrupt handler to update their statistics.
+ */
+Datum
+pg_stat_report(PG_FUNCTION_ARGS)
+{
+ int pid = PG_GETARG_INT32(0);
+ PGPROC *proc;
+ ProcNumber procNumber = INVALID_PROC_NUMBER;
+
+ /*
+ * See if the process with given pid is a backend or an auxiliary process.
+ */
+ proc = BackendPidGetProc(pid);
+ if (proc == NULL)
+ proc = AuxiliaryPidGetProc(pid);
+
+ /*
+ * BackendPidGetProc() and AuxiliaryPidGetProc() return NULL if the pid
+ * isn't valid; but by the time we reach kill(), a process for which we
+ * get a valid proc here might have terminated on its own. This is OK,
+ * as at shutdown processes flush their stats.
+ */
+ if (proc == NULL)
+ {
+ /*
+ * This is just a warning so a loop-through-resultset will not abort
+ * if one backend terminated on its own during the run.
+ */
+ ereport(WARNING,
+ (errmsg("PID %d is not a PostgreSQL server process", pid)));
+ PG_RETURN_BOOL(false);
+ }
+
+ procNumber = GetNumberFromPGProc(proc);
+ if (SendProcSignal(pid, PROCSIG_STATS_UPDATE, procNumber) < 0)
+ {
+ /* Again, just a warning to allow loops */
+ ereport(WARNING,
+ (errmsg("could not send signal to process %d: %m", pid)));
+ PG_RETURN_BOOL(false);
+ }
+
+ PG_RETURN_BOOL(true);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 36ad708b3602..22960ee3b27b 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -38,6 +38,7 @@ volatile sig_atomic_t IdleInTransactionSessionTimeoutPending = false;
volatile sig_atomic_t TransactionTimeoutPending = false;
volatile sig_atomic_t IdleSessionTimeoutPending = false;
volatile sig_atomic_t ProcSignalBarrierPending = false;
+volatile sig_atomic_t ProcSignalStatsUpdatePending = false;
volatile sig_atomic_t LogMemoryContextPending = false;
volatile sig_atomic_t IdleStatsUpdateTimeoutPending = false;
volatile uint32 InterruptHoldoffCount = 0;
diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index 6c03b1a79d75..68a0e1e02bcc 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -397,6 +397,58 @@ REVOKE EXECUTE ON FUNCTION pg_log_backend_memory_contexts(integer)
FROM regress_log_memory;
DROP ROLE regress_log_memory;
--
+-- pg_stat_report
+--
+-- check execution
+SELECT pg_stat_report(pg_backend_pid());
+ pg_stat_report
+----------------
+ t
+(1 row)
+
+SELECT pg_stat_report(pid) FROM pg_stat_activity
+ WHERE backend_type = 'checkpointer';
+ pg_stat_report
+----------------
+ t
+(1 row)
+
+-- Check privileges
+CREATE ROLE regress_stat_report;
+SELECT has_function_privilege('regress_stat_report',
+ 'pg_stat_report(integer)', 'EXECUTE'); -- no
+ has_function_privilege
+------------------------
+ f
+(1 row)
+
+-- Fails
+SET ROLE regress_stat_report;
+SELECT pg_stat_report(pg_backend_pid());
+ERROR: permission denied for function pg_stat_report
+RESET ROLE;
+-- Access granted, then function works
+GRANT EXECUTE ON FUNCTION pg_stat_report(integer)
+ TO regress_stat_report;
+SELECT has_function_privilege('regress_stat_report',
+ 'pg_stat_report(integer)', 'EXECUTE'); -- yes
+ has_function_privilege
+------------------------
+ t
+(1 row)
+
+SET ROLE regress_stat_report;
+SELECT pg_stat_report(pg_backend_pid());
+ pg_stat_report
+----------------
+ t
+(1 row)
+
+RESET ROLE;
+REVOKE EXECUTE ON FUNCTION pg_stat_report(integer)
+ FROM regress_stat_report;
+DROP ROLE regress_stat_report;
+--
-- Test some built-in SRFs
--
-- The outputs of these are variable, so we can't just print their results
diff --git a/src/test/regress/sql/misc_functions.sql b/src/test/regress/sql/misc_functions.sql
index 35b7983996c4..fe366904c9e8 100644
--- a/src/test/regress/sql/misc_functions.sql
+++ b/src/test/regress/sql/misc_functions.sql
@@ -154,6 +154,35 @@ REVOKE EXECUTE ON FUNCTION pg_log_backend_memory_contexts(integer)
DROP ROLE regress_log_memory;
+--
+-- pg_stat_report
+--
+
+-- check execution
+SELECT pg_stat_report(pg_backend_pid());
+SELECT pg_stat_report(pid) FROM pg_stat_activity
+ WHERE backend_type = 'checkpointer';
+
+-- Check privileges
+CREATE ROLE regress_stat_report;
+SELECT has_function_privilege('regress_stat_report',
+ 'pg_stat_report(integer)', 'EXECUTE'); -- no
+-- Fails
+SET ROLE regress_stat_report;
+SELECT pg_stat_report(pg_backend_pid());
+RESET ROLE;
+-- Access granted, then function works
+GRANT EXECUTE ON FUNCTION pg_stat_report(integer)
+ TO regress_stat_report;
+SELECT has_function_privilege('regress_stat_report',
+ 'pg_stat_report(integer)', 'EXECUTE'); -- yes
+SET ROLE regress_stat_report;
+SELECT pg_stat_report(pg_backend_pid());
+RESET ROLE;
+REVOKE EXECUTE ON FUNCTION pg_stat_report(integer)
+ FROM regress_stat_report;
+DROP ROLE regress_stat_report;
+
--
-- Test some built-in SRFs
--
diff --git a/doc/src/sgml/func/func-admin.sgml b/doc/src/sgml/func/func-admin.sgml
index 210b1118bdf7..114202c4fc19 100644
--- a/doc/src/sgml/func/func-admin.sgml
+++ b/doc/src/sgml/func/func-admin.sgml
@@ -220,6 +220,22 @@
</para></entry>
</row>
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_stat_report</primary>
+ </indexterm>
+ <function>pg_log_backend_memory_contexts</function> ( <parameter>pid</parameter> <type>integer</type> )
+ <returnvalue>boolean</returnvalue>
+ </para>
+ <para>
+ Requests to update the statistics computed by the backend with the
+ specified process ID. This function can send the request to
+ backends and auxiliary processes except logger, pushing an
+ update of the statistics data.
+ </para></entry>
+ </row>
+
<row>
<entry role="func_table_entry"><para role="func_signature">
<indexterm>
--
2.53.0
Attachments:
[text/plain] 0001-Add-support-for-transient-stats-updates.patch (16.6K, 2-0001-Add-support-for-transient-stats-updates.patch)
download | inline diff:
From c5aeb083265efbd6041ed3868b669997d4430760 Mon Sep 17 00:00:00 2001
From: Michael Paquier <[email protected]>
Date: Mon, 16 Mar 2026 15:12:24 +0900
Subject: [PATCH] Add support for transient stats updates
This introduces a function able to push stats updates, with a new stats
kind property to allow stats to be updated while in a transaction.
---
src/include/catalog/pg_proc.dat | 7 +++
src/include/miscadmin.h | 1 +
src/include/storage/procsignal.h | 1 +
src/include/utils/pgstat_internal.h | 16 ++++++
src/backend/storage/ipc/procsignal.c | 16 ++++++
src/backend/tcop/postgres.c | 11 +++-
src/backend/utils/activity/pgstat.c | 42 ++++++++++++--
src/backend/utils/adt/pgstatfuncs.c | 58 ++++++++++++++++++++
src/backend/utils/init/globals.c | 1 +
src/test/regress/expected/misc_functions.out | 52 ++++++++++++++++++
src/test/regress/sql/misc_functions.sql | 29 ++++++++++
doc/src/sgml/func/func-admin.sgml | 16 ++++++
12 files changed, 242 insertions(+), 8 deletions(-)
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 361e2cfffebe..85869154657a 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -8692,6 +8692,13 @@
prosrc => 'pg_log_backend_memory_contexts',
proacl => '{POSTGRES=X}' },
+# request an update of statistics
+{ oid => '8789', descr => 'have the specified backend push a pgstats update',
+ proname => 'pg_stat_report', provolatile => 'v',
+ prorettype => 'bool', proargtypes => 'int4',
+ prosrc => 'pg_stat_report',
+ proacl => '{POSTGRES=X}' },
+
# non-persistent series generator
{ oid => '1066', descr => 'non-persistent series generator',
proname => 'generate_series', prorows => '1000',
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index f16f35659b9b..a245cdd79d17 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -94,6 +94,7 @@ extern PGDLLIMPORT volatile sig_atomic_t IdleInTransactionSessionTimeoutPending;
extern PGDLLIMPORT volatile sig_atomic_t TransactionTimeoutPending;
extern PGDLLIMPORT volatile sig_atomic_t IdleSessionTimeoutPending;
extern PGDLLIMPORT volatile sig_atomic_t ProcSignalBarrierPending;
+extern PGDLLIMPORT volatile sig_atomic_t ProcSignalStatsUpdatePending;
extern PGDLLIMPORT volatile sig_atomic_t LogMemoryContextPending;
extern PGDLLIMPORT volatile sig_atomic_t IdleStatsUpdateTimeoutPending;
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 348fba53a931..d2e60403ad52 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -34,6 +34,7 @@ typedef enum
PROCSIG_PARALLEL_MESSAGE, /* message from cooperating parallel backend */
PROCSIG_WALSND_INIT_STOPPING, /* ask walsenders to prepare for shutdown */
PROCSIG_BARRIER, /* global barrier interrupt */
+ PROCSIG_STATS_UPDATE, /* pgstats update */
PROCSIG_LOG_MEMORY_CONTEXT, /* ask backend to log the memory contexts */
PROCSIG_PARALLEL_APPLY_MESSAGE, /* Message from parallel apply workers */
PROCSIG_RECOVERY_CONFLICT, /* backend is blocking recovery, check
diff --git a/src/include/utils/pgstat_internal.h b/src/include/utils/pgstat_internal.h
index 9b8fbae00ed5..fa4cc4fe9c60 100644
--- a/src/include/utils/pgstat_internal.h
+++ b/src/include/utils/pgstat_internal.h
@@ -224,6 +224,17 @@ typedef struct PgStat_SubXactStatus
PgStat_TableXactStatus *first; /* head of list for this subxact */
} PgStat_SubXactStatus;
+/*
+ * Contexts related to the report of the statistics, defined as properties
+ * of PgStat_KindInfo.report_context. These define when a stats report is
+ * allowed depending on the stats kind and the context where
+ * pgstat_report_stat() is called.
+ */
+
+/* report allowed while idle, outside a transaction (default) */
+#define PGSTAT_REPORT_IDLE 0x00
+/* report of stats data allowed within a transaction */
+#define PGSTAT_REPORT_TRANSACTION 0x01
/*
* Metadata for a specific kind of statistics.
@@ -251,6 +262,11 @@ typedef struct PgStat_KindInfo
*/
bool track_entry_count:1;
+ /*
+ * Contexts allowed for the report of this stats kind data.
+ */
+ bits32 report_context;
+
/*
* The size of an entry in the shared stats hash table (pointed to by
* PgStatShared_HashEntry->body). For fixed-numbered statistics, this is
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 7e017c8d53b5..e1bff2185933 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -490,6 +490,19 @@ HandleProcSignalBarrierInterrupt(void)
/* latch will be set by procsignal_sigusr1_handler */
}
+/*
+ * Handle receipt of an interrupt indicating that a stats update has been
+ * requested. This routine only gets called when PROCSIG_STATS_UPDATE is
+ * sent.
+ */
+static void
+HandleProcSignalStatsUpdateInterrupt(void)
+{
+ InterruptPending = true;
+ ProcSignalStatsUpdatePending = true;
+ /* latch will be set by procsignal_sigusr1_handler */
+}
+
/*
* Perform global barrier related interrupt checking.
*
@@ -694,6 +707,9 @@ procsignal_sigusr1_handler(SIGNAL_ARGS)
if (CheckProcSignal(PROCSIG_BARRIER))
HandleProcSignalBarrierInterrupt();
+ if (CheckProcSignal(PROCSIG_STATS_UPDATE))
+ HandleProcSignalStatsUpdateInterrupt();
+
if (CheckProcSignal(PROCSIG_LOG_MEMORY_CONTEXT))
HandleLogMemoryContextInterrupt();
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index d01a09dd0c41..ddd57dfea780 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -3555,12 +3555,17 @@ ProcessInterrupts(void)
/*
* If there are pending stats updates and we currently are truly idle
- * (matching the conditions in PostgresMain(), report stats now.
+ * (matching the conditions in PostgresMain(), or if a status update has
+ * been requested, report stats now.
*/
if (IdleStatsUpdateTimeoutPending &&
- DoingCommandRead && !IsTransactionOrTransactionBlock())
+ DoingCommandRead && !IsTransactionOrTransactionBlock() ||
+ ProcSignalStatsUpdatePending)
{
- IdleStatsUpdateTimeoutPending = false;
+ if (IdleStatsUpdateTimeoutPending)
+ IdleStatsUpdateTimeoutPending = false;
+ if (ProcSignalStatsUpdatePending)
+ ProcSignalStatsUpdatePending = false;
pgstat_report_stat(true);
}
diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c
index 11bb71cad5ad..5521e96d0cae 100644
--- a/src/backend/utils/activity/pgstat.c
+++ b/src/backend/utils/activity/pgstat.c
@@ -436,6 +436,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = true,
.write_to_file = true,
+ .report_context = PGSTAT_REPORT_TRANSACTION,
.snapshot_ctl_off = offsetof(PgStat_Snapshot, io),
.shared_ctl_off = offsetof(PgStat_ShmemControl, io),
@@ -470,6 +471,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = true,
.write_to_file = true,
+ .report_context = PGSTAT_REPORT_TRANSACTION,
.snapshot_ctl_off = offsetof(PgStat_Snapshot, wal),
.shared_ctl_off = offsetof(PgStat_ShmemControl, wal),
@@ -698,8 +700,9 @@ pgstat_initialize(void)
* a timeout after which to call pgstat_report_stat(true), but are not
* required to do so.
*
- * Note that this is called only when not within a transaction, so it is fair
- * to use transaction stop time as an approximation of current time.
+ * Note that when this is called only when not within a transaction, we use
+ * the transaction stop time as an approximation of current time. "force"
+ * is required when this is called within a transaction.
*/
long
pgstat_report_stat(bool force)
@@ -709,9 +712,14 @@ pgstat_report_stat(bool force)
bool partial_flush;
TimestampTz now;
bool nowait;
+ bool is_xact = IsTransactionOrTransactionBlock();
pgstat_assert_is_up();
- Assert(!IsTransactionOrTransactionBlock());
+ /*
+ * "force" is required if this routine is called inside a transaction
+ * block.
+ */
+ Assert(!is_xact || force);
/* "absorb" the forced flush even if there's nothing to flush */
if (pgStatForceNextFlush)
@@ -789,6 +797,11 @@ pgstat_report_stat(bool force)
if (!kind_info->flush_static_cb)
continue;
+ /* Skip if this stats kind cannot be flushed in a transaction */
+ if (is_xact &&
+ (kind_info->report_context & PGSTAT_REPORT_TRANSACTION) == 0)
+ continue;
+
partial_flush |= kind_info->flush_static_cb(nowait);
}
}
@@ -801,8 +814,11 @@ pgstat_report_stat(bool force)
*/
if (partial_flush)
{
- /* force should have prevented us from getting here */
- Assert(!force);
+ /*
+ * force should have prevented us from getting here, and partial
+ * flushes are accepted inside a transaction.
+ */
+ Assert(!force || is_xact);
/* remember since when stats have been pending */
if (pending_since == 0)
@@ -1351,6 +1367,7 @@ pgstat_flush_pending_entries(bool nowait)
{
bool have_pending = false;
dlist_node *cur = NULL;
+ bool is_xact = IsTransactionOrTransactionBlock();
/*
* Need to be a bit careful iterating over the list of pending entries.
@@ -1377,6 +1394,21 @@ pgstat_flush_pending_entries(bool nowait)
Assert(!kind_info->fixed_amount);
Assert(kind_info->flush_pending_cb != NULL);
+ /* Skip if this stats kind cannot be flushed while in a transaction */
+ if (is_xact &&
+ (kind_info->report_context & PGSTAT_REPORT_TRANSACTION) == 0)
+ {
+ have_pending = true;
+
+ if (dlist_has_next(&pgStatPending, cur))
+ next = dlist_next_node(&pgStatPending, cur);
+ else
+ next = NULL;
+
+ cur = next;
+ continue;
+ }
+
/* flush the stats, if possible */
did_flush = kind_info->flush_pending_cb(entry_ref, nowait);
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index bad5642d9c90..b27a5eb58e18 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -28,6 +28,7 @@
#include "replication/logicallauncher.h"
#include "storage/proc.h"
#include "storage/procarray.h"
+#include "storage/procsignal.h"
#include "utils/acl.h"
#include "utils/builtins.h"
#include "utils/timestamp.h"
@@ -2325,3 +2326,60 @@ pg_stat_have_stats(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(pgstat_have_entry(kind, dboid, objid));
}
+
+/*
+ * pg_stat_report
+ * Signal a backend or an auxiliary process to have it push an update
+ * of its statistics data.
+ *
+ * By default, only superusers are allowed to signal to log the memory
+ * contexts because allowing any users to issue this request at an unbounded
+ * rate would cause lots of log messages and which can lead to denial of
+ * service. Additional roles can be permitted with GRANT.
+ *
+ * On receipt of this signal, a backend or an auxiliary process sets the flag
+ * in the signal handler, which causes the next CHECK_FOR_INTERRUPTS()
+ * or process-specific interrupt handler to update their statistics.
+ */
+Datum
+pg_stat_report(PG_FUNCTION_ARGS)
+{
+ int pid = PG_GETARG_INT32(0);
+ PGPROC *proc;
+ ProcNumber procNumber = INVALID_PROC_NUMBER;
+
+ /*
+ * See if the process with given pid is a backend or an auxiliary process.
+ */
+ proc = BackendPidGetProc(pid);
+ if (proc == NULL)
+ proc = AuxiliaryPidGetProc(pid);
+
+ /*
+ * BackendPidGetProc() and AuxiliaryPidGetProc() return NULL if the pid
+ * isn't valid; but by the time we reach kill(), a process for which we
+ * get a valid proc here might have terminated on its own. This is OK,
+ * as at shutdown processes flush their stats.
+ */
+ if (proc == NULL)
+ {
+ /*
+ * This is just a warning so a loop-through-resultset will not abort
+ * if one backend terminated on its own during the run.
+ */
+ ereport(WARNING,
+ (errmsg("PID %d is not a PostgreSQL server process", pid)));
+ PG_RETURN_BOOL(false);
+ }
+
+ procNumber = GetNumberFromPGProc(proc);
+ if (SendProcSignal(pid, PROCSIG_STATS_UPDATE, procNumber) < 0)
+ {
+ /* Again, just a warning to allow loops */
+ ereport(WARNING,
+ (errmsg("could not send signal to process %d: %m", pid)));
+ PG_RETURN_BOOL(false);
+ }
+
+ PG_RETURN_BOOL(true);
+}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 36ad708b3602..22960ee3b27b 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -38,6 +38,7 @@ volatile sig_atomic_t IdleInTransactionSessionTimeoutPending = false;
volatile sig_atomic_t TransactionTimeoutPending = false;
volatile sig_atomic_t IdleSessionTimeoutPending = false;
volatile sig_atomic_t ProcSignalBarrierPending = false;
+volatile sig_atomic_t ProcSignalStatsUpdatePending = false;
volatile sig_atomic_t LogMemoryContextPending = false;
volatile sig_atomic_t IdleStatsUpdateTimeoutPending = false;
volatile uint32 InterruptHoldoffCount = 0;
diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index 6c03b1a79d75..68a0e1e02bcc 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -397,6 +397,58 @@ REVOKE EXECUTE ON FUNCTION pg_log_backend_memory_contexts(integer)
FROM regress_log_memory;
DROP ROLE regress_log_memory;
--
+-- pg_stat_report
+--
+-- check execution
+SELECT pg_stat_report(pg_backend_pid());
+ pg_stat_report
+----------------
+ t
+(1 row)
+
+SELECT pg_stat_report(pid) FROM pg_stat_activity
+ WHERE backend_type = 'checkpointer';
+ pg_stat_report
+----------------
+ t
+(1 row)
+
+-- Check privileges
+CREATE ROLE regress_stat_report;
+SELECT has_function_privilege('regress_stat_report',
+ 'pg_stat_report(integer)', 'EXECUTE'); -- no
+ has_function_privilege
+------------------------
+ f
+(1 row)
+
+-- Fails
+SET ROLE regress_stat_report;
+SELECT pg_stat_report(pg_backend_pid());
+ERROR: permission denied for function pg_stat_report
+RESET ROLE;
+-- Access granted, then function works
+GRANT EXECUTE ON FUNCTION pg_stat_report(integer)
+ TO regress_stat_report;
+SELECT has_function_privilege('regress_stat_report',
+ 'pg_stat_report(integer)', 'EXECUTE'); -- yes
+ has_function_privilege
+------------------------
+ t
+(1 row)
+
+SET ROLE regress_stat_report;
+SELECT pg_stat_report(pg_backend_pid());
+ pg_stat_report
+----------------
+ t
+(1 row)
+
+RESET ROLE;
+REVOKE EXECUTE ON FUNCTION pg_stat_report(integer)
+ FROM regress_stat_report;
+DROP ROLE regress_stat_report;
+--
-- Test some built-in SRFs
--
-- The outputs of these are variable, so we can't just print their results
diff --git a/src/test/regress/sql/misc_functions.sql b/src/test/regress/sql/misc_functions.sql
index 35b7983996c4..fe366904c9e8 100644
--- a/src/test/regress/sql/misc_functions.sql
+++ b/src/test/regress/sql/misc_functions.sql
@@ -154,6 +154,35 @@ REVOKE EXECUTE ON FUNCTION pg_log_backend_memory_contexts(integer)
DROP ROLE regress_log_memory;
+--
+-- pg_stat_report
+--
+
+-- check execution
+SELECT pg_stat_report(pg_backend_pid());
+SELECT pg_stat_report(pid) FROM pg_stat_activity
+ WHERE backend_type = 'checkpointer';
+
+-- Check privileges
+CREATE ROLE regress_stat_report;
+SELECT has_function_privilege('regress_stat_report',
+ 'pg_stat_report(integer)', 'EXECUTE'); -- no
+-- Fails
+SET ROLE regress_stat_report;
+SELECT pg_stat_report(pg_backend_pid());
+RESET ROLE;
+-- Access granted, then function works
+GRANT EXECUTE ON FUNCTION pg_stat_report(integer)
+ TO regress_stat_report;
+SELECT has_function_privilege('regress_stat_report',
+ 'pg_stat_report(integer)', 'EXECUTE'); -- yes
+SET ROLE regress_stat_report;
+SELECT pg_stat_report(pg_backend_pid());
+RESET ROLE;
+REVOKE EXECUTE ON FUNCTION pg_stat_report(integer)
+ FROM regress_stat_report;
+DROP ROLE regress_stat_report;
+
--
-- Test some built-in SRFs
--
diff --git a/doc/src/sgml/func/func-admin.sgml b/doc/src/sgml/func/func-admin.sgml
index 210b1118bdf7..114202c4fc19 100644
--- a/doc/src/sgml/func/func-admin.sgml
+++ b/doc/src/sgml/func/func-admin.sgml
@@ -220,6 +220,22 @@
</para></entry>
</row>
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_stat_report</primary>
+ </indexterm>
+ <function>pg_log_backend_memory_contexts</function> ( <parameter>pid</parameter> <type>integer</type> )
+ <returnvalue>boolean</returnvalue>
+ </para>
+ <para>
+ Requests to update the statistics computed by the backend with the
+ specified process ID. This function can send the request to
+ backends and auxiliary processes except logger, pushing an
+ update of the statistics data.
+ </para></entry>
+ </row>
+
<row>
<entry role="func_table_entry"><para role="func_signature">
<indexterm>
--
2.53.0
[application/pgp-signature] signature.asc (833B, 3-signature.asc)
download
^ permalink raw reply [nested|flat] 22+ messages in thread
* Re: Flush some statistics within running transactions
2026-02-17 19:18 Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-18 05:40 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-19 03:58 ` Re: Flush some statistics within running transactions Michael Paquier <[email protected]>
2026-02-19 08:01 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-19 22:08 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-20 15:55 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-23 02:12 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-23 08:14 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-23 23:47 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-24 12:01 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-03-16 06:26 ` Re: Flush some statistics within running transactions Michael Paquier <[email protected]>
@ 2026-03-16 09:20 ` Bertrand Drouvot <[email protected]>
2026-03-16 10:22 ` Re: Flush some statistics within running transactions Michael Paquier <[email protected]>
0 siblings, 1 reply; 22+ messages in thread
From: Bertrand Drouvot @ 2026-03-16 09:20 UTC (permalink / raw)
To: Michael Paquier <[email protected]>; +Cc: Sami Imseih <[email protected]>; Fujii Masao <[email protected]>; [email protected]; Zsolt Parragi <[email protected]>
Hi,
On Mon, Mar 16, 2026 at 03:26:33PM +0900, Michael Paquier wrote:
> On Tue, Feb 24, 2026 at 12:01:30PM +0000, Bertrand Drouvot wrote:
> > Though I don't think that adresses Michael's concern: "main worries are
> > mainly around 1), I guess, with the new SIGALRM handler requirements for all
> > auxiliary processes" in [1].
>
> FWIW, I am still concerned about that, and I have pondered about what
> we could do here. While reviewing the existing code, one thing that I
> have noticed we could do is rely on the existing interface of
> pgstat_report_stat() without changing the existing callers, and not
> touching at all the flush callbacks. If we begin to require the
> "force" mode when the routine the called inside a transaction block,
> things seem to work pretty smoothly in combination with a stats kind
> property that allows the stats data to be flushed if we are inside a
> transaction while a report happens.
Yeah, "force" makes use of GetCurrentTimestamp() (and so we avoid a failed
assertion that we would get if using GetCurrentTransactionStopTimestamp()).
> So please find attached my shot at that:
Thanks!
> - Introduction of a new system function called pg_stat_report(), based
> on a procsignal that gives a way to signal backends for a stats
> update, reusing the existing code where we only do flushes when idle
> and not in a transaction.
> - Property that tracks under which contexts the reports are allowed.
> Here I have decided to stick with simple, as in only allowing IO and
> WAL stats to be flushed if we are inside a transaction.
>
> Using that, I have done a few tests with three backends:
> - One with a long-running transaction.
> - One that periodically triggers the reports.
> - One that looks at IO and WAL stat.
> And the third session is able to get refreshes for both of these stats
> kinds, while the other stats remain the same.
I did not look closely at the code but did some testing too. I confirm that
pg_stat_io and pg_stat_wal are updated when pg_stat_report(<backend_pid>) is
triggered. But the stats update is not visible if requested through
pg_stat_get_backend_io(<same_backend_pid>) or pg_stat_get_backend_wal(<same_backend_pid>)).
I guess that PGSTAT_KIND_BACKEND should also get the PGSTAT_REPORT_TRANSACTION
report_context?
> Note that this is a WIP, which is check-world stable. One thing that
> sticks a bit in mind now is that perhaps we should not allow the
> function for auxiliary processes at all.
Why?
> A second thing is the
> requirement of allowing partial flushes at the end of the report path,
> which is OK because the variable-sized stats can have pending data.
Right.
> Perhaps we should just have pgstat_flush_pending_entries() provide a
> correct status in line with the property set in a stats kind when we
> try a flush while in a transaction.
The idea would be to avoid trying to flush stats that don't have pending
entries?
> Thoughts or tomatoes?
That looks "simpler" that the previous proposal but who would be responsible to
call pg_stat_report()? If that's the client responsabilty that kind of look weird
to me. If that's the core, how would that be scheduled? I think that the
end solution should prevent to find similar issues as 039549d70f6 fixed, without
delegating to the client.
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 22+ messages in thread
* Re: Flush some statistics within running transactions
2026-02-17 19:18 Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-18 05:40 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-19 03:58 ` Re: Flush some statistics within running transactions Michael Paquier <[email protected]>
2026-02-19 08:01 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-19 22:08 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-20 15:55 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-23 02:12 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-23 08:14 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-23 23:47 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-24 12:01 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-03-16 06:26 ` Re: Flush some statistics within running transactions Michael Paquier <[email protected]>
2026-03-16 09:20 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
@ 2026-03-16 10:22 ` Michael Paquier <[email protected]>
2026-03-16 23:42 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
0 siblings, 1 reply; 22+ messages in thread
From: Michael Paquier @ 2026-03-16 10:22 UTC (permalink / raw)
To: Bertrand Drouvot <[email protected]>; +Cc: Sami Imseih <[email protected]>; Fujii Masao <[email protected]>; [email protected]; Zsolt Parragi <[email protected]>
On Mon, Mar 16, 2026 at 09:20:41AM +0000, Bertrand Drouvot wrote:
> I did not look closely at the code but did some testing too. I confirm that
> pg_stat_io and pg_stat_wal are updated when pg_stat_report(<backend_pid>) is
> triggered. But the stats update is not visible if requested through
> pg_stat_get_backend_io(<same_backend_pid>) or pg_stat_get_backend_wal(<same_backend_pid>)).
> I guess that PGSTAT_KIND_BACKEND should also get the PGSTAT_REPORT_TRANSACTION
> report_context?
Yes, I guess that the transaction-level flag should be set as well for
the backend stats, due to the fact that WAL and IO stats have it set.
>> Note that this is a WIP, which is check-world stable. One thing that
>> sticks a bit in mind now is that perhaps we should not allow the
>> function for auxiliary processes at all.
>
> Why?
I was feeling that there would be an issue with that, and I did not
test it in details yet. For the checkpointer with a busy activity,
the IO stats could be interesting to see in live.
>> Perhaps we should just have pgstat_flush_pending_entries() provide a
>> correct status in line with the property set in a stats kind when we
>> try a flush while in a transaction.
>
> The idea would be to avoid trying to flush stats that don't have pending
> entries?
I had to work around the assert at the end of pgstat_report_stat(), to
tell that under an xact the partial flushes were OK. I was wondering
about keeping the end assertion based on solely "force" intact, making
the flush routines return a status based on if we are in an xact.
> That looks "simpler" that the previous proposal but who would be responsible to
> call pg_stat_report()? If that's the client responsabilty that kind of look weird
> to me. If that's the core, how would that be scheduled? I think that the
> end solution should prevent to find similar issues as 039549d70f6 fixed, without
> delegating to the client.
TBH, I don't like the requirement of setting SIGALRM in all the places
where we'd require it for the sake of this proposal, where
historically we have never done that, copying a mechanism that already
exists in the tree for procsigs, as the previous patch I posted proves
we could reuse. It's also not clear to me what a correct frequency of
the stat updates should be, and why it would make sense to force that
in a GUC; we want to have some information from long-running
transactions, where I fear that we will want modularity. A
user-settable GUC would fit with this picture, but the requirement of
planting the timeouts don't stick for me..
On top of that, I am not really convinced that there is a good reason
to remove the existing stat report calls we have already planted in
the tree for auxiliary processes, diverging from the stable branches
where these exist. With more than one release already out with them,
there is more benefit with potentially planting more strategic report
calls where they could matter (as added in v18 for WAL senders as one
example), when we find a requirement for them.
--
Michael
Attachments:
[application/pgp-signature] signature.asc (833B, 2-signature.asc)
download
^ permalink raw reply [nested|flat] 22+ messages in thread
* Re: Flush some statistics within running transactions
2026-02-17 19:18 Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-18 05:40 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-19 03:58 ` Re: Flush some statistics within running transactions Michael Paquier <[email protected]>
2026-02-19 08:01 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-19 22:08 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-20 15:55 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-23 02:12 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-23 08:14 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-23 23:47 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-24 12:01 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-03-16 06:26 ` Re: Flush some statistics within running transactions Michael Paquier <[email protected]>
2026-03-16 09:20 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-03-16 10:22 ` Re: Flush some statistics within running transactions Michael Paquier <[email protected]>
@ 2026-03-16 23:42 ` Sami Imseih <[email protected]>
2026-03-25 03:16 ` Re: Flush some statistics within running transactions Michael Paquier <[email protected]>
0 siblings, 1 reply; 22+ messages in thread
From: Sami Imseih @ 2026-03-16 23:42 UTC (permalink / raw)
To: Michael Paquier <[email protected]>; +Cc: Bertrand Drouvot <[email protected]>; Fujii Masao <[email protected]>; [email protected]; Zsolt Parragi <[email protected]>
Hi,
I have spent some time thinking about Michael's feedback and I want to
see if I can propose an alternate design that we can come to a consensus
on.
> I had to work around the assert at the end of pgstat_report_stat(), to
> tell that under an xact the partial flushes were OK. I was wondering
> about keeping the end assertion based on solely "force" intact, making
> the flush routines return a status based on if we are in an xact.
>
> > That looks "simpler" that the previous proposal but who would be responsible to
> > call pg_stat_report()? If that's the client responsibility that kind of look weird
> > to me. If that's the core, how would that be scheduled? I think that the
> > end solution should prevent to find similar issues as 039549d70f6 fixed, without
> > delegating to the client.
>
> TBH, I don't like the requirement of setting SIGALRM in all the places
> where we'd require it for the sake of this proposal, where
> historically we have never done that, copying a mechanism that already
> exists in the tree for procsigs, as the previous patch I posted proves
> we could reuse.
How about we add a timeout when a transaction goes idle?
Currently we have IDLE_STATS_UPDATE_TIMEOUT, which is the
timeout interval used when we have done more than one flush
within a second. This only works when the session is idle
(not in a transaction), and flushes everything.
However, as I mentioned earlier in this thread, we can also schedule
a flush that occurs when a transaction goes idle. This means when
a transaction is started (BEGIN/START TRANSACTION) the proceeding
idle-in-transaction can schedule a timeout of 10 seconds based on
a new #define called PGSTAT_IDLE_TXN_INTERVAL.
The timeout is not a fixed repeating interval. It is set
on-demand each time the transaction enters idle-in-transaction
state: first after the BEGIN, then again after each subsequent
command completes. This is a trade-off: if a single command
runs for a long time, no flush happens until it finishes and
the transaction goes idle again.
So, we still introduce a new timeout in this mechanism, but
it does not need to be registered all over the place.
So with this all the "safe to be flushed" stats will be flushed
mid-transaction at some point, and all stats will be flushed
at the end of a transaction.
With that said, the only way I can conceive getting away from a
a new timeout in this design is to track timestamp whenever
we go into idle-intransaction, but I am not too comfy adding a
GetCurrentTimestamp() there.
-> if no action occurs at this point, we don't set another timeout
> It's also not clear to me what a correct frequency of
> the stat updates should be, and why it would make sense to force that
> in a GUC;
No GUC will be needed in what I am proposing. A 10s flush interval
should be sufficient.
> On top of that, I am not really convinced that there is a good reason
> to remove the existing stat report calls we have already planted in
> the tree for auxiliary processes, diverging from the stable branches
> where these exist.
The design being proposed keeps all the existing in-tact.
So attached is a new proposal with tests and docs. In terms of
test I fell back to the strategy used by Bertrand [0] with the
INJECTION_POINT trick to reduce the flush interval. It does mean
we keep a pg_sleep in the test, but I think we should not try
to do anything different here.
Also, with this design we can call the flush modes FLUSH_IN_TRANSACTION
and FLUSH_AT_TXN_BOUNDARY. As the earlier proposals, the
callbacks will still need to make the decision what stats to flush based on
if it's called in-transaction or at boundary.
[0] https://www.postgresql.org/message-id/aZbDYMrOkeCyIubO%40ip-10-97-1-34.eu-west-3.compute.internal
--
Sami Imseih
Amazon Web Services (AWS)
Attachments:
[application/octet-stream] v12-0001-Add-periodic-in-transaction-stats-flushing.patch (54.2K, 2-v12-0001-Add-periodic-in-transaction-stats-flushing.patch)
download | inline diff:
From 5d75d93701bd268e7689791042405450297fede9 Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot <[email protected]>
Date: Mon, 5 Jan 2026 09:41:39 +0000
Subject: [PATCH v12 1/1] Add periodic in-transaction stats flushing
Some statistics, such as relation scan counts, blocks fetched/hit,
WAL, IO, and SLRU stats, can safely be flushed to shared memory
mid-transaction. Others, like tuple insert/update/delete counts,
should only be flushed at transaction boundaries.
Introduce a flush_mode field (FLUSH_IN_TRANSACTION and
FLUSH_AT_TXN_BOUNDARY) fo each stats kind. FLUSH_IN_TRANSACTION will
be flushed every PGSTAT_IDLE_TXN_INTERVAL (10s) while idle in transaction.
The timeout is set each time the transaction goes idle between
commands, not on a fixed repeating interval. It is disabled when
the session goes idle after the transaction ends.
---
doc/src/sgml/monitoring.sgml | 28 ++++-
src/backend/replication/walsender.c | 8 +-
src/backend/tcop/postgres.c | 43 +++++++
src/backend/utils/activity/pgstat.c | 109 ++++++++++++++---
src/backend/utils/activity/pgstat_backend.c | 6 +-
src/backend/utils/activity/pgstat_bgwriter.c | 2 +-
.../utils/activity/pgstat_checkpointer.c | 2 +-
src/backend/utils/activity/pgstat_database.c | 6 +-
src/backend/utils/activity/pgstat_function.c | 8 +-
src/backend/utils/activity/pgstat_io.c | 6 +-
src/backend/utils/activity/pgstat_relation.c | 98 ++++++++++++---
src/backend/utils/activity/pgstat_slru.c | 2 +-
.../utils/activity/pgstat_subscription.c | 8 +-
src/backend/utils/activity/pgstat_wal.c | 10 +-
src/backend/utils/init/globals.c | 1 +
src/backend/utils/init/postinit.c | 12 ++
src/include/miscadmin.h | 1 +
src/include/pgstat.h | 8 ++
src/include/utils/pgstat_internal.h | 58 +++++++--
src/include/utils/timeout.h | 1 +
.../test_custom_stats/t/001_custom_stats.pl | 59 +++++++++
.../test_custom_var_stats--1.0.sql | 10 ++
.../test_custom_stats/test_custom_var_stats.c | 114 +++++++++++++++++-
src/test/modules/test_misc/meson.build | 1 +
.../test_misc/t/011_in_transaction_stats.pl | 91 ++++++++++++++
src/tools/pgindent/typedefs.list | 1 +
26 files changed, 611 insertions(+), 82 deletions(-)
create mode 100644 src/test/modules/test_misc/t/011_in_transaction_stats.pl
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 462019a972c..dcdeb9aabec 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -235,15 +235,31 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
When using the cumulative statistics views and functions to monitor
collected data, it is important to realize that the information does not
update instantaneously. Each individual server process flushes out
- accumulated statistics to shared memory just before going idle, but not
- more frequently than once per <varname>PGSTAT_MIN_INTERVAL</varname>
- milliseconds (1 second unless altered while building the server); so a
- query or transaction still in progress does not affect the displayed totals
- and the displayed information lags behind actual activity. However,
- current-query information collected by <varname>track_activities</varname>
+ accumulated statistics to shared memory just before going idle or
+ periodically during transactions, depending on whether the statistics
+ should only be flushed at a transaction boundary, as described below.
+ Current-query information collected by <varname>track_activities</varname>
is always up-to-date.
</para>
+ <para>
+ When a server process is about to become idle, it flushes all pending
+ statistics to shared memory, but not more frequently than once per
+ <varname>PGSTAT_MIN_INTERVAL</varname> milliseconds (1 second unless
+ altered while building the server).
+ </para>
+
+ <para>
+ During explicit transaction blocks, statistics that can be flushed at any
+ time, such as relation scan counts, tuples returned and fetched, blocks
+ fetched and hit, as well as IO, WAL, SLRU, and backend statistics, are
+ periodically flushed to shared memory every
+ <varname>PGSTAT_IDLE_TXN_INTERVAL</varname> milliseconds (10 seconds
+ unless altered while building the server). This periodic flushing occurs
+ when the transaction goes idle between commands. It does not apply to
+ single statements running outside an explicit transaction block.
+ </para>
+
<para>
Another important point is that when a server process is asked to display
any of the accumulated statistics, accessed values are cached until the end
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 376ff46340d..24124d478e8 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1993,8 +1993,8 @@ WalSndWaitForWal(XLogRecPtr loc)
if (TimestampDifferenceExceeds(last_flush, now,
WALSENDER_STATS_FLUSH_INTERVAL))
{
- pgstat_flush_io(false);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO);
+ pgstat_flush_io(false, true);
+ (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
last_flush = now;
}
@@ -3022,8 +3022,8 @@ WalSndLoop(WalSndSendDataCallback send_data)
if (TimestampDifferenceExceeds(last_flush, now,
WALSENDER_STATS_FLUSH_INTERVAL))
{
- pgstat_flush_io(false);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO);
+ pgstat_flush_io(false, true);
+ (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
last_flush = now;
}
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index b3563113219..00436fee0de 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -3565,6 +3565,12 @@ ProcessInterrupts(void)
pgstat_report_stat(true);
}
+ if (InTransactionStatsUpdateTimeoutPending)
+ {
+ InTransactionStatsUpdateTimeoutPending = false;
+ pgstat_report_in_transaction_stat(false);
+ }
+
if (ProcSignalBarrierPending)
ProcessProcSignalBarrier();
@@ -4621,6 +4627,21 @@ PostgresMain(const char *dbname, const char *username)
enable_timeout_after(IDLE_IN_TRANSACTION_SESSION_TIMEOUT,
IdleInTransactionSessionTimeout);
}
+
+ /*
+ * Flush in-transaction stats.
+ */
+ if (!get_timeout_active(IN_TRANSACTION_STATS_UPDATE_TIMEOUT))
+ {
+ int stats_interval = PGSTAT_IDLE_TXN_INTERVAL;
+
+#ifdef USE_INJECTION_POINTS
+ if (IS_INJECTION_POINT_ATTACHED("in-transaction-stats-short-interval"))
+ stats_interval = 1000;
+#endif
+ enable_timeout_after(IN_TRANSACTION_STATS_UPDATE_TIMEOUT,
+ stats_interval);
+ }
}
else if (IsTransactionOrTransactionBlock())
{
@@ -4635,6 +4656,21 @@ PostgresMain(const char *dbname, const char *username)
enable_timeout_after(IDLE_IN_TRANSACTION_SESSION_TIMEOUT,
IdleInTransactionSessionTimeout);
}
+
+ /*
+ * Flush in-transaction stats.
+ */
+ if (!get_timeout_active(IN_TRANSACTION_STATS_UPDATE_TIMEOUT))
+ {
+ int stats_interval = PGSTAT_IDLE_TXN_INTERVAL;
+
+#ifdef USE_INJECTION_POINTS
+ if (IS_INJECTION_POINT_ATTACHED("in-transaction-stats-short-interval"))
+ stats_interval = 1000;
+#endif
+ enable_timeout_after(IN_TRANSACTION_STATS_UPDATE_TIMEOUT,
+ stats_interval);
+ }
}
else
{
@@ -4650,6 +4686,13 @@ PostgresMain(const char *dbname, const char *username)
if (notifyInterruptPending)
ProcessNotifyInterrupt(false);
+ /*
+ * We are no longer in a transaction, so disable the
+ * in-transaction stats flush timeout if it was active.
+ */
+ if (get_timeout_active(IN_TRANSACTION_STATS_UPDATE_TIMEOUT))
+ disable_timeout(IN_TRANSACTION_STATS_UPDATE_TIMEOUT, false);
+
/*
* Check if we need to report stats. If pgstat_report_stat()
* decides it's too soon to flush out pending stats / lock
diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c
index 11bb71cad5a..70090565fee 100644
--- a/src/backend/utils/activity/pgstat.c
+++ b/src/backend/utils/activity/pgstat.c
@@ -187,7 +187,8 @@ static void pgstat_init_snapshot_fixed(void);
static void pgstat_reset_after_failure(void);
-static bool pgstat_flush_pending_entries(bool nowait);
+static bool pgstat_flush_pending_entries(bool nowait, bool in_txn_only);
+static bool pgstat_flush_fixed_stats(bool nowait, bool in_txn_only);
static void pgstat_prep_snapshot(void);
static void pgstat_build_snapshot(void);
@@ -288,6 +289,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = true,
+ .flush_mode = FLUSH_IN_TRANSACTION,
/* so pg_stat_database entries can be seen in all databases */
.accessed_across_databases = true,
@@ -305,6 +307,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = true,
+ .flush_mode = FLUSH_IN_TRANSACTION,
.shared_size = sizeof(PgStatShared_Relation),
.shared_data_off = offsetof(PgStatShared_Relation, stats),
@@ -321,6 +324,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = true,
+ .flush_mode = FLUSH_AT_TXN_BOUNDARY,
.shared_size = sizeof(PgStatShared_Function),
.shared_data_off = offsetof(PgStatShared_Function, stats),
@@ -336,6 +340,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = true,
+ .flush_mode = FLUSH_AT_TXN_BOUNDARY,
.accessed_across_databases = true,
@@ -353,6 +358,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = true,
+ .flush_mode = FLUSH_AT_TXN_BOUNDARY,
/* so pg_stat_subscription_stats entries can be seen in all databases */
.accessed_across_databases = true,
@@ -370,6 +376,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = false,
.write_to_file = false,
+ .flush_mode = FLUSH_IN_TRANSACTION,
.accessed_across_databases = true,
@@ -436,6 +443,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = true,
.write_to_file = true,
+ .flush_mode = FLUSH_IN_TRANSACTION,
.snapshot_ctl_off = offsetof(PgStat_Snapshot, io),
.shared_ctl_off = offsetof(PgStat_ShmemControl, io),
@@ -453,6 +461,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = true,
.write_to_file = true,
+ .flush_mode = FLUSH_IN_TRANSACTION,
.snapshot_ctl_off = offsetof(PgStat_Snapshot, slru),
.shared_ctl_off = offsetof(PgStat_ShmemControl, slru),
@@ -470,6 +479,7 @@ static const PgStat_KindInfo pgstat_kind_builtin_infos[PGSTAT_KIND_BUILTIN_SIZE]
.fixed_amount = true,
.write_to_file = true,
+ .flush_mode = FLUSH_IN_TRANSACTION,
.snapshot_ctl_off = offsetof(PgStat_Snapshot, wal),
.shared_ctl_off = offsetof(PgStat_ShmemControl, wal),
@@ -775,23 +785,11 @@ pgstat_report_stat(bool force)
partial_flush = false;
/* flush of variable-numbered stats tracked in pending entries list */
- partial_flush |= pgstat_flush_pending_entries(nowait);
+ partial_flush |= pgstat_flush_pending_entries(nowait, false);
/* flush of other stats kinds */
if (pgstat_report_fixed)
- {
- for (PgStat_Kind kind = PGSTAT_KIND_MIN; kind <= PGSTAT_KIND_MAX; kind++)
- {
- const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
-
- if (!kind_info)
- continue;
- if (!kind_info->flush_static_cb)
- continue;
-
- partial_flush |= kind_info->flush_static_cb(nowait);
- }
- }
+ partial_flush |= pgstat_flush_fixed_stats(nowait, false);
last_flush = now;
@@ -1293,7 +1291,8 @@ pgstat_prep_pending_entry(PgStat_Kind kind, Oid dboid, uint64 objid, bool *creat
if (entry_ref->pending == NULL)
{
- size_t entrysize = pgstat_get_kind_info(kind)->pending_size;
+ const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
+ size_t entrysize = kind_info->pending_size;
Assert(entrysize != (size_t) -1);
@@ -1345,9 +1344,15 @@ pgstat_delete_pending_entry(PgStat_EntryRef *entry_ref)
/*
* Flush out pending variable-numbered stats.
+ *
+ * If in_txn_only is true, only flushes FLUSH_IN_TRANSACTION entries. For entries
+ * that support it, the callback may flush only non-transactional fields.
+ * This is safe to call inside transactions.
+ *
+ * If in_txn_only is false, flushes all entries.
*/
static bool
-pgstat_flush_pending_entries(bool nowait)
+pgstat_flush_pending_entries(bool nowait, bool in_txn_only)
{
bool have_pending = false;
dlist_node *cur = NULL;
@@ -1372,13 +1377,29 @@ pgstat_flush_pending_entries(bool nowait)
PgStat_Kind kind = key.kind;
const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
bool did_flush;
+ bool is_partial_flush = false;
dlist_node *next;
Assert(!kind_info->fixed_amount);
Assert(kind_info->flush_pending_cb != NULL);
+ /* Skip transactional stats if we're in in_txn_only mode */
+ if (in_txn_only && kind_info->flush_mode == FLUSH_AT_TXN_BOUNDARY)
+ {
+ have_pending = true;
+
+ if (dlist_has_next(&pgStatPending, cur))
+ next = dlist_next_node(&pgStatPending, cur);
+ else
+ next = NULL;
+
+ cur = next;
+ continue;
+ }
+
/* flush the stats, if possible */
- did_flush = kind_info->flush_pending_cb(entry_ref, nowait);
+ did_flush = kind_info->flush_pending_cb(entry_ref, nowait,
+ in_txn_only, &is_partial_flush);
Assert(did_flush || nowait);
@@ -1388,8 +1409,8 @@ pgstat_flush_pending_entries(bool nowait)
else
next = NULL;
- /* if successfully flushed, remove entry */
- if (did_flush)
+ /* if successfull non-partial flush, remove entry */
+ if (did_flush && !is_partial_flush)
pgstat_delete_pending_entry(entry_ref);
else
have_pending = true;
@@ -1402,6 +1423,33 @@ pgstat_flush_pending_entries(bool nowait)
return have_pending;
}
+/*
+ * Flush fixed-amount stats.
+ *
+ * If in_txn_only is true, only flushes FLUSH_IN_TRANSACTION stats (safe inside transactions).
+ * If in_txn_only is false, flushes all stats with flush_static_cb.
+ */
+static bool
+pgstat_flush_fixed_stats(bool nowait, bool in_txn_only)
+{
+ bool partial_flush = false;
+
+ for (PgStat_Kind kind = PGSTAT_KIND_MIN; kind <= PGSTAT_KIND_MAX; kind++)
+ {
+ const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
+
+ if (!kind_info || !kind_info->flush_static_cb)
+ continue;
+
+ /* Skip transactional stats if we're in in_txn_only mode */
+ if (in_txn_only && kind_info->flush_mode == FLUSH_AT_TXN_BOUNDARY)
+ continue;
+
+ partial_flush |= kind_info->flush_static_cb(nowait, in_txn_only);
+ }
+
+ return partial_flush;
+}
/* ------------------------------------------------------------
* Helper / infrastructure functions
@@ -2119,3 +2167,24 @@ assign_stats_fetch_consistency(int newval, void *extra)
if (pgstat_fetch_consistency != newval)
force_stats_snapshot_clear = true;
}
+
+/*
+ * Flushes only FLUSH_IN_TRANSACTION stats using non-blocking locks. Transactional
+ * stats (FLUSH_AT_TXN_BOUNDARY) remain pending until transaction boundary.
+ * Safe to call inside transactions.
+ *
+ * This is called from a timeout handler every PGSTAT_IDLE_TXN_INTERVAL
+ * (10000ms) while idle in transaction, to ensure non-transactional stats
+ * (e.g., IO, backend) are flushed periodically even during long transactions.
+ */
+void
+pgstat_report_in_transaction_stat(bool force)
+{
+ bool nowait = !force;
+
+ pgstat_assert_is_up();
+
+ /* Flush stats outside of transaction boundary */
+ pgstat_flush_pending_entries(nowait, true);
+ pgstat_flush_fixed_stats(nowait, true);
+}
diff --git a/src/backend/utils/activity/pgstat_backend.c b/src/backend/utils/activity/pgstat_backend.c
index f2f8d3ff75f..3be8b2705b1 100644
--- a/src/backend/utils/activity/pgstat_backend.c
+++ b/src/backend/utils/activity/pgstat_backend.c
@@ -268,7 +268,7 @@ pgstat_flush_backend_entry_wal(PgStat_EntryRef *entry_ref)
* if some statistics could not be flushed due to lock contention.
*/
bool
-pgstat_flush_backend(bool nowait, bits32 flags)
+pgstat_flush_backend(bool nowait, bits32 flags, bool in_txn_only)
{
PgStat_EntryRef *entry_ref;
bool has_pending_data = false;
@@ -311,9 +311,9 @@ pgstat_flush_backend(bool nowait, bits32 flags)
* If some stats could not be flushed due to lock contention, return true.
*/
bool
-pgstat_backend_flush_cb(bool nowait)
+pgstat_backend_flush_cb(bool nowait, bool in_txn_only)
{
- return pgstat_flush_backend(nowait, PGSTAT_BACKEND_FLUSH_ALL);
+ return pgstat_flush_backend(nowait, PGSTAT_BACKEND_FLUSH_ALL, in_txn_only);
}
/*
diff --git a/src/backend/utils/activity/pgstat_bgwriter.c b/src/backend/utils/activity/pgstat_bgwriter.c
index ed2fd801189..1c5f0c3ec40 100644
--- a/src/backend/utils/activity/pgstat_bgwriter.c
+++ b/src/backend/utils/activity/pgstat_bgwriter.c
@@ -61,7 +61,7 @@ pgstat_report_bgwriter(void)
/*
* Report IO statistics
*/
- pgstat_flush_io(false);
+ pgstat_flush_io(false, true);
}
/*
diff --git a/src/backend/utils/activity/pgstat_checkpointer.c b/src/backend/utils/activity/pgstat_checkpointer.c
index 1f70194b7a7..2d89a082464 100644
--- a/src/backend/utils/activity/pgstat_checkpointer.c
+++ b/src/backend/utils/activity/pgstat_checkpointer.c
@@ -68,7 +68,7 @@ pgstat_report_checkpointer(void)
/*
* Report IO statistics
*/
- pgstat_flush_io(false);
+ pgstat_flush_io(false, true);
}
/*
diff --git a/src/backend/utils/activity/pgstat_database.c b/src/backend/utils/activity/pgstat_database.c
index 933dcb5cae5..72d757a6631 100644
--- a/src/backend/utils/activity/pgstat_database.c
+++ b/src/backend/utils/activity/pgstat_database.c
@@ -435,7 +435,8 @@ pgstat_reset_database_timestamp(Oid dboid, TimestampTz ts)
* false without flushing the entry. Otherwise returns true.
*/
bool
-pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
+pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool in_txn_only, bool *is_partial)
{
PgStatShared_Database *sharedent;
PgStat_StatDBEntry *pendingent;
@@ -443,6 +444,9 @@ pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
pendingent = (PgStat_StatDBEntry *) entry_ref->pending;
sharedent = (PgStatShared_Database *) entry_ref->shared_stats;
+ /* this is not a partial flush */
+ *is_partial = false;
+
if (!pgstat_lock_entry(entry_ref, nowait))
return false;
diff --git a/src/backend/utils/activity/pgstat_function.c b/src/backend/utils/activity/pgstat_function.c
index e6b84283c6c..d51371d7967 100644
--- a/src/backend/utils/activity/pgstat_function.c
+++ b/src/backend/utils/activity/pgstat_function.c
@@ -190,14 +190,20 @@ pgstat_end_function_usage(PgStat_FunctionCallUsage *fcu, bool finalize)
* false without flushing the entry. Otherwise returns true.
*/
bool
-pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
+pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool in_txn_only, bool *is_partial)
{
PgStat_FunctionCounts *localent;
PgStatShared_Function *shfuncent;
+ Assert(!in_txn_only);
+
localent = (PgStat_FunctionCounts *) entry_ref->pending;
shfuncent = (PgStatShared_Function *) entry_ref->shared_stats;
+ /* this is not a partial flush */
+ *is_partial = false;
+
/* localent always has non-zero content */
if (!pgstat_lock_entry(entry_ref, nowait))
diff --git a/src/backend/utils/activity/pgstat_io.c b/src/backend/utils/activity/pgstat_io.c
index 28de24538dc..ec541bf3f78 100644
--- a/src/backend/utils/activity/pgstat_io.c
+++ b/src/backend/utils/activity/pgstat_io.c
@@ -172,9 +172,9 @@ pgstat_fetch_stat_io(void)
* Simpler wrapper of pgstat_io_flush_cb()
*/
void
-pgstat_flush_io(bool nowait)
+pgstat_flush_io(bool nowait, bool in_txn_only)
{
- (void) pgstat_io_flush_cb(nowait);
+ (void) pgstat_io_flush_cb(nowait, in_txn_only);
}
/*
@@ -186,7 +186,7 @@ pgstat_flush_io(bool nowait)
* acquired. Otherwise, return false.
*/
bool
-pgstat_io_flush_cb(bool nowait)
+pgstat_io_flush_cb(bool nowait, bool in_txn_only)
{
LWLock *bktype_lock;
PgStat_BktypeIO *bktype_shstats;
diff --git a/src/backend/utils/activity/pgstat_relation.c b/src/backend/utils/activity/pgstat_relation.c
index bc8c43b96aa..33359cf07b0 100644
--- a/src/backend/utils/activity/pgstat_relation.c
+++ b/src/backend/utils/activity/pgstat_relation.c
@@ -47,7 +47,19 @@ static void add_tabstat_xact_level(PgStat_TableStatus *pgstat_info, int nest_lev
static void ensure_tabstat_xact_level(PgStat_TableStatus *pgstat_info);
static void save_truncdrop_counters(PgStat_TableXactStatus *trans, bool is_drop);
static void restore_truncdrop_counters(PgStat_TableXactStatus *trans);
+static void flush_relation_in_transaction_stats(PgStat_StatTabEntry *tabentry,
+ PgStat_TableCounts *counts, bool in_txn_only);
+/*
+ * Update database statistics with non-transactional stats.
+ */
+#define UPDATE_DATABASE_IN_TRANSACTION_STATS(dbentry, counts) \
+ do { \
+ (dbentry)->tuples_returned += (counts)->tuples_returned; \
+ (dbentry)->tuples_fetched += (counts)->tuples_fetched; \
+ (dbentry)->blocks_fetched += (counts)->blocks_fetched; \
+ (dbentry)->blocks_hit += (counts)->blocks_hit; \
+ } while (0)
/*
* Copy stats between relations. This is used for things like REINDEX
@@ -267,8 +279,8 @@ pgstat_report_vacuum(Relation rel, PgStat_Counter livetuples,
* is done -- which will likely vacuum many relations -- or until the
* VACUUM command has processed all tables and committed.
*/
- pgstat_flush_io(false);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO);
+ pgstat_flush_io(false, true);
+ (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
}
/*
@@ -362,8 +374,8 @@ pgstat_report_analyze(Relation rel,
pgstat_unlock_entry(entry_ref);
/* see pgstat_report_vacuum() */
- pgstat_flush_io(false);
- (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO);
+ pgstat_flush_io(false, true);
+ (void) pgstat_flush_backend(false, PGSTAT_BACKEND_FLUSH_IO, true);
}
/*
@@ -802,6 +814,29 @@ pgstat_twophase_postabort(FullTransactionId fxid, uint16 info,
rec->tuples_inserted + rec->tuples_updated;
}
+/*
+ * Helper function to flush non-transactional statistics.
+ */
+static void
+flush_relation_in_transaction_stats(PgStat_StatTabEntry *tabentry, PgStat_TableCounts *counts,
+ bool in_txn_only)
+{
+ TimestampTz t;
+
+ tabentry->numscans += counts->numscans;
+ if (counts->numscans)
+ {
+ t = in_txn_only ? GetCurrentTimestamp() : GetCurrentTransactionStopTimestamp();
+ if (t > tabentry->lastscan)
+ tabentry->lastscan = t;
+ }
+
+ tabentry->tuples_returned += counts->tuples_returned;
+ tabentry->tuples_fetched += counts->tuples_fetched;
+ tabentry->blocks_fetched += counts->blocks_fetched;
+ tabentry->blocks_hit += counts->blocks_hit;
+}
+
/*
* Flush out pending stats for the entry
*
@@ -810,9 +845,17 @@ pgstat_twophase_postabort(FullTransactionId fxid, uint16 info,
*
* Some of the stats are copied to the corresponding pending database stats
* entry when successfully flushing.
+ *
+ * If in_txn_only is true, only non-transactional fields are flushed
+ * (numscans, tuples_returned, tuples_fetched, blocks_fetched, blocks_hit).
+ * Transactional fields remain pending until transaction boundary.
+ *
+ * Some of the stats are copied to the corresponding pending database stats
+ * entry when successfully flushing.
*/
bool
-pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
+pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool in_txn_only, bool *is_partial)
{
Oid dboid;
PgStat_TableStatus *lstats; /* pending stats entry */
@@ -824,6 +867,9 @@ pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
lstats = (PgStat_TableStatus *) entry_ref->pending;
shtabstats = (PgStatShared_Relation *) entry_ref->shared_stats;
+ /* this is a partial flush if in in_txn_only mode */
+ *is_partial = in_txn_only;
+
/*
* Ignore entries that didn't accumulate any actual counts, such as
* indexes that were opened by the planner but not used.
@@ -835,19 +881,36 @@ pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
if (!pgstat_lock_entry(entry_ref, nowait))
return false;
- /* add the values to the shared entry. */
tabentry = &shtabstats->stats;
- tabentry->numscans += lstats->counts.numscans;
- if (lstats->counts.numscans)
+ if (in_txn_only)
{
- TimestampTz t = GetCurrentTransactionStopTimestamp();
- if (t > tabentry->lastscan)
- tabentry->lastscan = t;
+ /* Flush non-transactional statistics */
+ flush_relation_in_transaction_stats(tabentry, &lstats->counts, true);
+
+ pgstat_unlock_entry(entry_ref);
+
+ /* Also update the corresponding fields in database stats */
+ dbentry = pgstat_prep_database_pending(dboid);
+ UPDATE_DATABASE_IN_TRANSACTION_STATS(dbentry, &lstats->counts);
+
+ /*
+ * Clear the flushed fields from pending stats to prevent
+ * double-counting when we flush all fields at transaction boundary.
+ */
+ lstats->counts.numscans = 0;
+ lstats->counts.tuples_returned = 0;
+ lstats->counts.tuples_fetched = 0;
+ lstats->counts.blocks_fetched = 0;
+ lstats->counts.blocks_hit = 0;
+
+ return true;
}
- tabentry->tuples_returned += lstats->counts.tuples_returned;
- tabentry->tuples_fetched += lstats->counts.tuples_fetched;
+
+ /* Flush non-transactional statistics */
+ flush_relation_in_transaction_stats(tabentry, &lstats->counts, false);
+
tabentry->tuples_inserted += lstats->counts.tuples_inserted;
tabentry->tuples_updated += lstats->counts.tuples_updated;
tabentry->tuples_deleted += lstats->counts.tuples_deleted;
@@ -877,9 +940,6 @@ pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
*/
tabentry->ins_since_vacuum += lstats->counts.tuples_inserted;
- tabentry->blocks_fetched += lstats->counts.blocks_fetched;
- tabentry->blocks_hit += lstats->counts.blocks_hit;
-
/* Clamp live_tuples in case of negative delta_live_tuples */
tabentry->live_tuples = Max(tabentry->live_tuples, 0);
/* Likewise for dead_tuples */
@@ -889,13 +949,11 @@ pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
/* The entry was successfully flushed, add the same to database stats */
dbentry = pgstat_prep_database_pending(dboid);
- dbentry->tuples_returned += lstats->counts.tuples_returned;
- dbentry->tuples_fetched += lstats->counts.tuples_fetched;
+ UPDATE_DATABASE_IN_TRANSACTION_STATS(dbentry, &lstats->counts);
+
dbentry->tuples_inserted += lstats->counts.tuples_inserted;
dbentry->tuples_updated += lstats->counts.tuples_updated;
dbentry->tuples_deleted += lstats->counts.tuples_deleted;
- dbentry->blocks_fetched += lstats->counts.blocks_fetched;
- dbentry->blocks_hit += lstats->counts.blocks_hit;
return true;
}
diff --git a/src/backend/utils/activity/pgstat_slru.c b/src/backend/utils/activity/pgstat_slru.c
index 2190f388eae..d8dc7be11ef 100644
--- a/src/backend/utils/activity/pgstat_slru.c
+++ b/src/backend/utils/activity/pgstat_slru.c
@@ -139,7 +139,7 @@ pgstat_get_slru_index(const char *name)
* acquired. Otherwise return false.
*/
bool
-pgstat_slru_flush_cb(bool nowait)
+pgstat_slru_flush_cb(bool nowait, bool in_txn_only)
{
PgStatShared_SLRU *stats_shmem = &pgStatLocal.shmem->slru;
int i;
diff --git a/src/backend/utils/activity/pgstat_subscription.c b/src/backend/utils/activity/pgstat_subscription.c
index 3277cf88a4e..4f289683d33 100644
--- a/src/backend/utils/activity/pgstat_subscription.c
+++ b/src/backend/utils/activity/pgstat_subscription.c
@@ -117,14 +117,20 @@ pgstat_fetch_stat_subscription(Oid subid)
* false without flushing the entry. Otherwise returns true.
*/
bool
-pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
+pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool in_txn_only, bool *is_partial)
{
PgStat_BackendSubEntry *localent;
PgStatShared_Subscription *shsubent;
+ Assert(!in_txn_only);
+
localent = (PgStat_BackendSubEntry *) entry_ref->pending;
shsubent = (PgStatShared_Subscription *) entry_ref->shared_stats;
+ /* this is not a partial flush */
+ *is_partial = false;
+
/* localent always has non-zero content */
if (!pgstat_lock_entry(entry_ref, nowait))
diff --git a/src/backend/utils/activity/pgstat_wal.c b/src/backend/utils/activity/pgstat_wal.c
index 183e0a7a97b..bf796ad655e 100644
--- a/src/backend/utils/activity/pgstat_wal.c
+++ b/src/backend/utils/activity/pgstat_wal.c
@@ -51,12 +51,12 @@ pgstat_report_wal(bool force)
nowait = !force;
/* flush wal stats */
- (void) pgstat_wal_flush_cb(nowait);
- pgstat_flush_backend(nowait, PGSTAT_BACKEND_FLUSH_WAL);
+ (void) pgstat_wal_flush_cb(nowait, true);
+ (void) pgstat_flush_backend(nowait, PGSTAT_BACKEND_FLUSH_WAL, true);
/* flush IO stats */
- pgstat_flush_io(nowait);
- (void) pgstat_flush_backend(nowait, PGSTAT_BACKEND_FLUSH_IO);
+ pgstat_flush_io(nowait, true);
+ (void) pgstat_flush_backend(nowait, PGSTAT_BACKEND_FLUSH_IO, true);
}
/*
@@ -88,7 +88,7 @@ pgstat_wal_have_pending(void)
* acquired. Otherwise return false.
*/
bool
-pgstat_wal_flush_cb(bool nowait)
+pgstat_wal_flush_cb(bool nowait, bool in_txn_only)
{
PgStatShared_Wal *stats_shmem = &pgStatLocal.shmem->wal;
WalUsage wal_usage_diff = {0};
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 36ad708b360..bcce54367d5 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -40,6 +40,7 @@ volatile sig_atomic_t IdleSessionTimeoutPending = false;
volatile sig_atomic_t ProcSignalBarrierPending = false;
volatile sig_atomic_t LogMemoryContextPending = false;
volatile sig_atomic_t IdleStatsUpdateTimeoutPending = false;
+volatile sig_atomic_t InTransactionStatsUpdateTimeoutPending = false;
volatile uint32 InterruptHoldoffCount = 0;
volatile uint32 QueryCancelHoldoffCount = 0;
volatile uint32 CritSectionCount = 0;
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 26118661f07..27ecf0df16e 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -65,6 +65,7 @@
#include "utils/injection_point.h"
#include "utils/memutils.h"
#include "utils/pg_locale.h"
+#include "utils/pgstat_internal.h"
#include "utils/portal.h"
#include "utils/ps_status.h"
#include "utils/snapmgr.h"
@@ -89,6 +90,7 @@ static void IdleInTransactionSessionTimeoutHandler(void);
static void TransactionTimeoutHandler(void);
static void IdleSessionTimeoutHandler(void);
static void IdleStatsUpdateTimeoutHandler(void);
+static void InTransactionStatsUpdateTimeoutHandler(void);
static void ClientCheckTimeoutHandler(void);
static bool ThereIsAtLeastOneRole(void);
static void process_startup_options(Port *port, bool am_superuser);
@@ -774,6 +776,8 @@ InitPostgres(const char *in_dbname, Oid dboid,
RegisterTimeout(CLIENT_CONNECTION_CHECK_TIMEOUT, ClientCheckTimeoutHandler);
RegisterTimeout(IDLE_STATS_UPDATE_TIMEOUT,
IdleStatsUpdateTimeoutHandler);
+ RegisterTimeout(IN_TRANSACTION_STATS_UPDATE_TIMEOUT,
+ InTransactionStatsUpdateTimeoutHandler);
}
/*
@@ -1433,6 +1437,14 @@ IdleStatsUpdateTimeoutHandler(void)
SetLatch(MyLatch);
}
+static void
+InTransactionStatsUpdateTimeoutHandler(void)
+{
+ InTransactionStatsUpdateTimeoutPending = true;
+ InterruptPending = true;
+ SetLatch(MyLatch);
+}
+
static void
ClientCheckTimeoutHandler(void)
{
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index f16f35659b9..7051e4b6f3f 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -96,6 +96,7 @@ extern PGDLLIMPORT volatile sig_atomic_t IdleSessionTimeoutPending;
extern PGDLLIMPORT volatile sig_atomic_t ProcSignalBarrierPending;
extern PGDLLIMPORT volatile sig_atomic_t LogMemoryContextPending;
extern PGDLLIMPORT volatile sig_atomic_t IdleStatsUpdateTimeoutPending;
+extern PGDLLIMPORT volatile sig_atomic_t InTransactionStatsUpdateTimeoutPending;
extern PGDLLIMPORT volatile sig_atomic_t CheckClientConnectionPending;
extern PGDLLIMPORT volatile sig_atomic_t ClientConnectionLost;
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 216b93492ba..8571eb1afd7 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -38,6 +38,13 @@ typedef struct RelationData *Relation;
/* Default directory to store temporary statistics data in */
#define PG_STAT_TMP_DIR "pg_stat_tmp"
+/*
+ * Interval in milliseconds for flushing FLUSH_IN_TRANSACTION stats to shared
+ * memory. An explicit transaction always enters idle-in-transaction state
+ * between commands, which is when this timeout is enabled.
+ */
+#define PGSTAT_IDLE_TXN_INTERVAL 10000
+
/* Values for track_functions GUC variable --- order is significant! */
typedef enum TrackFunctionsLevel
{
@@ -536,6 +543,7 @@ extern void pgstat_initialize(void);
/* Functions called from backends */
extern long pgstat_report_stat(bool force);
+extern void pgstat_report_in_transaction_stat(bool force);
extern void pgstat_force_next_flush(void);
extern void pgstat_reset_counters(void);
diff --git a/src/include/utils/pgstat_internal.h b/src/include/utils/pgstat_internal.h
index 9b8fbae00ed..e9f4dc85528 100644
--- a/src/include/utils/pgstat_internal.h
+++ b/src/include/utils/pgstat_internal.h
@@ -224,6 +224,18 @@ typedef struct PgStat_SubXactStatus
PgStat_TableXactStatus *first; /* head of list for this subxact */
} PgStat_SubXactStatus;
+/*
+ * Flush mode for statistics kinds.
+ *
+ * FLUSH_AT_TXN_BOUNDARY has to be the first because we want it to be the
+ * default value.
+ */
+typedef enum PgStat_FlushMode
+{
+ FLUSH_AT_TXN_BOUNDARY, /* All fields can only be flushed at
+ * transaction boundary */
+ FLUSH_IN_TRANSACTION, /* All fields can be flushed in a transaction */
+} PgStat_FlushMode;
/*
* Metadata for a specific kind of statistics.
@@ -251,6 +263,16 @@ typedef struct PgStat_KindInfo
*/
bool track_entry_count:1;
+ /*
+ * The mode of when to flush stats. See PgStat_FlushMode for more details.
+ *
+ * This member only has meaning for statistics kinds that accumulate
+ * pending stats and use flush callbacks. For kinds that write directly to
+ * shared memory (e.g., archiver, bgwriter, checkpointer), this member has
+ * no effect.
+ */
+ PgStat_FlushMode flush_mode;
+
/*
* The size of an entry in the shared stats hash table (pointed to by
* PgStatShared_HashEntry->body). For fixed-numbered statistics, this is
@@ -297,8 +319,13 @@ typedef struct PgStat_KindInfo
* For variable-numbered stats: flush pending stats. Required if pending
* data is used. See flush_static_cb when dealing with stats data that
* that cannot use PgStat_EntryRef->pending.
+ *
+ * The in_txn_only parameter indicates whether this is an in-transaction
+ * flush. The is_partial parameter indicates whether this is a partial
+ * flush.
*/
- bool (*flush_pending_cb) (PgStat_EntryRef *sr, bool nowait);
+ bool (*flush_pending_cb) (PgStat_EntryRef *sr, bool nowait,
+ bool in_txn_only, bool *is_partial);
/*
* For variable-numbered stats: delete pending stats. Optional.
@@ -366,8 +393,11 @@ typedef struct PgStat_KindInfo
*
* "pgstat_report_fixed" needs to be set to trigger the flush of pending
* stats.
+ *
+ * The in_txn_only parameter indicates whether this is an in-transaction
+ * flush.
*/
- bool (*flush_static_cb) (bool nowait);
+ bool (*flush_static_cb) (bool nowait, bool in_txn_only);
/*
* For fixed-numbered statistics: Reset All.
@@ -696,8 +726,8 @@ extern void pgstat_archiver_snapshot_cb(void);
#define PGSTAT_BACKEND_FLUSH_WAL (1 << 1) /* Flush WAL statistics */
#define PGSTAT_BACKEND_FLUSH_ALL (PGSTAT_BACKEND_FLUSH_IO | PGSTAT_BACKEND_FLUSH_WAL)
-extern bool pgstat_flush_backend(bool nowait, bits32 flags);
-extern bool pgstat_backend_flush_cb(bool nowait);
+extern bool pgstat_flush_backend(bool nowait, bits32 flags, bool in_txn_only);
+extern bool pgstat_backend_flush_cb(bool nowait, bool in_txn_only);
extern void pgstat_backend_reset_timestamp_cb(PgStatShared_Common *header,
TimestampTz ts);
@@ -729,7 +759,8 @@ extern void AtEOXact_PgStat_Database(bool isCommit, bool parallel);
extern PgStat_StatDBEntry *pgstat_prep_database_pending(Oid dboid);
extern void pgstat_reset_database_timestamp(Oid dboid, TimestampTz ts);
-extern bool pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait);
+extern bool pgstat_database_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool in_txn_only, bool *is_partial);
extern void pgstat_database_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
@@ -737,7 +768,8 @@ extern void pgstat_database_reset_timestamp_cb(PgStatShared_Common *header, Time
* Functions in pgstat_function.c
*/
-extern bool pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait);
+extern bool pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool in_txn_only, bool *is_partial);
extern void pgstat_function_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
@@ -745,9 +777,9 @@ extern void pgstat_function_reset_timestamp_cb(PgStatShared_Common *header, Time
* Functions in pgstat_io.c
*/
-extern void pgstat_flush_io(bool nowait);
+extern void pgstat_flush_io(bool nowait, bool in_txn_only);
-extern bool pgstat_io_flush_cb(bool nowait);
+extern bool pgstat_io_flush_cb(bool nowait, bool in_txn_only);
extern void pgstat_io_init_shmem_cb(void *stats);
extern void pgstat_io_reset_all_cb(TimestampTz ts);
extern void pgstat_io_snapshot_cb(void);
@@ -762,7 +794,8 @@ extern void AtEOSubXact_PgStat_Relations(PgStat_SubXactStatus *xact_state, bool
extern void AtPrepare_PgStat_Relations(PgStat_SubXactStatus *xact_state);
extern void PostPrepare_PgStat_Relations(PgStat_SubXactStatus *xact_state);
-extern bool pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait);
+extern bool pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool in_txn_only, bool *is_partial);
extern void pgstat_relation_delete_pending_cb(PgStat_EntryRef *entry_ref);
extern void pgstat_relation_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
@@ -809,7 +842,7 @@ extern PgStatShared_Common *pgstat_init_entry(PgStat_Kind kind,
* Functions in pgstat_slru.c
*/
-extern bool pgstat_slru_flush_cb(bool nowait);
+extern bool pgstat_slru_flush_cb(bool nowait, bool in_txn_only);
extern void pgstat_slru_init_shmem_cb(void *stats);
extern void pgstat_slru_reset_all_cb(TimestampTz ts);
extern void pgstat_slru_snapshot_cb(void);
@@ -820,7 +853,7 @@ extern void pgstat_slru_snapshot_cb(void);
*/
extern void pgstat_wal_init_backend_cb(void);
-extern bool pgstat_wal_flush_cb(bool nowait);
+extern bool pgstat_wal_flush_cb(bool nowait, bool in_txn_only);
extern void pgstat_wal_init_shmem_cb(void *stats);
extern void pgstat_wal_reset_all_cb(TimestampTz ts);
extern void pgstat_wal_snapshot_cb(void);
@@ -830,7 +863,8 @@ extern void pgstat_wal_snapshot_cb(void);
* Functions in pgstat_subscription.c
*/
-extern bool pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait);
+extern bool pgstat_subscription_flush_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool in_txn_only, bool *is_partial);
extern void pgstat_subscription_reset_timestamp_cb(PgStatShared_Common *header, TimestampTz ts);
diff --git a/src/include/utils/timeout.h b/src/include/utils/timeout.h
index 0965b590b34..68fe814bde6 100644
--- a/src/include/utils/timeout.h
+++ b/src/include/utils/timeout.h
@@ -34,6 +34,7 @@ typedef enum TimeoutId
TRANSACTION_TIMEOUT,
IDLE_SESSION_TIMEOUT,
IDLE_STATS_UPDATE_TIMEOUT,
+ IN_TRANSACTION_STATS_UPDATE_TIMEOUT,
CLIENT_CONNECTION_CHECK_TIMEOUT,
STARTUP_PROGRESS_TIMEOUT,
/* First user-definable timeout reason */
diff --git a/src/test/modules/test_custom_stats/t/001_custom_stats.pl b/src/test/modules/test_custom_stats/t/001_custom_stats.pl
index 9e6a7a38577..d3afbff37c5 100644
--- a/src/test/modules/test_custom_stats/t/001_custom_stats.pl
+++ b/src/test/modules/test_custom_stats/t/001_custom_stats.pl
@@ -156,5 +156,64 @@ $result = $node->safe_psql('postgres',
);
is($result, "0", "report of fixed-sized after manual reset");
+# Test FLUSH_IN_TRANSACTION custom stats kind.
+# The intxn kind is registered alongside the regular kind by test_custom_var_stats.
+
+# Basic update and report.
+$node->safe_psql('postgres',
+ q(SELECT test_custom_var_intxn_update('intxn_entry1')));
+$node->safe_psql('postgres',
+ q(SELECT test_custom_var_intxn_update('intxn_entry1')));
+$node->safe_psql('postgres',
+ q(SELECT test_custom_var_intxn_update('intxn_entry1')));
+
+$result = $node->safe_psql('postgres',
+ q(SELECT test_custom_var_intxn_report('intxn_entry1')));
+is($result, "3", "intxn stats report after updates");
+
+# Verify intxn stats are not persisted (write_to_file = false).
+$node->stop();
+$node->start();
+
+$result = $node->safe_psql('postgres',
+ q(SELECT test_custom_var_intxn_report('intxn_entry1')));
+is($result, "", "intxn stats lost after clean restart (not persisted)");
+
+# Test in-transaction flushing with injection point.
+SKIP:
+{
+ skip "injection points not supported", 1
+ unless (($ENV{enable_injection_points} // '') eq 'yes'
+ && $node->check_extension('injection_points'));
+
+ $node->safe_psql('postgres', 'CREATE EXTENSION IF NOT EXISTS injection_points;');
+ $node->safe_psql('postgres',
+ "SELECT injection_points_attach('in-transaction-stats-short-interval', 'error');");
+
+ $node->append_conf('postgresql.conf', 'stats_fetch_consistency = none');
+ $node->reload;
+
+ my $intxn_before = $node->safe_psql('postgres',
+ "SELECT COALESCE(test_custom_var_intxn_report('intxn_flush'), 0);");
+
+ $result = $node->safe_psql('postgres', q{
+BEGIN;
+SELECT test_custom_var_intxn_update('intxn_flush');
+SELECT test_custom_var_intxn_update('intxn_flush');
+SELECT test_custom_var_intxn_update('intxn_flush');
+SELECT pg_sleep(2);
+SELECT COALESCE(test_custom_var_intxn_report('intxn_flush'), 0);
+});
+
+ my @lines = split(/\n/, $result);
+ my $intxn_mid_txn = $lines[-1];
+
+ ok($intxn_mid_txn > $intxn_before,
+ "custom intxn stats flushed mid-transaction (before: $intxn_before, mid-txn: $intxn_mid_txn)");
+
+ $node->safe_psql('postgres',
+ "SELECT injection_points_detach('in-transaction-stats-short-interval');");
+}
+
# Test completed successfully
done_testing();
diff --git a/src/test/modules/test_custom_stats/test_custom_var_stats--1.0.sql b/src/test/modules/test_custom_stats/test_custom_var_stats--1.0.sql
index 5ed8cfc2dcf..03650cbc414 100644
--- a/src/test/modules/test_custom_stats/test_custom_var_stats--1.0.sql
+++ b/src/test/modules/test_custom_stats/test_custom_var_stats--1.0.sql
@@ -24,3 +24,13 @@ CREATE FUNCTION test_custom_stats_var_report(INOUT name TEXT,
RETURNS SETOF record
AS 'MODULE_PATHNAME', 'test_custom_stats_var_report'
LANGUAGE C STRICT PARALLEL UNSAFE;
+
+CREATE FUNCTION test_custom_var_intxn_update(IN name TEXT)
+RETURNS void
+AS 'MODULE_PATHNAME', 'test_custom_var_intxn_update'
+LANGUAGE C STRICT PARALLEL UNSAFE;
+
+CREATE FUNCTION test_custom_var_intxn_report(IN name TEXT)
+RETURNS bigint
+AS 'MODULE_PATHNAME', 'test_custom_var_intxn_report'
+LANGUAGE C STRICT PARALLEL UNSAFE;
diff --git a/src/test/modules/test_custom_stats/test_custom_var_stats.c b/src/test/modules/test_custom_stats/test_custom_var_stats.c
index 2ef0e903745..e87d47a28ab 100644
--- a/src/test/modules/test_custom_stats/test_custom_var_stats.c
+++ b/src/test/modules/test_custom_stats/test_custom_var_stats.c
@@ -37,6 +37,11 @@ PG_MODULE_MAGIC_EXT(
*/
#define PGSTAT_KIND_TEST_CUSTOM_VAR_STATS 25
+/*
+ * Kind ID for test_custom_var_intxn_stats statistics (FLUSH_IN_TRANSACTION).
+ */
+#define PGSTAT_KIND_TEST_CUSTOM_VAR_INTXN_STATS 27
+
/* File paths for auxiliary data serialization */
#define TEST_CUSTOM_AUX_DATA_DESC "pg_stat/test_custom_var_stats_desc.stats"
@@ -45,6 +50,8 @@ PG_MODULE_MAGIC_EXT(
*/
#define PGSTAT_CUSTOM_VAR_STATS_IDX(name) hash_bytes_extended((const unsigned char *) name, strlen(name), 0)
+#define PGSTAT_CUSTOM_VAR_INTXN_IDX(name) hash_bytes_extended((const unsigned char *) name, strlen(name), 0)
+
/*--------------------------------------------------------------------------
* Type definitions
*--------------------------------------------------------------------------
@@ -56,6 +63,12 @@ typedef struct PgStat_StatCustomVarEntry
PgStat_Counter numcalls; /* times statistic was incremented */
} PgStat_StatCustomVarEntry;
+/* Pending entry for FLUSH_IN_TRANSACTION kind (same layout) */
+typedef struct PgStat_CustomVarInTxnEntry
+{
+ PgStat_Counter numcalls;
+} PgStat_CustomVarInTxnEntry;
+
/* Shared memory statistics entry visible to all backends */
typedef struct PgStatShared_CustomVarEntry
{
@@ -64,6 +77,13 @@ typedef struct PgStatShared_CustomVarEntry
dsa_pointer description; /* pointer to description string in DSA */
} PgStatShared_CustomVarEntry;
+/* Shared memory entry for FLUSH_IN_TRANSACTION kind */
+typedef struct PgStatShared_CustomVarInTxnEntry
+{
+ PgStatShared_Common header;
+ PgStat_CustomVarInTxnEntry stats;
+} PgStatShared_CustomVarInTxnEntry;
+
/*--------------------------------------------------------------------------
* Global Variables
*--------------------------------------------------------------------------
@@ -85,7 +105,8 @@ static dsa_area *custom_stats_description_dsa = NULL;
/* Flush callback: merge pending stats into shared memory */
static bool test_custom_stats_var_flush_pending_cb(PgStat_EntryRef *entry_ref,
- bool nowait);
+ bool nowait, bool in_txn_only,
+ bool *is_partial);
/* Serialization callback: write auxiliary entry data */
static void test_custom_stats_var_to_serialized_data(const PgStat_HashKey *key,
@@ -100,6 +121,11 @@ static bool test_custom_stats_var_from_serialized_data(const PgStat_HashKey *key
/* Finish callback: end of statistics file operations */
static void test_custom_stats_var_finish(PgStat_StatsFileOp status);
+/* Flush callback for FLUSH_IN_TRANSACTION kind */
+static bool test_custom_var_intxn_flush_pending_cb(PgStat_EntryRef *entry_ref,
+ bool nowait, bool in_txn_only,
+ bool *is_partial);
+
/*--------------------------------------------------------------------------
* Custom kind configuration
*--------------------------------------------------------------------------
@@ -121,6 +147,19 @@ static const PgStat_KindInfo custom_stats = {
.finish = test_custom_stats_var_finish,
};
+static const PgStat_KindInfo custom_intxn_stats = {
+ .name = "test_custom_var_intxn_stats",
+ .fixed_amount = false,
+ .write_to_file = false, /* no persistence needed for test */
+ .flush_mode = FLUSH_IN_TRANSACTION,
+ .accessed_across_databases = true,
+ .shared_size = sizeof(PgStatShared_CustomVarInTxnEntry),
+ .shared_data_off = offsetof(PgStatShared_CustomVarInTxnEntry, stats),
+ .shared_data_len = sizeof(PgStat_CustomVarInTxnEntry),
+ .pending_size = sizeof(PgStat_CustomVarInTxnEntry),
+ .flush_pending_cb = test_custom_var_intxn_flush_pending_cb,
+};
+
/*--------------------------------------------------------------------------
* Module initialization
*--------------------------------------------------------------------------
@@ -133,8 +172,9 @@ _PG_init(void)
if (!process_shared_preload_libraries_in_progress)
return;
- /* Register custom statistics kind */
+ /* Register custom statistics kinds */
pgstat_register_kind(PGSTAT_KIND_TEST_CUSTOM_VAR_STATS, &custom_stats);
+ pgstat_register_kind(PGSTAT_KIND_TEST_CUSTOM_VAR_INTXN_STATS, &custom_intxn_stats);
}
/*--------------------------------------------------------------------------
@@ -152,7 +192,8 @@ _PG_init(void)
* Returns false only if nowait=true and lock acquisition fails.
*/
static bool
-test_custom_stats_var_flush_pending_cb(PgStat_EntryRef *entry_ref, bool nowait)
+test_custom_stats_var_flush_pending_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool in_txn_only, bool *is_partial)
{
PgStat_StatCustomVarEntry *pending_entry;
PgStatShared_CustomVarEntry *shared_entry;
@@ -160,6 +201,9 @@ test_custom_stats_var_flush_pending_cb(PgStat_EntryRef *entry_ref, bool nowait)
pending_entry = (PgStat_StatCustomVarEntry *) entry_ref->pending;
shared_entry = (PgStatShared_CustomVarEntry *) entry_ref->shared_stats;
+ /* this is not a partial flush */
+ *is_partial = false;
+
if (!pgstat_lock_entry(entry_ref, nowait))
return false;
@@ -691,3 +735,67 @@ test_custom_stats_var_report(PG_FUNCTION_ARGS)
SRF_RETURN_DONE(funcctx);
}
+
+/*--------------------------------------------------------------------------
+ * FLUSH_IN_TRANSACTION kind: callbacks and SQL functions
+ *--------------------------------------------------------------------------
+ */
+
+static bool
+test_custom_var_intxn_flush_pending_cb(PgStat_EntryRef *entry_ref, bool nowait,
+ bool in_txn_only, bool *is_partial)
+{
+ PgStat_CustomVarInTxnEntry *pending;
+ PgStatShared_CustomVarInTxnEntry *shared;
+
+ pending = (PgStat_CustomVarInTxnEntry *) entry_ref->pending;
+ shared = (PgStatShared_CustomVarInTxnEntry *) entry_ref->shared_stats;
+
+ *is_partial = false;
+
+ if (!pgstat_lock_entry(entry_ref, nowait))
+ return false;
+
+ shared->stats.numcalls += pending->numcalls;
+
+ pgstat_unlock_entry(entry_ref);
+
+ return true;
+}
+
+PG_FUNCTION_INFO_V1(test_custom_var_intxn_update);
+Datum
+test_custom_var_intxn_update(PG_FUNCTION_ARGS)
+{
+ char *stat_name = text_to_cstring(PG_GETARG_TEXT_PP(0));
+ PgStat_EntryRef *entry_ref;
+ PgStat_CustomVarInTxnEntry *pending;
+
+ entry_ref = pgstat_prep_pending_entry(PGSTAT_KIND_TEST_CUSTOM_VAR_INTXN_STATS,
+ InvalidOid,
+ PGSTAT_CUSTOM_VAR_INTXN_IDX(stat_name),
+ NULL);
+
+ pending = (PgStat_CustomVarInTxnEntry *) entry_ref->pending;
+ pending->numcalls++;
+
+ PG_RETURN_VOID();
+}
+
+PG_FUNCTION_INFO_V1(test_custom_var_intxn_report);
+Datum
+test_custom_var_intxn_report(PG_FUNCTION_ARGS)
+{
+ char *stat_name = text_to_cstring(PG_GETARG_TEXT_PP(0));
+ PgStat_CustomVarInTxnEntry *stats;
+
+ stats = (PgStat_CustomVarInTxnEntry *)
+ pgstat_fetch_entry(PGSTAT_KIND_TEST_CUSTOM_VAR_INTXN_STATS,
+ InvalidOid,
+ PGSTAT_CUSTOM_VAR_INTXN_IDX(stat_name));
+
+ if (!stats)
+ PG_RETURN_NULL();
+
+ PG_RETURN_INT64(stats->numcalls);
+}
diff --git a/src/test/modules/test_misc/meson.build b/src/test/modules/test_misc/meson.build
index 6e8db1621a7..70248360d21 100644
--- a/src/test/modules/test_misc/meson.build
+++ b/src/test/modules/test_misc/meson.build
@@ -19,6 +19,7 @@ tests += {
't/008_replslot_single_user.pl',
't/009_log_temp_files.pl',
't/010_index_concurrently_upsert.pl',
+ 't/011_in_transaction_stats.pl',
],
# The injection points are cluster-wide, so disable installcheck
'runningcheck': false,
diff --git a/src/test/modules/test_misc/t/011_in_transaction_stats.pl b/src/test/modules/test_misc/t/011_in_transaction_stats.pl
new file mode 100644
index 00000000000..cf1bde24a5d
--- /dev/null
+++ b/src/test/modules/test_misc/t/011_in_transaction_stats.pl
@@ -0,0 +1,91 @@
+
+# Copyright (c) 2024-2026, PostgreSQL Global Development Group
+
+# Test in-transaction stats flushing mechanism.
+#
+# This test verifies that FLUSH_IN_TRANSACTION stats are periodically flushed
+# to shared memory while a transaction is idle between commands. Uses an
+# injection point to reduce the flush interval from 10 seconds to 1 second
+# for faster testing.
+#
+# We test one representative of each stats kind: relation stats (seq_scan)
+# for variable-numbered stats, and WAL for fixed-sized stats.
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+ plan skip_all => 'Injection points not supported by this build';
+}
+
+my $node = PostgreSQL::Test::Cluster->new('node');
+$node->init();
+$node->append_conf('postgresql.conf', 'stats_fetch_consistency = none');
+$node->start;
+
+if (!$node->check_extension('injection_points'))
+{
+ plan skip_all => 'Extension injection_points not installed';
+}
+
+$node->safe_psql('postgres', 'CREATE EXTENSION injection_points;');
+
+# Attach injection point to reduce the in-transaction stats flush interval
+# to 1 second.
+$node->safe_psql('postgres',
+ "SELECT injection_points_attach('in-transaction-stats-short-interval', 'error');");
+
+# Create test table
+$node->safe_psql('postgres',
+ 'CREATE TABLE test_in_txn_stats(a int) WITH (autovacuum_enabled = off);');
+$node->safe_psql('postgres',
+ 'INSERT INTO test_in_txn_stats SELECT generate_series(1, 1000);');
+
+# Force flush and get baseline stats
+$node->safe_psql('postgres', 'SELECT pg_stat_force_next_flush();');
+my $seq_scan_before = $node->safe_psql('postgres',
+ "SELECT seq_scan FROM pg_stat_user_tables WHERE relname = 'test_in_txn_stats';");
+
+# Test that seq_scan stats (variable-numbered) are flushed mid-transaction
+# via the in-transaction periodic flush mechanism.
+my $result = $node->safe_psql('postgres', q{
+BEGIN;
+SELECT COUNT(*) FROM test_in_txn_stats;
+SELECT COUNT(*) FROM test_in_txn_stats;
+SELECT pg_sleep(2);
+SELECT seq_scan FROM pg_stat_user_tables WHERE relname = 'test_in_txn_stats';
+});
+
+my @lines = split(/\n/, $result);
+my $seq_scan_mid_txn = $lines[-1];
+
+ok($seq_scan_mid_txn > $seq_scan_before,
+ "seq_scan stats flushed during transaction (before: $seq_scan_before, mid-txn: $seq_scan_mid_txn)");
+
+# Test WAL stats (fixed-sized) flushing during transaction.
+$node->safe_psql('postgres', 'SELECT pg_stat_reset_shared(\'wal\');');
+my $wal_records_before = $node->safe_psql('postgres',
+ "SELECT wal_records FROM pg_stat_wal;");
+
+$result = $node->safe_psql('postgres', q{
+BEGIN;
+INSERT INTO test_in_txn_stats SELECT generate_series(1, 1000);
+SELECT pg_sleep(2);
+SELECT wal_records FROM pg_stat_wal;
+});
+
+@lines = split(/\n/, $result);
+my $wal_records_mid_txn = $lines[-1];
+
+ok($wal_records_mid_txn > $wal_records_before,
+ "WAL stats flushed during transaction (before: $wal_records_before, mid-txn: $wal_records_mid_txn)");
+
+# Cleanup
+$node->safe_psql('postgres', 'DROP TABLE test_in_txn_stats;');
+
+done_testing();
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 52f8603a7be..c9cd6087abc 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2290,6 +2290,7 @@ PgStat_Counter
PgStat_EntryRef
PgStat_EntryRefHashEntry
PgStat_FetchConsistency
+PgStat_FlushMode
PgStat_FunctionCallUsage
PgStat_FunctionCounts
PgStat_HashKey
--
2.47.3
^ permalink raw reply [nested|flat] 22+ messages in thread
* Re: Flush some statistics within running transactions
2026-02-17 19:18 Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-18 05:40 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-19 03:58 ` Re: Flush some statistics within running transactions Michael Paquier <[email protected]>
2026-02-19 08:01 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-19 22:08 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-20 15:55 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-23 02:12 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-23 08:14 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-23 23:47 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-24 12:01 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-03-16 06:26 ` Re: Flush some statistics within running transactions Michael Paquier <[email protected]>
2026-03-16 09:20 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-03-16 10:22 ` Re: Flush some statistics within running transactions Michael Paquier <[email protected]>
2026-03-16 23:42 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
@ 2026-03-25 03:16 ` Michael Paquier <[email protected]>
0 siblings, 0 replies; 22+ messages in thread
From: Michael Paquier @ 2026-03-25 03:16 UTC (permalink / raw)
To: Sami Imseih <[email protected]>; +Cc: Bertrand Drouvot <[email protected]>; Fujii Masao <[email protected]>; [email protected]; Zsolt Parragi <[email protected]>
On Mon, Mar 16, 2026 at 06:42:44PM -0500, Sami Imseih wrote:
> So attached is a new proposal with tests and docs. In terms of
> test I fell back to the strategy used by Bertrand [0] with the
As far as I am reading the patch, it seems to me that this would
accelerate the frequency of the stats flushes when we have a
transaction with many short queries, but it does not help much in the
case of an analytical single query because the flush of the stats
would just happen once after the timeout set for each query?
Let's imagine for example a query that takes hours to run, where we'd
want to get fresh IO and WAL stats (backend included) on a periodic
basis by processing interrupts while going through the executor.
Thinking more about this problem, I'd really like to think that a
client-side API would provide a more flexible interface for the
analytical case.
--
Michael
Attachments:
[application/pgp-signature] signature.asc (833B, 2-signature.asc)
download
^ permalink raw reply [nested|flat] 22+ messages in thread
* Re: Flush some statistics within running transactions
2026-02-17 19:18 Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-18 05:40 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-19 03:58 ` Re: Flush some statistics within running transactions Michael Paquier <[email protected]>
2026-02-19 08:01 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
@ 2026-02-23 23:48 ` Michael Paquier <[email protected]>
2026-02-24 13:55 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
1 sibling, 1 reply; 22+ messages in thread
From: Michael Paquier @ 2026-02-23 23:48 UTC (permalink / raw)
To: Bertrand Drouvot <[email protected]>; +Cc: Sami Imseih <[email protected]>; Fujii Masao <[email protected]>; [email protected]; Zsolt Parragi <[email protected]>
On Thu, Feb 19, 2026 at 08:01:36AM +0000, Bertrand Drouvot wrote:
> On Thu, Feb 19, 2026 at 12:58:12PM +0900, Michael Paquier wrote:
>> 2) The timeout requirement itself, relying on a timeout threshold
>> controlled by a backend-side configuration.
>
> What are you concerns with this?
I am concerned about the three additional points/requirements:
1) The need for all processes who want to flush non-transactional
stats to set up timeouts, unconditionally, which is what the patch
shows with the new InitializeTimeouts() calls added for example for
auxiliary processes. This forces the use of SIGALRM in these
processes, with a new handler, and feels like a requirement too heavy
for the sake of some stats flushes. This requires an extra
unconditional RegisterTimeout(), as well.
2) The need for all the stats to call pgstat_schedule_anytime_update()
in strategical places. This is less of a burden compared to 1), but
this leads to more complications in these code paths with the coding
requirements, especially for custom stats kinds.
3) Enforcing a flush timeout unconditionally. One thing that slightly
concerns me here is that it seems easier to misuse compared to a
client-based facility, where a too aggressive setup could stress more
the server-side pgstats. It is true that the client-side could be
misused too.
My main worries are mainly around 1), I guess, with the new SIGALRM
handler requirements for all auxiliary processes. Using a procsignal
path would allow us to rely on a solution that has the same
flexibility, combined with strategic additional flush calls that we
could spread depending on requirements we want to enforce in some
processes, like in the WAL sender, or perhaps the checkpointer. That
seems more careful in the long-run, and this can rely on the interrupt
processing for the job. The addition of the property to track if a
stats kind of OK to flush outside a transaction boundary is also
critical to have, of course. I am sold to the point point of the
design about this new property tracked in the stats kind meta-data.
>> With that in mind, wouldn't it be simpler if we introduced an API that
>> could be used from client applications instead, in a model similar
>> what we do for procsignal.c/h?
>
> That's another angle to look at it but I think that giving this responsability to
> the clients would not solve the concerns we had in [1] (that led to 039549d70f6
> and to this thread). It seems to me that a solution/design that does not allow
> us to "revert" 039549d70f6 does not suit our needs. Thoughts?
039549d70f6 goes in line with the "client" prospective, where I would
like to think that strategic flush calls are more flexible.
> Yeah, after our off-list discussion yesterday, I tried to implement the same
> trick that f1e251be80a has done with injection points (nice trick by the way!),
> but that led to:
In this case, avoiding an injpoint allocation in a critical section
would be a two-step process:
- INJECTION_POINT_LOAD(), before the critical section, to warm up the
cache and do all the allocations.
- INJECTION_POINT_CACHED() with IS_INJECTION_POINT_ATTACHED()
(optional), to run the point, in the critical section.
This tactic is already in use in the tree. You would have to use a
more complicated scheme to make sure that the wait machinery is
initialized when using it in a critical section, but that can be
worked around as well. See my "state-of-the-art" work of
15f68cebdcec, which, well, happens to work. Not really the best work
ever with two points, but it works for this purpose. :D
--
Michael
Attachments:
[application/pgp-signature] signature.asc (833B, 2-signature.asc)
download
^ permalink raw reply [nested|flat] 22+ messages in thread
* Re: Flush some statistics within running transactions
2026-02-17 19:18 Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-18 05:40 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-19 03:58 ` Re: Flush some statistics within running transactions Michael Paquier <[email protected]>
2026-02-19 08:01 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-23 23:48 ` Re: Flush some statistics within running transactions Michael Paquier <[email protected]>
@ 2026-02-24 13:55 ` Bertrand Drouvot <[email protected]>
2026-02-24 16:32 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-26 03:33 ` Re: Flush some statistics within running transactions Michael Paquier <[email protected]>
0 siblings, 2 replies; 22+ messages in thread
From: Bertrand Drouvot @ 2026-02-24 13:55 UTC (permalink / raw)
To: Michael Paquier <[email protected]>; +Cc: Sami Imseih <[email protected]>; Fujii Masao <[email protected]>; [email protected]; Zsolt Parragi <[email protected]>
Hi,
On Tue, Feb 24, 2026 at 08:48:31AM +0900, Michael Paquier wrote:
> On Thu, Feb 19, 2026 at 08:01:36AM +0000, Bertrand Drouvot wrote:
> > On Thu, Feb 19, 2026 at 12:58:12PM +0900, Michael Paquier wrote:
> >> 2) The timeout requirement itself, relying on a timeout threshold
> >> controlled by a backend-side configuration.
> >
> > What are you concerns with this?
>
> I am concerned about the three additional points/requirements:
> 1) The need for all processes who want to flush non-transactional
> stats to set up timeouts, unconditionally, which is what the patch
> shows with the new InitializeTimeouts() calls added for example for
> auxiliary processes. This forces the use of SIGALRM in these
> processes,
Right but they all already call pqsignal(SIGALRM, SIG_IGN), so I'm not sure
to get the point.
> This requires an extra unconditional RegisterTimeout(), as well.
Yes, I'm not clear why that's an issue though.
> 2) The need for all the stats to call pgstat_schedule_anytime_update()
> in strategical places. This is less of a burden compared to 1), but
> this leads to more complications in these code paths with the coding
> requirements, especially for custom stats kinds.
I think that's solved with Sami's proposal for variable stats kind (to flush or
schedule when the session is idle).
> 3) Enforcing a flush timeout unconditionally.
What do you mean? It's done only if there is things to flush.
> My main worries are mainly around 1), I guess, with the new SIGALRM
> handler requirements for all auxiliary processes. Using a procsignal
> path would allow us to rely on a solution that has the same
> flexibility, combined with strategic additional flush calls that we
> could spread depending on requirements we want to enforce in some
> processes, like in the WAL sender, or perhaps the checkpointer.
I see, but we know they'll have to flush IO or WAL stats at some point so
enabling the timeout for those look ok.
> The addition of the property to track if a
> stats kind of OK to flush outside a transaction boundary is also
> critical to have, of course. I am sold to the point point of the
> design about this new property tracked in the stats kind meta-data.
Great!
> 039549d70f6 goes in line with the "client" prospective, where I would
> like to think that strategic flush calls are more flexible.
>
> > Yeah, after our off-list discussion yesterday, I tried to implement the same
> > trick that f1e251be80a has done with injection points (nice trick by the way!),
> > but that led to:
>
> In this case, avoiding an injpoint allocation in a critical section
> would be a two-step process:
> - INJECTION_POINT_LOAD(), before the critical section, to warm up the
> cache and do all the allocations.
> - INJECTION_POINT_CACHED() with IS_INJECTION_POINT_ATTACHED()
> (optional), to run the point, in the critical section.
Oh okay, thanks for the explanation!
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
^ permalink raw reply [nested|flat] 22+ messages in thread
* Re: Flush some statistics within running transactions
2026-02-17 19:18 Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-18 05:40 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-19 03:58 ` Re: Flush some statistics within running transactions Michael Paquier <[email protected]>
2026-02-19 08:01 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-23 23:48 ` Re: Flush some statistics within running transactions Michael Paquier <[email protected]>
2026-02-24 13:55 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
@ 2026-02-24 16:32 ` Sami Imseih <[email protected]>
1 sibling, 0 replies; 22+ messages in thread
From: Sami Imseih @ 2026-02-24 16:32 UTC (permalink / raw)
To: Bertrand Drouvot <[email protected]>; +Cc: Michael Paquier <[email protected]>; Fujii Masao <[email protected]>; [email protected]; Zsolt Parragi <[email protected]>
> > 2) The need for all the stats to call pgstat_schedule_anytime_update()
> > in strategical places. This is less of a burden compared to 1), but
> > this leads to more complications in these code paths with the coding
> > requirements, especially for custom stats kinds.
>
> I think that's solved with Sami's proposal for variable stats kind (to flush or
> schedule when the session is idle).
Just a clarification. It is "to schedule a flush when the transaction
becomes idle"
PGSTAT_IDLE_INTERVAL already deals when the session is idle.
--
Sami Imseih
Amazon Web Services (AWS)
^ permalink raw reply [nested|flat] 22+ messages in thread
* Re: Flush some statistics within running transactions
2026-02-17 19:18 Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-18 05:40 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-19 03:58 ` Re: Flush some statistics within running transactions Michael Paquier <[email protected]>
2026-02-19 08:01 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-02-23 23:48 ` Re: Flush some statistics within running transactions Michael Paquier <[email protected]>
2026-02-24 13:55 ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
@ 2026-02-26 03:33 ` Michael Paquier <[email protected]>
1 sibling, 0 replies; 22+ messages in thread
From: Michael Paquier @ 2026-02-26 03:33 UTC (permalink / raw)
To: Bertrand Drouvot <[email protected]>; +Cc: Sami Imseih <[email protected]>; Fujii Masao <[email protected]>; [email protected]; Zsolt Parragi <[email protected]>
On Tue, Feb 24, 2026 at 01:55:48PM +0000, Bertrand Drouvot wrote:
> On Tue, Feb 24, 2026 at 08:48:31AM +0900, Michael Paquier wrote:
>> I am concerned about the three additional points/requirements:
>> 1) The need for all processes who want to flush non-transactional
>> stats to set up timeouts, unconditionally, which is what the patch
>> shows with the new InitializeTimeouts() calls added for example for
>> auxiliary processes. This forces the use of SIGALRM in these
>> processes,
>
> Right but they all already call pqsignal(SIGALRM, SIG_IGN), so I'm not sure
> to get the point.
This design requires enabling a new signal with a signal handler in a
lot of processes that did not do that. Enabling timeouts in a bunch
of new processes, while claiming it is fine to do, sounds like
something we should rather be careful about. Are you sure for example
that some of the checkpointer code would not buzz on that, for
example?
At the end, this approach seems too heavy-handed to me. I am also not
entirely convinced that enforcing that unconditionally is the right
thing to do in some cases, either. For the bgwriter, as one example,
we have a WAL report call happening in its main loop, which is quite
good in terms of information frequency obtained.
Wouldn't some case-by-case strategic "anytime" API flush calls make
more sense for some subsystems, rather than relying on a timeout? It
seems rather easy to misuse this design in some bgworker contexts, and
SIGALRM could also be used for a different purpose but we would block
that entirely moving on with future versions?
--
Michael
Attachments:
[application/pgp-signature] signature.asc (833B, 2-signature.asc)
download
^ permalink raw reply [nested|flat] 22+ messages in thread
end of thread, other threads:[~2026-03-25 03:16 UTC | newest]
Thread overview: 22+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2026-02-17 19:18 Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
2026-02-18 05:40 ` Bertrand Drouvot <[email protected]>
2026-02-18 11:37 ` Jakub Wartak <[email protected]>
2026-02-19 07:06 ` Bertrand Drouvot <[email protected]>
2026-02-19 03:58 ` Michael Paquier <[email protected]>
2026-02-19 08:01 ` Bertrand Drouvot <[email protected]>
2026-02-19 22:08 ` Sami Imseih <[email protected]>
2026-02-20 15:55 ` Bertrand Drouvot <[email protected]>
2026-02-23 02:12 ` Sami Imseih <[email protected]>
2026-02-23 08:14 ` Bertrand Drouvot <[email protected]>
2026-02-23 23:47 ` Sami Imseih <[email protected]>
2026-02-24 01:56 ` Michael Paquier <[email protected]>
2026-02-24 12:01 ` Bertrand Drouvot <[email protected]>
2026-03-16 06:26 ` Michael Paquier <[email protected]>
2026-03-16 09:20 ` Bertrand Drouvot <[email protected]>
2026-03-16 10:22 ` Michael Paquier <[email protected]>
2026-03-16 23:42 ` Sami Imseih <[email protected]>
2026-03-25 03:16 ` Michael Paquier <[email protected]>
2026-02-23 23:48 ` Michael Paquier <[email protected]>
2026-02-24 13:55 ` Bertrand Drouvot <[email protected]>
2026-02-24 16:32 ` Sami Imseih <[email protected]>
2026-02-26 03:33 ` Michael Paquier <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox