Flush some statistics within running transactions

public inbox for [email protected]  
help / color / mirror / Atom feed

Flush some statistics within running transactions
4+ messages / 2 participants
[nested] [flat]

* Flush some statistics within running transactions
@ 2026-01-12 11:03 Bertrand Drouvot <[email protected]>
  2026-01-15 03:54 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
  0 siblings, 1 reply; 4+ messages in thread

From: Bertrand Drouvot @ 2026-01-12 11:03 UTC (permalink / raw)
  To: [email protected]

Hi hackers,

Long running transactions can accumulate significant statistics (WAL, IO, ...)
that remain unflushed until the transaction ends. This delays visibility of
resource usage in monitoring views like pg_stat_io and pg_stat_wal.

This patch series introduce the ability to $SUBJECT (suggested in [1]) to:

- improve monitoring of long running transactions
- avoid missing places where we should flush statistics (like the one fixed in
039549d70f6)

The patch series is made of 3 sub-patches:

0001: Add pgstat_report_anytime_stat() for periodic stats flushing

It introduces pgstat_report_anytime_stat(), which flushes non transactional
statistics even inside active transactions. A new timeout handler fires every
second to call this function, ensuring timely stats visibility without waiting
for transaction completion.

Implementation details:

- Add PgStat_FlushBehavior enum to classify stats kinds:
  * FLUSH_ANYTIME: Stats that can always be flushed (WAL, IO, ...)
  * FLUSH_AT_TXN_BOUNDARY: Stats requiring transaction boundaries

- Modify pgstat_flush_pending_entries() and pgstat_flush_fixed_stats() to accept
a boolean anytime_only parameter:
   * When false: flushes all stats (existing behavior)
   * When true: flushes only FLUSH_ANYTIME stats and skips FLUSH_AT_TXN_BOUNDARY
     stats

- Register ANYTIME_STATS_UPDATE_TIMEOUT that fires every 1 second, calling
pgstat_report_anytime_stat(false)

Remarks:

- The force parameter in pgstat_report_anytime_stat() is currently unused (always
called with force=false) but reserved for future use cases requiring immediate flushing.

The 1 second flush interval is currently hardcoded but we could imagine increase
it or make it configurable. I ran some benchmarks and did not notice any noticeable
performance regression even with a large number of pending entries.

0002: Remove useless calls to flush some stats

Now that some stats can be flushed outside of transaction boundaries, remove
useless calls to flush some stats. Those calls were in place because
before 0001 stats were flushed only at transaction boundaries.

Remarks:

- it reverts 039549d70f6 (it just keeps its tests)
- it can't be done for checkpointer and bgworker for example because they don't
have a flush callback to call
- it can't be done for auxiliary process (walsummarizer for example) because they
currently do not register the new timeout handler
- we may want to improve the current behavior to "fix" the 2 above

0003: Add FLUSH_MIXED support and implement it for RELATION stats

This patch extends the non transactional stats infrastructure to support statistics
kinds with mixed transaction behavior: some fields are transactional (e.g., tuple
inserts/updates/deletes) while others are non transactional (e.g., sequential scans
blocks read, ...).

It introduces FLUSH_MIXED as a third flush behavior type, alongside FLUSH_ANYTIME
and FLUSH_AT_TXN_BOUNDARY. For FLUSH_MIXED kinds, a new flush_anytime_cb callback
enables partial flushing of only the non transactional fields during running
transactions.

Some tests are also added.

Implementation details:

- Add FLUSH_MIXED to PgStat_FlushBehavior enum
- Add flush_anytime_cb to PgStat_KindInfo for partial flushing callback
- Update pgstat_flush_pending_entries() to call flush_anytime_cb for
  FLUSH_MIXED entries when in anytime_only mode
- Keep FLUSH_MIXED entries in the pending list after partial flush, as
  transactional fields still need to be flushed at transaction boundary

RELATION stats are making use of FLUSH_MIXED:

- Change RELATION from TXN_ALL to FLUSH_MIXED
- Implement pgstat_relation_flush_anytime_cb() to flush only read related
  stats: numscans, tuples_returned, tuples_fetched, blocks_fetched,
  blocks_hit
- Clear these fields after flushing to prevent double counting when
  pgstat_relation_flush_cb() runs at transaction commit
- Transactional stats (tuples_inserted, tuples_updated, tuples_deleted,
  live_tuples, dead_tuples) remain pending until transaction boundary

Remark:

We could also imagine adding a new flush_anytime_static_cb() callback for
future FLUSH_MIXED fixed amount stats.

[1]: https://postgr.es/m/erpzwxoptqhuptdrtehqydzjapvroumkhh7lc6poclbhe7jk7l%40l3yfsq5q4pw7

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

^ permalink  raw  reply  [nested|flat] 4+ messages in thread

* Re: Flush some statistics within running transactions
  2026-01-12 11:03 Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
@ 2026-01-15 03:54 ` Sami Imseih <[email protected]>
  2026-01-15 12:18   ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
  0 siblings, 1 reply; 4+ messages in thread

From: Sami Imseih @ 2026-01-15 03:54 UTC (permalink / raw)
  To: Bertrand Drouvot <[email protected]>; +Cc: [email protected]

Hi,

Thanks for these patches!

I took a quick look at the patches and I have some general comments.

> Long running transactions can accumulate significant statistics (WAL, IO, ...)
> that remain unflushed until the transaction ends. This delays visibility of
> resource usage in monitoring views like pg_stat_io and pg_stat_wal.

+1. I do think this is a good idea. Long-running transactions cause accumulated
stats to appear as spikes in monitoring tools rather than as gradual activity.
This would help level out, though not eliminate, those artificial spikes.

> The 1 second flush interval is currently hardcoded but we could imagine increase
> it or make it configurable.

Someone may want to turn this off as well. I think a GUC will be needed.

> RELATION stats are making use of FLUSH_MIXED:

> stats: numscans, tuples_returned, tuples_fetched, blocks_fetched,
> blocks_hit

I’m concerned that fields being temporarily out of sync might impact monitoring
calculations, if the formula is dealing with fields that have
different flush strategies.
That said, minor discrepancies are usually tolerable for monitoring
data analysis.

For the numscans, should we not also update the scan timestamp?

--
Sami Imseih
Amazon Web Services (AWS)






^ permalink  raw  reply  [nested|flat] 4+ messages in thread

* Re: Flush some statistics within running transactions
  2026-01-12 11:03 Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
  2026-01-15 03:54 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
@ 2026-01-15 12:18   ` Bertrand Drouvot <[email protected]>
  2026-01-15 17:25     ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
  0 siblings, 1 reply; 4+ messages in thread

From: Bertrand Drouvot @ 2026-01-15 12:18 UTC (permalink / raw)
  To: Sami Imseih <[email protected]>; +Cc: [email protected]

Hi,

On Wed, Jan 14, 2026 at 09:54:17PM -0600, Sami Imseih wrote:
> I took a quick look at the patches and I have some general comments.

Thanks!

> 
> > Long running transactions can accumulate significant statistics (WAL, IO, ...)
> > that remain unflushed until the transaction ends. This delays visibility of
> > resource usage in monitoring views like pg_stat_io and pg_stat_wal.
> 
> +1. I do think this is a good idea. Long-running transactions cause accumulated
> stats to appear as spikes in monitoring tools rather than as gradual activity.
> This would help level out, though not eliminate, those artificial spikes.

Yeah.

> > The 1 second flush interval is currently hardcoded but we could imagine increase
> > it or make it configurable.
> 
> Someone may want to turn this off as well. I think a GUC will be needed.

I gave this more thoughts and I wonder if this should be configurable at all.
I mean, we don't do it for PGSTAT_MIN_INTERVAL, PGSTAT_MAX_INTERVAL and
PGSTAT_IDLE_INTERVAL. We could imagine make it configurable if it produces
noticeable performance impact but that's not what I observed. 

> > RELATION stats are making use of FLUSH_MIXED:
> 
> > stats: numscans, tuples_returned, tuples_fetched, blocks_fetched,
> > blocks_hit
> 
> I’m concerned that fields being temporarily out of sync might impact monitoring
> calculations, if the formula is dealing with fields that have
> different flush strategies.

That's a good point. Maybe we should document the fields flush strategy?

> That said, minor discrepancies are usually tolerable for monitoring
> data analysis.
> 
> For the numscans, should we not also update the scan timestamp?

The problem is that we could not call GetCurrentTransactionStopTimestamp(), so 
we would need to call GetCurrentTimestamp() instead. I'm not sure that calling
GetCurrentTimestamp() every second would be a real issue though, and if it is
maybe we could increase this 1s value.

That said I agree that having seq_scan being updated and not last_seq_scan is not
that great. 

Maybe we should keep this in mind and see what to do depending where this thread
is going (I mean if the current proposed design has to be changed).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com






^ permalink  raw  reply  [nested|flat] 4+ messages in thread

* Re: Flush some statistics within running transactions
  2026-01-12 11:03 Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
  2026-01-15 03:54 ` Re: Flush some statistics within running transactions Sami Imseih <[email protected]>
  2026-01-15 12:18   ` Re: Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
@ 2026-01-15 17:25     ` Sami Imseih <[email protected]>
  0 siblings, 0 replies; 4+ messages in thread

From: Sami Imseih @ 2026-01-15 17:25 UTC (permalink / raw)
  To: Bertrand Drouvot <[email protected]>; +Cc: [email protected]

> > > The 1 second flush interval is currently hardcoded but we could imagine increase
> > > it or make it configurable.
> >
> > Someone may want to turn this off as well. I think a GUC will be needed.
>
> I gave this more thoughts and I wonder if this should be configurable at all.
> I mean, we don't do it for PGSTAT_MIN_INTERVAL, PGSTAT_MAX_INTERVAL and
> PGSTAT_IDLE_INTERVAL. We could imagine make it configurable if it produces
> noticeable performance impact but that's not what I observed.

Is there a reason we need a new constant (PGSTAT_ANYTIME_FLUSH_INTERVAL)
for anytime flushes and can't rely on the existing PGSTAT_MIN_INTERVAL?

Also, How did you benchmark? I am less concerned about long running
transactions,
background processes and more about short/high concurrency transactions seeing
additional overhead due to additional flushing. Is that latter a concern?

> > > stats: numscans, tuples_returned, tuples_fetched, blocks_fetched,
> > > blocks_hit
> >
> > I’m concerned that fields being temporarily out of sync might impact monitoring
> > calculations, if the formula is dealing with fields that have
> > different flush strategies.
>
> That's a good point. Maybe we should document the fields flush strategy?

Yeah, we will need to document this.

> > That said, minor discrepancies are usually tolerable for monitoring
> > data analysis.
> >
> > For the numscans, should we not also update the scan timestamp?
>
> The problem is that we could not call GetCurrentTransactionStopTimestamp(), so
> we would need to call GetCurrentTimestamp() instead. I'm not sure that calling
> GetCurrentTimestamp() every second would be a real issue though, and if it is
> maybe we could increase this 1s value.

> That said I agree that having seq_scan being updated and not last_seq_scan is not
> that great.

with v3 ,  I checked by running seq scans in a long running transaction,
and I observed both for these values being updated at the same time. I think
this is OK.

# pgstat_relation_flush_anytime_cb
```
tabentry->numscans += lstats->counts.numscans;
if (lstats->counts.numscans)
{
TimestampTz t = GetCurrentTimestamp();

if (t > tabentry->lastscan)
tabentry->lastscan = t;
}
```
and

# pgstat_relation_flush_cb
```
if (lstats->counts.numscans)
{
TimestampTz t = GetCurrentTransactionStopTimestamp();

if (t > tabentry->lastscan)
tabentry->lastscan = t;
}
```

--
Sami Imseih
Amazon Web Services (AWS)






^ permalink  raw  reply  [nested|flat] 4+ messages in thread

end of thread, other threads:[~2026-01-15 17:25 UTC | newest]

Thread overview: 4+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2026-01-12 11:03 Flush some statistics within running transactions Bertrand Drouvot <[email protected]>
2026-01-15 03:54 ` Sami Imseih <[email protected]>
2026-01-15 12:18   ` Bertrand Drouvot <[email protected]>
2026-01-15 17:25     ` Sami Imseih <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox