public inbox for [email protected]  
help / color / mirror / Atom feed
From: Sami Imseih <[email protected]>
To: Bertrand Drouvot <[email protected]>
Cc: [email protected]
Subject: Re: Flush some statistics within running transactions
Date: Thu, 15 Jan 2026 11:25:18 -0600
Message-ID: <CAA5RZ0t6j0VYuUpxZ8JLq-ERoUriZ0rK=+8PCUtjRirmSmCx7A@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <aWTVEycKj7Qh/[email protected]>
	<CAA5RZ0vgN8nf7yPrk_uNt-ABFtRnRuz7-9CM=qSzUSwj=EEUKw@mail.gmail.com>
	<[email protected]>

> > > The 1 second flush interval is currently hardcoded but we could imagine increase
> > > it or make it configurable.
> >
> > Someone may want to turn this off as well. I think a GUC will be needed.
>
> I gave this more thoughts and I wonder if this should be configurable at all.
> I mean, we don't do it for PGSTAT_MIN_INTERVAL, PGSTAT_MAX_INTERVAL and
> PGSTAT_IDLE_INTERVAL. We could imagine make it configurable if it produces
> noticeable performance impact but that's not what I observed.

Is there a reason we need a new constant (PGSTAT_ANYTIME_FLUSH_INTERVAL)
for anytime flushes and can't rely on the existing PGSTAT_MIN_INTERVAL?

Also, How did you benchmark? I am less concerned about long running
transactions,
background processes and more about short/high concurrency transactions seeing
additional overhead due to additional flushing. Is that latter a concern?

> > > stats: numscans, tuples_returned, tuples_fetched, blocks_fetched,
> > > blocks_hit
> >
> > I’m concerned that fields being temporarily out of sync might impact monitoring
> > calculations, if the formula is dealing with fields that have
> > different flush strategies.
>
> That's a good point. Maybe we should document the fields flush strategy?

Yeah, we will need to document this.

> > That said, minor discrepancies are usually tolerable for monitoring
> > data analysis.
> >
> > For the numscans, should we not also update the scan timestamp?
>
> The problem is that we could not call GetCurrentTransactionStopTimestamp(), so
> we would need to call GetCurrentTimestamp() instead. I'm not sure that calling
> GetCurrentTimestamp() every second would be a real issue though, and if it is
> maybe we could increase this 1s value.

> That said I agree that having seq_scan being updated and not last_seq_scan is not
> that great.

with v3 ,  I checked by running seq scans in a long running transaction,
and I observed both for these values being updated at the same time. I think
this is OK.

# pgstat_relation_flush_anytime_cb
```
tabentry->numscans += lstats->counts.numscans;
if (lstats->counts.numscans)
{
TimestampTz t = GetCurrentTimestamp();

if (t > tabentry->lastscan)
tabentry->lastscan = t;
}
```
and

# pgstat_relation_flush_cb
```
if (lstats->counts.numscans)
{
TimestampTz t = GetCurrentTransactionStopTimestamp();

if (t > tabentry->lastscan)
tabentry->lastscan = t;
}
```

--
Sami Imseih
Amazon Web Services (AWS)






view thread (4+ messages)

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re: Flush some statistics within running transactions
  In-Reply-To: <CAA5RZ0t6j0VYuUpxZ8JLq-ERoUriZ0rK=+8PCUtjRirmSmCx7A@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox