Re: Flush some statistics within running transactions

public inbox for [email protected]  
help / color / mirror / Atom feed

From: Sami Imseih <[email protected]>
To: Michael Paquier <[email protected]>
Cc: Bertrand Drouvot <[email protected]>
Cc: [email protected]
Cc: Zsolt Parragi <[email protected]>
Subject: Re: Flush some statistics within running transactions
Date: Wed, 21 Jan 2026 19:41:30 -0600
Message-ID: <CAA5RZ0sQxAu6bU8wTgvs+aTSvhBcziH9jCJ27aS1hzKsm2kmTQ@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <aWTVEycKj7Qh/[email protected]>
	<CAA5RZ0vgN8nf7yPrk_uNt-ABFtRnRuz7-9CM=qSzUSwj=EEUKw@mail.gmail.com>
	<[email protected]>
	<CAA5RZ0t6j0VYuUpxZ8JLq-ERoUriZ0rK=+8PCUtjRirmSmCx7A@mail.gmail.com>
	<[email protected]>
	<CAA5RZ0s9j3x5UPpgaQdDyGA=MNjVEkT73SL7LMoVEKUwiZrVqA@mail.gmail.com>
	<[email protected]>
	<CAA5RZ0s6FkEHFdgKf8J6vueZGwsH+08LvV0YPBXa4Dw_8QgtTw@mail.gmail.com>
	<[email protected]>
	<[email protected]>

> > No, 0003 also changes the flush mode for the database KIND. All the fields that
> > I mentioned are inherited from relations stats and are flushed only at transaction
> > boundaries (so they don't appear in pg_stat_database until the transaction
> > finishes). Does that make sense? (if the database kind is not switched to
> > flush any time then none would appear while the transaction is in progress, even
> > the ones inherited from relations stats).
> >
> > PFA v3, also taking care of Zsolt's comment (thanks!) done up-thread.
>
> While reading through 0001, I got to question on which properties
> and/or assumptions of a stats kind one has to rely on to decide to
> what flush_mode should be set.  To put is simpler, why don't we just
> do a periodic pgstat_report_stat(false) call that would flush all the
> stats for all stats kinds based on the new timeout registered,
> expanding a bit the flush we currently do when idle in
> ProcessInterrupts()?

There are some important cases in which we would want to
distinguish between a "transaction boundary" flush vs an
"anytime" flush.

For example, xact_commit/rollback. I would want those
fields to be in sync with tuples_inserted/updated/deleted
to allow for accurate calculations like number of inserts
per commit, etc.

Another one would be n_mod_since_analyze, That should
only be updated after commit (or not after rollback). Otherwise,
it may throw autovanalyze threshold calculations way off. Same
for n_dead_tup and autovacuum.

> I am also not convinced that we have to be that aggressive with these
> extra flushes.  The target is long-running analytical queries, that
> could take minutes or even hours.  Using the same value as
> PGSTAT_IDLE_INTERVAL (10s),

PGSTAT_IDLE_INTERVAL is flushing an idle backend every 10 seconds
IIUC. So this value only applies when outside of a transaction.

> A 1s vs 10s report interval does not really matter for long analytical queries.

Sure, Bertrand mentioned early in the thread that the anytime flushes
could be made configurable. Perhaps that is a good idea where we can
default with something large like 10s intervals for anytime flushes, but allow
the user to configure a more frequent flushes ( although I would think
that 1 sec is the minimum we should allow ).

--
Sami Imseih
Amazon Web Services (AWS)

view thread (27+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Flush some statistics within running transactions
  In-Reply-To: <CAA5RZ0sQxAu6bU8wTgvs+aTSvhBcziH9jCJ27aS1hzKsm2kmTQ@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox