public inbox for [email protected]  
help / color / mirror / Atom feed
From: Sami Imseih <[email protected]>
To: Pavlo Golub <[email protected]>
Cc: [email protected]
Subject: Re: [PATCH] Add last_executed timestamp to pg_stat_statements
Date: Wed, 10 Dec 2025 11:30:19 -0600
Message-ID: <CAA5RZ0uyvnNuHRj=73jZhocnLxE07oM-AeSLS=FUuuZ_hy7fKA@mail.gmail.com> (raw)
In-Reply-To: <CAK7ymc+FxoVswo1ok_xDW-xPG-ZEZ8SAqCUkJ7WF04=0aQDvVQ@mail.gmail.com>
References: <CAK7ymc+FxoVswo1ok_xDW-xPG-ZEZ8SAqCUkJ7WF04=0aQDvVQ@mail.gmail.com>

Hi,

Thanks for raising this. I did not look at the patch, but I have some high
level comments.

> I would like to propose adding a last_executed timestamptz column to
> pg_stat_statements. This column records when each tracked statement
> was most recently executed.

I do think there is value in adding a last_executed timestamp. I actually think
last_executed should be the time the query started timestamp, so we should
actually create an entry at ExecutorStart, along with calls_started and
calls_completed. This is great for tracking cancelled queries.
The issue is the extra overhead of tracking the query on
EcecutorStart, but that
should be less of an issue once we move pg_stat_statements to the cumulative
statistics system, which will be possible once we get some prerequisite
work to make this happen [0].

Another concern is the width of the current view. I think before we add
any new attribute, pg_stat_statements fields should be split.
This was discussed in [1].

> The motivation comes from real world experience with monitoring tools
> like pgwatch that poll pg_stat_statements regularly. Currently, these
> tools must fetch and store statistics for all statements, even those
> that haven't executed recently. This creates significant storage
> overhead. For a database with around 3400 statements polled every 3
> minutes, storing full query text requires roughly 2.5 MB per snapshot.
> Over two weeks, this accumulates to about 17 GB. Even without query
> text, storage reaches 10 GB.
>
> With a last_executed timestamptz, monitoring tools can simply filter
> statements by "last_executed > NOW() - polling_interval" to fetch only
> statements that have been executed since the last poll. This
> eliminates the need for complex workarounds that some tools currently
> use to identify changed statements
> (https://github.com/cybertec-postgresql/pgwatch/blob/759df3a149cbbe973165547186068aa7b5332f9d/interna...).

Can pg_stat_statements.stats_since help here?

for example "where stats_since > last_poll_timestamp" ?

The client does have to track the last_poll_timestamp in that
case.

[0] https://www.postgresql.org/message-id/flat/CAA5RZ0s9SDOu+Z6veoJCHWk+kDeTktAtC-KY9fQ9Z6BJdDUirQ@mail....
[1] https://www.postgresql.org/message-id/03f82e6f-66a3-4c4d-935c-ea4d93871dc1%40gmail.com

--
Sami Imseih
Amazon Web Services (AWS)





view thread (4+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re: [PATCH] Add last_executed timestamp to pg_stat_statements
  In-Reply-To: <CAA5RZ0uyvnNuHRj=73jZhocnLxE07oM-AeSLS=FUuuZ_hy7fKA@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox