public inbox for [email protected]
help / color / mirror / Atom feedFrom: Sami Imseih <[email protected]>
To: Christoph Berg <[email protected]>
Cc: Pavlo Golub <[email protected]>
Cc: Bertrand Drouvot <[email protected]>
Cc: [email protected] <[email protected]>
Subject: Re: [PATCH] Add last_executed timestamp to pg_stat_statements
Date: Mon, 9 Feb 2026 10:51:30 -0600
Message-ID: <CAA5RZ0sxPWP2xm8fxhscE+cUqC2VSFi9UZ9882BdGZ0MbGQNUA@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <CAA5RZ0vMW+0_6FYtFEaN-NAnk1sT2ucCUDJe0-xKwZriROV--A@mail.gmail.com>
<[email protected]>
<[email protected]>
> > How about
> > TimestampTz stmt_end = TimestampTzPlusMilliseconds(
> > GetCurrentStatementStartTimestamp(),
> > (int64) total_time
> > );
> > We have total_time as an argument already! No kernel calls, sweet and easy!
>
> Cool idea!
This calculation could be wrong for very common cases in extended
query protocol,
Here is a script to test with:
```
select pg_stat_statements_reset();
BEGIN;
select now() as now, clock_timestamp() as clock_timestamp,
pg_sleep($1) \bind 10 \g
\! sleep 10
SELECT now() as now, clock_timestamp() as clock_timestamp, $1 \bind 1 \g
END;
select stats_last_updated, total_exec_time, substr(query, 1, 150) as
query from pg_stat_statements;
````
With v3 applied, notice the output is calculating a stats_last_updated
that is beyond the current time
```
pg_stat_statements_reset
-------------------------------
2026-02-09 16:13:35.188849+00
(1 row)
BEGIN
now | clock_timestamp | pg_sleep
------------------------------+-------------------------------+----------
2026-02-09 16:13:35.18911+00 | 2026-02-09 16:13:35.189397+00 |
(1 row)
now | clock_timestamp | ?column?
------------------------------+-------------------------------+----------
2026-02-09 16:13:35.18911+00 | 2026-02-09 16:13:55.193443+00 | 1
(1 row)
COMMIT
stats_last_updated | total_exec_time |
query
-------------------------------+-----------------+-------------------------------------------------------------------------
2026-02-09 16:13:55.19367+00 | 0.007401 | SELECT now() as
now, clock_timestamp() as clock_timestamp, $1
2026-02-09 16:13:55.193664+00 | 0.00103 | END
2026-02-09 16:13:35.189111+00 | 0.00098 | BEGIN
2026-02-09 16:13:35.188584+00 | 0.090183 | select
pg_stat_statements_reset()
2026-02-09 16:14:05.194134+00 | 10000.751122 | select now() as
now, clock_timestamp() as clock_timestamp, pg_sleep($1)
(5 rows)
```
This happens because in the case of extended query protocol,
ExecutorEnd is called
at the next query. This has been discussed in [1] [2].
So, for this to work, we will likely need to store the query start
time in the queryDesc; actually
queryDesc->totaltime, and set the query start time at ExecutorStart,
during InstrAlloc.
> > I think it's better because last_execution_start is already a known
> > timestamp in pg_stat_activity.query_start and some tool that finds a
> > long running query in pg_stat_activity, knowing the
> > query_start they could then go look it up in pg_stat_statements.
>
> That only works if a) the query was not yet overwritten in
> pg_stat_activity and b) neither in pg_stat_statements. Optimizing for
> that use case seems pretty narrow.
>
> > What I'm really getting at is separating these fields will open up
> > more use cases, IMO.
Maybe this is a bad use case. But I felt separating these 2 fields will
be more flexible.
> Generally, I think pgss should have cumulative statistics, and less
> about individual executions, so I'm not really sure what practical
> problem "last start" and "last runtime" would solve. The
> last_stats_update column we are talking about here is different in the
> sense that it's not about an individual execution, but infrastructure
> for retrieving the stats sensibly.
Sure, generally, pg_stat_statements is for cumulative stats, but we also
do have computed stats such as max/min/stddev, etc. But, it's not without
precedent that we track timestamps of the last time some operation occurred.
We do that in views that have a purpose of tracking cumulative data, because
these timestamps are useful. See pg_stat_all_tables.last_seq_scan or
last_autovacuum
as an example.
Maybe having the last runtime column is not that valuable if we can
correctly calculate
the last execution time. AlsoI will be a strong -1 calling this field
"stats_last_updated"
instead of "last_execution_time".
[1] https://www.postgresql.org/message-id/[email protected]
[2] https://www.postgresql.org/message-id/[email protected]...
--
Sami Imseih
Amazon Web Services (AWS)
view thread (8+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: [PATCH] Add last_executed timestamp to pg_stat_statements
In-Reply-To: <CAA5RZ0sxPWP2xm8fxhscE+cUqC2VSFi9UZ9882BdGZ0MbGQNUA@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox