Re: query_id: jumble names of temp tables for better pg_stat_statement UX

public inbox for [email protected]  
help / color / mirror / Atom feed

From: Lukas Fittl <[email protected]>
To: Alexander Kukushkin <[email protected]>
Cc: Michael Paquier <[email protected]>
Cc: Sami Imseih <[email protected]>
Cc: Tom Lane <[email protected]>
Cc: Christoph Berg <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Cc: ma lz <[email protected]>
Subject: Re: query_id: jumble names of temp tables for better pg_stat_statement UX
Date: Wed, 16 Jul 2025 22:06:04 -0700
Message-ID: <CAP53PkyHVm6boP5=PahN1Xq7+D+aUZy+zg4QhXXpTCecsJnm6w@mail.gmail.com> (raw)
In-Reply-To: <CAFh8B=m5BPeo7vTYDSw2KV=CmjeATWGw-CHCZmJ+3ZKm7kVM1w@mail.gmail.com>
References: <[email protected]>
	<[email protected]>
	<[email protected]>
	<CAFh8B==vog16UGpigF5jukg0yZW+MHsvvKG0QBs7gV6cHgqanA@mail.gmail.com>
	<[email protected]>
	<CAFh8B=m5BPeo7vTYDSw2KV=CmjeATWGw-CHCZmJ+3ZKm7kVM1w@mail.gmail.com>

On Tue, Jul 15, 2025 at 11:20 PM Alexander Kukushkin <[email protected]>
wrote:

> However, we regularly hear from many different customers that they *don't
> control queries* sent by application or *can't modify these queries*.
> Such kinds of workloads are also not that uncommon and this change makes
> it impossible to monitor them.
>

For the workloads you are thinking of, are these "one customer per schema"
multi-tenant workloads, or something else?

I mentioned this earlier in the discussion (when supporting the change that
was done), but the main challenge I've seen is that for "one customer per
schema" workloads, pg_stat_statements just doesn't work today, unless you
have only a handful of customers on a server.

Once you have anything close to 100 or more customer schemas on a server,
the churn on the entries makes pg_stat_statements unusable (even with a
high max), especially with the current way the query text file works, since
you can't reliably read from pg_stat_statements anymore without incurring a
read from a 100MB+ query text file.

So I agree this change reduces the visibility into which of the schemas had
a slow query, but it at least allows reliably using pg_stat_statements to
narrow down which query / part of an application is problematic. To get
specifics on the schema, one could then use other means (e.g.
log_min_duration_statement, auto_explain, etc) to get exact details,
grepping the logfile for the query ID retrieved from pg_stat_statements.

Thanks,
Lukas

-- 
Lukas Fittl

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: query_id: jumble names of temp tables for better pg_stat_statement UX
  In-Reply-To: <CAP53PkyHVm6boP5=PahN1Xq7+D+aUZy+zg4QhXXpTCecsJnm6w@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox