public inbox for [email protected]
help / color / mirror / Atom feedFrom: Michael Paquier <[email protected]>
To: Tomas Vondra <[email protected]>
Cc: Joe Conway <[email protected]>
Cc: Gurjeet Singh <[email protected]>
Cc: Masahiko Sawada <[email protected]>
Cc: PostgreSQL-development <[email protected]>
Subject: Re: RFC: pg_stat_logmsg
Date: Wed, 17 Jul 2024 08:08:48 +0900
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
<CAD21AoAyPVYZ_6VAQCOanvHTL3CRSqTDCFrtvaYVGwOQbrnxgQ@mail.gmail.com>
<CABwTF4VzMJaTLrpYtBj4n9r73TFjffzqpPFeUmJPrkY6cmHEAg@mail.gmail.com>
<[email protected]>
<[email protected]>
<[email protected]>
On Wed, Jul 17, 2024 at 12:14:36AM +0200, Tomas Vondra wrote:
> I noticed this patch hasn't moved since September 2023, so I wonder
> what's the main blocker / what is needed to move this?
+ /* Location of permanent stats file (valid when database is shut down) */
+ #define PGLM_DUMP_FILE PGSTAT_STAT_PERMANENT_DIRECTORY
"/pg_stat_logmsg.stat
Perhaps this does not count as a valid reason, but does it really make
sense to implement things this way, knowing that this could be changed
to rely on a potential pluggable pgstats? I mean this one I've
proposed:
https://www.postgresql.org/message-id/Zmqm9j5EO0I4W8dx%40paquier.xyz
One potential implementation is to stick that to be fixed-numbered,
because there is a maximum cap to the number of entries proposed by
the patch, while keeping the whole in memory.
+ logmsg_store(ErrorData *edata, Size *logmsg_offset,
+ int *logmsg_len, int *gc_count)
The patch shares a lot of perks with pg_stat_statements that don't
scale well. I'm wondering if it is a good idea to duplicate these
properties in a second, different, module, like the external file can
could be written out on a periodic basis depending on the workload.
I am not saying that the other thread is a magic solution for
everything (not looked yet at how this would plug with the cap in
entries that pg_stat_statements wants), just one option on the table.
> As for the code, I wonder if the instability of line numbers could be a
> problem - these can change (a little bit) between minor releases, so
> after an upgrade we'll read the dump file with line numbers from the old
> release, and then start adding entries with new line numbers. Do we need
> to handle this in some way?
Indeed. Perhaps a PostgreSQL version number assigned to each entry to
know from which binary an entry comes from? This would cost a couple
of extra bytes for each entry still that would be the best information
possible to match that with a correct code tree. If it comes to that,
even getting down to a commit SHA1 could be useful to provide the
lowest level of granularity. Another thing would be to give up on the
line number, stick to the uniqueness in the stats entries with the
errcode and the file name, but that won't help for things like
tablecmds.c.
> This might be partially solved by eviction of entries from the old
> release - we apply decay, so after a while their usage will be 0. But
> what if there's no pressure for space, we'll not actually evict them.
> And it'll be confusing to have a mix of old/new line numbers.
Once we know that these stats not going to be relevant anymore as of a
minor upgrade flow, resetting them could be the move that makes the
most sense, leaving the reset to the provider doing the upgrades,
while taking a snapshot of the past data before the reset? I find the
whole problem tricky to define, TBH.
--
Michael
Attachments:
[application/pgp-signature] signature.asc (833B, 2-signature.asc)
download
view thread (9+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: RFC: pg_stat_logmsg
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox