Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1qFRAJ-0000Tf-76 for pgsql-hackers@arkaria.postgresql.org; Sat, 01 Jul 2023 03:20:51 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1qFRAH-0007HD-To for pgsql-hackers@arkaria.postgresql.org; Sat, 01 Jul 2023 03:20:49 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1qFRAH-0007H4-JJ for pgsql-hackers@lists.postgresql.org; Sat, 01 Jul 2023 03:20:49 +0000 Received: from mail-yw1-x1132.google.com ([2607:f8b0:4864:20::1132]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1qFRAE-001Ufw-AF for pgsql-hackers@postgresql.org; Sat, 01 Jul 2023 03:20:48 +0000 Received: by mail-yw1-x1132.google.com with SMTP id 00721157ae682-577497ec6c6so13787227b3.2 for ; Fri, 30 Jun 2023 20:20:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1688181645; x=1690773645; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=/ZaOJm9gJ1UUK8l0NvUPgVXNsckcTJSv26Ey0TnSWYc=; b=XmBOyc5RPOVczlMlWuM2AiQdANaJCXU1gO1j3E/EwYSPNNDAyXp/7zPzojaiHdnzDi 32x+TkDZ+fIsNd2IdXwYC03cy7kwi9c2Ze/8qtDpk6hJL231UOrTwb0DfqRvQUfnHeQ2 Itn8DUtAIKhAkNxQAbl9Uw4JQpVFRTR2BBihkHRwx8E/i+S7xEBTvs9LgJG5aa4pfgPw wtDnBerrHLIocraUYHEJ3S78yagdJzgzuegyB3PI3LuBhRgaIytFQ/GzUGrtFrVxDad0 9o+4RtnnHLZwLmR36tUAyeYhqQ0NdnXf3enZTcYjOeLog7PkFqa8brbqMc3mzbuiwLxC 1h5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688181645; x=1690773645; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=/ZaOJm9gJ1UUK8l0NvUPgVXNsckcTJSv26Ey0TnSWYc=; b=MrJ/cya7u7A4cdgmLCkJYI3aR9Y2XslhX7wAf7742Dx8aJT+ou4dmSwfvehsTGFa+7 FqbRl+Z/Lref9+x3/NBssXeJuJ3WlR+sY8oOJUdOq3hvvC8nh/2kOVxXnK7I336S4hy1 JE0y+Q+eLEg2imPFFOUTvjFo3wXhcyEoSHx/Ux1575mpeUgmt/lyukIATcATDCmnOyqq n9Im+serHENV/v0HZzZnp2eFzA3Q/nKoz5jwVQ5CZneYjP0DZvLwUcx2odZgjEdL3Vbk hSglFb/Pc0s8pbS4ZzBs0BYkeQ8sRzyjtJmMTYOz6fXrcKQVedwSqGTU85e4P9PhcX44 Ve4A== X-Gm-Message-State: ABy/qLatFiQW7F/ikgCDVkCUdAnVly8vjCZsgd9hirK5FOVTL2nJ1l78 OEx2PrxrEwX7rAVENIIuD/MdTlGltfgf/uHUFgDC/Csn X-Google-Smtp-Source: APBJJlEoXQO4KAYjgX6mFv/YPCzu0JXX08NycuUg5kbOD94HuVObnIA18WZQC07B8H+8kgI5b/R9cTaPYuDQxdG5GOA= X-Received: by 2002:a0d:d74e:0:b0:578:5e60:dcc9 with SMTP id z75-20020a0dd74e000000b005785e60dcc9mr1478344ywd.10.1688181644821; Fri, 30 Jun 2023 20:20:44 -0700 (PDT) MIME-Version: 1.0 References: <89742024-d51a-c66b-90b9-67f837072cd2@joeconway.com> In-Reply-To: <89742024-d51a-c66b-90b9-67f837072cd2@joeconway.com> From: Pavel Stehule Date: Sat, 1 Jul 2023 05:20:08 +0200 Message-ID: Subject: Re: RFC: pg_stat_logmsg To: Joe Conway Cc: PostgreSQL-development Content-Type: multipart/alternative; boundary="000000000000418ec305ff646b00" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000418ec305ff646b00 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi so 1. 7. 2023 v 1:57 odes=C3=ADlatel Joe Conway napsal= : > Greetings, > > Attached please find a tarball (rather than a patch) for a proposed new > contrib extension, pg_stat_logmsg. > > The basic idea is to mirror how pg_stat_statements works, except the > logged messages keyed by filename, lineno, and elevel are saved with a > aggregate count. The format string is displayed (similar to a query > jumble) for context, along with function name and sqlerrcode. > > I threw this together rather quickly over the past couple of days > between meetings, so not claiming that it is committable (and lacks > documentation and regression tests as well), but I would love to get > feedback on: > > 1/ the general concept > 2/ the pg_stat_statement-like implementation > 3/ contrib vs core vs external project > > Some samples and data: > > `make installcheck` with the extension loaded: > 8<------------------ > # All 215 tests passed. > > > real 2m24.854s > user 0m0.086s > sys 0m0.283s > 8<------------------ > > `make installcheck` without the extension loaded: > 8<------------------ > > # All 215 tests passed. > > real 2m26.765s > user 0m0.076s > sys 0m0.293s > 8<------------------ > > Sample output after running make installcheck a couple times (plus a few > manually generated ERRORs): > > 8<------------------ > test=3D# select sum(count) from pg_stat_logmsg where elevel > 20; > sum > ------- > 10554 > (1 row) > > test=3D# \x > Expanded display is on. > test=3D# select * from pg_stat_logmsg where elevel > 20 order by count de= sc; > -[ RECORD 1 ]------------------------------- > filename | aclchk.c > lineno | 2811 > elevel | 21 > funcname | aclcheck_error > sqlerrcode | 42501 > message | permission denied for schema %s > count | 578 > -[ RECORD 2 ]------------------------------- > filename | scan.l > lineno | 1241 > elevel | 21 > funcname | scanner_yyerror > sqlerrcode | 42601 > message | %s at or near "%s" > count | 265 > ... > > test=3D# select * from pg_stat_logmsg where elevel > 20 and sqlerrcode = =3D > 'XX000'; > -[ RECORD 1 ]--------------------------------------- > filename | tid.c > lineno | 352 > elevel | 21 > funcname | currtid_for_view > sqlerrcode | XX000 > message | ctid isn't of type TID > count | 2 > -[ RECORD 2 ]--------------------------------------- > filename | pg_locale.c > lineno | 2493 > elevel | 21 > funcname | pg_ucol_open > sqlerrcode | XX000 > message | could not open collator for locale "%s": %s > count | 2 > ... > > 8<------------------ > > Part of the thinking is that people with fleets of postgres instances > can use this to scan for various errors that they care about. > Additionally it would be useful to look for "should not happen" errors. > > I will register this in the July CF and will appreciate feedback. > This can be a very interesting feature. I like it. Regards Pavel > Thanks! > > -- > Joe Conway > PostgreSQL Contributors Team > RDS Open Source Databases > Amazon Web Services: https://aws.amazon.com --000000000000418ec305ff646b00 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi

so 1. 7. 2023 v=C2=A01:57 odes=C3=ADlatel Joe= Conway <mail@joeconway.com>= ; napsal:
Greeti= ngs,

Attached please find a tarball (rather than a patch) for a proposed new contrib extension, pg_stat_logmsg.

The basic idea is to mirror how pg_stat_statements works, except the
logged messages keyed by filename, lineno, and elevel are saved with a
aggregate count. The format string is displayed (similar to a query
jumble) for context, along with function name and sqlerrcode.

I threw this together rather quickly over the past couple of days
between meetings, so not claiming that it is committable (and lacks
documentation and regression tests as well), but I would love to get
feedback on:

1/ the general concept
2/ the pg_stat_statement-like implementation
3/ contrib vs core vs external project

Some samples and data:

`make installcheck` with the extension loaded:
8<------------------
# All 215 tests passed.


real=C2=A0 =C2=A0 2m24.854s
user=C2=A0 =C2=A0 0m0.086s
sys=C2=A0 =C2=A0 =C2=A00m0.283s
8<------------------

`make installcheck` without the extension loaded:
8<------------------

# All 215 tests passed.

real=C2=A0 =C2=A0 2m26.765s
user=C2=A0 =C2=A0 0m0.076s
sys=C2=A0 =C2=A0 =C2=A00m0.293s
8<------------------

Sample output after running make installcheck a couple times (plus a few manually generated ERRORs):

8<------------------
test=3D# select sum(count) from pg_stat_logmsg where elevel > 20;
=C2=A0 =C2=A0sum
-------
=C2=A0 10554
(1 row)

test=3D# \x
Expanded display is on.
test=3D# select * from pg_stat_logmsg where elevel > 20 order by count d= esc;
-[ RECORD 1 ]-------------------------------
filename=C2=A0 =C2=A0| aclchk.c
lineno=C2=A0 =C2=A0 =C2=A0| 2811
elevel=C2=A0 =C2=A0 =C2=A0| 21
funcname=C2=A0 =C2=A0| aclcheck_error
sqlerrcode | 42501
message=C2=A0 =C2=A0 | permission denied for schema %s
count=C2=A0 =C2=A0 =C2=A0 | 578
-[ RECORD 2 ]-------------------------------
filename=C2=A0 =C2=A0| scan.l
lineno=C2=A0 =C2=A0 =C2=A0| 1241
elevel=C2=A0 =C2=A0 =C2=A0| 21
funcname=C2=A0 =C2=A0| scanner_yyerror
sqlerrcode | 42601
message=C2=A0 =C2=A0 | %s at or near "%s"
count=C2=A0 =C2=A0 =C2=A0 | 265
...

test=3D# select * from pg_stat_logmsg where elevel > 20 and sqlerrcode = =3D
'XX000';
-[ RECORD 1 ]---------------------------------------
filename=C2=A0 =C2=A0| tid.c
lineno=C2=A0 =C2=A0 =C2=A0| 352
elevel=C2=A0 =C2=A0 =C2=A0| 21
funcname=C2=A0 =C2=A0| currtid_for_view
sqlerrcode | XX000
message=C2=A0 =C2=A0 | ctid isn't of type TID
count=C2=A0 =C2=A0 =C2=A0 | 2
-[ RECORD 2 ]---------------------------------------
filename=C2=A0 =C2=A0| pg_locale.c
lineno=C2=A0 =C2=A0 =C2=A0| 2493
elevel=C2=A0 =C2=A0 =C2=A0| 21
funcname=C2=A0 =C2=A0| pg_ucol_open
sqlerrcode | XX000
message=C2=A0 =C2=A0 | could not open collator for locale "%s": %= s
count=C2=A0 =C2=A0 =C2=A0 | 2
...

8<------------------

Part of the thinking is that people with fleets of postgres instances
can use this to scan for various errors that they care about.
Additionally it would be useful to look for "should not happen" e= rrors.

I will register this in the July CF and will appreciate feedback.

This can be a very interesting feature. I like i= t.

Regards

Pavel


Thanks!

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
--000000000000418ec305ff646b00--