public inbox for [email protected]  
help / color / mirror / Atom feed
RFC: pg_stat_logmsg
3+ messages / 2 participants
[nested] [flat]

* RFC: pg_stat_logmsg
@ 2023-06-30 23:57 Joe Conway <[email protected]>
  2023-07-01 03:20 ` Re: RFC: pg_stat_logmsg Pavel Stehule <[email protected]>
  0 siblings, 1 reply; 3+ messages in thread

From: Joe Conway @ 2023-06-30 23:57 UTC (permalink / raw)
  To: pgsql-hackers

Greetings,

Attached please find a tarball (rather than a patch) for a proposed new 
contrib extension, pg_stat_logmsg.

The basic idea is to mirror how pg_stat_statements works, except the 
logged messages keyed by filename, lineno, and elevel are saved with a 
aggregate count. The format string is displayed (similar to a query 
jumble) for context, along with function name and sqlerrcode.

I threw this together rather quickly over the past couple of days 
between meetings, so not claiming that it is committable (and lacks 
documentation and regression tests as well), but I would love to get 
feedback on:

1/ the general concept
2/ the pg_stat_statement-like implementation
3/ contrib vs core vs external project

Some samples and data:

`make installcheck` with the extension loaded:
8<------------------
# All 215 tests passed.


real    2m24.854s
user    0m0.086s
sys     0m0.283s
8<------------------

`make installcheck` without the extension loaded:
8<------------------

# All 215 tests passed.

real    2m26.765s
user    0m0.076s
sys     0m0.293s
8<------------------

Sample output after running make installcheck a couple times (plus a few 
manually generated ERRORs):

8<------------------
test=# select sum(count) from pg_stat_logmsg where elevel > 20;
   sum
-------
  10554
(1 row)

test=# \x
Expanded display is on.
test=# select * from pg_stat_logmsg where elevel > 20 order by count desc;
-[ RECORD 1 ]-------------------------------
filename   | aclchk.c
lineno     | 2811
elevel     | 21
funcname   | aclcheck_error
sqlerrcode | 42501
message    | permission denied for schema %s
count      | 578
-[ RECORD 2 ]-------------------------------
filename   | scan.l
lineno     | 1241
elevel     | 21
funcname   | scanner_yyerror
sqlerrcode | 42601
message    | %s at or near "%s"
count      | 265
...

test=# select * from pg_stat_logmsg where elevel > 20 and sqlerrcode = 
'XX000';
-[ RECORD 1 ]---------------------------------------
filename   | tid.c
lineno     | 352
elevel     | 21
funcname   | currtid_for_view
sqlerrcode | XX000
message    | ctid isn't of type TID
count      | 2
-[ RECORD 2 ]---------------------------------------
filename   | pg_locale.c
lineno     | 2493
elevel     | 21
funcname   | pg_ucol_open
sqlerrcode | XX000
message    | could not open collator for locale "%s": %s
count      | 2
...

8<------------------

Part of the thinking is that people with fleets of postgres instances 
can use this to scan for various errors that they care about. 
Additionally it would be useful to look for "should not happen" errors.

I will register this in the July CF and will appreciate feedback.

Thanks!

-- 
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

  [application/x-compressed-tar] pg_stat_logmsg-000.tgz (14.9K, 2-pg_stat_logmsg-000.tgz)
  download

^ permalink  raw  reply  [nested|flat] 3+ messages in thread

* Re: RFC: pg_stat_logmsg
  2023-06-30 23:57 RFC: pg_stat_logmsg Joe Conway <[email protected]>
@ 2023-07-01 03:20 ` Pavel Stehule <[email protected]>
  2023-07-01 19:52   ` Re: RFC: pg_stat_logmsg Joe Conway <[email protected]>
  0 siblings, 1 reply; 3+ messages in thread

From: Pavel Stehule @ 2023-07-01 03:20 UTC (permalink / raw)
  To: Joe Conway <[email protected]>; +Cc: pgsql-hackers

Hi

so 1. 7. 2023 v 1:57 odesílatel Joe Conway <[email protected]> napsal:

> Greetings,
>
> Attached please find a tarball (rather than a patch) for a proposed new
> contrib extension, pg_stat_logmsg.
>
> The basic idea is to mirror how pg_stat_statements works, except the
> logged messages keyed by filename, lineno, and elevel are saved with a
> aggregate count. The format string is displayed (similar to a query
> jumble) for context, along with function name and sqlerrcode.
>
> I threw this together rather quickly over the past couple of days
> between meetings, so not claiming that it is committable (and lacks
> documentation and regression tests as well), but I would love to get
> feedback on:
>
> 1/ the general concept
> 2/ the pg_stat_statement-like implementation
> 3/ contrib vs core vs external project
>
> Some samples and data:
>
> `make installcheck` with the extension loaded:
> 8<------------------
> # All 215 tests passed.
>
>
> real    2m24.854s
> user    0m0.086s
> sys     0m0.283s
> 8<------------------
>
> `make installcheck` without the extension loaded:
> 8<------------------
>
> # All 215 tests passed.
>
> real    2m26.765s
> user    0m0.076s
> sys     0m0.293s
> 8<------------------
>
> Sample output after running make installcheck a couple times (plus a few
> manually generated ERRORs):
>
> 8<------------------
> test=# select sum(count) from pg_stat_logmsg where elevel > 20;
>    sum
> -------
>   10554
> (1 row)
>
> test=# \x
> Expanded display is on.
> test=# select * from pg_stat_logmsg where elevel > 20 order by count desc;
> -[ RECORD 1 ]-------------------------------
> filename   | aclchk.c
> lineno     | 2811
> elevel     | 21
> funcname   | aclcheck_error
> sqlerrcode | 42501
> message    | permission denied for schema %s
> count      | 578
> -[ RECORD 2 ]-------------------------------
> filename   | scan.l
> lineno     | 1241
> elevel     | 21
> funcname   | scanner_yyerror
> sqlerrcode | 42601
> message    | %s at or near "%s"
> count      | 265
> ...
>
> test=# select * from pg_stat_logmsg where elevel > 20 and sqlerrcode =
> 'XX000';
> -[ RECORD 1 ]---------------------------------------
> filename   | tid.c
> lineno     | 352
> elevel     | 21
> funcname   | currtid_for_view
> sqlerrcode | XX000
> message    | ctid isn't of type TID
> count      | 2
> -[ RECORD 2 ]---------------------------------------
> filename   | pg_locale.c
> lineno     | 2493
> elevel     | 21
> funcname   | pg_ucol_open
> sqlerrcode | XX000
> message    | could not open collator for locale "%s": %s
> count      | 2
> ...
>
> 8<------------------
>
> Part of the thinking is that people with fleets of postgres instances
> can use this to scan for various errors that they care about.
> Additionally it would be useful to look for "should not happen" errors.
>
> I will register this in the July CF and will appreciate feedback.
>

This can be a very interesting feature. I like it.

Regards

Pavel


> Thanks!
>
> --
> Joe Conway
> PostgreSQL Contributors Team
> RDS Open Source Databases
> Amazon Web Services: https://aws.amazon.com


^ permalink  raw  reply  [nested|flat] 3+ messages in thread

* Re: RFC: pg_stat_logmsg
  2023-06-30 23:57 RFC: pg_stat_logmsg Joe Conway <[email protected]>
  2023-07-01 03:20 ` Re: RFC: pg_stat_logmsg Pavel Stehule <[email protected]>
@ 2023-07-01 19:52   ` Joe Conway <[email protected]>
  0 siblings, 0 replies; 3+ messages in thread

From: Joe Conway @ 2023-07-01 19:52 UTC (permalink / raw)
  To: Pavel Stehule <[email protected]>; +Cc: pgsql-hackers

On 6/30/23 23:20, Pavel Stehule wrote:
> so 1. 7. 2023 v 1:57 odesílatel Joe Conway <[email protected] 
> <mailto:[email protected]>> napsal:
>     Part of the thinking is that people with fleets of postgres instances
>     can use this to scan for various errors that they care about.
>     Additionally it would be useful to look for "should not happen" errors.
> 
>     I will register this in the July CF and will appreciate feedback.
> 
> This can be a very interesting feature. I like it.

Thanks!

FWIW, I just modified it to provide the localized text of the elevel 
rather than the internal number. I also localized the message format string:

8<------------------------------
psql (16beta2)
Type "help" for help.

test=# \x
Expanded display is on.
test=# select * from pg_stat_logmsg where elevel = 'ERROR' and 
sqlerrcode = 'XX000' and count > 1;
-[ RECORD 1 ]---------------------------------------------
filename   | tablecmds.c
lineno     | 10908
elevel     | ERROR
funcname   | ATExecAlterConstraint
sqlerrcode | XX000
message    | cannot alter constraint "%s" on relation "%s"
count      | 2
-[ RECORD 2 ]---------------------------------------------
filename   | user.c
lineno     | 2130
elevel     | ERROR
funcname   | check_role_membership_authorization
sqlerrcode | XX000
message    | role "%s" cannot have explicit members
count      | 2

test=# set lc_messages ='sv_SE.UTF8';
SET
test=# select * from pg_stat_logmsg where elevel = 'FEL' and sqlerrcode 
= 'XX000' and count > 1;
-[ RECORD 1 ]---------------------------------------------
filename   | tablecmds.c
lineno     | 10908
elevel     | FEL
funcname   | ATExecAlterConstraint
sqlerrcode | XX000
message    | kan inte ändra villkoret "%s" i relation "%s"
count      | 2
-[ RECORD 2 ]---------------------------------------------
filename   | user.c
lineno     | 2130
elevel     | FEL
funcname   | check_role_membership_authorization
sqlerrcode | XX000
message    | rollen "%s" kan inte ha explicita medlemmar
count      | 2
8<------------------------------

New tarball attached.

-- 
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


Attachments:

  [application/x-compressed-tar] pg_stat_logmsg-001.tgz (14.8K, 2-pg_stat_logmsg-001.tgz)
  download

^ permalink  raw  reply  [nested|flat] 3+ messages in thread


end of thread, other threads:[~2023-07-01 19:52 UTC | newest]

Thread overview: 3+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2023-06-30 23:57 RFC: pg_stat_logmsg Joe Conway <[email protected]>
2023-07-01 03:20 ` Pavel Stehule <[email protected]>
2023-07-01 19:52   ` Joe Conway <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox