public inbox for [email protected]
help / color / mirror / Atom feedFrom: 신성준 <[email protected]>
To: [email protected]
Cc: Kirk Wolak <[email protected]>
Cc: Andrey Borodin <[email protected]>
Cc: Andreas Karlsson <[email protected]>
Cc: Nikolay Samokhvalov <[email protected]>
Subject: Re: Add wait events for server logging destination writes
Date: Sun, 31 May 2026 19:42:41 +0900
Message-ID: <CACdN0M4ENFV0Hg4zbBkP+ScRHdeN9NdYx9QV79h5BM_eN9YxHA@mail.gmail.com> (raw)
In-Reply-To: <CACdN0M78U+GvpqA7oey-GA7fFSYM636aDp6H9FVvCztv9zXxSA@mail.gmail.com>
References: <CACdN0M78U+GvpqA7oey-GA7fFSYM636aDp6H9FVvCztv9zXxSA@mail.gmail.com>
Hi,
cfbot caught a build failure on v1, in the SanityCheck task on Linux
and Windows: elog.c uses pgstat_report_wait_start()/end() and the
WAIT_EVENT_* constants but didn't include utils/wait_event.h. It only
built here because of an accidental transitive include on my machine;
on the CI images the declarations weren't visible.
v2 fixes that by adding the missing #include "utils/wait_event.h" to
elog.c, folded into 0001 so that patch builds on its own. No other
changes; the wait events and the reported write paths are the same as
in v1.
v2-0001 adds the two events and covers the write(2) paths.
v2-0002 covers the Windows WriteConsoleW() path, split out as before.
Applies cleanly on current master; full build passes locally.
Thanks,
Seongjun Shin
2026년 5월 31일 (일) 오후 5:50, 신성준 <[email protected]>님이 작성:
>
> Hi hackers,
>
> The write(2) calls that flush server log output aren't covered by wait
> events. When a backend logs something, the writes go out in:
>
> - write_pipe_chunks(): write(2) to the syslogger pipe
> - write_console(): write(2) to stderr (WriteConsoleW() on Windows)
>
> If one of those blocks -- syslogger pipe full, slow console, slow log
> device -- pg_stat_activity just shows wait_event = NULL until it
> returns. Since NULL usually reads as "on CPU", a backend stuck writing
> logs looks like it's doing work, so logging-related stalls are easy to
> miss.
>
> Attached is a short series that adds two WaitEventIO events and reports
> them around those writes:
>
> IO / SysloggerWrite - write(2) to the syslogger pipe
> IO / StderrWrite - write(2) to stderr, and WriteConsoleW()
>
> 0001 adds the events and covers the write(2) paths. 0002 does the
> Windows WriteConsoleW() path, split out since it's platform-specific.
>
> It only wraps the leaf write call and uses the existing
> pgstat_report_wait_start()/end() helpers, so it stays allocation-free
> and safe to call from inside the error-reporting path.
>
> I did a quick before/after to make sure the events show up: 8 backends
> each emitting large RAISE LOG lines, sampling wait_event from
> pg_stat_activity every 50 ms for 20 s.
>
> - logging_collector = on (syslogger pipe):
> master: NULL 100.0% (2184/2184)
> patched: IO/SysloggerWrite 99.1% (2204/2224), NULL 0.9%
>
> - logging_collector = off (stderr):
> master: NULL 100.0% (2144/2144)
> patched: IO/StderrWrite 90.7% (1952/2152), NULL 9.3%
>
> On master that wait time is just invisible; with the patch it lands on
> the new events. I can send the scripts and raw samples if anyone wants
> to reproduce it.
>
> Applies on current master. A couple of things I'm unsure about and
> would appreciate input on: whether the event names fit the surrounding
> conventions, and whether splitting the Windows path into its own patch
> is the right call.
>
> Thanks,
> Seongjun Shin
Attachments:
[application/octet-stream] v2-0001-Add-wait-events-for-server-logging-destination-wr.patch (3.8K, 2-v2-0001-Add-wait-events-for-server-logging-destination-wr.patch)
download | inline diff:
From e800c1de7de40a6de42d23742e899223567935ef Mon Sep 17 00:00:00 2001
From: Seongjun Shin <[email protected]>
Date: Fri, 29 May 2026 14:45:23 +0900
Subject: [PATCH v2 1/2] Add wait events for server logging destination writes
When a backend writes to the syslogger pipe in write_pipe_chunks() or
to stderr in write_console(), the underlying write(2) can block once
the pipe buffer fills up or the output device is slow. These blocking
syscalls were not instrumented, so pg_stat_activity reported
wait_event IS NULL during that time. Many monitoring tools interpret
NULL as on-CPU work, which made heavy-logging stalls hard to
attribute.
Add two new WaitEventIO events and report them around the relevant
write(2) calls:
IO / SysloggerWrite - write(2) to the syslogger pipe inside
write_pipe_chunks().
IO / StderrWrite - write(2) to stderr inside write_console().
The instrumentation is limited to the leaf write call. It uses only
the existing pgstat_report_wait_start()/end() inline helpers, which
are allocation-free and safe to call before MyProc is set up, so this
remains safe to invoke from within error reporting paths.
---
src/backend/utils/activity/wait_event_names.txt | 2 ++
src/backend/utils/error/elog.c | 7 +++++++
2 files changed, 9 insertions(+)
diff --git a/src/backend/utils/activity/wait_event_names.txt b/src/backend/utils/activity/wait_event_names.txt
index 5537a2d2530..ce33807c3fe 100644
--- a/src/backend/utils/activity/wait_event_names.txt
+++ b/src/backend/utils/activity/wait_event_names.txt
@@ -253,6 +253,8 @@ SLRU_WRITE "Waiting for a write of an SLRU page."
SNAPBUILD_READ "Waiting for a read of a serialized historical catalog snapshot."
SNAPBUILD_SYNC "Waiting for a serialized historical catalog snapshot to reach durable storage."
SNAPBUILD_WRITE "Waiting for a write of a serialized historical catalog snapshot."
+STDERR_WRITE "Waiting for a write to the server's standard error stream."
+SYSLOGGER_WRITE "Waiting for a write to the syslogger pipe."
TIMELINE_HISTORY_FILE_SYNC "Waiting for a timeline history file received via streaming replication to reach durable storage."
TIMELINE_HISTORY_FILE_WRITE "Waiting for a write of a timeline history file received via streaming replication."
TIMELINE_HISTORY_READ "Waiting for a read of a timeline history file."
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index aa530d3685e..d4c41fa1263 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -84,6 +84,7 @@
#include "utils/memutils.h"
#include "utils/ps_status.h"
#include "utils/varlena.h"
+#include "utils/wait_event.h"
/* In this module, access gettext() via err_gettext() */
@@ -2662,7 +2663,9 @@ write_console(const char *line, int len)
* We ignore any error from write() here. We have no useful way to report
* it ... certainly whining on stderr isn't likely to be productive.
*/
+ pgstat_report_wait_start(WAIT_EVENT_STDERR_WRITE);
rc = write(fileno(stderr), line, len);
+ pgstat_report_wait_end();
(void) rc;
}
@@ -3503,7 +3506,9 @@ write_pipe_chunks(char *data, int len, int dest)
/* no need to set PIPE_PROTO_IS_LAST yet */
p.proto.len = PIPE_MAX_PAYLOAD;
memcpy(p.proto.data, data, PIPE_MAX_PAYLOAD);
+ pgstat_report_wait_start(WAIT_EVENT_SYSLOGGER_WRITE);
rc = write(fd, &p, PIPE_HEADER_SIZE + PIPE_MAX_PAYLOAD);
+ pgstat_report_wait_end();
(void) rc;
data += PIPE_MAX_PAYLOAD;
len -= PIPE_MAX_PAYLOAD;
@@ -3513,7 +3518,9 @@ write_pipe_chunks(char *data, int len, int dest)
p.proto.flags |= PIPE_PROTO_IS_LAST;
p.proto.len = len;
memcpy(p.proto.data, data, len);
+ pgstat_report_wait_start(WAIT_EVENT_SYSLOGGER_WRITE);
rc = write(fd, &p, PIPE_HEADER_SIZE + len);
+ pgstat_report_wait_end();
(void) rc;
}
--
2.50.1 (Apple Git-155)
[application/octet-stream] v2-0002-Report-StderrWrite-wait-event-around-WriteConsole.patch (1.3K, 3-v2-0002-Report-StderrWrite-wait-event-around-WriteConsole.patch)
download | inline diff:
From 821352a11872392547c0ac6790878ef02d0284b4 Mon Sep 17 00:00:00 2001
From: Seongjun Shin <[email protected]>
Date: Fri, 29 May 2026 14:52:11 +0900
Subject: [PATCH v2 2/2] Report StderrWrite wait event around WriteConsoleW()
on Windows
On Windows, write_console() emits log messages to the console with
WriteConsoleW() rather than write(2). Like the write(2) path, this
call can block on a slow console, leaving wait_event IS NULL in
pg_stat_activity.
Wrap WriteConsoleW() with the StderrWrite wait event introduced in
the previous patch, so the Windows console path is instrumented
consistently with the stderr write(2) path.
---
src/backend/utils/error/elog.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index d4c41fa1263..9ab75b99f5c 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -2635,11 +2635,14 @@ write_console(const char *line, int len)
DWORD written;
stdHandle = GetStdHandle(STD_ERROR_HANDLE);
+ pgstat_report_wait_start(WAIT_EVENT_STDERR_WRITE);
if (WriteConsoleW(stdHandle, utf16, utf16len, &written, NULL))
{
+ pgstat_report_wait_end();
pfree(utf16);
return;
}
+ pgstat_report_wait_end();
/*
* In case WriteConsoleW() failed, fall back to writing the
--
2.50.1 (Apple Git-155)
view thread (9+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: Add wait events for server logging destination writes
In-Reply-To: <CACdN0M4ENFV0Hg4zbBkP+ScRHdeN9NdYx9QV79h5BM_eN9YxHA@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox