public inbox for [email protected]  
help / color / mirror / Atom feed
From: 신성준 <[email protected]>
To: [email protected]
Cc: Kirk Wolak <[email protected]>
Cc: Andrey Borodin <[email protected]>
Cc: Andreas Karlsson <[email protected]>
Cc: Nikolay Samokhvalov <[email protected]>
Subject: Re: Add wait events for server logging destination writes
Date: Sun, 31 May 2026 19:42:41 +0900
Message-ID: <CACdN0M4ENFV0Hg4zbBkP+ScRHdeN9NdYx9QV79h5BM_eN9YxHA@mail.gmail.com> (raw)
In-Reply-To: <CACdN0M78U+GvpqA7oey-GA7fFSYM636aDp6H9FVvCztv9zXxSA@mail.gmail.com>
References: <CACdN0M78U+GvpqA7oey-GA7fFSYM636aDp6H9FVvCztv9zXxSA@mail.gmail.com>

Hi,

cfbot caught a build failure on v1, in the SanityCheck task on Linux
and Windows: elog.c uses pgstat_report_wait_start()/end() and the
WAIT_EVENT_* constants but didn't include utils/wait_event.h. It only
built here because of an accidental transitive include on my machine;
on the CI images the declarations weren't visible.

v2 fixes that by adding the missing #include "utils/wait_event.h" to
elog.c, folded into 0001 so that patch builds on its own. No other
changes; the wait events and the reported write paths are the same as
in v1.

v2-0001 adds the two events and covers the write(2) paths.
v2-0002 covers the Windows WriteConsoleW() path, split out as before.

Applies cleanly on current master; full build passes locally.

Thanks,
Seongjun Shin

2026년 5월 31일 (일) 오후 5:50, 신성준 <[email protected]>님이 작성:
>
> Hi hackers,
>
> The write(2) calls that flush server log output aren't covered by wait
> events. When a backend logs something, the writes go out in:
>
>   - write_pipe_chunks(): write(2) to the syslogger pipe
>   - write_console(): write(2) to stderr (WriteConsoleW() on Windows)
>
> If one of those blocks -- syslogger pipe full, slow console, slow log
> device -- pg_stat_activity just shows wait_event = NULL until it
> returns. Since NULL usually reads as "on CPU", a backend stuck writing
> logs looks like it's doing work, so logging-related stalls are easy to
> miss.
>
> Attached is a short series that adds two WaitEventIO events and reports
> them around those writes:
>
>   IO / SysloggerWrite - write(2) to the syslogger pipe
>   IO / StderrWrite - write(2) to stderr, and WriteConsoleW()
>
> 0001 adds the events and covers the write(2) paths. 0002 does the
> Windows WriteConsoleW() path, split out since it's platform-specific.
>
> It only wraps the leaf write call and uses the existing
> pgstat_report_wait_start()/end() helpers, so it stays allocation-free
> and safe to call from inside the error-reporting path.
>
> I did a quick before/after to make sure the events show up: 8 backends
> each emitting large RAISE LOG lines, sampling wait_event from
> pg_stat_activity every 50 ms for 20 s.
>
>   - logging_collector = on (syslogger pipe):
>     master:  NULL                100.0%  (2184/2184)
>     patched: IO/SysloggerWrite     99.1%  (2204/2224), NULL 0.9%
>
>   - logging_collector = off (stderr):
>     master:  NULL                100.0%  (2144/2144)
>     patched: IO/StderrWrite        90.7%  (1952/2152), NULL 9.3%
>
> On master that wait time is just invisible; with the patch it lands on
> the new events. I can send the scripts and raw samples if anyone wants
> to reproduce it.
>
> Applies on current master. A couple of things I'm unsure about and
> would appreciate input on: whether the event names fit the surrounding
> conventions, and whether splitting the Windows path into its own patch
> is the right call.
>
> Thanks,
> Seongjun Shin


Attachments:

  [application/octet-stream] v2-0001-Add-wait-events-for-server-logging-destination-wr.patch (3.8K, 2-v2-0001-Add-wait-events-for-server-logging-destination-wr.patch)
  download | inline diff:
From e800c1de7de40a6de42d23742e899223567935ef Mon Sep 17 00:00:00 2001
From: Seongjun Shin <[email protected]>
Date: Fri, 29 May 2026 14:45:23 +0900
Subject: [PATCH v2 1/2] Add wait events for server logging destination writes

When a backend writes to the syslogger pipe in write_pipe_chunks() or
to stderr in write_console(), the underlying write(2) can block once
the pipe buffer fills up or the output device is slow.  These blocking
syscalls were not instrumented, so pg_stat_activity reported
wait_event IS NULL during that time.  Many monitoring tools interpret
NULL as on-CPU work, which made heavy-logging stalls hard to
attribute.

Add two new WaitEventIO events and report them around the relevant
write(2) calls:

  IO / SysloggerWrite - write(2) to the syslogger pipe inside
                        write_pipe_chunks().
  IO / StderrWrite    - write(2) to stderr inside write_console().

The instrumentation is limited to the leaf write call.  It uses only
the existing pgstat_report_wait_start()/end() inline helpers, which
are allocation-free and safe to call before MyProc is set up, so this
remains safe to invoke from within error reporting paths.
---
 src/backend/utils/activity/wait_event_names.txt | 2 ++
 src/backend/utils/error/elog.c                  | 7 +++++++
 2 files changed, 9 insertions(+)

diff --git a/src/backend/utils/activity/wait_event_names.txt b/src/backend/utils/activity/wait_event_names.txt
index 5537a2d2530..ce33807c3fe 100644
--- a/src/backend/utils/activity/wait_event_names.txt
+++ b/src/backend/utils/activity/wait_event_names.txt
@@ -253,6 +253,8 @@ SLRU_WRITE	"Waiting for a write of an SLRU page."
 SNAPBUILD_READ	"Waiting for a read of a serialized historical catalog snapshot."
 SNAPBUILD_SYNC	"Waiting for a serialized historical catalog snapshot to reach durable storage."
 SNAPBUILD_WRITE	"Waiting for a write of a serialized historical catalog snapshot."
+STDERR_WRITE	"Waiting for a write to the server's standard error stream."
+SYSLOGGER_WRITE	"Waiting for a write to the syslogger pipe."
 TIMELINE_HISTORY_FILE_SYNC	"Waiting for a timeline history file received via streaming replication to reach durable storage."
 TIMELINE_HISTORY_FILE_WRITE	"Waiting for a write of a timeline history file received via streaming replication."
 TIMELINE_HISTORY_READ	"Waiting for a read of a timeline history file."
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index aa530d3685e..d4c41fa1263 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -84,6 +84,7 @@
 #include "utils/memutils.h"
 #include "utils/ps_status.h"
 #include "utils/varlena.h"
+#include "utils/wait_event.h"
 
 
 /* In this module, access gettext() via err_gettext() */
@@ -2662,7 +2663,9 @@ write_console(const char *line, int len)
 	 * We ignore any error from write() here.  We have no useful way to report
 	 * it ... certainly whining on stderr isn't likely to be productive.
 	 */
+	pgstat_report_wait_start(WAIT_EVENT_STDERR_WRITE);
 	rc = write(fileno(stderr), line, len);
+	pgstat_report_wait_end();
 	(void) rc;
 }
 
@@ -3503,7 +3506,9 @@ write_pipe_chunks(char *data, int len, int dest)
 		/* no need to set PIPE_PROTO_IS_LAST yet */
 		p.proto.len = PIPE_MAX_PAYLOAD;
 		memcpy(p.proto.data, data, PIPE_MAX_PAYLOAD);
+		pgstat_report_wait_start(WAIT_EVENT_SYSLOGGER_WRITE);
 		rc = write(fd, &p, PIPE_HEADER_SIZE + PIPE_MAX_PAYLOAD);
+		pgstat_report_wait_end();
 		(void) rc;
 		data += PIPE_MAX_PAYLOAD;
 		len -= PIPE_MAX_PAYLOAD;
@@ -3513,7 +3518,9 @@ write_pipe_chunks(char *data, int len, int dest)
 	p.proto.flags |= PIPE_PROTO_IS_LAST;
 	p.proto.len = len;
 	memcpy(p.proto.data, data, len);
+	pgstat_report_wait_start(WAIT_EVENT_SYSLOGGER_WRITE);
 	rc = write(fd, &p, PIPE_HEADER_SIZE + len);
+	pgstat_report_wait_end();
 	(void) rc;
 }
 
-- 
2.50.1 (Apple Git-155)



  [application/octet-stream] v2-0002-Report-StderrWrite-wait-event-around-WriteConsole.patch (1.3K, 3-v2-0002-Report-StderrWrite-wait-event-around-WriteConsole.patch)
  download | inline diff:
From 821352a11872392547c0ac6790878ef02d0284b4 Mon Sep 17 00:00:00 2001
From: Seongjun Shin <[email protected]>
Date: Fri, 29 May 2026 14:52:11 +0900
Subject: [PATCH v2 2/2] Report StderrWrite wait event around WriteConsoleW()
 on Windows

On Windows, write_console() emits log messages to the console with
WriteConsoleW() rather than write(2).  Like the write(2) path, this
call can block on a slow console, leaving wait_event IS NULL in
pg_stat_activity.

Wrap WriteConsoleW() with the StderrWrite wait event introduced in
the previous patch, so the Windows console path is instrumented
consistently with the stderr write(2) path.
---
 src/backend/utils/error/elog.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index d4c41fa1263..9ab75b99f5c 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -2635,11 +2635,14 @@ write_console(const char *line, int len)
 			DWORD		written;
 
 			stdHandle = GetStdHandle(STD_ERROR_HANDLE);
+			pgstat_report_wait_start(WAIT_EVENT_STDERR_WRITE);
 			if (WriteConsoleW(stdHandle, utf16, utf16len, &written, NULL))
 			{
+				pgstat_report_wait_end();
 				pfree(utf16);
 				return;
 			}
+			pgstat_report_wait_end();
 
 			/*
 			 * In case WriteConsoleW() failed, fall back to writing the
-- 
2.50.1 (Apple Git-155)



view thread (9+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Add wait events for server logging destination writes
  In-Reply-To: <CACdN0M4ENFV0Hg4zbBkP+ScRHdeN9NdYx9QV79h5BM_eN9YxHA@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox