public inbox for [email protected]
help / color / mirror / Atom feedFrom: 신성준 <[email protected]>
To: [email protected]
Cc: [email protected]
Cc: Kyotaro Horiguchi <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Subject: Re: Add wait events for server logging destination writes
Date: Sun, 7 Jun 2026 01:25:25 +0900
Message-ID: <CACdN0M6WCA32MKK30-TS3CgGRTRKZpXRB9o3oHt2LGrB_CHDmA@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <CACdN0M78U+GvpqA7oey-GA7fFSYM636aDp6H9FVvCztv9zXxSA@mail.gmail.com>
<[email protected]>
Hi,
Thanks Kirk, glad it's useful.
Kyotaro Horiguchi wrote:
> Should we also consider instrumenting ReportEventW()/ReportEventA()?
> They seem to be another Windows-specific logging output path.
>
> Also, if the intention is to cover all places where logging output
> can block, I wonder whether the syslog() calls should be covered as
> well.
Good points -- both are blocking output paths and there's no real
reason to leave them out, so v3 instruments them rather than excluding
them with a comment. The intent is exactly to cover the places where
logging output can block, so this makes the series consistent.
v3 adds two more WaitEventIO events:
IO / SyslogWrite - syslog(3) in write_syslog()
IO / EventlogWrite - ReportEventW()/ReportEventA() in write_eventlog()
Same approach as before: the wait is reported only around the leaf
call, using the existing pgstat_report_wait_start()/end() helpers, so
it stays allocation-free and safe on the error-reporting path, and the
series still touches just elog.c and wait_event_names.txt.
This also matches Michael's point on v2 -- each event covers a routine
rather than a single call site, so SyslogWrite wraps the syslog(3)
calls in write_syslog() and EventlogWrite wraps both ReportEvent
variants in write_eventlog(), the same way SysloggerWrite already
covers the two writes in write_pipe_chunks().
As before, 0001 is the portable part and 0002 is the Windows part
(WriteConsoleW plus the new EventlogWrite).
One caveat: EventlogWrite is Windows-only, so I couldn't get a runtime
before/after for it here -- I've only confirmed it builds (cfbot's
Windows task should cover that). The other events still show up in the
sampling I posted earlier. If someone on Windows can exercise the
event-log path I'd appreciate a confirmation.
Applies cleanly on current master; full build passes locally on both
Autoconf and Meson, with no new warnings.
Thanks,
Seongjun Shin
2026년 6월 1일 (월) 오후 2:49, Kyotaro Horiguchi <[email protected]>님이 작성:
>
> Hello.
>
> At Sun, 31 May 2026 17:50:08 +0900, 신성준 <[email protected]> wrote in
> > Attached is a short series that adds two WaitEventIO events and reports
> > them around those writes:
> >
> > IO / SysloggerWrite - write(2) to the syslogger pipe
> > IO / StderrWrite - write(2) to stderr, and WriteConsoleW()
> >
> > 0001 adds the events and covers the write(2) paths. 0002 does the
> > Windows WriteConsoleW() path, split out since it's platform-specific.
>
> Should we also consider instrumenting ReportEventW()/ReportEventA()?
> They seem to be another Windows-specific logging output path.
>
> Also, if the intention is to cover all places where logging output can
> block, I wonder whether the syslog() calls should be covered as
> well. If they are intentionally excluded, perhaps a short comment
> explaining the rationale would be useful.
>
> Regards.
>
> --
> Kyotaro Horiguchi
> NTT Open Source Software Center
Attachments:
[application/octet-stream] v3-0001-Add-wait-events-for-server-logging-destination-wr.patch (4.6K, 2-v3-0001-Add-wait-events-for-server-logging-destination-wr.patch)
download | inline diff:
From f04f12afc75e692516d3d5042ec93b4833dbfe61 Mon Sep 17 00:00:00 2001
From: Seongjun Shin <[email protected]>
Date: Fri, 29 May 2026 14:45:23 +0900
Subject: [PATCH v3 1/2] Add wait events for server logging destination writes
When a backend writes server log output, the underlying call can
block: write(2) to the syslogger pipe or to stderr once the pipe
buffer fills up or the output device is slow, and syslog(3) when the
system logger is slow. These blocking calls were not instrumented, so
pg_stat_activity reported wait_event IS NULL during that time. Many
monitoring tools interpret NULL as on-CPU work, which made
heavy-logging stalls hard to attribute.
Add three new WaitEventIO events and report them around the relevant
calls:
IO / SysloggerWrite - write(2) to the syslogger pipe inside
write_pipe_chunks().
IO / StderrWrite - write(2) to stderr inside write_console().
IO / SyslogWrite - syslog(3) inside write_syslog().
The instrumentation is limited to the leaf write/syslog call. It uses
only the existing pgstat_report_wait_start()/end() inline helpers,
which are allocation-free and safe to call before MyProc is set up, so
this remains safe to invoke from within error reporting paths.
---
src/backend/utils/activity/wait_event_names.txt | 3 +++
src/backend/utils/error/elog.c | 11 +++++++++++
2 files changed, 14 insertions(+)
diff --git a/src/backend/utils/activity/wait_event_names.txt b/src/backend/utils/activity/wait_event_names.txt
index 5537a2d2530..fa528df03f1 100644
--- a/src/backend/utils/activity/wait_event_names.txt
+++ b/src/backend/utils/activity/wait_event_names.txt
@@ -253,6 +253,9 @@ SLRU_WRITE "Waiting for a write of an SLRU page."
SNAPBUILD_READ "Waiting for a read of a serialized historical catalog snapshot."
SNAPBUILD_SYNC "Waiting for a serialized historical catalog snapshot to reach durable storage."
SNAPBUILD_WRITE "Waiting for a write of a serialized historical catalog snapshot."
+STDERR_WRITE "Waiting for a write to the server's standard error stream."
+SYSLOGGER_WRITE "Waiting for a write to the syslogger pipe."
+SYSLOG_WRITE "Waiting for a write to the system logger (syslog)."
TIMELINE_HISTORY_FILE_SYNC "Waiting for a timeline history file received via streaming replication to reach durable storage."
TIMELINE_HISTORY_FILE_WRITE "Waiting for a write of a timeline history file received via streaming replication."
TIMELINE_HISTORY_READ "Waiting for a read of a timeline history file."
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index aa530d3685e..22d2cf8e79f 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -84,6 +84,7 @@
#include "utils/memutils.h"
#include "utils/ps_status.h"
#include "utils/varlena.h"
+#include "utils/wait_event.h"
/* In this module, access gettext() via err_gettext() */
@@ -2466,10 +2467,12 @@ write_syslog(int level, const char *line)
chunk_nr++;
+ pgstat_report_wait_start(WAIT_EVENT_SYSLOG_WRITE);
if (syslog_sequence_numbers)
syslog(level, "[%lu-%d] %s", seq, chunk_nr, buf);
else
syslog(level, "[%d] %s", chunk_nr, buf);
+ pgstat_report_wait_end();
line += buflen;
len -= buflen;
@@ -2478,10 +2481,12 @@ write_syslog(int level, const char *line)
else
{
/* message short enough */
+ pgstat_report_wait_start(WAIT_EVENT_SYSLOG_WRITE);
if (syslog_sequence_numbers)
syslog(level, "[%lu] %s", seq, line);
else
syslog(level, "%s", line);
+ pgstat_report_wait_end();
}
}
#endif /* HAVE_SYSLOG */
@@ -2662,7 +2667,9 @@ write_console(const char *line, int len)
* We ignore any error from write() here. We have no useful way to report
* it ... certainly whining on stderr isn't likely to be productive.
*/
+ pgstat_report_wait_start(WAIT_EVENT_STDERR_WRITE);
rc = write(fileno(stderr), line, len);
+ pgstat_report_wait_end();
(void) rc;
}
@@ -3503,7 +3510,9 @@ write_pipe_chunks(char *data, int len, int dest)
/* no need to set PIPE_PROTO_IS_LAST yet */
p.proto.len = PIPE_MAX_PAYLOAD;
memcpy(p.proto.data, data, PIPE_MAX_PAYLOAD);
+ pgstat_report_wait_start(WAIT_EVENT_SYSLOGGER_WRITE);
rc = write(fd, &p, PIPE_HEADER_SIZE + PIPE_MAX_PAYLOAD);
+ pgstat_report_wait_end();
(void) rc;
data += PIPE_MAX_PAYLOAD;
len -= PIPE_MAX_PAYLOAD;
@@ -3513,7 +3522,9 @@ write_pipe_chunks(char *data, int len, int dest)
p.proto.flags |= PIPE_PROTO_IS_LAST;
p.proto.len = len;
memcpy(p.proto.data, data, len);
+ pgstat_report_wait_start(WAIT_EVENT_SYSLOGGER_WRITE);
rc = write(fd, &p, PIPE_HEADER_SIZE + len);
+ pgstat_report_wait_end();
(void) rc;
}
--
2.50.1 (Apple Git-155)
[application/octet-stream] v3-0002-Add-wait-events-for-Windows-specific-logging-outp.patch (3.5K, 3-v3-0002-Add-wait-events-for-Windows-specific-logging-outp.patch)
download | inline diff:
From 414d91c9ad35754a6dbdeb7d459157aa53ca4f6a Mon Sep 17 00:00:00 2001
From: Seongjun Shin <[email protected]>
Date: Fri, 29 May 2026 14:52:11 +0900
Subject: [PATCH v3 2/2] Add wait events for Windows-specific logging output
paths
On Windows, log output goes through paths other than write(2) that can
also block: write_console() emits to the console with WriteConsoleW(),
and write_eventlog() writes to the Windows event log with
ReportEventW()/ReportEventA(). Like the write(2) path, these calls can
block and previously left wait_event IS NULL in pg_stat_activity.
Wrap WriteConsoleW() with the StderrWrite event introduced in the
previous patch, and add a new EventlogWrite event for the event log
path:
IO / EventlogWrite - ReportEventW()/ReportEventA() inside
write_eventlog().
This instruments the Windows logging paths consistently with the
write(2) and syslog(3) paths. It is split out from the previous patch
since it is platform-specific.
---
src/backend/utils/activity/wait_event_names.txt | 1 +
src/backend/utils/error/elog.c | 7 +++++++
2 files changed, 8 insertions(+)
diff --git a/src/backend/utils/activity/wait_event_names.txt b/src/backend/utils/activity/wait_event_names.txt
index fa528df03f1..59ec023dbf1 100644
--- a/src/backend/utils/activity/wait_event_names.txt
+++ b/src/backend/utils/activity/wait_event_names.txt
@@ -223,6 +223,7 @@ DATA_FILE_TRUNCATE "Waiting for a relation data file to be truncated."
DATA_FILE_WRITE "Waiting for a write to a relation data file."
DSM_ALLOCATE "Waiting for a dynamic shared memory segment to be allocated."
DSM_FILL_ZERO_WRITE "Waiting to fill a dynamic shared memory backing file with zeroes."
+EVENTLOG_WRITE "Waiting for a write to the Windows event log."
LOCK_FILE_ADDTODATADIR_READ "Waiting for a read while adding a line to the data directory lock file."
LOCK_FILE_ADDTODATADIR_SYNC "Waiting for data to reach durable storage while adding a line to the data directory lock file."
LOCK_FILE_ADDTODATADIR_WRITE "Waiting for a write while adding a line to the data directory lock file."
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 22d2cf8e79f..0679cabd245 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -2574,6 +2574,7 @@ write_eventlog(int level, const char *line, int len)
utf16 = pgwin32_message_to_UTF16(line, len, NULL);
if (utf16)
{
+ pgstat_report_wait_start(WAIT_EVENT_EVENTLOG_WRITE);
ReportEventW(evtHandle,
eventlevel,
0,
@@ -2583,12 +2584,14 @@ write_eventlog(int level, const char *line, int len)
0,
(LPCWSTR *) &utf16,
NULL);
+ pgstat_report_wait_end();
/* XXX Try ReportEventA() when ReportEventW() fails? */
pfree(utf16);
return;
}
}
+ pgstat_report_wait_start(WAIT_EVENT_EVENTLOG_WRITE);
ReportEventA(evtHandle,
eventlevel,
0,
@@ -2598,6 +2601,7 @@ write_eventlog(int level, const char *line, int len)
0,
&line,
NULL);
+ pgstat_report_wait_end();
}
#endif /* WIN32 */
@@ -2639,11 +2643,14 @@ write_console(const char *line, int len)
DWORD written;
stdHandle = GetStdHandle(STD_ERROR_HANDLE);
+ pgstat_report_wait_start(WAIT_EVENT_STDERR_WRITE);
if (WriteConsoleW(stdHandle, utf16, utf16len, &written, NULL))
{
+ pgstat_report_wait_end();
pfree(utf16);
return;
}
+ pgstat_report_wait_end();
/*
* In case WriteConsoleW() failed, fall back to writing the
--
2.50.1 (Apple Git-155)
view thread (9+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: Add wait events for server logging destination writes
In-Reply-To: <CACdN0M6WCA32MKK30-TS3CgGRTRKZpXRB9o3oHt2LGrB_CHDmA@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox