public inbox for [email protected]  
help / color / mirror / Atom feed
From: Shinya Kato <[email protected]>
To: PostgreSQL Hackers <[email protected]>
Subject: pg_stat_replication.*_lag sometimes shows NULL during active replication
Date: Tue, 24 Feb 2026 15:53:54 +0900
Message-ID: <CAOzEurTzcUrEzrH97DD7+Yz=HGPU81kzWQonKZvqBwYhx2G9_A@mail.gmail.com> (raw)

Hi hackers,

I have noticed that pg_stat_replication.*_lag sometimes shows NULL
when inserting a record per second for health checking. This happens
when the startup process replays WAL fast enough before the
walreceiver sends its flush notification to the walsender.

Here is the sequence that triggers the issue: (See normal.svg and
error.svg for diagrams of the normal and problematic cases.)

1. The walreceiver receives, writes, and flushes WAL, then wakes the
startup process via WakeupRecovery().

2. The startup process replays all available WAL quickly, then calls
WalRcvForceReply() to set force_reply = true and wakes the
walreceiver.

3. The walreceiver sends a flush notification to the walsender
(XLogWalRcvSendReply() in XLogWalRcvFlush()). Since the startup has
already replayed the WAL by this point, this message reports the
incremented applyPtr, which equals sentPtr. The walsender processes
this message, consuming the LagTracker samples and setting
fullyAppliedLastTime = true.

4. In the next loop iteration, the walreceiver sees force_reply = true
and sends another reply with the same positions. The walsender sees
applyPtr == sentPtr for the second consecutive time and sets
clearLagTimes = true. Since the LagTracker samples were already
consumed by step 3, all lag values are -1. With clearLagTimes = true,
these -1 values are written to walsnd->*Lag, causing
pg_stat_replication to show NULL.

The comment in ProcessStandbyReplyMessage() says:

     * If the standby reports that it has fully replayed the WAL in two
     * consecutive reply messages, then the second such message must result
     * from wal_receiver_status_interval expiring on the standby.

But as shown above, the second message can also come from
WalRcvForceReply(), violating this assumption.

The attached patch fixes this by adding a check that all lag values
are -1 to the clearLagTimes condition. This ensures that clearLagTimes
only triggers when there are truly no new lag samples in two
consecutive messages (i.e., the system is genuinely idle), and not
when the samples were simply consumed by a preceding message in a
burst of replies.

Regards,


-- 
Best regards,
Shinya Kato
NTT OSS Center


Attachments:

  [application/octet-stream] v1-0001-Fix-pg_stat_replication.-_lag-showing-NULL-during.patch (3.3K, 2-v1-0001-Fix-pg_stat_replication.-_lag-showing-NULL-during.patch)
  download | inline diff:
From 67eb950123b1bab1f1c3db5ba0f88ce1737b6574 Mon Sep 17 00:00:00 2001
From: Shinya Kato <[email protected]>
Date: Tue, 24 Feb 2026 15:45:04 +0900
Subject: [PATCH v1] Fix pg_stat_replication.*_lag showing NULL during active
 replication

When the startup process replays WAL quickly, the walreceiver's flush
notification and the subsequent force_reply message can both report
applyPtr == sentPtr in quick succession.  The clearLagTimes logic
assumed that two consecutive fully-applied messages meant the
wal_receiver_status_interval had expired, but this assumption is
violated when the second message comes from WalRcvForceReply().  In
that case, the LagTracker samples were already consumed by the first
message, so all lag values are -1; with clearLagTimes = true, these
-1 values were written to walsnd->*Lag, causing pg_stat_replication
to show NULL.

Fix this by also requiring that all lag values are -1 (no new
samples) in the clearLagTimes condition.  This ensures clearLagTimes
only triggers when the system is genuinely idle across two
consecutive messages, not when samples were consumed by a preceding
message in a burst of replies.

Author: Shinya Kato <[email protected]>
Reviewed-by:
Discussion: https://postgr.es/m/
---
 src/backend/replication/walsender.c | 24 +++++++++++++++++-------
 1 file changed, 17 insertions(+), 7 deletions(-)

diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 2cde8ebc729..5c7bd0a13ad 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -2493,15 +2493,25 @@ ProcessStandbyReplyMessage(void)
 	applyLag = LagTrackerRead(SYNC_REP_WAIT_APPLY, applyPtr, now);
 
 	/*
-	 * If the standby reports that it has fully replayed the WAL in two
-	 * consecutive reply messages, then the second such message must result
-	 * from wal_receiver_status_interval expiring on the standby.  This is a
-	 * convenient time to forget the lag times measured when it last
-	 * wrote/flushed/applied a WAL record, to avoid displaying stale lag data
-	 * until more WAL traffic arrives.
+	 * If the standby reports that it has fully replayed the WAL and there are
+	 * no new lag samples in two consecutive reply messages, then those
+	 * messages must result from wal_receiver_status_interval expiring on the
+	 * standby.  This is a convenient time to forget the lag times measured
+	 * when it last wrote/flushed/applied a WAL record, to avoid displaying
+	 * stale lag data until more WAL traffic arrives.
+	 *
+	 * We also require that no new lag samples are available (all lag values
+	 * are -1) in both messages to avoid a race condition: when the walreceiver
+	 * sends a flush notification followed immediately by a force_reply (to
+	 * report apply progress), both messages can have applyPtr == sentPtr if
+	 * the startup process replayed the WAL quickly.  In that case, the lag
+	 * tracker samples are consumed by the first message, causing the second
+	 * to see all lags as -1.  Without the lag check, clearLagTimes would
+	 * incorrectly trigger and overwrite valid lag values with -1 (NULL).
 	 */
 	clearLagTimes = false;
-	if (applyPtr == sentPtr)
+	if (applyPtr == sentPtr &&
+		writeLag == -1 && flushLag == -1 && applyLag == -1)
 	{
 		if (fullyAppliedLastTime)
 			clearLagTimes = true;
-- 
2.47.3



  [image/svg+xml] normal.svg (112.8K, 3-normal.svg)
  download | view image

  [image/svg+xml] error.svg (112.8K, 4-error.svg)
  download | view image

view thread (21+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected]
  Subject: Re: pg_stat_replication.*_lag sometimes shows NULL during active replication
  In-Reply-To: <CAOzEurTzcUrEzrH97DD7+Yz=HGPU81kzWQonKZvqBwYhx2G9_A@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox