public inbox for [email protected]  
help / color / mirror / Atom feed
From: Shinya Kato <[email protected]>
To: Fujii Masao <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Subject: Re: pg_stat_replication.*_lag sometimes shows NULL during active replication
Date: Fri, 6 Mar 2026 16:12:48 +0900
Message-ID: <CAOzEurRGiGE2Dfe+ySpb=+93ku=7ZC6RgAbHtLC6Xsq3g2XexA@mail.gmail.com> (raw)
In-Reply-To: <CAHGQGwE=kyQ+YnGPn8zpZ959+3ywg8OR_Nu__uXxxuE0E+Y_Zg@mail.gmail.com>
References: <CAOzEurTzcUrEzrH97DD7+Yz=HGPU81kzWQonKZvqBwYhx2G9_A@mail.gmail.com>
	<CAHGQGwE=kyQ+YnGPn8zpZ959+3ywg8OR_Nu__uXxxuE0E+Y_Zg@mail.gmail.com>

On Mon, Mar 2, 2026 at 11:44 PM Fujii Masao <[email protected]> wrote:
> With the patch applied, I set up a logical replication and inserted a row every
> second. Even with continuous inserts, NULL was shown in the lag columns of
> pg_stat_replication. That makes me wonder whether the patch's approach is
> sufficient to address the issue.

Thank you for the review and testing! I had only considered the issue
in the context of physical replication, but as you pointed out, my
approach is insufficient for logical replication.

> Relying solely on replies from the standby or subscriber seems a bit fragile to
> me. If the goal is to keep showing the last measured lag for some time,
> perhaps we should introduce a rate limit on when NULL is displayed in the lag
> columns?

My primary goal was to ensure that the source code comments match the
actual behavior, as the comment stating "the second such message must
result from wal_receiver_status_interval expiring on the standby" is
inaccurate. However, as you noted, the patch alone is not sufficient
to fully address the issue.

> For example, if there has been no activity (i.e., sentPtr == applyPtr and
> applyPtr has not changed since the previous cycle) for, say, 10 seconds,
> then we could allow NULL to be shown. Thought?

I considered a time-based rate limit, but it is difficult to choose an
appropriate threshold. Furthermore, the walsender has no way of
knowing the standby's or subscriber's wal_receiver_status_interval
setting.

The attached v2 patch takes a different approach: it additionally
requires that all reported positions (write/flush/apply) remain
unchanged from the previous reply. This directly detects a truly idle
system without relying on timeouts—if any position has advanced, new
WAL activity must have occurred, so we should not clear the lag values
even if the lag tracker is empty.
--
Best regards,
Shinya Kato
NTT OSS Center


Attachments:

  [application/octet-stream] v2-0001-Fix-spurious-NULL-lag-in-pg_stat_replication.patch (3.9K, 2-v2-0001-Fix-spurious-NULL-lag-in-pg_stat_replication.patch)
  download | inline diff:
From 8de9d904d70c362ca2af00bd4e73c2ad3bda9b6b Mon Sep 17 00:00:00 2001
From: Shinya Kato <[email protected]>
Date: Fri, 6 Mar 2026 16:10:59 +0900
Subject: [PATCH v2] Fix spurious NULL lag in pg_stat_replication

Previously, ProcessStandbyReplyMessage() cleared replication lag times
whenever the standby reported fully-applied WAL in two consecutive
reply messages.  This heuristic was too aggressive: in bursty reply
patterns one message could consume all lag tracker samples, and the
next message -- arriving before new samples accumulated -- would see
no samples and trigger clearing, even though the standby was still
actively replaying WAL.

Add two additional conditions before clearing lag times: (1) all three
LagTrackerRead() calls must return -1, indicating no new lag samples,
and (2) write/flush/apply positions must be unchanged from the
previous reply.  Together with the existing fully-applied check, this
ensures lag is only cleared when the standby is truly idle.

Author: Shinya Kato <[email protected]>
Reviewed-by: Fujii Masao <[email protected]>
Discussion: https://postgr.es/m/CAOzEurTzcUrEzrH97DD7+Yz=HGPU81kzWQonKZvqBwYhx2G9_A@mail.gmail.com
---
 src/backend/replication/walsender.c | 34 ++++++++++++++++++++++-------
 1 file changed, 26 insertions(+), 8 deletions(-)

diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 2cde8ebc729..59dcfa340a5 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -2456,11 +2456,16 @@ ProcessStandbyReplyMessage(void)
 	TimeOffset	writeLag,
 				flushLag,
 				applyLag;
-	bool		clearLagTimes;
+	bool		clearLagTimes,
+				noLagSamples,
+				positionsUnchanged;
 	TimestampTz now;
 	TimestampTz replyTime;
 
 	static bool fullyAppliedLastTime = false;
+	static XLogRecPtr prevWritePtr = InvalidXLogRecPtr;
+	static XLogRecPtr prevFlushPtr = InvalidXLogRecPtr;
+	static XLogRecPtr prevApplyPtr = InvalidXLogRecPtr;
 
 	/* the caller already consumed the msgtype byte */
 	writePtr = pq_getmsgint64(&reply_message);
@@ -2492,16 +2497,25 @@ ProcessStandbyReplyMessage(void)
 	flushLag = LagTrackerRead(SYNC_REP_WAIT_FLUSH, flushPtr, now);
 	applyLag = LagTrackerRead(SYNC_REP_WAIT_APPLY, applyPtr, now);
 
+	/* Precompute inputs for clearLagTimes decision below. */
+	noLagSamples = (writeLag == -1 && flushLag == -1 && applyLag == -1);
+	positionsUnchanged = (writePtr == prevWritePtr &&
+						  flushPtr == prevFlushPtr &&
+						  applyPtr == prevApplyPtr);
+
 	/*
-	 * If the standby reports that it has fully replayed the WAL in two
-	 * consecutive reply messages, then the second such message must result
-	 * from wal_receiver_status_interval expiring on the standby.  This is a
-	 * convenient time to forget the lag times measured when it last
-	 * wrote/flushed/applied a WAL record, to avoid displaying stale lag data
-	 * until more WAL traffic arrives.
+	 * If the standby reports that it has fully replayed the WAL, there are
+	 * no new lag samples, and positions remain unchanged across two
+	 * consecutive reply messages, forget the lag times measured when it last
+	 * wrote/flushed/applied a WAL record.  This avoids displaying stale lag
+	 * data until more WAL traffic arrives.
+	 *
+	 * The position-unchanged check prevents spuriously clearing lag in
+	 * bursty reply patterns, where one reply consumes all lag tracker
+	 * samples and the next arrives before new samples accumulate.
 	 */
 	clearLagTimes = false;
-	if (applyPtr == sentPtr)
+	if (applyPtr == sentPtr && noLagSamples && positionsUnchanged)
 	{
 		if (fullyAppliedLastTime)
 			clearLagTimes = true;
@@ -2510,6 +2524,10 @@ ProcessStandbyReplyMessage(void)
 	else
 		fullyAppliedLastTime = false;
 
+	prevWritePtr = writePtr;
+	prevFlushPtr = flushPtr;
+	prevApplyPtr = applyPtr;
+
 	/* Send a reply if the standby requested one. */
 	if (replyRequested)
 		WalSndKeepalive(false, InvalidXLogRecPtr);
-- 
2.47.3



view thread (21+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re: pg_stat_replication.*_lag sometimes shows NULL during active replication
  In-Reply-To: <CAOzEurRGiGE2Dfe+ySpb=+93ku=7ZC6RgAbHtLC6Xsq3g2XexA@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox