public inbox for [email protected]  
help / color / mirror / Atom feed
From: Shinya Kato <[email protected]>
To: Fujii Masao <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Subject: Re: pg_stat_replication.*_lag sometimes shows NULL during active replication
Date: Wed, 11 Mar 2026 11:38:23 +0900
Message-ID: <CAOzEurRF+OWcMZpfE=NV_Wcm6CFFGOnuxC6L9WWCjOMN0_eZMQ@mail.gmail.com> (raw)
In-Reply-To: <CAHGQGwEmMBBAE0RG-R3_LacfT4fbB55qGE6n9O5mNwrqvbNBtw@mail.gmail.com>
References: <CAOzEurTzcUrEzrH97DD7+Yz=HGPU81kzWQonKZvqBwYhx2G9_A@mail.gmail.com>
	<CAHGQGwE=kyQ+YnGPn8zpZ959+3ywg8OR_Nu__uXxxuE0E+Y_Zg@mail.gmail.com>
	<CAOzEurRGiGE2Dfe+ySpb=+93ku=7ZC6RgAbHtLC6Xsq3g2XexA@mail.gmail.com>
	<CAHGQGwH2h_R7FWPvEs3+NWLwHZoj9r96tUyRKi5haqxMc6FXiQ@mail.gmail.com>
	<CAOzEurQiP3uebd1GMiC1Dzf5VJwF4ZBEpJ6QYQFE6Y+rVjxqNA@mail.gmail.com>
	<CAHGQGwEmMBBAE0RG-R3_LacfT4fbB55qGE6n9O5mNwrqvbNBtw@mail.gmail.com>

On Tue, Mar 10, 2026 at 10:54 AM Fujii Masao <[email protected]> wrote:
> Even with your latest patch, if we remove fullyAppliedLastTime, and set
> clearLagTimes to true when applyPtr == sentPtr && noLagSamples &&
> positionsUnchanged,
> wouldn't the time for the lag to become NULL be almost the same as
> wal_receiver_status_interval?
>
> The documentation doesn't clearly specify how long it should take for
> the lag to become NULL, so doubling that time might be acceptable.
> However, if we can keep it roughly the same without much complexity,
> I think that would be preferable.
>
> Thought?

Thank you for the suggestion. I tested this by removing
fullyAppliedLastTime, but even with synchronous replication, NULL
still appears. Here is why:

- Reply 1 (flush notification): positions = X. Lag samples are
consumed with real values, so noLagSamples = false. clearLagTimes is
not set, and prevPtrs = X is saved.

- Reply 2 (force_reply): positions = X again. Here, noLagSamples =
true and positionsUnchanged = true. Since applyPtr == sentPtr,
clearLagTimes is set to true, resulting in a NULL value.

Therefore, I believe fullyAppliedLastTime is still necessary to ensure
that the previous reply also contained no lag samples.

BTW I noticed an incorrect comment in walreceiver.c and have included
a fix for it. Patch 0001 remains unchanged.


-- 
Best regards,
Shinya Kato
NTT OSS Center


Attachments:

  [application/octet-stream] v3-0001-Fix-spurious-NULL-lag-in-pg_stat_replication.patch (3.9K, 2-v3-0001-Fix-spurious-NULL-lag-in-pg_stat_replication.patch)
  download | inline diff:
From a06abff86337483ddcd4cd2a49ffbc03c30df966 Mon Sep 17 00:00:00 2001
From: Shinya Kato <[email protected]>
Date: Fri, 6 Mar 2026 16:10:59 +0900
Subject: [PATCH v3 1/2] Fix spurious NULL lag in pg_stat_replication

Previously, ProcessStandbyReplyMessage() cleared replication lag times
whenever the standby reported fully-applied WAL in two consecutive
reply messages.  This heuristic was too aggressive: in bursty reply
patterns one message could consume all lag tracker samples, and the
next message -- arriving before new samples accumulated -- would see
no samples and trigger clearing, even though the standby was still
actively replaying WAL.

Add two additional conditions before clearing lag times: (1) all three
LagTrackerRead() calls must return -1, indicating no new lag samples,
and (2) write/flush/apply positions must be unchanged from the
previous reply.  Together with the existing fully-applied check, this
ensures lag is only cleared when the standby is truly idle.

Author: Shinya Kato <[email protected]>
Reviewed-by: Fujii Masao <[email protected]>
Discussion: https://postgr.es/m/CAOzEurTzcUrEzrH97DD7+Yz=HGPU81kzWQonKZvqBwYhx2G9_A@mail.gmail.com
---
 src/backend/replication/walsender.c | 34 ++++++++++++++++++++++-------
 1 file changed, 26 insertions(+), 8 deletions(-)

diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 79fc192b171..e0b2ac29d74 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -2457,11 +2457,16 @@ ProcessStandbyReplyMessage(void)
 	TimeOffset	writeLag,
 				flushLag,
 				applyLag;
-	bool		clearLagTimes;
+	bool		clearLagTimes,
+				noLagSamples,
+				positionsUnchanged;
 	TimestampTz now;
 	TimestampTz replyTime;
 
 	static bool fullyAppliedLastTime = false;
+	static XLogRecPtr prevWritePtr = InvalidXLogRecPtr;
+	static XLogRecPtr prevFlushPtr = InvalidXLogRecPtr;
+	static XLogRecPtr prevApplyPtr = InvalidXLogRecPtr;
 
 	/* the caller already consumed the msgtype byte */
 	writePtr = pq_getmsgint64(&reply_message);
@@ -2493,16 +2498,25 @@ ProcessStandbyReplyMessage(void)
 	flushLag = LagTrackerRead(SYNC_REP_WAIT_FLUSH, flushPtr, now);
 	applyLag = LagTrackerRead(SYNC_REP_WAIT_APPLY, applyPtr, now);
 
+	/* Precompute inputs for clearLagTimes decision below. */
+	noLagSamples = (writeLag == -1 && flushLag == -1 && applyLag == -1);
+	positionsUnchanged = (writePtr == prevWritePtr &&
+						  flushPtr == prevFlushPtr &&
+						  applyPtr == prevApplyPtr);
+
 	/*
-	 * If the standby reports that it has fully replayed the WAL in two
-	 * consecutive reply messages, then the second such message must result
-	 * from wal_receiver_status_interval expiring on the standby.  This is a
-	 * convenient time to forget the lag times measured when it last
-	 * wrote/flushed/applied a WAL record, to avoid displaying stale lag data
-	 * until more WAL traffic arrives.
+	 * If the standby reports that it has fully replayed the WAL, there are
+	 * no new lag samples, and positions remain unchanged across two
+	 * consecutive reply messages, forget the lag times measured when it last
+	 * wrote/flushed/applied a WAL record.  This avoids displaying stale lag
+	 * data until more WAL traffic arrives.
+	 *
+	 * The position-unchanged check prevents spuriously clearing lag in
+	 * bursty reply patterns, where one reply consumes all lag tracker
+	 * samples and the next arrives before new samples accumulate.
 	 */
 	clearLagTimes = false;
-	if (applyPtr == sentPtr)
+	if (applyPtr == sentPtr && noLagSamples && positionsUnchanged)
 	{
 		if (fullyAppliedLastTime)
 			clearLagTimes = true;
@@ -2511,6 +2525,10 @@ ProcessStandbyReplyMessage(void)
 	else
 		fullyAppliedLastTime = false;
 
+	prevWritePtr = writePtr;
+	prevFlushPtr = flushPtr;
+	prevApplyPtr = applyPtr;
+
 	/* Send a reply if the standby requested one. */
 	if (replyRequested)
 		WalSndKeepalive(false, InvalidXLogRecPtr);
-- 
2.47.3



  [application/octet-stream] v3-0002-Fix-a-comment-in-walreceiver.c.patch (1.2K, 3-v3-0002-Fix-a-comment-in-walreceiver.c.patch)
  download | inline diff:
From 50fddedb1c94e720a5858dc61cf3af42c1580fd5 Mon Sep 17 00:00:00 2001
From: Shinya Kato <[email protected]>
Date: Wed, 11 Mar 2026 11:28:00 +0900
Subject: [PATCH v3 2/2] Fix a comment in walreceiver.c

Remove outdated reference to "oldest xmin" in XLogWalRcvSendReply()
comment, since the function no longer reports xmin.

Author: Shinya Kato <[email protected]>
Reviewed-by:
Discussion: https://postgr.es/m/
---
 src/backend/replication/walreceiver.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index fabe3c73034..bd9a1377e1c 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -1107,8 +1107,8 @@ XLogWalRcvClose(XLogRecPtr recptr, TimeLineID tli)
 }
 
 /*
- * Send reply message to primary, indicating our current WAL locations, oldest
- * xmin and the current time.
+ * Send reply message to primary, indicating our current WAL locations and the
+ * current time.
  *
  * If 'force' is not set, the message is only sent if enough time has
  * passed since last status update to reach wal_receiver_status_interval.
-- 
2.47.3



view thread (21+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re: pg_stat_replication.*_lag sometimes shows NULL during active replication
  In-Reply-To: <CAOzEurRF+OWcMZpfE=NV_Wcm6CFFGOnuxC6L9WWCjOMN0_eZMQ@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox