public inbox for [email protected]
help / color / mirror / Atom feedFrom: Shinya Kato <[email protected]>
To: Fujii Masao <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Subject: Re: pg_stat_replication.*_lag sometimes shows NULL during active replication
Date: Wed, 11 Mar 2026 11:38:23 +0900
Message-ID: <CAOzEurRF+OWcMZpfE=NV_Wcm6CFFGOnuxC6L9WWCjOMN0_eZMQ@mail.gmail.com> (raw)
In-Reply-To: <CAHGQGwEmMBBAE0RG-R3_LacfT4fbB55qGE6n9O5mNwrqvbNBtw@mail.gmail.com>
References: <CAOzEurTzcUrEzrH97DD7+Yz=HGPU81kzWQonKZvqBwYhx2G9_A@mail.gmail.com>
<CAHGQGwE=kyQ+YnGPn8zpZ959+3ywg8OR_Nu__uXxxuE0E+Y_Zg@mail.gmail.com>
<CAOzEurRGiGE2Dfe+ySpb=+93ku=7ZC6RgAbHtLC6Xsq3g2XexA@mail.gmail.com>
<CAHGQGwH2h_R7FWPvEs3+NWLwHZoj9r96tUyRKi5haqxMc6FXiQ@mail.gmail.com>
<CAOzEurQiP3uebd1GMiC1Dzf5VJwF4ZBEpJ6QYQFE6Y+rVjxqNA@mail.gmail.com>
<CAHGQGwEmMBBAE0RG-R3_LacfT4fbB55qGE6n9O5mNwrqvbNBtw@mail.gmail.com>
On Tue, Mar 10, 2026 at 10:54 AM Fujii Masao <[email protected]> wrote:
> Even with your latest patch, if we remove fullyAppliedLastTime, and set
> clearLagTimes to true when applyPtr == sentPtr && noLagSamples &&
> positionsUnchanged,
> wouldn't the time for the lag to become NULL be almost the same as
> wal_receiver_status_interval?
>
> The documentation doesn't clearly specify how long it should take for
> the lag to become NULL, so doubling that time might be acceptable.
> However, if we can keep it roughly the same without much complexity,
> I think that would be preferable.
>
> Thought?
Thank you for the suggestion. I tested this by removing
fullyAppliedLastTime, but even with synchronous replication, NULL
still appears. Here is why:
- Reply 1 (flush notification): positions = X. Lag samples are
consumed with real values, so noLagSamples = false. clearLagTimes is
not set, and prevPtrs = X is saved.
- Reply 2 (force_reply): positions = X again. Here, noLagSamples =
true and positionsUnchanged = true. Since applyPtr == sentPtr,
clearLagTimes is set to true, resulting in a NULL value.
Therefore, I believe fullyAppliedLastTime is still necessary to ensure
that the previous reply also contained no lag samples.
BTW I noticed an incorrect comment in walreceiver.c and have included
a fix for it. Patch 0001 remains unchanged.
--
Best regards,
Shinya Kato
NTT OSS Center
Attachments:
[application/octet-stream] v3-0001-Fix-spurious-NULL-lag-in-pg_stat_replication.patch (3.9K, 2-v3-0001-Fix-spurious-NULL-lag-in-pg_stat_replication.patch)
download | inline diff:
From a06abff86337483ddcd4cd2a49ffbc03c30df966 Mon Sep 17 00:00:00 2001
From: Shinya Kato <[email protected]>
Date: Fri, 6 Mar 2026 16:10:59 +0900
Subject: [PATCH v3 1/2] Fix spurious NULL lag in pg_stat_replication
Previously, ProcessStandbyReplyMessage() cleared replication lag times
whenever the standby reported fully-applied WAL in two consecutive
reply messages. This heuristic was too aggressive: in bursty reply
patterns one message could consume all lag tracker samples, and the
next message -- arriving before new samples accumulated -- would see
no samples and trigger clearing, even though the standby was still
actively replaying WAL.
Add two additional conditions before clearing lag times: (1) all three
LagTrackerRead() calls must return -1, indicating no new lag samples,
and (2) write/flush/apply positions must be unchanged from the
previous reply. Together with the existing fully-applied check, this
ensures lag is only cleared when the standby is truly idle.
Author: Shinya Kato <[email protected]>
Reviewed-by: Fujii Masao <[email protected]>
Discussion: https://postgr.es/m/CAOzEurTzcUrEzrH97DD7+Yz=HGPU81kzWQonKZvqBwYhx2G9_A@mail.gmail.com
---
src/backend/replication/walsender.c | 34 ++++++++++++++++++++++-------
1 file changed, 26 insertions(+), 8 deletions(-)
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 79fc192b171..e0b2ac29d74 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -2457,11 +2457,16 @@ ProcessStandbyReplyMessage(void)
TimeOffset writeLag,
flushLag,
applyLag;
- bool clearLagTimes;
+ bool clearLagTimes,
+ noLagSamples,
+ positionsUnchanged;
TimestampTz now;
TimestampTz replyTime;
static bool fullyAppliedLastTime = false;
+ static XLogRecPtr prevWritePtr = InvalidXLogRecPtr;
+ static XLogRecPtr prevFlushPtr = InvalidXLogRecPtr;
+ static XLogRecPtr prevApplyPtr = InvalidXLogRecPtr;
/* the caller already consumed the msgtype byte */
writePtr = pq_getmsgint64(&reply_message);
@@ -2493,16 +2498,25 @@ ProcessStandbyReplyMessage(void)
flushLag = LagTrackerRead(SYNC_REP_WAIT_FLUSH, flushPtr, now);
applyLag = LagTrackerRead(SYNC_REP_WAIT_APPLY, applyPtr, now);
+ /* Precompute inputs for clearLagTimes decision below. */
+ noLagSamples = (writeLag == -1 && flushLag == -1 && applyLag == -1);
+ positionsUnchanged = (writePtr == prevWritePtr &&
+ flushPtr == prevFlushPtr &&
+ applyPtr == prevApplyPtr);
+
/*
- * If the standby reports that it has fully replayed the WAL in two
- * consecutive reply messages, then the second such message must result
- * from wal_receiver_status_interval expiring on the standby. This is a
- * convenient time to forget the lag times measured when it last
- * wrote/flushed/applied a WAL record, to avoid displaying stale lag data
- * until more WAL traffic arrives.
+ * If the standby reports that it has fully replayed the WAL, there are
+ * no new lag samples, and positions remain unchanged across two
+ * consecutive reply messages, forget the lag times measured when it last
+ * wrote/flushed/applied a WAL record. This avoids displaying stale lag
+ * data until more WAL traffic arrives.
+ *
+ * The position-unchanged check prevents spuriously clearing lag in
+ * bursty reply patterns, where one reply consumes all lag tracker
+ * samples and the next arrives before new samples accumulate.
*/
clearLagTimes = false;
- if (applyPtr == sentPtr)
+ if (applyPtr == sentPtr && noLagSamples && positionsUnchanged)
{
if (fullyAppliedLastTime)
clearLagTimes = true;
@@ -2511,6 +2525,10 @@ ProcessStandbyReplyMessage(void)
else
fullyAppliedLastTime = false;
+ prevWritePtr = writePtr;
+ prevFlushPtr = flushPtr;
+ prevApplyPtr = applyPtr;
+
/* Send a reply if the standby requested one. */
if (replyRequested)
WalSndKeepalive(false, InvalidXLogRecPtr);
--
2.47.3
[application/octet-stream] v3-0002-Fix-a-comment-in-walreceiver.c.patch (1.2K, 3-v3-0002-Fix-a-comment-in-walreceiver.c.patch)
download | inline diff:
From 50fddedb1c94e720a5858dc61cf3af42c1580fd5 Mon Sep 17 00:00:00 2001
From: Shinya Kato <[email protected]>
Date: Wed, 11 Mar 2026 11:28:00 +0900
Subject: [PATCH v3 2/2] Fix a comment in walreceiver.c
Remove outdated reference to "oldest xmin" in XLogWalRcvSendReply()
comment, since the function no longer reports xmin.
Author: Shinya Kato <[email protected]>
Reviewed-by:
Discussion: https://postgr.es/m/
---
src/backend/replication/walreceiver.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index fabe3c73034..bd9a1377e1c 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -1107,8 +1107,8 @@ XLogWalRcvClose(XLogRecPtr recptr, TimeLineID tli)
}
/*
- * Send reply message to primary, indicating our current WAL locations, oldest
- * xmin and the current time.
+ * Send reply message to primary, indicating our current WAL locations and the
+ * current time.
*
* If 'force' is not set, the message is only sent if enough time has
* passed since last status update to reach wal_receiver_status_interval.
--
2.47.3
view thread (21+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected]
Subject: Re: pg_stat_replication.*_lag sometimes shows NULL during active replication
In-Reply-To: <CAOzEurRF+OWcMZpfE=NV_Wcm6CFFGOnuxC6L9WWCjOMN0_eZMQ@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox