public inbox for [email protected]  
help / color / mirror / Atom feed
From: Fujii Masao <[email protected]>
To: Shinya Kato <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Subject: Re: pg_stat_replication.*_lag sometimes shows NULL during active replication
Date: Mon, 9 Mar 2026 20:21:03 +0900
Message-ID: <CAHGQGwH2h_R7FWPvEs3+NWLwHZoj9r96tUyRKi5haqxMc6FXiQ@mail.gmail.com> (raw)
In-Reply-To: <CAOzEurRGiGE2Dfe+ySpb=+93ku=7ZC6RgAbHtLC6Xsq3g2XexA@mail.gmail.com>
References: <CAOzEurTzcUrEzrH97DD7+Yz=HGPU81kzWQonKZvqBwYhx2G9_A@mail.gmail.com>
	<CAHGQGwE=kyQ+YnGPn8zpZ959+3ywg8OR_Nu__uXxxuE0E+Y_Zg@mail.gmail.com>
	<CAOzEurRGiGE2Dfe+ySpb=+93ku=7ZC6RgAbHtLC6Xsq3g2XexA@mail.gmail.com>

On Fri, Mar 6, 2026 at 4:13 PM Shinya Kato <[email protected]> wrote:
>
> On Mon, Mar 2, 2026 at 11:44 PM Fujii Masao <[email protected]> wrote:
> > With the patch applied, I set up a logical replication and inserted a row every
> > second. Even with continuous inserts, NULL was shown in the lag columns of
> > pg_stat_replication. That makes me wonder whether the patch's approach is
> > sufficient to address the issue.
>
> Thank you for the review and testing! I had only considered the issue
> in the context of physical replication, but as you pointed out, my
> approach is insufficient for logical replication.
>
> > Relying solely on replies from the standby or subscriber seems a bit fragile to
> > me. If the goal is to keep showing the last measured lag for some time,
> > perhaps we should introduce a rate limit on when NULL is displayed in the lag
> > columns?
>
> My primary goal was to ensure that the source code comments match the
> actual behavior, as the comment stating "the second such message must
> result from wal_receiver_status_interval expiring on the standby" is
> inaccurate. However, as you noted, the patch alone is not sufficient
> to fully address the issue.
>
> > For example, if there has been no activity (i.e., sentPtr == applyPtr and
> > applyPtr has not changed since the previous cycle) for, say, 10 seconds,
> > then we could allow NULL to be shown. Thought?
>
> I considered a time-based rate limit, but it is difficult to choose an
> appropriate threshold. Furthermore, the walsender has no way of
> knowing the standby's or subscriber's wal_receiver_status_interval
> setting.
>
> The attached v2 patch takes a different approach: it additionally
> requires that all reported positions (write/flush/apply) remain
> unchanged from the previous reply. This directly detects a truly idle
> system without relying on timeouts—if any position has advanced, new
> WAL activity must have occurred, so we should not clear the lag values
> even if the lag tracker is empty.

This approach looks good to me.

One comment: currently, the lag becomes NULL basically after about one
wal_receiver_status_interval during periods of no activity. OTOH, with this
approach, it seems it would take about twice wal_receiver_status_interval.
Is this understanding correct?

Regards,

-- 
Fujii Masao





view thread (21+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re: pg_stat_replication.*_lag sometimes shows NULL during active replication
  In-Reply-To: <CAHGQGwH2h_R7FWPvEs3+NWLwHZoj9r96tUyRKi5haqxMc6FXiQ@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox