public inbox for [email protected]  
help / color / mirror / Atom feed
From: Lakshmi N <[email protected]>
To: Jakub Wartak <[email protected]>
Cc: Thomas Munro <[email protected]>
Cc: Bharath Rupireddy <[email protected]>
Cc: SATYANARAYANA NARLAPURAM <[email protected]>
Cc: [email protected]
Subject: Re: log XLogPrefetch stats at end of recovery
Date: Tue, 24 Mar 2026 05:07:40 -0700
Message-ID: <CA+3i_M8C+rK9vhwBm8U+ys2hbDifoBb4Xnws5Wmn2f4u7iqOpA@mail.gmail.com> (raw)
In-Reply-To: <CAKZiRmz3n24HZVzQX4=25y4tMZ9PjYQjmbGhJkrziAQjdHru5g@mail.gmail.com>
References: <CA+3i_M-+WVyrRcfnxa4gt+KQUzH-LTQ8D2JCGE_O0WzJJtZfUg@mail.gmail.com>
	<CAHg+QDfLh-yb+OacN6qPYF6H3SPDmVJUiors7vkSMq4yxfhNrA@mail.gmail.com>
	<CALj2ACWovfe_A43LP5fi0mJenZ-i99_ux68YCzX6D2CO5mB-gg@mail.gmail.com>
	<CAKZiRmwuKVHaQa_3PirbBqzoCKciuqcn2Tn4Df4d8icq4e3DCA@mail.gmail.com>
	<CA+3i_M9s6T-jOsvSZk5y1HVvyz_RVbiEQZGdwPRQrw+zZcZ5Eg@mail.gmail.com>
	<CAKZiRmz3n24HZVzQX4=25y4tMZ9PjYQjmbGhJkrziAQjdHru5g@mail.gmail.com>

Hi,

On Tue, Mar 24, 2026 at 1:53 AM Jakub Wartak <[email protected]>
wrote:

> On Tue, Mar 24, 2026 at 9:28 AM Lakshmi N <[email protected]> wrote:
> >
> > Hi all,
> >
> > Thank you for your feedback. Please see the attached patch.
> >
> > On Mon, Mar 23, 2026 at 4:30 AM Jakub Wartak <
> [email protected]> wrote:
> >>
> >> On Sun, Mar 22, 2026 at 1:43 AM Bharath Rupireddy
> >> <[email protected]> wrote:
> >>
> >> Hi,
> >>
> >> > On Sat, Mar 21, 2026 at 1:16 AM SATYANARAYANA NARLAPURAM
> >> > <[email protected]> wrote:
> >> > >
> >> > > > While investigating a long recovery, I noticed that XLogPrefetch
> stats were not logged at the end of recovery. This log message will be
> useful to understand how effective XLogPrefetch was during recovery. Adding
> a patch to address this.
> >> > >
> >> > > Applied this patch and validated the log message. This log message
> appears to be useful to me, particularly while doing fleet wide analysis.
> >> > >
> >> > > 2026-03-20 23:33:13.756 PDT [2265441] LOG:  XLogPrefetcher stats:
> prefetch=14, hit=6, skip_init=5, skip_new=28, skip_fpw=18, skip_rep=996
> >> >
> >> > This looks useful to understand how the prefetch helped during long
> recoveries.
> >> >
> >> > > I am wondering if we can periodically log this in standby mode as
> well, not just before promoting?
> >> >
> >> > Timer-based startup progress messaging allows logging such things
> >> > (ereport_startup_progress API). There was an attempt to enable "redo
> >> > in progress" for standbys, but that seemed to flood the standby logs
> >> > even at the default progress interval of 10 sec.
> >> >
> >> > Having said that, the prefetcher stats could be added to the existing
> >> > ereport_startup_progress("redo in progress xxx") message that works
> >> > for crash recoveries—however, I don't prefer doing a bunch of atomic
> >> > reads every progress interval of 10 sec.
> >>
> >> > Therefore, logging at the end of recovery looks good to me.
> >>
> >> +1 from me too to only of logging at the end of recovery (so -1 to
> logging
> >> every now and then). If someone is interested in current state (or
> progress
> >> over time) I think he can query pg_stat_recovery_prefetch view already,
> even
> >> today, right?
> >>
> >> > I reviewed the patch. I have the following comment:
> >> >
> >> > + elog(LOG, "XLogPrefetcher stats: prefetch=%lu, hit=%lu,
> >> > skip_init=%lu, skip_new=%lu, skip_fpw=%lu, skip_rep=%lu",
> >> >
> >> > XLogPrefetcher is an internal data structure name, how about "redo
> >> > prefetch stats: xxxx" to be consistent with other redo log messages?
> >>
> >> +1
> >
> >
> > please find the attached patch addressing this.
>
> Hi Lakshmi,
>
> The pg_stat_get_recovery_prefetch view has 3 additional columns, but they
> are
> showing current situation, maybe this $patch could be enhanced so that it
> also
> calculates averages for some of them? I mean perhaps not just hit ratio
> would be
> nice, but also the e.g. __average__ wal distance or average look ahead
> (block distance).
> I'm not sure maybe please also wait for input from others if they find
> it a good idea
> or not.
>
> Also if we are adding such pretty cryptic fields as "skip_fpw" to log,
> we would need some
> place in documentaiton to explain their meaining probably.
> monitoring.sgml already
> explains them, so maybe just link to that. I mean if you look what we
> have today (below),
> they are little less crypting and have different format, not
> "skip_fpw=123" but more
> like "17 removed" so maybe "17 skipped due to FPW". Maybe
>
> redo done at 0/75C7B290 system usage: CPU: user: 1.02 s, system: 0.51
> s, elapsed: 1.53 s
> checkpoint complete: end-of-recovery fast wait: wrote 16289 buffers
> (99.4%), wrote 3 SLRU buffers; 0 WAL file(s) added, 17 removed, 0
> recycled; write=0.049 s, sync=0.191 s, total=0.264 s; sync files=10,
> longest=0.187 s, average=0.020 s; distance=287186 kB, estimate=287186
> kB; lsn=0/75C7B2C8, redo lsn=0/75C7B2C8
>
> so instead of like:
> redo prefetch stats: prefetch=%lu, hit=%lu, skip_init=%lu,
> skip_new=%lu, skip_fpw=%lu, skip_rep=%lu"
>
> something like below ones:
> redo prefetch stats: done %lu prefetches, %lu hit, %lu zero-initated, ..
> redo prefetch stats: done %lu prefetches, (%d% hit ratio), %lu
> zero-initated, .. or something like that
>

Please find the attached patch with the suggested changes. I referenced [1]
to log the message as suggested.

2026-03-24 04:53:15.251 PDT [18898] LOG:  redo prefetch stats: prefetched
27 blocks, skipped 22 blocks because they were already in the buffer pool,
skipped 17 blocks because they would be zero-initialized, skipped 0 blocks
because they didn't exist yet, skipped 28 blocks because a full page image
was included in the WAL, skipped 155 blocks because they were already
recently prefetched.

[1]
https://www.postgresql.org/docs/devel/monitoring-stats.html#MONITORING-PG-STAT-RECOVERY-PREFETCH

Regards,
Lakshmi


Attachments:

  [application/octet-stream] 0001-log-prefetch-stats-at-end-of-recovery.patch (3.3K, 3-0001-log-prefetch-stats-at-end-of-recovery.patch)
  download | inline diff:
From 6c29ac6ac297014fceb2e02bd15c53ee463ac281 Mon Sep 17 00:00:00 2001
From: Lakshmi N <[email protected]>
Date: Tue, 24 Mar 2026 01:23:24 -0700
Subject: [PATCH] xlogprefetcher: log prefetch statistics at end of recovery

Add XLogPrefetchLogStats(), which emits a LOG message summarising the
prefetch counters (prefetch, hit, skip_init, skip_new, skip_fpw,
skip_rep) accumulated during recovery. The function is called from
PerformWalRecovery() immediately after the "redo done" message, giving
operators visibility into how effective WAL prefetching was over the
course of the recovery session.

No-op when recovery_prefetch = off.
---
 src/backend/access/transam/xlogprefetcher.c | 25 +++++++++++++++++++++
 src/backend/access/transam/xlogrecovery.c   |  2 ++
 src/include/access/xlogprefetcher.h         |  2 ++
 3 files changed, 29 insertions(+)

diff --git a/src/backend/access/transam/xlogprefetcher.c b/src/backend/access/transam/xlogprefetcher.c
index c235eca7c51..76be0cc1bcf 100644
--- a/src/backend/access/transam/xlogprefetcher.c
+++ b/src/backend/access/transam/xlogprefetcher.c
@@ -335,6 +335,31 @@ XLogPrefetchShmemInit(void)
 	}
 }
 
+/*
+ * Log a summary of the XLogPrefetcher stats. Intended to be called
+ * at the end of recovery or when a standby is promoted.
+ */
+void
+XLogPrefetchLogStats(void)
+{
+	if (recovery_prefetch == RECOVERY_PREFETCH_OFF)
+		return;
+
+	elog(LOG,
+		"redo prefetch stats: prefetched %lu blocks, "
+		"skipped %lu blocks because they were already in the buffer pool, "
+		"skipped %lu blocks because they would be zero-initialized, "
+		"skipped %lu blocks because they didn't exist yet, "
+		"skipped %lu blocks because a full page image was included in the WAL, "
+		"skipped %lu blocks because they were already recently prefetched.",
+		pg_atomic_read_u64(&SharedStats->prefetch),
+		pg_atomic_read_u64(&SharedStats->hit),
+		pg_atomic_read_u64(&SharedStats->skip_init),
+		pg_atomic_read_u64(&SharedStats->skip_new),
+		pg_atomic_read_u64(&SharedStats->skip_fpw),
+		pg_atomic_read_u64(&SharedStats->skip_rep));
+}
+
 /*
  * Called when any GUC is changed that affects prefetching.
  */
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 6d2c4a86b96..742cb90da9b 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -1845,6 +1845,8 @@ PerformWalRecovery(void)
 				errmsg("redo done at %X/%08X system usage: %s",
 					   LSN_FORMAT_ARGS(xlogreader->ReadRecPtr),
 					   pg_rusage_show(&ru0)));
+		
+		XLogPrefetchLogStats();
 		xtime = GetLatestXTime();
 		if (xtime)
 			ereport(LOG,
diff --git a/src/include/access/xlogprefetcher.h b/src/include/access/xlogprefetcher.h
index 7ec40c4b78b..a862924c895 100644
--- a/src/include/access/xlogprefetcher.h
+++ b/src/include/access/xlogprefetcher.h
@@ -37,6 +37,7 @@ extern void XLogPrefetchReconfigure(void);
 extern size_t XLogPrefetchShmemSize(void);
 extern void XLogPrefetchShmemInit(void);

+extern void XLogPrefetchLogStats(void);
 extern void XLogPrefetchResetStats(void);
 
 extern XLogPrefetcher *XLogPrefetcherAllocate(XLogReaderState *reader);
@@ -52,4 +53,5 @@ extern XLogRecord *XLogPrefetcherReadRecord(XLogPrefetcher *prefetcher,
 
 extern void XLogPrefetcherComputeStats(XLogPrefetcher *prefetcher);
 
+
 #endif
-- 
2.43.0



view thread (11+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: log XLogPrefetch stats at end of recovery
  In-Reply-To: <CA+3i_M8C+rK9vhwBm8U+ys2hbDifoBb4Xnws5Wmn2f4u7iqOpA@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox