public inbox for [email protected]  
help / color / mirror / Atom feed
Use streaming read I/O when enabling data checksums online
3+ messages / 2 participants
[nested] [flat]

* Use streaming read I/O when enabling data checksums online
@ 2026-06-03 11:10  =?utf-8?B?Y2NhNTUwNw==?= <[email protected]>
  0 siblings, 1 reply; 3+ messages in thread

From: =?utf-8?B?Y2NhNTUwNw==?= @ 2026-06-03 11:10 UTC (permalink / raw)
  To: =?utf-8?B?cGdzcWwtaGFja2Vycw==?= <[email protected]>

Hi hackers,

Attach a simple patch for $subject. I think we prefer streaming IO instead of
ReadBufferExtended(). Do I miss something?

--
Regards,
ChangAo Chen


Attachments:

  [application/octet-stream] v1-0001-Use-streaming-read-I-O-when-enabling-data-checksu.patch (2.7K, 2-v1-0001-Use-streaming-read-I-O-when-enabling-data-checksu.patch)
  download | inline diff:
From ee48966af6dcf6e2d18279faab17bbf8944f903e Mon Sep 17 00:00:00 2001
From: ChangAo Chen <[email protected]>
Date: Wed, 3 Jun 2026 19:01:17 +0800
Subject: [PATCH v1] Use streaming read I/O when enabling data checksums online

---
 src/backend/postmaster/datachecksum_state.c | 35 +++++++++++++++++++--
 1 file changed, 32 insertions(+), 3 deletions(-)

diff --git a/src/backend/postmaster/datachecksum_state.c b/src/backend/postmaster/datachecksum_state.c
index a49a31d1281..2b793b082d8 100644
--- a/src/backend/postmaster/datachecksum_state.c
+++ b/src/backend/postmaster/datachecksum_state.c
@@ -210,6 +210,7 @@
 #include "storage/lmgr.h"
 #include "storage/lwlock.h"
 #include "storage/procarray.h"
+#include "storage/read_stream.h"
 #include "storage/smgr.h"
 #include "storage/subsystems.h"
 #include "tcop/tcopprot.h"
@@ -654,6 +655,9 @@ ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrateg
 	BlockNumber numblocks = RelationGetNumberOfBlocksInFork(reln, forkNum);
 	char		activity[NAMEDATALEN * 2 + 128];
 	char	   *relns;
+	bool		success = true;
+	BlockRangeReadStreamPrivate	p;
+	ReadStream *stream;
 
 	relns = get_namespace_name(RelationGetNamespace(reln));
 
@@ -670,9 +674,26 @@ ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrateg
 	 * start, which is safe since new blocks are created with checksums set
 	 * already due to the state being "inprogress-on".
 	 */
+	p.current_blocknum = 0;
+	p.last_exclusive = numblocks;
+
+	/*
+	 * It is safe to use batchmode as block_range_read_stream_cb takes no
+	 * locks.
+	 */
+	stream = read_stream_begin_relation(READ_STREAM_MAINTENANCE |
+										READ_STREAM_FULL |
+										READ_STREAM_USE_BATCHING,
+										strategy,
+										reln,
+										forkNum,
+										block_range_read_stream_cb,
+										&p,
+										0);
+
 	for (BlockNumber blknum = 0; blknum < numblocks; blknum++)
 	{
-		Buffer		buf = ReadBufferExtended(reln, forkNum, blknum, RBM_NORMAL, strategy);
+		Buffer		buf = read_stream_next_buffer(stream, NULL);
 
 		/* Need to get an exclusive lock to mark the buffer as dirty */
 		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
@@ -715,7 +736,10 @@ ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrateg
 		LWLockRelease(DataChecksumsWorkerLock);
 
 		if (abort_requested)
-			return false;
+		{
+			success = false;
+			break;
+		}
 
 		/* update the block counter */
 		pgstat_progress_update_param(PROGRESS_DATACHECKSUMS_BLOCKS_DONE,
@@ -728,7 +752,12 @@ ProcessSingleRelationFork(Relation reln, ForkNumber forkNum, BufferAccessStrateg
 		vacuum_delay_point(false);
 	}
 
-	return true;
+	if (success)
+		Assert(read_stream_next_buffer(stream, NULL) == InvalidBuffer);
+
+	read_stream_end(stream);
+
+	return success;
 }
 
 /*
-- 
2.34.1



^ permalink  raw  reply  [nested|flat] 3+ messages in thread

* Re: Use streaming read I/O when enabling data checksums online
@ 2026-06-03 13:25  Daniel Gustafsson <[email protected]>
  parent: =?utf-8?B?Y2NhNTUwNw==?= <[email protected]>
  0 siblings, 1 reply; 3+ messages in thread

From: Daniel Gustafsson @ 2026-06-03 13:25 UTC (permalink / raw)
  To: cca5507 <[email protected]>; +Cc: pgsql-hackers <[email protected]>

> On 3 Jun 2026, at 13:10, cca5507 <[email protected]> wrote:

> Attach a simple patch for $subject. I think we prefer streaming IO instead of
> ReadBufferExtended(). Do I miss something?

Thanks!  This code was written well before read streams existed, and recent
work was focused on correctness of operation, that's the main reason it's using
ReadBufferExtended still.  We probably want do something like this once the
tree opens up after the v19 feature freeze.

--
Daniel Gustafsson







^ permalink  raw  reply  [nested|flat] 3+ messages in thread

* Re: Use streaming read I/O when enabling data checksums online
@ 2026-06-04 02:43  =?utf-8?B?Y2NhNTUwNw==?= <[email protected]>
  parent: Daniel Gustafsson <[email protected]>
  0 siblings, 0 replies; 3+ messages in thread

From: =?utf-8?B?Y2NhNTUwNw==?= @ 2026-06-04 02:43 UTC (permalink / raw)
  To: =?utf-8?B?RGFuaWVsIEd1c3RhZnNzb24=?= <[email protected]>; +Cc: =?utf-8?B?cGdzcWwtaGFja2Vycw==?= <[email protected]>

> > Attach a simple patch for $subject. I think we prefer streaming IO instead of
> > ReadBufferExtended(). Do I miss something?
> 
> Thanks!  This code was written well before read streams existed, and recent
> work was focused on correctness of operation, that's the main reason it's using
> ReadBufferExtended still.  We probably want do something like this once the
> tree opens up after the v19 feature freeze.

Get it! I created a CF entry for this:

https://commitfest.postgresql.org/patch/6841/

--
Regards,
ChangAo Chen


^ permalink  raw  reply  [nested|flat] 3+ messages in thread


end of thread, other threads:[~2026-06-04 02:43 UTC | newest]

Thread overview: 3+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2026-06-03 11:10 Use streaming read I/O when enabling data checksums online =?utf-8?B?Y2NhNTUwNw==?= <[email protected]>
2026-06-03 13:25 ` Daniel Gustafsson <[email protected]>
2026-06-04 02:43   ` =?utf-8?B?Y2NhNTUwNw==?= <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox