public inbox for [email protected]
help / color / mirror / Atom feedRe: BUG #19006: Assert(BufferIsPinned) in BufferGetBlockNumber() is triggered for forwarded buffer
6+ messages / 4 participants
[nested] [flat]
* Re: BUG #19006: Assert(BufferIsPinned) in BufferGetBlockNumber() is triggered for forwarded buffer
@ 2025-10-11 12:32 Xuneng Zhou <[email protected]>
0 siblings, 1 reply; 6+ messages in thread
From: Xuneng Zhou @ 2025-10-11 12:32 UTC (permalink / raw)
To: Thomas Munro <[email protected]>; +Cc: [email protected]; [email protected]; Michael Paquier <[email protected]>; Tom Lane <[email protected]>; [email protected]
Hi,
shared_buffers = '128MB'
track_io_timing = on
jit = off
effective_io_concurrency = 500
log_min_messages = warning
max_parallel_workers_per_gather = 0
io_combine_limit = ${IO_COMBINE_LIMIT:-36}
NROWS="${NROWS:-20000000}"
shared_buffers = '64GB'
track_io_timing = on
jit = off
effective_io_concurrency = 500
log_min_messages = warning
max_parallel_workers_per_gather = 0
io_combine_limit = ${IO_COMBINE_LIMIT:-36}
NROWS="${NROWS:-200000000}"
With these settings, running the benchmark script on a larger machine
(AX162-R from hetzner) installed with NVMe SSDs did not reveal
consistent or obvious performance gains or regressions in the patched
version, although it did show slightly better results for bitmap scans
in some runs.
Best,
Xuneng
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: BUG #19006: Assert(BufferIsPinned) in BufferGetBlockNumber() is triggered for forwarded buffer
@ 2025-12-11 04:16 Xuneng Zhou <[email protected]>
parent: Xuneng Zhou <[email protected]>
0 siblings, 1 reply; 6+ messages in thread
From: Xuneng Zhou @ 2025-12-11 04:16 UTC (permalink / raw)
To: Thomas Munro <[email protected]>; +Cc: [email protected]; [email protected]; Michael Paquier <[email protected]>; Tom Lane <[email protected]>; [email protected]
Hi,
I've reviewed patch v2 carefully and found no correctness problems.
Although the performance benefit it brings might be intangible, the
new interface is more robust and simpler than the current one. It does
save a lot of bookkeeping:
1) Initialize slots to InvalidBuffer: Before calling StartReadBuffers,
it had to zero out buffer slots:
while (stream->initialized_buffers < buffer_index + nblocks)
stream->buffers[stream->initialized_buffers++] = InvalidBuffer;
2) Scan for forwarded buffers: After the call, it had to scan the
array to count how many buffers were forwarded:
forwarded = 0;
while (nblocks + forwarded < requested_nblocks &&
stream->buffers[buffer_index + nblocks + forwarded] != InvalidBuffer)
forwarded++;
stream->forwarded_buffers = forwarded;
3) Clear consumed slots: When returning a buffer to the consumer, it
had to clear the slot:
stream->buffers[oldest_buffer_index] = InvalidBuffer;
This could be fragile because:
- read_stream.c must maintain the invariant that unused slots are InvalidBuffer.
- The scanning loop is O(n) and "layer-violating" (it's re-discovering
information that StartReadBuffers already knew).
- Missing a = InvalidBuffer assignment anywhere would cause silent corruption.
With the new interface -- Make npinned an explicit in/out parameter.
1) No initialization needed: StartReadBuffers doesn't look at
uninitialized slots. It only reads buffers[0..npinned-1].
2) No scanning needed: The count is returned directly: *npinned =
actual_npinned - *nblocks; // Buffer manager tells you the count
3) No clearing needed: The "is it forwarded?" check is now i <
actual_npinned, not buffers[i] != InvalidBuffer.
As for the xxx questions, here're some thoughts:
XXX is the single-buffer specialization unaffected due to dead code
elimination, as expected?
- The single-buffer specialization stays unchanged: StartReadBuffer()
still passes npinned=0 with allow_forwarding=false, so the extra
parameter is dead code there and the compiler will inline/strip it
just as before.
XXX unused queue entries could potentially be electrified with
wipe_mem/VALGRIND_MAKE_MEM_NOACCESS, though existing bufmgr.c
assertions are very likely to reveal any fencepost bugs already
- The queue-entry sanitization step removed should not affect
behavior, since StartReadBuffers() now relies on npinned rather than
InvalidBuffer sentinels. Leaving stale values in the array is likely
harmless. If desired, using wipe_mem or VALGRIND_MAKE_MEM_NOACCESS
would simply provide additional debugging hardening.
XXX perhaps we don't need quite so many _npinned <= _nblocks assertions
- I think this makes sense. In my experience, reading through the
read_stream module already requires a fair amount of effort and focus.
There are many edge cases to watch, and the complexity is amplified by
managing two arrays, pointer arithmetic across function calls, and
numerous assertions to validate those operations. Removing assertions
that are not strictly necessary can make the code more readable and
easier to follow, without weakening its defensive properties.
// In read_stream_start_pending_read():
// REMOVE this pre-call assertion:
- Assert(stream->pending_read_npinned <= stream->pending_read_nblocks);
// Entry in StartReadBuffersImpl (bufmgr.c):
- Assert(*npinned <= *nblocks);
// KEEP this post-call assertion:
Assert(stream->pending_read_nblocks >= stream->pending_read_npinned);
The pre-call assertion in read_stream_start_pending_read looks less
needed since the entry assertion in StartReadBuffersImpl catches
exactly a subset of bugs that the entry assertion catches, with no
intervening code that could obscure the failure or change program
state.
--
Best,
Xuneng
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: BUG #19006: Assert(BufferIsPinned) in BufferGetBlockNumber() is triggered for forwarded buffer
@ 2026-04-11 01:55 Thomas Munro <[email protected]>
parent: Xuneng Zhou <[email protected]>
0 siblings, 1 reply; 6+ messages in thread
From: Thomas Munro @ 2026-04-11 01:55 UTC (permalink / raw)
To: Xuneng Zhou <[email protected]>; +Cc: [email protected]; [email protected]; Michael Paquier <[email protected]>; Tom Lane <[email protected]>; [email protected]; Melanie Plageman <[email protected]>; Heikki Linnakangas <[email protected]>
On Thu, Dec 11, 2025 at 5:16 PM Xuneng Zhou <[email protected]> wrote:
> I've reviewed patch v2 carefully and found no correctness problems.
> Although the performance benefit it brings might be intangible, the
> new interface is more robust and simpler than the current one. It does
> save a lot of bookkeeping:
Thanks Xuneng for all your work reviewing and testing this back in December.
Unfortunately this fell through the cracks (sorry) and I didn't push
it before the freeze. Any objections to pushing it now? I can live
with deferring it until master reopens if that's the call (CC RMT),
but it would be nice to tidy up this design wart if we can.
A short restatement of the situation (but see earlier if you want a
... really long one):
* it's already the case that the StartReadBuffers() sometimes stores
and "forwards" already-pinned buffers in its in/out buffers[] argument
* that currently means the caller has to fill it with InvalidBuffer so
they can be distinguished
* it forwards buffers when it decides to boot them out of the current
IO operation and into the next one (that it assumes the caller will
usually make)
* an example reason to split is when an overlapping already
in-progress IO is encountered after it has the pins
* in the current coding, read_stream.c has to initialise its internal
queue with InvalidBuffer and clear them after use, and count them
after each calls
* in the 18 beta phase a bug in an edge case of that accounting was
fixed by b4212231
This patch shifts some of that responsibility to a more natural place:
* it now seems obvious that StartReadBuffers() should just allow an
in/out npinned counter to travel along with the in/out buffers array
* read_stream.c still needs to know how many there are for pin limit purposes
* it also needs to know in the unusual case that the stream ends
earlier and it has to unpin them
* other than that, it's StartReadBuffers()'s private business to manage them
* StartReadBuffers() can do that with trivial arithmetic, no need to
distinguish and count the buffers
* the end result is much simpler and more robust
(Heikki may recall that we wrestled with versions of this
how-to-avoid-unpinning-across-split-IOs question back in the v17 cycle
when he was reviewing the first vectored I/O + read_stream.c patches,
and there were "two phase" and the "three phase" API ideas, which
eventually came back as "forwarded buffers" in v18, and this would be
a simplification for v19, "forwarded buffers + a counter".)
Attachments:
[text/x-patch] v3-0001-Give-StartReadBuffers-a-more-robust-interface.patch (18.0K, 2-v3-0001-Give-StartReadBuffers-a-more-robust-interface.patch)
download | inline diff:
From fd18aa6530277487fcf2c602c570f36c274bbc81 Mon Sep 17 00:00:00 2001
From: Thomas Munro <[email protected]>
Date: Tue, 5 Aug 2025 14:35:09 +1200
Subject: [PATCH v3] Give StartReadBuffers() a more robust interface.
Remove the need for read_stream.c to initialize and clear
stream->buffers[] elements that will be passed to StartReadBuffers(),
and to count them on return. Instead, add an explicit npinned argument
to tell StartReadBuffers() how many buffers were already pinned by the
previous call (if it was a short read), and to learn how many are pinned
beyond *nblocks after this call returns (if it is a short read). It is
an in/out argument. The output of one call can be passed directly as
input to the next call, just like the buffers themselves.
The new function argument count still fits in registers in Unixoid
calling conventions. It doesn't on Windows/x86, but it didn't before
either. The single-buffer StartReadBuffer() specialization is not
affected.
read_stream.c now uses stream->pending_read_npinned to carry that number
from call to call, replacing stream->forwarded_buffers. It no longer
maintains it itself, which now seems a little fragile, layer-violating
and expensive. It still needs to consult the value when respecting pin
limits, but other than that it's entirely StartReadBuffers()' job to
worry about it. The new member variable name better reflects its
relationship with pending_read_{blocknum,nblocks}. It is the number of
buffers in that range that were already pinned by an earlier short read.
After a short read, _blocknum advances, _nblocks shrinks, and _npinned
has already been filled in by StartReadBuffers(), so we're ready for the
next call at next_buffer_index, where *npinned buffers have been placed
by StartReadBuffers().
Assertions about the identity of forwarded buffers are removed, because
StartReadBuffers() already has such assertions, and assertions about
queue entries after that are no longer relevant.
Reviewed-by: Xuneng Zhou <[email protected]>
Discussion: https://postgr.es/m/19006-80fcaaf69000377e%40postgresql.org
---
src/backend/storage/aio/read_stream.c | 107 +++++---------------------
src/backend/storage/buffer/bufmgr.c | 44 +++++++----
src/include/storage/bufmgr.h | 1 +
src/test/modules/test_aio/test_aio.c | 11 +++
4 files changed, 60 insertions(+), 103 deletions(-)
diff --git a/src/backend/storage/aio/read_stream.c b/src/backend/storage/aio/read_stream.c
index 2374b4cd507..a6169188a53 100644
--- a/src/backend/storage/aio/read_stream.c
+++ b/src/backend/storage/aio/read_stream.c
@@ -99,7 +99,6 @@ struct ReadStream
int16 ios_in_progress;
int16 queue_size;
int16 max_pinned_buffers;
- int16 forwarded_buffers;
int16 pinned_buffers;
/*
@@ -115,7 +114,6 @@ struct ReadStream
int16 combine_distance;
int16 readahead_distance;
uint16 distance_decay_holdoff;
- int16 initialized_buffers;
int16 resume_readahead_distance;
int16 resume_combine_distance;
int read_buffers_flags;
@@ -147,6 +145,7 @@ struct ReadStream
/* The read operation we are currently preparing. */
BlockNumber pending_read_blocknum;
int16 pending_read_nblocks;
+ int pending_read_npinned;
/* Space for buffers and optional per-buffer private data. */
size_t per_buffer_data_size;
@@ -318,10 +317,8 @@ static bool
read_stream_start_pending_read(ReadStream *stream)
{
bool need_wait;
- int requested_nblocks;
int nblocks;
int flags;
- int forwarded;
int16 io_index;
int16 overflow;
int16 buffer_index;
@@ -335,33 +332,12 @@ read_stream_start_pending_read(ReadStream *stream)
Assert(stream->pinned_buffers + stream->pending_read_nblocks <=
stream->max_pinned_buffers);
-#ifdef USE_ASSERT_CHECKING
/* We had better not be overwriting an existing pinned buffer. */
if (stream->pinned_buffers > 0)
Assert(stream->next_buffer_index != stream->oldest_buffer_index);
else
Assert(stream->next_buffer_index == stream->oldest_buffer_index);
- /*
- * Pinned buffers forwarded by a preceding StartReadBuffers() call that
- * had to split the operation should match the leading blocks of this
- * following StartReadBuffers() call.
- */
- Assert(stream->forwarded_buffers <= stream->pending_read_nblocks);
- for (int i = 0; i < stream->forwarded_buffers; ++i)
- Assert(BufferGetBlockNumber(stream->buffers[stream->next_buffer_index + i]) ==
- stream->pending_read_blocknum + i);
-
- /*
- * Check that we've cleared the queue/overflow entries corresponding to
- * the rest of the blocks covered by this read, unless it's the first go
- * around and we haven't even initialized them yet.
- */
- for (int i = stream->forwarded_buffers; i < stream->pending_read_nblocks; ++i)
- Assert(stream->next_buffer_index + i >= stream->initialized_buffers ||
- stream->buffers[stream->next_buffer_index + i] == InvalidBuffer);
-#endif
-
/* Do we need to issue read-ahead advice? */
flags = stream->read_buffers_flags;
if (stream->advice_enabled)
@@ -402,9 +378,9 @@ read_stream_start_pending_read(ReadStream *stream)
buffer_limit = Min(GetAdditionalLocalPinLimit(), PG_INT16_MAX);
else
buffer_limit = Min(GetAdditionalPinLimit(), PG_INT16_MAX);
- Assert(stream->forwarded_buffers <= stream->pending_read_nblocks);
+ Assert(stream->pending_read_npinned <= stream->pending_read_nblocks);
- buffer_limit += stream->forwarded_buffers;
+ buffer_limit += stream->pending_read_npinned;
buffer_limit = Min(buffer_limit, PG_INT16_MAX);
if (buffer_limit == 0 && stream->pinned_buffers == 0)
@@ -431,20 +407,15 @@ read_stream_start_pending_read(ReadStream *stream)
/*
* We say how many blocks we want to read, but it may be smaller on return
- * if the buffer manager decides to shorten the read. Initialize buffers
- * to InvalidBuffer (= not a forwarded buffer) as input on first use only,
- * and keep the original nblocks number so we can check for forwarded
- * buffers as output, below.
+ * if the buffer manager decides to shorten the read.
*/
buffer_index = stream->next_buffer_index;
io_index = stream->next_io_index;
- while (stream->initialized_buffers < buffer_index + nblocks)
- stream->buffers[stream->initialized_buffers++] = InvalidBuffer;
- requested_nblocks = nblocks;
need_wait = StartReadBuffers(&stream->ios[io_index].op,
&stream->buffers[buffer_index],
stream->pending_read_blocknum,
&nblocks,
+ &stream->pending_read_npinned,
flags);
stream->pinned_buffers += nblocks;
@@ -502,28 +473,15 @@ read_stream_start_pending_read(ReadStream *stream)
read_stream_count_io(stream, nblocks, stream->ios_in_progress);
}
- /*
- * How many pins were acquired but forwarded to the next call? These need
- * to be passed to the next StartReadBuffers() call by leaving them
- * exactly where they are in the queue, or released if the stream ends
- * early. We need the number for accounting purposes, since they are not
- * counted in stream->pinned_buffers but we already hold them.
- */
- forwarded = 0;
- while (nblocks + forwarded < requested_nblocks &&
- stream->buffers[buffer_index + nblocks + forwarded] != InvalidBuffer)
- forwarded++;
- stream->forwarded_buffers = forwarded;
-
/*
* We gave a contiguous range of buffer space to StartReadBuffers(), but
* we want it to wrap around at queue_size. Copy overflowing buffers to
* the front of the array where they'll be consumed, but also leave a copy
* in the overflow zone which the I/O operation has a pointer to (it needs
- * a contiguous array). Both copies will be cleared when the buffers are
- * handed to the consumer.
+ * a contiguous array).
*/
- overflow = (buffer_index + nblocks + forwarded) - stream->queue_size;
+ overflow = (buffer_index + nblocks + stream->pending_read_npinned) -
+ stream->queue_size;
if (overflow > 0)
{
Assert(overflow < stream->queue_size); /* can't overlap */
@@ -542,6 +500,7 @@ read_stream_start_pending_read(ReadStream *stream)
/* Adjust the pending read to cover the remaining portion, if any. */
stream->pending_read_blocknum += nblocks;
stream->pending_read_nblocks -= nblocks;
+ Assert(stream->pending_read_nblocks >= stream->pending_read_npinned);
return true;
}
@@ -1036,13 +995,12 @@ read_stream_next_buffer(ReadStream *stream, void **per_buffer_data)
/* Fast path assumptions. */
Assert(stream->ios_in_progress == 0);
- Assert(stream->forwarded_buffers == 0);
Assert(stream->pinned_buffers == 1);
Assert(stream->readahead_distance == 1);
Assert(stream->combine_distance == 1);
Assert(stream->pending_read_nblocks == 0);
+ Assert(stream->pending_read_npinned == 0);
Assert(stream->per_buffer_data_size == 0);
- Assert(stream->initialized_buffers > stream->oldest_buffer_index);
/* We're going to return the buffer we pinned last time. */
oldest_buffer_index = stream->oldest_buffer_index;
@@ -1124,7 +1082,6 @@ read_stream_next_buffer(ReadStream *stream, void **per_buffer_data)
stream->combine_distance = 0;
stream->oldest_buffer_index = stream->next_buffer_index;
stream->pinned_buffers = 0;
- stream->buffers[oldest_buffer_index] = InvalidBuffer;
}
stream->fast_path = false;
@@ -1274,16 +1231,6 @@ read_stream_next_buffer(ReadStream *stream, void **per_buffer_data)
stream->seq_until_processed = InvalidBlockNumber;
}
- /*
- * We must zap this queue entry, or else it would appear as a forwarded
- * buffer. If it's potentially in the overflow zone (ie from a
- * multi-block I/O that wrapped around the queue), also zap the copy.
- */
- stream->buffers[oldest_buffer_index] = InvalidBuffer;
- if (oldest_buffer_index < stream->io_combine_limit - 1)
- stream->buffers[stream->queue_size + oldest_buffer_index] =
- InvalidBuffer;
-
#if defined(CLOBBER_FREED_MEMORY) || defined(USE_VALGRIND)
/*
@@ -1329,26 +1276,13 @@ read_stream_next_buffer(ReadStream *stream, void **per_buffer_data)
#ifndef READ_STREAM_DISABLE_FAST_PATH
/* See if we can take the fast path for all-cached scans next time. */
if (stream->ios_in_progress == 0 &&
- stream->forwarded_buffers == 0 &&
stream->pinned_buffers == 1 &&
stream->readahead_distance == 1 &&
stream->combine_distance == 1 &&
stream->pending_read_nblocks == 0 &&
+ stream->pending_read_npinned == 0 &&
stream->per_buffer_data_size == 0)
{
- /*
- * The fast path spins on one buffer entry repeatedly instead of
- * rotating through the whole queue and clearing the entries behind
- * it. If the buffer it starts with happened to be forwarded between
- * StartReadBuffers() calls and also wrapped around the circular queue
- * partway through, then a copy also exists in the overflow zone, and
- * it won't clear it out as the regular path would. Do that now, so
- * it doesn't need code for that.
- */
- if (stream->oldest_buffer_index < stream->io_combine_limit - 1)
- stream->buffers[stream->queue_size + stream->oldest_buffer_index] =
- InvalidBuffer;
-
stream->fast_path = true;
}
#endif
@@ -1421,24 +1355,19 @@ read_stream_reset(ReadStream *stream)
while ((buffer = read_stream_next_buffer(stream, NULL)) != InvalidBuffer)
ReleaseBuffer(buffer);
- /* Unpin any unused forwarded buffers. */
+ /*
+ * Unpin any unused forwarded buffers. See comments above
+ * StartReadBuffers() for why we do this.
+ */
index = stream->next_buffer_index;
- while (index < stream->initialized_buffers &&
- (buffer = stream->buffers[index]) != InvalidBuffer)
+ while (stream->pending_read_npinned > 0)
{
- Assert(stream->forwarded_buffers > 0);
- stream->forwarded_buffers--;
- ReleaseBuffer(buffer);
-
- stream->buffers[index] = InvalidBuffer;
- if (index < stream->io_combine_limit - 1)
- stream->buffers[stream->queue_size + index] = InvalidBuffer;
-
+ ReleaseBuffer(stream->buffers[index]);
+ stream->pending_read_npinned--;
if (++index == stream->queue_size)
index = 0;
}
- Assert(stream->forwarded_buffers == 0);
Assert(stream->pinned_buffers == 0);
Assert(stream->ios_in_progress == 0);
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 3cc0b0bdd92..b966a107b9a 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -1369,16 +1369,20 @@ StartReadBuffersImpl(ReadBuffersOperation *operation,
Buffer *buffers,
BlockNumber blockNum,
int *nblocks,
+ int *npinned,
int flags,
bool allow_forwarding)
{
int actual_nblocks = *nblocks;
+ int actual_npinned = *npinned;
int maxcombine = 0;
bool did_start_io;
IOContext io_context;
IOObject io_object;
+ Assert(*npinned == 0 || allow_forwarding);
Assert(*nblocks == 1 || allow_forwarding);
+ Assert(*npinned <= *nblocks);
Assert(*nblocks > 0);
Assert(*nblocks <= MAX_IO_COMBINE_LIMIT);
@@ -1397,7 +1401,7 @@ StartReadBuffersImpl(ReadBuffersOperation *operation,
{
bool found;
- if (allow_forwarding && buffers[i] != InvalidBuffer)
+ if (allow_forwarding && i < actual_npinned)
{
BufferDesc *bufHdr;
@@ -1443,6 +1447,7 @@ StartReadBuffersImpl(ReadBuffersOperation *operation,
operation->strategy,
io_object, io_context,
&found);
+ actual_npinned++;
}
if (found)
@@ -1456,6 +1461,7 @@ StartReadBuffersImpl(ReadBuffersOperation *operation,
if (i == 0)
{
*nblocks = 1;
+ *npinned = actual_npinned - 1;
#ifdef USE_ASSERT_CHECKING
@@ -1579,22 +1585,30 @@ StartReadBuffersImpl(ReadBuffersOperation *operation,
CheckReadBuffersOperation(operation, !did_start_io);
+ /* Pinned buffers not included in final nblocks, forwarded to next call. */
+ *npinned = actual_npinned - *nblocks;
+
return did_start_io;
}
/*
* Begin reading a range of blocks beginning at blockNum and extending for
- * *nblocks. *nblocks and the buffers array are in/out parameters. On entry,
- * the buffers elements covered by *nblocks must hold either InvalidBuffer or
- * buffers forwarded by an earlier call to StartReadBuffers() that was split
- * and is now being continued. On return, *nblocks holds the number of blocks
- * accepted by this operation. If it is less than the original number then
- * this operation has been split, but buffer elements up to the original
- * requested size may hold forwarded buffers to be used for a continuing
- * operation. The caller must either start a new I/O beginning at the block
- * immediately following the blocks accepted by this call and pass those
- * buffers back in, or release them if it chooses not to. It shouldn't make
- * any other use of or assumptions about forwarded buffers.
+ * *nblocks. *nblocks, *npinned and the buffers array are in/out parameters.
+ *
+ * On entry, *nblocks is the desire number of block to read. On exit, it may
+ * be a smaller number if the operation was split.
+ *
+ * On entry, *npinned is the number of pinned buffers forwarded by an earlier
+ * operation that was split. On exit, it is the number forwarded by this call,
+ * which should be passed to the following call when continuing to read the
+ * same sequence of blocks, along with the corresponding buffers.
+ *
+ * When buffers are forwarded, as reported by *npinned, the caller must either
+ * start a new I/O beginning at the block immediately following the blocks
+ * accepted by this call (*nblocks on exit) and pass those buffers back in head
+ * position, or release them if it chooses not to. They are located in the
+ * buffers array beginning at index *nblocks (on exit). It shouldn't make any
+ * other use of or assumptions about forwarded buffers.
*
* If false is returned, no I/O is necessary and the buffers covered by
* *nblocks on exit are valid and ready to be accessed. If true is returned,
@@ -1610,9 +1624,10 @@ StartReadBuffers(ReadBuffersOperation *operation,
Buffer *buffers,
BlockNumber blockNum,
int *nblocks,
+ int *npinned,
int flags)
{
- return StartReadBuffersImpl(operation, buffers, blockNum, nblocks, flags,
+ return StartReadBuffersImpl(operation, buffers, blockNum, nblocks, npinned, flags,
true /* expect forwarded buffers */ );
}
@@ -1631,9 +1646,10 @@ StartReadBuffer(ReadBuffersOperation *operation,
int flags)
{
int nblocks = 1;
+ int npinned = 0;
bool result;
- result = StartReadBuffersImpl(operation, buffer, blocknum, &nblocks, flags,
+ result = StartReadBuffersImpl(operation, buffer, blocknum, &nblocks, &npinned, flags,
false /* single block, no forwarding */ );
Assert(nblocks == 1); /* single block can't be short */
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index 6837b35fc6d..262dc4262d0 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -250,6 +250,7 @@ extern bool StartReadBuffers(ReadBuffersOperation *operation,
Buffer *buffers,
BlockNumber blockNum,
int *nblocks,
+ int *npinned,
int flags);
extern bool WaitReadBuffers(ReadBuffersOperation *operation);
diff --git a/src/test/modules/test_aio/test_aio.c b/src/test/modules/test_aio/test_aio.c
index 35efba1a5e3..e7c4864136c 100644
--- a/src/test/modules/test_aio/test_aio.c
+++ b/src/test/modules/test_aio/test_aio.c
@@ -709,6 +709,7 @@ read_buffers(PG_FUNCTION_ARGS)
Datum *buffers_datum;
bool *io_reqds;
int *nblocks_per_io;
+ int npinned;
Assert(nblocks > 0);
@@ -724,6 +725,15 @@ read_buffers(PG_FUNCTION_ARGS)
rel = relation_open(relid, AccessShareLock);
smgr = RelationGetSmgr(rel);
+ /*
+ * Pins might be forwarded between calls, if IOs are split after pins are
+ * obtained. Such buffers will be provided to the next call in the
+ * buffers array, and their count will be carrier between calls in this
+ * variable. (Only if we decided to give up the loop below would we need
+ * to consult this, to unpin them.)
+ */
+ npinned = 0;
+
/*
* Do StartReadBuffers() until IO for all the required blocks has been
* started (if required).
@@ -744,6 +754,7 @@ read_buffers(PG_FUNCTION_ARGS)
&buffers[nblocks_done],
startblock + nblocks_done,
&nblocks_this_io,
+ &npinned,
0);
nblocks_per_io[nios] = nblocks_this_io;
nios++;
--
2.53.0
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: BUG #19006: Assert(BufferIsPinned) in BufferGetBlockNumber() is triggered for forwarded buffer
@ 2026-04-11 02:22 Tom Lane <[email protected]>
parent: Thomas Munro <[email protected]>
0 siblings, 1 reply; 6+ messages in thread
From: Tom Lane @ 2026-04-11 02:22 UTC (permalink / raw)
To: Thomas Munro <[email protected]>; +Cc: Xuneng Zhou <[email protected]>; [email protected]; [email protected]; Michael Paquier <[email protected]>; [email protected]; Melanie Plageman <[email protected]>; Heikki Linnakangas <[email protected]>
Thomas Munro <[email protected]> writes:
> Unfortunately this fell through the cracks (sorry) and I didn't push
> it before the freeze. Any objections to pushing it now? I can live
> with deferring it until master reopens if that's the call (CC RMT),
> but it would be nice to tidy up this design wart if we can.
This doesn't seem to me to be a "new feature", so I'm not sure that
feature freeze applies.
> This patch shifts some of that responsibility to a more natural place:
> * it now seems obvious that StartReadBuffers() should just allow an
> in/out npinned counter to travel along with the in/out buffers array
> * read_stream.c still needs to know how many there are for pin limit purposes
> * it also needs to know in the unusual case that the stream ends
> earlier and it has to unpin them
> * other than that, it's StartReadBuffers()'s private business to manage them
> * StartReadBuffers() can do that with trivial arithmetic, no need to
> distinguish and count the buffers
> * the end result is much simpler and more robust
IIUC, this is basically fixing StartReadBuffers' API, and if we don't
do it now then the v19 code will differ from both earlier and later
branches. That doesn't seem like a great place to be when you think
about having to back-patch bug fixes in this area.
So yeah, squeezing this in now seems like a good bet to me.
regards, tom lane
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: BUG #19006: Assert(BufferIsPinned) in BufferGetBlockNumber() is triggered for forwarded buffer
@ 2026-04-13 17:24 Michael Paquier <[email protected]>
parent: Tom Lane <[email protected]>
0 siblings, 1 reply; 6+ messages in thread
From: Michael Paquier @ 2026-04-13 17:24 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Thomas Munro <[email protected]>; Xuneng Zhou <[email protected]>; [email protected]; [email protected]; [email protected]; Melanie Plageman <[email protected]>; Heikki Linnakangas <[email protected]>
On Fri, Apr 10, 2026 at 10:22:56PM -0400, Tom Lane wrote:
> Thomas Munro <[email protected]> writes:
>> Unfortunately this fell through the cracks (sorry) and I didn't push
>> it before the freeze. Any objections to pushing it now? I can live
>> with deferring it until master reopens if that's the call (CC RMT),
>> but it would be nice to tidy up this design wart if we can.
>
> This doesn't seem to me to be a "new feature", so I'm not sure that
> feature freeze applies.
I have read the patch and I would agree this stance.
>> * it now seems obvious that StartReadBuffers() should just allow an
>> in/out npinned counter to travel along with the in/out buffers array
>> * read_stream.c still needs to know how many there are for pin limit purposes
>> * it also needs to know in the unusual case that the stream ends
>> earlier and it has to unpin them
>> * other than that, it's StartReadBuffers()'s private business to manage them
>> * StartReadBuffers() can do that with trivial arithmetic, no need to
>> distinguish and count the buffers
>> * the end result is much simpler and more robust
>
> IIUC, this is basically fixing StartReadBuffers' API, and if we don't
> do it now then the v19 code will differ from both earlier and later
> branches. That doesn't seem like a great place to be when you think
> about having to back-patch bug fixes in this area.
>
> So yeah, squeezing this in now seems like a good bet to me.
+1 for doing it now. It's also worth noting that this shaves code
line-wise.
--
Michael
Attachments:
[application/pgp-signature] signature.asc (833B, 2-signature.asc)
download
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: BUG #19006: Assert(BufferIsPinned) in BufferGetBlockNumber() is triggered for forwarded buffer
@ 2026-04-15 01:30 Xuneng Zhou <[email protected]>
parent: Michael Paquier <[email protected]>
0 siblings, 0 replies; 6+ messages in thread
From: Xuneng Zhou @ 2026-04-15 01:30 UTC (permalink / raw)
To: Michael Paquier <[email protected]>; +Cc: Tom Lane <[email protected]>; Thomas Munro <[email protected]>; [email protected]; [email protected]; [email protected]; Melanie Plageman <[email protected]>; Heikki Linnakangas <[email protected]>
On Tue, Apr 14, 2026 at 1:24 AM Michael Paquier <[email protected]> wrote:
>
> On Fri, Apr 10, 2026 at 10:22:56PM -0400, Tom Lane wrote:
> > Thomas Munro <[email protected]> writes:
> >> Unfortunately this fell through the cracks (sorry) and I didn't push
> >> it before the freeze. Any objections to pushing it now? I can live
> >> with deferring it until master reopens if that's the call (CC RMT),
> >> but it would be nice to tidy up this design wart if we can.
> >
> > This doesn't seem to me to be a "new feature", so I'm not sure that
> > feature freeze applies.
>
> I have read the patch and I would agree this stance.
>
> >> * it now seems obvious that StartReadBuffers() should just allow an
> >> in/out npinned counter to travel along with the in/out buffers array
> >> * read_stream.c still needs to know how many there are for pin limit purposes
> >> * it also needs to know in the unusual case that the stream ends
> >> earlier and it has to unpin them
> >> * other than that, it's StartReadBuffers()'s private business to manage them
> >> * StartReadBuffers() can do that with trivial arithmetic, no need to
> >> distinguish and count the buffers
> >> * the end result is much simpler and more robust
> >
> > IIUC, this is basically fixing StartReadBuffers' API, and if we don't
> > do it now then the v19 code will differ from both earlier and later
> > branches. That doesn't seem like a great place to be when you think
> > about having to back-patch bug fixes in this area.
> >
> > So yeah, squeezing this in now seems like a good bet to me.
>
> +1 for doing it now. It's also worth noting that this shaves code
> line-wise.
+1.
--
Best,
Xuneng
^ permalink raw reply [nested|flat] 6+ messages in thread
end of thread, other threads:[~2026-04-15 01:30 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2025-10-11 12:32 Re: BUG #19006: Assert(BufferIsPinned) in BufferGetBlockNumber() is triggered for forwarded buffer Xuneng Zhou <[email protected]>
2025-12-11 04:16 ` Xuneng Zhou <[email protected]>
2026-04-11 01:55 ` Thomas Munro <[email protected]>
2026-04-11 02:22 ` Tom Lane <[email protected]>
2026-04-13 17:24 ` Michael Paquier <[email protected]>
2026-04-15 01:30 ` Xuneng Zhou <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox