Re: Don't synchronously wait for already-in-progress IO in read stream

public inbox for [email protected]  
help / color / mirror / Atom feed

From: Andres Freund <[email protected]>
To: Melanie Plageman <[email protected]>
Cc: Nazir Bilal Yavuz <[email protected]>
Cc: Thomas Munro <[email protected]>
Cc: [email protected]
Cc: Peter Geoghegan <[email protected]>
Cc: Tomas Vondra <[email protected]>
Subject: Re: Don't synchronously wait for already-in-progress IO in read stream
Date: Tue, 17 Mar 2026 13:26:02 -0400
Message-ID: <u73un3xeljr4fiidzwi4ikcr6vm7oqugn4fo5vqpstjio6anl2@hph6fvdiiria> (raw)
In-Reply-To: <CAAKRu_YV0D_9wWgGkHdWgQsooqmcc08Uqb3tqxzrwcDXHajDtA@mail.gmail.com>
References: <zljergweqti7x67lg5ije2rzjusie37nslsnkjkkby4laqqbfw@3p3zu522yykv>
	<CA+hUKGKwzPJNcxCFUswe0K=0e7rG79dvqt9o17-E3soxLPxF+Q@mail.gmail.com>
	<CAAKRu_atdnPCCy=kfxqWT62Ckaiz3G5t=S97tW24CuL3i3fFfQ@mail.gmail.com>
	<CAN55FZ17ScU--nGM8cjjPtwPMOMonKZ9vMehPcc2sOQNLVskvA@mail.gmail.com>
	<CAAKRu_ZM1epxTdt2=4-g4ff6UC09zne+0xg_gNv3d7LEcxEvNA@mail.gmail.com>
	<CAN55FZ0ymY=XfaHQXHKG+Mbit6BZzZ9d-UWb6dmRXYm_xCvfQg@mail.gmail.com>
	<CAAKRu_YV0D_9wWgGkHdWgQsooqmcc08Uqb3tqxzrwcDXHajDtA@mail.gmail.com>

Hi,

> Attached v5 adds some comments to the tests, fixes a few nits in the
> actual code, and adds a commit to fix what I think is an existing
> off-by-one error in TRACE_POSTGRESQL_BUFFER_READ_DONE.


> Subject: [PATCH v5 3/5] Fix off-by-one error in read IO tracing
>
> ---
>  src/backend/storage/buffer/bufmgr.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
> index 00bc609529a..0723d4f3dd8 100644
> --- a/src/backend/storage/buffer/bufmgr.c
> +++ b/src/backend/storage/buffer/bufmgr.c
> @@ -1990,7 +1990,7 @@ AsyncReadBuffers(ReadBuffersOperation *operation, int *nblocks_progress)
>  		 * must have started out as a miss in PinBufferForBlock(). The other
>  		 * backend will track this as a 'read'.
>  		 */
> -		TRACE_POSTGRESQL_BUFFER_READ_DONE(forknum, blocknum + operation->nblocks_done,
> +		TRACE_POSTGRESQL_BUFFER_READ_DONE(forknum, blocknum + operation->nblocks_done - 1,
>  										  operation->smgr->smgr_rlocator.locator.spcOid,
>  										  operation->smgr->smgr_rlocator.locator.dbOid,
>  										  operation->smgr->smgr_rlocator.locator.relNumber,
> --

Ah, the issue is that we already incremented nblocks_done, right?  Maybe it'd
be easier to understand if we stashed blocknum + nblocks_done into a local
var, and use it in in both branches of if (!ReadBuffersCanStartIO())?

This probably needs to be backpatched...



> Subject: [PATCH v5 4/5] Make buffer hit helper
>
> Already two places count buffer hits, requiring quite a few lines of
> code since we do accounting in so many places. Future commits will add
> more locations, so refactor into a helper.
>
> Reviewed-by: Nazir Bilal Yavuz <[email protected]>
> Discussion: https://postgr.es/m/flat/zljergweqti7x67lg5ije2rzjusie37nslsnkjkkby4laqqbfw%403p3zu522yykv
> ---
>  src/backend/storage/buffer/bufmgr.c | 111 ++++++++++++++--------------
>  1 file changed, 56 insertions(+), 55 deletions(-)
>
> diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
> index 0723d4f3dd8..399004c2e44 100644
> --- a/src/backend/storage/buffer/bufmgr.c
> +++ b/src/backend/storage/buffer/bufmgr.c
> @@ -648,6 +648,10 @@ static inline BufferDesc *BufferAlloc(SMgrRelation smgr,
>  									  bool *foundPtr, IOContext io_context);
>  static bool AsyncReadBuffers(ReadBuffersOperation *operation, int *nblocks_progress);
>  static void CheckReadBuffersOperation(ReadBuffersOperation *operation, bool is_complete);
> +
> +static pg_attribute_always_inline void CountBufferHit(BufferAccessStrategy strategy,
> +													  Relation rel, char persistence, SMgrRelation smgr,
> +													  ForkNumber forknum, BlockNumber blocknum);
>  static Buffer GetVictimBuffer(BufferAccessStrategy strategy, IOContext io_context);
>  static void FlushUnlockedBuffer(BufferDesc *buf, SMgrRelation reln,
>  								IOObject io_object, IOContext io_context);
> @@ -1226,8 +1230,6 @@ PinBufferForBlock(Relation rel,
>  				  bool *foundPtr)
>  {
>  	BufferDesc *bufHdr;
> -	IOContext	io_context;
> -	IOObject	io_object;
>
>  	Assert(blockNum != P_NEW);
>
> @@ -1236,17 +1238,6 @@ PinBufferForBlock(Relation rel,
>  			persistence == RELPERSISTENCE_PERMANENT ||
>  			persistence == RELPERSISTENCE_UNLOGGED));
>
> -	if (persistence == RELPERSISTENCE_TEMP)
> -	{
> -		io_context = IOCONTEXT_NORMAL;
> -		io_object = IOOBJECT_TEMP_RELATION;
> -	}
> -	else
> -	{
> -		io_context = IOContextForStrategy(strategy);
> -		io_object = IOOBJECT_RELATION;
> -	}
> -

I'm mildly worried that this will lead to a bit worse code generation, the
compiler might have a harder time figuring out that io_context/io_object
doesn't change across multiple PinBufferForBlock calls. Although it already
might be unable to do so (we don't mark IOContextForStrategy as
pure [1]).

I kinda wonder if, for StartReadBuffersImpl(), we should go the opposite
direction, and explicitly look up IOContextForStrategy(strategy) *before* the
actual_nblocks loop to make sure the compiler doesn't inject external function
calls (which will in all likelihood require register spilling etc).

I don't think that necessarily has to conflict with the goal of this patch -
most of the the deduplicated stuff isn't io_context, so the helper will be
beneficial even if have to pull out the io_context/io_object determination to
the callsites.


>  	TRACE_POSTGRESQL_BUFFER_READ_START(forkNum, blockNum,
>  									   smgr->smgr_rlocator.locator.spcOid,
>  									   smgr->smgr_rlocator.locator.dbOid,
> @@ -1254,18 +1245,11 @@ PinBufferForBlock(Relation rel,
>  									   smgr->smgr_rlocator.backend);
>
>  	if (persistence == RELPERSISTENCE_TEMP)
> -	{
>  		bufHdr = LocalBufferAlloc(smgr, forkNum, blockNum, foundPtr);
> -		if (*foundPtr)
> -			pgBufferUsage.local_blks_hit++;
> -	}
>  	else
> -	{
>  		bufHdr = BufferAlloc(smgr, persistence, forkNum, blockNum,
> -							 strategy, foundPtr, io_context);
> -		if (*foundPtr)
> -			pgBufferUsage.shared_blks_hit++;
> -	}
> +							 strategy, foundPtr, IOContextForStrategy(strategy));
> +
>  	if (rel)
>  	{
>  		/*

And here it might end up adding a separate persistence == RELPERSISTENCE_TEMP
branch in CountBufferHit(), I suspect the compiler may not be able to optimize
it away.

At the very least I'd invert the call to CountBufferHit() and the
pgstat_count_buffer_read(), as the latter will probably prevent most
optimizations (due to the compiler not being able to prove that
(rel)->pgstat_info->counts.blocks_fetched is a different memory location as
*foundPtr).



> +/*
> + * We track various stats related to buffer hits. Because this is done in a
> + * few separate places, this helper exists for convenience.
> + */
> +static pg_attribute_always_inline void
> +CountBufferHit(BufferAccessStrategy strategy,
> +			   Relation rel, char persistence, SMgrRelation smgr,
> +			   ForkNumber forknum, BlockNumber blocknum)
> +{
> +	IOContext	io_context;
> +	IOObject	io_object;
> +
> +	if (persistence == RELPERSISTENCE_TEMP)
> +	{
> +		io_context = IOCONTEXT_NORMAL;
> +		io_object = IOOBJECT_TEMP_RELATION;
> +	}
> +	else
> +	{
> +		io_context = IOContextForStrategy(strategy);
> +		io_object = IOOBJECT_RELATION;
> +	}
> +
> +	TRACE_POSTGRESQL_BUFFER_READ_DONE(forknum,
> +									  blocknum,
> +									  smgr->smgr_rlocator.locator.spcOid,
> +									  smgr->smgr_rlocator.locator.dbOid,
> +									  smgr->smgr_rlocator.locator.relNumber,
> +									  smgr->smgr_rlocator.backend,
> +									  true);
> +
> +	if (persistence == RELPERSISTENCE_TEMP)
> +		pgBufferUsage.local_blks_hit += 1;
> +	else
> +		pgBufferUsage.shared_blks_hit += 1;
> +
> +	if (rel)
> +		pgstat_count_buffer_hit(rel);
> +
> +	pgstat_count_io_op(io_object, io_context, IOOP_HIT, 1, 0);
> +
> +	if (VacuumCostActive)
> +		VacuumCostBalance += VacuumCostPageHit;
> +}

I don't think "Count*" is a great name for something that does tracepoints and
vacuum cost balance accounting, the latter actually changes behavior of the
program due to the sleeps it injects.

The first alternative I have is AccountForBufferHit(), not great, but still
seems a bit better.



> From 4d737fa14f333abc4ee6ade8cb0340530695e887 Mon Sep 17 00:00:00 2001
> From: Melanie Plageman <[email protected]>
> Date: Fri, 23 Jan 2026 14:00:31 -0500
> Subject: [PATCH v5 5/5] Don't wait for already in-progress IO
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> When a backend attempts to start a read on a buffer and finds that I/O
> is already in progress, it previously waited for that I/O to complete
> before initiating reads for any other buffers. Although the backend must
> still wait for the I/O to finish when later acquiring the buffer, it
> should not need to wait at read start time. Other buffers may be
> available for I/O, and in some workloads this waiting significantly
> reduces concurrency.
>
> For example, index scans may repeatedly request the same heap block. If
> the backend waits each time it encounters an in-progress read, the
> access pattern effectively degenerates into synchronous I/O. By
> introducing the concept of foreign I/O operations, a backend can record
> the buffer’s wait reference and defer waiting until WaitReadBuffers()
> when it actually acquires the buffer.
>
> In rare cases, a backend may still need to wait when starting a read if
> it encounters a buffer after another backend has set BM_IO_IN_PROGRESS
> but before the buffer descriptor’s wait reference has been set. Such
> windows should be brief and uncommon.
>
> Author: Melanie Plageman <[email protected]>
> Reviewed-by: Andres Freund <[email protected]>
> Reviewed-by: Nazir Bilal Yavuz <[email protected]>
> Discussion: https://postgr.es/m/flat/zljergweqti7x67lg5ije2rzjusie37nslsnkjkkby4laqqbfw%403p3zu522yykv

> +/*
> + * In AsyncReadBuffers(), when preparing a buffer for reading and setting
> + * BM_IO_IN_PROGRESS, the buffer may already have I/O in progress or may
> + * already contain the desired block. AsyncReadBuffers() must distinguish
> + * between these cases (and the case where it should initiate I/O) so it can
> + * mark an in-progress buffer as foreign I/O rather than waiting on it.
> + */
> +typedef enum PrepareReadBuffer_Status
> +{
> +	READ_BUFFER_ALREADY_DONE,
> +	READ_BUFFER_IN_PROGRESS,
> +	READ_BUFFER_READY_FOR_IO,
> +} PrepareReadBuffer_Status;

I don't personally like mixing underscore and camel case naming within one
name.

I wonder if might be worth splitting this up in a refactoring and a
"behavioural change" commit. Might be too complicated.

Candidates for a split seem to be:
- Moving pgaio_io_acquire_nb() to earlier
- Introduce PrepareNewReadBufferIO/PrepareAdditionalReadBuffer without support
for READ_BUFFER_IN_PROGRESS
- introduce READ_BUFFER_IN_PROGRESS


>  /*
>   * We track various stats related to buffer hits. Because this is done in a
>   * few separate places, this helper exists for convenience.
> @@ -1815,8 +1791,11 @@ WaitReadBuffers(ReadBuffersOperation *operation)
>  			 * b) reports some time as waiting, even if we never waited
>  			 *
>  			 * we first check if we already know the IO is complete.
> +			 *
> +			 * Note that operation->io_return is uninitialized for foreign IO,
> +			 * so we cannot count that wait time.
>  			 */

I'm confused - your comment says we can't count wait time with a foreign IO,
but then oes on to count foreign IO time?  The lack of io_return just means we
can't do  the cheaper pre-check for PGAIO_RS_UNKNOWN, no?


> -			if (aio_ret->result.status == PGAIO_RS_UNKNOWN &&
> +			if ((operation->foreign_io || aio_ret->result.status == PGAIO_RS_UNKNOWN) &&
>  				!pgaio_wref_check_done(&operation->io_wref))
>  			{
>  				instr_time	io_start = pgstat_prepare_io_time(track_io_timing);
> @@ -1835,11 +1814,33 @@ WaitReadBuffers(ReadBuffersOperation *operation)
>  				Assert(pgaio_wref_check_done(&operation->io_wref));
>  			}
>
> -			/*
> -			 * We now are sure the IO completed. Check the results. This
> -			 * includes reporting on errors if there were any.
> -			 */
> -			ProcessReadBuffersResult(operation);
> +			if (unlikely(operation->foreign_io))
> +			{
> +				Buffer		buffer = operation->buffers[operation->nblocks_done];
> +				BufferDesc *desc = BufferIsLocal(buffer) ?
> +					GetLocalBufferDescriptor(-buffer - 1) :
> +					GetBufferDescriptor(buffer - 1);
> +				uint32		buf_state = pg_atomic_read_u64(&desc->state);
> +
> +				if (buf_state & BM_VALID)
> +				{
> +					operation->nblocks_done += 1;
> +					Assert(operation->nblocks_done <= operation->nblocks);
> +
> +					CountBufferHit(operation->strategy,
> +								   operation->rel, operation->persistence,
> +								   operation->smgr, operation->forknum,
> +								   operation->blocknum + operation->nblocks_done - 1);

Probably worth including a comment explaining why we count this as a hit. IIRC
earlier versions had such a comment.


> +/*
> + * Local version of PrepareNewReadBufferIO(). Here instead of localbuf.c to
> + * avoid an external function call.
> + */
> +static PrepareReadBuffer_Status
> +PrepareNewLocalReadBufferIO(ReadBuffersOperation *operation,
> +							Buffer buffer)

Hm, seems the test in 0002 should be extended to cover the the temp table case.



> +{
> +	BufferDesc *desc = GetLocalBufferDescriptor(-buffer - 1);
> +	uint64		buf_state = pg_atomic_read_u64(&desc->state);
> +
> +	/* Already valid, no work to do */
> +	if (buf_state & BM_VALID)
> +	{
> +		pgaio_wref_clear(&operation->io_wref);
> +		return READ_BUFFER_ALREADY_DONE;
> +	}

Is this reachable for local buffers?


> +	pgaio_submit_staged();
> +
> +	if (pgaio_wref_valid(&desc->io_wref))
> +	{
> +		operation->io_wref = desc->io_wref;
> +		operation->foreign_io = true;
> +		return READ_BUFFER_IN_PROGRESS;
> +	}
> +
> +	/*
> +	 * While it is possible for a buffer to have been prepared for IO but not
> +	 * yet had its wait reference set, there's no way for us to know that for
> +	 * temporary buffers. Thus, we'll prepare for own IO on this buffer.
> +	 */
> +	return READ_BUFFER_READY_FOR_IO;

Is that actually possible? And would it be ok to just do start IO in that
case?


> +/*
> + * Try to start IO on the first buffer in a new run of blocks. If AIO is in
> + * progress, be it in this backend or another backend, we just associate the
> + * wait reference with the operation and wait in WaitReadBuffers(). This turns
> + * out to be important for performance in two workloads:
> + *
> + * 1) A read stream that has to read the same block multiple times within the
> + *    readahead distance. This can happen e.g. for the table accesses of an
> + *    index scan.
> + *
> + * 2) Concurrent scans by multiple backends on the same relation.
> + *
> + * If we were to synchronously wait for the in-progress IO, we'd not be able
> + * to keep enough I/O in flight.
> + *
> + * If we do find there is ongoing I/O for the buffer, we set up a 1-block
> + * ReadBuffersOperation that WaitReadBuffers then can wait on.
> + *
> + * It's possible that another backend has started IO on the buffer but not yet
> + * set its wait reference. In this case, we have no choice but to wait for
> + * either the wait reference to be valid or the IO to be done.
> + */
> +static PrepareReadBuffer_Status
> +PrepareNewReadBufferIO(ReadBuffersOperation *operation,
> +					   Buffer buffer)
> +{

I'm not sure I love "New" here, compared to "Additional". Perhaps "Begin" &
"Continue"? Or "First" & "Additional"?  Or ...


> +	uint64		buf_state;
> +	BufferDesc *desc;
> +
> +	if (BufferIsLocal(buffer))
> +		return PrepareNewLocalReadBufferIO(operation, buffer);
> +
> +	ResourceOwnerEnlarge(CurrentResourceOwner);
> +	desc = GetBufferDescriptor(buffer - 1);
> +
> +	for (;;)
> +	{
> +		buf_state = LockBufHdr(desc);

Perhaps worth adding an
   Assert(buf_state & BM_TAG_VALID)?


> +		/* Already valid, no work to do */
> +		if (buf_state & BM_VALID)
> +		{
> +			UnlockBufHdr(desc);
> +			pgaio_wref_clear(&operation->io_wref);

Perhaps we should clear &operation->io_wref once at the start? Because right
now it'll be cleared if BM_VALID and it'll be set if BM_IO_IN_PROGRESS &&
&desc->io_wref, but it won't be touched when in BM_IO_IN_PROGRESS without a
wref set.  It seems either we should just touch &operation->io_wref if
  BM_IO_IN_PROGRESS && pgaio_wref_valid(&desc->io_wref)
or we should reliably do it.



> +			return READ_BUFFER_ALREADY_DONE;
> +		}
> +
> +		if (buf_state & BM_IO_IN_PROGRESS)
> +		{
> +			/* Join existing read */
> +			if (pgaio_wref_valid(&desc->io_wref))
> +			{
> +				operation->io_wref = desc->io_wref;
> +				operation->foreign_io = true;
> +				UnlockBufHdr(desc);
> +				return READ_BUFFER_IN_PROGRESS;
> +			}
> +
> +			/*
> +			 * If the wait ref is not valid but the IO is in progress, someone
> +			 * else started IO but hasn't set the wait ref yet. We have no
> +			 * choice but to wait until the IO completes.
> +			 */
> +			UnlockBufHdr(desc);
> +			pgaio_submit_staged();
> +			WaitIO(desc);
> +			continue;

Before this commit there was an explanation for this submit:

-    /*
-     * If this backend currently has staged IO, we need to submit the pending
-     * IO before waiting for the right to issue IO, to avoid the potential for
-     * deadlocks (and, more commonly, unnecessary delays for other backends).
-     */

Seems that vanished?



> +/*
> + * When building a new IO from multiple buffers, we won't include buffers
> + * that are already valid or already in progress. This function should only be
> + * used for additional adjacent buffers following the head buffer in a new IO.
> + *
> + * Returns true if the buffer was successfully prepared for IO and false if it
> + * is rejected and the read IO should not include this buffer.
> + */
> +static bool
> +PrepareAdditionalReadBuffer(Buffer buffer)

I think it'd be good to mention that this may never wait for IO or such to
avoid deadlocks.



> +	/* Check if we can start IO on the first to-be-read buffer */
> +	if ((status = PrepareNewReadBufferIO(operation, buffers[nblocks_done])) <
> +		READ_BUFFER_READY_FOR_IO)
> +	{

I don't love this < bit. For one there's no mention in
PrepareReadBuffer_Status mentioning that the numerical order is important. Any
reason to not just test != READ_BUFFER_READY_FOR_IO?

The assignment inside the if also looks somewhat awkward. For while() loops
there's often not really a better way to write it, but here you could just as
well do the status assignment in a line before.


> +		pgaio_io_release(ioh);
> +		*nblocks_progress = 1;
> +		if (status == READ_BUFFER_ALREADY_DONE)
> +		{
> +			/*
> +			 * Someone else has already completed this block, we're done.
> +			 *
> +			 * When IO is necessary, ->nblocks_done is updated in
> +			 * ProcessReadBuffersResult(), but that is not called if no IO is
> +			 * necessary. Thus update here.
> +			 */
> +			operation->nblocks_done += 1;
> +			Assert(operation->nblocks_done <= operation->nblocks);
> +
> +			/*
> +			 * Report and track this as a 'hit' for this backend, even though
> +			 * it must have started out as a miss in PinBufferForBlock(). The
> +			 * other backend will track this as a 'read'.
> +			 */
> +			CountBufferHit(operation->strategy,
> +						   operation->rel, operation->persistence,
> +						   operation->smgr, operation->forknum,
> +						   operation->blocknum + operation->nblocks_done - 1);
> +			return false;
> +		}
> +
> +		/* The IO is already in-progress */
> +		Assert(status == READ_BUFFER_IN_PROGRESS);
> +		CheckReadBuffersOperation(operation, false);

I was about to suggest that there should be a CheckReadBuffersOperation() for
both returns here, but there already are CheckReadBuffersOperation after calls
to AsyncReadBuffers(), so I think this CheckReadBuffersOperation could just be
removed.



>  	/*
> -	 * Check if we can start IO on the first to-be-read buffer.
> -	 *
> -	 * If an I/O is already in progress in another backend, we want to wait
> -	 * for the outcome: either done, or something went wrong and we will
> -	 * retry.
> +	 * How many neighboring-on-disk blocks can we scatter-read into other
> +	 * buffers at the same time?  In this case we don't wait if we see an I/O
> +	 * already in progress.  We already set BM_IO_IN_PROGRESS for the head
> +	 * block, so we should get on with that I/O as soon as possible.
>  	 */
> -	if (!ReadBuffersCanStartIO(buffers[nblocks_done], false))
> +	for (int i = nblocks_done + 1; i < operation->nblocks; i++)
>  	{
> -		/*
> -		 * Someone else has already completed this block, we're done.
> -		 *
> -		 * When IO is necessary, ->nblocks_done is updated in
> -		 * ProcessReadBuffersResult(), but that is not called if no IO is
> -		 * necessary. Thus update here.
> -		 */
> -		operation->nblocks_done += 1;
> -		*nblocks_progress = 1;
> -
> -		pgaio_io_release(ioh);
> -		pgaio_wref_clear(&operation->io_wref);
> -		did_start_io = false;
> +		if (!PrepareAdditionalReadBuffer(buffers[i]))
> +			break;
> +		/* Must be consecutive block numbers. */
> +		Assert(BufferGetBlockNumber(buffers[i - 1]) ==
> +			   BufferGetBlockNumber(buffers[i]) - 1);

Seems this assert could stand to be before the PrepareAdditionalReadBuffer(),
as it better hold true before we try to BM_IO_IN_PROGRESS?

I realize this is old, but since you're whacking this around...



> diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
> index 4017896f951..f85a9acc6ac 100644
> --- a/src/include/storage/bufmgr.h
> +++ b/src/include/storage/bufmgr.h
> @@ -147,6 +147,8 @@ struct ReadBuffersOperation
>  	int			flags;
>  	int16		nblocks;
>  	int16		nblocks_done;
> +	/* true if waiting on another backend's IO */
> +	bool		foreign_io;
>  	PgAioWaitRef io_wref;
>  	PgAioReturn io_return;
>  };

This adds an alignment-padding hole between nblocks_done and io_wref.  Read
stream can allocate quite a few of these, so it's probably worth avoiding?

There's a padding hole between persistence and forknum, but that seems a bit
ugly to utilize. A bit less ugly if we swapped forknum and persistence.

Or we could make 'flags' a uint8/16 (flags should imo always be unsigned, and
there are just four flag bits right now).

But perhaps it's also not worth worrying about right now.


[1] https://gcc.gnu.org/onlinedocs/gcc/Common-Attributes.html#index-pure

Greetings,

Andres Freund

view thread (31+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Don't synchronously wait for already-in-progress IO in read stream
  In-Reply-To: <u73un3xeljr4fiidzwi4ikcr6vm7oqugn4fo5vqpstjio6anl2@hph6fvdiiria>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox