public inbox for [email protected]  
help / color / mirror / Atom feed
From: Andres Freund <[email protected]>
To: Melanie Plageman <[email protected]>
Cc: Nazir Bilal Yavuz <[email protected]>
Cc: Thomas Munro <[email protected]>
Cc: [email protected]
Cc: Peter Geoghegan <[email protected]>
Cc: Tomas Vondra <[email protected]>
Subject: Re: Don't synchronously wait for already-in-progress IO in read stream
Date: Wed, 25 Mar 2026 11:15:18 -0400
Message-ID: <rk56uxuqxi3udzu5oqc7qft5u7nncau5d2avlbui6obd7ybols@wlu2euvawj7t> (raw)
In-Reply-To: <CAAKRu_YiaJgbmHGTFHRJEP9+AtVCt+gHTd1rmXm05agT_1Mx2A@mail.gmail.com>
References: <CAAKRu_atdnPCCy=kfxqWT62Ckaiz3G5t=S97tW24CuL3i3fFfQ@mail.gmail.com>
	<CAN55FZ17ScU--nGM8cjjPtwPMOMonKZ9vMehPcc2sOQNLVskvA@mail.gmail.com>
	<CAAKRu_ZM1epxTdt2=4-g4ff6UC09zne+0xg_gNv3d7LEcxEvNA@mail.gmail.com>
	<CAN55FZ0ymY=XfaHQXHKG+Mbit6BZzZ9d-UWb6dmRXYm_xCvfQg@mail.gmail.com>
	<CAAKRu_YV0D_9wWgGkHdWgQsooqmcc08Uqb3tqxzrwcDXHajDtA@mail.gmail.com>
	<u73un3xeljr4fiidzwi4ikcr6vm7oqugn4fo5vqpstjio6anl2@hph6fvdiiria>
	<CAAKRu_bCZfWpmfmfUiQjDgFK12T8FUox-hV9YFDy6XHTh=i0ew@mail.gmail.com>
	<prkj4fv7dvyew4lbad2g5l346nn6u66inndwx7nzzap5c3bdmh@fds54c2hx3yd>
	<42rdu4q44kvsq53fz5qgzuawqpaytvnemsnquynlfch5mqhc2m@6ytnlgivtzro>
	<CAAKRu_YiaJgbmHGTFHRJEP9+AtVCt+gHTd1rmXm05agT_1Mx2A@mail.gmail.com>

Hi,

On 2026-03-20 15:50:59 -0400, Melanie Plageman wrote:
> On Thu, Mar 19, 2026 at 6:22 PM Andres Freund <[email protected]> wrote:
> >
> > Thinking about it more, I also got worried about the duplicating of
> > logic. It's perhaps acceptable with the patches as-is, but we'll soon need
> > something very similar for AIO writes. Then we'd end up with like 5 variants,
> > because we'd still need the existing StartBufferIO() for some cases where we
> > do want to wait (e.g. the edge case in ExtendBufferedRelShared()).
> >
> > In the attached prototype I replaced your patch introducing
> > PrepareHeadBufferReadIO()/ PrepareAdditionalBufferReadIO() with one that
> > instead revises StartBufferIO() to have the enum return value you introduced
> > and a PgAioWaitRef * argument that callers that would like to asynchronously
> > wait for the IO to complete (others pass NULL). There are some other cleanups
> > in it too, see the commit message for more details.
> 
> I've come around to this. The aspect I like least is that io_wref is
> used both as an output parameter _and_ a decision input parameter.

Do you have an alternative suggestion? We could add a dedicated parameter for
that, but then it just opens up different ways of calling the function with
the wrong arguments.


> Some callers never want to wait on in-progress IO (and don't have to),
> while others must eventually wait but can defer that waiting as long
> as they have a wait reference. If they can't get a wait reference,
> they have no way to wait later, so they must wait now. The presence of
> io_wref indicates this difference.
> 
> I think it's important to express that less mechanically than in your
> current header comment and comment in the else block of
> StartSharedBufferIO() where we do the waiting. Explaining first—before
> detailing argument combinations—why a caller would want to pass
> io_wref might help.

I'm not entirely sure what you'd like here.  Would the following comment
do the trick?

 * In several scenarios the buffer may already be undergoing I/O in this or
 * another backend. How to best handle that depends on the caller's
 * situation. It might be appropriate to wait synchronously (e.g., because the
 * buffer is about to be invalidated); wait asynchronously, using the buffer's
 * IO wait reference (e.g., because the caller is doing readahead and doesn't
 * need the buffer to be ready immediately); or to not wait at all (e.g.,
 * because the caller is trying to combine IO for this buffer with another
 * buffer).
 *
 * How and whether to wait is controlled by the wait in io_wref parameters. In
 * detail:
 *
 * <existing comment>



> However, I do think you need to enumerate the different combinations
> of wait and io_wref (as you've done) to make it clear what they are.
> 
> I, for example, find it very confusing what wait == false and io_wref
> not NULL would mean. If IO is in progress on the buffer and the
> io_wref is not valid yet, the caller would get the expected
> BUFFER_IO_IN_PROGRESS return value but io_wref won't be set. I could
> see callers easily misinterpreting the API and passing this
> combination when what they want is wait == true and io_wref not NULL
> -- because they don't want to synchronously wait.

Hm. I started out proposing that we should just add an assert documenting this
is a nonsensical combination. But when writing the comment for that I realized
that it theoretically could make sense to pass in wait == false and io_wref !=
NULL, if you wanted to get a wait reference, but would not want to do a
WaitIO() if there's no wait reference set.

I don't think that's something we need right now, but ...


> I don't have any good suggestions despite thinking about it, though.
> 
> Two other things about 0007:
> 
>     for (int i = nblocks_done + 1; i < operation->nblocks; i++)
>     {
>         /* Must be consecutive block numbers. */
>         Assert(BufferGetBlockNumber(buffers[i - 1]) ==
>                BufferGetBlockNumber(buffers[i]) - 1);
> 
>         status = StartBufferIO(buffers[nblocks_done], true, false, NULL);
> 
> Copy-paste bug above, should be StartBufferIO(buffers[i],...
> 
> I would mention that currently BUFFER_IO_IN_PROGRESS is not used in
> the first StartBufferIO() case, so that is dead code as of this commit

Whaaat.  Why did this even pass tests???  I guess it just always failed to
start IO because there already was IO on the buffer and that was good enough
to get through.

Clearly a testing gap.

Not entirely trivial to test though :(.

Greetings,

Andres Freund





view thread (31+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Don't synchronously wait for already-in-progress IO in read stream
  In-Reply-To: <rk56uxuqxi3udzu5oqc7qft5u7nncau5d2avlbui6obd7ybols@wlu2euvawj7t>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox