Re: AIO / read stream heuristics adjustments for index prefetching

public inbox for [email protected]  
help / color / mirror / Atom feed

From: Andres Freund <[email protected]>
To: Melanie Plageman <[email protected]>
Cc: Nazir Bilal Yavuz <[email protected]>
Cc: [email protected]
Cc: Thomas Munro <[email protected]>
Cc: Peter Geoghegan <[email protected]>
Cc: Tomas Vondra <[email protected]>
Subject: Re: AIO / read stream heuristics adjustments for index prefetching
Date: Fri, 3 Apr 2026 15:01:13 -0400
Message-ID: <dyz5hwolszkdbztdag2arphj3esmx2y6ocdfdirryehkgintcj@i7hqar5btt4w> (raw)
In-Reply-To: <CAAKRu_bWPO-w3arE3pM5i5LArp7mA+Rg0chm-1WL-JxvDxv7Cg@mail.gmail.com>
References: <f3xxfrkafjxpyqxywcxricxgyizjirfceychyxsgn7bwjp5eda@kwbduhy7tfmu>
	<CAAKRu_ZcJnnxgDQaXjuhd37bnc-jKARBU4EDi+LUqgs+ZjmrgQ@mail.gmail.com>
	<b7cxivohlk7hfl6qcqxbutrpoukunawxvarr2g2o6jicwkyx5o@qtzdj36ihyz5>
	<CAN55FZ3GT+o545U9JLR0AC5JtBznqJP4Mf9Mi4k3axqqAXTxPg@mail.gmail.com>
	<aafskrrvefnb6p7zg3xnzau3m2iywfwrxfcmqx5vr7673j4qio@hha6e55rolqd>
	<CAAKRu_bWPO-w3arE3pM5i5LArp7mA+Rg0chm-1WL-JxvDxv7Cg@mail.gmail.com>

Hi,

On 2026-04-03 12:45:50 -0400, Melanie Plageman wrote:
> On Thu, Apr 2, 2026 at 9:33 AM Andres Freund <[email protected]> wrote:
> >
> > > +                /*
> > > +                 * XXX: Should we actually reduce this at any time other than
> > > +                 * a reset? For now we have to, as this is also a condition
> > > +                 * for re-enabling fast_path.
> > > +                 */
> > > +                if (stream->combine_distance > 1)
> > > +                    stream->combine_distance--;
> > >
> > > I don't think we need to reduce this other than reset.
> >
> > Hm. I go back and forth on that one :)
> 
> Separate from the fast-path enablement, we also probably want to
> decrease combine distance when we decrease readahead_distance because
> there is a point where we still want to parallelize the IOs even when
> the distance is lower and to do that, we need to make smaller IOs.

I'm not sure that's something we really need to worry about at this point. If
readahead_distance is so small that it does not allow enough IO concurrency,
we will have to wait for IO completion, which in turn will lead to the
readahead distance being increased again.

I can see some corner cases where this would not suffice, e.g. if you have a
rather low pin limit, but I doubt those are relevant in practice?


> I'm not sure where this point is, but I wonder if a few 256kB IOs is faster
> than 1 1MB IO (could test that with fio actually).

Yes, that point definitely exists. But I think the mechanism for that is to
configure io_combine_limit at or below the threshold at which even bigger IOs
hurt.


> I imagine that there is some size where that is true because of
> peculiarities in how drives (and cloud storage) issue/break up IOs after
> they are a certain size, etc.

It's even true for synchronous copies from the kernel page cache, due to some
hardware issue I have yet to fully understand. On both Intel and AMD CPUs,
unless SMAP is disabled, larger copies from kernel to userspace start to to be
substantially slower, somewhere around 1-4MBs per IO.

Greetings,

Andres Freund

view thread (23+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: AIO / read stream heuristics adjustments for index prefetching
  In-Reply-To: <dyz5hwolszkdbztdag2arphj3esmx2y6ocdfdirryehkgintcj@i7hqar5btt4w>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox