public inbox for [email protected]  
help / color / mirror / Atom feed
From: Peter Geoghegan <[email protected]>
To: Andres Freund <[email protected]>
Cc: Tomas Vondra <[email protected]>
Cc: Thomas Munro <[email protected]>
Cc: Nazir Bilal Yavuz <[email protected]>
Cc: Robert Haas <[email protected]>
Cc: Melanie Plageman <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Cc: Georgios <[email protected]>
Cc: Konstantin Knizhnik <[email protected]>
Cc: Dilip Kumar <[email protected]>
Subject: Re: index prefetching
Date: Wed, 3 Sep 2025 15:33:30 -0400
Message-ID: <CAH2-WznFdjY_OB2S7_BY4iAyeffK+XrE2qsX6aghgP63VocRfQ@mail.gmail.com> (raw)
In-Reply-To: <bpdeohyqvltb77viyft4bza4xc4peed3jcoep74d2ih6ynqlke@wbnhcwmq3ril>
References: <[email protected]>
	<onbn3rx35x6k7mfnsmejnebt4nahnii3qnjrac2jzdh3puwo6t@dzjzsx5ppaj7>
	<[email protected]>
	<5pltwb73d7cynsxo2yb54ygjk7haviatkrx43mnzihc6kkield@ahnstpgof46i>
	<CA+hUKGKL3MRvEftAE+kwBuL2PLg2CwUoHEMr=-KSvsWN8pHq9w@mail.gmail.com>
	<[email protected]>
	<e33gafg4p7iwvo24ytrxuw43nafm5xm3jefpdspnarcbkfurs7@3jbgdiinxem5>
	<[email protected]>
	<CAH2-Wz=DfvzasnzLv43cu36Q1Ca8Qi70_JjZ7SRbNhDwwgvirg@mail.gmail.com>
	<qdl4fojnbfcnm2k7b4zpvgd6gwzwdgtbl5c7shpimrb76dbyy6@scdnspus3ejh>
	<bpdeohyqvltb77viyft4bza4xc4peed3jcoep74d2ih6ynqlke@wbnhcwmq3ril>

On Wed, Sep 3, 2025 at 2:47 PM Andres Freund <[email protected]> wrote:
> I still don't think I fully understand why the impact of this is so large. The
> branch misses appear to be the only thing differentiating the two cases, but
> with resowners neutralized, the remaining difference in branch misses seems
> too large - it's not like the sequence of block numbers is more predictable
> without prefetching...
>
> The main increase in branch misses is in index_scan_stream_read_next...

I've been working on fixing the same regressed query, but using a
completely different (though likely complementary) approach: by adding
a test to index_scan_stream_read_next that detects when prefetching
isn't favorable. If it isn't favorable, then we stop prefetching
entirely (we fall back on regular sync I/O).

Although this experimental approach is still very rough, it seems
promising. It ~100% fixes the problem at hand, without really creating
any new problems (at least as far as our testing has been able to
determine, so far).

The key idea is to wait until a few batches have already been read,
and then test whether the index-tuple-wise "distance" between readPos
(the read position) and streamPos (the stream position used by
index_scan_stream_read_next) remained excessively low within
index_scan_stream_read_next. If, after processing 20 batches/leaf
pages, readPos and streamPos still read from the same batch *and* have
a low index-tuple-wise position within that batch (they're within 10
or 20 items of each other), we expect "thrashing", which makes
prefetching unfavorable -- and so we just stop using our read stream.

It's worth noting that (given the current structure of the patch) it
is inherently impossible to do something like this from within the
read stream. We're suppressing duplicate heap block requests iff the
blocks are contiguous within the index. So read stream just doesn't
see anything like what I'm calling the "index-tuple-wise distance"
between readPos and streamPos.

Note that the baseline behavior for the test case (the behavior with
master, or with prefetching disabled) appears to be very I/O bound,
due to readahead. I've confirmed this using iostat. So "synchronous"
I/O isn't very synchronous here. (Prefetching actually does make sense
when this query is run with direct I/O, but that's far slower with or
without the use of explicit prefetching, so that likely doesn't tell
us much.)

--
Peter Geoghegan





view thread (348+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: index prefetching
  In-Reply-To: <CAH2-WznFdjY_OB2S7_BY4iAyeffK+XrE2qsX6aghgP63VocRfQ@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox