public inbox for [email protected]
help / color / mirror / Atom feedFrom: Andres Freund <[email protected]>
To: Peter Geoghegan <[email protected]>
Cc: Tomas Vondra <[email protected]>
Cc: Thomas Munro <[email protected]>
Cc: Nazir Bilal Yavuz <[email protected]>
Cc: Robert Haas <[email protected]>
Cc: Melanie Plageman <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Cc: Georgios <[email protected]>
Cc: Konstantin Knizhnik <[email protected]>
Cc: Dilip Kumar <[email protected]>
Subject: Re: index prefetching
Date: Wed, 3 Sep 2025 16:06:32 -0400
Message-ID: <4zeu5yb73byiquvf3eefsunnrydyqfxy3eup66jrliutrtd4xl@5iifjey4n5m5> (raw)
In-Reply-To: <CAH2-WznFdjY_OB2S7_BY4iAyeffK+XrE2qsX6aghgP63VocRfQ@mail.gmail.com>
References: <[email protected]>
<5pltwb73d7cynsxo2yb54ygjk7haviatkrx43mnzihc6kkield@ahnstpgof46i>
<CA+hUKGKL3MRvEftAE+kwBuL2PLg2CwUoHEMr=-KSvsWN8pHq9w@mail.gmail.com>
<[email protected]>
<e33gafg4p7iwvo24ytrxuw43nafm5xm3jefpdspnarcbkfurs7@3jbgdiinxem5>
<[email protected]>
<CAH2-Wz=DfvzasnzLv43cu36Q1Ca8Qi70_JjZ7SRbNhDwwgvirg@mail.gmail.com>
<qdl4fojnbfcnm2k7b4zpvgd6gwzwdgtbl5c7shpimrb76dbyy6@scdnspus3ejh>
<bpdeohyqvltb77viyft4bza4xc4peed3jcoep74d2ih6ynqlke@wbnhcwmq3ril>
<CAH2-WznFdjY_OB2S7_BY4iAyeffK+XrE2qsX6aghgP63VocRfQ@mail.gmail.com>
Hi,
On 2025-09-03 15:33:30 -0400, Peter Geoghegan wrote:
> On Wed, Sep 3, 2025 at 2:47 PM Andres Freund <[email protected]> wrote:
> > I still don't think I fully understand why the impact of this is so large. The
> > branch misses appear to be the only thing differentiating the two cases, but
> > with resowners neutralized, the remaining difference in branch misses seems
> > too large - it's not like the sequence of block numbers is more predictable
> > without prefetching...
> >
> > The main increase in branch misses is in index_scan_stream_read_next...
>
> I've been working on fixing the same regressed query, but using a
> completely different (though likely complementary) approach: by adding
> a test to index_scan_stream_read_next that detects when prefetching
> isn't favorable. If it isn't favorable, then we stop prefetching
> entirely (we fall back on regular sync I/O).
The issue to me is that this kind of query actually *can* substantially
benefit from prefetching, no? Afaict the performance without prefetching is
rather atrocious as soon as a) storage has a tad higher latency or b) DIO is
used.
Indeed: With DIO, readahead provides a ~2.6x improvement for the query at hand.
I continue to be worried that we're optimizing for queries that have no
real-world relevance. The regression afaict is contingent on
1) An access pattern that is unpredictable to the CPU (due to the use of
random() as part of ORDER BY during the data generation)
2) Index and heap are somewhat correlated, but fuzzily, i.e. there are
backward jumps in the heap block numbers being fetched
3) There are 1 - small_number tuples on one heap tables
4) The query scans a huge number of tuples, without actually doing any
meaningful analysis on the tuples. As soon as one does meaningful work for
returned tuples, the small difference in per-tuple CPU costs vanishes
5) The query visits all heap pages within a range, just not quite in
order. Without that the kernel readahead would not work and the query's
performance without readahead would be terrible even on low-latency storage
This just doesn't strike me as a particularly realistic combination of
factors?
I suspect we could more than eat back the loss in performance by doing batched
heap_hot_search_buffer()...
Greetings,
Andres Freund
view thread (348+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: index prefetching
In-Reply-To: <4zeu5yb73byiquvf3eefsunnrydyqfxy3eup66jrliutrtd4xl@5iifjey4n5m5>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox