public inbox for [email protected]
help / color / mirror / Atom feedFrom: Tomas Vondra <[email protected]>
To: Alexandre Felipe <[email protected]>
To: Peter Geoghegan <[email protected]>
Cc: Andres Freund <[email protected]>
Cc: Thomas Munro <[email protected]>
Cc: Nazir Bilal Yavuz <[email protected]>
Cc: Robert Haas <[email protected]>
Cc: Melanie Plageman <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Cc: Georgios <[email protected]>
Cc: Konstantin Knizhnik <[email protected]>
Cc: Dilip Kumar <[email protected]>
Subject: Re: index prefetching
Date: Sun, 15 Feb 2026 08:57:19 +0100
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAE8JnxN_EwnTLLMWGhvgwaomYZ0ysm7NeogA-BqBd=Rs3S7Oqw@mail.gmail.com>
References: <[email protected]>
<CAH2-WzkHK0++yn3_nmWL2rWzXiu8Qc=2=NPQDG4nfVjvYkRDrQ@mail.gmail.com>
<CAH2-Wzk9=x=a2TbcqYcX+XXmDHQr5=1v9m4Z_v8a-KwF1Zoz0A@mail.gmail.com>
<CAH2-WzmYqhacBH161peAWb5eF=Ja7CFAQ+0jSEMq=qnfLVTOOg@mail.gmail.com>
<CAH2-Wzmm+mXuv_r+eYgcwOKZzNSmbYoEmxi5mdKMUtkmYa_qXw@mail.gmail.com>
<CAH2-WzkC_EdGB_tdEQr63BiOuJbcvZiV=xO+OJYux4wOaxPD8w@mail.gmail.com>
<CAH2-Wzn8whv-RE4E5L2WQB_Ha_jy0UReJq1yypu83gDYHeXq5g@mail.gmail.com>
<CAH2-Wzm7-QuDOs6TcqfhhDsGEZCuHtn=D-SriOTnTZ_fiXNBvA@mail.gmail.com>
<CAH2-WzmH7pVQ0-mYAxb82aWbz29_BiBPq2wV5p7+1o2sRFqDRQ@mail.gmail.com>
<CAH2-Wz=6a7fGz2rALDX+xiFDuEaGQWpZ49xEaBUDKiPH8gcL+Q@mail.gmail.com>
<CAH2-WzkehuhxyuA8quc7rRN3EtNXpiKsjPfO8mhb+0Dr2K0Dtg@mail.gmail.com>
<CAH2-WzmymSyOt5Y2RGbm6cJXg18J_ttfqjdcpodHe6Gp23ConQ@mail.gmail.com>
<CAH2-Wznv9_KGqHQ1vCW2pkiA6QskBGcx5NC_-UXnD6GEQasvAQ@mail.gmail.com>
<CAE8JnxN_EwnTLLMWGhvgwaomYZ0ysm7NeogA-BqBd=Rs3S7Oqw@mail.gmail.com>
Hi,
On 2/15/26 01:13, Alexandre Felipe wrote:
> Hi,
> I decided to test this PR.
>
> I didn't take much time to go through the thread or the code in detail
> yet. But I have my first benchmark results and I would like to share.
>
I'm quite confused by the scripts you shared, it seems incomplete. The
run_regression.py is meant to call purge_cache.sh (which is missing),
and the run_benchmark tries to call all sorts of missing .sql scripts.
So how do we use that?
> EXPERIMENT
>
> I tested [CF 4351] v10 - Index Prefetching
>
> I created a table with 100k rows and
> Sequential is, as guessed, 1,2,3,4 (indexed)
> Periodic is a quasi random (i * jump) % num_rows, where gcd(jump,
> num_rows) = 1, guarantee that there are no repeated entries (indexed)
> Random is a `row_number() over (order by random())` (indexed)
> The payload is a fixed 200 character long string, just to make it more
> realistic.
>
> For the tests, I disable sorting, sequential scans, index only scans and
> bitmap scans.
> Since buffer cache always has a significant impact on the query
> performance, I shuffled the tests, and tried to adjust for the number of
> buffer hit/read, but later I found that the best way to control that was
> to use a table small enough to be entirely held in cache, and evict the
> buffers.
>
That seems a bit bizarre. The whole point of index prefetching is better
I/O scheduling (ahead of time), but if you "control" the impact of cache
by making sure everything is cached, that kinda defeats the whole thing.
A table that is just 24MB and fits into buffers is a bit useless. It
means that even with random pattern (which is generally about the best
for prefetching), only about ~1/30 of pages will require I/O. Each page
has ~32 items, but only the first item from each page will incur an I/O.
> * off: buffers are kept in cache
> * pg: buffers evicted from postgres pg_buffercache_evict from
> pg_buffercache extension.
> * os: supported only in python, I separated the buffer eviction in
> purge_cache as it requires sudo (tested only in MacOS).
>
> I varied
> * max_parallel_workers_per_gather (although I guess it wasn't exploited),
> * enable_index_prefetch
> * the column used as sorting key and, as a result, the index used.
> * and buffer eviction mode.
>
> Running from python with psycopg
>
On what kind of hardware? How much variance is in the results?
regards
--
Tomas Vondra
view thread (87+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: index prefetching
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox