public inbox for [email protected]
help / color / mirror / Atom feedFrom: Peter Geoghegan <[email protected]>
To: Tomas Vondra <[email protected]>
Cc: Nazir Bilal Yavuz <[email protected]>
Cc: Thomas Munro <[email protected]>
Cc: Andres Freund <[email protected]>
Cc: Robert Haas <[email protected]>
Cc: Melanie Plageman <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Cc: Georgios <[email protected]>
Cc: Konstantin Knizhnik <[email protected]>
Cc: Dilip Kumar <[email protected]>
Subject: Re: index prefetching
Date: Thu, 24 Jul 2025 10:40:12 -0400
Message-ID: <CAH2-WzmER9kc4OtmkDh+h51QV=v6Yc5BGsJikwJHtucf1C1HWw@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <CAH2-Wzn7vqmt=qE_hDrOx4NETkUoCbdn74G1gswMXi1APUuYrA@mail.gmail.com>
<[email protected]>
<[email protected]>
<CAH2-Wzm-u6b4gDbLNP=1pkfqJbEyPyey9M-8wG0C+QOTit963Q@mail.gmail.com>
<[email protected]>
<[email protected]>
<CA+hUKG+P0RnG_c4vL6=d8ACwpmux5ZF91FO4UpX8PDu1WqEg9w@mail.gmail.com>
<[email protected]>
<CA+hUKG+WWr4-8TYemyU=ucQsNe6bUBN_Sq3mCnBoBtxaJ9w3ug@mail.gmail.com>
<CA+hUKGLvL408O2ss6YZQsdupryvBqjhuej_hyOYU9fobhdHkTQ@mail.gmail.com>
<CAN55FZ0y57uEK+Ts8s3NV9Gyg=YiAV-y610XJpfS+jdCh_7f5g@mail.gmail.com>
<[email protected]>
<CAH2-Wzk+ba5oHf1ghC34Z8eoCvum5yHGMxHeFCbC2ibcQ+J7uw@mail.gmail.com>
<[email protected]>
<CAH2-Wzm8AOhY83jPBrPDOO6dauoE9kDcef=b6-4TFSv6AkiNog@mail.gmail.com>
<[email protected]>
<[email protected]>
On Thu, Jul 24, 2025 at 7:19 AM Tomas Vondra <[email protected]> wrote:
> I got a bit bored yesterday, so I gave this a try and whipped up a patch
> that adds two pgstattuple functins that I think could be useful for
> analyzing index metrics that matter for prefetching.
This seems quite useful.
I notice that you're not accounting for posting lists. That'll lead to
miscounts of the number of heap blocks in many cases. I think that
that's worth fixing, even given that this patch is experimental.
> It's trivial to summarize this into a per-index statistic (of course,
> there may be some inaccuracies when the run spans multiple ranges), but
> it also seems useful to be able to look at parts of the index.
FWIW in my experience, the per-leaf-page "nhtids:nhblks" tends to be
fairly consistent across all leaf pages from a given index. There are
no doubt some exceptions, but they're probably pretty rare.
> Second, the index is walked sequentially in physical order, from block 0
> to the last block. But that's not really what the index prefetch sees.
> To make it "more accurate" it'd be better to just scan the leaf pages as
> if during a "full index scan".
Why not just do it that way to begin with? It wouldn't be complicated
to make the function follow a chain of right sibling links.
I suggest an interface that takes a block number, and an nblocks int8
argument that must be >= 1. The function would start from the block
number arg leaf page. If it's not a non-ignorable leaf page, throw an
error. Otherwise, count the number of distinct heap blocks on the leaf
page, and count the number of heap blocks on each additional leaf page
to the right -- until we've counted the heap blocks from nblocks-many
leaf pages (or until we reach the rightmost leaf page).
I suggest that a P_IGNORE() page shouldn't have its heap blocks
counted, and shouldn't count towards our nblocks tally of leaf pages
whose heap blocks are to be counted. Upon encountering a P_IGNORE()
page, just move to the right without doing anything. Note that the
rightmost page cannot be P_IGNORE().
This scheme will always succeed, no matter the nblocks argument,
provided the initial leaf page is a valid leaf page (and provided the
nblocks arg is >= 1).
I get that this is just a prototype that might not go anywhere, but
the scheme I've described requires few changes.
--
Peter Geoghegan
view thread (348+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: index prefetching
In-Reply-To: <CAH2-WzmER9kc4OtmkDh+h51QV=v6Yc5BGsJikwJHtucf1C1HWw@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox