public inbox for [email protected]
help / color / mirror / Atom feedFrom: Andres Freund <[email protected]>
To: Peter Geoghegan <[email protected]>
Cc: Tomas Vondra <[email protected]>
Cc: Alexandre Felipe <[email protected]>
Cc: Thomas Munro <[email protected]>
Cc: Nazir Bilal Yavuz <[email protected]>
Cc: Robert Haas <[email protected]>
Cc: Melanie Plageman <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Cc: Georgios <[email protected]>
Cc: Konstantin Knizhnik <[email protected]>
Cc: Dilip Kumar <[email protected]>
Subject: Re: index prefetching
Date: Tue, 10 Mar 2026 18:47:52 -0400
Message-ID: <y5wp4uxudeajyljuzdm4cmqvwmzlujwzkxbadimoa64cmybgjp@5dd7le2jxc5m> (raw)
In-Reply-To: <CAH2-Wz=g=JTSyDB4UtB5su2ZcvsS7VbP+ZMvvaG6ABoCb+s8Lw@mail.gmail.com>
References: <a67mvhyi2q45eg4eimhpwdg6l3s3dmpahti2svffvmvzwmss27@r4nohusvndbq>
<[email protected]>
<il7jtfowpatrlg33qb5plj7v7pferes4ogerq5fdczszi4kokh@sbwvb2ukfgos>
<[email protected]>
<ws47e3wly6skt36b23zy5qfvcxzueo6od3uicunuodsqnxl7os@7v2qi7qkxzbz>
<CAH2-Wzk-89uCvdJ1Q6NsM6LvDvUEt6Qy66T6A60J=D_voWxZDg@mail.gmail.com>
<64mfcfv7iihc4pmqlxarii4esnmqry52ckz5m7lmwylnfnuxuz@oxh4ioxkjtep>
<CAH2-Wzmy7NMba9k8m_VZ-XNDZJEUQBU8TeLEeL960-rAKb-+tQ@mail.gmail.com>
<d2d4qofb5ajg2ftvm6h56oi4utdwpzkqfjd7z2y4vod5qaub4h@ixyotvfut3mg>
<CAH2-Wz=g=JTSyDB4UtB5su2ZcvsS7VbP+ZMvvaG6ABoCb+s8Lw@mail.gmail.com>
Hi,
On 2026-03-10 16:57:35 -0400, Peter Geoghegan wrote:
> On Fri, Feb 27, 2026 at 6:52 PM Andres Freund <[email protected]> wrote:
> > This is a huge change. Is there a chance we can break it up into more
> > manageable chunks?
>
> Attached is v12, which has revisions that address most of your
> feedback items. It also includes items that address problems that I
> noticed during performance validation work.
>
> Highlights:
>
> * Substantial revisions that give table AMs and index AMs direct
> control over batch layout -- without giving up on batch
> recycling/caching. This is essentially what you (Andres) requested
> because the design from v11 was not sufficiently AM agnostic. In
> particular:
>
> - Table AMs now control the size and layout of visibility information
> (in practice heapam uses this to store per-item visibility state from
> the visibility map).
>
> - Index AMs have their own opaque state for things like sibling link
> block numbers, avoiding the assumption that other index AMs supporting
> amgetbatch will need to work like nbtree and hash as regards how they
> navigate to the next index page/index keyspace associated with each
> batch.
Nice!
> * No more read stream yielding. Numerous new patches from Andres are
> now included, which helps with this. In particular, "WIP: read_stream:
> Only increase distance when waiting for IO" fixes the problematic
> regression in an adversarial query -- the one that prompted me to
> invent yielding in the first place. As a result of all this, the read
> stream callback added by the prefetching commit itself is now
> substantially simpler than it was in v11.
Yay.
> * There are now a couple of extra patches created by breaking things
> into more distinct commits. Namely, there's a new "heapam: Track heap
> block in IndexFetchHeapData using xs_blk" commit, as well as a new
> "Make IndexScanInstrumentation a pointer in executor scan nodes"
> commit.
Yay^2.
> * Moreover, some commits now appear in a slightly different order,
> prioritizing work closer to being committable; those commits now come
> first.
Yay^3.
> * New commit "Use simple hash for PrivateRefCount" addresses some of
> the problems we were seeing with PrivateRefCount performance. This
> generic optimization addresses an existing problem that would
> otherwise be much worse with the index prefetching work in place.
Let's get that in soon.
Alexandre Felipe posted an implementation of this in
https://postgr.es/m/CAE8JnxNTETEUiAOF31%3D_yo%3DpvyAi9npOeJfcTvEJJbi4vomtYA%40mail.gmail.com
I don't agree with many of the other changes, but the simplehash conversion
contains an interesting piece - the ability to avoid the status field. I'd
encourage Alexandre to upstream that separately from this thread (and also
separately from the rest of the patches in the above thread).
> However, I have NOT yet acted on a few feedback items from Andres:
>
> * I still don't know what Andres meant about requiring table AMs to
> free batch index page buffer pins representing a modularity violation.
> I don't see how we can reasonably avoid it while still preserving the
> guarantees needed to safely drop buffer pins eagerly during index-only
> scans that require prefetching.
>
> * I'm also not at all sure what Andres meant about index AMs like hash
> not holding onto their own buffer pins, given that prefetching uses a
> read stream sensitive to the number of buffer pins the backend holds.
I tried to respond in
https://postgr.es/m/vbb4naf2tvm2tm7yoml54pzvrmn77p4nvq4awfa4wufc3hn7qx%40mof5q6li3xzv
to explain my concerns / what I think needs to happen.
Greetings,
Andres Freund
view thread (367+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: index prefetching
In-Reply-To: <y5wp4uxudeajyljuzdm4cmqvwmzlujwzkxbadimoa64cmybgjp@5dd7le2jxc5m>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox