Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w7xrG-0004zJ-2X for pgsql-hackers@arkaria.postgresql.org; Wed, 01 Apr 2026 15:51:55 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w7xrF-001FiD-2C for pgsql-hackers@arkaria.postgresql.org; Wed, 01 Apr 2026 15:51:54 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w7xrE-001Fg3-1p for pgsql-hackers@lists.postgresql.org; Wed, 01 Apr 2026 15:51:53 +0000 Received: from mail-wr1-x432.google.com ([2a00:1450:4864:20::432]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1w7xrB-00000000288-2Y71 for pgsql-hackers@lists.postgresql.org; Wed, 01 Apr 2026 15:51:51 +0000 Received: by mail-wr1-x432.google.com with SMTP id ffacd0b85a97d-43cf5fbacc9so656962f8f.1 for ; Wed, 01 Apr 2026 08:51:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1775058708; cv=none; d=google.com; s=arc-20240605; b=gvcj9eZADVU5dFVP84/A36eA1+TPsrgQw5oR8k3ZhlKPe5pzeXKjmulGKoxnW6PSJm V7u3zW4Ab9tKQWJmzoBdiGrkQac8dgaE6ulH0JgCt75E0qz2Lxedg7MSLV/skF6tc7Qz y039Eh+jCA31joRrq+qUYCd6/xugNuWAMIzrqFZrGOLGu+/FDC14ZMS7yOguchYq54nu gMKxliQqdMO9ReAiksvumymzZK3Yn138GthczvgxlG/9hXh64UIiL5HIRvaeACXw6UuY AmyymavACLjhmnfIgmzYh8XwrMrNsMXU7APdItgM0PKNmuNmMiUA+JAnQt+apCOXMP7H aNuA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=JF83aA2ACp/64j2THCH2m1EeF2p6+Q9NoOYDxHroTXc=; fh=37wOjif1QavHOwa5VMnbJB4+UIkUjwENszRw+WARuCs=; b=U7htBW0/mpbwqKoE5kZspLHo0+APibbZkUWkKI0SRpu/9xw1N3T1s3ZOwf6jk2mgAs 0z2KOxI2YOcSQ8TpufG00Jx0mGngCP5xqUqjxHjSp59axqB1r300oDtrfT3ys+WksQlw dmor8S6dNvHx8nxG1LgK1WPDeNA1vW8BrqTB34uWjmbvonOwDckE2QRxJkAblKtBVs2L VydY28yqo+TlVm3ks/uzx4kWkbgzoRvnR6ZeAtd4hsbKC8RM3GpiNIsa/Lp48bUs+9tN 2yLlODiR/Av+5LT/PQBbTOc2Oo/+Ki/H68lA6Wm7vuG0GCPpUxGzCwXEwGGtl/GZP4YU 8ZSQ==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bowt-ie.20230601.gappssmtp.com; s=20230601; t=1775058708; x=1775663508; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=JF83aA2ACp/64j2THCH2m1EeF2p6+Q9NoOYDxHroTXc=; b=pTHycMEhaNMtWDfMZ0NWQBwNejSe67WpuAmmmlpbCH3x4lxVLKrxfqp2P7M1+Tb0WB 97IoQu8SrMCeeqKsPGienouZ2z+ixE+YRSGkIYNK+zQGDxyahwop2QD8r1NcZ4FXcKys m3jk5KDsup8Oy5mHs/XutZsv+hstnry9ZfRyxDJ0aHdG1YJnR7RfO8jkd5jW9dZy64nd mCdogUFfwowQIXsM+aCGmoNYR6dIJRckZC4db7rjfMl/MtqeE5lE8DJ56OYrOsy3F/4P zy0qMv882Cxw+1r2xB/P0DQrAI6nOJEUrzkcVlFQAOvDvIwHO9iSUrAZXP3PZILTR9oa uKUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775058708; x=1775663508; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=JF83aA2ACp/64j2THCH2m1EeF2p6+Q9NoOYDxHroTXc=; b=FXtEObTg+FCK7+l+mNYBNSJeBpt+aYrnTsQmfcv99LPT/mp/+ZRRoZfSuKRSF6s0JQ l4VyMTRP0crZ4/ljzUr+2F31AbkPRgHMhHtiV3g7yWqBm6BmMbrNsuOR8wU34Rz67233 rG5NtDZLfJXgwkQWD4+BeykYsDp8wnNV2iOINNSrtMC6XFYPLi3xrdI8xm0SUJvMVrSk Gb1lpqJaFwvC5ThszjZ5uXfrcDeuq2baYWsZlixs1BCUG3m+zhvlO0X+L+Pq7SKY5Bl1 0+vaFGnaCYehJLnAiVfpmqa3NtX1oEzybJgSjzNvqyw7b6aPGTu4ceAEG5BuqDh3nMTb XowA== X-Forwarded-Encrypted: i=1; AJvYcCXL+J7C3KtXsWZg8DdnE4KrIeXd2t6wnx/N2vCcpFHFbU8GxAL4/HuacjyQzGgHULGPBemen8z54rO6Tecd@lists.postgresql.org X-Gm-Message-State: AOJu0Yxw1cSrRxbeq4Z3RbzQWb1YKUWzkP0nDsy+h2U5A364g+MZKI83 k3Y9ELFySiNulq778A9hdtJG7t02XsuQ9Eqx2XucCIy24hhWGR53su9RZc6rw1QVvZECSpHDKQI 7inl/Se+ECej4GzJh360/DND8nikk2XWY51HTbilGAg== X-Gm-Gg: ATEYQzwSy/zwrq+ZNNCTimEN2tlPnIDstAV4JhOw4B7vAWRQJkrjH78XqSg8Gng4UQM ksWn5uckNIMEQqaoskw+R2R49QetmItcVlvKPgk3a5PglJOrFWOl9rAsoZCSgWcyv4tCn7wx8E+ qY06sR6D5MSSgSxNRk6G9OfcaSSvf0IliBi46wcbWJ0GDNPjbxaYqmhZnkPXFm85slWORs8SM2l FI5oqlRGFdGm0JeAnUSltBV69LUxW2SrhOdgBfRW4ZM5jwo/CpG3/5F9KOcgocenld3jBr5dNmy o5VGfWXstdJnGVJeKR36ga5AWdn9D2d4ZHnEbSTXkLIqAo+rW0v/GQ== X-Received: by 2002:a05:6000:410b:b0:43b:4951:6b37 with SMTP id ffacd0b85a97d-43d081f7af4mr10282820f8f.15.1775058707752; Wed, 01 Apr 2026 08:51:47 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Peter Geoghegan Date: Wed, 1 Apr 2026 11:51:19 -0400 X-Gm-Features: AQROBzCTKeV8XcTs19fLdVmcY2J0aBmb22PZJCQQxYYOQiJ0nNBi1dU9ze7HUhU Message-ID: Subject: Re: index prefetching To: Andres Freund Cc: Tomas Vondra , Alexandre Felipe , Thomas Munro , Nazir Bilal Yavuz , Robert Haas , Melanie Plageman , PostgreSQL Hackers , Georgios , Konstantin Knizhnik , Dilip Kumar Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Tue, Mar 31, 2026 at 4:22=E2=80=AFPM Andres Freund = wrote: > > From 742dcbea7d184443d2c4ef3c095b60f47f870f72 Mon Sep 17 00:00:00 2001 > > From: Peter Geoghegan > > Date: Wed, 25 Mar 2026 00:22:17 -0400 > > Subject: [PATCH v19 01/18] Move heapam_handler.c index scan code to new= file. > > > > Move the heapam index fetch callbacks (index_fetch_begin, > > index_fetch_reset, index_fetch_end, and index_fetch_tuple) into a new > > dedicated file. Also move heap_hot_search_buffer over. This is a > > purely mechanical move with no functional impact. > > I don't love that this renames arguments (s/call_again/heap_continue/) wh= ile > moving. Makes it harder to verify this is just the mechanical change it c= laims > to be. Will fix. > > + * IDENTIFICATION > > + * src/backend/access/heap/heapam_indexscan.c > Should probably include access/relscan.h for +IndexFetchTableData, rather= than > relying on the indirect include. Will fix. > > Subject: [PATCH v19 02/18] Add slot-based table AM index scan interface= . > > This moves VM checks for index-only scans out of the executor and into > > heapam, enabling batching of visibility map lookups (though for now we > > continue to just perform retail lookups). > > I'd also add that it's architecturally much cleaner this way. Agreed, will fix the commit message. > > A small minority of callers (2 callers in total) continue to use the ol= d > > table_index_fetch_tuple interface. This is still necessary with caller= s > > that fundamentally need to pass a TID to the table AM. Both callers > > perform some kind of constraint enforcement. > > I think we may need to do something about this before all of this over. I= t's > just too confusing to have the generic sounding index_fetch_tuple() inter= faces > that are barely ever used and *shouldn't* be used when, uhm, fetching a t= uple > pointed to by an index. Agreed. > At the very least it seems we should rename the ->index_fetch_tuple() to > ->index_fetch_tid() or something and remove the call_again, all_dead > arguments, since they are never used in this context. I suspect it shoul= d > *not* use the table_index_fetch_begin()/table_index_fetch_end() interface > either. I'm working on a version of this that makes index_fetch_tid into a more specialized thing, purely designed for the requirements of the 2 remaining callers (that just need to do constraint enforcement) that absolutely have to pass a TID/that fundamentally cannot use a slot-based interface. This approach will avoid the need to call table_index_fetch_begin()/table_index_fetch_end() -- because it won't even have an IndexFetchTableData to pass. Something that has long bothered me about the current structure of the patch is that indexam.c sometimes resets the batch structure (e.g., on rescans), while heapam sometimes does very similar things (e.g., when the scan changes direction). This means heapam relies on indexam.c having already reset the batch ring buffer when heapam_index_fetch_reset is called on a rescan; it only needs to reset the read stream state (which is fairly closely related to the batch ring buffer and has invariants involving the ring buffer). In short, the ownership of the batch ring buffer is muddled. I plan on clarifying things in this area for v20. It will fully embrace the idea that IndexFetchTableData is just the table AM specific part of IndexScanDescData. This will allow me to move most of the batch-related calls currently in indexam.c over to heapam/the table AM. That will make it very clear that the table AM fully owns the batch ring buffer (there'll continue to be indexam.c calls to do things like take a mark, since that is implemented in a way that's already table AM agnostic). One problem with the current design is that it still treats IndexFetchTableData as not just the table AM piece of IndexScanDescData. I now realize that it works that way purely so we can avoid changing anything about index_fetch_tuple. But, as you say, we really *should* do that anyway -- it shouldn't need its own IndexFetchTableData (or its own IndexScanDescData), since it literally doesn't have anything to do with index scans. > > This convention lets both plain and index-only scans use the same > > table_index_getnext_slot interface. > > Not convinced that's really a useful goal. I don't think so either. That was just pro forma. The next version won't say this. > Hm. Why is it ok that we now don't have an equivalent of xs_heap_continue= on > the scan level anymore? I see there's a local heap_continue in > heapam_index_getnext_slot(), but isn't that insufficient e.g. for a > SNAPSHOT_ANY snapshot, where all the row versions would be visible? > > Am I misunderstanding how this works? No, I just screwed this up -- not sure how I missed it. It's a pity the tests didn't fail as a result. Will fix. > > +static inline bool > > +table_index_getnext_slot(IndexScanDesc iscan, ScanDirection direction, > > + TupleTableSlot *slot) > > +{ > > + return iscan->xs_getnext_slot(iscan, direction, slot); > > +} > > Hm. Does this actually belong in tableam.h? I'm a bit conflicted. I can see arguments for both. > > + .index_plain_amgettuple_getnext_slot =3D heapam_index_plain_amget= tuple_getnext_slot, > > + .index_only_amgettuple_getnext_slot =3D heapam_index_only_amgettu= ple_getnext_slot, > > .index_fetch_tuple =3D heapam_index_fetch_tuple, > > > > .tuple_insert =3D heapam_tuple_insert, > > Wonder if it's perhaps worth deleting _slot and shortening getnext to > next. These are quite only names... I will shorten these for the next version. > > +static pg_attribute_always_inline bool > > +heapam_index_getnext_slot(IndexScanDesc scan, ScanDirection direction, > > + TupleTableSlot *slot, b= ool index_only) > > For heapam_index_fetch_tuple_impl() you added _impl, but here you didn't. > Mild preference for also adding _impl here. That was when there was a separate heapam_index_fetch_tuple for the callback. But that's going away anyway now. > > + bool all_visible =3D false; > > + BlockNumber last_visited_block =3D InvalidBlockNumber; > > + uint8 n_visited_pages =3D 0, > > + xs_visited_pages_limit =3D 0; > > + > > + if (index_only) > > + xs_visited_pages_limit =3D scan->xs_visited_pages_limit; > > Did you check that it's useful to have xs_visited_pages_limit as a longer > lived variable? I suspect it's just going to add register pressure and wi= ll > need to be saved/restored across other calls, making it just as cheap to > always fetch it from scan. Will fix. > Still find it somewhat weird that we have local 'tid' variable, that we d= o not > use pass to heapam_index_fetch_heap(), because heapam_index_fetch_heap() > relies on index_getnext_tid() having stored the to-be-fetched tid in > xs_heaptid. > > But I guess that's something to change at a later date. Agreed, it is weird. > Previously the ExecClearTuple() here had at least this comment: > /* We don't actually need the heap tuple for anything */ > > Now it looks even more confusing... I will add that comment back. > Orthogonal: I suspect eventually we should pass the table to > RelationGetIndexScan(), have the tableam return how much space it needs, = and > allocate it one chunk. Agreed. Since these two structs (IndexScanDescData and IndexFetchTableData) are essentially 2 different parts of the same thing. That might have been debatable before now, but everything we're doing here is moving in that direction. > > From 85be7bf520fc2746f4ee73a0744c7aa04da55e52 Mon Sep 17 00:00:00 2001 > > From: Peter Geoghegan > > Date: Tue, 10 Mar 2026 14:40:35 -0400 > > Subject: [PATCH v19 03/18] heapam: Track heap block in IndexFetchHeapDa= ta. > > LGTM. Cool. > > --- a/src/backend/access/heap/heapam_indexscan.c > > +++ b/src/backend/access/heap/heapam_indexscan.c > > @@ -59,18 +59,15 @@ heapam_index_fetch_reset(IndexFetchTableData *scan) > > + /* Resets are a no-op (XXX amgetbatch commit resets xs_vm_items h= ere) */ > > + (void) hscan; > > I assume that's not a line you intend to commit? Will fix. > LGTM otherwise. Cool. > Need to do something else for a while. More another time. Thanks for the review! -- Peter Geoghegan