public inbox for [email protected]  
help / color / mirror / Atom feed
From: Tomas Vondra <[email protected]>
To: Robert Haas <[email protected]>
To: Melanie Plageman <[email protected]>
Cc: Andres Freund <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Cc: Georgios <[email protected]>
Cc: Thomas Munro <[email protected]>
Cc: Konstantin Knizhnik <[email protected]>
Cc: Dilip Kumar <[email protected]>
Subject: Re: index prefetching
Date: Wed, 14 Feb 2024 15:13:12 +0100
Message-ID: <[email protected]> (raw)
In-Reply-To: <CA+TgmoZemqR60MoiPN+R0qMe2XDYtOe=Tr1+3rX7S1Uua_C6wA@mail.gmail.com>
References: <[email protected]>
	<CA+Tgmob5Nh7sQO8GENzEDSjYJQuKCgfhWtwDLjMtdueFMCh05A@mail.gmail.com>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<CA+TgmoacmxNQBZsgMPSn9b7oE6QQLOqr_EP=YZTzdfHrqa5dqw@mail.gmail.com>
	<[email protected]>
	<CAAKRu_ZPDhNwwFxQwS8NdeTFkycM1c=tNLKdU0J-M6KxCjdEmQ@mail.gmail.com>
	<[email protected]>
	<CAAKRu_ad8cywU4_X+5e4A9Wy_PZmrCx2aoLpdHPBRvb9inGDgQ@mail.gmail.com>
	<[email protected]>
	<CAAKRu_Y=go4u99udRaj3k0vuF6EAfWqM86BONhxB4MV9X4FRpQ@mail.gmail.com>
	<CAAKRu_ZPuaO5XwXfM7wowf4sSmPSJ2LT9+zfmD5=LQ=WhV_j=Q@mail.gmail.com>
	<CA+TgmoZemqR60MoiPN+R0qMe2XDYtOe=Tr1+3rX7S1Uua_C6wA@mail.gmail.com>



On 2/14/24 08:10, Robert Haas wrote:
> On Thu, Feb 8, 2024 at 3:18 AM Melanie Plageman
> <[email protected]> wrote:
>> - kill prior tuple
>>
>> This optimization doesn't work with index prefetching with the current
>> design. Kill prior tuple relies on alternating between fetching a
>> single index tuple and visiting the heap. After visiting the heap we
>> can potentially kill the immediately preceding index tuple. Once we
>> fetch multiple index tuples, enqueue their TIDs, and later visit the
>> heap, the next index page we visit may not contain all of the index
>> tuples deemed killable by our visit to the heap.
> 
> Is this maybe just a bookkeeping problem? A Boolean that says "you can
> kill the prior tuple" is well-suited if and only if the prior tuple is
> well-defined. But perhaps it could be replaced with something more
> sophisticated that tells you which tuples are eligible to be killed.
> 

I don't think it's just a bookkeeping problem. In a way, nbtree already
does keep an array of tuples to kill (see btgettuple), but it's always
for the current index page. So it's not that we immediately go and kill
the prior tuple - nbtree already stashes it in an array, and kills all
those tuples when moving to the next index page.

The way I understand the problem is that with prefetching we're bound to
determine the kill_prior_tuple flag with a delay, in which case we might
have already moved to the next index page ...


So to make this work, we'd need to:

1) keep index pages pinned for all "in flight" TIDs (read from the
index, not yet consumed by the index scan)

2) keep a separate array of "to be killed" index tuples for each page

3) have a more sophisticated way to decide when to kill tuples and unpin
the index page (instead of just doing it when moving to the next index page)

Maybe that's what you meant by "more sophisticated bookkeeping", ofc.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






view thread (25+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: index prefetching
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox