public inbox for [email protected]  
help / color / mirror / Atom feed
From: Michail Nikolaev <[email protected]>
To: Melanie Plageman <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Cc: Alvaro Herrera <[email protected]>
Cc: Matthias van de Meent <[email protected]>
Subject: Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements
Date: Thu, 1 Feb 2024 17:06:28 +0100
Message-ID: <CANtu0oijWPRGRpaRR_OvT2R5YALzscvcOTFh-=uZKUpNJmuZtw@mail.gmail.com> (raw)
In-Reply-To: <CANtu0oirtBK_g4jxtw3jehSop3b0WSQaek5Sv5OGSXwxgcHwZQ@mail.gmail.com>
References: <CANtu0oiLc-+7h9zfzOVy2cv2UuYk_5MUReVLnVbOay6OgD_KGg@mail.gmail.com>
	<CAEze2WgW6pj48xJhG_YLUE1QS+n9Yv0AZQwaWeb-r+X=HAxU_g@mail.gmail.com>
	<CANtu0oizNtPUrPB0Mh+2vyjdijTX=LZvO5_dZN3+NqvE-CFPtw@mail.gmail.com>
	<CAEze2Wi3BFLkFBcZ+Brfbr-mGBCcWXcWuHucnCnw5ZOQotc6Eg@mail.gmail.com>
	<CANtu0ojRX=osoiXL9JJG6g6qOowXVbVYX+mDsN+2jmFVe=eG7w@mail.gmail.com>
	<CAEze2Wg03Ps_StwEhgCdSn7VXY9ZUM=zCrf-m1dRZpTWv6wD_A@mail.gmail.com>
	<CANtu0oj66JjAq8xyRSeO=MuRHYS2XsYbhHRRESHtOcLJs=3+Sw@mail.gmail.com>
	<CANtu0ogT2Qn7-q_jK6+DqBQvFoTt69eQJDKxJARXV9pdWjd0Gg@mail.gmail.com>
	<CANtu0ogXgNkEuxbDRwznAZpxEXRmj3NzOen3y-RGHDwig0YBRw@mail.gmail.com>
	<CANtu0oi+FTMqDb+6Bv8w7VHiTFVMB1uAAip_P841WQH+ktPixw@mail.gmail.com>
	<CAEze2WgeyVnDb_j4gJQYC4+HcSsYQAdeRA1-F0KDnJ=Y0A_TzA@mail.gmail.com>
	<CANtu0oga9zqqEFhdmcWyJTK4d6EGMJsMB_LMgVSE8ar0xVm7Ew@mail.gmail.com>
	<CANtu0oirtBK_g4jxtw3jehSop3b0WSQaek5Sv5OGSXwxgcHwZQ@mail.gmail.com>

> > > I just realised there is one issue with this design: We can't cheaply
> > > reset the snapshot during the second table scan:
> > > It is critically important that the second scan of R/CIC uses an index
> > > contents summary (made with index_bulk_delete) that was created while
> > > the current snapshot was already registered.
> >
> > > So, the "reset the snapshot every so often" trick cannot be applied in
> > > phase 3 (the rescan), or we'd have to do an index_bulk_delete call
> > > every time we reset the snapshot. Rescanning might be worth the cost
> > > (e.g. when using BRIN), but that is very unlikely.
> >
> > Hm, I think it is still possible. We could just manually recheck the
> > tuples we see
> > to the snapshot currently used for the scan. If an "old" snapshot can see
> > the tuple also (HeapTupleSatisfiesHistoricMVCC) then search for it in the
> > index summary.
> That's an interesting method.
>
> How would this deal with tuples not visible to the old snapshot?
> Presumably we can assume they're newer than that snapshot (the old
> snapshot didn't have it, but the new one does, so it's committed after
> the old snapshot, making them newer), so that backend must have
> inserted it into the index already, right?

I made a draft of the patch and this idea is not working.

The problem is generally the same:

* reference snapshot sees tuple X
* reference snapshot is used to create index summary (but there is no
tuple X in the index summary)
* tuple X is updated to Y creating a HOT-chain
* we started scan with new temporary snapshot (it sees Y, X is too old for it)
* tuple X is pruned from HOT-chain because it is not protected by any snapshot
* we see tuple Y in the scan with temporary snapshot
    * it is not in the index summary - so, we need to check if
reference snapshot can see it
    * there is no way to understand if the reference snapshot was able
to see tuple X - because we need the full HOT chain (with X tuple) for
that

Best regards,
Michail.






view thread (26+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements
  In-Reply-To: <CANtu0oijWPRGRpaRR_OvT2R5YALzscvcOTFh-=uZKUpNJmuZtw@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox