public inbox for [email protected]  
help / color / mirror / Atom feed
From: Mihail Nikalayeu <[email protected]>
To: Fujii Masao <[email protected]>
Cc: Alvaro Herrera <[email protected]>
Cc: Robert Treat <[email protected]>
Cc: Pg Hackers <[email protected]>
Cc: Antonin Houska <[email protected]>
Subject: Re: Adding REPACK [concurrently]
Date: Sat, 9 Aug 2025 14:55:00 +0200
Message-ID: <CADzfLwXx46j8KwQjjM1ZcqNBsx-k6GxHOzDJkm4SHjh+cv31Rw@mail.gmail.com> (raw)
In-Reply-To: <CADzfLwWnMvnHc1-89woNzOzfK2_wqx7bD66p+RwD6xU2fg6Y_g@mail.gmail.com>
References: <CAHGQGwELP6OAOprxjkS5iYSj1KXKmVPULJyYMddE7-uRxrSP_Q@mail.gmail.com>
	<CABV9wwMjm_8GQ_6A7WZkXzEkbz5ZPurB3ijxu74MbFJ4+XrWUg@mail.gmail.com>
	<[email protected]>
	<CAHGQGwHCeqXUoNaJuftWjOkB8_LSmyH7Md9qkR79sOtW9DJe2Q@mail.gmail.com>
	<CADzfLwWnMvnHc1-89woNzOzfK2_wqx7bD66p+RwD6xU2fg6Y_g@mail.gmail.com>

Hello!

One more thing - I think build_new_indexes and
index_concurrently_create_copy are very close in semantics, so it
might be a good idea to refactor them a bit.

I’m still concerned about MVCC-related issues. For multiple
applications, this is a dealbreaker, because in some cases correctness
is a higher priority than availability.

Possible options:

1) Terminate connections with old snapshots.

Add a flag to terminate all connections with snapshots during the
ExclusiveLock period for the swap. From the application’s perspective,
this is not a big deal - it's similar to a primary switch. We would
also need to prevent new snapshots from being taken during the swap
transaction, so a short exclusive lock on ProcArrayLock would also be
required.

2) MVCC-safe two-phase approach (inspired by CREATE INDEX).

- copy the data from T1 to the new table T2.
- apply the log.
- take a table-exclusive lock on T1
- apply the log again.
- instead of swapping, mark the T2 as a kind of shadow table - any
transaction applying changes to T1 must also apply them to T2, while
reads still use T1 as the source of truth.
- commit (and record the transaction ID as XID1).
- at this point, all changes are applied to both tables with the same
XIDs because of the "shadow table" mechanism.
- wait until older snapshots no longer treat XID1 as uncommitted.
- now the tables are identical from the MVCC perspective.
- take an exclusive lock on both T1 and T2.
- perform the swap and drop T1.
- commit.

This is more complex and would require implementing some sort of
"shadow table" mechanism, so it might not be worth the effort. Option
1 feels more appealing to me.

If others think this is a good idea, I might try implementing a proof
of concept.

Best regards,
Mikhail





view thread (106+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Adding REPACK [concurrently]
  In-Reply-To: <CADzfLwXx46j8KwQjjM1ZcqNBsx-k6GxHOzDJkm4SHjh+cv31Rw@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox