public inbox for [email protected]
help / color / mirror / Atom feedFrom: Michail Nikolaev <[email protected]>
To: Matthias van de Meent <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Cc: Andrey Borodin <[email protected]>
Cc: Melanie Plageman <[email protected]>
Subject: Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements
Date: Wed, 18 Dec 2024 00:29:13 +0100
Message-ID: <CANtu0oi+nbipJUsMZcoUfodCyuTN_DAXD22UstjMTYWG=tJ4jw@mail.gmail.com> (raw)
In-Reply-To: <CANtu0ogTfyng-H4yWr3Pm_+PXX+XvDx1AM1sXTy1V7DM6jJ+Bw@mail.gmail.com>
References: <CANtu0oiLc-+7h9zfzOVy2cv2UuYk_5MUReVLnVbOay6OgD_KGg@mail.gmail.com>
<CAEze2WgW6pj48xJhG_YLUE1QS+n9Yv0AZQwaWeb-r+X=HAxU_g@mail.gmail.com>
<CANtu0oizNtPUrPB0Mh+2vyjdijTX=LZvO5_dZN3+NqvE-CFPtw@mail.gmail.com>
<CAEze2Wi3BFLkFBcZ+Brfbr-mGBCcWXcWuHucnCnw5ZOQotc6Eg@mail.gmail.com>
<CANtu0ojRX=osoiXL9JJG6g6qOowXVbVYX+mDsN+2jmFVe=eG7w@mail.gmail.com>
<CAEze2Wg03Ps_StwEhgCdSn7VXY9ZUM=zCrf-m1dRZpTWv6wD_A@mail.gmail.com>
<CANtu0oj66JjAq8xyRSeO=MuRHYS2XsYbhHRRESHtOcLJs=3+Sw@mail.gmail.com>
<CANtu0ogT2Qn7-q_jK6+DqBQvFoTt69eQJDKxJARXV9pdWjd0Gg@mail.gmail.com>
<CANtu0ogXgNkEuxbDRwznAZpxEXRmj3NzOen3y-RGHDwig0YBRw@mail.gmail.com>
<CANtu0oi+FTMqDb+6Bv8w7VHiTFVMB1uAAip_P841WQH+ktPixw@mail.gmail.com>
<CAEze2WgeyVnDb_j4gJQYC4+HcSsYQAdeRA1-F0KDnJ=Y0A_TzA@mail.gmail.com>
<CANtu0oga9zqqEFhdmcWyJTK4d6EGMJsMB_LMgVSE8ar0xVm7Ew@mail.gmail.com>
<CANtu0oirtBK_g4jxtw3jehSop3b0WSQaek5Sv5OGSXwxgcHwZQ@mail.gmail.com>
<CANtu0oijWPRGRpaRR_OvT2R5YALzscvcOTFh-=uZKUpNJmuZtw@mail.gmail.com>
<CAEze2WgHFnYdxkNUmvqxOc-cFUNEYaTqL7+Pei=CtA-ZrTOFyw@mail.gmail.com>
<CANtu0oipL3e8fLnejbH4HnByMW6G_auR4v+ns8j-UHhuPW=9og@mail.gmail.com>
<CANtu0ojmVw8GW5bJknnqSp7Dp1xEuoBewdu2imtQ2tGnWpiWEg@mail.gmail.com>
<CAEze2WgNHTWfw_bP6O0zW_=vi1D-yi1nh6-JDj9kd=8UaB-zLA@mail.gmail.com>
<CANtu0ojA5=rT8BN5==OAiQJZh8CAxD_U8thFhZ3mwrZQ6roNOA@mail.gmail.com>
<CAEze2Wh3eSAnXFdY_6roNPb3WD-YsKbNLiKf=cPmAGHkPUd22w@mail.gmail.com>
<CANtu0og_=ypCbH2ZFayn44i=CL0HAXKW390LfZhQ1F56HoFXtQ@mail.gmail.com>
<CAEze2WghpUS29bJJh5GCZ+WtpO4qWmxiFF-CTWFiP4Qq62G58w@mail.gmail.com>
<CANtu0oiuuGRvYRsH-y0iQjfc+JpT9o4mPUXVkz97+sW9BXA+FA@mail.gmail.com>
<CANtu0og6P+O10XLm-AsoqOhZYEjr8SEHFETadSJ8ifO01YP1qg@mail.gmail.com>
<CANtu0oj0pakvxXhHJhsiKgk=ywY57m623G=OvhJnLVFWe9JCpg@mail.gmail.com>
<CANtu0oiOj65kZqP8ngnsc9O+gywUJATgOjOP6pUARXWsmS9cBQ@mail.gmail.com>
<CANtu0oiT9SPFhs=h8DR4YVPox7TJ6jkfR9JgqT-0L+=uy=Lxng@mail.gmail.com>
<CANtu0ogBOtd9ravu1CUbuZWgq6qvn1rny38PGKDPk9zzQPH8_A@mail.gmail.com>
<CAEze2Wj9SgwOpe_1CWnS_D-txQaQyXArR=dm4DTnha93=yua4g@mail.gmail.com>
<CANtu0ohFr7OzNSbxqBhUpR0mXDYyt0Xt6+=Tbq0EC7as7kr+Lg@mail.gmail.com>
<CANtu0oh4PwBn_h+4p_MxFigRAyJvF-0nA9Tm5NFRwfsWWjZQiA@mail.gmail.com>
<CANtu0ojHEVU9U_bxgViRmtqNTJ92LnF+76-yzn4axYjGsK2kqQ@mail.gmail.com>
<CANtu0ogS871NkdUnZW9P_LVpLzhSJ1+cETK0b55cYjs=v2qbPA@mail.gmail.com>
<CANtu0ohRVBDf4x7Ge3oVzgf4NzMb_DhmTM1ae0u1WUA+CD0UqA@mail.gmail.com>
<CANtu0ogTfyng-H4yWr3Pm_+PXX+XvDx1AM1sXTy1V7DM6jJ+Bw@mail.gmail.com>
Hello!
After [0] fix, I simplified stress tests to single pgbench run without any
forks.
[0]: https://commitfest.postgresql.org/51/5439/
>
Attachments:
[application/octet-stream] v6-0005-Allow-snapshot-resets-during-parallel-concurrent-.patch (29.2K, 3-v6-0005-Allow-snapshot-resets-during-parallel-concurrent-.patch)
download | inline diff:
From 15d61bbb64e5f8e418594d1ea6b50ceb9c65d9d1 Mon Sep 17 00:00:00 2001
From: nkey <[email protected]>
Date: Mon, 2 Dec 2024 01:33:21 +0100
Subject: [PATCH v6 5/6] Allow snapshot resets during parallel concurrent index
builds
Previously, non-unique concurrent index builds in parallel mode required a
consistent MVCC snapshot throughout the build, which could hold back the xmin
horizon and prevent dead tuple cleanup. This patch extends the previous work
on snapshot resets (introduced for non-parallel builds) to also support
parallel builds.
Key changes:
- Add infrastructure to track snapshot restoration in parallel workers
- Extend parallel scan initialization to support periodic snapshot resets
- Wait for parallel workers to restore their initial snapshots before
proceeding with scan
- Add regression tests to verify behavior with various index types
The snapshot reset approach is safe for non-unique indexes since they don't
need snapshot consistency across the entire scan. For unique indexes, we
continue to maintain a consistent snapshot to properly enforce uniqueness
constraints.
This helps reduce the xmin horizon impact of long-running concurrent index
builds in parallel mode, improving VACUUM's ability to clean up dead tuples.
---
src/backend/access/brin/brin.c | 43 +++++++++-------
src/backend/access/heap/heapam_handler.c | 12 +++--
src/backend/access/nbtree/nbtsort.c | 38 ++++++++++++--
src/backend/access/table/tableam.c | 37 ++++++++++++--
src/backend/access/transam/parallel.c | 50 +++++++++++++++++--
src/backend/executor/nodeSeqscan.c | 3 +-
src/backend/utils/time/snapmgr.c | 8 ---
src/include/access/parallel.h | 3 +-
src/include/access/relscan.h | 1 +
src/include/access/tableam.h | 9 ++--
.../expected/cic_reset_snapshots.out | 23 ++++++++-
.../sql/cic_reset_snapshots.sql | 7 ++-
12 files changed, 178 insertions(+), 56 deletions(-)
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index d69859ac4df..0782bd64a6a 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -143,7 +143,6 @@ typedef struct BrinLeader
*/
BrinShared *brinshared;
Sharedsort *sharedsort;
- Snapshot snapshot;
WalUsage *walusage;
BufferUsage *bufferusage;
} BrinLeader;
@@ -231,7 +230,7 @@ static void brin_fill_empty_ranges(BrinBuildState *state,
static void _brin_begin_parallel(BrinBuildState *buildstate, Relation heap, Relation index,
bool isconcurrent, int request);
static void _brin_end_parallel(BrinLeader *brinleader, BrinBuildState *state);
-static Size _brin_parallel_estimate_shared(Relation heap, Snapshot snapshot);
+static Size _brin_parallel_estimate_shared(Relation heap);
static double _brin_parallel_heapscan(BrinBuildState *state);
static double _brin_parallel_merge(BrinBuildState *state);
static void _brin_leader_participate_as_worker(BrinBuildState *buildstate,
@@ -2357,7 +2356,6 @@ _brin_begin_parallel(BrinBuildState *buildstate, Relation heap, Relation index,
{
ParallelContext *pcxt;
int scantuplesortstates;
- Snapshot snapshot;
Size estbrinshared;
Size estsort;
BrinShared *brinshared;
@@ -2367,6 +2365,7 @@ _brin_begin_parallel(BrinBuildState *buildstate, Relation heap, Relation index,
BufferUsage *bufferusage;
bool leaderparticipates = true;
bool need_pop_active_snapshot = true;
+ bool wait_for_snapshot_attach;
int querylen;
#ifdef DISABLE_LEADER_PARTICIPATION
@@ -2388,25 +2387,25 @@ _brin_begin_parallel(BrinBuildState *buildstate, Relation heap, Relation index,
* Prepare for scan of the base relation. In a normal index build, we use
* SnapshotAny because we must retrieve all tuples and do our own time
* qual checks (because we have to index RECENTLY_DEAD tuples). In a
- * concurrent build, we take a regular MVCC snapshot and index whatever's
- * live according to that.
+ * concurrent build, we take a regular MVCC snapshot and push it as active.
+ * Later we index whatever's live according to that snapshot while that
+ * snapshot is reset periodically.
*/
if (!isconcurrent)
{
Assert(ActiveSnapshotSet());
- snapshot = SnapshotAny;
need_pop_active_snapshot = false;
}
else
{
- snapshot = RegisterSnapshot(GetTransactionSnapshot());
+ Assert(!ActiveSnapshotSet());
PushActiveSnapshot(GetTransactionSnapshot());
}
/*
* Estimate size for our own PARALLEL_KEY_BRIN_SHARED workspace.
*/
- estbrinshared = _brin_parallel_estimate_shared(heap, snapshot);
+ estbrinshared = _brin_parallel_estimate_shared(heap);
shm_toc_estimate_chunk(&pcxt->estimator, estbrinshared);
estsort = tuplesort_estimate_shared(scantuplesortstates);
shm_toc_estimate_chunk(&pcxt->estimator, estsort);
@@ -2446,8 +2445,6 @@ _brin_begin_parallel(BrinBuildState *buildstate, Relation heap, Relation index,
{
if (need_pop_active_snapshot)
PopActiveSnapshot();
- if (IsMVCCSnapshot(snapshot))
- UnregisterSnapshot(snapshot);
DestroyParallelContext(pcxt);
ExitParallelMode();
return;
@@ -2472,7 +2469,8 @@ _brin_begin_parallel(BrinBuildState *buildstate, Relation heap, Relation index,
table_parallelscan_initialize(heap,
ParallelTableScanFromBrinShared(brinshared),
- snapshot);
+ isconcurrent ? InvalidSnapshot : SnapshotAny,
+ isconcurrent);
/*
* Store shared tuplesort-private state, for which we reserved space.
@@ -2518,7 +2516,6 @@ _brin_begin_parallel(BrinBuildState *buildstate, Relation heap, Relation index,
brinleader->nparticipanttuplesorts++;
brinleader->brinshared = brinshared;
brinleader->sharedsort = sharedsort;
- brinleader->snapshot = snapshot;
brinleader->walusage = walusage;
brinleader->bufferusage = bufferusage;
@@ -2534,6 +2531,16 @@ _brin_begin_parallel(BrinBuildState *buildstate, Relation heap, Relation index,
/* Save leader state now that it's clear build will be parallel */
buildstate->bs_leader = brinleader;
+ /*
+ * In case of concurrent build snapshots are going to be reset periodically.
+ * In case when leader going to reset own active snapshot as well - we need to
+ * wait until all workers imported initial snapshot.
+ */
+ wait_for_snapshot_attach = isconcurrent && leaderparticipates;
+
+ if (wait_for_snapshot_attach)
+ WaitForParallelWorkersToAttach(pcxt, true);
+
/* Join heap scan ourselves */
if (leaderparticipates)
_brin_leader_participate_as_worker(buildstate, heap, index);
@@ -2542,7 +2549,8 @@ _brin_begin_parallel(BrinBuildState *buildstate, Relation heap, Relation index,
* Caller needs to wait for all launched workers when we return. Make
* sure that the failure-to-start case will not hang forever.
*/
- WaitForParallelWorkersToAttach(pcxt);
+ if (!wait_for_snapshot_attach)
+ WaitForParallelWorkersToAttach(pcxt, false);
if (need_pop_active_snapshot)
PopActiveSnapshot();
}
@@ -2565,9 +2573,6 @@ _brin_end_parallel(BrinLeader *brinleader, BrinBuildState *state)
for (i = 0; i < brinleader->pcxt->nworkers_launched; i++)
InstrAccumParallelQuery(&brinleader->bufferusage[i], &brinleader->walusage[i]);
- /* Free last reference to MVCC snapshot, if one was used */
- if (IsMVCCSnapshot(brinleader->snapshot))
- UnregisterSnapshot(brinleader->snapshot);
DestroyParallelContext(brinleader->pcxt);
ExitParallelMode();
}
@@ -2767,14 +2772,14 @@ _brin_parallel_merge(BrinBuildState *state)
/*
* Returns size of shared memory required to store state for a parallel
- * brin index build based on the snapshot its parallel scan will use.
+ * brin index build.
*/
static Size
-_brin_parallel_estimate_shared(Relation heap, Snapshot snapshot)
+_brin_parallel_estimate_shared(Relation heap)
{
/* c.f. shm_toc_allocate as to why BUFFERALIGN is used */
return add_size(BUFFERALIGN(sizeof(BrinShared)),
- table_parallelscan_estimate(heap, snapshot));
+ table_parallelscan_estimate(heap, InvalidSnapshot));
}
/*
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 980c51e32b9..2e5163609c1 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -1231,14 +1231,13 @@ heapam_index_build_range_scan(Relation heapRelation,
* SnapshotAny because we must retrieve all tuples and do our own time
* qual checks (because we have to index RECENTLY_DEAD tuples). In a
* concurrent build, or during bootstrap, we take a regular MVCC snapshot
- * and index whatever's live according to that.
+ * and index whatever's live according to that while that snapshot is reset
+ * every so often (in case of non-unique index).
*/
OldestXmin = InvalidTransactionId;
/*
* For unique index we need consistent snapshot for the whole scan.
- * In case of parallel scan some additional infrastructure required
- * to perform scan with SO_RESET_SNAPSHOT which is not yet ready.
*/
reset_snapshots = indexInfo->ii_Concurrent &&
!indexInfo->ii_Unique &&
@@ -1300,8 +1299,11 @@ heapam_index_build_range_scan(Relation heapRelation,
Assert(!IsBootstrapProcessingMode());
Assert(allow_sync);
snapshot = scan->rs_snapshot;
- PushActiveSnapshot(snapshot);
- need_pop_active_snapshot = true;
+ if (!reset_snapshots)
+ {
+ PushActiveSnapshot(snapshot);
+ need_pop_active_snapshot = true;
+ }
}
hscan = (HeapScanDesc) scan;
diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index 5c4581afb1a..2acbf121745 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -1411,6 +1411,8 @@ _bt_begin_parallel(BTBuildState *buildstate, bool isconcurrent, int request)
BufferUsage *bufferusage;
bool leaderparticipates = true;
bool need_pop_active_snapshot = true;
+ bool reset_snapshot;
+ bool wait_for_snapshot_attach;
int querylen;
#ifdef DISABLE_LEADER_PARTICIPATION
@@ -1428,12 +1430,21 @@ _bt_begin_parallel(BTBuildState *buildstate, bool isconcurrent, int request)
scantuplesortstates = leaderparticipates ? request + 1 : request;
+ /*
+ * For concurrent non-unique index builds, we can periodically reset snapshots
+ * to allow the xmin horizon to advance. This is safe since these builds don't
+ * require a consistent view across the entire scan. Unique indexes still need
+ * a stable snapshot to properly enforce uniqueness constraints.
+ */
+ reset_snapshot = isconcurrent && !btspool->isunique;
+
/*
* Prepare for scan of the base relation. In a normal index build, we use
* SnapshotAny because we must retrieve all tuples and do our own time
* qual checks (because we have to index RECENTLY_DEAD tuples). In a
* concurrent build, we take a regular MVCC snapshot and index whatever's
- * live according to that.
+ * live according to that, while that snapshot may be reset periodically in
+ * case of non-unique index.
*/
if (!isconcurrent)
{
@@ -1441,6 +1452,11 @@ _bt_begin_parallel(BTBuildState *buildstate, bool isconcurrent, int request)
snapshot = SnapshotAny;
need_pop_active_snapshot = false;
}
+ else if (reset_snapshot)
+ {
+ snapshot = InvalidSnapshot;
+ PushActiveSnapshot(GetTransactionSnapshot());
+ }
else
{
snapshot = RegisterSnapshot(GetTransactionSnapshot());
@@ -1501,7 +1517,7 @@ _bt_begin_parallel(BTBuildState *buildstate, bool isconcurrent, int request)
{
if (need_pop_active_snapshot)
PopActiveSnapshot();
- if (IsMVCCSnapshot(snapshot))
+ if (snapshot != InvalidSnapshot && IsMVCCSnapshot(snapshot))
UnregisterSnapshot(snapshot);
DestroyParallelContext(pcxt);
ExitParallelMode();
@@ -1528,7 +1544,8 @@ _bt_begin_parallel(BTBuildState *buildstate, bool isconcurrent, int request)
btshared->brokenhotchain = false;
table_parallelscan_initialize(btspool->heap,
ParallelTableScanFromBTShared(btshared),
- snapshot);
+ snapshot,
+ reset_snapshot);
/*
* Store shared tuplesort-private state, for which we reserved space.
@@ -1604,6 +1621,16 @@ _bt_begin_parallel(BTBuildState *buildstate, bool isconcurrent, int request)
/* Save leader state now that it's clear build will be parallel */
buildstate->btleader = btleader;
+ /*
+ * In case of concurrent build snapshots are going to be reset periodically.
+ * In case when leader going to reset own active snapshot as well - we need to
+ * wait until all workers imported initial snapshot.
+ */
+ wait_for_snapshot_attach = reset_snapshot && leaderparticipates;
+
+ if (wait_for_snapshot_attach)
+ WaitForParallelWorkersToAttach(pcxt, true);
+
/* Join heap scan ourselves */
if (leaderparticipates)
_bt_leader_participate_as_worker(buildstate);
@@ -1612,7 +1639,8 @@ _bt_begin_parallel(BTBuildState *buildstate, bool isconcurrent, int request)
* Caller needs to wait for all launched workers when we return. Make
* sure that the failure-to-start case will not hang forever.
*/
- WaitForParallelWorkersToAttach(pcxt);
+ if (!wait_for_snapshot_attach)
+ WaitForParallelWorkersToAttach(pcxt, false);
if (need_pop_active_snapshot)
PopActiveSnapshot();
}
@@ -1636,7 +1664,7 @@ _bt_end_parallel(BTLeader *btleader)
InstrAccumParallelQuery(&btleader->bufferusage[i], &btleader->walusage[i]);
/* Free last reference to MVCC snapshot, if one was used */
- if (IsMVCCSnapshot(btleader->snapshot))
+ if (btleader->snapshot != InvalidSnapshot && IsMVCCSnapshot(btleader->snapshot))
UnregisterSnapshot(btleader->snapshot);
DestroyParallelContext(btleader->pcxt);
ExitParallelMode();
diff --git a/src/backend/access/table/tableam.c b/src/backend/access/table/tableam.c
index bd8715b6797..cac7a9ea88a 100644
--- a/src/backend/access/table/tableam.c
+++ b/src/backend/access/table/tableam.c
@@ -131,10 +131,10 @@ table_parallelscan_estimate(Relation rel, Snapshot snapshot)
{
Size sz = 0;
- if (IsMVCCSnapshot(snapshot))
+ if (snapshot != InvalidSnapshot && IsMVCCSnapshot(snapshot))
sz = add_size(sz, EstimateSnapshotSpace(snapshot));
else
- Assert(snapshot == SnapshotAny);
+ Assert(snapshot == SnapshotAny || snapshot == InvalidSnapshot);
sz = add_size(sz, rel->rd_tableam->parallelscan_estimate(rel));
@@ -143,21 +143,36 @@ table_parallelscan_estimate(Relation rel, Snapshot snapshot)
void
table_parallelscan_initialize(Relation rel, ParallelTableScanDesc pscan,
- Snapshot snapshot)
+ Snapshot snapshot, bool reset_snapshot)
{
Size snapshot_off = rel->rd_tableam->parallelscan_initialize(rel, pscan);
pscan->phs_snapshot_off = snapshot_off;
- if (IsMVCCSnapshot(snapshot))
+ /*
+ * Initialize parallel scan description. For normal scans with a regular
+ * MVCC snapshot, serialize the snapshot info. For scans that use periodic
+ * snapshot resets, mark the scan accordingly.
+ */
+ if (reset_snapshot)
+ {
+ Assert(snapshot == InvalidSnapshot);
+ pscan->phs_snapshot_any = false;
+ pscan->phs_reset_snapshot = true;
+ INJECTION_POINT("table_parallelscan_initialize");
+ }
+ else if (IsMVCCSnapshot(snapshot))
{
SerializeSnapshot(snapshot, (char *) pscan + pscan->phs_snapshot_off);
pscan->phs_snapshot_any = false;
+ pscan->phs_reset_snapshot = false;
}
else
{
Assert(snapshot == SnapshotAny);
+ Assert(!reset_snapshot);
pscan->phs_snapshot_any = true;
+ pscan->phs_reset_snapshot = false;
}
}
@@ -170,7 +185,19 @@ table_beginscan_parallel(Relation relation, ParallelTableScanDesc pscan)
Assert(RelFileLocatorEquals(relation->rd_locator, pscan->phs_locator));
- if (!pscan->phs_snapshot_any)
+ /*
+ * For scans that
+ * use periodic snapshot resets, mark the scan accordingly and use the active
+ * snapshot as the initial state.
+ */
+ if (pscan->phs_reset_snapshot)
+ {
+ Assert(ActiveSnapshotSet());
+ flags |= SO_RESET_SNAPSHOT;
+ /* Start with current active snapshot. */
+ snapshot = GetActiveSnapshot();
+ }
+ else if (!pscan->phs_snapshot_any)
{
/* Snapshot was serialized -- restore it */
snapshot = RestoreSnapshot((char *) pscan + pscan->phs_snapshot_off);
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 0a1e089ec1d..d49c6ee410f 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -76,6 +76,7 @@
#define PARALLEL_KEY_RELMAPPER_STATE UINT64CONST(0xFFFFFFFFFFFF000D)
#define PARALLEL_KEY_UNCOMMITTEDENUMS UINT64CONST(0xFFFFFFFFFFFF000E)
#define PARALLEL_KEY_CLIENTCONNINFO UINT64CONST(0xFFFFFFFFFFFF000F)
+#define PARALLEL_KEY_SNAPSHOT_RESTORED UINT64CONST(0xFFFFFFFFFFFF0010)
/* Fixed-size parallel state. */
typedef struct FixedParallelState
@@ -301,6 +302,10 @@ InitializeParallelDSM(ParallelContext *pcxt)
pcxt->nworkers));
shm_toc_estimate_keys(&pcxt->estimator, 1);
+ shm_toc_estimate_chunk(&pcxt->estimator, mul_size(sizeof(bool),
+ pcxt->nworkers));
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
/* Estimate how much we'll need for the entrypoint info. */
shm_toc_estimate_chunk(&pcxt->estimator, strlen(pcxt->library_name) +
strlen(pcxt->function_name) + 2);
@@ -372,6 +377,7 @@ InitializeParallelDSM(ParallelContext *pcxt)
char *entrypointstate;
char *uncommittedenumsspace;
char *clientconninfospace;
+ bool *snapshot_set_flag_space;
Size lnamelen;
/* Serialize shared libraries we have loaded. */
@@ -487,6 +493,19 @@ InitializeParallelDSM(ParallelContext *pcxt)
strcpy(entrypointstate, pcxt->library_name);
strcpy(entrypointstate + lnamelen + 1, pcxt->function_name);
shm_toc_insert(pcxt->toc, PARALLEL_KEY_ENTRYPOINT, entrypointstate);
+
+ /*
+ * Establish dynamic shared memory to pass information about importing
+ * of snapshot.
+ */
+ snapshot_set_flag_space =
+ shm_toc_allocate(pcxt->toc, mul_size(sizeof(bool), pcxt->nworkers));
+ for (i = 0; i < pcxt->nworkers; ++i)
+ {
+ pcxt->worker[i].snapshot_restored = snapshot_set_flag_space + i * sizeof(bool);
+ *pcxt->worker[i].snapshot_restored = false;
+ }
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_SNAPSHOT_RESTORED, snapshot_set_flag_space);
}
/* Update nworkers_to_launch, in case we changed nworkers above. */
@@ -542,6 +561,17 @@ ReinitializeParallelDSM(ParallelContext *pcxt)
pcxt->worker[i].error_mqh = shm_mq_attach(mq, pcxt->seg, NULL);
}
}
+
+ /* Set snapshot restored flag to false. */
+ if (pcxt->nworkers > 0)
+ {
+ bool *snapshot_restored_space;
+ int i;
+ snapshot_restored_space =
+ shm_toc_lookup(pcxt->toc, PARALLEL_KEY_SNAPSHOT_RESTORED, false);
+ for (i = 0; i < pcxt->nworkers; ++i)
+ snapshot_restored_space[i] = false;
+ }
}
/*
@@ -657,6 +687,10 @@ LaunchParallelWorkers(ParallelContext *pcxt)
* Wait for all workers to attach to their error queues, and throw an error if
* any worker fails to do this.
*
+ * wait_for_snapshot: track whether each parallel worker has successfully restored
+ * its snapshot. This is needed when using periodic snapshot resets to ensure all
+ * workers have a valid initial snapshot before proceeding with the scan.
+ *
* Callers can assume that if this function returns successfully, then the
* number of workers given by pcxt->nworkers_launched have initialized and
* attached to their error queues. Whether or not these workers are guaranteed
@@ -686,7 +720,7 @@ LaunchParallelWorkers(ParallelContext *pcxt)
* call this function at all.
*/
void
-WaitForParallelWorkersToAttach(ParallelContext *pcxt)
+WaitForParallelWorkersToAttach(ParallelContext *pcxt, bool wait_for_snapshot)
{
int i;
@@ -730,9 +764,12 @@ WaitForParallelWorkersToAttach(ParallelContext *pcxt)
mq = shm_mq_get_queue(pcxt->worker[i].error_mqh);
if (shm_mq_get_sender(mq) != NULL)
{
- /* Yes, so it is known to be attached. */
- pcxt->known_attached_workers[i] = true;
- ++pcxt->nknown_attached_workers;
+ if (!wait_for_snapshot || *(pcxt->worker[i].snapshot_restored))
+ {
+ /* Yes, so it is known to be attached. */
+ pcxt->known_attached_workers[i] = true;
+ ++pcxt->nknown_attached_workers;
+ }
}
}
else if (status == BGWH_STOPPED)
@@ -1291,6 +1328,7 @@ ParallelWorkerMain(Datum main_arg)
shm_toc *toc;
FixedParallelState *fps;
char *error_queue_space;
+ bool *snapshot_restored_space;
shm_mq *mq;
shm_mq_handle *mqh;
char *libraryspace;
@@ -1489,6 +1527,10 @@ ParallelWorkerMain(Datum main_arg)
fps->parallel_leader_pgproc);
PushActiveSnapshot(asnapshot);
+ /* Snapshot is restored, set flag to make leader know about it. */
+ snapshot_restored_space = shm_toc_lookup(toc, PARALLEL_KEY_SNAPSHOT_RESTORED, false);
+ snapshot_restored_space[ParallelWorkerNumber] = true;
+
/*
* We've changed which tuples we can see, and must therefore invalidate
* system caches.
diff --git a/src/backend/executor/nodeSeqscan.c b/src/backend/executor/nodeSeqscan.c
index 7cb12a11c2d..2907b366791 100644
--- a/src/backend/executor/nodeSeqscan.c
+++ b/src/backend/executor/nodeSeqscan.c
@@ -262,7 +262,8 @@ ExecSeqScanInitializeDSM(SeqScanState *node,
pscan = shm_toc_allocate(pcxt->toc, node->pscan_len);
table_parallelscan_initialize(node->ss.ss_currentRelation,
pscan,
- estate->es_snapshot);
+ estate->es_snapshot,
+ false);
shm_toc_insert(pcxt->toc, node->ss.ps.plan->plan_node_id, pscan);
node->ss.ss_currentScanDesc =
table_beginscan_parallel(node->ss.ss_currentRelation, pscan);
diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c
index 2189bf0d9ae..b3cc7a2c150 100644
--- a/src/backend/utils/time/snapmgr.c
+++ b/src/backend/utils/time/snapmgr.c
@@ -287,14 +287,6 @@ GetTransactionSnapshot(void)
Snapshot
GetLatestSnapshot(void)
{
- /*
- * We might be able to relax this, but nothing that could otherwise work
- * needs it.
- */
- if (IsInParallelMode())
- elog(ERROR,
- "cannot update SecondarySnapshot during a parallel operation");
-
/*
* So far there are no cases requiring support for GetLatestSnapshot()
* during logical decoding, but it wouldn't be hard to add if required.
diff --git a/src/include/access/parallel.h b/src/include/access/parallel.h
index 69ffe5498f9..964a7e945be 100644
--- a/src/include/access/parallel.h
+++ b/src/include/access/parallel.h
@@ -26,6 +26,7 @@ typedef struct ParallelWorkerInfo
{
BackgroundWorkerHandle *bgwhandle;
shm_mq_handle *error_mqh;
+ bool *snapshot_restored;
} ParallelWorkerInfo;
typedef struct ParallelContext
@@ -65,7 +66,7 @@ extern void InitializeParallelDSM(ParallelContext *pcxt);
extern void ReinitializeParallelDSM(ParallelContext *pcxt);
extern void ReinitializeParallelWorkers(ParallelContext *pcxt, int nworkers_to_launch);
extern void LaunchParallelWorkers(ParallelContext *pcxt);
-extern void WaitForParallelWorkersToAttach(ParallelContext *pcxt);
+extern void WaitForParallelWorkersToAttach(ParallelContext *pcxt, bool wait_for_snapshot);
extern void WaitForParallelWorkersToFinish(ParallelContext *pcxt);
extern void DestroyParallelContext(ParallelContext *pcxt);
extern bool ParallelContextActive(void);
diff --git a/src/include/access/relscan.h b/src/include/access/relscan.h
index e1884acf493..a9603084aeb 100644
--- a/src/include/access/relscan.h
+++ b/src/include/access/relscan.h
@@ -88,6 +88,7 @@ typedef struct ParallelTableScanDescData
RelFileLocator phs_locator; /* physical relation to scan */
bool phs_syncscan; /* report location to syncscan logic? */
bool phs_snapshot_any; /* SnapshotAny, not phs_snapshot_data? */
+ bool phs_reset_snapshot; /* use SO_RESET_SNAPSHOT? */
Size phs_snapshot_off; /* data for snapshot */
} ParallelTableScanDescData;
typedef struct ParallelTableScanDescData *ParallelTableScanDesc;
diff --git a/src/include/access/tableam.h b/src/include/access/tableam.h
index f4c7d2a92bf..9ee5ea15fd4 100644
--- a/src/include/access/tableam.h
+++ b/src/include/access/tableam.h
@@ -1184,7 +1184,8 @@ extern Size table_parallelscan_estimate(Relation rel, Snapshot snapshot);
*/
extern void table_parallelscan_initialize(Relation rel,
ParallelTableScanDesc pscan,
- Snapshot snapshot);
+ Snapshot snapshot,
+ bool reset_snapshot);
/*
* Begin a parallel scan. `pscan` needs to have been initialized with
@@ -1802,9 +1803,9 @@ table_scan_analyze_next_tuple(TableScanDesc scan, TransactionId OldestXmin,
* This only really makes sense for heap AM, it might need to be generalized
* for other AMs later.
*
- * In case of non-unique index and non-parallel concurrent build
- * SO_RESET_SNAPSHOT is applied for the scan. That leads for changing snapshots
- * on the fly to allow xmin horizon propagate.
+ * In case of non-unique concurrent index build SO_RESET_SNAPSHOT is applied
+ * for the scan. That leads for changing snapshots on the fly to allow xmin
+ * horizon propagate.
*/
static inline double
table_index_build_scan(Relation table_rel,
diff --git a/src/test/modules/injection_points/expected/cic_reset_snapshots.out b/src/test/modules/injection_points/expected/cic_reset_snapshots.out
index 4cfbbb05923..49ef68d9071 100644
--- a/src/test/modules/injection_points/expected/cic_reset_snapshots.out
+++ b/src/test/modules/injection_points/expected/cic_reset_snapshots.out
@@ -17,6 +17,12 @@ SELECT injection_points_attach('table_beginscan_strat_reset_snapshots', 'notice'
(1 row)
+SELECT injection_points_attach('table_parallelscan_initialize', 'notice');
+ injection_points_attach
+-------------------------
+
+(1 row)
+
CREATE SCHEMA cic_reset_snap;
CREATE TABLE cic_reset_snap.tbl(i int primary key, j int);
INSERT INTO cic_reset_snap.tbl SELECT i, i * I FROM generate_series(1, 200) s(i);
@@ -72,27 +78,40 @@ NOTICE: notice triggered for injection point heap_reset_scan_snapshot_effective
DROP INDEX CONCURRENTLY cic_reset_snap.idx;
-- The same in parallel mode
ALTER TABLE cic_reset_snap.tbl SET (parallel_workers=2);
+-- Detach to keep test stable, since parallel worker may complete scan before leader
+SELECT injection_points_detach('heap_reset_scan_snapshot_effective');
+ injection_points_detach
+-------------------------
+
+(1 row)
+
CREATE UNIQUE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl(i);
REINDEX INDEX CONCURRENTLY cic_reset_snap.idx;
DROP INDEX CONCURRENTLY cic_reset_snap.idx;
CREATE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl(i);
+NOTICE: notice triggered for injection point table_parallelscan_initialize
REINDEX INDEX CONCURRENTLY cic_reset_snap.idx;
+NOTICE: notice triggered for injection point table_parallelscan_initialize
DROP INDEX CONCURRENTLY cic_reset_snap.idx;
CREATE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl(MOD(i, 2), j) WHERE MOD(i, 2) = 0;
+NOTICE: notice triggered for injection point table_parallelscan_initialize
REINDEX INDEX CONCURRENTLY cic_reset_snap.idx;
+NOTICE: notice triggered for injection point table_parallelscan_initialize
DROP INDEX CONCURRENTLY cic_reset_snap.idx;
CREATE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl(i, j) WHERE cic_reset_snap.predicate_stable(i);
NOTICE: notice triggered for injection point table_beginscan_strat_reset_snapshots
-NOTICE: notice triggered for injection point heap_reset_scan_snapshot_effective
REINDEX INDEX CONCURRENTLY cic_reset_snap.idx;
NOTICE: notice triggered for injection point table_beginscan_strat_reset_snapshots
-NOTICE: notice triggered for injection point heap_reset_scan_snapshot_effective
DROP INDEX CONCURRENTLY cic_reset_snap.idx;
CREATE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl(i, j) WHERE cic_reset_snap.predicate_stable_no_param();
+NOTICE: notice triggered for injection point table_parallelscan_initialize
REINDEX INDEX CONCURRENTLY cic_reset_snap.idx;
+NOTICE: notice triggered for injection point table_parallelscan_initialize
DROP INDEX CONCURRENTLY cic_reset_snap.idx;
CREATE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl USING BRIN(i);
+NOTICE: notice triggered for injection point table_parallelscan_initialize
REINDEX INDEX CONCURRENTLY cic_reset_snap.idx;
+NOTICE: notice triggered for injection point table_parallelscan_initialize
DROP INDEX CONCURRENTLY cic_reset_snap.idx;
DROP SCHEMA cic_reset_snap CASCADE;
NOTICE: drop cascades to 3 other objects
diff --git a/src/test/modules/injection_points/sql/cic_reset_snapshots.sql b/src/test/modules/injection_points/sql/cic_reset_snapshots.sql
index 4fef5a47431..5d1c31493f0 100644
--- a/src/test/modules/injection_points/sql/cic_reset_snapshots.sql
+++ b/src/test/modules/injection_points/sql/cic_reset_snapshots.sql
@@ -3,7 +3,7 @@ CREATE EXTENSION injection_points;
SELECT injection_points_set_local();
SELECT injection_points_attach('heap_reset_scan_snapshot_effective', 'notice');
SELECT injection_points_attach('table_beginscan_strat_reset_snapshots', 'notice');
-
+SELECT injection_points_attach('table_parallelscan_initialize', 'notice');
CREATE SCHEMA cic_reset_snap;
CREATE TABLE cic_reset_snap.tbl(i int primary key, j int);
@@ -53,6 +53,9 @@ DROP INDEX CONCURRENTLY cic_reset_snap.idx;
-- The same in parallel mode
ALTER TABLE cic_reset_snap.tbl SET (parallel_workers=2);
+-- Detach to keep test stable, since parallel worker may complete scan before leader
+SELECT injection_points_detach('heap_reset_scan_snapshot_effective');
+
CREATE UNIQUE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl(i);
REINDEX INDEX CONCURRENTLY cic_reset_snap.idx;
DROP INDEX CONCURRENTLY cic_reset_snap.idx;
@@ -79,4 +82,4 @@ DROP INDEX CONCURRENTLY cic_reset_snap.idx;
DROP SCHEMA cic_reset_snap CASCADE;
-DROP EXTENSION injection_points;
+DROP EXTENSION injection_points;
\ No newline at end of file
--
2.43.0
[application/octet-stream] v6-0004-Allow-advancing-xmin-during-non-unique-non-parall.patch (35.8K, 4-v6-0004-Allow-advancing-xmin-during-non-unique-non-parall.patch)
download | inline diff:
From e85b568a1a8d39ab24bd21bef90d546fce61a726 Mon Sep 17 00:00:00 2001
From: nkey <[email protected]>
Date: Sat, 30 Nov 2024 17:41:29 +0100
Subject: [PATCH v6 4/6] Allow advancing xmin during non-unique, non-parallel
concurrent index builds by periodically resetting snapshots
Long-running transactions like those used by CREATE INDEX CONCURRENTLY and REINDEX CONCURRENTLY can hold back the global xmin horizon, preventing VACUUM from cleaning up dead tuples and potentially leading to transaction ID wraparound issues. In PostgreSQL 14, commit d9d076222f5b attempted to allow VACUUM to ignore indexing transactions with CONCURRENTLY to mitigate this problem. However, this was reverted in commit e28bb8851969 because it could cause indexes to miss heap tuples that were HOT-updated and HOT-pruned during the index creation, leading to index corruption.
This patch introduces a safe alternative by periodically resetting the snapshot used during non-unique, non-parallel concurrent index builds. By resetting the snapshot every N pages during the heap scan, we allow the xmin horizon to advance without risking index corruption. This approach is safe for non-unique index builds because they do not enforce uniqueness constraints that require a consistent snapshot across the entire scan.
Currently, this technique is applied to:
Non-parallel index builds: Parallel index builds are not yet supported and will be addressed in a future commit.
Non-unique indexes: Unique index builds still require a consistent snapshot to enforce uniqueness constraints, and support for them may be added in the future.
Only during the first scan of the heap: The second scan during index validation still uses a single snapshot to ensure index correctness.
To implement this, a new scan option SO_RESET_SNAPSHOT is introduced. When set, it causes the snapshot to be reset every SO_RESET_SNAPSHOT_EACH_N_PAGE pages during the scan. The heap scan code is adjusted to support this option, and the index build code is modified to use it for applicable concurrent index builds that are not on system catalogs and not using parallel workers.
This addresses the issues that led to the reversion of commit d9d076222f5b, providing a safe way to allow xmin advancement during long-running non-unique, non-parallel concurrent index builds while ensuring index correctness.
Regression tests are added to verify the behavior.
---
contrib/amcheck/verify_nbtree.c | 3 +-
contrib/pgstattuple/pgstattuple.c | 2 +-
src/backend/access/brin/brin.c | 14 +++
src/backend/access/heap/heapam.c | 46 ++++++++
src/backend/access/heap/heapam_handler.c | 57 ++++++++--
src/backend/access/index/genam.c | 2 +-
src/backend/access/nbtree/nbtsort.c | 14 +++
src/backend/catalog/index.c | 30 +++++-
src/backend/commands/indexcmds.c | 14 +--
src/backend/optimizer/plan/planner.c | 9 ++
src/include/access/tableam.h | 28 ++++-
src/test/modules/injection_points/Makefile | 2 +-
.../expected/cic_reset_snapshots.out | 102 ++++++++++++++++++
src/test/modules/injection_points/meson.build | 1 +
.../sql/cic_reset_snapshots.sql | 82 ++++++++++++++
15 files changed, 375 insertions(+), 31 deletions(-)
create mode 100644 src/test/modules/injection_points/expected/cic_reset_snapshots.out
create mode 100644 src/test/modules/injection_points/sql/cic_reset_snapshots.sql
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index ffe4f721672..7fb052ce3de 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -689,7 +689,8 @@ bt_check_every_level(Relation rel, Relation heaprel, bool heapkeyspace,
0, /* number of keys */
NULL, /* scan key */
true, /* buffer access strategy OK */
- true); /* syncscan OK? */
+ true, /* syncscan OK? */
+ false);
/*
* Scan will behave as the first scan of a CREATE INDEX CONCURRENTLY
diff --git a/contrib/pgstattuple/pgstattuple.c b/contrib/pgstattuple/pgstattuple.c
index 48cb8f59c4f..ff7cc07df99 100644
--- a/contrib/pgstattuple/pgstattuple.c
+++ b/contrib/pgstattuple/pgstattuple.c
@@ -332,7 +332,7 @@ pgstat_heap(Relation rel, FunctionCallInfo fcinfo)
errmsg("only heap AM is supported")));
/* Disable syncscan because we assume we scan from block zero upwards */
- scan = table_beginscan_strat(rel, SnapshotAny, 0, NULL, true, false);
+ scan = table_beginscan_strat(rel, SnapshotAny, 0, NULL, true, false, false);
hscan = (HeapScanDesc) scan;
InitDirtySnapshot(SnapshotDirty);
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 3aedec882cd..d69859ac4df 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -2366,6 +2366,7 @@ _brin_begin_parallel(BrinBuildState *buildstate, Relation heap, Relation index,
WalUsage *walusage;
BufferUsage *bufferusage;
bool leaderparticipates = true;
+ bool need_pop_active_snapshot = true;
int querylen;
#ifdef DISABLE_LEADER_PARTICIPATION
@@ -2391,9 +2392,16 @@ _brin_begin_parallel(BrinBuildState *buildstate, Relation heap, Relation index,
* live according to that.
*/
if (!isconcurrent)
+ {
+ Assert(ActiveSnapshotSet());
snapshot = SnapshotAny;
+ need_pop_active_snapshot = false;
+ }
else
+ {
snapshot = RegisterSnapshot(GetTransactionSnapshot());
+ PushActiveSnapshot(GetTransactionSnapshot());
+ }
/*
* Estimate size for our own PARALLEL_KEY_BRIN_SHARED workspace.
@@ -2436,6 +2444,8 @@ _brin_begin_parallel(BrinBuildState *buildstate, Relation heap, Relation index,
/* If no DSM segment was available, back out (do serial build) */
if (pcxt->seg == NULL)
{
+ if (need_pop_active_snapshot)
+ PopActiveSnapshot();
if (IsMVCCSnapshot(snapshot))
UnregisterSnapshot(snapshot);
DestroyParallelContext(pcxt);
@@ -2515,6 +2525,8 @@ _brin_begin_parallel(BrinBuildState *buildstate, Relation heap, Relation index,
/* If no workers were successfully launched, back out (do serial build) */
if (pcxt->nworkers_launched == 0)
{
+ if (need_pop_active_snapshot)
+ PopActiveSnapshot();
_brin_end_parallel(brinleader, NULL);
return;
}
@@ -2531,6 +2543,8 @@ _brin_begin_parallel(BrinBuildState *buildstate, Relation heap, Relation index,
* sure that the failure-to-start case will not hang forever.
*/
WaitForParallelWorkersToAttach(pcxt);
+ if (need_pop_active_snapshot)
+ PopActiveSnapshot();
}
/*
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index d00300c5dcb..1fdfdf96482 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -51,6 +51,7 @@
#include "utils/datum.h"
#include "utils/inval.h"
#include "utils/spccache.h"
+#include "utils/injection_point.h"
static HeapTuple heap_prepare_insert(Relation relation, HeapTuple tup,
@@ -566,6 +567,36 @@ heap_prepare_pagescan(TableScanDesc sscan)
LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
}
+/*
+ * Reset the active snapshot during a scan.
+ * This ensures the xmin horizon can advance while maintaining safe tuple visibility.
+ * Note: No other snapshot should be active during this operation.
+ */
+static inline void
+heap_reset_scan_snapshot(TableScanDesc sscan)
+{
+ /* Make sure no other snapshot was set as active. */
+ Assert(GetActiveSnapshot() == sscan->rs_snapshot);
+ /* And make sure active snapshot is not registered. */
+ Assert(GetActiveSnapshot()->regd_count == 0);
+ PopActiveSnapshot();
+
+ sscan->rs_snapshot = InvalidSnapshot; /* just ot be tidy */
+ Assert(!HaveRegisteredOrActiveSnapshot());
+ InvalidateCatalogSnapshot();
+
+ /* Goal of snapshot reset is to allow horizon to advance. */
+ Assert(!TransactionIdIsValid(MyProc->xmin));
+#if USE_INJECTION_POINTS
+ /* In some cases it is still not possible due xid assign. */
+ if (!TransactionIdIsValid(MyProc->xid))
+ INJECTION_POINT("heap_reset_scan_snapshot_effective");
+#endif
+
+ PushActiveSnapshot(GetLatestSnapshot());
+ sscan->rs_snapshot = GetActiveSnapshot();
+}
+
/*
* heap_fetch_next_buffer - read and pin the next block from MAIN_FORKNUM.
*
@@ -607,7 +638,13 @@ heap_fetch_next_buffer(HeapScanDesc scan, ScanDirection dir)
scan->rs_cbuf = read_stream_next_buffer(scan->rs_read_stream, NULL);
if (BufferIsValid(scan->rs_cbuf))
+ {
scan->rs_cblock = BufferGetBlockNumber(scan->rs_cbuf);
+#define SO_RESET_SNAPSHOT_EACH_N_PAGE 64
+ if ((scan->rs_base.rs_flags & SO_RESET_SNAPSHOT) &&
+ (scan->rs_cblock % SO_RESET_SNAPSHOT_EACH_N_PAGE == 0))
+ heap_reset_scan_snapshot((TableScanDesc) scan);
+ }
}
/*
@@ -1233,6 +1270,15 @@ heap_endscan(TableScanDesc sscan)
if (scan->rs_parallelworkerdata != NULL)
pfree(scan->rs_parallelworkerdata);
+ if (scan->rs_base.rs_flags & SO_RESET_SNAPSHOT)
+ {
+ Assert(!(scan->rs_base.rs_flags & SO_TEMP_SNAPSHOT));
+ /* Make sure no other snapshot was set as active. */
+ Assert(GetActiveSnapshot() == sscan->rs_snapshot);
+ /* And make sure snapshot is not registered. */
+ Assert(GetActiveSnapshot()->regd_count == 0);
+ }
+
if (scan->rs_base.rs_flags & SO_TEMP_SNAPSHOT)
UnregisterSnapshot(scan->rs_base.rs_snapshot);
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index a8d95e0f1c1..980c51e32b9 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -1190,6 +1190,8 @@ heapam_index_build_range_scan(Relation heapRelation,
ExprContext *econtext;
Snapshot snapshot;
bool need_unregister_snapshot = false;
+ bool need_pop_active_snapshot = false;
+ bool reset_snapshots = false;
TransactionId OldestXmin;
BlockNumber previous_blkno = InvalidBlockNumber;
BlockNumber root_blkno = InvalidBlockNumber;
@@ -1224,9 +1226,6 @@ heapam_index_build_range_scan(Relation heapRelation,
/* Arrange for econtext's scan tuple to be the tuple under test */
econtext->ecxt_scantuple = slot;
- /* Set up execution state for predicate, if any. */
- predicate = ExecPrepareQual(indexInfo->ii_Predicate, estate);
-
/*
* Prepare for scan of the base relation. In a normal index build, we use
* SnapshotAny because we must retrieve all tuples and do our own time
@@ -1236,6 +1235,15 @@ heapam_index_build_range_scan(Relation heapRelation,
*/
OldestXmin = InvalidTransactionId;
+ /*
+ * For unique index we need consistent snapshot for the whole scan.
+ * In case of parallel scan some additional infrastructure required
+ * to perform scan with SO_RESET_SNAPSHOT which is not yet ready.
+ */
+ reset_snapshots = indexInfo->ii_Concurrent &&
+ !indexInfo->ii_Unique &&
+ !is_system_catalog; /* just for the case */
+
/* okay to ignore lazy VACUUMs here */
if (!IsBootstrapProcessingMode() && !indexInfo->ii_Concurrent)
OldestXmin = GetOldestNonRemovableTransactionId(heapRelation);
@@ -1244,24 +1252,41 @@ heapam_index_build_range_scan(Relation heapRelation,
{
/*
* Serial index build.
- *
- * Must begin our own heap scan in this case. We may also need to
- * register a snapshot whose lifetime is under our direct control.
*/
if (!TransactionIdIsValid(OldestXmin))
{
- snapshot = RegisterSnapshot(GetTransactionSnapshot());
- need_unregister_snapshot = true;
+ snapshot = GetTransactionSnapshot();
+ /*
+ * Must begin our own heap scan in this case. We may also need to
+ * register a snapshot whose lifetime is under our direct control.
+ * In case of resetting of snapshot during the scan registration is
+ * not allowed because snapshot is going to be changed every so
+ * often.
+ */
+ if (!reset_snapshots)
+ {
+ snapshot = RegisterSnapshot(snapshot);
+ need_unregister_snapshot = true;
+ }
+ Assert(!ActiveSnapshotSet());
+ PushActiveSnapshot(snapshot);
+ /* store link to snapshot because it may be copied */
+ snapshot = GetActiveSnapshot();
+ need_pop_active_snapshot = true;
}
else
+ {
+ Assert(!indexInfo->ii_Concurrent);
snapshot = SnapshotAny;
+ }
scan = table_beginscan_strat(heapRelation, /* relation */
snapshot, /* snapshot */
0, /* number of keys */
NULL, /* scan key */
true, /* buffer access strategy OK */
- allow_sync); /* syncscan OK? */
+ allow_sync, /* syncscan OK? */
+ reset_snapshots /* reset snapshots? */);
}
else
{
@@ -1275,6 +1300,8 @@ heapam_index_build_range_scan(Relation heapRelation,
Assert(!IsBootstrapProcessingMode());
Assert(allow_sync);
snapshot = scan->rs_snapshot;
+ PushActiveSnapshot(snapshot);
+ need_pop_active_snapshot = true;
}
hscan = (HeapScanDesc) scan;
@@ -1289,6 +1316,13 @@ heapam_index_build_range_scan(Relation heapRelation,
Assert(snapshot == SnapshotAny ? TransactionIdIsValid(OldestXmin) :
!TransactionIdIsValid(OldestXmin));
Assert(snapshot == SnapshotAny || !anyvisible);
+ Assert(snapshot == SnapshotAny || ActiveSnapshotSet());
+
+ /* Set up execution state for predicate, if any. */
+ predicate = ExecPrepareQual(indexInfo->ii_Predicate, estate);
+ /* Clear reference to snapshot since it may be changed by the scan itself. */
+ if (reset_snapshots)
+ snapshot = InvalidSnapshot;
/* Publish number of blocks to scan */
if (progress)
@@ -1724,6 +1758,8 @@ heapam_index_build_range_scan(Relation heapRelation,
table_endscan(scan);
+ if (need_pop_active_snapshot)
+ PopActiveSnapshot();
/* we can now forget our snapshot, if set and registered by us */
if (need_unregister_snapshot)
UnregisterSnapshot(snapshot);
@@ -1796,7 +1832,8 @@ heapam_index_validate_scan(Relation heapRelation,
0, /* number of keys */
NULL, /* scan key */
true, /* buffer access strategy OK */
- false); /* syncscan not OK */
+ false, /* syncscan not OK */
+ false);
hscan = (HeapScanDesc) scan;
pgstat_progress_update_param(PROGRESS_SCAN_BLOCKS_TOTAL,
diff --git a/src/backend/access/index/genam.c b/src/backend/access/index/genam.c
index 4b4ebff6a17..a104ba9df74 100644
--- a/src/backend/access/index/genam.c
+++ b/src/backend/access/index/genam.c
@@ -463,7 +463,7 @@ systable_beginscan(Relation heapRelation,
*/
sysscan->scan = table_beginscan_strat(heapRelation, snapshot,
nkeys, key,
- true, false);
+ true, false, false);
sysscan->iscan = NULL;
}
diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index 17a352d040c..5c4581afb1a 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -1410,6 +1410,7 @@ _bt_begin_parallel(BTBuildState *buildstate, bool isconcurrent, int request)
WalUsage *walusage;
BufferUsage *bufferusage;
bool leaderparticipates = true;
+ bool need_pop_active_snapshot = true;
int querylen;
#ifdef DISABLE_LEADER_PARTICIPATION
@@ -1435,9 +1436,16 @@ _bt_begin_parallel(BTBuildState *buildstate, bool isconcurrent, int request)
* live according to that.
*/
if (!isconcurrent)
+ {
+ Assert(ActiveSnapshotSet());
snapshot = SnapshotAny;
+ need_pop_active_snapshot = false;
+ }
else
+ {
snapshot = RegisterSnapshot(GetTransactionSnapshot());
+ PushActiveSnapshot(snapshot);
+ }
/*
* Estimate size for our own PARALLEL_KEY_BTREE_SHARED workspace, and
@@ -1491,6 +1499,8 @@ _bt_begin_parallel(BTBuildState *buildstate, bool isconcurrent, int request)
/* If no DSM segment was available, back out (do serial build) */
if (pcxt->seg == NULL)
{
+ if (need_pop_active_snapshot)
+ PopActiveSnapshot();
if (IsMVCCSnapshot(snapshot))
UnregisterSnapshot(snapshot);
DestroyParallelContext(pcxt);
@@ -1585,6 +1595,8 @@ _bt_begin_parallel(BTBuildState *buildstate, bool isconcurrent, int request)
/* If no workers were successfully launched, back out (do serial build) */
if (pcxt->nworkers_launched == 0)
{
+ if (need_pop_active_snapshot)
+ PopActiveSnapshot();
_bt_end_parallel(btleader);
return;
}
@@ -1601,6 +1613,8 @@ _bt_begin_parallel(BTBuildState *buildstate, bool isconcurrent, int request)
* sure that the failure-to-start case will not hang forever.
*/
WaitForParallelWorkersToAttach(pcxt);
+ if (need_pop_active_snapshot)
+ PopActiveSnapshot();
}
/*
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 05dc6add7eb..e0ada5ce159 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -79,6 +79,7 @@
#include "utils/snapmgr.h"
#include "utils/syscache.h"
#include "utils/tuplesort.h"
+#include "storage/proc.h"
/* Potentially set by pg_upgrade_support functions */
Oid binary_upgrade_next_index_pg_class_oid = InvalidOid;
@@ -1490,8 +1491,8 @@ index_concurrently_build(Oid heapRelationId,
Relation indexRelation;
IndexInfo *indexInfo;
- /* This had better make sure that a snapshot is active */
- Assert(ActiveSnapshotSet());
+ Assert(!TransactionIdIsValid(MyProc->xmin));
+ Assert(!TransactionIdIsValid(MyProc->xid));
/* Open and lock the parent heap relation */
heapRel = table_open(heapRelationId, ShareUpdateExclusiveLock);
@@ -1509,19 +1510,28 @@ index_concurrently_build(Oid heapRelationId,
indexRelation = index_open(indexRelationId, RowExclusiveLock);
+ /* BuildIndexInfo may require as snapshot for expressions and predicates */
+ PushActiveSnapshot(GetTransactionSnapshot());
/*
* We have to re-build the IndexInfo struct, since it was lost in the
* commit of the transaction where this concurrent index was created at
* the catalog level.
*/
indexInfo = BuildIndexInfo(indexRelation);
+ /* Done with snapshot */
+ PopActiveSnapshot();
Assert(!indexInfo->ii_ReadyForInserts);
indexInfo->ii_Concurrent = true;
indexInfo->ii_BrokenHotChain = false;
+ Assert(!TransactionIdIsValid(MyProc->xmin));
/* Now build the index */
index_build(heapRel, indexRelation, indexInfo, false, true);
+ /* Invalidate catalog snapshot just for assert */
+ InvalidateCatalogSnapshot();
+ Assert((indexInfo->ii_ParallelWorkers || indexInfo->ii_Unique) || !TransactionIdIsValid(MyProc->xmin));
+
/* Roll back any GUC changes executed by index functions */
AtEOXact_GUC(false, save_nestlevel);
@@ -1532,12 +1542,19 @@ index_concurrently_build(Oid heapRelationId,
table_close(heapRel, NoLock);
index_close(indexRelation, NoLock);
+ /*
+ * Updating pg_index might involve TOAST table access, so ensure we
+ * have a valid snapshot.
+ */
+ PushActiveSnapshot(GetTransactionSnapshot());
/*
* Update the pg_index row to mark the index as ready for inserts. Once we
* commit this transaction, any new transactions that open the table must
* insert new entries into the index for insertions and non-HOT updates.
*/
index_set_state_flags(indexRelationId, INDEX_CREATE_SET_READY);
+ /* we can do away with our snapshot */
+ PopActiveSnapshot();
}
/*
@@ -3205,7 +3222,8 @@ IndexCheckExclusion(Relation heapRelation,
0, /* number of keys */
NULL, /* scan key */
true, /* buffer access strategy OK */
- true); /* syncscan OK */
+ true, /* syncscan OK */
+ false);
while (table_scan_getnextslot(scan, ForwardScanDirection, slot))
{
@@ -3268,12 +3286,16 @@ IndexCheckExclusion(Relation heapRelation,
* as of the start of the scan (see table_index_build_scan), whereas a normal
* build takes care to include recently-dead tuples. This is OK because
* we won't mark the index valid until all transactions that might be able
- * to see those tuples are gone. The reason for doing that is to avoid
+ * to see those tuples are gone. One of reasons for doing that is to avoid
* bogus unique-index failures due to concurrent UPDATEs (we might see
* different versions of the same row as being valid when we pass over them,
* if we used HeapTupleSatisfiesVacuum). This leaves us with an index that
* does not contain any tuples added to the table while we built the index.
*
+ * Furthermore, in case of non-unique index we set SO_RESET_SNAPSHOT for the
+ * scan, which causes new snapshot to be set as active every so often. The reason
+ * for that is to propagate the xmin horizon forward.
+ *
* Next, we mark the index "indisready" (but still not "indisvalid") and
* commit the second transaction and start a third. Again we wait for all
* transactions that could have been modifying the table to terminate. Now
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 932854d6c60..6c1fce8ed25 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -1670,23 +1670,17 @@ DefineIndex(Oid tableId,
* chains can be created where the new tuple and the old tuple in the
* chain have different index keys.
*
- * We now take a new snapshot, and build the index using all tuples that
- * are visible in this snapshot. We can be sure that any HOT updates to
+ * We build the index using all tuples that are visible using single or
+ * multiple refreshing snapshots. We can be sure that any HOT updates to
* these tuples will be compatible with the index, since any updates made
* by transactions that didn't know about the index are now committed or
* rolled back. Thus, each visible tuple is either the end of its
* HOT-chain or the extension of the chain is HOT-safe for this index.
*/
- /* Set ActiveSnapshot since functions in the indexes may need it */
- PushActiveSnapshot(GetTransactionSnapshot());
-
/* Perform concurrent build of index */
index_concurrently_build(tableId, indexRelationId);
- /* we can do away with our snapshot */
- PopActiveSnapshot();
-
/*
* Commit this transaction to make the indisready update visible.
*/
@@ -4084,9 +4078,6 @@ ReindexRelationConcurrently(const ReindexStmt *stmt, Oid relationOid, const Rein
if (newidx->safe)
set_indexsafe_procflags();
- /* Set ActiveSnapshot since functions in the indexes may need it */
- PushActiveSnapshot(GetTransactionSnapshot());
-
/*
* Update progress for the index to build, with the correct parent
* table involved.
@@ -4101,7 +4092,6 @@ ReindexRelationConcurrently(const ReindexStmt *stmt, Oid relationOid, const Rein
/* Perform concurrent build of new index */
index_concurrently_build(newidx->tableId, newidx->indexId);
- PopActiveSnapshot();
CommitTransactionCommand();
}
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index f3856c519f6..5c7514c96ac 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -61,6 +61,7 @@
#include "utils/lsyscache.h"
#include "utils/rel.h"
#include "utils/selfuncs.h"
+#include "utils/snapmgr.h"
/* GUC parameters */
double cursor_tuple_fraction = DEFAULT_CURSOR_TUPLE_FRACTION;
@@ -6779,6 +6780,7 @@ plan_create_index_workers(Oid tableOid, Oid indexOid)
Relation heap;
Relation index;
RelOptInfo *rel;
+ bool need_pop_active_snapshot = false;
int parallel_workers;
BlockNumber heap_blocks;
double reltuples;
@@ -6834,6 +6836,11 @@ plan_create_index_workers(Oid tableOid, Oid indexOid)
heap = table_open(tableOid, NoLock);
index = index_open(indexOid, NoLock);
+ /* Set ActiveSnapshot since functions in the indexes may need it */
+ if (!ActiveSnapshotSet()) {
+ PushActiveSnapshot(GetTransactionSnapshot());
+ need_pop_active_snapshot = true;
+ }
/*
* Determine if it's safe to proceed.
*
@@ -6891,6 +6898,8 @@ plan_create_index_workers(Oid tableOid, Oid indexOid)
parallel_workers--;
done:
+ if (need_pop_active_snapshot)
+ PopActiveSnapshot();
index_close(index, NoLock);
table_close(heap, NoLock);
diff --git a/src/include/access/tableam.h b/src/include/access/tableam.h
index adb478a93ca..f4c7d2a92bf 100644
--- a/src/include/access/tableam.h
+++ b/src/include/access/tableam.h
@@ -24,6 +24,7 @@
#include "storage/read_stream.h"
#include "utils/rel.h"
#include "utils/snapshot.h"
+#include "utils/injection_point.h"
#define DEFAULT_TABLE_ACCESS_METHOD "heap"
@@ -69,6 +70,17 @@ typedef enum ScanOptions
* needed. If table data may be needed, set SO_NEED_TUPLES.
*/
SO_NEED_TUPLES = 1 << 10,
+ /*
+ * Reset scan and catalog snapshot every so often? If so, each
+ * SO_RESET_SNAPSHOT_EACH_N_PAGE pages active snapshot is popped,
+ * catalog snapshot invalidated, latest snapshot pushed as active.
+ *
+ * At the end of the scan snapshot is not popped.
+ * Goal of such mode is keep xmin propagating horizon forward.
+ *
+ * see heap_reset_scan_snapshot for details.
+ */
+ SO_RESET_SNAPSHOT = 1 << 11,
} ScanOptions;
/*
@@ -935,7 +947,8 @@ extern TableScanDesc table_beginscan_catalog(Relation relation, int nkeys,
static inline TableScanDesc
table_beginscan_strat(Relation rel, Snapshot snapshot,
int nkeys, struct ScanKeyData *key,
- bool allow_strat, bool allow_sync)
+ bool allow_strat, bool allow_sync,
+ bool reset_snapshot)
{
uint32 flags = SO_TYPE_SEQSCAN | SO_ALLOW_PAGEMODE;
@@ -943,6 +956,15 @@ table_beginscan_strat(Relation rel, Snapshot snapshot,
flags |= SO_ALLOW_STRAT;
if (allow_sync)
flags |= SO_ALLOW_SYNC;
+ if (reset_snapshot)
+ {
+ INJECTION_POINT("table_beginscan_strat_reset_snapshots");
+ /* Active snapshot is required on start. */
+ Assert(GetActiveSnapshot() == snapshot);
+ /* Active snapshot should not be registered to keep xmin propagating. */
+ Assert(GetActiveSnapshot()->regd_count == 0);
+ flags |= (SO_RESET_SNAPSHOT);
+ }
return rel->rd_tableam->scan_begin(rel, snapshot, nkeys, key, NULL, flags);
}
@@ -1779,6 +1801,10 @@ table_scan_analyze_next_tuple(TableScanDesc scan, TransactionId OldestXmin,
* very hard to detect whether they're really incompatible with the chain tip.
* This only really makes sense for heap AM, it might need to be generalized
* for other AMs later.
+ *
+ * In case of non-unique index and non-parallel concurrent build
+ * SO_RESET_SNAPSHOT is applied for the scan. That leads for changing snapshots
+ * on the fly to allow xmin horizon propagate.
*/
static inline double
table_index_build_scan(Relation table_rel,
diff --git a/src/test/modules/injection_points/Makefile b/src/test/modules/injection_points/Makefile
index f8f86e8f3b6..73893d351bb 100644
--- a/src/test/modules/injection_points/Makefile
+++ b/src/test/modules/injection_points/Makefile
@@ -10,7 +10,7 @@ EXTENSION = injection_points
DATA = injection_points--1.0.sql
PGFILEDESC = "injection_points - facility for injection points"
-REGRESS = injection_points reindex_conc
+REGRESS = injection_points reindex_conc cic_reset_snapshots
REGRESS_OPTS = --dlpath=$(top_builddir)/src/test/regress
ISOLATION = basic inplace \
diff --git a/src/test/modules/injection_points/expected/cic_reset_snapshots.out b/src/test/modules/injection_points/expected/cic_reset_snapshots.out
new file mode 100644
index 00000000000..4cfbbb05923
--- /dev/null
+++ b/src/test/modules/injection_points/expected/cic_reset_snapshots.out
@@ -0,0 +1,102 @@
+CREATE EXTENSION injection_points;
+SELECT injection_points_set_local();
+ injection_points_set_local
+----------------------------
+
+(1 row)
+
+SELECT injection_points_attach('heap_reset_scan_snapshot_effective', 'notice');
+ injection_points_attach
+-------------------------
+
+(1 row)
+
+SELECT injection_points_attach('table_beginscan_strat_reset_snapshots', 'notice');
+ injection_points_attach
+-------------------------
+
+(1 row)
+
+CREATE SCHEMA cic_reset_snap;
+CREATE TABLE cic_reset_snap.tbl(i int primary key, j int);
+INSERT INTO cic_reset_snap.tbl SELECT i, i * I FROM generate_series(1, 200) s(i);
+CREATE FUNCTION cic_reset_snap.predicate_stable(integer) RETURNS bool IMMUTABLE
+ LANGUAGE plpgsql AS $$
+BEGIN
+ EXECUTE 'SELECT txid_current()';
+ RETURN MOD($1, 2) = 0;
+END; $$;
+CREATE FUNCTION cic_reset_snap.predicate_stable_no_param() RETURNS bool IMMUTABLE
+ LANGUAGE plpgsql AS $$
+BEGIN
+ EXECUTE 'SELECT txid_current()';
+ RETURN false;
+END; $$;
+----------------
+ALTER TABLE cic_reset_snap.tbl SET (parallel_workers=0);
+CREATE UNIQUE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl(i);
+REINDEX INDEX CONCURRENTLY cic_reset_snap.idx;
+DROP INDEX CONCURRENTLY cic_reset_snap.idx;
+CREATE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl(i);
+NOTICE: notice triggered for injection point table_beginscan_strat_reset_snapshots
+NOTICE: notice triggered for injection point heap_reset_scan_snapshot_effective
+REINDEX INDEX CONCURRENTLY cic_reset_snap.idx;
+NOTICE: notice triggered for injection point table_beginscan_strat_reset_snapshots
+NOTICE: notice triggered for injection point heap_reset_scan_snapshot_effective
+DROP INDEX CONCURRENTLY cic_reset_snap.idx;
+CREATE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl(MOD(i, 2), j) WHERE MOD(i, 2) = 0;
+NOTICE: notice triggered for injection point table_beginscan_strat_reset_snapshots
+NOTICE: notice triggered for injection point heap_reset_scan_snapshot_effective
+REINDEX INDEX CONCURRENTLY cic_reset_snap.idx;
+NOTICE: notice triggered for injection point table_beginscan_strat_reset_snapshots
+NOTICE: notice triggered for injection point heap_reset_scan_snapshot_effective
+DROP INDEX CONCURRENTLY cic_reset_snap.idx;
+CREATE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl(i, j) WHERE cic_reset_snap.predicate_stable(i);
+NOTICE: notice triggered for injection point table_beginscan_strat_reset_snapshots
+NOTICE: notice triggered for injection point heap_reset_scan_snapshot_effective
+REINDEX INDEX CONCURRENTLY cic_reset_snap.idx;
+NOTICE: notice triggered for injection point table_beginscan_strat_reset_snapshots
+NOTICE: notice triggered for injection point heap_reset_scan_snapshot_effective
+DROP INDEX CONCURRENTLY cic_reset_snap.idx;
+CREATE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl(i, j) WHERE cic_reset_snap.predicate_stable_no_param();
+NOTICE: notice triggered for injection point table_beginscan_strat_reset_snapshots
+REINDEX INDEX CONCURRENTLY cic_reset_snap.idx;
+NOTICE: notice triggered for injection point table_beginscan_strat_reset_snapshots
+DROP INDEX CONCURRENTLY cic_reset_snap.idx;
+CREATE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl USING BRIN(i);
+NOTICE: notice triggered for injection point table_beginscan_strat_reset_snapshots
+NOTICE: notice triggered for injection point heap_reset_scan_snapshot_effective
+REINDEX INDEX CONCURRENTLY cic_reset_snap.idx;
+NOTICE: notice triggered for injection point table_beginscan_strat_reset_snapshots
+NOTICE: notice triggered for injection point heap_reset_scan_snapshot_effective
+DROP INDEX CONCURRENTLY cic_reset_snap.idx;
+-- The same in parallel mode
+ALTER TABLE cic_reset_snap.tbl SET (parallel_workers=2);
+CREATE UNIQUE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl(i);
+REINDEX INDEX CONCURRENTLY cic_reset_snap.idx;
+DROP INDEX CONCURRENTLY cic_reset_snap.idx;
+CREATE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl(i);
+REINDEX INDEX CONCURRENTLY cic_reset_snap.idx;
+DROP INDEX CONCURRENTLY cic_reset_snap.idx;
+CREATE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl(MOD(i, 2), j) WHERE MOD(i, 2) = 0;
+REINDEX INDEX CONCURRENTLY cic_reset_snap.idx;
+DROP INDEX CONCURRENTLY cic_reset_snap.idx;
+CREATE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl(i, j) WHERE cic_reset_snap.predicate_stable(i);
+NOTICE: notice triggered for injection point table_beginscan_strat_reset_snapshots
+NOTICE: notice triggered for injection point heap_reset_scan_snapshot_effective
+REINDEX INDEX CONCURRENTLY cic_reset_snap.idx;
+NOTICE: notice triggered for injection point table_beginscan_strat_reset_snapshots
+NOTICE: notice triggered for injection point heap_reset_scan_snapshot_effective
+DROP INDEX CONCURRENTLY cic_reset_snap.idx;
+CREATE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl(i, j) WHERE cic_reset_snap.predicate_stable_no_param();
+REINDEX INDEX CONCURRENTLY cic_reset_snap.idx;
+DROP INDEX CONCURRENTLY cic_reset_snap.idx;
+CREATE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl USING BRIN(i);
+REINDEX INDEX CONCURRENTLY cic_reset_snap.idx;
+DROP INDEX CONCURRENTLY cic_reset_snap.idx;
+DROP SCHEMA cic_reset_snap CASCADE;
+NOTICE: drop cascades to 3 other objects
+DETAIL: drop cascades to table cic_reset_snap.tbl
+drop cascades to function cic_reset_snap.predicate_stable(integer)
+drop cascades to function cic_reset_snap.predicate_stable_no_param()
+DROP EXTENSION injection_points;
diff --git a/src/test/modules/injection_points/meson.build b/src/test/modules/injection_points/meson.build
index 91fc8ce687f..f288633da4f 100644
--- a/src/test/modules/injection_points/meson.build
+++ b/src/test/modules/injection_points/meson.build
@@ -35,6 +35,7 @@ tests += {
'sql': [
'injection_points',
'reindex_conc',
+ 'cic_reset_snapshots',
],
'regress_args': ['--dlpath', meson.build_root() / 'src/test/regress'],
# The injection points are cluster-wide, so disable installcheck
diff --git a/src/test/modules/injection_points/sql/cic_reset_snapshots.sql b/src/test/modules/injection_points/sql/cic_reset_snapshots.sql
new file mode 100644
index 00000000000..4fef5a47431
--- /dev/null
+++ b/src/test/modules/injection_points/sql/cic_reset_snapshots.sql
@@ -0,0 +1,82 @@
+CREATE EXTENSION injection_points;
+
+SELECT injection_points_set_local();
+SELECT injection_points_attach('heap_reset_scan_snapshot_effective', 'notice');
+SELECT injection_points_attach('table_beginscan_strat_reset_snapshots', 'notice');
+
+
+CREATE SCHEMA cic_reset_snap;
+CREATE TABLE cic_reset_snap.tbl(i int primary key, j int);
+INSERT INTO cic_reset_snap.tbl SELECT i, i * I FROM generate_series(1, 200) s(i);
+
+CREATE FUNCTION cic_reset_snap.predicate_stable(integer) RETURNS bool IMMUTABLE
+ LANGUAGE plpgsql AS $$
+BEGIN
+ EXECUTE 'SELECT txid_current()';
+ RETURN MOD($1, 2) = 0;
+END; $$;
+
+CREATE FUNCTION cic_reset_snap.predicate_stable_no_param() RETURNS bool IMMUTABLE
+ LANGUAGE plpgsql AS $$
+BEGIN
+ EXECUTE 'SELECT txid_current()';
+ RETURN false;
+END; $$;
+
+----------------
+ALTER TABLE cic_reset_snap.tbl SET (parallel_workers=0);
+
+CREATE UNIQUE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl(i);
+REINDEX INDEX CONCURRENTLY cic_reset_snap.idx;
+DROP INDEX CONCURRENTLY cic_reset_snap.idx;
+
+CREATE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl(i);
+REINDEX INDEX CONCURRENTLY cic_reset_snap.idx;
+DROP INDEX CONCURRENTLY cic_reset_snap.idx;
+
+CREATE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl(MOD(i, 2), j) WHERE MOD(i, 2) = 0;
+REINDEX INDEX CONCURRENTLY cic_reset_snap.idx;
+DROP INDEX CONCURRENTLY cic_reset_snap.idx;
+
+CREATE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl(i, j) WHERE cic_reset_snap.predicate_stable(i);
+REINDEX INDEX CONCURRENTLY cic_reset_snap.idx;
+DROP INDEX CONCURRENTLY cic_reset_snap.idx;
+
+CREATE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl(i, j) WHERE cic_reset_snap.predicate_stable_no_param();
+REINDEX INDEX CONCURRENTLY cic_reset_snap.idx;
+DROP INDEX CONCURRENTLY cic_reset_snap.idx;
+
+CREATE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl USING BRIN(i);
+REINDEX INDEX CONCURRENTLY cic_reset_snap.idx;
+DROP INDEX CONCURRENTLY cic_reset_snap.idx;
+
+-- The same in parallel mode
+ALTER TABLE cic_reset_snap.tbl SET (parallel_workers=2);
+
+CREATE UNIQUE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl(i);
+REINDEX INDEX CONCURRENTLY cic_reset_snap.idx;
+DROP INDEX CONCURRENTLY cic_reset_snap.idx;
+
+CREATE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl(i);
+REINDEX INDEX CONCURRENTLY cic_reset_snap.idx;
+DROP INDEX CONCURRENTLY cic_reset_snap.idx;
+
+CREATE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl(MOD(i, 2), j) WHERE MOD(i, 2) = 0;
+REINDEX INDEX CONCURRENTLY cic_reset_snap.idx;
+DROP INDEX CONCURRENTLY cic_reset_snap.idx;
+
+CREATE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl(i, j) WHERE cic_reset_snap.predicate_stable(i);
+REINDEX INDEX CONCURRENTLY cic_reset_snap.idx;
+DROP INDEX CONCURRENTLY cic_reset_snap.idx;
+
+CREATE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl(i, j) WHERE cic_reset_snap.predicate_stable_no_param();
+REINDEX INDEX CONCURRENTLY cic_reset_snap.idx;
+DROP INDEX CONCURRENTLY cic_reset_snap.idx;
+
+CREATE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl USING BRIN(i);
+REINDEX INDEX CONCURRENTLY cic_reset_snap.idx;
+DROP INDEX CONCURRENTLY cic_reset_snap.idx;
+
+DROP SCHEMA cic_reset_snap CASCADE;
+
+DROP EXTENSION injection_points;
--
2.43.0
[application/octet-stream] v6-0002-this-is-https-commitfest.postgresql.org-50-5160-m.patch (61.5K, 5-v6-0002-this-is-https-commitfest.postgresql.org-50-5160-m.patch)
download | inline diff:
From 12efb82206cee7843bf17ccabacc91435d0bac5a Mon Sep 17 00:00:00 2001
From: nkey <[email protected]>
Date: Sat, 30 Nov 2024 11:36:28 +0100
Subject: [PATCH v6 2/6] this is https://commitfest.postgresql.org/50/5160/
merged in single commit. it is required for stability of stress tests.
---
src/backend/commands/indexcmds.c | 4 +-
src/backend/executor/execIndexing.c | 3 +
src/backend/executor/execPartition.c | 119 ++++++++-
src/backend/executor/nodeModifyTable.c | 2 +
src/backend/optimizer/util/plancat.c | 135 +++++++---
src/backend/utils/time/snapmgr.c | 2 +
src/test/modules/injection_points/Makefile | 7 +-
.../expected/index_concurrently_upsert.out | 80 ++++++
.../index_concurrently_upsert_predicate.out | 80 ++++++
.../expected/reindex_concurrently_upsert.out | 238 ++++++++++++++++++
...ndex_concurrently_upsert_on_constraint.out | 238 ++++++++++++++++++
...eindex_concurrently_upsert_partitioned.out | 238 ++++++++++++++++++
src/test/modules/injection_points/meson.build | 11 +
.../specs/index_concurrently_upsert.spec | 68 +++++
.../index_concurrently_upsert_predicate.spec | 70 ++++++
.../specs/reindex_concurrently_upsert.spec | 86 +++++++
...dex_concurrently_upsert_on_constraint.spec | 86 +++++++
...index_concurrently_upsert_partitioned.spec | 88 +++++++
18 files changed, 1505 insertions(+), 50 deletions(-)
create mode 100644 src/test/modules/injection_points/expected/index_concurrently_upsert.out
create mode 100644 src/test/modules/injection_points/expected/index_concurrently_upsert_predicate.out
create mode 100644 src/test/modules/injection_points/expected/reindex_concurrently_upsert.out
create mode 100644 src/test/modules/injection_points/expected/reindex_concurrently_upsert_on_constraint.out
create mode 100644 src/test/modules/injection_points/expected/reindex_concurrently_upsert_partitioned.out
create mode 100644 src/test/modules/injection_points/specs/index_concurrently_upsert.spec
create mode 100644 src/test/modules/injection_points/specs/index_concurrently_upsert_predicate.spec
create mode 100644 src/test/modules/injection_points/specs/reindex_concurrently_upsert.spec
create mode 100644 src/test/modules/injection_points/specs/reindex_concurrently_upsert_on_constraint.spec
create mode 100644 src/test/modules/injection_points/specs/reindex_concurrently_upsert_partitioned.spec
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 4049ce1a10f..932854d6c60 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -1766,6 +1766,7 @@ DefineIndex(Oid tableId,
* before the reference snap was taken, we have to wait out any
* transactions that might have older snapshots.
*/
+ INJECTION_POINT("define_index_before_set_valid");
pgstat_progress_update_param(PROGRESS_CREATEIDX_PHASE,
PROGRESS_CREATEIDX_PHASE_WAIT_3);
WaitForOlderSnapshots(limitXmin, true);
@@ -4206,7 +4207,7 @@ ReindexRelationConcurrently(const ReindexStmt *stmt, Oid relationOid, const Rein
* the same time to make sure we only get constraint violations from the
* indexes with the correct names.
*/
-
+ INJECTION_POINT("reindex_relation_concurrently_before_swap");
StartTransactionCommand();
/*
@@ -4285,6 +4286,7 @@ ReindexRelationConcurrently(const ReindexStmt *stmt, Oid relationOid, const Rein
* index_drop() for more details.
*/
+ INJECTION_POINT("reindex_relation_concurrently_before_set_dead");
pgstat_progress_update_param(PROGRESS_CREATEIDX_PHASE,
PROGRESS_CREATEIDX_PHASE_WAIT_4);
WaitForLockersMultiple(lockTags, AccessExclusiveLock, true);
diff --git a/src/backend/executor/execIndexing.c b/src/backend/executor/execIndexing.c
index f0a5f8879a9..820749239ca 100644
--- a/src/backend/executor/execIndexing.c
+++ b/src/backend/executor/execIndexing.c
@@ -117,6 +117,7 @@
#include "utils/multirangetypes.h"
#include "utils/rangetypes.h"
#include "utils/snapmgr.h"
+#include "utils/injection_point.h"
/* waitMode argument to check_exclusion_or_unique_constraint() */
typedef enum
@@ -936,6 +937,8 @@ retry:
econtext->ecxt_scantuple = save_scantuple;
ExecDropSingleTupleTableSlot(existing_slot);
+ if (!conflict)
+ INJECTION_POINT("check_exclusion_or_unique_constraint_no_conflict");
return !conflict;
}
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 76518862291..aeeee41d5f1 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -483,6 +483,48 @@ ExecFindPartition(ModifyTableState *mtstate,
return rri;
}
+/*
+ * IsIndexCompatibleAsArbiter
+ * Checks if the indexes are identical in terms of being used
+ * as arbiters for the INSERT ON CONFLICT operation by comparing
+ * them to the provided arbiter index.
+ *
+ * Returns the true if indexes are compatible.
+ */
+static bool
+IsIndexCompatibleAsArbiter(Relation arbiterIndexRelation,
+ IndexInfo *arbiterIndexInfo,
+ Relation indexRelation,
+ IndexInfo *indexInfo)
+{
+ int i;
+
+ if (arbiterIndexInfo->ii_Unique != indexInfo->ii_Unique)
+ return false;
+ /* it is not supported for cases of exclusion constraints. */
+ if (arbiterIndexInfo->ii_ExclusionOps != NULL || indexInfo->ii_ExclusionOps != NULL)
+ return false;
+ if (arbiterIndexRelation->rd_index->indnkeyatts != indexRelation->rd_index->indnkeyatts)
+ return false;
+
+ for (i = 0; i < indexRelation->rd_index->indnkeyatts; i++)
+ {
+ int arbiterAttoNo = arbiterIndexRelation->rd_index->indkey.values[i];
+ int attoNo = indexRelation->rd_index->indkey.values[i];
+ if (arbiterAttoNo != attoNo)
+ return false;
+ }
+
+ if (list_difference(RelationGetIndexExpressions(arbiterIndexRelation),
+ RelationGetIndexExpressions(indexRelation)) != NIL)
+ return false;
+
+ if (list_difference(RelationGetIndexPredicate(arbiterIndexRelation),
+ RelationGetIndexPredicate(indexRelation)) != NIL)
+ return false;
+ return true;
+}
+
/*
* ExecInitPartitionInfo
* Lock the partition and initialize ResultRelInfo. Also setup other
@@ -693,6 +735,8 @@ ExecInitPartitionInfo(ModifyTableState *mtstate, EState *estate,
if (rootResultRelInfo->ri_onConflictArbiterIndexes != NIL)
{
List *childIdxs;
+ List *nonAncestorIdxs = NIL;
+ int i, j, additional_arbiters = 0;
childIdxs = RelationGetIndexList(leaf_part_rri->ri_RelationDesc);
@@ -703,23 +747,74 @@ ExecInitPartitionInfo(ModifyTableState *mtstate, EState *estate,
ListCell *lc2;
ancestors = get_partition_ancestors(childIdx);
- foreach(lc2, rootResultRelInfo->ri_onConflictArbiterIndexes)
+ if (ancestors)
{
- if (list_member_oid(ancestors, lfirst_oid(lc2)))
- arbiterIndexes = lappend_oid(arbiterIndexes, childIdx);
+ foreach(lc2, rootResultRelInfo->ri_onConflictArbiterIndexes)
+ {
+ if (list_member_oid(ancestors, lfirst_oid(lc2)))
+ arbiterIndexes = lappend_oid(arbiterIndexes, childIdx);
+ }
}
+ else /* No ancestor was found for that index. Save it for rechecking later. */
+ nonAncestorIdxs = lappend_oid(nonAncestorIdxs, childIdx);
list_free(ancestors);
}
+
+ /*
+ * If any non-ancestor indexes are found, we need to compare them with other
+ * indexes of the relation that will be used as arbiters. This is necessary
+ * when a partitioned index is processed by REINDEX CONCURRENTLY. Both indexes
+ * must be considered as arbiters to ensure that all concurrent transactions
+ * use the same set of arbiters.
+ */
+ if (nonAncestorIdxs)
+ {
+ for (i = 0; i < leaf_part_rri->ri_NumIndices; i++)
+ {
+ if (list_member_oid(nonAncestorIdxs, leaf_part_rri->ri_IndexRelationDescs[i]->rd_index->indexrelid))
+ {
+ Relation nonAncestorIndexRelation = leaf_part_rri->ri_IndexRelationDescs[i];
+ IndexInfo *nonAncestorIndexInfo = leaf_part_rri->ri_IndexRelationInfo[i];
+ Assert(!list_member_oid(arbiterIndexes, nonAncestorIndexRelation->rd_index->indexrelid));
+
+ /* It is too early to us non-ready indexes as arbiters */
+ if (!nonAncestorIndexInfo->ii_ReadyForInserts)
+ continue;
+
+ for (j = 0; j < leaf_part_rri->ri_NumIndices; j++)
+ {
+ if (list_member_oid(arbiterIndexes,
+ leaf_part_rri->ri_IndexRelationDescs[j]->rd_index->indexrelid))
+ {
+ Relation arbiterIndexRelation = leaf_part_rri->ri_IndexRelationDescs[j];
+ IndexInfo *arbiterIndexInfo = leaf_part_rri->ri_IndexRelationInfo[j];
+
+ /* If non-ancestor index are compatible to arbiter - use it as arbiter too. */
+ if (IsIndexCompatibleAsArbiter(arbiterIndexRelation, arbiterIndexInfo,
+ nonAncestorIndexRelation, nonAncestorIndexInfo))
+ {
+ arbiterIndexes = lappend_oid(arbiterIndexes,
+ nonAncestorIndexRelation->rd_index->indexrelid);
+ additional_arbiters++;
+ }
+ }
+ }
+ }
+ }
+ }
+ list_free(nonAncestorIdxs);
+
+ /*
+ * If the resulting lists are of inequal length, something is wrong.
+ * (This shouldn't happen, since arbiter index selection should not
+ * pick up a non-ready index.)
+ *
+ * But we need to consider an additional arbiter indexes also.
+ */
+ if (list_length(rootResultRelInfo->ri_onConflictArbiterIndexes) !=
+ list_length(arbiterIndexes) - additional_arbiters)
+ elog(ERROR, "invalid arbiter index list");
}
-
- /*
- * If the resulting lists are of inequal length, something is wrong.
- * (This shouldn't happen, since arbiter index selection should not
- * pick up an invalid index.)
- */
- if (list_length(rootResultRelInfo->ri_onConflictArbiterIndexes) !=
- list_length(arbiterIndexes))
- elog(ERROR, "invalid arbiter index list");
leaf_part_rri->ri_onConflictArbiterIndexes = arbiterIndexes;
/*
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 1161520f76b..23cf4c6b540 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -69,6 +69,7 @@
#include "utils/datum.h"
#include "utils/rel.h"
#include "utils/snapmgr.h"
+#include "utils/injection_point.h"
typedef struct MTTargetRelLookup
@@ -1087,6 +1088,7 @@ ExecInsert(ModifyTableContext *context,
return NULL;
}
}
+ INJECTION_POINT("exec_insert_before_insert_speculative");
/*
* Before we start insertion proper, acquire our "speculative
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 153390f2dc9..56b58d1ed74 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -714,12 +714,14 @@ infer_arbiter_indexes(PlannerInfo *root)
List *indexList;
ListCell *l;
- /* Normalized inference attributes and inference expressions: */
- Bitmapset *inferAttrs = NULL;
- List *inferElems = NIL;
+ /* Normalized required attributes and expressions: */
+ Bitmapset *requiredArbiterAttrs = NULL;
+ List *requiredArbiterElems = NIL;
+ List *requiredIndexPredExprs = (List *) onconflict->arbiterWhere;
/* Results */
List *results = NIL;
+ bool foundValid = false;
/*
* Quickly return NIL for ON CONFLICT DO NOTHING without an inference
@@ -754,8 +756,8 @@ infer_arbiter_indexes(PlannerInfo *root)
if (!IsA(elem->expr, Var))
{
- /* If not a plain Var, just shove it in inferElems for now */
- inferElems = lappend(inferElems, elem->expr);
+ /* If not a plain Var, just shove it in requiredArbiterElems for now */
+ requiredArbiterElems = lappend(requiredArbiterElems, elem->expr);
continue;
}
@@ -767,30 +769,76 @@ infer_arbiter_indexes(PlannerInfo *root)
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("whole row unique index inference specifications are not supported")));
- inferAttrs = bms_add_member(inferAttrs,
+ requiredArbiterAttrs = bms_add_member(requiredArbiterAttrs,
attno - FirstLowInvalidHeapAttributeNumber);
}
+ indexList = RelationGetIndexList(relation);
+
/*
* Lookup named constraint's index. This is not immediately returned
- * because some additional sanity checks are required.
+ * because some additional sanity checks are required. Additionally, we
+ * need to process other indexes as potential arbiters to account for
+ * cases where REINDEX CONCURRENTLY is processing an index used as a
+ * named constraint.
*/
if (onconflict->constraint != InvalidOid)
{
indexOidFromConstraint = get_constraint_index(onconflict->constraint);
if (indexOidFromConstraint == InvalidOid)
+ {
ereport(ERROR,
(errcode(ERRCODE_WRONG_OBJECT_TYPE),
- errmsg("constraint in ON CONFLICT clause has no associated index")));
+ errmsg("constraint in ON CONFLICT clause has no associated index")));
+ }
+
+ /*
+ * Find the named constraint index to extract its attributes and predicates.
+ * We open all indexes in the loop to avoid deadlock of changed order of locks.
+ * */
+ foreach(l, indexList)
+ {
+ Oid indexoid = lfirst_oid(l);
+ Relation idxRel;
+ Form_pg_index idxForm;
+ AttrNumber natt;
+
+ idxRel = index_open(indexoid, rte->rellockmode);
+ idxForm = idxRel->rd_index;
+
+ if (idxForm->indisready)
+ {
+ if (indexOidFromConstraint == idxForm->indexrelid)
+ {
+ /*
+ * Prepare requirements for other indexes to be used as arbiter together
+ * with indexOidFromConstraint. It is required to involve both equals indexes
+ * in case of REINDEX CONCURRENTLY.
+ */
+ for (natt = 0; natt < idxForm->indnkeyatts; natt++)
+ {
+ int attno = idxRel->rd_index->indkey.values[natt];
+
+ if (attno != 0)
+ requiredArbiterAttrs = bms_add_member(requiredArbiterAttrs,
+ attno - FirstLowInvalidHeapAttributeNumber);
+ }
+ requiredArbiterElems = RelationGetIndexExpressions(idxRel);
+ requiredIndexPredExprs = RelationGetIndexPredicate(idxRel);
+ /* We are done, so, quite the loop. */
+ index_close(idxRel, NoLock);
+ break;
+ }
+ }
+ index_close(idxRel, NoLock);
+ }
}
/*
* Using that representation, iterate through the list of indexes on the
* target relation to try and find a match
*/
- indexList = RelationGetIndexList(relation);
-
foreach(l, indexList)
{
Oid indexoid = lfirst_oid(l);
@@ -813,7 +861,13 @@ infer_arbiter_indexes(PlannerInfo *root)
idxRel = index_open(indexoid, rte->rellockmode);
idxForm = idxRel->rd_index;
- if (!idxForm->indisvalid)
+ /*
+ * We need to consider both indisvalid and indisready indexes because
+ * them may become indisvalid before execution phase. It is required
+ * to keep set of indexes used as arbiter to be the same for all
+ * concurrent transactions.
+ */
+ if (!idxForm->indisready)
goto next;
/*
@@ -833,27 +887,23 @@ infer_arbiter_indexes(PlannerInfo *root)
ereport(ERROR,
(errcode(ERRCODE_WRONG_OBJECT_TYPE),
errmsg("ON CONFLICT DO UPDATE not supported with exclusion constraints")));
-
- results = lappend_oid(results, idxForm->indexrelid);
- list_free(indexList);
- index_close(idxRel, NoLock);
- table_close(relation, NoLock);
- return results;
+ goto found;
}
else if (indexOidFromConstraint != InvalidOid)
{
- /* No point in further work for index in named constraint case */
- goto next;
+ /* In the case of "ON constraint_name DO UPDATE" we need to skip non-unique candidates. */
+ if (!idxForm->indisunique && onconflict->action == ONCONFLICT_UPDATE)
+ goto next;
+ } else {
+ /*
+ * Only considering conventional inference at this point (not named
+ * constraints), so index under consideration can be immediately
+ * skipped if it's not unique
+ */
+ if (!idxForm->indisunique)
+ goto next;
}
- /*
- * Only considering conventional inference at this point (not named
- * constraints), so index under consideration can be immediately
- * skipped if it's not unique
- */
- if (!idxForm->indisunique)
- goto next;
-
/*
* So-called unique constraints with WITHOUT OVERLAPS are really
* exclusion constraints, so skip those too.
@@ -873,7 +923,7 @@ infer_arbiter_indexes(PlannerInfo *root)
}
/* Non-expression attributes (if any) must match */
- if (!bms_equal(indexedAttrs, inferAttrs))
+ if (!bms_equal(indexedAttrs, requiredArbiterAttrs))
goto next;
/* Expression attributes (if any) must match */
@@ -881,6 +931,10 @@ infer_arbiter_indexes(PlannerInfo *root)
if (idxExprs && varno != 1)
ChangeVarNodes((Node *) idxExprs, 1, varno, 0);
+ /*
+ * If arbiterElems are present, check them. If name >constraint is
+ * present arbiterElems == NIL.
+ */
foreach(el, onconflict->arbiterElems)
{
InferenceElem *elem = (InferenceElem *) lfirst(el);
@@ -918,27 +972,35 @@ infer_arbiter_indexes(PlannerInfo *root)
}
/*
- * Now that all inference elements were matched, ensure that the
+ * In case of the conventional inference involved ensure that the
* expression elements from inference clause are not missing any
* cataloged expressions. This does the right thing when unique
* indexes redundantly repeat the same attribute, or if attributes
* redundantly appear multiple times within an inference clause.
+ *
+ * In the case of named constraint ensure candidate has equal set
+ * of expressions as the named constraint index.
*/
- if (list_difference(idxExprs, inferElems) != NIL)
+ if (list_difference(idxExprs, requiredArbiterElems) != NIL)
goto next;
- /*
- * If it's a partial index, its predicate must be implied by the ON
- * CONFLICT's WHERE clause.
- */
predExprs = RelationGetIndexPredicate(idxRel);
if (predExprs && varno != 1)
ChangeVarNodes((Node *) predExprs, 1, varno, 0);
- if (!predicate_implied_by(predExprs, (List *) onconflict->arbiterWhere, false))
+ /*
+ * If it's a partial index and conventional inference, its predicate must be implied
+ * by the ON CONFLICT's WHERE clause.
+ */
+ if (indexOidFromConstraint == InvalidOid && !predicate_implied_by(predExprs, requiredIndexPredExprs, false))
+ goto next;
+ /* If it's a partial index and named constraint predicates must be equal. */
+ if (indexOidFromConstraint != InvalidOid && list_difference(predExprs, requiredIndexPredExprs) != NIL)
goto next;
+found:
results = lappend_oid(results, idxForm->indexrelid);
+ foundValid |= idxForm->indisvalid;
next:
index_close(idxRel, NoLock);
}
@@ -946,7 +1008,8 @@ next:
list_free(indexList);
table_close(relation, NoLock);
- if (results == NIL)
+ /* It is required to have at least one indisvalid index during the planning. */
+ if (results == NIL || !foundValid)
ereport(ERROR,
(errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
errmsg("there is no unique or exclusion constraint matching the ON CONFLICT specification")));
diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c
index a1a0c2adeb6..2189bf0d9ae 100644
--- a/src/backend/utils/time/snapmgr.c
+++ b/src/backend/utils/time/snapmgr.c
@@ -64,6 +64,7 @@
#include "utils/resowner.h"
#include "utils/snapmgr.h"
#include "utils/syscache.h"
+#include "utils/injection_point.h"
/*
@@ -392,6 +393,7 @@ InvalidateCatalogSnapshot(void)
pairingheap_remove(&RegisteredSnapshots, &CatalogSnapshot->ph_node);
CatalogSnapshot = NULL;
SnapshotResetXmin();
+ INJECTION_POINT("invalidate_catalog_snapshot_end");
}
}
diff --git a/src/test/modules/injection_points/Makefile b/src/test/modules/injection_points/Makefile
index 0753a9df58c..f8f86e8f3b6 100644
--- a/src/test/modules/injection_points/Makefile
+++ b/src/test/modules/injection_points/Makefile
@@ -13,7 +13,12 @@ PGFILEDESC = "injection_points - facility for injection points"
REGRESS = injection_points reindex_conc
REGRESS_OPTS = --dlpath=$(top_builddir)/src/test/regress
-ISOLATION = basic inplace
+ISOLATION = basic inplace \
+ reindex_concurrently_upsert \
+ index_concurrently_upsert \
+ reindex_concurrently_upsert_partitioned \
+ reindex_concurrently_upsert_on_constraint \
+ index_concurrently_upsert_predicate
TAP_TESTS = 1
diff --git a/src/test/modules/injection_points/expected/index_concurrently_upsert.out b/src/test/modules/injection_points/expected/index_concurrently_upsert.out
new file mode 100644
index 00000000000..7f0659e8369
--- /dev/null
+++ b/src/test/modules/injection_points/expected/index_concurrently_upsert.out
@@ -0,0 +1,80 @@
+Parsed test spec with 4 sessions
+
+starting permutation: s3_start_create_index s1_start_upsert s4_wakeup_define_index_before_set_valid s2_start_upsert s4_wakeup_s1_from_invalidate_catalog_snapshot s4_wakeup_s2 s4_wakeup_s1
+injection_points_attach
+-----------------------
+
+(1 row)
+
+injection_points_attach
+-----------------------
+
+(1 row)
+
+injection_points_attach
+-----------------------
+
+(1 row)
+
+step s3_start_create_index: CREATE UNIQUE INDEX CONCURRENTLY tbl_pkey_duplicate ON test.tbl(i); <waiting ...>
+step s1_start_upsert: INSERT INTO test.tbl VALUES(13,now()) on conflict(i) do update set updated_at = now(); <waiting ...>
+step s4_wakeup_define_index_before_set_valid:
+ SELECT injection_points_detach('define_index_before_set_valid');
+ SELECT injection_points_wakeup('define_index_before_set_valid');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s3_start_create_index: <... completed>
+step s2_start_upsert: INSERT INTO test.tbl VALUES(13,now()) on conflict(i) do update set updated_at = now(); <waiting ...>
+step s4_wakeup_s1_from_invalidate_catalog_snapshot:
+ SELECT injection_points_detach('invalidate_catalog_snapshot_end');
+ SELECT injection_points_wakeup('invalidate_catalog_snapshot_end');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s4_wakeup_s2:
+ SELECT injection_points_detach('exec_insert_before_insert_speculative');
+ SELECT injection_points_wakeup('exec_insert_before_insert_speculative');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s2_start_upsert: <... completed>
+step s4_wakeup_s1:
+ SELECT injection_points_detach('check_exclusion_or_unique_constraint_no_conflict');
+ SELECT injection_points_wakeup('check_exclusion_or_unique_constraint_no_conflict');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s1_start_upsert: <... completed>
diff --git a/src/test/modules/injection_points/expected/index_concurrently_upsert_predicate.out b/src/test/modules/injection_points/expected/index_concurrently_upsert_predicate.out
new file mode 100644
index 00000000000..2300d5165e9
--- /dev/null
+++ b/src/test/modules/injection_points/expected/index_concurrently_upsert_predicate.out
@@ -0,0 +1,80 @@
+Parsed test spec with 4 sessions
+
+starting permutation: s3_start_create_index s1_start_upsert s4_wakeup_define_index_before_set_valid s2_start_upsert s4_wakeup_s1_from_invalidate_catalog_snapshot s4_wakeup_s2 s4_wakeup_s1
+injection_points_attach
+-----------------------
+
+(1 row)
+
+injection_points_attach
+-----------------------
+
+(1 row)
+
+injection_points_attach
+-----------------------
+
+(1 row)
+
+step s3_start_create_index: CREATE UNIQUE INDEX CONCURRENTLY tbl_pkey_special_duplicate ON test.tbl(abs(i)) WHERE i < 10000; <waiting ...>
+step s1_start_upsert: INSERT INTO test.tbl VALUES(13,now()) on conflict(abs(i)) where i < 100 do update set updated_at = now(); <waiting ...>
+step s4_wakeup_define_index_before_set_valid:
+ SELECT injection_points_detach('define_index_before_set_valid');
+ SELECT injection_points_wakeup('define_index_before_set_valid');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s3_start_create_index: <... completed>
+step s2_start_upsert: INSERT INTO test.tbl VALUES(13,now()) on conflict(abs(i)) where i < 100 do update set updated_at = now(); <waiting ...>
+step s4_wakeup_s1_from_invalidate_catalog_snapshot:
+ SELECT injection_points_detach('invalidate_catalog_snapshot_end');
+ SELECT injection_points_wakeup('invalidate_catalog_snapshot_end');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s4_wakeup_s2:
+ SELECT injection_points_detach('exec_insert_before_insert_speculative');
+ SELECT injection_points_wakeup('exec_insert_before_insert_speculative');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s2_start_upsert: <... completed>
+step s4_wakeup_s1:
+ SELECT injection_points_detach('check_exclusion_or_unique_constraint_no_conflict');
+ SELECT injection_points_wakeup('check_exclusion_or_unique_constraint_no_conflict');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s1_start_upsert: <... completed>
diff --git a/src/test/modules/injection_points/expected/reindex_concurrently_upsert.out b/src/test/modules/injection_points/expected/reindex_concurrently_upsert.out
new file mode 100644
index 00000000000..24bbbcbdd88
--- /dev/null
+++ b/src/test/modules/injection_points/expected/reindex_concurrently_upsert.out
@@ -0,0 +1,238 @@
+Parsed test spec with 4 sessions
+
+starting permutation: s3_start_reindex s1_start_upsert s4_wakeup_to_swap s2_start_upsert s4_wakeup_s1 s4_wakeup_s2 s4_wakeup_to_set_dead
+injection_points_attach
+-----------------------
+
+(1 row)
+
+injection_points_attach
+-----------------------
+
+(1 row)
+
+injection_points_attach
+-----------------------
+
+(1 row)
+
+step s3_start_reindex: REINDEX INDEX CONCURRENTLY test.tbl_pkey; <waiting ...>
+step s1_start_upsert: INSERT INTO test.tbl VALUES(13,now()) on conflict(i) do update set updated_at = now(); <waiting ...>
+step s4_wakeup_to_swap:
+ SELECT injection_points_detach('reindex_relation_concurrently_before_swap');
+ SELECT injection_points_wakeup('reindex_relation_concurrently_before_swap');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s2_start_upsert: INSERT INTO test.tbl VALUES(13,now()) on conflict(i) do update set updated_at = now(); <waiting ...>
+step s4_wakeup_s1:
+ SELECT injection_points_detach('check_exclusion_or_unique_constraint_no_conflict');
+ SELECT injection_points_wakeup('check_exclusion_or_unique_constraint_no_conflict');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s1_start_upsert: <... completed>
+step s4_wakeup_s2:
+ SELECT injection_points_detach('exec_insert_before_insert_speculative');
+ SELECT injection_points_wakeup('exec_insert_before_insert_speculative');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s2_start_upsert: <... completed>
+step s4_wakeup_to_set_dead:
+ SELECT injection_points_detach('reindex_relation_concurrently_before_set_dead');
+ SELECT injection_points_wakeup('reindex_relation_concurrently_before_set_dead');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s3_start_reindex: <... completed>
+
+starting permutation: s3_start_reindex s2_start_upsert s4_wakeup_to_swap s1_start_upsert s4_wakeup_s1 s4_wakeup_s2 s4_wakeup_to_set_dead
+injection_points_attach
+-----------------------
+
+(1 row)
+
+injection_points_attach
+-----------------------
+
+(1 row)
+
+injection_points_attach
+-----------------------
+
+(1 row)
+
+step s3_start_reindex: REINDEX INDEX CONCURRENTLY test.tbl_pkey; <waiting ...>
+step s2_start_upsert: INSERT INTO test.tbl VALUES(13,now()) on conflict(i) do update set updated_at = now(); <waiting ...>
+step s4_wakeup_to_swap:
+ SELECT injection_points_detach('reindex_relation_concurrently_before_swap');
+ SELECT injection_points_wakeup('reindex_relation_concurrently_before_swap');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s1_start_upsert: INSERT INTO test.tbl VALUES(13,now()) on conflict(i) do update set updated_at = now(); <waiting ...>
+step s4_wakeup_s1:
+ SELECT injection_points_detach('check_exclusion_or_unique_constraint_no_conflict');
+ SELECT injection_points_wakeup('check_exclusion_or_unique_constraint_no_conflict');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s1_start_upsert: <... completed>
+step s4_wakeup_s2:
+ SELECT injection_points_detach('exec_insert_before_insert_speculative');
+ SELECT injection_points_wakeup('exec_insert_before_insert_speculative');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s2_start_upsert: <... completed>
+step s4_wakeup_to_set_dead:
+ SELECT injection_points_detach('reindex_relation_concurrently_before_set_dead');
+ SELECT injection_points_wakeup('reindex_relation_concurrently_before_set_dead');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s3_start_reindex: <... completed>
+
+starting permutation: s3_start_reindex s4_wakeup_to_swap s1_start_upsert s2_start_upsert s4_wakeup_s1 s4_wakeup_to_set_dead s4_wakeup_s2
+injection_points_attach
+-----------------------
+
+(1 row)
+
+injection_points_attach
+-----------------------
+
+(1 row)
+
+injection_points_attach
+-----------------------
+
+(1 row)
+
+step s3_start_reindex: REINDEX INDEX CONCURRENTLY test.tbl_pkey; <waiting ...>
+step s4_wakeup_to_swap:
+ SELECT injection_points_detach('reindex_relation_concurrently_before_swap');
+ SELECT injection_points_wakeup('reindex_relation_concurrently_before_swap');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s1_start_upsert: INSERT INTO test.tbl VALUES(13,now()) on conflict(i) do update set updated_at = now(); <waiting ...>
+step s2_start_upsert: INSERT INTO test.tbl VALUES(13,now()) on conflict(i) do update set updated_at = now(); <waiting ...>
+step s4_wakeup_s1:
+ SELECT injection_points_detach('check_exclusion_or_unique_constraint_no_conflict');
+ SELECT injection_points_wakeup('check_exclusion_or_unique_constraint_no_conflict');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s1_start_upsert: <... completed>
+step s4_wakeup_to_set_dead:
+ SELECT injection_points_detach('reindex_relation_concurrently_before_set_dead');
+ SELECT injection_points_wakeup('reindex_relation_concurrently_before_set_dead');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s4_wakeup_s2:
+ SELECT injection_points_detach('exec_insert_before_insert_speculative');
+ SELECT injection_points_wakeup('exec_insert_before_insert_speculative');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s3_start_reindex: <... completed>
+step s2_start_upsert: <... completed>
diff --git a/src/test/modules/injection_points/expected/reindex_concurrently_upsert_on_constraint.out b/src/test/modules/injection_points/expected/reindex_concurrently_upsert_on_constraint.out
new file mode 100644
index 00000000000..d1cfd1731c8
--- /dev/null
+++ b/src/test/modules/injection_points/expected/reindex_concurrently_upsert_on_constraint.out
@@ -0,0 +1,238 @@
+Parsed test spec with 4 sessions
+
+starting permutation: s3_start_reindex s1_start_upsert s4_wakeup_to_swap s2_start_upsert s4_wakeup_s1 s4_wakeup_s2 s4_wakeup_to_set_dead
+injection_points_attach
+-----------------------
+
+(1 row)
+
+injection_points_attach
+-----------------------
+
+(1 row)
+
+injection_points_attach
+-----------------------
+
+(1 row)
+
+step s3_start_reindex: REINDEX INDEX CONCURRENTLY test.tbl_pkey; <waiting ...>
+step s1_start_upsert: INSERT INTO test.tbl VALUES(13,now()) on conflict on constraint tbl_pkey do update set updated_at = now(); <waiting ...>
+step s4_wakeup_to_swap:
+ SELECT injection_points_detach('reindex_relation_concurrently_before_swap');
+ SELECT injection_points_wakeup('reindex_relation_concurrently_before_swap');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s2_start_upsert: INSERT INTO test.tbl VALUES(13,now()) on conflict on constraint tbl_pkey do update set updated_at = now(); <waiting ...>
+step s4_wakeup_s1:
+ SELECT injection_points_detach('check_exclusion_or_unique_constraint_no_conflict');
+ SELECT injection_points_wakeup('check_exclusion_or_unique_constraint_no_conflict');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s1_start_upsert: <... completed>
+step s4_wakeup_s2:
+ SELECT injection_points_detach('exec_insert_before_insert_speculative');
+ SELECT injection_points_wakeup('exec_insert_before_insert_speculative');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s2_start_upsert: <... completed>
+step s4_wakeup_to_set_dead:
+ SELECT injection_points_detach('reindex_relation_concurrently_before_set_dead');
+ SELECT injection_points_wakeup('reindex_relation_concurrently_before_set_dead');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s3_start_reindex: <... completed>
+
+starting permutation: s3_start_reindex s2_start_upsert s4_wakeup_to_swap s1_start_upsert s4_wakeup_s1 s4_wakeup_s2 s4_wakeup_to_set_dead
+injection_points_attach
+-----------------------
+
+(1 row)
+
+injection_points_attach
+-----------------------
+
+(1 row)
+
+injection_points_attach
+-----------------------
+
+(1 row)
+
+step s3_start_reindex: REINDEX INDEX CONCURRENTLY test.tbl_pkey; <waiting ...>
+step s2_start_upsert: INSERT INTO test.tbl VALUES(13,now()) on conflict on constraint tbl_pkey do update set updated_at = now(); <waiting ...>
+step s4_wakeup_to_swap:
+ SELECT injection_points_detach('reindex_relation_concurrently_before_swap');
+ SELECT injection_points_wakeup('reindex_relation_concurrently_before_swap');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s1_start_upsert: INSERT INTO test.tbl VALUES(13,now()) on conflict on constraint tbl_pkey do update set updated_at = now(); <waiting ...>
+step s4_wakeup_s1:
+ SELECT injection_points_detach('check_exclusion_or_unique_constraint_no_conflict');
+ SELECT injection_points_wakeup('check_exclusion_or_unique_constraint_no_conflict');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s1_start_upsert: <... completed>
+step s4_wakeup_s2:
+ SELECT injection_points_detach('exec_insert_before_insert_speculative');
+ SELECT injection_points_wakeup('exec_insert_before_insert_speculative');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s2_start_upsert: <... completed>
+step s4_wakeup_to_set_dead:
+ SELECT injection_points_detach('reindex_relation_concurrently_before_set_dead');
+ SELECT injection_points_wakeup('reindex_relation_concurrently_before_set_dead');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s3_start_reindex: <... completed>
+
+starting permutation: s3_start_reindex s4_wakeup_to_swap s1_start_upsert s2_start_upsert s4_wakeup_s1 s4_wakeup_to_set_dead s4_wakeup_s2
+injection_points_attach
+-----------------------
+
+(1 row)
+
+injection_points_attach
+-----------------------
+
+(1 row)
+
+injection_points_attach
+-----------------------
+
+(1 row)
+
+step s3_start_reindex: REINDEX INDEX CONCURRENTLY test.tbl_pkey; <waiting ...>
+step s4_wakeup_to_swap:
+ SELECT injection_points_detach('reindex_relation_concurrently_before_swap');
+ SELECT injection_points_wakeup('reindex_relation_concurrently_before_swap');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s1_start_upsert: INSERT INTO test.tbl VALUES(13,now()) on conflict on constraint tbl_pkey do update set updated_at = now(); <waiting ...>
+step s2_start_upsert: INSERT INTO test.tbl VALUES(13,now()) on conflict on constraint tbl_pkey do update set updated_at = now(); <waiting ...>
+step s4_wakeup_s1:
+ SELECT injection_points_detach('check_exclusion_or_unique_constraint_no_conflict');
+ SELECT injection_points_wakeup('check_exclusion_or_unique_constraint_no_conflict');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s1_start_upsert: <... completed>
+step s4_wakeup_to_set_dead:
+ SELECT injection_points_detach('reindex_relation_concurrently_before_set_dead');
+ SELECT injection_points_wakeup('reindex_relation_concurrently_before_set_dead');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s4_wakeup_s2:
+ SELECT injection_points_detach('exec_insert_before_insert_speculative');
+ SELECT injection_points_wakeup('exec_insert_before_insert_speculative');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s3_start_reindex: <... completed>
+step s2_start_upsert: <... completed>
diff --git a/src/test/modules/injection_points/expected/reindex_concurrently_upsert_partitioned.out b/src/test/modules/injection_points/expected/reindex_concurrently_upsert_partitioned.out
new file mode 100644
index 00000000000..c95ff264f12
--- /dev/null
+++ b/src/test/modules/injection_points/expected/reindex_concurrently_upsert_partitioned.out
@@ -0,0 +1,238 @@
+Parsed test spec with 4 sessions
+
+starting permutation: s3_start_reindex s1_start_upsert s4_wakeup_to_swap s2_start_upsert s4_wakeup_s1 s4_wakeup_s2 s4_wakeup_to_set_dead
+injection_points_attach
+-----------------------
+
+(1 row)
+
+injection_points_attach
+-----------------------
+
+(1 row)
+
+injection_points_attach
+-----------------------
+
+(1 row)
+
+step s3_start_reindex: REINDEX INDEX CONCURRENTLY test.tbl_partition_pkey; <waiting ...>
+step s1_start_upsert: INSERT INTO test.tbl VALUES(13,now()) on conflict(i) do update set updated_at = now(); <waiting ...>
+step s4_wakeup_to_swap:
+ SELECT injection_points_detach('reindex_relation_concurrently_before_swap');
+ SELECT injection_points_wakeup('reindex_relation_concurrently_before_swap');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s2_start_upsert: INSERT INTO test.tbl VALUES(13,now()) on conflict(i) do update set updated_at = now(); <waiting ...>
+step s4_wakeup_s1:
+ SELECT injection_points_detach('check_exclusion_or_unique_constraint_no_conflict');
+ SELECT injection_points_wakeup('check_exclusion_or_unique_constraint_no_conflict');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s1_start_upsert: <... completed>
+step s4_wakeup_s2:
+ SELECT injection_points_detach('exec_insert_before_insert_speculative');
+ SELECT injection_points_wakeup('exec_insert_before_insert_speculative');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s2_start_upsert: <... completed>
+step s4_wakeup_to_set_dead:
+ SELECT injection_points_detach('reindex_relation_concurrently_before_set_dead');
+ SELECT injection_points_wakeup('reindex_relation_concurrently_before_set_dead');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s3_start_reindex: <... completed>
+
+starting permutation: s3_start_reindex s2_start_upsert s4_wakeup_to_swap s1_start_upsert s4_wakeup_s1 s4_wakeup_s2 s4_wakeup_to_set_dead
+injection_points_attach
+-----------------------
+
+(1 row)
+
+injection_points_attach
+-----------------------
+
+(1 row)
+
+injection_points_attach
+-----------------------
+
+(1 row)
+
+step s3_start_reindex: REINDEX INDEX CONCURRENTLY test.tbl_partition_pkey; <waiting ...>
+step s2_start_upsert: INSERT INTO test.tbl VALUES(13,now()) on conflict(i) do update set updated_at = now(); <waiting ...>
+step s4_wakeup_to_swap:
+ SELECT injection_points_detach('reindex_relation_concurrently_before_swap');
+ SELECT injection_points_wakeup('reindex_relation_concurrently_before_swap');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s1_start_upsert: INSERT INTO test.tbl VALUES(13,now()) on conflict(i) do update set updated_at = now(); <waiting ...>
+step s4_wakeup_s1:
+ SELECT injection_points_detach('check_exclusion_or_unique_constraint_no_conflict');
+ SELECT injection_points_wakeup('check_exclusion_or_unique_constraint_no_conflict');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s1_start_upsert: <... completed>
+step s4_wakeup_s2:
+ SELECT injection_points_detach('exec_insert_before_insert_speculative');
+ SELECT injection_points_wakeup('exec_insert_before_insert_speculative');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s2_start_upsert: <... completed>
+step s4_wakeup_to_set_dead:
+ SELECT injection_points_detach('reindex_relation_concurrently_before_set_dead');
+ SELECT injection_points_wakeup('reindex_relation_concurrently_before_set_dead');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s3_start_reindex: <... completed>
+
+starting permutation: s3_start_reindex s4_wakeup_to_swap s1_start_upsert s2_start_upsert s4_wakeup_s1 s4_wakeup_to_set_dead s4_wakeup_s2
+injection_points_attach
+-----------------------
+
+(1 row)
+
+injection_points_attach
+-----------------------
+
+(1 row)
+
+injection_points_attach
+-----------------------
+
+(1 row)
+
+step s3_start_reindex: REINDEX INDEX CONCURRENTLY test.tbl_partition_pkey; <waiting ...>
+step s4_wakeup_to_swap:
+ SELECT injection_points_detach('reindex_relation_concurrently_before_swap');
+ SELECT injection_points_wakeup('reindex_relation_concurrently_before_swap');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s1_start_upsert: INSERT INTO test.tbl VALUES(13,now()) on conflict(i) do update set updated_at = now(); <waiting ...>
+step s2_start_upsert: INSERT INTO test.tbl VALUES(13,now()) on conflict(i) do update set updated_at = now(); <waiting ...>
+step s4_wakeup_s1:
+ SELECT injection_points_detach('check_exclusion_or_unique_constraint_no_conflict');
+ SELECT injection_points_wakeup('check_exclusion_or_unique_constraint_no_conflict');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s1_start_upsert: <... completed>
+step s4_wakeup_to_set_dead:
+ SELECT injection_points_detach('reindex_relation_concurrently_before_set_dead');
+ SELECT injection_points_wakeup('reindex_relation_concurrently_before_set_dead');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s4_wakeup_s2:
+ SELECT injection_points_detach('exec_insert_before_insert_speculative');
+ SELECT injection_points_wakeup('exec_insert_before_insert_speculative');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+step s3_start_reindex: <... completed>
+step s2_start_upsert: <... completed>
diff --git a/src/test/modules/injection_points/meson.build b/src/test/modules/injection_points/meson.build
index 58f19001157..91fc8ce687f 100644
--- a/src/test/modules/injection_points/meson.build
+++ b/src/test/modules/injection_points/meson.build
@@ -44,7 +44,16 @@ tests += {
'specs': [
'basic',
'inplace',
+ 'reindex_concurrently_upsert',
+ 'index_concurrently_upsert',
+ 'reindex_concurrently_upsert_partitioned',
+ 'reindex_concurrently_upsert_on_constraint',
+ 'index_concurrently_upsert_predicate',
],
+ # The injection points are cluster-wide, so disable installcheck
+ 'runningcheck': false,
+ # We waiting for all snapshots, so, avoid parallel test executions
+ 'runningcheck-parallel': false,
},
'tap': {
'env': {
@@ -53,5 +62,7 @@ tests += {
'tests': [
't/001_stats.pl',
],
+ # The injection points are cluster-wide, so disable installcheck
+ 'runningcheck': false,
},
}
diff --git a/src/test/modules/injection_points/specs/index_concurrently_upsert.spec b/src/test/modules/injection_points/specs/index_concurrently_upsert.spec
new file mode 100644
index 00000000000..075450935b6
--- /dev/null
+++ b/src/test/modules/injection_points/specs/index_concurrently_upsert.spec
@@ -0,0 +1,68 @@
+# Test race conditions involving:
+# - s1: UPSERT a tuple
+# - s2: UPSERT the same tuple
+# - s3: CREATE UNIQUE INDEX CONCURRENTLY
+# - s4: operations with injection points
+
+setup
+{
+ CREATE EXTENSION injection_points;
+ CREATE SCHEMA test;
+ CREATE UNLOGGED TABLE test.tbl(i int primary key, updated_at timestamp);
+ ALTER TABLE test.tbl SET (parallel_workers=0);
+}
+
+teardown
+{
+ DROP SCHEMA test CASCADE;
+ DROP EXTENSION injection_points;
+}
+
+session s1
+setup {
+ SELECT injection_points_set_local();
+ SELECT injection_points_attach('check_exclusion_or_unique_constraint_no_conflict', 'wait');
+ SELECT injection_points_attach('invalidate_catalog_snapshot_end', 'wait');
+}
+step s1_start_upsert { INSERT INTO test.tbl VALUES(13,now()) on conflict(i) do update set updated_at = now(); }
+
+session s2
+setup {
+ SELECT injection_points_set_local();
+ SELECT injection_points_attach('exec_insert_before_insert_speculative', 'wait');
+}
+step s2_start_upsert { INSERT INTO test.tbl VALUES(13,now()) on conflict(i) do update set updated_at = now(); }
+
+session s3
+setup {
+ SELECT injection_points_set_local();
+ SELECT injection_points_attach('define_index_before_set_valid', 'wait');
+}
+step s3_start_create_index { CREATE UNIQUE INDEX CONCURRENTLY tbl_pkey_duplicate ON test.tbl(i); }
+
+session s4
+step s4_wakeup_s1 {
+ SELECT injection_points_detach('check_exclusion_or_unique_constraint_no_conflict');
+ SELECT injection_points_wakeup('check_exclusion_or_unique_constraint_no_conflict');
+}
+step s4_wakeup_s1_from_invalidate_catalog_snapshot {
+ SELECT injection_points_detach('invalidate_catalog_snapshot_end');
+ SELECT injection_points_wakeup('invalidate_catalog_snapshot_end');
+}
+step s4_wakeup_s2 {
+ SELECT injection_points_detach('exec_insert_before_insert_speculative');
+ SELECT injection_points_wakeup('exec_insert_before_insert_speculative');
+}
+step s4_wakeup_define_index_before_set_valid {
+ SELECT injection_points_detach('define_index_before_set_valid');
+ SELECT injection_points_wakeup('define_index_before_set_valid');
+}
+
+permutation
+ s3_start_create_index
+ s1_start_upsert
+ s4_wakeup_define_index_before_set_valid
+ s2_start_upsert
+ s4_wakeup_s1_from_invalidate_catalog_snapshot
+ s4_wakeup_s2
+ s4_wakeup_s1
\ No newline at end of file
diff --git a/src/test/modules/injection_points/specs/index_concurrently_upsert_predicate.spec b/src/test/modules/injection_points/specs/index_concurrently_upsert_predicate.spec
new file mode 100644
index 00000000000..70a27475e10
--- /dev/null
+++ b/src/test/modules/injection_points/specs/index_concurrently_upsert_predicate.spec
@@ -0,0 +1,70 @@
+# Test race conditions involving:
+# - s1: UPSERT a tuple
+# - s2: UPSERT the same tuple
+# - s3: CREATE UNIQUE INDEX CONCURRENTLY
+# - s4: operations with injection points
+
+setup
+{
+ CREATE EXTENSION injection_points;
+ CREATE SCHEMA test;
+ CREATE UNLOGGED TABLE test.tbl(i int, updated_at timestamp);
+
+ CREATE UNIQUE INDEX tbl_pkey_special ON test.tbl(abs(i)) WHERE i < 1000;
+ ALTER TABLE test.tbl SET (parallel_workers=0);
+}
+
+teardown
+{
+ DROP SCHEMA test CASCADE;
+ DROP EXTENSION injection_points;
+}
+
+session s1
+setup {
+ SELECT injection_points_set_local();
+ SELECT injection_points_attach('check_exclusion_or_unique_constraint_no_conflict', 'wait');
+ SELECT injection_points_attach('invalidate_catalog_snapshot_end', 'wait');
+}
+step s1_start_upsert { INSERT INTO test.tbl VALUES(13,now()) on conflict(abs(i)) where i < 100 do update set updated_at = now(); }
+
+session s2
+setup {
+ SELECT injection_points_set_local();
+ SELECT injection_points_attach('exec_insert_before_insert_speculative', 'wait');
+}
+step s2_start_upsert { INSERT INTO test.tbl VALUES(13,now()) on conflict(abs(i)) where i < 100 do update set updated_at = now(); }
+
+session s3
+setup {
+ SELECT injection_points_set_local();
+ SELECT injection_points_attach('define_index_before_set_valid', 'wait');
+}
+step s3_start_create_index { CREATE UNIQUE INDEX CONCURRENTLY tbl_pkey_special_duplicate ON test.tbl(abs(i)) WHERE i < 10000;}
+
+session s4
+step s4_wakeup_s1 {
+ SELECT injection_points_detach('check_exclusion_or_unique_constraint_no_conflict');
+ SELECT injection_points_wakeup('check_exclusion_or_unique_constraint_no_conflict');
+}
+step s4_wakeup_s1_from_invalidate_catalog_snapshot {
+ SELECT injection_points_detach('invalidate_catalog_snapshot_end');
+ SELECT injection_points_wakeup('invalidate_catalog_snapshot_end');
+}
+step s4_wakeup_s2 {
+ SELECT injection_points_detach('exec_insert_before_insert_speculative');
+ SELECT injection_points_wakeup('exec_insert_before_insert_speculative');
+}
+step s4_wakeup_define_index_before_set_valid {
+ SELECT injection_points_detach('define_index_before_set_valid');
+ SELECT injection_points_wakeup('define_index_before_set_valid');
+}
+
+permutation
+ s3_start_create_index
+ s1_start_upsert
+ s4_wakeup_define_index_before_set_valid
+ s2_start_upsert
+ s4_wakeup_s1_from_invalidate_catalog_snapshot
+ s4_wakeup_s2
+ s4_wakeup_s1
\ No newline at end of file
diff --git a/src/test/modules/injection_points/specs/reindex_concurrently_upsert.spec b/src/test/modules/injection_points/specs/reindex_concurrently_upsert.spec
new file mode 100644
index 00000000000..38b86d84345
--- /dev/null
+++ b/src/test/modules/injection_points/specs/reindex_concurrently_upsert.spec
@@ -0,0 +1,86 @@
+# Test race conditions involving:
+# - s1: UPSERT a tuple
+# - s2: UPSERT the same tuple
+# - s3: REINDEX concurrent primary key index
+# - s4: operations with injection points
+
+setup
+{
+ CREATE EXTENSION injection_points;
+ CREATE SCHEMA test;
+ CREATE UNLOGGED TABLE test.tbl(i int primary key, updated_at timestamp);
+ ALTER TABLE test.tbl SET (parallel_workers=0);
+}
+
+teardown
+{
+ DROP SCHEMA test CASCADE;
+ DROP EXTENSION injection_points;
+}
+
+session s1
+setup {
+ SELECT injection_points_set_local();
+ SELECT injection_points_attach('check_exclusion_or_unique_constraint_no_conflict', 'wait');
+}
+step s1_start_upsert { INSERT INTO test.tbl VALUES(13,now()) on conflict(i) do update set updated_at = now(); }
+
+session s2
+setup {
+ SELECT injection_points_set_local();
+ SELECT injection_points_attach('exec_insert_before_insert_speculative', 'wait');
+}
+step s2_start_upsert { INSERT INTO test.tbl VALUES(13,now()) on conflict(i) do update set updated_at = now(); }
+
+session s3
+setup {
+ SELECT injection_points_set_local();
+ SELECT injection_points_attach('reindex_relation_concurrently_before_set_dead', 'wait');
+ SELECT injection_points_attach('reindex_relation_concurrently_before_swap', 'wait');
+}
+step s3_start_reindex { REINDEX INDEX CONCURRENTLY test.tbl_pkey; }
+
+session s4
+step s4_wakeup_to_swap {
+ SELECT injection_points_detach('reindex_relation_concurrently_before_swap');
+ SELECT injection_points_wakeup('reindex_relation_concurrently_before_swap');
+}
+step s4_wakeup_s1 {
+ SELECT injection_points_detach('check_exclusion_or_unique_constraint_no_conflict');
+ SELECT injection_points_wakeup('check_exclusion_or_unique_constraint_no_conflict');
+}
+step s4_wakeup_s2 {
+ SELECT injection_points_detach('exec_insert_before_insert_speculative');
+ SELECT injection_points_wakeup('exec_insert_before_insert_speculative');
+}
+step s4_wakeup_to_set_dead {
+ SELECT injection_points_detach('reindex_relation_concurrently_before_set_dead');
+ SELECT injection_points_wakeup('reindex_relation_concurrently_before_set_dead');
+}
+
+permutation
+ s3_start_reindex
+ s1_start_upsert
+ s4_wakeup_to_swap
+ s2_start_upsert
+ s4_wakeup_s1
+ s4_wakeup_s2
+ s4_wakeup_to_set_dead
+
+permutation
+ s3_start_reindex
+ s2_start_upsert
+ s4_wakeup_to_swap
+ s1_start_upsert
+ s4_wakeup_s1
+ s4_wakeup_s2
+ s4_wakeup_to_set_dead
+
+permutation
+ s3_start_reindex
+ s4_wakeup_to_swap
+ s1_start_upsert
+ s2_start_upsert
+ s4_wakeup_s1
+ s4_wakeup_to_set_dead
+ s4_wakeup_s2
\ No newline at end of file
diff --git a/src/test/modules/injection_points/specs/reindex_concurrently_upsert_on_constraint.spec b/src/test/modules/injection_points/specs/reindex_concurrently_upsert_on_constraint.spec
new file mode 100644
index 00000000000..7d8e371bb0a
--- /dev/null
+++ b/src/test/modules/injection_points/specs/reindex_concurrently_upsert_on_constraint.spec
@@ -0,0 +1,86 @@
+# Test race conditions involving:
+# - s1: UPSERT a tuple
+# - s2: UPSERT the same tuple
+# - s3: REINDEX concurrent primary key index
+# - s4: operations with injection points
+
+setup
+{
+ CREATE EXTENSION injection_points;
+ CREATE SCHEMA test;
+ CREATE UNLOGGED TABLE test.tbl(i int primary key, updated_at timestamp);
+ ALTER TABLE test.tbl SET (parallel_workers=0);
+}
+
+teardown
+{
+ DROP SCHEMA test CASCADE;
+ DROP EXTENSION injection_points;
+}
+
+session s1
+setup {
+ SELECT injection_points_set_local();
+ SELECT injection_points_attach('check_exclusion_or_unique_constraint_no_conflict', 'wait');
+}
+step s1_start_upsert { INSERT INTO test.tbl VALUES(13,now()) on conflict on constraint tbl_pkey do update set updated_at = now(); }
+
+session s2
+setup {
+ SELECT injection_points_set_local();
+ SELECT injection_points_attach('exec_insert_before_insert_speculative', 'wait');
+}
+step s2_start_upsert { INSERT INTO test.tbl VALUES(13,now()) on conflict on constraint tbl_pkey do update set updated_at = now(); }
+
+session s3
+setup {
+ SELECT injection_points_set_local();
+ SELECT injection_points_attach('reindex_relation_concurrently_before_set_dead', 'wait');
+ SELECT injection_points_attach('reindex_relation_concurrently_before_swap', 'wait');
+}
+step s3_start_reindex { REINDEX INDEX CONCURRENTLY test.tbl_pkey; }
+
+session s4
+step s4_wakeup_to_swap {
+ SELECT injection_points_detach('reindex_relation_concurrently_before_swap');
+ SELECT injection_points_wakeup('reindex_relation_concurrently_before_swap');
+}
+step s4_wakeup_s1 {
+ SELECT injection_points_detach('check_exclusion_or_unique_constraint_no_conflict');
+ SELECT injection_points_wakeup('check_exclusion_or_unique_constraint_no_conflict');
+}
+step s4_wakeup_s2 {
+ SELECT injection_points_detach('exec_insert_before_insert_speculative');
+ SELECT injection_points_wakeup('exec_insert_before_insert_speculative');
+}
+step s4_wakeup_to_set_dead {
+ SELECT injection_points_detach('reindex_relation_concurrently_before_set_dead');
+ SELECT injection_points_wakeup('reindex_relation_concurrently_before_set_dead');
+}
+
+permutation
+ s3_start_reindex
+ s1_start_upsert
+ s4_wakeup_to_swap
+ s2_start_upsert
+ s4_wakeup_s1
+ s4_wakeup_s2
+ s4_wakeup_to_set_dead
+
+permutation
+ s3_start_reindex
+ s2_start_upsert
+ s4_wakeup_to_swap
+ s1_start_upsert
+ s4_wakeup_s1
+ s4_wakeup_s2
+ s4_wakeup_to_set_dead
+
+permutation
+ s3_start_reindex
+ s4_wakeup_to_swap
+ s1_start_upsert
+ s2_start_upsert
+ s4_wakeup_s1
+ s4_wakeup_to_set_dead
+ s4_wakeup_s2
\ No newline at end of file
diff --git a/src/test/modules/injection_points/specs/reindex_concurrently_upsert_partitioned.spec b/src/test/modules/injection_points/specs/reindex_concurrently_upsert_partitioned.spec
new file mode 100644
index 00000000000..b9253463039
--- /dev/null
+++ b/src/test/modules/injection_points/specs/reindex_concurrently_upsert_partitioned.spec
@@ -0,0 +1,88 @@
+# Test race conditions involving:
+# - s1: UPSERT a tuple
+# - s2: UPSERT the same tuple
+# - s3: REINDEX concurrent primary key index
+# - s4: operations with injection points
+
+setup
+{
+ CREATE EXTENSION injection_points;
+ CREATE SCHEMA test;
+ CREATE TABLE test.tbl(i int primary key, updated_at timestamp) PARTITION BY RANGE (i);
+ CREATE TABLE test.tbl_partition PARTITION OF test.tbl
+ FOR VALUES FROM (0) TO (10000)
+ WITH (parallel_workers = 0);
+}
+
+teardown
+{
+ DROP SCHEMA test CASCADE;
+ DROP EXTENSION injection_points;
+}
+
+session s1
+setup {
+ SELECT injection_points_set_local();
+ SELECT injection_points_attach('check_exclusion_or_unique_constraint_no_conflict', 'wait');
+}
+step s1_start_upsert { INSERT INTO test.tbl VALUES(13,now()) on conflict(i) do update set updated_at = now(); }
+
+session s2
+setup {
+ SELECT injection_points_set_local();
+ SELECT injection_points_attach('exec_insert_before_insert_speculative', 'wait');
+}
+step s2_start_upsert { INSERT INTO test.tbl VALUES(13,now()) on conflict(i) do update set updated_at = now(); }
+
+session s3
+setup {
+ SELECT injection_points_set_local();
+ SELECT injection_points_attach('reindex_relation_concurrently_before_set_dead', 'wait');
+ SELECT injection_points_attach('reindex_relation_concurrently_before_swap', 'wait');
+}
+step s3_start_reindex { REINDEX INDEX CONCURRENTLY test.tbl_partition_pkey; }
+
+session s4
+step s4_wakeup_to_swap {
+ SELECT injection_points_detach('reindex_relation_concurrently_before_swap');
+ SELECT injection_points_wakeup('reindex_relation_concurrently_before_swap');
+}
+step s4_wakeup_s1 {
+ SELECT injection_points_detach('check_exclusion_or_unique_constraint_no_conflict');
+ SELECT injection_points_wakeup('check_exclusion_or_unique_constraint_no_conflict');
+}
+step s4_wakeup_s2 {
+ SELECT injection_points_detach('exec_insert_before_insert_speculative');
+ SELECT injection_points_wakeup('exec_insert_before_insert_speculative');
+}
+step s4_wakeup_to_set_dead {
+ SELECT injection_points_detach('reindex_relation_concurrently_before_set_dead');
+ SELECT injection_points_wakeup('reindex_relation_concurrently_before_set_dead');
+}
+
+permutation
+ s3_start_reindex
+ s1_start_upsert
+ s4_wakeup_to_swap
+ s2_start_upsert
+ s4_wakeup_s1
+ s4_wakeup_s2
+ s4_wakeup_to_set_dead
+
+permutation
+ s3_start_reindex
+ s2_start_upsert
+ s4_wakeup_to_swap
+ s1_start_upsert
+ s4_wakeup_s1
+ s4_wakeup_s2
+ s4_wakeup_to_set_dead
+
+permutation
+ s3_start_reindex
+ s4_wakeup_to_swap
+ s1_start_upsert
+ s2_start_upsert
+ s4_wakeup_s1
+ s4_wakeup_to_set_dead
+ s4_wakeup_s2
\ No newline at end of file
--
2.43.0
[application/octet-stream] v6-0003-Add-stress-tests-for-concurrent-index-operations.patch (6.5K, 6-v6-0003-Add-stress-tests-for-concurrent-index-operations.patch)
download | inline diff:
From 212a59c454c7584f1b020e9b847da5bd86e22f56 Mon Sep 17 00:00:00 2001
From: nkey <[email protected]>
Date: Sat, 30 Nov 2024 16:24:20 +0100
Subject: [PATCH v6 3/6] Add stress tests for concurrent index operations
Add comprehensive stress tests for concurrent index operations, focusing on:
* Testing CREATE/REINDEX/DROP INDEX CONCURRENTLY under heavy write load
* Verifying index integrity during concurrent HOT updates
* Testing various index types including unique and partial indexes
* Validating index correctness using amcheck
* Exercising parallel worker configurations
These stress tests help ensure reliability of concurrent index operations
under heavy load conditions.
---
src/bin/pg_amcheck/meson.build | 1 +
src/bin/pg_amcheck/t/006_cic.pl | 144 ++++++++++++++++++++++++++++++++
2 files changed, 145 insertions(+)
create mode 100644 src/bin/pg_amcheck/t/006_cic.pl
diff --git a/src/bin/pg_amcheck/meson.build b/src/bin/pg_amcheck/meson.build
index 292b33eb094..4a8f4fbc8b0 100644
--- a/src/bin/pg_amcheck/meson.build
+++ b/src/bin/pg_amcheck/meson.build
@@ -28,6 +28,7 @@ tests += {
't/003_check.pl',
't/004_verify_heapam.pl',
't/005_opclass_damage.pl',
+ 't/006_cic.pl',
],
},
}
diff --git a/src/bin/pg_amcheck/t/006_cic.pl b/src/bin/pg_amcheck/t/006_cic.pl
new file mode 100644
index 00000000000..002348b8366
--- /dev/null
+++ b/src/bin/pg_amcheck/t/006_cic.pl
@@ -0,0 +1,144 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+# Test REINDEX CONCURRENTLY with concurrent modifications and HOT updates
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+Test::More->builder->todo_start('filesystem bug')
+ if PostgreSQL::Test::Utils::has_wal_read_bug;
+
+my ($node, $result);
+
+#
+# Test set-up
+#
+$node = PostgreSQL::Test::Cluster->new('RC_test');
+$node->init;
+$node->append_conf('postgresql.conf',
+ 'lock_timeout = ' . (1000 * $PostgreSQL::Test::Utils::timeout_default));
+$node->append_conf('postgresql.conf', 'fsync = off');
+$node->start;
+$node->safe_psql('postgres', q(CREATE EXTENSION amcheck));
+$node->safe_psql('postgres', q(CREATE TABLE tbl(i int primary key,
+ c1 money default 0, c2 money default 0,
+ c3 money default 0, updated_at timestamp)));
+$node->safe_psql('postgres', q(CREATE INDEX CONCURRENTLY idx ON tbl(i, updated_at);));
+# create sequence
+$node->safe_psql('postgres', q(CREATE UNLOGGED SEQUENCE in_row_rebuild START 1 INCREMENT 1;));
+$node->safe_psql('postgres', q(SELECT nextval('in_row_rebuild');));
+
+# Create helper functions for predicate tests
+$node->safe_psql('postgres', q(
+ CREATE FUNCTION predicate_stable() RETURNS bool IMMUTABLE
+ LANGUAGE plpgsql AS $$
+ BEGIN
+ EXECUTE 'SELECT txid_current()';
+ RETURN true;
+ END; $$;
+));
+
+$node->safe_psql('postgres', q(
+ CREATE FUNCTION predicate_const(integer) RETURNS bool IMMUTABLE
+ LANGUAGE plpgsql AS $$
+ BEGIN
+ RETURN MOD($1, 2) = 0;
+ END; $$;
+));
+
+# Run CIC/RIC in different options concurrently with upserts
+$node->pgbench(
+ '--no-vacuum --client=30 --jobs=4 --exit-on-abort --transactions=2500',
+ 0,
+ [qr{actually processed}],
+ [qr{^$}],
+ 'concurrent operations with REINDEX/CREATE INDEX CONCURRENTLY',
+ {
+ 'concurrent_ops' => q(
+ SELECT pg_try_advisory_lock(42)::integer AS gotlock \gset
+ \if :gotlock
+ SELECT nextval('in_row_rebuild') AS last_value \gset
+ \set variant random(0, 5)
+ \set parallels random(0, 4)
+ \if :last_value < 3
+ ALTER TABLE tbl SET (parallel_workers=:parallels);
+ \if :variant = 0
+ CREATE INDEX CONCURRENTLY idx_2 ON tbl(i, updated_at);
+ \elif :variant = 1
+ CREATE INDEX CONCURRENTLY idx_2 ON tbl(i, updated_at) WHERE predicate_stable();
+ \elif :variant = 2
+ CREATE INDEX CONCURRENTLY idx_2 ON tbl(i, updated_at) WHERE MOD(i, 2) = 0;
+ \elif :variant = 3
+ CREATE INDEX CONCURRENTLY idx_2 ON tbl(i, updated_at) WHERE predicate_const(i);
+ \elif :variant = 4
+ CREATE INDEX CONCURRENTLY idx_2 ON tbl(predicate_const(i));
+ \elif :variant = 5
+ CREATE INDEX CONCURRENTLY idx_2 ON tbl(i, predicate_const(i), updated_at) WHERE predicate_const(i);
+ \endif
+ --\sleep 200 ms
+ SELECT bt_index_check('idx_2', heapallindexed => true, checkunique => true);
+ REINDEX INDEX CONCURRENTLY idx_2;
+ --\sleep 200 ms
+ SELECT bt_index_check('idx_2', heapallindexed => true, checkunique => true);
+ DROP INDEX CONCURRENTLY idx_2;
+ \endif
+ SELECT pg_advisory_unlock(42);
+ \else
+ \set num random(1000, 100000)
+ BEGIN;
+ INSERT INTO tbl VALUES(floor(random()*:num),0,0,0,now())
+ ON CONFLICT(i) DO UPDATE SET updated_at = now();
+ INSERT INTO tbl VALUES(floor(random()*:num),0,0,0,now())
+ ON CONFLICT(i) DO UPDATE SET updated_at = now();
+ INSERT INTO tbl VALUES(floor(random()*:num),0,0,0,now())
+ ON CONFLICT(i) DO UPDATE SET updated_at = now();
+ INSERT INTO tbl VALUES(floor(random()*:num),0,0,0,now())
+ ON CONFLICT(i) DO UPDATE SET updated_at = now();
+ INSERT INTO tbl VALUES(floor(random()*:num),0,0,0,now())
+ ON CONFLICT(i) DO UPDATE SET updated_at = now();
+ SELECT setval('in_row_rebuild', 1);
+ COMMIT;
+ \endif
+ )
+ });
+
+$node->safe_psql('postgres', q(TRUNCATE TABLE tbl;));
+
+# Run CIC/RIC for unique index concurrently with upserts
+$node->pgbench(
+ '--no-vacuum --client=30 --jobs=4 --exit-on-abort --transactions=2500',
+ 0,
+ [qr{actually processed}],
+ [qr{^$}],
+ 'concurrent operations with REINDEX/CREATE INDEX CONCURRENTLY',
+ {
+ 'concurrent_ops_unique_idx' => q(
+ SELECT pg_try_advisory_lock(42)::integer AS gotlock \gset
+ \if :gotlock
+ SELECT nextval('in_row_rebuild') AS last_value \gset
+ \set parallels random(0, 4)
+ \if :last_value < 3
+ ALTER TABLE tbl SET (parallel_workers=:parallels);
+ CREATE UNIQUE INDEX CONCURRENTLY idx_2 ON tbl(i);
+ --\sleep 200 ms
+ SELECT bt_index_check('idx_2', heapallindexed => true, checkunique => true);
+ REINDEX INDEX CONCURRENTLY idx_2;
+ --\sleep 200 ms
+ SELECT bt_index_check('idx_2', heapallindexed => true, checkunique => true);
+ DROP INDEX CONCURRENTLY idx_2;
+ \endif
+ SELECT pg_advisory_unlock(42);
+ \else
+ \set num random(1, power(10, random(1, 5)))
+ INSERT INTO tbl VALUES(floor(random()*:num),0,0,0,now())
+ ON CONFLICT(i) DO UPDATE SET updated_at = now();
+ SELECT setval('in_row_rebuild', 1);
+ \endif
+ )
+ });
+
+$node->stop;
+done_testing();
\ No newline at end of file
--
2.43.0
[application/octet-stream] v6-0006-Allow-snapshot-resets-in-concurrent-unique-index-.patch (32.5K, 7-v6-0006-Allow-snapshot-resets-in-concurrent-unique-index-.patch)
download | inline diff:
From dc8447015383a3c38c71570749b697b25c7aceb7 Mon Sep 17 00:00:00 2001
From: nkey <[email protected]>
Date: Sat, 7 Dec 2024 23:27:34 +0100
Subject: [PATCH v6 6/6] Allow snapshot resets in concurrent unique index
builds
Previously, concurrent unique index builds used a fixed snapshot for the entire
scan to ensure proper uniqueness checks. This could delay vacuum's ability to
clean up dead tuples.
Now reset snapshots periodically during concurrent unique index builds, while
still maintaining uniqueness by:
1. Ignoring dead tuples during uniqueness checks in tuplesort
2. Adding a uniqueness check in _bt_load that detects multiple alive tuples with the same key values
This improves vacuum effectiveness during long-running index builds without
compromising index uniqueness enforcement.
---
src/backend/access/heap/heapam_handler.c | 6 +-
src/backend/access/nbtree/nbtdedup.c | 8 +-
src/backend/access/nbtree/nbtsort.c | 173 ++++++++++++++----
src/backend/access/nbtree/nbtsplitloc.c | 12 +-
src/backend/access/nbtree/nbtutils.c | 29 ++-
src/backend/catalog/index.c | 6 +-
src/backend/utils/sort/tuplesortvariants.c | 67 +++++--
src/include/access/nbtree.h | 4 +-
src/include/access/tableam.h | 5 +-
src/include/utils/tuplesort.h | 1 +
.../expected/cic_reset_snapshots.out | 6 +
11 files changed, 242 insertions(+), 75 deletions(-)
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 2e5163609c1..921b806642a 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -1232,15 +1232,15 @@ heapam_index_build_range_scan(Relation heapRelation,
* qual checks (because we have to index RECENTLY_DEAD tuples). In a
* concurrent build, or during bootstrap, we take a regular MVCC snapshot
* and index whatever's live according to that while that snapshot is reset
- * every so often (in case of non-unique index).
+ * every so often.
*/
OldestXmin = InvalidTransactionId;
/*
- * For unique index we need consistent snapshot for the whole scan.
+ * For concurrent builds of non-system indexes, we may want to periodically
+ * reset snapshots to allow vacuum to clean up tuples.
*/
reset_snapshots = indexInfo->ii_Concurrent &&
- !indexInfo->ii_Unique &&
!is_system_catalog; /* just for the case */
/* okay to ignore lazy VACUUMs here */
diff --git a/src/backend/access/nbtree/nbtdedup.c b/src/backend/access/nbtree/nbtdedup.c
index 456d86b51c9..31b59265a29 100644
--- a/src/backend/access/nbtree/nbtdedup.c
+++ b/src/backend/access/nbtree/nbtdedup.c
@@ -148,7 +148,7 @@ _bt_dedup_pass(Relation rel, Buffer buf, IndexTuple newitem, Size newitemsz,
_bt_dedup_start_pending(state, itup, offnum);
}
else if (state->deduplicate &&
- _bt_keep_natts_fast(rel, state->base, itup) > nkeyatts &&
+ _bt_keep_natts_fast(rel, state->base, itup, NULL) > nkeyatts &&
_bt_dedup_save_htid(state, itup))
{
/*
@@ -374,7 +374,7 @@ _bt_bottomupdel_pass(Relation rel, Buffer buf, Relation heapRel,
/* itup starts first pending interval */
_bt_dedup_start_pending(state, itup, offnum);
}
- else if (_bt_keep_natts_fast(rel, state->base, itup) > nkeyatts &&
+ else if (_bt_keep_natts_fast(rel, state->base, itup, NULL) > nkeyatts &&
_bt_dedup_save_htid(state, itup))
{
/* Tuple is equal; just added its TIDs to pending interval */
@@ -789,12 +789,12 @@ _bt_do_singleval(Relation rel, Page page, BTDedupState state,
itemid = PageGetItemId(page, minoff);
itup = (IndexTuple) PageGetItem(page, itemid);
- if (_bt_keep_natts_fast(rel, newitem, itup) > nkeyatts)
+ if (_bt_keep_natts_fast(rel, newitem, itup, NULL) > nkeyatts)
{
itemid = PageGetItemId(page, PageGetMaxOffsetNumber(page));
itup = (IndexTuple) PageGetItem(page, itemid);
- if (_bt_keep_natts_fast(rel, newitem, itup) > nkeyatts)
+ if (_bt_keep_natts_fast(rel, newitem, itup, NULL) > nkeyatts)
return true;
}
diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index 2acbf121745..ac9e5acfc53 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -83,6 +83,7 @@ typedef struct BTSpool
Relation index;
bool isunique;
bool nulls_not_distinct;
+ bool unique_dead_ignored;
} BTSpool;
/*
@@ -101,6 +102,7 @@ typedef struct BTShared
Oid indexrelid;
bool isunique;
bool nulls_not_distinct;
+ bool unique_dead_ignored;
bool isconcurrent;
int scantuplesortstates;
@@ -203,15 +205,13 @@ typedef struct BTLeader
*/
typedef struct BTBuildState
{
- bool isunique;
- bool nulls_not_distinct;
bool havedead;
Relation heap;
BTSpool *spool;
/*
- * spool2 is needed only when the index is a unique index. Dead tuples are
- * put into spool2 instead of spool in order to avoid uniqueness check.
+ * spool2 is needed only when the index is a unique index and build non-concurrently.
+ * Dead tuples are put into spool2 instead of spool in order to avoid uniqueness check.
*/
BTSpool *spool2;
double indtuples;
@@ -303,8 +303,6 @@ btbuild(Relation heap, Relation index, IndexInfo *indexInfo)
ResetUsage();
#endif /* BTREE_BUILD_STATS */
- buildstate.isunique = indexInfo->ii_Unique;
- buildstate.nulls_not_distinct = indexInfo->ii_NullsNotDistinct;
buildstate.havedead = false;
buildstate.heap = heap;
buildstate.spool = NULL;
@@ -379,6 +377,11 @@ _bt_spools_heapscan(Relation heap, Relation index, BTBuildState *buildstate,
btspool->index = index;
btspool->isunique = indexInfo->ii_Unique;
btspool->nulls_not_distinct = indexInfo->ii_NullsNotDistinct;
+ /*
+ * We need to ignore dead tuples for unique checks in case of concurrent build.
+ * It is required because or periodic reset of snapshot.
+ */
+ btspool->unique_dead_ignored = indexInfo->ii_Concurrent && indexInfo->ii_Unique;
/* Save as primary spool */
buildstate->spool = btspool;
@@ -427,8 +430,9 @@ _bt_spools_heapscan(Relation heap, Relation index, BTBuildState *buildstate,
* the use of parallelism or any other factor.
*/
buildstate->spool->sortstate =
- tuplesort_begin_index_btree(heap, index, buildstate->isunique,
- buildstate->nulls_not_distinct,
+ tuplesort_begin_index_btree(heap, index, btspool->isunique,
+ btspool->nulls_not_distinct,
+ btspool->unique_dead_ignored,
maintenance_work_mem, coordinate,
TUPLESORT_NONE);
@@ -436,8 +440,12 @@ _bt_spools_heapscan(Relation heap, Relation index, BTBuildState *buildstate,
* If building a unique index, put dead tuples in a second spool to keep
* them out of the uniqueness check. We expect that the second spool (for
* dead tuples) won't get very full, so we give it only work_mem.
+ *
+ * In case of concurrent build dead tuples are not need to be put into index
+ * since we wait for all snapshots older than reference snapshot during the
+ * validation phase.
*/
- if (indexInfo->ii_Unique)
+ if (indexInfo->ii_Unique && !indexInfo->ii_Concurrent)
{
BTSpool *btspool2 = (BTSpool *) palloc0(sizeof(BTSpool));
SortCoordinate coordinate2 = NULL;
@@ -468,7 +476,7 @@ _bt_spools_heapscan(Relation heap, Relation index, BTBuildState *buildstate,
* full, so we give it only work_mem
*/
buildstate->spool2->sortstate =
- tuplesort_begin_index_btree(heap, index, false, false, work_mem,
+ tuplesort_begin_index_btree(heap, index, false, false, false, work_mem,
coordinate2, TUPLESORT_NONE);
}
@@ -1147,13 +1155,116 @@ _bt_load(BTWriteState *wstate, BTSpool *btspool, BTSpool *btspool2)
SortSupport sortKeys;
int64 tuples_done = 0;
bool deduplicate;
+ bool fail_on_alive_duplicate;
wstate->bulkstate = smgr_bulk_start_rel(wstate->index, MAIN_FORKNUM);
deduplicate = wstate->inskey->allequalimage && !btspool->isunique &&
BTGetDeduplicateItems(wstate->index);
+ /*
+ * The unique_dead_ignored does not guarantee absence of multiple alive
+ * tuples with same values exists in the spool. Such thing may happen if
+ * alive tuples are located between a few dead tuples, like this: addda.
+ */
+ fail_on_alive_duplicate = btspool->unique_dead_ignored;
- if (merge)
+ if (fail_on_alive_duplicate)
+ {
+ bool seen_alive = false,
+ prev_tested = false;
+ IndexTuple prev = NULL;
+ TupleTableSlot *slot = MakeSingleTupleTableSlot(RelationGetDescr(wstate->heap),
+ &TTSOpsBufferHeapTuple);
+ IndexFetchTableData *fetch = table_index_fetch_begin(wstate->heap);
+
+ Assert(btspool->isunique);
+ Assert(!btspool2);
+
+ while ((itup = tuplesort_getindextuple(btspool->sortstate, true)) != NULL)
+ {
+ bool tuples_equal = false;
+
+ /* When we see first tuple, create first index page */
+ if (state == NULL)
+ state = _bt_pagestate(wstate, 0);
+
+ if (prev != NULL) /* if is not the first tuple */
+ {
+ bool has_nulls = false,
+ call_again, /* just to pass something */
+ ignored, /* just to pass something */
+ now_alive;
+ ItemPointerData tid;
+
+ /* if this tuples equal to previouse one? */
+ if (wstate->inskey->allequalimage)
+ tuples_equal = _bt_keep_natts_fast(wstate->index, prev, itup, &has_nulls) > keysz;
+ else
+ tuples_equal = _bt_keep_natts(wstate->index, prev, itup,wstate->inskey, &has_nulls) > keysz;
+
+ /* handle null values correctly */
+ if (has_nulls && !btspool->nulls_not_distinct)
+ tuples_equal = false;
+
+ if (tuples_equal)
+ {
+ /* check previous tuple if not yet */
+ if (!prev_tested)
+ {
+ call_again = false;
+ tid = prev->t_tid;
+ seen_alive = table_index_fetch_tuple(fetch, &tid, SnapshotSelf, slot, &call_again, &ignored);
+ prev_tested = true;
+ }
+
+ call_again = false;
+ tid = itup->t_tid;
+ now_alive = table_index_fetch_tuple(fetch, &tid, SnapshotSelf, slot, &call_again, &ignored);
+
+ /* are multiple alive tuples detected in equal group? */
+ if (seen_alive && now_alive)
+ {
+ char *key_desc;
+ TupleDesc tupDes = RelationGetDescr(wstate->index);
+ bool isnull[INDEX_MAX_KEYS];
+ Datum values[INDEX_MAX_KEYS];
+
+ index_deform_tuple(itup, tupDes, values, isnull);
+
+ key_desc = BuildIndexValueDescription(wstate->index, values, isnull);
+
+ /* keep this message in sync with the same in comparetup_index_btree_tiebreak */
+ ereport(ERROR,
+ (errcode(ERRCODE_UNIQUE_VIOLATION),
+ errmsg("could not create unique index \"%s\"",
+ RelationGetRelationName(wstate->index)),
+ key_desc ? errdetail("Key %s is duplicated.", key_desc) :
+ errdetail("Duplicate keys exist."),
+ errtableconstraint(wstate->heap,
+ RelationGetRelationName(wstate->index))));
+ }
+ seen_alive |= now_alive;
+ }
+ }
+
+ if (!tuples_equal)
+ {
+ seen_alive = false;
+ prev_tested = false;
+ }
+
+ _bt_buildadd(wstate, state, itup, 0);
+ if (prev) pfree(prev);
+ prev = CopyIndexTuple(itup);
+
+ /* Report progress */
+ pgstat_progress_update_param(PROGRESS_CREATEIDX_TUPLES_DONE,
+ ++tuples_done);
+ }
+ ExecDropSingleTupleTableSlot(slot);
+ table_index_fetch_end(fetch);
+ }
+ else if (merge)
{
/*
* Another BTSpool for dead tuples exists. Now we have to merge
@@ -1314,7 +1425,7 @@ _bt_load(BTWriteState *wstate, BTSpool *btspool, BTSpool *btspool2)
InvalidOffsetNumber);
}
else if (_bt_keep_natts_fast(wstate->index, dstate->base,
- itup) > keysz &&
+ itup, NULL) > keysz &&
_bt_dedup_save_htid(dstate, itup))
{
/*
@@ -1411,7 +1522,6 @@ _bt_begin_parallel(BTBuildState *buildstate, bool isconcurrent, int request)
BufferUsage *bufferusage;
bool leaderparticipates = true;
bool need_pop_active_snapshot = true;
- bool reset_snapshot;
bool wait_for_snapshot_attach;
int querylen;
@@ -1430,21 +1540,12 @@ _bt_begin_parallel(BTBuildState *buildstate, bool isconcurrent, int request)
scantuplesortstates = leaderparticipates ? request + 1 : request;
- /*
- * For concurrent non-unique index builds, we can periodically reset snapshots
- * to allow the xmin horizon to advance. This is safe since these builds don't
- * require a consistent view across the entire scan. Unique indexes still need
- * a stable snapshot to properly enforce uniqueness constraints.
- */
- reset_snapshot = isconcurrent && !btspool->isunique;
-
/*
* Prepare for scan of the base relation. In a normal index build, we use
* SnapshotAny because we must retrieve all tuples and do our own time
* qual checks (because we have to index RECENTLY_DEAD tuples). In a
* concurrent build, we take a regular MVCC snapshot and index whatever's
- * live according to that, while that snapshot may be reset periodically in
- * case of non-unique index.
+ * live according to that, while that snapshot may be reset periodically.
*/
if (!isconcurrent)
{
@@ -1452,16 +1553,16 @@ _bt_begin_parallel(BTBuildState *buildstate, bool isconcurrent, int request)
snapshot = SnapshotAny;
need_pop_active_snapshot = false;
}
- else if (reset_snapshot)
+ else
{
+ /*
+ * For concurrent index builds, we can periodically reset snapshots to allow
+ * the xmin horizon to advance. This is safe since these builds don't
+ * require a consistent view across the entire scan.
+ */
snapshot = InvalidSnapshot;
PushActiveSnapshot(GetTransactionSnapshot());
}
- else
- {
- snapshot = RegisterSnapshot(GetTransactionSnapshot());
- PushActiveSnapshot(snapshot);
- }
/*
* Estimate size for our own PARALLEL_KEY_BTREE_SHARED workspace, and
@@ -1531,6 +1632,7 @@ _bt_begin_parallel(BTBuildState *buildstate, bool isconcurrent, int request)
btshared->indexrelid = RelationGetRelid(btspool->index);
btshared->isunique = btspool->isunique;
btshared->nulls_not_distinct = btspool->nulls_not_distinct;
+ btshared->unique_dead_ignored = btspool->unique_dead_ignored;
btshared->isconcurrent = isconcurrent;
btshared->scantuplesortstates = scantuplesortstates;
btshared->queryid = pgstat_get_my_query_id();
@@ -1545,7 +1647,7 @@ _bt_begin_parallel(BTBuildState *buildstate, bool isconcurrent, int request)
table_parallelscan_initialize(btspool->heap,
ParallelTableScanFromBTShared(btshared),
snapshot,
- reset_snapshot);
+ isconcurrent);
/*
* Store shared tuplesort-private state, for which we reserved space.
@@ -1626,7 +1728,7 @@ _bt_begin_parallel(BTBuildState *buildstate, bool isconcurrent, int request)
* In case when leader going to reset own active snapshot as well - we need to
* wait until all workers imported initial snapshot.
*/
- wait_for_snapshot_attach = reset_snapshot && leaderparticipates;
+ wait_for_snapshot_attach = isconcurrent && leaderparticipates;
if (wait_for_snapshot_attach)
WaitForParallelWorkersToAttach(pcxt, true);
@@ -1742,6 +1844,7 @@ _bt_leader_participate_as_worker(BTBuildState *buildstate)
leaderworker->index = buildstate->spool->index;
leaderworker->isunique = buildstate->spool->isunique;
leaderworker->nulls_not_distinct = buildstate->spool->nulls_not_distinct;
+ leaderworker->unique_dead_ignored = buildstate->spool->unique_dead_ignored;
/* Initialize second spool, if required */
if (!btleader->btshared->isunique)
@@ -1845,11 +1948,12 @@ _bt_parallel_build_main(dsm_segment *seg, shm_toc *toc)
btspool->index = indexRel;
btspool->isunique = btshared->isunique;
btspool->nulls_not_distinct = btshared->nulls_not_distinct;
+ btspool->unique_dead_ignored = btshared->unique_dead_ignored;
/* Look up shared state private to tuplesort.c */
sharedsort = shm_toc_lookup(toc, PARALLEL_KEY_TUPLESORT, false);
tuplesort_attach_shared(sharedsort, seg);
- if (!btshared->isunique)
+ if (!btshared->isunique || btshared->isconcurrent)
{
btspool2 = NULL;
sharedsort2 = NULL;
@@ -1928,6 +2032,7 @@ _bt_parallel_scan_and_sort(BTSpool *btspool, BTSpool *btspool2,
btspool->index,
btspool->isunique,
btspool->nulls_not_distinct,
+ btspool->unique_dead_ignored,
sortmem, coordinate,
TUPLESORT_NONE);
@@ -1950,14 +2055,12 @@ _bt_parallel_scan_and_sort(BTSpool *btspool, BTSpool *btspool2,
coordinate2->nParticipants = -1;
coordinate2->sharedsort = sharedsort2;
btspool2->sortstate =
- tuplesort_begin_index_btree(btspool->heap, btspool->index, false, false,
+ tuplesort_begin_index_btree(btspool->heap, btspool->index, false, false, false,
Min(sortmem, work_mem), coordinate2,
false);
}
/* Fill in buildstate for _bt_build_callback() */
- buildstate.isunique = btshared->isunique;
- buildstate.nulls_not_distinct = btshared->nulls_not_distinct;
buildstate.havedead = false;
buildstate.heap = btspool->heap;
buildstate.spool = btspool;
diff --git a/src/backend/access/nbtree/nbtsplitloc.c b/src/backend/access/nbtree/nbtsplitloc.c
index 1f40d40263e..e2ed4537026 100644
--- a/src/backend/access/nbtree/nbtsplitloc.c
+++ b/src/backend/access/nbtree/nbtsplitloc.c
@@ -687,7 +687,7 @@ _bt_afternewitemoff(FindSplitData *state, OffsetNumber maxoff,
{
itemid = PageGetItemId(state->origpage, maxoff);
tup = (IndexTuple) PageGetItem(state->origpage, itemid);
- keepnatts = _bt_keep_natts_fast(state->rel, tup, state->newitem);
+ keepnatts = _bt_keep_natts_fast(state->rel, tup, state->newitem, NULL);
if (keepnatts > 1 && keepnatts <= nkeyatts)
{
@@ -718,7 +718,7 @@ _bt_afternewitemoff(FindSplitData *state, OffsetNumber maxoff,
!_bt_adjacenthtid(&tup->t_tid, &state->newitem->t_tid))
return false;
/* Check same conditions as rightmost item case, too */
- keepnatts = _bt_keep_natts_fast(state->rel, tup, state->newitem);
+ keepnatts = _bt_keep_natts_fast(state->rel, tup, state->newitem, NULL);
if (keepnatts > 1 && keepnatts <= nkeyatts)
{
@@ -967,7 +967,7 @@ _bt_strategy(FindSplitData *state, SplitPoint *leftpage,
* avoid appending a heap TID in new high key, we're done. Finish split
* with default strategy and initial split interval.
*/
- perfectpenalty = _bt_keep_natts_fast(state->rel, leftmost, rightmost);
+ perfectpenalty = _bt_keep_natts_fast(state->rel, leftmost, rightmost, NULL);
if (perfectpenalty <= indnkeyatts)
return perfectpenalty;
@@ -988,7 +988,7 @@ _bt_strategy(FindSplitData *state, SplitPoint *leftpage,
* If page is entirely full of duplicates, a single value strategy split
* will be performed.
*/
- perfectpenalty = _bt_keep_natts_fast(state->rel, leftmost, rightmost);
+ perfectpenalty = _bt_keep_natts_fast(state->rel, leftmost, rightmost, NULL);
if (perfectpenalty <= indnkeyatts)
{
*strategy = SPLIT_MANY_DUPLICATES;
@@ -1027,7 +1027,7 @@ _bt_strategy(FindSplitData *state, SplitPoint *leftpage,
itemid = PageGetItemId(state->origpage, P_HIKEY);
hikey = (IndexTuple) PageGetItem(state->origpage, itemid);
perfectpenalty = _bt_keep_natts_fast(state->rel, hikey,
- state->newitem);
+ state->newitem, NULL);
if (perfectpenalty <= indnkeyatts)
*strategy = SPLIT_SINGLE_VALUE;
else
@@ -1149,7 +1149,7 @@ _bt_split_penalty(FindSplitData *state, SplitPoint *split)
lastleft = _bt_split_lastleft(state, split);
firstright = _bt_split_firstright(state, split);
- return _bt_keep_natts_fast(state->rel, lastleft, firstright);
+ return _bt_keep_natts_fast(state->rel, lastleft, firstright, NULL);
}
/*
diff --git a/src/backend/access/nbtree/nbtutils.c b/src/backend/access/nbtree/nbtutils.c
index 50cbf06cb45..3d6dda4ace8 100644
--- a/src/backend/access/nbtree/nbtutils.c
+++ b/src/backend/access/nbtree/nbtutils.c
@@ -100,8 +100,6 @@ static bool _bt_check_rowcompare(ScanKey skey,
ScanDirection dir, bool *continuescan);
static void _bt_checkkeys_look_ahead(IndexScanDesc scan, BTReadPageState *pstate,
int tupnatts, TupleDesc tupdesc);
-static int _bt_keep_natts(Relation rel, IndexTuple lastleft,
- IndexTuple firstright, BTScanInsert itup_key);
/*
@@ -4672,7 +4670,7 @@ _bt_truncate(Relation rel, IndexTuple lastleft, IndexTuple firstright,
Assert(!BTreeTupleIsPivot(lastleft) && !BTreeTupleIsPivot(firstright));
/* Determine how many attributes must be kept in truncated tuple */
- keepnatts = _bt_keep_natts(rel, lastleft, firstright, itup_key);
+ keepnatts = _bt_keep_natts(rel, lastleft, firstright, itup_key, NULL);
#ifdef DEBUG_NO_TRUNCATE
/* Force truncation to be ineffective for testing purposes */
@@ -4790,17 +4788,24 @@ _bt_truncate(Relation rel, IndexTuple lastleft, IndexTuple firstright,
/*
* _bt_keep_natts - how many key attributes to keep when truncating.
*
+ * This is exported to be used as comparison function during concurrent
+ * unique index build in case _bt_keep_natts_fast is not suitable because
+ * collation is not "allequalimage"/deduplication-safe.
+ *
* Caller provides two tuples that enclose a split point. Caller's insertion
* scankey is used to compare the tuples; the scankey's argument values are
* not considered here.
*
+ * hasnulls value set to true in case of any null column in any tuple.
+ *
* This can return a number of attributes that is one greater than the
* number of key attributes for the index relation. This indicates that the
* caller must use a heap TID as a unique-ifier in new pivot tuple.
*/
-static int
+int
_bt_keep_natts(Relation rel, IndexTuple lastleft, IndexTuple firstright,
- BTScanInsert itup_key)
+ BTScanInsert itup_key,
+ bool *hasnulls)
{
int nkeyatts = IndexRelationGetNumberOfKeyAttributes(rel);
TupleDesc itupdesc = RelationGetDescr(rel);
@@ -4826,6 +4831,8 @@ _bt_keep_natts(Relation rel, IndexTuple lastleft, IndexTuple firstright,
datum1 = index_getattr(lastleft, attnum, itupdesc, &isNull1);
datum2 = index_getattr(firstright, attnum, itupdesc, &isNull2);
+ if (hasnulls)
+ (*hasnulls) |= (isNull1 || isNull2);
if (isNull1 != isNull2)
break;
@@ -4845,7 +4852,7 @@ _bt_keep_natts(Relation rel, IndexTuple lastleft, IndexTuple firstright,
* expected in an allequalimage index.
*/
Assert(!itup_key->allequalimage ||
- keepnatts == _bt_keep_natts_fast(rel, lastleft, firstright));
+ keepnatts == _bt_keep_natts_fast(rel, lastleft, firstright, NULL));
return keepnatts;
}
@@ -4856,7 +4863,8 @@ _bt_keep_natts(Relation rel, IndexTuple lastleft, IndexTuple firstright,
* This is exported so that a candidate split point can have its effect on
* suffix truncation inexpensively evaluated ahead of time when finding a
* split location. A naive bitwise approach to datum comparisons is used to
- * save cycles.
+ * save cycles. Also, it may be used as comparison function during concurrent
+ * build of unique index.
*
* The approach taken here usually provides the same answer as _bt_keep_natts
* will (for the same pair of tuples from a heapkeyspace index), since the
@@ -4865,6 +4873,8 @@ _bt_keep_natts(Relation rel, IndexTuple lastleft, IndexTuple firstright,
* "equal image" columns, routine is guaranteed to give the same result as
* _bt_keep_natts would.
*
+ * hasnulls value set to true in case of any null column in any tuple.
+ *
* Callers can rely on the fact that attributes considered equal here are
* definitely also equal according to _bt_keep_natts, even when the index uses
* an opclass or collation that is not "allequalimage"/deduplication-safe.
@@ -4873,7 +4883,8 @@ _bt_keep_natts(Relation rel, IndexTuple lastleft, IndexTuple firstright,
* more balanced split point.
*/
int
-_bt_keep_natts_fast(Relation rel, IndexTuple lastleft, IndexTuple firstright)
+_bt_keep_natts_fast(Relation rel, IndexTuple lastleft, IndexTuple firstright,
+ bool *hasnulls)
{
TupleDesc itupdesc = RelationGetDescr(rel);
int keysz = IndexRelationGetNumberOfKeyAttributes(rel);
@@ -4890,6 +4901,8 @@ _bt_keep_natts_fast(Relation rel, IndexTuple lastleft, IndexTuple firstright)
datum1 = index_getattr(lastleft, attnum, itupdesc, &isNull1);
datum2 = index_getattr(firstright, attnum, itupdesc, &isNull2);
+ if (hasnulls)
+ *hasnulls |= (isNull1 | isNull2);
att = TupleDescAttr(itupdesc, attnum - 1);
if (isNull1 != isNull2)
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index e0ada5ce159..f6a1a2f3f90 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3292,9 +3292,9 @@ IndexCheckExclusion(Relation heapRelation,
* if we used HeapTupleSatisfiesVacuum). This leaves us with an index that
* does not contain any tuples added to the table while we built the index.
*
- * Furthermore, in case of non-unique index we set SO_RESET_SNAPSHOT for the
- * scan, which causes new snapshot to be set as active every so often. The reason
- * for that is to propagate the xmin horizon forward.
+ * Furthermore, we set SO_RESET_SNAPSHOT for the scan, which causes new
+ * snapshot to be set as active every so often. The reason for that is to
+ * propagate the xmin horizon forward.
*
* Next, we mark the index "indisready" (but still not "indisvalid") and
* commit the second transaction and start a third. Again we wait for all
diff --git a/src/backend/utils/sort/tuplesortvariants.c b/src/backend/utils/sort/tuplesortvariants.c
index e07ba4ea4b1..aa4fcaac9a0 100644
--- a/src/backend/utils/sort/tuplesortvariants.c
+++ b/src/backend/utils/sort/tuplesortvariants.c
@@ -123,6 +123,7 @@ typedef struct
bool enforceUnique; /* complain if we find duplicate tuples */
bool uniqueNullsNotDistinct; /* unique constraint null treatment */
+ bool uniqueDeadIgnored; /* ignore dead tuples in unique check */
} TuplesortIndexBTreeArg;
/*
@@ -349,6 +350,7 @@ tuplesort_begin_index_btree(Relation heapRel,
Relation indexRel,
bool enforceUnique,
bool uniqueNullsNotDistinct,
+ bool uniqueDeadIgnored,
int workMem,
SortCoordinate coordinate,
int sortopt)
@@ -391,6 +393,7 @@ tuplesort_begin_index_btree(Relation heapRel,
arg->index.indexRel = indexRel;
arg->enforceUnique = enforceUnique;
arg->uniqueNullsNotDistinct = uniqueNullsNotDistinct;
+ arg->uniqueDeadIgnored = uniqueDeadIgnored;
indexScanKey = _bt_mkscankey(indexRel, NULL);
@@ -1520,6 +1523,7 @@ comparetup_index_btree_tiebreak(const SortTuple *a, const SortTuple *b,
Datum values[INDEX_MAX_KEYS];
bool isnull[INDEX_MAX_KEYS];
char *key_desc;
+ bool uniqueCheckFail = true;
/*
* Some rather brain-dead implementations of qsort (such as the one in
@@ -1529,18 +1533,57 @@ comparetup_index_btree_tiebreak(const SortTuple *a, const SortTuple *b,
*/
Assert(tuple1 != tuple2);
- index_deform_tuple(tuple1, tupDes, values, isnull);
-
- key_desc = BuildIndexValueDescription(arg->index.indexRel, values, isnull);
-
- ereport(ERROR,
- (errcode(ERRCODE_UNIQUE_VIOLATION),
- errmsg("could not create unique index \"%s\"",
- RelationGetRelationName(arg->index.indexRel)),
- key_desc ? errdetail("Key %s is duplicated.", key_desc) :
- errdetail("Duplicate keys exist."),
- errtableconstraint(arg->index.heapRel,
- RelationGetRelationName(arg->index.indexRel))));
+ /* This is fail-fast check, see _bt_load for details. */
+ if (arg->uniqueDeadIgnored)
+ {
+ bool any_tuple_dead,
+ call_again = false,
+ ignored;
+
+ TupleTableSlot *slot = MakeSingleTupleTableSlot(RelationGetDescr(arg->index.heapRel),
+ &TTSOpsBufferHeapTuple);
+ ItemPointerData tid = tuple1->t_tid;
+
+ IndexFetchTableData *fetch = table_index_fetch_begin(arg->index.heapRel);
+ any_tuple_dead = !table_index_fetch_tuple(fetch, &tid, SnapshotSelf, slot, &call_again, &ignored);
+
+ if (!any_tuple_dead)
+ {
+ call_again = false;
+ tid = tuple2->t_tid;
+ any_tuple_dead = !table_index_fetch_tuple(fetch, &tuple2->t_tid, SnapshotSelf, slot, &call_again,
+ &ignored);
+ }
+
+ if (any_tuple_dead)
+ {
+ elog(DEBUG5, "skipping duplicate values because some of them are dead: (%u,%u) vs (%u,%u)",
+ ItemPointerGetBlockNumber(&tuple1->t_tid),
+ ItemPointerGetOffsetNumber(&tuple1->t_tid),
+ ItemPointerGetBlockNumber(&tuple2->t_tid),
+ ItemPointerGetOffsetNumber(&tuple2->t_tid));
+
+ uniqueCheckFail = false;
+ }
+ ExecDropSingleTupleTableSlot(slot);
+ table_index_fetch_end(fetch);
+ }
+ if (uniqueCheckFail)
+ {
+ index_deform_tuple(tuple1, tupDes, values, isnull);
+
+ key_desc = BuildIndexValueDescription(arg->index.indexRel, values, isnull);
+
+ /* keep this error message in sync with the same in _bt_load */
+ ereport(ERROR,
+ (errcode(ERRCODE_UNIQUE_VIOLATION),
+ errmsg("could not create unique index \"%s\"",
+ RelationGetRelationName(arg->index.indexRel)),
+ key_desc ? errdetail("Key %s is duplicated.", key_desc) :
+ errdetail("Duplicate keys exist."),
+ errtableconstraint(arg->index.heapRel,
+ RelationGetRelationName(arg->index.indexRel))));
+ }
}
/*
diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h
index 123fba624db..4200d2bd20e 100644
--- a/src/include/access/nbtree.h
+++ b/src/include/access/nbtree.h
@@ -1297,8 +1297,10 @@ extern bool btproperty(Oid index_oid, int attno,
extern char *btbuildphasename(int64 phasenum);
extern IndexTuple _bt_truncate(Relation rel, IndexTuple lastleft,
IndexTuple firstright, BTScanInsert itup_key);
+extern int _bt_keep_natts(Relation rel, IndexTuple lastleft, IndexTuple firstright,
+ BTScanInsert itup_key, bool *hasnulls);
extern int _bt_keep_natts_fast(Relation rel, IndexTuple lastleft,
- IndexTuple firstright);
+ IndexTuple firstright, bool *hasnulls);
extern bool _bt_check_natts(Relation rel, bool heapkeyspace, Page page,
OffsetNumber offnum);
extern void _bt_check_third_page(Relation rel, Relation heap,
diff --git a/src/include/access/tableam.h b/src/include/access/tableam.h
index 9ee5ea15fd4..ec3769585c3 100644
--- a/src/include/access/tableam.h
+++ b/src/include/access/tableam.h
@@ -1803,9 +1803,8 @@ table_scan_analyze_next_tuple(TableScanDesc scan, TransactionId OldestXmin,
* This only really makes sense for heap AM, it might need to be generalized
* for other AMs later.
*
- * In case of non-unique concurrent index build SO_RESET_SNAPSHOT is applied
- * for the scan. That leads for changing snapshots on the fly to allow xmin
- * horizon propagate.
+ * In case of concurrent index build SO_RESET_SNAPSHOT is applied for the scan.
+ * That leads for changing snapshots on the fly to allow xmin horizon propagate.
*/
static inline double
table_index_build_scan(Relation table_rel,
diff --git a/src/include/utils/tuplesort.h b/src/include/utils/tuplesort.h
index cde83f62015..ae5f4d28fdc 100644
--- a/src/include/utils/tuplesort.h
+++ b/src/include/utils/tuplesort.h
@@ -428,6 +428,7 @@ extern Tuplesortstate *tuplesort_begin_index_btree(Relation heapRel,
Relation indexRel,
bool enforceUnique,
bool uniqueNullsNotDistinct,
+ bool uniqueDeadIgnored,
int workMem, SortCoordinate coordinate,
int sortopt);
extern Tuplesortstate *tuplesort_begin_index_hash(Relation heapRel,
diff --git a/src/test/modules/injection_points/expected/cic_reset_snapshots.out b/src/test/modules/injection_points/expected/cic_reset_snapshots.out
index 49ef68d9071..c8e4683ad6d 100644
--- a/src/test/modules/injection_points/expected/cic_reset_snapshots.out
+++ b/src/test/modules/injection_points/expected/cic_reset_snapshots.out
@@ -41,7 +41,11 @@ END; $$;
----------------
ALTER TABLE cic_reset_snap.tbl SET (parallel_workers=0);
CREATE UNIQUE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl(i);
+NOTICE: notice triggered for injection point table_beginscan_strat_reset_snapshots
+NOTICE: notice triggered for injection point heap_reset_scan_snapshot_effective
REINDEX INDEX CONCURRENTLY cic_reset_snap.idx;
+NOTICE: notice triggered for injection point table_beginscan_strat_reset_snapshots
+NOTICE: notice triggered for injection point heap_reset_scan_snapshot_effective
DROP INDEX CONCURRENTLY cic_reset_snap.idx;
CREATE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl(i);
NOTICE: notice triggered for injection point table_beginscan_strat_reset_snapshots
@@ -86,7 +90,9 @@ SELECT injection_points_detach('heap_reset_scan_snapshot_effective');
(1 row)
CREATE UNIQUE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl(i);
+NOTICE: notice triggered for injection point table_parallelscan_initialize
REINDEX INDEX CONCURRENTLY cic_reset_snap.idx;
+NOTICE: notice triggered for injection point table_parallelscan_initialize
DROP INDEX CONCURRENTLY cic_reset_snap.idx;
CREATE INDEX CONCURRENTLY idx ON cic_reset_snap.tbl(i);
NOTICE: notice triggered for injection point table_parallelscan_initialize
--
2.43.0
view thread (33+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected]
Subject: Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements
In-Reply-To: <CANtu0oi+nbipJUsMZcoUfodCyuTN_DAXD22UstjMTYWG=tJ4jw@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox