public inbox for [email protected]  
help / color / mirror / Atom feed
Fix REPACK with WITHOUT OVERLAPS replica identity indexes
10+ messages / 4 participants
[nested] [flat]

* Fix REPACK with WITHOUT OVERLAPS replica identity indexes
@ 2026-05-08 04:21 Chao Li <[email protected]>
  2026-05-08 17:47 ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Kirill Reshke <[email protected]>
  0 siblings, 1 reply; 10+ messages in thread

From: Chao Li @ 2026-05-08 04:21 UTC (permalink / raw)
  To: pgsql-hackers

Hi,

While testing UPDATE FOR PORTION OF, I started wondering whether REPACK supports temporal tables. In theory, it should, because temporal WITHOUT OVERLAPS indexes can be used as replica identity indexes. So I created a test script, repack_temporal.spec, which is included in the attached patch, and it failed.

I found that REPACK hard-codes BTEqualStrategyNumber when calling get_opfamily_member(). That seems wrong, because build_replindex_scan_key() uses IndexAmTranslateCompareType() to get the equality strategy for COMPARE_EQ.

After fixing the hard-coded BTEqualStrategyNumber, the temporal test passed. Then I added another test for multirange, repack_temporal_multirange.spec, which also failed. The reason is that find_target_tuple() uses the identity index to find the first tuple and returns it directly, but a lossy index scan may return false positives and require recheck.

Please see the attached patch for the fix details and test scripts.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/



Attachments:

  [application/octet-stream] v1-0001-Fix-REPACK-with-WITHOUT-OVERLAPS-replica-identity.patch (16.3K, 2-v1-0001-Fix-REPACK-with-WITHOUT-OVERLAPS-replica-identity.patch)
  download | inline diff:
From 2a5d930499ba65964bdf095218373f85093d5dda Mon Sep 17 00:00:00 2001
From: "Chao Li (Evan)" <[email protected]>
Date: Fri, 8 May 2026 11:40:18 +0800
Subject: [PATCH v1] Fix REPACK with WITHOUT OVERLAPS replica identity indexes

REPACK replay builds scan keys for the replica identity index, but it
hard-coded BTEqualStrategyNumber when looking up the equality operator.
That is not correct for non-btree identity indexes, such as the GiST
indexes created for WITHOUT OVERLAPS primary keys.  In addition,
find_target_tuple() accepted the first tuple returned by the identity
index scan, which is unsafe for lossy index scans because the index AM may
return false positives with xs_recheck set.

Fix this by using IndexAmTranslateCompareType() to translate COMPARE_EQ
to the equality strategy number for the index AM, and by continuing the
scan when recheck is required until a candidate tuple matches the locator
tuple on all replica identity key columns.

The recheck uses the same equality operator functions as the identity
index scan keys, preserving ScanKey argument ordering.

Author: Chao Li <[email protected]>
Reviewed-by:
Discussion: https://postgr.es/m/
---
 src/backend/commands/repack.c                 |  63 ++++++++++-
 src/test/modules/injection_points/Makefile    |   2 +
 .../expected/repack_temporal.out              |  68 ++++++++++++
 .../expected/repack_temporal_multirange.out   |  74 +++++++++++++
 src/test/modules/injection_points/meson.build |   2 +
 .../specs/repack_temporal.spec                |  90 ++++++++++++++++
 .../specs/repack_temporal_multirange.spec     | 102 ++++++++++++++++++
 7 files changed, 397 insertions(+), 4 deletions(-)
 create mode 100644 src/test/modules/injection_points/expected/repack_temporal.out
 create mode 100644 src/test/modules/injection_points/expected/repack_temporal_multirange.out
 create mode 100644 src/test/modules/injection_points/specs/repack_temporal.spec
 create mode 100644 src/test/modules/injection_points/specs/repack_temporal_multirange.spec

diff --git a/src/backend/commands/repack.c b/src/backend/commands/repack.c
index 9a199dd9bfb..359109fe470 100644
--- a/src/backend/commands/repack.c
+++ b/src/backend/commands/repack.c
@@ -182,6 +182,9 @@ static void adjust_toast_pointers(Relation relation, TupleTableSlot *dest,
 static bool find_target_tuple(Relation rel, ChangeContext *chgcxt,
 							  TupleTableSlot *locator,
 							  TupleTableSlot *retrieved);
+static bool identity_key_equal(ChangeContext *chgcxt,
+							   TupleTableSlot *locator,
+							   TupleTableSlot *candidate);
 static void process_concurrent_changes(XLogRecPtr end_of_wal,
 									   ChangeContext *chgcxt,
 									   bool done);
@@ -2807,7 +2810,7 @@ find_target_tuple(Relation rel, ChangeContext *chgcxt, TupleTableSlot *locator,
 {
 	Form_pg_index idx = chgcxt->cc_ident_index->rd_index;
 	IndexScanDesc scan;
-	bool		retval;
+	bool		retval = false;
 
 	/*
 	 * Scan key is passed by caller, so it does not have to be constructed
@@ -2829,12 +2832,57 @@ find_target_tuple(Relation rel, ChangeContext *chgcxt, TupleTableSlot *locator,
 	scan = index_beginscan(rel, chgcxt->cc_ident_index, GetActiveSnapshot(),
 						   NULL, chgcxt->cc_ident_key_nentries, 0, 0);
 	index_rescan(scan, chgcxt->cc_ident_key, chgcxt->cc_ident_key_nentries, NULL, 0);
-	retval = index_getnext_slot(scan, ForwardScanDirection, retrieved);
+	while (index_getnext_slot(scan, ForwardScanDirection, retrieved))
+	{
+		if (scan->xs_recheck && !identity_key_equal(chgcxt, locator, retrieved))
+			continue;
+
+		retval = true;
+		break;
+	}
 	index_endscan(scan);
 
 	return retval;
 }
 
+/*
+ * Check whether the candidate tuple matches the locator tuple on all replica
+ * identity key columns, using the same equality operators as the identity
+ * index scan.  This is needed to filter lossy index matches, such as GiST
+ * multirange scans.
+ */
+static bool
+identity_key_equal(ChangeContext *chgcxt, TupleTableSlot *locator,
+				   TupleTableSlot *candidate)
+{
+	Form_pg_index idx = chgcxt->cc_ident_index->rd_index;
+
+	slot_getallattrs(locator);
+	slot_getallattrs(candidate);
+
+	for (int i = 0; i < chgcxt->cc_ident_key_nentries; i++)
+	{
+		ScanKey		entry = &chgcxt->cc_ident_key[i];
+		AttrNumber	attno = idx->indkey.values[i];
+
+		Assert(attno > 0);
+
+		if (locator->tts_isnull[attno - 1] != candidate->tts_isnull[attno - 1])
+			return false;
+
+		if (locator->tts_isnull[attno - 1])
+			continue;
+
+		if (!DatumGetBool(FunctionCall2Coll(&entry->sk_func,
+											entry->sk_collation,
+											candidate->tts_values[attno - 1],
+											entry->sk_argument)))
+			return false;
+	}
+
+	return true;
+}
+
 /*
  * Decode and apply concurrent changes, up to (and including) the record whose
  * LSN is 'end_of_wal'.
@@ -2944,13 +2992,20 @@ initialize_change_context(ChangeContext *chgcxt,
 						opcintype,
 						opno,
 						opcode;
+			StrategyNumber eq_strategy;
 
 			entry = &chgcxt->cc_ident_key[i];
 
 			opfamily = chgcxt->cc_ident_index->rd_opfamily[i];
 			opcintype = chgcxt->cc_ident_index->rd_opcintype[i];
+			eq_strategy = IndexAmTranslateCompareType(COMPARE_EQ,
+													  chgcxt->cc_ident_index->rd_rel->relam,
+													  opfamily, false);
+			if (eq_strategy == InvalidStrategy)
+				elog(ERROR, "failed to find equality strategy for index operator family %u for type %u",
+					 opfamily, opcintype);
 			opno = get_opfamily_member(opfamily, opcintype, opcintype,
-									   BTEqualStrategyNumber);
+									   eq_strategy);
 			if (!OidIsValid(opno))
 				elog(ERROR, "failed to find = operator for type %u", opcintype);
 			opcode = get_opcode(opno);
@@ -2960,7 +3015,7 @@ initialize_change_context(ChangeContext *chgcxt,
 			/* Initialize everything but argument. */
 			ScanKeyInit(entry,
 						i + 1,
-						BTEqualStrategyNumber, opcode,
+						eq_strategy, opcode,
 						(Datum) 0);
 			entry->sk_collation = chgcxt->cc_ident_index->rd_indcollation[i];
 		}
diff --git a/src/test/modules/injection_points/Makefile b/src/test/modules/injection_points/Makefile
index f057d143d1a..c01d2fb095c 100644
--- a/src/test/modules/injection_points/Makefile
+++ b/src/test/modules/injection_points/Makefile
@@ -15,6 +15,8 @@ REGRESS_OPTS = --dlpath=$(top_builddir)/src/test/regress
 ISOLATION = basic \
 	    inplace \
 	    repack \
+	    repack_temporal \
+	    repack_temporal_multirange \
 	    repack_toast \
 	    syscache-update-pruned \
 	    heap_lock_update
diff --git a/src/test/modules/injection_points/expected/repack_temporal.out b/src/test/modules/injection_points/expected/repack_temporal.out
new file mode 100644
index 00000000000..e6b06c00cec
--- /dev/null
+++ b/src/test/modules/injection_points/expected/repack_temporal.out
@@ -0,0 +1,68 @@
+Parsed test spec with 2 sessions
+
+starting permutation: wait_before_lock update_target check_after_update wakeup_before_lock check_after_repack
+injection_points_attach
+-----------------------
+                       
+(1 row)
+
+step wait_before_lock: 
+	REPACK (CONCURRENTLY) repack_temporal USING INDEX repack_temporal_pkey;
+ <waiting ...>
+step update_target: 
+	UPDATE repack_temporal
+	SET label = 'updated'
+	WHERE id = '[2,3)' AND valid_at = '[2000-01-10,2000-01-20)';
+
+step check_after_update: 
+	INSERT INTO relfilenodes(node)
+	SELECT relfilenode FROM pg_class WHERE relname = 'repack_temporal';
+
+	-- Expect 2 rows
+	SELECT id, valid_at, label
+	FROM repack_temporal
+	ORDER BY id, valid_at, label;
+
+id    |valid_at               |label  
+------+-----------------------+-------
+[1,10)|[01-01-2000,02-01-2000)|other  
+[2,3) |[01-10-2000,01-20-2000)|updated
+(2 rows)
+
+step wakeup_before_lock: 
+	SELECT injection_points_wakeup('repack-concurrently-before-lock');
+
+injection_points_wakeup
+-----------------------
+                       
+(1 row)
+
+step wait_before_lock: <... completed>
+step check_after_repack: 
+	INSERT INTO relfilenodes(node)
+	SELECT relfilenode FROM pg_class WHERE relname = 'repack_temporal';
+
+	-- Expect 2, proving that repack has rewritten the table
+	SELECT count(DISTINCT node) FROM relfilenodes;
+
+	-- Expect 2 rows
+	SELECT id, valid_at, label
+	FROM repack_temporal
+	ORDER BY id, valid_at, label;
+
+count
+-----
+    2
+(1 row)
+
+id    |valid_at               |label  
+------+-----------------------+-------
+[1,10)|[01-01-2000,02-01-2000)|other  
+[2,3) |[01-10-2000,01-20-2000)|updated
+(2 rows)
+
+injection_points_detach
+-----------------------
+                       
+(1 row)
+
diff --git a/src/test/modules/injection_points/expected/repack_temporal_multirange.out b/src/test/modules/injection_points/expected/repack_temporal_multirange.out
new file mode 100644
index 00000000000..3f5ff8cfb0f
--- /dev/null
+++ b/src/test/modules/injection_points/expected/repack_temporal_multirange.out
@@ -0,0 +1,74 @@
+Parsed test spec with 2 sessions
+
+starting permutation: wait_before_lock update_target check_after_update wakeup_before_lock final_check
+injection_points_attach
+-----------------------
+                       
+(1 row)
+
+step wait_before_lock: 
+	REPACK (CONCURRENTLY) repack_temporal_multirange
+		USING INDEX repack_temporal_multirange_pkey;
+ <waiting ...>
+step update_target: 
+	UPDATE repack_temporal_multirange
+	SET label = 'updated'
+	WHERE id = int4multirange(int4range(1, 7))
+	  AND valid_at = datemultirange(daterange('2000-01-01', '2000-02-01'));
+
+step check_after_update: 
+	INSERT INTO relfilenodes(node)
+	SELECT relfilenode
+	FROM pg_class
+	WHERE relname = 'repack_temporal_multirange';
+
+	-- Expect 2 rows
+	SELECT id, valid_at, label
+	FROM repack_temporal_multirange
+	ORDER BY id, valid_at, label;
+
+id           |valid_at                 |label  
+-------------+-------------------------+-------
+{[1,3),[5,7)}|{[01-01-2000,02-01-2000)}|other  
+{[1,7)}      |{[01-01-2000,02-01-2000)}|updated
+(2 rows)
+
+step wakeup_before_lock: 
+	SELECT injection_points_wakeup('repack-concurrently-before-lock');
+
+injection_points_wakeup
+-----------------------
+                       
+(1 row)
+
+step wait_before_lock: <... completed>
+step final_check: 
+	INSERT INTO relfilenodes(node)
+	SELECT relfilenode
+	FROM pg_class
+	WHERE relname = 'repack_temporal_multirange';
+
+	-- Expect 2, proving that repack has rewritten the table
+	SELECT count(DISTINCT node) FROM relfilenodes;
+
+	-- Expect 2 rows
+	SELECT id, valid_at, label
+	FROM repack_temporal_multirange
+	ORDER BY id, valid_at, label;
+
+count
+-----
+    2
+(1 row)
+
+id           |valid_at                 |label  
+-------------+-------------------------+-------
+{[1,3),[5,7)}|{[01-01-2000,02-01-2000)}|other  
+{[1,7)}      |{[01-01-2000,02-01-2000)}|updated
+(2 rows)
+
+injection_points_detach
+-----------------------
+                       
+(1 row)
+
diff --git a/src/test/modules/injection_points/meson.build b/src/test/modules/injection_points/meson.build
index fb1418e2caa..59dba1cb023 100644
--- a/src/test/modules/injection_points/meson.build
+++ b/src/test/modules/injection_points/meson.build
@@ -46,6 +46,8 @@ tests += {
       'basic',
       'inplace',
       'repack',
+      'repack_temporal',
+      'repack_temporal_multirange',
       'repack_toast',
       'syscache-update-pruned',
       'heap_lock_update',
diff --git a/src/test/modules/injection_points/specs/repack_temporal.spec b/src/test/modules/injection_points/specs/repack_temporal.spec
new file mode 100644
index 00000000000..9629d502ec1
--- /dev/null
+++ b/src/test/modules/injection_points/specs/repack_temporal.spec
@@ -0,0 +1,90 @@
+# REPACK (CONCURRENTLY) on a temporal replica identity index.
+#
+# The table's replica identity is a GiST index created by a temporal primary
+# key.  A concurrent UPDATE changes a non-key column of one row, while another
+# row overlaps it on all indexed columns.  Replay must still find the exact
+# target row.
+setup
+{
+	CREATE EXTENSION injection_points;
+
+	CREATE TABLE repack_temporal (
+		id int4range,
+		valid_at daterange,
+		label text,
+		PRIMARY KEY (id, valid_at WITHOUT OVERLAPS)
+	);
+
+	ALTER TABLE repack_temporal REPLICA IDENTITY USING INDEX repack_temporal_pkey;
+
+	INSERT INTO repack_temporal(id, valid_at, label)
+	VALUES
+		('[1,10)', '[2000-01-01,2000-02-01)', 'other'),
+		('[2,3)',  '[2000-01-10,2000-01-20)', 'target');
+
+	CREATE TABLE relfilenodes(node oid);
+}
+
+teardown
+{
+	DROP TABLE repack_temporal;
+	DROP EXTENSION injection_points;
+	DROP TABLE relfilenodes;
+}
+
+session s1
+setup
+{
+	SELECT injection_points_set_local();
+	SELECT injection_points_attach('repack-concurrently-before-lock', 'wait');
+}
+step wait_before_lock
+{
+	REPACK (CONCURRENTLY) repack_temporal USING INDEX repack_temporal_pkey;
+}
+step check_after_repack
+{
+	INSERT INTO relfilenodes(node)
+	SELECT relfilenode FROM pg_class WHERE relname = 'repack_temporal';
+
+	-- Expect 2, proving that repack has rewritten the table
+	SELECT count(DISTINCT node) FROM relfilenodes;
+
+	-- Expect 2 rows
+	SELECT id, valid_at, label
+	FROM repack_temporal
+	ORDER BY id, valid_at, label;
+}
+teardown
+{
+	SELECT injection_points_detach('repack-concurrently-before-lock');
+}
+
+session s2
+step update_target
+{
+	UPDATE repack_temporal
+	SET label = 'updated'
+	WHERE id = '[2,3)' AND valid_at = '[2000-01-10,2000-01-20)';
+}
+step check_after_update
+{
+	INSERT INTO relfilenodes(node)
+	SELECT relfilenode FROM pg_class WHERE relname = 'repack_temporal';
+
+	-- Expect 2 rows
+	SELECT id, valid_at, label
+	FROM repack_temporal
+	ORDER BY id, valid_at, label;
+}
+step wakeup_before_lock
+{
+	SELECT injection_points_wakeup('repack-concurrently-before-lock');
+}
+
+permutation
+	wait_before_lock
+	update_target
+	check_after_update
+	wakeup_before_lock
+	check_after_repack
diff --git a/src/test/modules/injection_points/specs/repack_temporal_multirange.spec b/src/test/modules/injection_points/specs/repack_temporal_multirange.spec
new file mode 100644
index 00000000000..dfff1f2234d
--- /dev/null
+++ b/src/test/modules/injection_points/specs/repack_temporal_multirange.spec
@@ -0,0 +1,102 @@
+# REPACK (CONCURRENTLY) on a temporal replica identity index with lossy
+# multirange equality.
+#
+# The leading identity column is an int4multirange. Two distinct rows have
+# different multirange values but the same union range, so GiST equality can
+# produce both as candidates and requires exact recheck.
+setup
+{
+	CREATE EXTENSION injection_points;
+
+	CREATE TABLE repack_temporal_multirange (
+		id int4multirange,
+		valid_at datemultirange,
+		label text,
+		PRIMARY KEY (id, valid_at WITHOUT OVERLAPS)
+	);
+
+	ALTER TABLE repack_temporal_multirange
+		REPLICA IDENTITY USING INDEX repack_temporal_multirange_pkey;
+
+	-- (1,3)+(5+7) is the same uninon range of (1-7), but needs recheck
+	INSERT INTO repack_temporal_multirange(id, valid_at, label)
+	VALUES
+		(int4multirange(int4range(1, 3), int4range(5, 7)),
+		 datemultirange(daterange('2000-01-01', '2000-02-01')),
+		 'other'),
+		(int4multirange(int4range(1, 7)),
+		 datemultirange(daterange('2000-01-01', '2000-02-01')),
+		 'target');
+
+	CREATE TABLE relfilenodes(node oid);
+}
+
+teardown
+{
+	DROP TABLE repack_temporal_multirange;
+	DROP EXTENSION injection_points;
+	DROP TABLE relfilenodes;
+}
+
+session s1
+setup
+{
+	SELECT injection_points_set_local();
+	SELECT injection_points_attach('repack-concurrently-before-lock', 'wait');
+}
+step wait_before_lock
+{
+	REPACK (CONCURRENTLY) repack_temporal_multirange
+		USING INDEX repack_temporal_multirange_pkey;
+}
+step final_check
+{
+	INSERT INTO relfilenodes(node)
+	SELECT relfilenode
+	FROM pg_class
+	WHERE relname = 'repack_temporal_multirange';
+
+	-- Expect 2, proving that repack has rewritten the table
+	SELECT count(DISTINCT node) FROM relfilenodes;
+
+	-- Expect 2 rows
+	SELECT id, valid_at, label
+	FROM repack_temporal_multirange
+	ORDER BY id, valid_at, label;
+}
+teardown
+{
+	SELECT injection_points_detach('repack-concurrently-before-lock');
+}
+
+session s2
+step update_target
+{
+	UPDATE repack_temporal_multirange
+	SET label = 'updated'
+	WHERE id = int4multirange(int4range(1, 7))
+	  AND valid_at = datemultirange(daterange('2000-01-01', '2000-02-01'));
+}
+step check_after_update
+{
+	INSERT INTO relfilenodes(node)
+	SELECT relfilenode
+	FROM pg_class
+	WHERE relname = 'repack_temporal_multirange';
+
+	-- Expect 2 rows
+	SELECT id, valid_at, label
+	FROM repack_temporal_multirange
+	ORDER BY id, valid_at, label;
+}
+step wakeup_before_lock
+{
+	SELECT injection_points_wakeup('repack-concurrently-before-lock');
+}
+
+permutation
+	wait_before_lock
+	update_target
+	check_after_update
+	wakeup_before_lock
+	final_check
-- 
2.50.1 (Apple Git-155)



^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes
  2026-05-08 04:21 Fix REPACK with WITHOUT OVERLAPS replica identity indexes Chao Li <[email protected]>
@ 2026-05-08 17:47 ` Kirill Reshke <[email protected]>
  2026-05-09 08:36   ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Chao Li <[email protected]>
  0 siblings, 1 reply; 10+ messages in thread

From: Kirill Reshke @ 2026-05-08 17:47 UTC (permalink / raw)
  To: Chao Li <[email protected]>; +Cc: pgsql-hackers

On Fri, 8 May 2026 at 09:22, Chao Li <[email protected]> wrote:
>
> Hi,
>
> While testing UPDATE FOR PORTION OF, I started wondering whether REPACK supports temporal tables. In theory, it should, because temporal WITHOUT OVERLAPS indexes can be used as replica identity indexes. So I created a test script, repack_temporal.spec, which is included in the attached patch, and it failed.
>
> I found that REPACK hard-codes BTEqualStrategyNumber when calling get_opfamily_member(). That seems wrong, because build_replindex_scan_key() uses IndexAmTranslateCompareType() to get the equality strategy for COMPARE_EQ.
>
> After fixing the hard-coded BTEqualStrategyNumber, the temporal test passed. Then I added another test for multirange, repack_temporal_multirange.spec, which also failed. The reason is that find_target_tuple() uses the identity index to find the first tuple and returns it directly, but a lossy index scan may return false positives and require recheck.
>
> Please see the attached patch for the fix details and test scripts.
>
> Best regards,
> --
> Chao Li (Evan)
> HighGo Software Co., Ltd.
> https://www.highgo.com/
>

your analysis appears correct to me

>  + while (index_getnext_slot(scan, ForwardScanDirection, retrieved))
> + {
> + if (scan->xs_recheck && !identity_key_equal(chgcxt, locator, retrieved))
> + continue;
> +
> + retval = true;
> + break;
> + }

Should we add CFI() ?


Also, do we really need isolation tests and inj points here? Doesn't a
simple regression test for REPACK execute the same code?


-- 
Best regards,
Kirill Reshke





^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes
  2026-05-08 04:21 Fix REPACK with WITHOUT OVERLAPS replica identity indexes Chao Li <[email protected]>
  2026-05-08 17:47 ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Kirill Reshke <[email protected]>
@ 2026-05-09 08:36   ` Chao Li <[email protected]>
  2026-05-09 22:38     ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Álvaro Herrera <[email protected]>
  0 siblings, 1 reply; 10+ messages in thread

From: Chao Li @ 2026-05-09 08:36 UTC (permalink / raw)
  To: Kirill Reshke <[email protected]>; +Cc: pgsql-hackers



> On May 9, 2026, at 01:47, Kirill Reshke <[email protected]> wrote:
> 
> On Fri, 8 May 2026 at 09:22, Chao Li <[email protected]> wrote:
>> 
>> Hi,
>> 
>> While testing UPDATE FOR PORTION OF, I started wondering whether REPACK supports temporal tables. In theory, it should, because temporal WITHOUT OVERLAPS indexes can be used as replica identity indexes. So I created a test script, repack_temporal.spec, which is included in the attached patch, and it failed.
>> 
>> I found that REPACK hard-codes BTEqualStrategyNumber when calling get_opfamily_member(). That seems wrong, because build_replindex_scan_key() uses IndexAmTranslateCompareType() to get the equality strategy for COMPARE_EQ.
>> 
>> After fixing the hard-coded BTEqualStrategyNumber, the temporal test passed. Then I added another test for multirange, repack_temporal_multirange.spec, which also failed. The reason is that find_target_tuple() uses the identity index to find the first tuple and returns it directly, but a lossy index scan may return false positives and require recheck.
>> 
>> Please see the attached patch for the fix details and test scripts.
>> 
>> Best regards,
>> --
>> Chao Li (Evan)
>> HighGo Software Co., Ltd.
>> https://www.highgo.com/
>> 
> 
> your analysis appears correct to me

Hi Kirill, thanks for your review.

> 
>> + while (index_getnext_slot(scan, ForwardScanDirection, retrieved))
>> + {
>> + if (scan->xs_recheck && !identity_key_equal(chgcxt, locator, retrieved))
>> + continue;
>> +
>> + retval = true;
>> + break;
>> + }
> 
> Should we add CFI() ?
> 

Oh, I didn’t consider that at all, because I thought there should not be a lot of candidate rows needing recheck. I am okay to add that.

> 
> Also, do we really need isolation tests and inj points here?

I think so. Without the injection point, the first phase of copying a new heap would be very fast, it would be hard to run an update in the second session. I think that’s way the repack code intentionally added an injection point before the first round of replay:

```
    /*
* During testing, wait for another backend to perform concurrent data
* changes which we will process below.
*/
INJECTION_POINT("repack-concurrently-before-lock", NULL);
```

> Doesn't a
> simple regression test for REPACK execute the same code?
> 

It seems we intentionally avoid to run repack test in the regress test, see [1] and [2].

PFA v2: added the CFI as Kirill suggested.

[1] https://postgr.es/m/[email protected]
[2] https://git.postgresql.org/cgit/postgresql.git/commit/?id=2fd787d0aac1cb00a42ebce92ebb1d7534035ee3

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/






Attachments:

  [application/octet-stream] v2-0001-Fix-REPACK-with-WITHOUT-OVERLAPS-replica-identity.patch (16.4K, 2-v2-0001-Fix-REPACK-with-WITHOUT-OVERLAPS-replica-identity.patch)
  download | inline diff:
From d480e5d6ee5bfd05ba59642fe62a25b1e7e66da6 Mon Sep 17 00:00:00 2001
From: "Chao Li (Evan)" <[email protected]>
Date: Fri, 8 May 2026 11:40:18 +0800
Subject: [PATCH v2] Fix REPACK with WITHOUT OVERLAPS replica identity indexes

REPACK replay builds scan keys for the replica identity index, but it
hard-coded BTEqualStrategyNumber when looking up the equality operator.
That is not correct for non-btree identity indexes, such as the GiST
indexes created for WITHOUT OVERLAPS primary keys.  In addition,
find_target_tuple() accepted the first tuple returned by the identity
index scan, which is unsafe for lossy index scans because the index AM may
return false positives with xs_recheck set.

Fix this by using IndexAmTranslateCompareType() to translate COMPARE_EQ
to the equality strategy number for the index AM, and by continuing the
scan when recheck is required until a candidate tuple matches the locator
tuple on all replica identity key columns.

The recheck uses the same equality operator functions as the identity
index scan keys, preserving ScanKey argument ordering.

Author: Chao Li <[email protected]>
Reviewed-by: Kirill Reshke <[email protected]>
Discussion: https://postgr.es/m/[email protected]
---
 src/backend/commands/repack.c                 |  66 +++++++++++-
 src/test/modules/injection_points/Makefile    |   2 +
 .../expected/repack_temporal.out              |  68 ++++++++++++
 .../expected/repack_temporal_multirange.out   |  74 +++++++++++++
 src/test/modules/injection_points/meson.build |   2 +
 .../specs/repack_temporal.spec                |  90 ++++++++++++++++
 .../specs/repack_temporal_multirange.spec     | 102 ++++++++++++++++++
 7 files changed, 400 insertions(+), 4 deletions(-)
 create mode 100644 src/test/modules/injection_points/expected/repack_temporal.out
 create mode 100644 src/test/modules/injection_points/expected/repack_temporal_multirange.out
 create mode 100644 src/test/modules/injection_points/specs/repack_temporal.spec
 create mode 100644 src/test/modules/injection_points/specs/repack_temporal_multirange.spec

diff --git a/src/backend/commands/repack.c b/src/backend/commands/repack.c
index 9a199dd9bfb..42e5d9e7558 100644
--- a/src/backend/commands/repack.c
+++ b/src/backend/commands/repack.c
@@ -182,6 +182,9 @@ static void adjust_toast_pointers(Relation relation, TupleTableSlot *dest,
 static bool find_target_tuple(Relation rel, ChangeContext *chgcxt,
 							  TupleTableSlot *locator,
 							  TupleTableSlot *retrieved);
+static bool identity_key_equal(ChangeContext *chgcxt,
+							   TupleTableSlot *locator,
+							   TupleTableSlot *candidate);
 static void process_concurrent_changes(XLogRecPtr end_of_wal,
 									   ChangeContext *chgcxt,
 									   bool done);
@@ -2807,7 +2810,7 @@ find_target_tuple(Relation rel, ChangeContext *chgcxt, TupleTableSlot *locator,
 {
 	Form_pg_index idx = chgcxt->cc_ident_index->rd_index;
 	IndexScanDesc scan;
-	bool		retval;
+	bool		retval = false;
 
 	/*
 	 * Scan key is passed by caller, so it does not have to be constructed
@@ -2829,12 +2832,60 @@ find_target_tuple(Relation rel, ChangeContext *chgcxt, TupleTableSlot *locator,
 	scan = index_beginscan(rel, chgcxt->cc_ident_index, GetActiveSnapshot(),
 						   NULL, chgcxt->cc_ident_key_nentries, 0, 0);
 	index_rescan(scan, chgcxt->cc_ident_key, chgcxt->cc_ident_key_nentries, NULL, 0);
-	retval = index_getnext_slot(scan, ForwardScanDirection, retrieved);
+	while (index_getnext_slot(scan, ForwardScanDirection, retrieved))
+	{
+		if (scan->xs_recheck && !identity_key_equal(chgcxt, locator, retrieved))
+		{
+			CHECK_FOR_INTERRUPTS();
+			continue;
+		}
+
+		retval = true;
+		break;
+	}
 	index_endscan(scan);
 
 	return retval;
 }
 
+/*
+ * Check whether the candidate tuple matches the locator tuple on all replica
+ * identity key columns, using the same equality operators as the identity
+ * index scan.  This is needed to filter lossy index matches, such as GiST
+ * multirange scans.
+ */
+static bool
+identity_key_equal(ChangeContext *chgcxt, TupleTableSlot *locator,
+				   TupleTableSlot *candidate)
+{
+	Form_pg_index idx = chgcxt->cc_ident_index->rd_index;
+
+	slot_getallattrs(locator);
+	slot_getallattrs(candidate);
+
+	for (int i = 0; i < chgcxt->cc_ident_key_nentries; i++)
+	{
+		ScanKey		entry = &chgcxt->cc_ident_key[i];
+		AttrNumber	attno = idx->indkey.values[i];
+
+		Assert(attno > 0);
+
+		if (locator->tts_isnull[attno - 1] != candidate->tts_isnull[attno - 1])
+			return false;
+
+		if (locator->tts_isnull[attno - 1])
+			continue;
+
+		if (!DatumGetBool(FunctionCall2Coll(&entry->sk_func,
+											entry->sk_collation,
+											candidate->tts_values[attno - 1],
+											entry->sk_argument)))
+			return false;
+	}
+
+	return true;
+}
+
 /*
  * Decode and apply concurrent changes, up to (and including) the record whose
  * LSN is 'end_of_wal'.
@@ -2944,13 +2995,20 @@ initialize_change_context(ChangeContext *chgcxt,
 						opcintype,
 						opno,
 						opcode;
+			StrategyNumber eq_strategy;
 
 			entry = &chgcxt->cc_ident_key[i];
 
 			opfamily = chgcxt->cc_ident_index->rd_opfamily[i];
 			opcintype = chgcxt->cc_ident_index->rd_opcintype[i];
+			eq_strategy = IndexAmTranslateCompareType(COMPARE_EQ,
+													  chgcxt->cc_ident_index->rd_rel->relam,
+													  opfamily, false);
+			if (eq_strategy == InvalidStrategy)
+				elog(ERROR, "failed to find equality strategy for index operator family %u for type %u",
+					 opfamily, opcintype);
 			opno = get_opfamily_member(opfamily, opcintype, opcintype,
-									   BTEqualStrategyNumber);
+									   eq_strategy);
 			if (!OidIsValid(opno))
 				elog(ERROR, "failed to find = operator for type %u", opcintype);
 			opcode = get_opcode(opno);
@@ -2960,7 +3018,7 @@ initialize_change_context(ChangeContext *chgcxt,
 			/* Initialize everything but argument. */
 			ScanKeyInit(entry,
 						i + 1,
-						BTEqualStrategyNumber, opcode,
+						eq_strategy, opcode,
 						(Datum) 0);
 			entry->sk_collation = chgcxt->cc_ident_index->rd_indcollation[i];
 		}
diff --git a/src/test/modules/injection_points/Makefile b/src/test/modules/injection_points/Makefile
index f057d143d1a..c01d2fb095c 100644
--- a/src/test/modules/injection_points/Makefile
+++ b/src/test/modules/injection_points/Makefile
@@ -15,6 +15,8 @@ REGRESS_OPTS = --dlpath=$(top_builddir)/src/test/regress
 ISOLATION = basic \
 	    inplace \
 	    repack \
+	    repack_temporal \
+	    repack_temporal_multirange \
 	    repack_toast \
 	    syscache-update-pruned \
 	    heap_lock_update
diff --git a/src/test/modules/injection_points/expected/repack_temporal.out b/src/test/modules/injection_points/expected/repack_temporal.out
new file mode 100644
index 00000000000..e6b06c00cec
--- /dev/null
+++ b/src/test/modules/injection_points/expected/repack_temporal.out
@@ -0,0 +1,68 @@
+Parsed test spec with 2 sessions
+
+starting permutation: wait_before_lock update_target check_after_update wakeup_before_lock check_after_repack
+injection_points_attach
+-----------------------
+                       
+(1 row)
+
+step wait_before_lock: 
+	REPACK (CONCURRENTLY) repack_temporal USING INDEX repack_temporal_pkey;
+ <waiting ...>
+step update_target: 
+	UPDATE repack_temporal
+	SET label = 'updated'
+	WHERE id = '[2,3)' AND valid_at = '[2000-01-10,2000-01-20)';
+
+step check_after_update: 
+	INSERT INTO relfilenodes(node)
+	SELECT relfilenode FROM pg_class WHERE relname = 'repack_temporal';
+
+	-- Expect 2 rows
+	SELECT id, valid_at, label
+	FROM repack_temporal
+	ORDER BY id, valid_at, label;
+
+id    |valid_at               |label  
+------+-----------------------+-------
+[1,10)|[01-01-2000,02-01-2000)|other  
+[2,3) |[01-10-2000,01-20-2000)|updated
+(2 rows)
+
+step wakeup_before_lock: 
+	SELECT injection_points_wakeup('repack-concurrently-before-lock');
+
+injection_points_wakeup
+-----------------------
+                       
+(1 row)
+
+step wait_before_lock: <... completed>
+step check_after_repack: 
+	INSERT INTO relfilenodes(node)
+	SELECT relfilenode FROM pg_class WHERE relname = 'repack_temporal';
+
+	-- Expect 2, proving that repack has rewritten the table
+	SELECT count(DISTINCT node) FROM relfilenodes;
+
+	-- Expect 2 rows
+	SELECT id, valid_at, label
+	FROM repack_temporal
+	ORDER BY id, valid_at, label;
+
+count
+-----
+    2
+(1 row)
+
+id    |valid_at               |label  
+------+-----------------------+-------
+[1,10)|[01-01-2000,02-01-2000)|other  
+[2,3) |[01-10-2000,01-20-2000)|updated
+(2 rows)
+
+injection_points_detach
+-----------------------
+                       
+(1 row)
+
diff --git a/src/test/modules/injection_points/expected/repack_temporal_multirange.out b/src/test/modules/injection_points/expected/repack_temporal_multirange.out
new file mode 100644
index 00000000000..3f5ff8cfb0f
--- /dev/null
+++ b/src/test/modules/injection_points/expected/repack_temporal_multirange.out
@@ -0,0 +1,74 @@
+Parsed test spec with 2 sessions
+
+starting permutation: wait_before_lock update_target check_after_update wakeup_before_lock final_check
+injection_points_attach
+-----------------------
+                       
+(1 row)
+
+step wait_before_lock: 
+	REPACK (CONCURRENTLY) repack_temporal_multirange
+		USING INDEX repack_temporal_multirange_pkey;
+ <waiting ...>
+step update_target: 
+	UPDATE repack_temporal_multirange
+	SET label = 'updated'
+	WHERE id = int4multirange(int4range(1, 7))
+	  AND valid_at = datemultirange(daterange('2000-01-01', '2000-02-01'));
+
+step check_after_update: 
+	INSERT INTO relfilenodes(node)
+	SELECT relfilenode
+	FROM pg_class
+	WHERE relname = 'repack_temporal_multirange';
+
+	-- Expect 2 rows
+	SELECT id, valid_at, label
+	FROM repack_temporal_multirange
+	ORDER BY id, valid_at, label;
+
+id           |valid_at                 |label  
+-------------+-------------------------+-------
+{[1,3),[5,7)}|{[01-01-2000,02-01-2000)}|other  
+{[1,7)}      |{[01-01-2000,02-01-2000)}|updated
+(2 rows)
+
+step wakeup_before_lock: 
+	SELECT injection_points_wakeup('repack-concurrently-before-lock');
+
+injection_points_wakeup
+-----------------------
+                       
+(1 row)
+
+step wait_before_lock: <... completed>
+step final_check: 
+	INSERT INTO relfilenodes(node)
+	SELECT relfilenode
+	FROM pg_class
+	WHERE relname = 'repack_temporal_multirange';
+
+	-- Expect 2, proving that repack has rewritten the table
+	SELECT count(DISTINCT node) FROM relfilenodes;
+
+	-- Expect 2 rows
+	SELECT id, valid_at, label
+	FROM repack_temporal_multirange
+	ORDER BY id, valid_at, label;
+
+count
+-----
+    2
+(1 row)
+
+id           |valid_at                 |label  
+-------------+-------------------------+-------
+{[1,3),[5,7)}|{[01-01-2000,02-01-2000)}|other  
+{[1,7)}      |{[01-01-2000,02-01-2000)}|updated
+(2 rows)
+
+injection_points_detach
+-----------------------
+                       
+(1 row)
+
diff --git a/src/test/modules/injection_points/meson.build b/src/test/modules/injection_points/meson.build
index fb1418e2caa..59dba1cb023 100644
--- a/src/test/modules/injection_points/meson.build
+++ b/src/test/modules/injection_points/meson.build
@@ -46,6 +46,8 @@ tests += {
       'basic',
       'inplace',
       'repack',
+      'repack_temporal',
+      'repack_temporal_multirange',
       'repack_toast',
       'syscache-update-pruned',
       'heap_lock_update',
diff --git a/src/test/modules/injection_points/specs/repack_temporal.spec b/src/test/modules/injection_points/specs/repack_temporal.spec
new file mode 100644
index 00000000000..9629d502ec1
--- /dev/null
+++ b/src/test/modules/injection_points/specs/repack_temporal.spec
@@ -0,0 +1,90 @@
+# REPACK (CONCURRENTLY) on a temporal replica identity index.
+#
+# The table's replica identity is a GiST index created by a temporal primary
+# key.  A concurrent UPDATE changes a non-key column of one row, while another
+# row overlaps it on all indexed columns.  Replay must still find the exact
+# target row.
+setup
+{
+	CREATE EXTENSION injection_points;
+
+	CREATE TABLE repack_temporal (
+		id int4range,
+		valid_at daterange,
+		label text,
+		PRIMARY KEY (id, valid_at WITHOUT OVERLAPS)
+	);
+
+	ALTER TABLE repack_temporal REPLICA IDENTITY USING INDEX repack_temporal_pkey;
+
+	INSERT INTO repack_temporal(id, valid_at, label)
+	VALUES
+		('[1,10)', '[2000-01-01,2000-02-01)', 'other'),
+		('[2,3)',  '[2000-01-10,2000-01-20)', 'target');
+
+	CREATE TABLE relfilenodes(node oid);
+}
+
+teardown
+{
+	DROP TABLE repack_temporal;
+	DROP EXTENSION injection_points;
+	DROP TABLE relfilenodes;
+}
+
+session s1
+setup
+{
+	SELECT injection_points_set_local();
+	SELECT injection_points_attach('repack-concurrently-before-lock', 'wait');
+}
+step wait_before_lock
+{
+	REPACK (CONCURRENTLY) repack_temporal USING INDEX repack_temporal_pkey;
+}
+step check_after_repack
+{
+	INSERT INTO relfilenodes(node)
+	SELECT relfilenode FROM pg_class WHERE relname = 'repack_temporal';
+
+	-- Expect 2, proving that repack has rewritten the table
+	SELECT count(DISTINCT node) FROM relfilenodes;
+
+	-- Expect 2 rows
+	SELECT id, valid_at, label
+	FROM repack_temporal
+	ORDER BY id, valid_at, label;
+}
+teardown
+{
+	SELECT injection_points_detach('repack-concurrently-before-lock');
+}
+
+session s2
+step update_target
+{
+	UPDATE repack_temporal
+	SET label = 'updated'
+	WHERE id = '[2,3)' AND valid_at = '[2000-01-10,2000-01-20)';
+}
+step check_after_update
+{
+	INSERT INTO relfilenodes(node)
+	SELECT relfilenode FROM pg_class WHERE relname = 'repack_temporal';
+
+	-- Expect 2 rows
+	SELECT id, valid_at, label
+	FROM repack_temporal
+	ORDER BY id, valid_at, label;
+}
+step wakeup_before_lock
+{
+	SELECT injection_points_wakeup('repack-concurrently-before-lock');
+}
+
+permutation
+	wait_before_lock
+	update_target
+	check_after_update
+	wakeup_before_lock
+	check_after_repack
diff --git a/src/test/modules/injection_points/specs/repack_temporal_multirange.spec b/src/test/modules/injection_points/specs/repack_temporal_multirange.spec
new file mode 100644
index 00000000000..dfff1f2234d
--- /dev/null
+++ b/src/test/modules/injection_points/specs/repack_temporal_multirange.spec
@@ -0,0 +1,102 @@
+# REPACK (CONCURRENTLY) on a temporal replica identity index with lossy
+# multirange equality.
+#
+# The leading identity column is an int4multirange. Two distinct rows have
+# different multirange values but the same union range, so GiST equality can
+# produce both as candidates and requires exact recheck.
+setup
+{
+	CREATE EXTENSION injection_points;
+
+	CREATE TABLE repack_temporal_multirange (
+		id int4multirange,
+		valid_at datemultirange,
+		label text,
+		PRIMARY KEY (id, valid_at WITHOUT OVERLAPS)
+	);
+
+	ALTER TABLE repack_temporal_multirange
+		REPLICA IDENTITY USING INDEX repack_temporal_multirange_pkey;
+
+	-- (1,3)+(5+7) is the same uninon range of (1-7), but needs recheck
+	INSERT INTO repack_temporal_multirange(id, valid_at, label)
+	VALUES
+		(int4multirange(int4range(1, 3), int4range(5, 7)),
+		 datemultirange(daterange('2000-01-01', '2000-02-01')),
+		 'other'),
+		(int4multirange(int4range(1, 7)),
+		 datemultirange(daterange('2000-01-01', '2000-02-01')),
+		 'target');
+
+	CREATE TABLE relfilenodes(node oid);
+}
+
+teardown
+{
+	DROP TABLE repack_temporal_multirange;
+	DROP EXTENSION injection_points;
+	DROP TABLE relfilenodes;
+}
+
+session s1
+setup
+{
+	SELECT injection_points_set_local();
+	SELECT injection_points_attach('repack-concurrently-before-lock', 'wait');
+}
+step wait_before_lock
+{
+	REPACK (CONCURRENTLY) repack_temporal_multirange
+		USING INDEX repack_temporal_multirange_pkey;
+}
+step final_check
+{
+	INSERT INTO relfilenodes(node)
+	SELECT relfilenode
+	FROM pg_class
+	WHERE relname = 'repack_temporal_multirange';
+
+	-- Expect 2, proving that repack has rewritten the table
+	SELECT count(DISTINCT node) FROM relfilenodes;
+
+	-- Expect 2 rows
+	SELECT id, valid_at, label
+	FROM repack_temporal_multirange
+	ORDER BY id, valid_at, label;
+}
+teardown
+{
+	SELECT injection_points_detach('repack-concurrently-before-lock');
+}
+
+session s2
+step update_target
+{
+	UPDATE repack_temporal_multirange
+	SET label = 'updated'
+	WHERE id = int4multirange(int4range(1, 7))
+	  AND valid_at = datemultirange(daterange('2000-01-01', '2000-02-01'));
+}
+step check_after_update
+{
+	INSERT INTO relfilenodes(node)
+	SELECT relfilenode
+	FROM pg_class
+	WHERE relname = 'repack_temporal_multirange';
+
+	-- Expect 2 rows
+	SELECT id, valid_at, label
+	FROM repack_temporal_multirange
+	ORDER BY id, valid_at, label;
+}
+step wakeup_before_lock
+{
+	SELECT injection_points_wakeup('repack-concurrently-before-lock');
+}
+
+permutation
+	wait_before_lock
+	update_target
+	check_after_update
+	wakeup_before_lock
+	final_check
-- 
2.50.1 (Apple Git-155)



^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes
  2026-05-08 04:21 Fix REPACK with WITHOUT OVERLAPS replica identity indexes Chao Li <[email protected]>
  2026-05-08 17:47 ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Kirill Reshke <[email protected]>
  2026-05-09 08:36   ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Chao Li <[email protected]>
@ 2026-05-09 22:38     ` Álvaro Herrera <[email protected]>
  2026-05-11 06:54       ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Chao Li <[email protected]>
  0 siblings, 1 reply; 10+ messages in thread

From: Álvaro Herrera @ 2026-05-09 22:38 UTC (permalink / raw)
  To: Chao Li <[email protected]>; +Cc: Kirill Reshke <[email protected]>; pgsql-hackers

On 2026-May-09, Chao Li wrote:

> > On May 9, 2026, at 01:47, Kirill Reshke <[email protected]> wrote:
> > 
> > On Fri, 8 May 2026 at 09:22, Chao Li <[email protected]> wrote:

> >> While testing UPDATE FOR PORTION OF, I started wondering whether
> >> REPACK supports temporal tables. In theory, it should, because
> >> temporal WITHOUT OVERLAPS indexes can be used as replica identity
> >> indexes. So I created a test script, repack_temporal.spec, which is
> >> included in the attached patch, and it failed.

Nice find, thanks for testing.

> >> I found that REPACK hard-codes BTEqualStrategyNumber when calling
> >> get_opfamily_member(). That seems wrong, because
> >> build_replindex_scan_key() uses IndexAmTranslateCompareType() to
> >> get the equality strategy for COMPARE_EQ.

Makes sense.

I think it would be a good idea to make identity_key_equal() not deform
all attributes, but instead only up to the last one it needs for the key
comparisons.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/





^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes
  2026-05-08 04:21 Fix REPACK with WITHOUT OVERLAPS replica identity indexes Chao Li <[email protected]>
  2026-05-08 17:47 ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Kirill Reshke <[email protected]>
  2026-05-09 08:36   ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Chao Li <[email protected]>
  2026-05-09 22:38     ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Álvaro Herrera <[email protected]>
@ 2026-05-11 06:54       ` Chao Li <[email protected]>
  2026-05-11 16:21         ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Álvaro Herrera <[email protected]>
  0 siblings, 1 reply; 10+ messages in thread

From: Chao Li @ 2026-05-11 06:54 UTC (permalink / raw)
  To: Álvaro Herrera <[email protected]>; +Cc: Kirill Reshke <[email protected]>; pgsql-hackers



> On May 10, 2026, at 06:38, Álvaro Herrera <[email protected]> wrote:
> 
> On 2026-May-09, Chao Li wrote:
> 
>>> On May 9, 2026, at 01:47, Kirill Reshke <[email protected]> wrote:
>>> 
>>> On Fri, 8 May 2026 at 09:22, Chao Li <[email protected]> wrote:
> 
>>>> While testing UPDATE FOR PORTION OF, I started wondering whether
>>>> REPACK supports temporal tables. In theory, it should, because
>>>> temporal WITHOUT OVERLAPS indexes can be used as replica identity
>>>> indexes. So I created a test script, repack_temporal.spec, which is
>>>> included in the attached patch, and it failed.
> 
> Nice find, thanks for testing.
> 
>>>> I found that REPACK hard-codes BTEqualStrategyNumber when calling
>>>> get_opfamily_member(). That seems wrong, because
>>>> build_replindex_scan_key() uses IndexAmTranslateCompareType() to
>>>> get the equality strategy for COMPARE_EQ.
> 
> Makes sense.
> 
> I think it would be a good idea to make identity_key_equal() not deform
> all attributes, but instead only up to the last one it needs for the key
> comparisons.
> 

That’s true. Please see v3.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/






Attachments:

  [application/octet-stream] v3-0001-Fix-REPACK-with-WITHOUT-OVERLAPS-replica-identity.patch (16.8K, 2-v3-0001-Fix-REPACK-with-WITHOUT-OVERLAPS-replica-identity.patch)
  download | inline diff:
From 816c41624a7f6eff03aa7d499e3d84d501fb8352 Mon Sep 17 00:00:00 2001
From: "Chao Li (Evan)" <[email protected]>
Date: Fri, 8 May 2026 11:40:18 +0800
Subject: [PATCH v3] Fix REPACK with WITHOUT OVERLAPS replica identity indexes
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

REPACK replay builds scan keys for the replica identity index, but it
hard-coded BTEqualStrategyNumber when looking up the equality operator.
That is not correct for non-btree identity indexes, such as the GiST
indexes created for WITHOUT OVERLAPS primary keys.  In addition,
find_target_tuple() accepted the first tuple returned by the identity
index scan, which is unsafe for lossy index scans because the index AM may
return false positives with xs_recheck set.

Fix this by using IndexAmTranslateCompareType() to translate COMPARE_EQ
to the equality strategy number for the index AM, and by continuing the
scan when recheck is required until a candidate tuple matches the locator
tuple on all replica identity key columns.

The recheck uses the same equality operator functions as the identity
index scan keys, preserving ScanKey argument ordering.

Author: Chao Li <[email protected]>
Reviewed-by: Kirill Reshke <[email protected]>
Reviewed-by: Álvaro Herrera <[email protected]>
Discussion: https://postgr.es/m/[email protected]
---
 src/backend/commands/repack.c                 |  75 ++++++++++++-
 src/test/modules/injection_points/Makefile    |   2 +
 .../expected/repack_temporal.out              |  68 ++++++++++++
 .../expected/repack_temporal_multirange.out   |  74 +++++++++++++
 src/test/modules/injection_points/meson.build |   2 +
 .../specs/repack_temporal.spec                |  90 ++++++++++++++++
 .../specs/repack_temporal_multirange.spec     | 102 ++++++++++++++++++
 7 files changed, 409 insertions(+), 4 deletions(-)
 create mode 100644 src/test/modules/injection_points/expected/repack_temporal.out
 create mode 100644 src/test/modules/injection_points/expected/repack_temporal_multirange.out
 create mode 100644 src/test/modules/injection_points/specs/repack_temporal.spec
 create mode 100644 src/test/modules/injection_points/specs/repack_temporal_multirange.spec

diff --git a/src/backend/commands/repack.c b/src/backend/commands/repack.c
index 9a199dd9bfb..d56f22069f5 100644
--- a/src/backend/commands/repack.c
+++ b/src/backend/commands/repack.c
@@ -182,6 +182,9 @@ static void adjust_toast_pointers(Relation relation, TupleTableSlot *dest,
 static bool find_target_tuple(Relation rel, ChangeContext *chgcxt,
 							  TupleTableSlot *locator,
 							  TupleTableSlot *retrieved);
+static bool identity_key_equal(ChangeContext *chgcxt,
+							   TupleTableSlot *locator,
+							   TupleTableSlot *candidate);
 static void process_concurrent_changes(XLogRecPtr end_of_wal,
 									   ChangeContext *chgcxt,
 									   bool done);
@@ -2807,7 +2810,7 @@ find_target_tuple(Relation rel, ChangeContext *chgcxt, TupleTableSlot *locator,
 {
 	Form_pg_index idx = chgcxt->cc_ident_index->rd_index;
 	IndexScanDesc scan;
-	bool		retval;
+	bool		retval = false;
 
 	/*
 	 * Scan key is passed by caller, so it does not have to be constructed
@@ -2829,12 +2832,69 @@ find_target_tuple(Relation rel, ChangeContext *chgcxt, TupleTableSlot *locator,
 	scan = index_beginscan(rel, chgcxt->cc_ident_index, GetActiveSnapshot(),
 						   NULL, chgcxt->cc_ident_key_nentries, 0, 0);
 	index_rescan(scan, chgcxt->cc_ident_key, chgcxt->cc_ident_key_nentries, NULL, 0);
-	retval = index_getnext_slot(scan, ForwardScanDirection, retrieved);
+	while (index_getnext_slot(scan, ForwardScanDirection, retrieved))
+	{
+		if (scan->xs_recheck && !identity_key_equal(chgcxt, locator, retrieved))
+		{
+			CHECK_FOR_INTERRUPTS();
+			continue;
+		}
+
+		retval = true;
+		break;
+	}
 	index_endscan(scan);
 
 	return retval;
 }
 
+/*
+ * Check whether the candidate tuple matches the locator tuple on all replica
+ * identity key columns, using the same equality operators as the identity
+ * index scan.  This is needed to filter lossy index matches, such as GiST
+ * multirange scans.
+ */
+static bool
+identity_key_equal(ChangeContext *chgcxt, TupleTableSlot *locator,
+				   TupleTableSlot *candidate)
+{
+	Form_pg_index idx = chgcxt->cc_ident_index->rd_index;
+	AttrNumber	last_key_attno = 0;
+
+	for (int i = 0; i < chgcxt->cc_ident_key_nentries; i++)
+	{
+		AttrNumber	attno = idx->indkey.values[i];
+
+		Assert(attno > 0);
+		last_key_attno = Max(last_key_attno, attno);
+	}
+
+	slot_getsomeattrs(locator, last_key_attno);
+	slot_getsomeattrs(candidate, last_key_attno);
+
+	for (int i = 0; i < chgcxt->cc_ident_key_nentries; i++)
+	{
+		ScanKey		entry = &chgcxt->cc_ident_key[i];
+		AttrNumber	attno = idx->indkey.values[i];
+
+		Assert(attno > 0);
+
+		if (locator->tts_isnull[attno - 1] != candidate->tts_isnull[attno - 1])
+			return false;
+
+		if (locator->tts_isnull[attno - 1])
+			continue;
+
+		if (!DatumGetBool(FunctionCall2Coll(&entry->sk_func,
+											entry->sk_collation,
+											candidate->tts_values[attno - 1],
+											entry->sk_argument)))
+			return false;
+	}
+
+	return true;
+}
+
 /*
  * Decode and apply concurrent changes, up to (and including) the record whose
  * LSN is 'end_of_wal'.
@@ -2944,13 +3004,20 @@ initialize_change_context(ChangeContext *chgcxt,
 						opcintype,
 						opno,
 						opcode;
+			StrategyNumber eq_strategy;
 
 			entry = &chgcxt->cc_ident_key[i];
 
 			opfamily = chgcxt->cc_ident_index->rd_opfamily[i];
 			opcintype = chgcxt->cc_ident_index->rd_opcintype[i];
+			eq_strategy = IndexAmTranslateCompareType(COMPARE_EQ,
+													  chgcxt->cc_ident_index->rd_rel->relam,
+													  opfamily, false);
+			if (eq_strategy == InvalidStrategy)
+				elog(ERROR, "failed to find equality strategy for index operator family %u for type %u",
+					 opfamily, opcintype);
 			opno = get_opfamily_member(opfamily, opcintype, opcintype,
-									   BTEqualStrategyNumber);
+									   eq_strategy);
 			if (!OidIsValid(opno))
 				elog(ERROR, "failed to find = operator for type %u", opcintype);
 			opcode = get_opcode(opno);
@@ -2960,7 +3027,7 @@ initialize_change_context(ChangeContext *chgcxt,
 			/* Initialize everything but argument. */
 			ScanKeyInit(entry,
 						i + 1,
-						BTEqualStrategyNumber, opcode,
+						eq_strategy, opcode,
 						(Datum) 0);
 			entry->sk_collation = chgcxt->cc_ident_index->rd_indcollation[i];
 		}
diff --git a/src/test/modules/injection_points/Makefile b/src/test/modules/injection_points/Makefile
index f057d143d1a..c01d2fb095c 100644
--- a/src/test/modules/injection_points/Makefile
+++ b/src/test/modules/injection_points/Makefile
@@ -15,6 +15,8 @@ REGRESS_OPTS = --dlpath=$(top_builddir)/src/test/regress
 ISOLATION = basic \
 	    inplace \
 	    repack \
+	    repack_temporal \
+	    repack_temporal_multirange \
 	    repack_toast \
 	    syscache-update-pruned \
 	    heap_lock_update
diff --git a/src/test/modules/injection_points/expected/repack_temporal.out b/src/test/modules/injection_points/expected/repack_temporal.out
new file mode 100644
index 00000000000..e6b06c00cec
--- /dev/null
+++ b/src/test/modules/injection_points/expected/repack_temporal.out
@@ -0,0 +1,68 @@
+Parsed test spec with 2 sessions
+
+starting permutation: wait_before_lock update_target check_after_update wakeup_before_lock check_after_repack
+injection_points_attach
+-----------------------
+                       
+(1 row)
+
+step wait_before_lock: 
+	REPACK (CONCURRENTLY) repack_temporal USING INDEX repack_temporal_pkey;
+ <waiting ...>
+step update_target: 
+	UPDATE repack_temporal
+	SET label = 'updated'
+	WHERE id = '[2,3)' AND valid_at = '[2000-01-10,2000-01-20)';
+
+step check_after_update: 
+	INSERT INTO relfilenodes(node)
+	SELECT relfilenode FROM pg_class WHERE relname = 'repack_temporal';
+
+	-- Expect 2 rows
+	SELECT id, valid_at, label
+	FROM repack_temporal
+	ORDER BY id, valid_at, label;
+
+id    |valid_at               |label  
+------+-----------------------+-------
+[1,10)|[01-01-2000,02-01-2000)|other  
+[2,3) |[01-10-2000,01-20-2000)|updated
+(2 rows)
+
+step wakeup_before_lock: 
+	SELECT injection_points_wakeup('repack-concurrently-before-lock');
+
+injection_points_wakeup
+-----------------------
+                       
+(1 row)
+
+step wait_before_lock: <... completed>
+step check_after_repack: 
+	INSERT INTO relfilenodes(node)
+	SELECT relfilenode FROM pg_class WHERE relname = 'repack_temporal';
+
+	-- Expect 2, proving that repack has rewritten the table
+	SELECT count(DISTINCT node) FROM relfilenodes;
+
+	-- Expect 2 rows
+	SELECT id, valid_at, label
+	FROM repack_temporal
+	ORDER BY id, valid_at, label;
+
+count
+-----
+    2
+(1 row)
+
+id    |valid_at               |label  
+------+-----------------------+-------
+[1,10)|[01-01-2000,02-01-2000)|other  
+[2,3) |[01-10-2000,01-20-2000)|updated
+(2 rows)
+
+injection_points_detach
+-----------------------
+                       
+(1 row)
+
diff --git a/src/test/modules/injection_points/expected/repack_temporal_multirange.out b/src/test/modules/injection_points/expected/repack_temporal_multirange.out
new file mode 100644
index 00000000000..3f5ff8cfb0f
--- /dev/null
+++ b/src/test/modules/injection_points/expected/repack_temporal_multirange.out
@@ -0,0 +1,74 @@
+Parsed test spec with 2 sessions
+
+starting permutation: wait_before_lock update_target check_after_update wakeup_before_lock final_check
+injection_points_attach
+-----------------------
+                       
+(1 row)
+
+step wait_before_lock: 
+	REPACK (CONCURRENTLY) repack_temporal_multirange
+		USING INDEX repack_temporal_multirange_pkey;
+ <waiting ...>
+step update_target: 
+	UPDATE repack_temporal_multirange
+	SET label = 'updated'
+	WHERE id = int4multirange(int4range(1, 7))
+	  AND valid_at = datemultirange(daterange('2000-01-01', '2000-02-01'));
+
+step check_after_update: 
+	INSERT INTO relfilenodes(node)
+	SELECT relfilenode
+	FROM pg_class
+	WHERE relname = 'repack_temporal_multirange';
+
+	-- Expect 2 rows
+	SELECT id, valid_at, label
+	FROM repack_temporal_multirange
+	ORDER BY id, valid_at, label;
+
+id           |valid_at                 |label  
+-------------+-------------------------+-------
+{[1,3),[5,7)}|{[01-01-2000,02-01-2000)}|other  
+{[1,7)}      |{[01-01-2000,02-01-2000)}|updated
+(2 rows)
+
+step wakeup_before_lock: 
+	SELECT injection_points_wakeup('repack-concurrently-before-lock');
+
+injection_points_wakeup
+-----------------------
+                       
+(1 row)
+
+step wait_before_lock: <... completed>
+step final_check: 
+	INSERT INTO relfilenodes(node)
+	SELECT relfilenode
+	FROM pg_class
+	WHERE relname = 'repack_temporal_multirange';
+
+	-- Expect 2, proving that repack has rewritten the table
+	SELECT count(DISTINCT node) FROM relfilenodes;
+
+	-- Expect 2 rows
+	SELECT id, valid_at, label
+	FROM repack_temporal_multirange
+	ORDER BY id, valid_at, label;
+
+count
+-----
+    2
+(1 row)
+
+id           |valid_at                 |label  
+-------------+-------------------------+-------
+{[1,3),[5,7)}|{[01-01-2000,02-01-2000)}|other  
+{[1,7)}      |{[01-01-2000,02-01-2000)}|updated
+(2 rows)
+
+injection_points_detach
+-----------------------
+                       
+(1 row)
+
diff --git a/src/test/modules/injection_points/meson.build b/src/test/modules/injection_points/meson.build
index fb1418e2caa..59dba1cb023 100644
--- a/src/test/modules/injection_points/meson.build
+++ b/src/test/modules/injection_points/meson.build
@@ -46,6 +46,8 @@ tests += {
       'basic',
       'inplace',
       'repack',
+      'repack_temporal',
+      'repack_temporal_multirange',
       'repack_toast',
       'syscache-update-pruned',
       'heap_lock_update',
diff --git a/src/test/modules/injection_points/specs/repack_temporal.spec b/src/test/modules/injection_points/specs/repack_temporal.spec
new file mode 100644
index 00000000000..9629d502ec1
--- /dev/null
+++ b/src/test/modules/injection_points/specs/repack_temporal.spec
@@ -0,0 +1,90 @@
+# REPACK (CONCURRENTLY) on a temporal replica identity index.
+#
+# The table's replica identity is a GiST index created by a temporal primary
+# key.  A concurrent UPDATE changes a non-key column of one row, while another
+# row overlaps it on all indexed columns.  Replay must still find the exact
+# target row.
+setup
+{
+	CREATE EXTENSION injection_points;
+
+	CREATE TABLE repack_temporal (
+		id int4range,
+		valid_at daterange,
+		label text,
+		PRIMARY KEY (id, valid_at WITHOUT OVERLAPS)
+	);
+
+	ALTER TABLE repack_temporal REPLICA IDENTITY USING INDEX repack_temporal_pkey;
+
+	INSERT INTO repack_temporal(id, valid_at, label)
+	VALUES
+		('[1,10)', '[2000-01-01,2000-02-01)', 'other'),
+		('[2,3)',  '[2000-01-10,2000-01-20)', 'target');
+
+	CREATE TABLE relfilenodes(node oid);
+}
+
+teardown
+{
+	DROP TABLE repack_temporal;
+	DROP EXTENSION injection_points;
+	DROP TABLE relfilenodes;
+}
+
+session s1
+setup
+{
+	SELECT injection_points_set_local();
+	SELECT injection_points_attach('repack-concurrently-before-lock', 'wait');
+}
+step wait_before_lock
+{
+	REPACK (CONCURRENTLY) repack_temporal USING INDEX repack_temporal_pkey;
+}
+step check_after_repack
+{
+	INSERT INTO relfilenodes(node)
+	SELECT relfilenode FROM pg_class WHERE relname = 'repack_temporal';
+
+	-- Expect 2, proving that repack has rewritten the table
+	SELECT count(DISTINCT node) FROM relfilenodes;
+
+	-- Expect 2 rows
+	SELECT id, valid_at, label
+	FROM repack_temporal
+	ORDER BY id, valid_at, label;
+}
+teardown
+{
+	SELECT injection_points_detach('repack-concurrently-before-lock');
+}
+
+session s2
+step update_target
+{
+	UPDATE repack_temporal
+	SET label = 'updated'
+	WHERE id = '[2,3)' AND valid_at = '[2000-01-10,2000-01-20)';
+}
+step check_after_update
+{
+	INSERT INTO relfilenodes(node)
+	SELECT relfilenode FROM pg_class WHERE relname = 'repack_temporal';
+
+	-- Expect 2 rows
+	SELECT id, valid_at, label
+	FROM repack_temporal
+	ORDER BY id, valid_at, label;
+}
+step wakeup_before_lock
+{
+	SELECT injection_points_wakeup('repack-concurrently-before-lock');
+}
+
+permutation
+	wait_before_lock
+	update_target
+	check_after_update
+	wakeup_before_lock
+	check_after_repack
diff --git a/src/test/modules/injection_points/specs/repack_temporal_multirange.spec b/src/test/modules/injection_points/specs/repack_temporal_multirange.spec
new file mode 100644
index 00000000000..dfff1f2234d
--- /dev/null
+++ b/src/test/modules/injection_points/specs/repack_temporal_multirange.spec
@@ -0,0 +1,102 @@
+# REPACK (CONCURRENTLY) on a temporal replica identity index with lossy
+# multirange equality.
+#
+# The leading identity column is an int4multirange. Two distinct rows have
+# different multirange values but the same union range, so GiST equality can
+# produce both as candidates and requires exact recheck.
+setup
+{
+	CREATE EXTENSION injection_points;
+
+	CREATE TABLE repack_temporal_multirange (
+		id int4multirange,
+		valid_at datemultirange,
+		label text,
+		PRIMARY KEY (id, valid_at WITHOUT OVERLAPS)
+	);
+
+	ALTER TABLE repack_temporal_multirange
+		REPLICA IDENTITY USING INDEX repack_temporal_multirange_pkey;
+
+	-- (1,3)+(5+7) is the same uninon range of (1-7), but needs recheck
+	INSERT INTO repack_temporal_multirange(id, valid_at, label)
+	VALUES
+		(int4multirange(int4range(1, 3), int4range(5, 7)),
+		 datemultirange(daterange('2000-01-01', '2000-02-01')),
+		 'other'),
+		(int4multirange(int4range(1, 7)),
+		 datemultirange(daterange('2000-01-01', '2000-02-01')),
+		 'target');
+
+	CREATE TABLE relfilenodes(node oid);
+}
+
+teardown
+{
+	DROP TABLE repack_temporal_multirange;
+	DROP EXTENSION injection_points;
+	DROP TABLE relfilenodes;
+}
+
+session s1
+setup
+{
+	SELECT injection_points_set_local();
+	SELECT injection_points_attach('repack-concurrently-before-lock', 'wait');
+}
+step wait_before_lock
+{
+	REPACK (CONCURRENTLY) repack_temporal_multirange
+		USING INDEX repack_temporal_multirange_pkey;
+}
+step final_check
+{
+	INSERT INTO relfilenodes(node)
+	SELECT relfilenode
+	FROM pg_class
+	WHERE relname = 'repack_temporal_multirange';
+
+	-- Expect 2, proving that repack has rewritten the table
+	SELECT count(DISTINCT node) FROM relfilenodes;
+
+	-- Expect 2 rows
+	SELECT id, valid_at, label
+	FROM repack_temporal_multirange
+	ORDER BY id, valid_at, label;
+}
+teardown
+{
+	SELECT injection_points_detach('repack-concurrently-before-lock');
+}
+
+session s2
+step update_target
+{
+	UPDATE repack_temporal_multirange
+	SET label = 'updated'
+	WHERE id = int4multirange(int4range(1, 7))
+	  AND valid_at = datemultirange(daterange('2000-01-01', '2000-02-01'));
+}
+step check_after_update
+{
+	INSERT INTO relfilenodes(node)
+	SELECT relfilenode
+	FROM pg_class
+	WHERE relname = 'repack_temporal_multirange';
+
+	-- Expect 2 rows
+	SELECT id, valid_at, label
+	FROM repack_temporal_multirange
+	ORDER BY id, valid_at, label;
+}
+step wakeup_before_lock
+{
+	SELECT injection_points_wakeup('repack-concurrently-before-lock');
+}
+
+permutation
+	wait_before_lock
+	update_target
+	check_after_update
+	wakeup_before_lock
+	final_check
-- 
2.50.1 (Apple Git-155)



^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes
  2026-05-08 04:21 Fix REPACK with WITHOUT OVERLAPS replica identity indexes Chao Li <[email protected]>
  2026-05-08 17:47 ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Kirill Reshke <[email protected]>
  2026-05-09 08:36   ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Chao Li <[email protected]>
  2026-05-09 22:38     ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Álvaro Herrera <[email protected]>
  2026-05-11 06:54       ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Chao Li <[email protected]>
@ 2026-05-11 16:21         ` Álvaro Herrera <[email protected]>
  2026-05-12 02:31           ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Chao Li <[email protected]>
  2026-05-12 16:48           ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Antonin Houska <[email protected]>
  0 siblings, 2 replies; 10+ messages in thread

From: Álvaro Herrera @ 2026-05-11 16:21 UTC (permalink / raw)
  To: Chao Li <[email protected]>; +Cc: Kirill Reshke <[email protected]>; pgsql-hackers

On 2026-May-11, Chao Li wrote:

> > On May 10, 2026, at 06:38, Álvaro Herrera <[email protected]> wrote:

> > I think it would be a good idea to make identity_key_equal() not deform
> > all attributes, but instead only up to the last one it needs for the key
> > comparisons.
> 
> That’s true. Please see v3.

Thanks.  I did one further small change, namely to determine these last
attnums just once per run rather than once per tuple.  Pushed now.

Thanks for testing!

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/





^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes
  2026-05-08 04:21 Fix REPACK with WITHOUT OVERLAPS replica identity indexes Chao Li <[email protected]>
  2026-05-08 17:47 ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Kirill Reshke <[email protected]>
  2026-05-09 08:36   ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Chao Li <[email protected]>
  2026-05-09 22:38     ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Álvaro Herrera <[email protected]>
  2026-05-11 06:54       ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Chao Li <[email protected]>
  2026-05-11 16:21         ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Álvaro Herrera <[email protected]>
@ 2026-05-12 02:31           ` Chao Li <[email protected]>
  1 sibling, 0 replies; 10+ messages in thread

From: Chao Li @ 2026-05-12 02:31 UTC (permalink / raw)
  To: Álvaro Herrera <[email protected]>; +Cc: Kirill Reshke <[email protected]>; pgsql-hackers



> On May 12, 2026, at 00:21, Álvaro Herrera <[email protected]> wrote:
> 
> On 2026-May-11, Chao Li wrote:
> 
>>> On May 10, 2026, at 06:38, Álvaro Herrera <[email protected]> wrote:
> 
>>> I think it would be a good idea to make identity_key_equal() not deform
>>> all attributes, but instead only up to the last one it needs for the key
>>> comparisons.
>> 
>> That’s true. Please see v3.
> 
> Thanks.  I did one further small change, namely to determine these last
> attnums just once per run rather than once per tuple.  Pushed now.
> 
> Thanks for testing!
> 
> -- 
> Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/

Hi Álvaro, thank you so much for tuning the code and pushing.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/









^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes
  2026-05-08 04:21 Fix REPACK with WITHOUT OVERLAPS replica identity indexes Chao Li <[email protected]>
  2026-05-08 17:47 ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Kirill Reshke <[email protected]>
  2026-05-09 08:36   ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Chao Li <[email protected]>
  2026-05-09 22:38     ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Álvaro Herrera <[email protected]>
  2026-05-11 06:54       ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Chao Li <[email protected]>
  2026-05-11 16:21         ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Álvaro Herrera <[email protected]>
@ 2026-05-12 16:48           ` Antonin Houska <[email protected]>
  2026-05-13 08:35             ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Chao Li <[email protected]>
  1 sibling, 1 reply; 10+ messages in thread

From: Antonin Houska @ 2026-05-12 16:48 UTC (permalink / raw)
  To: [email protected]; +Cc: Chao Li <[email protected]>; Kirill Reshke <[email protected]>; pgsql-hackers

Álvaro Herrera <[email protected]> wrote:

> On 2026-May-11, Chao Li wrote:
> 
> > > On May 10, 2026, at 06:38, Álvaro Herrera <[email protected]> wrote:
> 
> > > I think it would be a good idea to make identity_key_equal() not deform
> > > all attributes, but instead only up to the last one it needs for the key
> > > comparisons.
> > 
> > That’s true. Please see v3.
> 
> Thanks.  I did one further small change, namely to determine these last
> attnums just once per run rather than once per tuple.  Pushed now.

I appreciate that REPACK can handle more cases now! However, I found a problem
(or at least a question) when rebasing the improvements for the next
release(s). (It's related to splitting the table scan into multiple block
ranges and use one snapshot per range, details are not too important here, )
Assertion failure in the new code made me think if other than B-tree indexes
should be allowed in the USING INDEX clause of REPACK.

AFAICS, only B-tree indexes (and some special ones that don't appear in the
core) provide ordering information - see get_relation_info():

	/*
	 * Fetch the ordering information for the index, if any.
	 */
	if (info->relam == BTREE_AM_OID)
	{
		...
		info->sortopfamily = info->opfamily;
		...
	}
	else if (amroutine->amcanorder)
	{
		/*
		 * Otherwise, identify the corresponding btree opfamilies
		 * by trying to map this index's "<" operators into btree.
		 * Since "<" uniquely defines the behavior of a sort
		 * order, this is a sufficient test.
		 ...
	}
	else
	{
		...
		info->sortopfamily = NULL;
		...
	}


Therefore, index scan shouldn't be possible for GIST index - see
build_index_paths():

	index_is_ordered = (index->sortopfamily != NULL);


So I'm not sure if clustering makes sense here. What makes me confused is that
GIST has IndexAmRoutine.amclusterable=true. As it has amcanorder=false at the
same time, I suspect it might be just a thinko. However, if we simply set
amclusterable to false, it can break upgrade to PG 19 for users who already
"clustered" some table by a GIST index (for mysterious reasons). (BTW, do we
need the amclusterable field at all?)

REPACK currently rejects explicit sort if non-B-tree index is specified (due
to lack of ordering information), but it still scans the index rather than
the heap - see copy_table_data() and heapam_relation_copy_for_cluster().

Does this seem worth fixing now? Or maybe at least worth some comments (unless
I'm completely wrong)?

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com





^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes
  2026-05-08 04:21 Fix REPACK with WITHOUT OVERLAPS replica identity indexes Chao Li <[email protected]>
  2026-05-08 17:47 ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Kirill Reshke <[email protected]>
  2026-05-09 08:36   ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Chao Li <[email protected]>
  2026-05-09 22:38     ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Álvaro Herrera <[email protected]>
  2026-05-11 06:54       ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Chao Li <[email protected]>
  2026-05-11 16:21         ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Álvaro Herrera <[email protected]>
  2026-05-12 16:48           ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Antonin Houska <[email protected]>
@ 2026-05-13 08:35             ` Chao Li <[email protected]>
  2026-05-13 12:00               ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Antonin Houska <[email protected]>
  0 siblings, 1 reply; 10+ messages in thread

From: Chao Li @ 2026-05-13 08:35 UTC (permalink / raw)
  To: Antonin Houska <[email protected]>; +Cc: [email protected]; Kirill Reshke <[email protected]>; pgsql-hackers



> On May 13, 2026, at 00:48, Antonin Houska <[email protected]> wrote:
> 
> Álvaro Herrera <[email protected]> wrote:
> 
>> On 2026-May-11, Chao Li wrote:
>> 
>>>> On May 10, 2026, at 06:38, Álvaro Herrera <[email protected]> wrote:
>> 
>>>> I think it would be a good idea to make identity_key_equal() not deform
>>>> all attributes, but instead only up to the last one it needs for the key
>>>> comparisons.
>>> 
>>> That’s true. Please see v3.
>> 
>> Thanks.  I did one further small change, namely to determine these last
>> attnums just once per run rather than once per tuple.  Pushed now.
> 
> I appreciate that REPACK can handle more cases now! However, I found a problem
> (or at least a question) when rebasing the improvements for the next
> release(s). (It's related to splitting the table scan into multiple block
> ranges and use one snapshot per range, details are not too important here, )
> Assertion failure in the new code made me think if other than B-tree indexes
> should be allowed in the USING INDEX clause of REPACK.
> 
> AFAICS, only B-tree indexes (and some special ones that don't appear in the
> core) provide ordering information - see get_relation_info():
> 
> /*
>  * Fetch the ordering information for the index, if any.
>  */
> if (info->relam == BTREE_AM_OID)
> {
> ...
> info->sortopfamily = info->opfamily;
> ...
> }
> else if (amroutine->amcanorder)
> {
> /*
>  * Otherwise, identify the corresponding btree opfamilies
>  * by trying to map this index's "<" operators into btree.
>  * Since "<" uniquely defines the behavior of a sort
>  * order, this is a sufficient test.
>  ...
> }
> else
> {
> ...
> info->sortopfamily = NULL;
> ...
> }
> 
> 
> Therefore, index scan shouldn't be possible for GIST index - see
> build_index_paths():
> 
> index_is_ordered = (index->sortopfamily != NULL);
> 
> 
> So I'm not sure if clustering makes sense here. What makes me confused is that
> GIST has IndexAmRoutine.amclusterable=true. As it has amcanorder=false at the
> same time, I suspect it might be just a thinko. However, if we simply set
> amclusterable to false, it can break upgrade to PG 19 for users who already
> "clustered" some table by a GIST index (for mysterious reasons). (BTW, do we
> need the amclusterable field at all?)
> 
> REPACK currently rejects explicit sort if non-B-tree index is specified (due
> to lack of ordering information), but it still scans the index rather than
> the heap - see copy_table_data() and heapam_relation_copy_for_cluster().
> 
> Does this seem worth fixing now? Or maybe at least worth some comments (unless
> I'm completely wrong)?

After some investigation, I think I see the mismatch:

* get_relation_info(): non-ordered GiST cannot provide sort order. That is expected.
* copy_table_data() only uses plan_cluster_use_sort() for btree. For any other clusterable index, it sets use_sort = false and does a raw index scan.
* The docs say REPACK can re-sort using index scan “if the index is a b-tree” or seqscan+sort, which does not describe what the code actually does for GiST.

I am not sure whether we should change the behavior in PG19. Alvaro may have a better idea about that. But I agree that we can at least clarify the code comment and documentation. The attached patch attempts to do that.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/



Attachments:

  [application/octet-stream] repack_comment.diff (4.6K, 2-repack_comment.diff)
  download | inline diff:
diff --git a/doc/src/sgml/ref/repack.sgml b/doc/src/sgml/ref/repack.sgml
index 0cb72b6b289..c9ef358261e 100644
--- a/doc/src/sgml/ref/repack.sgml
+++ b/doc/src/sgml/ref/repack.sgml
@@ -78,11 +78,13 @@ REPACK [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] USING
 
    <para>
     If the <literal>USING INDEX</literal> clause is specified, the rows in
-    the table are stored in the order that the index specifies;
-    <firstterm>clustering</firstterm>, because rows are physically clustered
-    afterwards.
-    If an index name is specified in the command, the order implied by that
-    index is used, and that index is configured as the index to cluster on.
+    the table are physically rearranged according to the specified index;
+    this is known as <firstterm>clustering</firstterm>.  For b-tree indexes,
+    this means the table is stored in the index's sort order.  Other clusterable
+    index access methods use their own index scan order, which does not
+    necessarily correspond to a SQL sort order.
+    If an index name is specified in the command, that index is used for
+    clustering, and is configured as the index to cluster on.
     (This also applies to an index given to the <command>CLUSTER</command>
     command.)
     If no index name is specified, then the index that has
@@ -101,12 +103,12 @@ REPACK [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] USING
    <para>
     Clustering is a one-time operation: when the table is
     subsequently updated, the changes are not clustered.  That is, no attempt
-    is made to store new or updated rows according to their index order.  (If
-    one wishes, one can periodically recluster by issuing the command again.
-    Also, setting the table's <literal>fillfactor</literal> storage parameter
-    to less than 100% can aid in preserving cluster ordering during updates,
-    since updated rows are kept on the same page if enough space is available
-    there.)
+    is made to store new or updated rows according to the clustering order.
+    (If one wishes, one can periodically recluster by issuing the command
+    again.  Also, setting the table's <literal>fillfactor</literal> storage
+    parameter to less than 100% can aid in preserving cluster ordering during
+    updates, since updated rows are kept on the same page if enough space is
+    available there.)
    </para>
 
    <para>
@@ -123,11 +125,12 @@ REPACK [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] USING
    </para>
 
    <para>
-    <command>REPACK</command> can re-sort the table using either an index scan
-    on the specified index (if the index is a b-tree), or a sequential scan
-    followed by sorting.  It will attempt to choose the method that will be
-    faster, based on planner cost parameters and available statistical
-    information.
+    When clustering on a b-tree index, <command>REPACK</command> can rewrite
+    the table using either an index scan on the specified index, or a
+    sequential scan followed by sorting.  It will attempt to choose the method
+    that will be faster, based on planner cost parameters and available
+    statistical information.  When clustering on a non-b-tree index,
+    <command>REPACK</command> uses an index scan.
    </para>
 
    <para>
diff --git a/src/backend/commands/repack.c b/src/backend/commands/repack.c
index 860e2aecbe9..6eb685b2240 100644
--- a/src/backend/commands/repack.c
+++ b/src/backend/commands/repack.c
@@ -1364,10 +1364,16 @@ copy_table_data(Relation NewHeap, Relation OldHeap, Relation OldIndex,
 
 	/*
 	 * Decide whether to use an indexscan or seqscan-and-optional-sort to scan
-	 * the OldHeap.  We know how to use a sort to duplicate the ordering of a
-	 * btree index, and will use seqscan-and-sort for that case if the planner
-	 * tells us it's cheaper.  Otherwise, always indexscan if an index is
-	 * provided, else plain seqscan.
+	 * the OldHeap.  For btree indexes, the scan order is a well-defined sort
+	 * order that can also be reproduced by an explicit sort, so use the
+	 * planner to choose between indexscan and seqscan-and-sort.
+	 *
+	 * Other index AMs can be marked clusterable even though they do not
+	 * provide btree-style ordering information to the planner.  For those,
+	 * clustering means rewriting the heap in the AM's index scan order, which
+	 * may improve locality but cannot be duplicated by sorting here, so leave
+	 * use_sort false.  If no index is provided, use a plain seqscan.
+
 	 */
 	if (OldIndex != NULL && OldIndex->rd_rel->relam == BTREE_AM_OID)
 		use_sort = plan_cluster_use_sort(RelationGetRelid(OldHeap),


^ permalink  raw  reply  [nested|flat] 10+ messages in thread

* Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes
  2026-05-08 04:21 Fix REPACK with WITHOUT OVERLAPS replica identity indexes Chao Li <[email protected]>
  2026-05-08 17:47 ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Kirill Reshke <[email protected]>
  2026-05-09 08:36   ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Chao Li <[email protected]>
  2026-05-09 22:38     ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Álvaro Herrera <[email protected]>
  2026-05-11 06:54       ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Chao Li <[email protected]>
  2026-05-11 16:21         ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Álvaro Herrera <[email protected]>
  2026-05-12 16:48           ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Antonin Houska <[email protected]>
  2026-05-13 08:35             ` Re: Fix REPACK with WITHOUT OVERLAPS replica identity indexes Chao Li <[email protected]>
@ 2026-05-13 12:00               ` Antonin Houska <[email protected]>
  0 siblings, 0 replies; 10+ messages in thread

From: Antonin Houska @ 2026-05-13 12:00 UTC (permalink / raw)
  To: Chao Li <[email protected]>; +Cc: [email protected]; Kirill Reshke <[email protected]>; pgsql-hackers

Chao Li <[email protected]> wrote:

> I am not sure whether we should change the behavior in PG19. Alvaro may have
> a better idea about that. But I agree that we can at least clarify the code
> comment and documentation. The attached patch attempts to do that.

I meant just a comment that reminds developers that something needs to be
fixed, rather than defending the inconsistent behavior.

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com






^ permalink  raw  reply  [nested|flat] 10+ messages in thread


end of thread, other threads:[~2026-05-13 12:00 UTC | newest]

Thread overview: 10+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2026-05-08 04:21 Fix REPACK with WITHOUT OVERLAPS replica identity indexes Chao Li <[email protected]>
2026-05-08 17:47 ` Kirill Reshke <[email protected]>
2026-05-09 08:36   ` Chao Li <[email protected]>
2026-05-09 22:38     ` Álvaro Herrera <[email protected]>
2026-05-11 06:54       ` Chao Li <[email protected]>
2026-05-11 16:21         ` Álvaro Herrera <[email protected]>
2026-05-12 02:31           ` Chao Li <[email protected]>
2026-05-12 16:48           ` Antonin Houska <[email protected]>
2026-05-13 08:35             ` Chao Li <[email protected]>
2026-05-13 12:00               ` Antonin Houska <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox