public inbox for [email protected]  
help / color / mirror / Atom feed
From: Mihail Nikalayeu <[email protected]>
To: Zhijie Hou (Fujitsu) <[email protected]>
To: Peter Geoghegan <[email protected]>
Cc: Amit Kapila <[email protected]>
Cc: Hayato Kuroda (Fujitsu) <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Cc: Andres Freund <[email protected]>
Subject: Re: [BUG?] check_exclusion_or_unique_constraint false negative
Date: Thu, 21 Aug 2025 10:43:08 +0200
Message-ID: <CADzfLwVJtBKX32daO20hJ_6_BWZn9cFPtAwzyzNrT2u7YCd6CA@mail.gmail.com> (raw)
In-Reply-To: <CADzfLwWC49oanFSGPTf=6FJoTw-kAnpPZV8nVqAyR5KL68LrHQ@mail.gmail.com>
References: <CANtu0oiktqQ2pwExoXqDpByXNCJa-KE5vQRodTRnmFHN_+qwHg@mail.gmail.com>
	<CANtu0ohU2XRV9shtu14CffLPDS1x10q7ebOGf-vX0p+45_L8jw@mail.gmail.com>
	<CANtu0oh0tspW-xWzDGWP9ehz96KPt9aUP1c9JYhdBYxKsB0jpA@mail.gmail.com>
	<CANtu0ohUB9ky45iiMAYN1fGyt82+cg=+UYBom=P7drb+=97G9w@mail.gmail.com>
	<TYAPR01MB56921C9C3D21B0D62FF76330F5B22@TYAPR01MB5692.jpnprd01.prod.outlook.com>
	<CANtu0og=5v4j8onS4nyJ4zMPdh-EPFxmiEi5PLoyZrmqHA6RKw@mail.gmail.com>
	<CAA4eK1Jfb0xviXYon-_TvHNKeAY7ngAeo++Knu-0RPR6EkSBjA@mail.gmail.com>
	<CANtu0ohHmYXsK5bxU9Thcq1FbELLAk0S2Zap0r8AnU3OTmcCOA@mail.gmail.com>
	<CAA4eK1+_V1PWXrrgAM01p+CByP6JwYRxejZrcxOu83a-v_+zZg@mail.gmail.com>
	<CANtu0ogDDQnXbrv6p7Xtc2dT_MZ1fjdPgB9-0B5Lw1b4pQGd2A@mail.gmail.com>
	<OS0PR01MB5716FFD8DBBADB55E8E6935994852@OS0PR01MB5716.jpnprd01.prod.outlook.com>
	<CANtu0oiziTBM8+WDtkktMZv0rhGBroYGWwqSQW+MzOWpmk-XEw@mail.gmail.com>
	<OS0PR01MB5716E30952F542E256DD72E294802@OS0PR01MB5716.jpnprd01.prod.outlook.com>
	<CANtu0oh69b+VCiASX86dF_eY=9=A2RmMQ_+0+uxZ_Zir+oNhhw@mail.gmail.com>
	<CANtu0og_4FVsRMxXue8SXUn03MwBAT0SiZSib_wcPMLDkpn-RA@mail.gmail.com>
	<CANtu0oj8LzpjCvF1zSwdLJxPddhbdY0=uh=7-wT6vwhmju_-PQ@mail.gmail.com>
	<CANtu0ogHMahRJvLKPofE9T7Z19H3UyWeb22fZ9cKfFRG_BEV0w@mail.gmail.com>
	<CANtu0ojos4kvrrQ9YJOei2=c5vB1wJBHpR3q_X+BG1i99ut+Hw@mail.gmail.com>
	<CADzfLwWuXh8KO=OZvB71pZnQ8nH0NYXfuGbFU6FBiVZUbmuFGg@mail.gmail.com>
	<CADzfLwW61K09Q8HnQ6zO+QHD2oHBmm5VmZbHa19=ZkMMX_Vcow@mail.gmail.com>
	<CADzfLwU_wuHeApGSGm+jO=9DzE-ZugONPRk1m6qohY0T-UNB7w@mail.gmail.com>
	<CADzfLwX4rRZ4OtDzpHSsTu4qGSjw1W5zORb7HAM5G9uzPyLOvA@mail.gmail.com>
	<CADzfLwWC49oanFSGPTf=6FJoTw-kAnpPZV8nVqAyR5KL68LrHQ@mail.gmail.com>

Hello,

Added one more test - for invalid "update_deleted" conflict detection.

Best regards,
Mikhail.


Attachments:

  [application/octet-stream] v10-0002-Fix-btree-index-scan-concurrency-issues-with-dir.patch (10.1K, 2-v10-0002-Fix-btree-index-scan-concurrency-issues-with-dir.patch)
  download | inline diff:
From 660d39f2a31882427522fe48387922dcd4091101 Mon Sep 17 00:00:00 2001
From: Mikhail Nikalayeu <[email protected]>
Date: Mon, 16 Jun 2025 22:20:38 +0200
Subject: [PATCH v10 2/2] Fix btree index scan concurrency issues with dirty
 snapshots

This patch addresses an issue where non-MVCC index scans using SnapshotDirty or SnapshotSelf could miss tuples due to concurrent modifications. The fix retains read locks on pages for these special snapshot types until the scan is done with the page's tuples, preventing concurrent modifications from causing inconsistent results.

Updated README to document this special case in the btree locking mechanism.
---
 src/backend/access/nbtree/README       | 13 ++++++++++++-
 src/backend/access/nbtree/nbtree.c     | 19 ++++++++++++++++++-
 src/backend/access/nbtree/nbtsearch.c  | 16 ++++++++++++----
 src/backend/access/nbtree/nbtutils.c   |  4 +++-
 src/backend/executor/execReplication.c |  8 ++++++--
 src/include/access/nbtree.h            |  1 +
 6 files changed, 52 insertions(+), 9 deletions(-)

diff --git a/src/backend/access/nbtree/README b/src/backend/access/nbtree/README
index 53d4a61dc3f..a9280415633 100644
--- a/src/backend/access/nbtree/README
+++ b/src/backend/access/nbtree/README
@@ -85,7 +85,8 @@ move right until we find a page whose right-link matches the page we
 came from.  (Actually, it's even harder than that; see page deletion
 discussion below.)
 
-Page read locks are held only for as long as a scan is examining a page.
+Page read locks are held only for as long as a scan is examining a page
+(with exception for SnapshotDirty and SnapshotSelf scans - see below).
 To minimize lock/unlock traffic, an index scan always searches a leaf page
 to identify all the matching items at once, copying their heap tuple IDs
 into backend-local storage.  The heap tuple IDs are then processed while
@@ -103,6 +104,16 @@ We also remember the left-link, and follow it when the scan moves backwards
 (though this requires extra handling to account for concurrent splits of
 the left sibling; see detailed move-left algorithm below).
 
+Despite the described mechanics in place, inconsistent results may still occur
+during non-MVCC scans (SnapshotDirty and SnapshotSelf). This issue can occur if a 
+concurrent transaction deletes a tuple and inserts a new tuple with a new TID in the 
+same page. If the scan has already visited the page and cached its content in the
+backend-local storage, it might skip the old tuple due to deletion and miss the new 
+tuple because the scan does not re-read the page. To address this issue, for 
+SnapshotDirty and SnapshotSelf scans, we retain the read lock on the page until 
+we're completely done processing all the tuples from that page, preventing 
+concurrent modifications that could lead to inconsistent results.
+
 In most cases we release our lock and pin on a page before attempting
 to acquire pin and lock on the page we are moving to.  In a few places
 it is necessary to lock the next page before releasing the current one.
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index fdff960c130..bda2b821a51 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -393,10 +393,22 @@ btrescan(IndexScanDesc scan, ScanKey scankey, int nscankeys,
 		/* Before leaving current page, deal with any killed items */
 		if (so->numKilled > 0)
 			_bt_killitems(scan);
+		else if (!so->dropLock) /* _bt_killitems always releases lock */
+			_bt_unlockbuf(scan->indexRelation, so->currPos.buf);
 		BTScanPosUnpinIfPinned(so->currPos);
 		BTScanPosInvalidate(so->currPos);
 	}
 
+	/*
+	 * For SnapshotDirty and SnapshotSelf scans, we don't unlock the buffer
+	 * and keep the lock should be until we're completely done with this page.
+	 * This prevents concurrent modifications from causing inconsistent
+	 * results during non-MVCC scans.
+	 *
+	 * See nbtree/README for information about SnapshotDirty and SnapshotSelf.
+	 */
+	so->dropLock = scan->xs_snapshot->snapshot_type != SNAPSHOT_DIRTY
+					&& scan->xs_snapshot->snapshot_type != SNAPSHOT_SELF;
 	/*
 	 * We prefer to eagerly drop leaf page pins before btgettuple returns.
 	 * This avoids making VACUUM wait to acquire a cleanup lock on the page.
@@ -420,7 +432,8 @@ btrescan(IndexScanDesc scan, ScanKey scankey, int nscankeys,
 	 *
 	 * Note: so->dropPin should never change across rescans.
 	 */
-	so->dropPin = (!scan->xs_want_itup &&
+	so->dropPin = (so->dropLock &&
+				   !scan->xs_want_itup &&
 				   IsMVCCSnapshot(scan->xs_snapshot) &&
 				   RelationNeedsWAL(scan->indexRelation) &&
 				   scan->heapRelation != NULL);
@@ -477,6 +490,8 @@ btendscan(IndexScanDesc scan)
 		/* Before leaving current page, deal with any killed items */
 		if (so->numKilled > 0)
 			_bt_killitems(scan);
+		else if (!so->dropLock) /* _bt_killitems always releases lock */
+			_bt_unlockbuf(scan->indexRelation, so->currPos.buf);
 		BTScanPosUnpinIfPinned(so->currPos);
 	}
 
@@ -557,6 +572,8 @@ btrestrpos(IndexScanDesc scan)
 			/* Before leaving current page, deal with any killed items */
 			if (so->numKilled > 0)
 				_bt_killitems(scan);
+			else if (!so->dropLock) /* _bt_killitems always releases lock */
+				_bt_unlockbuf(scan->indexRelation, so->currPos.buf);
 			BTScanPosUnpinIfPinned(so->currPos);
 		}
 
diff --git a/src/backend/access/nbtree/nbtsearch.c b/src/backend/access/nbtree/nbtsearch.c
index d69798795b4..f92dba17fa4 100644
--- a/src/backend/access/nbtree/nbtsearch.c
+++ b/src/backend/access/nbtree/nbtsearch.c
@@ -57,12 +57,14 @@ static bool _bt_endpoint(IndexScanDesc scan, ScanDirection dir);
 /*
  *	_bt_drop_lock_and_maybe_pin()
  *
- * Unlock so->currPos.buf.  If scan is so->dropPin, drop the pin, too.
+ * Unlock so->currPos.buf if so->dropLock. If scan is so->dropPin, drop the pin, too.
  * Dropping the pin prevents VACUUM from blocking on acquiring a cleanup lock.
  */
 static inline void
 _bt_drop_lock_and_maybe_pin(Relation rel, BTScanOpaque so)
 {
+	if (!so->dropLock)
+		return;
 	if (!so->dropPin)
 	{
 		/* Just drop the lock (not the pin) */
@@ -1579,7 +1581,8 @@ _bt_first(IndexScanDesc scan, ScanDirection dir)
  *	_bt_next() -- Get the next item in a scan.
  *
  *		On entry, so->currPos describes the current page, which may be pinned
- *		but is not locked, and so->currPos.itemIndex identifies which item was
+ *		but is not locked (except for SnapshotDirty and SnapshotSelf scans, where
+ *		the page remains locked), and so->currPos.itemIndex identifies which item was
  *		previously returned.
  *
  *		On success exit, so->currPos is updated as needed, and _bt_returnitem
@@ -2158,7 +2161,9 @@ _bt_returnitem(IndexScanDesc scan, BTScanOpaque so)
  * Wrapper on _bt_readnextpage that performs final steps for the current page.
  *
  * On entry, so->currPos must be valid.  Its buffer will be pinned, though
- * never locked. (Actually, when so->dropPin there won't even be a pin held,
+ * never locked, except for SnapshotDirty and SnapshotSelf scans where the buffer
+ * remains locked until we're done with all tuples from the page
+ * (Actually, when so->dropPin there won't even be a pin held,
  * though so->currPos.currPage must still be set to a valid block number.)
  */
 static bool
@@ -2173,6 +2178,8 @@ _bt_steppage(IndexScanDesc scan, ScanDirection dir)
 	/* Before leaving current page, deal with any killed items */
 	if (so->numKilled > 0)
 		_bt_killitems(scan);
+	else if (!so->dropLock) /* _bt_killitems always releases lock */
+		_bt_unlockbuf(scan->indexRelation, so->currPos.buf);
 
 	/*
 	 * Before we modify currPos, make a copy of the page data if there was a
@@ -2312,7 +2319,8 @@ _bt_readfirstpage(IndexScanDesc scan, OffsetNumber offnum, ScanDirection dir)
 	}
 
 	/* There's no actually-matching data on the page in so->currPos.buf */
-	_bt_unlockbuf(scan->indexRelation, so->currPos.buf);
+	if (so->dropLock)
+		_bt_unlockbuf(scan->indexRelation, so->currPos.buf);
 
 	/* Call _bt_readnextpage using its _bt_steppage wrapper function */
 	if (!_bt_steppage(scan, dir))
diff --git a/src/backend/access/nbtree/nbtutils.c b/src/backend/access/nbtree/nbtutils.c
index edfea2acaff..56d5bf44785 100644
--- a/src/backend/access/nbtree/nbtutils.c
+++ b/src/backend/access/nbtree/nbtutils.c
@@ -3283,8 +3283,10 @@ _bt_killitems(IndexScanDesc scan)
 		 * concurrent VACUUMs from recycling any of the TIDs on the page.
 		 */
 		Assert(BTScanPosIsPinned(so->currPos));
+		/* Lock only if the lock is dropped. */
 		buf = so->currPos.buf;
-		_bt_lockbuf(rel, buf, BT_READ);
+		if (so->dropLock)
+			_bt_lockbuf(rel, buf, BT_READ);
 	}
 	else
 	{
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index da0cbf41d6f..c2f5aa2ba5c 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -205,12 +205,11 @@ RelationFindReplTupleByIndex(Relation rel, Oid idxoid,
 
 	/* Start an index scan. */
 	scan = index_beginscan(rel, idxrel, &snap, NULL, skey_attoff, 0);
+	index_rescan(scan, skey, skey_attoff, NULL, 0);
 
 retry:
 	found = false;
 
-	index_rescan(scan, skey, skey_attoff, NULL, 0);
-
 	/* Try to find the tuple */
 	while (index_getnext_slot(scan, ForwardScanDirection, outslot))
 	{
@@ -238,6 +237,8 @@ retry:
 		 */
 		if (TransactionIdIsValid(xwait))
 		{
+			/* We need to call rescan before wait to ensure we release all the index page locks. */
+			index_rescan(scan, skey, skey_attoff, NULL, 0);
 			XactLockTableWait(xwait, NULL, NULL, XLTW_None);
 			goto retry;
 		}
@@ -266,7 +267,10 @@ retry:
 		PopActiveSnapshot();
 
 		if (should_refetch_tuple(res, &tmfd))
+		{
+			index_rescan(scan, skey, skey_attoff, NULL, 0);
 			goto retry;
+		}
 	}
 
 	index_endscan(scan);
diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h
index 9ab467cb8fd..9c10931c8e2 100644
--- a/src/include/access/nbtree.h
+++ b/src/include/access/nbtree.h
@@ -1069,6 +1069,7 @@ typedef struct BTScanOpaqueData
 	/* info about killed items if any (killedItems is NULL if never used) */
 	int		   *killedItems;	/* currPos.items indexes of killed items */
 	int			numKilled;		/* number of currently stored items */
+	bool		dropLock;		/* drop lock on before btgettuple returns? */
 	bool		dropPin;		/* drop leaf pin before btgettuple returns? */
 
 	/*
-- 
2.43.0



  [application/octet-stream] v10-0001-This-patch-introduces-new-injection-points-and-T.patch (22.1K, 3-v10-0001-This-patch-introduces-new-injection-points-and-T.patch)
  download | inline diff:
From dacda92357f397354a63aa5418f9bae802af06d3 Mon Sep 17 00:00:00 2001
From: nkey <[email protected]>
Date: Sat, 23 Nov 2024 13:25:11 +0100
Subject: [PATCH v10 1/2] This patch introduces new injection points and TAP
 tests to reproduce and verify conflict detection issues that arise during
 SNAPSHOT_DIRTY index scans in logical replication and
 check_exclusion_or_unique_constraint.

---
 src/backend/access/index/indexam.c            |   8 +
 src/backend/executor/execIndexing.c           |   3 +
 src/test/modules/injection_points/Makefile    |   2 +-
 .../expected/dirty_index_scan.out             |  27 ++++
 src/test/modules/injection_points/meson.build |   1 +
 .../specs/dirty_index_scan.spec               |  37 +++++
 src/test/subscription/Makefile                |   1 +
 src/test/subscription/meson.build             |   8 +-
 .../subscription/t/036_delete_missing_race.pl | 137 +++++++++++++++++
 .../subscription/t/037_update_missing_race.pl | 139 +++++++++++++++++
 .../t/038_update_missing_with_retain.pl       | 141 ++++++++++++++++++
 11 files changed, 502 insertions(+), 2 deletions(-)
 create mode 100644 src/test/modules/injection_points/expected/dirty_index_scan.out
 create mode 100644 src/test/modules/injection_points/specs/dirty_index_scan.spec
 create mode 100644 src/test/subscription/t/036_delete_missing_race.pl
 create mode 100644 src/test/subscription/t/037_update_missing_race.pl
 create mode 100644 src/test/subscription/t/038_update_missing_with_retain.pl

diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
index 1a4f36fe0a9..2e65750979e 100644
--- a/src/backend/access/index/indexam.c
+++ b/src/backend/access/index/indexam.c
@@ -57,6 +57,7 @@
 #include "utils/ruleutils.h"
 #include "utils/snapmgr.h"
 #include "utils/syscache.h"
+#include "utils/injection_point.h"
 
 
 /* ----------------------------------------------------------------
@@ -741,6 +742,13 @@ index_getnext_slot(IndexScanDesc scan, ScanDirection direction, TupleTableSlot *
 		 * the index.
 		 */
 		Assert(ItemPointerIsValid(&scan->xs_heaptid));
+#ifdef USE_INJECTION_POINTS
+		if (scan->xs_snapshot->snapshot_type == SNAPSHOT_DIRTY)
+		{
+			INJECTION_POINT("index_getnext_slot_before_fetch_apply_dirty", NULL);
+		}
+#endif
+
 		if (index_fetch_heap(scan, slot))
 			return true;
 	}
diff --git a/src/backend/executor/execIndexing.c b/src/backend/executor/execIndexing.c
index ca33a854278..c07ba230946 100644
--- a/src/backend/executor/execIndexing.c
+++ b/src/backend/executor/execIndexing.c
@@ -117,6 +117,7 @@
 #include "utils/multirangetypes.h"
 #include "utils/rangetypes.h"
 #include "utils/snapmgr.h"
+#include "utils/injection_point.h"
 
 /* waitMode argument to check_exclusion_or_unique_constraint() */
 typedef enum
@@ -943,6 +944,8 @@ retry:
 
 	ExecDropSingleTupleTableSlot(existing_slot);
 
+	if (!conflict)
+		INJECTION_POINT("check_exclusion_or_unique_constraint_no_conflict", NULL);
 	return !conflict;
 }
 
diff --git a/src/test/modules/injection_points/Makefile b/src/test/modules/injection_points/Makefile
index fc82cd67f6c..15f5e6d23d0 100644
--- a/src/test/modules/injection_points/Makefile
+++ b/src/test/modules/injection_points/Makefile
@@ -14,7 +14,7 @@ PGFILEDESC = "injection_points - facility for injection points"
 REGRESS = injection_points hashagg reindex_conc vacuum
 REGRESS_OPTS = --dlpath=$(top_builddir)/src/test/regress
 
-ISOLATION = basic inplace syscache-update-pruned
+ISOLATION = basic inplace syscache-update-pruned dirty_index_scan
 
 TAP_TESTS = 1
 
diff --git a/src/test/modules/injection_points/expected/dirty_index_scan.out b/src/test/modules/injection_points/expected/dirty_index_scan.out
new file mode 100644
index 00000000000..82d46397d61
--- /dev/null
+++ b/src/test/modules/injection_points/expected/dirty_index_scan.out
@@ -0,0 +1,27 @@
+Parsed test spec with 3 sessions
+
+starting permutation: s1_s1 s2_s1 s3_s1
+injection_points_attach
+-----------------------
+                       
+(1 row)
+
+step s1_s1: INSERT INTO test.tbl VALUES(42, 1) on conflict(i) do update set n = EXCLUDED.n + 1; <waiting ...>
+step s2_s1: UPDATE test.tbl SET n = n + 1 WHERE i = 42; <waiting ...>
+step s3_s1: 
+	SELECT injection_points_detach('index_getnext_slot_before_fetch_apply_dirty');
+	SELECT injection_points_wakeup('index_getnext_slot_before_fetch_apply_dirty');
+ <waiting ...>
+step s1_s1: <... completed>
+step s2_s1: <... completed>
+step s3_s1: <... completed>
+injection_points_detach
+-----------------------
+                       
+(1 row)
+
+injection_points_wakeup
+-----------------------
+                       
+(1 row)
+
diff --git a/src/test/modules/injection_points/meson.build b/src/test/modules/injection_points/meson.build
index 20390d6b4bf..a126fe20c2d 100644
--- a/src/test/modules/injection_points/meson.build
+++ b/src/test/modules/injection_points/meson.build
@@ -48,6 +48,7 @@ tests += {
       'basic',
       'inplace',
       'syscache-update-pruned',
+      'dirty_index_scan',
     ],
     'runningcheck': false, # see syscache-update-pruned
   },
diff --git a/src/test/modules/injection_points/specs/dirty_index_scan.spec b/src/test/modules/injection_points/specs/dirty_index_scan.spec
new file mode 100644
index 00000000000..91d20ab4612
--- /dev/null
+++ b/src/test/modules/injection_points/specs/dirty_index_scan.spec
@@ -0,0 +1,37 @@
+setup
+{
+	CREATE EXTENSION injection_points;
+	CREATE SCHEMA test;
+	CREATE UNLOGGED TABLE test.tbl(i int primary key, n int);
+	CREATE INDEX tbl_n_idx ON test.tbl(n);
+	INSERT INTO test.tbl VALUES(42,1);
+}
+
+teardown
+{
+	DROP SCHEMA test CASCADE;
+	DROP EXTENSION injection_points;
+}
+
+session s1
+setup	{
+	SELECT injection_points_set_local();
+	SELECT injection_points_attach('check_exclusion_or_unique_constraint_no_conflict', 'error');
+	SELECT injection_points_attach('index_getnext_slot_before_fetch_apply_dirty', 'wait');
+}
+
+step s1_s1	{ INSERT INTO test.tbl VALUES(42, 1) on conflict(i) do update set n = EXCLUDED.n + 1; }
+
+session s2
+step s2_s1	{ UPDATE test.tbl SET n = n + 1 WHERE i = 42; }
+
+session s3
+step s3_s1		{
+	SELECT injection_points_detach('index_getnext_slot_before_fetch_apply_dirty');
+	SELECT injection_points_wakeup('index_getnext_slot_before_fetch_apply_dirty');
+}
+
+permutation
+	s1_s1
+	s2_s1(*)
+	s3_s1(s1_s1)
\ No newline at end of file
diff --git a/src/test/subscription/Makefile b/src/test/subscription/Makefile
index 50b65d8f6ea..51d28eca091 100644
--- a/src/test/subscription/Makefile
+++ b/src/test/subscription/Makefile
@@ -16,6 +16,7 @@ include $(top_builddir)/src/Makefile.global
 EXTRA_INSTALL = contrib/hstore
 
 export with_icu
+export enable_injection_points
 
 check:
 	$(prove_check)
diff --git a/src/test/subscription/meson.build b/src/test/subscription/meson.build
index 586ffba434e..49f52db4dd1 100644
--- a/src/test/subscription/meson.build
+++ b/src/test/subscription/meson.build
@@ -5,7 +5,10 @@ tests += {
   'sd': meson.current_source_dir(),
   'bd': meson.current_build_dir(),
   'tap': {
-    'env': {'with_icu': icu.found() ? 'yes' : 'no'},
+    'env': {
+      'with_icu': icu.found() ? 'yes' : 'no',
+      'enable_injection_points': get_option('injection_points') ? 'yes' : 'no'
+    },
     'tests': [
       't/001_rep_changes.pl',
       't/002_types.pl',
@@ -42,6 +45,9 @@ tests += {
       't/033_run_as_table_owner.pl',
       't/034_temporal.pl',
       't/035_conflicts.pl',
+      't/036_delete_missing_race.pl',
+      't/037_update_missing_race.pl',
+      't/038_update_missing_with_retain.pl',
       't/100_bugs.pl',
     ],
   },
diff --git a/src/test/subscription/t/036_delete_missing_race.pl b/src/test/subscription/t/036_delete_missing_race.pl
new file mode 100644
index 00000000000..82e16af9be3
--- /dev/null
+++ b/src/test/subscription/t/036_delete_missing_race.pl
@@ -0,0 +1,137 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test the conflict detection and resolution in logical replication
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+############################## Set it to 0 to make set success; TODO: delete that for commit
+my $simulate_race_condition = 1;
+##############################
+
+###############################
+# Setup
+###############################
+
+# Initialize publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+	qq(track_commit_timestamp = on));
+$node_publisher->start;
+
+
+# Create subscriber node with track_commit_timestamp enabled
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init;
+$node_subscriber->append_conf('postgresql.conf',
+	qq(track_commit_timestamp = on));
+$node_subscriber->start;
+
+
+# Check if the extension injection_points is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$node_subscriber->check_extension('injection_points'))
+{
+	plan skip_all => 'Extension injection_points not installed';
+}
+
+# Create table on publisher
+$node_publisher->safe_psql(
+	'postgres',
+	"CREATE TABLE conf_tab(a int PRIMARY key, data text);");
+
+# Create similar table on subscriber with additional index to disable HOT updates
+$node_subscriber->safe_psql(
+	'postgres',
+	"CREATE TABLE conf_tab(a int PRIMARY key, data text);
+	 CREATE INDEX data_index ON conf_tab(data);");
+
+# Set up extension to simulate race condition
+$node_subscriber->safe_psql('postgres', 'CREATE EXTENSION injection_points;');
+
+# Setup logical replication
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION tap_pub FOR TABLE conf_tab");
+
+# Insert row to be updated later
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO conf_tab(a, data) VALUES (1,'frompub')");
+
+# Create the subscription
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql(
+	'postgres',
+	"CREATE SUBSCRIPTION tap_sub
+	 CONNECTION '$publisher_connstr application_name=$appname'
+	 PUBLICATION tap_pub");
+
+# Wait for initial table sync to finish
+$node_subscriber->wait_for_subscription_sync($node_publisher, $appname);
+
+############################################
+# Race condition because of DirtySnapshot
+############################################
+
+my $psql_session_subscriber = $node_subscriber->background_psql('postgres');
+if ($simulate_race_condition)
+{
+	$node_subscriber->safe_psql('postgres',
+		"SELECT injection_points_attach('index_getnext_slot_before_fetch_apply_dirty', 'wait')");
+}
+
+my $log_offset = -s $node_subscriber->logfile;
+
+# Delete tuple on publisher
+$node_publisher->safe_psql('postgres', "DELETE FROM conf_tab WHERE a=1;");
+
+if ($simulate_race_condition)
+{
+	# Wait apply worker to start the search for the tuple using index
+	$node_subscriber->wait_for_event('logical replication apply worker',
+		'index_getnext_slot_before_fetch_apply_dirty');
+}
+
+# Updater tuple on subscriber
+$psql_session_subscriber->query_until(
+	qr/start/, qq[
+	\\echo start
+	UPDATE conf_tab SET data = 'fromsubnew' WHERE (a=1);
+]);
+
+
+if ($simulate_race_condition)
+{
+	# Wake up apply worker
+	$node_subscriber->safe_psql('postgres',"
+		SELECT injection_points_detach('index_getnext_slot_before_fetch_apply_dirty');
+		SELECT injection_points_wakeup('index_getnext_slot_before_fetch_apply_dirty');
+		");
+}
+
+# Tuple was updated - so, we have conflict
+$node_subscriber->wait_for_log(
+	qr/conflict detected on relation \"public.conf_tab\"/,
+	$log_offset);
+
+# But tuple should be deleted on subscriber any way
+is($node_subscriber->safe_psql('postgres', 'SELECT count(*) from conf_tab'), 0, 'record deleted on subscriber');
+
+ok(!$node_subscriber->log_contains(
+		qr/LOG:  conflict detected on relation \"public.conf_tab\": conflict=delete_missing/,
+		$log_offset), 'invalid conflict detected');
+
+ok($node_subscriber->log_contains(
+		qr/LOG:  conflict detected on relation "public.conf_tab": conflict=delete_origin_differs/,
+		$log_offset), 'correct conflict detected');
+
+done_testing();
diff --git a/src/test/subscription/t/037_update_missing_race.pl b/src/test/subscription/t/037_update_missing_race.pl
new file mode 100644
index 00000000000..e29ad771d8e
--- /dev/null
+++ b/src/test/subscription/t/037_update_missing_race.pl
@@ -0,0 +1,139 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test the conflict detection and resolution in logical replication
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+############################## Set it to 0 to make set success; TODO: delete that for commit
+my $simulate_race_condition = 1;
+##############################
+
+###############################
+# Setup
+###############################
+
+# Initialize publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+	qq(track_commit_timestamp = on));
+$node_publisher->start;
+
+
+# Create subscriber node with track_commit_timestamp enabled
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init;
+$node_subscriber->append_conf('postgresql.conf',
+	qq(track_commit_timestamp = on));
+$node_subscriber->start;
+
+
+# Check if the extension injection_points is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$node_subscriber->check_extension('injection_points'))
+{
+	plan skip_all => 'Extension injection_points not installed';
+}
+
+# Create table on publisher
+$node_publisher->safe_psql(
+	'postgres',
+	"CREATE TABLE conf_tab(a int PRIMARY key, data text);");
+
+# Create similar table on subscriber with additional index to disable HOT updates and additional column
+$node_subscriber->safe_psql(
+	'postgres',
+	"CREATE TABLE conf_tab(a int PRIMARY key, data text, i int DEFAULT 0);
+	 CREATE INDEX i_index ON conf_tab(i);");
+
+# Set up extension to simulate race condition
+$node_subscriber->safe_psql('postgres', 'CREATE EXTENSION injection_points;');
+
+# Setup logical replication
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION tap_pub FOR TABLE conf_tab");
+
+# Insert row to be updated later
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO conf_tab(a, data) VALUES (1,'frompub')");
+
+# Create the subscription
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql(
+	'postgres',
+	"CREATE SUBSCRIPTION tap_sub
+	 CONNECTION '$publisher_connstr application_name=$appname'
+	 PUBLICATION tap_pub");
+
+# Wait for initial table sync to finish
+$node_subscriber->wait_for_subscription_sync($node_publisher, $appname);
+
+############################################
+# Race condition because of DirtySnapshot
+############################################
+
+my $psql_session_subscriber = $node_subscriber->background_psql('postgres');
+if ($simulate_race_condition)
+{
+	$node_subscriber->safe_psql('postgres', "SELECT injection_points_attach('index_getnext_slot_before_fetch_apply_dirty', 'wait')");
+}
+
+my $log_offset = -s $node_subscriber->logfile;
+
+# Update tuple on publisher
+$node_publisher->safe_psql('postgres',
+	"UPDATE conf_tab SET data = 'frompubnew' WHERE (a=1);");
+
+
+if ($simulate_race_condition)
+{
+	# Wait apply worker to start the search for the tuple using index
+	$node_subscriber->wait_for_event('logical replication apply worker', 'index_getnext_slot_before_fetch_apply_dirty');
+}
+
+# Update additional(!) column on the subscriber
+$psql_session_subscriber->query_until(
+	qr/start/, qq[
+	\\echo start
+	UPDATE conf_tab SET i = 1 WHERE (a=1);
+]);
+
+
+if ($simulate_race_condition)
+{
+	# Wake up apply worker
+	$node_subscriber->safe_psql('postgres',"
+		SELECT injection_points_detach('index_getnext_slot_before_fetch_apply_dirty');
+		SELECT injection_points_wakeup('index_getnext_slot_before_fetch_apply_dirty');
+		");
+}
+
+# Tuple was updated - so, we have conflict
+$node_subscriber->wait_for_log(
+	qr/conflict detected on relation \"public.conf_tab\"/,
+	$log_offset);
+
+# We need new column value be synced with subscriber
+is($node_subscriber->safe_psql('postgres', 'SELECT data from conf_tab WHERE a = 1'), 'frompubnew', 'record updated on subscriber');
+# And additional column maintain updated value
+is($node_subscriber->safe_psql('postgres', 'SELECT i from conf_tab WHERE a = 1'), 1, 'column record updated on subscriber');
+
+ok(!$node_subscriber->log_contains(
+		qr/LOG:  conflict detected on relation \"public.conf_tab\": conflict=update_missing/,
+		$log_offset), 'invalid conflict detected');
+
+ok($node_subscriber->log_contains(
+		qr/LOG:  conflict detected on relation "public.conf_tab": conflict=update_origin_differs/,
+		$log_offset), 'correct conflict detected');
+
+done_testing();
diff --git a/src/test/subscription/t/038_update_missing_with_retain.pl b/src/test/subscription/t/038_update_missing_with_retain.pl
new file mode 100644
index 00000000000..13769aa1c11
--- /dev/null
+++ b/src/test/subscription/t/038_update_missing_with_retain.pl
@@ -0,0 +1,141 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test the conflict detection and resolution in logical replication
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
+
+############################## Set it to 0 to make set success; TODO: delete that for commit
+my $simulate_race_condition = 1;
+##############################
+
+###############################
+# Setup
+###############################
+
+# Initialize publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->append_conf('postgresql.conf',
+	qq(track_commit_timestamp = on));
+$node_publisher->start;
+
+
+# Create subscriber node with track_commit_timestamp enabled
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init;
+$node_subscriber->append_conf('postgresql.conf',
+	qq(track_commit_timestamp = on));
+$node_subscriber->append_conf('postgresql.conf',
+	qq(wal_level = 'replica'));
+$node_subscriber->start;
+
+
+# Check if the extension injection_points is available, as it may be
+# possible that this script is run with installcheck, where the module
+# would not be installed by default.
+if (!$node_subscriber->check_extension('injection_points'))
+{
+	plan skip_all => 'Extension injection_points not installed';
+}
+
+# Create table on publisher
+$node_publisher->safe_psql(
+	'postgres',
+	"CREATE TABLE conf_tab(a int PRIMARY key, data text);");
+
+# Create similar table on subscriber with additional index to disable HOT updates and additional column
+$node_subscriber->safe_psql(
+	'postgres',
+	"CREATE TABLE conf_tab(a int PRIMARY key, data text, i int DEFAULT 0);
+	 CREATE INDEX i_index ON conf_tab(i);");
+
+# Set up extension to simulate race condition
+$node_subscriber->safe_psql('postgres', 'CREATE EXTENSION injection_points;');
+
+# Setup logical replication
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+$node_publisher->safe_psql('postgres',
+	"CREATE PUBLICATION tap_pub FOR TABLE conf_tab");
+
+# Insert row to be updated later
+$node_publisher->safe_psql('postgres',
+	"INSERT INTO conf_tab(a, data) VALUES (1,'frompub')");
+
+# Create the subscription
+my $appname = 'tap_sub';
+$node_subscriber->safe_psql(
+	'postgres',
+	"CREATE SUBSCRIPTION tap_sub
+	 CONNECTION '$publisher_connstr application_name=$appname'
+	 PUBLICATION tap_pub WITH (retain_dead_tuples = true)");
+
+# Wait for initial table sync to finish
+$node_subscriber->wait_for_subscription_sync($node_publisher, $appname);
+
+############################################
+# Race condition because of DirtySnapshot
+############################################
+
+my $psql_session_subscriber = $node_subscriber->background_psql('postgres');
+if ($simulate_race_condition)
+{
+	$node_subscriber->safe_psql('postgres', "SELECT injection_points_attach('index_getnext_slot_before_fetch_apply_dirty', 'wait')");
+}
+
+my $log_offset = -s $node_subscriber->logfile;
+
+# Update tuple on publisher
+$node_publisher->safe_psql('postgres',
+	"UPDATE conf_tab SET data = 'frompubnew' WHERE (a=1);");
+
+
+if ($simulate_race_condition)
+{
+	# Wait apply worker to start the search for the tuple using index
+	$node_subscriber->wait_for_event('logical replication apply worker', 'index_getnext_slot_before_fetch_apply_dirty');
+}
+
+# Update additional(!) column on the subscriber
+$psql_session_subscriber->query_until(
+	qr/start/, qq[
+	\\echo start
+	UPDATE conf_tab SET i = 1 WHERE (a=1);
+]);
+
+
+if ($simulate_race_condition)
+{
+	# Wake up apply worker
+	$node_subscriber->safe_psql('postgres',"
+		SELECT injection_points_detach('index_getnext_slot_before_fetch_apply_dirty');
+		SELECT injection_points_wakeup('index_getnext_slot_before_fetch_apply_dirty');
+		");
+}
+
+# Tuple was updated - so, we have conflict
+$node_subscriber->wait_for_log(
+	qr/conflict detected on relation \"public.conf_tab\"/,
+	$log_offset);
+
+# We need new column value be synced with subscriber
+is($node_subscriber->safe_psql('postgres', 'SELECT data from conf_tab WHERE a = 1'), 'frompubnew', 'record updated on subscriber');
+# And additional column maintain updated value
+is($node_subscriber->safe_psql('postgres', 'SELECT i from conf_tab WHERE a = 1'), 1, 'column record updated on subscriber');
+
+ok(!$node_subscriber->log_contains(
+		qr/LOG:  conflict detected on relation \"public.conf_tab\": conflict=update_deleted/,
+		$log_offset), 'invalid conflict detected');
+
+ok($node_subscriber->log_contains(
+		qr/LOG:  conflict detected on relation "public.conf_tab": conflict=update_origin_differs/,
+		$log_offset), 'correct conflict detected');
+
+done_testing();
-- 
2.43.0



view thread (37+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: [BUG?] check_exclusion_or_unique_constraint false negative
  In-Reply-To: <CADzfLwVJtBKX32daO20hJ_6_BWZn9cFPtAwzyzNrT2u7YCd6CA@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox