public inbox for [email protected]  
help / color / mirror / Atom feed
pgsql: Fix race leading to incorrect conflict cause in InvalidatePossib
4+ messages / 2 participants
[nested] [flat]

* pgsql: Fix race leading to incorrect conflict cause in InvalidatePossib
@ 2024-02-20 04:44  Michael Paquier <[email protected]>
  0 siblings, 1 reply; 4+ messages in thread

From: Michael Paquier @ 2024-02-20 04:44 UTC (permalink / raw)
  To: [email protected]

Fix race leading to incorrect conflict cause in InvalidatePossiblyObsoleteSlot()

The invalidation of an active slot is done in two steps:
- Termination of the backend holding it, if any.
- Report that the slot is obsolete, with a conflict cause depending on
the slot's data.

This can be racy because between these two steps the slot mutex would be
released while doing system calls, which means that the effective_xmin
and effective_catalog_xmin could advance during that time, detecting a
conflict cause different than the one originally wanted before the
process owning a slot is terminated.

Holding the mutex longer is not an option, so this commit changes the
code to record the LSNs stored in the slot during the termination of the
process owning the slot.

Bonus thanks to Alexander Lakhin for the various tests and the analysis.

Author: Bertrand Drouvot
Reviewed-by: Michael Paquier, Bharath Rupireddy
Discussion: https://postgr.es/m/[email protected]
Backpatch-through: 16

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/818fefd8fd4412d45eb542155cb2833a2b864acc

Modified Files
--------------
src/backend/replication/slot.c | 39 +++++++++++++++++++++++++++++++++------
1 file changed, 33 insertions(+), 6 deletions(-)



^ permalink  raw  reply  [nested|flat] 4+ messages in thread

* pgsql: Fix race leading to incorrect conflict cause in InvalidatePossib
@ 2024-02-20 04:44  Michael Paquier <[email protected]>
  0 siblings, 0 replies; 4+ messages in thread

From: Michael Paquier @ 2024-02-20 04:44 UTC (permalink / raw)
  To: [email protected]

Fix race leading to incorrect conflict cause in InvalidatePossiblyObsoleteSlot()

The invalidation of an active slot is done in two steps:
- Termination of the backend holding it, if any.
- Report that the slot is obsolete, with a conflict cause depending on
the slot's data.

This can be racy because between these two steps the slot mutex would be
released while doing system calls, which means that the effective_xmin
and effective_catalog_xmin could advance during that time, detecting a
conflict cause different than the one originally wanted before the
process owning a slot is terminated.

Holding the mutex longer is not an option, so this commit changes the
code to record the LSNs stored in the slot during the termination of the
process owning the slot.

Bonus thanks to Alexander Lakhin for the various tests and the analysis.

Author: Bertrand Drouvot
Reviewed-by: Michael Paquier, Bharath Rupireddy
Discussion: https://postgr.es/m/[email protected]
Backpatch-through: 16

Branch
------
REL_16_STABLE

Details
-------
https://git.postgresql.org/pg/commitdiff/59cea09f03a56a40bce70a7461226c4d45740d02

Modified Files
--------------
src/backend/replication/slot.c | 39 +++++++++++++++++++++++++++++++++------
1 file changed, 33 insertions(+), 6 deletions(-)



^ permalink  raw  reply  [nested|flat] 4+ messages in thread

* Re: pgsql: Fix race leading to incorrect conflict cause in InvalidatePossib
@ 2024-04-11 07:08  Bharath Rupireddy <[email protected]>
  parent: Michael Paquier <[email protected]>
  0 siblings, 1 reply; 4+ messages in thread

From: Bharath Rupireddy @ 2024-04-11 07:08 UTC (permalink / raw)
  To: Michael Paquier <[email protected]>; +Cc: [email protected]

On Tue, Feb 20, 2024 at 10:14 AM Michael Paquier <[email protected]> wrote:
>
> Fix race leading to incorrect conflict cause in InvalidatePossiblyObsoleteSlot()

I found a typo with the code added by this commit - we've used
XLogRecPtr/InvalidXLogRecPtr for xmins in place of
TransactionId/InvalidTransactionId. Attached a patch to fix this.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


Attachments:

  [application/octet-stream] v1-0001-Use-correct-datatype-for-xmin-variables-added-by-.patch (1.1K, 2-v1-0001-Use-correct-datatype-for-xmin-variables-added-by-.patch)
  download | inline diff:
From 73f85cc634d85e82dc6cd38945c492fb0389fe94 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <[email protected]>
Date: Thu, 11 Apr 2024 07:04:28 +0000
Subject: [PATCH v1] Use correct datatype for xmin variables added by 818fefd8

---
 src/backend/replication/slot.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 3bddaae022..cebf44bb0f 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -1545,8 +1545,8 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause,
 	int			last_signaled_pid = 0;
 	bool		released_lock = false;
 	bool		terminated = false;
-	XLogRecPtr	initial_effective_xmin = InvalidXLogRecPtr;
-	XLogRecPtr	initial_catalog_effective_xmin = InvalidXLogRecPtr;
+	TransactionId initial_effective_xmin = InvalidTransactionId;
+	TransactionId initial_catalog_effective_xmin = InvalidTransactionId;
 	XLogRecPtr	initial_restart_lsn = InvalidXLogRecPtr;
 	ReplicationSlotInvalidationCause invalidation_cause_prev PG_USED_FOR_ASSERTS_ONLY = RS_INVAL_NONE;
 
-- 
2.34.1



^ permalink  raw  reply  [nested|flat] 4+ messages in thread

* Re: pgsql: Fix race leading to incorrect conflict cause in InvalidatePossib
@ 2024-04-11 07:39  Michael Paquier <[email protected]>
  parent: Bharath Rupireddy <[email protected]>
  0 siblings, 0 replies; 4+ messages in thread

From: Michael Paquier @ 2024-04-11 07:39 UTC (permalink / raw)
  To: Bharath Rupireddy <[email protected]>; +Cc: [email protected]

On Thu, Apr 11, 2024 at 12:38:06PM +0530, Bharath Rupireddy wrote:
> I found a typo with the code added by this commit - we've used
> XLogRecPtr/InvalidXLogRecPtr for xmins in place of
> TransactionId/InvalidTransactionId. Attached a patch to fix this.

Thanks for the report.  Will fix.
--
Michael


Attachments:

  [application/pgp-signature] signature.asc (833B, 2-signature.asc)
  download

^ permalink  raw  reply  [nested|flat] 4+ messages in thread


end of thread, other threads:[~2024-04-11 07:39 UTC | newest]

Thread overview: 4+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2024-02-20 04:44 pgsql: Fix race leading to incorrect conflict cause in InvalidatePossib Michael Paquier <[email protected]>
2024-04-11 07:08 ` Bharath Rupireddy <[email protected]>
2024-04-11 07:39   ` Michael Paquier <[email protected]>
2024-02-20 04:44 pgsql: Fix race leading to incorrect conflict cause in InvalidatePossib Michael Paquier <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox