public inbox for [email protected]  
help / color / mirror / Atom feed
From: Michael Paquier <[email protected]>
To: Andy Fan <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Subject: Re: raise ERROR between EndPrepare and PostPrepare_Locks causes ROLLBACK 2pc PAINC
Date: Wed, 25 Mar 2026 10:50:53 +0900
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
	<[email protected]>

On Wed, Mar 25, 2026 at 08:39:07AM +0800, Andy Fan wrote:
> I found a similar but not exactly same case at 2014 [1] which 
> might be helpful to recall a boarder understanding on this area. 
> 
> [1] https://www.postgresql.org/message-id/534AF601.1030007%40vmware.com

Incorrect shared state when an ERROR happens at an arbitrary location
is usually bad, yes.

For this one, your suggestion of delaying the end of the critical
section started at StartPrepare() and ending in EndPrepare() is not an
acceptable solution as far as I can see, unfortunately: it would mean
doing a SyncRepWaitForLSN() while in a critical section, and I doubt
we'd want to do that.  Anyway, I doubt that this one is worth caring
for.  The current locking 2PC scheme means, as far as I remember, that
it is not really possible to interact with an external command in a
specific session between the EndPrepare() and the PostPrepare_Locks()
calls.

To put it in other words, let's imagine that we use a breakpoint
between these two calls (or a wait injection point if you automate
that).  Is it possible for a second backend to mess with the state of
the first backend waiting until its locks are transfered to the dummy
PGPROC entry?  That's what the 2014 thread is about: there was a race
condition reachable between two sessions.  If the answer to this
question is yes, I'd agree that this is something that deserves a
closer lookup.  And before you ask: attempting to interact with a 2PC
state from a second session with a first session waiting between these
two points would not work: the 2PC entry is locked, cleaned up after
EndPrepare() and PostPrepare_Locks() at PostPrepare_Twophase().
Trying to request an access to this entry fails, as the first backend
is marked as locking it.  A second backend attempting to lock it would
fail, complaining that the 2PC entry with a GXID is "busy".

SyncRepWaitForLSN() would be a problematic pattern between the
EndPrepare() and the PostPrepare_Locks(), but we never ERROR there on 
purpose: even if we cancel while waiting for a transaction commit we'd
just get a WARNING, meaning that we'd be able to transfer our locks
anyway.

Or perhaps you have a realistic scenario where it is possible to mess
up with the shared state, outside a elog(ERROR) forced between these
two points?
--
Michael


Attachments:

  [application/pgp-signature] signature.asc (833B, 2-signature.asc)
  download

view thread (4+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re: raise ERROR between EndPrepare and PostPrepare_Locks causes ROLLBACK 2pc PAINC
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox