public inbox for [email protected]  
help / color / mirror / Atom feed
From: Amit Kapila <[email protected]>
To: Michail Nikolaev <[email protected]>
Cc: Hayato Kuroda (Fujitsu) <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Cc: Andres Freund <[email protected]>
Subject: Re: [BUG?] check_exclusion_or_unique_constraint false negative
Date: Fri, 2 Aug 2024 10:26:40 +0530
Message-ID: <CAA4eK1Jfb0xviXYon-_TvHNKeAY7ngAeo++Knu-0RPR6EkSBjA@mail.gmail.com> (raw)
In-Reply-To: <CANtu0og=5v4j8onS4nyJ4zMPdh-EPFxmiEi5PLoyZrmqHA6RKw@mail.gmail.com>
References: <CANtu0oiktqQ2pwExoXqDpByXNCJa-KE5vQRodTRnmFHN_+qwHg@mail.gmail.com>
	<CANtu0ohU2XRV9shtu14CffLPDS1x10q7ebOGf-vX0p+45_L8jw@mail.gmail.com>
	<CANtu0oh0tspW-xWzDGWP9ehz96KPt9aUP1c9JYhdBYxKsB0jpA@mail.gmail.com>
	<CANtu0ohUB9ky45iiMAYN1fGyt82+cg=+UYBom=P7drb+=97G9w@mail.gmail.com>
	<TYAPR01MB56921C9C3D21B0D62FF76330F5B22@TYAPR01MB5692.jpnprd01.prod.outlook.com>
	<CANtu0og=5v4j8onS4nyJ4zMPdh-EPFxmiEi5PLoyZrmqHA6RKw@mail.gmail.com>

On Thu, Aug 1, 2024 at 2:55 PM Michail Nikolaev
<[email protected]> wrote:
>
> > Thanks for pointing out the issue!
>
> Thanks for your attention!
>
> > IIUC, the issue can happen when two concurrent transactions using DirtySnapshot access
> > the same tuples, which is not specific to the parallel apply
>
> Not exactly, it happens for any DirtySnapshot scan over a B-tree index with some other transaction updating the same index page (even using the MVCC snapshot).
>
> So, logical replication related scenario looks like this:
>
> * subscriber worker receives a tuple update\delete from the publisher
> * it calls RelationFindReplTupleByIndex to find the tuple in the local table
> * some other transaction updates the tuple in the local table (on subscriber side) in parallel
> * RelationFindReplTupleByIndex may not find the tuple because it uses DirtySnapshot
> * update\delete is lost
>
> Parallel apply mode looks like more dangerous because it uses multiple workers on the subscriber side, so the probability of the issue is higher.
> In that case, "some other transaction" is just another worker applying changes of different transaction in parallel.
>

I think it is rather less likely or not possible in a parallel apply
case because such conflicting updates (updates on the same tuple)
should be serialized at the publisher itself. So one of the updates
will be after the commit that has the second update.

I haven't tried the test based on your description of the general
problem with DirtySnapshot scan. In case of logical replication, we
will LOG update_missing type of conflict and the user may need to take
some manual action based on that. I have not tried a test so I could
be wrong as well. I am not sure we can do anything specific to logical
replication for this but feel free to suggest if you have ideas to
solve this problem in general or specific to logical replication.

-- 
With Regards,
Amit Kapila.






view thread (37+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected]
  Subject: Re: [BUG?] check_exclusion_or_unique_constraint false negative
  In-Reply-To: <CAA4eK1Jfb0xviXYon-_TvHNKeAY7ngAeo++Knu-0RPR6EkSBjA@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox