Re: Improve conflict detection when replication origins are reused

public inbox for [email protected]  
help / color / mirror / Atom feed

Re: Improve conflict detection when replication origins are reused
4+ messages / 2 participants
[nested] [flat]

* Re: Improve conflict detection when replication origins are reused
@ 2026-05-19 03:19 ` shveta malik <[email protected]>
  1 sibling, 0 replies; 4+ messages in thread

From: shveta malik @ 2026-05-19 03:19 UTC (permalink / raw)
  To: Nisha Moond <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>

On Fri, May 15, 2026 at 4:45 PM Nisha Moond <[email protected]> wrote:
>
> On Fri, May 15, 2026 at 3:27 PM shveta malik <[email protected]> wrote:
> >
> > Nisha, I think we will get the same problem in another scenario too:
> >
> > create pub1-server1
> > create pub1-server2
> > create sub1-server3; subscribing to pub1-server1
> >
> > --On both server1 and server2, insert same set of rows:
> > insert into tab1 values (10), (20), (30);
> >
> > Sub1 (server3) will get the rows from server1.
> > Now alter sub1 to connect to server2 (you will have to create slot
> > manually on server2)
> > SELECT pg_create_logical_replication_slot('sub1', 'pgoutput', false,
> > false, false);
> >
> >
> > --Now perform the update on server2:
> > update tab1 set i=11 where i=10;
> >
> > The subscriber on server3 will receive update form server2 and will
> > update the row inserted by server1 origianlly without raising
> > update_origin_differ.
> >
> > Can you please confirm if my understanding of the problem statement is
> > correct and if the scenario above will also result in a similar
> > situation? IIUC, in such a case, the proposed solutions may not work
> > directly and will need to be further evolved. I will think more once
> > you confirm my understanding.
> >
>
> I agree that the above scenario will not raise a conflict, and I think
> that is expected with the current replication model, which tracks
> which subscription stream applied a row, not which publisher server it
> originally came from.
>
> With the existing replication model, we can also see the opposite
> scenario of what you mentioned: if two subscriptions replicate the
> same table from the same publisher, update_origin_differs conflicts
> can still be raised even though both changes come from the same
> source. This again shows that origin identity today is effectively
> tied to the subscription stream, not the publisher server.

Yes, I agree. Thansk for details.

> If we want conflict detection based on publisher identity, that would
> require a different model altogether, closer to systems like
> BDR/pglogical, which track global node identities across the
> replication chain.
>
> So for now, I think the above scenario is outside the scope of the
> current subscription-level origin tracking design.
>

Yes, looks like so.

thanks
Shveta






^ permalink  raw  reply  [nested|flat] 4+ messages in thread

* Re: Improve conflict detection when replication origins are reused
@ 2026-05-19 09:22 ` shveta malik <[email protected]>
  2026-05-19 13:38   ` Re: Improve conflict detection when replication origins are reused Nisha Moond <[email protected]>
  1 sibling, 1 reply; 4+ messages in thread

From: shveta malik @ 2026-05-19 09:22 UTC (permalink / raw)
  To: Nisha Moond <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>; shveta malik <[email protected]>

On Thu, May 14, 2026 at 8:35 AM Nisha Moond <[email protected]> wrote:
>
> Hi hackers,
>
> While reviewing the issue reported at [1] and the proposed solutions
> at [2], I noticed a related problem: false negative conflict detection
> when a 'ReplOriginId' gets reused.
>
> In logical replication, conflict detection relies on the tuple’s
> replication origin ('roident'). The problem is that if a subscription
> is dropped and a new subscription later reuses the same origin ID, the
> apply worker may incorrectly treat incoming changes as “its own”
> changes and skip conflict detection.
>
> A simple example:
>   1. Create subscription sub1 with 'roident = 1'
>   2. Replicate some rows into table 't1'
>   3. Drop 'sub1'
>   4. Create another subscription 'sub2'
>   5. `sub2` reuses 'roident = 1'
>   6. New updates arrive for rows previously written by 'sub1'
>   At this point, conflict detection sees:
>       tuple_origin == current_origin
>
> and incorrectly assumes the row was written by the current
> subscription instance, so no 'update_origin_differ' conflict is
> raised.
>
> This may look harmless in this simple setup, but it becomes
> problematic if the new subscription is connected to a different
> publisher, because real conflicts can then be silently missed.
>
> I explored two possible approaches to solve this:
>
> Approach 1. Zero out old origin IDs in commit_ts data when dropping a
> subscription
> ----------------------
>  - When a subscription is dropped and its replication origin becomes
> free, scan all 'commit_ts' SLRU entries and replace that old origin ID
> with 'InvalidRepOriginId (0)'.
>  - So rows previously written by the old subscription would no longer
> appear to belong to any active replication origin.
>  - A new subscription reusing the same 'roident' will always conflict
> with origin '0'.
>
> Pros:
>  - Fixes the stale-origin problem completely and may also help solve
> the tablesync-origin issue discussed in [1]
>  - No additional checks needed during conflict detection
>
> Cons:
>  - Requires scanning the entire 'commit_ts' SLRU during DROP
> SUBSCRIPTION, so it can become very expensive on large systems
>  - Not crash-safe currently(patch):
>     - if the server crashes midway, some entries may still contain the
> old origin ID
>     - after restart, reused origins can again lead to missed conflicts
>  - Making this fully crash-safe would likely require WAL logging or
> recovery-time reprocessing.
>
> Approach 2. Store replication origin creation time
> ----------------------
>  - Add a creation timestamp for each replication origin
>  - During conflict check:
>     if tuple_origin != current_origin
>         -> existing behavior
>     if tuple_origin == current_origin
>         -> compare tuple commit timestamp with origin creation time
>         if tuple_commit_ts <= origin_creation_time
>             -> treat as an origin reuse case and raise conflict
>
> Pros:
> -------
>  - No additional processing during DROP SUBSCRIPTION
>  - Lightweight runtime check (just one timestamp comparison)
>  - Naturally crash-safe since origin creation is WAL-logged already
>
> Cons:
>  - Requires a catalog schema change
>  - The <= comparison can produce false-positive conflicts for rows
> committed at the exact same microsecond as origin creation
>  -  May require additional handling for upgraded origins
>
> IMO, the second approach currently looks more practical because it
> avoids the heavy SLRU scan and crash-recovery complexity.
>

I find Approach 2 the most practical. I explored other ideas but none
seem completely reliable or worth the effort to justify this use-case.
A few ideas I considered are:

1) We could modify replorigin_create to exhaust the full range of IDs
sequentially before reusing them. But this is not a reliable solution.
It would make the bug much harder to hit, but a busy system could
still eventually exhaust the 2-byte limit of 65K IDs, after which the
problem may reappear.

2) Using LSN Matching instead of timestamp. To completely eliminate
the edge case where a timestamp results in a false-positive case, we
could track the origin_creation_lsn and compare it against the tuple's
commit LSN. IIUC, it would require extending commit_ts to include
8-byte of commit-lsn which might not be a good idea. So this idea may
also not be desirable unless there is an existing way to extract
commit-lsn (which I am not aware of) without extending the commit-ts
structure?

thanks
Shveta






^ permalink  raw  reply  [nested|flat] 4+ messages in thread

* Re: Improve conflict detection when replication origins are reused
  2026-05-19 09:22 ` Re: Improve conflict detection when replication origins are reused shveta malik <[email protected]>
@ 2026-05-19 13:38   ` Nisha Moond <[email protected]>
  2026-05-20 04:07     ` Re: Improve conflict detection when replication origins are reused shveta malik <[email protected]>
  0 siblings, 1 reply; 4+ messages in thread

From: Nisha Moond @ 2026-05-19 13:38 UTC (permalink / raw)
  To: shveta malik <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>; shveta malik <[email protected]>

On Tue, May 19, 2026 at 2:52 PM shveta malik <[email protected]> wrote:
>
> I find Approach 2 the most practical. I explored other ideas but none
> seem completely reliable or worth the effort to justify this use-case.
> A few ideas I considered are:
>
> 1) We could modify replorigin_create to exhaust the full range of IDs
> sequentially before reusing them. But this is not a reliable solution.
> It would make the bug much harder to hit, but a busy system could
> still eventually exhaust the 2-byte limit of 65K IDs, after which the
> problem may reappear.
>
> 2) Using LSN Matching instead of timestamp. To completely eliminate
> the edge case where a timestamp results in a false-positive case, we
> could track the origin_creation_lsn and compare it against the tuple's
> commit LSN. IIUC, it would require extending commit_ts to include
> 8-byte of commit-lsn which might not be a good idea. So this idea may
> also not be desirable unless there is an existing way to extract
> commit-lsn (which I am not aware of) without extending the commit-ts
> structure?
>

Using LSN is a good idea. I looked through the code a bit, and
extending `commit_ts` seems like the only option. I also could not
find anything existing from which we can extract the commit LSN of a
tuple while applying a change.
Every heap page has pd_lsn (accessible via PageGetLSN(page)), which
stores the LSN of the most recent WAL record that modified the page.
But this doesn't help, as there is no correlation to a specific
tuple's xmin.

--
Thanks,
Nisha






^ permalink  raw  reply  [nested|flat] 4+ messages in thread

* Re: Improve conflict detection when replication origins are reused
  2026-05-19 09:22 ` Re: Improve conflict detection when replication origins are reused shveta malik <[email protected]>
  2026-05-19 13:38   ` Re: Improve conflict detection when replication origins are reused Nisha Moond <[email protected]>
@ 2026-05-20 04:07     ` shveta malik <[email protected]>
  0 siblings, 0 replies; 4+ messages in thread

From: shveta malik @ 2026-05-20 04:07 UTC (permalink / raw)
  To: Nisha Moond <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>; shveta malik <[email protected]>

On Tue, May 19, 2026 at 7:08 PM Nisha Moond <[email protected]> wrote:
>
> On Tue, May 19, 2026 at 2:52 PM shveta malik <[email protected]> wrote:
> >
> > I find Approach 2 the most practical. I explored other ideas but none
> > seem completely reliable or worth the effort to justify this use-case.
> > A few ideas I considered are:
> >
> > 1) We could modify replorigin_create to exhaust the full range of IDs
> > sequentially before reusing them. But this is not a reliable solution.
> > It would make the bug much harder to hit, but a busy system could
> > still eventually exhaust the 2-byte limit of 65K IDs, after which the
> > problem may reappear.
> >
> > 2) Using LSN Matching instead of timestamp. To completely eliminate
> > the edge case where a timestamp results in a false-positive case, we
> > could track the origin_creation_lsn and compare it against the tuple's
> > commit LSN. IIUC, it would require extending commit_ts to include
> > 8-byte of commit-lsn which might not be a good idea. So this idea may
> > also not be desirable unless there is an existing way to extract
> > commit-lsn (which I am not aware of) without extending the commit-ts
> > structure?
> >
>
> Using LSN is a good idea. I looked through the code a bit, and
> extending `commit_ts` seems like the only option. I also could not
> find anything existing from which we can extract the commit LSN of a
> tuple while applying a change.
> Every heap page has pd_lsn (accessible via PageGetLSN(page)), which
> stores the LSN of the most recent WAL record that modified the page.
> But this doesn't help, as there is no correlation to a specific
> tuple's xmin.
>

Even I could not find any existing way to get the commit-LSN. We have
TransactionIdGetCommitLSN() but this does not return exact commit-lsn.

thanks
Shveta






^ permalink  raw  reply  [nested|flat] 4+ messages in thread

end of thread, other threads:[~2026-05-20 04:07 UTC | newest]

Thread overview: 4+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2026-05-19 03:19 ` shveta malik <[email protected]>
2026-05-19 09:22 ` shveta malik <[email protected]>
2026-05-19 13:38   ` Nisha Moond <[email protected]>
2026-05-20 04:07     ` shveta malik <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox