public inbox for [email protected]
help / color / mirror / Atom feedFrom: SATYANARAYANA NARLAPURAM <[email protected]>
To: PostgreSQL Hackers <[email protected]>
To: Álvaro Herrera <[email protected]>
To: Antonin Houska <[email protected]>
Subject: Possible premature SNAPBUILD_CONSISTENT with DB-specific running_xacts
Date: Sun, 19 Apr 2026 11:58:46 -0700
Message-ID: <CAHg+QDcQak4jx_6X2_Ws98rzG=xBARLjqm_=56wTRUtNsY4DZQ@mail.gmail.com> (raw)
Hi hackers,
A cluster-wide decoder must never have its snapshot-builder state changed
by a database-specific running_xacts record. Adding a check to return it
early.
I think otherwise a cluster wide decoder can potentially go to
SNAPSHOT_CONSISTENT state immediately even though transactions older
than nextXid are still in progress on a different DB (not tracked by
running_xact
record). This race is now possible with cluster wide decoders and Repack
concurrently run.
Attached a patch to fix this. Thoughts?
Thanks
Satya
Attachments:
[application/octet-stream] v1-snapbuild-only.patch (1.3K, 3-v1-snapbuild-only.patch)
download | inline diff:
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index c8309b9..8953647 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -1157,6 +1157,21 @@ SnapBuildProcessRunningXacts(SnapBuild *builder, XLogRecPtr lsn, xl_running_xact
ReorderBufferTXN *txn;
TransactionId xmin;
+ /*
+ * A database-specific xl_running_xacts record only describes XIDs of
+ * transactions running in running->dbid; XIDs of transactions in other
+ * databases (including possibly our own) are missing from xids[] and not
+ * accounted for in oldestRunningXid/nextXid. Such a record may only be
+ * consumed by a decoder that itself opted out of cluster-wide tracking
+ * (db_specific == true). Otherwise we could mark the snapshot
+ * SNAPBUILD_CONSISTENT while transactions older than running->nextXid are
+ * still in progress in another database, causing their later commits to
+ * be silently dropped from the decoded change stream (data loss in
+ * downstream subscribers).
+ */
+ if (!db_specific && OidIsValid(running->dbid))
+ return;
+
/*
* If we're not consistent yet, inspect the record to see whether it
* allows to get closer to being consistent. If we are consistent, dump
view thread (2+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected]
Subject: Re: Possible premature SNAPBUILD_CONSISTENT with DB-specific running_xacts
In-Reply-To: <CAHg+QDcQak4jx_6X2_Ws98rzG=xBARLjqm_=56wTRUtNsY4DZQ@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox