Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sjLNi-00HTnZ-P3 for pgsql-general@arkaria.postgresql.org; Wed, 28 Aug 2024 16:18:51 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sjLNg-007UOX-SR for pgsql-general@arkaria.postgresql.org; Wed, 28 Aug 2024 16:18:49 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sjLNg-007UMy-Dq for pgsql-general@lists.postgresql.org; Wed, 28 Aug 2024 16:18:49 +0000 Received: from mail-wm1-x335.google.com ([2a00:1450:4864:20::335]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1sjLNd-001wc0-AX for pgsql-general@lists.postgresql.org; Wed, 28 Aug 2024 16:18:48 +0000 Received: by mail-wm1-x335.google.com with SMTP id 5b1f17b1804b1-428243f928cso60915715e9.3 for ; Wed, 28 Aug 2024 09:18:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724861926; x=1725466726; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=1MYIJPv6DbnelWY6L6mt9GSOoY9i9Nc4S5p84+gqwqc=; b=QWc5bi5q6YhYv8TDU/wc6vPTONuJm7KBEk7xLCgQ335NI53+Lm9LNuCDEaMD25SMgQ Zt/y0+0mOct0AS0yF5jvtRih882UzmvG4WmlFSpqygOgRNJTkLmRYALtlRY1tZoIeeM8 zxipi8SS07YiujEA3A0Z1C1bmq3blifjlh67TeewRxV+rc0NHPtEhOrjZ5oV7B5JQU8U YLATehmDnleBCGkiGZ9LOa85Band8768RP6nv8vjIGU4FPgQgZabwYK8Fdw0iCmACRnR +t0CqRHsEPk/+R2HhhKw72TBlmBIkfIWkyx34J28SegoZwK8cIqhseCkaiQw4u1APqpn 7gHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724861926; x=1725466726; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=1MYIJPv6DbnelWY6L6mt9GSOoY9i9Nc4S5p84+gqwqc=; b=suWWYECF1bAP5TnML4j92qWzRM+xvWg5qANcqmpoHUB4UPc1Nv+eCNoliiHmwxBPCO BoLf0+e850sjILGEhe0rBppLOHUT6LoIeI6Sg7TJ9bfZElQVGTremV0ipB47pDt6j7a/ FfrgZ0YKzBo9Etdzd0Lj+XZw16vjwPLYrSXwU10nTS48wso4uajP+Uv1rjATilFEos0B eY1QozfHP9lvLfnbuGkJCKwmWs/zQE5ceu7cMIDVbx5AnO5+tUkjeqUrESKBXPs6kY6b ENb/7Fab4hpQn5Fn3rTDyIvlf2gM5ttHy20AWurMg4lfh5ky2rOesIwXSLWQBt0ReuJW kpXA== X-Gm-Message-State: AOJu0YxK8uhLKf6flW6jD4RFlF05PAHISh06RePfcoOiQGIHDoLr5vaF d+f0m1Pgd6pp76FWUOJnKKUkUmVjqU3PvLzi3tm2+pkANl9SHj9xD0eIyKiOUpCOlOz3I1s8FbM /5sHhbYQi2qaOWHvyf1iXYED7EP2BoqXc X-Google-Smtp-Source: AGHT+IGKCM++chm4Nr/4x4wmwT0AhDl6SuO2pH8btDet6zsHWW/PBHYtM1MniqCJ+EKr1ED/FfhWW6PSgxiD8gZNPWw= X-Received: by 2002:a05:6000:18a6:b0:368:2f01:307a with SMTP id ffacd0b85a97d-3749b57c76emr71772f8f.46.1724861924934; Wed, 28 Aug 2024 09:18:44 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Jacob Biesinger Date: Wed, 28 Aug 2024 09:18:33 -0700 Message-ID: Subject: Re: Ghost data from failed FDW transactions? To: Greg Sabino Mullane Cc: pgsql-general@lists.postgresql.org Content-Type: multipart/alternative; boundary="000000000000528dc40620c0b6c6" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000528dc40620c0b6c6 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, Aug 28, 2024 at 5:39=E2=80=AFAM Greg Sabino Mullane wrote: > On Tue, Aug 27, 2024 at 9:03=E2=80=AFPM Jacob Biesinger > wrote: > >> I'm scratching my head at a few rows in the root DB, where it seems the >> corresponding tenant transaction rolled back, but the root DB transactio= n >> committed >> > ... > >> Before I jump into particulars, does this sound like expected behavior? >> > > No, it sounds like something is going wrong. Your setup as described > should work to keep both sides in sync. > Well, that's a plus. At least we're (probably) not using it wrong :) There aren't many details in the docs around failure modes... is there anything there that could cause this issue? For example, if there were some temporary network outage that prevented communication between the two DBs, would the FDW fail gracefully? Or if there were some disk failure or something after the FDW signals to the remote (root) DB to commit but then the local (tenant) DB failed to commit? We've had a few outages over the years where we hit the `max_connections` setting on the cluster (which would affect both the root + tenant DBs), but connections are held for the duration of both local + remote txns, so doesn't seem like that would affect this. We don't use pgBouncer, either on the client -> DB or as an in-between on the DB -> DB FDW side. > > Through the magic of postgres_fdw, row triggers, and distributed >> transactions, >> > > Can you expand on "distributed transactions" here? > I just mean "using the FDW as described". It is magic that we can get proper transactional + ACID semantics in a distributed system. The FDW really helps our use-case. But to go deeper, we use the javascript knex adapter and some application-level transaction management that automatically retries a transaction N times when it encounters serialization errors. On this particular endpoint, the emitted SQL for the full transaction looks something like: BEGIN; INSERT INTO "devices" ("orgId", "patientId", "deviceId", "data") VALUES ( 'org1', 'patient1', 'device1', '{"id": "device1", "patientId": "patient1", "serialNumber": "12345", "status": "active" }' ); INSERT INTO "devices" ("orgId", "patientId", "deviceId", "data") VALUES ( 'org1', 'patient1', 'device2', '{"id": "device2", "patientId": "patient1", "serialNumber": "67890", "status": "active" }' ); SELECT * FROM "rootDb"."assets"; -- execute some logic client-side, nothing touching the DB UPDATE "rootDb"."assets" WHERE ...; COMMIT; With I guess the maybe-relevant bits here being that we do some additional reading + writing to the remote / root DB (other tables) subsequently as part of the same txn. The JS driving this also has the unfortunate shortcoming that the two `INSERT` statements run "in parallel", meaning there's a race to execute them (serially) through their shared txn/connection. The ordering shouldn't matter, but this also means that error handling (e.g., when there is a conflict with the `EXCLUDE "serialNumber"` constraint) may not stop the second `INSERT` statement from being attempted (and rejected by postgres). But I think that's all client-side details that shouldn't affect the FDW txn semantics, right? -- Jake Biesinger On Wed, Aug 28, 2024 at 5:39=E2=80=AFAM Greg Sabino Mullane wrote: > On Tue, Aug 27, 2024 at 9:03=E2=80=AFPM Jacob Biesinger > wrote: > >> I'm scratching my head at a few rows in the root DB, where it seems the >> corresponding tenant transaction rolled back, but the root DB transactio= n >> committed >> > ... > >> Before I jump into particulars, does this sound like expected behavior? >> > > No, it sounds like something is going wrong. Your setup as described > should work to keep both sides in sync. > > Through the magic of postgres_fdw, row triggers, and distributed >> transactions, >> > > Can you expand on "distributed transactions" here? > > Cheers, > Greg > > --000000000000528dc40620c0b6c6 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Wed, Aug 28, 2024 at 5:39=E2=80= =AFAM Greg Sabino Mullane <htamfid= s@gmail.com> wrote:




But to go deeper, we use the javascript knex adapter and some applicatio= n-level transaction management that automatically retries a transaction N t= imes when it encounters serialization errors. On this particular endpoint, = the emitted SQL for the full transaction looks something like:
=

With I guess the maybe-relevant bits here being that we= do some additional reading=C2=A0+ writing to the remote / root DB (other t= ables) subsequently as part of the same txn. The JS driving this also has t= he unfortunate shortcoming that the two `INSERT` statements run "in pa= rallel", meaning there's a race to execute them (serially) through= their shared txn/connection. The ordering shouldn't matter, but this a= lso means that=C2=A0error handling (e.g., when there is a conflict with the= `EXCLUDE "serialNumber"` constraint) may not stop the second `IN= SERT` statement from being attempted (and rejected by postgres). But I thin= k that's all client-side details that shouldn't affect the FDW txn = semantics, right?


--
Jake Biesinger
=C2=A0
On T= ue, Aug 27, 2024 at 9:03=E2=80=AFPM Jacob Biesinger <jake.biesinger@gmail.com>= wrote:
I'm scratching my head at a fe= w rows in the root DB, where it seems the corresponding tenant transaction = rolled back, but the root DB transaction committed
...=C2=A0
<= div dir=3D"ltr">
Before I jump into particulars, does this sound like e= xpected behavior?

No, it sounds= like something is going wrong. Your setup as described should work to keep= both sides in sync.

Through the magic of postgres_fdw, ro= w triggers, and distributed transactions,

=
Can you expand on "distributed transactions" here?

Cheers,
Greg

--000000000000528dc40620c0b6c6--