public inbox for [email protected]
help / color / mirror / Atom feedGhost data from failed FDW transactions?
6+ messages / 3 participants
[nested] [flat]
* Ghost data from failed FDW transactions?
@ 2024-08-28 01:03 Jacob Biesinger <[email protected]>
2024-08-28 12:38 ` Re: Ghost data from failed FDW transactions? Greg Sabino Mullane <[email protected]>
0 siblings, 1 reply; 6+ messages in thread
From: Jacob Biesinger @ 2024-08-28 01:03 UTC (permalink / raw)
To: [email protected]
Hi there!
We have a setup where, for compliance reasons, we hoist a portion of data
from several "tenant" databases into a "root" / common / untenanted DB.
Through the magic of postgres_fdw, row triggers, and distributed
transactions, we automatically hoist the needed columns into the untenanted
DB whenever there are changes in any of the tenant DBs. The root DB is
never written to outside of these triggers, and act as a sort of
cross-tenant index.
I'm scratching my head at a few rows in the root DB, where it seems the
corresponding tenant transaction rolled back, but the root DB transaction
committed -- there is no row in the tenant but the root DOES have one. I
don't have a smoking gun just yet, but this is not the first time we've
seen this issue.
Before I jump into particulars, does this sound like expected behavior? We
run SERIALIZABLE txn level everywhere (set at the cluster level).
Thanks so much in advance for any insights here!
Here's my setup:
## info + cluster-level flags
GCP cloudsql
postgres version: 15.7
default_transaction_isolation: serializable
## The tenant DB:
CREATE TABLE
"devices" (
"orgId" TEXT NOT NULL,
"patientId" TEXT NOT NULL,
"deviceId" TEXT NOT NULL,
"data" JSONB NOT NULL,
PRIMARY KEY ("orgId", "patientId", "deviceId")
);
## The root DB:
CREATE TABLE IF NOT EXISTS "devices" (
"orgId" TEXT NOT NULL,
"patientId" TEXT NOT NULL,
"deviceId" TEXT NOT NULL,
"data" JSONB NOT NULL,
"serialNumber" TEXT NOT NULL GENERATED ALWAYS AS ("data"->>'serialNumber')
STORED,
"status" TEXT NOT NULL GENERATED ALWAYS AS ("data"->>'status') STORED,
PRIMARY KEY ("orgId", "patientId", "deviceId"),
CONSTRAINT "deviceIdMatches" CHECK ("data"->>'id' = "deviceId"),
CONSTRAINT "patientIdMatches" CHECK ("data"->>'patientId' = "patientId"),
-- Prevent duplicate serial numbers that are simultaneously "active"
EXCLUDE ( "serialNumber" WITH = ) WHERE ("status" = 'active')
);
## FDW connection from tenant DB to the root:
CREATE EXTENSION IF NOT EXISTS postgres_fdw;
CREATE SERVER IF NOT EXISTS "fdw_server__root" FOREIGN DATA WRAPPER
postgres_fdw OPTIONS (
host '${instance-ip-address}',
dbname '${root-db}',
updatable 'true',
truncatable 'false',
keep_connections 'off'
);
CREATE USER MAPPING FOR "${remote-user}" SERVER "fdw_server__root" OPTIONS(
user '${remote-user}', password '$${PGPASSWORD}'
);
CREATE SCHEMA IF NOT EXISTS "rootDb";
GRANT USAGE ON SCHEMA "rootDb" TO "${user-name}";
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA "rootDb" TO
"${user-name}";
ALTER DEFAULT PRIVILEGES IN SCHEMA "rootDb" GRANT SELECT, INSERT, UPDATE,
DELETE ON TABLES TO "${user-name}";
IMPORT FOREIGN SCHEMA "public" LIMIT TO (devices)
FROM SERVER "fdw_server__root"
INTO "rootDb";
## Trigger setup on the tenant DB, hoisting rows when modified into the
root table:
-- Set up a trigger which hoists tenant devices into the rootDb
CREATE OR REPLACE FUNCTION hoist_devices() RETURNS TRIGGER
AS $hoist_devices$
BEGIN
IF (TG_OP IN ('UPDATE', 'DELETE')) THEN
DELETE FROM "rootDb"."devices"
WHERE "orgId" = OLD."orgId"
AND "patientId" = OLD."patientId"
AND "deviceId" = OLD."deviceId";
END IF;
IF (TG_OP IN ('INSERT', 'UPDATE')) THEN
INSERT INTO "rootDb"."devices" ("orgId", "patientId", "deviceId", "data")
SELECT
NEW."orgId",
NEW."patientId",
NEW."deviceId",
NEW."data";
END IF;
RETURN NEW;
END;
$hoist_devices$ LANGUAGE plpgsql;
CREATE OR REPLACE TRIGGER hoist_devices_insert_update_delete
AFTER INSERT OR UPDATE OR DELETE ON "devices"
FOR EACH ROW
EXECUTE FUNCTION hoist_devices();
## A particular endpoint attempts insertions like:
BEGIN;
INSERT INTO "devices" ("orgId", "patientId", "deviceId", "data")
VALUES (
'org1',
'patient1',
'device1',
'{"id": "device1", "patientId": "patient1", "serialNumber": "12345",
"status": "active" }'
);
COMMIT;
--
Jake Biesinger
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Ghost data from failed FDW transactions?
2024-08-28 01:03 Ghost data from failed FDW transactions? Jacob Biesinger <[email protected]>
@ 2024-08-28 12:38 ` Greg Sabino Mullane <[email protected]>
2024-08-28 16:18 ` Re: Ghost data from failed FDW transactions? Jacob Biesinger <[email protected]>
0 siblings, 1 reply; 6+ messages in thread
From: Greg Sabino Mullane @ 2024-08-28 12:38 UTC (permalink / raw)
To: Jacob Biesinger <[email protected]>; +Cc: [email protected]
On Tue, Aug 27, 2024 at 9:03 PM Jacob Biesinger <[email protected]>
wrote:
> I'm scratching my head at a few rows in the root DB, where it seems the
> corresponding tenant transaction rolled back, but the root DB transaction
> committed
>
...
> Before I jump into particulars, does this sound like expected behavior?
>
No, it sounds like something is going wrong. Your setup as described should
work to keep both sides in sync.
Through the magic of postgres_fdw, row triggers, and distributed
> transactions,
>
Can you expand on "distributed transactions" here?
Cheers,
Greg
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Ghost data from failed FDW transactions?
2024-08-28 01:03 Ghost data from failed FDW transactions? Jacob Biesinger <[email protected]>
2024-08-28 12:38 ` Re: Ghost data from failed FDW transactions? Greg Sabino Mullane <[email protected]>
@ 2024-08-28 16:18 ` Jacob Biesinger <[email protected]>
2024-08-28 22:16 ` Re: Ghost data from failed FDW transactions? Rob Sargent <[email protected]>
2024-09-11 12:57 ` Re: Ghost data from failed FDW transactions? Greg Sabino Mullane <[email protected]>
0 siblings, 2 replies; 6+ messages in thread
From: Jacob Biesinger @ 2024-08-28 16:18 UTC (permalink / raw)
To: Greg Sabino Mullane <[email protected]>; +Cc: [email protected]
On Wed, Aug 28, 2024 at 5:39 AM Greg Sabino Mullane <[email protected]>
wrote:
> On Tue, Aug 27, 2024 at 9:03 PM Jacob Biesinger <[email protected]>
> wrote:
>
>> I'm scratching my head at a few rows in the root DB, where it seems the
>> corresponding tenant transaction rolled back, but the root DB transaction
>> committed
>>
> ...
>
>> Before I jump into particulars, does this sound like expected behavior?
>>
>
> No, it sounds like something is going wrong. Your setup as described
> should work to keep both sides in sync.
>
Well, that's a plus. At least we're (probably) not using it wrong :)
There aren't many details in the docs around failure modes... is there
anything there that could cause this issue?
For example, if there were some temporary network outage that prevented
communication between the two DBs, would the FDW fail gracefully? Or if
there were some disk failure or something after the FDW signals to the
remote (root) DB to commit but then the local (tenant) DB failed to commit?
We've had a few outages over the years where we hit the `max_connections`
setting on the cluster (which would affect both the root + tenant DBs), but
connections are held for the duration of both local + remote txns, so
doesn't seem like that would affect this. We don't use pgBouncer, either on
the client -> DB or as an in-between on the DB -> DB FDW side.
>
> Through the magic of postgres_fdw, row triggers, and distributed
>> transactions,
>>
>
> Can you expand on "distributed transactions" here?
>
I just mean "using the FDW as described". It is magic that we can get
proper transactional + ACID semantics in a distributed system. The FDW
really helps our use-case.
But to go deeper, we use the javascript knex adapter and some
application-level transaction management that automatically retries a
transaction N times when it encounters serialization errors. On this
particular endpoint, the emitted SQL for the full transaction looks
something like:
BEGIN;
INSERT INTO "devices" ("orgId", "patientId", "deviceId", "data")
VALUES (
'org1',
'patient1',
'device1',
'{"id": "device1", "patientId": "patient1", "serialNumber": "12345",
"status": "active" }'
);
INSERT INTO "devices" ("orgId", "patientId", "deviceId", "data")
VALUES (
'org1',
'patient1',
'device2',
'{"id": "device2", "patientId": "patient1", "serialNumber": "67890",
"status": "active" }'
);
SELECT * FROM "rootDb"."assets";
-- execute some logic client-side, nothing touching the DB
UPDATE "rootDb"."assets" WHERE ...;
COMMIT;
With I guess the maybe-relevant bits here being that we do some additional
reading + writing to the remote / root DB (other tables) subsequently as
part of the same txn. The JS driving this also has the unfortunate
shortcoming that the two `INSERT` statements run "in parallel", meaning
there's a race to execute them (serially) through their shared
txn/connection. The ordering shouldn't matter, but this also means
that error handling (e.g., when there is a conflict with the `EXCLUDE
"serialNumber"` constraint) may not stop the second `INSERT` statement from
being attempted (and rejected by postgres). But I think that's all
client-side details that shouldn't affect the FDW txn semantics, right?
--
Jake Biesinger
On Wed, Aug 28, 2024 at 5:39 AM Greg Sabino Mullane <[email protected]>
wrote:
> On Tue, Aug 27, 2024 at 9:03 PM Jacob Biesinger <[email protected]>
> wrote:
>
>> I'm scratching my head at a few rows in the root DB, where it seems the
>> corresponding tenant transaction rolled back, but the root DB transaction
>> committed
>>
> ...
>
>> Before I jump into particulars, does this sound like expected behavior?
>>
>
> No, it sounds like something is going wrong. Your setup as described
> should work to keep both sides in sync.
>
> Through the magic of postgres_fdw, row triggers, and distributed
>> transactions,
>>
>
> Can you expand on "distributed transactions" here?
>
> Cheers,
> Greg
>
>
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Ghost data from failed FDW transactions?
2024-08-28 01:03 Ghost data from failed FDW transactions? Jacob Biesinger <[email protected]>
2024-08-28 12:38 ` Re: Ghost data from failed FDW transactions? Greg Sabino Mullane <[email protected]>
2024-08-28 16:18 ` Re: Ghost data from failed FDW transactions? Jacob Biesinger <[email protected]>
@ 2024-08-28 22:16 ` Rob Sargent <[email protected]>
2024-08-28 22:19 ` Re: Ghost data from failed FDW transactions? Jacob Biesinger <[email protected]>
1 sibling, 1 reply; 6+ messages in thread
From: Rob Sargent @ 2024-08-28 22:16 UTC (permalink / raw)
To: Jacob Biesinger <[email protected]>; +Cc: Greg Sabino Mullane <[email protected]>; [email protected]
> On Aug 28, 2024, at 10:18 AM, Jacob Biesinger <[email protected]> wrote:
>
> But to go deeper, we use the javascript knex adapter and some application-level transaction management that automatically retries a transaction N times when it encounters serialization errors. On this particular endpoint, the emitted SQL for the full transaction looks something like:
>
> BEGIN;
> INSERT INTO "devices" ("orgId", "patientId", "deviceId", "data")
> VALUES (
> 'org1',
> 'patient1',
> 'device1',
> '{"id": "device1", "patientId": "patient1", "serialNumber": "12345", "status": "active" }'
> );
> INSERT INTO "devices" ("orgId", "patientId", "deviceId", "data")
> VALUES (
> 'org1',
> 'patient1',
> 'device2',
> '{"id": "device2", "patientId": "patient1", "serialNumber": "67890", "status": "active" }'
> );
> SELECT * FROM "rootDb"."assets";
>
> -- execute some logic client-side, nothing touching the DB
>
> UPDATE "rootDb"."assets" WHERE ...;
> COMMIT;
>
Any value in supplying a single insert statement a la (less back and forth perhaps?):
BEGIN;
INSERT INTO "devices" ("orgId", "patientId", "deviceId", "data")
VALUES (
'org1',
'patient1',
'device1',
'{"id": "device1", "patientId": "patient1", "serialNumber": "12345", "status": "active" }’),
(
'org1',
'patient1',
'device2',
'{"id": "device2", "patientId": "patient1", "serialNumber": "67890", "status": "active" }'
)
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Ghost data from failed FDW transactions?
2024-08-28 01:03 Ghost data from failed FDW transactions? Jacob Biesinger <[email protected]>
2024-08-28 12:38 ` Re: Ghost data from failed FDW transactions? Greg Sabino Mullane <[email protected]>
2024-08-28 16:18 ` Re: Ghost data from failed FDW transactions? Jacob Biesinger <[email protected]>
2024-08-28 22:16 ` Re: Ghost data from failed FDW transactions? Rob Sargent <[email protected]>
@ 2024-08-28 22:19 ` Jacob Biesinger <[email protected]>
0 siblings, 0 replies; 6+ messages in thread
From: Jacob Biesinger @ 2024-08-28 22:19 UTC (permalink / raw)
To: Rob Sargent <[email protected]>; +Cc: Greg Sabino Mullane <[email protected]>; [email protected]
>
> Any value in supplying a single insert statement a la (less back and forth
> perhaps?):
>
Yes, absolutely that would be better. This particular endpoint has some
ancient + crufty code backing it (migrated from a NoSQL DB with a
db-agnostic shim that we're slowly replacing). The old code likes doing
things client-side instead of being sane about SQL semantics.
But I don't think that would affect the issue here, right?
--
Jake Biesinger
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Ghost data from failed FDW transactions?
2024-08-28 01:03 Ghost data from failed FDW transactions? Jacob Biesinger <[email protected]>
2024-08-28 12:38 ` Re: Ghost data from failed FDW transactions? Greg Sabino Mullane <[email protected]>
2024-08-28 16:18 ` Re: Ghost data from failed FDW transactions? Jacob Biesinger <[email protected]>
@ 2024-09-11 12:57 ` Greg Sabino Mullane <[email protected]>
1 sibling, 0 replies; 6+ messages in thread
From: Greg Sabino Mullane @ 2024-09-11 12:57 UTC (permalink / raw)
To: Jacob Biesinger <[email protected]>; +Cc: [email protected]
Any updates on this?
A few replies from me inline:
On Wed, Aug 28, 2024 at 12:18 PM Jacob Biesinger <[email protected]>
wrote:
> There aren't many details in the docs around failure modes... is there
> anything there that could cause this issue?
>
Nothing that I know of, but it's possible there is some sort of weird race
condition/bug. But frankly, it's far more likely to be some sort of
application bug / missed behavior. If you could trigger it at will by
developing a self-contained test, that would be ideal.
> The ordering shouldn't matter, but this also means that error handling
> (e.g., when there is a conflict with the `EXCLUDE "serialNumber"`
> constraint) may not stop the second `INSERT` statement from being attempted
> (and rejected by postgres). But I think that's all client-side details that
> shouldn't affect the FDW txn semantics, right?
>
Correct, but it's hard to say for sure without seeing the code or knowing
exactly how the serialization errors are being handled. I'm not clear on
how the inserts are run "in parallel" if your flow is:
BEGIN;
> INSERT INTO "devices" ("orgId", "patientId", "deviceId", "data") ...
> INSERT INTO "devices" ("orgId", "patientId", "deviceId", "data") ...
SELECT * FROM "rootDb"."assets";
-- execute some logic client-side, nothing touching the DB
UPDATE "rootDb"."assets" WHERE ...;
COMMIT;
Perhaps expand on that, because it almost sounds like you have two
processes trying to talk to the same backend at once?
Cheers,
Greg
^ permalink raw reply [nested|flat] 6+ messages in thread
end of thread, other threads:[~2024-09-11 12:57 UTC | newest]
Thread overview: 6+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2024-08-28 01:03 Ghost data from failed FDW transactions? Jacob Biesinger <[email protected]>
2024-08-28 12:38 ` Greg Sabino Mullane <[email protected]>
2024-08-28 16:18 ` Jacob Biesinger <[email protected]>
2024-08-28 22:16 ` Rob Sargent <[email protected]>
2024-08-28 22:19 ` Jacob Biesinger <[email protected]>
2024-09-11 12:57 ` Greg Sabino Mullane <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox