public inbox for [email protected]
help / color / mirror / Atom feedFrom: Alvaro Herrera <[email protected]>
To: Tom Lane <[email protected]>
Cc: Adrian Klaver <[email protected]>
Cc: [email protected]
Cc: [email protected]
Subject: Re: Fwd: Fwd: Postgres attach partition: AccessExclusive lock set on different tables depending on how attaching is performed
Date: Wed, 13 Nov 2024 12:49:49 +0100
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
On 2024-Nov-10, Tom Lane wrote:
> This surprised me a bit too, because I thought we took a
> slightly-less-than-exclusive lock for FK additions or deletions.
> Tracing through it, I find that CloneFkReferencing opens the
> referenced relation with ShareRowExclusiveLock as I expected.
> But then we conclude that we can drop the existing FK enforcement
> triggers for the table being attached. That causes us to take
> AccessExclusiveLock on the trigger itself, which is fine because
> nobody's really paying attention to that. But then RemoveTriggerById
> takes AccessExclusiveLock on the trigger's table. We already had
> that on the table being attached, but not on the other table.
Oooh.
> I wonder whether it'd be all right for RemoveTriggerById to take
> only ShareRowExclusiveLock on the trigger's table. This seems
> OK in terms of basic semantics: that's enough to lock out
> anything that might want to fire triggers on the table. However,
> this comment for AlterTableGetLockLevel gives me pause:
>
> * Also note that pg_dump uses only an AccessShareLock, meaning that anything
> * that takes a lock less than AccessExclusiveLock can change object definitions
> * while pg_dump is running. Be careful to check that the appropriate data is
> * derived by pg_dump using an MVCC snapshot, rather than syscache lookups,
> * otherwise we might end up with an inconsistent dump that can't restore.
>
> I think pg_dump uses pg_get_triggerdef, which is probably not
> safe in these terms.
Looking at pg_get_triggerdef_worker, it is not using syscache but a
systable scan, which uses the catalog snapshot. A catalog snapshot is
indeed implemented as an MVCC snapshot (so strictly speaking it _is_ an
MVCC snapshot), but the invalidation rules are different from a normal
MVCC snapshot, so AFAIU it's still unsafe.
> An alternative answer might be what Alvaro was muttering about
> the other day: redesign FKs for partitioned tables so that we
> do not have to change the set of triggers when attaching/detaching.
Hmm, I hadn't thought about this idea in those terms, but perhaps we
could reimplement this by not having one trigger for each RI check, but
instead a single trigger which internally determines which FK
constraints exist on the table and does the necessary work in a single
pass. Then we don't need to add/drop triggers all the time, but we just
add it with the first FK in the table, and remove it when dropping the
last FK.
For tables with many FKs, this could be a win, because we'd only go
through the trigger machinery once. If a table has both outgoing and
incoming FKs, maybe we could have _one_ single trigger.
(I think this would be orthogonal with the project to stop using SPI for
RI triggers.)
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: Fwd: Fwd: Postgres attach partition: AccessExclusive lock set on different tables depending on how attaching is performed
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox