public inbox for [email protected]
help / color / mirror / Atom feedFrom: Richard Guo <[email protected]>
To: Alexandra Wang <[email protected]>
Cc: Tender Wang <[email protected]>
Cc: David Rowley <[email protected]>
Cc: Pg Hackers <[email protected]>
Cc: Robert Haas <[email protected]>
Cc: Tom Lane <[email protected]>
Cc: Paul A Jungwirth <[email protected]>
Cc: Ashutosh Bapat <[email protected]>
Subject: Re: Remove inner joins based on foreign keys
Date: Tue, 26 May 2026 06:01:08 +0900
Message-ID: <CAMbWs4_UJd5QxYq7KaqO=TOWmS=2opUrhC1Oc=Dbwkcn2mXk6g@mail.gmail.com> (raw)
In-Reply-To: <CAK98qZ3ECQtS-QSa2cSHUKGkYjgOPWpfEGg7mJPVnqQ_5OdE0g@mail.gmail.com>
References: <CAMbWs49-v9FGX+B=tsCxUq_FPX09ZNvXjj+TaTx0m-MAVmY5TA@mail.gmail.com>
<CAApHDvoOuJ==mxHBpX9F6V87Xg6Hzj+DuoA_dpzxZaFBkWyt9A@mail.gmail.com>
<CAMbWs496Mn-JSWnKa0vStQSVdUTGaidEDFTZsjYyC82AsxW9sw@mail.gmail.com>
<CAMbWs4_hdpfVqWe+hMvQZp_9x4SFaAy9jDzhgVBQKKJrN47wgA@mail.gmail.com>
<CAMbWs48LDf=9pjk3rwrrNbtW5Uoa8sK4fgfmJDNbeeGwY73=gg@mail.gmail.com>
<CAHewXNk=iH75y43SWGVdmnfJAbUdzs_BZKyCQX7pN2gx+zKx5A@mail.gmail.com>
<CAMbWs4-JenbkmsY8cJX-dB0L-RdgZKy5MhOK3KQmD3xXGRAiLQ@mail.gmail.com>
<CAK98qZ3ECQtS-QSa2cSHUKGkYjgOPWpfEGg7mJPVnqQ_5OdE0g@mail.gmail.com>
(CC'ing folks who discussed this with me at PGConf.dev last week;
thanks again for the conversations.)
After several discussions at PGConf.dev last week, there are concerns
with the lock-based predicate this thread has been converging on. I'd
like to lay out the underlying issue and the available approaches in
detail on-list, so the wider community can weigh in before we go
further. I'm not advocating any particular option here. My goal in
this email is to make sure everyone evaluating the patch is starting
from the same picture.
1. The trigger gap, demonstrated
--------------------------------
On master, no patch applied:
CREATE TABLE users (id int PRIMARY KEY, name text);
CREATE TABLE orders (id int PRIMARY KEY,
user_id int REFERENCES users(id) ON DELETE CASCADE);
INSERT INTO users VALUES (1, 'Alice'), (2, 'Bob');
INSERT INTO orders VALUES (10, 1), (11, 1), (20, 2);
CREATE FUNCTION show_gap() RETURNS void LANGUAGE plpgsql AS $$
DECLARE u_rows text; o_rows text;
BEGIN
SELECT string_agg(format('%s (id=%s)', name, id),
', ' ORDER BY id)
INTO u_rows FROM users;
SELECT string_agg(format('(id=%s, user_id=%s)', id, user_id),
', ' ORDER BY id)
INTO o_rows FROM orders;
RAISE NOTICE 'users: %', u_rows;
RAISE NOTICE 'orders: %', o_rows;
END $$;
DELETE FROM users WHERE id = 1 RETURNING show_gap();
NOTICE: users: Bob (id=2)
NOTICE: orders: (id=10, user_id=1), (id=11, user_id=1), (id=20, user_id=2)
show_gap() runs inside the DELETE's RETURNING clause, after Alice has
been removed from users but before the cascade trigger has deleted her
orders. In show_gap()'s snapshot, Alice is gone, but her two orders
(10 and 11) are still there, pointing at a user_id that no longer
exists. The FK invariant is locally false.
Why this happens: FK constraints are enforced by AFTER ROW triggers
that fire at end of statement, after RETURNING evaluation. Between
when the DELETE modifies the heap and when the cascade trigger fires,
the heap is transiently inconsistent. Any user code that runs in that
window with a fresh snapshot observes the inconsistency. In the demo
above, plpgsql's SPI call takes such a snapshot.
For the FK-based inner-join-removal optimization, this matters because
the rewrite from 'child JOIN parent ON c.fk = p.id' to 'child WHERE
c.fk IS NOT NULL' assumes the FK invariant holds for the executing
snapshot. In show_gap()'s snapshot it doesn't, and the rewrite would
return Alice's orders (10 and 11) as if Alice were still present, when
the joined form correctly excludes them.
2. Approaches
-------------
What PG promises about FK constraints today is what the SQL standard
requires: INITIALLY IMMEDIATE constraints are satisfied at end-of-
statement, INITIALLY DEFERRED at commit. Neither the standard nor the
PG docs explicitly promise that the FK invariant holds for every
snapshot a query inside the database can use. The optimization
proposed in this thread needs the stronger property: it rewrites a
join on the assumption that every visible child row with non-null FK
has a corresponding visible parent in the same snapshot. The trigger
gap is where that stronger property fails.
The approaches I have considered are listed below, with (B) being
the predicate currently in the v4 patch.
(A) Drop the optimization.
Don't pursue FK-based inner-join removal, or any other optimization
whose correctness depends on the FK invariant.
(B) Lock-based predicate.
inner_join_is_removable() and choose_custom_plan() consult
CheckRelationOidLockedByMe(RowExclusiveLock) for the FK rels; skip the
rewrite if held. RowExclusiveLock is xact-scoped and exceeds the
lifetime of any in-xact snapshot, so the lock being present is
sufficient to block the optimization in any context where a
gap-bearing snapshot might exist.
Pros: No new infrastructure. Predicate is local to two functions.
CheckRelationOidLockedByMe() is a cheap local hash lookup. Correct
against every gap-bearing snapshot path.
Cons: Xact-scoped, therefore pessimistic. Once a DML on either FK rel
runs in a transaction, the rewrite is suppressed for the rest of that
transaction, even when subsequent statements take fresh snapshots that
the FK invariant *does* hold for. For cached plans that previously
used FK removal, this also produces a planner-replan storm:
choose_custom_plan trips on every consultation in the writing xact and
replans each time.
(C) Snapshot-anchored predicate.
Add a "captured during a gap window" bit to SnapshotData, populated in
CopySnapshot when AfterTriggerPendingOnAnyRel() returns true at
capture time. inner_join_is_removable() and choose_custom_plan()
consult the bit on the active snapshot.
Pros: Precise. Only snapshots actually captured inside a gap suppress
the optimization. Sequential DML-then-SELECT in the same xact keeps
the rewrite. Eliminates the replan storm and most of the plan-shape
cliff that (B) introduces.
Cons: SnapshotData, one of PG's most foundational abstractions, gains
a field that encodes trigger-subsystem state. Every future reader of
snapshot.h has to learn what the gap is.
(D) Close the gap.
Make the FK invariant hold for every in-xact snapshot. Either by
enforcing RI synchronously inside the DML rather than via AFTER ROW
triggers, or by arranging that the deleted parent tuple and the
trigger's modifications to children become visible atomically.
Pros: Strengthens what PG promises about constraints. Benefits any
future optimization that needs the FK invariant. No gating predicate
is needed in the planner or plan cache.
Cons: Invasive and difficult.
(E) Apply the optimization without gap handling, and document the
corner cases.
Skip the predicate entirely. The optimization fires whenever the FK
constraint structurally permits it. Queries that execute against a
gap-bearing snapshot can return wrong results, documented as a known
limitation.
Pros: Smallest patch. Writing a join query that runs during a trigger
gap is unusual in practice for most user code.
Cons: Can have wrong results.
(F) Something else.
3. On splitting the patch
-------------------------
Independently of which option above wins, I'd like to separate the
patch into two parts so the optimization mechanics and the
gap-handling can be reviewed independently (see v5 patches):
Part 1. The structural inner-join-removal logic, assuming the FK
invariant holds.
Part 2. Whatever gap-handling we converge on from section 2 above. If
(A) wins, Part 2 is replaced by dropping Part 1 entirely.
Part 1 alone would be unsafe to commit by itself; the value of the
split is reviewing-order, not commit-order.
(Attached v5 is the split version of v4, plus addressing Alex's two
comments. 0002 is still the lock-based predicate from v4, posted as
the concrete reference for option (B); it can be swapped for whichever
approach the gap-handling discussion settles on.)
I'd like input on which of the approaches in section 2 people would be
willing to live with. Pointers to prior discussion I haven't found,
or design considerations I've missed, are also welcome.
- Richard
Attachments:
[application/octet-stream] v5-0001-Remove-inner-joins-based-on-foreign-keys.patch (68.2K, 2-v5-0001-Remove-inner-joins-based-on-foreign-keys.patch)
download | inline diff:
From 7ee3be87784263c57394365747254b8b580e9d75 Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Wed, 11 Mar 2026 16:26:12 +0900
Subject: [PATCH v5 1/2] Remove inner joins based on foreign keys
This patch allows the planner to safely remove an inner join to the
referenced relation of a foreign key when that relation acts solely as
a structural anchor. To be removable, the inner join must be
mathematically guaranteed to produce exactly one matching row for each
row in the referencing relation. The foreign key constraint
guarantees existence (at least one match), while the referenced
relation's unique constraint guarantees non-duplication (at most one
match).
To preserve the inner join's elimination semantics, the referencing
relation's foreign key columns must be guaranteed to be not null. If
they are not defined as not null in the schema, this patch injects an
IS NOT NULL filter to prevent null-key rows from incorrectly surviving
the removal.
The referenced relation's non-key columns cannot be used anywhere in
the query. Because the referenced relation will be removed, we would
have no way to read their values. Even if used strictly within the
join conditions, they would act as local filters, violating the
exact-match guarantee.
Theoretically, the referenced relation's key columns represent the
exact same logical values as the referencing relation's columns due to
the foreign key equality. However, they are not strictly
interchangeable in the query tree. Substituting them requires
preserving exact data types, collations, and operator family
semantics. Because the planner currently lacks a mechanism to safely
perform this variable substitution across differing schemas, their
usage is strictly limited. They cannot be used in contexts that
require rewriting Var references, such as targetlist, restriction
clauses, lateral_vars, or righthand-side expressions of semijoins.
They are, however, permitted to bridge multi-hop joins via
EquivalenceClasses. When the relation is removed, its members are
dropped from the EquivalenceClasses, allowing the planner's native
machinery to deduce the remaining transitive equalities.
Deferrable constraints are excluded because their enforcement can be
postponed to end-of-transaction, so the FK invariant may not hold at
query time. NOT VALID constraints are excluded because existing data
may violate the FK even though new modifications are enforced.
This commit is the structural part of the optimization. It assumes
the FK invariant holds for every snapshot a query can be executed
against -- an assumption that is not actually true in PG today due to
the AFTER-row-trigger model of RI enforcement (the "trigger gap").
The follow-up commit handles that.
---
src/backend/optimizer/path/equivclass.c | 12 +-
src/backend/optimizer/plan/analyzejoins.c | 665 +++++++++++++++++-
src/backend/optimizer/plan/initsplan.c | 5 +-
src/backend/optimizer/plan/planmain.c | 8 +
src/backend/optimizer/util/plancat.c | 2 +
src/backend/utils/cache/relcache.c | 2 +
src/backend/utils/misc/guc_parameters.dat | 7 +
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/include/nodes/pathnodes.h | 4 +
src/include/optimizer/paths.h | 3 +-
src/include/optimizer/planmain.h | 2 +
src/include/utils/rel.h | 6 +
src/test/regress/expected/join.out | 543 ++++++++++++++
src/test/regress/expected/sysviews.out | 3 +-
src/test/regress/sql/join.sql | 278 ++++++++
15 files changed, 1494 insertions(+), 47 deletions(-)
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index e3697df51a2..34d3db9d9a4 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -2701,14 +2701,14 @@ exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2, Oid opfamily)
* we ignore that fine point here.) This is much like exprs_known_equal,
* except for the format of the input.
*
- * On success, we also set fkinfo->eclass[colno] to the matching eclass,
- * and set fkinfo->fk_eclass_member[colno] to the eclass member for the
- * referencing Var.
+ * If fk_eclass_member is not NULL, *fk_eclass_member is set to the eclass
+ * member for the referencing Var.
*/
EquivalenceClass *
match_eclasses_to_foreign_key_col(PlannerInfo *root,
ForeignKeyOptInfo *fkinfo,
- int colno)
+ int colno,
+ EquivalenceMember **fk_eclass_member)
{
Index var1varno = fkinfo->con_relid;
AttrNumber var1attno = fkinfo->conkey[colno];
@@ -2778,8 +2778,8 @@ match_eclasses_to_foreign_key_col(PlannerInfo *root,
opfamilies = get_mergejoin_opfamilies(eqop);
if (equal(opfamilies, ec->ec_opfamilies))
{
- fkinfo->eclass[colno] = ec;
- fkinfo->fk_eclass_member[colno] = item2_em;
+ if (fk_eclass_member)
+ *fk_eclass_member = item2_em;
return ec;
}
/* Otherwise, done with this EC, move on to the next */
diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index b07cb731401..4c5c78dcbfc 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -23,12 +23,14 @@
#include "postgres.h"
#include "catalog/pg_class.h"
+#include "nodes/makefuncs.h"
#include "nodes/nodeFuncs.h"
#include "optimizer/joininfo.h"
#include "optimizer/optimizer.h"
#include "optimizer/pathnode.h"
#include "optimizer/paths.h"
#include "optimizer/placeholder.h"
+#include "optimizer/plancat.h"
#include "optimizer/planmain.h"
#include "optimizer/restrictinfo.h"
#include "parser/parse_agg.h"
@@ -52,6 +54,7 @@ typedef struct
} SelfJoinCandidate;
bool enable_self_join_elimination;
+bool enable_fk_inner_join_removal;
/* local functions */
static bool join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo);
@@ -79,6 +82,10 @@ static bool is_innerrel_unique_for(PlannerInfo *root,
static int self_join_candidates_cmp(const void *a, const void *b);
static bool replace_relid_callback(Node *node,
ChangeVarNodes_context *context);
+static bool inner_join_is_removable(PlannerInfo *root, ForeignKeyOptInfo *fkinfo);
+static void inject_fk_not_null_quals(PlannerInfo *root, ForeignKeyOptInfo *fkinfo);
+static void remove_referenced_rel_from_query(PlannerInfo *root,
+ ForeignKeyOptInfo *fkinfo);
/*
@@ -434,16 +441,18 @@ remove_leftjoinrel_from_query(PlannerInfo *root, int relid,
* to include them in the query. Optionally replace references to the
* removed relid with subst if this is a self-join removal.
*
- * This function serves as the common infrastructure for left-join removal
- * and self-join elimination. It is intentionally scoped to update only the
- * shared planner data structures that are universally affected by relation
- * removal. Each specific caller remains responsible for updating any
- * remaining data structures required by its unique removal logic.
+ * This function serves as the common infrastructure for three distinct
+ * optimization passes: left-join removal, inner-join removal, and self-join
+ * elimination. It is intentionally scoped to update only the shared planner
+ * data structures that are universally affected by relation removal. Each
+ * specific caller remains responsible for updating any remaining data
+ * structures required by its unique removal logic.
*
* The specific type of removal being performed is dictated by the combination
* of the sjinfo and subst parameters. A non-NULL sjinfo indicates left-join
* removal. When sjinfo is NULL, a positive subst value indicates self-join
- * elimination (where references are replaced with subst).
+ * elimination (where references are replaced with subst), while a negative
+ * subst value indicates inner-join removal.
*/
static void
remove_rel_from_query(PlannerInfo *root, int relid,
@@ -455,10 +464,11 @@ remove_rel_from_query(PlannerInfo *root, int relid,
ListCell *l;
bool is_outer_join = (sjinfo != NULL);
bool is_self_join = (!is_outer_join && subst > 0);
+ bool is_inner_join = (!is_outer_join && subst < 0);
- Assert(is_outer_join || is_self_join);
+ Assert(is_outer_join || is_self_join || is_inner_join);
Assert(!is_outer_join || ojrelid > 0);
- Assert(!is_outer_join || joinrelids != NULL);
+ Assert(is_self_join || joinrelids != NULL);
/*
* Update all_baserels and related relid sets.
@@ -515,7 +525,7 @@ remove_rel_from_query(PlannerInfo *root, int relid,
sjinf->commute_below_l = bms_del_member(sjinf->commute_below_l, ojrelid);
sjinf->commute_below_r = bms_del_member(sjinf->commute_below_r, ojrelid);
}
- else
+ else if (is_self_join)
{
/*
* For self-join removal, replace relid references in
@@ -524,16 +534,24 @@ remove_rel_from_query(PlannerInfo *root, int relid,
ChangeVarNodesExtended((Node *) sjinf->semi_rhs_exprs, relid, subst,
0, replace_relid_callback);
}
+ else if (is_inner_join)
+ {
+ /*
+ * For inner-join removal, no additional modifications are needed.
+ * inner_join_is_removable() has already guaranteed that the
+ * target relation's columns do not leak into semi_rhs_exprs.
+ */
+ }
}
/*
* Likewise remove references from PlaceHolderVar data structures,
- * removing any no-longer-needed placeholders entirely. We only remove
- * PHVs for left-join removal. With self-join elimination, PHVs already
- * get moved to the remaining relation, where they might still be needed.
- * It might also happen that we skip the removal of some PHVs that could
- * be removed. However, the overhead of extra PHVs is small compared to
- * the complexity of analysis needed to remove them.
+ * removing any no-longer-needed placeholders entirely. We remove PHVs
+ * for left-join and inner-join removal. With self-join elimination, PHVs
+ * already get moved to the remaining relation, where they might still be
+ * needed. It might also happen that we skip the removal of some PHVs
+ * that could be removed. However, the overhead of extra PHVs is small
+ * compared to the complexity of analysis needed to remove them.
*
* Removal is a bit trickier than it might seem: we can remove PHVs that
* are used at the target rel and/or in the join qual, but not those that
@@ -542,19 +560,20 @@ remove_rel_from_query(PlannerInfo *root, int relid,
* since they will both have ph_needed sets that are subsets of
* joinrelids. However, a PHV used at a partner rel could not have the
* target rel in ph_eval_at, so we check that while deciding whether to
- * remove or just update the PHV. There is no corresponding test in
- * join_is_removable because it doesn't need to distinguish those cases.
+ * remove or just update the PHV. There are no corresponding tests in the
+ * callers (like join_is_removable or inner_join_is_removable) because
+ * they don't need to distinguish those cases.
*/
foreach(l, root->placeholder_list)
{
PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst(l);
- Assert(!is_outer_join || !bms_is_member(relid, phinfo->ph_lateral));
+ Assert(is_self_join || !bms_is_member(relid, phinfo->ph_lateral));
- if (is_outer_join &&
+ if (!is_self_join &&
bms_is_subset(phinfo->ph_needed, joinrelids) &&
bms_is_member(relid, phinfo->ph_eval_at) &&
- !bms_is_member(ojrelid, phinfo->ph_eval_at))
+ (is_inner_join || !bms_is_member(ojrelid, phinfo->ph_eval_at)))
{
root->placeholder_list = foreach_delete_current(root->placeholder_list,
l);
@@ -583,10 +602,11 @@ remove_rel_from_query(PlannerInfo *root, int relid,
/*
* For self-join removal, update Var nodes within the PHV's
* expression to reference the replacement relid, and adjust
- * ph_lateral for the relid substitution. (For left-join removal,
- * we're removing rather than replacing, and any surviving PHV
- * shouldn't reference the removed rel in its expression. Also,
- * relid can't appear in ph_lateral for outer joins.)
+ * ph_lateral for the relid substitution. (For left-join and
+ * inner-join removal, we're removing rather than replacing, and
+ * any surviving PHV shouldn't reference the removed rel in its
+ * expression. Also, relid can't appear in ph_lateral for outer
+ * joins and inner joins.)
*/
if (is_self_join)
{
@@ -612,14 +632,14 @@ remove_rel_from_query(PlannerInfo *root, int relid,
* For self-join removal, the caller has already updated the
* EquivalenceClasses, so we can skip this step.
*/
- if (is_outer_join)
+ if (is_outer_join || is_inner_join)
{
foreach(l, root->eq_classes)
{
EquivalenceClass *ec = (EquivalenceClass *) lfirst(l);
if (bms_is_member(relid, ec->ec_relids) ||
- bms_is_member(ojrelid, ec->ec_relids))
+ (is_outer_join && bms_is_member(ojrelid, ec->ec_relids)))
remove_rel_from_eclass(ec, relid, ojrelid);
}
}
@@ -671,7 +691,8 @@ remove_rel_from_query(PlannerInfo *root, int relid,
}
/*
- * Remove any references to relid or ojrelid from the RestrictInfo.
+ * Remove any references to relid or ojrelid from the RestrictInfo. If
+ * ojrelid is <= 0, it is ignored (used for inner-join removal).
*
* We only bother to clean out bits in the RestrictInfo's various relid sets,
* not nullingrel bits in contained Vars and PHVs. (This might have to be
@@ -690,27 +711,33 @@ remove_rel_from_restrictinfo(RestrictInfo *rinfo, int relid, int ojrelid)
*/
rinfo->clause_relids = bms_copy(rinfo->clause_relids);
rinfo->clause_relids = bms_del_member(rinfo->clause_relids, relid);
- rinfo->clause_relids = bms_del_member(rinfo->clause_relids, ojrelid);
+ if (ojrelid > 0)
+ rinfo->clause_relids = bms_del_member(rinfo->clause_relids, ojrelid);
/* Likewise for required_relids */
rinfo->required_relids = bms_copy(rinfo->required_relids);
rinfo->required_relids = bms_del_member(rinfo->required_relids, relid);
- rinfo->required_relids = bms_del_member(rinfo->required_relids, ojrelid);
+ if (ojrelid > 0)
+ rinfo->required_relids = bms_del_member(rinfo->required_relids, ojrelid);
/* Likewise for incompatible_relids */
rinfo->incompatible_relids = bms_copy(rinfo->incompatible_relids);
rinfo->incompatible_relids = bms_del_member(rinfo->incompatible_relids, relid);
- rinfo->incompatible_relids = bms_del_member(rinfo->incompatible_relids, ojrelid);
+ if (ojrelid > 0)
+ rinfo->incompatible_relids = bms_del_member(rinfo->incompatible_relids, ojrelid);
/* Likewise for outer_relids */
rinfo->outer_relids = bms_copy(rinfo->outer_relids);
rinfo->outer_relids = bms_del_member(rinfo->outer_relids, relid);
- rinfo->outer_relids = bms_del_member(rinfo->outer_relids, ojrelid);
+ if (ojrelid > 0)
+ rinfo->outer_relids = bms_del_member(rinfo->outer_relids, ojrelid);
/* Likewise for left_relids */
rinfo->left_relids = bms_copy(rinfo->left_relids);
rinfo->left_relids = bms_del_member(rinfo->left_relids, relid);
- rinfo->left_relids = bms_del_member(rinfo->left_relids, ojrelid);
+ if (ojrelid > 0)
+ rinfo->left_relids = bms_del_member(rinfo->left_relids, ojrelid);
/* Likewise for right_relids */
rinfo->right_relids = bms_copy(rinfo->right_relids);
rinfo->right_relids = bms_del_member(rinfo->right_relids, relid);
- rinfo->right_relids = bms_del_member(rinfo->right_relids, ojrelid);
+ if (ojrelid > 0)
+ rinfo->right_relids = bms_del_member(rinfo->right_relids, ojrelid);
/* If it's an OR, recurse to clean up sub-clauses */
if (restriction_is_or_clause(rinfo))
@@ -746,7 +773,8 @@ remove_rel_from_restrictinfo(RestrictInfo *rinfo, int relid, int ojrelid)
}
/*
- * Remove any references to relid or ojrelid from the EquivalenceClass.
+ * Remove any references to relid or ojrelid from the EquivalenceClass. If
+ * ojrelid is <= 0, it is ignored (used for inner-join removal).
*
* Like remove_rel_from_restrictinfo, we don't worry about cleaning out
* any nullingrel bits in contained Vars and PHVs. (This might have to be
@@ -761,7 +789,8 @@ remove_rel_from_eclass(EquivalenceClass *ec, int relid, int ojrelid)
/* Fix up the EC's overall relids */
ec->ec_relids = bms_del_member(ec->ec_relids, relid);
- ec->ec_relids = bms_del_member(ec->ec_relids, ojrelid);
+ if (ojrelid > 0)
+ ec->ec_relids = bms_del_member(ec->ec_relids, ojrelid);
/*
* We don't expect any EC child members to exist at this point. Ensure
@@ -780,13 +809,14 @@ remove_rel_from_eclass(EquivalenceClass *ec, int relid, int ojrelid)
EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc);
if (bms_is_member(relid, cur_em->em_relids) ||
- bms_is_member(ojrelid, cur_em->em_relids))
+ (ojrelid > 0 && bms_is_member(ojrelid, cur_em->em_relids)))
{
Assert(!cur_em->em_is_const);
/* em_relids is likely to be shared with some RestrictInfo */
cur_em->em_relids = bms_copy(cur_em->em_relids);
cur_em->em_relids = bms_del_member(cur_em->em_relids, relid);
- cur_em->em_relids = bms_del_member(cur_em->em_relids, ojrelid);
+ if (ojrelid > 0)
+ cur_em->em_relids = bms_del_member(cur_em->em_relids, ojrelid);
if (bms_is_empty(cur_em->em_relids))
ec->ec_members = foreach_delete_current(ec->ec_members, lc);
}
@@ -2038,7 +2068,6 @@ remove_self_join_rel(PlannerInfo *root, PlanRowMark *kmark, PlanRowMark *rmark,
/* And nuke the RelOptInfo, just in case there's another access path. */
pfree(toRemove);
-
/*
* Now repeat construction of attr_needed bits coming from all other
* sources.
@@ -2567,3 +2596,563 @@ remove_useless_self_joins(PlannerInfo *root, List *joinlist)
return joinlist;
}
+
+/*
+ * remove_useless_inner_joins
+ * Scans all foreign keys in the query to find and remove referenced
+ * relations that act only to duplicate referential integrity guarantees.
+ *
+ * We are passed the current joinlist and return the updated list. Other
+ * data structures that have to be updated are accessible via "root".
+ */
+List *
+remove_useless_inner_joins(PlannerInfo *root, List *joinlist)
+{
+ Relids removed_relids = NULL;
+
+restart:
+ foreach_node(ForeignKeyOptInfo, fkinfo, root->fkey_list)
+ {
+ int nremoved;
+
+ /*
+ * Skip if either the referencing or the referenced relation has
+ * already been removed by a prior FK removal in this loop.
+ */
+ if (bms_is_member(fkinfo->con_relid, removed_relids) ||
+ bms_is_member(fkinfo->ref_relid, removed_relids))
+ continue;
+
+ /* Skip if not removable */
+ if (!inner_join_is_removable(root, fkinfo))
+ continue;
+
+ /* Inject IS NOT NULL clauses for nullable foreign key columns */
+ inject_fk_not_null_quals(root, fkinfo);
+
+ /* Remove the referenced relation */
+ remove_referenced_rel_from_query(root, fkinfo);
+
+ /* We verify that exactly one reference gets removed from joinlist */
+ nremoved = 0;
+ joinlist = remove_rel_from_joinlist(joinlist, fkinfo->ref_relid, &nremoved);
+ if (nremoved != 1)
+ elog(ERROR, "failed to find relation %d in joinlist", fkinfo->ref_relid);
+
+ removed_relids = bms_add_member(removed_relids, fkinfo->ref_relid);
+
+ /*
+ * Restart the scan. This is necessary to ensure we find all
+ * removable joins independently of ordering of the fkey_list: a just-
+ * completed removal can collapse another EC to a single rel, which
+ * makes a previously-rejected FK pass inner_join_is_removable's
+ * multi-rel-EC count check.
+ */
+ goto restart;
+ }
+
+ bms_free(removed_relids);
+
+ return joinlist;
+}
+
+/*
+ * inner_join_is_removable
+ * Check whether an inner join to the referenced relation of a foreign key
+ * can be safely removed from the query tree.
+ *
+ * To be removable, the referenced relation must act only as a structural
+ * anchor. The inner join must be mathematically guaranteed to produce exactly
+ * one matching row for each referencing relation's row. The foreign key
+ * constraint guarantees existence (at least one match), and the referenced
+ * relation's unique constraint guarantees non-duplication (at most one match).
+ *
+ * The referenced relation's non-key columns cannot be used anywhere in the
+ * query. Because the referenced relation will be removed, we would have no
+ * way to read their values. Even if used strictly within the join condition,
+ * they would act as local filters, violating the exact-match guarantee.
+ *
+ * Theoretically, the referenced relation's key columns represent the exact
+ * same logical values as the referencing relation's columns due to the foreign
+ * key equality. However, they are not strictly interchangeable in the query
+ * tree. Substituting them requires preserving exact data types, collations,
+ * and operator family semantics. Because the planner currently lacks a
+ * mechanism to safely perform this variable substitution across differing
+ * schemas, their usage is strictly limited. They cannot be used in contexts
+ * that require rewriting Var references, such as targetlist, restriction
+ * clauses, lateral_vars, or righthand-side expressions of semijoins. They
+ * are, however, permitted to bridge multi-hop joins via EquivalenceClasses.
+ * Removing the relation safely drops its members from the ECs, and the planner
+ * natively deduces the remaining transitive equalities.
+ *
+ * NOTE: This function assumes the caller will inject IS NOT NULL filters for
+ * the referencing relation's FK columns if they are not strictly enforced by
+ * the schema to prevent partial-match ghost rows.
+ */
+static bool
+inner_join_is_removable(PlannerInfo *root, ForeignKeyOptInfo *fkinfo)
+{
+ RelOptInfo *con_rel;
+ RelOptInfo *ref_rel;
+ int attroff;
+ Relids inputrelids;
+ Bitmapset *fk_attnums = NULL;
+ int colno;
+ int multirel_ec_count = 0;
+ int i;
+
+ /* User has disabled this optimization. */
+ if (!enable_fk_inner_join_removal)
+ return false;
+
+ /*
+ * If the constraint is deferrable, the physical tables may temporarily
+ * violate the FK during an active transaction. The inner join must be
+ * executed to filter out uncommitted orphaned rows.
+ */
+ if (fkinfo->con_deferrable)
+ return false;
+
+ /*
+ * If the constraint is not validated (NOT VALID), existing data may
+ * violate the FK even though new modifications are enforced. We cannot
+ * rely on such a constraint for join removal.
+ */
+ if (!fkinfo->con_validated)
+ return false;
+
+ /*
+ * Never try to eliminate the query's result relation. UPDATE, DELETE,
+ * and MERGE can all build a join tree where the target relation appears
+ * as the FK referenced rel.
+ */
+ if (fkinfo->ref_relid == root->parse->resultRelation)
+ return false;
+
+ /*
+ * Either relid might identify a rel that is in the query's rtable but
+ * isn't referenced by the jointree, or has been removed by join removal,
+ * so that it won't have a RelOptInfo. Hence we use find_base_rel_noerr()
+ * here. We can ignore such FKs.
+ */
+ con_rel = find_base_rel_noerr(root, fkinfo->con_relid);
+ if (con_rel == NULL)
+ return false;
+ ref_rel = find_base_rel_noerr(root, fkinfo->ref_relid);
+ if (ref_rel == NULL)
+ return false;
+
+ /*
+ * Ignore FK unless both rels are baserels. This gets rid of FKs that
+ * link to inheritance child rels (otherrels).
+ */
+ if (con_rel->reloptkind != RELOPT_BASEREL ||
+ ref_rel->reloptkind != RELOPT_BASEREL)
+ return false;
+
+ /*
+ * If the referenced relation has any restriction clauses, they act as
+ * explicit filters. Since we cannot perform variable substitution to
+ * rewrite these clauses, we must abort.
+ */
+ if (ref_rel->baserestrictinfo)
+ return false;
+
+ /*
+ * If the referenced relation is being sampled via TABLESAMPLE, the join
+ * would normally restrict the result to child rows whose parent landed in
+ * the sample. Removing the join would lose that restriction and return
+ * all child rows with a non-null FK, which is more than the joined form
+ * would return. Since we have no way to preserve the sampling effect
+ * when the referenced rel is removed, we must skip the optimization.
+ */
+ if (root->simple_rte_array[fkinfo->ref_relid]->tablesample != NULL)
+ return false;
+
+ /*
+ * For joininfo, we look at clause_relids rather than at the joininfo list
+ * itself. required_relids may exceed clause_relids when an outer join ON
+ * clause needs to be forced to evaluate exactly at the level of the outer
+ * join; that can place a clause on ref_rel->joininfo even though ref_rel
+ * doesn't appear in the clause's expression at all. Reject only when a
+ * join clause actually references ref_rel.
+ */
+ foreach_node(RestrictInfo, rinfo, ref_rel->joininfo)
+ {
+ if (bms_is_member(ref_rel->relid, rinfo->clause_relids))
+ return false;
+ }
+
+ /*
+ * Build a fast lookup bitmap for the referenced relation's foreign key
+ * attributes to optimize the subsequent attribute usage checks.
+ */
+ for (colno = 0; colno < fkinfo->nkeys; colno++)
+ {
+ fk_attnums = bms_add_member(fk_attnums,
+ fkinfo->confkey[colno] - ref_rel->min_attr);
+ }
+
+ /*
+ * Verify attribute usage against the substitution constraints.
+ *
+ * As a micro-optimization, it seems better to start with max_attr and
+ * count down rather than starting with min_attr and counting up, on the
+ * theory that the system attributes are somewhat less likely to be wanted
+ * and should be tested last.
+ */
+ for (attroff = ref_rel->max_attr - ref_rel->min_attr;
+ attroff >= 0;
+ attroff--)
+ {
+ if (bms_is_member(attroff, fk_attnums))
+ {
+ /*
+ * The specific columns involved in the foreign key are allowed to
+ * bridge multi-hop joins via EquivalenceClasses. However, they
+ * must not be needed in the final targetlist, which would require
+ * variable substitution.
+ */
+ if (bms_is_member(0, ref_rel->attr_needed[attroff]))
+ return false;
+
+ continue;
+ }
+
+ /*
+ * The non-key columns must not be used anywhere. See comments above.
+ */
+ if (!bms_is_empty(ref_rel->attr_needed[attroff]))
+ return false;
+ }
+
+ /* Compute the relid set for the join we are considering */
+ inputrelids = bms_make_singleton(fkinfo->con_relid);
+ inputrelids = bms_add_member(inputrelids, fkinfo->ref_relid);
+
+ /*
+ * Similarly check that the referenced rel isn't needed by any
+ * PlaceHolderVars that will be used above the join. The PHV case is a
+ * little bit more complicated, because PHVs may have been assigned a
+ * ph_eval_at location that includes the ref_rel, yet their contained
+ * expression might not actually reference the ref_rel (it could be just a
+ * constant, for instance). If such a PHV is due to be evaluated above
+ * the join then it needn't prevent inner join removal.
+ */
+ foreach_node(PlaceHolderInfo, phinfo, root->placeholder_list)
+ {
+ if (bms_overlap(phinfo->ph_lateral, ref_rel->relids))
+ return false; /* it references ref_rel laterally */
+ if (!bms_overlap(phinfo->ph_eval_at, ref_rel->relids))
+ continue; /* it definitely doesn't reference ref_rel */
+ if (bms_is_subset(phinfo->ph_needed, inputrelids))
+ continue; /* PHV is not used above the join */
+
+ /*
+ * We need to be sure there will still be a place to evaluate the PHV
+ * if we remove the join, ie that ph_eval_at wouldn't become empty.
+ * We've established above that ph_eval_at overlaps ref_rel->relids,
+ * so the only way removal empties it is if ref_rel is the sole
+ * member.
+ */
+ if (bms_membership(phinfo->ph_eval_at) == BMS_SINGLETON)
+ return false; /* there isn't any other place to eval PHV */
+ /* Check contained expression last, since this is a bit expensive */
+ if (bms_overlap(pull_varnos(root, (Node *) phinfo->ph_var->phexpr),
+ ref_rel->relids))
+ return false; /* contained expression references ref_rel */
+ }
+
+ /*
+ * If the referencing and referenced relations are separated by an outer
+ * join boundary, removing the inner join alters the null-padding
+ * semantics of the query.
+ *
+ * Additionally, if the referenced relation's columns are referenced in
+ * the RHS expressions of any semi-join, we must punt. We do not have a
+ * way to rewrite those references.
+ */
+ foreach_node(SpecialJoinInfo, sjinfo, root->join_info_list)
+ {
+ if ((bms_is_member(fkinfo->con_relid, sjinfo->syn_lefthand) ^
+ bms_is_member(fkinfo->ref_relid, sjinfo->syn_lefthand)) ||
+ (bms_is_member(fkinfo->con_relid, sjinfo->syn_righthand) ^
+ bms_is_member(fkinfo->ref_relid, sjinfo->syn_righthand)))
+ return false;
+
+ if (sjinfo->semi_rhs_exprs != NIL)
+ {
+ if (bms_is_member(fkinfo->ref_relid,
+ pull_varnos(root, (Node *) sjinfo->semi_rhs_exprs)))
+ return false;
+ }
+ }
+
+ /*
+ * If the referenced relation is referenced laterally by any other
+ * relation, punt. We do not have a way to rewrite those references.
+ */
+ if (root->hasLateralRTEs)
+ {
+ Index rti;
+
+ for (rti = 1; rti < root->simple_rel_array_size; rti++)
+ {
+ RelOptInfo *otherrel = root->simple_rel_array[rti];
+
+ if (otherrel == NULL || otherrel->relid == fkinfo->ref_relid)
+ continue;
+
+ if (otherrel->lateral_vars != NIL)
+ {
+ if (bms_is_member(fkinfo->ref_relid,
+ pull_varnos(root, (Node *) otherrel->lateral_vars)))
+ return false;
+ }
+ }
+ }
+
+ /*
+ * We must mathematically prove that every column in the referenced
+ * relation's foreign key is bound to the referencing relation's foreign
+ * key via a valid equality operator within an equivalence class.
+ */
+ for (colno = 0; colno < fkinfo->nkeys; colno++)
+ {
+ EquivalenceClass *ec;
+
+ ec = match_eclasses_to_foreign_key_col(root, fkinfo, colno, NULL);
+
+ if (ec == NULL)
+ return false;
+ }
+
+ /*
+ * We must also prove the inverse: the referenced relation must not
+ * participate in any EquivalenceClasses other than the ones that bridge
+ * it to the referencing relation's foreign key, and within those ECs no
+ * EquivalenceMember may mix ref_rel with other relations.
+ *
+ * We have already verified that the relation participates in 'nkeys'
+ * valid ECs bridging to the referencing relation. If it appears in more
+ * multi-rel ECs than that, it means a referenced key column is involved
+ * in an equality that the planner refused to merge with the foreign key's
+ * EC. This split occurs during collation mismatches, incompatible
+ * operator families, or if the key is wrapped in an expression more
+ * complex than a simple Var (e.g., ref_rel.pk + 1 = c.val). In those
+ * scenarios, removing the referenced relation would silently destroy the
+ * join condition of that separate EC, because the referencing relation's
+ * column is not present in it to take over the transitive equality.
+ *
+ * Single-rel ECs (sort/group expressions, or ECs left degenerate by a
+ * prior FK removal stripping their other side) are skipped: their
+ * surviving members all reference ref_rel exclusively and will be dropped
+ * during remove_rel_from_eclass. Skipping them is what lets a chain of
+ * FK removals succeed once the caller re-runs the scan.
+ *
+ * Within the bridging ECs, we also need to reject any EM that mentions
+ * ref_rel together with other relations (e.g., ref_rel.pk + other_rel.x).
+ * Such an EM would survive remove_rel_from_eclass with a non-empty
+ * em_relids, yet its expression still embeds a Var on the now-removed
+ * ref_rel.
+ */
+ i = -1;
+ while ((i = bms_next_member(ref_rel->eclass_indexes, i)) >= 0)
+ {
+ EquivalenceClass *ec = (EquivalenceClass *) list_nth(root->eq_classes, i);
+
+ if (bms_membership(ec->ec_relids) == BMS_MULTIPLE)
+ multirel_ec_count++;
+
+ foreach_node(EquivalenceMember, em, ec->ec_members)
+ {
+ if (bms_is_member(fkinfo->ref_relid, em->em_relids) &&
+ bms_membership(em->em_relids) == BMS_MULTIPLE)
+ return false;
+ }
+ }
+ if (multirel_ec_count != fkinfo->nkeys)
+ return false;
+
+ /* All checks passed */
+ return true;
+}
+
+/*
+ * inject_fk_not_null_quals
+ * Injects IS NOT NULL clauses into the referencing relation's restriction
+ * list for any foreign key columns that are not enforced as NOT NULL by the
+ * schema.
+ *
+ * When we remove a referenced relation (the inner join target) based on a
+ * foreign key, we lose the implicit filtering of NULLs that a standard inner
+ * join performs. To preserve mathematical equivalence, we must ensure the
+ * referencing relation does not emit rows with NULL foreign keys.
+ */
+static void
+inject_fk_not_null_quals(PlannerInfo *root, ForeignKeyOptInfo *fkinfo)
+{
+ RelOptInfo *con_rel;
+ RangeTblEntry *con_rte;
+ Bitmapset *notnullattnums;
+ Bitmapset *already_injected = NULL;
+ int i;
+
+ con_rel = find_base_rel(root, fkinfo->con_relid);
+ con_rte = root->simple_rte_array[fkinfo->con_relid];
+ Assert(con_rte->rtekind == RTE_RELATION);
+ Assert(!con_rte->inh);
+
+ /*
+ * Get the column not-null constraint information for the referencing
+ * relation.
+ */
+ notnullattnums = find_relation_notnullatts(root, con_rte->relid);
+
+ /*
+ * Collect attnos already covered by an IS NOT NULL clause on con_rel in
+ * baserestrictinfo, so we don't append a redundant qual. Such clauses
+ * may come from a prior FK removal for another FK on the same con_rel, a
+ * user-written IS NOT NULL clause, or other planner transformations that
+ * inject NullTest restrictions.
+ */
+ foreach_node(RestrictInfo, r, con_rel->baserestrictinfo)
+ {
+ NullTest *ntest;
+ Var *var;
+
+ if (!IsA(r->clause, NullTest))
+ continue;
+ ntest = (NullTest *) r->clause;
+ if (ntest->nulltesttype != IS_NOT_NULL ||
+ ntest->argisrow ||
+ !IsA(ntest->arg, Var))
+ continue;
+ var = (Var *) ntest->arg;
+ Assert(var->varno == fkinfo->con_relid);
+ if (var->varattno > 0)
+ already_injected = bms_add_member(already_injected, var->varattno);
+ }
+
+ for (i = 0; i < fkinfo->nkeys; i++)
+ {
+ AttrNumber con_attno = fkinfo->conkey[i];
+ Var *var;
+ NullTest *ntest;
+ RestrictInfo *rinfo;
+ Oid vartype;
+ int32 vartypmod;
+ Oid varcollid;
+
+ /* System columns are implicitly NOT NULL */
+ if (con_attno < 0)
+ continue;
+
+ /* Schema already guarantees the column is NOT NULL */
+ if (bms_is_member(con_attno, notnullattnums))
+ continue;
+
+ /* Already covered by an existing IS NOT NULL on con_rel */
+ if (bms_is_member(con_attno, already_injected))
+ continue;
+
+ get_atttypetypmodcoll(con_rte->relid,
+ con_attno,
+ &vartype,
+ &vartypmod,
+ &varcollid);
+
+ var = makeVar(fkinfo->con_relid,
+ con_attno,
+ vartype,
+ vartypmod,
+ varcollid,
+ 0);
+
+ ntest = makeNode(NullTest);
+ ntest->arg = (Expr *) var;
+ ntest->nulltesttype = IS_NOT_NULL;
+ ntest->argisrow = false;
+ ntest->location = -1;
+
+ rinfo = make_restrictinfo(root,
+ (Expr *) ntest,
+ true,
+ false,
+ false,
+ false,
+ root->qual_security_level,
+ bms_make_singleton(fkinfo->con_relid),
+ NULL,
+ NULL);
+
+ con_rel->baserestrictinfo = lappend(con_rel->baserestrictinfo, rinfo);
+ already_injected = bms_add_member(already_injected, con_attno);
+ }
+
+ bms_free(already_injected);
+}
+
+/*
+ * remove_referenced_rel_from_query
+ * Remove the referenced rel of a foreign key and references to it from
+ * the planner's data structures, having determined that there is no
+ * need to include it in the query.
+ *
+ * We are not terribly thorough here. We only bother to update parts of
+ * the planner's data structures that will actually be consulted later.
+ */
+static void
+remove_referenced_rel_from_query(PlannerInfo *root, ForeignKeyOptInfo *fkinfo)
+{
+ int relid = fkinfo->ref_relid;
+ RelOptInfo *ref_rel = find_base_rel(root, relid);
+ Relids joinrelids;
+
+ /* Compute the relid set for the join we are considering */
+ joinrelids = bms_make_singleton(fkinfo->con_relid);
+ joinrelids = bms_add_member(joinrelids, relid);
+
+ remove_rel_from_query(root, relid, -1, NULL, joinrelids);
+
+ bms_free(joinrelids);
+
+ /*
+ * Any join clause that survived in ref_rel->joininfo doesn't reference
+ * ref_rel, but its relid sets may still mention ref_rel because it was
+ * distributed there based on required_relids. Strip ref_rel from each
+ * such clause's relid sets so that surviving rels see a consistent state.
+ */
+ foreach_node(RestrictInfo, rinfo, ref_rel->joininfo)
+ {
+ Assert(!bms_is_member(ref_rel->relid, rinfo->clause_relids));
+
+ remove_rel_from_restrictinfo(rinfo, relid, 0);
+
+ Assert(bms_membership(rinfo->required_relids) == BMS_MULTIPLE);
+ }
+
+ /*
+ * There may be references to the rel in root->fkey_list, but if so,
+ * match_foreign_keys_to_quals() will get rid of them.
+ */
+
+ /*
+ * Now remove the rel from the baserel array to prevent it from being
+ * referenced again.
+ */
+ root->simple_rel_array[relid] = NULL;
+ root->simple_rte_array[relid] = NULL;
+
+ /* And nuke the RelOptInfo, just in case there's another access path */
+ pfree(ref_rel);
+
+ /*
+ * Now repeat construction of attr_needed bits coming from all other
+ * sources.
+ */
+ rebuild_placeholder_attr_needed(root);
+ rebuild_joinclause_attr_needed(root);
+ rebuild_eclass_attr_needed(root);
+ rebuild_lateral_attr_needed(root);
+}
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index b38422c47a4..a59bce6dacc 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -4077,15 +4077,18 @@ match_foreign_keys_to_quals(PlannerInfo *root)
for (colno = 0; colno < fkinfo->nkeys; colno++)
{
EquivalenceClass *ec;
+ EquivalenceMember *em = NULL;
AttrNumber con_attno,
ref_attno;
Oid fpeqop;
ListCell *lc2;
- ec = match_eclasses_to_foreign_key_col(root, fkinfo, colno);
+ ec = match_eclasses_to_foreign_key_col(root, fkinfo, colno, &em);
/* Don't bother looking for loose quals if we got an EC match */
if (ec != NULL)
{
+ fkinfo->eclass[colno] = ec;
+ fkinfo->fk_eclass_member[colno] = em;
fkinfo->nmatched_ec++;
if (ec->ec_has_const)
fkinfo->nconst_ec++;
diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c
index 02495e22e24..194a84266e3 100644
--- a/src/backend/optimizer/plan/planmain.c
+++ b/src/backend/optimizer/plan/planmain.c
@@ -241,6 +241,14 @@ query_planner(PlannerInfo *root,
*/
joinlist = remove_useless_self_joins(root, joinlist);
+ /*
+ * Remove any useless inner joins based on foreign key constraints. This
+ * must be done after equivalence classes are finalized and foreign key
+ * lists are initialized, as the optimization relies on these structures
+ * to safely identify and eliminate redundant relations.
+ */
+ joinlist = remove_useless_inner_joins(root, joinlist);
+
/*
* Now distribute "placeholders" to base rels as needed. This has to be
* done after join removal because removal could change whether a
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 7c4be174869..99e3dc5bc92 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -662,6 +662,8 @@ get_relation_foreign_keys(PlannerInfo *root, RelOptInfo *rel,
info->con_relid = rel->relid;
info->ref_relid = rti;
info->nkeys = cachedfk->nkeys;
+ info->con_deferrable = cachedfk->condeferrable;
+ info->con_validated = cachedfk->convalidated;
memcpy(info->conkey, cachedfk->conkey, sizeof(info->conkey));
memcpy(info->confkey, cachedfk->confkey, sizeof(info->confkey));
memcpy(info->conpfeqop, cachedfk->conpfeqop, sizeof(info->conpfeqop));
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 0572ab424e7..4c4bc206db3 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -4774,7 +4774,9 @@ RelationGetFKeyList(Relation relation)
info->conoid = constraint->oid;
info->conrelid = constraint->conrelid;
info->confrelid = constraint->confrelid;
+ info->condeferrable = constraint->condeferrable;
info->conenforced = constraint->conenforced;
+ info->convalidated = constraint->convalidated;
DeconstructFkConstraintRow(htup, &info->nkeys,
info->conkey,
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index afaa058b046..04697f95acd 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -892,6 +892,13 @@
boot_val => 'true',
},
+{ name => 'enable_fk_inner_join_removal', type => 'bool', context => 'PGC_USERSET', group => 'QUERY_TUNING_METHOD',
+ short_desc => 'Enables foreign-key-based inner-join removal.',
+ flags => 'GUC_EXPLAIN',
+ variable => 'enable_fk_inner_join_removal',
+ boot_val => 'true',
+},
+
{ name => 'enable_gathermerge', type => 'bool', context => 'PGC_USERSET', group => 'QUERY_TUNING_METHOD',
short_desc => 'Enables the planner\'s use of gather merge plans.',
flags => 'GUC_EXPLAIN',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index ac38cddaaf9..29a26c7b07d 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -447,6 +447,7 @@
#enable_group_by_reordering = on
#enable_distinct_reordering = on
#enable_self_join_elimination = on
+#enable_fk_inner_join_removal = on
#enable_eager_aggregate = on
# - Planner Cost Constants -
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 27a2c6815b7..d67f68089e3 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -1473,6 +1473,10 @@ typedef struct ForeignKeyOptInfo
Index ref_relid;
/* number of columns in the foreign key */
int nkeys;
+ /* Is the constraint deferrable? */
+ bool con_deferrable;
+ /* Is the constraint validated? */
+ bool con_validated;
/* cols in referencing table */
AttrNumber conkey[INDEX_MAX_KEYS] pg_node_attr(array_size(nkeys));
/* cols in referenced table */
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 17f2099ec3b..51931cd2367 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -172,7 +172,8 @@ extern bool exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2,
Oid opfamily);
extern EquivalenceClass *match_eclasses_to_foreign_key_col(PlannerInfo *root,
ForeignKeyOptInfo *fkinfo,
- int colno);
+ int colno,
+ EquivalenceMember **fk_eclass_member);
extern RestrictInfo *find_derived_clause_for_ec_member(PlannerInfo *root,
EquivalenceClass *ec,
EquivalenceMember *em);
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index 71c043a25e8..b6b5b930831 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -21,6 +21,7 @@
#define DEFAULT_CURSOR_TUPLE_FRACTION 0.1
extern PGDLLIMPORT double cursor_tuple_fraction;
extern PGDLLIMPORT bool enable_self_join_elimination;
+extern PGDLLIMPORT bool enable_fk_inner_join_removal;
/* query_planner callback to compute query_pathkeys */
typedef void (*query_pathkeys_callback) (PlannerInfo *root, void *extra);
@@ -120,6 +121,7 @@ extern bool innerrel_is_unique_ext(PlannerInfo *root, Relids joinrelids,
JoinType jointype, List *restrictlist,
bool force_cache, List **extra_clauses);
extern List *remove_useless_self_joins(PlannerInfo *root, List *joinlist);
+extern List *remove_useless_inner_joins(PlannerInfo *root, List *joinlist);
/*
* prototypes for plan/setrefs.c
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index fa07ebf8ff7..483a66eb0a6 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -284,9 +284,15 @@ typedef struct ForeignKeyCacheInfo
/* number of columns in the foreign key */
int nkeys;
+ /* Is deferrable ? */
+ bool condeferrable;
+
/* Is enforced ? */
bool conenforced;
+ /* Is validated ? */
+ bool convalidated;
+
/*
* these arrays each have nkeys valid entries:
*/
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 78bf022f7b4..d46fbf0f28a 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -10143,3 +10143,546 @@ SELECT COUNT(*) FROM onek t1 LEFT JOIN tenk1 t2
19000
(1 row)
+--
+-- Test useless inner join removal for foreign key referenced relations
+--
+CREATE TABLE fk_parent1 (id int PRIMARY KEY, val text);
+CREATE TABLE fk_parent2 (id int PRIMARY KEY, val text);
+CREATE TABLE fk_parent_text (id text PRIMARY KEY);
+CREATE TABLE fk_multi_parent (id1 int, id2 int, val text, PRIMARY KEY (id1, id2));
+CREATE TABLE fk_child (
+ id int PRIMARY KEY,
+ p1_id int NOT NULL REFERENCES fk_parent1(id),
+ p2_id int REFERENCES fk_parent2(id),
+ p_text text REFERENCES fk_parent_text(id),
+ p_m1 int,
+ p_m2 int,
+ val text,
+ FOREIGN KEY (p_m1, p_m2) REFERENCES fk_multi_parent(id1, id2)
+);
+INSERT INTO fk_parent1 VALUES (1, 'p1_1'), (2, 'p1_2');
+INSERT INTO fk_parent2 VALUES (1, 'p2_1'), (2, 'p2_2');
+INSERT INTO fk_parent_text VALUES ('t1'), ('t2');
+INSERT INTO fk_multi_parent VALUES (1, 1, 'm1'), (2, 2, 'm2');
+INSERT INTO fk_child VALUES
+ (1, 1, 1, 't1', 1, 1, 'c1'),
+ (2, 2, NULL, 't2', 2, 2, 'c2'),
+ (3, 1, 1, 't1', 1, 1, 'c3');
+ANALYZE fk_parent1;
+ANALYZE fk_parent2;
+ANALYZE fk_parent_text;
+ANALYZE fk_multi_parent;
+ANALYZE fk_child;
+-- Ensure that fk_parent1 is removed
+EXPLAIN (COSTS OFF)
+SELECT c.id, c.val
+FROM fk_child c JOIN fk_parent1 p ON c.p1_id = p.id
+ORDER BY c.id;
+ QUERY PLAN
+------------------------------
+ Sort
+ Sort Key: c.id
+ -> Seq Scan on fk_child c
+(3 rows)
+
+-- Ensure that it returns all 3 rows
+SELECT c.id, c.val
+FROM fk_child c JOIN fk_parent1 p ON c.p1_id = p.id
+ORDER BY c.id;
+ id | val
+----+-----
+ 1 | c1
+ 2 | c2
+ 3 | c3
+(3 rows)
+
+-- Ensure that fk_parent2 is removed, and an IS NOT NULL qual is injected
+EXPLAIN (COSTS OFF)
+SELECT c.id, c.val
+FROM fk_child c JOIN fk_parent2 p ON c.p2_id = p.id
+ORDER BY c.id;
+ QUERY PLAN
+-------------------------------------
+ Sort
+ Sort Key: c.id
+ -> Seq Scan on fk_child c
+ Filter: (p2_id IS NOT NULL)
+(4 rows)
+
+-- Ensure that it returns row 1 and row 3
+SELECT c.id, c.val
+FROM fk_child c JOIN fk_parent2 p ON c.p2_id = p.id
+ORDER BY c.id;
+ id | val
+----+-----
+ 1 | c1
+ 3 | c3
+(2 rows)
+
+-- Ensure we do not have redundant IS NOT NULL qual
+EXPLAIN (COSTS OFF)
+SELECT c.id, c.val
+FROM fk_child c JOIN fk_parent2 p ON c.p2_id = p.id
+WHERE c.p2_id IS NOT NULL
+ORDER BY c.id;
+ QUERY PLAN
+-------------------------------------
+ Sort
+ Sort Key: c.id
+ -> Seq Scan on fk_child c
+ Filter: (p2_id IS NOT NULL)
+(4 rows)
+
+-- Ensure that it returns row 1 and row 3
+SELECT c.id, c.val
+FROM fk_child c JOIN fk_parent2 p ON c.p2_id = p.id
+WHERE c.p2_id IS NOT NULL
+ORDER BY c.id;
+ id | val
+----+-----
+ 1 | c1
+ 3 | c3
+(2 rows)
+
+-- Ensure that both fk_parent1 and fk_parent2 are removed
+EXPLAIN (COSTS OFF)
+SELECT c.id
+FROM fk_child c
+ JOIN fk_parent1 p1 ON c.p1_id = p1.id
+ JOIN fk_parent2 p2 ON c.p2_id = p2.id
+ORDER BY c.id;
+ QUERY PLAN
+-------------------------------------
+ Sort
+ Sort Key: c.id
+ -> Seq Scan on fk_child c
+ Filter: (p2_id IS NOT NULL)
+(4 rows)
+
+-- Ensure that it returns rows 1 and 3
+SELECT c.id
+FROM fk_child c
+ JOIN fk_parent1 p1 ON c.p1_id = p1.id
+ JOIN fk_parent2 p2 ON c.p2_id = p2.id
+ORDER BY c.id;
+ id
+----
+ 1
+ 3
+(2 rows)
+
+-- Ensure that fk_parent1 is removed, leaving c1 joined to c2
+EXPLAIN (COSTS OFF)
+SELECT c1.id, c2.id
+FROM fk_child c1
+ JOIN fk_parent1 p ON c1.p1_id = p.id
+ JOIN fk_child c2 ON p.id = c2.p1_id
+ORDER BY c1.id, c2.id;
+ QUERY PLAN
+-------------------------------------------
+ Sort
+ Sort Key: c1.id, c2.id
+ -> Hash Join
+ Hash Cond: (c1.p1_id = c2.p1_id)
+ -> Seq Scan on fk_child c1
+ -> Hash
+ -> Seq Scan on fk_child c2
+(7 rows)
+
+-- Ensure that we get 1x1, 1x3, 3x1, 3x3, 2x2
+SELECT c1.id, c2.id
+FROM fk_child c1
+ JOIN fk_parent1 p ON c1.p1_id = p.id
+ JOIN fk_child c2 ON p.id = c2.p1_id
+ORDER BY c1.id, c2.id;
+ id | id
+----+----
+ 1 | 1
+ 1 | 3
+ 2 | 2
+ 3 | 1
+ 3 | 3
+(5 rows)
+
+-- Multi-column FK, ensure that fk_multi_parent is removed
+EXPLAIN (COSTS OFF)
+SELECT c.id
+FROM fk_child c
+ JOIN fk_multi_parent p ON c.p_m1 = p.id1 AND c.p_m2 = p.id2
+ORDER BY c.id;
+ QUERY PLAN
+-------------------------------------------------------------
+ Sort
+ Sort Key: c.id
+ -> Seq Scan on fk_child c
+ Filter: ((p_m1 IS NOT NULL) AND (p_m2 IS NOT NULL))
+(4 rows)
+
+-- Ensure that we get all 3 rows
+SELECT c.id
+FROM fk_child c
+ JOIN fk_multi_parent p ON c.p_m1 = p.id1 AND c.p_m2 = p.id2
+ORDER BY c.id;
+ id
+----
+ 1
+ 2
+ 3
+(3 rows)
+
+-- Chain-shaped FK removal: c -> p1 -> p2
+ALTER TABLE fk_parent1
+ ADD COLUMN p2_id int NOT NULL DEFAULT 1 REFERENCES fk_parent2(id);
+EXPLAIN (COSTS OFF)
+SELECT c.id
+FROM fk_child c
+ JOIN fk_parent1 p1 ON c.p1_id = p1.id
+ JOIN fk_parent2 p2 ON p1.p2_id = p2.id
+ORDER BY c.id;
+ QUERY PLAN
+------------------------------
+ Sort
+ Sort Key: c.id
+ -> Seq Scan on fk_child c
+(3 rows)
+
+-- Ensure that we get all 3 rows
+SELECT c.id
+FROM fk_child c
+ JOIN fk_parent1 p1 ON c.p1_id = p1.id
+ JOIN fk_parent2 p2 ON p1.p2_id = p2.id
+ORDER BY c.id;
+ id
+----
+ 1
+ 2
+ 3
+(3 rows)
+
+ALTER TABLE fk_parent1 DROP COLUMN p2_id;
+-- LEFT JOIN ON-clause in joininfo does not reference fk_parent2, so FK-removal
+-- should still fire
+EXPLAIN (COSTS OFF)
+SELECT c.id
+FROM fk_parent1 p1
+ LEFT JOIN (fk_child c JOIN fk_parent2 p2 ON c.p2_id = p2.id)
+ ON p1.id = c.id
+ORDER BY c.id;
+ QUERY PLAN
+-------------------------------------------------
+ Sort
+ Sort Key: c.id
+ -> Nested Loop Left Join
+ Join Filter: (p1.id = c.id)
+ -> Seq Scan on fk_parent1 p1
+ -> Materialize
+ -> Seq Scan on fk_child c
+ Filter: (p2_id IS NOT NULL)
+(8 rows)
+
+SELECT c.id
+FROM fk_parent1 p1
+ LEFT JOIN (fk_child c JOIN fk_parent2 p2 ON c.p2_id = p2.id)
+ ON p1.id = c.id
+ORDER BY c.id;
+ id
+----
+ 1
+
+(2 rows)
+
+-- fk_parent1 cannot be removed because p.val is selected
+EXPLAIN (COSTS OFF)
+SELECT c.id, p.val
+FROM fk_child c JOIN fk_parent1 p ON c.p1_id = p.id;
+ QUERY PLAN
+--------------------------------------
+ Nested Loop
+ Join Filter: (p.id = c.p1_id)
+ -> Seq Scan on fk_child c
+ -> Materialize
+ -> Seq Scan on fk_parent1 p
+(5 rows)
+
+-- fk_parent1 cannot be removed because p.val is filtered
+EXPLAIN (COSTS OFF)
+SELECT c.id
+FROM fk_child c JOIN fk_parent1 p ON c.p1_id = p.id
+WHERE p.val = 'p1_1';
+ QUERY PLAN
+--------------------------------------
+ Nested Loop
+ Join Filter: (p.id = c.p1_id)
+ -> Seq Scan on fk_parent1 p
+ Filter: (val = 'p1_1'::text)
+ -> Seq Scan on fk_child c
+(5 rows)
+
+-- fk_parent1 cannot be removed because of TABLESAMPLE on the referenced
+-- relation
+EXPLAIN (COSTS OFF)
+SELECT c.id, c.val
+FROM fk_child c JOIN fk_parent1 p TABLESAMPLE BERNOULLI(50) REPEATABLE(1)
+ ON c.p1_id = p.id;
+ QUERY PLAN
+-----------------------------------------------------------------------------
+ Nested Loop
+ Join Filter: (c.p1_id = p.id)
+ -> Sample Scan on fk_parent1 p
+ Sampling: bernoulli ('50'::real) REPEATABLE ('1'::double precision)
+ -> Seq Scan on fk_child c
+(5 rows)
+
+-- fk_parent1 cannot be removed because p.id is in the targetlist
+EXPLAIN (COSTS OFF)
+SELECT c.id, p.id
+FROM fk_child c JOIN fk_parent1 p ON c.p1_id = p.id;
+ QUERY PLAN
+--------------------------------------
+ Nested Loop
+ Join Filter: (p.id = c.p1_id)
+ -> Seq Scan on fk_child c
+ -> Materialize
+ -> Seq Scan on fk_parent1 p
+(5 rows)
+
+-- fk_parent1 cannot be removed because p.id is in the filter
+EXPLAIN (COSTS OFF)
+SELECT c.id
+FROM fk_child c JOIN fk_parent1 p ON c.p1_id = p.id
+WHERE p.id > 1;
+ QUERY PLAN
+---------------------------------
+ Nested Loop
+ Join Filter: (p.id = c.p1_id)
+ -> Seq Scan on fk_parent1 p
+ Filter: (id > 1)
+ -> Seq Scan on fk_child c
+(5 rows)
+
+-- fk_parent1 cannot be removed because p.id is in the join clause
+EXPLAIN (COSTS OFF)
+SELECT c.id
+FROM fk_child c JOIN fk_parent1 p ON c.p1_id = p.id
+WHERE c.id > p.id;
+ QUERY PLAN
+--------------------------------------
+ Hash Join
+ Hash Cond: (c.p1_id = p.id)
+ Join Filter: (c.id > p.id)
+ -> Seq Scan on fk_child c
+ -> Hash
+ -> Seq Scan on fk_parent1 p
+(6 rows)
+
+-- fk_multi_parent cannot be removed because not all foreign key columns are
+-- matched
+EXPLAIN (COSTS OFF)
+SELECT c.id
+FROM fk_child c
+ JOIN fk_multi_parent p ON c.p_m1 = p.id1;
+ QUERY PLAN
+-------------------------------------------
+ Hash Join
+ Hash Cond: (c.p_m1 = p.id1)
+ -> Seq Scan on fk_child c
+ -> Hash
+ -> Seq Scan on fk_multi_parent p
+(5 rows)
+
+-- fk_parent2 cannot be removed because the LEFT JOIN ON-clause references
+-- p2.id
+EXPLAIN (COSTS OFF)
+SELECT c.id
+FROM fk_parent1 p1
+ LEFT JOIN (fk_child c JOIN fk_parent2 p2 ON c.p2_id = p2.id)
+ ON p1.id = p2.id;
+ QUERY PLAN
+---------------------------------------------
+ Hash Right Join
+ Hash Cond: (p2.id = p1.id)
+ -> Hash Join
+ Hash Cond: (c.p2_id = p2.id)
+ -> Seq Scan on fk_child c
+ -> Hash
+ -> Seq Scan on fk_parent2 p2
+ -> Hash
+ -> Seq Scan on fk_parent1 p1
+(9 rows)
+
+-- fk_parent1 cannot be removed because p.id is referenced in lateral_vars
+EXPLAIN (COSTS OFF)
+SELECT c.id
+FROM fk_child c
+ JOIN fk_parent1 p ON c.p1_id = p.id
+ CROSS JOIN LATERAL (SELECT p.id OFFSET 0);
+ QUERY PLAN
+--------------------------------------------
+ Hash Join
+ Hash Cond: (c.p1_id = p.id)
+ -> Seq Scan on fk_child c
+ -> Hash
+ -> Nested Loop
+ -> Seq Scan on fk_parent1 p
+ -> Result
+(7 rows)
+
+-- fk_parent2 cannot be removed because p2.id is referenced in the semi-join
+-- RHS expressions
+EXPLAIN (COSTS OFF)
+SELECT * FROM fk_parent1 p1
+WHERE p1.id IN
+ (SELECT p2.id FROM fk_child c JOIN fk_parent2 p2 ON c.p2_id = p2.id);
+ QUERY PLAN
+---------------------------------------------------
+ Nested Loop Semi Join
+ Join Filter: (p1.id = c.p2_id)
+ -> Seq Scan on fk_parent1 p1
+ -> Materialize
+ -> Hash Join
+ Hash Cond: (c.p2_id = p2.id)
+ -> Seq Scan on fk_child c
+ -> Hash
+ -> Seq Scan on fk_parent2 p2
+(9 rows)
+
+-- fk_parent_text cannot be removed due to the COLLATE "C" mismatch splitting
+-- the EC
+EXPLAIN (COSTS OFF)
+SELECT c1.id, c2.id
+FROM fk_child c1
+ JOIN fk_parent_text p ON c1.p_text = p.id
+ JOIN fk_child c2 ON p.id COLLATE "C" = c2.p_text COLLATE "C";
+ QUERY PLAN
+-------------------------------------------------
+ Hash Join
+ Hash Cond: ((p.id)::text = (c2.p_text)::text)
+ -> Nested Loop
+ Join Filter: (p.id = c1.p_text)
+ -> Seq Scan on fk_child c1
+ -> Materialize
+ -> Seq Scan on fk_parent_text p
+ -> Hash
+ -> Seq Scan on fk_child c2
+(9 rows)
+
+-- fk_parent1 cannot be removed because p.id + 0 splits the EC
+EXPLAIN (COSTS OFF)
+SELECT c1.id, c2.id
+FROM fk_child c1
+ JOIN fk_parent1 p ON c1.p1_id = p.id
+ JOIN fk_child c2 ON (p.id + 0) = c2.p1_id;
+ QUERY PLAN
+--------------------------------------------
+ Hash Join
+ Hash Cond: ((p.id + 0) = c2.p1_id)
+ -> Nested Loop
+ Join Filter: (p.id = c1.p1_id)
+ -> Seq Scan on fk_child c1
+ -> Materialize
+ -> Seq Scan on fk_parent1 p
+ -> Hash
+ -> Seq Scan on fk_child c2
+(9 rows)
+
+-- fk_parent1 cannot be removed due to the multi-rel EM (p.id + c2.p1_id)
+EXPLAIN (COSTS OFF)
+SELECT c1.id, c2.id
+FROM fk_child c1
+ JOIN fk_parent1 p ON c1.p1_id = p.id
+ JOIN fk_child c2 ON c1.p1_id = (p.id + c2.p1_id);
+ QUERY PLAN
+-------------------------------------------------
+ Nested Loop
+ Join Filter: (p.id = c1.p1_id)
+ -> Nested Loop
+ Join Filter: ((p.id + c2.p1_id) = p.id)
+ -> Seq Scan on fk_child c2
+ -> Materialize
+ -> Seq Scan on fk_parent1 p
+ -> Seq Scan on fk_child c1
+(8 rows)
+
+-- Ensure that the right join is not removed
+EXPLAIN (COSTS OFF)
+SELECT c.id, c.val
+FROM fk_child c RIGHT JOIN fk_parent1 p ON c.p1_id = p.id;
+ QUERY PLAN
+--------------------------------------
+ Hash Right Join
+ Hash Cond: (c.p1_id = p.id)
+ -> Seq Scan on fk_child c
+ -> Hash
+ -> Seq Scan on fk_parent1 p
+(5 rows)
+
+-- Ensure that the full join is not removed
+EXPLAIN (COSTS OFF)
+SELECT c.id, c.val
+FROM fk_child c FULL JOIN fk_parent1 p ON c.p1_id = p.id;
+ QUERY PLAN
+--------------------------------------
+ Hash Full Join
+ Hash Cond: (c.p1_id = p.id)
+ -> Seq Scan on fk_child c
+ -> Hash
+ -> Seq Scan on fk_parent1 p
+(5 rows)
+
+-- fk_parent1 cannot be removed because it is separated with fk_child by the
+-- outer join boundary
+EXPLAIN (COSTS OFF)
+SELECT c.id
+FROM fk_child c
+ JOIN (fk_parent1 p1
+ LEFT JOIN fk_parent2 p2 ON TRUE)
+ ON c.p1_id = p1.id;
+ QUERY PLAN
+---------------------------------------------
+ Nested Loop Left Join
+ -> Nested Loop
+ Join Filter: (p1.id = c.p1_id)
+ -> Seq Scan on fk_child c
+ -> Materialize
+ -> Seq Scan on fk_parent1 p1
+ -> Materialize
+ -> Seq Scan on fk_parent2 p2
+(8 rows)
+
+-- Deferrable FK: fk_parent1 cannot be removed because constraint is deferrable
+ALTER TABLE fk_child DROP CONSTRAINT fk_child_p1_id_fkey;
+ALTER TABLE fk_child ADD CONSTRAINT fk_child_p1_id_fkey
+ FOREIGN KEY (p1_id) REFERENCES fk_parent1(id) DEFERRABLE;
+EXPLAIN (COSTS OFF)
+SELECT c.id, c.val
+FROM fk_child c JOIN fk_parent1 p ON c.p1_id = p.id;
+ QUERY PLAN
+--------------------------------------
+ Nested Loop
+ Join Filter: (p.id = c.p1_id)
+ -> Seq Scan on fk_child c
+ -> Materialize
+ -> Seq Scan on fk_parent1 p
+(5 rows)
+
+-- NOT VALID FK: fk_parent1 cannot be removed because constraint is NOT VALID
+ALTER TABLE fk_child DROP CONSTRAINT fk_child_p1_id_fkey;
+ALTER TABLE fk_child ADD CONSTRAINT fk_child_p1_id_fkey
+ FOREIGN KEY (p1_id) REFERENCES fk_parent1(id) NOT VALID;
+EXPLAIN (COSTS OFF)
+SELECT c.id, c.val
+FROM fk_child c JOIN fk_parent1 p ON c.p1_id = p.id;
+ QUERY PLAN
+--------------------------------------
+ Nested Loop
+ Join Filter: (p.id = c.p1_id)
+ -> Seq Scan on fk_child c
+ -> Materialize
+ -> Seq Scan on fk_parent1 p
+(5 rows)
+
+DROP TABLE fk_child;
+DROP TABLE fk_multi_parent;
+DROP TABLE fk_parent_text;
+DROP TABLE fk_parent2;
+DROP TABLE fk_parent1;
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index 132b56a5864..f404bac4b0d 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -159,6 +159,7 @@ select name, setting from pg_settings where name like 'enable%';
enable_bitmapscan | on
enable_distinct_reordering | on
enable_eager_aggregate | on
+ enable_fk_inner_join_removal | on
enable_gathermerge | on
enable_group_by_reordering | on
enable_hashagg | on
@@ -180,7 +181,7 @@ select name, setting from pg_settings where name like 'enable%';
enable_seqscan | on
enable_sort | on
enable_tidscan | on
-(25 rows)
+(26 rows)
-- There are always wait event descriptions for various types. InjectionPoint
-- may be present or absent, depending on history since last postmaster start.
diff --git a/src/test/regress/sql/join.sql b/src/test/regress/sql/join.sql
index fae19113cef..b62f5ac6c0b 100644
--- a/src/test/regress/sql/join.sql
+++ b/src/test/regress/sql/join.sql
@@ -3891,3 +3891,281 @@ SELECT COUNT(*) FROM onek t1 LEFT JOIN tenk1 t2
ON (t2.thousand = t1.tenthous OR t2.thousand = t1.thousand);
SELECT COUNT(*) FROM onek t1 LEFT JOIN tenk1 t2
ON (t2.thousand = t1.tenthous OR t2.thousand = t1.thousand);
+
+--
+-- Test useless inner join removal for foreign key referenced relations
+--
+
+CREATE TABLE fk_parent1 (id int PRIMARY KEY, val text);
+CREATE TABLE fk_parent2 (id int PRIMARY KEY, val text);
+CREATE TABLE fk_parent_text (id text PRIMARY KEY);
+CREATE TABLE fk_multi_parent (id1 int, id2 int, val text, PRIMARY KEY (id1, id2));
+
+CREATE TABLE fk_child (
+ id int PRIMARY KEY,
+ p1_id int NOT NULL REFERENCES fk_parent1(id),
+ p2_id int REFERENCES fk_parent2(id),
+ p_text text REFERENCES fk_parent_text(id),
+ p_m1 int,
+ p_m2 int,
+ val text,
+ FOREIGN KEY (p_m1, p_m2) REFERENCES fk_multi_parent(id1, id2)
+);
+
+INSERT INTO fk_parent1 VALUES (1, 'p1_1'), (2, 'p1_2');
+INSERT INTO fk_parent2 VALUES (1, 'p2_1'), (2, 'p2_2');
+INSERT INTO fk_parent_text VALUES ('t1'), ('t2');
+INSERT INTO fk_multi_parent VALUES (1, 1, 'm1'), (2, 2, 'm2');
+
+INSERT INTO fk_child VALUES
+ (1, 1, 1, 't1', 1, 1, 'c1'),
+ (2, 2, NULL, 't2', 2, 2, 'c2'),
+ (3, 1, 1, 't1', 1, 1, 'c3');
+
+ANALYZE fk_parent1;
+ANALYZE fk_parent2;
+ANALYZE fk_parent_text;
+ANALYZE fk_multi_parent;
+ANALYZE fk_child;
+
+-- Ensure that fk_parent1 is removed
+EXPLAIN (COSTS OFF)
+SELECT c.id, c.val
+FROM fk_child c JOIN fk_parent1 p ON c.p1_id = p.id
+ORDER BY c.id;
+
+-- Ensure that it returns all 3 rows
+SELECT c.id, c.val
+FROM fk_child c JOIN fk_parent1 p ON c.p1_id = p.id
+ORDER BY c.id;
+
+-- Ensure that fk_parent2 is removed, and an IS NOT NULL qual is injected
+EXPLAIN (COSTS OFF)
+SELECT c.id, c.val
+FROM fk_child c JOIN fk_parent2 p ON c.p2_id = p.id
+ORDER BY c.id;
+
+-- Ensure that it returns row 1 and row 3
+SELECT c.id, c.val
+FROM fk_child c JOIN fk_parent2 p ON c.p2_id = p.id
+ORDER BY c.id;
+
+-- Ensure we do not have redundant IS NOT NULL qual
+EXPLAIN (COSTS OFF)
+SELECT c.id, c.val
+FROM fk_child c JOIN fk_parent2 p ON c.p2_id = p.id
+WHERE c.p2_id IS NOT NULL
+ORDER BY c.id;
+
+-- Ensure that it returns row 1 and row 3
+SELECT c.id, c.val
+FROM fk_child c JOIN fk_parent2 p ON c.p2_id = p.id
+WHERE c.p2_id IS NOT NULL
+ORDER BY c.id;
+
+-- Ensure that both fk_parent1 and fk_parent2 are removed
+EXPLAIN (COSTS OFF)
+SELECT c.id
+FROM fk_child c
+ JOIN fk_parent1 p1 ON c.p1_id = p1.id
+ JOIN fk_parent2 p2 ON c.p2_id = p2.id
+ORDER BY c.id;
+
+-- Ensure that it returns rows 1 and 3
+SELECT c.id
+FROM fk_child c
+ JOIN fk_parent1 p1 ON c.p1_id = p1.id
+ JOIN fk_parent2 p2 ON c.p2_id = p2.id
+ORDER BY c.id;
+
+-- Ensure that fk_parent1 is removed, leaving c1 joined to c2
+EXPLAIN (COSTS OFF)
+SELECT c1.id, c2.id
+FROM fk_child c1
+ JOIN fk_parent1 p ON c1.p1_id = p.id
+ JOIN fk_child c2 ON p.id = c2.p1_id
+ORDER BY c1.id, c2.id;
+
+-- Ensure that we get 1x1, 1x3, 3x1, 3x3, 2x2
+SELECT c1.id, c2.id
+FROM fk_child c1
+ JOIN fk_parent1 p ON c1.p1_id = p.id
+ JOIN fk_child c2 ON p.id = c2.p1_id
+ORDER BY c1.id, c2.id;
+
+-- Multi-column FK, ensure that fk_multi_parent is removed
+EXPLAIN (COSTS OFF)
+SELECT c.id
+FROM fk_child c
+ JOIN fk_multi_parent p ON c.p_m1 = p.id1 AND c.p_m2 = p.id2
+ORDER BY c.id;
+
+-- Ensure that we get all 3 rows
+SELECT c.id
+FROM fk_child c
+ JOIN fk_multi_parent p ON c.p_m1 = p.id1 AND c.p_m2 = p.id2
+ORDER BY c.id;
+
+-- Chain-shaped FK removal: c -> p1 -> p2
+ALTER TABLE fk_parent1
+ ADD COLUMN p2_id int NOT NULL DEFAULT 1 REFERENCES fk_parent2(id);
+
+EXPLAIN (COSTS OFF)
+SELECT c.id
+FROM fk_child c
+ JOIN fk_parent1 p1 ON c.p1_id = p1.id
+ JOIN fk_parent2 p2 ON p1.p2_id = p2.id
+ORDER BY c.id;
+
+-- Ensure that we get all 3 rows
+SELECT c.id
+FROM fk_child c
+ JOIN fk_parent1 p1 ON c.p1_id = p1.id
+ JOIN fk_parent2 p2 ON p1.p2_id = p2.id
+ORDER BY c.id;
+
+ALTER TABLE fk_parent1 DROP COLUMN p2_id;
+
+-- LEFT JOIN ON-clause in joininfo does not reference fk_parent2, so FK-removal
+-- should still fire
+EXPLAIN (COSTS OFF)
+SELECT c.id
+FROM fk_parent1 p1
+ LEFT JOIN (fk_child c JOIN fk_parent2 p2 ON c.p2_id = p2.id)
+ ON p1.id = c.id
+ORDER BY c.id;
+
+SELECT c.id
+FROM fk_parent1 p1
+ LEFT JOIN (fk_child c JOIN fk_parent2 p2 ON c.p2_id = p2.id)
+ ON p1.id = c.id
+ORDER BY c.id;
+
+-- fk_parent1 cannot be removed because p.val is selected
+EXPLAIN (COSTS OFF)
+SELECT c.id, p.val
+FROM fk_child c JOIN fk_parent1 p ON c.p1_id = p.id;
+
+-- fk_parent1 cannot be removed because p.val is filtered
+EXPLAIN (COSTS OFF)
+SELECT c.id
+FROM fk_child c JOIN fk_parent1 p ON c.p1_id = p.id
+WHERE p.val = 'p1_1';
+
+-- fk_parent1 cannot be removed because of TABLESAMPLE on the referenced
+-- relation
+EXPLAIN (COSTS OFF)
+SELECT c.id, c.val
+FROM fk_child c JOIN fk_parent1 p TABLESAMPLE BERNOULLI(50) REPEATABLE(1)
+ ON c.p1_id = p.id;
+
+-- fk_parent1 cannot be removed because p.id is in the targetlist
+EXPLAIN (COSTS OFF)
+SELECT c.id, p.id
+FROM fk_child c JOIN fk_parent1 p ON c.p1_id = p.id;
+
+-- fk_parent1 cannot be removed because p.id is in the filter
+EXPLAIN (COSTS OFF)
+SELECT c.id
+FROM fk_child c JOIN fk_parent1 p ON c.p1_id = p.id
+WHERE p.id > 1;
+
+-- fk_parent1 cannot be removed because p.id is in the join clause
+EXPLAIN (COSTS OFF)
+SELECT c.id
+FROM fk_child c JOIN fk_parent1 p ON c.p1_id = p.id
+WHERE c.id > p.id;
+
+-- fk_multi_parent cannot be removed because not all foreign key columns are
+-- matched
+EXPLAIN (COSTS OFF)
+SELECT c.id
+FROM fk_child c
+ JOIN fk_multi_parent p ON c.p_m1 = p.id1;
+
+-- fk_parent2 cannot be removed because the LEFT JOIN ON-clause references
+-- p2.id
+EXPLAIN (COSTS OFF)
+SELECT c.id
+FROM fk_parent1 p1
+ LEFT JOIN (fk_child c JOIN fk_parent2 p2 ON c.p2_id = p2.id)
+ ON p1.id = p2.id;
+
+-- fk_parent1 cannot be removed because p.id is referenced in lateral_vars
+EXPLAIN (COSTS OFF)
+SELECT c.id
+FROM fk_child c
+ JOIN fk_parent1 p ON c.p1_id = p.id
+ CROSS JOIN LATERAL (SELECT p.id OFFSET 0);
+
+-- fk_parent2 cannot be removed because p2.id is referenced in the semi-join
+-- RHS expressions
+EXPLAIN (COSTS OFF)
+SELECT * FROM fk_parent1 p1
+WHERE p1.id IN
+ (SELECT p2.id FROM fk_child c JOIN fk_parent2 p2 ON c.p2_id = p2.id);
+
+-- fk_parent_text cannot be removed due to the COLLATE "C" mismatch splitting
+-- the EC
+EXPLAIN (COSTS OFF)
+SELECT c1.id, c2.id
+FROM fk_child c1
+ JOIN fk_parent_text p ON c1.p_text = p.id
+ JOIN fk_child c2 ON p.id COLLATE "C" = c2.p_text COLLATE "C";
+
+-- fk_parent1 cannot be removed because p.id + 0 splits the EC
+EXPLAIN (COSTS OFF)
+SELECT c1.id, c2.id
+FROM fk_child c1
+ JOIN fk_parent1 p ON c1.p1_id = p.id
+ JOIN fk_child c2 ON (p.id + 0) = c2.p1_id;
+
+-- fk_parent1 cannot be removed due to the multi-rel EM (p.id + c2.p1_id)
+EXPLAIN (COSTS OFF)
+SELECT c1.id, c2.id
+FROM fk_child c1
+ JOIN fk_parent1 p ON c1.p1_id = p.id
+ JOIN fk_child c2 ON c1.p1_id = (p.id + c2.p1_id);
+
+-- Ensure that the right join is not removed
+EXPLAIN (COSTS OFF)
+SELECT c.id, c.val
+FROM fk_child c RIGHT JOIN fk_parent1 p ON c.p1_id = p.id;
+
+-- Ensure that the full join is not removed
+EXPLAIN (COSTS OFF)
+SELECT c.id, c.val
+FROM fk_child c FULL JOIN fk_parent1 p ON c.p1_id = p.id;
+
+-- fk_parent1 cannot be removed because it is separated with fk_child by the
+-- outer join boundary
+EXPLAIN (COSTS OFF)
+SELECT c.id
+FROM fk_child c
+ JOIN (fk_parent1 p1
+ LEFT JOIN fk_parent2 p2 ON TRUE)
+ ON c.p1_id = p1.id;
+
+-- Deferrable FK: fk_parent1 cannot be removed because constraint is deferrable
+ALTER TABLE fk_child DROP CONSTRAINT fk_child_p1_id_fkey;
+ALTER TABLE fk_child ADD CONSTRAINT fk_child_p1_id_fkey
+ FOREIGN KEY (p1_id) REFERENCES fk_parent1(id) DEFERRABLE;
+
+EXPLAIN (COSTS OFF)
+SELECT c.id, c.val
+FROM fk_child c JOIN fk_parent1 p ON c.p1_id = p.id;
+
+-- NOT VALID FK: fk_parent1 cannot be removed because constraint is NOT VALID
+ALTER TABLE fk_child DROP CONSTRAINT fk_child_p1_id_fkey;
+ALTER TABLE fk_child ADD CONSTRAINT fk_child_p1_id_fkey
+ FOREIGN KEY (p1_id) REFERENCES fk_parent1(id) NOT VALID;
+
+EXPLAIN (COSTS OFF)
+SELECT c.id, c.val
+FROM fk_child c JOIN fk_parent1 p ON c.p1_id = p.id;
+
+DROP TABLE fk_child;
+DROP TABLE fk_multi_parent;
+DROP TABLE fk_parent_text;
+DROP TABLE fk_parent2;
+DROP TABLE fk_parent1;
--
2.39.5 (Apple Git-154)
[application/octet-stream] v5-0002-Suppress-FK-based-inner-join-removal-during-the-t.patch (14.4K, 3-v5-0002-Suppress-FK-based-inner-join-removal-during-the-t.patch)
download | inline diff:
From 7e0a286b85a5a304a48963b00eddf76ca8b02268 Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Wed, 11 Mar 2026 16:26:12 +0900
Subject: [PATCH v5 2/2] Suppress FK-based inner join removal during the
trigger gap
FK constraints are enforced via AFTER ROW triggers, creating a window
(the "trigger gap") between when a row is physically modified and when
the enforcement trigger fires. Within this window the FK invariant is
locally false in the heap. It becomes user-observable through any
query that runs against a snapshot captured inside the window -- for
example, a SELECT inside a plpgsql function invoked from RETURNING, a
query inside an AFTER ROW trigger body, or a fetch from a cursor
opened inside the window. The optimization introduced in the previous
commit must therefore be disabled whenever a query could be executed
against a snapshot that captures gap-window state.
A predicate that only inspects the current trigger queue is not
sufficient. A snapshot captured during the gap can outlive the gap's
closure -- for example, a cursor frozen via CopySnapshot inside the
window, or a cached plan reused at execution time against a snapshot
that was captured inside the window. A query executed against such a
stale snapshot can still observe the inconsistent state even after
triggers have fired and the queue has drained. The predicate guarding
the optimization therefore has to remain positive at least as long as
any snapshot in this transaction could still be referenced.
RowExclusiveLock has exactly that lifetime: it is acquired during
parse analysis of any DML on the rel and released only at end of
transaction, which trivially exceeds the lifetime of any in-xact
snapshot. The optimization is therefore disabled whenever either side
of the FK is RowExclusiveLock'd by this backend. The cost is
pessimism in cases where the lock outlives the actual gap window:
after the DML completes within the same transaction, after ROLLBACK TO
a savepoint, or when RowExclusiveLock is held for unrelated reasons
(e.g., LOCK TABLE). In those cases the query falls back to executing
the join normally; correctness is preserved.
For cached plans, the plan cache forces replanning when a previously
cached generic plan with FK-based join removal might be affected by an
active trigger gap. The OIDs of the relations involved in FK join
removal are propagated through PlannerGlobal and PlannedStmt so that
choose_custom_plan() can perform this check efficiently.
---
src/backend/optimizer/plan/analyzejoins.c | 53 +++++++++++++++++++++++
src/backend/optimizer/plan/planner.c | 2 +
src/backend/utils/cache/plancache.c | 34 +++++++++++++++
src/include/nodes/pathnodes.h | 3 ++
src/include/nodes/plannodes.h | 3 ++
src/include/utils/plancache.h | 1 +
src/test/regress/expected/join.out | 40 +++++++++++++++++
src/test/regress/sql/join.sql | 30 +++++++++++++
8 files changed, 166 insertions(+)
diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index 4c5c78dcbfc..4d9b45d71db 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -35,6 +35,7 @@
#include "optimizer/restrictinfo.h"
#include "parser/parse_agg.h"
#include "rewrite/rewriteManip.h"
+#include "storage/lmgr.h"
#include "utils/lsyscache.h"
/*
@@ -2627,6 +2628,18 @@ restart:
if (!inner_join_is_removable(root, fkinfo))
continue;
+ /*
+ * Record the OIDs of both sides of the FK so that the plan cache can
+ * detect when a cached plan with FK-based join removal needs
+ * replanning due to the trigger gap becoming possible.
+ */
+ root->glob->fkRemovedRelOids =
+ list_append_unique_oid(root->glob->fkRemovedRelOids,
+ root->simple_rte_array[fkinfo->con_relid]->relid);
+ root->glob->fkRemovedRelOids =
+ list_append_unique_oid(root->glob->fkRemovedRelOids,
+ root->simple_rte_array[fkinfo->ref_relid]->relid);
+
/* Inject IS NOT NULL clauses for nullable foreign key columns */
inject_fk_not_null_quals(root, fkinfo);
@@ -2694,6 +2707,8 @@ inner_join_is_removable(PlannerInfo *root, ForeignKeyOptInfo *fkinfo)
{
RelOptInfo *con_rel;
RelOptInfo *ref_rel;
+ Oid con_reloid;
+ Oid ref_reloid;
int attroff;
Relids inputrelids;
Bitmapset *fk_attnums = NULL;
@@ -2750,6 +2765,44 @@ inner_join_is_removable(PlannerInfo *root, ForeignKeyOptInfo *fkinfo)
ref_rel->reloptkind != RELOPT_BASEREL)
return false;
+ con_reloid = root->simple_rte_array[fkinfo->con_relid]->relid;
+ ref_reloid = root->simple_rte_array[fkinfo->ref_relid]->relid;
+
+ /*
+ * FK constraints are enforced via AFTER ROW triggers, creating a window
+ * (the "trigger gap") between when a row is physically modified and when
+ * the FK enforcement trigger fires. Within this window the FK invariant
+ * is locally false in the heap. It becomes user-observable through any
+ * query that runs against a snapshot captured inside the window -- for
+ * example, a SELECT inside a plpgsql function invoked from RETURNING, a
+ * query inside an AFTER ROW trigger body, or a fetch from a cursor opened
+ * inside the window.
+ *
+ * The predicate guarding the optimization must remain positive at least
+ * as long as any snapshot in this transaction could still be referenced,
+ * because a snapshot captured during the gap can outlive the gap's
+ * closure -- for example, a cursor frozen via CopySnapshot inside the
+ * window, or a cached plan reused at execution time against a snapshot
+ * that was captured inside the window. Any FK-removed plan executed
+ * against such a stale snapshot would observe the inconsistent state and
+ * produce wrong results, even if the trigger queue has long since
+ * drained. Predicates tied to the active DML's own lifetime, such as a
+ * per-statement counter or AfterTriggerPendingOnRel(), go silent once the
+ * gap closes and would leave stale-snapshot executions unguarded.
+ *
+ * RowExclusiveLock has exactly the right lifetime: it is acquired during
+ * parse analysis of any DML on the rel and released only at end of
+ * transaction, which trivially exceeds the lifetime of any in-transaction
+ * snapshot. The pessimism this introduces -- skipping the optimization
+ * for the rest of the transaction after the DML completes, or when
+ * RowExclusiveLock is held for other reasons such as LOCK TABLE -- is the
+ * price of that lifetime guarantee. In those cases the query falls back
+ * to executing the join normally.
+ */
+ if (CheckRelationOidLockedByMe(con_reloid, RowExclusiveLock, true) ||
+ CheckRelationOidLockedByMe(ref_reloid, RowExclusiveLock, true))
+ return false;
+
/*
* If the referenced relation has any restriction clauses, they act as
* explicit filters. Since we cannot perform variable substitution to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index f4689e7c9f8..4f66747d83f 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -385,6 +385,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
glob->partPruneInfos = NIL;
glob->relationOids = NIL;
glob->invalItems = NIL;
+ glob->fkRemovedRelOids = NIL;
glob->paramExecTypes = NIL;
glob->lastPHId = 0;
glob->lastRowMarkId = 0;
@@ -693,6 +694,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->relationOids = glob->relationOids;
result->invalItems = glob->invalItems;
+ result->fkRemovedRelOids = glob->fkRemovedRelOids;
result->paramExecTypes = glob->paramExecTypes;
/* utilityStmt should be null, but we might as well copy it */
result->utilityStmt = parse->utilityStmt;
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 698e7c1aa22..f5350ff0302 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -1132,6 +1132,7 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
*/
plan->planRoleId = GetUserId();
plan->dependsOnRole = plansource->dependsOnRLS;
+ plan->hasFKJoinRemoval = false;
is_transient = false;
foreach(lc, plist)
{
@@ -1144,6 +1145,8 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
is_transient = true;
if (plannedstmt->dependsOnRole)
plan->dependsOnRole = true;
+ if (plannedstmt->fkRemovedRelOids != NIL)
+ plan->hasFKJoinRemoval = true;
}
if (is_transient)
{
@@ -1180,6 +1183,37 @@ choose_custom_plan(CachedPlanSource *plansource, ParamListInfo boundParams)
if (plansource->is_oneshot)
return true;
+ /*
+ * If the generic plan used FK-based inner join removal, check whether any
+ * of the relations involved are currently being modified in this
+ * transaction (detected by RowExclusiveLock). If so, we must force a
+ * custom plan: an FK-removed plan executed against a snapshot that still
+ * sees the trigger-gap state would produce incorrect results. This check
+ * uses the same predicate as inner_join_is_removable() at planning time;
+ * see that function for the correctness argument.
+ *
+ * This check must come before all other early returns, because
+ * correctness requires replanning regardless of parameter availability,
+ * GUC settings, or cursor options.
+ *
+ * We use the hasFKJoinRemoval flag for a fast-path check so that plans
+ * without FK join removal (the common case) pay only a single boolean
+ * test.
+ */
+ if (plansource->gplan != NULL && plansource->gplan->hasFKJoinRemoval)
+ {
+ foreach_node(PlannedStmt, stmt, plansource->gplan->stmt_list)
+ {
+ if (stmt->commandType == CMD_UTILITY)
+ continue;
+ foreach_oid(reloid, stmt->fkRemovedRelOids)
+ {
+ if (CheckRelationOidLockedByMe(reloid, RowExclusiveLock, true))
+ return true;
+ }
+ }
+ }
+
/* Otherwise, never any point in a custom plan if there's no parameters */
if (boundParams == NULL)
return false;
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index d67f68089e3..604df3bd5dd 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -229,6 +229,9 @@ typedef struct PlannerGlobal
/* other dependencies, as PlanInvalItems */
List *invalItems;
+ /* OIDs of relations involved in FK-based inner join removal */
+ List *fkRemovedRelOids;
+
/* type OIDs for PARAM_EXEC Params */
List *paramExecTypes;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 14a1dfed2b9..63af469a981 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -146,6 +146,9 @@ typedef struct PlannedStmt
/* other dependencies, as PlanInvalItems */
List *invalItems;
+ /* OIDs of relations involved in FK-based inner join removal */
+ List *fkRemovedRelOids;
+
/* type OIDs for PARAM_EXEC Params */
List *paramExecTypes;
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 7a4a85c8038..8130cc3e1a3 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -165,6 +165,7 @@ typedef struct CachedPlan
bool is_valid; /* is the stmt_list currently valid? */
Oid planRoleId; /* Role ID the plan was created for */
bool dependsOnRole; /* is plan specific to that role? */
+ bool hasFKJoinRemoval; /* does plan use FK-based join removal? */
TransactionId saved_xmin; /* if valid, replan when TransactionXmin
* changes from this value */
int generation; /* parent's generation number for this plan */
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index d46fbf0f28a..174914a6a89 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -10681,6 +10681,46 @@ FROM fk_child c JOIN fk_parent1 p ON c.p1_id = p.id;
-> Seq Scan on fk_parent1 p
(5 rows)
+-- Trigger gap: fk_parent1 cannot be removed when RowExclusiveLock is held
+-- during active DML on a FK-related table
+ALTER TABLE fk_child DROP CONSTRAINT fk_child_p1_id_fkey;
+ALTER TABLE fk_child ADD CONSTRAINT fk_child_p1_id_fkey
+ FOREIGN KEY (p1_id) REFERENCES fk_parent1(id) ON DELETE CASCADE;
+CREATE FUNCTION trigger_gap_test_fn() RETURNS int LANGUAGE plpgsql AS $$
+DECLARE
+ join_count int;
+ plan_line text;
+BEGIN
+ SELECT count(*) INTO join_count
+ FROM fk_child c JOIN fk_parent1 p ON c.p1_id = p.id;
+ RAISE NOTICE 'Join Count: %', join_count;
+
+ RAISE NOTICE '--- EXPLAIN PLAN ---';
+ FOR plan_line IN
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM fk_child c JOIN fk_parent1 p ON c.p1_id = p.id
+ LOOP
+ RAISE NOTICE '%', plan_line;
+ END LOOP;
+
+ RETURN join_count;
+END;
+$$;
+DELETE FROM fk_parent1 WHERE id = 1 RETURNING trigger_gap_test_fn();
+NOTICE: Join Count: 1
+NOTICE: --- EXPLAIN PLAN ---
+NOTICE: Aggregate
+NOTICE: -> Nested Loop
+NOTICE: Join Filter: (p.id = c.p1_id)
+NOTICE: -> Seq Scan on fk_child c
+NOTICE: -> Materialize
+NOTICE: -> Seq Scan on fk_parent1 p
+ trigger_gap_test_fn
+---------------------
+ 1
+(1 row)
+
+DROP FUNCTION trigger_gap_test_fn();
DROP TABLE fk_child;
DROP TABLE fk_multi_parent;
DROP TABLE fk_parent_text;
diff --git a/src/test/regress/sql/join.sql b/src/test/regress/sql/join.sql
index b62f5ac6c0b..0cb2999bf98 100644
--- a/src/test/regress/sql/join.sql
+++ b/src/test/regress/sql/join.sql
@@ -4164,6 +4164,36 @@ EXPLAIN (COSTS OFF)
SELECT c.id, c.val
FROM fk_child c JOIN fk_parent1 p ON c.p1_id = p.id;
+-- Trigger gap: fk_parent1 cannot be removed when RowExclusiveLock is held
+-- during active DML on a FK-related table
+ALTER TABLE fk_child DROP CONSTRAINT fk_child_p1_id_fkey;
+ALTER TABLE fk_child ADD CONSTRAINT fk_child_p1_id_fkey
+ FOREIGN KEY (p1_id) REFERENCES fk_parent1(id) ON DELETE CASCADE;
+
+CREATE FUNCTION trigger_gap_test_fn() RETURNS int LANGUAGE plpgsql AS $$
+DECLARE
+ join_count int;
+ plan_line text;
+BEGIN
+ SELECT count(*) INTO join_count
+ FROM fk_child c JOIN fk_parent1 p ON c.p1_id = p.id;
+ RAISE NOTICE 'Join Count: %', join_count;
+
+ RAISE NOTICE '--- EXPLAIN PLAN ---';
+ FOR plan_line IN
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM fk_child c JOIN fk_parent1 p ON c.p1_id = p.id
+ LOOP
+ RAISE NOTICE '%', plan_line;
+ END LOOP;
+
+ RETURN join_count;
+END;
+$$;
+
+DELETE FROM fk_parent1 WHERE id = 1 RETURNING trigger_gap_test_fn();
+DROP FUNCTION trigger_gap_test_fn();
+
DROP TABLE fk_child;
DROP TABLE fk_multi_parent;
DROP TABLE fk_parent_text;
--
2.39.5 (Apple Git-154)
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: Remove inner joins based on foreign keys
In-Reply-To: <CAMbWs4_UJd5QxYq7KaqO=TOWmS=2opUrhC1Oc=Dbwkcn2mXk6g@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox