public inbox for [email protected]
help / color / mirror / Atom feedThe bogus calls in remove_self_join_rel()
4+ messages / 3 participants
[nested] [flat]
* The bogus calls in remove_self_join_rel()
@ 2026-04-23 02:45 Richard Guo <[email protected]>
0 siblings, 2 replies; 4+ messages in thread
From: Richard Guo @ 2026-04-23 02:45 UTC (permalink / raw)
To: Pg Hackers <[email protected]>
I noticed these two calls in remove_self_join_rel():
adjust_relid_set(root->all_result_relids, toRemove->relid, toKeep->relid);
adjust_relid_set(root->leaf_result_relids, toRemove->relid, toKeep->relid);
There's no comment explaining them, and as far as I can tell they do
nothing: adjust_relid_set returns a Relids and does not modify the
input in place.
Rather than make the calls do the cleanup they pretend to do, I think
a better way is to replace them with assertions: toRemove->relid is
not a member of either set. This is true as these two sets contain
only parse->resultRelation (rejected as an SJE candidate to preserve
EvalPlanQual) and inheritance children of the target, which never
appear in the joinlist that SJE scans for candidates.
Thoughts?
- Richard
Attachments:
[application/octet-stream] v1-0001-Fix-bogus-calls-in-remove_self_join_rel.patch (2.7K, 2-v1-0001-Fix-bogus-calls-in-remove_self_join_rel.patch)
download | inline diff:
From d06fc10fc3a5317a88a0576d9a2cfed1c10a4762 Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Thu, 23 Apr 2026 11:19:25 +0900
Subject: [PATCH v1] Fix bogus calls in remove_self_join_rel()
remove_self_join_rel() called adjust_relid_set() on all_result_relids
and leaf_result_relids but threw away the return value. Since
adjust_relid_set() returns a freshly-built Relids and does not modify
the input in place, the calls did nothing. This has been the case
since the SJE feature went in (commit fc069a3a6).
There has been no observable misbehavior, because the relid being
passed is guaranteed not to be a member of either set: those sets
contain only the query's top-level target relation, which
remove_self_joins_recurse() rejects as an SJE candidate to preserve
the EvalPlanQual mechanism, plus inheritance children of the target
relation, which are RELOPT_OTHER_MEMBER_RELs and so never appear in
the joinlist that SJE scans for candidates. Even with the result
assigned, the calls would be no-ops in practice.
Rather than make the calls do the cleanup they pretend to do, replace
them with assertions of the invariant. Any future loosening of the
SJE candidate filter -- for instance to allow eliminating a result
relation under provable conditions -- will trip the assertion and
force whoever does it to revisit this code.
---
src/backend/optimizer/plan/analyzejoins.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index 03056bdf3e0..494b9ee0716 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -1994,8 +1994,16 @@ remove_self_join_rel(PlannerInfo *root, PlanRowMark *kmark, PlanRowMark *rmark,
ChangeVarNodesExtended((Node *) root->processed_tlist, toRemove->relid,
toKeep->relid, 0, replace_relid_callback);
- adjust_relid_set(root->all_result_relids, toRemove->relid, toKeep->relid);
- adjust_relid_set(root->leaf_result_relids, toRemove->relid, toKeep->relid);
+ /*
+ * No need to touch all_result_relids or leaf_result_relids: every relid
+ * in those sets is either the query's top-level target relation, which
+ * remove_self_joins_recurse() rejects as an SJE candidate (the EPQ
+ * mechanism requires the target rel to survive), or an inheritance child
+ * of it, which is a RELOPT_OTHER_MEMBER_REL and so never appears in the
+ * joinlist that SJE scans. toRemove->relid therefore cannot be a member.
+ */
+ Assert(!bms_is_member(toRemove->relid, root->all_result_relids));
+ Assert(!bms_is_member(toRemove->relid, root->leaf_result_relids));
/*
* There may be references to the rel in root->fkey_list, but if so,
--
2.39.5 (Apple Git-154)
^ permalink raw reply [nested|flat] 4+ messages in thread
* Re: The bogus calls in remove_self_join_rel()
@ 2026-04-23 03:11 David Rowley <[email protected]>
parent: Richard Guo <[email protected]>
1 sibling, 1 reply; 4+ messages in thread
From: David Rowley @ 2026-04-23 03:11 UTC (permalink / raw)
To: Richard Guo <[email protected]>; +Cc: Pg Hackers <[email protected]>
On Thu, 23 Apr 2026 at 14:46, Richard Guo <[email protected]> wrote:
>
> I noticed these two calls in remove_self_join_rel():
>
> adjust_relid_set(root->all_result_relids, toRemove->relid, toKeep->relid);
> adjust_relid_set(root->leaf_result_relids, toRemove->relid, toKeep->relid);
>
> There's no comment explaining them, and as far as I can tell they do
> nothing: adjust_relid_set returns a Relids and does not modify the
> input in place.
>
> Rather than make the calls do the cleanup they pretend to do, I think
> a better way is to replace them with assertions: toRemove->relid is
> not a member of either set. This is true as these two sets contain
> only parse->resultRelation (rejected as an SJE candidate to preserve
> EvalPlanQual) and inheritance children of the target, which never
> appear in the joinlist that SJE scans for candidates.
Yeah, it certainly shouldn't be removing any result relations. I see
there's a check in remove_self_joins_recurse() for varno !=
root->parse->resultRelation. Have you followed through on what happens
for CTEs that do DML and RETURNING? I assume that fails on the nearby
rte->relkind == RELKIND_RELATION, but I didn't debug to check. The
only place I see all_result_relids being added to, aside from the
initial setting with bms_make_singleton() is for the inheritance
expansion in expand_single_inheritance_child(), which happens after
join removals. For leaf_result_relids, it's similar.
I think the Asserts should go at the top of the function next to the
other Asserts. Putting them near the top makes it clearer when reading
code. The Asserts will be very close to the function's header comment,
so it's easier to get a picture about what the function supports and
does, plus, it helps ensure we still get the Asserts before any early
returns are taken.
It may also be useful to decorate adjust_relid_set() with pg_nodiscard.
David
^ permalink raw reply [nested|flat] 4+ messages in thread
* Re: The bogus calls in remove_self_join_rel()
@ 2026-04-23 07:58 Richard Guo <[email protected]>
parent: David Rowley <[email protected]>
0 siblings, 0 replies; 4+ messages in thread
From: Richard Guo @ 2026-04-23 07:58 UTC (permalink / raw)
To: David Rowley <[email protected]>; +Cc: Pg Hackers <[email protected]>
On Thu, Apr 23, 2026 at 12:11 PM David Rowley <[email protected]> wrote:
> Have you followed through on what happens
> for CTEs that do DML and RETURNING? I assume that fails on the nearby
> rte->relkind == RELKIND_RELATION, but I didn't debug to check.
Right. The CTE's reference in the outer query is RTE_CTE, and the
rte->rtekind == RTE_RELATION check ensures SJE never considers it as a
candidate. The CTE itself is planned as a separate Query, where the
varno != root->parse->resultRelation check rules out its target
relation.
> The
> only place I see all_result_relids being added to, aside from the
> initial setting with bms_make_singleton() is for the inheritance
> expansion in expand_single_inheritance_child(), which happens after
> join removals. For leaf_result_relids, it's similar.
Right. And this makes the comment and commit message not accurate, as
inheritance children have not been added yet, as that happens later in
add_other_rels_to_query(). Fixed in v2.
> I think the Asserts should go at the top of the function next to the
> other Asserts. Putting them near the top makes it clearer when reading
> code. The Asserts will be very close to the function's header comment,
> so it's easier to get a picture about what the function supports and
> does, plus, it helps ensure we still get the Asserts before any early
> returns are taken.
Hmm, I considered that, but I chose the current placement because the
Asserts are documenting a specific non-action: "we don't touch these
two sets, and here is why." That reads more naturally adjacent to the
cleanup of all the other structures, rather than at the top where it
would turn into a precondition claim. The existing Asserts at the top
check input-parameter validity, which is a different kind of check.
On the early-returns argument: remove_self_join_rel() has no early
returns today, and adding one would mean forgetting to clear some
field, so I don't expect that to change.
That said, either location is OK, so happy to move them if you feel
strongly.
> It may also be useful to decorate adjust_relid_set() with pg_nodiscard.
Good suggestion. Done in v2.
- Richard
Attachments:
[application/octet-stream] v2-0001-Fix-bogus-calls-in-remove_self_join_rel.patch (3.6K, 2-v2-0001-Fix-bogus-calls-in-remove_self_join_rel.patch)
download | inline diff:
From 5acb5066d73b738b0861593bbbcf85014f35f8fa Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Thu, 23 Apr 2026 11:19:25 +0900
Subject: [PATCH v2] Fix bogus calls in remove_self_join_rel()
remove_self_join_rel() called adjust_relid_set() on all_result_relids
and leaf_result_relids but threw away the return value. Since
adjust_relid_set() returns a freshly-built Relids and does not modify
the input in place, the calls did nothing. This has been the case
since the SJE feature went in (commit fc069a3a6).
There has been no observable misbehavior, because the relid being
passed is guaranteed not to be a member of either set. At the point
remove_self_join_rel() runs, those sets contain only resultRelation;
inheritance children have not been added yet, as that happens later in
query_planner(), in expand_single_inheritance_child() called from
add_other_rels_to_query(). And remove_self_joins_recurse() rejects
parse->resultRelation as an SJE candidate to preserve the EvalPlanQual
mechanism. Even with the result assigned, the calls would be no-ops
in practice.
Rather than make the calls do the cleanup they pretend to do, replace
them with assertions of the invariant. Any future loosening of the
SJE candidate filter -- for instance to allow eliminating a result
relation under provable conditions -- will trip the assertion and
force whoever does it to revisit this code.
Additionally, decorate adjust_relid_set() with pg_nodiscard so that
any future accidental discard of its return value is caught at compile
time.
---
src/backend/optimizer/plan/analyzejoins.c | 12 ++++++++++--
src/include/rewrite/rewriteManip.h | 2 +-
2 files changed, 11 insertions(+), 3 deletions(-)
diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index 03056bdf3e0..0f82ab9facb 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -1994,8 +1994,16 @@ remove_self_join_rel(PlannerInfo *root, PlanRowMark *kmark, PlanRowMark *rmark,
ChangeVarNodesExtended((Node *) root->processed_tlist, toRemove->relid,
toKeep->relid, 0, replace_relid_callback);
- adjust_relid_set(root->all_result_relids, toRemove->relid, toKeep->relid);
- adjust_relid_set(root->leaf_result_relids, toRemove->relid, toKeep->relid);
+ /*
+ * No need to touch all_result_relids or leaf_result_relids: at this point
+ * those sets contain only parse->resultRelation; inheritance children
+ * have not been added yet; that happens later in add_other_rels_to_query.
+ * And remove_self_joins_recurse rejects parse->resultRelation as an SJE
+ * candidate to preserve the EPQ mechanism. So toRemove->relid cannot be
+ * a member.
+ */
+ Assert(!bms_is_member(toRemove->relid, root->all_result_relids));
+ Assert(!bms_is_member(toRemove->relid, root->leaf_result_relids));
/*
* There may be references to the rel in root->fkey_list, but if so,
diff --git a/src/include/rewrite/rewriteManip.h b/src/include/rewrite/rewriteManip.h
index a6d4e888e06..9d234cb8741 100644
--- a/src/include/rewrite/rewriteManip.h
+++ b/src/include/rewrite/rewriteManip.h
@@ -54,7 +54,7 @@ struct ChangeVarNodes_context
ChangeVarNodes_callback callback;
};
-extern Relids adjust_relid_set(Relids relids, int oldrelid, int newrelid);
+pg_nodiscard extern Relids adjust_relid_set(Relids relids, int oldrelid, int newrelid);
extern void CombineRangeTables(List **dst_rtable, List **dst_perminfos,
List *src_rtable, List *src_perminfos);
extern void OffsetVarNodes(Node *node, int offset, int sublevels_up);
--
2.39.5 (Apple Git-154)
^ permalink raw reply [nested|flat] 4+ messages in thread
* Re: The bogus calls in remove_self_join_rel()
@ 2026-04-23 08:11 Andrei Lepikhov <[email protected]>
parent: Richard Guo <[email protected]>
1 sibling, 0 replies; 4+ messages in thread
From: Andrei Lepikhov @ 2026-04-23 08:11 UTC (permalink / raw)
To: Richard Guo <[email protected]>; Pg Hackers <[email protected]>
On 23/04/2026 04:45, Richard Guo wrote:
> I noticed these two calls in remove_self_join_rel():
>
> adjust_relid_set(root->all_result_relids, toRemove->relid, toKeep->relid);
> adjust_relid_set(root->leaf_result_relids, toRemove->relid, toKeep->relid);
>
> There's no comment explaining them, and as far as I can tell they do
> nothing: adjust_relid_set returns a Relids and does not modify the
> input in place.
There is a clear history of these calls. When designing SJE, we initially
applied it to partitioned tables. Later, we realised complicated issues arise
when SJE meets DML, the RETURNING clause, and partitioned tables. So, we reduced
the feature for some time. I guess the core code's stability has been proven
enough by PG18. We may introduce SJE over partitioned tables in the next release.
You can probably remove these calls for now. Just make sure to add assertions to
help with developing the partitioned case.
--
regards, Andrei Lepikhov,
pgEdge
^ permalink raw reply [nested|flat] 4+ messages in thread
end of thread, other threads:[~2026-04-23 08:11 UTC | newest]
Thread overview: 4+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2026-04-23 02:45 The bogus calls in remove_self_join_rel() Richard Guo <[email protected]>
2026-04-23 03:11 ` David Rowley <[email protected]>
2026-04-23 07:58 ` Richard Guo <[email protected]>
2026-04-23 08:11 ` Andrei Lepikhov <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox