public inbox for [email protected]  
help / color / mirror / Atom feed
From: Tender Wang <[email protected]>
To: Richard Guo <[email protected]>
Cc: Pg Hackers <[email protected]>
Subject: Re: Clean up remove_rel_from_query() after self-join elimination commit
Date: Tue, 7 Apr 2026 09:43:27 +0800
Message-ID: <CAHewXNmrGtEwhCFeJMtx-e=3TcD2Q0CUt6CVNbr6Hu0zbTg+tQ@mail.gmail.com> (raw)
In-Reply-To: <CAMbWs48JC4OVqE=3gMB6se2WmRNNfMyFyYxm-09vgpm+Vwe8Hg@mail.gmail.com>
References: <CAMbWs48JC4OVqE=3gMB6se2WmRNNfMyFyYxm-09vgpm+Vwe8Hg@mail.gmail.com>

Richard Guo <[email protected]> 于2026年4月6日周一 16:11写道:
>
> While working on reusing remove_rel_from_query() for inner-join
> removal, I noticed that the function is in pretty rough shape.  The
> self-join elimination (SJE) commit grafted self-join removal onto
> remove_rel_from_query(), which was originally written for left-join
> removal only.

Yes, I noticed this a few days ago when I tried to remove a redundant
filter added during SJE.

>This resulted in several issues:
> 1. Comments throughout remove_rel_from_query() still assumed only
> left-join removal, making the code misleading.  For example:
>
>   * This is relevant in case the outer join we're deleting is nested inside
>   * other outer joins:
>

The current logic of remove_rel_from_query() is not easy to read. For example,
It distinguishes between left-join removal and SJE based on whether
sjinfo is NULL.

> 2. This was called even for left-join removal, with subst=-1.  This is
> pointless and confusing.
>
>       ChangeVarNodesExtended((Node *) phv->phexpr, relid, subst, 0,
>                              replace_relid_callback);
>
Yes, it is.
> 3. phinfo->ph_lateral was adjusted for left-join removal, which is
> also confusing ...
>
>        phinfo->ph_lateral = adjust_relid_set(phinfo->ph_lateral, relid, subst);
>
>        /*
>         * ph_lateral might contain rels mentioned in ph_eval_at after the
>         * replacement, remove them.
>         */
>        phinfo->ph_lateral = bms_difference(phinfo->ph_lateral,
> phinfo->ph_eval_at);
>
> ... since you can find this Assert just above:
>
>        Assert(sjinfo == NULL || !bms_is_member(relid, phinfo->ph_lateral));
>
> 4. The comment about attr_needed reconstruction was in
> remove_rel_from_query(), but the actual rebuild is performed by
> the callers.
>
> 5. The EquivalenceClass processing in remove_rel_from_query():
>
>     /*
>      * Likewise remove references from EquivalenceClasses.
>      */
>     foreach(l, root->eq_classes)
>     {
>         EquivalenceClass *ec = (EquivalenceClass *) lfirst(l);
>
>         if (bms_is_member(relid, ec->ec_relids) ||
>             (sjinfo == NULL || bms_is_member(sjinfo->ojrelid, ec->ec_relids)))
>             remove_rel_from_eclass(ec, sjinfo, relid, subst);
>     }
>
> The condition always evaluates to true for self-join elimination
> (i.e., sjinfo == NULL), meaning every EquivalenceClass gets adjusted.
> But this is redundant because the caller remove_self_join_rel()
> already handles ECs via update_eclasses().
>
> 6. In remove_self_join_rel(), I notice this code:
>
>     /* At last, replace varno in root targetlist and HAVING clause */
>     ChangeVarNodesExtended((Node *) root->processed_tlist, toRemove->relid,
>                            toKeep->relid, 0, replace_relid_callback);
>     ChangeVarNodesExtended((Node *) root->processed_groupClause,
>                            toRemove->relid, toKeep->relid, 0,
>                            replace_relid_callback);
>
> The comment mentions "HAVING clause", but neither processed_tlist nor
> processed_groupClause has anything to do with the HAVING clause.
> Furthermore, processed_groupClause contains SortGroupClause nodes,
> which have no Var nodes to rewrite, so calling ChangeVarNodesExtended
> on it is pointless.

I didn't find as many as you listed here. I agree that we should do
something for
current logic.
>
> The attached patch is an attempt to fix all these issues.  It also
> aims to leave the code better structured for adding new types of join
> removal (such as inner-join removal) in the future.

I looked through the attached patch. LGTM.



-- 
Thanks,
Tender Wang





view thread (7+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re: Clean up remove_rel_from_query() after self-join elimination commit
  In-Reply-To: <CAHewXNmrGtEwhCFeJMtx-e=3TcD2Q0CUt6CVNbr6Hu0zbTg+tQ@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox