Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sj46x-00DoPp-A1 for pgsql-hackers@arkaria.postgresql.org; Tue, 27 Aug 2024 21:52:23 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sj46u-00DFUb-Sv for pgsql-hackers@arkaria.postgresql.org; Tue, 27 Aug 2024 21:52:21 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sj46u-00DFSm-H9 for pgsql-hackers@lists.postgresql.org; Tue, 27 Aug 2024 21:52:21 +0000 Received: from sss.pgh.pa.us ([68.162.161.243]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sj46r-001ou2-8t for pgsql-hackers@lists.postgresql.org; Tue, 27 Aug 2024 21:52:20 +0000 Received: from sss1.sss.pgh.pa.us (localhost [127.0.0.1]) by sss.pgh.pa.us (8.15.2/8.15.2) with ESMTP id 47RLqC8o3147331; Tue, 27 Aug 2024 17:52:12 -0400 From: Tom Lane To: David Rowley cc: nikhil raj , pgsql-hackers@lists.postgresql.org Subject: Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables. In-reply-to: <3104695.1724775341@sss.pgh.pa.us> References: <2962669.1724722813@sss.pgh.pa.us> <2965760.1724724227@sss.pgh.pa.us> <3104695.1724775341@sss.pgh.pa.us> Comments: In-reply-to Tom Lane message dated "Tue, 27 Aug 2024 12:15:41 -0400" MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----- =_aaaaaaaaaa0" Content-ID: <3147094.1724795392.0@sss.pgh.pa.us> Date: Tue, 27 Aug 2024 17:52:12 -0400 Message-ID: <3147330.1724795532@sss.pgh.pa.us> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk ------- =_aaaaaaaaaa0 Content-Type: text/plain; charset="us-ascii" Content-ID: <3147094.1724795392.1@sss.pgh.pa.us> I wrote: > We didn't see this particular behavior before 2489d76c49 because > pullup_replace_vars avoided inserting a PHV: > * If it contains a Var of the subquery being pulled up, and > * does not contain any non-strict constructs, then it's > * certainly nullable so we don't need to insert a > * PlaceHolderVar. > I dropped that case in 2489d76c49 because now we need to attach > nullingrels to the expression. You could imagine attaching the > nullingrels to the contained Var(s) instead of putting a PHV on top, > but that seems like a mess and I'm not quite sure it's semantically > the same. In any case it wouldn't fix adjacent cases where there is > a non-strict construct in the subquery output expression. I realized that actually we do have the mechanism for making that work: we could apply add_nulling_relids to the expression, if it meets those same conditions. This is a kluge really, but it would restore the status quo ante in a fairly localized fashion that seems like it might be safe enough to back-patch into v16. Here's a WIP patch that does it like that. One problem with it is that it requires rcon->relids to be calculated in cases where we didn't need that before, which is probably not *that* expensive but it's annoying. If we go forward with this, I'm thinking about changing add_nulling_relids' API contract to say "if target_relid is NULL then all level-zero Vars/PHVs are modified", so that we don't need that relid set in non-LATERAL cases. The other problem with this is that it breaks one test case in memoize.sql: a query that formerly generated a memoize plan now does not use memoize. I am not sure why not --- does that mean anything to you? regards, tom lane ------- =_aaaaaaaaaa0 Content-Type: text/x-diff; name="avoid-unnecessary-PHV-during-pullup-wip.patch"; charset="us-ascii" Content-ID: <3147094.1724795392.2@sss.pgh.pa.us> Content-Description: avoid-unnecessary-PHV-during-pullup-wip.patch Content-Transfer-Encoding: quoted-printable diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optim= izer/prep/prepjointree.c index 969e257f70..3a12a52440 100644 --- a/src/backend/optimizer/prep/prepjointree.c +++ b/src/backend/optimizer/prep/prepjointree.c @@ -48,7 +48,7 @@ typedef struct pullup_replace_vars_context List *targetlist; /* tlist of subquery being pulled up */ RangeTblEntry *target_rte; /* RTE of subquery */ Relids relids; /* relids within subquery, as numbered after - * pullup (set only if target_rte->lateral) */ + * pullup */ bool *outer_hasSubLinks; /* -> outer query's hasSubLinks */ int varno; /* varno of subquery */ bool wrap_non_vars; /* do we need all non-Var outputs to be PHVs? */ @@ -1163,11 +1163,8 @@ pull_up_simple_subquery(PlannerInfo *root, Node *jt= node, RangeTblEntry *rte, rvcontext.root =3D root; rvcontext.targetlist =3D subquery->targetList; rvcontext.target_rte =3D rte; - if (rte->lateral) - rvcontext.relids =3D get_relids_in_jointree((Node *) subquery->jointree= , - true, true); - else /* won't need relids */ - rvcontext.relids =3D NULL; + rvcontext.relids =3D get_relids_in_jointree((Node *) subquery->jointree, + true, true); rvcontext.outer_hasSubLinks =3D &parse->hasSubLinks; rvcontext.varno =3D varno; /* this flag will be set below, if needed */ @@ -1713,7 +1710,7 @@ pull_up_simple_values(PlannerInfo *root, Node *jtnod= e, RangeTblEntry *rte) rvcontext.root =3D root; rvcontext.targetlist =3D tlist; rvcontext.target_rte =3D rte; - rvcontext.relids =3D NULL; + rvcontext.relids =3D NULL; /* XXX */ rvcontext.outer_hasSubLinks =3D &parse->hasSubLinks; rvcontext.varno =3D varno; rvcontext.wrap_non_vars =3D false; @@ -1877,7 +1874,7 @@ pull_up_constant_function(PlannerInfo *root, Node *j= tnode, * lateral references, even if it's marked as LATERAL. This means we * don't need to fill relids. */ - rvcontext.relids =3D NULL; + rvcontext.relids =3D NULL; /* XXX */ = rvcontext.outer_hasSubLinks =3D &parse->hasSubLinks; rvcontext.varno =3D ((RangeTblRef *) jtnode)->rtindex; @@ -2490,14 +2487,48 @@ pullup_replace_vars_callback(Var *var, else wrap =3D false; } + else if (rcon->wrap_non_vars) + { + /* Caller told us to wrap all non-Vars in a PlaceHolderVar */ + wrap =3D true; + } else { /* - * Must wrap, either because we need a place to insert - * varnullingrels or because caller told us to wrap - * everything. + * If the node contains Var(s) or PlaceHolderVar(s) of the + * subquery being pulled up, and does not contain any + * non-strict constructs, then instead of adding a PHV on top + * we can add the required nullingrels to those Vars/PHVs. + * (This is fundamentally a generalization of the above cases + * for bare Vars and PHVs.) + * + * This test is somewhat expensive, but it avoids pessimizing + * the plan in cases where the nullingrels get removed again + * later by outer join reduction. + * + * This analysis could be tighter: in particular, a non-strict + * construct hidden within a lower-level PlaceHolderVar is not + * reason to add another PHV. But for now it doesn't seem + * worth the code to be more exact. + * + * For a LATERAL subquery, we have to check the actual var + * membership of the node, but if it's non-lateral then any + * level-zero var must belong to the subquery. */ - wrap =3D true; + if ((rcon->target_rte->lateral ? + bms_overlap(pull_varnos(rcon->root, newnode), + rcon->relids) : + contain_vars_of_level(newnode, 0)) && + !contain_nonstrict_functions(newnode)) + { + /* No wrap needed */ + wrap =3D false; + } + else + { + /* Else wrap it in a PlaceHolderVar */ + wrap =3D true; + } } = if (wrap) @@ -2522,7 +2553,7 @@ pullup_replace_vars_callback(Var *var, if (var->varlevelsup > 0) IncrementVarSublevelsUp(newnode, var->varlevelsup, 0); = - /* Propagate any varnullingrels into the replacement Var or PHV */ + /* Propagate any varnullingrels into the replacement expression */ if (var->varnullingrels !=3D NULL) { if (IsA(newnode, Var)) @@ -2542,7 +2573,15 @@ pullup_replace_vars_callback(Var *var, var->varnullingrels); } else - elog(ERROR, "failed to wrap a non-Var"); + { + /* There should be lower-level Vars/PHVs we can modify */ + newnode =3D add_nulling_relids(newnode, + rcon->relids, + var->varnullingrels); + /* Assert we did put the varnullingrels into the expression */ + Assert(bms_is_subset(var->varnullingrels, + pull_varnos(rcon->root, newnode))); + } } = return newnode; ------- =_aaaaaaaaaa0--