public inbox for [email protected]help / color / mirror / Atom feed
Re: Fwd: pg18 bug? SELECT query doesn't work 6+ messages / 3 participants [nested] [flat]
* Re: Fwd: pg18 bug? SELECT query doesn't work @ 2026-01-08 04:03 Tom Lane <[email protected]> 0 siblings, 1 reply; 6+ messages in thread From: Tom Lane @ 2026-01-08 04:03 UTC (permalink / raw) To: Richard Guo <[email protected]>; +Cc: Eric Ridge <[email protected]>; Pg Hackers <[email protected]> Richard Guo <[email protected]> writes: > The underlying expression of a join alias Var can only be a Var > (potentially coerced) from one of the join's input rels, or a COALESCE > expression containing the two input Vars. Therefore, it should not be > able to contain SRFs or volatile functions, and thus we do not need to > expand it beforehand. [ itch... ] That statement is false in general, because subquery pullup within the subquery can replace a sub-subquery's output Vars with expressions. It might be okay for this purpose, as I think we'd not pull up if the sub-subquery's output expressions are volatile or SRFs. These assumptions had better be well commented though. The larger point here is that this behavior is all recursive, and we can happily end with an expression that's been pulled up several levels; we'd better make sure the right checks happen. So I'm a little bit distressed that planner.c's invocations of flatten_group_exprs are not at all analogous to its usage of flatten_join_alias_vars. The latter pattern has a couple of decades of usage to lend credence to the assumption that it's correct. flatten_group_exprs, um, not so much. It may be fine, given the fact that grouping Vars can appear within much less of the query than join aliases. But in view of the present bug, I'm feeling nervous. regards, tom lane ^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Fwd: pg18 bug? SELECT query doesn't work @ 2026-01-08 13:30 Richard Guo <[email protected]> parent: Tom Lane <[email protected]> 0 siblings, 1 reply; 6+ messages in thread From: Richard Guo @ 2026-01-08 13:30 UTC (permalink / raw) To: Tom Lane <[email protected]>; +Cc: Eric Ridge <[email protected]>; Pg Hackers <[email protected]> On Thu, Jan 8, 2026 at 1:03 PM Tom Lane <[email protected]> wrote: > [ itch... ] That statement is false in general, because subquery > pullup within the subquery can replace a sub-subquery's output Vars > with expressions. It might be okay for this purpose, as I think we'd > not pull up if the sub-subquery's output expressions are volatile or > SRFs. These assumptions had better be well commented though. Ah, I see. Once the sub-subqueries are flattened, the join alias entries in the subquery can become arbitrary expressions rather than simple Vars. I suspect we have only avoided issues with join aliases in subquery_is_pushdown_safe() for all these years by sheer luck: since we don't pull up subqueries that output set-returning or volatile functions, those 'arbitrary expressions' are unlikely to include them. How about we add a comment to check_output_expressions() along the below lines? /* * We need to expand grouping Vars to their underlying expressions (the * grouping clauses) because the grouping expressions themselves might be * volatile or set-returning. However, we do not need to recurse deeper * into the arguments of those expressions. If an argument references a * lower-level subquery output, we can rely on the fact that subqueries * containing volatile or set-returning functions in their targetlists are * never pulled up. * * We do not need to expand join alias Vars. The underlying expression of * a join alias Var does not itself introduce volatility or set-returning * behavior. As with grouping Vars, we rely on the pull-up restrictions to * guarantee that any referenced inputs from lower levels are free of such * functions. */ > The larger point here is that this behavior is all recursive, > and we can happily end with an expression that's been pulled up > several levels; we'd better make sure the right checks happen. > So I'm a little bit distressed that planner.c's invocations of > flatten_group_exprs are not at all analogous to its usage of > flatten_join_alias_vars. The latter pattern has a couple of > decades of usage to lend credence to the assumption that it's > correct. flatten_group_exprs, um, not so much. It may be > fine, given the fact that grouping Vars can appear within > much less of the query than join aliases. But in view of the > present bug, I'm feeling nervous. I checked the invocations of flatten_join_alias_vars and flatten_group_exprs in the planner to understand why they are not being used analogously. 1. planner.c:1041 Here, we call flatten_join_alias_vars on the subquery because the subquery may contain join aliases from the outer query level; since these won't be expanded during the subquery's own planning, we must expand them now. A query illustrating this scenario is: select * from tenk1 t1 full join tenk1 t2 using (unique1) join lateral (select unique1 offset 0) on true; (BTW, the test cases added in da3df9987 for this logic are no longer valid. These queries still function correctly even if this code is removed. I think this is something we should fix.) However, I don't think an analogous call to flatten_group_exprs is necessary here. Subqueries should not contain grouping-Vars from the outer query, since FROM clause is processed before the GROUP BY step. While it is true that a subquery could reference the output of another subquery that happens to be a grouping-Var, that would be handled when expanding grouping-Vars for that specific subquery. 2. prepjointree.c:1473 Here, we call flatten_join_alias_vars on the subquery's targetlist during subquery flattening because once the the subquery's subqueries are flattened, join alias entries may become arbitrary expressions rather than simple Vars. Again, I don't see a need for an analogous flatten_group_exprs call here. Any subquery containing grouping-Vars must involve grouping, and we don't flatten such subqueries to begin with. 3. planner.c:1309 Here, flatten_join_alias_vars is called within preprocess_expression for various expressions. The analogous call to flatten_group_exprs occurs at planner.c:1118. I believe we only need to call flatten_group_exprs on the targetList and havingQual, as these are the only places where grouping-Vars can appear. Based on the above, I suspect whether we should expect the invocations of flatten_group_exprs to be analogous to those of flatten_join_alias_vars. - Richard ^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Fwd: pg18 bug? SELECT query doesn't work @ 2026-01-09 09:52 Richard Guo <[email protected]> parent: Richard Guo <[email protected]> 0 siblings, 1 reply; 6+ messages in thread From: Richard Guo @ 2026-01-09 09:52 UTC (permalink / raw) To: Tom Lane <[email protected]>; +Cc: Eric Ridge <[email protected]>; Pg Hackers <[email protected]> On Thu, Jan 8, 2026 at 10:30 PM Richard Guo <[email protected]> wrote: > How about we add a comment to check_output_expressions() along the > below lines? I've worked on the comment a bit more in the attached patch, which also includes my previously proposed code changes and some test cases. I think this patch is suitable for fixing the current bug in the back branches. We can use a separate patch for the more ambitious goal of moving the push-down of subquery's restriction clauses into subquery_planner(). Any thoughts? - Richard Attachments: [application/octet-stream] v1-0001-Fix-unsafe-pushdown-of-quals-referencing-grouping.patch (10.3K, 2-v1-0001-Fix-unsafe-pushdown-of-quals-referencing-grouping.patch) download | inline diff: From a48b94bb7c8e60e15238bf291766b322d1601d03 Mon Sep 17 00:00:00 2001 From: Richard Guo <[email protected]> Date: Fri, 9 Jan 2026 13:00:11 +0900 Subject: [PATCH v1] Fix unsafe pushdown of quals referencing grouping Vars When checking a subquery's output expressions to see if it's safe to push down an upper-level qual, check_output_expressions() previously treated grouping Vars as opaque Vars. This implicitly assumed they were stable and scalar. However, a grouping Var's underlying expression corresponds to the grouping clause, which may be volatile or set-returning. If an upper-level qual references such an output column, pushing it down into the subquery is unsafe. This can cause strange results due to multiple evaluation of a volatile function, or introduce SRFs into the subquery's WHERE/HAVING quals. This patch teaches check_output_expressions() to look through grouping Vars to their underlying expressions. This ensures that any volatility or set-returning properties in the grouping clause are detected, preventing the unsafe pushdown. We do not need to recursively examine the Vars contained in these underlying expressions. Even if they reference outputs from lower-level subqueries (at any depth), those references are guaranteed not to expand to volatile or set-returning functions, because subqueries containing such functions in their targetlists are never pulled up. --- src/backend/optimizer/path/allpaths.c | 26 +++++- src/backend/optimizer/util/var.c | 8 +- src/test/regress/expected/subselect.out | 104 ++++++++++++++++++++++++ src/test/regress/sql/subselect.sql | 40 +++++++++ 4 files changed, 175 insertions(+), 3 deletions(-) diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c index 6e641c146a3..e00d1700817 100644 --- a/src/backend/optimizer/path/allpaths.c +++ b/src/backend/optimizer/path/allpaths.c @@ -4204,9 +4204,33 @@ recurse_pushdown_safe(Node *setOp, Query *topquery, static void check_output_expressions(Query *subquery, pushdown_safety_info *safetyInfo) { + List *flattened_targetList = subquery->targetList; ListCell *lc; - foreach(lc, subquery->targetList) + /* + * We must be careful with grouping Vars and join alias Vars in the + * subquery's outputs, as they hide the underlying expressions. + * + * We need to expand grouping Vars to their underlying expressions (the + * grouping clauses) because the grouping expressions themselves might be + * volatile or set-returning. However, we do not need to expand join + * alias Vars, as their underlying structure does not introduce volatile + * or set-returning functions at the current level. + * + * In neither case do we need to recursively examine the Vars contained in + * these underlying expressions. Even if they reference outputs from + * lower-level subqueries (at any depth), those references are guaranteed + * not to expand to volatile or set-returning functions, because + * subqueries containing such functions in their targetlists are never + * pulled up. + */ + if (subquery->hasGroupRTE) + { + flattened_targetList = (List *) + flatten_group_exprs(NULL, subquery, (Node *) subquery->targetList); + } + + foreach(lc, flattened_targetList) { TargetEntry *tle = (TargetEntry *) lfirst(lc); diff --git a/src/backend/optimizer/util/var.c b/src/backend/optimizer/util/var.c index 99da622507b..5f22021ecca 100644 --- a/src/backend/optimizer/util/var.c +++ b/src/backend/optimizer/util/var.c @@ -959,10 +959,14 @@ flatten_join_alias_vars_mutator(Node *node, * wrapper. * * NOTE: this is also used by ruleutils.c, to deparse one query parsetree back - * to source text. For that use-case, root will be NULL, which is why we have - * to pass the Query separately. We need the root itself only for preserving + * to source text, and by check_output_expressions() to check for unsafe + * pushdowns. For these use-cases, root will be NULL, which is why we have to + * pass the Query separately. We need the root itself only for preserving * varnullingrels. We can avoid preserving varnullingrels in the ruleutils.c's * usage because it does not make any difference to the deparsed source text. + * We can also avoid it in check_output_expressions() because that function + * only cares about the presence of volatile or set-returning functions, which + * is independent of the Vars' nullingrels. */ Node * flatten_group_exprs(PlannerInfo *root, Query *query, Node *node) diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out index 37ed59e73bf..2135d82884d 100644 --- a/src/test/regress/expected/subselect.out +++ b/src/test/regress/expected/subselect.out @@ -1925,6 +1925,110 @@ NOTICE: x = 9, y = 13 9 | 3 (3 rows) +-- +-- check that an upper-level qual is not pushed down if it references a grouped +-- Var whose underlying expression contains SRFs +-- +explain (verbose, costs off) +select * from + (select generate_series(1, ten) as g, count(*) from tenk1 group by 1) ss + where ss.g = 1; + QUERY PLAN +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + Subquery Scan on ss + Output: ss.g, ss.count + Filter: (ss.g = 1) + -> HashAggregate + Output: (generate_series(1, tenk1.ten)), count(*) + Group Key: generate_series(1, tenk1.ten) + -> ProjectSet + Output: generate_series(1, tenk1.ten) + -> Seq Scan on public.tenk1 + Output: tenk1.unique1, tenk1.unique2, tenk1.two, tenk1.four, tenk1.ten, tenk1.twenty, tenk1.hundred, tenk1.thousand, tenk1.twothousand, tenk1.fivethous, tenk1.tenthous, tenk1.odd, tenk1.even, tenk1.stringu1, tenk1.stringu2, tenk1.string4 +(10 rows) + +select * from + (select generate_series(1, ten) as g, count(*) from tenk1 group by 1) ss + where ss.g = 1; + g | count +---+------- + 1 | 9000 +(1 row) + +-- +-- check that an upper-level qual is not pushed down if it references a grouped +-- Var whose underlying expression contains volatile functions +-- +alter function tattle(x int, y int) volatile; +explain (verbose, costs off) +select * from + (select tattle(3, ten) as v, count(*) from tenk1 where unique1 < 3 group by 1) ss + where ss.v; + QUERY PLAN +------------------------------------------------------------ + Subquery Scan on ss + Output: ss.v, ss.count + Filter: ss.v + -> GroupAggregate + Output: (tattle(3, tenk1.ten)), count(*) + Group Key: (tattle(3, tenk1.ten)) + -> Sort + Output: (tattle(3, tenk1.ten)) + Sort Key: (tattle(3, tenk1.ten)) + -> Bitmap Heap Scan on public.tenk1 + Output: tattle(3, tenk1.ten) + Recheck Cond: (tenk1.unique1 < 3) + -> Bitmap Index Scan on tenk1_unique1 + Index Cond: (tenk1.unique1 < 3) +(14 rows) + +select * from + (select tattle(3, ten) as v, count(*) from tenk1 where unique1 < 3 group by 1) ss + where ss.v; +NOTICE: x = 3, y = 2 +NOTICE: x = 3, y = 1 +NOTICE: x = 3, y = 0 + v | count +---+------- + t | 3 +(1 row) + +-- if we pretend it's stable, we get different results: +alter function tattle(x int, y int) stable; +explain (verbose, costs off) +select * from + (select tattle(3, ten) as v, count(*) from tenk1 where unique1 < 3 group by 1) ss + where ss.v; + QUERY PLAN +------------------------------------------------------ + GroupAggregate + Output: (tattle(3, tenk1.ten)), count(*) + Group Key: (tattle(3, tenk1.ten)) + -> Sort + Output: (tattle(3, tenk1.ten)) + Sort Key: (tattle(3, tenk1.ten)) + -> Bitmap Heap Scan on public.tenk1 + Output: tattle(3, tenk1.ten) + Recheck Cond: (tenk1.unique1 < 3) + Filter: tattle(3, tenk1.ten) + -> Bitmap Index Scan on tenk1_unique1 + Index Cond: (tenk1.unique1 < 3) +(12 rows) + +select * from + (select tattle(3, ten) as v, count(*) from tenk1 where unique1 < 3 group by 1) ss + where ss.v; +NOTICE: x = 3, y = 2 +NOTICE: x = 3, y = 2 +NOTICE: x = 3, y = 1 +NOTICE: x = 3, y = 1 +NOTICE: x = 3, y = 0 +NOTICE: x = 3, y = 0 + v | count +---+------- + t | 3 +(1 row) + drop function tattle(x int, y int); -- -- Test that LIMIT can be pushed to SORT through a subquery that just projects diff --git a/src/test/regress/sql/subselect.sql b/src/test/regress/sql/subselect.sql index 192a6f96b93..cadc3293687 100644 --- a/src/test/regress/sql/subselect.sql +++ b/src/test/regress/sql/subselect.sql @@ -952,6 +952,46 @@ select * from (select 9 as x, unnest(array[1,2,3,11,12,13]) as u) ss where tattle(x, u); +-- +-- check that an upper-level qual is not pushed down if it references a grouped +-- Var whose underlying expression contains SRFs +-- +explain (verbose, costs off) +select * from + (select generate_series(1, ten) as g, count(*) from tenk1 group by 1) ss + where ss.g = 1; + +select * from + (select generate_series(1, ten) as g, count(*) from tenk1 group by 1) ss + where ss.g = 1; + +-- +-- check that an upper-level qual is not pushed down if it references a grouped +-- Var whose underlying expression contains volatile functions +-- +alter function tattle(x int, y int) volatile; + +explain (verbose, costs off) +select * from + (select tattle(3, ten) as v, count(*) from tenk1 where unique1 < 3 group by 1) ss + where ss.v; + +select * from + (select tattle(3, ten) as v, count(*) from tenk1 where unique1 < 3 group by 1) ss + where ss.v; + +-- if we pretend it's stable, we get different results: +alter function tattle(x int, y int) stable; + +explain (verbose, costs off) +select * from + (select tattle(3, ten) as v, count(*) from tenk1 where unique1 < 3 group by 1) ss + where ss.v; + +select * from + (select tattle(3, ten) as v, count(*) from tenk1 where unique1 < 3 group by 1) ss + where ss.v; + drop function tattle(x int, y int); -- -- 2.39.5 (Apple Git-154) ^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Fwd: pg18 bug? SELECT query doesn't work @ 2026-01-16 03:27 Richard Guo <[email protected]> parent: Richard Guo <[email protected]> 0 siblings, 1 reply; 6+ messages in thread From: Richard Guo @ 2026-01-16 03:27 UTC (permalink / raw) To: Tom Lane <[email protected]>; +Cc: Eric Ridge <[email protected]>; Pg Hackers <[email protected]> On Fri, Jan 9, 2026 at 6:52 PM Richard Guo <[email protected]> wrote: > I've worked on the comment a bit more in the attached patch, which > also includes my previously proposed code changes and some test cases. > > I think this patch is suitable for fixing the current bug in the back > branches. We can use a separate patch for the more ambitious goal of > moving the push-down of subquery's restriction clauses into > subquery_planner(). I plan to push and backpatch this patch early next week, barring any objections. - Richard ^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Fwd: pg18 bug? SELECT query doesn't work @ 2026-01-17 16:18 Eric Ridge <[email protected]> parent: Richard Guo <[email protected]> 0 siblings, 1 reply; 6+ messages in thread From: Eric Ridge @ 2026-01-17 16:18 UTC (permalink / raw) To: Richard Guo <[email protected]>; +Cc: Tom Lane <[email protected]>; Pg Hackers <[email protected]> > I plan to push and backpatch this patch early next week, barring any objections. Thank you for your work on this. It’s sincerely appreciated. eric ^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Fwd: pg18 bug? SELECT query doesn't work @ 2026-01-19 02:39 Richard Guo <[email protected]> parent: Eric Ridge <[email protected]> 0 siblings, 0 replies; 6+ messages in thread From: Richard Guo @ 2026-01-19 02:39 UTC (permalink / raw) To: Eric Ridge <[email protected]>; +Cc: Tom Lane <[email protected]>; Pg Hackers <[email protected]> On Sun, Jan 18, 2026 at 1:18 AM Eric Ridge <[email protected]> wrote: > > I plan to push and backpatch this patch early next week, barring any objections. > Thank you for your work on this. It’s sincerely appreciated. Pushed and backpatched. Thanks for the report, Eric. The fix will be included in the next minor release this Feb. - Richard ^ permalink raw reply [nested|flat] 6+ messages in thread
end of thread, other threads:[~2026-01-19 02:39 UTC | newest] Thread overview: 6+ messages (download: mbox mbox.gz follow: Atom feed) -- links below jump to the message on this page -- 2026-01-08 04:03 Re: Fwd: pg18 bug? SELECT query doesn't work Tom Lane <[email protected]> 2026-01-08 13:30 ` Richard Guo <[email protected]> 2026-01-09 09:52 ` Richard Guo <[email protected]> 2026-01-16 03:27 ` Richard Guo <[email protected]> 2026-01-17 16:18 ` Eric Ridge <[email protected]> 2026-01-19 02:39 ` Richard Guo <[email protected]>
This inbox is served by agora; see mirroring instructions for how to clone and mirror all data and code used for this inbox