Hi, all! I updated the patch and it looks nice. All the problems
have been solved.
On 02.04.2025 19:39, Alena Rybakina wrote:
I see that I need to add a walker that, when traversing the tree, determines whether there are conditions under which pull-up is impossible - the presence of
volatility of functions and other restrictions, and leave the transformation for the var objects that I added before, I described it here.
I have some concerns about pulling up every clause from the subquery with one column. In particular, not every clause is safe or beneficial to pull up: OR-clauses, CASE expressions, nested sublinks could significantly change how the planner estimates the number of rows or applies filters, especially when they are not true join predicates. Pulling them up might lead to worse plans, or even change the semantics in subtle ways. I think before applying such transformations, we should make sure they are not only safe but actually improve the resulting plan.
There may indeed be cases where a query plan without pull-up is worse than with pull-up.
For example, as shown below, with pull-up we don't need to scan
two tables and perform a join, since the subquery returns 0 rows
(no matching tuples in the inner sequential scan in a
parameterized Nested Loop).
However, this cannot be detected at the current planning stage -
we simply don't have that information yet.
Do you have any ideas on how to solve this problem? So far, the only approach I see is to try an alternative plan but I'm still learning this.
For example:
without my patch:
QUERY
PLAN
---------------------------------------------------------------------------------------------------------------------------
Seq Scan on t (cost=0.00..309.30 rows=5738 width=4)
(actual time=68268.562..68268.565 rows=0.00 loops=1)
Filter: EXISTS(SubPlan 1)
Rows Removed by Filter: 10000
Buffers: shared hit=900045
SubPlan 1
-> Nested Loop (cost=0.00..8524.27 rows=654075
width=0) (actual time=6.823..6.823 rows=0.00 loops=10000)
Buffers: shared hit=900000
-> Seq Scan on t2 (cost=0.00..159.75 rows=11475
width=0) (actual time=0.011..1.660 rows=10000.00 loops=10000)
Buffers: shared hit=450000
-> Materialize (cost=0.00..188.72 rows=57 width=0)
(actual time=0.000..0.000 rows=0.00 loops=100000000)
Storage: Memory Maximum Storage: 17kB
Buffers: shared hit=450000
-> Seq Scan on t1 (cost=0.00..188.44 rows=57
width=0) (actual time=2.403..2.403 rows=0.00 loops=10000)
Filter: (t.x = x)
Rows Removed by Filter: 10000
Buffers: shared hit=450000
Planning:
Buffers: shared hit=40 read=16
Planning Time: 0.487 ms
Execution Time: 68268.600 ms
-- Regards, Alena Rybakina Postgres Professional