Hi! I have solved it.
The logic is the same, but extended to constants. I added a few more tests that not only cover this case, but also NOT EXISTS, which will be converted to ANTI JOIN.Hi! Thank you for your interest to this subject!
On 27.12.2024 15:53, Ilia Evdokimov wrote:
Hi Alena,
Thank you for your work on subqueries with JOIN.
Have you considered the scenario where in subquery includes a qual like (tc.aid = 1)? When I tried executing those queries I receive different results. In my opinion, to prevent this, we should add filters for such quals within the loop 'foreach (lc, all_clauses)'
EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF) SELECT * FROM ta
WHERE EXISTS (SELECT * FROM tb, tc WHERE ta.id = tb.id AND tc.aid = 1);
QUERY PLAN
----------------------------------------------------------------------
Hash Join (actual rows=1 loops=1)
Hash Cond: (ta.id = tb.id)
Buffers: local hit=3
-> Seq Scan on ta (actual rows=3 loops=1)
Buffers: local hit=1
-> Hash (actual rows=3 loops=1)
Buckets: 4096 Batches: 1 Memory Usage: 33kB
Buffers: local hit=2
-> HashAggregate (actual rows=3 loops=1)
Group Key: tb.id
Batches: 1 Memory Usage: 121kB
Buffers: local hit=2
-> Nested Loop (actual rows=3 loops=1)
Buffers: local hit=2
-> Seq Scan on tb (actual rows=3 loops=1)
Buffers: local hit=1
-> Materialize (actual rows=1 loops=3)
Storage: Memory Maximum Storage: 17kB
Buffers: local hit=1
-> Seq Scan on tc (actual rows=1 loops=1)
Filter: (aid = 1)
Rows Removed by Filter: 1
Buffers: local hit=1
(23 rows)
============================
EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF)
SELECT * FROM ta WHERE EXISTS (SELECT * FROM tb JOIN tc ON ta.id =
tb.id WHERE tc.aid = 1);
QUERY PLAN
---------------------------------------------------------------------------
Seq Scan on ta (actual rows=1 loops=1)
Filter: EXISTS(SubPlan 1)
Rows Removed by Filter: 2
Buffers: local hit=6
SubPlan 1
-> Nested Loop (actual rows=0 loops=3)
Buffers: local hit=5
-> Index Only Scan using tb_pkey on tb (actual rows=0 loops=3)
Index Cond: (id = ta.id)
Heap Fetches: 1
Buffers: local hit=4
-> Seq Scan on tc (actual rows=1 loops=1)
Filter: (aid = 1)
Buffers: local hit=1
(14 rows)
You are right, at the moment the code is not processed if there is a constant qual in the subquery (like t1.x1=1 in the example below) and this problem is not only related to the current patch.
For example you can get such a query plan if you complete this request to the master:
create table t (x int); create table t1 (x1 int); create table t2 (x2 int); EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF) SELECT 1 FROM t WHERE EXISTS (SELECT 1 FROM t1 where t1.x1 = 1); QUERY PLAN ------------------------------------------------ Result (actual rows=0 loops=1) One-Time Filter: (InitPlan 1).col1 InitPlan 1 -> Seq Scan on t1 (actual rows=0 loops=1) Filter: (x1 = 1) -> Seq Scan on t (never executed) (6 rows)It's all because of the check in this function - this qual has levelsoup = 0, not 1 (see (!contain_vars_of_level(whereClause, 1)), but I already found out that by changing this, the logic of correction there is required a little more complicated. At the moment, I'm working to add this processing to the patch.
Thanks for the case!
-- Regards, Alena Rybakina Postgres Professional