public inbox for [email protected]  
help / color / mirror / Atom feed
From: Alena Rybakina <[email protected]>
To: Ranier Vilela <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Subject: Re: Exists pull-up application with JoinExpr
Date: Wed, 25 Dec 2024 14:20:42 +0300
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAEudQAoD707uh5Pjpg5NMyF-QO=fzajA+BmtcoqQAeXN1C+TkQ@mail.gmail.com>
References: <[email protected]>
	<CAEudQAoD707uh5Pjpg5NMyF-QO=fzajA+BmtcoqQAeXN1C+TkQ@mail.gmail.com>

On 24.12.2024 13:25, Ranier Vilela wrote:
> Hi Alena.
>
> Em ter., 24 de dez. de 2024 às 01:44, Alena Rybakina 
> <[email protected]> escreveu:
>
>     Hi, hackers!
>
>     I found one pull-up that works if the inner join condition is
>     written through the where condition,
>
>     |create temp table ta (id int primary key, val int); insert into
>     ta values(1,1); insert into ta values(2,2); ||insert into ta values(3,3);|
>
>     |create temp table tb (id int primary key, aval int); insert into
>     tb values(4,1); insert into tb values(5,1); insert into tb
>     values(1,2); create temp table tc (id int primary key, aid int);
>     insert into tc values(6,1); insert into tc values(7,2);|
>
>     |EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF) SELECT *
>     FROM ta WHERE EXISTS (SELECT * FROM tb, tc WHERE ta.id
>     <http://ta.id; = tb.id <http://tb.id;);|
>                                     QUERY PLAN
>     -------------------------------------------------------------------------
>       Nested Loop Semi Join (actual rows=1 loops=1)
>         Buffers: local hit=6
>         ->  Seq Scan on ta (actual rows=3 loops=1)
>               Buffers: local hit=1
>         ->  Nested Loop (actual rows=0 loops=3)
>               Buffers: local hit=5
>               ->  Index Only Scan using tb_pkey on tb (actual rows=0 loops=3)
>                     Index Cond: (id =ta.id  <http://ta.id;)
>                     Heap Fetches: 1
>                     Buffers: local hit=4
>               ->  Seq Scan on tc (actual rows=1 loops=1)
>                     Buffers: local hit=1
>       Planning:
>         Buffers: shared hit=67 read=12
>     (14 rows)
>
>     but it doesn't work if it is written through the outside condition.
>
>     |alena@postgres=# EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING
>     OFF) SELECT * FROM ta WHERE EXISTS (SELECT * FROM tb JOIN tc ON
>     ta.id <http://ta.id; = tb.id <http://tb.id;); QUERY PLAN
>     ------------------------------------------------------ Seq Scan on
>     ta (actual rows=1 loops=1) Filter: EXISTS(SubPlan 1) Rows Removed
>     by Filter: 2 Buffers: local hit=5 SubPlan 1 -> Nested Loop (actual
>     rows=0 loops=3) Buffers: local hit=4 -> Seq Scan on tb (actual
>     rows=0 loops=3) Filter: (ta.id <http://ta.id; = id) Rows Removed
>     by Filter: 3 Buffers: local hit=3 -> Seq Scan on tc (actual rows=1
>     loops=1) Buffers: local hit=1 Planning: Buffers: shared hit=16
>     read=9 (15 rows) |
>
>     |I have written a patch to add this functionality and now it gives
>     an query plan: |
>
>     |alena@postgres=# EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING
>     OFF)
>      SELECT *
>        FROM ta
>       WHERE EXISTS (SELECT *
>                       FROM tb JOIN tc
>                       ON ta.id <http://ta.id; = tb.id <http://tb.id;);
>                          QUERY PLAN
>     -------------------------------------------------------------------------
>      Nested Loop Semi Join (actual rows=1 loops=1)
>        Buffers: local hit=6
>        ->  Seq Scan on ta (actual rows=3 loops=1)
>              Buffers: local hit=1
>        ->  Nested Loop (actual rows=0 loops=3)
>              Buffers: local hit=5
>              ->  Index Only Scan using tb_pkey on tb (actual rows=0
>     loops=3)
>                    Index Cond: (id = ta.id <http://ta.id;)
>                    Heap Fetches: 1
>                    Buffers: local hit=4
>              ->  Seq Scan on tc (actual rows=1 loops=1)
>                    Buffers: local hit=1
>     (12 rows)|
>
>     tb and tc form a Cartesian product, but in the case of the
>     intersection condition with tuples from the table ta (ta.id
>     <http://ta.id; = tb.id <http://tb.id;). So, according to the join
>     condition, tb intersects only with 1, and only it gets into the
>     result, but at the same time they appear twice - this is because
>     of the Cartesian product of tb with tc
>
>     |*How it works:*
>     |
>
>     I rewrote the code a bit so that it considers not only the quals
>     in jointree->quals, but also those in join expression
>     (subselect->jointree->fromlist). If they satisfy the conditions
>     for using pull up, I add them to the list of clauses and form a
>     "Bool" expression from them, joined by an "AND" operation.
>
> I took a look at this patch and I did a little polishing on it.
>
> And I believe that in testing, you need to set it to BUFFERS OFF,
> because of the recent change made to ANALYZE.
>
> The tests are failing, like this:
> QUERY PLAN
>  -------------------------------------------------------------------------
>  Nested Loop Semi Join (actual rows=2 loops=1)
> + Buffers: local hit=7
>  -> Seq Scan on ta (actual rows=2 loops=1)
> + Buffers: local hit=1
>  -> Nested Loop (actual rows=1 loops=2)
> + Buffers: local hit=6
>  -> Index Only Scan using tb_pkey on tb (actual rows=1 loops=2)
>  Index Cond: (id = ta.id <http://ta.id;)
>  Heap Fetches: 2
> + Buffers: local hit=4
>  -> Seq Scan on tc (actual rows=1 loops=2)
> -(7 rows)
> + Buffers: local hit=2
> +(12 rows)
>
Yes, you are right) Thank you for your interest to this thread)

-- 
Regards,
Alena Rybakina
Postgres Professional


view thread (22+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re: Exists pull-up application with JoinExpr
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox