public inbox for [email protected]  
help / color / mirror / Atom feed
From: Jakub Wartak <[email protected]>
To: Robert Haas <[email protected]>
Cc: Jacob Champion <[email protected]>
Cc: Dian Fay <[email protected]>
Cc: Matheus Alcantara <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Subject: Re: pg_plan_advice
Date: Wed, 17 Dec 2025 11:12:40 +0100
Message-ID: <CAKZiRmx538gbg=sV8CQUoR2zf6q5Y3c_1pGBd1A7XONG-5c2eg@mail.gmail.com> (raw)
In-Reply-To: <CA+TgmoZ0x3ym_oueXRWzbM_=6ucKoPZVGj3rRMLBDC_FnetXDw@mail.gmail.com>
References: <CA+TgmoZ-Jh1T6QyWoCODMVQdhTUPYkaZjWztzP1En4=ZHoKPzw@mail.gmail.com>
	<CAKZiRmxtJAFG7e1+Vs9B8ngON=AOzJbuws+1ZeH4LsbJh5AzoQ@mail.gmail.com>
	<CA+TgmoY9Ne_Sh10u6LSPc3wvOQPLp3kF9nBp3nqJEG2JGF2QiA@mail.gmail.com>
	<CA+Tgmoa57S6mP=aTOXH2-gDAL4TMO1WbGgrHSg0s6J4zUH=04g@mail.gmail.com>
	<[email protected]>
	<CA+Tgmoaf__2B0BUL+vrg28P+3buX=Ti-kybqkHiLTtFrrCfzuA@mail.gmail.com>
	<CA+TgmoYpcLNOuypOTdgCSLW7FuA=t6BtB3meTARHX2-Dj_81xQ@mail.gmail.com>
	<[email protected]>
	<CA+TgmoZjv9OyFu1Gkt78w0vWEti8S33w8trYHmErf-GMmGSi=w@mail.gmail.com>
	<[email protected]>
	<CA+TgmoaOSBQD9Ux4eG40w723ZN=c0J7p-+oX4+J8urUeyLMo5w@mail.gmail.com>
	<CAOYmi+=g+MMoOpWkk2weXWKJcKH0eKey8gKHHdH0dF4Tiawrhw@mail.gmail.com>
	<CA+TgmobwaT=PXPDDrgDup+jA8KHBbkxigtziD-zNzAKKkQYVgQ@mail.gmail.com>
	<CAOYmi+mOmEW=amDRQMfw6-Fb3ZmDEQFaJzwk8Bc8W8DzaP85XQ@mail.gmail.com>
	<CA+TgmoaX2AMW4cdFM3OngBJxmxpkdmzF33R7-CWhvRLfucbFMg@mail.gmail.com>
	<CAOYmi+njnRGcomnxTY6vsEW3wWigc0ruB0EyWFpb+PVVE8sWpw@mail.gmail.com>
	<CAOYmi+k4AyWCQHK=XVF99KVDuFkqxcADao61OWGLxu0nRYMONQ@mail.gmail.com>
	<CA+TgmoZ0x3ym_oueXRWzbM_=6ucKoPZVGj3rRMLBDC_FnetXDw@mail.gmail.com>

On Mon, Dec 15, 2025 at 9:06 PM Robert Haas <[email protected]> wrote:
>
> Here's v7.
[..]

OK, so I've tested today from Your's branch directly, so I hope that
was also v7. Given the following q20 query:

SELECT s_name, s_address
FROM supplier, nation
WHERE s_suppkey in
    (SELECT ps_suppkey
     FROM partsupp
     WHERE ps_partkey in
         (SELECT p_partkey
          FROM part
          WHERE p_name LIKE 'forest%' )
       AND ps_availqty >
         (SELECT 0.5 * sum(l_quantity)
          FROM lineitem
          WHERE l_partkey = ps_partkey
            AND l_suppkey = ps_suppkey
            AND l_shipdate >= DATE '1994-01-01'
            AND l_shipdate < DATE '1994-01-01' + INTERVAL '1' year ) )
  AND s_nationkey = n_nationkey
  AND n_name = 'CANADA'
ORDER BY s_name;

in normal conditions (w/o advice) the above query generates:

 Sort  (cost=1010985030.44..1010985030.59 rows=61 width=51)
   Sort Key: supplier.s_name
   ->  Nested Loop  (cost=0.42..1010985028.63 rows=61 width=51)
         Join Filter: (nation.n_nationkey = supplier.s_nationkey)
         ->  Seq Scan on nation  (cost=0.00..1.31 rows=1 width=4)
               Filter: (n_name = 'CANADA'::bpchar)
         ->  Nested Loop Semi Join  (cost=0.42..1010985008.29
rows=1522 width=55)
               Join Filter: (partsupp.ps_suppkey = supplier.s_suppkey)
               ->  Seq Scan on supplier  (cost=0.00..249.30 rows=7730 width=59)
               ->  Materialize  (cost=0.42..1010755994.57 rows=1973 width=4)
                     ->  Nested Loop  (cost=0.42..1010755984.71
rows=1973 width=4)
                           ->  Seq Scan on part  (cost=0.00..4842.25
rows=1469 width=4)
                                 Filter: ((p_name)::text ~~ 'forest%'::text)
                           ->  Index Scan using pk_partsupp on
partsupp  (cost=0.42..688053.87 rows=1 width=8)
                                 Index Cond: (ps_partkey = part.p_partkey)
                                 Filter: ((ps_availqty)::numeric >
(SubPlan expr_1))
                                 SubPlan expr_1
                                   ->  Aggregate
(cost=172009.42..172009.44 rows=1 width=32)
                                         ->  Seq Scan on lineitem
(cost=0.00..172009.42 rows=1 width=5)
                                               Filter: ((l_shipdate >=
'1994-01-01'::date) AND (l_shipdate < '1995-01-01 00:00:00'::timestamp
without time zone) AND (l_partkey = partsupp.ps_partkey) AND
(l_suppkey = partsupp.ps_suppkey))


 Generated Plan Advice:
   JOIN_ORDER(nation (supplier (part partsupp)))
   NESTED_LOOP_PLAIN(partsupp partsupp) <--- [X]
   NESTED_LOOP_MATERIALIZE(partsupp)
   SEQ_SCAN(nation supplier part lineitem@expr_1)
   INDEX_SCAN(partsupp public.pk_partsupp)
   SEMIJOIN_NON_UNIQUE((partsupp part))
   NO_GATHER(supplier nation partsupp part lineitem@expr_1)

Please see the - I think it's confusing? -
NESTED_LOOP_MATERIALIZE(partsupp partsupp) - that's 2x the same
string? This causes it to turn into below plan -- I've marked the
problem with [X]

 Sort  (cost=50035755.50..50035755.66 rows=61 width=51)
   Sort Key: supplier.s_name
   ->  Nested Loop  (cost=12562154.32..50035753.70 rows=61 width=51)
         Join Filter: (nation.n_nationkey = supplier.s_nationkey)
         ->  Seq Scan on nation  (cost=0.00..1.31 rows=1 width=4)
               Filter: (n_name = 'CANADA'::bpchar)
         ->  Nested Loop Semi Join  (cost=12562154.32..50035733.36
rows=1522 width=55)
             [X] -- missing Join Filter here
               ->  Seq Scan on supplier  (cost=0.00..249.30 rows=7730 width=59)
               [X] -- HJ instead of Materialize+Nested Loop below:
               ->  Hash Join  (cost=12562154.32..12567002.09 rows=1 width=4)
                     Hash Cond: (part.p_partkey = partsupp.ps_partkey)
                     ->  Seq Scan on part  (cost=0.00..4842.25
rows=1469 width=4)
                           Filter: ((p_name)::text ~~ 'forest%'::text)
                     ->  Hash  (cost=12562154.02..12562154.02 rows=24 width=8)
                           ->  Index Scan using pk_partsupp on
partsupp  (cost=0.42..12562154.02 rows=24 width=8)
                                 [X] -- wrong Index Cond below
(suppkey instead of partkey)
                                 Index Cond: (ps_suppkey = supplier.s_suppkey)
                                 Filter: ((ps_availqty)::numeric >
(SubPlan expr_1))
                                 SubPlan expr_1
                                   ->  Aggregate
(cost=172009.42..172009.44 rows=1 width=32)
                                         ->  Seq Scan on lineitem
(cost=0.00..172009.42 rows=1 width=5)
                                               Filter: ((l_shipdate >=
'1994-01-01'::date) AND (l_shipdate < '1995-01-01 00:00:00'::timestamp
without time zone) AND (l_partkey = partsupp.ps_partkey) AND
(l_suppkey = partsupp.ps_suppkey))

Supplied Plan Advice:
   SEQ_SCAN(nation) /* matched */
   SEQ_SCAN(supplier) /* matched */
   SEQ_SCAN(part) /* matched */
   SEQ_SCAN(lineitem@expr_1) /* matched */
   INDEX_SCAN(partsupp public.pk_partsupp) /* matched */
   JOIN_ORDER(nation (supplier (part partsupp))) /* matched, conflicting */
   NESTED_LOOP_PLAIN(partsupp) /* matched, conflicting */
   NESTED_LOOP_PLAIN(partsupp) /* matched, conflicting */
   NESTED_LOOP_MATERIALIZE(partsupp) /* matched, conflicting, failed */
   SEMIJOIN_NON_UNIQUE((partsupp part)) /* matched, conflicting */
   NO_GATHER(supplier) /* matched */
   NO_GATHER(nation) /* matched */
   NO_GATHER(partsupp) /* matched */
   NO_GATHER(part) /* matched */
   NO_GATHER(lineitem@expr_1) /* matched */

So the difference is basically between:
    set pg_plan_advice.advice = '[..] NESTED_LOOP_PLAIN(partsupp
partsupp) NESTED_LOOP_MATERIALIZE(partsupp) [..]';
which causes wrong plan and outcome:
    NESTED_LOOP_MATERIALIZE(partsupp) /* matched, conflicting, failed */

and apparently proper advice like below which has better yield:
    set pg_plan_advice.advice = '[..] NESTED_LOOP_PLAIN(part partsupp)
NESTED_LOOP_MATERIALIZE(partsupp) [..]';
which is not generated , but caused good plan, however it also prints:
   NESTED_LOOP_PLAIN(part) /* matched, conflicting, failed */
   NESTED_LOOP_MATERIALIZE(partsupp) /* matched, conflicting */
but that seems "failed" there, seems to be untrue?

Another idea is perhaps, we could have some elog(WARNING) - but not
Asserts() - in assert-only enabled build that could alert us in case
of duplicated entries being detected for the same ops in
pg_plan_advice_explain_feedback()?

-J.





view thread (133+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: pg_plan_advice
  In-Reply-To: <CAKZiRmx538gbg=sV8CQUoR2zf6q5Y3c_1pGBd1A7XONG-5c2eg@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox