public inbox for [email protected]
help / color / mirror / Atom feedFrom: Jakub Wartak <[email protected]>
To: Robert Haas <[email protected]>
Cc: Jacob Champion <[email protected]>
Cc: Dian Fay <[email protected]>
Cc: Matheus Alcantara <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Subject: Re: pg_plan_advice
Date: Wed, 17 Dec 2025 11:12:40 +0100
Message-ID: <CAKZiRmx538gbg=sV8CQUoR2zf6q5Y3c_1pGBd1A7XONG-5c2eg@mail.gmail.com> (raw)
In-Reply-To: <CA+TgmoZ0x3ym_oueXRWzbM_=6ucKoPZVGj3rRMLBDC_FnetXDw@mail.gmail.com>
References: <CA+TgmoZ-Jh1T6QyWoCODMVQdhTUPYkaZjWztzP1En4=ZHoKPzw@mail.gmail.com>
<CAKZiRmxtJAFG7e1+Vs9B8ngON=AOzJbuws+1ZeH4LsbJh5AzoQ@mail.gmail.com>
<CA+TgmoY9Ne_Sh10u6LSPc3wvOQPLp3kF9nBp3nqJEG2JGF2QiA@mail.gmail.com>
<CA+Tgmoa57S6mP=aTOXH2-gDAL4TMO1WbGgrHSg0s6J4zUH=04g@mail.gmail.com>
<[email protected]>
<CA+Tgmoaf__2B0BUL+vrg28P+3buX=Ti-kybqkHiLTtFrrCfzuA@mail.gmail.com>
<CA+TgmoYpcLNOuypOTdgCSLW7FuA=t6BtB3meTARHX2-Dj_81xQ@mail.gmail.com>
<[email protected]>
<CA+TgmoZjv9OyFu1Gkt78w0vWEti8S33w8trYHmErf-GMmGSi=w@mail.gmail.com>
<[email protected]>
<CA+TgmoaOSBQD9Ux4eG40w723ZN=c0J7p-+oX4+J8urUeyLMo5w@mail.gmail.com>
<CAOYmi+=g+MMoOpWkk2weXWKJcKH0eKey8gKHHdH0dF4Tiawrhw@mail.gmail.com>
<CA+TgmobwaT=PXPDDrgDup+jA8KHBbkxigtziD-zNzAKKkQYVgQ@mail.gmail.com>
<CAOYmi+mOmEW=amDRQMfw6-Fb3ZmDEQFaJzwk8Bc8W8DzaP85XQ@mail.gmail.com>
<CA+TgmoaX2AMW4cdFM3OngBJxmxpkdmzF33R7-CWhvRLfucbFMg@mail.gmail.com>
<CAOYmi+njnRGcomnxTY6vsEW3wWigc0ruB0EyWFpb+PVVE8sWpw@mail.gmail.com>
<CAOYmi+k4AyWCQHK=XVF99KVDuFkqxcADao61OWGLxu0nRYMONQ@mail.gmail.com>
<CA+TgmoZ0x3ym_oueXRWzbM_=6ucKoPZVGj3rRMLBDC_FnetXDw@mail.gmail.com>
On Mon, Dec 15, 2025 at 9:06 PM Robert Haas <[email protected]> wrote:
>
> Here's v7.
[..]
OK, so I've tested today from Your's branch directly, so I hope that
was also v7. Given the following q20 query:
SELECT s_name, s_address
FROM supplier, nation
WHERE s_suppkey in
(SELECT ps_suppkey
FROM partsupp
WHERE ps_partkey in
(SELECT p_partkey
FROM part
WHERE p_name LIKE 'forest%' )
AND ps_availqty >
(SELECT 0.5 * sum(l_quantity)
FROM lineitem
WHERE l_partkey = ps_partkey
AND l_suppkey = ps_suppkey
AND l_shipdate >= DATE '1994-01-01'
AND l_shipdate < DATE '1994-01-01' + INTERVAL '1' year ) )
AND s_nationkey = n_nationkey
AND n_name = 'CANADA'
ORDER BY s_name;
in normal conditions (w/o advice) the above query generates:
Sort (cost=1010985030.44..1010985030.59 rows=61 width=51)
Sort Key: supplier.s_name
-> Nested Loop (cost=0.42..1010985028.63 rows=61 width=51)
Join Filter: (nation.n_nationkey = supplier.s_nationkey)
-> Seq Scan on nation (cost=0.00..1.31 rows=1 width=4)
Filter: (n_name = 'CANADA'::bpchar)
-> Nested Loop Semi Join (cost=0.42..1010985008.29
rows=1522 width=55)
Join Filter: (partsupp.ps_suppkey = supplier.s_suppkey)
-> Seq Scan on supplier (cost=0.00..249.30 rows=7730 width=59)
-> Materialize (cost=0.42..1010755994.57 rows=1973 width=4)
-> Nested Loop (cost=0.42..1010755984.71
rows=1973 width=4)
-> Seq Scan on part (cost=0.00..4842.25
rows=1469 width=4)
Filter: ((p_name)::text ~~ 'forest%'::text)
-> Index Scan using pk_partsupp on
partsupp (cost=0.42..688053.87 rows=1 width=8)
Index Cond: (ps_partkey = part.p_partkey)
Filter: ((ps_availqty)::numeric >
(SubPlan expr_1))
SubPlan expr_1
-> Aggregate
(cost=172009.42..172009.44 rows=1 width=32)
-> Seq Scan on lineitem
(cost=0.00..172009.42 rows=1 width=5)
Filter: ((l_shipdate >=
'1994-01-01'::date) AND (l_shipdate < '1995-01-01 00:00:00'::timestamp
without time zone) AND (l_partkey = partsupp.ps_partkey) AND
(l_suppkey = partsupp.ps_suppkey))
Generated Plan Advice:
JOIN_ORDER(nation (supplier (part partsupp)))
NESTED_LOOP_PLAIN(partsupp partsupp) <--- [X]
NESTED_LOOP_MATERIALIZE(partsupp)
SEQ_SCAN(nation supplier part lineitem@expr_1)
INDEX_SCAN(partsupp public.pk_partsupp)
SEMIJOIN_NON_UNIQUE((partsupp part))
NO_GATHER(supplier nation partsupp part lineitem@expr_1)
Please see the - I think it's confusing? -
NESTED_LOOP_MATERIALIZE(partsupp partsupp) - that's 2x the same
string? This causes it to turn into below plan -- I've marked the
problem with [X]
Sort (cost=50035755.50..50035755.66 rows=61 width=51)
Sort Key: supplier.s_name
-> Nested Loop (cost=12562154.32..50035753.70 rows=61 width=51)
Join Filter: (nation.n_nationkey = supplier.s_nationkey)
-> Seq Scan on nation (cost=0.00..1.31 rows=1 width=4)
Filter: (n_name = 'CANADA'::bpchar)
-> Nested Loop Semi Join (cost=12562154.32..50035733.36
rows=1522 width=55)
[X] -- missing Join Filter here
-> Seq Scan on supplier (cost=0.00..249.30 rows=7730 width=59)
[X] -- HJ instead of Materialize+Nested Loop below:
-> Hash Join (cost=12562154.32..12567002.09 rows=1 width=4)
Hash Cond: (part.p_partkey = partsupp.ps_partkey)
-> Seq Scan on part (cost=0.00..4842.25
rows=1469 width=4)
Filter: ((p_name)::text ~~ 'forest%'::text)
-> Hash (cost=12562154.02..12562154.02 rows=24 width=8)
-> Index Scan using pk_partsupp on
partsupp (cost=0.42..12562154.02 rows=24 width=8)
[X] -- wrong Index Cond below
(suppkey instead of partkey)
Index Cond: (ps_suppkey = supplier.s_suppkey)
Filter: ((ps_availqty)::numeric >
(SubPlan expr_1))
SubPlan expr_1
-> Aggregate
(cost=172009.42..172009.44 rows=1 width=32)
-> Seq Scan on lineitem
(cost=0.00..172009.42 rows=1 width=5)
Filter: ((l_shipdate >=
'1994-01-01'::date) AND (l_shipdate < '1995-01-01 00:00:00'::timestamp
without time zone) AND (l_partkey = partsupp.ps_partkey) AND
(l_suppkey = partsupp.ps_suppkey))
Supplied Plan Advice:
SEQ_SCAN(nation) /* matched */
SEQ_SCAN(supplier) /* matched */
SEQ_SCAN(part) /* matched */
SEQ_SCAN(lineitem@expr_1) /* matched */
INDEX_SCAN(partsupp public.pk_partsupp) /* matched */
JOIN_ORDER(nation (supplier (part partsupp))) /* matched, conflicting */
NESTED_LOOP_PLAIN(partsupp) /* matched, conflicting */
NESTED_LOOP_PLAIN(partsupp) /* matched, conflicting */
NESTED_LOOP_MATERIALIZE(partsupp) /* matched, conflicting, failed */
SEMIJOIN_NON_UNIQUE((partsupp part)) /* matched, conflicting */
NO_GATHER(supplier) /* matched */
NO_GATHER(nation) /* matched */
NO_GATHER(partsupp) /* matched */
NO_GATHER(part) /* matched */
NO_GATHER(lineitem@expr_1) /* matched */
So the difference is basically between:
set pg_plan_advice.advice = '[..] NESTED_LOOP_PLAIN(partsupp
partsupp) NESTED_LOOP_MATERIALIZE(partsupp) [..]';
which causes wrong plan and outcome:
NESTED_LOOP_MATERIALIZE(partsupp) /* matched, conflicting, failed */
and apparently proper advice like below which has better yield:
set pg_plan_advice.advice = '[..] NESTED_LOOP_PLAIN(part partsupp)
NESTED_LOOP_MATERIALIZE(partsupp) [..]';
which is not generated , but caused good plan, however it also prints:
NESTED_LOOP_PLAIN(part) /* matched, conflicting, failed */
NESTED_LOOP_MATERIALIZE(partsupp) /* matched, conflicting */
but that seems "failed" there, seems to be untrue?
Another idea is perhaps, we could have some elog(WARNING) - but not
Asserts() - in assert-only enabled build that could alert us in case
of duplicated entries being detected for the same ops in
pg_plan_advice_explain_feedback()?
-J.
view thread (133+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: pg_plan_advice
In-Reply-To: <CAKZiRmx538gbg=sV8CQUoR2zf6q5Y3c_1pGBd1A7XONG-5c2eg@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox