MIME-Version: 1.0
References: <3FF63E99-AB4F-41A9-BC78-AAB28823FBD0@Outlook.com>
 <6db6d2ec-7529-4add-9a95-178fc318311d@vondra.me> <313ACE5A-CBF1-43B3-9181-10D3E8ADF424@Outlook.com>
 <5abd6054-413c-4f48-9172-d8b31062b266@vondra.me> <cb313155-24c4-4838-a46b-44968993a6e2@vondra.me>
 <938E2286-9B0D-4F8D-A916-8E0E35D55034@Outlook.com> <e82d4302-2450-4915-93a5-7df75f69c385@vondra.me>
 <CANWCAZY529EPHyo1kLnEzjFBq-UaDPc3KErK=ApqDZZ1Oc-XHg@mail.gmail.com>
 <CAFj8pRCO5ocbr-wFWx5QsKdfkW-=XuQ6zkW5FES7ERQZQHtpwQ@mail.gmail.com>
 <982de4a4-71b6-4d1d-afe2-35b1c5d43529@vondra.me> <CAFj8pRASJuRQKHOoBTnR5aRUeRKpNAmrYQcBrQb=yqeZ_8me9Q@mail.gmail.com>
 <CAEvyyTi1M6JhHb6sR+xK-kp2bezMoADSC+RY2A+DbdEn+_BLxA@mail.gmail.com>
 <8927A117-A7EA-41E8-94B3-0B4F7767DA8B@outlook.com> <CAEvyyTizQ9ki++g0P8-2Ae2OundUb-2=cS2-PQHe-LYPzhSS1A@mail.gmail.com>
In-Reply-To: <CAEvyyTizQ9ki++g0P8-2Ae2OundUb-2=cS2-PQHe-LYPzhSS1A@mail.gmail.com>
From: lakshmi <lakshmigcdac@gmail.com>
Date: Mon, 16 Feb 2026 12:29:24 +0530
Message-ID: <CAEvyyTjqcn9RwFb-S_Kx-+3b_Zg9YCUyhqoDsgvUcrR=pkMB0A@mail.gmail.com>
Subject: Re: Add a greedy join search algorithm to handle large join problems
To: Chengpeng Yan <chengpeng_yan@outlook.com>
Cc: Pavel Stehule <pavel.stehule@gmail.com>, Tomas Vondra <tomas@vondra.me>, 
	John Naylor <johncnaylorls@gmail.com>, 
	"pgsql-hackers@lists.postgresql.org" <pgsql-hackers@lists.postgresql.org>
Content-Type: multipart/alternative; boundary="000000000000b47910064aeb74a9"
Archived-At: <https://www.postgresql.org/message-id/CAEvyyTjqcn9RwFb-S_Kx-%2B3b_Zg9YCUyhqoDsgvUcrR%3DpkMB0A%40mail.gmail.com>
Precedence: bulk

--000000000000b47910064aeb74a9
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Hi Chengpeng,

I ran a quick comparison of the GOO v5 greedy strategies on a multi-join
workload to look at execution quality in addition to planning time.

Here are the results from one representative query:

       -cost: planning ~1.8 ms | execution ~32.8 ms

       -result_size: planning ~2.1 ms | execution ~29.3 ms

       -combined: planning ~3.7 ms | execution ~29.4 ms

All three strategies keep planning time very low, which continues to
support GOO=E2=80=99s intended scalability advantage over DP and GEQO.

In this test, the cost strategy shows noticeably higher execution time,
while result_size and combined produce similar and better execution
performance. Since combined has slightly higher planning overhead without a
clear execution benefit here, result_size appears to provide the best
trade-off for this synthetic workload.

I plan to continue testing with a few JOB-based queries to evaluate plan
quality in a more realistic setting and will share further results if
anything notable appears.

Thanks again for the continued work on this patch series.

Regards,
Lakshmi

On Mon, Feb 16, 2026 at 11:57=E2=80=AFAM lakshmi <lakshmigcdac@gmail.com> w=
rote:

> Hi Chengpeng,
>
> I tested the v5 patch on a clean build from current PostgreSQL master.The
> patch applied cleanly, the server built successfully, and  make
> check-world  passed without new failures.
> I then compared DP, GEQO, and GOO on synthetic multi-join workloads.
>
> *15-table join*
>
>    -
>
>    DP: planning ~19.1 ms | execution ~21.0 ms
>    -
>
>    GEQO: planning ~46.6 ms | execution ~17.7 ms
>    -
>
>    GOO v5: planning ~1.0 ms | execution ~19.5 ms
>
> *20-table join*
>
>    -
>
>    DP: planning ~25.1 ms | execution ~28.2 ms
>    -
>
>    GEQO: planning ~27.5 ms | execution ~23.5 ms
>    -
>
>    GOO v5: planning ~1.5 ms | execution ~28.9 ms
>
>    Across both join sizes, GOO v5 keeps planning time extremely low (roug=
hly
>    an order of magnitude lower than DP/GEQO) while execution times remain
>    in a comparable range, with no obvious regressions in this synthetic
>    workload. This appears consistent with the goal of reducing planning
>    overhead for large join problems while preserving similar plan quality=
.
>
>    These tests use controlled synthetic joins rather than JOB/TPC-H, so
>    they mainly validate planning-time scaling and basic plan sanity. I pl=
an to
>    continue with more realistic workloads and strategy comparisons and wi=
ll
>    share further results if anything notable appears.
>
>    Thanks for the continued work on this patch series.
>
>    Regards,
>    Lakshmi
>
>
> On Sat, Feb 14, 2026 at 11:09=E2=80=AFAM Chengpeng Yan <chengpeng_yan@out=
look.com>
> wrote:
>
>>
>> > 2026=E5=B9=B42=E6=9C=8813=E6=97=A5 19:14=EF=BC=8Clakshmi <lakshmigcdac=
@gmail.com> =E5=86=99=E9=81=93=EF=BC=9A
>> >
>> > HI all,
>> > I tested the latest GOO patch (v4) on a fresh build from the current
>> PostgreSQL master. The patch applied cleanly, the server built without
>> issues, and regression tests passed except for the expected EXPLAIN outp=
ut
>> differences due to the new join ordering behavior.
>> >
>> > As a quick sanity check, I compared DP, GEQO, and GOO on a small
>> multi-join query:
>> >
>> >      DP planning: ~0.66 ms
>> >      GEQO planning: ~2.28 ms
>> >      GOO planning: ~0.38 ms
>> > Execution times were similar across all three (~1.5=E2=80=931.7 ms) wi=
th no
>> correctness issues. Even on a small join, GEQO shows higher planning
>> overhead, while GOO plans faster with comparable execution cost.
>> > I then evaluated scaling using synthetic 15-table and 20-table joins
>> with EXPLAIN (ANALYZE, TIMING OFF):
>> >      15 tables
>> >      DP: ~22.9 ms | ~23.4 ms
>> >      GEQO: ~46.7 ms | ~20.5 ms
>> >      GOO: ~1.8 ms | ~22.4 ms
>> >
>> >       20 tables
>> >       DP: ~48.1 ms | ~30.5 ms
>> >      GEQO: ~51.0 ms | ~26.7 ms
>> >      GOO: ~3.2 ms | ~29.0 ms
>> >
>> > Planning time increases notably for DP and remains relatively high for
>> GEQO, while GOO stays very low even at 20 joins, indicating substantiall=
y
>> > reduced planning overhead. Execution costs remain broadly comparable,
>> with no obvious regressions from GOO in this synthetic workload.
>> >
>> > Although this uses a controlled synthetic join graph rather than
>> JOB/TPC-H, the scaling behavior appears consistent with GOO=E2=80=99s go=
al of
>> significantly cheaper planning than DP/GEQO while preserving similar pla=
n
>> quality.
>> >
>> > I plan to continue testing with more realistic workloads and will shar=
e
>> further results if anything notable appears.
>> >
>> > Thanks for the interesting work.
>> >
>> > Regards,
>> > Lakshmi
>>
>> Hi,
>>
>> Thank you very much for testing v4 and sharing the results. I really
>> appreciate the effort and the detailed feedback.
>>
>> I also agree with Tomas=E2=80=99s point that we need better benchmark co=
ntext to
>> evaluate plan quality, not only planning time.
>>
>> I=E2=80=99ve prepared a v5 refresh on top of v4, still split into two pa=
tches
>> (v5-0001 and v5-0002).
>> I also ran `make check-world` on current master with v5 applied, and it
>> passes on my side.
>>
>> Compared with v4:
>>
>> [PATCH v5 1/2]
>> - keeps the base GOO join-search path focused on a single greedy signal
>> (cost);
>> - fixes issues found during recent testing (mainly around candidate
>> probing/cleanup and failure paths);
>> - improves stability/determinism in candidate selection (including tie
>> handling);
>> - updates regression outputs accordingly.
>>
>> [PATCH v5 2/2]
>> - extends `goo_greedy_strategy` and adds the `selectivity` heuristic
>> suggested by Tomas;
>> - improves combined mode so multiple greedy signals are evaluated in a
>> common framework, and the final plan is selected by lowest estimated
>> `total_cost`;
>> - keeps strategy-layer changes isolated from the base path for easier
>> comparison and review.
>>
>> My current next steps are:
>>
>> 1. Continue evaluating plan quality on more datasets/workloads. I=E2=80=
=99ve
>> already collected several candidate tests: some are JOB-based
>> variants, and others are synthetic workloads. Next, I plan to
>> consolidate these into a unified test set (with reproducible
>> setup/details), publish it, and run broader comparative evaluation.
>>
>> 2. Prototype a hybrid handoff approach: use greedy contraction first to
>> reduce the join graph, then let DP optimize the reduced problem. The
>> goal is a smoother transition around the threshold, avoiding abrupt
>> plan-shape changes from a hard optimizer switch.
>>
>> 3. Explore more join-ordering improvements incrementally, including
>> ideas from =E2=80=9CSimplicity Done Right for Join Ordering=E2=80=9D and=
 related
>> work.
>>
>> Thanks again for the careful testing and detailed feedback.
>>
>> --
>> Best regards,
>> Chengpeng Yan
>>
>

--000000000000b47910064aeb74a9
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><p>Hi Chengpeng,</p><p>I ran a quick comparison of the GOO=
 v5 greedy strategies on a multi-join workload to look at execution quality=
 in addition to planning time.</p><p>Here are the results from one represen=
tative query:</p><p>=C2=A0 =C2=A0 =C2=A0 =C2=A0-cost: planning ~1.8 ms | ex=
ecution ~32.8 ms</p><p>=C2=A0 =C2=A0 =C2=A0 =C2=A0-result_size: planning ~2=
.1 ms | execution ~29.3 ms</p><p>=C2=A0 =C2=A0 =C2=A0 =C2=A0-combined: plan=
ning ~3.7 ms | execution ~29.4 ms</p><p>All three strategies keep planning =
time very low, which continues to support GOO=E2=80=99s intended scalabilit=
y advantage over DP and GEQO.</p><p>In this test, the cost=C2=A0strategy sh=
ows noticeably higher execution time, while result_size and combined produc=
e similar and better execution performance. Since combined has slightly hig=
her planning overhead without a clear execution benefit here, result_size a=
ppears to provide the best trade-off for this synthetic workload.</p><p>I p=
lan to continue testing with a few JOB-based queries to evaluate plan quali=
ty in a more realistic setting and will share further results if anything n=
otable appears.</p><p>Thanks again for the continued work on this patch ser=
ies.</p><p>Regards,<br>Lakshmi</p></div><br><div class=3D"gmail_quote gmail=
_quote_container"><div dir=3D"ltr" class=3D"gmail_attr">On Mon, Feb 16, 202=
6 at 11:57=E2=80=AFAM lakshmi &lt;<a href=3D"mailto:lakshmigcdac@gmail.com"=
>lakshmigcdac@gmail.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_=
quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,=
204);padding-left:1ex"><div dir=3D"ltr"><p>Hi Chengpeng,</p><p>I tested the=
 v5 patch on a clean build from current PostgreSQL master.The patch applied=
 cleanly, the server built successfully, and=C2=A0=C2=A0<code>make check-wo=
rld</code>=C2=A0=C2=A0passed without new failures.<br>I then compared DP, G=
EQO, and GOO on synthetic multi-join workloads.<br></p><p><strong>15-table =
join</strong></p><ul><li><p>DP:   planning ~19.1 ms | execution ~21.0 ms</p=
></li><li><p>GEQO: planning ~46.6 ms | execution ~17.7 ms</p></li><li><p>GO=
O v5: planning ~1.0 ms | execution ~19.5 ms</p></li></ul><p><strong>20-tabl=
e join</strong></p><ul><li><p>DP:   planning ~25.1 ms | execution ~28.2 ms<=
/p></li><li><p>GEQO: planning ~27.5 ms | execution ~23.5 ms</p></li><li><p>=
GOO v5: planning ~1.5 ms | execution ~28.9 ms<br><br></p><p></p><p></p><p><=
span>Across both join sizes, GOO v5 keeps planning time extremely low=C2=A0=
</span><span>(roughly an order of magnitude lower than DP/GEQO) while execu=
tion=C2=A0</span><span>times remain in a comparable range, with no obvious =
regressions in this=C2=A0</span><span>synthetic workload. This appears cons=
istent with the goal of reducing=C2=A0</span><span>planning overhead for la=
rge join problems while preserving similar plan=C2=A0</span><span>quality.<=
br><br>These tests use controlled synthetic joins rather than JOB/TPC-H, so=
 they mainly validate planning-time scaling and basic plan sanity. I plan t=
o continue with more realistic workloads and strategy comparisons and will =
share further results if anything notable appears.<br><br></span></p><p>Tha=
nks for the continued work on this patch series.</p><p></p><p>Regards,<br>L=
akshmi</p></li></ul></div><br><div class=3D"gmail_quote"><div dir=3D"ltr" c=
lass=3D"gmail_attr">On Sat, Feb 14, 2026 at 11:09=E2=80=AFAM Chengpeng Yan =
&lt;<a href=3D"mailto:chengpeng_yan@outlook.com" target=3D"_blank">chengpen=
g_yan@outlook.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote"=
 style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);p=
adding-left:1ex">


<div>
<div><font size=3D"2"><span style=3D"font-size:11pt">
<div><br>
&gt; 2026=E5=B9=B42=E6=9C=8813=E6=97=A5 19:14=EF=BC=8Clakshmi &lt;<a href=
=3D"mailto:lakshmigcdac@gmail.com" target=3D"_blank">lakshmigcdac@gmail.com=
</a>&gt; =E5=86=99=E9=81=93=EF=BC=9A<br>
&gt; <br>
&gt; HI all,<br>
&gt; I tested the latest GOO patch (v4) on a fresh build from the current P=
ostgreSQL master. The patch applied cleanly, the server built without issue=
s, and regression tests passed except for the expected EXPLAIN output diffe=
rences due to the new join ordering
 behavior.<br>
&gt; <br>
&gt; As a quick sanity check, I compared DP, GEQO, and GOO on a small multi=
-join query:<br>
&gt; <br>
&gt;=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 DP planning: ~0.66 ms<br>
&gt;=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 GEQO planning: ~2.28 ms<br>
&gt;=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 GOO planning: ~0.38 ms<br>
&gt; Execution times were similar across all three (~1.5=E2=80=931.7 ms) wi=
th no correctness issues. Even on a small join, GEQO shows higher planning =
overhead, while GOO plans faster with comparable execution cost.<br>
&gt; I then evaluated scaling using synthetic 15-table and 20-table joins w=
ith EXPLAIN (ANALYZE, TIMING OFF):<br>
&gt;=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 15 tables<br>
&gt;=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 DP: ~22.9 ms | ~23.4 ms<br>
&gt;=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 GEQO: ~46.7 ms | ~20.5 ms<br>
&gt;=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 GOO: ~1.8 ms | ~22.4 ms<br>
&gt; <br>
&gt;=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 20 tables<br>
&gt;=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 DP: ~48.1 ms | ~30.5 ms<br>
&gt;=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 GEQO: ~51.0 ms | ~26.7 ms<br>
&gt;=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 GOO: ~3.2 ms | ~29.0 ms<br>
&gt; <br>
&gt; Planning time increases notably for DP and remains relatively high for=
 GEQO, while GOO stays very low even at 20 joins, indicating substantially<=
br>
&gt; reduced planning overhead. Execution costs remain broadly comparable, =
with no obvious regressions from GOO in this synthetic workload.<br>
&gt; <br>
&gt; Although this uses a controlled synthetic join graph rather than JOB/T=
PC-H, the scaling behavior appears consistent with GOO=E2=80=99s goal of si=
gnificantly cheaper planning than DP/GEQO while preserving similar plan qua=
lity.<br>
&gt; <br>
&gt; I plan to continue testing with more realistic workloads and will shar=
e further results if anything notable appears.<br>
&gt; <br>
&gt; Thanks for the interesting work.<br>
&gt; <br>
&gt; Regards,<br>
&gt; Lakshmi<br>
<br>
Hi,<br>
<br>
Thank you very much for testing v4 and sharing the results. I really<br>
appreciate the effort and the detailed feedback.<br>
<br>
I also agree with Tomas=E2=80=99s point that we need better benchmark conte=
xt to<br>
evaluate plan quality, not only planning time.<br>
<br>
I=E2=80=99ve prepared a v5 refresh on top of v4, still split into two patch=
es<br>
(v5-0001 and v5-0002). <br>
I also ran `make check-world` on current master with v5 applied, and it<br>
passes on my side.<br>
<br>
Compared with v4:<br>
<br>
[PATCH v5 1/2]<br>
- keeps the base GOO join-search path focused on a single greedy signal<br>
(cost);<br>
- fixes issues found during recent testing (mainly around candidate<br>
probing/cleanup and failure paths);<br>
- improves stability/determinism in candidate selection (including tie<br>
handling);<br>
- updates regression outputs accordingly.<br>
<br>
[PATCH v5 2/2]<br>
- extends `goo_greedy_strategy` and adds the `selectivity` heuristic<br>
suggested by Tomas;<br>
- improves combined mode so multiple greedy signals are evaluated in a<br>
common framework, and the final plan is selected by lowest estimated<br>
`total_cost`;<br>
- keeps strategy-layer changes isolated from the base path for easier<br>
comparison and review.<br>
<br>
My current next steps are:<br>
<br>
1. Continue evaluating plan quality on more datasets/workloads. I=E2=80=99v=
e<br>
already collected several candidate tests: some are JOB-based<br>
variants, and others are synthetic workloads. Next, I plan to<br>
consolidate these into a unified test set (with reproducible<br>
setup/details), publish it, and run broader comparative evaluation.<br>
<br>
2. Prototype a hybrid handoff approach: use greedy contraction first to<br>
reduce the join graph, then let DP optimize the reduced problem. The<br>
goal is a smoother transition around the threshold, avoiding abrupt<br>
plan-shape changes from a hard optimizer switch.<br>
<br>
3. Explore more join-ordering improvements incrementally, including<br>
ideas from =E2=80=9CSimplicity Done Right for Join Ordering=E2=80=9D and re=
lated<br>
work.<br>
<br>
Thanks again for the careful testing and detailed feedback.<br>
</div>
</span></font></div>
<div><font size=3D"2"><span style=3D"font-size:11pt">
<div><br>
--<br>
Best regards,<br>
Chengpeng Yan</div>
</span></font></div>
</div>

</blockquote></div>
</blockquote></div>

--000000000000b47910064aeb74a9--