public inbox for [email protected]
help / color / mirror / Atom feedRe: Add a greedy join search algorithm to handle large join problems
18+ messages / 4 participants
[nested] [flat]
* Re: Add a greedy join search algorithm to handle large join problems
@ 2026-02-13 11:14 lakshmi <[email protected]>
2026-02-13 13:18 ` Re: Add a greedy join search algorithm to handle large join problems Tomas Vondra <[email protected]>
2026-02-14 05:39 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
0 siblings, 2 replies; 18+ messages in thread
From: lakshmi @ 2026-02-13 11:14 UTC (permalink / raw)
To: Pavel Stehule <[email protected]>; +Cc: Tomas Vondra <[email protected]>; John Naylor <[email protected]>; Chengpeng Yan <[email protected]>; [email protected] <[email protected]>
HI all,
I tested the latest GOO patch (v4) on a fresh build from the current
PostgreSQL master. The patch applied cleanly, the server built without
issues, and regression tests passed except for the expected EXPLAIN output
differences due to the new join ordering behavior.
As a quick sanity check, I compared DP, GEQO, and GOO on a small multi-join
query:
DP planning: ~0.66 ms
GEQO planning: ~2.28 ms
GOO planning: ~0.38 ms
Execution times were similar across all three (~1.5–1.7 ms) with no
correctness issues. Even on a small join, GEQO shows higher planning
overhead, while GOO plans faster with comparable execution cost.
I then evaluated scaling using synthetic 15-table and 20-table joins with
EXPLAIN (ANALYZE, TIMING OFF):
15 tables
DP: ~22.9 ms | ~23.4 ms
GEQO: ~46.7 ms | ~20.5 ms
GOO: ~1.8 ms | ~22.4 ms
20 tables
DP: ~48.1 ms | ~30.5 ms
GEQO: ~51.0 ms | ~26.7 ms
GOO: ~3.2 ms | ~29.0 ms
Planning time increases notably for DP and remains relatively high for
GEQO, while GOO stays very low even at 20 joins, indicating substantially
reduced planning overhead. Execution costs remain broadly comparable, with
no obvious regressions from GOO in this synthetic workload.
Although this uses a controlled synthetic join graph rather than JOB/TPC-H,
the scaling behavior appears consistent with GOO’s goal of significantly
cheaper planning than DP/GEQO while preserving similar plan quality.
I plan to continue testing with more realistic workloads and will share
further results if anything notable appears.
Thanks for the interesting work.
Regards,
Lakshmi
On Fri, Feb 13, 2026 at 12:33 PM Pavel Stehule <[email protected]>
wrote:
>
>
> čt 11. 12. 2025 v 18:07 odesílatel Tomas Vondra <[email protected]> napsal:
>
>> On 12/11/25 07:12, Pavel Stehule wrote:
>> >
>> >
>> > čt 11. 12. 2025 v 3:53 odesílatel John Naylor <[email protected]
>> > <mailto:[email protected]>> napsal:
>> >
>> > On Wed, Dec 10, 2025 at 5:20 PM Tomas Vondra <[email protected]
>> > <mailto:[email protected]>> wrote:
>> > > I did however notice an interesting thing - running EXPLAIN on
>> the 99
>> > > queries (for 3 scales and 0/4 workers, so 6x 99) took this much
>> time:
>> > >
>> > > master: 8s
>> > > master/geqo: 20s
>> > > master/goo: 5s
>> >
>> > > It's nice that "goo" seems to be faster than "geqo" - assuming the
>> > plans
>> > > are comparable or better. But it surprised me switching to geqo
>> > makes it
>> > > slower than master. That goes against my intuition that geqo is
>> > meant to
>> > > be cheaper/faster join order planning. But maybe I'm missing
>> > something.
>> >
>> > Yeah, that was surprising. It seems that geqo has a large overhead,
>> so
>> > it takes a larger join problem for the asymptotic behavior to win
>> over
>> > exhaustive search.
>> >
>> >
>> > If I understand correctly to design - geqo should be slower for any
>> > queries with smaller complexity. The question is how many queries in the
>> > tested model are really complex.
>> >
>>
>> Depends on what you mean by "really complex". TPC-DS queries are not
>> trivial, but the complexity may not be in the number of joins.
>>
>> Of course, setting geqo_threshold to 2 may be too aggressive. Not sure.
>>
>
> I checked the TPC-H queries and almost all queries are simple - 5 x JOIN
> -- 2x nested subselect
>
>
>>
>>
>> regards
>>
>> --
>> Tomas Vondra
>>
>>
^ permalink raw reply [nested|flat] 18+ messages in thread
* Re: Add a greedy join search algorithm to handle large join problems
2026-02-13 11:14 Re: Add a greedy join search algorithm to handle large join problems lakshmi <[email protected]>
@ 2026-02-13 13:18 ` Tomas Vondra <[email protected]>
1 sibling, 0 replies; 18+ messages in thread
From: Tomas Vondra @ 2026-02-13 13:18 UTC (permalink / raw)
To: lakshmi <[email protected]>; Pavel Stehule <[email protected]>; +Cc: John Naylor <[email protected]>; Chengpeng Yan <[email protected]>; [email protected] <[email protected]>
Hello,
On 2/13/26 12:14, lakshmi wrote:
> HI all,
> I tested the latest GOO patch (v4) on a fresh build from the current
> PostgreSQL master. The patch applied cleanly, the server built without
> issues, and regression tests passed except for the expected EXPLAIN
> output differences due to the new join ordering behavior.
>
> As a quick sanity check, I compared DP, GEQO, and GOO on a small multi-
> join query:
>
> DP planning: ~0.66 ms
> GEQO planning: ~2.28 ms
> GOO planning: ~0.38 ms
> Execution times were similar across all three (~1.5–1.7 ms) with no
> correctness issues. Even on a small join, GEQO shows higher planning
> overhead, while GOO plans faster with comparable execution cost.
> I then evaluated scaling using synthetic 15-table and 20-table joins
> with EXPLAIN (ANALYZE, TIMING OFF):
> 15 tables
> DP: ~22.9 ms | ~23.4 ms
> GEQO: ~46.7 ms | ~20.5 ms
> GOO: ~1.8 ms | ~22.4 ms
>
> 20 tables
> DP: ~48.1 ms | ~30.5 ms
> GEQO: ~51.0 ms | ~26.7 ms
> GOO: ~3.2 ms | ~29.0 ms
>
> Planning time increases notably for DP and remains relatively high for
> GEQO, while GOO stays very low even at 20 joins, indicating substantially
> reduced planning overhead. Execution costs remain broadly comparable,
> with no obvious regressions from GOO in this synthetic workload.
>
> Although this uses a controlled synthetic join graph rather than JOB/
> TPC-H, the scaling behavior appears consistent with GOO’s goal of
> significantly cheaper planning than DP/GEQO while preserving similar
> plan quality.
>
I think we need to know much more about the synthetic benchmark to judge
the plan quality. For example, for OLTP-like schema/queries, processing
only very few rows in each join, the different join methods are likely
to perform about the same. The performance will differ for larger joins
(matching large number of rows / large intermediate results).
The other question is whether the plans produced by the join search
algorithms even differ.
regards
--
Tomas Vondra
^ permalink raw reply [nested|flat] 18+ messages in thread
* Re: Add a greedy join search algorithm to handle large join problems
2026-02-13 11:14 Re: Add a greedy join search algorithm to handle large join problems lakshmi <[email protected]>
@ 2026-02-14 05:39 ` Chengpeng Yan <[email protected]>
2026-02-16 06:27 ` Re: Add a greedy join search algorithm to handle large join problems lakshmi <[email protected]>
2026-05-03 12:46 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
1 sibling, 2 replies; 18+ messages in thread
From: Chengpeng Yan @ 2026-02-14 05:39 UTC (permalink / raw)
To: lakshmi <[email protected]>; +Cc: Pavel Stehule <[email protected]>; Tomas Vondra <[email protected]>; John Naylor <[email protected]>; [email protected] <[email protected]>
> 2026年2月13日 19:14,lakshmi <[email protected]> 写道:
>
> HI all,
> I tested the latest GOO patch (v4) on a fresh build from the current PostgreSQL master. The patch applied cleanly, the server built without issues, and regression tests passed except for the expected EXPLAIN output differences due to the new join ordering behavior.
>
> As a quick sanity check, I compared DP, GEQO, and GOO on a small multi-join query:
>
> DP planning: ~0.66 ms
> GEQO planning: ~2.28 ms
> GOO planning: ~0.38 ms
> Execution times were similar across all three (~1.5–1.7 ms) with no correctness issues. Even on a small join, GEQO shows higher planning overhead, while GOO plans faster with comparable execution cost.
> I then evaluated scaling using synthetic 15-table and 20-table joins with EXPLAIN (ANALYZE, TIMING OFF):
> 15 tables
> DP: ~22.9 ms | ~23.4 ms
> GEQO: ~46.7 ms | ~20.5 ms
> GOO: ~1.8 ms | ~22.4 ms
>
> 20 tables
> DP: ~48.1 ms | ~30.5 ms
> GEQO: ~51.0 ms | ~26.7 ms
> GOO: ~3.2 ms | ~29.0 ms
>
> Planning time increases notably for DP and remains relatively high for GEQO, while GOO stays very low even at 20 joins, indicating substantially
> reduced planning overhead. Execution costs remain broadly comparable, with no obvious regressions from GOO in this synthetic workload.
>
> Although this uses a controlled synthetic join graph rather than JOB/TPC-H, the scaling behavior appears consistent with GOO’s goal of significantly cheaper planning than DP/GEQO while preserving similar plan quality.
>
> I plan to continue testing with more realistic workloads and will share further results if anything notable appears.
>
> Thanks for the interesting work.
>
> Regards,
> Lakshmi
Hi,
Thank you very much for testing v4 and sharing the results. I really
appreciate the effort and the detailed feedback.
I also agree with Tomas’s point that we need better benchmark context to
evaluate plan quality, not only planning time.
I’ve prepared a v5 refresh on top of v4, still split into two patches
(v5-0001 and v5-0002).
I also ran `make check-world` on current master with v5 applied, and it
passes on my side.
Compared with v4:
[PATCH v5 1/2]
- keeps the base GOO join-search path focused on a single greedy signal
(cost);
- fixes issues found during recent testing (mainly around candidate
probing/cleanup and failure paths);
- improves stability/determinism in candidate selection (including tie
handling);
- updates regression outputs accordingly.
[PATCH v5 2/2]
- extends `goo_greedy_strategy` and adds the `selectivity` heuristic
suggested by Tomas;
- improves combined mode so multiple greedy signals are evaluated in a
common framework, and the final plan is selected by lowest estimated
`total_cost`;
- keeps strategy-layer changes isolated from the base path for easier
comparison and review.
My current next steps are:
1. Continue evaluating plan quality on more datasets/workloads. I’ve
already collected several candidate tests: some are JOB-based
variants, and others are synthetic workloads. Next, I plan to
consolidate these into a unified test set (with reproducible
setup/details), publish it, and run broader comparative evaluation.
2. Prototype a hybrid handoff approach: use greedy contraction first to
reduce the join graph, then let DP optimize the reduced problem. The
goal is a smoother transition around the threshold, avoiding abrupt
plan-shape changes from a hard optimizer switch.
3. Explore more join-ordering improvements incrementally, including
ideas from “Simplicity Done Right for Join Ordering” and related
work.
Thanks again for the careful testing and detailed feedback.
--
Best regards,
Chengpeng Yan
Attachments:
[application/octet-stream] v5-0001-Add-GOO-Greedy-Operator-Ordering-join-search-as-a.patch (62.9K, 3-v5-0001-Add-GOO-Greedy-Operator-Ordering-join-search-as-a.patch)
download | inline diff:
From c87540122d1756d10083f81c7ce29d8163d51dfb Mon Sep 17 00:00:00 2001
From: Chengpeng Yan <[email protected]>
Date: Sat, 14 Feb 2026 11:43:24 +0800
Subject: [PATCH v5 1/2] Add GOO (Greedy Operator Ordering) join search as an
alternative to GEQO
Introduce a greedy join search algorithm (GOO) to handle
large join problems. GOO builds join relations iteratively, maintaining
a list of clumps (join components) and committing to the cheapest
legal join at each step until only one clump remains.
Signed-off-by: Chengpeng Yan <[email protected]>
---
src/backend/optimizer/path/Makefile | 1 +
src/backend/optimizer/path/allpaths.c | 4 +
src/backend/optimizer/path/goo.c | 608 +++++++++++++++
src/backend/optimizer/path/meson.build | 1 +
src/backend/utils/misc/guc_parameters.dat | 9 +
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/include/optimizer/goo.h | 23 +
src/include/optimizer/paths.h | 2 +-
src/test/regress/expected/goo.out | 700 ++++++++++++++++++
src/test/regress/expected/sysviews.out | 3 +-
src/test/regress/parallel_schedule | 9 +-
src/test/regress/sql/goo.sql | 364 +++++++++
12 files changed, 1720 insertions(+), 5 deletions(-)
create mode 100644 src/backend/optimizer/path/goo.c
create mode 100644 src/include/optimizer/goo.h
create mode 100644 src/test/regress/expected/goo.out
create mode 100644 src/test/regress/sql/goo.sql
diff --git a/src/backend/optimizer/path/Makefile b/src/backend/optimizer/path/Makefile
index 1e199ff66f7..3bc825cd845 100644
--- a/src/backend/optimizer/path/Makefile
+++ b/src/backend/optimizer/path/Makefile
@@ -17,6 +17,7 @@ OBJS = \
clausesel.o \
costsize.o \
equivclass.o \
+ goo.o \
indxpath.o \
joinpath.o \
joinrels.o \
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 90275e25872..6a56c6bab61 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -35,6 +35,7 @@
#include "optimizer/clauses.h"
#include "optimizer/cost.h"
#include "optimizer/geqo.h"
+#include "optimizer/goo.h"
#include "optimizer/optimizer.h"
#include "optimizer/pathnode.h"
#include "optimizer/paths.h"
@@ -3898,6 +3899,9 @@ make_rel_from_joinlist(PlannerInfo *root, List *joinlist)
if (join_search_hook)
return (*join_search_hook) (root, levels_needed, initial_rels);
+ /* WIP: for now use geqo_threshold for testing */
+ else if (enable_goo_join_search && levels_needed >= geqo_threshold)
+ return goo_join_search(root, levels_needed, initial_rels);
else if (enable_geqo && levels_needed >= geqo_threshold)
return geqo(root, levels_needed, initial_rels);
else
diff --git a/src/backend/optimizer/path/goo.c b/src/backend/optimizer/path/goo.c
new file mode 100644
index 00000000000..e49a9f372ef
--- /dev/null
+++ b/src/backend/optimizer/path/goo.c
@@ -0,0 +1,608 @@
+/*-------------------------------------------------------------------------
+ *
+ * goo.c
+ * Greedy operator ordering (GOO) join search for large join problems
+ *
+ * GOO is a deterministic greedy operator ordering algorithm that constructs
+ * join relations iteratively, always committing to the cheapest legal join at
+ * each step. The algorithm maintains a list of "clumps" (join components),
+ * initially one per base relation. At each iteration, it evaluates all legal
+ * pairs of clumps, selects the pair that produces the cheapest join according
+ * to the planner's cost model, and replaces those two clumps with the
+ * resulting joinrel. This continues until only one clump remains.
+ *
+ * ALGORITHM COMPLEXITY:
+ *
+ * Time Complexity: O(n^3) where n is the number of base relations.
+ * - The algorithm performs (n - 1) iterations, merging two clumps each time.
+ * - At iteration i, there are (n - i + 1) remaining clumps, requiring
+ * O((n-i)^2) pair evaluations to find the cheapest join.
+ * - Total: Sum of (n-i)^2 for i=1 to n-1 ≈ O(n^3)
+ *
+ * REFERENCES:
+ *
+ * This implementation is based on the algorithm described in:
+ *
+ * Leonidas Fegaras, "A New Heuristic for Optimizing Large Queries",
+ * Proceedings of the 9th International Conference on Database and Expert
+ * Systems Applications (DEXA '98), August 1998, Pages 726-735.
+ * https://dl.acm.org/doi/10.5555/648311.754892
+ *
+ *
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/backend/optimizer/path/goo.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "miscadmin.h"
+#include "nodes/bitmapset.h"
+#include "nodes/pathnodes.h"
+#include "optimizer/geqo.h"
+#include "optimizer/goo.h"
+#include "optimizer/joininfo.h"
+#include "optimizer/pathnode.h"
+#include "optimizer/paths.h"
+#include "utils/hsearch.h"
+#include "utils/memutils.h"
+
+/*
+ * Configuration defaults. These are exposed as GUCs in guc_tables.c.
+ */
+bool enable_goo_join_search = false;
+
+/*
+ * Working state for a single GOO search invocation.
+ *
+ * This structure holds all the state needed during a greedy join order search.
+ * It manages three memory contexts with different lifetimes to avoid memory
+ * bloat during large join searches.
+ *
+ * TODO: Consider using the extension_state mechanism in PlannerInfo (similar
+ * to GEQO's approach) instead of passing GooState separately.
+ */
+typedef struct GooState
+{
+ PlannerInfo *root; /* global planner state */
+ MemoryContext goo_cxt; /* long-lived (per-search) allocations */
+ MemoryContext cand_cxt; /* per-iteration candidate storage */
+ MemoryContext scratch_cxt; /* per-candidate speculative evaluation */
+ List *clumps; /* remaining join components (RelOptInfo *) */
+
+ /*
+ * "clumps" are similar to GEQO's concept (see geqo_eval.c): join
+ * components that haven't been merged yet. Initially one per base
+ * relation, gradually merged until one remains.
+ */
+ bool prune_cartesian; /* skip clauseless joins in this pass? */
+} GooState;
+
+/*
+ * Candidate join between two clumps.
+ *
+ * This structure holds the greedy metrics from a speculative joinrel
+ * evaluation. We create this lightweight structure in cand_cxt after discarding
+ * the actual joinrel from scratch_cxt, allowing us to compare many candidates
+ * without exhausting memory.
+ */
+typedef struct GooCandidate
+{
+ RelOptInfo *left; /* left input clump */
+ RelOptInfo *right; /* right input clump */
+ Cost total_cost; /* total cost of cheapest path */
+ Relids joinrelids; /* relids covered by this join */
+} GooCandidate;
+
+static GooState * goo_init_state(PlannerInfo *root, List *initial_rels);
+static void goo_destroy_state(GooState * state);
+static RelOptInfo *goo_search_internal(GooState * state);
+static void goo_reset_probe_state(GooState * state, int saved_rel_len,
+ struct HTAB *saved_hash);
+static GooCandidate * goo_build_candidate(GooState * state, RelOptInfo *left,
+ RelOptInfo *right);
+static RelOptInfo *goo_commit_join(GooState * state, GooCandidate * cand);
+static bool goo_candidate_better(GooCandidate * a, GooCandidate * b);
+static bool goo_candidate_prunable(GooState * state, RelOptInfo *left,
+ RelOptInfo *right);
+
+/*
+ * goo_join_search
+ * Entry point for Greedy Operator Ordering join search algorithm.
+ *
+ * This function is called from make_rel_from_joinlist() when
+ * enable_goo_join_search is true and the number of relations meets or
+ * exceeds geqo_threshold.
+ *
+ * Returns the final RelOptInfo representing the join of all base relations,
+ * or errors out if no valid join order can be found.
+ */
+RelOptInfo *
+goo_join_search(PlannerInfo *root, int levels_needed,
+ List *initial_rels)
+{
+ GooState *state;
+ RelOptInfo *result;
+ int base_rel_count;
+ struct HTAB *base_hash;
+
+ /* Initialize search state and memory contexts */
+ state = goo_init_state(root, initial_rels);
+
+ /*
+ * Save initial state of join_rel_list and join_rel_hash so we can restore
+ * them if the search fails.
+ */
+ base_rel_count = list_length(root->join_rel_list);
+ base_hash = root->join_rel_hash;
+
+ /* Run the main greedy search loop */
+ result = goo_search_internal(state);
+
+ if (result == NULL)
+ {
+ /* Restore planner state before reporting error */
+ root->join_rel_list = list_truncate(root->join_rel_list, base_rel_count);
+ root->join_rel_hash = base_hash;
+ elog(ERROR, "GOO join search failed to find a valid join order");
+ }
+
+ goo_destroy_state(state);
+ return result;
+}
+
+/*
+ * goo_init_state
+ * Initialize per-search state and memory contexts.
+ *
+ * Creates the GooState structure and three memory contexts with different
+ * lifetimes:
+ *
+ * - goo_cxt: Lives for the entire search, holds the clumps list and state.
+ * - cand_cxt: Reset after each iteration, holds candidate structures during
+ * the comparison phase.
+ * - scratch_cxt: Reset after each candidate evaluation, holds speculative
+ * joinrels that are discarded before committing to a choice.
+ *
+ * The three-context design prevents memory bloat during large join searches
+ * where we may evaluate hundreds or thousands of candidate joins.
+ */
+static GooState *
+goo_init_state(PlannerInfo *root, List *initial_rels)
+{
+ MemoryContext oldcxt;
+ GooState *state;
+
+ oldcxt = MemoryContextSwitchTo(root->planner_cxt);
+
+ state = palloc(sizeof(GooState));
+ state->root = root;
+ state->clumps = NIL;
+ state->prune_cartesian = false;
+
+ /* Create the three-level memory context hierarchy */
+ state->goo_cxt = AllocSetContextCreate(root->planner_cxt, "GOOStateContext",
+ ALLOCSET_DEFAULT_SIZES);
+ state->cand_cxt = AllocSetContextCreate(state->goo_cxt, "GOOCandidateContext",
+ ALLOCSET_SMALL_SIZES);
+ state->scratch_cxt = AllocSetContextCreate(
+ state->goo_cxt, "GOOScratchContext", ALLOCSET_SMALL_SIZES);
+
+ /*
+ * Copy the initial_rels list into goo_cxt. This becomes our working
+ * clumps list that we'll modify throughout the search.
+ */
+ MemoryContextSwitchTo(state->goo_cxt);
+ state->clumps = list_copy(initial_rels);
+
+ MemoryContextSwitchTo(oldcxt);
+
+ return state;
+}
+
+/*
+ * goo_destroy_state
+ * Free all memory allocated for the GOO search.
+ *
+ * Deletes the goo_cxt memory context (which recursively deletes cand_cxt
+ * and scratch_cxt as children) and then frees the state structure itself.
+ * This is called after the search completes successfully or fails.
+ */
+static void
+goo_destroy_state(GooState * state)
+{
+ MemoryContextDelete(state->goo_cxt);
+ pfree(state);
+}
+
+/*
+ * goo_search_internal
+ * Main greedy search loop.
+ *
+ * Implements a two-pass algorithm at each iteration:
+ *
+ * Pass 1: Evaluate only clause-connected pairs (joins with a clause or join
+ * order restriction).
+ *
+ * Pass 2: If no legal clause-connected pair exists, allow Cartesian products
+ * to guarantee progress.
+ *
+ * After selecting the best candidate, we permanently create its joinrel in
+ * planner_cxt and replace the two input clumps with this new joinrel. This
+ * continues until only one clump remains.
+ *
+ * The function runs primarily in goo_cxt, temporarily switching to planner_cxt
+ * when creating permanent joinrels and to scratch_cxt when evaluating
+ * speculative candidates.
+ *
+ * Returns the final joinrel spanning all base relations, or NULL on failure.
+ */
+static RelOptInfo *
+goo_search_internal(GooState * state)
+{
+ RelOptInfo *final_rel = NULL;
+ MemoryContext oldcxt;
+
+ /*
+ * Switch to goo_cxt for the entire search process. This ensures that all
+ * operations on state->clumps and related structures happen in the
+ * correct memory context.
+ */
+ oldcxt = MemoryContextSwitchTo(state->goo_cxt);
+
+ while (list_length(state->clumps) > 1)
+ {
+ ListCell *lc1;
+ GooCandidate *best_candidate = NULL;
+
+ /* Allow query cancellation during long join searches */
+ CHECK_FOR_INTERRUPTS();
+
+ for (int pass = 0; pass < 2; pass++)
+ {
+ bool prune_cartesian = (pass == 0);
+
+ /* Reset candidate context for this pass */
+ MemoryContextReset(state->cand_cxt);
+ state->prune_cartesian = prune_cartesian;
+ best_candidate = NULL;
+
+ /*
+ * Evaluate all viable candidate pairs and select the best.
+ *
+ * For each pair that passes the pruning check, we do a full
+ * speculative evaluation using make_join_rel() to get accurate
+ * costs. The candidate with the best cost (according to
+ * goo_candidate_better) is remembered and will be committed after
+ * this pass.
+ *
+ * TODO: Consider caching cheap legality/connectivity or cost
+ * estimates across iterations; keep it simple for now.
+ */
+ for (lc1 = list_head(state->clumps); lc1 != NULL;
+ lc1 = lnext(state->clumps, lc1))
+ {
+ RelOptInfo *left = lfirst_node(RelOptInfo, lc1);
+ ListCell *lc2 = lnext(state->clumps, lc1);
+
+ for (; lc2 != NULL; lc2 = lnext(state->clumps, lc2))
+ {
+ RelOptInfo *right = lfirst_node(RelOptInfo, lc2);
+ GooCandidate *cand;
+
+ cand = goo_build_candidate(state, left, right);
+ if (cand == NULL)
+ continue;
+
+ /* Track the best candidate seen so far */
+ if (best_candidate == NULL ||
+ goo_candidate_better(cand, best_candidate))
+ best_candidate = cand;
+ }
+ }
+
+ if (best_candidate != NULL || !prune_cartesian)
+ break;
+ }
+
+ /* No legal join candidate found for this iteration. */
+ if (best_candidate == NULL)
+ {
+ MemoryContextSwitchTo(oldcxt);
+ return NULL;
+ }
+
+ /*
+ * Commit the best candidate: create the joinrel permanently and
+ * update the clumps list.
+ */
+ final_rel = goo_commit_join(state, best_candidate);
+
+ if (final_rel == NULL)
+ elog(ERROR, "GOO join search failed to commit join");
+ }
+
+ /* Switch back to the original context before returning */
+ MemoryContextSwitchTo(oldcxt);
+
+ return final_rel;
+}
+
+/*
+ * goo_candidate_prunable
+ * Determine whether a candidate pair should be skipped.
+ *
+ * We use a two-level pruning strategy:
+ *
+ * 1. Pairs with join clauses or join-order restrictions are never prunable.
+ * These represent natural joins or required join orders (e.g., from outer
+ * joins or LATERAL references).
+ *
+ * 2. If prune_cartesian is true, we prune Cartesian products to avoid
+ * evaluating expensive cross joins when better options are available.
+ *
+ * If no legal clause-connected pairs exist in the current iteration,
+ * goo_search_internal() will retry with prune_cartesian disabled.
+ *
+ * Returns true if the pair should be pruned (skipped), false otherwise.
+ */
+static bool
+goo_candidate_prunable(GooState * state, RelOptInfo *left,
+ RelOptInfo *right)
+{
+ PlannerInfo *root = state->root;
+ bool has_clause = have_relevant_joinclause(root, left, right);
+ bool has_restriction = have_join_order_restriction(root, left, right);
+
+ if (has_clause || has_restriction)
+ return false; /* never prune clause-connected joins */
+
+ return state->prune_cartesian;
+}
+
+/*
+ * goo_build_candidate
+ * Evaluate a potential join between two clumps and return a candidate.
+ *
+ * This function performs a speculative join evaluation to extract greedy metrics
+ * without permanently creating the joinrel. The process is:
+ *
+ * 1. Check basic viability (pruning, overlapping relids).
+ * 2. Switch to scratch_cxt and create the joinrel using make_join_rel().
+ * 3. Generate paths (including partitionwise and parallel variants).
+ * 4. Extract the greedy metrics from the join relation.
+ * 5. Discard the joinrel by calling goo_reset_probe_state().
+ * 6. Create a lightweight GooCandidate in cand_cxt with the extracted metrics.
+ *
+ * This evaluate-and-discard pattern prevents memory bloat when evaluating
+ * many candidates. The winning candidate will be rebuilt permanently later
+ * by goo_commit_join().
+ *
+ * Returns a GooCandidate structure, or NULL if the join is illegal or
+ * overlapping. Assumes the caller is in goo_cxt.
+ */
+static GooCandidate * goo_build_candidate(GooState * state, RelOptInfo *left,
+ RelOptInfo *right)
+{
+ PlannerInfo *root = state->root;
+ MemoryContext oldcxt;
+ int saved_rel_len;
+ struct HTAB *saved_hash;
+ RelOptInfo *joinrel;
+ Cost total_cost;
+ GooCandidate *cand;
+ bool is_top_rel;
+
+ /* Skip if this pair should be pruned */
+ if (goo_candidate_prunable(state, left, right))
+ return NULL;
+
+ /* Sanity check: ensure the clumps don't overlap */
+ if (bms_overlap(left->relids, right->relids))
+ return NULL;
+
+ /*
+ * Save state before speculative join evaluation. We'll create the joinrel
+ * in scratch_cxt and then discard it.
+ */
+ saved_rel_len = list_length(root->join_rel_list);
+ saved_hash = root->join_rel_hash;
+
+ /* Switch to scratch_cxt for speculative joinrel creation */
+ oldcxt = MemoryContextSwitchTo(state->scratch_cxt);
+
+ /*
+ * Create the joinrel and generate all its paths.
+ *
+ * TODO: This is the most expensive part of GOO. Each candidate evaluation
+ * performs full path generation via make_join_rel().
+ */
+ joinrel = make_join_rel(root, left, right);
+
+ if (joinrel == NULL)
+ {
+ /* Invalid or illegal join, clean up and return NULL */
+ MemoryContextSwitchTo(oldcxt);
+ goo_reset_probe_state(state, saved_rel_len, saved_hash);
+ return NULL;
+ }
+
+ is_top_rel = bms_equal(joinrel->relids, root->all_query_rels);
+
+ generate_partitionwise_join_paths(root, joinrel);
+ if (!is_top_rel)
+ generate_useful_gather_paths(root, joinrel, false);
+ set_cheapest(joinrel);
+
+ if (joinrel->grouped_rel != NULL && !is_top_rel)
+ {
+ RelOptInfo *grouped_rel = joinrel->grouped_rel;
+
+ Assert(IS_GROUPED_REL(grouped_rel));
+
+ generate_grouped_paths(root, grouped_rel, joinrel);
+ set_cheapest(grouped_rel);
+ }
+
+ total_cost = joinrel->cheapest_total_path->total_cost;
+
+ /*
+ * Switch back to goo_cxt and discard the speculative joinrel.
+ * goo_reset_probe_state() will clean up join_rel_list, join_rel_hash, and
+ * reset scratch_cxt to free all the joinrel's memory.
+ */
+ MemoryContextSwitchTo(oldcxt);
+ goo_reset_probe_state(state, saved_rel_len, saved_hash);
+
+ /*
+ * Now create the candidate structure in cand_cxt. This will survive until
+ * the end of this iteration (when cand_cxt is reset).
+ */
+ oldcxt = MemoryContextSwitchTo(state->cand_cxt);
+ cand = palloc(sizeof(GooCandidate));
+ cand->left = left;
+ cand->right = right;
+ cand->total_cost = total_cost;
+ cand->joinrelids = bms_union(left->relids, right->relids);
+ MemoryContextSwitchTo(oldcxt);
+
+ return cand;
+}
+
+/*
+ * goo_reset_probe_state
+ * Clean up after a speculative joinrel evaluation.
+ *
+ * Reverts the planner's join_rel_list and join_rel_hash to their saved state,
+ * removing any joinrels that were created during speculative evaluation.
+ * Also resets scratch_cxt to free all memory used by the discarded joinrel
+ * and its paths.
+ *
+ * This function is called after extracting cost metrics from a speculative
+ * joinrel that we don't want to keep.
+ */
+static void
+goo_reset_probe_state(GooState * state, int saved_rel_len,
+ struct HTAB *saved_hash)
+{
+ PlannerInfo *root = state->root;
+ int cur_rel_len;
+
+ cur_rel_len = list_length(root->join_rel_list);
+
+ /* Remove hashtable entries created by this probe before resetting memory. */
+ if (saved_hash != NULL && cur_rel_len > saved_rel_len)
+ {
+ for (int i = saved_rel_len; i < cur_rel_len; i++)
+ {
+ RelOptInfo *joinrel = list_nth_node(RelOptInfo,
+ root->join_rel_list, i);
+ bool found;
+
+ (void) hash_search(saved_hash, &(joinrel->relids),
+ HASH_REMOVE, &found);
+ Assert(found);
+ }
+ }
+
+ /* Remove speculative joinrels from the planner's lists */
+ root->join_rel_list = list_truncate(root->join_rel_list, saved_rel_len);
+ root->join_rel_hash = saved_hash;
+
+ /* Free all memory used during speculative evaluation */
+ MemoryContextReset(state->scratch_cxt);
+}
+
+/*
+ * goo_commit_join
+ * Permanently create the chosen join and update the clumps list.
+ *
+ * After selecting the best candidate in an iteration, we need to permanently
+ * create its joinrel (with all paths) and integrate it into the planner state.
+ * This function:
+ *
+ * 1. Switches to planner_cxt and creates the joinrel using make_join_rel().
+ * Unlike the speculative evaluation, this joinrel is kept permanently.
+ * 2. Generates partitionwise and parallel path variants.
+ * 3. Determines the cheapest paths.
+ * 4. Updates state->clumps by removing the two input clumps and adding the
+ * new joinrel as a single clump.
+ *
+ * The next iteration will treat this joinrel as an atomic unit that can be
+ * joined with other remaining clumps.
+ *
+ * Returns the newly created joinrel. Assumes the caller is in goo_cxt.
+ */
+static RelOptInfo *
+goo_commit_join(GooState * state, GooCandidate * cand)
+{
+ MemoryContext oldcxt;
+ PlannerInfo *root = state->root;
+ RelOptInfo *joinrel;
+ bool is_top_rel;
+
+ /*
+ * Create the joinrel permanently in planner_cxt. Unlike the speculative
+ * evaluation in goo_build_candidate(), this joinrel will be kept and
+ * added to root->join_rel_list for use by the rest of the planner.
+ */
+ oldcxt = MemoryContextSwitchTo(root->planner_cxt);
+
+ joinrel = make_join_rel(root, cand->left, cand->right);
+ if (joinrel == NULL)
+ {
+ MemoryContextSwitchTo(oldcxt);
+ elog(ERROR, "GOO join search failed to create join relation");
+ }
+
+ /* Generate additional path variants, just like standard_join_search() */
+ is_top_rel = bms_equal(joinrel->relids, root->all_query_rels);
+
+ generate_partitionwise_join_paths(root, joinrel);
+ if (!is_top_rel)
+ generate_useful_gather_paths(root, joinrel, false);
+ set_cheapest(joinrel);
+
+ if (joinrel->grouped_rel != NULL && !is_top_rel)
+ {
+ RelOptInfo *grouped_rel = joinrel->grouped_rel;
+
+ Assert(IS_GROUPED_REL(grouped_rel));
+
+ generate_grouped_paths(root, grouped_rel, joinrel);
+ set_cheapest(grouped_rel);
+ }
+
+ /*
+ * Switch back to goo_cxt and update the clumps list. Remove the two input
+ * clumps and add the new joinrel as a single clump.
+ */
+ MemoryContextSwitchTo(oldcxt);
+
+ state->clumps = list_delete_ptr(state->clumps, cand->left);
+ state->clumps = list_delete_ptr(state->clumps, cand->right);
+ state->clumps = lappend(state->clumps, joinrel);
+
+ return joinrel;
+}
+
+/*
+ * goo_candidate_better
+ * Compare two join candidates and determine which is better.
+ *
+ * Returns true if candidate 'a' should be preferred over candidate 'b'.
+ */
+static bool
+goo_candidate_better(GooCandidate * a, GooCandidate * b)
+{
+ if (a->total_cost < b->total_cost)
+ return true;
+ if (a->total_cost > b->total_cost)
+ return false;
+
+ return bms_compare(a->joinrelids, b->joinrelids) < 0;
+}
diff --git a/src/backend/optimizer/path/meson.build b/src/backend/optimizer/path/meson.build
index 98f3ebd5192..d4e71a23396 100644
--- a/src/backend/optimizer/path/meson.build
+++ b/src/backend/optimizer/path/meson.build
@@ -5,6 +5,7 @@ backend_sources += files(
'clausesel.c',
'costsize.c',
'equivclass.c',
+ 'goo.c',
'indxpath.c',
'joinpath.c',
'joinrels.c',
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 271c033952e..624dee632c3 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -849,6 +849,15 @@
boot_val => 'true',
},
+/* WIP: for now keep this in QUERY_TUNING_GEQO group for testing convenience */
+{ name => 'enable_goo_join_search', type => 'bool', context => 'PGC_USERSET', group => 'QUERY_TUNING_GEQO',
+ short_desc => 'Enables the planner\'s use of GOO join search for large join problems.',
+ long_desc => 'Greedy Operator Ordering (GOO) is a deterministic join search algorithm for queries with many relations.',
+ flags => 'GUC_EXPLAIN',
+ variable => 'enable_goo_join_search',
+ boot_val => 'false',
+},
+
{ name => 'enable_group_by_reordering', type => 'bool', context => 'PGC_USERSET', group => 'QUERY_TUNING_METHOD',
short_desc => 'Enables reordering of GROUP BY keys.',
flags => 'GUC_EXPLAIN',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index f938cc65a3a..eab1021665a 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -461,6 +461,7 @@
# - Genetic Query Optimizer -
#geqo = on
+#enable_goo_join_search = off
#geqo_threshold = 12
#geqo_effort = 5 # range 1-10
#geqo_pool_size = 0 # selects default based on effort
diff --git a/src/include/optimizer/goo.h b/src/include/optimizer/goo.h
new file mode 100644
index 00000000000..0080dfa2ac8
--- /dev/null
+++ b/src/include/optimizer/goo.h
@@ -0,0 +1,23 @@
+/*-------------------------------------------------------------------------
+ *
+ * goo.h
+ * prototype for the greedy operator ordering join search
+ *
+ * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ * src/include/optimizer/goo.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef GOO_H
+#define GOO_H
+
+#include "nodes/pathnodes.h"
+#include "nodes/pg_list.h"
+
+extern RelOptInfo *goo_join_search(PlannerInfo *root, int levels_needed,
+ List *initial_rels);
+
+#endif /* GOO_H */
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 8751ad7381c..734fa68884d 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -16,11 +16,11 @@
#include "nodes/pathnodes.h"
-
/*
* allpaths.c
*/
extern PGDLLIMPORT bool enable_geqo;
+extern PGDLLIMPORT bool enable_goo_join_search;
extern PGDLLIMPORT bool enable_eager_aggregate;
extern PGDLLIMPORT int geqo_threshold;
extern PGDLLIMPORT double min_eager_agg_group_size;
diff --git a/src/test/regress/expected/goo.out b/src/test/regress/expected/goo.out
new file mode 100644
index 00000000000..3a935e626da
--- /dev/null
+++ b/src/test/regress/expected/goo.out
@@ -0,0 +1,700 @@
+--
+-- GOO (Greedy Operator Ordering) Join Search Tests
+--
+-- This test suite validates the GOO join ordering algorithm and verifies
+-- correct behavior for various query patterns.
+--
+-- Create test tables with various sizes and join patterns
+CREATE TEMP TABLE t1 (a int, b int);
+CREATE TEMP TABLE t2 (a int, c int);
+CREATE TEMP TABLE t3 (b int, d int);
+CREATE TEMP TABLE t4 (c int, e int);
+CREATE TEMP TABLE t5 (d int, f int);
+CREATE TEMP TABLE t6 (e int, g int);
+CREATE TEMP TABLE t7 (f int, h int);
+CREATE TEMP TABLE t8 (g int, i int);
+CREATE TEMP TABLE t9 (h int, j int);
+CREATE TEMP TABLE t10 (i int, k int);
+CREATE TEMP TABLE t11 (j int, l int);
+CREATE TEMP TABLE t12 (k int, m int);
+CREATE TEMP TABLE t13 (l int, n int);
+CREATE TEMP TABLE t14 (m int, o int);
+CREATE TEMP TABLE t15 (n int, p int);
+CREATE TEMP TABLE t16 (o int, q int);
+CREATE TEMP TABLE t17 (p int, r int);
+CREATE TEMP TABLE t18 (q int, s int);
+-- Populate with small amount of data
+INSERT INTO t1 SELECT i, i FROM generate_series(1,10) i;
+INSERT INTO t2 SELECT i, i FROM generate_series(1,10) i;
+INSERT INTO t3 SELECT i, i FROM generate_series(1,10) i;
+INSERT INTO t4 SELECT i, i FROM generate_series(1,10) i;
+INSERT INTO t5 SELECT i, i FROM generate_series(1,10) i;
+INSERT INTO t6 SELECT i, i FROM generate_series(1,10) i;
+INSERT INTO t7 SELECT i, i FROM generate_series(1,10) i;
+INSERT INTO t8 SELECT i, i FROM generate_series(1,10) i;
+INSERT INTO t9 SELECT i, i FROM generate_series(1,10) i;
+INSERT INTO t10 SELECT i, i FROM generate_series(1,10) i;
+INSERT INTO t11 SELECT i, i FROM generate_series(1,10) i;
+INSERT INTO t12 SELECT i, i FROM generate_series(1,10) i;
+INSERT INTO t13 SELECT i, i FROM generate_series(1,10) i;
+INSERT INTO t14 SELECT i, i FROM generate_series(1,10) i;
+INSERT INTO t15 SELECT i, i FROM generate_series(1,10) i;
+INSERT INTO t16 SELECT i, i FROM generate_series(1,10) i;
+INSERT INTO t17 SELECT i, i FROM generate_series(1,10) i;
+INSERT INTO t18 SELECT i, i FROM generate_series(1,10) i;
+ANALYZE;
+--
+-- Basic 3-way join (sanity check)
+--
+SET enable_goo_join_search = on;
+SET geqo_threshold = 2;
+EXPLAIN (COSTS OFF)
+SELECT count(*)
+FROM (VALUES (1),(2)) AS a(x)
+JOIN (VALUES (1),(2)) AS b(x) USING (x)
+JOIN (VALUES (1),(3)) AS c(x) USING (x);
+ QUERY PLAN
+----------------------------------------------------------------------
+ Aggregate
+ -> Hash Join
+ Hash Cond: ("*VALUES*".column1 = "*VALUES*_2".column1)
+ -> Hash Join
+ Hash Cond: ("*VALUES*".column1 = "*VALUES*_1".column1)
+ -> Values Scan on "*VALUES*"
+ -> Hash
+ -> Values Scan on "*VALUES*_1"
+ -> Hash
+ -> Values Scan on "*VALUES*_2"
+(10 rows)
+
+SELECT count(*)
+FROM (VALUES (1),(2)) AS a(x)
+JOIN (VALUES (1),(2)) AS b(x) USING (x)
+JOIN (VALUES (1),(3)) AS c(x) USING (x);
+ count
+-------
+ 1
+(1 row)
+
+--
+-- Disconnected graph (Cartesian products required)
+--
+-- This tests GOO's ability to handle queries where some relations
+-- have no join clauses connecting them. GOO should allow Cartesian
+-- products when no clause-connected joins are available.
+--
+EXPLAIN (COSTS OFF)
+SELECT count(*)
+FROM t1, t2, t5
+WHERE t1.a = 1 AND t2.c = 2 AND t5.f = 3;
+ QUERY PLAN
+-------------------------------------
+ Aggregate
+ -> Nested Loop
+ -> Seq Scan on t5
+ Filter: (f = 3)
+ -> Nested Loop
+ -> Seq Scan on t1
+ Filter: (a = 1)
+ -> Seq Scan on t2
+ Filter: (c = 2)
+(9 rows)
+
+SELECT count(*)
+FROM t1, t2, t5
+WHERE t1.a = 1 AND t2.c = 2 AND t5.f = 3;
+ count
+-------
+ 1
+(1 row)
+
+--
+-- Star schema (fact table with multiple dimension tables)
+--
+-- Test GOO with a typical star schema join pattern.
+--
+CREATE TEMP TABLE fact (id int, dim1_id int, dim2_id int, dim3_id int, dim4_id int, value int);
+CREATE TEMP TABLE dim1 (id int, name text);
+CREATE TEMP TABLE dim2 (id int, name text);
+CREATE TEMP TABLE dim3 (id int, name text);
+CREATE TEMP TABLE dim4 (id int, name text);
+INSERT INTO fact SELECT i, i, i, i, i, i FROM generate_series(1,100) i;
+INSERT INTO dim1 SELECT i, 'dim1_'||i FROM generate_series(1,10) i;
+INSERT INTO dim2 SELECT i, 'dim2_'||i FROM generate_series(1,10) i;
+INSERT INTO dim3 SELECT i, 'dim3_'||i FROM generate_series(1,10) i;
+INSERT INTO dim4 SELECT i, 'dim4_'||i FROM generate_series(1,10) i;
+ANALYZE;
+EXPLAIN (COSTS OFF)
+SELECT count(*)
+FROM fact
+JOIN dim1 ON fact.dim1_id = dim1.id
+JOIN dim2 ON fact.dim2_id = dim2.id
+JOIN dim3 ON fact.dim3_id = dim3.id
+JOIN dim4 ON fact.dim4_id = dim4.id
+WHERE dim1.id < 5;
+ QUERY PLAN
+---------------------------------------------------------------------
+ Aggregate
+ -> Nested Loop
+ Join Filter: (fact.dim4_id = dim4.id)
+ -> Hash Join
+ Hash Cond: (dim3.id = fact.dim3_id)
+ -> Seq Scan on dim3
+ -> Hash
+ -> Hash Join
+ Hash Cond: (fact.dim1_id = dim1.id)
+ -> Hash Join
+ Hash Cond: (fact.dim2_id = dim2.id)
+ -> Seq Scan on fact
+ -> Hash
+ -> Seq Scan on dim2
+ -> Hash
+ -> Seq Scan on dim1
+ Filter: (id < 5)
+ -> Seq Scan on dim4
+(18 rows)
+
+--
+-- Long join chain
+--
+-- Tests GOO with a large join involving 15 relations.
+--
+SET geqo_threshold = 8;
+EXPLAIN (COSTS OFF)
+SELECT count(*)
+FROM t1
+JOIN t2 ON t1.a = t2.a
+JOIN t3 ON t1.b = t3.b
+JOIN t4 ON t2.c = t4.c
+JOIN t5 ON t3.d = t5.d
+JOIN t6 ON t4.e = t6.e
+JOIN t7 ON t5.f = t7.f
+JOIN t8 ON t6.g = t8.g
+JOIN t9 ON t7.h = t9.h
+JOIN t10 ON t8.i = t10.i
+JOIN t11 ON t9.j = t11.j
+JOIN t12 ON t10.k = t12.k
+JOIN t13 ON t11.l = t13.l
+JOIN t14 ON t12.m = t14.m
+JOIN t15 ON t13.n = t15.n;
+ QUERY PLAN
+----------------------------------------------------------------
+ Aggregate
+ -> Hash Join
+ Hash Cond: (t7.h = t9.h)
+ -> Hash Join
+ Hash Cond: (t8.i = t10.i)
+ -> Hash Join
+ Hash Cond: (t2.c = t4.c)
+ -> Hash Join
+ Hash Cond: (t3.b = t1.b)
+ -> Hash Join
+ Hash Cond: (t5.f = t7.f)
+ -> Hash Join
+ Hash Cond: (t3.d = t5.d)
+ -> Seq Scan on t3
+ -> Hash
+ -> Seq Scan on t5
+ -> Hash
+ -> Seq Scan on t7
+ -> Hash
+ -> Hash Join
+ Hash Cond: (t1.a = t2.a)
+ -> Seq Scan on t1
+ -> Hash
+ -> Seq Scan on t2
+ -> Hash
+ -> Hash Join
+ Hash Cond: (t6.g = t8.g)
+ -> Hash Join
+ Hash Cond: (t4.e = t6.e)
+ -> Seq Scan on t4
+ -> Hash
+ -> Seq Scan on t6
+ -> Hash
+ -> Seq Scan on t8
+ -> Hash
+ -> Hash Join
+ Hash Cond: (t12.m = t14.m)
+ -> Hash Join
+ Hash Cond: (t10.k = t12.k)
+ -> Seq Scan on t10
+ -> Hash
+ -> Seq Scan on t12
+ -> Hash
+ -> Seq Scan on t14
+ -> Hash
+ -> Hash Join
+ Hash Cond: (t11.l = t13.l)
+ -> Hash Join
+ Hash Cond: (t9.j = t11.j)
+ -> Seq Scan on t9
+ -> Hash
+ -> Seq Scan on t11
+ -> Hash
+ -> Hash Join
+ Hash Cond: (t13.n = t15.n)
+ -> Seq Scan on t13
+ -> Hash
+ -> Seq Scan on t15
+(58 rows)
+
+-- Execute to verify correctness
+SELECT count(*)
+FROM t1
+JOIN t2 ON t1.a = t2.a
+JOIN t3 ON t1.b = t3.b
+JOIN t4 ON t2.c = t4.c
+JOIN t5 ON t3.d = t5.d
+JOIN t6 ON t4.e = t6.e
+JOIN t7 ON t5.f = t7.f
+JOIN t8 ON t6.g = t8.g
+JOIN t9 ON t7.h = t9.h
+JOIN t10 ON t8.i = t10.i
+JOIN t11 ON t9.j = t11.j
+JOIN t12 ON t10.k = t12.k
+JOIN t13 ON t11.l = t13.l
+JOIN t14 ON t12.m = t14.m
+JOIN t15 ON t13.n = t15.n;
+ count
+-------
+ 10
+(1 row)
+
+--
+-- Bushy tree support
+--
+-- Verify that GOO can produce bushy join trees, not just left-deep or right-deep.
+-- With appropriate cost model, GOO should join (t1,t2) and (t3,t4) first,
+-- then join those results (bushy tree).
+--
+SET geqo_threshold = 4;
+EXPLAIN (COSTS OFF)
+SELECT count(*)
+FROM t1, t2, t3, t4
+WHERE t1.a = t2.a
+ AND t3.b = t4.c
+ AND t1.a = t3.b;
+ QUERY PLAN
+----------------------------------------------
+ Aggregate
+ -> Hash Join
+ Hash Cond: (t1.a = t3.b)
+ -> Hash Join
+ Hash Cond: (t1.a = t2.a)
+ -> Seq Scan on t1
+ -> Hash
+ -> Seq Scan on t2
+ -> Hash
+ -> Hash Join
+ Hash Cond: (t3.b = t4.c)
+ -> Seq Scan on t3
+ -> Hash
+ -> Seq Scan on t4
+(14 rows)
+
+--
+-- Compare GOO vs standard join search
+--
+-- Run the same query with GOO and standard join search to verify both
+-- produce valid plans. Results should be identical even if plans differ.
+--
+SET enable_goo_join_search = on;
+PREPARE goo_plan AS
+SELECT count(*)
+FROM t1 JOIN t2 ON t1.a = t2.a
+JOIN t3 ON t1.b = t3.b
+JOIN t4 ON t2.c = t4.c
+JOIN t5 ON t3.d = t5.d
+JOIN t6 ON t4.e = t6.e;
+EXECUTE goo_plan;
+ count
+-------
+ 10
+(1 row)
+
+SET enable_goo_join_search = off;
+SET geqo_threshold = default;
+PREPARE standard_plan AS
+SELECT count(*)
+FROM t1 JOIN t2 ON t1.a = t2.a
+JOIN t3 ON t1.b = t3.b
+JOIN t4 ON t2.c = t4.c
+JOIN t5 ON t3.d = t5.d
+JOIN t6 ON t4.e = t6.e;
+EXECUTE standard_plan;
+ count
+-------
+ 10
+(1 row)
+
+-- Results should match
+EXECUTE goo_plan;
+ count
+-------
+ 10
+(1 row)
+
+EXECUTE standard_plan;
+ count
+-------
+ 10
+(1 row)
+
+--
+-- Large join (18 relations)
+--
+-- Test GOO with a large number of relations.
+--
+SET enable_goo_join_search = on;
+SET geqo_threshold = 10;
+EXPLAIN (COSTS OFF)
+SELECT count(*)
+FROM t1
+JOIN t2 ON t1.a = t2.a
+JOIN t3 ON t1.b = t3.b
+JOIN t4 ON t2.c = t4.c
+JOIN t5 ON t3.d = t5.d
+JOIN t6 ON t4.e = t6.e
+JOIN t7 ON t5.f = t7.f
+JOIN t8 ON t6.g = t8.g
+JOIN t9 ON t7.h = t9.h
+JOIN t10 ON t8.i = t10.i
+JOIN t11 ON t9.j = t11.j
+JOIN t12 ON t10.k = t12.k
+JOIN t13 ON t11.l = t13.l
+JOIN t14 ON t12.m = t14.m
+JOIN t15 ON t13.n = t15.n
+JOIN t16 ON t14.o = t16.o
+JOIN t17 ON t15.p = t17.p
+JOIN t18 ON t16.q = t18.q;
+ QUERY PLAN
+----------------------------------------------------------------------------------------------------------------------------------
+ Aggregate
+ -> Hash Join
+ Hash Cond: (t16.q = t18.q)
+ -> Hash Join
+ Hash Cond: (t15.p = t17.p)
+ -> Hash Join
+ Hash Cond: (t14.o = t16.o)
+ -> Hash Join
+ Hash Cond: (t13.n = t15.n)
+ -> Hash Join
+ Hash Cond: (t12.m = t14.m)
+ -> Hash Join
+ Hash Cond: (t11.l = t13.l)
+ -> Hash Join
+ Hash Cond: (t10.k = t12.k)
+ -> Hash Join
+ Hash Cond: (t9.j = t11.j)
+ -> Hash Join
+ Hash Cond: (t8.i = t10.i)
+ -> Hash Join
+ Hash Cond: (t7.h = t9.h)
+ -> Hash Join
+ Hash Cond: (t6.g = t8.g)
+ -> Hash Join
+ Hash Cond: (t5.f = t7.f)
+ -> Hash Join
+ Hash Cond: (t4.e = t6.e)
+ -> Hash Join
+ Hash Cond: (t3.d = t5.d)
+ -> Hash Join
+ Hash Cond: (t2.c = t4.c)
+ -> Hash Join
+ Hash Cond: (t1.b = t3.b)
+ -> Hash Join
+ Hash Cond: (t1.a = t2.a)
+ -> Seq Scan on t1
+ -> Hash
+ -> Seq Scan on t2
+ -> Hash
+ -> Seq Scan on t3
+ -> Hash
+ -> Seq Scan on t4
+ -> Hash
+ -> Seq Scan on t5
+ -> Hash
+ -> Seq Scan on t6
+ -> Hash
+ -> Seq Scan on t7
+ -> Hash
+ -> Seq Scan on t8
+ -> Hash
+ -> Seq Scan on t9
+ -> Hash
+ -> Seq Scan on t10
+ -> Hash
+ -> Seq Scan on t11
+ -> Hash
+ -> Seq Scan on t12
+ -> Hash
+ -> Seq Scan on t13
+ -> Hash
+ -> Seq Scan on t14
+ -> Hash
+ -> Seq Scan on t15
+ -> Hash
+ -> Seq Scan on t16
+ -> Hash
+ -> Seq Scan on t17
+ -> Hash
+ -> Seq Scan on t18
+(70 rows)
+
+--
+-- Mixed connected and disconnected components
+--
+-- Query with two connected components that need a Cartesian product between them.
+--
+EXPLAIN (COSTS OFF)
+SELECT count(*)
+FROM t1 JOIN t2 ON t1.a = t2.a,
+ t5 JOIN t6 ON t5.f = t6.e
+WHERE t1.a < 5 AND t5.d < 3;
+ QUERY PLAN
+-------------------------------------------------
+ Aggregate
+ -> Hash Join
+ Hash Cond: (t2.a = t1.a)
+ -> Nested Loop
+ -> Hash Join
+ Hash Cond: (t6.e = t5.f)
+ -> Seq Scan on t6
+ -> Hash
+ -> Seq Scan on t5
+ Filter: (d < 3)
+ -> Seq Scan on t2
+ -> Hash
+ -> Seq Scan on t1
+ Filter: (a < 5)
+(14 rows)
+
+--
+-- Outer joins
+--
+-- Verify GOO handles outer joins correctly (respects join order restrictions)
+--
+EXPLAIN (COSTS OFF)
+SELECT count(*)
+FROM t1
+LEFT JOIN t2 ON t1.a = t2.a
+LEFT JOIN t3 ON t2.a = t3.b
+LEFT JOIN t4 ON t3.d = t4.c;
+ QUERY PLAN
+----------------------------------------------
+ Aggregate
+ -> Hash Left Join
+ Hash Cond: (t3.d = t4.c)
+ -> Hash Left Join
+ Hash Cond: (t2.a = t3.b)
+ -> Hash Left Join
+ Hash Cond: (t1.a = t2.a)
+ -> Seq Scan on t1
+ -> Hash
+ -> Seq Scan on t2
+ -> Hash
+ -> Seq Scan on t3
+ -> Hash
+ -> Seq Scan on t4
+(14 rows)
+
+--
+-- Complete Cartesian products (disconnected graphs)
+--
+-- Test GOO's ability to handle queries with no join clauses at all.
+--
+SET enable_goo_join_search = on;
+SET geqo_threshold = 2;
+EXPLAIN (COSTS OFF)
+SELECT count(*)
+FROM t1, t2;
+ QUERY PLAN
+----------------------------------
+ Aggregate
+ -> Nested Loop
+ -> Seq Scan on t1
+ -> Materialize
+ -> Seq Scan on t2
+(5 rows)
+
+SELECT count(*)
+FROM t1, t2;
+ count
+-------
+ 100
+(1 row)
+
+--
+-- Join order restrictions with FULL OUTER JOIN
+--
+-- FULL OUTER JOIN creates strong ordering constraints that GOO must respect
+--
+EXPLAIN (COSTS OFF)
+SELECT count(*)
+FROM t1
+FULL OUTER JOIN t2 ON t1.a = t2.a
+FULL OUTER JOIN t3 ON t2.a = t3.b;
+ QUERY PLAN
+----------------------------------------
+ Aggregate
+ -> Hash Full Join
+ Hash Cond: (t2.a = t3.b)
+ -> Hash Full Join
+ Hash Cond: (t1.a = t2.a)
+ -> Seq Scan on t1
+ -> Hash
+ -> Seq Scan on t2
+ -> Hash
+ -> Seq Scan on t3
+(10 rows)
+
+--
+-- Self-join handling
+--
+-- Test GOO with the same table appearing multiple times. GOO must correctly
+-- handle self-joins that were not removed by Self-Join Elimination.
+--
+EXPLAIN (COSTS OFF)
+SELECT count(*)
+FROM t1 a
+JOIN t1 b ON a.a = b.a
+JOIN t2 c ON b.b = c.c;
+ QUERY PLAN
+------------------------------------------
+ Aggregate
+ -> Hash Join
+ Hash Cond: (b.b = c.c)
+ -> Hash Join
+ Hash Cond: (a.a = b.a)
+ -> Seq Scan on t1 a
+ -> Hash
+ -> Seq Scan on t1 b
+ -> Hash
+ -> Seq Scan on t2 c
+(10 rows)
+
+--
+-- Complex bushy tree pattern
+--
+-- Create a query that naturally leads to bushy tree: multiple independent
+-- join chains that need to be combined
+--
+CREATE TEMP TABLE chain1a (id int, val int);
+CREATE TEMP TABLE chain1b (id int, val int);
+CREATE TEMP TABLE chain1c (id int, val int);
+CREATE TEMP TABLE chain2a (id int, val int);
+CREATE TEMP TABLE chain2b (id int, val int);
+CREATE TEMP TABLE chain2c (id int, val int);
+INSERT INTO chain1a SELECT i, i FROM generate_series(1,100) i;
+INSERT INTO chain1b SELECT i, i FROM generate_series(1,100) i;
+INSERT INTO chain1c SELECT i, i FROM generate_series(1,100) i;
+INSERT INTO chain2a SELECT i, i FROM generate_series(1,100) i;
+INSERT INTO chain2b SELECT i, i FROM generate_series(1,100) i;
+INSERT INTO chain2c SELECT i, i FROM generate_series(1,100) i;
+ANALYZE;
+EXPLAIN (COSTS OFF)
+SELECT count(*)
+FROM chain1a
+JOIN chain1b ON chain1a.id = chain1b.id
+JOIN chain1c ON chain1b.val = chain1c.id
+JOIN chain2a ON chain1a.val = chain2a.id -- Cross-chain join
+JOIN chain2b ON chain2a.val = chain2b.id
+JOIN chain2c ON chain2b.val = chain2c.id;
+ QUERY PLAN
+-----------------------------------------------------------------
+ Aggregate
+ -> Hash Join
+ Hash Cond: (chain1a.val = chain2a.id)
+ -> Hash Join
+ Hash Cond: (chain1b.val = chain1c.id)
+ -> Hash Join
+ Hash Cond: (chain1a.id = chain1b.id)
+ -> Seq Scan on chain1a
+ -> Hash
+ -> Seq Scan on chain1b
+ -> Hash
+ -> Seq Scan on chain1c
+ -> Hash
+ -> Hash Join
+ Hash Cond: (chain2b.val = chain2c.id)
+ -> Hash Join
+ Hash Cond: (chain2a.val = chain2b.id)
+ -> Seq Scan on chain2a
+ -> Hash
+ -> Seq Scan on chain2b
+ -> Hash
+ -> Seq Scan on chain2c
+(22 rows)
+
+--
+-- Eager aggregation with GOO join search
+-- Ensure grouped_rel handling when eager aggregation is enabled.
+--
+SET enable_eager_aggregate = on;
+SET min_eager_agg_group_size = 0;
+CREATE TEMP TABLE center_tbl (id int PRIMARY KEY);
+CREATE TEMP TABLE arm1_tbl (center_id int, payload int);
+CREATE TEMP TABLE arm2_tbl (center_id int, payload int);
+INSERT INTO center_tbl SELECT i FROM generate_series(1, 10) i;
+INSERT INTO arm1_tbl SELECT i%10 + 1, i FROM generate_series(1, 1000) i;
+INSERT INTO arm2_tbl SELECT i%10 + 1, i FROM generate_series(1, 1000) i;
+ANALYZE center_tbl;
+ANALYZE arm1_tbl;
+ANALYZE arm2_tbl;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT c.id, count(*)
+FROM center_tbl c
+JOIN arm1_tbl a1 ON c.id = a1.center_id
+JOIN arm2_tbl a2 ON c.id = a2.center_id
+GROUP BY c.id;
+ QUERY PLAN
+----------------------------------------------------------
+ HashAggregate
+ Output: c.id, count(*)
+ Group Key: c.id
+ -> Hash Join
+ Output: c.id
+ Hash Cond: (c.id = a2.center_id)
+ -> Hash Join
+ Output: c.id, a1.center_id
+ Inner Unique: true
+ Hash Cond: (a1.center_id = c.id)
+ -> Seq Scan on pg_temp.arm1_tbl a1
+ Output: a1.center_id, a1.payload
+ -> Hash
+ Output: c.id
+ -> Seq Scan on pg_temp.center_tbl c
+ Output: c.id
+ -> Hash
+ Output: a2.center_id
+ -> Seq Scan on pg_temp.arm2_tbl a2
+ Output: a2.center_id
+(20 rows)
+
+SELECT c.id, count(*)
+FROM center_tbl c
+JOIN arm1_tbl a1 ON c.id = a1.center_id
+JOIN arm2_tbl a2 ON c.id = a2.center_id
+GROUP BY c.id;
+ id | count
+----+-------
+ 8 | 10000
+ 10 | 10000
+ 9 | 10000
+ 7 | 10000
+ 1 | 10000
+ 5 | 10000
+ 4 | 10000
+ 2 | 10000
+ 6 | 10000
+ 3 | 10000
+(10 rows)
+
+RESET min_eager_agg_group_size;
+RESET enable_eager_aggregate;
+-- Cleanup
+DEALLOCATE goo_plan;
+DEALLOCATE standard_plan;
+RESET geqo_threshold;
+RESET enable_goo_join_search;
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index 3dd63fd88ed..a823d16d793 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -153,6 +153,7 @@ select name, setting from pg_settings where name like 'enable%';
enable_distinct_reordering | on
enable_eager_aggregate | on
enable_gathermerge | on
+ enable_goo_join_search | off
enable_group_by_reordering | on
enable_hashagg | on
enable_hashjoin | on
@@ -173,7 +174,7 @@ select name, setting from pg_settings where name like 'enable%';
enable_seqscan | on
enable_sort | on
enable_tidscan | on
-(25 rows)
+(26 rows)
-- There are always wait event descriptions for various types. InjectionPoint
-- may be present or absent, depending on history since last postmaster start.
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 549e9b2d7be..f60e309079a 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -68,6 +68,12 @@ test: select_into select_distinct select_distinct_on select_implicit select_havi
# ----------
test: brin gin gist spgist privileges init_privs security_label collate matview lock replica_identity rowsecurity object_address tablesample groupingsets drop_operator password identity generated_stored join_hash
+# ----------
+# Additional JOIN ORDER tests
+# WIP: need to find an appropriate group for this test
+# ----------
+test: goo
+
# ----------
# Additional BRIN tests
# ----------
@@ -98,9 +104,6 @@ test: maintain_every
# no relation related tests can be put in this group
test: publication subscription
-# ----------
-# Another group of parallel tests
-# select_views depends on create_view
# ----------
test: select_views portals_p2 foreign_key cluster dependency guc bitmapops combocid tsearch tsdicts foreign_data window xmlmap functional_deps advisory_lock indirect_toast equivclass stats_rewrite
diff --git a/src/test/regress/sql/goo.sql b/src/test/regress/sql/goo.sql
new file mode 100644
index 00000000000..ab048d8e34e
--- /dev/null
+++ b/src/test/regress/sql/goo.sql
@@ -0,0 +1,364 @@
+--
+-- GOO (Greedy Operator Ordering) Join Search Tests
+--
+-- This test suite validates the GOO join ordering algorithm and verifies
+-- correct behavior for various query patterns.
+--
+
+-- Create test tables with various sizes and join patterns
+CREATE TEMP TABLE t1 (a int, b int);
+CREATE TEMP TABLE t2 (a int, c int);
+CREATE TEMP TABLE t3 (b int, d int);
+CREATE TEMP TABLE t4 (c int, e int);
+CREATE TEMP TABLE t5 (d int, f int);
+CREATE TEMP TABLE t6 (e int, g int);
+CREATE TEMP TABLE t7 (f int, h int);
+CREATE TEMP TABLE t8 (g int, i int);
+CREATE TEMP TABLE t9 (h int, j int);
+CREATE TEMP TABLE t10 (i int, k int);
+CREATE TEMP TABLE t11 (j int, l int);
+CREATE TEMP TABLE t12 (k int, m int);
+CREATE TEMP TABLE t13 (l int, n int);
+CREATE TEMP TABLE t14 (m int, o int);
+CREATE TEMP TABLE t15 (n int, p int);
+CREATE TEMP TABLE t16 (o int, q int);
+CREATE TEMP TABLE t17 (p int, r int);
+CREATE TEMP TABLE t18 (q int, s int);
+
+-- Populate with small amount of data
+INSERT INTO t1 SELECT i, i FROM generate_series(1,10) i;
+INSERT INTO t2 SELECT i, i FROM generate_series(1,10) i;
+INSERT INTO t3 SELECT i, i FROM generate_series(1,10) i;
+INSERT INTO t4 SELECT i, i FROM generate_series(1,10) i;
+INSERT INTO t5 SELECT i, i FROM generate_series(1,10) i;
+INSERT INTO t6 SELECT i, i FROM generate_series(1,10) i;
+INSERT INTO t7 SELECT i, i FROM generate_series(1,10) i;
+INSERT INTO t8 SELECT i, i FROM generate_series(1,10) i;
+INSERT INTO t9 SELECT i, i FROM generate_series(1,10) i;
+INSERT INTO t10 SELECT i, i FROM generate_series(1,10) i;
+INSERT INTO t11 SELECT i, i FROM generate_series(1,10) i;
+INSERT INTO t12 SELECT i, i FROM generate_series(1,10) i;
+INSERT INTO t13 SELECT i, i FROM generate_series(1,10) i;
+INSERT INTO t14 SELECT i, i FROM generate_series(1,10) i;
+INSERT INTO t15 SELECT i, i FROM generate_series(1,10) i;
+INSERT INTO t16 SELECT i, i FROM generate_series(1,10) i;
+INSERT INTO t17 SELECT i, i FROM generate_series(1,10) i;
+INSERT INTO t18 SELECT i, i FROM generate_series(1,10) i;
+
+ANALYZE;
+
+--
+-- Basic 3-way join (sanity check)
+--
+SET enable_goo_join_search = on;
+SET geqo_threshold = 2;
+
+EXPLAIN (COSTS OFF)
+SELECT count(*)
+FROM (VALUES (1),(2)) AS a(x)
+JOIN (VALUES (1),(2)) AS b(x) USING (x)
+JOIN (VALUES (1),(3)) AS c(x) USING (x);
+
+SELECT count(*)
+FROM (VALUES (1),(2)) AS a(x)
+JOIN (VALUES (1),(2)) AS b(x) USING (x)
+JOIN (VALUES (1),(3)) AS c(x) USING (x);
+
+--
+-- Disconnected graph (Cartesian products required)
+--
+-- This tests GOO's ability to handle queries where some relations
+-- have no join clauses connecting them. GOO should allow Cartesian
+-- products when no clause-connected joins are available.
+--
+EXPLAIN (COSTS OFF)
+SELECT count(*)
+FROM t1, t2, t5
+WHERE t1.a = 1 AND t2.c = 2 AND t5.f = 3;
+
+SELECT count(*)
+FROM t1, t2, t5
+WHERE t1.a = 1 AND t2.c = 2 AND t5.f = 3;
+
+--
+-- Star schema (fact table with multiple dimension tables)
+--
+-- Test GOO with a typical star schema join pattern.
+--
+CREATE TEMP TABLE fact (id int, dim1_id int, dim2_id int, dim3_id int, dim4_id int, value int);
+CREATE TEMP TABLE dim1 (id int, name text);
+CREATE TEMP TABLE dim2 (id int, name text);
+CREATE TEMP TABLE dim3 (id int, name text);
+CREATE TEMP TABLE dim4 (id int, name text);
+
+INSERT INTO fact SELECT i, i, i, i, i, i FROM generate_series(1,100) i;
+INSERT INTO dim1 SELECT i, 'dim1_'||i FROM generate_series(1,10) i;
+INSERT INTO dim2 SELECT i, 'dim2_'||i FROM generate_series(1,10) i;
+INSERT INTO dim3 SELECT i, 'dim3_'||i FROM generate_series(1,10) i;
+INSERT INTO dim4 SELECT i, 'dim4_'||i FROM generate_series(1,10) i;
+
+ANALYZE;
+
+EXPLAIN (COSTS OFF)
+SELECT count(*)
+FROM fact
+JOIN dim1 ON fact.dim1_id = dim1.id
+JOIN dim2 ON fact.dim2_id = dim2.id
+JOIN dim3 ON fact.dim3_id = dim3.id
+JOIN dim4 ON fact.dim4_id = dim4.id
+WHERE dim1.id < 5;
+
+--
+-- Long join chain
+--
+-- Tests GOO with a large join involving 15 relations.
+--
+SET geqo_threshold = 8;
+
+EXPLAIN (COSTS OFF)
+SELECT count(*)
+FROM t1
+JOIN t2 ON t1.a = t2.a
+JOIN t3 ON t1.b = t3.b
+JOIN t4 ON t2.c = t4.c
+JOIN t5 ON t3.d = t5.d
+JOIN t6 ON t4.e = t6.e
+JOIN t7 ON t5.f = t7.f
+JOIN t8 ON t6.g = t8.g
+JOIN t9 ON t7.h = t9.h
+JOIN t10 ON t8.i = t10.i
+JOIN t11 ON t9.j = t11.j
+JOIN t12 ON t10.k = t12.k
+JOIN t13 ON t11.l = t13.l
+JOIN t14 ON t12.m = t14.m
+JOIN t15 ON t13.n = t15.n;
+
+-- Execute to verify correctness
+SELECT count(*)
+FROM t1
+JOIN t2 ON t1.a = t2.a
+JOIN t3 ON t1.b = t3.b
+JOIN t4 ON t2.c = t4.c
+JOIN t5 ON t3.d = t5.d
+JOIN t6 ON t4.e = t6.e
+JOIN t7 ON t5.f = t7.f
+JOIN t8 ON t6.g = t8.g
+JOIN t9 ON t7.h = t9.h
+JOIN t10 ON t8.i = t10.i
+JOIN t11 ON t9.j = t11.j
+JOIN t12 ON t10.k = t12.k
+JOIN t13 ON t11.l = t13.l
+JOIN t14 ON t12.m = t14.m
+JOIN t15 ON t13.n = t15.n;
+
+--
+-- Bushy tree support
+--
+-- Verify that GOO can produce bushy join trees, not just left-deep or right-deep.
+-- With appropriate cost model, GOO should join (t1,t2) and (t3,t4) first,
+-- then join those results (bushy tree).
+--
+SET geqo_threshold = 4;
+
+EXPLAIN (COSTS OFF)
+SELECT count(*)
+FROM t1, t2, t3, t4
+WHERE t1.a = t2.a
+ AND t3.b = t4.c
+ AND t1.a = t3.b;
+
+--
+-- Compare GOO vs standard join search
+--
+-- Run the same query with GOO and standard join search to verify both
+-- produce valid plans. Results should be identical even if plans differ.
+--
+SET enable_goo_join_search = on;
+PREPARE goo_plan AS
+SELECT count(*)
+FROM t1 JOIN t2 ON t1.a = t2.a
+JOIN t3 ON t1.b = t3.b
+JOIN t4 ON t2.c = t4.c
+JOIN t5 ON t3.d = t5.d
+JOIN t6 ON t4.e = t6.e;
+
+EXECUTE goo_plan;
+
+SET enable_goo_join_search = off;
+SET geqo_threshold = default;
+PREPARE standard_plan AS
+SELECT count(*)
+FROM t1 JOIN t2 ON t1.a = t2.a
+JOIN t3 ON t1.b = t3.b
+JOIN t4 ON t2.c = t4.c
+JOIN t5 ON t3.d = t5.d
+JOIN t6 ON t4.e = t6.e;
+
+EXECUTE standard_plan;
+
+-- Results should match
+EXECUTE goo_plan;
+EXECUTE standard_plan;
+
+--
+-- Large join (18 relations)
+--
+-- Test GOO with a large number of relations.
+--
+SET enable_goo_join_search = on;
+SET geqo_threshold = 10;
+
+EXPLAIN (COSTS OFF)
+SELECT count(*)
+FROM t1
+JOIN t2 ON t1.a = t2.a
+JOIN t3 ON t1.b = t3.b
+JOIN t4 ON t2.c = t4.c
+JOIN t5 ON t3.d = t5.d
+JOIN t6 ON t4.e = t6.e
+JOIN t7 ON t5.f = t7.f
+JOIN t8 ON t6.g = t8.g
+JOIN t9 ON t7.h = t9.h
+JOIN t10 ON t8.i = t10.i
+JOIN t11 ON t9.j = t11.j
+JOIN t12 ON t10.k = t12.k
+JOIN t13 ON t11.l = t13.l
+JOIN t14 ON t12.m = t14.m
+JOIN t15 ON t13.n = t15.n
+JOIN t16 ON t14.o = t16.o
+JOIN t17 ON t15.p = t17.p
+JOIN t18 ON t16.q = t18.q;
+
+--
+-- Mixed connected and disconnected components
+--
+-- Query with two connected components that need a Cartesian product between them.
+--
+EXPLAIN (COSTS OFF)
+SELECT count(*)
+FROM t1 JOIN t2 ON t1.a = t2.a,
+ t5 JOIN t6 ON t5.f = t6.e
+WHERE t1.a < 5 AND t5.d < 3;
+
+--
+-- Outer joins
+--
+-- Verify GOO handles outer joins correctly (respects join order restrictions)
+--
+EXPLAIN (COSTS OFF)
+SELECT count(*)
+FROM t1
+LEFT JOIN t2 ON t1.a = t2.a
+LEFT JOIN t3 ON t2.a = t3.b
+LEFT JOIN t4 ON t3.d = t4.c;
+
+--
+-- Complete Cartesian products (disconnected graphs)
+--
+-- Test GOO's ability to handle queries with no join clauses at all.
+--
+SET enable_goo_join_search = on;
+SET geqo_threshold = 2;
+
+EXPLAIN (COSTS OFF)
+SELECT count(*)
+FROM t1, t2;
+
+SELECT count(*)
+FROM t1, t2;
+
+--
+-- Join order restrictions with FULL OUTER JOIN
+--
+-- FULL OUTER JOIN creates strong ordering constraints that GOO must respect
+--
+EXPLAIN (COSTS OFF)
+SELECT count(*)
+FROM t1
+FULL OUTER JOIN t2 ON t1.a = t2.a
+FULL OUTER JOIN t3 ON t2.a = t3.b;
+
+--
+-- Self-join handling
+--
+-- Test GOO with the same table appearing multiple times. GOO must correctly
+-- handle self-joins that were not removed by Self-Join Elimination.
+--
+EXPLAIN (COSTS OFF)
+SELECT count(*)
+FROM t1 a
+JOIN t1 b ON a.a = b.a
+JOIN t2 c ON b.b = c.c;
+
+--
+-- Complex bushy tree pattern
+--
+-- Create a query that naturally leads to bushy tree: multiple independent
+-- join chains that need to be combined
+--
+CREATE TEMP TABLE chain1a (id int, val int);
+CREATE TEMP TABLE chain1b (id int, val int);
+CREATE TEMP TABLE chain1c (id int, val int);
+CREATE TEMP TABLE chain2a (id int, val int);
+CREATE TEMP TABLE chain2b (id int, val int);
+CREATE TEMP TABLE chain2c (id int, val int);
+
+INSERT INTO chain1a SELECT i, i FROM generate_series(1,100) i;
+INSERT INTO chain1b SELECT i, i FROM generate_series(1,100) i;
+INSERT INTO chain1c SELECT i, i FROM generate_series(1,100) i;
+INSERT INTO chain2a SELECT i, i FROM generate_series(1,100) i;
+INSERT INTO chain2b SELECT i, i FROM generate_series(1,100) i;
+INSERT INTO chain2c SELECT i, i FROM generate_series(1,100) i;
+
+ANALYZE;
+
+EXPLAIN (COSTS OFF)
+SELECT count(*)
+FROM chain1a
+JOIN chain1b ON chain1a.id = chain1b.id
+JOIN chain1c ON chain1b.val = chain1c.id
+JOIN chain2a ON chain1a.val = chain2a.id -- Cross-chain join
+JOIN chain2b ON chain2a.val = chain2b.id
+JOIN chain2c ON chain2b.val = chain2c.id;
+
+--
+-- Eager aggregation with GOO join search
+-- Ensure grouped_rel handling when eager aggregation is enabled.
+--
+SET enable_eager_aggregate = on;
+SET min_eager_agg_group_size = 0;
+
+CREATE TEMP TABLE center_tbl (id int PRIMARY KEY);
+CREATE TEMP TABLE arm1_tbl (center_id int, payload int);
+CREATE TEMP TABLE arm2_tbl (center_id int, payload int);
+
+INSERT INTO center_tbl SELECT i FROM generate_series(1, 10) i;
+INSERT INTO arm1_tbl SELECT i%10 + 1, i FROM generate_series(1, 1000) i;
+INSERT INTO arm2_tbl SELECT i%10 + 1, i FROM generate_series(1, 1000) i;
+
+ANALYZE center_tbl;
+ANALYZE arm1_tbl;
+ANALYZE arm2_tbl;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT c.id, count(*)
+FROM center_tbl c
+JOIN arm1_tbl a1 ON c.id = a1.center_id
+JOIN arm2_tbl a2 ON c.id = a2.center_id
+GROUP BY c.id;
+
+SELECT c.id, count(*)
+FROM center_tbl c
+JOIN arm1_tbl a1 ON c.id = a1.center_id
+JOIN arm2_tbl a2 ON c.id = a2.center_id
+GROUP BY c.id;
+
+RESET min_eager_agg_group_size;
+RESET enable_eager_aggregate;
+
+-- Cleanup
+DEALLOCATE goo_plan;
+DEALLOCATE standard_plan;
+
+RESET geqo_threshold;
+RESET enable_goo_join_search;
--
2.50.1 (Apple Git-155)
[application/octet-stream] v5-0002-add-a-GUC-goo_greedy_strategy-to-choose-GOO-greed.patch (20.0K, 4-v5-0002-add-a-GUC-goo_greedy_strategy-to-choose-GOO-greed.patch)
download | inline diff:
From ae8ddc0a25cbbdead422166a0f6223c235e6e807 Mon Sep 17 00:00:00 2001
From: Chengpeng Yan <[email protected]>
Date: Sat, 14 Feb 2026 11:44:59 +0800
Subject: [PATCH v5 2/2] add a GUC goo_greedy_strategy to choose GOO greedy
strategies
Add goo_greedy_strategy and extend GOO candidate comparison to support
result_size, cost, and selectivity strategies. Also add combined mode,
which runs these strategies and keeps the lowest-cost result.
Signed-off-by: Chengpeng Yan <[email protected]>
---
src/backend/optimizer/path/goo.c | 184 ++++++++++++++++++++--
src/backend/utils/misc/guc_parameters.dat | 10 ++
src/backend/utils/misc/guc_tables.c | 8 +
src/include/optimizer/paths.h | 9 ++
src/test/regress/expected/goo.out | 102 ++++++------
src/test/regress/sql/goo.sql | 4 +
6 files changed, 260 insertions(+), 57 deletions(-)
diff --git a/src/backend/optimizer/path/goo.c b/src/backend/optimizer/path/goo.c
index e49a9f372ef..b7d4a931198 100644
--- a/src/backend/optimizer/path/goo.c
+++ b/src/backend/optimizer/path/goo.c
@@ -55,6 +55,7 @@
* Configuration defaults. These are exposed as GUCs in guc_tables.c.
*/
bool enable_goo_join_search = false;
+int goo_greedy_strategy = GOO_GREEDY_STRATEGY_COMBINED;
/*
* Working state for a single GOO search invocation.
@@ -73,6 +74,7 @@ typedef struct GooState
MemoryContext cand_cxt; /* per-iteration candidate storage */
MemoryContext scratch_cxt; /* per-candidate speculative evaluation */
List *clumps; /* remaining join components (RelOptInfo *) */
+ GooGreedyStrategy strategy; /* candidate comparison heuristic */
/*
* "clumps" are similar to GEQO's concept (see geqo_eval.c): join
@@ -94,11 +96,22 @@ typedef struct GooCandidate
{
RelOptInfo *left; /* left input clump */
RelOptInfo *right; /* right input clump */
+ double result_size; /* estimated result size in bytes */
Cost total_cost; /* total cost of cheapest path */
+ double selectivity; /* join selectivity (output/input rows) */
Relids joinrelids; /* relids covered by this join */
} GooCandidate;
-static GooState * goo_init_state(PlannerInfo *root, List *initial_rels);
+typedef struct GooStrategyResult
+{
+ RelOptInfo *result;
+ Cost total_cost;
+ List *join_rel_list;
+ struct HTAB *join_rel_hash;
+} GooStrategyResult;
+
+static GooState * goo_init_state(PlannerInfo *root, List *initial_rels,
+ GooGreedyStrategy strategy);
static void goo_destroy_state(GooState * state);
static RelOptInfo *goo_search_internal(GooState * state);
static void goo_reset_probe_state(GooState * state, int saved_rel_len,
@@ -106,9 +119,15 @@ static void goo_reset_probe_state(GooState * state, int saved_rel_len,
static GooCandidate * goo_build_candidate(GooState * state, RelOptInfo *left,
RelOptInfo *right);
static RelOptInfo *goo_commit_join(GooState * state, GooCandidate * cand);
-static bool goo_candidate_better(GooCandidate * a, GooCandidate * b);
+static bool goo_candidate_better(GooGreedyStrategy strategy,
+ GooCandidate * a, GooCandidate * b);
static bool goo_candidate_prunable(GooState * state, RelOptInfo *left,
RelOptInfo *right);
+static const char *goo_strategy_name(GooGreedyStrategy strategy);
+static GooStrategyResult goo_run_strategy(PlannerInfo *root, List *initial_rels,
+ List *base_join_rel_list,
+ struct HTAB *base_hash,
+ GooGreedyStrategy strategy);
/*
* goo_join_search
@@ -130,8 +149,63 @@ goo_join_search(PlannerInfo *root, int levels_needed,
int base_rel_count;
struct HTAB *base_hash;
+ /* If COMBINED mode, try all strategies and return the better one */
+ if (goo_greedy_strategy == GOO_GREEDY_STRATEGY_COMBINED)
+ {
+ static const GooGreedyStrategy combined_strategies[] = {
+ GOO_GREEDY_STRATEGY_RESULT_SIZE,
+ GOO_GREEDY_STRATEGY_COST,
+ GOO_GREEDY_STRATEGY_SELECTIVITY
+ };
+ GooStrategyResult best_result = {0};
+ GooGreedyStrategy best_strategy = GOO_GREEDY_STRATEGY_COST;
+ List *base_join_rel_list;
+ bool have_best = false;
+
+ base_join_rel_list = root->join_rel_list;
+ base_hash = root->join_rel_hash;
+
+ for (int i = 0; i < lengthof(combined_strategies); i++)
+ {
+ GooGreedyStrategy strategy = combined_strategies[i];
+ GooStrategyResult result;
+
+ result = goo_run_strategy(root, initial_rels,
+ base_join_rel_list, base_hash,
+ strategy);
+
+ if (result.result == NULL)
+ continue;
+
+ if (!have_best || result.total_cost < best_result.total_cost)
+ {
+ best_result = result;
+ best_strategy = strategy;
+ have_best = true;
+ }
+ }
+
+ /*
+ * During development/testing, fail fast when every strategy fails.
+ */
+ if (!have_best)
+ elog(ERROR, "GOO join search failed: all strategies exhausted without a valid join order");
+
+ /*
+ * Pick the lowest-cost result across strategies.
+ */
+ root->join_rel_list = best_result.join_rel_list;
+ root->join_rel_hash = best_result.join_rel_hash;
+
+ elog(DEBUG1, "GOO COMBINED mode: %s strategy chosen (cost: %.2f)",
+ goo_strategy_name(best_strategy), best_result.total_cost);
+
+ return best_result.result;
+ }
+
+ /* Normal single-strategy mode */
/* Initialize search state and memory contexts */
- state = goo_init_state(root, initial_rels);
+ state = goo_init_state(root, initial_rels, goo_greedy_strategy);
/*
* Save initial state of join_rel_list and join_rel_hash so we can restore
@@ -172,7 +246,8 @@ goo_join_search(PlannerInfo *root, int levels_needed,
* where we may evaluate hundreds or thousands of candidate joins.
*/
static GooState *
-goo_init_state(PlannerInfo *root, List *initial_rels)
+goo_init_state(PlannerInfo *root, List *initial_rels,
+ GooGreedyStrategy strategy)
{
MemoryContext oldcxt;
GooState *state;
@@ -183,6 +258,7 @@ goo_init_state(PlannerInfo *root, List *initial_rels)
state->root = root;
state->clumps = NIL;
state->prune_cartesian = false;
+ state->strategy = strategy;
/* Create the three-level memory context hierarchy */
state->goo_cxt = AllocSetContextCreate(root->planner_cxt, "GOOStateContext",
@@ -219,6 +295,42 @@ goo_destroy_state(GooState * state)
pfree(state);
}
+static GooStrategyResult
+goo_run_strategy(PlannerInfo *root, List *initial_rels,
+ List *base_join_rel_list, struct HTAB *base_hash,
+ GooGreedyStrategy strategy)
+{
+ GooStrategyResult result;
+ GooState *state;
+ MemoryContext oldcxt;
+
+ result.result = NULL;
+ result.total_cost = 0;
+ result.join_rel_list = NIL;
+ result.join_rel_hash = NULL;
+
+ oldcxt = MemoryContextSwitchTo(root->planner_cxt);
+ root->join_rel_list = list_copy(base_join_rel_list);
+ root->join_rel_hash = NULL;
+ MemoryContextSwitchTo(oldcxt);
+
+ state = goo_init_state(root, initial_rels, strategy);
+ result.result = goo_search_internal(state);
+
+ if (result.result != NULL)
+ result.total_cost = result.result->cheapest_total_path->total_cost;
+
+ result.join_rel_list = root->join_rel_list;
+ result.join_rel_hash = root->join_rel_hash;
+
+ goo_destroy_state(state);
+
+ root->join_rel_list = base_join_rel_list;
+ root->join_rel_hash = base_hash;
+
+ return result;
+}
+
/*
* goo_search_internal
* Main greedy search loop.
@@ -300,7 +412,8 @@ goo_search_internal(GooState * state)
/* Track the best candidate seen so far */
if (best_candidate == NULL ||
- goo_candidate_better(cand, best_candidate))
+ goo_candidate_better(state->strategy,
+ cand, best_candidate))
best_candidate = cand;
}
}
@@ -393,6 +506,8 @@ static GooCandidate * goo_build_candidate(GooState * state, RelOptInfo *left,
int saved_rel_len;
struct HTAB *saved_hash;
RelOptInfo *joinrel;
+ double join_rows;
+ double result_size;
Cost total_cost;
GooCandidate *cand;
bool is_top_rel;
@@ -448,6 +563,9 @@ static GooCandidate * goo_build_candidate(GooState * state, RelOptInfo *left,
set_cheapest(grouped_rel);
}
+ join_rows = joinrel->rows;
+
+ result_size = join_rows * joinrel->reltarget->width;
total_cost = joinrel->cheapest_total_path->total_cost;
/*
@@ -466,7 +584,9 @@ static GooCandidate * goo_build_candidate(GooState * state, RelOptInfo *left,
cand = palloc(sizeof(GooCandidate));
cand->left = left;
cand->right = right;
+ cand->result_size = result_size;
cand->total_cost = total_cost;
+ cand->selectivity = join_rows / (left->rows * right->rows);
cand->joinrelids = bms_union(left->relids, right->relids);
MemoryContextSwitchTo(oldcxt);
@@ -597,12 +717,56 @@ goo_commit_join(GooState * state, GooCandidate * cand)
* Returns true if candidate 'a' should be preferred over candidate 'b'.
*/
static bool
-goo_candidate_better(GooCandidate * a, GooCandidate * b)
+goo_candidate_better(GooGreedyStrategy strategy,
+ GooCandidate * a, GooCandidate * b)
{
- if (a->total_cost < b->total_cost)
- return true;
- if (a->total_cost > b->total_cost)
- return false;
+ switch (strategy)
+ {
+ case GOO_GREEDY_STRATEGY_COMBINED:
+ /* Should not be called in COMBINED mode */
+ elog(ERROR, "goo_candidate_better should not be called in COMBINED mode");
+ return false;
+
+ case GOO_GREEDY_STRATEGY_RESULT_SIZE:
+ if (a->result_size < b->result_size)
+ return true;
+ if (a->result_size > b->result_size)
+ return false;
+ break;
+
+ case GOO_GREEDY_STRATEGY_COST:
+ if (a->total_cost < b->total_cost)
+ return true;
+ if (a->total_cost > b->total_cost)
+ return false;
+ break;
+
+ case GOO_GREEDY_STRATEGY_SELECTIVITY:
+ default:
+ if (a->selectivity < b->selectivity)
+ return true;
+ if (a->selectivity > b->selectivity)
+ return false;
+ break;
+ }
return bms_compare(a->joinrelids, b->joinrelids) < 0;
}
+
+static const char *
+goo_strategy_name(GooGreedyStrategy strategy)
+{
+ switch (strategy)
+ {
+ case GOO_GREEDY_STRATEGY_RESULT_SIZE:
+ return "RESULT_SIZE";
+ case GOO_GREEDY_STRATEGY_COST:
+ return "COST";
+ case GOO_GREEDY_STRATEGY_SELECTIVITY:
+ return "SELECTIVITY";
+ case GOO_GREEDY_STRATEGY_COMBINED:
+ return "COMBINED";
+ }
+
+ return "UNKNOWN";
+}
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 624dee632c3..c17c44ca4d0 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -1164,6 +1164,16 @@
max => 'MAX_KILOBYTES',
},
+/* WIP: only for testing */
+{ name => 'goo_greedy_strategy', type => 'enum', context => 'PGC_USERSET', group => 'QUERY_TUNING_GEQO',
+ short_desc => 'Selects the heuristic used by GOO to compare join candidates.',
+ long_desc => 'Valid values are result_size, cost, selectivity, and combined.',
+ flags => 'GUC_EXPLAIN',
+ variable => 'goo_greedy_strategy',
+ boot_val => 'GOO_GREEDY_STRATEGY_COMBINED',
+ options => 'goo_greedy_strategy_options',
+},
+
{ name => 'gss_accept_delegation', type => 'bool', context => 'PGC_SIGHUP', group => 'CONN_AUTH_AUTH',
short_desc => 'Sets whether GSSAPI delegation should be accepted from the client.',
variable => 'pg_gss_accept_delegation',
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 741fce8dede..3b0ba5e4064 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -412,6 +412,14 @@ static const struct config_enum_entry plan_cache_mode_options[] = {
{NULL, 0, false}
};
+static const struct config_enum_entry goo_greedy_strategy_options[] = {
+ {"result_size", GOO_GREEDY_STRATEGY_RESULT_SIZE, false},
+ {"cost", GOO_GREEDY_STRATEGY_COST, false},
+ {"selectivity", GOO_GREEDY_STRATEGY_SELECTIVITY, false},
+ {"combined", GOO_GREEDY_STRATEGY_COMBINED, false},
+ {NULL, 0, false}
+};
+
static const struct config_enum_entry password_encryption_options[] = {
{"md5", PASSWORD_TYPE_MD5, false},
{"scram-sha-256", PASSWORD_TYPE_SCRAM_SHA_256, false},
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 734fa68884d..1782ec35066 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -16,11 +16,20 @@
#include "nodes/pathnodes.h"
+typedef enum GooGreedyStrategy
+{
+ GOO_GREEDY_STRATEGY_RESULT_SIZE,
+ GOO_GREEDY_STRATEGY_COST,
+ GOO_GREEDY_STRATEGY_SELECTIVITY,
+ GOO_GREEDY_STRATEGY_COMBINED
+} GooGreedyStrategy;
+
/*
* allpaths.c
*/
extern PGDLLIMPORT bool enable_geqo;
extern PGDLLIMPORT bool enable_goo_join_search;
+extern PGDLLIMPORT int goo_greedy_strategy;
extern PGDLLIMPORT bool enable_eager_aggregate;
extern PGDLLIMPORT int geqo_threshold;
extern PGDLLIMPORT double min_eager_agg_group_size;
diff --git a/src/test/regress/expected/goo.out b/src/test/regress/expected/goo.out
index 3a935e626da..7f7436a971f 100644
--- a/src/test/regress/expected/goo.out
+++ b/src/test/regress/expected/goo.out
@@ -43,6 +43,13 @@ INSERT INTO t16 SELECT i, i FROM generate_series(1,10) i;
INSERT INTO t17 SELECT i, i FROM generate_series(1,10) i;
INSERT INTO t18 SELECT i, i FROM generate_series(1,10) i;
ANALYZE;
+-- Verify combined is the default strategy.
+SHOW goo_greedy_strategy;
+ goo_greedy_strategy
+---------------------
+ combined
+(1 row)
+
--
-- Basic 3-way join (sanity check)
--
@@ -177,42 +184,42 @@ JOIN t12 ON t10.k = t12.k
JOIN t13 ON t11.l = t13.l
JOIN t14 ON t12.m = t14.m
JOIN t15 ON t13.n = t15.n;
- QUERY PLAN
-----------------------------------------------------------------
+ QUERY PLAN
+----------------------------------------------------------------------------------
Aggregate
-> Hash Join
Hash Cond: (t7.h = t9.h)
-> Hash Join
Hash Cond: (t8.i = t10.i)
-> Hash Join
- Hash Cond: (t2.c = t4.c)
+ Hash Cond: (t6.g = t8.g)
-> Hash Join
- Hash Cond: (t3.b = t1.b)
+ Hash Cond: (t5.f = t7.f)
-> Hash Join
- Hash Cond: (t5.f = t7.f)
+ Hash Cond: (t4.e = t6.e)
-> Hash Join
Hash Cond: (t3.d = t5.d)
- -> Seq Scan on t3
+ -> Hash Join
+ Hash Cond: (t2.c = t4.c)
+ -> Hash Join
+ Hash Cond: (t1.b = t3.b)
+ -> Hash Join
+ Hash Cond: (t1.a = t2.a)
+ -> Seq Scan on t1
+ -> Hash
+ -> Seq Scan on t2
+ -> Hash
+ -> Seq Scan on t3
+ -> Hash
+ -> Seq Scan on t4
-> Hash
-> Seq Scan on t5
-> Hash
- -> Seq Scan on t7
+ -> Seq Scan on t6
-> Hash
- -> Hash Join
- Hash Cond: (t1.a = t2.a)
- -> Seq Scan on t1
- -> Hash
- -> Seq Scan on t2
+ -> Seq Scan on t7
-> Hash
- -> Hash Join
- Hash Cond: (t6.g = t8.g)
- -> Hash Join
- Hash Cond: (t4.e = t6.e)
- -> Seq Scan on t4
- -> Hash
- -> Seq Scan on t6
- -> Hash
- -> Seq Scan on t8
+ -> Seq Scan on t8
-> Hash
-> Hash Join
Hash Cond: (t12.m = t14.m)
@@ -279,18 +286,18 @@ WHERE t1.a = t2.a
----------------------------------------------
Aggregate
-> Hash Join
- Hash Cond: (t1.a = t3.b)
+ Hash Cond: (t1.a = t4.c)
-> Hash Join
- Hash Cond: (t1.a = t2.a)
- -> Seq Scan on t1
- -> Hash
- -> Seq Scan on t2
- -> Hash
+ Hash Cond: (t1.a = t3.b)
-> Hash Join
- Hash Cond: (t3.b = t4.c)
- -> Seq Scan on t3
+ Hash Cond: (t1.a = t2.a)
+ -> Seq Scan on t1
-> Hash
- -> Seq Scan on t4
+ -> Seq Scan on t2
+ -> Hash
+ -> Seq Scan on t3
+ -> Hash
+ -> Seq Scan on t4
(14 rows)
--
@@ -601,30 +608,30 @@ JOIN chain1c ON chain1b.val = chain1c.id
JOIN chain2a ON chain1a.val = chain2a.id -- Cross-chain join
JOIN chain2b ON chain2a.val = chain2b.id
JOIN chain2c ON chain2b.val = chain2c.id;
- QUERY PLAN
------------------------------------------------------------------
+ QUERY PLAN
+-----------------------------------------------------------------------
Aggregate
-> Hash Join
- Hash Cond: (chain1a.val = chain2a.id)
+ Hash Cond: (chain2b.val = chain2c.id)
-> Hash Join
- Hash Cond: (chain1b.val = chain1c.id)
- -> Hash Join
- Hash Cond: (chain1a.id = chain1b.id)
- -> Seq Scan on chain1a
- -> Hash
- -> Seq Scan on chain1b
- -> Hash
- -> Seq Scan on chain1c
- -> Hash
+ Hash Cond: (chain2a.val = chain2b.id)
-> Hash Join
- Hash Cond: (chain2b.val = chain2c.id)
+ Hash Cond: (chain1a.val = chain2a.id)
-> Hash Join
- Hash Cond: (chain2a.val = chain2b.id)
- -> Seq Scan on chain2a
+ Hash Cond: (chain1b.id = chain1a.id)
+ -> Hash Join
+ Hash Cond: (chain1b.val = chain1c.id)
+ -> Seq Scan on chain1b
+ -> Hash
+ -> Seq Scan on chain1c
-> Hash
- -> Seq Scan on chain2b
+ -> Seq Scan on chain1a
-> Hash
- -> Seq Scan on chain2c
+ -> Seq Scan on chain2a
+ -> Hash
+ -> Seq Scan on chain2b
+ -> Hash
+ -> Seq Scan on chain2c
(22 rows)
--
@@ -698,3 +705,4 @@ DEALLOCATE goo_plan;
DEALLOCATE standard_plan;
RESET geqo_threshold;
RESET enable_goo_join_search;
+RESET goo_greedy_strategy;
diff --git a/src/test/regress/sql/goo.sql b/src/test/regress/sql/goo.sql
index ab048d8e34e..1b2dff1d929 100644
--- a/src/test/regress/sql/goo.sql
+++ b/src/test/regress/sql/goo.sql
@@ -47,6 +47,9 @@ INSERT INTO t18 SELECT i, i FROM generate_series(1,10) i;
ANALYZE;
+-- Verify combined is the default strategy.
+SHOW goo_greedy_strategy;
+
--
-- Basic 3-way join (sanity check)
--
@@ -362,3 +365,4 @@ DEALLOCATE standard_plan;
RESET geqo_threshold;
RESET enable_goo_join_search;
+RESET goo_greedy_strategy;
\ No newline at end of file
--
2.50.1 (Apple Git-155)
^ permalink raw reply [nested|flat] 18+ messages in thread
* Re: Add a greedy join search algorithm to handle large join problems
2026-02-13 11:14 Re: Add a greedy join search algorithm to handle large join problems lakshmi <[email protected]>
2026-02-14 05:39 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
@ 2026-02-16 06:27 ` lakshmi <[email protected]>
2026-02-16 06:59 ` Re: Add a greedy join search algorithm to handle large join problems lakshmi <[email protected]>
1 sibling, 1 reply; 18+ messages in thread
From: lakshmi @ 2026-02-16 06:27 UTC (permalink / raw)
To: Chengpeng Yan <[email protected]>; +Cc: Pavel Stehule <[email protected]>; Tomas Vondra <[email protected]>; John Naylor <[email protected]>; [email protected] <[email protected]>
Hi Chengpeng,
I tested the v5 patch on a clean build from current PostgreSQL master.The
patch applied cleanly, the server built successfully, and make
check-world passed
without new failures.
I then compared DP, GEQO, and GOO on synthetic multi-join workloads.
*15-table join*
-
DP: planning ~19.1 ms | execution ~21.0 ms
-
GEQO: planning ~46.6 ms | execution ~17.7 ms
-
GOO v5: planning ~1.0 ms | execution ~19.5 ms
*20-table join*
-
DP: planning ~25.1 ms | execution ~28.2 ms
-
GEQO: planning ~27.5 ms | execution ~23.5 ms
-
GOO v5: planning ~1.5 ms | execution ~28.9 ms
Across both join sizes, GOO v5 keeps planning time extremely low (roughly
an order of magnitude lower than DP/GEQO) while execution times remain
in a comparable range, with no obvious regressions in this synthetic
workload. This appears consistent with the goal of reducing planning
overhead for large join problems while preserving similar plan quality.
These tests use controlled synthetic joins rather than JOB/TPC-H, so
they mainly validate planning-time scaling and basic plan sanity. I plan to
continue with more realistic workloads and strategy comparisons and will
share further results if anything notable appears.
Thanks for the continued work on this patch series.
Regards,
Lakshmi
On Sat, Feb 14, 2026 at 11:09 AM Chengpeng Yan <[email protected]>
wrote:
>
> > 2026年2月13日 19:14,lakshmi <[email protected]> 写道:
> >
> > HI all,
> > I tested the latest GOO patch (v4) on a fresh build from the current
> PostgreSQL master. The patch applied cleanly, the server built without
> issues, and regression tests passed except for the expected EXPLAIN output
> differences due to the new join ordering behavior.
> >
> > As a quick sanity check, I compared DP, GEQO, and GOO on a small
> multi-join query:
> >
> > DP planning: ~0.66 ms
> > GEQO planning: ~2.28 ms
> > GOO planning: ~0.38 ms
> > Execution times were similar across all three (~1.5–1.7 ms) with no
> correctness issues. Even on a small join, GEQO shows higher planning
> overhead, while GOO plans faster with comparable execution cost.
> > I then evaluated scaling using synthetic 15-table and 20-table joins
> with EXPLAIN (ANALYZE, TIMING OFF):
> > 15 tables
> > DP: ~22.9 ms | ~23.4 ms
> > GEQO: ~46.7 ms | ~20.5 ms
> > GOO: ~1.8 ms | ~22.4 ms
> >
> > 20 tables
> > DP: ~48.1 ms | ~30.5 ms
> > GEQO: ~51.0 ms | ~26.7 ms
> > GOO: ~3.2 ms | ~29.0 ms
> >
> > Planning time increases notably for DP and remains relatively high for
> GEQO, while GOO stays very low even at 20 joins, indicating substantially
> > reduced planning overhead. Execution costs remain broadly comparable,
> with no obvious regressions from GOO in this synthetic workload.
> >
> > Although this uses a controlled synthetic join graph rather than
> JOB/TPC-H, the scaling behavior appears consistent with GOO’s goal of
> significantly cheaper planning than DP/GEQO while preserving similar plan
> quality.
> >
> > I plan to continue testing with more realistic workloads and will share
> further results if anything notable appears.
> >
> > Thanks for the interesting work.
> >
> > Regards,
> > Lakshmi
>
> Hi,
>
> Thank you very much for testing v4 and sharing the results. I really
> appreciate the effort and the detailed feedback.
>
> I also agree with Tomas’s point that we need better benchmark context to
> evaluate plan quality, not only planning time.
>
> I’ve prepared a v5 refresh on top of v4, still split into two patches
> (v5-0001 and v5-0002).
> I also ran `make check-world` on current master with v5 applied, and it
> passes on my side.
>
> Compared with v4:
>
> [PATCH v5 1/2]
> - keeps the base GOO join-search path focused on a single greedy signal
> (cost);
> - fixes issues found during recent testing (mainly around candidate
> probing/cleanup and failure paths);
> - improves stability/determinism in candidate selection (including tie
> handling);
> - updates regression outputs accordingly.
>
> [PATCH v5 2/2]
> - extends `goo_greedy_strategy` and adds the `selectivity` heuristic
> suggested by Tomas;
> - improves combined mode so multiple greedy signals are evaluated in a
> common framework, and the final plan is selected by lowest estimated
> `total_cost`;
> - keeps strategy-layer changes isolated from the base path for easier
> comparison and review.
>
> My current next steps are:
>
> 1. Continue evaluating plan quality on more datasets/workloads. I’ve
> already collected several candidate tests: some are JOB-based
> variants, and others are synthetic workloads. Next, I plan to
> consolidate these into a unified test set (with reproducible
> setup/details), publish it, and run broader comparative evaluation.
>
> 2. Prototype a hybrid handoff approach: use greedy contraction first to
> reduce the join graph, then let DP optimize the reduced problem. The
> goal is a smoother transition around the threshold, avoiding abrupt
> plan-shape changes from a hard optimizer switch.
>
> 3. Explore more join-ordering improvements incrementally, including
> ideas from “Simplicity Done Right for Join Ordering” and related
> work.
>
> Thanks again for the careful testing and detailed feedback.
>
> --
> Best regards,
> Chengpeng Yan
>
^ permalink raw reply [nested|flat] 18+ messages in thread
* Re: Add a greedy join search algorithm to handle large join problems
2026-02-13 11:14 Re: Add a greedy join search algorithm to handle large join problems lakshmi <[email protected]>
2026-02-14 05:39 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
2026-02-16 06:27 ` Re: Add a greedy join search algorithm to handle large join problems lakshmi <[email protected]>
@ 2026-02-16 06:59 ` lakshmi <[email protected]>
2026-02-16 08:44 ` Re: Add a greedy join search algorithm to handle large join problems Tomas Vondra <[email protected]>
0 siblings, 1 reply; 18+ messages in thread
From: lakshmi @ 2026-02-16 06:59 UTC (permalink / raw)
To: Chengpeng Yan <[email protected]>; +Cc: Pavel Stehule <[email protected]>; Tomas Vondra <[email protected]>; John Naylor <[email protected]>; [email protected] <[email protected]>
Hi Chengpeng,
I ran a quick comparison of the GOO v5 greedy strategies on a multi-join
workload to look at execution quality in addition to planning time.
Here are the results from one representative query:
-cost: planning ~1.8 ms | execution ~32.8 ms
-result_size: planning ~2.1 ms | execution ~29.3 ms
-combined: planning ~3.7 ms | execution ~29.4 ms
All three strategies keep planning time very low, which continues to
support GOO’s intended scalability advantage over DP and GEQO.
In this test, the cost strategy shows noticeably higher execution time,
while result_size and combined produce similar and better execution
performance. Since combined has slightly higher planning overhead without a
clear execution benefit here, result_size appears to provide the best
trade-off for this synthetic workload.
I plan to continue testing with a few JOB-based queries to evaluate plan
quality in a more realistic setting and will share further results if
anything notable appears.
Thanks again for the continued work on this patch series.
Regards,
Lakshmi
On Mon, Feb 16, 2026 at 11:57 AM lakshmi <[email protected]> wrote:
> Hi Chengpeng,
>
> I tested the v5 patch on a clean build from current PostgreSQL master.The
> patch applied cleanly, the server built successfully, and make
> check-world passed without new failures.
> I then compared DP, GEQO, and GOO on synthetic multi-join workloads.
>
> *15-table join*
>
> -
>
> DP: planning ~19.1 ms | execution ~21.0 ms
> -
>
> GEQO: planning ~46.6 ms | execution ~17.7 ms
> -
>
> GOO v5: planning ~1.0 ms | execution ~19.5 ms
>
> *20-table join*
>
> -
>
> DP: planning ~25.1 ms | execution ~28.2 ms
> -
>
> GEQO: planning ~27.5 ms | execution ~23.5 ms
> -
>
> GOO v5: planning ~1.5 ms | execution ~28.9 ms
>
> Across both join sizes, GOO v5 keeps planning time extremely low (roughly
> an order of magnitude lower than DP/GEQO) while execution times remain
> in a comparable range, with no obvious regressions in this synthetic
> workload. This appears consistent with the goal of reducing planning
> overhead for large join problems while preserving similar plan quality.
>
> These tests use controlled synthetic joins rather than JOB/TPC-H, so
> they mainly validate planning-time scaling and basic plan sanity. I plan to
> continue with more realistic workloads and strategy comparisons and will
> share further results if anything notable appears.
>
> Thanks for the continued work on this patch series.
>
> Regards,
> Lakshmi
>
>
> On Sat, Feb 14, 2026 at 11:09 AM Chengpeng Yan <[email protected]>
> wrote:
>
>>
>> > 2026年2月13日 19:14,lakshmi <[email protected]> 写道:
>> >
>> > HI all,
>> > I tested the latest GOO patch (v4) on a fresh build from the current
>> PostgreSQL master. The patch applied cleanly, the server built without
>> issues, and regression tests passed except for the expected EXPLAIN output
>> differences due to the new join ordering behavior.
>> >
>> > As a quick sanity check, I compared DP, GEQO, and GOO on a small
>> multi-join query:
>> >
>> > DP planning: ~0.66 ms
>> > GEQO planning: ~2.28 ms
>> > GOO planning: ~0.38 ms
>> > Execution times were similar across all three (~1.5–1.7 ms) with no
>> correctness issues. Even on a small join, GEQO shows higher planning
>> overhead, while GOO plans faster with comparable execution cost.
>> > I then evaluated scaling using synthetic 15-table and 20-table joins
>> with EXPLAIN (ANALYZE, TIMING OFF):
>> > 15 tables
>> > DP: ~22.9 ms | ~23.4 ms
>> > GEQO: ~46.7 ms | ~20.5 ms
>> > GOO: ~1.8 ms | ~22.4 ms
>> >
>> > 20 tables
>> > DP: ~48.1 ms | ~30.5 ms
>> > GEQO: ~51.0 ms | ~26.7 ms
>> > GOO: ~3.2 ms | ~29.0 ms
>> >
>> > Planning time increases notably for DP and remains relatively high for
>> GEQO, while GOO stays very low even at 20 joins, indicating substantially
>> > reduced planning overhead. Execution costs remain broadly comparable,
>> with no obvious regressions from GOO in this synthetic workload.
>> >
>> > Although this uses a controlled synthetic join graph rather than
>> JOB/TPC-H, the scaling behavior appears consistent with GOO’s goal of
>> significantly cheaper planning than DP/GEQO while preserving similar plan
>> quality.
>> >
>> > I plan to continue testing with more realistic workloads and will share
>> further results if anything notable appears.
>> >
>> > Thanks for the interesting work.
>> >
>> > Regards,
>> > Lakshmi
>>
>> Hi,
>>
>> Thank you very much for testing v4 and sharing the results. I really
>> appreciate the effort and the detailed feedback.
>>
>> I also agree with Tomas’s point that we need better benchmark context to
>> evaluate plan quality, not only planning time.
>>
>> I’ve prepared a v5 refresh on top of v4, still split into two patches
>> (v5-0001 and v5-0002).
>> I also ran `make check-world` on current master with v5 applied, and it
>> passes on my side.
>>
>> Compared with v4:
>>
>> [PATCH v5 1/2]
>> - keeps the base GOO join-search path focused on a single greedy signal
>> (cost);
>> - fixes issues found during recent testing (mainly around candidate
>> probing/cleanup and failure paths);
>> - improves stability/determinism in candidate selection (including tie
>> handling);
>> - updates regression outputs accordingly.
>>
>> [PATCH v5 2/2]
>> - extends `goo_greedy_strategy` and adds the `selectivity` heuristic
>> suggested by Tomas;
>> - improves combined mode so multiple greedy signals are evaluated in a
>> common framework, and the final plan is selected by lowest estimated
>> `total_cost`;
>> - keeps strategy-layer changes isolated from the base path for easier
>> comparison and review.
>>
>> My current next steps are:
>>
>> 1. Continue evaluating plan quality on more datasets/workloads. I’ve
>> already collected several candidate tests: some are JOB-based
>> variants, and others are synthetic workloads. Next, I plan to
>> consolidate these into a unified test set (with reproducible
>> setup/details), publish it, and run broader comparative evaluation.
>>
>> 2. Prototype a hybrid handoff approach: use greedy contraction first to
>> reduce the join graph, then let DP optimize the reduced problem. The
>> goal is a smoother transition around the threshold, avoiding abrupt
>> plan-shape changes from a hard optimizer switch.
>>
>> 3. Explore more join-ordering improvements incrementally, including
>> ideas from “Simplicity Done Right for Join Ordering” and related
>> work.
>>
>> Thanks again for the careful testing and detailed feedback.
>>
>> --
>> Best regards,
>> Chengpeng Yan
>>
>
^ permalink raw reply [nested|flat] 18+ messages in thread
* Re: Add a greedy join search algorithm to handle large join problems
2026-02-13 11:14 Re: Add a greedy join search algorithm to handle large join problems lakshmi <[email protected]>
2026-02-14 05:39 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
2026-02-16 06:27 ` Re: Add a greedy join search algorithm to handle large join problems lakshmi <[email protected]>
2026-02-16 06:59 ` Re: Add a greedy join search algorithm to handle large join problems lakshmi <[email protected]>
@ 2026-02-16 08:44 ` Tomas Vondra <[email protected]>
2026-02-16 10:44 ` Re: Add a greedy join search algorithm to handle large join problems lakshmi <[email protected]>
0 siblings, 1 reply; 18+ messages in thread
From: Tomas Vondra @ 2026-02-16 08:44 UTC (permalink / raw)
To: lakshmi <[email protected]>; Chengpeng Yan <[email protected]>; +Cc: Pavel Stehule <[email protected]>; John Naylor <[email protected]>; [email protected] <[email protected]>
Hi Lakshmi,
can you share the join workload you used to test (or at least some
example joins with 15 and 20 relations)? Have you modified any of the
planner parameters affecting join search (join_collapse_limit,
geqo_threshold, ...)?
regards
Tomas
On 2/16/26 07:59, lakshmi wrote:
> Hi Chengpeng,
>
> I ran a quick comparison of the GOO v5 greedy strategies on a multi-join
> workload to look at execution quality in addition to planning time.
>
> Here are the results from one representative query:
>
> -cost: planning ~1.8 ms | execution ~32.8 ms
>
> -result_size: planning ~2.1 ms | execution ~29.3 ms
>
> -combined: planning ~3.7 ms | execution ~29.4 ms
>
> All three strategies keep planning time very low, which continues to
> support GOO’s intended scalability advantage over DP and GEQO.
>
> In this test, the cost strategy shows noticeably higher execution time,
> while result_size and combined produce similar and better execution
> performance. Since combined has slightly higher planning overhead
> without a clear execution benefit here, result_size appears to provide
> the best trade-off for this synthetic workload.
>
> I plan to continue testing with a few JOB-based queries to evaluate plan
> quality in a more realistic setting and will share further results if
> anything notable appears.
>
> Thanks again for the continued work on this patch series.
>
> Regards,
> Lakshmi
>
>
> On Mon, Feb 16, 2026 at 11:57 AM lakshmi <[email protected]
> <mailto:[email protected]>> wrote:
>
> Hi Chengpeng,
>
> I tested the v5 patch on a clean build from current PostgreSQL
> master.The patch applied cleanly, the server built successfully,
> and |make check-world| passed without new failures.
> I then compared DP, GEQO, and GOO on synthetic multi-join workloads.
>
> *15-table join*
>
> *
>
> DP: planning ~19.1 ms | execution ~21.0 ms
>
> *
>
> GEQO: planning ~46.6 ms | execution ~17.7 ms
>
> *
>
> GOO v5: planning ~1.0 ms | execution ~19.5 ms
>
> *20-table join*
>
> *
>
> DP: planning ~25.1 ms | execution ~28.2 ms
>
> *
>
> GEQO: planning ~27.5 ms | execution ~23.5 ms
>
> *
>
> GOO v5: planning ~1.5 ms | execution ~28.9 ms
>
> Across both join sizes, GOO v5 keeps planning time extremely
> low (roughly an order of magnitude lower than DP/GEQO) while
> execution times remain in a comparable range, with no obvious
> regressions in this synthetic workload. This appears consistent
> with the goal of reducing planning overhead for large join
> problems while preserving similar plan quality.
>
> These tests use controlled synthetic joins rather than JOB/TPC-
> H, so they mainly validate planning-time scaling and basic plan
> sanity. I plan to continue with more realistic workloads and
> strategy comparisons and will share further results if anything
> notable appears.
>
> Thanks for the continued work on this patch series.
>
> Regards,
> Lakshmi
>
>
> On Sat, Feb 14, 2026 at 11:09 AM Chengpeng Yan
> <[email protected] <mailto:[email protected]>> wrote:
>
>
> > 2026年2月13日 19:14,lakshmi <[email protected] <mailto:[email protected]>> 写道:
> >
> > HI all,
> > I tested the latest GOO patch (v4) on a fresh build from the current PostgreSQL master. The patch applied cleanly, the server built without issues, and regression tests passed except for the expected EXPLAIN output differences due to the new join ordering behavior.
> >
> > As a quick sanity check, I compared DP, GEQO, and GOO on a small multi-join query:
> >
> > DP planning: ~0.66 ms
> > GEQO planning: ~2.28 ms
> > GOO planning: ~0.38 ms
> > Execution times were similar across all three (~1.5–1.7 ms) with no correctness issues. Even on a small join, GEQO shows higher planning overhead, while GOO plans faster with comparable execution cost.
> > I then evaluated scaling using synthetic 15-table and 20-table joins with EXPLAIN (ANALYZE, TIMING OFF):
> > 15 tables
> > DP: ~22.9 ms | ~23.4 ms
> > GEQO: ~46.7 ms | ~20.5 ms
> > GOO: ~1.8 ms | ~22.4 ms
> >
> > 20 tables
> > DP: ~48.1 ms | ~30.5 ms
> > GEQO: ~51.0 ms | ~26.7 ms
> > GOO: ~3.2 ms | ~29.0 ms
> >
> > Planning time increases notably for DP and remains relatively high for GEQO, while GOO stays very low even at 20 joins, indicating substantially
> > reduced planning overhead. Execution costs remain broadly comparable, with no obvious regressions from GOO in this synthetic workload.
> >
> > Although this uses a controlled synthetic join graph rather than JOB/TPC-H, the scaling behavior appears consistent with GOO’s goal of significantly cheaper planning than DP/GEQO while preserving similar plan quality.
> >
> > I plan to continue testing with more realistic workloads and will share further results if anything notable appears.
> >
> > Thanks for the interesting work.
> >
> > Regards,
> > Lakshmi
>
> Hi,
>
> Thank you very much for testing v4 and sharing the results. I really
> appreciate the effort and the detailed feedback.
>
> I also agree with Tomas’s point that we need better benchmark
> context to
> evaluate plan quality, not only planning time.
>
> I’ve prepared a v5 refresh on top of v4, still split into two
> patches
> (v5-0001 and v5-0002).
> I also ran `make check-world` on current master with v5 applied,
> and it
> passes on my side.
>
> Compared with v4:
>
> [PATCH v5 1/2]
> - keeps the base GOO join-search path focused on a single greedy
> signal
> (cost);
> - fixes issues found during recent testing (mainly around candidate
> probing/cleanup and failure paths);
> - improves stability/determinism in candidate selection
> (including tie
> handling);
> - updates regression outputs accordingly.
>
> [PATCH v5 2/2]
> - extends `goo_greedy_strategy` and adds the `selectivity` heuristic
> suggested by Tomas;
> - improves combined mode so multiple greedy signals are
> evaluated in a
> common framework, and the final plan is selected by lowest estimated
> `total_cost`;
> - keeps strategy-layer changes isolated from the base path for
> easier
> comparison and review.
>
> My current next steps are:
>
> 1. Continue evaluating plan quality on more datasets/workloads. I’ve
> already collected several candidate tests: some are JOB-based
> variants, and others are synthetic workloads. Next, I plan to
> consolidate these into a unified test set (with reproducible
> setup/details), publish it, and run broader comparative evaluation.
>
> 2. Prototype a hybrid handoff approach: use greedy contraction
> first to
> reduce the join graph, then let DP optimize the reduced problem. The
> goal is a smoother transition around the threshold, avoiding abrupt
> plan-shape changes from a hard optimizer switch.
>
> 3. Explore more join-ordering improvements incrementally, including
> ideas from “Simplicity Done Right for Join Ordering” and related
> work.
>
> Thanks again for the careful testing and detailed feedback.
>
> --
> Best regards,
> Chengpeng Yan
>
--
Tomas Vondra
^ permalink raw reply [nested|flat] 18+ messages in thread
* Re: Add a greedy join search algorithm to handle large join problems
2026-02-13 11:14 Re: Add a greedy join search algorithm to handle large join problems lakshmi <[email protected]>
2026-02-14 05:39 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
2026-02-16 06:27 ` Re: Add a greedy join search algorithm to handle large join problems lakshmi <[email protected]>
2026-02-16 06:59 ` Re: Add a greedy join search algorithm to handle large join problems lakshmi <[email protected]>
2026-02-16 08:44 ` Re: Add a greedy join search algorithm to handle large join problems Tomas Vondra <[email protected]>
@ 2026-02-16 10:44 ` lakshmi <[email protected]>
0 siblings, 0 replies; 18+ messages in thread
From: lakshmi @ 2026-02-16 10:44 UTC (permalink / raw)
To: Tomas Vondra <[email protected]>; +Cc: Chengpeng Yan <[email protected]>; Pavel Stehule <[email protected]>; John Naylor <[email protected]>; [email protected] <[email protected]>
Hi Tomas,
Thank you for the question.
The 15-table and 20-table results I shared were obtained using a synthetic
join workload designed to stress join-order planning and measure
planning-time scaling, rather than a JOB or TPC-H query.
Each query is essentially a left-deep chain of equality joins over simple
tables. For reference, the structure is equivalent to:
15-table join
SELECT count(*)
FROM t1
JOIN t2 ON t1.id = t2.id
JOIN t3 ON t2.id = t3.id
...
JOIN t15 ON t14.id = t15.id;
20-table join
SELECT count(*)
FROM t1
JOIN t2 ON t1.id = t2.id
JOIN t3 ON t2.id = t3.id
...
JOIN t20 ON t19.id = t20.id;
Regarding planner settings:
-geqo_threshold was set to:
a high value (e.g., 100) to force DP
a low value (e.g., 2) to allow GEQO/GOO
-enable_goo_join_search was toggled on/off depending on the comparison
being measured.
-Other planner parameters, including join_collapse_limit, were left at
their default values.
So these experiments mainly evaluate planning-time scaling and basic plan
sanity on a controlled join graph, rather than realistic workload plan
quality.
I’m currently preparing additional tests using selected JOB queries to
provide more meaningful plan-quality comparisons and will share those
results once available.
Regards
Lakshmi
^ permalink raw reply [nested|flat] 18+ messages in thread
* Re: Add a greedy join search algorithm to handle large join problems
2026-02-13 11:14 Re: Add a greedy join search algorithm to handle large join problems lakshmi <[email protected]>
2026-02-14 05:39 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
@ 2026-05-03 12:46 ` Chengpeng Yan <[email protected]>
2026-05-05 05:40 ` Re: Add a greedy join search algorithm to handle large join problems John Naylor <[email protected]>
1 sibling, 1 reply; 18+ messages in thread
From: Chengpeng Yan @ 2026-05-03 12:46 UTC (permalink / raw)
To: Tomas Vondra <[email protected]>; +Cc: lakshmi <[email protected]>; Pavel Stehule <[email protected]>; John Naylor <[email protected]>; [email protected] <[email protected]>
> On Feb 14, 2026, at 13:39, Chengpeng Yan <[email protected]> wrote:
>
> My current next steps are:
>
> 1. Continue evaluating plan quality on more datasets/workloads. I’ve
> already collected several candidate tests: some are JOB-based
> variants, and others are synthetic workloads. Next, I plan to
> consolidate these into a unified test set (with reproducible
> setup/details), publish it, and run broader comparative evaluation.
>
> 2. Prototype a hybrid handoff approach: use greedy contraction first to
> reduce the join graph, then let DP optimize the reduced problem. The
> goal is a smoother transition around the threshold, avoiding abrupt
> plan-shape changes from a hard optimizer switch.
>
> 3. Explore more join-ordering improvements incrementally, including
> ideas from “Simplicity Done Right for Join Ordering” and related
> work.
Hi,
I ran the current pure-GOO variants on JOB and JOB-Complex [1], 143
queries in total. JOB-Complex uses the same IMDB/JOB setting, but adds
harder predicates and more challenging join patterns. The tested
variants were:
DP / GEQO / GOO(cost) / GOO(result_size) / GOO(selectivity) / GOO(combined)
The benchmark uses the same setup as my previous JOB measurements. It
prepares the IMDB-backed workloads, applies the same Postgres GUC
settings used in the previous runs, runs one warmup pass and three
measured repetitions for each query/variant pair, and reports the median
of the measured repetitions. Each measured execution is collected with:
EXPLAIN (ANALYZE, TIMING OFF, SUMMARY ON, FORMAT JSON, SETTINGS ON)
Planning time and execution time are parsed separately. Planning time is
still reported, but the main metric below is execution time, because it
better reflects the quality of the chosen plan. I will attach
v5-job-benchmark-result.xlsx with both planning-time and execution-time
results, and v5-job-benchmark-execution-result.pdf as the execution-time
PDF view. The color convention is the same as in the earlier tables.
Timeouts use a 10-minute statement_timeout.
The scripts, workload definitions, and reproduction details are in the
benchmark repository [2]. I used a repository rather than a single
script because the evaluation now includes multiple workloads, several
algorithm variants, and enough setup details that keeping the workload
definitions and run protocol together makes reproduction and review
easier.
As expected from the previous discussion, planning time is low for the
GOO variants. GOO(combined) is about 5.2x faster than GEQO in planning
time in this run.
The execution-time picture is mixed. GOO(combined) has clear positives,
but does not look robust enough to replace GEQO directly.
Compared with GEQO over the 143 comparable queries:
GOO(combined) faster by 20%-2x: 6 queries
GOO(combined) faster by 2x-10x: 8 queries
GOO(combined) faster by >10x: 7 queries
GOO(combined) within +/-20% of GEQO: 82 queries
GOO(combined) slower by 20%-2x: 10 queries
GOO(combined) slower by 2x-10x: 26 queries
GOO(combined) slower by >10x: 4 queries
timeouts: 0 for GEQO, 0 for GOO(combined)
On execution-time sum, GOO(combined) is better than GEQO:
GEQO: 430459 ms
GOO(combined): 292194 ms
That is about 1.47x faster than GEQO on the workload sum. However, in
the execution-time distribution, the result is not one-sided: there are
meaningful wins and meaningful regressions.
Some cases where GOO(combined) is much better than GEQO:
job_complex q15: 229 ms vs 114942 ms, about 503x faster
job_complex q18: 2456 ms vs 116014 ms, about 47x faster
job 26c: 859 ms vs 25735 ms, about 30x faster
job 26a: 359 ms vs 8107 ms, about 23x faster
job_complex q23: 1121 ms vs 15711 ms, about 14x faster
job_complex q05: 462 ms vs 5678 ms, about 12x faster
Some cases where GOO(combined) is much worse than GEQO:
job_complex q28: 56563 ms vs 272 ms, about 208x slower
job_complex q20: 1918 ms vs 88 ms, about 22x slower
job_complex q12: 70750 ms vs 3456 ms, about 20x slower
job 28b: 1390 ms vs 103 ms, about 14x slower
## Selectivity
One topic from the earlier discussion was whether selectivity should be
ignored. Based on this run, I do not think selectivity is good enough as
a primary greedy strategy.
GOO(selectivity) has low planning time, but its execution-time behavior
is poor:
execution-time sum: about 16.46x DP and 4.76x GEQO
timeouts: 6
cases >=2x GEQO: 110
Some examples:
job 24b: 87569 ms vs GEQO 13.7 ms, about 6374x slower
job 29a: 56080 ms vs GEQO 8.9 ms, about 6334x slower
job 31b: 207334 ms vs GEQO 165 ms, about 1259x slower
job_complex q03: 14345 ms vs GEQO 25 ms, about 574x slower
job_complex q04: 15026 ms vs GEQO 26 ms, about 589x slower
This does not mean that selectivity is meaningless, or that GOO itself
is useless. The more precise issue is that a greedy join search commits
much earlier than DP or GEQO. If an early estimate is wrong, that error
is more likely to be amplified into the final join order.
For selectivity specifically, the main problem seems to be that it is
scale-free. It optimizes the relative reduction of one join pair, not
the absolute size of the intermediate result. A join can have a very low
estimated selectivity and still produce a large intermediate if the
inputs are large. Result_size is also estimate-dependent, but it at
least includes the estimated output size; cost includes Postgres's
normal physical cost model. Selectivity throws away much of that scale
information, which makes it a brittle primary greedy key on these
workloads.
In GOO(combined), the selectivity candidate was never selected as the
final plan, so it did not improve the combined variant in this run.
There were a few positive standalone selectivity cases, but they were
rare compared with the tail risk.
## GOO(combined)
GOO(combined) is still the best pure-GOO variant in this run. It reduces
planning time and avoids several severe GEQO bad cases.
At the same time, the remaining regressions suggest three limitations:
1. A greedy path is more sensitive to estimation errors and local-choice
mistakes. A locally reasonable early join can be globally poor once
the rest of the join graph is considered.
2. Sometimes none of the current greedy variants produces a good
candidate. For example, in q20, 28b, 19a, and q03, all current GOO
candidates are clearly worse than GEQO. In that case, improving only
the final selector would not be enough.
3. Sometimes a good GOO candidate exists, but the final estimated-cost
selector does not choose it. For example, in q28, q12, and 15a,
result_size produced a much better plan, but the final cost-based
choice selected a worse one.
If we continue improving pure GOO, I think the two natural directions
are:
* increase the diversity of greedy strategies, so different queries
have a better chance of getting at least one good candidate;
* once such a good candidate exists, make the final selector more
likely to pick it. In several cases a faster GOO candidate existed,
but it had a higher estimated cost and was not selected.
The CIDR 2021 paper "Simplicity Done Right for Join Ordering" is
relevant to both ideas: its E-block suggests another
enumeration/candidate-generation direction, and its U-block gives an
upper-bound style signal that could be used as a risk-aware selector
input. I think that is a good fit in principle, but it would not be a
small extension to the current GOO patch. E-block would require a
different enumeration path, and U-block would require extra metadata
such as maximum value frequency for join attributes. For this patch
series, I would prefer to keep the scope focused on plan
search rather than also changing the estimation side.
There is also a usability issue. The current pure-GOO variants do have
some tuning surface: users can switch between cost and result_size, or
use a combined mode. That is easy to try, but not necessarily easy to
tune. It is often unclear which strategy should be preferred for a given
query, and in some tail cases neither strategy produces a good plan.
GOO(combined) with cost and result_size may already be a useful baseline
or candidate generator, but the regressions above do not support making
it the default GEQO replacement. One possible advantage over GEQO is
that GOO may work better with pg_plan_advice, but I need to understand
the details better before making a stronger claim.
So my current conclusion is not that GOO has no value. It is fast to
plan, and it can avoid some GEQO bad cases. The issue is that pure GOO
appears to have structural robustness limits in some tail cases.
Because of that, I plan to switch the next experiment to a direction
closer to the earlier community discussion about making the transition
from DP to heuristic search more gradual [3]. The idea is to give
planning a fixed effort budget, use exact DP while the budget allows it,
and complete the remaining search space with GOO when the DP quota is
not enough.
This seems smoother, easier to tune, and possibly easier to accept:
increasing the effort should naturally allow either deeper DP or more
greedy completions, instead of switching abruptly from DP to a pure
greedy or GEQO-style method. I already have a demo implementation under
test, and I will share the code and benchmark results separately once
the numbers are ready.
References:
[1] JOB-Complex: https://github.com/DataManagementLab/JOB-Complex
[2] Benchmark repository:
https://github.com/Reminiscent/join-order-benchmark
[3] Earlier discussion about replacing GEQO more gradually:
https://www.postgresql.org/message-id/flat/CAFBsxsE3Sb889r-Xvun%2BiAa5Onjy71m8nzp6JSH387JJF-YmrA%40m...
--
Best regards,
Chengpeng Yan
Attachments:
[application/pdf] v5-job-benchmark-execution-result.pdf (222.9K, 3-v5-job-benchmark-execution-result.pdf)
download
[application/vnd.openxmlformats-officedocument.spreadsheetml.sheet] v5-job-benchmark-result.xlsx (44.0K, 4-v5-job-benchmark-result.xlsx)
download
^ permalink raw reply [nested|flat] 18+ messages in thread
* Re: Add a greedy join search algorithm to handle large join problems
2026-02-13 11:14 Re: Add a greedy join search algorithm to handle large join problems lakshmi <[email protected]>
2026-02-14 05:39 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
2026-05-03 12:46 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
@ 2026-05-05 05:40 ` John Naylor <[email protected]>
2026-05-11 14:33 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
0 siblings, 1 reply; 18+ messages in thread
From: John Naylor @ 2026-05-05 05:40 UTC (permalink / raw)
To: Chengpeng Yan <[email protected]>; +Cc: Tomas Vondra <[email protected]>; lakshmi <[email protected]>; Pavel Stehule <[email protected]>; [email protected] <[email protected]>
On Sun, May 3, 2026 at 7:46 PM Chengpeng Yan <[email protected]> wrote:
> > On Feb 14, 2026, at 13:39, Chengpeng Yan <[email protected]> wrote:
> > 1. Continue evaluating plan quality on more datasets/workloads. I’ve
> > already collected several candidate tests: some are JOB-based
> > variants, and others are synthetic workloads. Next, I plan to
> > consolidate these into a unified test set (with reproducible
> > setup/details), publish it, and run broader comparative evaluation.
> I ran the current pure-GOO variants on JOB and JOB-Complex [1], 143
> queries in total. JOB-Complex uses the same IMDB/JOB setting, but adds
> harder predicates and more challenging join patterns. The tested
Thanks for those results. As Tomas mentioned above, evaluation should
focus on cases where DP won't be used. Joins with a small number of
relations aren't going to tell us anything, especially since (I think)
GEGO will at times accidentally cover the entire seach space. Indeed,
there are quite a few queries where GOO is worse than GEQO by around
2-3x, but also small enough to be handled by DP anyway, so reporting
them is a distraction.
Less than half of the JOB-Complex queries have at least 12 "joins"
(BTW, is that number in the pdf actual joins or is it base
relations?), but the results (and summary results) include small joins
as well. Looking at the queries where one of GOO/GEQO is much better
than the other do seem to happen with large join problems.
> GOO(combined) with cost and result_size may already be a useful baseline
> or candidate generator, but the regressions above do not support making
> it the default GEQO replacement. One possible advantage over GEQO is
> that GOO may work better with pg_plan_advice, but I need to understand
> the details better before making a stronger claim.
Given that all heuristic join enumeration methods can produce
spectacularly bad plans, the ability to influence the plan is more
crucial with large join problems with small ones. Features should be
orthogonal in general, and in this case, integrating well with plan
advice seems like a strong deciding factor.
--
John Naylor
Amazon Web Services
^ permalink raw reply [nested|flat] 18+ messages in thread
* Re: Add a greedy join search algorithm to handle large join problems
2026-02-13 11:14 Re: Add a greedy join search algorithm to handle large join problems lakshmi <[email protected]>
2026-02-14 05:39 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
2026-05-03 12:46 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
2026-05-05 05:40 ` Re: Add a greedy join search algorithm to handle large join problems John Naylor <[email protected]>
@ 2026-05-11 14:33 ` Chengpeng Yan <[email protected]>
2026-05-12 11:51 ` Re: Add a greedy join search algorithm to handle large join problems John Naylor <[email protected]>
0 siblings, 1 reply; 18+ messages in thread
From: Chengpeng Yan @ 2026-05-11 14:33 UTC (permalink / raw)
To: John Naylor <[email protected]>; +Cc: Tomas Vondra <[email protected]>; lakshmi <[email protected]>; Pavel Stehule <[email protected]>; [email protected] <[email protected]>; Robert Haas <[email protected]>
Hi,
> On May 5, 2026, at 13:40, John Naylor <[email protected]> wrote:
>
> Thanks for those results. As Tomas mentioned above, evaluation should
> focus on cases where DP won't be used. Joins with a small number of
> relations aren't going to tell us anything, especially since (I think)
> GEGO will at times accidentally cover the entire seach space. Indeed,
> there are quite a few queries where GOO is worse than GEQO by around
> 2-3x, but also small enough to be handled by DP anyway, so reporting
> them is a distraction.
>
> Less than half of the JOB-Complex queries have at least 12 "joins"
> (BTW, is that number in the pdf actual joins or is it base
> relations?), but the results (and summary results) include small joins
> as well. Looking at the queries where one of GOO/GEQO is much better
> than the other do seem to happen with large join problems.
Thanks for taking a look and pointing this out. You're right. Sorry
about that. Including the smaller join queries in the summary made the
result harder to interpret, because those are normally handled by
standard DP search and are not the main target of this patch.
For the comparison below, I only use queries involving at least 12 base
relations, i.e. the default GEQO threshold boundary. The number shown in
the PDF is the number of base relations in the query's flat FROM list,
not the number of join operators. In JOB + JOB-Complex, 33 of the 143
queries meet that threshold: 20 from JOB and 13 from JOB-Complex.
I reran the v4 patch on those 33 queries, with DP / GEQO /
GOO(combined). This version does not include selectivity in the combined
candidate set; GOO(combined) only chooses between the GOO(cost) and
GOO(result_size) plans. The protocol and GUC settings are the same as
before: one warmup pass, three measured repetitions, median reported,
10-minute statement_timeout, and:
EXPLAIN (ANALYZE, TIMING OFF, SUMMARY ON, FORMAT JSON, SETTINGS ON)
The command checklist and run protocol are documented in the benchmark
repository [1][2]. I attached v4-job-benchmark-result-20260511.xlsx with
both planning and execution numbers, and
v4-job-benchmark-execution-result-20260511.pdf as a PDF view of the
execution-time sheet.
The detailed comparison below uses round1, which is the baseline
workbook attached here.
For planning time, GOO(combined) is still much cheaper than GEQO:
GEQO: 722 ms
GOO(combined): 108 ms
That is about 6.7x lower planning time for GOO(combined).
For execution time, compared with GEQO over the 33 comparable queries:
GOO(combined) faster by 20%-2x: 5 queries
GOO(combined) faster by 2x-10x: 4 queries
GOO(combined) faster by >10x: 4 queries
GOO(combined) within +/-20% of GEQO: 9 queries
GOO(combined) slower by 20%-2x: 5 queries
GOO(combined) slower by 2x-10x: 4 queries
GOO(combined) slower by >10x: 2 queries
timeouts: 0 for GEQO, 0 for GOO(combined)
On execution-time sum:
GEQO: 574582 ms
GOO(combined): 87706 ms
That is about 6.55x faster for GOO(combined) in this round, but the
distribution is not one-sided.
Some cases where GOO(combined) is much better than GEQO:
job_complex q19: 1125 ms vs 430902 ms, about 383x faster
job_complex q17: 975 ms vs 97105 ms, about 100x faster
job 26c: 880 ms vs 25859 ms, about 29x faster
job 29c: 31 ms vs 416 ms, about 14x faster
Some cases where GOO(combined) is much worse than GEQO:
job_complex q20: 1912 ms vs 78 ms, about 25x slower
job_complex q12: 71628 ms vs 3324 ms, about 22x slower
job 29a: 84 ms vs 9 ms, about 9.4x slower
job 30a: 1768 ms vs 423 ms, about 4.2x slower
So I would still describe the execution-time picture as mixed. Both
methods have bad cases.
>> GOO(combined) with cost and result_size may already be a useful baseline
>> or candidate generator, but the regressions above do not support making
>> it the default GEQO replacement. One possible advantage over GEQO is
>> that GOO may work better with pg_plan_advice, but I need to understand
>> the details better before making a stronger claim.
>
> Given that all heuristic join enumeration methods can produce
> spectacularly bad plans, the ability to influence the plan is more
> crucial with large join problems with small ones. Features should be
> orthogonal in general, and in this case, integrating well with plan
> advice seems like a strong deciding factor.
I agree. For large joins, cardinality estimation, the cost model, and
the join-order search all interact. Even a better search method can
choose a bad plan if the cardinality estimates or cost model give it
misleading input. So the ability to tune or influence the chosen plan
becomes especially important.
After reading more of the pg_plan_advice discussion and code, my current
understanding is that GOO should be easier than GEQO to integrate with
join order advice. Please correct me if this understanding is wrong. My
reading of the pg_plan_advice discussion is that advice can only
reliably nudge the planner towards plans it would have considered
anyway; with GEQO, not all join orders are explored, so join-order
advice can only constrain the options that the genetic search actually
considers [3][4].
GOO seems to have a useful property here: candidate generation is
explicit. At each contraction step it evaluates legal pair joins, so an
advice implementation could reject or prefer candidate pairs at that
point. This does not make impossible advice feasible, and it still would
need careful integration, but it seems less dependent on whether GEQO
happened to generate a compatible join sequence.
I also noticed a statistics-sensitivity issue while rerunning the
large-query subset. I ran five rounds over the same 33 queries and the
same DP / GEQO / GOO(combined) variants. Each round refreshed statistics
before executing the measured queries. I saved the statistics snapshot
with pg_dump --statistics-only and kept the per-round result
spreadsheets; these are in round_artifacts-20260511.zip, organized as
round1 through round5. The per-run measured repetitions are
comparatively stable; the large differences below correspond to changed
plans after refreshed statistics, not just measurement noise.
Execution-time sum:
| round | DP ms | GEQO ms | GOO(combined) ms |
| --- | ---: | ---: | ---: |
| round1 | 41091 | 574582 | 87706 |
| round2 | 41331 | 51205 | 85038 |
| round3 | 40493 | 95580 | 85561 |
| round4 | 41464 | 99228 | 83930 |
| round5 | 41611 | 126544 | 85578 |
GOO(combined) has the lower execution-time sum in four of the five
rounds. The exact bucket counts are not identical in every round, but
each round still has both large GOO(combined) wins and large
regressions. More importantly, GEQO varies much more across statistics
snapshots: its execution-time sum ranges from 51s to 575s, while
GOO(combined) stays in the 84s-88s range.
The large GEQO swings seem to follow this pattern: small changes in
sampled statistics change the estimated-cost ordering among close GEQO
join orders; some selected orders delay selective joins and produce much
larger intermediate results. For example, job_complex q19 ranges from
2.5s to 431s across the five rounds, job_complex q17 ranges from 0.96s
to 97s, job 26a ranges from 0.26s to 8.2s, and job 26c ranges from 0.65s
to 26s.
I do not think this means GOO(combined) is generally more stable. In
these runs, GEQO sometimes selected very different join orders after
refreshed statistics, while GOO(combined) changed less severely. One
possible reason is the different search shape: GOO(combined) compares a
small number of greedy completions, while GEQO searches among full join
sequences, where close estimated costs can lead to a very different
selected sequence. I would treat this as an observed behavior in this
benchmark rather than a general property of GOO.
My current summary is:
1. GOO(combined) still has much lower planning time than GEQO.
2. On execution time for large joins, both GEQO and GOO(combined) have
good and bad cases.
3. GOO may have a better path to work with plan advice, because the
search is deterministic and exposes explicit pair-join decision
points where advice could reject or prefer candidate pairs.
4. In this benchmark subset, GOO(combined) is more stable across
refreshed statistics, while GEQO is more sensitive to small
statistics changes.
As I mentioned in the previous message, the remaining GOO(combined)
regressions fall into two groups: sometimes neither cost nor result_size
finds a good greedy candidate, and sometimes a good greedy candidate
exists but the final estimated-cost selector chooses a slower plan.
So I am still looking at the two follow-up directions mentioned earlier,
both building on the current GOO work. One is to improve pure-GOO
candidate generation and the final selector, for example by adding more
diverse strategies and making the selector consider signals beyond final
estimated cost. The other is to use exact DP for a prefix of the problem
and fall back to GOO when the DP budget is not enough. The first
direction may need broader changes across planner or estimation-related
code, so I am currently focusing more on the second direction and will
share the code and numbers once I have a clearer result.
References:
[1] Benchmark reproduction notes:
https://github.com/Reminiscent/join-order-benchmark/blob/main/REPRODUCE.md
[2] Benchmark run protocol:
https://github.com/Reminiscent/join-order-benchmark/blob/main/BENCHMARK_RUNS.md
[3] pg_plan_advice / GEQO discussion:
https://www.postgresql.org/message-id/CA%2BTgmoYD55R8XqgbFbKhC4Uipu-gxA1msDOsUAZ9q38iCN8dQA%40mail.g...
[4] Same pg_plan_advice thread, GEQO note:
https://www.postgresql.org/message-id/CA%2BTgmoZpQDJOz_W34Wkp-JA%3DMQpzLeV6dsDGt%3D04U0AD6c65RA%40ma...
--
Best regards,
Chengpeng Yan
Attachments:
[application/pdf] v4-job-benchmark-result-20260511.pdf (23.2K, 3-v4-job-benchmark-result-20260511.pdf)
download
[application/vnd.openxmlformats-officedocument.spreadsheetml.sheet] v4-job-benchmark-result-20260511.xlsx (11.8K, 4-v4-job-benchmark-result-20260511.xlsx)
download
[application/zip] round_artifacts-20260511.zip (438.9K, 5-round_artifacts-20260511.zip)
download
^ permalink raw reply [nested|flat] 18+ messages in thread
* Re: Add a greedy join search algorithm to handle large join problems
2026-02-13 11:14 Re: Add a greedy join search algorithm to handle large join problems lakshmi <[email protected]>
2026-02-14 05:39 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
2026-05-03 12:46 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
2026-05-05 05:40 ` Re: Add a greedy join search algorithm to handle large join problems John Naylor <[email protected]>
2026-05-11 14:33 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
@ 2026-05-12 11:51 ` John Naylor <[email protected]>
2026-05-12 13:42 ` Re: Add a greedy join search algorithm to handle large join problems Tomas Vondra <[email protected]>
0 siblings, 1 reply; 18+ messages in thread
From: John Naylor @ 2026-05-12 11:51 UTC (permalink / raw)
To: Chengpeng Yan <[email protected]>; +Cc: Tomas Vondra <[email protected]>; lakshmi <[email protected]>; Pavel Stehule <[email protected]>; [email protected] <[email protected]>; Robert Haas <[email protected]>
On Mon, May 11, 2026 at 9:33 PM Chengpeng Yan <[email protected]> wrote:
> > On May 5, 2026, at 13:40, John Naylor <[email protected]> wrote:
> > Given that all heuristic join enumeration methods can produce
> > spectacularly bad plans, the ability to influence the plan is more
> > crucial with large join problems with small ones. Features should be
> > orthogonal in general, and in this case, integrating well with plan
> > advice seems like a strong deciding factor.
>
> I agree. For large joins, cardinality estimation, the cost model, and
> the join-order search all interact. Even a better search method can
> choose a bad plan if the cardinality estimates or cost model give it
> misleading input. So the ability to tune or influence the chosen plan
> becomes especially important.
>
> After reading more of the pg_plan_advice discussion and code, my current
> understanding is that GOO should be easier than GEQO to integrate with
> join order advice. Please correct me if this understanding is wrong. My
> reading of the pg_plan_advice discussion is that advice can only
> reliably nudge the planner towards plans it would have considered
> anyway; with GEQO, not all join orders are explored, so join-order
> advice can only constrain the options that the genetic search actually
> considers [3][4].
I believe that's the current thinking, but note this from the plan
advice README:
"XXX Need to investigate whether and how well supplying advice works with GEQO"
> GOO seems to have a useful property here: candidate generation is
> explicit. At each contraction step it evaluates legal pair joins, so an
> advice implementation could reject or prefer candidate pairs at that
> point. This does not make impossible advice feasible, and it still would
> need careful integration, but it seems less dependent on whether GEQO
> happened to generate a compatible join sequence.
That's a reasonable hypothesis. I'm not sure what you mean by "careful
integration". Would some aspects of the patch need to change?
> I also noticed a statistics-sensitivity issue while rerunning the
> large-query subset. I ran five rounds over the same 33 queries and the
> same DP / GEQO / GOO(combined) variants. Each round refreshed statistics
> before executing the measured queries. I saved the statistics snapshot
> with pg_dump --statistics-only and kept the per-round result
> spreadsheets; these are in round_artifacts-20260511.zip, organized as
> round1 through round5. The per-run measured repetitions are
> comparatively stable; the large differences below correspond to changed
> plans after refreshed statistics, not just measurement noise.
>
> Execution-time sum:
>
> | round | DP ms | GEQO ms | GOO(combined) ms |
> | --- | ---: | ---: | ---: |
> | round1 | 41091 | 574582 | 87706 |
> | round2 | 41331 | 51205 | 85038 |
> | round3 | 40493 | 95580 | 85561 |
> | round4 | 41464 | 99228 | 83930 |
> | round5 | 41611 | 126544 | 85578 |
>
> GOO(combined) has the lower execution-time sum in four of the five
> rounds. The exact bucket counts are not identical in every round, but
> each round still has both large GOO(combined) wins and large
> regressions. More importantly, GEQO varies much more across statistics
> snapshots: its execution-time sum ranges from 51s to 575s, while
> GOO(combined) stays in the 84s-88s range.
That's an interesting finding! We'll want to keep this behavior in
mind and see how reproducible it is. You mentioned you didn't want to
draw any general conclusions (reasonable), but a 10x variation just
from statistics does put a new perspective on the test results.
> As I mentioned in the previous message, the remaining GOO(combined)
> regressions fall into two groups: sometimes neither cost nor result_size
> finds a good greedy candidate, and sometimes a good greedy candidate
> exists but the final estimated-cost selector chooses a slower plan.
>
> So I am still looking at the two follow-up directions mentioned earlier,
> both building on the current GOO work. One is to improve pure-GOO
> candidate generation and the final selector, for example by adding more
> diverse strategies and making the selector consider signals beyond final
> estimated cost. The other is to use exact DP for a prefix of the problem
> and fall back to GOO when the DP budget is not enough. The first
> direction may need broader changes across planner or estimation-related
> code,
I agree we should try to avoid strategies that depend on changing other areas.
> so I am currently focusing more on the second direction and will
> share the code and numbers once I have a clearer result.
I've mentioned elsewhere that graceful degradation is a good property
(and a prerequisite for robustness), and cliff behavior is one
disadvantage of the GEQO search. However, it's not going to scale:
There will always be large join problems where a heuristic search will
find terrible plans, and that means severe regressions are possible,
as we've seen. Further, this direction opens up another large design
space that I fear will complicate testing and review, and it's already
hard enough.
So, the above doesn't seem like a logical next step from the data thus
far. (To be clear, I think it's good future work, but that presumes
getting to a first commit).
To my mind, we still need more data to inform future direction. The
most important things IMO are
- plan advice -- My current thinking is, the plan advice aspect is the
"wild card" that could immediately improve user experience with large
joins. Conversely, without the ability to fix regressions, any new
approach is risky. How well does plan advice work with GEQO? Does it
work with GOO already or would that need additional coding?
- more large join benchmarks (I see you have additional benchmarks in
your repository above)
--
John Naylor
Amazon Web Services
^ permalink raw reply [nested|flat] 18+ messages in thread
* Re: Add a greedy join search algorithm to handle large join problems
2026-02-13 11:14 Re: Add a greedy join search algorithm to handle large join problems lakshmi <[email protected]>
2026-02-14 05:39 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
2026-05-03 12:46 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
2026-05-05 05:40 ` Re: Add a greedy join search algorithm to handle large join problems John Naylor <[email protected]>
2026-05-11 14:33 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
2026-05-12 11:51 ` Re: Add a greedy join search algorithm to handle large join problems John Naylor <[email protected]>
@ 2026-05-12 13:42 ` Tomas Vondra <[email protected]>
2026-05-13 04:34 ` Re: Add a greedy join search algorithm to handle large join problems John Naylor <[email protected]>
2026-05-13 06:06 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
0 siblings, 2 replies; 18+ messages in thread
From: Tomas Vondra @ 2026-05-12 13:42 UTC (permalink / raw)
To: John Naylor <[email protected]>; Chengpeng Yan <[email protected]>; +Cc: lakshmi <[email protected]>; Pavel Stehule <[email protected]>; [email protected] <[email protected]>; Robert Haas <[email protected]>
On 5/12/26 13:51, John Naylor wrote:
> On Mon, May 11, 2026 at 9:33 PM Chengpeng Yan <[email protected]> wrote:
>
>> ...
>> I also noticed a statistics-sensitivity issue while rerunning the
>> large-query subset. I ran five rounds over the same 33 queries and the
>> same DP / GEQO / GOO(combined) variants. Each round refreshed statistics
>> before executing the measured queries. I saved the statistics snapshot
>> with pg_dump --statistics-only and kept the per-round result
>> spreadsheets; these are in round_artifacts-20260511.zip, organized as
>> round1 through round5. The per-run measured repetitions are
>> comparatively stable; the large differences below correspond to changed
>> plans after refreshed statistics, not just measurement noise.
>>
>> Execution-time sum:
>>
>> | round | DP ms | GEQO ms | GOO(combined) ms |
>> | --- | ---: | ---: | ---: |
>> | round1 | 41091 | 574582 | 87706 |
>> | round2 | 41331 | 51205 | 85038 |
>> | round3 | 40493 | 95580 | 85561 |
>> | round4 | 41464 | 99228 | 83930 |
>> | round5 | 41611 | 126544 | 85578 |
>>
>> GOO(combined) has the lower execution-time sum in four of the five
>> rounds. The exact bucket counts are not identical in every round, but
>> each round still has both large GOO(combined) wins and large
>> regressions. More importantly, GEQO varies much more across statistics
>> snapshots: its execution-time sum ranges from 51s to 575s, while
>> GOO(combined) stays in the 84s-88s range.
>
> That's an interesting finding! We'll want to keep this behavior in
> mind and see how reproducible it is. You mentioned you didn't want to
> draw any general conclusions (reasonable), but a 10x variation just
> from statistics does put a new perspective on the test results.
>
After reading this, my thinking was "we should assume the estimates are
accurate" because with bogus estimates we can end up with arbitrarily
bad plans. Garbage in, garbage out.
But maybe it's not as black-and-white? I agree if one method is much
more robust (i.e. performs better with estimates that are close enough
to the actual values), then that seems like an important feature for
this type of heuristics.
I wonder how "bad" the estimates are in the presented example, both in
the good and bad runs.
>> As I mentioned in the previous message, the remaining GOO(combined)
>> regressions fall into two groups: sometimes neither cost nor result_size
>> finds a good greedy candidate, and sometimes a good greedy candidate
>> exists but the final estimated-cost selector chooses a slower plan.
>>
>> So I am still looking at the two follow-up directions mentioned earlier,
>> both building on the current GOO work. One is to improve pure-GOO
>> candidate generation and the final selector, for example by adding more
>> diverse strategies and making the selector consider signals beyond final
>> estimated cost. The other is to use exact DP for a prefix of the problem
>> and fall back to GOO when the DP budget is not enough. The first
>> direction may need broader changes across planner or estimation-related
>> code,
>
> I agree we should try to avoid strategies that depend on changing other areas.
>
IMHO it's futile to search for a perfect heuristics. It we had that, we
wouldn't need the DP mode at all. This can't be the criteria, because no
heuristics would pass that.
Regarding the two proposed directions, it's not very clear to me why the
first direction would require broader changes, particularly changes that
would be unnecessary for the second direction (DP for prefix).
To me it seems the second approach is actually an advanced version of
the first one. How to you pick the prefix to handle by DP if not by some
heuristics, requiring this new signals?
>> so I am currently focusing more on the second direction and will
>> share the code and numbers once I have a clearer result.
>
> I've mentioned elsewhere that graceful degradation is a good property
> (and a prerequisite for robustness), and cliff behavior is one
> disadvantage of the GEQO search. However, it's not going to scale:
> There will always be large join problems where a heuristic search will
> find terrible plans, and that means severe regressions are possible,
> as we've seen. Further, this direction opens up another large design
> space that I fear will complicate testing and review, and it's already
> hard enough.
>
TBH I don't quite understand what the proposed approach with "using
exact DP for a prefix of the problem" is meant to do. As of now we split
the join problem into smaller parts per join_collapse_limit (=8). If
these problems are "too large" for DP (geqo_threshold=12), we use the
heuristics (GEQO) mode. Or do I misremember this?
Maybe I'm just "too used" to this, but this seems reasonable to me. Try
searching for a "perfect" solution first (within reason), and only for
large problems fall back to something approximate. Sensible, no?
To got to the GEQO/GOO code, people have to adjust the limits, so that
(join_collapse_limit >= geqo_threshold). AFAIK almost no one does that,
so most join problems are smaller that geqo_threshold and so handled by
regular DP (for smaller subproblems).
But let's say someone adjusts the GUCs, gets to GOO, and it handles a
prefix of the problem using the DP approach. How is that different from
keeping the (join_collapse_limit < geqo_threshold) and not even getting
to GOO? Why not to just leave join_collapse_limit to a low value?
> So, the above doesn't seem like a logical next step from the data thus
> far. (To be clear, I think it's good future work, but that presumes
> getting to a first commit).
>
> To my mind, we still need more data to inform future direction. The
> most important things IMO are
> - plan advice -- My current thinking is, the plan advice aspect is the
> "wild card" that could immediately improve user experience with large
> joins. Conversely, without the ability to fix regressions, any new
> approach is risky. How well does plan advice work with GEQO? Does it
> work with GOO already or would that need additional coding?
> - more large join benchmarks (I see you have additional benchmarks in
> your repository above)
>
Right. When replacing one heuristics with a different one, there will
almost certainly be regressions. Each heuristics will explore a slightly
different subset of the solution space, and it's a matter of luck which
gets a substantially better solution.
I don't have a fully formed idea how to evaluate this, but I think the
only way to "prove" a the new heuristics is better is to test a lot of
complex join queries, and look at the overall statistics. See how long
the planning took, see how many queries got faster/slower, etc.
The JOB is certainly one option to do that, and it's valuable because
the queries are meant to be realistic / from actual application. But
there's not all that many of them.
I think it would be useful to write a script that generates joins of
arbitrary complexity (number of relations, how connected they are), and
see how it works on those. It could even generate data to get estimates
with adjustable inaccuracy.
regards
--
Tomas Vondra
^ permalink raw reply [nested|flat] 18+ messages in thread
* Re: Add a greedy join search algorithm to handle large join problems
2026-02-13 11:14 Re: Add a greedy join search algorithm to handle large join problems lakshmi <[email protected]>
2026-02-14 05:39 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
2026-05-03 12:46 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
2026-05-05 05:40 ` Re: Add a greedy join search algorithm to handle large join problems John Naylor <[email protected]>
2026-05-11 14:33 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
2026-05-12 11:51 ` Re: Add a greedy join search algorithm to handle large join problems John Naylor <[email protected]>
2026-05-12 13:42 ` Re: Add a greedy join search algorithm to handle large join problems Tomas Vondra <[email protected]>
@ 2026-05-13 04:34 ` John Naylor <[email protected]>
2026-05-13 11:54 ` Re: Add a greedy join search algorithm to handle large join problems Tomas Vondra <[email protected]>
1 sibling, 1 reply; 18+ messages in thread
From: John Naylor @ 2026-05-13 04:34 UTC (permalink / raw)
To: Tomas Vondra <[email protected]>; +Cc: Chengpeng Yan <[email protected]>; lakshmi <[email protected]>; Pavel Stehule <[email protected]>; [email protected] <[email protected]>; Robert Haas <[email protected]>
On Tue, May 12, 2026 at 8:42 PM Tomas Vondra <[email protected]> wrote:
>
> On 5/12/26 13:51, John Naylor wrote:
> > On Mon, May 11, 2026 at 9:33 PM Chengpeng Yan <[email protected]> wrote:
> >> GOO(combined) has the lower execution-time sum in four of the five
> >> rounds. The exact bucket counts are not identical in every round, but
> >> each round still has both large GOO(combined) wins and large
> >> regressions. More importantly, GEQO varies much more across statistics
> >> snapshots: its execution-time sum ranges from 51s to 575s, while
> >> GOO(combined) stays in the 84s-88s range.
> >
> > That's an interesting finding! We'll want to keep this behavior in
> > mind and see how reproducible it is. You mentioned you didn't want to
> > draw any general conclusions (reasonable), but a 10x variation just
> > from statistics does put a new perspective on the test results.
> >
>
> After reading this, my thinking was "we should assume the estimates are
> accurate" because with bogus estimates we can end up with arbitrarily
> bad plans. Garbage in, garbage out.
>
> But maybe it's not as black-and-white? I agree if one method is much
> more robust (i.e. performs better with estimates that are close enough
> to the actual values), then that seems like an important feature for
> this type of heuristics.
Exactly. (The operative word being "if")
At the very least, this variation can complicate getting reliable test results.
> I wonder how "bad" the estimates are in the presented example, both in
> the good and bad runs.
Yeah, one deliberate feature of JOB is that it has real-world data
distribution that confound DBMS's common assumptions about data
independance etc. I wonder if a synthetic benchmark with more
uniform/independent data would have less variation here.
> >> So I am still looking at the two follow-up directions mentioned earlier,
> >> both building on the current GOO work. One is to improve pure-GOO
> >> candidate generation and the final selector, for example by adding more
> >> diverse strategies and making the selector consider signals beyond final
> >> estimated cost. The other is to use exact DP for a prefix of the problem
> >> and fall back to GOO when the DP budget is not enough. The first
> >> direction may need broader changes across planner or estimation-related
> >> code,
> >
> > I agree we should try to avoid strategies that depend on changing other areas.
> >
>
> IMHO it's futile to search for a perfect heuristics. It we had that, we
> wouldn't need the DP mode at all. This can't be the criteria, because no
> heuristics would pass that.
Agreed.
> TBH I don't quite understand what the proposed approach with "using
> exact DP for a prefix of the problem" is meant to do. As of now we split
> the join problem into smaller parts per join_collapse_limit (=8). If
> these problems are "too large" for DP (geqo_threshold=12), we use the
> heuristics (GEQO) mode. Or do I misremember this?
>
> Maybe I'm just "too used" to this, but this seems reasonable to me. Try
> searching for a "perfect" solution first (within reason), and only for
> large problems fall back to something approximate. Sensible, no?
Yes, I'm very much in favor of that concept. My concern was, trying to
do this now may be trying to do too much at once.
> To got to the GEQO/GOO code, people have to adjust the limits, so that
> (join_collapse_limit >= geqo_threshold). AFAIK almost no one does that,
> so most join problems are smaller that geqo_threshold and so handled by
> regular DP (for smaller subproblems).
>
> But let's say someone adjusts the GUCs, gets to GOO, and it handles a
> prefix of the problem using the DP approach. How is that different from
> keeping the (join_collapse_limit < geqo_threshold) and not even getting
> to GOO? Why not to just leave join_collapse_limit to a low value?
One flavor of the second idea above (found in the literature) is to
start with DP, and after some new GUC limit (under the hood: # of
times calling make_join_rel), pick one of the incomplete subproblems
and finish with a heuristic search. It switches strategy on the fly,
rather than choosing upfront. That's the difference from now, although
I'm not sure offhand how join_collase_limit chooses its subproblems
out of the bigger problem.
> Right. When replacing one heuristics with a different one, there will
> almost certainly be regressions. Each heuristics will explore a slightly
> different subset of the solution space, and it's a matter of luck which
> gets a substantially better solution.
>
> I don't have a fully formed idea how to evaluate this, but I think the
> only way to "prove" a the new heuristics is better is to test a lot of
> complex join queries, and look at the overall statistics. See how long
> the planning took, see how many queries got faster/slower, etc.
>
> The JOB is certainly one option to do that, and it's valuable because
> the queries are meant to be realistic / from actual application. But
> there's not all that many of them.
>
> I think it would be useful to write a script that generates joins of
> arbitrary complexity (number of relations, how connected they are), and
> see how it works on those. It could even generate data to get estimates
> with adjustable inaccuracy.
+1
With unnaturally uniform/independent data, I'm curious if the
stats-dependent variation seen above for GEQO would disappear/lessen.
(I'm not sure how hard adjustable inaccuracy would be)
--
John Naylor
Amazon Web Services
^ permalink raw reply [nested|flat] 18+ messages in thread
* Re: Add a greedy join search algorithm to handle large join problems
2026-02-13 11:14 Re: Add a greedy join search algorithm to handle large join problems lakshmi <[email protected]>
2026-02-14 05:39 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
2026-05-03 12:46 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
2026-05-05 05:40 ` Re: Add a greedy join search algorithm to handle large join problems John Naylor <[email protected]>
2026-05-11 14:33 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
2026-05-12 11:51 ` Re: Add a greedy join search algorithm to handle large join problems John Naylor <[email protected]>
2026-05-12 13:42 ` Re: Add a greedy join search algorithm to handle large join problems Tomas Vondra <[email protected]>
2026-05-13 04:34 ` Re: Add a greedy join search algorithm to handle large join problems John Naylor <[email protected]>
@ 2026-05-13 11:54 ` Tomas Vondra <[email protected]>
2026-05-14 05:59 ` Re: Add a greedy join search algorithm to handle large join problems John Naylor <[email protected]>
0 siblings, 1 reply; 18+ messages in thread
From: Tomas Vondra @ 2026-05-13 11:54 UTC (permalink / raw)
To: John Naylor <[email protected]>; +Cc: Chengpeng Yan <[email protected]>; lakshmi <[email protected]>; Pavel Stehule <[email protected]>; [email protected] <[email protected]>; Robert Haas <[email protected]>
On 5/13/26 06:34, John Naylor wrote:
> On Tue, May 12, 2026 at 8:42 PM Tomas Vondra <[email protected]> wrote:
>>
>> On 5/12/26 13:51, John Naylor wrote:
>>> On Mon, May 11, 2026 at 9:33 PM Chengpeng Yan <[email protected]> wrote:
>
>>>> GOO(combined) has the lower execution-time sum in four of the five
>>>> rounds. The exact bucket counts are not identical in every round, but
>>>> each round still has both large GOO(combined) wins and large
>>>> regressions. More importantly, GEQO varies much more across statistics
>>>> snapshots: its execution-time sum ranges from 51s to 575s, while
>>>> GOO(combined) stays in the 84s-88s range.
>>>
>>> That's an interesting finding! We'll want to keep this behavior in
>>> mind and see how reproducible it is. You mentioned you didn't want to
>>> draw any general conclusions (reasonable), but a 10x variation just
>>> from statistics does put a new perspective on the test results.
>>>
>>
>> After reading this, my thinking was "we should assume the estimates are
>> accurate" because with bogus estimates we can end up with arbitrarily
>> bad plans. Garbage in, garbage out.
>>
>> But maybe it's not as black-and-white? I agree if one method is much
>> more robust (i.e. performs better with estimates that are close enough
>> to the actual values), then that seems like an important feature for
>> this type of heuristics.
>
> Exactly. (The operative word being "if")
>
> At the very least, this variation can complicate getting reliable test results.
>
>> I wonder how "bad" the estimates are in the presented example, both in
>> the good and bad runs.
>
> Yeah, one deliberate feature of JOB is that it has real-world data
> distribution that confound DBMS's common assumptions about data
> independance etc. I wonder if a synthetic benchmark with more
> uniform/independent data would have less variation here.
>
Yes, that's my understanding too. What does that mean for using JOB to
evaluate these patches? If the optimizer does substantial estimation
errors, what can we conclude from heuristics performing well or poorly
on those queries?
I initially though "garbage in, garbage out", but maybe if a heuristics
performs systemically better (on a large number of random test cases),
maybe it handles "risk" better?
>>>> So I am still looking at the two follow-up directions mentioned earlier,
>>>> both building on the current GOO work. One is to improve pure-GOO
>>>> candidate generation and the final selector, for example by adding more
>>>> diverse strategies and making the selector consider signals beyond final
>>>> estimated cost. The other is to use exact DP for a prefix of the problem
>>>> and fall back to GOO when the DP budget is not enough. The first
>>>> direction may need broader changes across planner or estimation-related
>>>> code,
>>>
>>> I agree we should try to avoid strategies that depend on changing other areas.
>>>
>>
>> IMHO it's futile to search for a perfect heuristics. It we had that, we
>> wouldn't need the DP mode at all. This can't be the criteria, because no
>> heuristics would pass that.
>
> Agreed.
>
>> TBH I don't quite understand what the proposed approach with "using
>> exact DP for a prefix of the problem" is meant to do. As of now we split
>> the join problem into smaller parts per join_collapse_limit (=8). If
>> these problems are "too large" for DP (geqo_threshold=12), we use the
>> heuristics (GEQO) mode. Or do I misremember this?
>>
>> Maybe I'm just "too used" to this, but this seems reasonable to me. Try
>> searching for a "perfect" solution first (within reason), and only for
>> large problems fall back to something approximate. Sensible, no?
>
> Yes, I'm very much in favor of that concept. My concern was, trying to
> do this now may be trying to do too much at once.
>
Probably. I understood the goal of GOO to be a drop-in replacement GEQO,
and my intuition is to keep the scope limited to that goal.
That does not mean GOO can't do more later, but I'd definitely structure
it with doing a "simplest possible" GEQO replacement first, and then
maybe do more smart stuff later.
>> To got to the GEQO/GOO code, people have to adjust the limits, so that
>> (join_collapse_limit >= geqo_threshold). AFAIK almost no one does that,
>> so most join problems are smaller that geqo_threshold and so handled by
>> regular DP (for smaller subproblems).
>>
>> But let's say someone adjusts the GUCs, gets to GOO, and it handles a
>> prefix of the problem using the DP approach. How is that different from
>> keeping the (join_collapse_limit < geqo_threshold) and not even getting
>> to GOO? Why not to just leave join_collapse_limit to a low value?
>
> One flavor of the second idea above (found in the literature) is to
> start with DP, and after some new GUC limit (under the hood: # of
> times calling make_join_rel), pick one of the incomplete subproblems
> and finish with a heuristic search. It switches strategy on the fly,
> rather than choosing upfront. That's the difference from now, although
> I'm not sure offhand how join_collase_limit chooses its subproblems
> out of the bigger problem.
>
Not sure, for two reasons.
How often do we expect this to be used? AFAIK GEQO is not used very
often in practice, because geqo_threshold is so much higher than
join_collapse_limit. So even huge joins get divided into problems
handled by DP. We might invest a lot of time and complexity into
something that gets used extremely rarely ...
Also, couldn't this easily lead to unpredictable planning behavior? Say
the budget relies on counting "make_join_rel" calls, or something like
that. Then someone adds or removes an index on one of the tables, and we
suddenly start to consider fewer/more candidate orders. Or maybe someone
refactors the code a little bit, moving the calls. And suddenly, we hit
the threshold at a different place. But maybe it's OK for a heuristics?
>> Right. When replacing one heuristics with a different one, there will
>> almost certainly be regressions. Each heuristics will explore a slightly
>> different subset of the solution space, and it's a matter of luck which
>> gets a substantially better solution.
>>
>> I don't have a fully formed idea how to evaluate this, but I think the
>> only way to "prove" a the new heuristics is better is to test a lot of
>> complex join queries, and look at the overall statistics. See how long
>> the planning took, see how many queries got faster/slower, etc.
>>
>> The JOB is certainly one option to do that, and it's valuable because
>> the queries are meant to be realistic / from actual application. But
>> there's not all that many of them.
>>
>> I think it would be useful to write a script that generates joins of
>> arbitrary complexity (number of relations, how connected they are), and
>> see how it works on those. It could even generate data to get estimates
>> with adjustable inaccuracy.
>
> +1
>
> With unnaturally uniform/independent data, I'm curious if the
> stats-dependent variation seen above for GEQO would disappear/lessen.
> (I'm not sure how hard adjustable inaccuracy would be)
>
Yeah, that was my guess too.
regards
--
Tomas Vondra
^ permalink raw reply [nested|flat] 18+ messages in thread
* Re: Add a greedy join search algorithm to handle large join problems
2026-02-13 11:14 Re: Add a greedy join search algorithm to handle large join problems lakshmi <[email protected]>
2026-02-14 05:39 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
2026-05-03 12:46 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
2026-05-05 05:40 ` Re: Add a greedy join search algorithm to handle large join problems John Naylor <[email protected]>
2026-05-11 14:33 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
2026-05-12 11:51 ` Re: Add a greedy join search algorithm to handle large join problems John Naylor <[email protected]>
2026-05-12 13:42 ` Re: Add a greedy join search algorithm to handle large join problems Tomas Vondra <[email protected]>
2026-05-13 04:34 ` Re: Add a greedy join search algorithm to handle large join problems John Naylor <[email protected]>
2026-05-13 11:54 ` Re: Add a greedy join search algorithm to handle large join problems Tomas Vondra <[email protected]>
@ 2026-05-14 05:59 ` John Naylor <[email protected]>
2026-05-14 08:49 ` Re: Add a greedy join search algorithm to handle large join problems Tomas Vondra <[email protected]>
0 siblings, 1 reply; 18+ messages in thread
From: John Naylor @ 2026-05-14 05:59 UTC (permalink / raw)
To: Tomas Vondra <[email protected]>; +Cc: Chengpeng Yan <[email protected]>; lakshmi <[email protected]>; Pavel Stehule <[email protected]>; [email protected] <[email protected]>; Robert Haas <[email protected]>
On Wed, May 13, 2026 at 6:54 PM Tomas Vondra <[email protected]> wrote:
>
> On 5/13/26 06:34, John Naylor wrote:
> > On Tue, May 12, 2026 at 8:42 PM Tomas Vondra <[email protected]> wrote:
> >> To got to the GEQO/GOO code, people have to adjust the limits, so that
> >> (join_collapse_limit >= geqo_threshold). AFAIK almost no one does that,
> >> so most join problems are smaller that geqo_threshold and so handled by
> >> regular DP (for smaller subproblems).
> >>
> >> But let's say someone adjusts the GUCs, gets to GOO, and it handles a
> >> prefix of the problem using the DP approach. How is that different from
> >> keeping the (join_collapse_limit < geqo_threshold) and not even getting
> >> to GOO? Why not to just leave join_collapse_limit to a low value?
> >
> > One flavor of the second idea above (found in the literature) is to
> > start with DP, and after some new GUC limit (under the hood: # of
> > times calling make_join_rel), pick one of the incomplete subproblems
> > and finish with a heuristic search. It switches strategy on the fly,
> > rather than choosing upfront. That's the difference from now, although
> > I'm not sure offhand how join_collase_limit chooses its subproblems
> > out of the bigger problem.
> >
>
> Not sure, for two reasons.
>
> How often do we expect this to be used? AFAIK GEQO is not used very
> often in practice, because geqo_threshold is so much higher than
> join_collapse_limit. So even huge joins get divided into problems
> handled by DP. We might invest a lot of time and complexity into
> something that gets used extremely rarely ...
That's a risk to keep in mind.
In cases where planning time is a substantial part of the runtime
(like in your thread on OLTP star joins), a user could lower the
threshold for greedy search, since it's fast. That's not an option
currently because GEQO is slow.
With a dynamic strategy, I imagine join_collapse_limit would mean
something different (one technique in the literature is to refine the
greedy output on subproblems of size 4-7), or be replaced with a
different tuning knob.
Anyway, these questions reinforce that this is potential follow-up work.
> Also, couldn't this easily lead to unpredictable planning behavior? Say
> the budget relies on counting "make_join_rel" calls, or something like
> that. Then someone adds or removes an index on one of the tables, and we
> suddenly start to consider fewer/more candidate orders. Or maybe someone
> refactors the code a little bit, moving the calls. And suddenly, we hit
> the threshold at a different place. But maybe it's OK for a heuristics?
That's a good question.
--
John Naylor
Amazon Web Services
^ permalink raw reply [nested|flat] 18+ messages in thread
* Re: Add a greedy join search algorithm to handle large join problems
2026-02-13 11:14 Re: Add a greedy join search algorithm to handle large join problems lakshmi <[email protected]>
2026-02-14 05:39 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
2026-05-03 12:46 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
2026-05-05 05:40 ` Re: Add a greedy join search algorithm to handle large join problems John Naylor <[email protected]>
2026-05-11 14:33 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
2026-05-12 11:51 ` Re: Add a greedy join search algorithm to handle large join problems John Naylor <[email protected]>
2026-05-12 13:42 ` Re: Add a greedy join search algorithm to handle large join problems Tomas Vondra <[email protected]>
2026-05-13 04:34 ` Re: Add a greedy join search algorithm to handle large join problems John Naylor <[email protected]>
2026-05-13 11:54 ` Re: Add a greedy join search algorithm to handle large join problems Tomas Vondra <[email protected]>
2026-05-14 05:59 ` Re: Add a greedy join search algorithm to handle large join problems John Naylor <[email protected]>
@ 2026-05-14 08:49 ` Tomas Vondra <[email protected]>
0 siblings, 0 replies; 18+ messages in thread
From: Tomas Vondra @ 2026-05-14 08:49 UTC (permalink / raw)
To: John Naylor <[email protected]>; +Cc: Chengpeng Yan <[email protected]>; lakshmi <[email protected]>; Pavel Stehule <[email protected]>; [email protected] <[email protected]>; Robert Haas <[email protected]>
On 5/14/26 07:59, John Naylor wrote:
> On Wed, May 13, 2026 at 6:54 PM Tomas Vondra <[email protected]> wrote:
>>
>> On 5/13/26 06:34, John Naylor wrote:
>>> On Tue, May 12, 2026 at 8:42 PM Tomas Vondra <[email protected]> wrote:
>
>>>> To got to the GEQO/GOO code, people have to adjust the limits, so that
>>>> (join_collapse_limit >= geqo_threshold). AFAIK almost no one does that,
>>>> so most join problems are smaller that geqo_threshold and so handled by
>>>> regular DP (for smaller subproblems).
>>>>
>>>> But let's say someone adjusts the GUCs, gets to GOO, and it handles a
>>>> prefix of the problem using the DP approach. How is that different from
>>>> keeping the (join_collapse_limit < geqo_threshold) and not even getting
>>>> to GOO? Why not to just leave join_collapse_limit to a low value?
>>>
>>> One flavor of the second idea above (found in the literature) is to
>>> start with DP, and after some new GUC limit (under the hood: # of
>>> times calling make_join_rel), pick one of the incomplete subproblems
>>> and finish with a heuristic search. It switches strategy on the fly,
>>> rather than choosing upfront. That's the difference from now, although
>>> I'm not sure offhand how join_collase_limit chooses its subproblems
>>> out of the bigger problem.
>>>
>>
>> Not sure, for two reasons.
>>
>> How often do we expect this to be used? AFAIK GEQO is not used very
>> often in practice, because geqo_threshold is so much higher than
>> join_collapse_limit. So even huge joins get divided into problems
>> handled by DP. We might invest a lot of time and complexity into
>> something that gets used extremely rarely ...
>
> That's a risk to keep in mind.
>
> In cases where planning time is a substantial part of the runtime
> (like in your thread on OLTP star joins), a user could lower the
> threshold for greedy search, since it's fast. That's not an option
> currently because GEQO is slow.
>
Perhaps. But there's also the discussion in [1], where Tom asked if
maybe it's time to update the join_collapse_limit/geqo_threshold
defaults, and Andrei responds join_collapse_limit=40/geqo_threshold=14
worked OK for the case he was investigating.
Those values are not meant to be the new defaults, but chances are it's
time to update the defaults in some way. And that may alter the chance
of actually triggering GEQO.
[1] https://www.postgresql.org/message-id/3525339.1770668216%40sss.pgh.pa.us
> With a dynamic strategy, I imagine join_collapse_limit would mean
> something different (one technique in the literature is to refine the
> greedy output on subproblems of size 4-7), or be replaced with a
> different tuning knob.
>
> Anyway, these questions reinforce that this is potential follow-up work.
>
Agreed.
regards
--
Tomas Vondra
^ permalink raw reply [nested|flat] 18+ messages in thread
* Re: Add a greedy join search algorithm to handle large join problems
2026-02-13 11:14 Re: Add a greedy join search algorithm to handle large join problems lakshmi <[email protected]>
2026-02-14 05:39 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
2026-05-03 12:46 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
2026-05-05 05:40 ` Re: Add a greedy join search algorithm to handle large join problems John Naylor <[email protected]>
2026-05-11 14:33 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
2026-05-12 11:51 ` Re: Add a greedy join search algorithm to handle large join problems John Naylor <[email protected]>
2026-05-12 13:42 ` Re: Add a greedy join search algorithm to handle large join problems Tomas Vondra <[email protected]>
@ 2026-05-13 06:06 ` Chengpeng Yan <[email protected]>
2026-05-13 12:22 ` Re: Add a greedy join search algorithm to handle large join problems Tomas Vondra <[email protected]>
1 sibling, 1 reply; 18+ messages in thread
From: Chengpeng Yan @ 2026-05-13 06:06 UTC (permalink / raw)
To: Tomas Vondra <[email protected]>; +Cc: John Naylor <[email protected]>; lakshmi <[email protected]>; Pavel Stehule <[email protected]>; [email protected] <[email protected]>; Robert Haas <[email protected]>
Hi Tomas,
Thanks for the detailed comments. I do not think I can answer all of the
questions yet, but I would like to first align my current understanding
and reply to the parts where I have some thoughts. For the remaining
parts, I think I need more analysis/research before making stronger
claims.
> On May 12, 2026, at 21:42, Tomas Vondra <[email protected]> wrote:
>
> After reading this, my thinking was "we should assume the estimates are
> accurate" because with bogus estimates we can end up with arbitrarily
> bad plans. Garbage in, garbage out.
>
> But maybe it's not as black-and-white? I agree if one method is much
> more robust (i.e. performs better with estimates that are close enough
> to the actual values), then that seems like an important feature for
> this type of heuristics.
>
> I wonder how "bad" the estimates are in the presented example, both in
> the good and bad runs.
I agree with this point. My earlier benchmark was more of an end-to-end
test with current postgres estimates, including cases where the
estimates can be quite wrong. I used that setup because complex join
estimates seem very hard to make accurate in all cases, so these errors
are likely to remain part of the practical problem.
But for evaluating the join-order algorithm itself, I agree we also need
a setup where the estimates are closer to the actual values, for example
by first measuring actual row counts/cardinalities and then using those
values instead of normal planner estimates. That would answer a
different question: the current results show end-to-end behavior with
the current postgres estimator, while the additional setup would better
isolate the search algorithm.
If the estimates are very wrong, the selected plan may be dominated by
estimator error rather than by the search method. In that case, I am not
yet sure how much a win/loss between two heuristics proves about the
join search algorithm itself. On the other hand, if the estimates are
close enough and one method is still more robust, I agree that would be
an important property for this kind of heuristic.
For the presented examples, I need to look more carefully at the good
and bad runs. From a first look at some severe regressions, the issue
seems to be that some intermediate join row estimates are off by orders
of magnitude, but I do not want to claim that as the full answer before
doing a more systematic analysis.
> IMHO it's futile to search for a perfect heuristics. It we had that, we
> wouldn't need the DP mode at all. This can't be the criteria, because no
> heuristics would pass that.
Agreed. I did not mean that the goal should be a heuristic with no bad
cases. That would not be realistic.
What I had in mind was a weaker practical goal: assuming planning time
stays acceptable and plan advice can still be used to correct bad cases,
can the default setting reduce the probability of severe bad cases? For
example, using purely made-up numbers, if the new algorithm's default
setting has severe regressions in 10% of cases, can we reduce that to 5%
or lower? The point is to reduce the probability of bad cases, not to
eliminate them.
I am not fully sure yet whether this is the right goal/criterion,
though.
> Regarding the two proposed directions, it's not very clear to me why the
> first direction would require broader changes, particularly changes that
> would be unnecessary for the second direction (DP for prefix).
>
> To me it seems the second approach is actually an advanced version of
> the first one. How to you pick the prefix to handle by DP if not by some
> heuristics, requiring this new signals?
That's fair. The DP-prefix approach still needs a heuristic to choose
the frontier seed, so it does not avoid the ranking problem. What I
meant was mostly about scope, not a strict dependency between the two
directions. The current GOO variants mostly use signals already
available in postgres at that point, such as estimated cost and
estimated result size. If those signals are wrong, different variants
may still share similar failure patterns. To get a substantially
different signal, we might need something outside the current GOO patch.
For example, the CIDR 2021 paper "Simplicity Done Right for Join
Ordering" seems like one possible direction for the first approach. My
current reading is that it is not only changing a local choice inside
the current GOO rule. It also changes the way candidate joins are
formed/connected, and it may need extra statistics about join
attributes, such as maximum value frequency, to make the risk signal
work. That seems broader than only changing the GOO search rule.
That is why I originally looked at the DP-prefix direction first: not
because I think it is definitely the better direction, but because it
seemed more contained to the join search algorithm and could initially
reuse existing signals like cost/result size. Better signals could still
be used later for prefix/frontier selection if that direction turns out
to be useful. I am probably still missing some of the tradeoffs, and I
need to study the paper, the failure cases, and other people's views
more carefully.
After this discussion, it may make more sense to first make the pure GOO
baseline and the evaluation framework clearer, and then decide which
follow-up direction is worth pursuing.
> TBH I don't quite understand what the proposed approach with "using
> exact DP for a prefix of the problem" is meant to do. As of now we split
> the join problem into smaller parts per join_collapse_limit (=8). If
> these problems are "too large" for DP (geqo_threshold=12), we use the
> heuristics (GEQO) mode. Or do I misremember this?
>
> Maybe I'm just "too used" to this, but this seems reasonable to me. Try
> searching for a "perfect" solution first (within reason), and only for
> large problems fall back to something approximate. Sensible, no?
>
> To got to the GEQO/GOO code, people have to adjust the limits, so that
> (join_collapse_limit >= geqo_threshold). AFAIK almost no one does that,
> so most join problems are smaller that geqo_threshold and so handled by
> regular DP (for smaller subproblems).
>
> But let's say someone adjusts the GUCs, gets to GOO, and it handles a
> prefix of the problem using the DP approach. How is that different from
> keeping the (join_collapse_limit < geqo_threshold) and not even getting
> to GOO? Why not to just leave join_collapse_limit to a low value?
This is a very good question. I need to better understand the current
implementation and think about this more carefully before giving an
answer.
> Right. When replacing one heuristics with a different one, there will
> almost certainly be regressions. Each heuristics will explore a slightly
> different subset of the solution space, and it's a matter of luck which
> gets a substantially better solution.
>
> I don't have a fully formed idea how to evaluate this, but I think the
> only way to "prove" a the new heuristics is better is to test a lot of
> complex join queries, and look at the overall statistics. See how long
> the planning took, see how many queries got faster/slower, etc.
Agreed. I think we need to test many more large join queries. As
mentioned above, I think it would be useful to keep the current-estimate
and accurate-estimate evaluations separate, because they answer related
but different questions.
> The JOB is certainly one option to do that, and it's valuable because
> the queries are meant to be realistic / from actual application. But
> there's not all that many of them.
Yes. In addition to JOB and JOB-Complex, the benchmark repository also
has an extended IMDB-backed workload, `imdb_ceb_3k`, with 842 queries
involving at least 12 base relations [1]. It uses the same IMDB
schema/data family, so it should be a useful larger test set after the
smaller JOB/JOB-Complex subset.
I have not run that full set for this patch yet, but I plan to use it
for the next round of evaluation.
> I think it would be useful to write a script that generates joins of
> arbitrary complexity (number of relations, how connected they are), and
> see how it works on those. It could even generate data to get estimates
> with adjustable inaccuracy.
That makes sense, but I need to think more about it. I am not sure yet
whether this would be easy to design in a useful way.
References:
[1] Benchmark workloads:
https://github.com/Reminiscent/join-order-benchmark/blob/main/WORKLOADS.md
--
Best regards,
Chengpeng Yan
^ permalink raw reply [nested|flat] 18+ messages in thread
* Re: Add a greedy join search algorithm to handle large join problems
2026-02-13 11:14 Re: Add a greedy join search algorithm to handle large join problems lakshmi <[email protected]>
2026-02-14 05:39 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
2026-05-03 12:46 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
2026-05-05 05:40 ` Re: Add a greedy join search algorithm to handle large join problems John Naylor <[email protected]>
2026-05-11 14:33 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
2026-05-12 11:51 ` Re: Add a greedy join search algorithm to handle large join problems John Naylor <[email protected]>
2026-05-12 13:42 ` Re: Add a greedy join search algorithm to handle large join problems Tomas Vondra <[email protected]>
2026-05-13 06:06 ` Re: Add a greedy join search algorithm to handle large join problems Chengpeng Yan <[email protected]>
@ 2026-05-13 12:22 ` Tomas Vondra <[email protected]>
0 siblings, 0 replies; 18+ messages in thread
From: Tomas Vondra @ 2026-05-13 12:22 UTC (permalink / raw)
To: Chengpeng Yan <[email protected]>; +Cc: John Naylor <[email protected]>; lakshmi <[email protected]>; Pavel Stehule <[email protected]>; [email protected] <[email protected]>; Robert Haas <[email protected]>
Hi!
On 5/13/26 08:06, Chengpeng Yan wrote:
> Hi Tomas,
>
> Thanks for the detailed comments. I do not think I can answer all of the
> questions yet, but I would like to first align my current understanding
> and reply to the parts where I have some thoughts. For the remaining
> parts, I think I need more analysis/research before making stronger
> claims.
>
That's fine. No one expect you to have all the answers right away. It's
a matter for future research. Also, it's just random ideas, and some of
them may be misguided - feel free to push back against that, please.
>> On May 12, 2026, at 21:42, Tomas Vondra <[email protected]> wrote:
>>
>> After reading this, my thinking was "we should assume the estimates are
>> accurate" because with bogus estimates we can end up with arbitrarily
>> bad plans. Garbage in, garbage out.
>>
>> But maybe it's not as black-and-white? I agree if one method is much
>> more robust (i.e. performs better with estimates that are close enough
>> to the actual values), then that seems like an important feature for
>> this type of heuristics.
>>
>> I wonder how "bad" the estimates are in the presented example, both in
>> the good and bad runs.
>
> I agree with this point. My earlier benchmark was more of an end-to-end
> test with current postgres estimates, including cases where the
> estimates can be quite wrong. I used that setup because complex join
> estimates seem very hard to make accurate in all cases, so these errors
> are likely to remain part of the practical problem.
>
> But for evaluating the join-order algorithm itself, I agree we also need
> a setup where the estimates are closer to the actual values, for example
> by first measuring actual row counts/cardinalities and then using those
> values instead of normal planner estimates. That would answer a
> different question: the current results show end-to-end behavior with
> the current postgres estimator, while the additional setup would better
> isolate the search algorithm.
>
> If the estimates are very wrong, the selected plan may be dominated by
> estimator error rather than by the search method. In that case, I am not
> yet sure how much a win/loss between two heuristics proves about the
> join search algorithm itself. On the other hand, if the estimates are
> close enough and one method is still more robust, I agree that would be
> an important property for this kind of heuristic.
>
> For the presented examples, I need to look more carefully at the good
> and bad runs. From a first look at some severe regressions, the issue
> seems to be that some intermediate join row estimates are off by orders
> of magnitude, but I do not want to claim that as the full answer before
> doing a more systematic analysis.
>
Agreed. Seeing results for joins with accurate estimates would be helpful.
I initially thought we should probably just ignore behavior on examples
with too inaccurate estimates (because the "exact" DP would likely make
bogus decisions too).
But maybe one of the heuristics is somehow more robust, and that seems
like a quality of it's own?
>> IMHO it's futile to search for a perfect heuristics. It we had that, we
>> wouldn't need the DP mode at all. This can't be the criteria, because no
>> heuristics would pass that.
>
> Agreed. I did not mean that the goal should be a heuristic with no bad
> cases. That would not be realistic.
>
> What I had in mind was a weaker practical goal: assuming planning time
> stays acceptable and plan advice can still be used to correct bad cases,
> can the default setting reduce the probability of severe bad cases? For
> example, using purely made-up numbers, if the new algorithm's default
> setting has severe regressions in 10% of cases, can we reduce that to 5%
> or lower? The point is to reduce the probability of bad cases, not to
> eliminate them.
>
> I am not fully sure yet whether this is the right goal/criterion,
> though.
>
That sounds like a sensible goal. It's more or less what I had in mind
when I proposed evaluating the heuristics on random joins generated by a
script. Seeing on what fraction of the cases is "A" better than "B",
vice versa, by how much, ...
>> Regarding the two proposed directions, it's not very clear to me why the
>> first direction would require broader changes, particularly changes that
>> would be unnecessary for the second direction (DP for prefix).
>>
>> To me it seems the second approach is actually an advanced version of
>> the first one. How to you pick the prefix to handle by DP if not by some
>> heuristics, requiring this new signals?
>
> That's fair. The DP-prefix approach still needs a heuristic to choose
> the frontier seed, so it does not avoid the ranking problem. What I
> meant was mostly about scope, not a strict dependency between the two
> directions. The current GOO variants mostly use signals already
> available in postgres at that point, such as estimated cost and
> estimated result size. If those signals are wrong, different variants
> may still share similar failure patterns. To get a substantially
> different signal, we might need something outside the current GOO patch.
>
> For example, the CIDR 2021 paper "Simplicity Done Right for Join
> Ordering" seems like one possible direction for the first approach. My
> current reading is that it is not only changing a local choice inside
> the current GOO rule. It also changes the way candidate joins are
> formed/connected, and it may need extra statistics about join
> attributes, such as maximum value frequency, to make the risk signal
> work. That seems broader than only changing the GOO search rule.
>
> That is why I originally looked at the DP-prefix direction first: not
> because I think it is definitely the better direction, but because it
> seemed more contained to the join search algorithm and could initially
> reuse existing signals like cost/result size. Better signals could still
> be used later for prefix/frontier selection if that direction turns out
> to be useful. I am probably still missing some of the tradeoffs, and I
> need to study the paper, the failure cases, and other people's views
> more carefully.
>
> After this discussion, it may make more sense to first make the pure GOO
> baseline and the evaluation framework clearer, and then decide which
> follow-up direction is worth pursuing.
>
+1 to do "simple GOO" first, followed by various improvements. It could
even be implemented in one patch series, but structured like that. It
makes review / discussions way simpler.
>> TBH I don't quite understand what the proposed approach with "using
>> exact DP for a prefix of the problem" is meant to do. As of now we split
>> the join problem into smaller parts per join_collapse_limit (=8). If
>> these problems are "too large" for DP (geqo_threshold=12), we use the
>> heuristics (GEQO) mode. Or do I misremember this?
>>
>> Maybe I'm just "too used" to this, but this seems reasonable to me. Try
>> searching for a "perfect" solution first (within reason), and only for
>> large problems fall back to something approximate. Sensible, no?
>>
>> To got to the GEQO/GOO code, people have to adjust the limits, so that
>> (join_collapse_limit >= geqo_threshold). AFAIK almost no one does that,
>> so most join problems are smaller that geqo_threshold and so handled by
>> regular DP (for smaller subproblems).
>>
>> But let's say someone adjusts the GUCs, gets to GOO, and it handles a
>> prefix of the problem using the DP approach. How is that different from
>> keeping the (join_collapse_limit < geqo_threshold) and not even getting
>> to GOO? Why not to just leave join_collapse_limit to a low value?
>
> This is a very good question. I need to better understand the current
> implementation and think about this more carefully before giving an
> answer.
>
>> Right. When replacing one heuristics with a different one, there will
>> almost certainly be regressions. Each heuristics will explore a slightly
>> different subset of the solution space, and it's a matter of luck which
>> gets a substantially better solution.
>>
>> I don't have a fully formed idea how to evaluate this, but I think the
>> only way to "prove" a the new heuristics is better is to test a lot of
>> complex join queries, and look at the overall statistics. See how long
>> the planning took, see how many queries got faster/slower, etc.
>
> Agreed. I think we need to test many more large join queries. As
> mentioned above, I think it would be useful to keep the current-estimate
> and accurate-estimate evaluations separate, because they answer related
> but different questions.
>
+1
>> The JOB is certainly one option to do that, and it's valuable because
>> the queries are meant to be realistic / from actual application. But
>> there's not all that many of them.
>
> Yes. In addition to JOB and JOB-Complex, the benchmark repository also
> has an extended IMDB-backed workload, `imdb_ceb_3k`, with 842 queries
> involving at least 12 base relations [1]. It uses the same IMDB
> schema/data family, so it should be a useful larger test set after the
> smaller JOB/JOB-Complex subset.
>
> I have not run that full set for this patch yet, but I plan to use it
> for the next round of evaluation.
>
Maybe it'd make sense to evaluate GEQO/GOO even with smaller joins?
I don't know if that tells us anything, or which geqo_threshold values
are interesting. Maybe not all the way back to 2, because as John wrote
earlier, for small problems GEQO may actually cover the whole space.
But maybe if we run the tests with geqo_threshold=2..12 for joins of
different sizes, it'd tell us a couple things:
a) At which value it starts producing plans different from DP?
b) How the planning time grows for DP/GEQO/GOO, and where the DP gets
too expensive.
c) How the execution time changes for DP/GEQO/GOO.
I honestly don't know what exactly it makes sense to test. In cases like
this I usually run a lot of tests (generated randomly) for different
combinations of parameters, and then "ask questions" on the data.
>> I think it would be useful to write a script that generates joins of
>> arbitrary complexity (number of relations, how connected they are), and
>> see how it works on those. It could even generate data to get estimates
>> with adjustable inaccuracy.
>
> That makes sense, but I need to think more about it. I am not sure yet
> whether this would be easy to design in a useful way.
>
Me neither. But it seems like an interesting thought experiment.
regards
--
Tomas Vondra
^ permalink raw reply [nested|flat] 18+ messages in thread
end of thread, other threads:[~2026-05-14 08:49 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2026-02-13 11:14 Re: Add a greedy join search algorithm to handle large join problems lakshmi <[email protected]>
2026-02-13 13:18 ` Tomas Vondra <[email protected]>
2026-02-14 05:39 ` Chengpeng Yan <[email protected]>
2026-02-16 06:27 ` lakshmi <[email protected]>
2026-02-16 06:59 ` lakshmi <[email protected]>
2026-02-16 08:44 ` Tomas Vondra <[email protected]>
2026-02-16 10:44 ` lakshmi <[email protected]>
2026-05-03 12:46 ` Chengpeng Yan <[email protected]>
2026-05-05 05:40 ` John Naylor <[email protected]>
2026-05-11 14:33 ` Chengpeng Yan <[email protected]>
2026-05-12 11:51 ` John Naylor <[email protected]>
2026-05-12 13:42 ` Tomas Vondra <[email protected]>
2026-05-13 04:34 ` John Naylor <[email protected]>
2026-05-13 11:54 ` Tomas Vondra <[email protected]>
2026-05-14 05:59 ` John Naylor <[email protected]>
2026-05-14 08:49 ` Tomas Vondra <[email protected]>
2026-05-13 06:06 ` Chengpeng Yan <[email protected]>
2026-05-13 12:22 ` Tomas Vondra <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox