Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wN8AZ-000bAO-08 for pgsql-hackers@arkaria.postgresql.org; Wed, 13 May 2026 11:54:31 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wN8AX-008YND-2V for pgsql-hackers@arkaria.postgresql.org; Wed, 13 May 2026 11:54:29 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wN8AX-008YN5-0v for pgsql-hackers@lists.postgresql.org; Wed, 13 May 2026 11:54:29 +0000 Received: from relay8-d.mail.gandi.net ([217.70.183.201]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1wN8AU-00000000MSp-0uRZ for pgsql-hackers@lists.postgresql.org; Wed, 13 May 2026 11:54:28 +0000 Received: by mail.gandi.net (Postfix) with ESMTPSA id F41823EC30; Wed, 13 May 2026 11:54:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vondra.me; s=gm1; t=1778673262; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YUGMugQ8znWf87sYEaNoZt2rGOWNOnvPjZoJNUmN4hI=; b=U22TuOtoJ8fJ4mAsDfNdY4L1eeiSnyUiDShYAde/bCzWEHqMFJBqwfRhrijA8v02ZDGyEA NnIo7bN2Dd47vfIuPHMfOlT38+ir60gUnTRD25y8vWHTDwA5TXcIE5wA2RsGXXn5G8RnY3 RLgQgEzRjIWg0wDznXCT5oAoPtOfppvGWMHJaJw8Nf/mSXRh+hYxCPKqS/5BWFAgtSn2XA SAsoQ0iJi0I5m/xJKiBSe5AGfsX8C67jSZBTtok4smcydPew4cLK+V4BDJLMLUUixVdYzp E41LDJ8M+W/vjHF2dFnJYV9DWrBTnzxHcu77kxhaVv1niyv97j0R6UzwYKGPaw== Message-ID: <8e9509d3-b6a3-4604-bd26-8777386de216@vondra.me> Date: Wed, 13 May 2026 13:54:20 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Add a greedy join search algorithm to handle large join problems To: John Naylor Cc: Chengpeng Yan , lakshmi , Pavel Stehule , "pgsql-hackers@lists.postgresql.org" , Robert Haas References: <3FF63E99-AB4F-41A9-BC78-AAB28823FBD0@Outlook.com> <5abd6054-413c-4f48-9172-d8b31062b266@vondra.me> <938E2286-9B0D-4F8D-A916-8E0E35D55034@Outlook.com> <982de4a4-71b6-4d1d-afe2-35b1c5d43529@vondra.me> <8927A117-A7EA-41E8-94B3-0B4F7767DA8B@outlook.com> <4383D1E9-8F01-429E-9C18-1EFE12FF9196@outlook.com> Content-Language: en-US From: Tomas Vondra In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-GND-Sasl: tomas@vondra.me X-GND-Cause: dmFkZTGT1VET9YlKz9Pw+RBqZKjc8s6TilNponqymHjH7/ILX5uq8ZEcQwyAICsL9tpTG7VseccyS2oR7OZzRIiNO+LCZW7M2aQ9K7zNE9cs4r6TiM+eJ7dhhe0v93d/ZWzMvHHuY+4U2d21IRvkWthg21KV2SmjnioljJBHy/7gXY28nvIJwNbHoMMCW0nS9nHRlJF5uV8PjTXXJsEgj9l6Nb/bADpUr89NdTJfxqXK14UyleXjkSbuiYRFqE9PMcyMGO0gmBQXtpO5WzHp8XSjGiQX8l2h2QLlUYF4Ve/uR5XlxNH4oSGG3EKsEB9PdgfdqC9M24S1+9h18ICAyJ8khN3UnhXwr1SLtlWKZHkQOmfCcoX9xtI97VS73EesoB53b8qqHvTWYyowiYzvsD+ULU6f4+qIBkYagIdzXe4BY2wnzBSZ8GFJ0Kl2M6x5nc2e+1BtjyYiNyUOMrdtixpJ+PXqES3lBmvPKdJLWzSccnRrj5Lz731aPLcaW7TLD5KR0xTOP4OsRM4hYcW1Biy0mcTFEVV2RpuPh8zY72VVmt130Fqm41obnJSUZ25SMqewjYiOtRDGd1kKnLJjWH+92fOj1Fm74F92RwCdyAei0j3l1bgx6IMZvnFJRIxO0Ud2tx7+r9NbODukqbIeVyHF4qcs7U1T8Bf8UyH6W0KayjAZtg X-GND-State: clean X-GND-Score: -100 List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On 5/13/26 06:34, John Naylor wrote: > On Tue, May 12, 2026 at 8:42 PM Tomas Vondra wrote: >> >> On 5/12/26 13:51, John Naylor wrote: >>> On Mon, May 11, 2026 at 9:33 PM Chengpeng Yan wrote: > >>>> GOO(combined) has the lower execution-time sum in four of the five >>>> rounds. The exact bucket counts are not identical in every round, but >>>> each round still has both large GOO(combined) wins and large >>>> regressions. More importantly, GEQO varies much more across statistics >>>> snapshots: its execution-time sum ranges from 51s to 575s, while >>>> GOO(combined) stays in the 84s-88s range. >>> >>> That's an interesting finding! We'll want to keep this behavior in >>> mind and see how reproducible it is. You mentioned you didn't want to >>> draw any general conclusions (reasonable), but a 10x variation just >>> from statistics does put a new perspective on the test results. >>> >> >> After reading this, my thinking was "we should assume the estimates are >> accurate" because with bogus estimates we can end up with arbitrarily >> bad plans. Garbage in, garbage out. >> >> But maybe it's not as black-and-white? I agree if one method is much >> more robust (i.e. performs better with estimates that are close enough >> to the actual values), then that seems like an important feature for >> this type of heuristics. > > Exactly. (The operative word being "if") > > At the very least, this variation can complicate getting reliable test results. > >> I wonder how "bad" the estimates are in the presented example, both in >> the good and bad runs. > > Yeah, one deliberate feature of JOB is that it has real-world data > distribution that confound DBMS's common assumptions about data > independance etc. I wonder if a synthetic benchmark with more > uniform/independent data would have less variation here. > Yes, that's my understanding too. What does that mean for using JOB to evaluate these patches? If the optimizer does substantial estimation errors, what can we conclude from heuristics performing well or poorly on those queries? I initially though "garbage in, garbage out", but maybe if a heuristics performs systemically better (on a large number of random test cases), maybe it handles "risk" better? >>>> So I am still looking at the two follow-up directions mentioned earlier, >>>> both building on the current GOO work. One is to improve pure-GOO >>>> candidate generation and the final selector, for example by adding more >>>> diverse strategies and making the selector consider signals beyond final >>>> estimated cost. The other is to use exact DP for a prefix of the problem >>>> and fall back to GOO when the DP budget is not enough. The first >>>> direction may need broader changes across planner or estimation-related >>>> code, >>> >>> I agree we should try to avoid strategies that depend on changing other areas. >>> >> >> IMHO it's futile to search for a perfect heuristics. It we had that, we >> wouldn't need the DP mode at all. This can't be the criteria, because no >> heuristics would pass that. > > Agreed. > >> TBH I don't quite understand what the proposed approach with "using >> exact DP for a prefix of the problem" is meant to do. As of now we split >> the join problem into smaller parts per join_collapse_limit (=8). If >> these problems are "too large" for DP (geqo_threshold=12), we use the >> heuristics (GEQO) mode. Or do I misremember this? >> >> Maybe I'm just "too used" to this, but this seems reasonable to me. Try >> searching for a "perfect" solution first (within reason), and only for >> large problems fall back to something approximate. Sensible, no? > > Yes, I'm very much in favor of that concept. My concern was, trying to > do this now may be trying to do too much at once. > Probably. I understood the goal of GOO to be a drop-in replacement GEQO, and my intuition is to keep the scope limited to that goal. That does not mean GOO can't do more later, but I'd definitely structure it with doing a "simplest possible" GEQO replacement first, and then maybe do more smart stuff later. >> To got to the GEQO/GOO code, people have to adjust the limits, so that >> (join_collapse_limit >= geqo_threshold). AFAIK almost no one does that, >> so most join problems are smaller that geqo_threshold and so handled by >> regular DP (for smaller subproblems). >> >> But let's say someone adjusts the GUCs, gets to GOO, and it handles a >> prefix of the problem using the DP approach. How is that different from >> keeping the (join_collapse_limit < geqo_threshold) and not even getting >> to GOO? Why not to just leave join_collapse_limit to a low value? > > One flavor of the second idea above (found in the literature) is to > start with DP, and after some new GUC limit (under the hood: # of > times calling make_join_rel), pick one of the incomplete subproblems > and finish with a heuristic search. It switches strategy on the fly, > rather than choosing upfront. That's the difference from now, although > I'm not sure offhand how join_collase_limit chooses its subproblems > out of the bigger problem. > Not sure, for two reasons. How often do we expect this to be used? AFAIK GEQO is not used very often in practice, because geqo_threshold is so much higher than join_collapse_limit. So even huge joins get divided into problems handled by DP. We might invest a lot of time and complexity into something that gets used extremely rarely ... Also, couldn't this easily lead to unpredictable planning behavior? Say the budget relies on counting "make_join_rel" calls, or something like that. Then someone adds or removes an index on one of the tables, and we suddenly start to consider fewer/more candidate orders. Or maybe someone refactors the code a little bit, moving the calls. And suddenly, we hit the threshold at a different place. But maybe it's OK for a heuristics? >> Right. When replacing one heuristics with a different one, there will >> almost certainly be regressions. Each heuristics will explore a slightly >> different subset of the solution space, and it's a matter of luck which >> gets a substantially better solution. >> >> I don't have a fully formed idea how to evaluate this, but I think the >> only way to "prove" a the new heuristics is better is to test a lot of >> complex join queries, and look at the overall statistics. See how long >> the planning took, see how many queries got faster/slower, etc. >> >> The JOB is certainly one option to do that, and it's valuable because >> the queries are meant to be realistic / from actual application. But >> there's not all that many of them. >> >> I think it would be useful to write a script that generates joins of >> arbitrary complexity (number of relations, how connected they are), and >> see how it works on those. It could even generate data to get estimates >> with adjustable inaccuracy. > > +1 > > With unnaturally uniform/independent data, I'm curious if the > stats-dependent variation seen above for GEQO would disappear/lessen. > (I'm not sure how hard adjustable inaccuracy would be) > Yeah, that was my guess too. regards -- Tomas Vondra