Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wN1JD-000WE5-2f for pgsql-hackers@arkaria.postgresql.org; Wed, 13 May 2026 04:34:59 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wN1JB-007EB6-0H for pgsql-hackers@arkaria.postgresql.org; Wed, 13 May 2026 04:34:57 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wN1JA-007EAv-2X for pgsql-hackers@lists.postgresql.org; Wed, 13 May 2026 04:34:56 +0000 Received: from mail-qk1-x742.google.com ([2607:f8b0:4864:20::742]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wN1J8-00000000L3j-2usO for pgsql-hackers@lists.postgresql.org; Wed, 13 May 2026 04:34:56 +0000 Received: by mail-qk1-x742.google.com with SMTP id af79cd13be357-900fa9f178dso689537885a.1 for ; Tue, 12 May 2026 21:34:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1778646893; cv=none; d=google.com; s=arc-20240605; b=REcJLXvPPjjyCdSJZmsUw83GJ1dA9KAJF6i4F0eqgh9OFISlIeLIjcWfCiBcwB77QQ URd2f3+eualePz0oi+8FGHPw5y7K0gKsV2As/q4UI0K7zSuK/6a3LxDLvOuMhXGx1Llk IPl2mzNL/AQPTy/gDjy/7/E7gCoUhSgpwGmHLVvkN3W8eh896qupd2fPkPNPBtIeTgBf HlszI1k1zgUHx2wcKyvgLLzh7gQBxmzd4o3uxcB3YW8GOkE5fdOeriPHa5z6/INuJcLQ dic/RTb2FWXjm5I019RANvotIy8B22vHxRnkUOuBJ/j3OdLsiNFAa9XIa+gtsP96iu9A rRkA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=TH1cBtrj7ajYROw+8vrjI/ys4WEcm8VIptO9UO8Low8=; fh=6I9CdjdNTTKSuZJrDS3C1uKjGvjzACKe3VhZ4DWv9xY=; b=BOJuzXxP7dpQrOelKnszcDp8ZUnI1SPPnKL6zrt6v13gypCKrPe/d9w9FSCiz5S3KL gzvwrRU6jInXFCJWePddX9TbS1JYHJt4HZqlAn3svujKw3H2cGC6KZqmfOtEAVRDev0/ eWncFrCT/zoQBVX8cZwyA9/6gSbtwze4qbUM6vomhK3rvl2hOakXXuGLVmswfdV9As2f 67Gm+TUXKSTKjMA6KzruNf9Ns6PCqo1jjBZdH7dTqg3hB8YtKP62idGlcoDinkthlNkF VStiGrcfevHNbmxq4e+TqybFa7eFrMNL7fXjucS6ZEqq1S9NCmVlKxTcYIJgspyl2Tjg XqlQ==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1778646893; x=1779251693; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=TH1cBtrj7ajYROw+8vrjI/ys4WEcm8VIptO9UO8Low8=; b=d++x3oD3rEuVisxL2kFmr/3QpmOnD+5sUCs3RDeA8lJlzIKSmlRd2KWlmPfd5WYfkn rJ3rNPR0HMYRKWCEAIzUXV59tt0gtA2cxU4qf56ZHW4JAyOh1cSQzmRzwl3Lj0VLyBKp Mj7Q/KbgQFA2jzC6gKRkqMRGtI6XTlJoZ7VfudVwh013beHjAEPZXaR0GuZDIGcTrPRn 2++RVKUkXK8mjaUF81oETsBFR+0lFZD/f5MofP9qX/MafPiSlUUKhNEp0XpMAY5UQ83/ oEvLQ3Hu4CJSVA/lC2S/Snp62xYorEN2L8nJZdRMm6Tk1cjgXKkuBObFqfOWOHfqFXSC gVDw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1778646893; x=1779251693; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=TH1cBtrj7ajYROw+8vrjI/ys4WEcm8VIptO9UO8Low8=; b=pTNqHdoxeDBqitGeTz8AlHIp7MO6DgaDBkgbPv3Mv5GWchR48Ru5BK1hVmVJVNLkAe wrJV+8ZtZjaW2CG25SYj1Y9hxVLeb6KX9/S2645ADCRPLws6H/AinMF2P3uuf2Q9BXcI dX5Oe0pu5CkIJPFMkT4qUNqzJ3VHJ0p4DPUyImMJIk1x7e/V/76dNTMJPziqqLjFky1w XxoReY3nfJcr93j7jkmGmfu0BoUWSrmF8uzMPAqpA31RyW/POeNp6ElxB7TKSpvqNfHt i7BkRIc9Sqh8vsYDnbQaXIC7Iln7Ay8jFEh1CnrqVKcXNC4wmbtUFCSKo0w32Bv9cFnq 7ImQ== X-Forwarded-Encrypted: i=1; AFNElJ8zj97Kf6vNxHDhjvvV1MIgNMRv5VHP/qoZvWiZoPmicO8QEHjbYnglK+ahQX6uu/laLBLXDl7j0zUZgW0R@lists.postgresql.org X-Gm-Message-State: AOJu0Yy5Kg8Zi333uRv68S1T3R5QVi9Mr7YrPakIso7hlleVmihiv2Ef BWFF/q3/LPC/xdLJx9tK8/imQ65GvgBEmw1Pj6DmREmf/mdF4/sJzBHp1aMt618AuYhjzalt+/j 3CsTpQ3JjnrAp9qVjb17FrNljS+qWNcE= X-Gm-Gg: Acq92OF/X8NA7TjhVWlcBbnSarWFZG2s6IC0P8eZuGRySCc92Ra+wi3zXiemwilPTyx 7wmsaY4cJMIELI4GMD7+mMaNb5OvwA4bKGrm2T0a+vwu+7pe3dRcmX/yeULLqlZputeIqLomkaZ VRUvk2OVpqa+qIhtTE+7Vz9QzG0k/l0BuCNaBUsUHSg3Pi+wuVAv9C+sUMUuPPJ2dDEO8tdVpYD tylYlKR4RBl3ktaaHLgzwV480KKxOdRQH0JhiVlWeNrpO5mt/nx4WDJkuU+Flmn4RuqiyJw3Khg LUakRy1Z6cih04yZD4rl7V4B1oaXBbUORLHEkLf9BhT/nB2OR9zX2cmvKMYa5w3snNwkVLNhdcz WjlmH5LqitkUklefn/yoD3BEySP0= X-Received: by 2002:a05:622a:4183:b0:50f:c36a:381a with SMTP id d75a77b69052e-5162f638c04mr21584081cf.55.1778646892810; Tue, 12 May 2026 21:34:52 -0700 (PDT) MIME-Version: 1.0 References: <3FF63E99-AB4F-41A9-BC78-AAB28823FBD0@Outlook.com> <6db6d2ec-7529-4add-9a95-178fc318311d@vondra.me> <313ACE5A-CBF1-43B3-9181-10D3E8ADF424@Outlook.com> <5abd6054-413c-4f48-9172-d8b31062b266@vondra.me> <938E2286-9B0D-4F8D-A916-8E0E35D55034@Outlook.com> <982de4a4-71b6-4d1d-afe2-35b1c5d43529@vondra.me> <8927A117-A7EA-41E8-94B3-0B4F7767DA8B@outlook.com> <4383D1E9-8F01-429E-9C18-1EFE12FF9196@outlook.com> In-Reply-To: From: John Naylor Date: Wed, 13 May 2026 11:34:41 +0700 X-Gm-Features: AVHnY4L_J5S1bR1_BU8_t1I2HQFAA2aOPrXarXNgrmsnt8z1Fc54jHlZ2yVFzHs Message-ID: Subject: Re: Add a greedy join search algorithm to handle large join problems To: Tomas Vondra Cc: Chengpeng Yan , lakshmi , Pavel Stehule , "pgsql-hackers@lists.postgresql.org" , Robert Haas Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Tue, May 12, 2026 at 8:42=E2=80=AFPM Tomas Vondra wrot= e: > > On 5/12/26 13:51, John Naylor wrote: > > On Mon, May 11, 2026 at 9:33=E2=80=AFPM Chengpeng Yan wrote: > >> GOO(combined) has the lower execution-time sum in four of the five > >> rounds. The exact bucket counts are not identical in every round, but > >> each round still has both large GOO(combined) wins and large > >> regressions. More importantly, GEQO varies much more across statistics > >> snapshots: its execution-time sum ranges from 51s to 575s, while > >> GOO(combined) stays in the 84s-88s range. > > > > That's an interesting finding! We'll want to keep this behavior in > > mind and see how reproducible it is. You mentioned you didn't want to > > draw any general conclusions (reasonable), but a 10x variation just > > from statistics does put a new perspective on the test results. > > > > After reading this, my thinking was "we should assume the estimates are > accurate" because with bogus estimates we can end up with arbitrarily > bad plans. Garbage in, garbage out. > > But maybe it's not as black-and-white? I agree if one method is much > more robust (i.e. performs better with estimates that are close enough > to the actual values), then that seems like an important feature for > this type of heuristics. Exactly. (The operative word being "if") At the very least, this variation can complicate getting reliable test resu= lts. > I wonder how "bad" the estimates are in the presented example, both in > the good and bad runs. Yeah, one deliberate feature of JOB is that it has real-world data distribution that confound DBMS's common assumptions about data independance etc. I wonder if a synthetic benchmark with more uniform/independent data would have less variation here. > >> So I am still looking at the two follow-up directions mentioned earlie= r, > >> both building on the current GOO work. One is to improve pure-GOO > >> candidate generation and the final selector, for example by adding mor= e > >> diverse strategies and making the selector consider signals beyond fin= al > >> estimated cost. The other is to use exact DP for a prefix of the probl= em > >> and fall back to GOO when the DP budget is not enough. The first > >> direction may need broader changes across planner or estimation-relate= d > >> code, > > > > I agree we should try to avoid strategies that depend on changing other= areas. > > > > IMHO it's futile to search for a perfect heuristics. It we had that, we > wouldn't need the DP mode at all. This can't be the criteria, because no > heuristics would pass that. Agreed. > TBH I don't quite understand what the proposed approach with "using > exact DP for a prefix of the problem" is meant to do. As of now we split > the join problem into smaller parts per join_collapse_limit (=3D8). If > these problems are "too large" for DP (geqo_threshold=3D12), we use the > heuristics (GEQO) mode. Or do I misremember this? > > Maybe I'm just "too used" to this, but this seems reasonable to me. Try > searching for a "perfect" solution first (within reason), and only for > large problems fall back to something approximate. Sensible, no? Yes, I'm very much in favor of that concept. My concern was, trying to do this now may be trying to do too much at once. > To got to the GEQO/GOO code, people have to adjust the limits, so that > (join_collapse_limit >=3D geqo_threshold). AFAIK almost no one does that, > so most join problems are smaller that geqo_threshold and so handled by > regular DP (for smaller subproblems). > > But let's say someone adjusts the GUCs, gets to GOO, and it handles a > prefix of the problem using the DP approach. How is that different from > keeping the (join_collapse_limit < geqo_threshold) and not even getting > to GOO? Why not to just leave join_collapse_limit to a low value? One flavor of the second idea above (found in the literature) is to start with DP, and after some new GUC limit (under the hood: # of times calling make_join_rel), pick one of the incomplete subproblems and finish with a heuristic search. It switches strategy on the fly, rather than choosing upfront. That's the difference from now, although I'm not sure offhand how join_collase_limit chooses its subproblems out of the bigger problem. > Right. When replacing one heuristics with a different one, there will > almost certainly be regressions. Each heuristics will explore a slightly > different subset of the solution space, and it's a matter of luck which > gets a substantially better solution. > > I don't have a fully formed idea how to evaluate this, but I think the > only way to "prove" a the new heuristics is better is to test a lot of > complex join queries, and look at the overall statistics. See how long > the planning took, see how many queries got faster/slower, etc. > > The JOB is certainly one option to do that, and it's valuable because > the queries are meant to be realistic / from actual application. But > there's not all that many of them. > > I think it would be useful to write a script that generates joins of > arbitrary complexity (number of relations, how connected they are), and > see how it works on those. It could even generate data to get estimates > with adjustable inaccuracy. +1 With unnaturally uniform/independent data, I'm curious if the stats-dependent variation seen above for GEQO would disappear/lessen. (I'm not sure how hard adjustable inaccuracy would be) -- John Naylor Amazon Web Services