Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vQO3W-006biZ-00 for pgsql-hackers@arkaria.postgresql.org; Tue, 02 Dec 2025 10:56:26 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vQO3V-007K3r-08 for pgsql-hackers@arkaria.postgresql.org; Tue, 02 Dec 2025 10:56:25 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vQO3U-007K3j-22 for pgsql-hackers@lists.postgresql.org; Tue, 02 Dec 2025 10:56:25 +0000 Received: from relay6-d.mail.gandi.net ([2001:4b98:dc4:8::226]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vQO3R-002jQj-27 for pgsql-hackers@lists.postgresql.org; Tue, 02 Dec 2025 10:56:23 +0000 Received: by mail.gandi.net (Postfix) with ESMTPSA id 6889D43A09; Tue, 2 Dec 2025 10:56:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vondra.me; s=gm1; t=1764672974; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=t+ycdbK+B+ZgkdxUVu1Z8RoWAgaehi7hU7pM7QJ1ZSA=; b=o9cwcKYIgJW+ttQB4b/woyoxVr2ZBEC0fj25rl8YVO2aX5XTY00TVB6ETLhoXqUSFbsCFv xLhLPBT5+/f63qyGrlpZixQLNYtw4yeyie+yqpkedADJY44MvWzbSezjlBG3nhfXpP8/HF flk6Ej0pruiRlIzMNmxdq3taT9oJe+K0V5YBFYq3i8UtseHwGiiMi+o0/UTymmLgNUBb+C rzOoT+0Bs+GO6zC+2K3X2bbOLTE+Z4HTstdM3FxMpvlQL3EUYRlv/y7VvsyqK4pfeZHSBf FORBlpDykFmtktQ7zU6hJfrfFTHSKg9MjxR/8U/TAmJsv3AseLa7Vt4zP9+eLA== Message-ID: <6db6d2ec-7529-4add-9a95-178fc318311d@vondra.me> Date: Tue, 2 Dec 2025 11:56:13 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Add a greedy join search algorithm to handle large join problems To: Chengpeng Yan , "pgsql-hackers@lists.postgresql.org" Cc: John Naylor References: <3FF63E99-AB4F-41A9-BC78-AAB28823FBD0@Outlook.com> Content-Language: en-US From: Tomas Vondra In-Reply-To: <3FF63E99-AB4F-41A9-BC78-AAB28823FBD0@Outlook.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-GND-State: clean X-GND-Score: -100 X-GND-Cause: gggruggvucftvghtrhhoucdtuddrgeeffedrtdeggddviedtgedtucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuifetpfffkfdpucggtfgfnhhsuhgsshgtrhhisggvnecuuegrihhlohhuthemuceftddunecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenucfjughrpefkffggfgfuvfevfhfhjggtgfesthekredttddvjeenucfhrhhomhepvfhomhgrshcugghonhgurhgruceothhomhgrshesvhhonhgurhgrrdhmvgeqnecuggftrfgrthhtvghrnhepuedvvdeifefffeekudeggfdtieeglefggeduheffveeihefggfehgfdvudetffevnecukfhppeekiedrgeelrddvfedtrddvtdeinecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehinhgvthepkeeirdegledrvdeftddrvddtiedphhgvlhhopegluddtrddufeejrddtrddvngdpmhgrihhlfhhrohhmpehtohhmrghssehvohhnughrrgdrmhgvpdhnsggprhgtphhtthhopeefpdhrtghpthhtoheptghhvghnghhpvghnghgphigrnhesohhuthhlohhokhdrtghomhdprhgtphhtthhopehpghhsqhhlqdhhrggtkhgvrhhssehlihhsthhsrdhpohhsthhgrhgvshhqlhdrohhrghdprhgtphhtthhopehjohhhnhgtnhgrhihlohhrlhhssehgmhgrihhlrdgtohhm X-GND-Sasl: tomas@vondra.me List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On 12/2/25 04:48, Chengpeng Yan wrote: > Hi hackers, > > This patch implements GOO (Greedy Operator Ordering), a greedy > join-order search method for large join problems, based on Fegaras (DEXA > ’98) [1]. The algorithm repeatedly selects, among all legal joins, the > join pair with the lowest estimated total cost, merges them, and > continues until a single join remains. Patch attached. > > To get an initial sense of performance, I reused the star join / > snowflake examples and the testing script from the thread in [2]. The > star-join GUC in that SQL workload was replaced with > `enable_goo_join_search`, so the same tests can run under DP (standard > dynamic programming) / GEQO(Genetic Query Optimizer) / GOO. For these > tests, geqo_threshold was set to 15 for DP, and to 5 for both GEQO and > GOO. Other planner settings, including join_collapse_limit, remained at > their defaults. > > On my local machine, a single-client pgbench run produces the following > throughput (tps): > >                     |    DP    |   GEQO   |    GOO > --------------------+----------+----------+----------- > starjoin    (inner) |  1762.52 |  192.13  |  6168.89 > starjoin    (outer) |  1683.92 |  173.90  |  5626.56 > snowflake   (inner) |  1829.04 |  133.40  |  3929.57 > snowflake   (outer) |  1397.93 |   99.65  |  3040.52 > Seems interesting, and also much more ambitious than what I intended to do in the starjoin thread (which is meant to be just a simplistic heuristics on top of the regular join order planning). I think a much broader evaluation will be needed, comparing not just the planning time, but also the quality of the final plan. Which for the starjoin tests does not really matter, as the plans are all equal in this regard. regards -- Tomas Vondra