Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wFfQQ-005Pvw-2g for pgsql-hackers@arkaria.postgresql.org; Wed, 22 Apr 2026 21:48:03 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wFfQN-00FTQL-1k for pgsql-hackers@arkaria.postgresql.org; Wed, 22 Apr 2026 21:47:59 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wFfQN-00FTQC-0p for pgsql-hackers@lists.postgresql.org; Wed, 22 Apr 2026 21:47:59 +0000 Received: from sss.pgh.pa.us ([68.162.161.243]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1wFfQK-00000002Jb2-3zKM for pgsql-hackers@lists.postgresql.org; Wed, 22 Apr 2026 21:47:58 +0000 Received: from sss1.sss.pgh.pa.us (localhost [127.0.0.1]) by sss.pgh.pa.us (8.15.2/8.15.2) with ESMTP id 63MLlgqP1325979; Wed, 22 Apr 2026 17:47:42 -0400 From: Tom Lane To: Staroverov Ilja cc: "pgsql-hackers@lists.postgresql.org" Subject: Re: [RFC][PATCH] Order qual clauses by combined cost and selectivity In-reply-to: <41b98cf827294e60b726aedcca06f9d5@localhost.localdomain> References: <41b98cf827294e60b726aedcca06f9d5@localhost.localdomain> Comments: In-reply-to Staroverov Ilja message dated "Wed, 22 Apr 2026 05:50:11 -0000" MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <1325977.1776894462.1@sss.pgh.pa.us> Date: Wed, 22 Apr 2026 17:47:42 -0400 Message-ID: <1325978.1776894462@sss.pgh.pa.us> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Staroverov Ilja writes: > The attached patch changes the ranking heuristic to use > cost / (1 - selectivity) > where selectivity is the fraction of rows that pass the clause. This (or some close relative) has been proposed before, but we have been hesitant to do it because our cost metrics for qual clauses are pretty nearly completely bogus: practically all the built-in functions are assigned cost 1, even though in reality they have a wide range of runtimes. Selectivity isn't enormously reliable either. We could easily be taking a qual order that the user has chosen carefully and stirring it around more or less at random. I'm suspicious of the particular form of this expression, too, because selectivities close to 1 will produce very substantial effects on the estimate even though there may not be that much difference in practice, and the selectivity difference may be mostly sampling error in the first place. I think you need a formula that's not very sensitive to small differences, but this will fail that test. We had a similar discussion about two years ago concerning a patch that (IIRC) tried to order sort columns according to the estimated cost of the comparison functions. That got reverted for a few reasons, but one of the big ones was that the cost comparisons were largely garbage-in-garbage-out. I think that a prerequisite for any work in this area is to try to assign more realistic procost estimates to at least a substantial fraction of the built-in pg_proc entries. That's going to be tedious and probably contentious, but it's hard to believe we can make much progress without better cost data. regards, tom lane