Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wACrE-002AGw-1n for pgsql-hackers@arkaria.postgresql.org; Tue, 07 Apr 2026 20:17:08 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wACrD-002HQX-0M for pgsql-hackers@arkaria.postgresql.org; Tue, 07 Apr 2026 20:17:07 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wACrC-002HQO-2f for pgsql-hackers@lists.postgresql.org; Tue, 07 Apr 2026 20:17:07 +0000 Received: from sss.pgh.pa.us ([68.162.161.243]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1wACr6-00000001Av6-2rf6 for pgsql-hackers@postgresql.org; Tue, 07 Apr 2026 20:17:06 +0000 Received: from sss1.sss.pgh.pa.us (localhost [127.0.0.1]) by sss.pgh.pa.us (8.15.2/8.15.2) with ESMTP id 637KGs7g2351009; Tue, 7 Apr 2026 16:16:54 -0400 From: Tom Lane To: Chengpeng Yan cc: Andres Freund , PostgreSQL-development , David Rowley Subject: Re: Unfortunate pushing down of expressions below sort In-reply-to: <3F7BA12C-7158-4E78-A8F8-A17FE5F6CCF0@Outlook.com> References: <1668738.1770336789@sss.pgh.pa.us> <1674012.1770339751@sss.pgh.pa.us> <988146DB-C16A-487C-8C39-656C702EF4E2@Outlook.com> <3F7BA12C-7158-4E78-A8F8-A17FE5F6CCF0@Outlook.com> Comments: In-reply-to Chengpeng Yan message dated "Sat, 07 Feb 2026 12:46:15 +0000" MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <2351007.1775593014.1@sss.pgh.pa.us> Date: Tue, 07 Apr 2026 16:16:54 -0400 Message-ID: <2351008.1775593014@sss.pgh.pa.us> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Chengpeng Yan writes: > Following up on the discussion below, I now have a patch. > The patch extends make_sort_input_target() with a conservative rule: > defer additional non-sort targetlist expressions past Sort only when > doing so does not require carrying any additional Vars/PlaceHolderVars > through Sort. This way, Sort input width never increases. I spent some time thinking about this. One thing I think we need to keep in mind is that if we don't postpone an expression past Sort, and the user doesn't like that, she can easily rewrite the query to force it; as indeed Andres demonstrated at the start of this thread. But overriding an unwanted planner decision to postpone is harder. I think you can do it with SELECT * FROM (SELECT x,y,f(z) FROM ... OFFSET 0) ORDER BY whatever; but if you forget the OFFSET-0 optimization fence you may find f(z) getting evaluated after the sort anyway. And the fence might foreclose some other optimization you did want. Also, make_sort_input_target() has gone basically unchanged since 2016, without that many complaints. So I think we need to be pretty conservative about adding postponement choices that aren't forced by semantic requirements. The rule stated above seems pretty conservative, but either it's not conservative enough or you didn't implement it right, because the regression test changes show the v2 patch is very willing to create Result nodes where there were none before, even when there's no LIMIT and thus no reason to think we can save any expression evaluations. That extra plan node has nonzero cost that I don't think you're accounting for. It'll still be a win if enough data volume is removed from the Sort step, but I don't see any consideration of how much we're actually saving before deciding to add the projection step. So I think we need some sort of gating rule, whereby we only postpone these expressions if (a) there was already a reason to add a projection or (b) we can make some cost-based or at least heuristic estimate that says we'll cut the sort data volume significantly. Maybe (b) needs to interact with the existing heuristic about postponing expensive expressions, not sure. Independently of that, I don't especially like the changes in make_sort_input_target(). They seem rather inelegant and expensive (and underdocumented), as well as duplicative of other work already being done in the function. It may be time to tackle the unfinished work mentioned in the existing comments about avoiding redundant cost/width calculations ... regards, tom lane