Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wADYS-002AwX-2U for pgsql-hackers@arkaria.postgresql.org; Tue, 07 Apr 2026 21:01:48 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wADYR-002Rnh-0r for pgsql-hackers@arkaria.postgresql.org; Tue, 07 Apr 2026 21:01:47 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wADYQ-002RnZ-3B for pgsql-hackers@lists.postgresql.org; Tue, 07 Apr 2026 21:01:47 +0000 Received: from fhigh-b1-smtp.messagingengine.com ([202.12.124.152]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1wADYP-000000016K1-263q for pgsql-hackers@postgresql.org; Tue, 07 Apr 2026 21:01:46 +0000 Received: from phl-compute-04.internal (phl-compute-04.internal [10.202.2.44]) by mailfhigh.stl.internal (Postfix) with ESMTP id 86F2F7A00A2; Tue, 7 Apr 2026 17:01:44 -0400 (EDT) Received: from phl-frontend-03 ([10.202.2.162]) by phl-compute-04.internal (MEProxy); Tue, 07 Apr 2026 17:01:44 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=anarazel.de; h= cc:cc:content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1775595704; x=1775682104; bh=mNUdlLK21l bWb46zA6rseEgLs5AC9ztlDYhoWGub+ws=; b=U5l6nAFevUrcGoyT7j4Jc4mKL5 rX+SXIxFT0ZC5Y5/8NsYpD1ywXCqOoakF5r6rM+xLff5VKaGet3enkJwgFmNqGmU OeTOd0BeCgaUVrMQ2Wh2yapG9BSKvH0eNGK3hY7rEhDObe8JXR66I0GuT1/dAo7G kYg+hDqYcbxj9p6N0dO9SR5ZWwQzEGAaklScZWfL9koBKajKypTmMD1CRYPnu0vR p+SfiAl9bq+Bfb3XPE+wXaEJ4uvn+sae1wEtIanleoq0cBvWf8x4r8uHHFYd2dC9 kHIRR0ioPSISRbWRnmdHduaDPogmH/oPQRooHedip6GTVnYSa8iVeWm49R/A== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t= 1775595704; x=1775682104; bh=mNUdlLK21lbWb46zA6rseEgLs5AC9ztlDYh oWGub+ws=; b=a4LudEozE0UzkArXe1tAvf4lRFAJ2qFkqa79XSIZgrUDTLtD5Q6 4k40BZpL7yLFotcdCpavv4/aMLDxquQFQSXkjtUTQG5uWQjZjUagBke9mU1Ypelo ezxM2HU3/nzOsrwx0vOvnf7XVlEsbEsqQBsjaIRuTHKmeEwtTUah+BbH58TRMIvg xNKhRljLa4Jyqj7t4qHhgQzr4pogHskL8Ax5O9CrTNuUYlPTB+c1r0gGvzjJ4Uy/ lOfgLTBnCY4/MRk+XCDvCK5XnloeuldQETzs04Whvp4usmWVQrbyXa/qCDSGd9+B v20PDuS7z30yl+a3BAprOKhS0jHTfyP/fBw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefhedrtddtgddvudeilecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpuffrtefokffrpgfnqfghnecuuegr ihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenucfjug hrpeffhffvvefukfhfgggtuggjsehttdfstddttddvnecuhfhrohhmpeetnhgurhgvshcu hfhrvghunhguuceorghnughrvghssegrnhgrrhgriigvlhdruggvqeenucggtffrrghtth gvrhhnpeeffffgledvffegtdevlefgtdeggffhvdekgfegteeiveejkeetudelveejhfeu geenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpegrnh gurhgvshesrghnrghrrgiivghlrdguvgdpnhgspghrtghpthhtohepgedpmhhouggvpehs mhhtphhouhhtpdhrtghpthhtohepughgrhhofihlvgihmhhlsehgmhgrihhlrdgtohhmpd hrtghpthhtoheptghhvghnghhpvghnghgphigrnhesohhuthhlohhokhdrtghomhdprhgt phhtthhopehpghhsqhhlqdhhrggtkhgvrhhssehpohhsthhgrhgvshhqlhdrohhrghdprh gtphhtthhopehtghhlsehsshhsrdhpghhhrdhprgdruhhs X-ME-Proxy: Feedback-ID: id4a34324:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Tue, 7 Apr 2026 17:01:43 -0400 (EDT) Date: Tue, 7 Apr 2026 17:01:43 -0400 From: Andres Freund To: Tom Lane Cc: Chengpeng Yan , PostgreSQL-development , David Rowley Subject: Re: Unfortunate pushing down of expressions below sort Message-ID: References: <1668738.1770336789@sss.pgh.pa.us> <1674012.1770339751@sss.pgh.pa.us> <988146DB-C16A-487C-8C39-656C702EF4E2@Outlook.com> <3F7BA12C-7158-4E78-A8F8-A17FE5F6CCF0@Outlook.com> <2351008.1775593014@sss.pgh.pa.us> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2351008.1775593014@sss.pgh.pa.us> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi, On 2026-04-07 16:16:54 -0400, Tom Lane wrote: > That extra plan node has nonzero cost that I don't think you're > accounting for. It'll still be a win if enough data volume is removed > from the Sort step, but I don't see any consideration of how much > we're actually saving before deciding to add the projection step. A different way (compared to the heuristics you were subsequently talking about) to deal with that could be to add projection support to sort's output, I guess. That would have a considerably smaller cost than a separate node. It'd add a tiny bit of cost (an if checking for projection) for the rather common of a sort without a projection. Although that'd not be too hard to get rid of by generating specialized ExecSort() variants for the different cases, like now done for ExecSeqScan, if it turned out to matter. However it'd probably would still not be guaranteed faster than evaluating below the sort, due to a) startup cost of the projection machinery b) potentially needing to deform during the expression's inputs during the projection in the sort (or above) I don't know if there are realistic cases where b) matters? You'd have to have nodes above the sort that would require the projection to happen at the sort (or an intermediary level) but then filter out most rows to avoid needing to deform anyway? In all the realistic cases I can think of the expression evaluation should then be pulled up above that filtering out of rows? I don't know if a) is really ever significant enough to matter compared to the cost of sorting. It sure shows up in a sequential scan, but that has an order of magnitude or three lower per tuple cost. Are there cases where something like Chengpeng's logic would still trigger a result node being injected, if sort had projection capability? > So I think we need some sort of gating rule, whereby we only postpone > these expressions if (a) there was already a reason to add a > projection or (b) we can make some cost-based or at least heuristic > estimate that says we'll cut the sort data volume significantly. This reminds me: The heuristics around the cost of expression evaluation seem like they could be improved a fair bit by taking into account the cost of having to deform the input columns. There's a huge difference between func1(col1), func2(col1) and func1(col1), func2(col2) and func1(col30) Unless func* are particularly expensive, the cost will be increasingly dominated by tuple deforming. I think this is one thing that sometimes contributes to us wrongly choosing sequential scans with a qual over an index scans that needs to evaluate far fewer rows, because we just take the operator costs into account, not the tuple deforming it requires. Greetings, Andres Freund