Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vjK6I-00CzRw-29 for pgsql-hackers@arkaria.postgresql.org; Fri, 23 Jan 2026 16:33:35 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vjK6H-000YsQ-2S for pgsql-hackers@arkaria.postgresql.org; Fri, 23 Jan 2026 16:33:34 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vjK6G-000YsG-2f for pgsql-hackers@lists.postgresql.org; Fri, 23 Jan 2026 16:33:33 +0000 Received: from fhigh-a4-smtp.messagingengine.com ([103.168.172.155]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1vjK6D-000000006Vm-3MQk for pgsql-hackers@lists.postgresql.org; Fri, 23 Jan 2026 16:33:32 +0000 Received: from phl-compute-03.internal (phl-compute-03.internal [10.202.2.43]) by mailfhigh.phl.internal (Postfix) with ESMTP id 8F66D14002E2; Fri, 23 Jan 2026 11:33:27 -0500 (EST) Received: from phl-frontend-03 ([10.202.2.162]) by phl-compute-03.internal (MEProxy); Fri, 23 Jan 2026 11:33:27 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=anarazel.de; h= cc:cc:content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1769186007; x=1769272407; bh=PIOYcB7iDR tDny8iotR+WzZ3j/oR/ERsJ25O0SLdZB4=; b=NPA+ArkCwnXRUD/KAsjxwJrMgD wwbf7N1ZiaWyjCcqYkLzLEnBq8wuJvjuW3oHyVhGKJL00Z21EqmDVeQcjerZ2zS9 17cJmB3akj9G/uchmBP8UK1x98PeAsMwFxLwyblStNkfsY7pmPaRfnqS28tcV3dg +MGR3VJtnkos3kjOtxUOnug8UyLakF/k4LmEBeh+Z3NcyDIX+J6Rx34SXnYNI1YS 9uT3d81oRXRHSUHxV24buJgX7oZ4kkg/I25R/KGLYp/KvfC1+8k3KBBNvwVYM3jm wD1uoapCbP+LLD8LCPsGEd7cRrwJ1Wg/GxzWYt8i3VvmUCYpCE63qj/9Qh9g== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t= 1769186007; x=1769272407; bh=PIOYcB7iDRtDny8iotR+WzZ3j/oR/ERsJ25 O0SLdZB4=; b=pthfKTkF7AdBKXWO+IMbUTy6z0pVK8OaWdwwlrb+lrF0kWMjbhA mAqE+DPZYN7Cmjje+SJtJ6HLZJE94s6ShapaDNqPgcOpeWhMPlMXDSQRy70rPQbQ NZTsvvSN/dUu2LryNMMYLnNoXsD/WdibViVpOOJRtr92uMm1AL6DjunUaJO1d1K7 B0laCK7wNm9Z5HSi6W3W/Mf6BfHdyYmu3zm5C8eiYZ1Y6talWwyQl3A3KhR/XHDv jEuHMZshuK/hzOq5l+u+dkTNjLjfDINcMzO5xEH83O+YVeJwBvsefmOU0ffq/XNc BAyhSH5TOc0UZFAbbzf+HT1RMnlZVUbMDQg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefgedrtddtgddugeelheefucetufdoteggodetrf dotffvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfurfetoffkrfgpnffqhgenuceu rghilhhouhhtmecufedttdenucenucfjughrpeffhffvvefukfhfgggtuggjsehttdfstd dttddvnecuhfhrohhmpeetnhgurhgvshcuhfhrvghunhguuceorghnughrvghssegrnhgr rhgriigvlhdruggvqeenucggtffrrghtthgvrhhnpeeffffgledvffegtdevlefgtdeggf fhvdekgfegteeiveejkeetudelveejhfeugeenucevlhhushhtvghrufhiiigvpedtnecu rfgrrhgrmhepmhgrihhlfhhrohhmpegrnhgurhgvshesrghnrghrrgiivghlrdguvgdpnh gspghrtghpthhtohepfedpmhhouggvpehsmhhtphhouhhtpdhrtghpthhtohepughgrhho fihlvgihmhhlsehgmhgrihhlrdgtohhmpdhrtghpthhtoheplhhirdgvvhgrnhdrtghhrg hosehgmhgrihhlrdgtohhmpdhrtghpthhtohepphhgshhqlhdqhhgrtghkvghrsheslhhi shhtshdrphhoshhtghhrvghsqhhlrdhorhhg X-ME-Proxy: Feedback-ID: id4a34324:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 23 Jan 2026 11:33:27 -0500 (EST) Date: Fri, 23 Jan 2026 11:33:26 -0500 From: Andres Freund To: David Rowley Cc: Chao Li , PostgreSQL Developers Subject: Re: More speedups for tuple deformation Message-ID: References: <9A17C43D-7A28-4885-8974-555A40C9523E@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi, On 2026-01-22 20:18:21 -0500, Andres Freund wrote: > I haven't yet looked at the new version of the patch, but I ran your benchmark > from upthread (fwiw, I removed the sleep 10 to reduce runtimes, the results > seem stable enough anyway) on two intel machines, as you mentioned that you > saw a lot variation in Azure. > > For both I disabled turbo boost, cpu idling and pinned the backend to a single > CPU core. > > There's a bit of noise on "awork3" (basically an editor and an idle browser > window), but everything is pinned to the other socket. "awork4" is entirely > idle. > > > Looks like overall the results are quite impressive! Some of the extra_cols=0 > runs saphire rapids are a bit slower, but the losses are much smaller than the > gains in other cases. > > > I think it'd be good to add a few test cases of "incremental deforming" to the > benchmark. E.g. a qual that accesses column 10, but projection then deforms up > to 20. I'm a bit worried that e.g. the repeated first_null_attr() > computations could cause regressions. The overhead of the aggregation etc makes it harder to see efficiency changes in deformation speed: I think it'd be worth replacing the SUM(a) with WHERE a < 0 (filtering all rows), to reduce the cost of the executor dispatch. Here's a profile of the SUM(a): - 99.90% 0.00% postgres postgres [.] standard_ExecutorRun - standard_ExecutorRun - 96.83% ExecAgg - 49.86% ExecInterpExpr - 28.30% slot_getsomeattrs_int tts_buffer_heap_getsomeattrs 0.67% tts_buffer_heap_getsomeattrs + 0.02% asm_sysvec_apic_timer_interrupt - 37.44% fetch_input_tuple - 31.42% ExecSeqScan + 20.58% heap_getnextslot 3.58% MemoryContextReset 0.52% heapgettup_pagemode 0.32% ExecStoreBufferHeapTuple 0.99% heap_getnextslot 0.79% MemoryContextReset 2.81% int4_sum 1.39% MemoryContextReset Which takes ~93ms on average for the first generated bench.sql - 99.88% 0.00% postgres postgres [.] standard_ExecutorRun - standard_ExecutorRun - 95.78% ExecSeqScanWithQual - 57.65% ExecInterpExpr - 29.08% slot_getsomeattrs_int tts_buffer_heap_getsomeattrs 0.49% tts_buffer_heap_getsomeattrs - 25.40% heap_getnextslot + 15.00% heapgettup_pagemode + 4.71% ExecStoreBufferHeapTuple 0.05% UnlockBuffer 1.80% MemoryContextReset 0.77% int4lt 0.52% heapgettup_pagemode 0.47% ExecStoreBufferHeapTuple 0.37% slot_getsomeattrs_int 2.11% heap_getnextslot 1.49% ExecInterpExpr 0.50% MemoryContextReset Same data, but with a WHERE a < 0, takes on average ~74m. I wonder if it's worth writing a C helper to test deformation in a bit more targeted way. Looking at the profile of ExecSeqScanWithQual() made me a bit sad, turns out that some of the generated code isn't great :(. I'll start a separate thread about that. Greetings, Andres Freund