Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vl8No-007Ze1-23 for pgsql-hackers@arkaria.postgresql.org; Wed, 28 Jan 2026 16:27:09 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vl8Nm-002Olx-2K for pgsql-hackers@arkaria.postgresql.org; Wed, 28 Jan 2026 16:27:07 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vl8Nl-002Olp-3A for pgsql-hackers@lists.postgresql.org; Wed, 28 Jan 2026 16:27:06 +0000 Received: from fout-b7-smtp.messagingengine.com ([202.12.124.150]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1vl8Nh-00000000u4Y-0ahY for pgsql-hackers@lists.postgresql.org; Wed, 28 Jan 2026 16:27:05 +0000 Received: from phl-compute-08.internal (phl-compute-08.internal [10.202.2.48]) by mailfout.stl.internal (Postfix) with ESMTP id F408E1D00176; Wed, 28 Jan 2026 11:26:51 -0500 (EST) Received: from phl-frontend-03 ([10.202.2.162]) by phl-compute-08.internal (MEProxy); Wed, 28 Jan 2026 11:26:52 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=anarazel.de; h= cc:cc:content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm3; t=1769617611; x=1769704011; bh=R4vQKiUF9d iJnNFLundZsOq1dl+SMb9PJGdb0w1MuKg=; b=ilKMrIzDMfwM70ad4PAfy7Q1pS E+fe1XJEEMYRUPdPnRBzyBF+BYl35vIKDIi31acbuBdkZ3095rtfnYT6f6Dg51+H GJqwxxDeUyJHQ7MQoQQMCt5pgCGf3M8kZdKMYEROtwbmgp2VN6NDja3gqOReG55N MdYRT9c9JpwalPd9oP/u5Q1T03RiG+5IUwRClea5qU9H/Nqc6yKBmxj20szStbz1 rG8mkEh3sgmWTj0btTsPNfhtkDeSE4kWTWszdYjviaFlZ1Deq2KRXDQzFo2QVmTp HdX4tYNPens11Hq8YrH6hBQTUwD/PqD63pGn0mhAv6aEEfBI88I+4pcrp57A== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm3; t= 1769617611; x=1769704011; bh=R4vQKiUF9diJnNFLundZsOq1dl+SMb9PJGd b0w1MuKg=; b=oIr7/n0J+X48VBE0I8wm9qhtiQ3G8L8Y70xIpn8Xzujtv0Y7ZEH ZLCEY5PPjWTjdvNlNve9cNf7ba/DG413wJHzypuMoRhp5c0/XkRFbL59Pf7bXszg 5Ua52tW+qgdbNyFzYGk6yVntaJdb/wDVh9DGX0fGn31Qqo+4AhGs0T+u+XOttyUM EHIbizTpO0SXhIWenfe83xecOhMTrAVETDIPO2CMiG0el22pTeWoAuydNzpN99wA DNxeUAC1T1PqAXKOcMI1U+3eqnVgJ67lKWWhv5KNUlPtsZSKcqVX/SnFlA+zFysO Q6/6nlIgL7nX+g5ii4yVV1wYapLOARKfNcw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefgedrtddtgdduieefkedtucetufdoteggodetrf dotffvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfurfetoffkrfgpnffqhgenuceu rghilhhouhhtmecufedttdenucenucfjughrpeffhffvvefukfhfgggtuggjsehttdfstd dttddvnecuhfhrohhmpeetnhgurhgvshcuhfhrvghunhguuceorghnughrvghssegrnhgr rhgriigvlhdruggvqeenucggtffrrghtthgvrhhnpeeffffgledvffegtdevlefgtdeggf fhvdekgfegteeiveejkeetudelveejhfeugeenucevlhhushhtvghrufhiiigvpedtnecu rfgrrhgrmhepmhgrihhlfhhrohhmpegrnhgurhgvshesrghnrghrrgiivghlrdguvgdpnh gspghrtghpthhtohepfedpmhhouggvpehsmhhtphhouhhtpdhrtghpthhtohepughgrhho fihlvgihmhhlsehgmhgrihhlrdgtohhmpdhrtghpthhtoheplhhirdgvvhgrnhdrtghhrg hosehgmhgrihhlrdgtohhmpdhrtghpthhtohepphhgshhqlhdqhhgrtghkvghrsheslhhi shhtshdrphhoshhtghhrvghsqhhlrdhorhhg X-ME-Proxy: Feedback-ID: id4a34324:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 28 Jan 2026 11:26:51 -0500 (EST) Date: Wed, 28 Jan 2026 11:26:51 -0500 From: Andres Freund To: David Rowley Cc: Chao Li , PostgreSQL Developers Subject: Re: More speedups for tuple deformation Message-ID: References: <9A17C43D-7A28-4885-8974-555A40C9523E@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi, On 2026-01-28 02:34:26 +1300, David Rowley wrote: > On Sat, 24 Jan 2026 at 05:33, Andres Freund wrote: > > I wonder if it's worth writing a C helper to test deformation in a bit more > > targeted way. > > Good idea. I've written a test module called "deform_bench". You can > do: "select deform_bench('tablename'::regclass, '{10,20}');" which > will deform up to attnum=10, then in a 2nd pass deform up to > attnum=20. This is in the 0003 patch. (Requires "ninja > install-test-files"). 0003 is intended for testing, not commit. Nice! I am trying very hard to restrain myself from playing with it right now, because I really need to get some other things done first... > /* > * slot_deform_heap_tuple > * Given a TupleTableSlot, extract data from the slot's physical tuple > @@ -1122,78 +1010,140 @@ static pg_attribute_always_inline void > slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp, > int natts) > { > + CompactAttribute *cattr; > + TupleDesc tupleDesc = slot->tts_tupleDescriptor; > bool hasnulls = HeapTupleHasNulls(tuple); > + HeapTupleHeader tup = tuple->t_data; > + bits8 *bp; /* ptr to null bitmap in tuple */ > int attnum; > + int firstNonCacheOffsetAttr; > + int firstNullAttr; > + Datum *values; > + bool *isnull; > + char *tp; /* ptr to tuple data */ > uint32 off; /* offset in tuple data */ > - bool slow; /* can we use/set attcacheoff? */ > + > + /* Did someone forget to call TupleDescFinalize()? */ > + Assert(tupleDesc->firstNonCachedOffAttr >= 0); > > /* We can only fetch as many attributes as the tuple has. */ > - natts = Min(HeapTupleHeaderGetNatts(tuple->t_data), natts); > + natts = Min(HeapTupleHeaderGetNatts(tup), natts); > + attnum = slot->tts_nvalid; > + firstNonCacheOffsetAttr = Min(tupleDesc->firstNonCachedOffAttr, natts); FWIW, in a few experiments on my cascade lake systems, this branch (well, it ends up as a cmov) ends up causing a surprisingly large performance bottleneck. I don't really see a way around that, but I thought I'd mention it. On the topic of tupleDesc->firstNonCachedOffAttr - shouldn't that be an AttrNumber? Not that it'll make a difference perf or space wise, just for clarity. Hm, I guess natts isn't an AttrNumber either. Not sure why? > + if (hasnulls) > + { > + bp = tup->t_bits; > + firstNullAttr = first_null_attr(bp, natts); > + firstNonCacheOffsetAttr = Min(firstNonCacheOffsetAttr, firstNullAttr); > + } > + else > + { > + bp = NULL; > + firstNullAttr = natts; > + } > + > + values = slot->tts_values; > + isnull = slot->tts_isnull; > + tp = (char *) tup + tup->t_hoff; Another stall I see is due to the t_hoff computation - which makes sense, it's in the tuple header and none of the deforming can happen without knowing the address. I think in the !hasnulls case, the only influence on it is MAXALIGN(offsetof(HeapTupleHeaderData, t_bits)), so we could just hardcode that? Separately, sometimes - I haven't figured out when - gcc seems to think it's smart to actually compute the `tp + cattr->attcacheoff` below using tup and tup->t_hoff stored in registers (i.e. doing multiple adds). When the code is generated that way, I see substantially worse performance. Have you seen that? > + else if (attnum == 0) > { > /* Start from the first attribute */ > off = 0; > - slow = false; > } > else > { > /* Restore state from previous execution */ > off = *offp; > - slow = TTS_SLOW(slot); > } Do we actually need both of these branches? Shouldn't *offp be set to 0 in the attnum == 0 case? > - if (!slow) > + for (; attnum < firstNullAttr; attnum++) > { > [...] > + cattr = TupleDescCompactAttr(tupleDesc, attnum); > + > + /* align the offset for this attribute */ > + off = att_pointer_alignby(off, > + cattr->attalignby, > + cattr->attlen, > + tp + off); > + > + values[attnum] = fetchatt(cattr, tp + off); > + isnull[attnum] = false; > + > + /* move the offset beyond this attribute */ > + off = att_addlength_pointer(off, cattr->attlen, tp + off); > } A few thoughts / suggestions: 1) We should not update values[]/isnull[] between fetchatt() and att_addlength_pointer(). The compiler can't figure out that no fields in cattr or *(tp + off) are being affected by those stores. Changing this on master improves performance quite noticeably. I see a 13% improvement in a test with deforming 5 not-null byval columns. 2) I sometime see performance benefits due to moving the isnull[attnum] = false; to the beginning of the loop. Which makes some sense, starting the store earlier allows it to complete earlier, and it doesn't depend on fetching cattr, aligning the pointer, fetching the attribute and adjusting the offset. 3) I briefly experimented with this code, and I think we may be able to optimize the combination of att_pointer_alignby(), fetch_att() and att_addlength_pointer(). They all do quite related work, and for byvalue types, we know at compile time what the alignment requirement for each of the supported attlen is. > + /* > + * Now handle any remaining tuples, this time include NULL checks as we're > + * now at the first NULL attribute. > + */ > + for (; attnum < natts; attnum++) > { > - /* XXX is it worth adding a separate call when hasnulls is false? */ > - attnum = slot_deform_heap_tuple_internal(slot, > - tuple, > - attnum, > - natts, > - true, /* slow */ > - hasnulls, > - &off, > - &slow); > + if (att_isnull(attnum, bp)) > + { > + values[attnum] = (Datum) 0; > + isnull[attnum] = true; > + continue; > + } Have you experimented setting isnull[] in a dedicated loop if there are nulls and then in this loop just checking isnull[attnum]? Seems like that could perhaps be combined with the work in first_null_attr() and be more efficient than doing an att_isnull() separately for each column. Greetings, Andres Freund