Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vyueI-000X8X-1c for pgsql-hackers@arkaria.postgresql.org; Sat, 07 Mar 2026 16:37:06 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vyueF-008y2D-2Y for pgsql-hackers@arkaria.postgresql.org; Sat, 07 Mar 2026 16:37:04 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vyueF-008y25-1X for pgsql-hackers@lists.postgresql.org; Sat, 07 Mar 2026 16:37:04 +0000 Received: from mail-ej1-x636.google.com ([2a00:1450:4864:20::636]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1vyueD-00000001MuJ-0PAO for pgsql-hackers@lists.postgresql.org; Sat, 07 Mar 2026 16:37:03 +0000 Received: by mail-ej1-x636.google.com with SMTP id a640c23a62f3a-b9382e59c0eso640963966b.0 for ; Sat, 07 Mar 2026 08:37:00 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1772901415; cv=none; d=google.com; s=arc-20240605; b=a/IXpSa4ZSb8qNQzTaHSyLQzKLpgH8ntvOFTYf3aLYFJwa0Z3fBMifRVutCT9HttdX nVLf8iksXxn39Ev5HKlANesg/+vHXI2SmcVHlYAuh6BmXChOYjCW8eed9BxP6TnsuH3k wblpslk1TygmYEAvTGjkYjoeV6/T8VWlZ9Ji9mtLCFBA5h5dzI5XcDgQOjBz4MqeW+M3 u5AY7CrZNnMxRDKTg5QLJoInzTS7kbWC8GWAtHIe8CS85qzaC9V5acCpCVA3rgbk6SoR pawxBRwoDKSPnRB4+6H+moHJr3j7VhA+DvOTFKxTJQ2PZJQOjfYNa/i2wI7MShIqnkPW sa3Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=MLj3fXdFzD5VRPDk7MApBOyUj5hvVE2BLyRdmCwJgKA=; fh=EbGs07U2RQscmIVJ5ce7NFCARjXjyQPsieFn97VGe0U=; b=aMTtrBNFiLa+inR+OcpjRQoJUgMCRIuL2t6NNpeXEJn74lDK4E/tHbcmf+OiLy6I4x /0JlnQNs0k+ZXavvSwet7U21tWtjS6pV7NhmtLIyzS6u5D3kCZfEG+HZyL37u/chHvV8 B1Jcca6JsVj6mCDHE5pOyB3tAeijWbMxgrtSFPcqt3vhRrphSMtw2FrSYJFCq19Q3KI5 Pd9kS44w8/oKx/Kl+Aq5Tzt9rdKg7D+M5f6B9KPIG96QzpCs83gUt9VuA0EeUA2y2Wub MaQ0WVpT6wTrbvjzVkyKEDVy1EAtK9rYG1WkM9LViiZJTM6fxz9B+yeRv80c5LxI0cVe RIog==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1772901415; x=1773506215; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=MLj3fXdFzD5VRPDk7MApBOyUj5hvVE2BLyRdmCwJgKA=; b=SdBxHx01g898bN0h/StSFgv2NjFphVQTheeBspIi6spRBr9XV/ZlXzMk7G8qMTNTvn muosRG82huqUaK+fj1cx+hifBa/BIEoii1xD8cclwkB4mD6/n/c8CtxOdjppp8EH3vvr 7O5HhnsU9FBLas14lPlYD8haKvl2i8KIK1Qw6Ibx330rnI0qnsO/r4IOLsdB4BoZznvy KK+gErFnb6jASmT5L9ujxFuEmYOLmfTt+vXttyCP173jdUqMrC7Hey9BXtFjgIpnkbps PdEAqNq+xuzPj2UvO0ytAKmqylL8gmPZKB0am+eT5A1GQ9/uoKqMxXCnILI7liGTk+RM hR3A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1772901415; x=1773506215; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=MLj3fXdFzD5VRPDk7MApBOyUj5hvVE2BLyRdmCwJgKA=; b=S+usONqtejUHjLfYhluV8kHINRo313kRwAi78Ust/mafS6llsFar4Iu0FzP++bluso 4VyQ7tnwyZFF9rgOYspvYxCjZwUERPpwfiINFiDvTlQ2X6czBvdOJWNx3GPKDJMf7NVS gZJisRIZd8eVsr9LNzzLtBpclyfWNa7Rt7BSHegNrx1cQ2Zvh8cDUDbum6xfP3aq4ZB4 UkcCusA3adSGcMJ5kfj5AMtCbe6qfo0lPyLzR7IRSjbn5bLpylsaYs5h5hCRbgCVX60o zX3ZNwzkEhimcI9aI0x03bTeVmErqrExjRfMEqJ6LoSleDaRC59FoPHzgiEFchbDZ1zT iaoA== X-Forwarded-Encrypted: i=1; AJvYcCUqePHgRWW3bz+kyKa6aAHhethTrEMkMC+NuRch4mX7Iad9ri3pdZsFFnV3vXSi0qS7EOd64NnAXshlqwer@lists.postgresql.org X-Gm-Message-State: AOJu0YwoEMoKnY+bYS2pnDwh5fQWxz2OzFDKF+4htA8FnbbaDCK0Ddxy 7CtY05jsJtJTBPP1KYpVDUfh9+12TA+NpWvjMi4uGtV1nGxPP5Xw0PkkcjEUS6a+QIJwB4UKc0M 1lIbfiAjyC3zs7Gi8auRcLRh4ScOJdRs= X-Gm-Gg: ATEYQzyvD3LElWXT1SylQ5ooC4H7GTV/Nj8bYZAXcrulM9wYClwWZQLNxzFWHhUPHYt cPAhzIVmVdwSbwsborgfwHpepUb6brQTW2PpqvsWMSuthz6FAuLOqHNhB4W1eGjxKBAefrlT+/S dCfdVwFQ0oLVBZPVKgkSKUe/0186c/fgV/yphLszR+2kSkEuTZj57V8OR/2mllULD8TSfMZ/V5N ueWoX5X9zkMKBl60l6DaAztgFayEuskBa6W2/8kbAtz0YxbDnOwmtkC0Duld3kk6Sjh1I4zgMhE v7iU/Vl5NgnZeER1pZlWl9GPnJrHFON7L7JS9VSlTMXvkiiLmMZ4DCa4s1HN6wwQn509WyGIFg= = X-Received: by 2002:a17:907:3f1a:b0:b8d:c595:83e with SMTP id a640c23a62f3a-b942dfa4856mr306860266b.50.1772901414656; Sat, 07 Mar 2026 08:36:54 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Junwang Zhao Date: Sun, 8 Mar 2026 00:36:42 +0800 X-Gm-Features: AaiRm52p_NDkwwfEkFCFXWXSCcUfuEMpOModUwS2ZktpeshShQGm9VOi29psaZ4 Message-ID: Subject: Re: More speedups for tuple deformation To: David Rowley Cc: Andres Freund , John Naylor , Chao Li , PostgreSQL Developers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi David, On Fri, Mar 6, 2026 at 12:10=E2=80=AFPM David Rowley = wrote: > > One of my goals for proactively populating > CompactAttribute.attcacheoff is to make it so we're able to support > deforming only a subset of columns. If we only need a small number of > columns from the tuple and all those columns have a known attcacheoff > and no NULLs come prior, then we can quite efficiently just go to > those cached offsets and fetch only the attributes that we need. To > do this, we'll need an extra array to store which attnums we're > interested in, rather than deforming all attrs up to the highest > attnum that we need, as we do today. I expect that looking at this > new array will slow things down a bit when we're accessing either most > or all columns in, say, a SELECT * query. So, IMO, it'd be bad to > *replace* the current deforming code with code which does this. > Instead, I propose we add an additional deform operator and have some > heuristic which decides which one is best to use. I expect > ExecPushExprSetupSteps() could make that choice fairly easily. Perhaps > something cheap like bms_num_members(scan_attrs) is less than half the > bms_prev_member(scan_attrs, -1) (the highest member). > > There's going to be many cases where the attcacheoff isn't known in > the attributes being selected. So that we still get some gains when > that's the case, I've coded it up so that we start walking the tuple > at the last attribute that has an attcacheoff. In many cases, that'll > mean we don't need to walk the entire tuple. Often, leading columns > are fixed-width, so this means that there's likely some benefit to > most cases. There might need to be a bit more education or > documentation about best column ordering practises. > > There are a few hurdles to make this work, and one is the physical > tlist optimization. If the planner replaces the targetlist with a > physical tlist, the executor is going to think we need all columns, > which would have it likely choose not to do the selective deforming. > To make this work, I've added some code in createplan.c to extract the > attnums we need from the qual and tlist before the physical tlist is > installed. That's recorded in a Bitmapset and passed down to the > executor and to the code which sets up the ExprStates. Currently, > mostly to exercise this code as much as possible, I've coded it to > always do the selective deforming when the Bitmapset isn't empty. So > far, I've only done this for Seq Scan, but I expect all the scans that > deform tuples could use this. > > I've attached the code which does all this in the 0006 patch. > Ideally, I'd have had this at least to the current state about 2-3 > months ago, so I don't intend that 0006 is v19 material, but I wanted > to share to show where I intend this work to go. > > Performance: > > Using the t_1_40 table from the deform_test_setup.sh script I sent in > [1], running "select a from t_1_40 where a =3D 0;" ("a" is the 43rd > column in that table), on my Zen2 machine, I get the following from > perf top and pgbench: > > master: > 75.57% postgres [.] tts_buffer_heap_getsomeattrs > 4.70% postgres [.] ExecInterpExpr > 2.85% postgres [.] ExecSeqScanWithQualProject > 1.94% postgres [.] heapgettup_pagemode > 1.21% postgres [.] UnlockBuffer > 1.15% postgres [.] slot_getsomeattrs_int > > $ for i in {1..3}; do pgbench -n -f bench.sql -M prepared -T 10 > postgres | grep latency; done > latency average =3D 154.175 ms > latency average =3D 156.780 ms > latency average =3D 157.599 ms > > 0001-0005: > 64.24% postgres [.] tts_buffer_heap_getsomeattrs > 15.01% postgres [.] ExecInterpExpr > 3.22% postgres [.] ExecSeqScanWithQualProject > 3.01% postgres [.] heapgettup_pagemode > 1.57% postgres [.] ExecStoreBufferHeapTuple > 1.53% postgres [.] heap_prepare_pagescan > > $ for i in {1..3}; do pgbench -n -f bench.sql -M prepared -T 10 > postgres | grep latency; done > latency average =3D 130.981 ms > latency average =3D 134.700 ms > latency average =3D 134.898 ms > > 0001-0006: > 42.28% postgres [.] heapgettup_pagemode > 11.38% postgres [.] ExecInterpExpr > 7.13% postgres [.] ExecSeqScanWithQualProject > 5.92% postgres [.] tts_buffer_heap_selectattrs <-- it's down= here. > 5.69% postgres [.] ExecStoreBufferHeapTuple > 5.11% postgres [.] heap_getnextslot > 3.87% postgres [.] heap_prepare_pagescan > > $ for i in {1..3}; do pgbench -n -f bench.sql -M prepared -T 10 > postgres | grep latency; done > latency average =3D 71.689 ms > latency average =3D 75.638 ms > latency average =3D 75.149 ms > > Keep in mind that this is one of the best cases as t_1_40 has no NULLs > and only has fixed-width columns. The only slightly better case would > be to add more columns and fetch only the final one. 40 doesn't seem > excessively unrealistic, to get an idea of the gains that someone > *could* see. > > You can see that perf top report that tts_buffer_heap_getsomeattrs > dropped from taking 75.57% down to 64.24% with 0001-0005. Adding 0006 > sees that replaced with tts_buffer_heap_selectattrs which takes less > than 6% of the CPU time. It also highlights the next most interesting > thing we should probably make faster, heapgettup_pagemode(). > > I've attached v12 of the patch. There are a few changes in 0001-0005 > that should help make things a bit faster than v11. I've also attached > the new selective deforming code in 0006. There's no JIT support for > 0006 yet, I don't need to be told about that :-) > > I'm planning on starting to go through 0002-0005 in much more detail > from mid next week with my committer hat on. If anyone wants to relook > at any of the 0002-0005 patches, there's still time. I'm also happy to I have some comments on v12-0004. 1. + off +=3D cattr->attlen; + firstNonCachedOffsetAttr =3D i + 1; + } + + tupdesc->firstNonCachedOffsetAttr =3D firstNonCachedOffsetAttr; + tupdesc->firstNonGuaranteedAttr =3D firstNonGuaranteedAttr; +} The firstNonCachedOffsetAttr seems to be the first variable width attribute, but it seems that the offset of this attribute can be cached, for example, in a table defined as (int, text), the offset of firstNonCachedOffsetAttr should be 4, is that correct? If TupleDescFinalize records the offset firstNonCachedOffsetAttr, it might save one iterator of the deforming loop. For example, add something like the following after the above mentioned code. if (firstNonCachedOffsetAttr < tupdesc->natts) { cattr =3D TupleDescCompactAttr(tupdesc, firstNonCachedOffsetAttr); cattr->attcacheoff =3D off; } 2. in slot_deform_heap_tuple, there are multiple statements setting firstNonCacheOffsetAttr, + firstNonCacheOffsetAttr =3D tupleDesc->firstNonCachedOffsetAttr; + /* We can only use any cached offsets until the first NULL attr */ + firstNonCacheOffsetAttr =3D Min(firstNonCacheOffsetAttr, + firstNullAttr); + /* We can only fetch as many attributes as the tuple has. */ + firstNonCacheOffsetAttr =3D Min(firstNonCacheOffsetAttr, natts); Based on the logic, it seems the second one could be moved to the third position, and the third one could then be safely removed? > receive feedback on 0006, but I will address concerns with that at a > lower priority. One thing that's still left todo in the 0004 patch is > enable the TTS_FLAG_OBEYS_NOT_NULL_CONSTRAINTS optimisation for a few > other scan types. > > Thanks for reading > > David > > [1] https://postgr.es/m/CAApHDvo1i-ycAcWnK3L7ZASTuM8mW46kvRqMaUHD46HSuJmx= 7A@mail.gmail.com --=20 Regards Junwang Zhao