public inbox for [email protected]  
help / color / mirror / Atom feed
Re: More speedups for tuple deformation
19+ messages / 4 participants
[nested] [flat]

* Re: More speedups for tuple deformation
@ 2026-01-18 22:13  David Rowley <[email protected]>
  0 siblings, 1 reply; 19+ messages in thread

From: David Rowley @ 2026-01-18 22:13 UTC (permalink / raw)
  To: PostgreSQL Developers <[email protected]>

On Fri, 2 Jan 2026 at 18:58, David Rowley <[email protected]> wrote:
> Please find attached an updated set of patches. A rebase was needed,
> plus 0003 had a problem with an Assert not handling the bitmap being a
> NULL pointer.

Another rebase and updates to some newly created missing calls to
TupleDescFinalize().

I've also attached another round of benchmarks after dipping into some
Azure machines to cover my lack of any Intel benchmark results. I
think these are somewhat noisy as I opted for low core-count instances
which will have L3 shared with workloads running for other people.
This is most evident in Xeon_E5-2673 with gcc where the patched run
was nearly twice as fast as unpatched for test 2 on 20 extra columns.
If you look at the raw results from that, you can see the times are
quite unstable between the 3 runs of each test, which makes me believe
that the machine was busy with other work when that test ran on
master. The AMD3990x and M2 machines are all sitting next to me and
were otherwise idle, so they should be much more stable.

Quite a few machines have a small regression for the 0 extra column
tests. There is a small amount of extra work being done in the
deforming function to check if the attnum < the first attribute
without an attcacheoff. This mostly only affects the tests that don't
do any deforming with a cached attcacheoff, e.g due to NULLs or
varlena types. The only way I've thought about to possibly reduce that
is to invent a new TupleTableSlotOps and pick the one that applies
when creating the TupleTableSlot. This doesn't appeal to me very much
as it requires modifying many callsites. But I do wonder if we should
try to come up with something here as technically we could use this to
eliminate alignment padding out of some MinimalTuples in some cases
where these were not directly derived from pre-formed HeapTuples. That
could allow a more compact tuple representation for sorting and
hashing, allowing us to do more with less memory in some cases.

The benchmark results also indicated that there wasn't much advantage
to the 0002+0003 patches, so I've removed those from the set. That
reduces some complexity around the benchmarks. I did still keep the
OPTIMIZE_BYVAL loop as separate results. It's not quite clear what's
best there as machines seem to vary on which they prefer.

Benchmark results attached in the bz2 file both in spreadsheet form
and the raw results pg_dumped.

David

From 3676468860a43bcd7a288f218d4bf32efef9ac29 Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Tue, 31 Dec 2024 09:19:24 +1300
Subject: [PATCH v3] Precalculate CompactAttribute's attcacheoff

This allows code to be removed from the tuple deform routines which
shrinks down the code a little, which can make it run more quickly.
This also makes a dedicated deformer loop to deform the portion of the
tuple which has a known offset, which makes deforming much faster when
a leading set of the table's columns are non-NULL values and fixed-width
types.
---
 contrib/dblink/dblink.c                       |   3 +
 contrib/pg_buffercache/pg_buffercache_pages.c |   2 +
 contrib/pg_visibility/pg_visibility.c         |   2 +
 src/backend/access/brin/brin_tuple.c          |   1 +
 src/backend/access/common/heaptuple.c         | 317 ++++++----------
 src/backend/access/common/indextuple.c        | 355 +++++++-----------
 src/backend/access/common/tupdesc.c           |  62 +++
 src/backend/access/gin/ginutil.c              |   1 +
 src/backend/access/gist/gistscan.c            |   1 +
 src/backend/access/spgist/spgutils.c          |   4 +-
 src/backend/access/transam/twophase.c         |   1 +
 src/backend/access/transam/xlogfuncs.c        |   1 +
 src/backend/backup/basebackup_copy.c          |   3 +
 src/backend/catalog/index.c                   |   2 +
 src/backend/catalog/pg_publication.c          |   1 +
 src/backend/catalog/toasting.c                |   6 +
 src/backend/commands/explain.c                |   1 +
 src/backend/commands/functioncmds.c           |   1 +
 src/backend/commands/sequence.c               |   1 +
 src/backend/commands/tablecmds.c              |   4 +
 src/backend/commands/wait.c                   |   1 +
 src/backend/executor/execSRF.c                |   2 +
 src/backend/executor/execTuples.c             | 302 +++++++--------
 src/backend/executor/nodeFunctionscan.c       |   2 +
 src/backend/jit/llvm/llvmjit_deform.c         |   4 -
 src/backend/parser/parse_relation.c           |   4 +-
 src/backend/parser/parse_target.c             |   2 +
 .../libpqwalreceiver/libpqwalreceiver.c       |   1 +
 src/backend/replication/walsender.c           |   5 +
 src/backend/utils/adt/acl.c                   |   1 +
 src/backend/utils/adt/genfile.c               |   1 +
 src/backend/utils/adt/lockfuncs.c             |   1 +
 src/backend/utils/adt/orderedsetaggs.c        |   1 +
 src/backend/utils/adt/pgstatfuncs.c           |   5 +
 src/backend/utils/adt/tsvector_op.c           |   1 +
 src/backend/utils/cache/relcache.c            |  20 +-
 src/backend/utils/fmgr/funcapi.c              |   6 +
 src/backend/utils/misc/guc_funcs.c            |   5 +
 src/include/access/htup_details.h             |  19 +-
 src/include/access/itup.h                     |  20 +-
 src/include/access/tupdesc.h                  |  14 +
 src/include/access/tupmacs.h                  |  57 +++
 src/include/executor/tuptable.h               |   9 +-
 src/pl/plpgsql/src/pl_comp.c                  |   2 +
 .../test_custom_fixed_stats.c                 |   1 +
 .../modules/test_predtest/test_predtest.c     |   1 +
 46 files changed, 623 insertions(+), 633 deletions(-)

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 8cb3166495c..ed0267d2183 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -881,6 +881,7 @@ materializeResult(FunctionCallInfo fcinfo, PGconn *conn, PGresult *res)
 		tupdesc = CreateTemplateTupleDesc(1);
 		TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 						   TEXTOID, -1, 0);
+		TupleDescFinalize(tupdesc);
 		ntuples = 1;
 		nfields = 1;
 	}
@@ -1045,6 +1046,7 @@ materializeQueryResult(FunctionCallInfo fcinfo,
 			TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 							   TEXTOID, -1, 0);
 			attinmeta = TupleDescGetAttInMetadata(tupdesc);
+			TupleDescFinalize(tupdesc);
 
 			oldcontext = MemoryContextSwitchTo(rsinfo->econtext->ecxt_per_query_memory);
 			tupstore = tuplestore_begin_heap(true, false, work_mem);
@@ -1534,6 +1536,7 @@ dblink_get_pkey(PG_FUNCTION_ARGS)
 		 * C strings
 		 */
 		attinmeta = TupleDescGetAttInMetadata(tupdesc);
+		TupleDescFinalize(tupdesc);
 		funcctx->attinmeta = attinmeta;
 
 		if ((results != NULL) && (indnkeyatts > 0))
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index dcba3fb5473..2fdf5a341f6 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -174,6 +174,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
 			TupleDescInitEntry(tupledesc, (AttrNumber) 9, "pinning_backends",
 							   INT4OID, -1, 0);
 
+		TupleDescFinalize(tupledesc);
 		fctx->tupdesc = BlessTupleDesc(tupledesc);
 
 		/* Allocate NBuffers worth of BufferCachePagesRec records. */
@@ -442,6 +443,7 @@ pg_buffercache_os_pages_internal(FunctionCallInfo fcinfo, bool include_numa)
 		TupleDescInitEntry(tupledesc, (AttrNumber) 3, "numa_node",
 						   INT4OID, -1, 0);
 
+		TupleDescFinalize(tupledesc);
 		fctx->tupdesc = BlessTupleDesc(tupledesc);
 		fctx->include_numa = include_numa;
 
diff --git a/contrib/pg_visibility/pg_visibility.c b/contrib/pg_visibility/pg_visibility.c
index 9bc3a784bf7..dfab0b64cf5 100644
--- a/contrib/pg_visibility/pg_visibility.c
+++ b/contrib/pg_visibility/pg_visibility.c
@@ -469,6 +469,8 @@ pg_visibility_tupdesc(bool include_blkno, bool include_pd)
 		TupleDescInitEntry(tupdesc, ++a, "pd_all_visible", BOOLOID, -1, 0);
 	Assert(a == maxattr);
 
+	TupleDescFinalize(tupdesc);
+
 	return BlessTupleDesc(tupdesc);
 }
 
diff --git a/src/backend/access/brin/brin_tuple.c b/src/backend/access/brin/brin_tuple.c
index 706387e36d6..7f150df9ee7 100644
--- a/src/backend/access/brin/brin_tuple.c
+++ b/src/backend/access/brin/brin_tuple.c
@@ -84,6 +84,7 @@ brtuple_disk_tupdesc(BrinDesc *brdesc)
 
 		MemoryContextSwitchTo(oldcxt);
 
+		TupleDescFinalize(tupdesc);
 		brdesc->bd_disktdesc = tupdesc;
 	}
 
diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c
index 11bec20e82e..262d5d1bc69 100644
--- a/src/backend/access/common/heaptuple.c
+++ b/src/backend/access/common/heaptuple.c
@@ -497,20 +497,8 @@ heap_attisnull(HeapTuple tup, int attnum, TupleDesc tupleDesc)
 /* ----------------
  *		nocachegetattr
  *
- *		This only gets called from fastgetattr(), in cases where we
- *		can't use a cacheoffset and the value is not null.
- *
- *		This caches attribute offsets in the attribute descriptor.
- *
- *		An alternative way to speed things up would be to cache offsets
- *		with the tuple, but that seems more difficult unless you take
- *		the storage hit of actually putting those offsets into the
- *		tuple you send to disk.  Yuck.
- *
- *		This scheme will be slightly slower than that, but should
- *		perform well for queries which hit large #'s of tuples.  After
- *		you cache the offsets once, examining all the other tuples using
- *		the same attribute descriptor will go much quicker. -cim 5/4/91
+ *		This only gets called from fastgetattr(), in cases where the
+ *		attcacheoff is not set.
  *
  *		NOTE: if you need to change this code, see also heap_deform_tuple.
  *		Also see nocache_index_getattr, which is the same code for index
@@ -522,194 +510,101 @@ nocachegetattr(HeapTuple tup,
 			   int attnum,
 			   TupleDesc tupleDesc)
 {
+	CompactAttribute *cattr;
 	HeapTupleHeader td = tup->t_data;
 	char	   *tp;				/* ptr to data part of tuple */
 	bits8	   *bp = td->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* do we have to walk attrs? */
 	int			off;			/* current offset within data */
+	int			startAttr;
+	int			firstnullattr;
+	bool		hasnulls = HeapTupleHasNulls(tup);
 
-	/* ----------------
-	 *	 Three cases:
-	 *
-	 *	 1: No nulls and no variable-width attributes.
-	 *	 2: Has a null or a var-width AFTER att.
-	 *	 3: Has nulls or var-widths BEFORE att.
-	 * ----------------
+	/*
+	 * If there are no NULLs before the required attnum, then we can start at
+	 * the highest attribute with a known offset, or the first attribute if
+	 * none have a cached offset.  If the tuple has no variable width types,
+	 * then we can use a slightly cheaper method of offset calculation, as we
+	 * just need to add the attlen to the aligned offset when skipping over
+	 * columns.  When the tuple contains variable-width types, we must use
+	 * att_addlength_pointer(), which does a bit more branching and is
+	 * slightly less efficient.
 	 */
-
 	attnum--;
 
-	if (!HeapTupleNoNulls(tup))
-	{
-		/*
-		 * there's a null somewhere in the tuple
-		 *
-		 * check to see if any preceding bits are null...
-		 */
-		int			byte = attnum >> 3;
-		int			finalbit = attnum & 0x07;
-
-		/* check for nulls "before" final bit of last byte */
-		if ((~bp[byte]) & ((1 << finalbit) - 1))
-			slow = true;
-		else
-		{
-			/* check for nulls in any "earlier" bytes */
-			int			i;
+	if (hasnulls)
+		firstnullattr = first_null_attr(bp, attnum);
+	else
+		firstnullattr = attnum;
 
-			for (i = 0; i < byte; i++)
-			{
-				if (bp[i] != 0xFF)
-				{
-					slow = true;
-					break;
-				}
-			}
-		}
+	if (tupleDesc->firstNonCachedOffAttr >= 0)
+	{
+		startAttr = Min(tupleDesc->firstNonCachedOffAttr - 1, firstnullattr);
+		off = TupleDescCompactAttr(tupleDesc, startAttr)->attcacheoff;
+	}
+	else
+	{
+		startAttr = 0;
+		off = 0;
 	}
 
 	tp = (char *) td + td->t_hoff;
 
-	if (!slow)
+	if (hasnulls)
 	{
-		CompactAttribute *att;
+		for (int i = startAttr; i < attnum; i++)
+		{
+			CompactAttribute *att;
 
-		/*
-		 * If we get here, there are no nulls up to and including the target
-		 * attribute.  If we have a cached offset, we can use it.
-		 */
-		att = TupleDescCompactAttr(tupleDesc, attnum);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, tp + att->attcacheoff);
+			if (att_isnull(i, bp))
+				continue;
 
-		/*
-		 * Otherwise, check for non-fixed-length attrs up to and including
-		 * target.  If there aren't any, it's safe to cheaply initialize the
-		 * cached offsets for these attrs.
-		 */
-		if (HeapTupleHasVarWidth(tup))
-		{
-			int			j;
+			att = TupleDescCompactAttr(tupleDesc, i);
 
-			for (j = 0; j <= attnum; j++)
-			{
-				if (TupleDescCompactAttr(tupleDesc, j)->attlen <= 0)
-				{
-					slow = true;
-					break;
-				}
-			}
+			off = att_pointer_alignby(off,
+									  att->attalignby,
+									  att->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, att->attlen, tp + off);
 		}
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
 	}
-
-	if (!slow)
+	else if (!HeapTupleHasVarWidth(tup))
 	{
-		int			natts = tupleDesc->natts;
-		int			j = 1;
-
-		/*
-		 * If we get here, we have a tuple with no nulls or var-widths up to
-		 * and including the target attribute, so we can use the cached offset
-		 * ... only we don't have it yet, or we'd not have got here.  Since
-		 * it's cheap to compute offsets for fixed-width columns, we take the
-		 * opportunity to initialize the cached offsets for *all* the leading
-		 * fixed-width columns, in hope of avoiding future visits to this
-		 * routine.
-		 */
-		TupleDescCompactAttr(tupleDesc, 0)->attcacheoff = 0;
-
-		/* we might have set some offsets in the slow path previously */
-		while (j < natts && TupleDescCompactAttr(tupleDesc, j)->attcacheoff > 0)
-			j++;
-
-		off = TupleDescCompactAttr(tupleDesc, j - 1)->attcacheoff +
-			TupleDescCompactAttr(tupleDesc, j - 1)->attlen;
-
-		for (; j < natts; j++)
+		for (int i = startAttr; i < attnum; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, j);
-
-			if (att->attlen <= 0)
-				break;
+			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
 
 			off = att_nominal_alignby(off, att->attalignby);
-
-			att->attcacheoff = off;
-
 			off += att->attlen;
 		}
-
-		Assert(j > attnum);
-
-		off = TupleDescCompactAttr(tupleDesc, attnum)->attcacheoff;
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_nominal_alignby(off, cattr->attalignby);
 	}
 	else
 	{
-		bool		usecache = true;
-		int			i;
-
-		/*
-		 * Now we know that we have to walk the tuple CAREFULLY.  But we still
-		 * might be able to cache some offsets for next time.
-		 *
-		 * Note - This loop is a little tricky.  For each non-null attribute,
-		 * we have to first account for alignment padding before the attr,
-		 * then advance over the attr based on its length.  Nulls have no
-		 * storage and no alignment padding either.  We can use/set
-		 * attcacheoff until we reach either a null or a var-width attribute.
-		 */
-		off = 0;
-		for (i = 0;; i++)		/* loop exit is at "break" */
+		for (int i = startAttr; i < attnum; i++)
 		{
 			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
 
-			if (HeapTupleHasNulls(tup) && att_isnull(i, bp))
-			{
-				usecache = false;
-				continue;		/* this cannot be the target att */
-			}
-
-			/* If we know the next offset, we can skip the rest */
-			if (usecache && att->attcacheoff >= 0)
-				off = att->attcacheoff;
-			else if (att->attlen == -1)
-			{
-				/*
-				 * We can only cache the offset for a varlena attribute if the
-				 * offset is already suitably aligned, so that there would be
-				 * no pad bytes in any case: then the offset will be valid for
-				 * either an aligned or unaligned value.
-				 */
-				if (usecache &&
-					off == att_nominal_alignby(off, att->attalignby))
-					att->attcacheoff = off;
-				else
-				{
-					off = att_pointer_alignby(off, att->attalignby, -1,
-											  tp + off);
-					usecache = false;
-				}
-			}
-			else
-			{
-				/* not varlena, so safe to use att_nominal_alignby */
-				off = att_nominal_alignby(off, att->attalignby);
-
-				if (usecache)
-					att->attcacheoff = off;
-			}
-
-			if (i == attnum)
-				break;
-
+			off = att_pointer_alignby(off,
+									  att->attalignby,
+									  att->attlen,
+									  tp + off);
 			off = att_addlength_pointer(off, att->attlen, tp + off);
 
-			if (usecache && att->attlen <= 0)
-				usecache = false;
 		}
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
 	}
 
-	return fetchatt(TupleDescCompactAttr(tupleDesc, attnum), tp + off);
+	return fetchatt(cattr, tp + off);
 }
 
 /* ----------------
@@ -1354,7 +1249,8 @@ heap_deform_tuple(HeapTuple tuple, TupleDesc tupleDesc,
 	char	   *tp;				/* ptr to tuple data */
 	uint32		off;			/* offset in tuple data */
 	bits8	   *bp = tup->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* can we use/set attcacheoff? */
+	int			cacheoffattrs;
+	int			firstnullattr;
 
 	natts = HeapTupleHeaderGetNatts(tup);
 
@@ -1364,60 +1260,77 @@ heap_deform_tuple(HeapTuple tuple, TupleDesc tupleDesc,
 	 * the caller's arrays.
 	 */
 	natts = Min(natts, tdesc_natts);
+	cacheoffattrs = Min(tupleDesc->firstNonCachedOffAttr, natts);
 
-	tp = (char *) tup + tup->t_hoff;
+	if (hasnulls)
+	{
+		firstnullattr = first_null_attr(bp, natts);
+		cacheoffattrs = Min(cacheoffattrs, firstnullattr);
+	}
+	else
+		firstnullattr = natts;
 
+	tp = (char *) tup + tup->t_hoff;
 	off = 0;
 
-	for (attnum = 0; attnum < natts; attnum++)
+	for (attnum = 0; attnum < cacheoffattrs; attnum++)
 	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDesc, attnum);
+		CompactAttribute *cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+		Assert(cattr->attcacheoff >= 0);
+
+		values[attnum] = fetch_att(tp + cattr->attcacheoff, cattr->attbyval,
+								   cattr->attlen);
+		isnull[attnum] = false;
+		off = cattr->attcacheoff + cattr->attlen;
+	}
 
-		if (hasnulls && att_isnull(attnum, bp))
+	for (; attnum < firstnullattr; attnum++)
+	{
+		CompactAttribute *cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+		if (cattr->attlen == -1)
+			off = att_pointer_alignby(off, cattr->attalignby, -1,
+									  tp + off);
+		else
 		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			slow = true;		/* can't use attcacheoff anymore */
-			continue;
+			/* not varlena, so safe to use att_nominal_alignby */
+			off = att_nominal_alignby(off, cattr->attalignby);
 		}
 
 		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (!slow && thisatt->attcacheoff >= 0)
-			off = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+	}
+
+	for (; attnum < natts; attnum++)
+	{
+		CompactAttribute *cattr;
+
+		Assert(hasnulls);
+
+		if (att_isnull(attnum, bp))
 		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow &&
-				off == att_nominal_alignby(off, thisatt->attalignby))
-				thisatt->attcacheoff = off;
-			else
-			{
-				off = att_pointer_alignby(off, thisatt->attalignby, -1,
-										  tp + off);
-				slow = true;
-			}
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
 		}
+
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		if (cattr->attlen == -1)
+			off = att_pointer_alignby(off, cattr->attalignby, -1,
+									  tp + off);
 		else
 		{
 			/* not varlena, so safe to use att_nominal_alignby */
-			off = att_nominal_alignby(off, thisatt->attalignby);
-
-			if (!slow)
-				thisatt->attcacheoff = off;
+			off = att_nominal_alignby(off, cattr->attalignby);
 		}
 
-		values[attnum] = fetchatt(thisatt, tp + off);
-
-		off = att_addlength_pointer(off, thisatt->attlen, tp + off);
+		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (thisatt->attlen <= 0)
-			slow = true;		/* can't use attcacheoff anymore */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
 	/*
diff --git a/src/backend/access/common/indextuple.c b/src/backend/access/common/indextuple.c
index d7c8c53fd8d..647248ded2c 100644
--- a/src/backend/access/common/indextuple.c
+++ b/src/backend/access/common/indextuple.c
@@ -223,18 +223,6 @@ index_form_tuple_context(TupleDesc tupleDescriptor,
  *
  *		This gets called from index_getattr() macro, and only in cases
  *		where we can't use cacheoffset and the value is not null.
- *
- *		This caches attribute offsets in the attribute descriptor.
- *
- *		An alternative way to speed things up would be to cache offsets
- *		with the tuple, but that seems more difficult unless you take
- *		the storage hit of actually putting those offsets into the
- *		tuple you send to disk.  Yuck.
- *
- *		This scheme will be slightly slower than that, but should
- *		perform well for queries which hit large #'s of tuples.  After
- *		you cache the offsets once, examining all the other tuples using
- *		the same attribute descriptor will go much quicker. -cim 5/4/91
  * ----------------
  */
 Datum
@@ -242,205 +230,126 @@ nocache_index_getattr(IndexTuple tup,
 					  int attnum,
 					  TupleDesc tupleDesc)
 {
+	CompactAttribute *cattr;
 	char	   *tp;				/* ptr to data part of tuple */
 	bits8	   *bp = NULL;		/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* do we have to walk attrs? */
 	int			data_off;		/* tuple data offset */
 	int			off;			/* current offset within data */
+	int			startAttr;
+	int			firstnullattr;
+	bool		hasnulls = IndexTupleHasNulls(tup);
+	int			i;
 
-	/* ----------------
-	 *	 Three cases:
-	 *
-	 *	 1: No nulls and no variable-width attributes.
-	 *	 2: Has a null or a var-width AFTER att.
-	 *	 3: Has nulls or var-widths BEFORE att.
-	 * ----------------
-	 */
+	attnum--;
 
+	/*
+	 * If there are no NULLs before the required attnum, then we can start at
+	 * the highest attribute with a known offset, or the first attribute if
+	 * none have a cached offset.  If the tuple has no variable width types,
+	 * which is common with indexes, then we can use a slightly cheaper method
+	 * of offset calculation, as we just need to add the attlen to the aligned
+	 * offset when skipping over columns.  When the tuple contains
+	 * variable-width types, we must use att_addlength_pointer(), which does a
+	 * bit more branching and is slightly less efficient.
+	 */
 	data_off = IndexInfoFindDataOffset(tup->t_info);
+	tp = (char *) tup + data_off;
 
-	attnum--;
-
-	if (IndexTupleHasNulls(tup))
+	/*
+	 * Find the first NULL column, or if there's none set the first NULL to
+	 * attnum so that we can forego NULL checking all the way to attnum.
+	 */
+	if (hasnulls)
 	{
-		/*
-		 * there's a null somewhere in the tuple
-		 *
-		 * check to see if desired att is null
-		 */
-
-		/* XXX "knows" t_bits are just after fixed tuple header! */
 		bp = (bits8 *) ((char *) tup + sizeof(IndexTupleData));
-
-		/*
-		 * Now check to see if any preceding bits are null...
-		 */
-		{
-			int			byte = attnum >> 3;
-			int			finalbit = attnum & 0x07;
-
-			/* check for nulls "before" final bit of last byte */
-			if ((~bp[byte]) & ((1 << finalbit) - 1))
-				slow = true;
-			else
-			{
-				/* check for nulls in any "earlier" bytes */
-				int			i;
-
-				for (i = 0; i < byte; i++)
-				{
-					if (bp[i] != 0xFF)
-					{
-						slow = true;
-						break;
-					}
-				}
-			}
-		}
+		firstnullattr = first_null_attr(bp, attnum);
 	}
+	else
+		firstnullattr = attnum;
 
-	tp = (char *) tup + data_off;
-
-	if (!slow)
+	if (tupleDesc->firstNonCachedOffAttr >= 0)
 	{
-		CompactAttribute *att;
-
-		/*
-		 * If we get here, there are no nulls up to and including the target
-		 * attribute.  If we have a cached offset, we can use it.
-		 */
-		att = TupleDescCompactAttr(tupleDesc, attnum);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, tp + att->attcacheoff);
-
-		/*
-		 * Otherwise, check for non-fixed-length attrs up to and including
-		 * target.  If there aren't any, it's safe to cheaply initialize the
-		 * cached offsets for these attrs.
-		 */
-		if (IndexTupleHasVarwidths(tup))
-		{
-			int			j;
-
-			for (j = 0; j <= attnum; j++)
-			{
-				if (TupleDescCompactAttr(tupleDesc, j)->attlen <= 0)
-				{
-					slow = true;
-					break;
-				}
-			}
-		}
+		startAttr = Min(tupleDesc->firstNonCachedOffAttr - 1, firstnullattr);
+		off = TupleDescCompactAttr(tupleDesc, startAttr)->attcacheoff;
 	}
-
-	if (!slow)
+	else
 	{
-		int			natts = tupleDesc->natts;
-		int			j = 1;
-
-		/*
-		 * If we get here, we have a tuple with no nulls or var-widths up to
-		 * and including the target attribute, so we can use the cached offset
-		 * ... only we don't have it yet, or we'd not have got here.  Since
-		 * it's cheap to compute offsets for fixed-width columns, we take the
-		 * opportunity to initialize the cached offsets for *all* the leading
-		 * fixed-width columns, in hope of avoiding future visits to this
-		 * routine.
-		 */
-		TupleDescCompactAttr(tupleDesc, 0)->attcacheoff = 0;
+		startAttr = 0;
+		off = 0;
+	}
 
-		/* we might have set some offsets in the slow path previously */
-		while (j < natts && TupleDescCompactAttr(tupleDesc, j)->attcacheoff > 0)
-			j++;
+	/* Handle tuples with var-width attributes */
+	if (IndexTupleHasVarwidths(tup))
+	{
+		/* Calculate the offset up until the first NULL */
+		for (i = startAttr; i < firstnullattr; i++)
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-		off = TupleDescCompactAttr(tupleDesc, j - 1)->attcacheoff +
-			TupleDescCompactAttr(tupleDesc, j - 1)->attlen;
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, cattr->attlen, tp + off);
+		}
 
-		for (; j < natts; j++)
+		/* Calculate the offset for any remaining columns. */
+		for (; i < attnum; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, j);
-
-			if (att->attlen <= 0)
-				break;
+			Assert(hasnulls);
 
-			off = att_nominal_alignby(off, att->attalignby);
+			if (att_isnull(i, bp))
+				continue;
 
-			att->attcacheoff = off;
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-			off += att->attlen;
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, cattr->attlen, tp + off);
 		}
-
-		Assert(j > attnum);
-
-		off = TupleDescCompactAttr(tupleDesc, attnum)->attcacheoff;
 	}
 	else
 	{
-		bool		usecache = true;
-		int			i;
+		/* Handle tuples with only fixed-width attributes */
 
-		/*
-		 * Now we know that we have to walk the tuple CAREFULLY.  But we still
-		 * might be able to cache some offsets for next time.
-		 *
-		 * Note - This loop is a little tricky.  For each non-null attribute,
-		 * we have to first account for alignment padding before the attr,
-		 * then advance over the attr based on its length.  Nulls have no
-		 * storage and no alignment padding either.  We can use/set
-		 * attcacheoff until we reach either a null or a var-width attribute.
-		 */
-		off = 0;
-		for (i = 0;; i++)		/* loop exit is at "break" */
+		/* Calculate the offset up until the first NULL */
+		for (i = startAttr; i < firstnullattr; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
-
-			if (IndexTupleHasNulls(tup) && att_isnull(i, bp))
-			{
-				usecache = false;
-				continue;		/* this cannot be the target att */
-			}
-
-			/* If we know the next offset, we can skip the rest */
-			if (usecache && att->attcacheoff >= 0)
-				off = att->attcacheoff;
-			else if (att->attlen == -1)
-			{
-				/*
-				 * We can only cache the offset for a varlena attribute if the
-				 * offset is already suitably aligned, so that there would be
-				 * no pad bytes in any case: then the offset will be valid for
-				 * either an aligned or unaligned value.
-				 */
-				if (usecache &&
-					off == att_nominal_alignby(off, att->attalignby))
-					att->attcacheoff = off;
-				else
-				{
-					off = att_pointer_alignby(off, att->attalignby, -1,
-											  tp + off);
-					usecache = false;
-				}
-			}
-			else
-			{
-				/* not varlena, so safe to use att_nominal_alignby */
-				off = att_nominal_alignby(off, att->attalignby);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
+
+			Assert(cattr->attlen > 0);
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off += cattr->attlen;
+		}
 
-				if (usecache)
-					att->attcacheoff = off;
-			}
+		/* Calculate the offset for any remaining columns. */
+		for (; i < attnum; i++)
+		{
+			Assert(hasnulls);
 
-			if (i == attnum)
-				break;
+			if (att_isnull(i, bp))
+				continue;
 
-			off = att_addlength_pointer(off, att->attlen, tp + off);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-			if (usecache && att->attlen <= 0)
-				usecache = false;
+			Assert(cattr->attlen > 0);
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off += cattr->attlen;
 		}
 	}
 
-	return fetchatt(TupleDescCompactAttr(tupleDesc, attnum), tp + off);
+	cattr = TupleDescCompactAttr(tupleDesc, attnum);
+	off = att_pointer_alignby(off, cattr->attalignby,
+							  cattr->attlen, tp + off);
+	return fetchatt(cattr, tp + off);
 }
 
 /*
@@ -481,62 +390,76 @@ index_deform_tuple_internal(TupleDesc tupleDescriptor,
 							char *tp, bits8 *bp, int hasnulls)
 {
 	int			natts = tupleDescriptor->natts; /* number of atts to extract */
-	int			attnum;
+	int			attnum = 0;
 	int			off = 0;		/* offset in tuple data */
-	bool		slow = false;	/* can we use/set attcacheoff? */
+	int			cacheoffattrs;
+	int			firstnullattr;
 
 	/* Assert to protect callers who allocate fixed-size arrays */
 	Assert(natts <= INDEX_MAX_KEYS);
 
-	for (attnum = 0; attnum < natts; attnum++)
+	cacheoffattrs = Min(tupleDescriptor->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
 	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDescriptor, attnum);
+		firstnullattr = first_null_attr(bp, natts);
+		cacheoffattrs = Min(cacheoffattrs, firstnullattr);
+	}
+	else
+		firstnullattr = natts;
+
+	if (attnum < cacheoffattrs)
+	{
+		CompactAttribute *cattr;
 
-		if (hasnulls && att_isnull(attnum, bp))
+		do
 		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			slow = true;		/* can't use attcacheoff anymore */
-			continue;
-		}
+			cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+
+			Assert(cattr->attcacheoff >= 0);
+
+			values[attnum] = fetch_att(tp + cattr->attcacheoff, cattr->attbyval,
+									   cattr->attlen);
+			isnull[attnum] = false;
+		} while (++attnum < cacheoffattrs);
+
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+
+	for (; attnum < firstnullattr; attnum++)
+	{
+		CompactAttribute *cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
 		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (!slow && thisatt->attcacheoff >= 0)
-			off = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow &&
-				off == att_nominal_alignby(off, thisatt->attalignby))
-				thisatt->attcacheoff = off;
-			else
-			{
-				off = att_pointer_alignby(off, thisatt->attalignby, -1,
-										  tp + off);
-				slow = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			off = att_nominal_alignby(off, thisatt->attalignby);
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+	}
+
+	for (; attnum < natts; attnum++)
+	{
+		CompactAttribute *cattr;
+
+		Assert(hasnulls);
 
-			if (!slow)
-				thisatt->attcacheoff = off;
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
 		}
 
-		values[attnum] = fetchatt(thisatt, tp + off);
+		cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
-		off = att_addlength_pointer(off, thisatt->attlen, tp + off);
+		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (thisatt->attlen <= 0)
-			slow = true;		/* can't use attcacheoff anymore */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 }
 
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index 94b4f1f9975..727475b6fb0 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -238,6 +238,9 @@ CreateTupleDesc(int natts, Form_pg_attribute *attrs)
 		memcpy(TupleDescAttr(desc, i), attrs[i], ATTRIBUTE_FIXED_PART_SIZE);
 		populate_compact_attribute(desc, i);
 	}
+
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -282,6 +285,8 @@ CreateTupleDescCopy(TupleDesc tupdesc)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -328,6 +333,8 @@ CreateTupleDescTruncatedCopy(TupleDesc tupdesc, int natts)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -413,6 +420,8 @@ CreateTupleDescCopyConstr(TupleDesc tupdesc)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -455,6 +464,8 @@ TupleDescCopy(TupleDesc dst, TupleDesc src)
 	 * source's refcount would be wrong in any case.)
 	 */
 	dst->tdrefcount = -1;
+
+	TupleDescFinalize(dst);
 }
 
 /*
@@ -463,6 +474,9 @@ TupleDescCopy(TupleDesc dst, TupleDesc src)
  *		descriptor to another.
  *
  * !!! Constraints and defaults are not copied !!!
+ *
+ * The caller must take care of calling TupleDescFinalize() on once all
+ * TupleDesc changes have been made.
  */
 void
 TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
@@ -495,6 +509,52 @@ TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 	populate_compact_attribute(dst, dstAttno - 1);
 }
 
+/*
+ * TupleDescFinalize
+ *		Finalize the given TupleDesc.  This must be called after the
+ *		attributes arrays have been populated or adjusted by any code.
+ *
+ * Must be called after populate_compact_attribute()
+ */
+void
+TupleDescFinalize(TupleDesc tupdesc)
+{
+	int			firstNonCachedOffAttr = -1;
+#ifdef OPTIMIZE_BYVAL
+	int			firstByRefAttr = tupdesc->natts;
+#endif
+	int			offp = 0;
+
+	for (int i = 0; i < tupdesc->natts; i++)
+	{
+		CompactAttribute *cattr = TupleDescCompactAttr(tupdesc, i);
+
+#ifdef OPTIMIZE_BYVAL
+		if (!cattr->attbyval)
+			firstByRefAttr = Min(firstByRefAttr, i);
+#endif
+
+		/*
+		 * We can't cache the offset for the first varlena attr as the
+		 * alignment for those depends on 1 vs 4 byte headers, however we
+		 * possibily could cache the first attlen == -2 attr.  Worthwhile?
+		 */
+		if (cattr->attlen <= 0)
+			break;
+
+		offp = att_nominal_alignby(offp, cattr->attalignby);
+		cattr->attcacheoff = offp;
+
+		offp += cattr->attlen;
+		firstNonCachedOffAttr = i + 1;
+	}
+
+	tupdesc->firstNonCachedOffAttr = firstNonCachedOffAttr;
+#ifdef OPTIMIZE_BYVAL
+	tupdesc->firstByRefAttr = firstByRefAttr;
+#endif
+}
+
 /*
  * Free a TupleDesc including all substructure
  */
@@ -1082,6 +1142,8 @@ BuildDescFromLists(const List *names, const List *types, const List *typmods, co
 		TupleDescInitEntryCollation(desc, attnum, attcollation);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index d205093e21d..a533d79e26e 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -129,6 +129,7 @@ initGinState(GinState *state, Relation index)
 							   attr->attndims);
 			TupleDescInitEntryCollation(state->tupdesc[i], (AttrNumber) 2,
 										attr->attcollation);
+			TupleDescFinalize(state->tupdesc[i]);
 		}
 
 		/*
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
index f23bc4a6757..c65f93abdae 100644
--- a/src/backend/access/gist/gistscan.c
+++ b/src/backend/access/gist/gistscan.c
@@ -201,6 +201,7 @@ gistrescan(IndexScanDesc scan, ScanKey key, int nkeys,
 											 attno - 1)->atttypid,
 							   -1, 0);
 		}
+		TupleDescFinalize(so->giststate->fetchTupdesc);
 		scan->xs_hitupdesc = so->giststate->fetchTupdesc;
 
 		/* Also create a memory context that will hold the returned tuples */
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 9f5379b87ac..a4694bd8065 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -335,11 +335,9 @@ getSpGistTupleDesc(Relation index, SpGistTypeDesc *keyType)
 		/* We shouldn't need to bother with making these valid: */
 		att->attcompression = InvalidCompressionMethod;
 		att->attcollation = InvalidOid;
-		/* In case we changed typlen, we'd better reset following offsets */
-		for (int i = spgFirstIncludeColumn; i < outTupDesc->natts; i++)
-			TupleDescCompactAttr(outTupDesc, i)->attcacheoff = -1;
 
 		populate_compact_attribute(outTupDesc, spgKeyColumn);
+		TupleDescFinalize(outTupDesc);
 	}
 	return outTupDesc;
 }
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index e50abb331cc..9f708f84334 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -744,6 +744,7 @@ pg_prepared_xact(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 5, "dbid",
 						   OIDOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 2efe4105efb..b6bc616c74c 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -400,6 +400,7 @@ pg_walfile_name_offset(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(resultTupleDesc, (AttrNumber) 2, "file_offset",
 					   INT4OID, -1, 0);
 
+	TupleDescFinalize(resultTupleDesc);
 	resultTupleDesc = BlessTupleDesc(resultTupleDesc);
 
 	/*
diff --git a/src/backend/backup/basebackup_copy.c b/src/backend/backup/basebackup_copy.c
index fecfad9ab7b..29dbd0cb32f 100644
--- a/src/backend/backup/basebackup_copy.c
+++ b/src/backend/backup/basebackup_copy.c
@@ -357,6 +357,8 @@ SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli)
 	 */
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "tli", INT8OID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
+
 	/* send RowDescription */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
 
@@ -388,6 +390,7 @@ SendTablespaceList(List *tablespaces)
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, "spcoid", OIDOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "spclocation", TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "size", INT8OID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* send RowDescription */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 43de42ce39e..75e97fb394a 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -481,6 +481,8 @@ ConstructTupleDescriptor(Relation heapRelation,
 		populate_compact_attribute(indexTupDesc, i);
 	}
 
+	TupleDescFinalize(indexTupDesc);
+
 	return indexTupDesc;
 }
 
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 9a4791c573e..fa353a0dd37 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -1230,6 +1230,7 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 4, "qual",
 						   PG_NODE_TREEOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 		funcctx->user_fctx = table_infos;
 
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index c78dcea98c1..078a1cf5127 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -229,6 +229,12 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid,
 	TupleDescAttr(tupdesc, 1)->attcompression = InvalidCompressionMethod;
 	TupleDescAttr(tupdesc, 2)->attcompression = InvalidCompressionMethod;
 
+	populate_compact_attribute(tupdesc, 0);
+	populate_compact_attribute(tupdesc, 1);
+	populate_compact_attribute(tupdesc, 2);
+
+	TupleDescFinalize(tupdesc);
+
 	/*
 	 * Toast tables for regular relations go in pg_toast; those for temp
 	 * relations go into the per-backend temp-toast-table namespace.
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index b7bb111688c..7abd9ed272f 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -281,6 +281,7 @@ ExplainResultDesc(ExplainStmt *stmt)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "QUERY PLAN",
 					   result_type, -1, 0);
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
 
diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c
index a516b037dea..6a8f162b640 100644
--- a/src/backend/commands/functioncmds.c
+++ b/src/backend/commands/functioncmds.c
@@ -2423,6 +2423,7 @@ CallStmtResultDesc(CallStmt *stmt)
 							   -1,
 							   0);
 		}
+		TupleDescFinalize(tupdesc);
 	}
 
 	return tupdesc;
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index e1b808bbb60..551667650ba 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -1808,6 +1808,7 @@ pg_get_sequence_data(PG_FUNCTION_ARGS)
 					   BOOLOID, -1, 0);
 	TupleDescInitEntry(resultTupleDesc, (AttrNumber) 3, "page_lsn",
 					   LSNOID, -1, 0);
+	TupleDescFinalize(resultTupleDesc);
 	resultTupleDesc = BlessTupleDesc(resultTupleDesc);
 
 	seqrel = try_relation_open(relid, AccessShareLock);
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index f976c0e5c7e..ae7c9db9fa9 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -1029,6 +1029,8 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
 		}
 	}
 
+	TupleDescFinalize(descriptor);
+
 	/*
 	 * For relations with table AM and partitioned tables, select access
 	 * method to use: an explicitly indicated one, or (in the case of a
@@ -1448,6 +1450,8 @@ BuildDescForRelation(const List *columns)
 		populate_compact_attribute(desc, attnum - 1);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
diff --git a/src/backend/commands/wait.c b/src/backend/commands/wait.c
index 97f1e778488..9e55d9bfb80 100644
--- a/src/backend/commands/wait.c
+++ b/src/backend/commands/wait.c
@@ -341,5 +341,6 @@ WaitStmtResultDesc(WaitStmt *stmt)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 					   TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
diff --git a/src/backend/executor/execSRF.c b/src/backend/executor/execSRF.c
index a0b111dc0e4..b481e50acfb 100644
--- a/src/backend/executor/execSRF.c
+++ b/src/backend/executor/execSRF.c
@@ -272,6 +272,7 @@ ExecMakeTableFunctionResult(SetExprState *setexpr,
 									   funcrettype,
 									   -1,
 									   0);
+					TupleDescFinalize(tupdesc);
 					rsinfo.setDesc = tupdesc;
 				}
 				MemoryContextSwitchTo(oldcontext);
@@ -776,6 +777,7 @@ init_sexpr(Oid foid, Oid input_collation, Expr *node,
 							   funcrettype,
 							   -1,
 							   0);
+			TupleDescFinalize(tupdesc);
 			sexpr->funcResultDesc = tupdesc;
 			sexpr->funcReturnsTuple = false;
 		}
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index b768eae9e53..d47a75f3ae0 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -992,118 +992,6 @@ tts_buffer_heap_store_tuple(TupleTableSlot *slot, HeapTuple tuple,
 	}
 }
 
-/*
- * slot_deform_heap_tuple_internal
- *		An always inline helper function for use in slot_deform_heap_tuple to
- *		allow the compiler to emit specialized versions of this function for
- *		various combinations of "slow" and "hasnulls".  For example, if a
- *		given tuple has no nulls, then we needn't check "hasnulls" for every
- *		attribute that we're deforming.  The caller can just call this
- *		function with hasnulls set to constant-false and have the compiler
- *		remove the constant-false branches and emit more optimal code.
- *
- * Returns the next attnum to deform, which can be equal to natts when the
- * function manages to deform all requested attributes.  *offp is an input and
- * output parameter which is the byte offset within the tuple to start deforming
- * from which, on return, gets set to the offset where the next attribute
- * should be deformed from.  *slowp is set to true when subsequent deforming
- * of this tuple must use a version of this function with "slow" passed as
- * true.
- *
- * Callers cannot assume when we return "attnum" (i.e. all requested
- * attributes have been deformed) that slow mode isn't required for any
- * additional deforming as the final attribute may have caused a switch to
- * slow mode.
- */
-static pg_attribute_always_inline int
-slot_deform_heap_tuple_internal(TupleTableSlot *slot, HeapTuple tuple,
-								int attnum, int natts, bool slow,
-								bool hasnulls, uint32 *offp, bool *slowp)
-{
-	TupleDesc	tupleDesc = slot->tts_tupleDescriptor;
-	Datum	   *values = slot->tts_values;
-	bool	   *isnull = slot->tts_isnull;
-	HeapTupleHeader tup = tuple->t_data;
-	char	   *tp;				/* ptr to tuple data */
-	bits8	   *bp = tup->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slownext = false;
-
-	tp = (char *) tup + tup->t_hoff;
-
-	for (; attnum < natts; attnum++)
-	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDesc, attnum);
-
-		if (hasnulls && att_isnull(attnum, bp))
-		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			if (!slow)
-			{
-				*slowp = true;
-				return attnum + 1;
-			}
-			else
-				continue;
-		}
-
-		isnull[attnum] = false;
-
-		/* calculate the offset of this attribute */
-		if (!slow && thisatt->attcacheoff >= 0)
-			*offp = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow && *offp == att_nominal_alignby(*offp, thisatt->attalignby))
-				thisatt->attcacheoff = *offp;
-			else
-			{
-				*offp = att_pointer_alignby(*offp,
-											thisatt->attalignby,
-											-1,
-											tp + *offp);
-
-				if (!slow)
-					slownext = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			*offp = att_nominal_alignby(*offp, thisatt->attalignby);
-
-			if (!slow)
-				thisatt->attcacheoff = *offp;
-		}
-
-		values[attnum] = fetchatt(thisatt, tp + *offp);
-
-		*offp = att_addlength_pointer(*offp, thisatt->attlen, tp + *offp);
-
-		/* check if we need to switch to slow mode */
-		if (!slow)
-		{
-			/*
-			 * We're unable to deform any further if the above code set
-			 * 'slownext', or if this isn't a fixed-width attribute.
-			 */
-			if (slownext || thisatt->attlen <= 0)
-			{
-				*slowp = true;
-				return attnum + 1;
-			}
-		}
-	}
-
-	return natts;
-}
-
 /*
  * slot_deform_heap_tuple
  *		Given a TupleTableSlot, extract data from the slot's physical tuple
@@ -1122,78 +1010,164 @@ static pg_attribute_always_inline void
 slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 					   int natts)
 {
+	CompactAttribute *cattr;
+	TupleDesc	tupleDesc = slot->tts_tupleDescriptor;
 	bool		hasnulls = HeapTupleHasNulls(tuple);
+	HeapTupleHeader tup = tuple->t_data;
+	bits8	   *bp;				/* ptr to null bitmap in tuple */
 	int			attnum;
+	int			firstNonCacheOffsetAttr;
+
+#ifdef OPTIMIZE_BYVAL
+	int			firstByRefAttr;
+#endif
+	int			firstNullAttr;
+	Datum	   *values;
+	bool	   *isnull;
+	char	   *tp;				/* ptr to tuple data */
 	uint32		off;			/* offset in tuple data */
-	bool		slow;			/* can we use/set attcacheoff? */
 
 	/* We can only fetch as many attributes as the tuple has. */
-	natts = Min(HeapTupleHeaderGetNatts(tuple->t_data), natts);
+	natts = Min(HeapTupleHeaderGetNatts(tup), natts);
+	attnum = slot->tts_nvalid;
+	firstNonCacheOffsetAttr = Min(tupleDesc->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		bp = tup->t_bits;
+		firstNullAttr = first_null_attr(bp, natts);
+		firstNonCacheOffsetAttr = Min(firstNonCacheOffsetAttr, firstNullAttr);
+	}
+	else
+	{
+		bp = NULL;
+		firstNullAttr = natts;
+	}
+
+#ifdef OPTIMIZE_BYVAL
+	firstByRefAttr = Min(firstNonCacheOffsetAttr, tupleDesc->firstByRefAttr);
+#endif
+	values = slot->tts_values;
+	isnull = slot->tts_isnull;
+	tp = (char *) tup + tup->t_hoff;
+
+#ifdef OPTIMIZE_BYVAL
 
 	/*
-	 * Check whether the first call for this tuple, and initialize or restore
-	 * loop state.
+	 * Many tuples have leading byval attributes, try and process as many of
+	 * those as possible with a special loop that can't handle byref types.
 	 */
-	attnum = slot->tts_nvalid;
-	if (attnum == 0)
+	if (attnum < firstByRefAttr)
+	{
+		/* Use do/while as we already know we need to loop at least once. */
+		do
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+			Assert(cattr->attcacheoff >= 0);
+
+			/*
+			 * Hard code byval == true to allow the compiler to remove the
+			 * byval check when inlining fetch_att().
+			 */
+			values[attnum] = fetch_att(tp + cattr->attcacheoff, true, cattr->attlen);
+			isnull[attnum] = false;
+		} while (++attnum < firstByRefAttr);
+
+		/*
+		 * Point the offset after the end of the last attribute with a cached
+		 * offset.  We expect the final cached offset attribute to have a
+		 * fixed width, so just add the attlen to the attcacheoff.
+		 */
+		Assert(cattr->attlen > 0);
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+#endif
+
+	/*
+	 * Handle the portion of the tuple that we have cached the offset for up
+	 * to the first NULL attribute.  The offset is effectively fixed for these
+	 * so we can use the CompactAttribute's attcacheoff.
+	 */
+	if (attnum < firstNonCacheOffsetAttr)
+	{
+		do
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+			Assert(cattr->attcacheoff >= 0);
+
+			values[attnum] = fetchatt(cattr, tp + cattr->attcacheoff);
+			isnull[attnum] = false;
+		} while (++attnum < firstNonCacheOffsetAttr);
+
+		/*
+		 * Point the offset after the end of the last attribute with a cached
+		 * offset.  We expect the final cached offset attribute to have a
+		 * fixed width, so just add the attlen to the attcacheoff
+		 */
+		Assert(cattr->attlen > 0);
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+	else if (attnum == 0)
 	{
 		/* Start from the first attribute */
 		off = 0;
-		slow = false;
 	}
 	else
 	{
 		/* Restore state from previous execution */
 		off = *offp;
-		slow = TTS_SLOW(slot);
 	}
 
 	/*
-	 * If 'slow' isn't set, try deforming using deforming code that does not
-	 * contain any of the extra checks required for non-fixed offset
-	 * deforming.  During deforming, if or when we find a NULL or a variable
-	 * length attribute, we'll switch to a deforming method which includes the
-	 * extra code required for non-fixed offset deforming, a.k.a slow mode.
-	 * Because this is performance critical, we inline
-	 * slot_deform_heap_tuple_internal passing the 'slow' and 'hasnull'
-	 * parameters as constants to allow the compiler to emit specialized code
-	 * with the known-const false comparisons and subsequent branches removed.
+	 * Handle any portion of the tuple that doesn't have a fixed offset up
+	 * until the first NULL attribute.  This loops only differs from the one
+	 * after it by the NULL checks.
 	 */
-	if (!slow)
+	for (; attnum < firstNullAttr; attnum++)
 	{
-		/* Tuple without any NULLs? We can skip doing any NULL checking */
-		if (!hasnulls)
-			attnum = slot_deform_heap_tuple_internal(slot,
-													 tuple,
-													 attnum,
-													 natts,
-													 false, /* slow */
-													 false, /* hasnulls */
-													 &off,
-													 &slow);
-		else
-			attnum = slot_deform_heap_tuple_internal(slot,
-													 tuple,
-													 attnum,
-													 natts,
-													 false, /* slow */
-													 true,	/* hasnulls */
-													 &off,
-													 &slow);
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+		/* align the offset for this attribute */
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
+
+		values[attnum] = fetchatt(cattr, tp + off);
+		isnull[attnum] = false;
+
+		/* move the offset beyond this attribute */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
-	/* If there's still work to do then we must be in slow mode */
-	if (attnum < natts)
+	/*
+	 * Now handle any remaining tuples, this time include NULL checks as we're
+	 * now at the first NULL attribute.
+	 */
+	for (; attnum < natts; attnum++)
 	{
-		/* XXX is it worth adding a separate call when hasnulls is false? */
-		attnum = slot_deform_heap_tuple_internal(slot,
-												 tuple,
-												 attnum,
-												 natts,
-												 true,	/* slow */
-												 hasnulls,
-												 &off,
-												 &slow);
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
+		}
+
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+		/* align the offset for this attribute */
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
+
+		values[attnum] = fetchatt(cattr, tp + off);
+		isnull[attnum] = false;
+
+		/* move the offset beyond this attribute */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
 	/*
@@ -1201,10 +1175,6 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 	 */
 	slot->tts_nvalid = attnum;
 	*offp = off;
-	if (slow)
-		slot->tts_flags |= TTS_FLAG_SLOW;
-	else
-		slot->tts_flags &= ~TTS_FLAG_SLOW;
 }
 
 const TupleTableSlotOps TTSOpsVirtual = {
@@ -2173,6 +2143,8 @@ ExecTypeFromTLInternal(List *targetList, bool skipjunk)
 		cur_resno++;
 	}
 
+	TupleDescFinalize(typeInfo);
+
 	return typeInfo;
 }
 
@@ -2207,6 +2179,8 @@ ExecTypeFromExprList(List *exprList)
 		cur_resno++;
 	}
 
+	TupleDescFinalize(typeInfo);
+
 	return typeInfo;
 }
 
diff --git a/src/backend/executor/nodeFunctionscan.c b/src/backend/executor/nodeFunctionscan.c
index 63e605e1f81..feb82d64967 100644
--- a/src/backend/executor/nodeFunctionscan.c
+++ b/src/backend/executor/nodeFunctionscan.c
@@ -414,6 +414,7 @@ ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags)
 				TupleDescInitEntryCollation(tupdesc,
 											(AttrNumber) 1,
 											exprCollation(funcexpr));
+				TupleDescFinalize(tupdesc);
 			}
 			else
 			{
@@ -485,6 +486,7 @@ ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags)
 							   0);
 		}
 
+		TupleDescFinalize(scan_tupdesc);
 		Assert(attno == natts);
 	}
 
diff --git a/src/backend/jit/llvm/llvmjit_deform.c b/src/backend/jit/llvm/llvmjit_deform.c
index 3eb087eb56b..3674a908e75 100644
--- a/src/backend/jit/llvm/llvmjit_deform.c
+++ b/src/backend/jit/llvm/llvmjit_deform.c
@@ -747,14 +747,10 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 
 	{
 		LLVMValueRef v_off = l_load(b, TypeSizeT, v_offp, "");
-		LLVMValueRef v_flags;
 
 		LLVMBuildStore(b, l_int16_const(lc, natts), v_nvalidp);
 		v_off = LLVMBuildTrunc(b, v_off, LLVMInt32TypeInContext(lc), "");
 		LLVMBuildStore(b, v_off, v_slotoffp);
-		v_flags = l_load(b, LLVMInt16TypeInContext(lc), v_flagsp, "tts_flags");
-		v_flags = LLVMBuildOr(b, v_flags, l_int16_const(lc, TTS_FLAG_SLOW), "");
-		LLVMBuildStore(b, v_flags, v_flagsp);
 		LLVMBuildRetVoid(b);
 	}
 
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 3ec8d8de011..0ad767d827b 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -1891,6 +1891,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 			TupleDescInitEntryCollation(tupdesc,
 										(AttrNumber) 1,
 										exprCollation(funcexpr));
+			TupleDescFinalize(tupdesc);
 		}
 		else if (functypclass == TYPEFUNC_RECORD)
 		{
@@ -1948,6 +1949,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 
 				i++;
 			}
+			TupleDescFinalize(tupdesc);
 
 			/*
 			 * Ensure that the coldeflist defines a legal set of names (no
@@ -2016,7 +2018,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 							   0);
 			/* no need to set collation */
 		}
-
+		TupleDescFinalize(tupdesc);
 		Assert(natts == totalatts);
 	}
 	else
diff --git a/src/backend/parser/parse_target.c b/src/backend/parser/parse_target.c
index b5a2f915b67..5fd17f3d8d0 100644
--- a/src/backend/parser/parse_target.c
+++ b/src/backend/parser/parse_target.c
@@ -1570,6 +1570,8 @@ expandRecordVariable(ParseState *pstate, Var *var, int levelsup)
 		}
 		Assert(lname == NULL && lvar == NULL);	/* lists same length? */
 
+		TupleDescFinalize(tupleDesc);
+
 		return tupleDesc;
 	}
 
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 7c8639b32e9..4b2d60fe3c2 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -1074,6 +1074,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
 		TupleDescInitEntry(walres->tupledesc, (AttrNumber) coln + 1,
 						   PQfname(pgres, coln), retTypes[coln], -1, 0);
 	attinmeta = TupleDescGetAttInMetadata(walres->tupledesc);
+	TupleDescFinalize(walres->tupledesc);
 
 	/* No point in doing more here if there were no tuples returned. */
 	if (PQntuples(pgres) == 0)
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 1ab09655a70..269b081bac0 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -452,6 +452,7 @@ IdentifySystem(void)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 4, "dbname",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -497,6 +498,7 @@ ReadReplicationSlot(ReadReplicationSlotCmd *cmd)
 	/* TimeLineID is unsigned, so int4 is not wide enough. */
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "restart_tli",
 							  INT8OID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	memset(nulls, true, READ_REPLICATION_SLOT_COLS * sizeof(bool));
 
@@ -599,6 +601,7 @@ SendTimeLineHistory(TimeLineHistoryCmd *cmd)
 	tupdesc = CreateTemplateTupleDesc(2);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, "filename", TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "content", TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	TLHistoryFileName(histfname, cmd->timeline);
 	TLHistoryFilePath(path, cmd->timeline);
@@ -1016,6 +1019,7 @@ StartReplication(StartReplicationCmd *cmd)
 								  INT8OID, -1, 0);
 		TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "next_tli_startpos",
 								  TEXTOID, -1, 0);
+		TupleDescFinalize(tupdesc);
 
 		/* prepare for projection of tuple */
 		tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -1370,6 +1374,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 4, "output_plugin",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
diff --git a/src/backend/utils/adt/acl.c b/src/backend/utils/adt/acl.c
index 3a6905f9546..9d37053c81e 100644
--- a/src/backend/utils/adt/acl.c
+++ b/src/backend/utils/adt/acl.c
@@ -1818,6 +1818,7 @@ aclexplode(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 4, "is_grantable",
 						   BOOLOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/* allocate memory for user context */
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index c083608b1d5..bfb949401d0 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -454,6 +454,7 @@ pg_stat_file(PG_FUNCTION_ARGS)
 					   "creation", TIMESTAMPTZOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 6,
 					   "isdir", BOOLOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	memset(isnull, false, sizeof(isnull));
diff --git a/src/backend/utils/adt/lockfuncs.c b/src/backend/utils/adt/lockfuncs.c
index 9dadd6da672..4481c354fd6 100644
--- a/src/backend/utils/adt/lockfuncs.c
+++ b/src/backend/utils/adt/lockfuncs.c
@@ -146,6 +146,7 @@ pg_lock_status(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 16, "waitstart",
 						   TIMESTAMPTZOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/*
diff --git a/src/backend/utils/adt/orderedsetaggs.c b/src/backend/utils/adt/orderedsetaggs.c
index 3b6da8e36ac..fd8b8676470 100644
--- a/src/backend/utils/adt/orderedsetaggs.c
+++ b/src/backend/utils/adt/orderedsetaggs.c
@@ -233,6 +233,7 @@ ordered_set_startup(FunctionCallInfo fcinfo, bool use_tuples)
 								   -1,
 								   0);
 
+				TupleDescFinalize(newdesc);
 				FreeTupleDesc(qstate->tupdesc);
 				qstate->tupdesc = newdesc;
 			}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 73ca0bb0b7f..08ad27e57c2 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -769,6 +769,7 @@ pg_stat_get_backend_subxact(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subxact_overflow",
 					   BOOLOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	if ((local_beentry = pgstat_get_local_beentry_by_proc_number(procNumber)) != NULL)
@@ -1658,6 +1659,7 @@ pg_stat_wal_build_tuple(PgStat_WalCounters wal_counters,
 	TupleDescInitEntry(tupdesc, (AttrNumber) 6, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	/* Fill values and NULLs */
@@ -2085,6 +2087,7 @@ pg_stat_get_archiver(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(tupdesc, (AttrNumber) 7, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	/* Get statistics about the archiver process */
@@ -2166,6 +2169,7 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
 					   TIMESTAMPTZOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 13, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	namestrcpy(&slotname, text_to_cstring(slotname_text));
@@ -2253,6 +2257,7 @@ pg_stat_get_subscription_stats(PG_FUNCTION_ARGS)
 					   INT8OID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 13, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	if (!subentry)
diff --git a/src/backend/utils/adt/tsvector_op.c b/src/backend/utils/adt/tsvector_op.c
index 94e0fed8309..7ca19a97882 100644
--- a/src/backend/utils/adt/tsvector_op.c
+++ b/src/backend/utils/adt/tsvector_op.c
@@ -651,6 +651,7 @@ tsvector_unnest(PG_FUNCTION_ARGS)
 						   TEXTARRAYOID, -1, 0);
 		if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
 			elog(ERROR, "return type must be a row type");
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = tupdesc;
 
 		funcctx->user_fctx = PG_GETARG_TSVECTOR_COPY(0);
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 6b634c9fff1..998be24ac41 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -666,14 +666,6 @@ RelationBuildTupleDesc(Relation relation)
 		elog(ERROR, "pg_attribute catalog is missing %d attribute(s) for relation OID %u",
 			 need, RelationGetRelid(relation));
 
-	/*
-	 * We can easily set the attcacheoff value for the first attribute: it
-	 * must be zero.  This eliminates the need for special cases for attnum=1
-	 * that used to exist in fastgetattr() and index_getattr().
-	 */
-	if (RelationGetNumberOfAttributes(relation) > 0)
-		TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
-
 	/*
 	 * Set up constraint/default info
 	 */
@@ -729,6 +721,8 @@ RelationBuildTupleDesc(Relation relation)
 		pfree(constr);
 		relation->rd_att->constr = NULL;
 	}
+
+	TupleDescFinalize(relation->rd_att);
 }
 
 /*
@@ -1983,8 +1977,7 @@ formrdesc(const char *relationName, Oid relationReltype,
 		populate_compact_attribute(relation->rd_att, i);
 	}
 
-	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
-	TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
+	TupleDescFinalize(relation->rd_att);
 
 	/* mark not-null status */
 	if (has_not_null)
@@ -3688,6 +3681,8 @@ RelationBuildLocalRelation(const char *relname,
 	for (i = 0; i < natts; i++)
 		TupleDescAttr(rel->rd_att, i)->attrelid = relid;
 
+	TupleDescFinalize(rel->rd_att);
+
 	rel->rd_rel->reltablespace = reltablespace;
 
 	if (mapped_relation)
@@ -4441,8 +4436,7 @@ BuildHardcodedDescriptor(int natts, const FormData_pg_attribute *attrs)
 		populate_compact_attribute(result, i);
 	}
 
-	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
-	TupleDescCompactAttr(result, 0)->attcacheoff = 0;
+	TupleDescFinalize(result);
 
 	/* Note: we don't bother to set up a TupleConstr entry */
 
@@ -6268,6 +6262,8 @@ load_relcache_init_file(bool shared)
 			populate_compact_attribute(rel->rd_att, i);
 		}
 
+		TupleDescFinalize(rel->rd_att);
+
 		/* next read the access method specific field */
 		if (fread(&len, 1, sizeof(len), fp) != sizeof(len))
 			goto read_failed;
diff --git a/src/backend/utils/fmgr/funcapi.c b/src/backend/utils/fmgr/funcapi.c
index 8a934ea8dca..516d02cfb82 100644
--- a/src/backend/utils/fmgr/funcapi.c
+++ b/src/backend/utils/fmgr/funcapi.c
@@ -340,6 +340,8 @@ get_expr_result_type(Node *expr,
 										exprCollation(col));
 			i++;
 		}
+		TupleDescFinalize(tupdesc);
+
 		if (resultTypeId)
 			*resultTypeId = rexpr->row_typeid;
 		if (resultTupleDesc)
@@ -1044,6 +1046,7 @@ resolve_polymorphic_tupdesc(TupleDesc tupdesc, oidvector *declared_args,
 		}
 	}
 
+	TupleDescFinalize(tupdesc);
 	return true;
 }
 
@@ -1853,6 +1856,8 @@ build_function_result_tupdesc_d(char prokind,
 						   0);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -1970,6 +1975,7 @@ TypeGetTupleDesc(Oid typeoid, List *colaliases)
 						   typeoid,
 						   -1,
 						   0);
+		TupleDescFinalize(tupdesc);
 	}
 	else if (functypclass == TYPEFUNC_RECORD)
 	{
diff --git a/src/backend/utils/misc/guc_funcs.c b/src/backend/utils/misc/guc_funcs.c
index 4f3e40bf470..b566a8ef600 100644
--- a/src/backend/utils/misc/guc_funcs.c
+++ b/src/backend/utils/misc/guc_funcs.c
@@ -444,6 +444,7 @@ GetPGVariableResultDesc(const char *name)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 1, varname,
 						   TEXTOID, -1, 0);
 	}
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
 
@@ -465,6 +466,7 @@ ShowGUCConfigOption(const char *name, DestReceiver *dest)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, varname,
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -499,6 +501,7 @@ ShowAllGUCConfig(DestReceiver *dest)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "description",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -939,6 +942,8 @@ show_all_settings(PG_FUNCTION_ARGS)
 		 * C strings
 		 */
 		attinmeta = TupleDescGetAttInMetadata(tupdesc);
+		TupleDescFinalize(tupdesc);
+
 		funcctx->attinmeta = attinmeta;
 
 		/* collect the variables, in sorted order */
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index d406825ff22..94b4279b7f1 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -865,20 +865,17 @@ extern MinimalTuple minimal_expand_tuple(HeapTuple sourceTuple, TupleDesc tupleD
 static inline Datum
 fastgetattr(HeapTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
 {
-	Assert(attnum > 0);
+	CompactAttribute *att = TupleDescCompactAttr(tupleDesc, attnum - 1);
 
+	Assert(attnum > 0);
 	*isnull = false;
-	if (HeapTupleNoNulls(tup))
-	{
-		CompactAttribute *att;
 
-		att = TupleDescCompactAttr(tupleDesc, attnum - 1);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, (char *) tup->t_data + tup->t_data->t_hoff +
-							att->attcacheoff);
-		else
-			return nocachegetattr(tup, attnum, tupleDesc);
-	}
+	if (att->attcacheoff >= 0 && !HeapTupleHasNulls(tup))
+		return fetchatt(att, (char *) tup->t_data + tup->t_data->t_hoff +
+						att->attcacheoff);
+
+	if (HeapTupleNoNulls(tup))
+		return nocachegetattr(tup, attnum, tupleDesc);
 	else
 	{
 		if (att_isnull(attnum - 1, tup->t_data->t_bits))
diff --git a/src/include/access/itup.h b/src/include/access/itup.h
index 57e4daafb0d..e4bb27b7e58 100644
--- a/src/include/access/itup.h
+++ b/src/include/access/itup.h
@@ -131,24 +131,20 @@ IndexInfoFindDataOffset(unsigned short t_info)
 static inline Datum
 index_getattr(IndexTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
 {
+	CompactAttribute *attr = TupleDescCompactAttr(tupleDesc, attnum - 1);
+
 	Assert(isnull);
 	Assert(attnum > 0);
 
 	*isnull = false;
 
-	if (!IndexTupleHasNulls(tup))
-	{
-		CompactAttribute *attr = TupleDescCompactAttr(tupleDesc, attnum - 1);
+	if (attr->attcacheoff >= 0 && !IndexTupleHasNulls(tup))
+		return fetchatt(attr,
+						(char *) tup + IndexInfoFindDataOffset(tup->t_info) +
+						attr->attcacheoff);
 
-		if (attr->attcacheoff >= 0)
-		{
-			return fetchatt(attr,
-							(char *) tup + IndexInfoFindDataOffset(tup->t_info) +
-							attr->attcacheoff);
-		}
-		else
-			return nocache_index_getattr(tup, attnum, tupleDesc);
-	}
+	if (!IndexTupleHasNulls(tup))
+		return nocache_index_getattr(tup, attnum, tupleDesc);
 	else
 	{
 		if (att_isnull(attnum - 1, (bits8 *) tup + sizeof(IndexTupleData)))
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index d46cdbf7a3c..129176f64ae 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -131,6 +131,12 @@ typedef struct CompactAttribute
  * Any code making changes manually to and fields in the FormData_pg_attribute
  * array must subsequently call populate_compact_attribute() to flush the
  * changes out to the corresponding 'compact_attrs' element.
+ *
+ * firstNonCachedOffAttr stores the index into the compact_attrs array for the
+ * first attribute that we don't have a known attcacheoff for.
+ *
+ * Once a TupleDesc has been populated, before it is used for any purpose
+ * TupleDescFinalize() must be called on it.
  */
 typedef struct TupleDescData
 {
@@ -138,6 +144,12 @@ typedef struct TupleDescData
 	Oid			tdtypeid;		/* composite type ID for tuple type */
 	int32		tdtypmod;		/* typmod for tuple type */
 	int			tdrefcount;		/* reference count, or -1 if not counting */
+	int			firstNonCachedOffAttr;	/* index of last att with an
+										 * attcacheoff */
+#ifdef OPTIMIZE_BYVAL
+	int			firstByRefAttr; /* index of the first attr with !attbyval, or
+								 * natts if none. */
+#endif
 	TupleConstr *constr;		/* constraints, or NULL if none */
 	/* compact_attrs[N] is the compact metadata of Attribute Number N+1 */
 	CompactAttribute compact_attrs[FLEXIBLE_ARRAY_MEMBER];
@@ -205,6 +217,8 @@ extern void TupleDescCopy(TupleDesc dst, TupleDesc src);
 extern void TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 							   TupleDesc src, AttrNumber srcAttno);
 
+extern void TupleDescFinalize(TupleDesc tupdesc);
+
 extern void FreeTupleDesc(TupleDesc tupdesc);
 
 extern void IncrTupleDescRefCount(TupleDesc tupdesc);
diff --git a/src/include/access/tupmacs.h b/src/include/access/tupmacs.h
index e6df8264750..8c70dad6c62 100644
--- a/src/include/access/tupmacs.h
+++ b/src/include/access/tupmacs.h
@@ -15,6 +15,7 @@
 #define TUPMACS_H
 
 #include "catalog/pg_type_d.h"	/* for TYPALIGN macros */
+#include "port/pg_bitutils.h"
 
 
 /*
@@ -69,6 +70,62 @@ fetch_att(const void *T, bool attbyval, int attlen)
 	else
 		return PointerGetDatum(T);
 }
+
+/*
+ * first_null_attr
+ *		Inspect a NULL bitmask from a tuple and return the 0-based attnum of the
+ *		first NULL attribute.  Returns natts if no NULLs were found.
+ */
+static inline int
+first_null_attr(const bits8 *bits, int natts)
+{
+	int			lastByte = natts >> 3;
+	uint8		mask;
+	int			res = natts;
+	uint8		byte;
+
+#ifdef USE_ASSERT_CHECKING
+	int			firstnull_check = natts;
+
+	/* Do it the slow way and check we get the same answer. */
+	for (int i = 0; i < natts; i++)
+	{
+		if (att_isnull(i, bits))
+		{
+			firstnull_check = i;
+			break;
+		}
+	}
+#endif
+
+	/* Process all bytes up to just before the byte for the natts index */
+	for (int bytenum = 0; bytenum < lastByte; bytenum++)
+	{
+		if (bits[bytenum] != 0xFF)
+		{
+			byte = ~bits[bytenum];
+			res = bytenum * 8;
+			res += pg_rightmost_one_pos[byte];
+
+			Assert(res == firstnull_check);
+			return res;
+		}
+	}
+
+	/* Create a mask with all bits beyond natts's bit set to off */
+	mask = 0xFF & ((((uint8) 1) << (natts & 7)) - 1);
+	byte = (~bits[lastByte]) & mask;
+
+	if (byte != 0)
+	{
+		res = lastByte * 8;
+		res += pg_rightmost_one_pos[byte];
+	}
+
+	Assert(res == firstnull_check);
+
+	return res;
+}
 #endif							/* FRONTEND */
 
 /*
diff --git a/src/include/executor/tuptable.h b/src/include/executor/tuptable.h
index a2dfd707e78..363c5f33697 100644
--- a/src/include/executor/tuptable.h
+++ b/src/include/executor/tuptable.h
@@ -84,9 +84,6 @@
  * tts_values/tts_isnull are allocated either when the slot is created (when
  * the descriptor is provided), or when a descriptor is assigned to the slot;
  * they are of length equal to the descriptor's natts.
- *
- * The TTS_FLAG_SLOW flag is saved state for
- * slot_deform_heap_tuple, and should not be touched by any other code.
  *----------
  */
 
@@ -98,12 +95,8 @@
 #define			TTS_FLAG_SHOULDFREE		(1 << 2)
 #define TTS_SHOULDFREE(slot) (((slot)->tts_flags & TTS_FLAG_SHOULDFREE) != 0)
 
-/* saved state for slot_deform_heap_tuple */
-#define			TTS_FLAG_SLOW		(1 << 3)
-#define TTS_SLOW(slot) (((slot)->tts_flags & TTS_FLAG_SLOW) != 0)
-
 /* fixed tuple descriptor */
-#define			TTS_FLAG_FIXED		(1 << 4)
+#define			TTS_FLAG_FIXED		(1 << 4)	/* XXX change to #3? */
 #define TTS_FIXED(slot) (((slot)->tts_flags & TTS_FLAG_FIXED) != 0)
 
 struct TupleTableSlotOps;
diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c
index 7d648c941c0..b2112c29fcf 100644
--- a/src/pl/plpgsql/src/pl_comp.c
+++ b/src/pl/plpgsql/src/pl_comp.c
@@ -1912,6 +1912,8 @@ build_row_from_vars(PLpgSQL_variable **vars, int numvars)
 		TupleDescInitEntryCollation(row->rowtupdesc, i + 1, typcoll);
 	}
 
+	TupleDescFinalize(row->rowtupdesc);
+
 	return row;
 }
 
diff --git a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
index 908bd18a7c7..fa1719bf3b5 100644
--- a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
+++ b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
@@ -205,6 +205,7 @@ test_custom_stats_fixed_report(PG_FUNCTION_ARGS)
 					   INT8OID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	values[0] = Int64GetDatum(stats->numcalls);
diff --git a/src/test/modules/test_predtest/test_predtest.c b/src/test/modules/test_predtest/test_predtest.c
index 679a5de456d..48ca2a4ea70 100644
--- a/src/test/modules/test_predtest/test_predtest.c
+++ b/src/test/modules/test_predtest/test_predtest.c
@@ -230,6 +230,7 @@ test_predtest(PG_FUNCTION_ARGS)
 					   "s_r_holds", BOOLOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 8,
 					   "w_r_holds", BOOLOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	tupdesc = BlessTupleDesc(tupdesc);
 
 	values[0] = BoolGetDatum(strong_implied_by);
-- 
2.51.0



Attachments:

  [text/plain] v3-0001-Precalculate-CompactAttribute-s-attcacheoff.patch (73.2K, 2-v3-0001-Precalculate-CompactAttribute-s-attcacheoff.patch)
  download | inline diff:
From 3676468860a43bcd7a288f218d4bf32efef9ac29 Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Tue, 31 Dec 2024 09:19:24 +1300
Subject: [PATCH v3] Precalculate CompactAttribute's attcacheoff

This allows code to be removed from the tuple deform routines which
shrinks down the code a little, which can make it run more quickly.
This also makes a dedicated deformer loop to deform the portion of the
tuple which has a known offset, which makes deforming much faster when
a leading set of the table's columns are non-NULL values and fixed-width
types.
---
 contrib/dblink/dblink.c                       |   3 +
 contrib/pg_buffercache/pg_buffercache_pages.c |   2 +
 contrib/pg_visibility/pg_visibility.c         |   2 +
 src/backend/access/brin/brin_tuple.c          |   1 +
 src/backend/access/common/heaptuple.c         | 317 ++++++----------
 src/backend/access/common/indextuple.c        | 355 +++++++-----------
 src/backend/access/common/tupdesc.c           |  62 +++
 src/backend/access/gin/ginutil.c              |   1 +
 src/backend/access/gist/gistscan.c            |   1 +
 src/backend/access/spgist/spgutils.c          |   4 +-
 src/backend/access/transam/twophase.c         |   1 +
 src/backend/access/transam/xlogfuncs.c        |   1 +
 src/backend/backup/basebackup_copy.c          |   3 +
 src/backend/catalog/index.c                   |   2 +
 src/backend/catalog/pg_publication.c          |   1 +
 src/backend/catalog/toasting.c                |   6 +
 src/backend/commands/explain.c                |   1 +
 src/backend/commands/functioncmds.c           |   1 +
 src/backend/commands/sequence.c               |   1 +
 src/backend/commands/tablecmds.c              |   4 +
 src/backend/commands/wait.c                   |   1 +
 src/backend/executor/execSRF.c                |   2 +
 src/backend/executor/execTuples.c             | 302 +++++++--------
 src/backend/executor/nodeFunctionscan.c       |   2 +
 src/backend/jit/llvm/llvmjit_deform.c         |   4 -
 src/backend/parser/parse_relation.c           |   4 +-
 src/backend/parser/parse_target.c             |   2 +
 .../libpqwalreceiver/libpqwalreceiver.c       |   1 +
 src/backend/replication/walsender.c           |   5 +
 src/backend/utils/adt/acl.c                   |   1 +
 src/backend/utils/adt/genfile.c               |   1 +
 src/backend/utils/adt/lockfuncs.c             |   1 +
 src/backend/utils/adt/orderedsetaggs.c        |   1 +
 src/backend/utils/adt/pgstatfuncs.c           |   5 +
 src/backend/utils/adt/tsvector_op.c           |   1 +
 src/backend/utils/cache/relcache.c            |  20 +-
 src/backend/utils/fmgr/funcapi.c              |   6 +
 src/backend/utils/misc/guc_funcs.c            |   5 +
 src/include/access/htup_details.h             |  19 +-
 src/include/access/itup.h                     |  20 +-
 src/include/access/tupdesc.h                  |  14 +
 src/include/access/tupmacs.h                  |  57 +++
 src/include/executor/tuptable.h               |   9 +-
 src/pl/plpgsql/src/pl_comp.c                  |   2 +
 .../test_custom_fixed_stats.c                 |   1 +
 .../modules/test_predtest/test_predtest.c     |   1 +
 46 files changed, 623 insertions(+), 633 deletions(-)

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 8cb3166495c..ed0267d2183 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -881,6 +881,7 @@ materializeResult(FunctionCallInfo fcinfo, PGconn *conn, PGresult *res)
 		tupdesc = CreateTemplateTupleDesc(1);
 		TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 						   TEXTOID, -1, 0);
+		TupleDescFinalize(tupdesc);
 		ntuples = 1;
 		nfields = 1;
 	}
@@ -1045,6 +1046,7 @@ materializeQueryResult(FunctionCallInfo fcinfo,
 			TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 							   TEXTOID, -1, 0);
 			attinmeta = TupleDescGetAttInMetadata(tupdesc);
+			TupleDescFinalize(tupdesc);
 
 			oldcontext = MemoryContextSwitchTo(rsinfo->econtext->ecxt_per_query_memory);
 			tupstore = tuplestore_begin_heap(true, false, work_mem);
@@ -1534,6 +1536,7 @@ dblink_get_pkey(PG_FUNCTION_ARGS)
 		 * C strings
 		 */
 		attinmeta = TupleDescGetAttInMetadata(tupdesc);
+		TupleDescFinalize(tupdesc);
 		funcctx->attinmeta = attinmeta;
 
 		if ((results != NULL) && (indnkeyatts > 0))
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index dcba3fb5473..2fdf5a341f6 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -174,6 +174,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
 			TupleDescInitEntry(tupledesc, (AttrNumber) 9, "pinning_backends",
 							   INT4OID, -1, 0);
 
+		TupleDescFinalize(tupledesc);
 		fctx->tupdesc = BlessTupleDesc(tupledesc);
 
 		/* Allocate NBuffers worth of BufferCachePagesRec records. */
@@ -442,6 +443,7 @@ pg_buffercache_os_pages_internal(FunctionCallInfo fcinfo, bool include_numa)
 		TupleDescInitEntry(tupledesc, (AttrNumber) 3, "numa_node",
 						   INT4OID, -1, 0);
 
+		TupleDescFinalize(tupledesc);
 		fctx->tupdesc = BlessTupleDesc(tupledesc);
 		fctx->include_numa = include_numa;
 
diff --git a/contrib/pg_visibility/pg_visibility.c b/contrib/pg_visibility/pg_visibility.c
index 9bc3a784bf7..dfab0b64cf5 100644
--- a/contrib/pg_visibility/pg_visibility.c
+++ b/contrib/pg_visibility/pg_visibility.c
@@ -469,6 +469,8 @@ pg_visibility_tupdesc(bool include_blkno, bool include_pd)
 		TupleDescInitEntry(tupdesc, ++a, "pd_all_visible", BOOLOID, -1, 0);
 	Assert(a == maxattr);
 
+	TupleDescFinalize(tupdesc);
+
 	return BlessTupleDesc(tupdesc);
 }
 
diff --git a/src/backend/access/brin/brin_tuple.c b/src/backend/access/brin/brin_tuple.c
index 706387e36d6..7f150df9ee7 100644
--- a/src/backend/access/brin/brin_tuple.c
+++ b/src/backend/access/brin/brin_tuple.c
@@ -84,6 +84,7 @@ brtuple_disk_tupdesc(BrinDesc *brdesc)
 
 		MemoryContextSwitchTo(oldcxt);
 
+		TupleDescFinalize(tupdesc);
 		brdesc->bd_disktdesc = tupdesc;
 	}
 
diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c
index 11bec20e82e..262d5d1bc69 100644
--- a/src/backend/access/common/heaptuple.c
+++ b/src/backend/access/common/heaptuple.c
@@ -497,20 +497,8 @@ heap_attisnull(HeapTuple tup, int attnum, TupleDesc tupleDesc)
 /* ----------------
  *		nocachegetattr
  *
- *		This only gets called from fastgetattr(), in cases where we
- *		can't use a cacheoffset and the value is not null.
- *
- *		This caches attribute offsets in the attribute descriptor.
- *
- *		An alternative way to speed things up would be to cache offsets
- *		with the tuple, but that seems more difficult unless you take
- *		the storage hit of actually putting those offsets into the
- *		tuple you send to disk.  Yuck.
- *
- *		This scheme will be slightly slower than that, but should
- *		perform well for queries which hit large #'s of tuples.  After
- *		you cache the offsets once, examining all the other tuples using
- *		the same attribute descriptor will go much quicker. -cim 5/4/91
+ *		This only gets called from fastgetattr(), in cases where the
+ *		attcacheoff is not set.
  *
  *		NOTE: if you need to change this code, see also heap_deform_tuple.
  *		Also see nocache_index_getattr, which is the same code for index
@@ -522,194 +510,101 @@ nocachegetattr(HeapTuple tup,
 			   int attnum,
 			   TupleDesc tupleDesc)
 {
+	CompactAttribute *cattr;
 	HeapTupleHeader td = tup->t_data;
 	char	   *tp;				/* ptr to data part of tuple */
 	bits8	   *bp = td->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* do we have to walk attrs? */
 	int			off;			/* current offset within data */
+	int			startAttr;
+	int			firstnullattr;
+	bool		hasnulls = HeapTupleHasNulls(tup);
 
-	/* ----------------
-	 *	 Three cases:
-	 *
-	 *	 1: No nulls and no variable-width attributes.
-	 *	 2: Has a null or a var-width AFTER att.
-	 *	 3: Has nulls or var-widths BEFORE att.
-	 * ----------------
+	/*
+	 * If there are no NULLs before the required attnum, then we can start at
+	 * the highest attribute with a known offset, or the first attribute if
+	 * none have a cached offset.  If the tuple has no variable width types,
+	 * then we can use a slightly cheaper method of offset calculation, as we
+	 * just need to add the attlen to the aligned offset when skipping over
+	 * columns.  When the tuple contains variable-width types, we must use
+	 * att_addlength_pointer(), which does a bit more branching and is
+	 * slightly less efficient.
 	 */
-
 	attnum--;
 
-	if (!HeapTupleNoNulls(tup))
-	{
-		/*
-		 * there's a null somewhere in the tuple
-		 *
-		 * check to see if any preceding bits are null...
-		 */
-		int			byte = attnum >> 3;
-		int			finalbit = attnum & 0x07;
-
-		/* check for nulls "before" final bit of last byte */
-		if ((~bp[byte]) & ((1 << finalbit) - 1))
-			slow = true;
-		else
-		{
-			/* check for nulls in any "earlier" bytes */
-			int			i;
+	if (hasnulls)
+		firstnullattr = first_null_attr(bp, attnum);
+	else
+		firstnullattr = attnum;
 
-			for (i = 0; i < byte; i++)
-			{
-				if (bp[i] != 0xFF)
-				{
-					slow = true;
-					break;
-				}
-			}
-		}
+	if (tupleDesc->firstNonCachedOffAttr >= 0)
+	{
+		startAttr = Min(tupleDesc->firstNonCachedOffAttr - 1, firstnullattr);
+		off = TupleDescCompactAttr(tupleDesc, startAttr)->attcacheoff;
+	}
+	else
+	{
+		startAttr = 0;
+		off = 0;
 	}
 
 	tp = (char *) td + td->t_hoff;
 
-	if (!slow)
+	if (hasnulls)
 	{
-		CompactAttribute *att;
+		for (int i = startAttr; i < attnum; i++)
+		{
+			CompactAttribute *att;
 
-		/*
-		 * If we get here, there are no nulls up to and including the target
-		 * attribute.  If we have a cached offset, we can use it.
-		 */
-		att = TupleDescCompactAttr(tupleDesc, attnum);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, tp + att->attcacheoff);
+			if (att_isnull(i, bp))
+				continue;
 
-		/*
-		 * Otherwise, check for non-fixed-length attrs up to and including
-		 * target.  If there aren't any, it's safe to cheaply initialize the
-		 * cached offsets for these attrs.
-		 */
-		if (HeapTupleHasVarWidth(tup))
-		{
-			int			j;
+			att = TupleDescCompactAttr(tupleDesc, i);
 
-			for (j = 0; j <= attnum; j++)
-			{
-				if (TupleDescCompactAttr(tupleDesc, j)->attlen <= 0)
-				{
-					slow = true;
-					break;
-				}
-			}
+			off = att_pointer_alignby(off,
+									  att->attalignby,
+									  att->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, att->attlen, tp + off);
 		}
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
 	}
-
-	if (!slow)
+	else if (!HeapTupleHasVarWidth(tup))
 	{
-		int			natts = tupleDesc->natts;
-		int			j = 1;
-
-		/*
-		 * If we get here, we have a tuple with no nulls or var-widths up to
-		 * and including the target attribute, so we can use the cached offset
-		 * ... only we don't have it yet, or we'd not have got here.  Since
-		 * it's cheap to compute offsets for fixed-width columns, we take the
-		 * opportunity to initialize the cached offsets for *all* the leading
-		 * fixed-width columns, in hope of avoiding future visits to this
-		 * routine.
-		 */
-		TupleDescCompactAttr(tupleDesc, 0)->attcacheoff = 0;
-
-		/* we might have set some offsets in the slow path previously */
-		while (j < natts && TupleDescCompactAttr(tupleDesc, j)->attcacheoff > 0)
-			j++;
-
-		off = TupleDescCompactAttr(tupleDesc, j - 1)->attcacheoff +
-			TupleDescCompactAttr(tupleDesc, j - 1)->attlen;
-
-		for (; j < natts; j++)
+		for (int i = startAttr; i < attnum; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, j);
-
-			if (att->attlen <= 0)
-				break;
+			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
 
 			off = att_nominal_alignby(off, att->attalignby);
-
-			att->attcacheoff = off;
-
 			off += att->attlen;
 		}
-
-		Assert(j > attnum);
-
-		off = TupleDescCompactAttr(tupleDesc, attnum)->attcacheoff;
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_nominal_alignby(off, cattr->attalignby);
 	}
 	else
 	{
-		bool		usecache = true;
-		int			i;
-
-		/*
-		 * Now we know that we have to walk the tuple CAREFULLY.  But we still
-		 * might be able to cache some offsets for next time.
-		 *
-		 * Note - This loop is a little tricky.  For each non-null attribute,
-		 * we have to first account for alignment padding before the attr,
-		 * then advance over the attr based on its length.  Nulls have no
-		 * storage and no alignment padding either.  We can use/set
-		 * attcacheoff until we reach either a null or a var-width attribute.
-		 */
-		off = 0;
-		for (i = 0;; i++)		/* loop exit is at "break" */
+		for (int i = startAttr; i < attnum; i++)
 		{
 			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
 
-			if (HeapTupleHasNulls(tup) && att_isnull(i, bp))
-			{
-				usecache = false;
-				continue;		/* this cannot be the target att */
-			}
-
-			/* If we know the next offset, we can skip the rest */
-			if (usecache && att->attcacheoff >= 0)
-				off = att->attcacheoff;
-			else if (att->attlen == -1)
-			{
-				/*
-				 * We can only cache the offset for a varlena attribute if the
-				 * offset is already suitably aligned, so that there would be
-				 * no pad bytes in any case: then the offset will be valid for
-				 * either an aligned or unaligned value.
-				 */
-				if (usecache &&
-					off == att_nominal_alignby(off, att->attalignby))
-					att->attcacheoff = off;
-				else
-				{
-					off = att_pointer_alignby(off, att->attalignby, -1,
-											  tp + off);
-					usecache = false;
-				}
-			}
-			else
-			{
-				/* not varlena, so safe to use att_nominal_alignby */
-				off = att_nominal_alignby(off, att->attalignby);
-
-				if (usecache)
-					att->attcacheoff = off;
-			}
-
-			if (i == attnum)
-				break;
-
+			off = att_pointer_alignby(off,
+									  att->attalignby,
+									  att->attlen,
+									  tp + off);
 			off = att_addlength_pointer(off, att->attlen, tp + off);
 
-			if (usecache && att->attlen <= 0)
-				usecache = false;
 		}
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
 	}
 
-	return fetchatt(TupleDescCompactAttr(tupleDesc, attnum), tp + off);
+	return fetchatt(cattr, tp + off);
 }
 
 /* ----------------
@@ -1354,7 +1249,8 @@ heap_deform_tuple(HeapTuple tuple, TupleDesc tupleDesc,
 	char	   *tp;				/* ptr to tuple data */
 	uint32		off;			/* offset in tuple data */
 	bits8	   *bp = tup->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* can we use/set attcacheoff? */
+	int			cacheoffattrs;
+	int			firstnullattr;
 
 	natts = HeapTupleHeaderGetNatts(tup);
 
@@ -1364,60 +1260,77 @@ heap_deform_tuple(HeapTuple tuple, TupleDesc tupleDesc,
 	 * the caller's arrays.
 	 */
 	natts = Min(natts, tdesc_natts);
+	cacheoffattrs = Min(tupleDesc->firstNonCachedOffAttr, natts);
 
-	tp = (char *) tup + tup->t_hoff;
+	if (hasnulls)
+	{
+		firstnullattr = first_null_attr(bp, natts);
+		cacheoffattrs = Min(cacheoffattrs, firstnullattr);
+	}
+	else
+		firstnullattr = natts;
 
+	tp = (char *) tup + tup->t_hoff;
 	off = 0;
 
-	for (attnum = 0; attnum < natts; attnum++)
+	for (attnum = 0; attnum < cacheoffattrs; attnum++)
 	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDesc, attnum);
+		CompactAttribute *cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+		Assert(cattr->attcacheoff >= 0);
+
+		values[attnum] = fetch_att(tp + cattr->attcacheoff, cattr->attbyval,
+								   cattr->attlen);
+		isnull[attnum] = false;
+		off = cattr->attcacheoff + cattr->attlen;
+	}
 
-		if (hasnulls && att_isnull(attnum, bp))
+	for (; attnum < firstnullattr; attnum++)
+	{
+		CompactAttribute *cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+		if (cattr->attlen == -1)
+			off = att_pointer_alignby(off, cattr->attalignby, -1,
+									  tp + off);
+		else
 		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			slow = true;		/* can't use attcacheoff anymore */
-			continue;
+			/* not varlena, so safe to use att_nominal_alignby */
+			off = att_nominal_alignby(off, cattr->attalignby);
 		}
 
 		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (!slow && thisatt->attcacheoff >= 0)
-			off = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+	}
+
+	for (; attnum < natts; attnum++)
+	{
+		CompactAttribute *cattr;
+
+		Assert(hasnulls);
+
+		if (att_isnull(attnum, bp))
 		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow &&
-				off == att_nominal_alignby(off, thisatt->attalignby))
-				thisatt->attcacheoff = off;
-			else
-			{
-				off = att_pointer_alignby(off, thisatt->attalignby, -1,
-										  tp + off);
-				slow = true;
-			}
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
 		}
+
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		if (cattr->attlen == -1)
+			off = att_pointer_alignby(off, cattr->attalignby, -1,
+									  tp + off);
 		else
 		{
 			/* not varlena, so safe to use att_nominal_alignby */
-			off = att_nominal_alignby(off, thisatt->attalignby);
-
-			if (!slow)
-				thisatt->attcacheoff = off;
+			off = att_nominal_alignby(off, cattr->attalignby);
 		}
 
-		values[attnum] = fetchatt(thisatt, tp + off);
-
-		off = att_addlength_pointer(off, thisatt->attlen, tp + off);
+		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (thisatt->attlen <= 0)
-			slow = true;		/* can't use attcacheoff anymore */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
 	/*
diff --git a/src/backend/access/common/indextuple.c b/src/backend/access/common/indextuple.c
index d7c8c53fd8d..647248ded2c 100644
--- a/src/backend/access/common/indextuple.c
+++ b/src/backend/access/common/indextuple.c
@@ -223,18 +223,6 @@ index_form_tuple_context(TupleDesc tupleDescriptor,
  *
  *		This gets called from index_getattr() macro, and only in cases
  *		where we can't use cacheoffset and the value is not null.
- *
- *		This caches attribute offsets in the attribute descriptor.
- *
- *		An alternative way to speed things up would be to cache offsets
- *		with the tuple, but that seems more difficult unless you take
- *		the storage hit of actually putting those offsets into the
- *		tuple you send to disk.  Yuck.
- *
- *		This scheme will be slightly slower than that, but should
- *		perform well for queries which hit large #'s of tuples.  After
- *		you cache the offsets once, examining all the other tuples using
- *		the same attribute descriptor will go much quicker. -cim 5/4/91
  * ----------------
  */
 Datum
@@ -242,205 +230,126 @@ nocache_index_getattr(IndexTuple tup,
 					  int attnum,
 					  TupleDesc tupleDesc)
 {
+	CompactAttribute *cattr;
 	char	   *tp;				/* ptr to data part of tuple */
 	bits8	   *bp = NULL;		/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* do we have to walk attrs? */
 	int			data_off;		/* tuple data offset */
 	int			off;			/* current offset within data */
+	int			startAttr;
+	int			firstnullattr;
+	bool		hasnulls = IndexTupleHasNulls(tup);
+	int			i;
 
-	/* ----------------
-	 *	 Three cases:
-	 *
-	 *	 1: No nulls and no variable-width attributes.
-	 *	 2: Has a null or a var-width AFTER att.
-	 *	 3: Has nulls or var-widths BEFORE att.
-	 * ----------------
-	 */
+	attnum--;
 
+	/*
+	 * If there are no NULLs before the required attnum, then we can start at
+	 * the highest attribute with a known offset, or the first attribute if
+	 * none have a cached offset.  If the tuple has no variable width types,
+	 * which is common with indexes, then we can use a slightly cheaper method
+	 * of offset calculation, as we just need to add the attlen to the aligned
+	 * offset when skipping over columns.  When the tuple contains
+	 * variable-width types, we must use att_addlength_pointer(), which does a
+	 * bit more branching and is slightly less efficient.
+	 */
 	data_off = IndexInfoFindDataOffset(tup->t_info);
+	tp = (char *) tup + data_off;
 
-	attnum--;
-
-	if (IndexTupleHasNulls(tup))
+	/*
+	 * Find the first NULL column, or if there's none set the first NULL to
+	 * attnum so that we can forego NULL checking all the way to attnum.
+	 */
+	if (hasnulls)
 	{
-		/*
-		 * there's a null somewhere in the tuple
-		 *
-		 * check to see if desired att is null
-		 */
-
-		/* XXX "knows" t_bits are just after fixed tuple header! */
 		bp = (bits8 *) ((char *) tup + sizeof(IndexTupleData));
-
-		/*
-		 * Now check to see if any preceding bits are null...
-		 */
-		{
-			int			byte = attnum >> 3;
-			int			finalbit = attnum & 0x07;
-
-			/* check for nulls "before" final bit of last byte */
-			if ((~bp[byte]) & ((1 << finalbit) - 1))
-				slow = true;
-			else
-			{
-				/* check for nulls in any "earlier" bytes */
-				int			i;
-
-				for (i = 0; i < byte; i++)
-				{
-					if (bp[i] != 0xFF)
-					{
-						slow = true;
-						break;
-					}
-				}
-			}
-		}
+		firstnullattr = first_null_attr(bp, attnum);
 	}
+	else
+		firstnullattr = attnum;
 
-	tp = (char *) tup + data_off;
-
-	if (!slow)
+	if (tupleDesc->firstNonCachedOffAttr >= 0)
 	{
-		CompactAttribute *att;
-
-		/*
-		 * If we get here, there are no nulls up to and including the target
-		 * attribute.  If we have a cached offset, we can use it.
-		 */
-		att = TupleDescCompactAttr(tupleDesc, attnum);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, tp + att->attcacheoff);
-
-		/*
-		 * Otherwise, check for non-fixed-length attrs up to and including
-		 * target.  If there aren't any, it's safe to cheaply initialize the
-		 * cached offsets for these attrs.
-		 */
-		if (IndexTupleHasVarwidths(tup))
-		{
-			int			j;
-
-			for (j = 0; j <= attnum; j++)
-			{
-				if (TupleDescCompactAttr(tupleDesc, j)->attlen <= 0)
-				{
-					slow = true;
-					break;
-				}
-			}
-		}
+		startAttr = Min(tupleDesc->firstNonCachedOffAttr - 1, firstnullattr);
+		off = TupleDescCompactAttr(tupleDesc, startAttr)->attcacheoff;
 	}
-
-	if (!slow)
+	else
 	{
-		int			natts = tupleDesc->natts;
-		int			j = 1;
-
-		/*
-		 * If we get here, we have a tuple with no nulls or var-widths up to
-		 * and including the target attribute, so we can use the cached offset
-		 * ... only we don't have it yet, or we'd not have got here.  Since
-		 * it's cheap to compute offsets for fixed-width columns, we take the
-		 * opportunity to initialize the cached offsets for *all* the leading
-		 * fixed-width columns, in hope of avoiding future visits to this
-		 * routine.
-		 */
-		TupleDescCompactAttr(tupleDesc, 0)->attcacheoff = 0;
+		startAttr = 0;
+		off = 0;
+	}
 
-		/* we might have set some offsets in the slow path previously */
-		while (j < natts && TupleDescCompactAttr(tupleDesc, j)->attcacheoff > 0)
-			j++;
+	/* Handle tuples with var-width attributes */
+	if (IndexTupleHasVarwidths(tup))
+	{
+		/* Calculate the offset up until the first NULL */
+		for (i = startAttr; i < firstnullattr; i++)
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-		off = TupleDescCompactAttr(tupleDesc, j - 1)->attcacheoff +
-			TupleDescCompactAttr(tupleDesc, j - 1)->attlen;
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, cattr->attlen, tp + off);
+		}
 
-		for (; j < natts; j++)
+		/* Calculate the offset for any remaining columns. */
+		for (; i < attnum; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, j);
-
-			if (att->attlen <= 0)
-				break;
+			Assert(hasnulls);
 
-			off = att_nominal_alignby(off, att->attalignby);
+			if (att_isnull(i, bp))
+				continue;
 
-			att->attcacheoff = off;
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-			off += att->attlen;
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, cattr->attlen, tp + off);
 		}
-
-		Assert(j > attnum);
-
-		off = TupleDescCompactAttr(tupleDesc, attnum)->attcacheoff;
 	}
 	else
 	{
-		bool		usecache = true;
-		int			i;
+		/* Handle tuples with only fixed-width attributes */
 
-		/*
-		 * Now we know that we have to walk the tuple CAREFULLY.  But we still
-		 * might be able to cache some offsets for next time.
-		 *
-		 * Note - This loop is a little tricky.  For each non-null attribute,
-		 * we have to first account for alignment padding before the attr,
-		 * then advance over the attr based on its length.  Nulls have no
-		 * storage and no alignment padding either.  We can use/set
-		 * attcacheoff until we reach either a null or a var-width attribute.
-		 */
-		off = 0;
-		for (i = 0;; i++)		/* loop exit is at "break" */
+		/* Calculate the offset up until the first NULL */
+		for (i = startAttr; i < firstnullattr; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
-
-			if (IndexTupleHasNulls(tup) && att_isnull(i, bp))
-			{
-				usecache = false;
-				continue;		/* this cannot be the target att */
-			}
-
-			/* If we know the next offset, we can skip the rest */
-			if (usecache && att->attcacheoff >= 0)
-				off = att->attcacheoff;
-			else if (att->attlen == -1)
-			{
-				/*
-				 * We can only cache the offset for a varlena attribute if the
-				 * offset is already suitably aligned, so that there would be
-				 * no pad bytes in any case: then the offset will be valid for
-				 * either an aligned or unaligned value.
-				 */
-				if (usecache &&
-					off == att_nominal_alignby(off, att->attalignby))
-					att->attcacheoff = off;
-				else
-				{
-					off = att_pointer_alignby(off, att->attalignby, -1,
-											  tp + off);
-					usecache = false;
-				}
-			}
-			else
-			{
-				/* not varlena, so safe to use att_nominal_alignby */
-				off = att_nominal_alignby(off, att->attalignby);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
+
+			Assert(cattr->attlen > 0);
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off += cattr->attlen;
+		}
 
-				if (usecache)
-					att->attcacheoff = off;
-			}
+		/* Calculate the offset for any remaining columns. */
+		for (; i < attnum; i++)
+		{
+			Assert(hasnulls);
 
-			if (i == attnum)
-				break;
+			if (att_isnull(i, bp))
+				continue;
 
-			off = att_addlength_pointer(off, att->attlen, tp + off);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-			if (usecache && att->attlen <= 0)
-				usecache = false;
+			Assert(cattr->attlen > 0);
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off += cattr->attlen;
 		}
 	}
 
-	return fetchatt(TupleDescCompactAttr(tupleDesc, attnum), tp + off);
+	cattr = TupleDescCompactAttr(tupleDesc, attnum);
+	off = att_pointer_alignby(off, cattr->attalignby,
+							  cattr->attlen, tp + off);
+	return fetchatt(cattr, tp + off);
 }
 
 /*
@@ -481,62 +390,76 @@ index_deform_tuple_internal(TupleDesc tupleDescriptor,
 							char *tp, bits8 *bp, int hasnulls)
 {
 	int			natts = tupleDescriptor->natts; /* number of atts to extract */
-	int			attnum;
+	int			attnum = 0;
 	int			off = 0;		/* offset in tuple data */
-	bool		slow = false;	/* can we use/set attcacheoff? */
+	int			cacheoffattrs;
+	int			firstnullattr;
 
 	/* Assert to protect callers who allocate fixed-size arrays */
 	Assert(natts <= INDEX_MAX_KEYS);
 
-	for (attnum = 0; attnum < natts; attnum++)
+	cacheoffattrs = Min(tupleDescriptor->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
 	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDescriptor, attnum);
+		firstnullattr = first_null_attr(bp, natts);
+		cacheoffattrs = Min(cacheoffattrs, firstnullattr);
+	}
+	else
+		firstnullattr = natts;
+
+	if (attnum < cacheoffattrs)
+	{
+		CompactAttribute *cattr;
 
-		if (hasnulls && att_isnull(attnum, bp))
+		do
 		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			slow = true;		/* can't use attcacheoff anymore */
-			continue;
-		}
+			cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+
+			Assert(cattr->attcacheoff >= 0);
+
+			values[attnum] = fetch_att(tp + cattr->attcacheoff, cattr->attbyval,
+									   cattr->attlen);
+			isnull[attnum] = false;
+		} while (++attnum < cacheoffattrs);
+
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+
+	for (; attnum < firstnullattr; attnum++)
+	{
+		CompactAttribute *cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
 		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (!slow && thisatt->attcacheoff >= 0)
-			off = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow &&
-				off == att_nominal_alignby(off, thisatt->attalignby))
-				thisatt->attcacheoff = off;
-			else
-			{
-				off = att_pointer_alignby(off, thisatt->attalignby, -1,
-										  tp + off);
-				slow = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			off = att_nominal_alignby(off, thisatt->attalignby);
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+	}
+
+	for (; attnum < natts; attnum++)
+	{
+		CompactAttribute *cattr;
+
+		Assert(hasnulls);
 
-			if (!slow)
-				thisatt->attcacheoff = off;
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
 		}
 
-		values[attnum] = fetchatt(thisatt, tp + off);
+		cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
-		off = att_addlength_pointer(off, thisatt->attlen, tp + off);
+		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (thisatt->attlen <= 0)
-			slow = true;		/* can't use attcacheoff anymore */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 }
 
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index 94b4f1f9975..727475b6fb0 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -238,6 +238,9 @@ CreateTupleDesc(int natts, Form_pg_attribute *attrs)
 		memcpy(TupleDescAttr(desc, i), attrs[i], ATTRIBUTE_FIXED_PART_SIZE);
 		populate_compact_attribute(desc, i);
 	}
+
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -282,6 +285,8 @@ CreateTupleDescCopy(TupleDesc tupdesc)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -328,6 +333,8 @@ CreateTupleDescTruncatedCopy(TupleDesc tupdesc, int natts)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -413,6 +420,8 @@ CreateTupleDescCopyConstr(TupleDesc tupdesc)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -455,6 +464,8 @@ TupleDescCopy(TupleDesc dst, TupleDesc src)
 	 * source's refcount would be wrong in any case.)
 	 */
 	dst->tdrefcount = -1;
+
+	TupleDescFinalize(dst);
 }
 
 /*
@@ -463,6 +474,9 @@ TupleDescCopy(TupleDesc dst, TupleDesc src)
  *		descriptor to another.
  *
  * !!! Constraints and defaults are not copied !!!
+ *
+ * The caller must take care of calling TupleDescFinalize() on once all
+ * TupleDesc changes have been made.
  */
 void
 TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
@@ -495,6 +509,52 @@ TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 	populate_compact_attribute(dst, dstAttno - 1);
 }
 
+/*
+ * TupleDescFinalize
+ *		Finalize the given TupleDesc.  This must be called after the
+ *		attributes arrays have been populated or adjusted by any code.
+ *
+ * Must be called after populate_compact_attribute()
+ */
+void
+TupleDescFinalize(TupleDesc tupdesc)
+{
+	int			firstNonCachedOffAttr = -1;
+#ifdef OPTIMIZE_BYVAL
+	int			firstByRefAttr = tupdesc->natts;
+#endif
+	int			offp = 0;
+
+	for (int i = 0; i < tupdesc->natts; i++)
+	{
+		CompactAttribute *cattr = TupleDescCompactAttr(tupdesc, i);
+
+#ifdef OPTIMIZE_BYVAL
+		if (!cattr->attbyval)
+			firstByRefAttr = Min(firstByRefAttr, i);
+#endif
+
+		/*
+		 * We can't cache the offset for the first varlena attr as the
+		 * alignment for those depends on 1 vs 4 byte headers, however we
+		 * possibily could cache the first attlen == -2 attr.  Worthwhile?
+		 */
+		if (cattr->attlen <= 0)
+			break;
+
+		offp = att_nominal_alignby(offp, cattr->attalignby);
+		cattr->attcacheoff = offp;
+
+		offp += cattr->attlen;
+		firstNonCachedOffAttr = i + 1;
+	}
+
+	tupdesc->firstNonCachedOffAttr = firstNonCachedOffAttr;
+#ifdef OPTIMIZE_BYVAL
+	tupdesc->firstByRefAttr = firstByRefAttr;
+#endif
+}
+
 /*
  * Free a TupleDesc including all substructure
  */
@@ -1082,6 +1142,8 @@ BuildDescFromLists(const List *names, const List *types, const List *typmods, co
 		TupleDescInitEntryCollation(desc, attnum, attcollation);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index d205093e21d..a533d79e26e 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -129,6 +129,7 @@ initGinState(GinState *state, Relation index)
 							   attr->attndims);
 			TupleDescInitEntryCollation(state->tupdesc[i], (AttrNumber) 2,
 										attr->attcollation);
+			TupleDescFinalize(state->tupdesc[i]);
 		}
 
 		/*
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
index f23bc4a6757..c65f93abdae 100644
--- a/src/backend/access/gist/gistscan.c
+++ b/src/backend/access/gist/gistscan.c
@@ -201,6 +201,7 @@ gistrescan(IndexScanDesc scan, ScanKey key, int nkeys,
 											 attno - 1)->atttypid,
 							   -1, 0);
 		}
+		TupleDescFinalize(so->giststate->fetchTupdesc);
 		scan->xs_hitupdesc = so->giststate->fetchTupdesc;
 
 		/* Also create a memory context that will hold the returned tuples */
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 9f5379b87ac..a4694bd8065 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -335,11 +335,9 @@ getSpGistTupleDesc(Relation index, SpGistTypeDesc *keyType)
 		/* We shouldn't need to bother with making these valid: */
 		att->attcompression = InvalidCompressionMethod;
 		att->attcollation = InvalidOid;
-		/* In case we changed typlen, we'd better reset following offsets */
-		for (int i = spgFirstIncludeColumn; i < outTupDesc->natts; i++)
-			TupleDescCompactAttr(outTupDesc, i)->attcacheoff = -1;
 
 		populate_compact_attribute(outTupDesc, spgKeyColumn);
+		TupleDescFinalize(outTupDesc);
 	}
 	return outTupDesc;
 }
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index e50abb331cc..9f708f84334 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -744,6 +744,7 @@ pg_prepared_xact(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 5, "dbid",
 						   OIDOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 2efe4105efb..b6bc616c74c 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -400,6 +400,7 @@ pg_walfile_name_offset(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(resultTupleDesc, (AttrNumber) 2, "file_offset",
 					   INT4OID, -1, 0);
 
+	TupleDescFinalize(resultTupleDesc);
 	resultTupleDesc = BlessTupleDesc(resultTupleDesc);
 
 	/*
diff --git a/src/backend/backup/basebackup_copy.c b/src/backend/backup/basebackup_copy.c
index fecfad9ab7b..29dbd0cb32f 100644
--- a/src/backend/backup/basebackup_copy.c
+++ b/src/backend/backup/basebackup_copy.c
@@ -357,6 +357,8 @@ SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli)
 	 */
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "tli", INT8OID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
+
 	/* send RowDescription */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
 
@@ -388,6 +390,7 @@ SendTablespaceList(List *tablespaces)
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, "spcoid", OIDOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "spclocation", TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "size", INT8OID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* send RowDescription */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 43de42ce39e..75e97fb394a 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -481,6 +481,8 @@ ConstructTupleDescriptor(Relation heapRelation,
 		populate_compact_attribute(indexTupDesc, i);
 	}
 
+	TupleDescFinalize(indexTupDesc);
+
 	return indexTupDesc;
 }
 
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 9a4791c573e..fa353a0dd37 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -1230,6 +1230,7 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 4, "qual",
 						   PG_NODE_TREEOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 		funcctx->user_fctx = table_infos;
 
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index c78dcea98c1..078a1cf5127 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -229,6 +229,12 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid,
 	TupleDescAttr(tupdesc, 1)->attcompression = InvalidCompressionMethod;
 	TupleDescAttr(tupdesc, 2)->attcompression = InvalidCompressionMethod;
 
+	populate_compact_attribute(tupdesc, 0);
+	populate_compact_attribute(tupdesc, 1);
+	populate_compact_attribute(tupdesc, 2);
+
+	TupleDescFinalize(tupdesc);
+
 	/*
 	 * Toast tables for regular relations go in pg_toast; those for temp
 	 * relations go into the per-backend temp-toast-table namespace.
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index b7bb111688c..7abd9ed272f 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -281,6 +281,7 @@ ExplainResultDesc(ExplainStmt *stmt)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "QUERY PLAN",
 					   result_type, -1, 0);
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
 
diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c
index a516b037dea..6a8f162b640 100644
--- a/src/backend/commands/functioncmds.c
+++ b/src/backend/commands/functioncmds.c
@@ -2423,6 +2423,7 @@ CallStmtResultDesc(CallStmt *stmt)
 							   -1,
 							   0);
 		}
+		TupleDescFinalize(tupdesc);
 	}
 
 	return tupdesc;
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index e1b808bbb60..551667650ba 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -1808,6 +1808,7 @@ pg_get_sequence_data(PG_FUNCTION_ARGS)
 					   BOOLOID, -1, 0);
 	TupleDescInitEntry(resultTupleDesc, (AttrNumber) 3, "page_lsn",
 					   LSNOID, -1, 0);
+	TupleDescFinalize(resultTupleDesc);
 	resultTupleDesc = BlessTupleDesc(resultTupleDesc);
 
 	seqrel = try_relation_open(relid, AccessShareLock);
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index f976c0e5c7e..ae7c9db9fa9 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -1029,6 +1029,8 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
 		}
 	}
 
+	TupleDescFinalize(descriptor);
+
 	/*
 	 * For relations with table AM and partitioned tables, select access
 	 * method to use: an explicitly indicated one, or (in the case of a
@@ -1448,6 +1450,8 @@ BuildDescForRelation(const List *columns)
 		populate_compact_attribute(desc, attnum - 1);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
diff --git a/src/backend/commands/wait.c b/src/backend/commands/wait.c
index 97f1e778488..9e55d9bfb80 100644
--- a/src/backend/commands/wait.c
+++ b/src/backend/commands/wait.c
@@ -341,5 +341,6 @@ WaitStmtResultDesc(WaitStmt *stmt)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 					   TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
diff --git a/src/backend/executor/execSRF.c b/src/backend/executor/execSRF.c
index a0b111dc0e4..b481e50acfb 100644
--- a/src/backend/executor/execSRF.c
+++ b/src/backend/executor/execSRF.c
@@ -272,6 +272,7 @@ ExecMakeTableFunctionResult(SetExprState *setexpr,
 									   funcrettype,
 									   -1,
 									   0);
+					TupleDescFinalize(tupdesc);
 					rsinfo.setDesc = tupdesc;
 				}
 				MemoryContextSwitchTo(oldcontext);
@@ -776,6 +777,7 @@ init_sexpr(Oid foid, Oid input_collation, Expr *node,
 							   funcrettype,
 							   -1,
 							   0);
+			TupleDescFinalize(tupdesc);
 			sexpr->funcResultDesc = tupdesc;
 			sexpr->funcReturnsTuple = false;
 		}
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index b768eae9e53..d47a75f3ae0 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -992,118 +992,6 @@ tts_buffer_heap_store_tuple(TupleTableSlot *slot, HeapTuple tuple,
 	}
 }
 
-/*
- * slot_deform_heap_tuple_internal
- *		An always inline helper function for use in slot_deform_heap_tuple to
- *		allow the compiler to emit specialized versions of this function for
- *		various combinations of "slow" and "hasnulls".  For example, if a
- *		given tuple has no nulls, then we needn't check "hasnulls" for every
- *		attribute that we're deforming.  The caller can just call this
- *		function with hasnulls set to constant-false and have the compiler
- *		remove the constant-false branches and emit more optimal code.
- *
- * Returns the next attnum to deform, which can be equal to natts when the
- * function manages to deform all requested attributes.  *offp is an input and
- * output parameter which is the byte offset within the tuple to start deforming
- * from which, on return, gets set to the offset where the next attribute
- * should be deformed from.  *slowp is set to true when subsequent deforming
- * of this tuple must use a version of this function with "slow" passed as
- * true.
- *
- * Callers cannot assume when we return "attnum" (i.e. all requested
- * attributes have been deformed) that slow mode isn't required for any
- * additional deforming as the final attribute may have caused a switch to
- * slow mode.
- */
-static pg_attribute_always_inline int
-slot_deform_heap_tuple_internal(TupleTableSlot *slot, HeapTuple tuple,
-								int attnum, int natts, bool slow,
-								bool hasnulls, uint32 *offp, bool *slowp)
-{
-	TupleDesc	tupleDesc = slot->tts_tupleDescriptor;
-	Datum	   *values = slot->tts_values;
-	bool	   *isnull = slot->tts_isnull;
-	HeapTupleHeader tup = tuple->t_data;
-	char	   *tp;				/* ptr to tuple data */
-	bits8	   *bp = tup->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slownext = false;
-
-	tp = (char *) tup + tup->t_hoff;
-
-	for (; attnum < natts; attnum++)
-	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDesc, attnum);
-
-		if (hasnulls && att_isnull(attnum, bp))
-		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			if (!slow)
-			{
-				*slowp = true;
-				return attnum + 1;
-			}
-			else
-				continue;
-		}
-
-		isnull[attnum] = false;
-
-		/* calculate the offset of this attribute */
-		if (!slow && thisatt->attcacheoff >= 0)
-			*offp = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow && *offp == att_nominal_alignby(*offp, thisatt->attalignby))
-				thisatt->attcacheoff = *offp;
-			else
-			{
-				*offp = att_pointer_alignby(*offp,
-											thisatt->attalignby,
-											-1,
-											tp + *offp);
-
-				if (!slow)
-					slownext = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			*offp = att_nominal_alignby(*offp, thisatt->attalignby);
-
-			if (!slow)
-				thisatt->attcacheoff = *offp;
-		}
-
-		values[attnum] = fetchatt(thisatt, tp + *offp);
-
-		*offp = att_addlength_pointer(*offp, thisatt->attlen, tp + *offp);
-
-		/* check if we need to switch to slow mode */
-		if (!slow)
-		{
-			/*
-			 * We're unable to deform any further if the above code set
-			 * 'slownext', or if this isn't a fixed-width attribute.
-			 */
-			if (slownext || thisatt->attlen <= 0)
-			{
-				*slowp = true;
-				return attnum + 1;
-			}
-		}
-	}
-
-	return natts;
-}
-
 /*
  * slot_deform_heap_tuple
  *		Given a TupleTableSlot, extract data from the slot's physical tuple
@@ -1122,78 +1010,164 @@ static pg_attribute_always_inline void
 slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 					   int natts)
 {
+	CompactAttribute *cattr;
+	TupleDesc	tupleDesc = slot->tts_tupleDescriptor;
 	bool		hasnulls = HeapTupleHasNulls(tuple);
+	HeapTupleHeader tup = tuple->t_data;
+	bits8	   *bp;				/* ptr to null bitmap in tuple */
 	int			attnum;
+	int			firstNonCacheOffsetAttr;
+
+#ifdef OPTIMIZE_BYVAL
+	int			firstByRefAttr;
+#endif
+	int			firstNullAttr;
+	Datum	   *values;
+	bool	   *isnull;
+	char	   *tp;				/* ptr to tuple data */
 	uint32		off;			/* offset in tuple data */
-	bool		slow;			/* can we use/set attcacheoff? */
 
 	/* We can only fetch as many attributes as the tuple has. */
-	natts = Min(HeapTupleHeaderGetNatts(tuple->t_data), natts);
+	natts = Min(HeapTupleHeaderGetNatts(tup), natts);
+	attnum = slot->tts_nvalid;
+	firstNonCacheOffsetAttr = Min(tupleDesc->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		bp = tup->t_bits;
+		firstNullAttr = first_null_attr(bp, natts);
+		firstNonCacheOffsetAttr = Min(firstNonCacheOffsetAttr, firstNullAttr);
+	}
+	else
+	{
+		bp = NULL;
+		firstNullAttr = natts;
+	}
+
+#ifdef OPTIMIZE_BYVAL
+	firstByRefAttr = Min(firstNonCacheOffsetAttr, tupleDesc->firstByRefAttr);
+#endif
+	values = slot->tts_values;
+	isnull = slot->tts_isnull;
+	tp = (char *) tup + tup->t_hoff;
+
+#ifdef OPTIMIZE_BYVAL
 
 	/*
-	 * Check whether the first call for this tuple, and initialize or restore
-	 * loop state.
+	 * Many tuples have leading byval attributes, try and process as many of
+	 * those as possible with a special loop that can't handle byref types.
 	 */
-	attnum = slot->tts_nvalid;
-	if (attnum == 0)
+	if (attnum < firstByRefAttr)
+	{
+		/* Use do/while as we already know we need to loop at least once. */
+		do
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+			Assert(cattr->attcacheoff >= 0);
+
+			/*
+			 * Hard code byval == true to allow the compiler to remove the
+			 * byval check when inlining fetch_att().
+			 */
+			values[attnum] = fetch_att(tp + cattr->attcacheoff, true, cattr->attlen);
+			isnull[attnum] = false;
+		} while (++attnum < firstByRefAttr);
+
+		/*
+		 * Point the offset after the end of the last attribute with a cached
+		 * offset.  We expect the final cached offset attribute to have a
+		 * fixed width, so just add the attlen to the attcacheoff.
+		 */
+		Assert(cattr->attlen > 0);
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+#endif
+
+	/*
+	 * Handle the portion of the tuple that we have cached the offset for up
+	 * to the first NULL attribute.  The offset is effectively fixed for these
+	 * so we can use the CompactAttribute's attcacheoff.
+	 */
+	if (attnum < firstNonCacheOffsetAttr)
+	{
+		do
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+			Assert(cattr->attcacheoff >= 0);
+
+			values[attnum] = fetchatt(cattr, tp + cattr->attcacheoff);
+			isnull[attnum] = false;
+		} while (++attnum < firstNonCacheOffsetAttr);
+
+		/*
+		 * Point the offset after the end of the last attribute with a cached
+		 * offset.  We expect the final cached offset attribute to have a
+		 * fixed width, so just add the attlen to the attcacheoff
+		 */
+		Assert(cattr->attlen > 0);
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+	else if (attnum == 0)
 	{
 		/* Start from the first attribute */
 		off = 0;
-		slow = false;
 	}
 	else
 	{
 		/* Restore state from previous execution */
 		off = *offp;
-		slow = TTS_SLOW(slot);
 	}
 
 	/*
-	 * If 'slow' isn't set, try deforming using deforming code that does not
-	 * contain any of the extra checks required for non-fixed offset
-	 * deforming.  During deforming, if or when we find a NULL or a variable
-	 * length attribute, we'll switch to a deforming method which includes the
-	 * extra code required for non-fixed offset deforming, a.k.a slow mode.
-	 * Because this is performance critical, we inline
-	 * slot_deform_heap_tuple_internal passing the 'slow' and 'hasnull'
-	 * parameters as constants to allow the compiler to emit specialized code
-	 * with the known-const false comparisons and subsequent branches removed.
+	 * Handle any portion of the tuple that doesn't have a fixed offset up
+	 * until the first NULL attribute.  This loops only differs from the one
+	 * after it by the NULL checks.
 	 */
-	if (!slow)
+	for (; attnum < firstNullAttr; attnum++)
 	{
-		/* Tuple without any NULLs? We can skip doing any NULL checking */
-		if (!hasnulls)
-			attnum = slot_deform_heap_tuple_internal(slot,
-													 tuple,
-													 attnum,
-													 natts,
-													 false, /* slow */
-													 false, /* hasnulls */
-													 &off,
-													 &slow);
-		else
-			attnum = slot_deform_heap_tuple_internal(slot,
-													 tuple,
-													 attnum,
-													 natts,
-													 false, /* slow */
-													 true,	/* hasnulls */
-													 &off,
-													 &slow);
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+		/* align the offset for this attribute */
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
+
+		values[attnum] = fetchatt(cattr, tp + off);
+		isnull[attnum] = false;
+
+		/* move the offset beyond this attribute */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
-	/* If there's still work to do then we must be in slow mode */
-	if (attnum < natts)
+	/*
+	 * Now handle any remaining tuples, this time include NULL checks as we're
+	 * now at the first NULL attribute.
+	 */
+	for (; attnum < natts; attnum++)
 	{
-		/* XXX is it worth adding a separate call when hasnulls is false? */
-		attnum = slot_deform_heap_tuple_internal(slot,
-												 tuple,
-												 attnum,
-												 natts,
-												 true,	/* slow */
-												 hasnulls,
-												 &off,
-												 &slow);
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
+		}
+
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+		/* align the offset for this attribute */
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
+
+		values[attnum] = fetchatt(cattr, tp + off);
+		isnull[attnum] = false;
+
+		/* move the offset beyond this attribute */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
 	/*
@@ -1201,10 +1175,6 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 	 */
 	slot->tts_nvalid = attnum;
 	*offp = off;
-	if (slow)
-		slot->tts_flags |= TTS_FLAG_SLOW;
-	else
-		slot->tts_flags &= ~TTS_FLAG_SLOW;
 }
 
 const TupleTableSlotOps TTSOpsVirtual = {
@@ -2173,6 +2143,8 @@ ExecTypeFromTLInternal(List *targetList, bool skipjunk)
 		cur_resno++;
 	}
 
+	TupleDescFinalize(typeInfo);
+
 	return typeInfo;
 }
 
@@ -2207,6 +2179,8 @@ ExecTypeFromExprList(List *exprList)
 		cur_resno++;
 	}
 
+	TupleDescFinalize(typeInfo);
+
 	return typeInfo;
 }
 
diff --git a/src/backend/executor/nodeFunctionscan.c b/src/backend/executor/nodeFunctionscan.c
index 63e605e1f81..feb82d64967 100644
--- a/src/backend/executor/nodeFunctionscan.c
+++ b/src/backend/executor/nodeFunctionscan.c
@@ -414,6 +414,7 @@ ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags)
 				TupleDescInitEntryCollation(tupdesc,
 											(AttrNumber) 1,
 											exprCollation(funcexpr));
+				TupleDescFinalize(tupdesc);
 			}
 			else
 			{
@@ -485,6 +486,7 @@ ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags)
 							   0);
 		}
 
+		TupleDescFinalize(scan_tupdesc);
 		Assert(attno == natts);
 	}
 
diff --git a/src/backend/jit/llvm/llvmjit_deform.c b/src/backend/jit/llvm/llvmjit_deform.c
index 3eb087eb56b..3674a908e75 100644
--- a/src/backend/jit/llvm/llvmjit_deform.c
+++ b/src/backend/jit/llvm/llvmjit_deform.c
@@ -747,14 +747,10 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 
 	{
 		LLVMValueRef v_off = l_load(b, TypeSizeT, v_offp, "");
-		LLVMValueRef v_flags;
 
 		LLVMBuildStore(b, l_int16_const(lc, natts), v_nvalidp);
 		v_off = LLVMBuildTrunc(b, v_off, LLVMInt32TypeInContext(lc), "");
 		LLVMBuildStore(b, v_off, v_slotoffp);
-		v_flags = l_load(b, LLVMInt16TypeInContext(lc), v_flagsp, "tts_flags");
-		v_flags = LLVMBuildOr(b, v_flags, l_int16_const(lc, TTS_FLAG_SLOW), "");
-		LLVMBuildStore(b, v_flags, v_flagsp);
 		LLVMBuildRetVoid(b);
 	}
 
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 3ec8d8de011..0ad767d827b 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -1891,6 +1891,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 			TupleDescInitEntryCollation(tupdesc,
 										(AttrNumber) 1,
 										exprCollation(funcexpr));
+			TupleDescFinalize(tupdesc);
 		}
 		else if (functypclass == TYPEFUNC_RECORD)
 		{
@@ -1948,6 +1949,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 
 				i++;
 			}
+			TupleDescFinalize(tupdesc);
 
 			/*
 			 * Ensure that the coldeflist defines a legal set of names (no
@@ -2016,7 +2018,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 							   0);
 			/* no need to set collation */
 		}
-
+		TupleDescFinalize(tupdesc);
 		Assert(natts == totalatts);
 	}
 	else
diff --git a/src/backend/parser/parse_target.c b/src/backend/parser/parse_target.c
index b5a2f915b67..5fd17f3d8d0 100644
--- a/src/backend/parser/parse_target.c
+++ b/src/backend/parser/parse_target.c
@@ -1570,6 +1570,8 @@ expandRecordVariable(ParseState *pstate, Var *var, int levelsup)
 		}
 		Assert(lname == NULL && lvar == NULL);	/* lists same length? */
 
+		TupleDescFinalize(tupleDesc);
+
 		return tupleDesc;
 	}
 
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 7c8639b32e9..4b2d60fe3c2 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -1074,6 +1074,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
 		TupleDescInitEntry(walres->tupledesc, (AttrNumber) coln + 1,
 						   PQfname(pgres, coln), retTypes[coln], -1, 0);
 	attinmeta = TupleDescGetAttInMetadata(walres->tupledesc);
+	TupleDescFinalize(walres->tupledesc);
 
 	/* No point in doing more here if there were no tuples returned. */
 	if (PQntuples(pgres) == 0)
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 1ab09655a70..269b081bac0 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -452,6 +452,7 @@ IdentifySystem(void)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 4, "dbname",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -497,6 +498,7 @@ ReadReplicationSlot(ReadReplicationSlotCmd *cmd)
 	/* TimeLineID is unsigned, so int4 is not wide enough. */
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "restart_tli",
 							  INT8OID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	memset(nulls, true, READ_REPLICATION_SLOT_COLS * sizeof(bool));
 
@@ -599,6 +601,7 @@ SendTimeLineHistory(TimeLineHistoryCmd *cmd)
 	tupdesc = CreateTemplateTupleDesc(2);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, "filename", TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "content", TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	TLHistoryFileName(histfname, cmd->timeline);
 	TLHistoryFilePath(path, cmd->timeline);
@@ -1016,6 +1019,7 @@ StartReplication(StartReplicationCmd *cmd)
 								  INT8OID, -1, 0);
 		TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "next_tli_startpos",
 								  TEXTOID, -1, 0);
+		TupleDescFinalize(tupdesc);
 
 		/* prepare for projection of tuple */
 		tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -1370,6 +1374,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 4, "output_plugin",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
diff --git a/src/backend/utils/adt/acl.c b/src/backend/utils/adt/acl.c
index 3a6905f9546..9d37053c81e 100644
--- a/src/backend/utils/adt/acl.c
+++ b/src/backend/utils/adt/acl.c
@@ -1818,6 +1818,7 @@ aclexplode(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 4, "is_grantable",
 						   BOOLOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/* allocate memory for user context */
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index c083608b1d5..bfb949401d0 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -454,6 +454,7 @@ pg_stat_file(PG_FUNCTION_ARGS)
 					   "creation", TIMESTAMPTZOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 6,
 					   "isdir", BOOLOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	memset(isnull, false, sizeof(isnull));
diff --git a/src/backend/utils/adt/lockfuncs.c b/src/backend/utils/adt/lockfuncs.c
index 9dadd6da672..4481c354fd6 100644
--- a/src/backend/utils/adt/lockfuncs.c
+++ b/src/backend/utils/adt/lockfuncs.c
@@ -146,6 +146,7 @@ pg_lock_status(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 16, "waitstart",
 						   TIMESTAMPTZOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/*
diff --git a/src/backend/utils/adt/orderedsetaggs.c b/src/backend/utils/adt/orderedsetaggs.c
index 3b6da8e36ac..fd8b8676470 100644
--- a/src/backend/utils/adt/orderedsetaggs.c
+++ b/src/backend/utils/adt/orderedsetaggs.c
@@ -233,6 +233,7 @@ ordered_set_startup(FunctionCallInfo fcinfo, bool use_tuples)
 								   -1,
 								   0);
 
+				TupleDescFinalize(newdesc);
 				FreeTupleDesc(qstate->tupdesc);
 				qstate->tupdesc = newdesc;
 			}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 73ca0bb0b7f..08ad27e57c2 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -769,6 +769,7 @@ pg_stat_get_backend_subxact(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subxact_overflow",
 					   BOOLOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	if ((local_beentry = pgstat_get_local_beentry_by_proc_number(procNumber)) != NULL)
@@ -1658,6 +1659,7 @@ pg_stat_wal_build_tuple(PgStat_WalCounters wal_counters,
 	TupleDescInitEntry(tupdesc, (AttrNumber) 6, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	/* Fill values and NULLs */
@@ -2085,6 +2087,7 @@ pg_stat_get_archiver(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(tupdesc, (AttrNumber) 7, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	/* Get statistics about the archiver process */
@@ -2166,6 +2169,7 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
 					   TIMESTAMPTZOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 13, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	namestrcpy(&slotname, text_to_cstring(slotname_text));
@@ -2253,6 +2257,7 @@ pg_stat_get_subscription_stats(PG_FUNCTION_ARGS)
 					   INT8OID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 13, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	if (!subentry)
diff --git a/src/backend/utils/adt/tsvector_op.c b/src/backend/utils/adt/tsvector_op.c
index 94e0fed8309..7ca19a97882 100644
--- a/src/backend/utils/adt/tsvector_op.c
+++ b/src/backend/utils/adt/tsvector_op.c
@@ -651,6 +651,7 @@ tsvector_unnest(PG_FUNCTION_ARGS)
 						   TEXTARRAYOID, -1, 0);
 		if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
 			elog(ERROR, "return type must be a row type");
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = tupdesc;
 
 		funcctx->user_fctx = PG_GETARG_TSVECTOR_COPY(0);
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 6b634c9fff1..998be24ac41 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -666,14 +666,6 @@ RelationBuildTupleDesc(Relation relation)
 		elog(ERROR, "pg_attribute catalog is missing %d attribute(s) for relation OID %u",
 			 need, RelationGetRelid(relation));
 
-	/*
-	 * We can easily set the attcacheoff value for the first attribute: it
-	 * must be zero.  This eliminates the need for special cases for attnum=1
-	 * that used to exist in fastgetattr() and index_getattr().
-	 */
-	if (RelationGetNumberOfAttributes(relation) > 0)
-		TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
-
 	/*
 	 * Set up constraint/default info
 	 */
@@ -729,6 +721,8 @@ RelationBuildTupleDesc(Relation relation)
 		pfree(constr);
 		relation->rd_att->constr = NULL;
 	}
+
+	TupleDescFinalize(relation->rd_att);
 }
 
 /*
@@ -1983,8 +1977,7 @@ formrdesc(const char *relationName, Oid relationReltype,
 		populate_compact_attribute(relation->rd_att, i);
 	}
 
-	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
-	TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
+	TupleDescFinalize(relation->rd_att);
 
 	/* mark not-null status */
 	if (has_not_null)
@@ -3688,6 +3681,8 @@ RelationBuildLocalRelation(const char *relname,
 	for (i = 0; i < natts; i++)
 		TupleDescAttr(rel->rd_att, i)->attrelid = relid;
 
+	TupleDescFinalize(rel->rd_att);
+
 	rel->rd_rel->reltablespace = reltablespace;
 
 	if (mapped_relation)
@@ -4441,8 +4436,7 @@ BuildHardcodedDescriptor(int natts, const FormData_pg_attribute *attrs)
 		populate_compact_attribute(result, i);
 	}
 
-	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
-	TupleDescCompactAttr(result, 0)->attcacheoff = 0;
+	TupleDescFinalize(result);
 
 	/* Note: we don't bother to set up a TupleConstr entry */
 
@@ -6268,6 +6262,8 @@ load_relcache_init_file(bool shared)
 			populate_compact_attribute(rel->rd_att, i);
 		}
 
+		TupleDescFinalize(rel->rd_att);
+
 		/* next read the access method specific field */
 		if (fread(&len, 1, sizeof(len), fp) != sizeof(len))
 			goto read_failed;
diff --git a/src/backend/utils/fmgr/funcapi.c b/src/backend/utils/fmgr/funcapi.c
index 8a934ea8dca..516d02cfb82 100644
--- a/src/backend/utils/fmgr/funcapi.c
+++ b/src/backend/utils/fmgr/funcapi.c
@@ -340,6 +340,8 @@ get_expr_result_type(Node *expr,
 										exprCollation(col));
 			i++;
 		}
+		TupleDescFinalize(tupdesc);
+
 		if (resultTypeId)
 			*resultTypeId = rexpr->row_typeid;
 		if (resultTupleDesc)
@@ -1044,6 +1046,7 @@ resolve_polymorphic_tupdesc(TupleDesc tupdesc, oidvector *declared_args,
 		}
 	}
 
+	TupleDescFinalize(tupdesc);
 	return true;
 }
 
@@ -1853,6 +1856,8 @@ build_function_result_tupdesc_d(char prokind,
 						   0);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -1970,6 +1975,7 @@ TypeGetTupleDesc(Oid typeoid, List *colaliases)
 						   typeoid,
 						   -1,
 						   0);
+		TupleDescFinalize(tupdesc);
 	}
 	else if (functypclass == TYPEFUNC_RECORD)
 	{
diff --git a/src/backend/utils/misc/guc_funcs.c b/src/backend/utils/misc/guc_funcs.c
index 4f3e40bf470..b566a8ef600 100644
--- a/src/backend/utils/misc/guc_funcs.c
+++ b/src/backend/utils/misc/guc_funcs.c
@@ -444,6 +444,7 @@ GetPGVariableResultDesc(const char *name)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 1, varname,
 						   TEXTOID, -1, 0);
 	}
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
 
@@ -465,6 +466,7 @@ ShowGUCConfigOption(const char *name, DestReceiver *dest)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, varname,
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -499,6 +501,7 @@ ShowAllGUCConfig(DestReceiver *dest)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "description",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -939,6 +942,8 @@ show_all_settings(PG_FUNCTION_ARGS)
 		 * C strings
 		 */
 		attinmeta = TupleDescGetAttInMetadata(tupdesc);
+		TupleDescFinalize(tupdesc);
+
 		funcctx->attinmeta = attinmeta;
 
 		/* collect the variables, in sorted order */
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index d406825ff22..94b4279b7f1 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -865,20 +865,17 @@ extern MinimalTuple minimal_expand_tuple(HeapTuple sourceTuple, TupleDesc tupleD
 static inline Datum
 fastgetattr(HeapTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
 {
-	Assert(attnum > 0);
+	CompactAttribute *att = TupleDescCompactAttr(tupleDesc, attnum - 1);
 
+	Assert(attnum > 0);
 	*isnull = false;
-	if (HeapTupleNoNulls(tup))
-	{
-		CompactAttribute *att;
 
-		att = TupleDescCompactAttr(tupleDesc, attnum - 1);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, (char *) tup->t_data + tup->t_data->t_hoff +
-							att->attcacheoff);
-		else
-			return nocachegetattr(tup, attnum, tupleDesc);
-	}
+	if (att->attcacheoff >= 0 && !HeapTupleHasNulls(tup))
+		return fetchatt(att, (char *) tup->t_data + tup->t_data->t_hoff +
+						att->attcacheoff);
+
+	if (HeapTupleNoNulls(tup))
+		return nocachegetattr(tup, attnum, tupleDesc);
 	else
 	{
 		if (att_isnull(attnum - 1, tup->t_data->t_bits))
diff --git a/src/include/access/itup.h b/src/include/access/itup.h
index 57e4daafb0d..e4bb27b7e58 100644
--- a/src/include/access/itup.h
+++ b/src/include/access/itup.h
@@ -131,24 +131,20 @@ IndexInfoFindDataOffset(unsigned short t_info)
 static inline Datum
 index_getattr(IndexTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
 {
+	CompactAttribute *attr = TupleDescCompactAttr(tupleDesc, attnum - 1);
+
 	Assert(isnull);
 	Assert(attnum > 0);
 
 	*isnull = false;
 
-	if (!IndexTupleHasNulls(tup))
-	{
-		CompactAttribute *attr = TupleDescCompactAttr(tupleDesc, attnum - 1);
+	if (attr->attcacheoff >= 0 && !IndexTupleHasNulls(tup))
+		return fetchatt(attr,
+						(char *) tup + IndexInfoFindDataOffset(tup->t_info) +
+						attr->attcacheoff);
 
-		if (attr->attcacheoff >= 0)
-		{
-			return fetchatt(attr,
-							(char *) tup + IndexInfoFindDataOffset(tup->t_info) +
-							attr->attcacheoff);
-		}
-		else
-			return nocache_index_getattr(tup, attnum, tupleDesc);
-	}
+	if (!IndexTupleHasNulls(tup))
+		return nocache_index_getattr(tup, attnum, tupleDesc);
 	else
 	{
 		if (att_isnull(attnum - 1, (bits8 *) tup + sizeof(IndexTupleData)))
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index d46cdbf7a3c..129176f64ae 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -131,6 +131,12 @@ typedef struct CompactAttribute
  * Any code making changes manually to and fields in the FormData_pg_attribute
  * array must subsequently call populate_compact_attribute() to flush the
  * changes out to the corresponding 'compact_attrs' element.
+ *
+ * firstNonCachedOffAttr stores the index into the compact_attrs array for the
+ * first attribute that we don't have a known attcacheoff for.
+ *
+ * Once a TupleDesc has been populated, before it is used for any purpose
+ * TupleDescFinalize() must be called on it.
  */
 typedef struct TupleDescData
 {
@@ -138,6 +144,12 @@ typedef struct TupleDescData
 	Oid			tdtypeid;		/* composite type ID for tuple type */
 	int32		tdtypmod;		/* typmod for tuple type */
 	int			tdrefcount;		/* reference count, or -1 if not counting */
+	int			firstNonCachedOffAttr;	/* index of last att with an
+										 * attcacheoff */
+#ifdef OPTIMIZE_BYVAL
+	int			firstByRefAttr; /* index of the first attr with !attbyval, or
+								 * natts if none. */
+#endif
 	TupleConstr *constr;		/* constraints, or NULL if none */
 	/* compact_attrs[N] is the compact metadata of Attribute Number N+1 */
 	CompactAttribute compact_attrs[FLEXIBLE_ARRAY_MEMBER];
@@ -205,6 +217,8 @@ extern void TupleDescCopy(TupleDesc dst, TupleDesc src);
 extern void TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 							   TupleDesc src, AttrNumber srcAttno);
 
+extern void TupleDescFinalize(TupleDesc tupdesc);
+
 extern void FreeTupleDesc(TupleDesc tupdesc);
 
 extern void IncrTupleDescRefCount(TupleDesc tupdesc);
diff --git a/src/include/access/tupmacs.h b/src/include/access/tupmacs.h
index e6df8264750..8c70dad6c62 100644
--- a/src/include/access/tupmacs.h
+++ b/src/include/access/tupmacs.h
@@ -15,6 +15,7 @@
 #define TUPMACS_H
 
 #include "catalog/pg_type_d.h"	/* for TYPALIGN macros */
+#include "port/pg_bitutils.h"
 
 
 /*
@@ -69,6 +70,62 @@ fetch_att(const void *T, bool attbyval, int attlen)
 	else
 		return PointerGetDatum(T);
 }
+
+/*
+ * first_null_attr
+ *		Inspect a NULL bitmask from a tuple and return the 0-based attnum of the
+ *		first NULL attribute.  Returns natts if no NULLs were found.
+ */
+static inline int
+first_null_attr(const bits8 *bits, int natts)
+{
+	int			lastByte = natts >> 3;
+	uint8		mask;
+	int			res = natts;
+	uint8		byte;
+
+#ifdef USE_ASSERT_CHECKING
+	int			firstnull_check = natts;
+
+	/* Do it the slow way and check we get the same answer. */
+	for (int i = 0; i < natts; i++)
+	{
+		if (att_isnull(i, bits))
+		{
+			firstnull_check = i;
+			break;
+		}
+	}
+#endif
+
+	/* Process all bytes up to just before the byte for the natts index */
+	for (int bytenum = 0; bytenum < lastByte; bytenum++)
+	{
+		if (bits[bytenum] != 0xFF)
+		{
+			byte = ~bits[bytenum];
+			res = bytenum * 8;
+			res += pg_rightmost_one_pos[byte];
+
+			Assert(res == firstnull_check);
+			return res;
+		}
+	}
+
+	/* Create a mask with all bits beyond natts's bit set to off */
+	mask = 0xFF & ((((uint8) 1) << (natts & 7)) - 1);
+	byte = (~bits[lastByte]) & mask;
+
+	if (byte != 0)
+	{
+		res = lastByte * 8;
+		res += pg_rightmost_one_pos[byte];
+	}
+
+	Assert(res == firstnull_check);
+
+	return res;
+}
 #endif							/* FRONTEND */
 
 /*
diff --git a/src/include/executor/tuptable.h b/src/include/executor/tuptable.h
index a2dfd707e78..363c5f33697 100644
--- a/src/include/executor/tuptable.h
+++ b/src/include/executor/tuptable.h
@@ -84,9 +84,6 @@
  * tts_values/tts_isnull are allocated either when the slot is created (when
  * the descriptor is provided), or when a descriptor is assigned to the slot;
  * they are of length equal to the descriptor's natts.
- *
- * The TTS_FLAG_SLOW flag is saved state for
- * slot_deform_heap_tuple, and should not be touched by any other code.
  *----------
  */
 
@@ -98,12 +95,8 @@
 #define			TTS_FLAG_SHOULDFREE		(1 << 2)
 #define TTS_SHOULDFREE(slot) (((slot)->tts_flags & TTS_FLAG_SHOULDFREE) != 0)
 
-/* saved state for slot_deform_heap_tuple */
-#define			TTS_FLAG_SLOW		(1 << 3)
-#define TTS_SLOW(slot) (((slot)->tts_flags & TTS_FLAG_SLOW) != 0)
-
 /* fixed tuple descriptor */
-#define			TTS_FLAG_FIXED		(1 << 4)
+#define			TTS_FLAG_FIXED		(1 << 4)	/* XXX change to #3? */
 #define TTS_FIXED(slot) (((slot)->tts_flags & TTS_FLAG_FIXED) != 0)
 
 struct TupleTableSlotOps;
diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c
index 7d648c941c0..b2112c29fcf 100644
--- a/src/pl/plpgsql/src/pl_comp.c
+++ b/src/pl/plpgsql/src/pl_comp.c
@@ -1912,6 +1912,8 @@ build_row_from_vars(PLpgSQL_variable **vars, int numvars)
 		TupleDescInitEntryCollation(row->rowtupdesc, i + 1, typcoll);
 	}
 
+	TupleDescFinalize(row->rowtupdesc);
+
 	return row;
 }
 
diff --git a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
index 908bd18a7c7..fa1719bf3b5 100644
--- a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
+++ b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
@@ -205,6 +205,7 @@ test_custom_stats_fixed_report(PG_FUNCTION_ARGS)
 					   INT8OID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	values[0] = Int64GetDatum(stats->numcalls);
diff --git a/src/test/modules/test_predtest/test_predtest.c b/src/test/modules/test_predtest/test_predtest.c
index 679a5de456d..48ca2a4ea70 100644
--- a/src/test/modules/test_predtest/test_predtest.c
+++ b/src/test/modules/test_predtest/test_predtest.c
@@ -230,6 +230,7 @@ test_predtest(PG_FUNCTION_ARGS)
 					   "s_r_holds", BOOLOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 8,
 					   "w_r_holds", BOOLOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	tupdesc = BlessTupleDesc(tupdesc);
 
 	values[0] = BoolGetDatum(strong_implied_by);
-- 
2.51.0



  [application/x-compressed] deform_results2.tar.bz2 (416.1K, 3-deform_results2.tar.bz2)
  download

^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: More speedups for tuple deformation
@ 2026-01-19 05:47  Chao Li <[email protected]>
  parent: David Rowley <[email protected]>
  0 siblings, 1 reply; 19+ messages in thread

From: Chao Li @ 2026-01-19 05:47 UTC (permalink / raw)
  To: David Rowley <[email protected]>; +Cc: PostgreSQL Developers <[email protected]>



> On Jan 19, 2026, at 06:13, David Rowley <[email protected]> wrote:
> 
> On Fri, 2 Jan 2026 at 18:58, David Rowley <[email protected]> wrote:
>> Please find attached an updated set of patches. A rebase was needed,
>> plus 0003 had a problem with an Assert not handling the bitmap being a
>> NULL pointer.
> 
> Another rebase and updates to some newly created missing calls to
> TupleDescFinalize().
> 
> I've also attached another round of benchmarks after dipping into some
> Azure machines to cover my lack of any Intel benchmark results. I
> think these are somewhat noisy as I opted for low core-count instances
> which will have L3 shared with workloads running for other people.
> This is most evident in Xeon_E5-2673 with gcc where the patched run
> was nearly twice as fast as unpatched for test 2 on 20 extra columns.
> If you look at the raw results from that, you can see the times are
> quite unstable between the 3 runs of each test, which makes me believe
> that the machine was busy with other work when that test ran on
> master. The AMD3990x and M2 machines are all sitting next to me and
> were otherwise idle, so they should be much more stable.
> 
> Quite a few machines have a small regression for the 0 extra column
> tests. There is a small amount of extra work being done in the
> deforming function to check if the attnum < the first attribute
> without an attcacheoff. This mostly only affects the tests that don't
> do any deforming with a cached attcacheoff, e.g due to NULLs or
> varlena types. The only way I've thought about to possibly reduce that
> is to invent a new TupleTableSlotOps and pick the one that applies
> when creating the TupleTableSlot. This doesn't appeal to me very much
> as it requires modifying many callsites. But I do wonder if we should
> try to come up with something here as technically we could use this to
> eliminate alignment padding out of some MinimalTuples in some cases
> where these were not directly derived from pre-formed HeapTuples. That
> could allow a more compact tuple representation for sorting and
> hashing, allowing us to do more with less memory in some cases.
> 
> The benchmark results also indicated that there wasn't much advantage
> to the 0002+0003 patches, so I've removed those from the set. That
> reduces some complexity around the benchmarks. I did still keep the
> OPTIMIZE_BYVAL loop as separate results. It's not quite clear what's
> best there as machines seem to vary on which they prefer.
> 
> Benchmark results attached in the bz2 file both in spreadsheet form
> and the raw results pg_dumped.
> 
> David
> <v3-0001-Precalculate-CompactAttribute-s-attcacheoff.patch><deform_results2.tar.bz2>

Hi David,

I reviewed the patch and traced some basic workflows. But I haven’t done a load test to compare performance differences with and without this patch, I will do that if I get some bandwidth later. Here comes some review comments:

1 - tupmacs.h
```
+	/* Create a mask with all bits beyond natts's bit set to off */
+	mask = 0xFF & ((((uint8) 1) << (natts & 7)) - 1);
+	byte = (~bits[lastByte]) & mask;
```

When I read the code, I got an impression bits[lastByte] might overflow when natts % 8 == 0, so I traced the code, then I realized that, this function is only called when a row has null values, so that, when reaching here, natts % 8 != 0, otherwise it should return earlier within the for loop.

So, to avoid future reader’s same confusion, can we add a brief comment to explain that no overflow should happen here.

2 - After this patch, nocachegetattr() and nocache_index_getattr() strictly rely on tupleDesc->firstNonCachedOffAttr to work:
```
	if (tupleDesc->firstNonCachedOffAttr >= 0)
	{
		startAttr = Min(tupleDesc->firstNonCachedOffAttr - 1, firstnullattr);
		off = TupleDescCompactAttr(tupleDesc, startAttr)->attcacheoff;
	}
	else
	{
		startAttr = 0;
		off = 0;
	}
```

And tupleDesc->firstNonCachedOffAttr is only set by TupleDescFinalize(). So, assuming some code misses to call TupleDescFinalize(), looking at how TupleDesc is created, for example CreateTemplateTupleDesc():
```
	desc = (TupleDesc) palloc(offsetof(struct TupleDescData, compact_attrs) +
							  natts * sizeof(CompactAttribute) +
							  natts * sizeof(FormData_pg_attribute));

	/*
	 * Initialize other fields of the tupdesc.
	 */
	desc->natts = natts;
	desc->constr = NULL;
	desc->tdtypeid = RECORDOID;
	desc->tdtypmod = -1;
	desc->tdrefcount = -1;		/* assume not reference-counted */

	return desc;
```

It’s palloc and not palloc0, so desc->firstNonCachedOffAttr will initially hold a random value. As long as TupleDescFinalize() is missed, then that’s a bug.

From this perspective, I think we can set firstNonCachedOffAttr to -2 when in CreateTemplateTupleDesc() as well as other functions that create a TupleDesc. Then in nocachegetattr() and nocache_index_getattr(), we can Assert(desc->firstNonCachedOffAttr > -2).

3
```
+		firstNonCachedOffAttr = i + 1;
```

In TupleDescFinalize(), given firstNonCachedOffAttr = i + 1, firstNonCachedOffAttr will never be 0.

But in nocachegetattr(), it checks firstNonCachedOffAttr >= 0:
```
	if (tupleDesc->firstNonCachedOffAttr >= 0)
	{
		startAttr = Min(tupleDesc->firstNonCachedOffAttr - 1, firstnullattr);
		off = TupleDescCompactAttr(tupleDesc, startAttr)->attcacheoff;
	}
```

This is kinda inconsistent, and may potentially lead to some confusion to code readers.

From the meaning of the variable name “firstNonCachedOffAttr”, when there is no cached attribute, firstNonCachedOffAttr feels better to be 0 rather than-1. From this perspective, TupleDescFinalize() can initialize desc->firstNonCachedOffAttr to 0. And for my comment 2, we can use -1 instead of -2, so that -1 indicates TupleDescFinalize() is not called, 0 means no cached attribute, >0 means some cached attributes. 

4
```
+		 * possibily could cache the first attlen == -2 attr.  Worthwhile?
```

Typo: possibily -> possibly

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/










^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: More speedups for tuple deformation
@ 2026-01-20 00:11  David Rowley <[email protected]>
  parent: Chao Li <[email protected]>
  0 siblings, 2 replies; 19+ messages in thread

From: David Rowley @ 2026-01-20 00:11 UTC (permalink / raw)
  To: Chao Li <[email protected]>; +Cc: PostgreSQL Developers <[email protected]>

On Mon, 19 Jan 2026 at 18:48, Chao Li <[email protected]> wrote:
> I reviewed the patch and traced some basic workflows. But I haven’t done a load test to compare performance differences with and without this patch, I will do that if I get some bandwidth later. Here comes some review comments:
>
> 1 - tupmacs.h
> ```
> +       /* Create a mask with all bits beyond natts's bit set to off */
> +       mask = 0xFF & ((((uint8) 1) << (natts & 7)) - 1);
> +       byte = (~bits[lastByte]) & mask;
> ```
>
> When I read the code, I got an impression bits[lastByte] might overflow when natts % 8 == 0, so I traced the code, then I realized that, this function is only called when a row has null values, so that, when reaching here, natts % 8 != 0, otherwise it should return earlier within the for loop.

It certainly is possible to get to that part of the code when natts is
a multiple of 8 and the tuple contains NULLs after that (we may not be
deforming the entire tuple). The code you quoted that's setting "mask"
in that case will produce a zero mask, resulting in not finding any
NULLs. I don't quite see any risk of overflowing any of the types
here.  If natts is 16 then effectively the code does 0xFF & ((1 << 0)
- 1); so no overflow. Just left shift by 0 bits and bitwise AND with
zero, resulting in the mask becoming zero.

How about if I write the comment as follows?

/*
* Create a mask with all bits beyond natts's bit set to off.  The code
* below will generate a zero mask when natts & 7 == 0.  When that happens
* all bytes that need to be checked were done so in the loop above.  The
* code below will create an empty mask and end up returning natts.  This
* has been done to avoid having to write a special case to check if we've
* covered all bytes already.
*/

> In TupleDescFinalize(), given firstNonCachedOffAttr = i + 1, firstNonCachedOffAttr will never be 0.
> But in nocachegetattr(), it checks firstNonCachedOffAttr >= 0:
> This is kinda inconsistent, and may potentially lead to some confusion to code readers.

I've changed things around so that firstNonCachedOffAttr == 0 means
there are no attributes with cached offsets. -1 becomes uninitialised.
I've added Asserts to check for firstNonCachedOffAttr >= 0 with a
comment directing anyone who's facing debugging one of those failing
to add the missing call to TupleDescFinalize().

Thanks for reviewing.

I've attached the v4 patch, which also fixes the LLVM compiler warning
that I introduced.

David

From 21cf76b97fc8b3a47ef71ff8921c4b370c0f714a Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Tue, 31 Dec 2024 09:19:24 +1300
Subject: [PATCH v4] Precalculate CompactAttribute's attcacheoff

This allows code to be removed from the tuple deform routines which
shrinks down the code a little, which can make it run more quickly.
This also makes a dedicated deformer loop to deform the portion of the
tuple which has a known offset, which makes deforming much faster when
a leading set of the table's columns are non-NULL values and fixed-width
types.
---
 contrib/dblink/dblink.c                       |   3 +
 contrib/pg_buffercache/pg_buffercache_pages.c |   2 +
 contrib/pg_visibility/pg_visibility.c         |   2 +
 src/backend/access/brin/brin_tuple.c          |   1 +
 src/backend/access/common/heaptuple.c         | 323 ++++++----------
 src/backend/access/common/indextuple.c        | 361 +++++++-----------
 src/backend/access/common/tupdesc.c           |  59 +++
 src/backend/access/gin/ginutil.c              |   1 +
 src/backend/access/gist/gistscan.c            |   1 +
 src/backend/access/spgist/spgutils.c          |   4 +-
 src/backend/access/transam/twophase.c         |   1 +
 src/backend/access/transam/xlogfuncs.c        |   1 +
 src/backend/backup/basebackup_copy.c          |   3 +
 src/backend/catalog/index.c                   |   2 +
 src/backend/catalog/pg_publication.c          |   1 +
 src/backend/catalog/toasting.c                |   6 +
 src/backend/commands/explain.c                |   1 +
 src/backend/commands/functioncmds.c           |   1 +
 src/backend/commands/sequence.c               |   1 +
 src/backend/commands/tablecmds.c              |   4 +
 src/backend/commands/wait.c                   |   1 +
 src/backend/executor/execSRF.c                |   2 +
 src/backend/executor/execTuples.c             | 305 +++++++--------
 src/backend/executor/nodeFunctionscan.c       |   2 +
 src/backend/jit/llvm/llvmjit_deform.c         |   6 -
 src/backend/parser/parse_relation.c           |   4 +-
 src/backend/parser/parse_target.c             |   2 +
 .../libpqwalreceiver/libpqwalreceiver.c       |   1 +
 src/backend/replication/walsender.c           |   5 +
 src/backend/utils/adt/acl.c                   |   1 +
 src/backend/utils/adt/genfile.c               |   1 +
 src/backend/utils/adt/lockfuncs.c             |   1 +
 src/backend/utils/adt/orderedsetaggs.c        |   1 +
 src/backend/utils/adt/pgstatfuncs.c           |   5 +
 src/backend/utils/adt/tsvector_op.c           |   1 +
 src/backend/utils/cache/relcache.c            |  20 +-
 src/backend/utils/fmgr/funcapi.c              |   6 +
 src/backend/utils/misc/guc_funcs.c            |   5 +
 src/include/access/htup_details.h             |  19 +-
 src/include/access/itup.h                     |  20 +-
 src/include/access/tupdesc.h                  |  14 +
 src/include/access/tupmacs.h                  |  64 ++++
 src/include/executor/tuptable.h               |   9 +-
 src/pl/plpgsql/src/pl_comp.c                  |   2 +
 .../test_custom_fixed_stats.c                 |   1 +
 .../modules/test_predtest/test_predtest.c     |   1 +
 46 files changed, 642 insertions(+), 635 deletions(-)

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 8cb3166495c..ed0267d2183 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -881,6 +881,7 @@ materializeResult(FunctionCallInfo fcinfo, PGconn *conn, PGresult *res)
 		tupdesc = CreateTemplateTupleDesc(1);
 		TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 						   TEXTOID, -1, 0);
+		TupleDescFinalize(tupdesc);
 		ntuples = 1;
 		nfields = 1;
 	}
@@ -1045,6 +1046,7 @@ materializeQueryResult(FunctionCallInfo fcinfo,
 			TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 							   TEXTOID, -1, 0);
 			attinmeta = TupleDescGetAttInMetadata(tupdesc);
+			TupleDescFinalize(tupdesc);
 
 			oldcontext = MemoryContextSwitchTo(rsinfo->econtext->ecxt_per_query_memory);
 			tupstore = tuplestore_begin_heap(true, false, work_mem);
@@ -1534,6 +1536,7 @@ dblink_get_pkey(PG_FUNCTION_ARGS)
 		 * C strings
 		 */
 		attinmeta = TupleDescGetAttInMetadata(tupdesc);
+		TupleDescFinalize(tupdesc);
 		funcctx->attinmeta = attinmeta;
 
 		if ((results != NULL) && (indnkeyatts > 0))
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index dcba3fb5473..2fdf5a341f6 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -174,6 +174,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
 			TupleDescInitEntry(tupledesc, (AttrNumber) 9, "pinning_backends",
 							   INT4OID, -1, 0);
 
+		TupleDescFinalize(tupledesc);
 		fctx->tupdesc = BlessTupleDesc(tupledesc);
 
 		/* Allocate NBuffers worth of BufferCachePagesRec records. */
@@ -442,6 +443,7 @@ pg_buffercache_os_pages_internal(FunctionCallInfo fcinfo, bool include_numa)
 		TupleDescInitEntry(tupledesc, (AttrNumber) 3, "numa_node",
 						   INT4OID, -1, 0);
 
+		TupleDescFinalize(tupledesc);
 		fctx->tupdesc = BlessTupleDesc(tupledesc);
 		fctx->include_numa = include_numa;
 
diff --git a/contrib/pg_visibility/pg_visibility.c b/contrib/pg_visibility/pg_visibility.c
index 9bc3a784bf7..dfab0b64cf5 100644
--- a/contrib/pg_visibility/pg_visibility.c
+++ b/contrib/pg_visibility/pg_visibility.c
@@ -469,6 +469,8 @@ pg_visibility_tupdesc(bool include_blkno, bool include_pd)
 		TupleDescInitEntry(tupdesc, ++a, "pd_all_visible", BOOLOID, -1, 0);
 	Assert(a == maxattr);
 
+	TupleDescFinalize(tupdesc);
+
 	return BlessTupleDesc(tupdesc);
 }
 
diff --git a/src/backend/access/brin/brin_tuple.c b/src/backend/access/brin/brin_tuple.c
index 706387e36d6..7f150df9ee7 100644
--- a/src/backend/access/brin/brin_tuple.c
+++ b/src/backend/access/brin/brin_tuple.c
@@ -84,6 +84,7 @@ brtuple_disk_tupdesc(BrinDesc *brdesc)
 
 		MemoryContextSwitchTo(oldcxt);
 
+		TupleDescFinalize(tupdesc);
 		brdesc->bd_disktdesc = tupdesc;
 	}
 
diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c
index 11bec20e82e..85b42f33315 100644
--- a/src/backend/access/common/heaptuple.c
+++ b/src/backend/access/common/heaptuple.c
@@ -497,20 +497,8 @@ heap_attisnull(HeapTuple tup, int attnum, TupleDesc tupleDesc)
 /* ----------------
  *		nocachegetattr
  *
- *		This only gets called from fastgetattr(), in cases where we
- *		can't use a cacheoffset and the value is not null.
- *
- *		This caches attribute offsets in the attribute descriptor.
- *
- *		An alternative way to speed things up would be to cache offsets
- *		with the tuple, but that seems more difficult unless you take
- *		the storage hit of actually putting those offsets into the
- *		tuple you send to disk.  Yuck.
- *
- *		This scheme will be slightly slower than that, but should
- *		perform well for queries which hit large #'s of tuples.  After
- *		you cache the offsets once, examining all the other tuples using
- *		the same attribute descriptor will go much quicker. -cim 5/4/91
+ *		This only gets called from fastgetattr(), in cases where the
+ *		attcacheoff is not set.
  *
  *		NOTE: if you need to change this code, see also heap_deform_tuple.
  *		Also see nocache_index_getattr, which is the same code for index
@@ -522,194 +510,104 @@ nocachegetattr(HeapTuple tup,
 			   int attnum,
 			   TupleDesc tupleDesc)
 {
+	CompactAttribute *cattr;
 	HeapTupleHeader td = tup->t_data;
 	char	   *tp;				/* ptr to data part of tuple */
 	bits8	   *bp = td->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* do we have to walk attrs? */
 	int			off;			/* current offset within data */
+	int			startAttr;
+	int			firstnullattr;
+	bool		hasnulls = HeapTupleHasNulls(tup);
 
-	/* ----------------
-	 *	 Three cases:
-	 *
-	 *	 1: No nulls and no variable-width attributes.
-	 *	 2: Has a null or a var-width AFTER att.
-	 *	 3: Has nulls or var-widths BEFORE att.
-	 * ----------------
-	 */
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
+	/*
+	 * If there are no NULLs before the required attnum, then we can start at
+	 * the highest attribute with a known offset, or the first attribute if
+	 * none have a cached offset.  If the tuple has no variable width types,
+	 * then we can use a slightly cheaper method of offset calculation, as we
+	 * just need to add the attlen to the aligned offset when skipping over
+	 * columns.  When the tuple contains variable-width types, we must use
+	 * att_addlength_pointer(), which does a bit more branching and is
+	 * slightly less efficient.
+	 */
 	attnum--;
 
-	if (!HeapTupleNoNulls(tup))
-	{
-		/*
-		 * there's a null somewhere in the tuple
-		 *
-		 * check to see if any preceding bits are null...
-		 */
-		int			byte = attnum >> 3;
-		int			finalbit = attnum & 0x07;
-
-		/* check for nulls "before" final bit of last byte */
-		if ((~bp[byte]) & ((1 << finalbit) - 1))
-			slow = true;
-		else
-		{
-			/* check for nulls in any "earlier" bytes */
-			int			i;
+	if (hasnulls)
+		firstnullattr = first_null_attr(bp, attnum);
+	else
+		firstnullattr = attnum;
 
-			for (i = 0; i < byte; i++)
-			{
-				if (bp[i] != 0xFF)
-				{
-					slow = true;
-					break;
-				}
-			}
-		}
+	if (tupleDesc->firstNonCachedOffAttr > 0)
+	{
+		startAttr = Min(tupleDesc->firstNonCachedOffAttr - 1, firstnullattr);
+		off = TupleDescCompactAttr(tupleDesc, startAttr)->attcacheoff;
+	}
+	else
+	{
+		startAttr = 0;
+		off = 0;
 	}
 
 	tp = (char *) td + td->t_hoff;
 
-	if (!slow)
+	if (hasnulls)
 	{
-		CompactAttribute *att;
+		for (int i = startAttr; i < attnum; i++)
+		{
+			CompactAttribute *att;
 
-		/*
-		 * If we get here, there are no nulls up to and including the target
-		 * attribute.  If we have a cached offset, we can use it.
-		 */
-		att = TupleDescCompactAttr(tupleDesc, attnum);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, tp + att->attcacheoff);
+			if (att_isnull(i, bp))
+				continue;
 
-		/*
-		 * Otherwise, check for non-fixed-length attrs up to and including
-		 * target.  If there aren't any, it's safe to cheaply initialize the
-		 * cached offsets for these attrs.
-		 */
-		if (HeapTupleHasVarWidth(tup))
-		{
-			int			j;
+			att = TupleDescCompactAttr(tupleDesc, i);
 
-			for (j = 0; j <= attnum; j++)
-			{
-				if (TupleDescCompactAttr(tupleDesc, j)->attlen <= 0)
-				{
-					slow = true;
-					break;
-				}
-			}
+			off = att_pointer_alignby(off,
+									  att->attalignby,
+									  att->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, att->attlen, tp + off);
 		}
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
 	}
-
-	if (!slow)
+	else if (!HeapTupleHasVarWidth(tup))
 	{
-		int			natts = tupleDesc->natts;
-		int			j = 1;
-
-		/*
-		 * If we get here, we have a tuple with no nulls or var-widths up to
-		 * and including the target attribute, so we can use the cached offset
-		 * ... only we don't have it yet, or we'd not have got here.  Since
-		 * it's cheap to compute offsets for fixed-width columns, we take the
-		 * opportunity to initialize the cached offsets for *all* the leading
-		 * fixed-width columns, in hope of avoiding future visits to this
-		 * routine.
-		 */
-		TupleDescCompactAttr(tupleDesc, 0)->attcacheoff = 0;
-
-		/* we might have set some offsets in the slow path previously */
-		while (j < natts && TupleDescCompactAttr(tupleDesc, j)->attcacheoff > 0)
-			j++;
-
-		off = TupleDescCompactAttr(tupleDesc, j - 1)->attcacheoff +
-			TupleDescCompactAttr(tupleDesc, j - 1)->attlen;
-
-		for (; j < natts; j++)
+		for (int i = startAttr; i < attnum; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, j);
-
-			if (att->attlen <= 0)
-				break;
+			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
 
 			off = att_nominal_alignby(off, att->attalignby);
-
-			att->attcacheoff = off;
-
 			off += att->attlen;
 		}
-
-		Assert(j > attnum);
-
-		off = TupleDescCompactAttr(tupleDesc, attnum)->attcacheoff;
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_nominal_alignby(off, cattr->attalignby);
 	}
 	else
 	{
-		bool		usecache = true;
-		int			i;
-
-		/*
-		 * Now we know that we have to walk the tuple CAREFULLY.  But we still
-		 * might be able to cache some offsets for next time.
-		 *
-		 * Note - This loop is a little tricky.  For each non-null attribute,
-		 * we have to first account for alignment padding before the attr,
-		 * then advance over the attr based on its length.  Nulls have no
-		 * storage and no alignment padding either.  We can use/set
-		 * attcacheoff until we reach either a null or a var-width attribute.
-		 */
-		off = 0;
-		for (i = 0;; i++)		/* loop exit is at "break" */
+		for (int i = startAttr; i < attnum; i++)
 		{
 			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
 
-			if (HeapTupleHasNulls(tup) && att_isnull(i, bp))
-			{
-				usecache = false;
-				continue;		/* this cannot be the target att */
-			}
-
-			/* If we know the next offset, we can skip the rest */
-			if (usecache && att->attcacheoff >= 0)
-				off = att->attcacheoff;
-			else if (att->attlen == -1)
-			{
-				/*
-				 * We can only cache the offset for a varlena attribute if the
-				 * offset is already suitably aligned, so that there would be
-				 * no pad bytes in any case: then the offset will be valid for
-				 * either an aligned or unaligned value.
-				 */
-				if (usecache &&
-					off == att_nominal_alignby(off, att->attalignby))
-					att->attcacheoff = off;
-				else
-				{
-					off = att_pointer_alignby(off, att->attalignby, -1,
-											  tp + off);
-					usecache = false;
-				}
-			}
-			else
-			{
-				/* not varlena, so safe to use att_nominal_alignby */
-				off = att_nominal_alignby(off, att->attalignby);
-
-				if (usecache)
-					att->attcacheoff = off;
-			}
-
-			if (i == attnum)
-				break;
-
+			off = att_pointer_alignby(off,
+									  att->attalignby,
+									  att->attlen,
+									  tp + off);
 			off = att_addlength_pointer(off, att->attlen, tp + off);
 
-			if (usecache && att->attlen <= 0)
-				usecache = false;
 		}
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
 	}
 
-	return fetchatt(TupleDescCompactAttr(tupleDesc, attnum), tp + off);
+	return fetchatt(cattr, tp + off);
 }
 
 /* ----------------
@@ -1354,70 +1252,91 @@ heap_deform_tuple(HeapTuple tuple, TupleDesc tupleDesc,
 	char	   *tp;				/* ptr to tuple data */
 	uint32		off;			/* offset in tuple data */
 	bits8	   *bp = tup->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* can we use/set attcacheoff? */
+	int			cacheoffattrs;
+	int			firstnullattr;
 
 	natts = HeapTupleHeaderGetNatts(tup);
 
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
+
 	/*
 	 * In inheritance situations, it is possible that the given tuple actually
 	 * has more fields than the caller is expecting.  Don't run off the end of
 	 * the caller's arrays.
 	 */
 	natts = Min(natts, tdesc_natts);
+	cacheoffattrs = Min(tupleDesc->firstNonCachedOffAttr, natts);
 
-	tp = (char *) tup + tup->t_hoff;
+	if (hasnulls)
+	{
+		firstnullattr = first_null_attr(bp, natts);
+		cacheoffattrs = Min(cacheoffattrs, firstnullattr);
+	}
+	else
+		firstnullattr = natts;
 
+	tp = (char *) tup + tup->t_hoff;
 	off = 0;
 
-	for (attnum = 0; attnum < natts; attnum++)
+	for (attnum = 0; attnum < cacheoffattrs; attnum++)
+	{
+		CompactAttribute *cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+		Assert(cattr->attcacheoff >= 0);
+
+		values[attnum] = fetch_att(tp + cattr->attcacheoff, cattr->attbyval,
+								   cattr->attlen);
+		isnull[attnum] = false;
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+
+	for (; attnum < firstnullattr; attnum++)
 	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDesc, attnum);
+		CompactAttribute *cattr = TupleDescCompactAttr(tupleDesc, attnum);
 
-		if (hasnulls && att_isnull(attnum, bp))
+		if (cattr->attlen == -1)
+			off = att_pointer_alignby(off, cattr->attalignby, -1,
+									  tp + off);
+		else
 		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			slow = true;		/* can't use attcacheoff anymore */
-			continue;
+			/* not varlena, so safe to use att_nominal_alignby */
+			off = att_nominal_alignby(off, cattr->attalignby);
 		}
 
 		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
+
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+	}
+
+	for (; attnum < natts; attnum++)
+	{
+		CompactAttribute *cattr;
+
+		Assert(hasnulls);
 
-		if (!slow && thisatt->attcacheoff >= 0)
-			off = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
+		if (att_isnull(attnum, bp))
 		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow &&
-				off == att_nominal_alignby(off, thisatt->attalignby))
-				thisatt->attcacheoff = off;
-			else
-			{
-				off = att_pointer_alignby(off, thisatt->attalignby, -1,
-										  tp + off);
-				slow = true;
-			}
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
 		}
+
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		if (cattr->attlen == -1)
+			off = att_pointer_alignby(off, cattr->attalignby, -1,
+									  tp + off);
 		else
 		{
 			/* not varlena, so safe to use att_nominal_alignby */
-			off = att_nominal_alignby(off, thisatt->attalignby);
-
-			if (!slow)
-				thisatt->attcacheoff = off;
+			off = att_nominal_alignby(off, cattr->attalignby);
 		}
 
-		values[attnum] = fetchatt(thisatt, tp + off);
-
-		off = att_addlength_pointer(off, thisatt->attlen, tp + off);
+		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (thisatt->attlen <= 0)
-			slow = true;		/* can't use attcacheoff anymore */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
 	/*
diff --git a/src/backend/access/common/indextuple.c b/src/backend/access/common/indextuple.c
index d7c8c53fd8d..7e1bde9703a 100644
--- a/src/backend/access/common/indextuple.c
+++ b/src/backend/access/common/indextuple.c
@@ -223,18 +223,6 @@ index_form_tuple_context(TupleDesc tupleDescriptor,
  *
  *		This gets called from index_getattr() macro, and only in cases
  *		where we can't use cacheoffset and the value is not null.
- *
- *		This caches attribute offsets in the attribute descriptor.
- *
- *		An alternative way to speed things up would be to cache offsets
- *		with the tuple, but that seems more difficult unless you take
- *		the storage hit of actually putting those offsets into the
- *		tuple you send to disk.  Yuck.
- *
- *		This scheme will be slightly slower than that, but should
- *		perform well for queries which hit large #'s of tuples.  After
- *		you cache the offsets once, examining all the other tuples using
- *		the same attribute descriptor will go much quicker. -cim 5/4/91
  * ----------------
  */
 Datum
@@ -242,205 +230,129 @@ nocache_index_getattr(IndexTuple tup,
 					  int attnum,
 					  TupleDesc tupleDesc)
 {
+	CompactAttribute *cattr;
 	char	   *tp;				/* ptr to data part of tuple */
 	bits8	   *bp = NULL;		/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* do we have to walk attrs? */
 	int			data_off;		/* tuple data offset */
 	int			off;			/* current offset within data */
+	int			startAttr;
+	int			firstnullattr;
+	bool		hasnulls = IndexTupleHasNulls(tup);
+	int			i;
 
-	/* ----------------
-	 *	 Three cases:
-	 *
-	 *	 1: No nulls and no variable-width attributes.
-	 *	 2: Has a null or a var-width AFTER att.
-	 *	 3: Has nulls or var-widths BEFORE att.
-	 * ----------------
-	 */
-
-	data_off = IndexInfoFindDataOffset(tup->t_info);
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
 	attnum--;
 
-	if (IndexTupleHasNulls(tup))
-	{
-		/*
-		 * there's a null somewhere in the tuple
-		 *
-		 * check to see if desired att is null
-		 */
+	/*
+	 * If there are no NULLs before the required attnum, then we can start at
+	 * the highest attribute with a known offset, or the first attribute if
+	 * none have a cached offset.  If the tuple has no variable width types,
+	 * which is common with indexes, then we can use a slightly cheaper method
+	 * of offset calculation, as we just need to add the attlen to the aligned
+	 * offset when skipping over columns.  When the tuple contains
+	 * variable-width types, we must use att_addlength_pointer(), which does a
+	 * bit more branching and is slightly less efficient.
+	 */
+	data_off = IndexInfoFindDataOffset(tup->t_info);
+	tp = (char *) tup + data_off;
 
-		/* XXX "knows" t_bits are just after fixed tuple header! */
+	/*
+	 * Find the first NULL column, or if there's none set the first NULL to
+	 * attnum so that we can forego NULL checking all the way to attnum.
+	 */
+	if (hasnulls)
+	{
 		bp = (bits8 *) ((char *) tup + sizeof(IndexTupleData));
-
-		/*
-		 * Now check to see if any preceding bits are null...
-		 */
-		{
-			int			byte = attnum >> 3;
-			int			finalbit = attnum & 0x07;
-
-			/* check for nulls "before" final bit of last byte */
-			if ((~bp[byte]) & ((1 << finalbit) - 1))
-				slow = true;
-			else
-			{
-				/* check for nulls in any "earlier" bytes */
-				int			i;
-
-				for (i = 0; i < byte; i++)
-				{
-					if (bp[i] != 0xFF)
-					{
-						slow = true;
-						break;
-					}
-				}
-			}
-		}
+		firstnullattr = first_null_attr(bp, attnum);
 	}
+	else
+		firstnullattr = attnum;
 
-	tp = (char *) tup + data_off;
-
-	if (!slow)
+	if (tupleDesc->firstNonCachedOffAttr > 0)
 	{
-		CompactAttribute *att;
-
-		/*
-		 * If we get here, there are no nulls up to and including the target
-		 * attribute.  If we have a cached offset, we can use it.
-		 */
-		att = TupleDescCompactAttr(tupleDesc, attnum);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, tp + att->attcacheoff);
-
-		/*
-		 * Otherwise, check for non-fixed-length attrs up to and including
-		 * target.  If there aren't any, it's safe to cheaply initialize the
-		 * cached offsets for these attrs.
-		 */
-		if (IndexTupleHasVarwidths(tup))
-		{
-			int			j;
-
-			for (j = 0; j <= attnum; j++)
-			{
-				if (TupleDescCompactAttr(tupleDesc, j)->attlen <= 0)
-				{
-					slow = true;
-					break;
-				}
-			}
-		}
+		startAttr = Min(tupleDesc->firstNonCachedOffAttr - 1, firstnullattr);
+		off = TupleDescCompactAttr(tupleDesc, startAttr)->attcacheoff;
 	}
-
-	if (!slow)
+	else
 	{
-		int			natts = tupleDesc->natts;
-		int			j = 1;
-
-		/*
-		 * If we get here, we have a tuple with no nulls or var-widths up to
-		 * and including the target attribute, so we can use the cached offset
-		 * ... only we don't have it yet, or we'd not have got here.  Since
-		 * it's cheap to compute offsets for fixed-width columns, we take the
-		 * opportunity to initialize the cached offsets for *all* the leading
-		 * fixed-width columns, in hope of avoiding future visits to this
-		 * routine.
-		 */
-		TupleDescCompactAttr(tupleDesc, 0)->attcacheoff = 0;
+		startAttr = 0;
+		off = 0;
+	}
 
-		/* we might have set some offsets in the slow path previously */
-		while (j < natts && TupleDescCompactAttr(tupleDesc, j)->attcacheoff > 0)
-			j++;
+	/* Handle tuples with var-width attributes */
+	if (IndexTupleHasVarwidths(tup))
+	{
+		/* Calculate the offset up until the first NULL */
+		for (i = startAttr; i < firstnullattr; i++)
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-		off = TupleDescCompactAttr(tupleDesc, j - 1)->attcacheoff +
-			TupleDescCompactAttr(tupleDesc, j - 1)->attlen;
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, cattr->attlen, tp + off);
+		}
 
-		for (; j < natts; j++)
+		/* Calculate the offset for any remaining columns. */
+		for (; i < attnum; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, j);
-
-			if (att->attlen <= 0)
-				break;
+			Assert(hasnulls);
 
-			off = att_nominal_alignby(off, att->attalignby);
+			if (att_isnull(i, bp))
+				continue;
 
-			att->attcacheoff = off;
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-			off += att->attlen;
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, cattr->attlen, tp + off);
 		}
-
-		Assert(j > attnum);
-
-		off = TupleDescCompactAttr(tupleDesc, attnum)->attcacheoff;
 	}
 	else
 	{
-		bool		usecache = true;
-		int			i;
+		/* Handle tuples with only fixed-width attributes */
 
-		/*
-		 * Now we know that we have to walk the tuple CAREFULLY.  But we still
-		 * might be able to cache some offsets for next time.
-		 *
-		 * Note - This loop is a little tricky.  For each non-null attribute,
-		 * we have to first account for alignment padding before the attr,
-		 * then advance over the attr based on its length.  Nulls have no
-		 * storage and no alignment padding either.  We can use/set
-		 * attcacheoff until we reach either a null or a var-width attribute.
-		 */
-		off = 0;
-		for (i = 0;; i++)		/* loop exit is at "break" */
+		/* Calculate the offset up until the first NULL */
+		for (i = startAttr; i < firstnullattr; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
-
-			if (IndexTupleHasNulls(tup) && att_isnull(i, bp))
-			{
-				usecache = false;
-				continue;		/* this cannot be the target att */
-			}
-
-			/* If we know the next offset, we can skip the rest */
-			if (usecache && att->attcacheoff >= 0)
-				off = att->attcacheoff;
-			else if (att->attlen == -1)
-			{
-				/*
-				 * We can only cache the offset for a varlena attribute if the
-				 * offset is already suitably aligned, so that there would be
-				 * no pad bytes in any case: then the offset will be valid for
-				 * either an aligned or unaligned value.
-				 */
-				if (usecache &&
-					off == att_nominal_alignby(off, att->attalignby))
-					att->attcacheoff = off;
-				else
-				{
-					off = att_pointer_alignby(off, att->attalignby, -1,
-											  tp + off);
-					usecache = false;
-				}
-			}
-			else
-			{
-				/* not varlena, so safe to use att_nominal_alignby */
-				off = att_nominal_alignby(off, att->attalignby);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
+
+			Assert(cattr->attlen > 0);
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off += cattr->attlen;
+		}
 
-				if (usecache)
-					att->attcacheoff = off;
-			}
+		/* Calculate the offset for any remaining columns. */
+		for (; i < attnum; i++)
+		{
+			Assert(hasnulls);
 
-			if (i == attnum)
-				break;
+			if (att_isnull(i, bp))
+				continue;
 
-			off = att_addlength_pointer(off, att->attlen, tp + off);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-			if (usecache && att->attlen <= 0)
-				usecache = false;
+			Assert(cattr->attlen > 0);
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off += cattr->attlen;
 		}
 	}
 
-	return fetchatt(TupleDescCompactAttr(tupleDesc, attnum), tp + off);
+	cattr = TupleDescCompactAttr(tupleDesc, attnum);
+	off = att_pointer_alignby(off, cattr->attalignby,
+							  cattr->attlen, tp + off);
+	return fetchatt(cattr, tp + off);
 }
 
 /*
@@ -481,62 +393,79 @@ index_deform_tuple_internal(TupleDesc tupleDescriptor,
 							char *tp, bits8 *bp, int hasnulls)
 {
 	int			natts = tupleDescriptor->natts; /* number of atts to extract */
-	int			attnum;
+	int			attnum = 0;
 	int			off = 0;		/* offset in tuple data */
-	bool		slow = false;	/* can we use/set attcacheoff? */
+	int			cacheoffattrs;
+	int			firstnullattr;
 
 	/* Assert to protect callers who allocate fixed-size arrays */
 	Assert(natts <= INDEX_MAX_KEYS);
 
-	for (attnum = 0; attnum < natts; attnum++)
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDescriptor->firstNonCachedOffAttr >= 0);
+
+	cacheoffattrs = Min(tupleDescriptor->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		firstnullattr = first_null_attr(bp, natts);
+		cacheoffattrs = Min(cacheoffattrs, firstnullattr);
+	}
+	else
+		firstnullattr = natts;
+
+	if (attnum < cacheoffattrs)
 	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDescriptor, attnum);
+		CompactAttribute *cattr;
 
-		if (hasnulls && att_isnull(attnum, bp))
+		do
 		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			slow = true;		/* can't use attcacheoff anymore */
-			continue;
-		}
+			cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+
+			Assert(cattr->attcacheoff >= 0);
+
+			values[attnum] = fetch_att(tp + cattr->attcacheoff, cattr->attbyval,
+									   cattr->attlen);
+			isnull[attnum] = false;
+		} while (++attnum < cacheoffattrs);
+
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+
+	for (; attnum < firstnullattr; attnum++)
+	{
+		CompactAttribute *cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
 		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (!slow && thisatt->attcacheoff >= 0)
-			off = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow &&
-				off == att_nominal_alignby(off, thisatt->attalignby))
-				thisatt->attcacheoff = off;
-			else
-			{
-				off = att_pointer_alignby(off, thisatt->attalignby, -1,
-										  tp + off);
-				slow = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			off = att_nominal_alignby(off, thisatt->attalignby);
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+	}
+
+	for (; attnum < natts; attnum++)
+	{
+		CompactAttribute *cattr;
 
-			if (!slow)
-				thisatt->attcacheoff = off;
+		Assert(hasnulls);
+
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
 		}
 
-		values[attnum] = fetchatt(thisatt, tp + off);
+		cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
-		off = att_addlength_pointer(off, thisatt->attlen, tp + off);
+		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (thisatt->attlen <= 0)
-			slow = true;		/* can't use attcacheoff anymore */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 }
 
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index 94b4f1f9975..4cf13b79805 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -214,6 +214,9 @@ CreateTemplateTupleDesc(int natts)
 	desc->tdtypmod = -1;
 	desc->tdrefcount = -1;		/* assume not reference-counted */
 
+	/* This will be set to the correct value by TupleDescFinalize() */
+	desc->firstNonCachedOffAttr = -1;
+
 	return desc;
 }
 
@@ -238,6 +241,9 @@ CreateTupleDesc(int natts, Form_pg_attribute *attrs)
 		memcpy(TupleDescAttr(desc, i), attrs[i], ATTRIBUTE_FIXED_PART_SIZE);
 		populate_compact_attribute(desc, i);
 	}
+
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -282,6 +288,8 @@ CreateTupleDescCopy(TupleDesc tupdesc)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -328,6 +336,8 @@ CreateTupleDescTruncatedCopy(TupleDesc tupdesc, int natts)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -413,6 +423,8 @@ CreateTupleDescCopyConstr(TupleDesc tupdesc)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -455,6 +467,8 @@ TupleDescCopy(TupleDesc dst, TupleDesc src)
 	 * source's refcount would be wrong in any case.)
 	 */
 	dst->tdrefcount = -1;
+
+	TupleDescFinalize(dst);
 }
 
 /*
@@ -463,6 +477,9 @@ TupleDescCopy(TupleDesc dst, TupleDesc src)
  *		descriptor to another.
  *
  * !!! Constraints and defaults are not copied !!!
+ *
+ * The caller must take care of calling TupleDescFinalize() on once all
+ * TupleDesc changes have been made.
  */
 void
 TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
@@ -495,6 +512,46 @@ TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 	populate_compact_attribute(dst, dstAttno - 1);
 }
 
+/*
+ * TupleDescFinalize
+ *		Finalize the given TupleDesc.  This must be called after the
+ *		attributes arrays have been populated or adjusted by any code.
+ *
+ * Must be called after populate_compact_attribute()
+ */
+void
+TupleDescFinalize(TupleDesc tupdesc)
+{
+	int			firstNonCachedOffAttr = 0;
+#ifdef OPTIMIZE_BYVAL
+	int			firstByRefAttr = tupdesc->natts;
+#endif
+	int			offp = 0;
+
+	for (int i = 0; i < tupdesc->natts; i++)
+	{
+		CompactAttribute *cattr = TupleDescCompactAttr(tupdesc, i);
+
+#ifdef OPTIMIZE_BYVAL
+		if (!cattr->attbyval)
+			firstByRefAttr = Min(firstByRefAttr, i);
+#endif
+		if (cattr->attlen <= 0)
+			break;
+
+		offp = att_nominal_alignby(offp, cattr->attalignby);
+		cattr->attcacheoff = offp;
+
+		offp += cattr->attlen;
+		firstNonCachedOffAttr = i + 1;
+	}
+
+	tupdesc->firstNonCachedOffAttr = firstNonCachedOffAttr;
+#ifdef OPTIMIZE_BYVAL
+	tupdesc->firstByRefAttr = firstByRefAttr;
+#endif
+}
+
 /*
  * Free a TupleDesc including all substructure
  */
@@ -1082,6 +1139,8 @@ BuildDescFromLists(const List *names, const List *types, const List *typmods, co
 		TupleDescInitEntryCollation(desc, attnum, attcollation);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index d205093e21d..a533d79e26e 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -129,6 +129,7 @@ initGinState(GinState *state, Relation index)
 							   attr->attndims);
 			TupleDescInitEntryCollation(state->tupdesc[i], (AttrNumber) 2,
 										attr->attcollation);
+			TupleDescFinalize(state->tupdesc[i]);
 		}
 
 		/*
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
index f23bc4a6757..c65f93abdae 100644
--- a/src/backend/access/gist/gistscan.c
+++ b/src/backend/access/gist/gistscan.c
@@ -201,6 +201,7 @@ gistrescan(IndexScanDesc scan, ScanKey key, int nkeys,
 											 attno - 1)->atttypid,
 							   -1, 0);
 		}
+		TupleDescFinalize(so->giststate->fetchTupdesc);
 		scan->xs_hitupdesc = so->giststate->fetchTupdesc;
 
 		/* Also create a memory context that will hold the returned tuples */
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 9f5379b87ac..a4694bd8065 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -335,11 +335,9 @@ getSpGistTupleDesc(Relation index, SpGistTypeDesc *keyType)
 		/* We shouldn't need to bother with making these valid: */
 		att->attcompression = InvalidCompressionMethod;
 		att->attcollation = InvalidOid;
-		/* In case we changed typlen, we'd better reset following offsets */
-		for (int i = spgFirstIncludeColumn; i < outTupDesc->natts; i++)
-			TupleDescCompactAttr(outTupDesc, i)->attcacheoff = -1;
 
 		populate_compact_attribute(outTupDesc, spgKeyColumn);
+		TupleDescFinalize(outTupDesc);
 	}
 	return outTupDesc;
 }
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index e50abb331cc..9f708f84334 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -744,6 +744,7 @@ pg_prepared_xact(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 5, "dbid",
 						   OIDOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 2efe4105efb..b6bc616c74c 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -400,6 +400,7 @@ pg_walfile_name_offset(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(resultTupleDesc, (AttrNumber) 2, "file_offset",
 					   INT4OID, -1, 0);
 
+	TupleDescFinalize(resultTupleDesc);
 	resultTupleDesc = BlessTupleDesc(resultTupleDesc);
 
 	/*
diff --git a/src/backend/backup/basebackup_copy.c b/src/backend/backup/basebackup_copy.c
index fecfad9ab7b..29dbd0cb32f 100644
--- a/src/backend/backup/basebackup_copy.c
+++ b/src/backend/backup/basebackup_copy.c
@@ -357,6 +357,8 @@ SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli)
 	 */
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "tli", INT8OID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
+
 	/* send RowDescription */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
 
@@ -388,6 +390,7 @@ SendTablespaceList(List *tablespaces)
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, "spcoid", OIDOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "spclocation", TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "size", INT8OID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* send RowDescription */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 43de42ce39e..75e97fb394a 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -481,6 +481,8 @@ ConstructTupleDescriptor(Relation heapRelation,
 		populate_compact_attribute(indexTupDesc, i);
 	}
 
+	TupleDescFinalize(indexTupDesc);
+
 	return indexTupDesc;
 }
 
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 9a4791c573e..fa353a0dd37 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -1230,6 +1230,7 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 4, "qual",
 						   PG_NODE_TREEOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 		funcctx->user_fctx = table_infos;
 
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index c78dcea98c1..078a1cf5127 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -229,6 +229,12 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid,
 	TupleDescAttr(tupdesc, 1)->attcompression = InvalidCompressionMethod;
 	TupleDescAttr(tupdesc, 2)->attcompression = InvalidCompressionMethod;
 
+	populate_compact_attribute(tupdesc, 0);
+	populate_compact_attribute(tupdesc, 1);
+	populate_compact_attribute(tupdesc, 2);
+
+	TupleDescFinalize(tupdesc);
+
 	/*
 	 * Toast tables for regular relations go in pg_toast; those for temp
 	 * relations go into the per-backend temp-toast-table namespace.
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index b7bb111688c..7abd9ed272f 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -281,6 +281,7 @@ ExplainResultDesc(ExplainStmt *stmt)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "QUERY PLAN",
 					   result_type, -1, 0);
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
 
diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c
index a516b037dea..6a8f162b640 100644
--- a/src/backend/commands/functioncmds.c
+++ b/src/backend/commands/functioncmds.c
@@ -2423,6 +2423,7 @@ CallStmtResultDesc(CallStmt *stmt)
 							   -1,
 							   0);
 		}
+		TupleDescFinalize(tupdesc);
 	}
 
 	return tupdesc;
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index e1b808bbb60..551667650ba 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -1808,6 +1808,7 @@ pg_get_sequence_data(PG_FUNCTION_ARGS)
 					   BOOLOID, -1, 0);
 	TupleDescInitEntry(resultTupleDesc, (AttrNumber) 3, "page_lsn",
 					   LSNOID, -1, 0);
+	TupleDescFinalize(resultTupleDesc);
 	resultTupleDesc = BlessTupleDesc(resultTupleDesc);
 
 	seqrel = try_relation_open(relid, AccessShareLock);
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index f976c0e5c7e..ae7c9db9fa9 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -1029,6 +1029,8 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
 		}
 	}
 
+	TupleDescFinalize(descriptor);
+
 	/*
 	 * For relations with table AM and partitioned tables, select access
 	 * method to use: an explicitly indicated one, or (in the case of a
@@ -1448,6 +1450,8 @@ BuildDescForRelation(const List *columns)
 		populate_compact_attribute(desc, attnum - 1);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
diff --git a/src/backend/commands/wait.c b/src/backend/commands/wait.c
index 97f1e778488..9e55d9bfb80 100644
--- a/src/backend/commands/wait.c
+++ b/src/backend/commands/wait.c
@@ -341,5 +341,6 @@ WaitStmtResultDesc(WaitStmt *stmt)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 					   TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
diff --git a/src/backend/executor/execSRF.c b/src/backend/executor/execSRF.c
index a0b111dc0e4..b481e50acfb 100644
--- a/src/backend/executor/execSRF.c
+++ b/src/backend/executor/execSRF.c
@@ -272,6 +272,7 @@ ExecMakeTableFunctionResult(SetExprState *setexpr,
 									   funcrettype,
 									   -1,
 									   0);
+					TupleDescFinalize(tupdesc);
 					rsinfo.setDesc = tupdesc;
 				}
 				MemoryContextSwitchTo(oldcontext);
@@ -776,6 +777,7 @@ init_sexpr(Oid foid, Oid input_collation, Expr *node,
 							   funcrettype,
 							   -1,
 							   0);
+			TupleDescFinalize(tupdesc);
 			sexpr->funcResultDesc = tupdesc;
 			sexpr->funcReturnsTuple = false;
 		}
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index b768eae9e53..9683acc8020 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -992,118 +992,6 @@ tts_buffer_heap_store_tuple(TupleTableSlot *slot, HeapTuple tuple,
 	}
 }
 
-/*
- * slot_deform_heap_tuple_internal
- *		An always inline helper function for use in slot_deform_heap_tuple to
- *		allow the compiler to emit specialized versions of this function for
- *		various combinations of "slow" and "hasnulls".  For example, if a
- *		given tuple has no nulls, then we needn't check "hasnulls" for every
- *		attribute that we're deforming.  The caller can just call this
- *		function with hasnulls set to constant-false and have the compiler
- *		remove the constant-false branches and emit more optimal code.
- *
- * Returns the next attnum to deform, which can be equal to natts when the
- * function manages to deform all requested attributes.  *offp is an input and
- * output parameter which is the byte offset within the tuple to start deforming
- * from which, on return, gets set to the offset where the next attribute
- * should be deformed from.  *slowp is set to true when subsequent deforming
- * of this tuple must use a version of this function with "slow" passed as
- * true.
- *
- * Callers cannot assume when we return "attnum" (i.e. all requested
- * attributes have been deformed) that slow mode isn't required for any
- * additional deforming as the final attribute may have caused a switch to
- * slow mode.
- */
-static pg_attribute_always_inline int
-slot_deform_heap_tuple_internal(TupleTableSlot *slot, HeapTuple tuple,
-								int attnum, int natts, bool slow,
-								bool hasnulls, uint32 *offp, bool *slowp)
-{
-	TupleDesc	tupleDesc = slot->tts_tupleDescriptor;
-	Datum	   *values = slot->tts_values;
-	bool	   *isnull = slot->tts_isnull;
-	HeapTupleHeader tup = tuple->t_data;
-	char	   *tp;				/* ptr to tuple data */
-	bits8	   *bp = tup->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slownext = false;
-
-	tp = (char *) tup + tup->t_hoff;
-
-	for (; attnum < natts; attnum++)
-	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDesc, attnum);
-
-		if (hasnulls && att_isnull(attnum, bp))
-		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			if (!slow)
-			{
-				*slowp = true;
-				return attnum + 1;
-			}
-			else
-				continue;
-		}
-
-		isnull[attnum] = false;
-
-		/* calculate the offset of this attribute */
-		if (!slow && thisatt->attcacheoff >= 0)
-			*offp = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow && *offp == att_nominal_alignby(*offp, thisatt->attalignby))
-				thisatt->attcacheoff = *offp;
-			else
-			{
-				*offp = att_pointer_alignby(*offp,
-											thisatt->attalignby,
-											-1,
-											tp + *offp);
-
-				if (!slow)
-					slownext = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			*offp = att_nominal_alignby(*offp, thisatt->attalignby);
-
-			if (!slow)
-				thisatt->attcacheoff = *offp;
-		}
-
-		values[attnum] = fetchatt(thisatt, tp + *offp);
-
-		*offp = att_addlength_pointer(*offp, thisatt->attlen, tp + *offp);
-
-		/* check if we need to switch to slow mode */
-		if (!slow)
-		{
-			/*
-			 * We're unable to deform any further if the above code set
-			 * 'slownext', or if this isn't a fixed-width attribute.
-			 */
-			if (slownext || thisatt->attlen <= 0)
-			{
-				*slowp = true;
-				return attnum + 1;
-			}
-		}
-	}
-
-	return natts;
-}
-
 /*
  * slot_deform_heap_tuple
  *		Given a TupleTableSlot, extract data from the slot's physical tuple
@@ -1122,78 +1010,167 @@ static pg_attribute_always_inline void
 slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 					   int natts)
 {
+	CompactAttribute *cattr;
+	TupleDesc	tupleDesc = slot->tts_tupleDescriptor;
 	bool		hasnulls = HeapTupleHasNulls(tuple);
+	HeapTupleHeader tup = tuple->t_data;
+	bits8	   *bp;				/* ptr to null bitmap in tuple */
 	int			attnum;
+	int			firstNonCacheOffsetAttr;
+
+#ifdef OPTIMIZE_BYVAL
+	int			firstByRefAttr;
+#endif
+	int			firstNullAttr;
+	Datum	   *values;
+	bool	   *isnull;
+	char	   *tp;				/* ptr to tuple data */
 	uint32		off;			/* offset in tuple data */
-	bool		slow;			/* can we use/set attcacheoff? */
+
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
 	/* We can only fetch as many attributes as the tuple has. */
-	natts = Min(HeapTupleHeaderGetNatts(tuple->t_data), natts);
+	natts = Min(HeapTupleHeaderGetNatts(tup), natts);
+	attnum = slot->tts_nvalid;
+	firstNonCacheOffsetAttr = Min(tupleDesc->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		bp = tup->t_bits;
+		firstNullAttr = first_null_attr(bp, natts);
+		firstNonCacheOffsetAttr = Min(firstNonCacheOffsetAttr, firstNullAttr);
+	}
+	else
+	{
+		bp = NULL;
+		firstNullAttr = natts;
+	}
+
+#ifdef OPTIMIZE_BYVAL
+	firstByRefAttr = Min(firstNonCacheOffsetAttr, tupleDesc->firstByRefAttr);
+#endif
+	values = slot->tts_values;
+	isnull = slot->tts_isnull;
+	tp = (char *) tup + tup->t_hoff;
+
+#ifdef OPTIMIZE_BYVAL
 
 	/*
-	 * Check whether the first call for this tuple, and initialize or restore
-	 * loop state.
+	 * Many tuples have leading byval attributes, try and process as many of
+	 * those as possible with a special loop that can't handle byref types.
 	 */
-	attnum = slot->tts_nvalid;
-	if (attnum == 0)
+	if (attnum < firstByRefAttr)
+	{
+		/* Use do/while as we already know we need to loop at least once. */
+		do
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+			Assert(cattr->attcacheoff >= 0);
+
+			/*
+			 * Hard code byval == true to allow the compiler to remove the
+			 * byval check when inlining fetch_att().
+			 */
+			values[attnum] = fetch_att(tp + cattr->attcacheoff, true, cattr->attlen);
+			isnull[attnum] = false;
+		} while (++attnum < firstByRefAttr);
+
+		/*
+		 * Point the offset after the end of the last attribute with a cached
+		 * offset.  We expect the final cached offset attribute to have a
+		 * fixed width, so just add the attlen to the attcacheoff.
+		 */
+		Assert(cattr->attlen > 0);
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+#endif
+
+	/*
+	 * Handle the portion of the tuple that we have cached the offset for up
+	 * to the first NULL attribute.  The offset is effectively fixed for these
+	 * so we can use the CompactAttribute's attcacheoff.
+	 */
+	if (attnum < firstNonCacheOffsetAttr)
+	{
+		do
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+			Assert(cattr->attcacheoff >= 0);
+
+			values[attnum] = fetchatt(cattr, tp + cattr->attcacheoff);
+			isnull[attnum] = false;
+		} while (++attnum < firstNonCacheOffsetAttr);
+
+		/*
+		 * Point the offset after the end of the last attribute with a cached
+		 * offset.  We expect the final cached offset attribute to have a
+		 * fixed width, so just add the attlen to the attcacheoff
+		 */
+		Assert(cattr->attlen > 0);
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+	else if (attnum == 0)
 	{
 		/* Start from the first attribute */
 		off = 0;
-		slow = false;
 	}
 	else
 	{
 		/* Restore state from previous execution */
 		off = *offp;
-		slow = TTS_SLOW(slot);
 	}
 
 	/*
-	 * If 'slow' isn't set, try deforming using deforming code that does not
-	 * contain any of the extra checks required for non-fixed offset
-	 * deforming.  During deforming, if or when we find a NULL or a variable
-	 * length attribute, we'll switch to a deforming method which includes the
-	 * extra code required for non-fixed offset deforming, a.k.a slow mode.
-	 * Because this is performance critical, we inline
-	 * slot_deform_heap_tuple_internal passing the 'slow' and 'hasnull'
-	 * parameters as constants to allow the compiler to emit specialized code
-	 * with the known-const false comparisons and subsequent branches removed.
+	 * Handle any portion of the tuple that doesn't have a fixed offset up
+	 * until the first NULL attribute.  This loops only differs from the one
+	 * after it by the NULL checks.
 	 */
-	if (!slow)
+	for (; attnum < firstNullAttr; attnum++)
 	{
-		/* Tuple without any NULLs? We can skip doing any NULL checking */
-		if (!hasnulls)
-			attnum = slot_deform_heap_tuple_internal(slot,
-													 tuple,
-													 attnum,
-													 natts,
-													 false, /* slow */
-													 false, /* hasnulls */
-													 &off,
-													 &slow);
-		else
-			attnum = slot_deform_heap_tuple_internal(slot,
-													 tuple,
-													 attnum,
-													 natts,
-													 false, /* slow */
-													 true,	/* hasnulls */
-													 &off,
-													 &slow);
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+		/* align the offset for this attribute */
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
+
+		values[attnum] = fetchatt(cattr, tp + off);
+		isnull[attnum] = false;
+
+		/* move the offset beyond this attribute */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
-	/* If there's still work to do then we must be in slow mode */
-	if (attnum < natts)
+	/*
+	 * Now handle any remaining tuples, this time include NULL checks as we're
+	 * now at the first NULL attribute.
+	 */
+	for (; attnum < natts; attnum++)
 	{
-		/* XXX is it worth adding a separate call when hasnulls is false? */
-		attnum = slot_deform_heap_tuple_internal(slot,
-												 tuple,
-												 attnum,
-												 natts,
-												 true,	/* slow */
-												 hasnulls,
-												 &off,
-												 &slow);
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
+		}
+
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+		/* align the offset for this attribute */
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
+
+		values[attnum] = fetchatt(cattr, tp + off);
+		isnull[attnum] = false;
+
+		/* move the offset beyond this attribute */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
 	/*
@@ -1201,10 +1178,6 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 	 */
 	slot->tts_nvalid = attnum;
 	*offp = off;
-	if (slow)
-		slot->tts_flags |= TTS_FLAG_SLOW;
-	else
-		slot->tts_flags &= ~TTS_FLAG_SLOW;
 }
 
 const TupleTableSlotOps TTSOpsVirtual = {
@@ -2173,6 +2146,8 @@ ExecTypeFromTLInternal(List *targetList, bool skipjunk)
 		cur_resno++;
 	}
 
+	TupleDescFinalize(typeInfo);
+
 	return typeInfo;
 }
 
@@ -2207,6 +2182,8 @@ ExecTypeFromExprList(List *exprList)
 		cur_resno++;
 	}
 
+	TupleDescFinalize(typeInfo);
+
 	return typeInfo;
 }
 
diff --git a/src/backend/executor/nodeFunctionscan.c b/src/backend/executor/nodeFunctionscan.c
index 63e605e1f81..feb82d64967 100644
--- a/src/backend/executor/nodeFunctionscan.c
+++ b/src/backend/executor/nodeFunctionscan.c
@@ -414,6 +414,7 @@ ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags)
 				TupleDescInitEntryCollation(tupdesc,
 											(AttrNumber) 1,
 											exprCollation(funcexpr));
+				TupleDescFinalize(tupdesc);
 			}
 			else
 			{
@@ -485,6 +486,7 @@ ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags)
 							   0);
 		}
 
+		TupleDescFinalize(scan_tupdesc);
 		Assert(attno == natts);
 	}
 
diff --git a/src/backend/jit/llvm/llvmjit_deform.c b/src/backend/jit/llvm/llvmjit_deform.c
index 3eb087eb56b..12521e3e46a 100644
--- a/src/backend/jit/llvm/llvmjit_deform.c
+++ b/src/backend/jit/llvm/llvmjit_deform.c
@@ -62,7 +62,6 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 	LLVMValueRef v_tts_values;
 	LLVMValueRef v_tts_nulls;
 	LLVMValueRef v_slotoffp;
-	LLVMValueRef v_flagsp;
 	LLVMValueRef v_nvalidp;
 	LLVMValueRef v_nvalid;
 	LLVMValueRef v_maxatt;
@@ -178,7 +177,6 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 	v_tts_nulls =
 		l_load_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_ISNULL,
 						  "tts_ISNULL");
-	v_flagsp = l_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_FLAGS, "");
 	v_nvalidp = l_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_NVALID, "");
 
 	if (ops == &TTSOpsHeapTuple || ops == &TTSOpsBufferHeapTuple)
@@ -747,14 +745,10 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 
 	{
 		LLVMValueRef v_off = l_load(b, TypeSizeT, v_offp, "");
-		LLVMValueRef v_flags;
 
 		LLVMBuildStore(b, l_int16_const(lc, natts), v_nvalidp);
 		v_off = LLVMBuildTrunc(b, v_off, LLVMInt32TypeInContext(lc), "");
 		LLVMBuildStore(b, v_off, v_slotoffp);
-		v_flags = l_load(b, LLVMInt16TypeInContext(lc), v_flagsp, "tts_flags");
-		v_flags = LLVMBuildOr(b, v_flags, l_int16_const(lc, TTS_FLAG_SLOW), "");
-		LLVMBuildStore(b, v_flags, v_flagsp);
 		LLVMBuildRetVoid(b);
 	}
 
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 3ec8d8de011..0ad767d827b 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -1891,6 +1891,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 			TupleDescInitEntryCollation(tupdesc,
 										(AttrNumber) 1,
 										exprCollation(funcexpr));
+			TupleDescFinalize(tupdesc);
 		}
 		else if (functypclass == TYPEFUNC_RECORD)
 		{
@@ -1948,6 +1949,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 
 				i++;
 			}
+			TupleDescFinalize(tupdesc);
 
 			/*
 			 * Ensure that the coldeflist defines a legal set of names (no
@@ -2016,7 +2018,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 							   0);
 			/* no need to set collation */
 		}
-
+		TupleDescFinalize(tupdesc);
 		Assert(natts == totalatts);
 	}
 	else
diff --git a/src/backend/parser/parse_target.c b/src/backend/parser/parse_target.c
index b5a2f915b67..5fd17f3d8d0 100644
--- a/src/backend/parser/parse_target.c
+++ b/src/backend/parser/parse_target.c
@@ -1570,6 +1570,8 @@ expandRecordVariable(ParseState *pstate, Var *var, int levelsup)
 		}
 		Assert(lname == NULL && lvar == NULL);	/* lists same length? */
 
+		TupleDescFinalize(tupleDesc);
+
 		return tupleDesc;
 	}
 
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 7c8639b32e9..4b2d60fe3c2 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -1074,6 +1074,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
 		TupleDescInitEntry(walres->tupledesc, (AttrNumber) coln + 1,
 						   PQfname(pgres, coln), retTypes[coln], -1, 0);
 	attinmeta = TupleDescGetAttInMetadata(walres->tupledesc);
+	TupleDescFinalize(walres->tupledesc);
 
 	/* No point in doing more here if there were no tuples returned. */
 	if (PQntuples(pgres) == 0)
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 1ab09655a70..269b081bac0 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -452,6 +452,7 @@ IdentifySystem(void)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 4, "dbname",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -497,6 +498,7 @@ ReadReplicationSlot(ReadReplicationSlotCmd *cmd)
 	/* TimeLineID is unsigned, so int4 is not wide enough. */
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "restart_tli",
 							  INT8OID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	memset(nulls, true, READ_REPLICATION_SLOT_COLS * sizeof(bool));
 
@@ -599,6 +601,7 @@ SendTimeLineHistory(TimeLineHistoryCmd *cmd)
 	tupdesc = CreateTemplateTupleDesc(2);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, "filename", TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "content", TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	TLHistoryFileName(histfname, cmd->timeline);
 	TLHistoryFilePath(path, cmd->timeline);
@@ -1016,6 +1019,7 @@ StartReplication(StartReplicationCmd *cmd)
 								  INT8OID, -1, 0);
 		TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "next_tli_startpos",
 								  TEXTOID, -1, 0);
+		TupleDescFinalize(tupdesc);
 
 		/* prepare for projection of tuple */
 		tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -1370,6 +1374,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 4, "output_plugin",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
diff --git a/src/backend/utils/adt/acl.c b/src/backend/utils/adt/acl.c
index 3a6905f9546..9d37053c81e 100644
--- a/src/backend/utils/adt/acl.c
+++ b/src/backend/utils/adt/acl.c
@@ -1818,6 +1818,7 @@ aclexplode(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 4, "is_grantable",
 						   BOOLOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/* allocate memory for user context */
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index c083608b1d5..bfb949401d0 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -454,6 +454,7 @@ pg_stat_file(PG_FUNCTION_ARGS)
 					   "creation", TIMESTAMPTZOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 6,
 					   "isdir", BOOLOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	memset(isnull, false, sizeof(isnull));
diff --git a/src/backend/utils/adt/lockfuncs.c b/src/backend/utils/adt/lockfuncs.c
index 9dadd6da672..4481c354fd6 100644
--- a/src/backend/utils/adt/lockfuncs.c
+++ b/src/backend/utils/adt/lockfuncs.c
@@ -146,6 +146,7 @@ pg_lock_status(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 16, "waitstart",
 						   TIMESTAMPTZOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/*
diff --git a/src/backend/utils/adt/orderedsetaggs.c b/src/backend/utils/adt/orderedsetaggs.c
index 3b6da8e36ac..fd8b8676470 100644
--- a/src/backend/utils/adt/orderedsetaggs.c
+++ b/src/backend/utils/adt/orderedsetaggs.c
@@ -233,6 +233,7 @@ ordered_set_startup(FunctionCallInfo fcinfo, bool use_tuples)
 								   -1,
 								   0);
 
+				TupleDescFinalize(newdesc);
 				FreeTupleDesc(qstate->tupdesc);
 				qstate->tupdesc = newdesc;
 			}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 73ca0bb0b7f..08ad27e57c2 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -769,6 +769,7 @@ pg_stat_get_backend_subxact(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subxact_overflow",
 					   BOOLOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	if ((local_beentry = pgstat_get_local_beentry_by_proc_number(procNumber)) != NULL)
@@ -1658,6 +1659,7 @@ pg_stat_wal_build_tuple(PgStat_WalCounters wal_counters,
 	TupleDescInitEntry(tupdesc, (AttrNumber) 6, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	/* Fill values and NULLs */
@@ -2085,6 +2087,7 @@ pg_stat_get_archiver(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(tupdesc, (AttrNumber) 7, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	/* Get statistics about the archiver process */
@@ -2166,6 +2169,7 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
 					   TIMESTAMPTZOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 13, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	namestrcpy(&slotname, text_to_cstring(slotname_text));
@@ -2253,6 +2257,7 @@ pg_stat_get_subscription_stats(PG_FUNCTION_ARGS)
 					   INT8OID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 13, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	if (!subentry)
diff --git a/src/backend/utils/adt/tsvector_op.c b/src/backend/utils/adt/tsvector_op.c
index 94e0fed8309..7ca19a97882 100644
--- a/src/backend/utils/adt/tsvector_op.c
+++ b/src/backend/utils/adt/tsvector_op.c
@@ -651,6 +651,7 @@ tsvector_unnest(PG_FUNCTION_ARGS)
 						   TEXTARRAYOID, -1, 0);
 		if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
 			elog(ERROR, "return type must be a row type");
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = tupdesc;
 
 		funcctx->user_fctx = PG_GETARG_TSVECTOR_COPY(0);
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 6b634c9fff1..998be24ac41 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -666,14 +666,6 @@ RelationBuildTupleDesc(Relation relation)
 		elog(ERROR, "pg_attribute catalog is missing %d attribute(s) for relation OID %u",
 			 need, RelationGetRelid(relation));
 
-	/*
-	 * We can easily set the attcacheoff value for the first attribute: it
-	 * must be zero.  This eliminates the need for special cases for attnum=1
-	 * that used to exist in fastgetattr() and index_getattr().
-	 */
-	if (RelationGetNumberOfAttributes(relation) > 0)
-		TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
-
 	/*
 	 * Set up constraint/default info
 	 */
@@ -729,6 +721,8 @@ RelationBuildTupleDesc(Relation relation)
 		pfree(constr);
 		relation->rd_att->constr = NULL;
 	}
+
+	TupleDescFinalize(relation->rd_att);
 }
 
 /*
@@ -1983,8 +1977,7 @@ formrdesc(const char *relationName, Oid relationReltype,
 		populate_compact_attribute(relation->rd_att, i);
 	}
 
-	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
-	TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
+	TupleDescFinalize(relation->rd_att);
 
 	/* mark not-null status */
 	if (has_not_null)
@@ -3688,6 +3681,8 @@ RelationBuildLocalRelation(const char *relname,
 	for (i = 0; i < natts; i++)
 		TupleDescAttr(rel->rd_att, i)->attrelid = relid;
 
+	TupleDescFinalize(rel->rd_att);
+
 	rel->rd_rel->reltablespace = reltablespace;
 
 	if (mapped_relation)
@@ -4441,8 +4436,7 @@ BuildHardcodedDescriptor(int natts, const FormData_pg_attribute *attrs)
 		populate_compact_attribute(result, i);
 	}
 
-	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
-	TupleDescCompactAttr(result, 0)->attcacheoff = 0;
+	TupleDescFinalize(result);
 
 	/* Note: we don't bother to set up a TupleConstr entry */
 
@@ -6268,6 +6262,8 @@ load_relcache_init_file(bool shared)
 			populate_compact_attribute(rel->rd_att, i);
 		}
 
+		TupleDescFinalize(rel->rd_att);
+
 		/* next read the access method specific field */
 		if (fread(&len, 1, sizeof(len), fp) != sizeof(len))
 			goto read_failed;
diff --git a/src/backend/utils/fmgr/funcapi.c b/src/backend/utils/fmgr/funcapi.c
index 8a934ea8dca..516d02cfb82 100644
--- a/src/backend/utils/fmgr/funcapi.c
+++ b/src/backend/utils/fmgr/funcapi.c
@@ -340,6 +340,8 @@ get_expr_result_type(Node *expr,
 										exprCollation(col));
 			i++;
 		}
+		TupleDescFinalize(tupdesc);
+
 		if (resultTypeId)
 			*resultTypeId = rexpr->row_typeid;
 		if (resultTupleDesc)
@@ -1044,6 +1046,7 @@ resolve_polymorphic_tupdesc(TupleDesc tupdesc, oidvector *declared_args,
 		}
 	}
 
+	TupleDescFinalize(tupdesc);
 	return true;
 }
 
@@ -1853,6 +1856,8 @@ build_function_result_tupdesc_d(char prokind,
 						   0);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -1970,6 +1975,7 @@ TypeGetTupleDesc(Oid typeoid, List *colaliases)
 						   typeoid,
 						   -1,
 						   0);
+		TupleDescFinalize(tupdesc);
 	}
 	else if (functypclass == TYPEFUNC_RECORD)
 	{
diff --git a/src/backend/utils/misc/guc_funcs.c b/src/backend/utils/misc/guc_funcs.c
index 4f3e40bf470..b566a8ef600 100644
--- a/src/backend/utils/misc/guc_funcs.c
+++ b/src/backend/utils/misc/guc_funcs.c
@@ -444,6 +444,7 @@ GetPGVariableResultDesc(const char *name)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 1, varname,
 						   TEXTOID, -1, 0);
 	}
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
 
@@ -465,6 +466,7 @@ ShowGUCConfigOption(const char *name, DestReceiver *dest)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, varname,
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -499,6 +501,7 @@ ShowAllGUCConfig(DestReceiver *dest)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "description",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -939,6 +942,8 @@ show_all_settings(PG_FUNCTION_ARGS)
 		 * C strings
 		 */
 		attinmeta = TupleDescGetAttInMetadata(tupdesc);
+		TupleDescFinalize(tupdesc);
+
 		funcctx->attinmeta = attinmeta;
 
 		/* collect the variables, in sorted order */
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index d406825ff22..94b4279b7f1 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -865,20 +865,17 @@ extern MinimalTuple minimal_expand_tuple(HeapTuple sourceTuple, TupleDesc tupleD
 static inline Datum
 fastgetattr(HeapTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
 {
-	Assert(attnum > 0);
+	CompactAttribute *att = TupleDescCompactAttr(tupleDesc, attnum - 1);
 
+	Assert(attnum > 0);
 	*isnull = false;
-	if (HeapTupleNoNulls(tup))
-	{
-		CompactAttribute *att;
 
-		att = TupleDescCompactAttr(tupleDesc, attnum - 1);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, (char *) tup->t_data + tup->t_data->t_hoff +
-							att->attcacheoff);
-		else
-			return nocachegetattr(tup, attnum, tupleDesc);
-	}
+	if (att->attcacheoff >= 0 && !HeapTupleHasNulls(tup))
+		return fetchatt(att, (char *) tup->t_data + tup->t_data->t_hoff +
+						att->attcacheoff);
+
+	if (HeapTupleNoNulls(tup))
+		return nocachegetattr(tup, attnum, tupleDesc);
 	else
 	{
 		if (att_isnull(attnum - 1, tup->t_data->t_bits))
diff --git a/src/include/access/itup.h b/src/include/access/itup.h
index 57e4daafb0d..e4bb27b7e58 100644
--- a/src/include/access/itup.h
+++ b/src/include/access/itup.h
@@ -131,24 +131,20 @@ IndexInfoFindDataOffset(unsigned short t_info)
 static inline Datum
 index_getattr(IndexTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
 {
+	CompactAttribute *attr = TupleDescCompactAttr(tupleDesc, attnum - 1);
+
 	Assert(isnull);
 	Assert(attnum > 0);
 
 	*isnull = false;
 
-	if (!IndexTupleHasNulls(tup))
-	{
-		CompactAttribute *attr = TupleDescCompactAttr(tupleDesc, attnum - 1);
+	if (attr->attcacheoff >= 0 && !IndexTupleHasNulls(tup))
+		return fetchatt(attr,
+						(char *) tup + IndexInfoFindDataOffset(tup->t_info) +
+						attr->attcacheoff);
 
-		if (attr->attcacheoff >= 0)
-		{
-			return fetchatt(attr,
-							(char *) tup + IndexInfoFindDataOffset(tup->t_info) +
-							attr->attcacheoff);
-		}
-		else
-			return nocache_index_getattr(tup, attnum, tupleDesc);
-	}
+	if (!IndexTupleHasNulls(tup))
+		return nocache_index_getattr(tup, attnum, tupleDesc);
 	else
 	{
 		if (att_isnull(attnum - 1, (bits8 *) tup + sizeof(IndexTupleData)))
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index d46cdbf7a3c..3c44f2ac119 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -131,6 +131,12 @@ typedef struct CompactAttribute
  * Any code making changes manually to and fields in the FormData_pg_attribute
  * array must subsequently call populate_compact_attribute() to flush the
  * changes out to the corresponding 'compact_attrs' element.
+ *
+ * firstNonCachedOffAttr stores the index into the compact_attrs array for the
+ * first attribute that we don't have a known attcacheoff for.
+ *
+ * Once a TupleDesc has been populated, before it is used for any purpose
+ * TupleDescFinalize() must be called on it.
  */
 typedef struct TupleDescData
 {
@@ -138,6 +144,12 @@ typedef struct TupleDescData
 	Oid			tdtypeid;		/* composite type ID for tuple type */
 	int32		tdtypmod;		/* typmod for tuple type */
 	int			tdrefcount;		/* reference count, or -1 if not counting */
+	int			firstNonCachedOffAttr;	/* index of the first att without an
+										 * attcacheoff */
+#ifdef OPTIMIZE_BYVAL
+	int			firstByRefAttr; /* index of the first attr with !attbyval, or
+								 * natts if none. */
+#endif
 	TupleConstr *constr;		/* constraints, or NULL if none */
 	/* compact_attrs[N] is the compact metadata of Attribute Number N+1 */
 	CompactAttribute compact_attrs[FLEXIBLE_ARRAY_MEMBER];
@@ -205,6 +217,8 @@ extern void TupleDescCopy(TupleDesc dst, TupleDesc src);
 extern void TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 							   TupleDesc src, AttrNumber srcAttno);
 
+extern void TupleDescFinalize(TupleDesc tupdesc);
+
 extern void FreeTupleDesc(TupleDesc tupdesc);
 
 extern void IncrTupleDescRefCount(TupleDesc tupdesc);
diff --git a/src/include/access/tupmacs.h b/src/include/access/tupmacs.h
index e6df8264750..a26f18c39fe 100644
--- a/src/include/access/tupmacs.h
+++ b/src/include/access/tupmacs.h
@@ -15,6 +15,7 @@
 #define TUPMACS_H
 
 #include "catalog/pg_type_d.h"	/* for TYPALIGN macros */
+#include "port/pg_bitutils.h"
 
 
 /*
@@ -69,6 +70,69 @@ fetch_att(const void *T, bool attbyval, int attlen)
 	else
 		return PointerGetDatum(T);
 }
+
+/*
+ * first_null_attr
+ *		Inspect a NULL bitmask from a tuple and return the 0-based attnum of the
+ *		first NULL attribute.  Returns natts if no NULLs were found.
+ */
+static inline int
+first_null_attr(const bits8 *bits, int natts)
+{
+	int			lastByte = natts >> 3;
+	uint8		mask;
+	int			res = natts;
+	uint8		byte;
+
+#ifdef USE_ASSERT_CHECKING
+	int			firstnull_check = natts;
+
+	/* Do it the slow way and check we get the same answer. */
+	for (int i = 0; i < natts; i++)
+	{
+		if (att_isnull(i, bits))
+		{
+			firstnull_check = i;
+			break;
+		}
+	}
+#endif
+
+	/* Process all bytes up to just before the byte for the natts index */
+	for (int bytenum = 0; bytenum < lastByte; bytenum++)
+	{
+		if (bits[bytenum] != 0xFF)
+		{
+			byte = ~bits[bytenum];
+			res = bytenum << 3;
+			res += pg_rightmost_one_pos[byte];
+
+			Assert(res == firstnull_check);
+			return res;
+		}
+	}
+
+	/*
+	 * Create a mask with all bits beyond natts's bit set to off.  The code
+	 * below will generate a zero mask when natts & 7 == 0.  When that
+	 * happens, all bytes that need to be checked were done so in the loop
+	 * above.  The code below will create an empty mask and end up returning
+	 * natts.  This has been done to avoid having to write a special case to
+	 * check if we've covered all bytes already.
+	 */
+	mask = 0xFF & ((((uint8) 1) << (natts & 7)) - 1);
+	byte = (~bits[lastByte]) & mask;
+
+	if (byte != 0)
+	{
+		res = lastByte << 3;
+		res += pg_rightmost_one_pos[byte];
+	}
+
+	Assert(res == firstnull_check);
+
+	return res;
+}
 #endif							/* FRONTEND */
 
 /*
diff --git a/src/include/executor/tuptable.h b/src/include/executor/tuptable.h
index a2dfd707e78..363c5f33697 100644
--- a/src/include/executor/tuptable.h
+++ b/src/include/executor/tuptable.h
@@ -84,9 +84,6 @@
  * tts_values/tts_isnull are allocated either when the slot is created (when
  * the descriptor is provided), or when a descriptor is assigned to the slot;
  * they are of length equal to the descriptor's natts.
- *
- * The TTS_FLAG_SLOW flag is saved state for
- * slot_deform_heap_tuple, and should not be touched by any other code.
  *----------
  */
 
@@ -98,12 +95,8 @@
 #define			TTS_FLAG_SHOULDFREE		(1 << 2)
 #define TTS_SHOULDFREE(slot) (((slot)->tts_flags & TTS_FLAG_SHOULDFREE) != 0)
 
-/* saved state for slot_deform_heap_tuple */
-#define			TTS_FLAG_SLOW		(1 << 3)
-#define TTS_SLOW(slot) (((slot)->tts_flags & TTS_FLAG_SLOW) != 0)
-
 /* fixed tuple descriptor */
-#define			TTS_FLAG_FIXED		(1 << 4)
+#define			TTS_FLAG_FIXED		(1 << 4)	/* XXX change to #3? */
 #define TTS_FIXED(slot) (((slot)->tts_flags & TTS_FLAG_FIXED) != 0)
 
 struct TupleTableSlotOps;
diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c
index 7d648c941c0..b2112c29fcf 100644
--- a/src/pl/plpgsql/src/pl_comp.c
+++ b/src/pl/plpgsql/src/pl_comp.c
@@ -1912,6 +1912,8 @@ build_row_from_vars(PLpgSQL_variable **vars, int numvars)
 		TupleDescInitEntryCollation(row->rowtupdesc, i + 1, typcoll);
 	}
 
+	TupleDescFinalize(row->rowtupdesc);
+
 	return row;
 }
 
diff --git a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
index 908bd18a7c7..fa1719bf3b5 100644
--- a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
+++ b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
@@ -205,6 +205,7 @@ test_custom_stats_fixed_report(PG_FUNCTION_ARGS)
 					   INT8OID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	values[0] = Int64GetDatum(stats->numcalls);
diff --git a/src/test/modules/test_predtest/test_predtest.c b/src/test/modules/test_predtest/test_predtest.c
index 679a5de456d..48ca2a4ea70 100644
--- a/src/test/modules/test_predtest/test_predtest.c
+++ b/src/test/modules/test_predtest/test_predtest.c
@@ -230,6 +230,7 @@ test_predtest(PG_FUNCTION_ARGS)
 					   "s_r_holds", BOOLOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 8,
 					   "w_r_holds", BOOLOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	tupdesc = BlessTupleDesc(tupdesc);
 
 	values[0] = BoolGetDatum(strong_implied_by);
-- 
2.51.0



Attachments:

  [text/plain] v4-0001-Precalculate-CompactAttribute-s-attcacheoff.patch (74.9K, 2-v4-0001-Precalculate-CompactAttribute-s-attcacheoff.patch)
  download | inline diff:
From 21cf76b97fc8b3a47ef71ff8921c4b370c0f714a Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Tue, 31 Dec 2024 09:19:24 +1300
Subject: [PATCH v4] Precalculate CompactAttribute's attcacheoff

This allows code to be removed from the tuple deform routines which
shrinks down the code a little, which can make it run more quickly.
This also makes a dedicated deformer loop to deform the portion of the
tuple which has a known offset, which makes deforming much faster when
a leading set of the table's columns are non-NULL values and fixed-width
types.
---
 contrib/dblink/dblink.c                       |   3 +
 contrib/pg_buffercache/pg_buffercache_pages.c |   2 +
 contrib/pg_visibility/pg_visibility.c         |   2 +
 src/backend/access/brin/brin_tuple.c          |   1 +
 src/backend/access/common/heaptuple.c         | 323 ++++++----------
 src/backend/access/common/indextuple.c        | 361 +++++++-----------
 src/backend/access/common/tupdesc.c           |  59 +++
 src/backend/access/gin/ginutil.c              |   1 +
 src/backend/access/gist/gistscan.c            |   1 +
 src/backend/access/spgist/spgutils.c          |   4 +-
 src/backend/access/transam/twophase.c         |   1 +
 src/backend/access/transam/xlogfuncs.c        |   1 +
 src/backend/backup/basebackup_copy.c          |   3 +
 src/backend/catalog/index.c                   |   2 +
 src/backend/catalog/pg_publication.c          |   1 +
 src/backend/catalog/toasting.c                |   6 +
 src/backend/commands/explain.c                |   1 +
 src/backend/commands/functioncmds.c           |   1 +
 src/backend/commands/sequence.c               |   1 +
 src/backend/commands/tablecmds.c              |   4 +
 src/backend/commands/wait.c                   |   1 +
 src/backend/executor/execSRF.c                |   2 +
 src/backend/executor/execTuples.c             | 305 +++++++--------
 src/backend/executor/nodeFunctionscan.c       |   2 +
 src/backend/jit/llvm/llvmjit_deform.c         |   6 -
 src/backend/parser/parse_relation.c           |   4 +-
 src/backend/parser/parse_target.c             |   2 +
 .../libpqwalreceiver/libpqwalreceiver.c       |   1 +
 src/backend/replication/walsender.c           |   5 +
 src/backend/utils/adt/acl.c                   |   1 +
 src/backend/utils/adt/genfile.c               |   1 +
 src/backend/utils/adt/lockfuncs.c             |   1 +
 src/backend/utils/adt/orderedsetaggs.c        |   1 +
 src/backend/utils/adt/pgstatfuncs.c           |   5 +
 src/backend/utils/adt/tsvector_op.c           |   1 +
 src/backend/utils/cache/relcache.c            |  20 +-
 src/backend/utils/fmgr/funcapi.c              |   6 +
 src/backend/utils/misc/guc_funcs.c            |   5 +
 src/include/access/htup_details.h             |  19 +-
 src/include/access/itup.h                     |  20 +-
 src/include/access/tupdesc.h                  |  14 +
 src/include/access/tupmacs.h                  |  64 ++++
 src/include/executor/tuptable.h               |   9 +-
 src/pl/plpgsql/src/pl_comp.c                  |   2 +
 .../test_custom_fixed_stats.c                 |   1 +
 .../modules/test_predtest/test_predtest.c     |   1 +
 46 files changed, 642 insertions(+), 635 deletions(-)

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 8cb3166495c..ed0267d2183 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -881,6 +881,7 @@ materializeResult(FunctionCallInfo fcinfo, PGconn *conn, PGresult *res)
 		tupdesc = CreateTemplateTupleDesc(1);
 		TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 						   TEXTOID, -1, 0);
+		TupleDescFinalize(tupdesc);
 		ntuples = 1;
 		nfields = 1;
 	}
@@ -1045,6 +1046,7 @@ materializeQueryResult(FunctionCallInfo fcinfo,
 			TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 							   TEXTOID, -1, 0);
 			attinmeta = TupleDescGetAttInMetadata(tupdesc);
+			TupleDescFinalize(tupdesc);
 
 			oldcontext = MemoryContextSwitchTo(rsinfo->econtext->ecxt_per_query_memory);
 			tupstore = tuplestore_begin_heap(true, false, work_mem);
@@ -1534,6 +1536,7 @@ dblink_get_pkey(PG_FUNCTION_ARGS)
 		 * C strings
 		 */
 		attinmeta = TupleDescGetAttInMetadata(tupdesc);
+		TupleDescFinalize(tupdesc);
 		funcctx->attinmeta = attinmeta;
 
 		if ((results != NULL) && (indnkeyatts > 0))
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index dcba3fb5473..2fdf5a341f6 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -174,6 +174,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
 			TupleDescInitEntry(tupledesc, (AttrNumber) 9, "pinning_backends",
 							   INT4OID, -1, 0);
 
+		TupleDescFinalize(tupledesc);
 		fctx->tupdesc = BlessTupleDesc(tupledesc);
 
 		/* Allocate NBuffers worth of BufferCachePagesRec records. */
@@ -442,6 +443,7 @@ pg_buffercache_os_pages_internal(FunctionCallInfo fcinfo, bool include_numa)
 		TupleDescInitEntry(tupledesc, (AttrNumber) 3, "numa_node",
 						   INT4OID, -1, 0);
 
+		TupleDescFinalize(tupledesc);
 		fctx->tupdesc = BlessTupleDesc(tupledesc);
 		fctx->include_numa = include_numa;
 
diff --git a/contrib/pg_visibility/pg_visibility.c b/contrib/pg_visibility/pg_visibility.c
index 9bc3a784bf7..dfab0b64cf5 100644
--- a/contrib/pg_visibility/pg_visibility.c
+++ b/contrib/pg_visibility/pg_visibility.c
@@ -469,6 +469,8 @@ pg_visibility_tupdesc(bool include_blkno, bool include_pd)
 		TupleDescInitEntry(tupdesc, ++a, "pd_all_visible", BOOLOID, -1, 0);
 	Assert(a == maxattr);
 
+	TupleDescFinalize(tupdesc);
+
 	return BlessTupleDesc(tupdesc);
 }
 
diff --git a/src/backend/access/brin/brin_tuple.c b/src/backend/access/brin/brin_tuple.c
index 706387e36d6..7f150df9ee7 100644
--- a/src/backend/access/brin/brin_tuple.c
+++ b/src/backend/access/brin/brin_tuple.c
@@ -84,6 +84,7 @@ brtuple_disk_tupdesc(BrinDesc *brdesc)
 
 		MemoryContextSwitchTo(oldcxt);
 
+		TupleDescFinalize(tupdesc);
 		brdesc->bd_disktdesc = tupdesc;
 	}
 
diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c
index 11bec20e82e..85b42f33315 100644
--- a/src/backend/access/common/heaptuple.c
+++ b/src/backend/access/common/heaptuple.c
@@ -497,20 +497,8 @@ heap_attisnull(HeapTuple tup, int attnum, TupleDesc tupleDesc)
 /* ----------------
  *		nocachegetattr
  *
- *		This only gets called from fastgetattr(), in cases where we
- *		can't use a cacheoffset and the value is not null.
- *
- *		This caches attribute offsets in the attribute descriptor.
- *
- *		An alternative way to speed things up would be to cache offsets
- *		with the tuple, but that seems more difficult unless you take
- *		the storage hit of actually putting those offsets into the
- *		tuple you send to disk.  Yuck.
- *
- *		This scheme will be slightly slower than that, but should
- *		perform well for queries which hit large #'s of tuples.  After
- *		you cache the offsets once, examining all the other tuples using
- *		the same attribute descriptor will go much quicker. -cim 5/4/91
+ *		This only gets called from fastgetattr(), in cases where the
+ *		attcacheoff is not set.
  *
  *		NOTE: if you need to change this code, see also heap_deform_tuple.
  *		Also see nocache_index_getattr, which is the same code for index
@@ -522,194 +510,104 @@ nocachegetattr(HeapTuple tup,
 			   int attnum,
 			   TupleDesc tupleDesc)
 {
+	CompactAttribute *cattr;
 	HeapTupleHeader td = tup->t_data;
 	char	   *tp;				/* ptr to data part of tuple */
 	bits8	   *bp = td->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* do we have to walk attrs? */
 	int			off;			/* current offset within data */
+	int			startAttr;
+	int			firstnullattr;
+	bool		hasnulls = HeapTupleHasNulls(tup);
 
-	/* ----------------
-	 *	 Three cases:
-	 *
-	 *	 1: No nulls and no variable-width attributes.
-	 *	 2: Has a null or a var-width AFTER att.
-	 *	 3: Has nulls or var-widths BEFORE att.
-	 * ----------------
-	 */
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
+	/*
+	 * If there are no NULLs before the required attnum, then we can start at
+	 * the highest attribute with a known offset, or the first attribute if
+	 * none have a cached offset.  If the tuple has no variable width types,
+	 * then we can use a slightly cheaper method of offset calculation, as we
+	 * just need to add the attlen to the aligned offset when skipping over
+	 * columns.  When the tuple contains variable-width types, we must use
+	 * att_addlength_pointer(), which does a bit more branching and is
+	 * slightly less efficient.
+	 */
 	attnum--;
 
-	if (!HeapTupleNoNulls(tup))
-	{
-		/*
-		 * there's a null somewhere in the tuple
-		 *
-		 * check to see if any preceding bits are null...
-		 */
-		int			byte = attnum >> 3;
-		int			finalbit = attnum & 0x07;
-
-		/* check for nulls "before" final bit of last byte */
-		if ((~bp[byte]) & ((1 << finalbit) - 1))
-			slow = true;
-		else
-		{
-			/* check for nulls in any "earlier" bytes */
-			int			i;
+	if (hasnulls)
+		firstnullattr = first_null_attr(bp, attnum);
+	else
+		firstnullattr = attnum;
 
-			for (i = 0; i < byte; i++)
-			{
-				if (bp[i] != 0xFF)
-				{
-					slow = true;
-					break;
-				}
-			}
-		}
+	if (tupleDesc->firstNonCachedOffAttr > 0)
+	{
+		startAttr = Min(tupleDesc->firstNonCachedOffAttr - 1, firstnullattr);
+		off = TupleDescCompactAttr(tupleDesc, startAttr)->attcacheoff;
+	}
+	else
+	{
+		startAttr = 0;
+		off = 0;
 	}
 
 	tp = (char *) td + td->t_hoff;
 
-	if (!slow)
+	if (hasnulls)
 	{
-		CompactAttribute *att;
+		for (int i = startAttr; i < attnum; i++)
+		{
+			CompactAttribute *att;
 
-		/*
-		 * If we get here, there are no nulls up to and including the target
-		 * attribute.  If we have a cached offset, we can use it.
-		 */
-		att = TupleDescCompactAttr(tupleDesc, attnum);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, tp + att->attcacheoff);
+			if (att_isnull(i, bp))
+				continue;
 
-		/*
-		 * Otherwise, check for non-fixed-length attrs up to and including
-		 * target.  If there aren't any, it's safe to cheaply initialize the
-		 * cached offsets for these attrs.
-		 */
-		if (HeapTupleHasVarWidth(tup))
-		{
-			int			j;
+			att = TupleDescCompactAttr(tupleDesc, i);
 
-			for (j = 0; j <= attnum; j++)
-			{
-				if (TupleDescCompactAttr(tupleDesc, j)->attlen <= 0)
-				{
-					slow = true;
-					break;
-				}
-			}
+			off = att_pointer_alignby(off,
+									  att->attalignby,
+									  att->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, att->attlen, tp + off);
 		}
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
 	}
-
-	if (!slow)
+	else if (!HeapTupleHasVarWidth(tup))
 	{
-		int			natts = tupleDesc->natts;
-		int			j = 1;
-
-		/*
-		 * If we get here, we have a tuple with no nulls or var-widths up to
-		 * and including the target attribute, so we can use the cached offset
-		 * ... only we don't have it yet, or we'd not have got here.  Since
-		 * it's cheap to compute offsets for fixed-width columns, we take the
-		 * opportunity to initialize the cached offsets for *all* the leading
-		 * fixed-width columns, in hope of avoiding future visits to this
-		 * routine.
-		 */
-		TupleDescCompactAttr(tupleDesc, 0)->attcacheoff = 0;
-
-		/* we might have set some offsets in the slow path previously */
-		while (j < natts && TupleDescCompactAttr(tupleDesc, j)->attcacheoff > 0)
-			j++;
-
-		off = TupleDescCompactAttr(tupleDesc, j - 1)->attcacheoff +
-			TupleDescCompactAttr(tupleDesc, j - 1)->attlen;
-
-		for (; j < natts; j++)
+		for (int i = startAttr; i < attnum; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, j);
-
-			if (att->attlen <= 0)
-				break;
+			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
 
 			off = att_nominal_alignby(off, att->attalignby);
-
-			att->attcacheoff = off;
-
 			off += att->attlen;
 		}
-
-		Assert(j > attnum);
-
-		off = TupleDescCompactAttr(tupleDesc, attnum)->attcacheoff;
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_nominal_alignby(off, cattr->attalignby);
 	}
 	else
 	{
-		bool		usecache = true;
-		int			i;
-
-		/*
-		 * Now we know that we have to walk the tuple CAREFULLY.  But we still
-		 * might be able to cache some offsets for next time.
-		 *
-		 * Note - This loop is a little tricky.  For each non-null attribute,
-		 * we have to first account for alignment padding before the attr,
-		 * then advance over the attr based on its length.  Nulls have no
-		 * storage and no alignment padding either.  We can use/set
-		 * attcacheoff until we reach either a null or a var-width attribute.
-		 */
-		off = 0;
-		for (i = 0;; i++)		/* loop exit is at "break" */
+		for (int i = startAttr; i < attnum; i++)
 		{
 			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
 
-			if (HeapTupleHasNulls(tup) && att_isnull(i, bp))
-			{
-				usecache = false;
-				continue;		/* this cannot be the target att */
-			}
-
-			/* If we know the next offset, we can skip the rest */
-			if (usecache && att->attcacheoff >= 0)
-				off = att->attcacheoff;
-			else if (att->attlen == -1)
-			{
-				/*
-				 * We can only cache the offset for a varlena attribute if the
-				 * offset is already suitably aligned, so that there would be
-				 * no pad bytes in any case: then the offset will be valid for
-				 * either an aligned or unaligned value.
-				 */
-				if (usecache &&
-					off == att_nominal_alignby(off, att->attalignby))
-					att->attcacheoff = off;
-				else
-				{
-					off = att_pointer_alignby(off, att->attalignby, -1,
-											  tp + off);
-					usecache = false;
-				}
-			}
-			else
-			{
-				/* not varlena, so safe to use att_nominal_alignby */
-				off = att_nominal_alignby(off, att->attalignby);
-
-				if (usecache)
-					att->attcacheoff = off;
-			}
-
-			if (i == attnum)
-				break;
-
+			off = att_pointer_alignby(off,
+									  att->attalignby,
+									  att->attlen,
+									  tp + off);
 			off = att_addlength_pointer(off, att->attlen, tp + off);
 
-			if (usecache && att->attlen <= 0)
-				usecache = false;
 		}
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
 	}
 
-	return fetchatt(TupleDescCompactAttr(tupleDesc, attnum), tp + off);
+	return fetchatt(cattr, tp + off);
 }
 
 /* ----------------
@@ -1354,70 +1252,91 @@ heap_deform_tuple(HeapTuple tuple, TupleDesc tupleDesc,
 	char	   *tp;				/* ptr to tuple data */
 	uint32		off;			/* offset in tuple data */
 	bits8	   *bp = tup->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* can we use/set attcacheoff? */
+	int			cacheoffattrs;
+	int			firstnullattr;
 
 	natts = HeapTupleHeaderGetNatts(tup);
 
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
+
 	/*
 	 * In inheritance situations, it is possible that the given tuple actually
 	 * has more fields than the caller is expecting.  Don't run off the end of
 	 * the caller's arrays.
 	 */
 	natts = Min(natts, tdesc_natts);
+	cacheoffattrs = Min(tupleDesc->firstNonCachedOffAttr, natts);
 
-	tp = (char *) tup + tup->t_hoff;
+	if (hasnulls)
+	{
+		firstnullattr = first_null_attr(bp, natts);
+		cacheoffattrs = Min(cacheoffattrs, firstnullattr);
+	}
+	else
+		firstnullattr = natts;
 
+	tp = (char *) tup + tup->t_hoff;
 	off = 0;
 
-	for (attnum = 0; attnum < natts; attnum++)
+	for (attnum = 0; attnum < cacheoffattrs; attnum++)
+	{
+		CompactAttribute *cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+		Assert(cattr->attcacheoff >= 0);
+
+		values[attnum] = fetch_att(tp + cattr->attcacheoff, cattr->attbyval,
+								   cattr->attlen);
+		isnull[attnum] = false;
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+
+	for (; attnum < firstnullattr; attnum++)
 	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDesc, attnum);
+		CompactAttribute *cattr = TupleDescCompactAttr(tupleDesc, attnum);
 
-		if (hasnulls && att_isnull(attnum, bp))
+		if (cattr->attlen == -1)
+			off = att_pointer_alignby(off, cattr->attalignby, -1,
+									  tp + off);
+		else
 		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			slow = true;		/* can't use attcacheoff anymore */
-			continue;
+			/* not varlena, so safe to use att_nominal_alignby */
+			off = att_nominal_alignby(off, cattr->attalignby);
 		}
 
 		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
+
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+	}
+
+	for (; attnum < natts; attnum++)
+	{
+		CompactAttribute *cattr;
+
+		Assert(hasnulls);
 
-		if (!slow && thisatt->attcacheoff >= 0)
-			off = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
+		if (att_isnull(attnum, bp))
 		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow &&
-				off == att_nominal_alignby(off, thisatt->attalignby))
-				thisatt->attcacheoff = off;
-			else
-			{
-				off = att_pointer_alignby(off, thisatt->attalignby, -1,
-										  tp + off);
-				slow = true;
-			}
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
 		}
+
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		if (cattr->attlen == -1)
+			off = att_pointer_alignby(off, cattr->attalignby, -1,
+									  tp + off);
 		else
 		{
 			/* not varlena, so safe to use att_nominal_alignby */
-			off = att_nominal_alignby(off, thisatt->attalignby);
-
-			if (!slow)
-				thisatt->attcacheoff = off;
+			off = att_nominal_alignby(off, cattr->attalignby);
 		}
 
-		values[attnum] = fetchatt(thisatt, tp + off);
-
-		off = att_addlength_pointer(off, thisatt->attlen, tp + off);
+		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (thisatt->attlen <= 0)
-			slow = true;		/* can't use attcacheoff anymore */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
 	/*
diff --git a/src/backend/access/common/indextuple.c b/src/backend/access/common/indextuple.c
index d7c8c53fd8d..7e1bde9703a 100644
--- a/src/backend/access/common/indextuple.c
+++ b/src/backend/access/common/indextuple.c
@@ -223,18 +223,6 @@ index_form_tuple_context(TupleDesc tupleDescriptor,
  *
  *		This gets called from index_getattr() macro, and only in cases
  *		where we can't use cacheoffset and the value is not null.
- *
- *		This caches attribute offsets in the attribute descriptor.
- *
- *		An alternative way to speed things up would be to cache offsets
- *		with the tuple, but that seems more difficult unless you take
- *		the storage hit of actually putting those offsets into the
- *		tuple you send to disk.  Yuck.
- *
- *		This scheme will be slightly slower than that, but should
- *		perform well for queries which hit large #'s of tuples.  After
- *		you cache the offsets once, examining all the other tuples using
- *		the same attribute descriptor will go much quicker. -cim 5/4/91
  * ----------------
  */
 Datum
@@ -242,205 +230,129 @@ nocache_index_getattr(IndexTuple tup,
 					  int attnum,
 					  TupleDesc tupleDesc)
 {
+	CompactAttribute *cattr;
 	char	   *tp;				/* ptr to data part of tuple */
 	bits8	   *bp = NULL;		/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* do we have to walk attrs? */
 	int			data_off;		/* tuple data offset */
 	int			off;			/* current offset within data */
+	int			startAttr;
+	int			firstnullattr;
+	bool		hasnulls = IndexTupleHasNulls(tup);
+	int			i;
 
-	/* ----------------
-	 *	 Three cases:
-	 *
-	 *	 1: No nulls and no variable-width attributes.
-	 *	 2: Has a null or a var-width AFTER att.
-	 *	 3: Has nulls or var-widths BEFORE att.
-	 * ----------------
-	 */
-
-	data_off = IndexInfoFindDataOffset(tup->t_info);
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
 	attnum--;
 
-	if (IndexTupleHasNulls(tup))
-	{
-		/*
-		 * there's a null somewhere in the tuple
-		 *
-		 * check to see if desired att is null
-		 */
+	/*
+	 * If there are no NULLs before the required attnum, then we can start at
+	 * the highest attribute with a known offset, or the first attribute if
+	 * none have a cached offset.  If the tuple has no variable width types,
+	 * which is common with indexes, then we can use a slightly cheaper method
+	 * of offset calculation, as we just need to add the attlen to the aligned
+	 * offset when skipping over columns.  When the tuple contains
+	 * variable-width types, we must use att_addlength_pointer(), which does a
+	 * bit more branching and is slightly less efficient.
+	 */
+	data_off = IndexInfoFindDataOffset(tup->t_info);
+	tp = (char *) tup + data_off;
 
-		/* XXX "knows" t_bits are just after fixed tuple header! */
+	/*
+	 * Find the first NULL column, or if there's none set the first NULL to
+	 * attnum so that we can forego NULL checking all the way to attnum.
+	 */
+	if (hasnulls)
+	{
 		bp = (bits8 *) ((char *) tup + sizeof(IndexTupleData));
-
-		/*
-		 * Now check to see if any preceding bits are null...
-		 */
-		{
-			int			byte = attnum >> 3;
-			int			finalbit = attnum & 0x07;
-
-			/* check for nulls "before" final bit of last byte */
-			if ((~bp[byte]) & ((1 << finalbit) - 1))
-				slow = true;
-			else
-			{
-				/* check for nulls in any "earlier" bytes */
-				int			i;
-
-				for (i = 0; i < byte; i++)
-				{
-					if (bp[i] != 0xFF)
-					{
-						slow = true;
-						break;
-					}
-				}
-			}
-		}
+		firstnullattr = first_null_attr(bp, attnum);
 	}
+	else
+		firstnullattr = attnum;
 
-	tp = (char *) tup + data_off;
-
-	if (!slow)
+	if (tupleDesc->firstNonCachedOffAttr > 0)
 	{
-		CompactAttribute *att;
-
-		/*
-		 * If we get here, there are no nulls up to and including the target
-		 * attribute.  If we have a cached offset, we can use it.
-		 */
-		att = TupleDescCompactAttr(tupleDesc, attnum);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, tp + att->attcacheoff);
-
-		/*
-		 * Otherwise, check for non-fixed-length attrs up to and including
-		 * target.  If there aren't any, it's safe to cheaply initialize the
-		 * cached offsets for these attrs.
-		 */
-		if (IndexTupleHasVarwidths(tup))
-		{
-			int			j;
-
-			for (j = 0; j <= attnum; j++)
-			{
-				if (TupleDescCompactAttr(tupleDesc, j)->attlen <= 0)
-				{
-					slow = true;
-					break;
-				}
-			}
-		}
+		startAttr = Min(tupleDesc->firstNonCachedOffAttr - 1, firstnullattr);
+		off = TupleDescCompactAttr(tupleDesc, startAttr)->attcacheoff;
 	}
-
-	if (!slow)
+	else
 	{
-		int			natts = tupleDesc->natts;
-		int			j = 1;
-
-		/*
-		 * If we get here, we have a tuple with no nulls or var-widths up to
-		 * and including the target attribute, so we can use the cached offset
-		 * ... only we don't have it yet, or we'd not have got here.  Since
-		 * it's cheap to compute offsets for fixed-width columns, we take the
-		 * opportunity to initialize the cached offsets for *all* the leading
-		 * fixed-width columns, in hope of avoiding future visits to this
-		 * routine.
-		 */
-		TupleDescCompactAttr(tupleDesc, 0)->attcacheoff = 0;
+		startAttr = 0;
+		off = 0;
+	}
 
-		/* we might have set some offsets in the slow path previously */
-		while (j < natts && TupleDescCompactAttr(tupleDesc, j)->attcacheoff > 0)
-			j++;
+	/* Handle tuples with var-width attributes */
+	if (IndexTupleHasVarwidths(tup))
+	{
+		/* Calculate the offset up until the first NULL */
+		for (i = startAttr; i < firstnullattr; i++)
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-		off = TupleDescCompactAttr(tupleDesc, j - 1)->attcacheoff +
-			TupleDescCompactAttr(tupleDesc, j - 1)->attlen;
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, cattr->attlen, tp + off);
+		}
 
-		for (; j < natts; j++)
+		/* Calculate the offset for any remaining columns. */
+		for (; i < attnum; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, j);
-
-			if (att->attlen <= 0)
-				break;
+			Assert(hasnulls);
 
-			off = att_nominal_alignby(off, att->attalignby);
+			if (att_isnull(i, bp))
+				continue;
 
-			att->attcacheoff = off;
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-			off += att->attlen;
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, cattr->attlen, tp + off);
 		}
-
-		Assert(j > attnum);
-
-		off = TupleDescCompactAttr(tupleDesc, attnum)->attcacheoff;
 	}
 	else
 	{
-		bool		usecache = true;
-		int			i;
+		/* Handle tuples with only fixed-width attributes */
 
-		/*
-		 * Now we know that we have to walk the tuple CAREFULLY.  But we still
-		 * might be able to cache some offsets for next time.
-		 *
-		 * Note - This loop is a little tricky.  For each non-null attribute,
-		 * we have to first account for alignment padding before the attr,
-		 * then advance over the attr based on its length.  Nulls have no
-		 * storage and no alignment padding either.  We can use/set
-		 * attcacheoff until we reach either a null or a var-width attribute.
-		 */
-		off = 0;
-		for (i = 0;; i++)		/* loop exit is at "break" */
+		/* Calculate the offset up until the first NULL */
+		for (i = startAttr; i < firstnullattr; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
-
-			if (IndexTupleHasNulls(tup) && att_isnull(i, bp))
-			{
-				usecache = false;
-				continue;		/* this cannot be the target att */
-			}
-
-			/* If we know the next offset, we can skip the rest */
-			if (usecache && att->attcacheoff >= 0)
-				off = att->attcacheoff;
-			else if (att->attlen == -1)
-			{
-				/*
-				 * We can only cache the offset for a varlena attribute if the
-				 * offset is already suitably aligned, so that there would be
-				 * no pad bytes in any case: then the offset will be valid for
-				 * either an aligned or unaligned value.
-				 */
-				if (usecache &&
-					off == att_nominal_alignby(off, att->attalignby))
-					att->attcacheoff = off;
-				else
-				{
-					off = att_pointer_alignby(off, att->attalignby, -1,
-											  tp + off);
-					usecache = false;
-				}
-			}
-			else
-			{
-				/* not varlena, so safe to use att_nominal_alignby */
-				off = att_nominal_alignby(off, att->attalignby);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
+
+			Assert(cattr->attlen > 0);
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off += cattr->attlen;
+		}
 
-				if (usecache)
-					att->attcacheoff = off;
-			}
+		/* Calculate the offset for any remaining columns. */
+		for (; i < attnum; i++)
+		{
+			Assert(hasnulls);
 
-			if (i == attnum)
-				break;
+			if (att_isnull(i, bp))
+				continue;
 
-			off = att_addlength_pointer(off, att->attlen, tp + off);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-			if (usecache && att->attlen <= 0)
-				usecache = false;
+			Assert(cattr->attlen > 0);
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off += cattr->attlen;
 		}
 	}
 
-	return fetchatt(TupleDescCompactAttr(tupleDesc, attnum), tp + off);
+	cattr = TupleDescCompactAttr(tupleDesc, attnum);
+	off = att_pointer_alignby(off, cattr->attalignby,
+							  cattr->attlen, tp + off);
+	return fetchatt(cattr, tp + off);
 }
 
 /*
@@ -481,62 +393,79 @@ index_deform_tuple_internal(TupleDesc tupleDescriptor,
 							char *tp, bits8 *bp, int hasnulls)
 {
 	int			natts = tupleDescriptor->natts; /* number of atts to extract */
-	int			attnum;
+	int			attnum = 0;
 	int			off = 0;		/* offset in tuple data */
-	bool		slow = false;	/* can we use/set attcacheoff? */
+	int			cacheoffattrs;
+	int			firstnullattr;
 
 	/* Assert to protect callers who allocate fixed-size arrays */
 	Assert(natts <= INDEX_MAX_KEYS);
 
-	for (attnum = 0; attnum < natts; attnum++)
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDescriptor->firstNonCachedOffAttr >= 0);
+
+	cacheoffattrs = Min(tupleDescriptor->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		firstnullattr = first_null_attr(bp, natts);
+		cacheoffattrs = Min(cacheoffattrs, firstnullattr);
+	}
+	else
+		firstnullattr = natts;
+
+	if (attnum < cacheoffattrs)
 	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDescriptor, attnum);
+		CompactAttribute *cattr;
 
-		if (hasnulls && att_isnull(attnum, bp))
+		do
 		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			slow = true;		/* can't use attcacheoff anymore */
-			continue;
-		}
+			cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+
+			Assert(cattr->attcacheoff >= 0);
+
+			values[attnum] = fetch_att(tp + cattr->attcacheoff, cattr->attbyval,
+									   cattr->attlen);
+			isnull[attnum] = false;
+		} while (++attnum < cacheoffattrs);
+
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+
+	for (; attnum < firstnullattr; attnum++)
+	{
+		CompactAttribute *cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
 		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (!slow && thisatt->attcacheoff >= 0)
-			off = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow &&
-				off == att_nominal_alignby(off, thisatt->attalignby))
-				thisatt->attcacheoff = off;
-			else
-			{
-				off = att_pointer_alignby(off, thisatt->attalignby, -1,
-										  tp + off);
-				slow = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			off = att_nominal_alignby(off, thisatt->attalignby);
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+	}
+
+	for (; attnum < natts; attnum++)
+	{
+		CompactAttribute *cattr;
 
-			if (!slow)
-				thisatt->attcacheoff = off;
+		Assert(hasnulls);
+
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
 		}
 
-		values[attnum] = fetchatt(thisatt, tp + off);
+		cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
-		off = att_addlength_pointer(off, thisatt->attlen, tp + off);
+		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (thisatt->attlen <= 0)
-			slow = true;		/* can't use attcacheoff anymore */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 }
 
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index 94b4f1f9975..4cf13b79805 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -214,6 +214,9 @@ CreateTemplateTupleDesc(int natts)
 	desc->tdtypmod = -1;
 	desc->tdrefcount = -1;		/* assume not reference-counted */
 
+	/* This will be set to the correct value by TupleDescFinalize() */
+	desc->firstNonCachedOffAttr = -1;
+
 	return desc;
 }
 
@@ -238,6 +241,9 @@ CreateTupleDesc(int natts, Form_pg_attribute *attrs)
 		memcpy(TupleDescAttr(desc, i), attrs[i], ATTRIBUTE_FIXED_PART_SIZE);
 		populate_compact_attribute(desc, i);
 	}
+
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -282,6 +288,8 @@ CreateTupleDescCopy(TupleDesc tupdesc)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -328,6 +336,8 @@ CreateTupleDescTruncatedCopy(TupleDesc tupdesc, int natts)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -413,6 +423,8 @@ CreateTupleDescCopyConstr(TupleDesc tupdesc)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -455,6 +467,8 @@ TupleDescCopy(TupleDesc dst, TupleDesc src)
 	 * source's refcount would be wrong in any case.)
 	 */
 	dst->tdrefcount = -1;
+
+	TupleDescFinalize(dst);
 }
 
 /*
@@ -463,6 +477,9 @@ TupleDescCopy(TupleDesc dst, TupleDesc src)
  *		descriptor to another.
  *
  * !!! Constraints and defaults are not copied !!!
+ *
+ * The caller must take care of calling TupleDescFinalize() on once all
+ * TupleDesc changes have been made.
  */
 void
 TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
@@ -495,6 +512,46 @@ TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 	populate_compact_attribute(dst, dstAttno - 1);
 }
 
+/*
+ * TupleDescFinalize
+ *		Finalize the given TupleDesc.  This must be called after the
+ *		attributes arrays have been populated or adjusted by any code.
+ *
+ * Must be called after populate_compact_attribute()
+ */
+void
+TupleDescFinalize(TupleDesc tupdesc)
+{
+	int			firstNonCachedOffAttr = 0;
+#ifdef OPTIMIZE_BYVAL
+	int			firstByRefAttr = tupdesc->natts;
+#endif
+	int			offp = 0;
+
+	for (int i = 0; i < tupdesc->natts; i++)
+	{
+		CompactAttribute *cattr = TupleDescCompactAttr(tupdesc, i);
+
+#ifdef OPTIMIZE_BYVAL
+		if (!cattr->attbyval)
+			firstByRefAttr = Min(firstByRefAttr, i);
+#endif
+		if (cattr->attlen <= 0)
+			break;
+
+		offp = att_nominal_alignby(offp, cattr->attalignby);
+		cattr->attcacheoff = offp;
+
+		offp += cattr->attlen;
+		firstNonCachedOffAttr = i + 1;
+	}
+
+	tupdesc->firstNonCachedOffAttr = firstNonCachedOffAttr;
+#ifdef OPTIMIZE_BYVAL
+	tupdesc->firstByRefAttr = firstByRefAttr;
+#endif
+}
+
 /*
  * Free a TupleDesc including all substructure
  */
@@ -1082,6 +1139,8 @@ BuildDescFromLists(const List *names, const List *types, const List *typmods, co
 		TupleDescInitEntryCollation(desc, attnum, attcollation);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index d205093e21d..a533d79e26e 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -129,6 +129,7 @@ initGinState(GinState *state, Relation index)
 							   attr->attndims);
 			TupleDescInitEntryCollation(state->tupdesc[i], (AttrNumber) 2,
 										attr->attcollation);
+			TupleDescFinalize(state->tupdesc[i]);
 		}
 
 		/*
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
index f23bc4a6757..c65f93abdae 100644
--- a/src/backend/access/gist/gistscan.c
+++ b/src/backend/access/gist/gistscan.c
@@ -201,6 +201,7 @@ gistrescan(IndexScanDesc scan, ScanKey key, int nkeys,
 											 attno - 1)->atttypid,
 							   -1, 0);
 		}
+		TupleDescFinalize(so->giststate->fetchTupdesc);
 		scan->xs_hitupdesc = so->giststate->fetchTupdesc;
 
 		/* Also create a memory context that will hold the returned tuples */
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 9f5379b87ac..a4694bd8065 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -335,11 +335,9 @@ getSpGistTupleDesc(Relation index, SpGistTypeDesc *keyType)
 		/* We shouldn't need to bother with making these valid: */
 		att->attcompression = InvalidCompressionMethod;
 		att->attcollation = InvalidOid;
-		/* In case we changed typlen, we'd better reset following offsets */
-		for (int i = spgFirstIncludeColumn; i < outTupDesc->natts; i++)
-			TupleDescCompactAttr(outTupDesc, i)->attcacheoff = -1;
 
 		populate_compact_attribute(outTupDesc, spgKeyColumn);
+		TupleDescFinalize(outTupDesc);
 	}
 	return outTupDesc;
 }
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index e50abb331cc..9f708f84334 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -744,6 +744,7 @@ pg_prepared_xact(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 5, "dbid",
 						   OIDOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 2efe4105efb..b6bc616c74c 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -400,6 +400,7 @@ pg_walfile_name_offset(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(resultTupleDesc, (AttrNumber) 2, "file_offset",
 					   INT4OID, -1, 0);
 
+	TupleDescFinalize(resultTupleDesc);
 	resultTupleDesc = BlessTupleDesc(resultTupleDesc);
 
 	/*
diff --git a/src/backend/backup/basebackup_copy.c b/src/backend/backup/basebackup_copy.c
index fecfad9ab7b..29dbd0cb32f 100644
--- a/src/backend/backup/basebackup_copy.c
+++ b/src/backend/backup/basebackup_copy.c
@@ -357,6 +357,8 @@ SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli)
 	 */
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "tli", INT8OID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
+
 	/* send RowDescription */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
 
@@ -388,6 +390,7 @@ SendTablespaceList(List *tablespaces)
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, "spcoid", OIDOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "spclocation", TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "size", INT8OID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* send RowDescription */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 43de42ce39e..75e97fb394a 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -481,6 +481,8 @@ ConstructTupleDescriptor(Relation heapRelation,
 		populate_compact_attribute(indexTupDesc, i);
 	}
 
+	TupleDescFinalize(indexTupDesc);
+
 	return indexTupDesc;
 }
 
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 9a4791c573e..fa353a0dd37 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -1230,6 +1230,7 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 4, "qual",
 						   PG_NODE_TREEOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 		funcctx->user_fctx = table_infos;
 
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index c78dcea98c1..078a1cf5127 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -229,6 +229,12 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid,
 	TupleDescAttr(tupdesc, 1)->attcompression = InvalidCompressionMethod;
 	TupleDescAttr(tupdesc, 2)->attcompression = InvalidCompressionMethod;
 
+	populate_compact_attribute(tupdesc, 0);
+	populate_compact_attribute(tupdesc, 1);
+	populate_compact_attribute(tupdesc, 2);
+
+	TupleDescFinalize(tupdesc);
+
 	/*
 	 * Toast tables for regular relations go in pg_toast; those for temp
 	 * relations go into the per-backend temp-toast-table namespace.
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index b7bb111688c..7abd9ed272f 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -281,6 +281,7 @@ ExplainResultDesc(ExplainStmt *stmt)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "QUERY PLAN",
 					   result_type, -1, 0);
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
 
diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c
index a516b037dea..6a8f162b640 100644
--- a/src/backend/commands/functioncmds.c
+++ b/src/backend/commands/functioncmds.c
@@ -2423,6 +2423,7 @@ CallStmtResultDesc(CallStmt *stmt)
 							   -1,
 							   0);
 		}
+		TupleDescFinalize(tupdesc);
 	}
 
 	return tupdesc;
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index e1b808bbb60..551667650ba 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -1808,6 +1808,7 @@ pg_get_sequence_data(PG_FUNCTION_ARGS)
 					   BOOLOID, -1, 0);
 	TupleDescInitEntry(resultTupleDesc, (AttrNumber) 3, "page_lsn",
 					   LSNOID, -1, 0);
+	TupleDescFinalize(resultTupleDesc);
 	resultTupleDesc = BlessTupleDesc(resultTupleDesc);
 
 	seqrel = try_relation_open(relid, AccessShareLock);
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index f976c0e5c7e..ae7c9db9fa9 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -1029,6 +1029,8 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
 		}
 	}
 
+	TupleDescFinalize(descriptor);
+
 	/*
 	 * For relations with table AM and partitioned tables, select access
 	 * method to use: an explicitly indicated one, or (in the case of a
@@ -1448,6 +1450,8 @@ BuildDescForRelation(const List *columns)
 		populate_compact_attribute(desc, attnum - 1);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
diff --git a/src/backend/commands/wait.c b/src/backend/commands/wait.c
index 97f1e778488..9e55d9bfb80 100644
--- a/src/backend/commands/wait.c
+++ b/src/backend/commands/wait.c
@@ -341,5 +341,6 @@ WaitStmtResultDesc(WaitStmt *stmt)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 					   TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
diff --git a/src/backend/executor/execSRF.c b/src/backend/executor/execSRF.c
index a0b111dc0e4..b481e50acfb 100644
--- a/src/backend/executor/execSRF.c
+++ b/src/backend/executor/execSRF.c
@@ -272,6 +272,7 @@ ExecMakeTableFunctionResult(SetExprState *setexpr,
 									   funcrettype,
 									   -1,
 									   0);
+					TupleDescFinalize(tupdesc);
 					rsinfo.setDesc = tupdesc;
 				}
 				MemoryContextSwitchTo(oldcontext);
@@ -776,6 +777,7 @@ init_sexpr(Oid foid, Oid input_collation, Expr *node,
 							   funcrettype,
 							   -1,
 							   0);
+			TupleDescFinalize(tupdesc);
 			sexpr->funcResultDesc = tupdesc;
 			sexpr->funcReturnsTuple = false;
 		}
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index b768eae9e53..9683acc8020 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -992,118 +992,6 @@ tts_buffer_heap_store_tuple(TupleTableSlot *slot, HeapTuple tuple,
 	}
 }
 
-/*
- * slot_deform_heap_tuple_internal
- *		An always inline helper function for use in slot_deform_heap_tuple to
- *		allow the compiler to emit specialized versions of this function for
- *		various combinations of "slow" and "hasnulls".  For example, if a
- *		given tuple has no nulls, then we needn't check "hasnulls" for every
- *		attribute that we're deforming.  The caller can just call this
- *		function with hasnulls set to constant-false and have the compiler
- *		remove the constant-false branches and emit more optimal code.
- *
- * Returns the next attnum to deform, which can be equal to natts when the
- * function manages to deform all requested attributes.  *offp is an input and
- * output parameter which is the byte offset within the tuple to start deforming
- * from which, on return, gets set to the offset where the next attribute
- * should be deformed from.  *slowp is set to true when subsequent deforming
- * of this tuple must use a version of this function with "slow" passed as
- * true.
- *
- * Callers cannot assume when we return "attnum" (i.e. all requested
- * attributes have been deformed) that slow mode isn't required for any
- * additional deforming as the final attribute may have caused a switch to
- * slow mode.
- */
-static pg_attribute_always_inline int
-slot_deform_heap_tuple_internal(TupleTableSlot *slot, HeapTuple tuple,
-								int attnum, int natts, bool slow,
-								bool hasnulls, uint32 *offp, bool *slowp)
-{
-	TupleDesc	tupleDesc = slot->tts_tupleDescriptor;
-	Datum	   *values = slot->tts_values;
-	bool	   *isnull = slot->tts_isnull;
-	HeapTupleHeader tup = tuple->t_data;
-	char	   *tp;				/* ptr to tuple data */
-	bits8	   *bp = tup->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slownext = false;
-
-	tp = (char *) tup + tup->t_hoff;
-
-	for (; attnum < natts; attnum++)
-	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDesc, attnum);
-
-		if (hasnulls && att_isnull(attnum, bp))
-		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			if (!slow)
-			{
-				*slowp = true;
-				return attnum + 1;
-			}
-			else
-				continue;
-		}
-
-		isnull[attnum] = false;
-
-		/* calculate the offset of this attribute */
-		if (!slow && thisatt->attcacheoff >= 0)
-			*offp = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow && *offp == att_nominal_alignby(*offp, thisatt->attalignby))
-				thisatt->attcacheoff = *offp;
-			else
-			{
-				*offp = att_pointer_alignby(*offp,
-											thisatt->attalignby,
-											-1,
-											tp + *offp);
-
-				if (!slow)
-					slownext = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			*offp = att_nominal_alignby(*offp, thisatt->attalignby);
-
-			if (!slow)
-				thisatt->attcacheoff = *offp;
-		}
-
-		values[attnum] = fetchatt(thisatt, tp + *offp);
-
-		*offp = att_addlength_pointer(*offp, thisatt->attlen, tp + *offp);
-
-		/* check if we need to switch to slow mode */
-		if (!slow)
-		{
-			/*
-			 * We're unable to deform any further if the above code set
-			 * 'slownext', or if this isn't a fixed-width attribute.
-			 */
-			if (slownext || thisatt->attlen <= 0)
-			{
-				*slowp = true;
-				return attnum + 1;
-			}
-		}
-	}
-
-	return natts;
-}
-
 /*
  * slot_deform_heap_tuple
  *		Given a TupleTableSlot, extract data from the slot's physical tuple
@@ -1122,78 +1010,167 @@ static pg_attribute_always_inline void
 slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 					   int natts)
 {
+	CompactAttribute *cattr;
+	TupleDesc	tupleDesc = slot->tts_tupleDescriptor;
 	bool		hasnulls = HeapTupleHasNulls(tuple);
+	HeapTupleHeader tup = tuple->t_data;
+	bits8	   *bp;				/* ptr to null bitmap in tuple */
 	int			attnum;
+	int			firstNonCacheOffsetAttr;
+
+#ifdef OPTIMIZE_BYVAL
+	int			firstByRefAttr;
+#endif
+	int			firstNullAttr;
+	Datum	   *values;
+	bool	   *isnull;
+	char	   *tp;				/* ptr to tuple data */
 	uint32		off;			/* offset in tuple data */
-	bool		slow;			/* can we use/set attcacheoff? */
+
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
 	/* We can only fetch as many attributes as the tuple has. */
-	natts = Min(HeapTupleHeaderGetNatts(tuple->t_data), natts);
+	natts = Min(HeapTupleHeaderGetNatts(tup), natts);
+	attnum = slot->tts_nvalid;
+	firstNonCacheOffsetAttr = Min(tupleDesc->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		bp = tup->t_bits;
+		firstNullAttr = first_null_attr(bp, natts);
+		firstNonCacheOffsetAttr = Min(firstNonCacheOffsetAttr, firstNullAttr);
+	}
+	else
+	{
+		bp = NULL;
+		firstNullAttr = natts;
+	}
+
+#ifdef OPTIMIZE_BYVAL
+	firstByRefAttr = Min(firstNonCacheOffsetAttr, tupleDesc->firstByRefAttr);
+#endif
+	values = slot->tts_values;
+	isnull = slot->tts_isnull;
+	tp = (char *) tup + tup->t_hoff;
+
+#ifdef OPTIMIZE_BYVAL
 
 	/*
-	 * Check whether the first call for this tuple, and initialize or restore
-	 * loop state.
+	 * Many tuples have leading byval attributes, try and process as many of
+	 * those as possible with a special loop that can't handle byref types.
 	 */
-	attnum = slot->tts_nvalid;
-	if (attnum == 0)
+	if (attnum < firstByRefAttr)
+	{
+		/* Use do/while as we already know we need to loop at least once. */
+		do
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+			Assert(cattr->attcacheoff >= 0);
+
+			/*
+			 * Hard code byval == true to allow the compiler to remove the
+			 * byval check when inlining fetch_att().
+			 */
+			values[attnum] = fetch_att(tp + cattr->attcacheoff, true, cattr->attlen);
+			isnull[attnum] = false;
+		} while (++attnum < firstByRefAttr);
+
+		/*
+		 * Point the offset after the end of the last attribute with a cached
+		 * offset.  We expect the final cached offset attribute to have a
+		 * fixed width, so just add the attlen to the attcacheoff.
+		 */
+		Assert(cattr->attlen > 0);
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+#endif
+
+	/*
+	 * Handle the portion of the tuple that we have cached the offset for up
+	 * to the first NULL attribute.  The offset is effectively fixed for these
+	 * so we can use the CompactAttribute's attcacheoff.
+	 */
+	if (attnum < firstNonCacheOffsetAttr)
+	{
+		do
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+			Assert(cattr->attcacheoff >= 0);
+
+			values[attnum] = fetchatt(cattr, tp + cattr->attcacheoff);
+			isnull[attnum] = false;
+		} while (++attnum < firstNonCacheOffsetAttr);
+
+		/*
+		 * Point the offset after the end of the last attribute with a cached
+		 * offset.  We expect the final cached offset attribute to have a
+		 * fixed width, so just add the attlen to the attcacheoff
+		 */
+		Assert(cattr->attlen > 0);
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+	else if (attnum == 0)
 	{
 		/* Start from the first attribute */
 		off = 0;
-		slow = false;
 	}
 	else
 	{
 		/* Restore state from previous execution */
 		off = *offp;
-		slow = TTS_SLOW(slot);
 	}
 
 	/*
-	 * If 'slow' isn't set, try deforming using deforming code that does not
-	 * contain any of the extra checks required for non-fixed offset
-	 * deforming.  During deforming, if or when we find a NULL or a variable
-	 * length attribute, we'll switch to a deforming method which includes the
-	 * extra code required for non-fixed offset deforming, a.k.a slow mode.
-	 * Because this is performance critical, we inline
-	 * slot_deform_heap_tuple_internal passing the 'slow' and 'hasnull'
-	 * parameters as constants to allow the compiler to emit specialized code
-	 * with the known-const false comparisons and subsequent branches removed.
+	 * Handle any portion of the tuple that doesn't have a fixed offset up
+	 * until the first NULL attribute.  This loops only differs from the one
+	 * after it by the NULL checks.
 	 */
-	if (!slow)
+	for (; attnum < firstNullAttr; attnum++)
 	{
-		/* Tuple without any NULLs? We can skip doing any NULL checking */
-		if (!hasnulls)
-			attnum = slot_deform_heap_tuple_internal(slot,
-													 tuple,
-													 attnum,
-													 natts,
-													 false, /* slow */
-													 false, /* hasnulls */
-													 &off,
-													 &slow);
-		else
-			attnum = slot_deform_heap_tuple_internal(slot,
-													 tuple,
-													 attnum,
-													 natts,
-													 false, /* slow */
-													 true,	/* hasnulls */
-													 &off,
-													 &slow);
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+		/* align the offset for this attribute */
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
+
+		values[attnum] = fetchatt(cattr, tp + off);
+		isnull[attnum] = false;
+
+		/* move the offset beyond this attribute */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
-	/* If there's still work to do then we must be in slow mode */
-	if (attnum < natts)
+	/*
+	 * Now handle any remaining tuples, this time include NULL checks as we're
+	 * now at the first NULL attribute.
+	 */
+	for (; attnum < natts; attnum++)
 	{
-		/* XXX is it worth adding a separate call when hasnulls is false? */
-		attnum = slot_deform_heap_tuple_internal(slot,
-												 tuple,
-												 attnum,
-												 natts,
-												 true,	/* slow */
-												 hasnulls,
-												 &off,
-												 &slow);
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
+		}
+
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+		/* align the offset for this attribute */
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
+
+		values[attnum] = fetchatt(cattr, tp + off);
+		isnull[attnum] = false;
+
+		/* move the offset beyond this attribute */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
 	/*
@@ -1201,10 +1178,6 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 	 */
 	slot->tts_nvalid = attnum;
 	*offp = off;
-	if (slow)
-		slot->tts_flags |= TTS_FLAG_SLOW;
-	else
-		slot->tts_flags &= ~TTS_FLAG_SLOW;
 }
 
 const TupleTableSlotOps TTSOpsVirtual = {
@@ -2173,6 +2146,8 @@ ExecTypeFromTLInternal(List *targetList, bool skipjunk)
 		cur_resno++;
 	}
 
+	TupleDescFinalize(typeInfo);
+
 	return typeInfo;
 }
 
@@ -2207,6 +2182,8 @@ ExecTypeFromExprList(List *exprList)
 		cur_resno++;
 	}
 
+	TupleDescFinalize(typeInfo);
+
 	return typeInfo;
 }
 
diff --git a/src/backend/executor/nodeFunctionscan.c b/src/backend/executor/nodeFunctionscan.c
index 63e605e1f81..feb82d64967 100644
--- a/src/backend/executor/nodeFunctionscan.c
+++ b/src/backend/executor/nodeFunctionscan.c
@@ -414,6 +414,7 @@ ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags)
 				TupleDescInitEntryCollation(tupdesc,
 											(AttrNumber) 1,
 											exprCollation(funcexpr));
+				TupleDescFinalize(tupdesc);
 			}
 			else
 			{
@@ -485,6 +486,7 @@ ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags)
 							   0);
 		}
 
+		TupleDescFinalize(scan_tupdesc);
 		Assert(attno == natts);
 	}
 
diff --git a/src/backend/jit/llvm/llvmjit_deform.c b/src/backend/jit/llvm/llvmjit_deform.c
index 3eb087eb56b..12521e3e46a 100644
--- a/src/backend/jit/llvm/llvmjit_deform.c
+++ b/src/backend/jit/llvm/llvmjit_deform.c
@@ -62,7 +62,6 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 	LLVMValueRef v_tts_values;
 	LLVMValueRef v_tts_nulls;
 	LLVMValueRef v_slotoffp;
-	LLVMValueRef v_flagsp;
 	LLVMValueRef v_nvalidp;
 	LLVMValueRef v_nvalid;
 	LLVMValueRef v_maxatt;
@@ -178,7 +177,6 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 	v_tts_nulls =
 		l_load_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_ISNULL,
 						  "tts_ISNULL");
-	v_flagsp = l_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_FLAGS, "");
 	v_nvalidp = l_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_NVALID, "");
 
 	if (ops == &TTSOpsHeapTuple || ops == &TTSOpsBufferHeapTuple)
@@ -747,14 +745,10 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 
 	{
 		LLVMValueRef v_off = l_load(b, TypeSizeT, v_offp, "");
-		LLVMValueRef v_flags;
 
 		LLVMBuildStore(b, l_int16_const(lc, natts), v_nvalidp);
 		v_off = LLVMBuildTrunc(b, v_off, LLVMInt32TypeInContext(lc), "");
 		LLVMBuildStore(b, v_off, v_slotoffp);
-		v_flags = l_load(b, LLVMInt16TypeInContext(lc), v_flagsp, "tts_flags");
-		v_flags = LLVMBuildOr(b, v_flags, l_int16_const(lc, TTS_FLAG_SLOW), "");
-		LLVMBuildStore(b, v_flags, v_flagsp);
 		LLVMBuildRetVoid(b);
 	}
 
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 3ec8d8de011..0ad767d827b 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -1891,6 +1891,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 			TupleDescInitEntryCollation(tupdesc,
 										(AttrNumber) 1,
 										exprCollation(funcexpr));
+			TupleDescFinalize(tupdesc);
 		}
 		else if (functypclass == TYPEFUNC_RECORD)
 		{
@@ -1948,6 +1949,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 
 				i++;
 			}
+			TupleDescFinalize(tupdesc);
 
 			/*
 			 * Ensure that the coldeflist defines a legal set of names (no
@@ -2016,7 +2018,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 							   0);
 			/* no need to set collation */
 		}
-
+		TupleDescFinalize(tupdesc);
 		Assert(natts == totalatts);
 	}
 	else
diff --git a/src/backend/parser/parse_target.c b/src/backend/parser/parse_target.c
index b5a2f915b67..5fd17f3d8d0 100644
--- a/src/backend/parser/parse_target.c
+++ b/src/backend/parser/parse_target.c
@@ -1570,6 +1570,8 @@ expandRecordVariable(ParseState *pstate, Var *var, int levelsup)
 		}
 		Assert(lname == NULL && lvar == NULL);	/* lists same length? */
 
+		TupleDescFinalize(tupleDesc);
+
 		return tupleDesc;
 	}
 
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 7c8639b32e9..4b2d60fe3c2 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -1074,6 +1074,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
 		TupleDescInitEntry(walres->tupledesc, (AttrNumber) coln + 1,
 						   PQfname(pgres, coln), retTypes[coln], -1, 0);
 	attinmeta = TupleDescGetAttInMetadata(walres->tupledesc);
+	TupleDescFinalize(walres->tupledesc);
 
 	/* No point in doing more here if there were no tuples returned. */
 	if (PQntuples(pgres) == 0)
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 1ab09655a70..269b081bac0 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -452,6 +452,7 @@ IdentifySystem(void)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 4, "dbname",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -497,6 +498,7 @@ ReadReplicationSlot(ReadReplicationSlotCmd *cmd)
 	/* TimeLineID is unsigned, so int4 is not wide enough. */
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "restart_tli",
 							  INT8OID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	memset(nulls, true, READ_REPLICATION_SLOT_COLS * sizeof(bool));
 
@@ -599,6 +601,7 @@ SendTimeLineHistory(TimeLineHistoryCmd *cmd)
 	tupdesc = CreateTemplateTupleDesc(2);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, "filename", TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "content", TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	TLHistoryFileName(histfname, cmd->timeline);
 	TLHistoryFilePath(path, cmd->timeline);
@@ -1016,6 +1019,7 @@ StartReplication(StartReplicationCmd *cmd)
 								  INT8OID, -1, 0);
 		TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "next_tli_startpos",
 								  TEXTOID, -1, 0);
+		TupleDescFinalize(tupdesc);
 
 		/* prepare for projection of tuple */
 		tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -1370,6 +1374,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 4, "output_plugin",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
diff --git a/src/backend/utils/adt/acl.c b/src/backend/utils/adt/acl.c
index 3a6905f9546..9d37053c81e 100644
--- a/src/backend/utils/adt/acl.c
+++ b/src/backend/utils/adt/acl.c
@@ -1818,6 +1818,7 @@ aclexplode(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 4, "is_grantable",
 						   BOOLOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/* allocate memory for user context */
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index c083608b1d5..bfb949401d0 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -454,6 +454,7 @@ pg_stat_file(PG_FUNCTION_ARGS)
 					   "creation", TIMESTAMPTZOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 6,
 					   "isdir", BOOLOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	memset(isnull, false, sizeof(isnull));
diff --git a/src/backend/utils/adt/lockfuncs.c b/src/backend/utils/adt/lockfuncs.c
index 9dadd6da672..4481c354fd6 100644
--- a/src/backend/utils/adt/lockfuncs.c
+++ b/src/backend/utils/adt/lockfuncs.c
@@ -146,6 +146,7 @@ pg_lock_status(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 16, "waitstart",
 						   TIMESTAMPTZOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/*
diff --git a/src/backend/utils/adt/orderedsetaggs.c b/src/backend/utils/adt/orderedsetaggs.c
index 3b6da8e36ac..fd8b8676470 100644
--- a/src/backend/utils/adt/orderedsetaggs.c
+++ b/src/backend/utils/adt/orderedsetaggs.c
@@ -233,6 +233,7 @@ ordered_set_startup(FunctionCallInfo fcinfo, bool use_tuples)
 								   -1,
 								   0);
 
+				TupleDescFinalize(newdesc);
 				FreeTupleDesc(qstate->tupdesc);
 				qstate->tupdesc = newdesc;
 			}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 73ca0bb0b7f..08ad27e57c2 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -769,6 +769,7 @@ pg_stat_get_backend_subxact(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subxact_overflow",
 					   BOOLOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	if ((local_beentry = pgstat_get_local_beentry_by_proc_number(procNumber)) != NULL)
@@ -1658,6 +1659,7 @@ pg_stat_wal_build_tuple(PgStat_WalCounters wal_counters,
 	TupleDescInitEntry(tupdesc, (AttrNumber) 6, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	/* Fill values and NULLs */
@@ -2085,6 +2087,7 @@ pg_stat_get_archiver(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(tupdesc, (AttrNumber) 7, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	/* Get statistics about the archiver process */
@@ -2166,6 +2169,7 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
 					   TIMESTAMPTZOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 13, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	namestrcpy(&slotname, text_to_cstring(slotname_text));
@@ -2253,6 +2257,7 @@ pg_stat_get_subscription_stats(PG_FUNCTION_ARGS)
 					   INT8OID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 13, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	if (!subentry)
diff --git a/src/backend/utils/adt/tsvector_op.c b/src/backend/utils/adt/tsvector_op.c
index 94e0fed8309..7ca19a97882 100644
--- a/src/backend/utils/adt/tsvector_op.c
+++ b/src/backend/utils/adt/tsvector_op.c
@@ -651,6 +651,7 @@ tsvector_unnest(PG_FUNCTION_ARGS)
 						   TEXTARRAYOID, -1, 0);
 		if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
 			elog(ERROR, "return type must be a row type");
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = tupdesc;
 
 		funcctx->user_fctx = PG_GETARG_TSVECTOR_COPY(0);
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 6b634c9fff1..998be24ac41 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -666,14 +666,6 @@ RelationBuildTupleDesc(Relation relation)
 		elog(ERROR, "pg_attribute catalog is missing %d attribute(s) for relation OID %u",
 			 need, RelationGetRelid(relation));
 
-	/*
-	 * We can easily set the attcacheoff value for the first attribute: it
-	 * must be zero.  This eliminates the need for special cases for attnum=1
-	 * that used to exist in fastgetattr() and index_getattr().
-	 */
-	if (RelationGetNumberOfAttributes(relation) > 0)
-		TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
-
 	/*
 	 * Set up constraint/default info
 	 */
@@ -729,6 +721,8 @@ RelationBuildTupleDesc(Relation relation)
 		pfree(constr);
 		relation->rd_att->constr = NULL;
 	}
+
+	TupleDescFinalize(relation->rd_att);
 }
 
 /*
@@ -1983,8 +1977,7 @@ formrdesc(const char *relationName, Oid relationReltype,
 		populate_compact_attribute(relation->rd_att, i);
 	}
 
-	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
-	TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
+	TupleDescFinalize(relation->rd_att);
 
 	/* mark not-null status */
 	if (has_not_null)
@@ -3688,6 +3681,8 @@ RelationBuildLocalRelation(const char *relname,
 	for (i = 0; i < natts; i++)
 		TupleDescAttr(rel->rd_att, i)->attrelid = relid;
 
+	TupleDescFinalize(rel->rd_att);
+
 	rel->rd_rel->reltablespace = reltablespace;
 
 	if (mapped_relation)
@@ -4441,8 +4436,7 @@ BuildHardcodedDescriptor(int natts, const FormData_pg_attribute *attrs)
 		populate_compact_attribute(result, i);
 	}
 
-	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
-	TupleDescCompactAttr(result, 0)->attcacheoff = 0;
+	TupleDescFinalize(result);
 
 	/* Note: we don't bother to set up a TupleConstr entry */
 
@@ -6268,6 +6262,8 @@ load_relcache_init_file(bool shared)
 			populate_compact_attribute(rel->rd_att, i);
 		}
 
+		TupleDescFinalize(rel->rd_att);
+
 		/* next read the access method specific field */
 		if (fread(&len, 1, sizeof(len), fp) != sizeof(len))
 			goto read_failed;
diff --git a/src/backend/utils/fmgr/funcapi.c b/src/backend/utils/fmgr/funcapi.c
index 8a934ea8dca..516d02cfb82 100644
--- a/src/backend/utils/fmgr/funcapi.c
+++ b/src/backend/utils/fmgr/funcapi.c
@@ -340,6 +340,8 @@ get_expr_result_type(Node *expr,
 										exprCollation(col));
 			i++;
 		}
+		TupleDescFinalize(tupdesc);
+
 		if (resultTypeId)
 			*resultTypeId = rexpr->row_typeid;
 		if (resultTupleDesc)
@@ -1044,6 +1046,7 @@ resolve_polymorphic_tupdesc(TupleDesc tupdesc, oidvector *declared_args,
 		}
 	}
 
+	TupleDescFinalize(tupdesc);
 	return true;
 }
 
@@ -1853,6 +1856,8 @@ build_function_result_tupdesc_d(char prokind,
 						   0);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -1970,6 +1975,7 @@ TypeGetTupleDesc(Oid typeoid, List *colaliases)
 						   typeoid,
 						   -1,
 						   0);
+		TupleDescFinalize(tupdesc);
 	}
 	else if (functypclass == TYPEFUNC_RECORD)
 	{
diff --git a/src/backend/utils/misc/guc_funcs.c b/src/backend/utils/misc/guc_funcs.c
index 4f3e40bf470..b566a8ef600 100644
--- a/src/backend/utils/misc/guc_funcs.c
+++ b/src/backend/utils/misc/guc_funcs.c
@@ -444,6 +444,7 @@ GetPGVariableResultDesc(const char *name)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 1, varname,
 						   TEXTOID, -1, 0);
 	}
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
 
@@ -465,6 +466,7 @@ ShowGUCConfigOption(const char *name, DestReceiver *dest)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, varname,
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -499,6 +501,7 @@ ShowAllGUCConfig(DestReceiver *dest)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "description",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -939,6 +942,8 @@ show_all_settings(PG_FUNCTION_ARGS)
 		 * C strings
 		 */
 		attinmeta = TupleDescGetAttInMetadata(tupdesc);
+		TupleDescFinalize(tupdesc);
+
 		funcctx->attinmeta = attinmeta;
 
 		/* collect the variables, in sorted order */
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index d406825ff22..94b4279b7f1 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -865,20 +865,17 @@ extern MinimalTuple minimal_expand_tuple(HeapTuple sourceTuple, TupleDesc tupleD
 static inline Datum
 fastgetattr(HeapTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
 {
-	Assert(attnum > 0);
+	CompactAttribute *att = TupleDescCompactAttr(tupleDesc, attnum - 1);
 
+	Assert(attnum > 0);
 	*isnull = false;
-	if (HeapTupleNoNulls(tup))
-	{
-		CompactAttribute *att;
 
-		att = TupleDescCompactAttr(tupleDesc, attnum - 1);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, (char *) tup->t_data + tup->t_data->t_hoff +
-							att->attcacheoff);
-		else
-			return nocachegetattr(tup, attnum, tupleDesc);
-	}
+	if (att->attcacheoff >= 0 && !HeapTupleHasNulls(tup))
+		return fetchatt(att, (char *) tup->t_data + tup->t_data->t_hoff +
+						att->attcacheoff);
+
+	if (HeapTupleNoNulls(tup))
+		return nocachegetattr(tup, attnum, tupleDesc);
 	else
 	{
 		if (att_isnull(attnum - 1, tup->t_data->t_bits))
diff --git a/src/include/access/itup.h b/src/include/access/itup.h
index 57e4daafb0d..e4bb27b7e58 100644
--- a/src/include/access/itup.h
+++ b/src/include/access/itup.h
@@ -131,24 +131,20 @@ IndexInfoFindDataOffset(unsigned short t_info)
 static inline Datum
 index_getattr(IndexTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
 {
+	CompactAttribute *attr = TupleDescCompactAttr(tupleDesc, attnum - 1);
+
 	Assert(isnull);
 	Assert(attnum > 0);
 
 	*isnull = false;
 
-	if (!IndexTupleHasNulls(tup))
-	{
-		CompactAttribute *attr = TupleDescCompactAttr(tupleDesc, attnum - 1);
+	if (attr->attcacheoff >= 0 && !IndexTupleHasNulls(tup))
+		return fetchatt(attr,
+						(char *) tup + IndexInfoFindDataOffset(tup->t_info) +
+						attr->attcacheoff);
 
-		if (attr->attcacheoff >= 0)
-		{
-			return fetchatt(attr,
-							(char *) tup + IndexInfoFindDataOffset(tup->t_info) +
-							attr->attcacheoff);
-		}
-		else
-			return nocache_index_getattr(tup, attnum, tupleDesc);
-	}
+	if (!IndexTupleHasNulls(tup))
+		return nocache_index_getattr(tup, attnum, tupleDesc);
 	else
 	{
 		if (att_isnull(attnum - 1, (bits8 *) tup + sizeof(IndexTupleData)))
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index d46cdbf7a3c..3c44f2ac119 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -131,6 +131,12 @@ typedef struct CompactAttribute
  * Any code making changes manually to and fields in the FormData_pg_attribute
  * array must subsequently call populate_compact_attribute() to flush the
  * changes out to the corresponding 'compact_attrs' element.
+ *
+ * firstNonCachedOffAttr stores the index into the compact_attrs array for the
+ * first attribute that we don't have a known attcacheoff for.
+ *
+ * Once a TupleDesc has been populated, before it is used for any purpose
+ * TupleDescFinalize() must be called on it.
  */
 typedef struct TupleDescData
 {
@@ -138,6 +144,12 @@ typedef struct TupleDescData
 	Oid			tdtypeid;		/* composite type ID for tuple type */
 	int32		tdtypmod;		/* typmod for tuple type */
 	int			tdrefcount;		/* reference count, or -1 if not counting */
+	int			firstNonCachedOffAttr;	/* index of the first att without an
+										 * attcacheoff */
+#ifdef OPTIMIZE_BYVAL
+	int			firstByRefAttr; /* index of the first attr with !attbyval, or
+								 * natts if none. */
+#endif
 	TupleConstr *constr;		/* constraints, or NULL if none */
 	/* compact_attrs[N] is the compact metadata of Attribute Number N+1 */
 	CompactAttribute compact_attrs[FLEXIBLE_ARRAY_MEMBER];
@@ -205,6 +217,8 @@ extern void TupleDescCopy(TupleDesc dst, TupleDesc src);
 extern void TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 							   TupleDesc src, AttrNumber srcAttno);
 
+extern void TupleDescFinalize(TupleDesc tupdesc);
+
 extern void FreeTupleDesc(TupleDesc tupdesc);
 
 extern void IncrTupleDescRefCount(TupleDesc tupdesc);
diff --git a/src/include/access/tupmacs.h b/src/include/access/tupmacs.h
index e6df8264750..a26f18c39fe 100644
--- a/src/include/access/tupmacs.h
+++ b/src/include/access/tupmacs.h
@@ -15,6 +15,7 @@
 #define TUPMACS_H
 
 #include "catalog/pg_type_d.h"	/* for TYPALIGN macros */
+#include "port/pg_bitutils.h"
 
 
 /*
@@ -69,6 +70,69 @@ fetch_att(const void *T, bool attbyval, int attlen)
 	else
 		return PointerGetDatum(T);
 }
+
+/*
+ * first_null_attr
+ *		Inspect a NULL bitmask from a tuple and return the 0-based attnum of the
+ *		first NULL attribute.  Returns natts if no NULLs were found.
+ */
+static inline int
+first_null_attr(const bits8 *bits, int natts)
+{
+	int			lastByte = natts >> 3;
+	uint8		mask;
+	int			res = natts;
+	uint8		byte;
+
+#ifdef USE_ASSERT_CHECKING
+	int			firstnull_check = natts;
+
+	/* Do it the slow way and check we get the same answer. */
+	for (int i = 0; i < natts; i++)
+	{
+		if (att_isnull(i, bits))
+		{
+			firstnull_check = i;
+			break;
+		}
+	}
+#endif
+
+	/* Process all bytes up to just before the byte for the natts index */
+	for (int bytenum = 0; bytenum < lastByte; bytenum++)
+	{
+		if (bits[bytenum] != 0xFF)
+		{
+			byte = ~bits[bytenum];
+			res = bytenum << 3;
+			res += pg_rightmost_one_pos[byte];
+
+			Assert(res == firstnull_check);
+			return res;
+		}
+	}
+
+	/*
+	 * Create a mask with all bits beyond natts's bit set to off.  The code
+	 * below will generate a zero mask when natts & 7 == 0.  When that
+	 * happens, all bytes that need to be checked were done so in the loop
+	 * above.  The code below will create an empty mask and end up returning
+	 * natts.  This has been done to avoid having to write a special case to
+	 * check if we've covered all bytes already.
+	 */
+	mask = 0xFF & ((((uint8) 1) << (natts & 7)) - 1);
+	byte = (~bits[lastByte]) & mask;
+
+	if (byte != 0)
+	{
+		res = lastByte << 3;
+		res += pg_rightmost_one_pos[byte];
+	}
+
+	Assert(res == firstnull_check);
+
+	return res;
+}
 #endif							/* FRONTEND */
 
 /*
diff --git a/src/include/executor/tuptable.h b/src/include/executor/tuptable.h
index a2dfd707e78..363c5f33697 100644
--- a/src/include/executor/tuptable.h
+++ b/src/include/executor/tuptable.h
@@ -84,9 +84,6 @@
  * tts_values/tts_isnull are allocated either when the slot is created (when
  * the descriptor is provided), or when a descriptor is assigned to the slot;
  * they are of length equal to the descriptor's natts.
- *
- * The TTS_FLAG_SLOW flag is saved state for
- * slot_deform_heap_tuple, and should not be touched by any other code.
  *----------
  */
 
@@ -98,12 +95,8 @@
 #define			TTS_FLAG_SHOULDFREE		(1 << 2)
 #define TTS_SHOULDFREE(slot) (((slot)->tts_flags & TTS_FLAG_SHOULDFREE) != 0)
 
-/* saved state for slot_deform_heap_tuple */
-#define			TTS_FLAG_SLOW		(1 << 3)
-#define TTS_SLOW(slot) (((slot)->tts_flags & TTS_FLAG_SLOW) != 0)
-
 /* fixed tuple descriptor */
-#define			TTS_FLAG_FIXED		(1 << 4)
+#define			TTS_FLAG_FIXED		(1 << 4)	/* XXX change to #3? */
 #define TTS_FIXED(slot) (((slot)->tts_flags & TTS_FLAG_FIXED) != 0)
 
 struct TupleTableSlotOps;
diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c
index 7d648c941c0..b2112c29fcf 100644
--- a/src/pl/plpgsql/src/pl_comp.c
+++ b/src/pl/plpgsql/src/pl_comp.c
@@ -1912,6 +1912,8 @@ build_row_from_vars(PLpgSQL_variable **vars, int numvars)
 		TupleDescInitEntryCollation(row->rowtupdesc, i + 1, typcoll);
 	}
 
+	TupleDescFinalize(row->rowtupdesc);
+
 	return row;
 }
 
diff --git a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
index 908bd18a7c7..fa1719bf3b5 100644
--- a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
+++ b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
@@ -205,6 +205,7 @@ test_custom_stats_fixed_report(PG_FUNCTION_ARGS)
 					   INT8OID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	values[0] = Int64GetDatum(stats->numcalls);
diff --git a/src/test/modules/test_predtest/test_predtest.c b/src/test/modules/test_predtest/test_predtest.c
index 679a5de456d..48ca2a4ea70 100644
--- a/src/test/modules/test_predtest/test_predtest.c
+++ b/src/test/modules/test_predtest/test_predtest.c
@@ -230,6 +230,7 @@ test_predtest(PG_FUNCTION_ARGS)
 					   "s_r_holds", BOOLOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 8,
 					   "w_r_holds", BOOLOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	tupdesc = BlessTupleDesc(tupdesc);
 
 	values[0] = BoolGetDatum(strong_implied_by);
-- 
2.51.0



^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: More speedups for tuple deformation
@ 2026-01-20 04:32  Chao Li <[email protected]>
  parent: David Rowley <[email protected]>
  1 sibling, 1 reply; 19+ messages in thread

From: Chao Li @ 2026-01-20 04:32 UTC (permalink / raw)
  To: David Rowley <[email protected]>; +Cc: PostgreSQL Developers <[email protected]>



> On Jan 20, 2026, at 08:11, David Rowley <[email protected]> wrote:
> 
> On Mon, 19 Jan 2026 at 18:48, Chao Li <[email protected]> wrote:
>> I reviewed the patch and traced some basic workflows. But I haven’t done a load test to compare performance differences with and without this patch, I will do that if I get some bandwidth later. Here comes some review comments:
>> 
>> 1 - tupmacs.h
>> ```
>> +       /* Create a mask with all bits beyond natts's bit set to off */
>> +       mask = 0xFF & ((((uint8) 1) << (natts & 7)) - 1);
>> +       byte = (~bits[lastByte]) & mask;
>> ```
>> 
>> When I read the code, I got an impression bits[lastByte] might overflow when natts % 8 == 0, so I traced the code, then I realized that, this function is only called when a row has null values, so that, when reaching here, natts % 8 != 0, otherwise it should return earlier within the for loop.
> 
> It certainly is possible to get to that part of the code when natts is
> a multiple of 8 and the tuple contains NULLs after that (we may not be
> deforming the entire tuple). The code you quoted that's setting "mask"
> in that case will produce a zero mask, resulting in not finding any
> NULLs. I don't quite see any risk of overflowing any of the types
> here.  If natts is 16 then effectively the code does 0xFF & ((1 << 0)
> - 1); so no overflow. Just left shift by 0 bits and bitwise AND with
> zero, resulting in the mask becoming zero.
> 
> How about if I write the comment as follows?
> 
> /*
> * Create a mask with all bits beyond natts's bit set to off.  The code
> * below will generate a zero mask when natts & 7 == 0.  When that happens
> * all bytes that need to be checked were done so in the loop above.  The
> * code below will create an empty mask and end up returning natts.  This
> * has been done to avoid having to write a special case to check if we've
> * covered all bytes already.
> */
> 

I’m sorry I didn’t express myself clearly, maybe I should have used “OOB” rather than “overflow". My real concern is about out-of-boundary read of bits[lastByte] when natts&7==0.

Say, natts is 16, then bits is 2 bytes long; lastByte = 16>>3 = 2, so bits[2] is a OOB read.

If first_null_attr() is only called when hasnulls==true, then it will never hit the OOB point, because it will return early from the “for” loop. In the current patch, which is true, so the OOB should never happen.

However, I don’t see any comment mentions something like “first_null_attr() should only be called when hasnulls is true. If in future one calls first_null_attr() in a situation where hasnulls == false, then the OOB will be triggered.

The comment you added explains that even if OOB happens, no matter what value is hold by bits[lastByte], because mask is 0, the final result is still correct, which is true, but OOB is still a concern. If the bits array happens to end exactly at the edge of a memory page, the OOB read bits[lastByte] may trigger a segment fault; and valgrind may detect the OOB and complain about it.

So, my original comment was that, we should at least add something to the header comment to mention “first_null_attr() should only be called when hasnulls is true. If we can add an Assert to ensure hasnulls is true, that would be even better.

But if we want first_null_attr() to be safe no matter hasnulls is true or false, I think we should avoid the OOB.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/










^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: More speedups for tuple deformation
@ 2026-01-20 06:05  Chao Li <[email protected]>
  parent: Chao Li <[email protected]>
  0 siblings, 0 replies; 19+ messages in thread

From: Chao Li @ 2026-01-20 06:05 UTC (permalink / raw)
  To: David Rowley <[email protected]>; +Cc: PostgreSQL Developers <[email protected]>



> On Jan 20, 2026, at 12:32, Chao Li <[email protected]> wrote:
> 
> 
> 
>> On Jan 20, 2026, at 08:11, David Rowley <[email protected]> wrote:
>> 
>> On Mon, 19 Jan 2026 at 18:48, Chao Li <[email protected]> wrote:
>>> I reviewed the patch and traced some basic workflows. But I haven’t done a load test to compare performance differences with and without this patch, I will do that if I get some bandwidth later. Here comes some review comments:
>>> 
>>> 1 - tupmacs.h
>>> ```
>>> +       /* Create a mask with all bits beyond natts's bit set to off */
>>> +       mask = 0xFF & ((((uint8) 1) << (natts & 7)) - 1);
>>> +       byte = (~bits[lastByte]) & mask;
>>> ```
>>> 
>>> When I read the code, I got an impression bits[lastByte] might overflow when natts % 8 == 0, so I traced the code, then I realized that, this function is only called when a row has null values, so that, when reaching here, natts % 8 != 0, otherwise it should return earlier within the for loop.
>> 
>> It certainly is possible to get to that part of the code when natts is
>> a multiple of 8 and the tuple contains NULLs after that (we may not be
>> deforming the entire tuple). The code you quoted that's setting "mask"
>> in that case will produce a zero mask, resulting in not finding any
>> NULLs. I don't quite see any risk of overflowing any of the types
>> here.  If natts is 16 then effectively the code does 0xFF & ((1 << 0)
>> - 1); so no overflow. Just left shift by 0 bits and bitwise AND with
>> zero, resulting in the mask becoming zero.
>> 
>> How about if I write the comment as follows?
>> 
>> /*
>> * Create a mask with all bits beyond natts's bit set to off.  The code
>> * below will generate a zero mask when natts & 7 == 0.  When that happens
>> * all bytes that need to be checked were done so in the loop above.  The
>> * code below will create an empty mask and end up returning natts.  This
>> * has been done to avoid having to write a special case to check if we've
>> * covered all bytes already.
>> */
>> 
> 
> I’m sorry I didn’t express myself clearly, maybe I should have used “OOB” rather than “overflow". My real concern is about out-of-boundary read of bits[lastByte] when natts&7==0.
> 
> Say, natts is 16, then bits is 2 bytes long; lastByte = 16>>3 = 2, so bits[2] is a OOB read.
> 
> If first_null_attr() is only called when hasnulls==true, then it will never hit the OOB point, because it will return early from the “for” loop. In the current patch, which is true, so the OOB should never happen.
> 
> However, I don’t see any comment mentions something like “first_null_attr() should only be called when hasnulls is true. If in future one calls first_null_attr() in a situation where hasnulls == false, then the OOB will be triggered.
> 
> The comment you added explains that even if OOB happens, no matter what value is hold by bits[lastByte], because mask is 0, the final result is still correct, which is true, but OOB is still a concern. If the bits array happens to end exactly at the edge of a memory page, the OOB read bits[lastByte] may trigger a segment fault; and valgrind may detect the OOB and complain about it.
> 
> So, my original comment was that, we should at least add something to the header comment to mention “first_null_attr() should only be called when hasnulls is true. If we can add an Assert to ensure hasnulls is true, that would be even better.
> 
> But if we want first_null_attr() to be safe no matter hasnulls is true or false, I think we should avoid the OOB.
> 

I also noticed one thing that, with running an arbitrary SQL statement, first_null_attr() might be called with natts=0, so maybe it can have a fast path to return 0 directly if natts==0.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/










^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: More speedups for tuple deformation
@ 2026-01-20 18:38  Andres Freund <[email protected]>
  parent: David Rowley <[email protected]>
  1 sibling, 1 reply; 19+ messages in thread

From: Andres Freund @ 2026-01-20 18:38 UTC (permalink / raw)
  To: David Rowley <[email protected]>; +Cc: Chao Li <[email protected]>; PostgreSQL Developers <[email protected]>

Hi,

On 2026-01-20 13:11:55 +1300, David Rowley wrote:
> I've attached the v4 patch, which also fixes the LLVM compiler warning
> that I introduced.

I wonder if it's possible to split the patch - it's big enough to be
nontrivial to review...  Perhaps the finalization could be introduced
separately from the patch actually making use of it?

I wonder if we should somehow change the API of tupledesc creation, to make
old code that doesn't have TupleDescFinalize() fail to compile, instead of
just warn...



> diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
> index dcba3fb5473..2fdf5a341f6 100644
> --- a/contrib/pg_buffercache/pg_buffercache_pages.c
> +++ b/contrib/pg_buffercache/pg_buffercache_pages.c
> @@ -174,6 +174,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
>  			TupleDescInitEntry(tupledesc, (AttrNumber) 9, "pinning_backends",
>  							   INT4OID, -1, 0);
>  
> +		TupleDescFinalize(tupledesc);
>  		fctx->tupdesc = BlessTupleDesc(tupledesc);
>  

Think it'd be worth adding an assertion to BlessTupleDesc that
TupleDescFinalize has been called, I think that'll lead to easier to
understand backtraces in a lot of cases. Particularly if you consider cases
where BlessTupleDesc() will create a tupdesc in shared memory, that could then
trigger an assertion failure in a parallel worker or such.
>  /*
>   * slot_deform_heap_tuple
>   *		Given a TupleTableSlot, extract data from the slot's physical tuple
> @@ -1122,78 +1010,167 @@ static pg_attribute_always_inline void
>  slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
>  					   int natts)
>  {
> +	CompactAttribute *cattr;
> +	TupleDesc	tupleDesc = slot->tts_tupleDescriptor;
>  	bool		hasnulls = HeapTupleHasNulls(tuple);
> +	HeapTupleHeader tup = tuple->t_data;
> +	bits8	   *bp;				/* ptr to null bitmap in tuple */
>  	int			attnum;
> +	int			firstNonCacheOffsetAttr;
> +
> +#ifdef OPTIMIZE_BYVAL
> +	int			firstByRefAttr;
> +#endif
> +	int			firstNullAttr;
> +	Datum	   *values;
> +	bool	   *isnull;
> +	char	   *tp;				/* ptr to tuple data */
>  	uint32		off;			/* offset in tuple data */
> -	bool		slow;			/* can we use/set attcacheoff? */
> +
> +	/* Did someone forget to call TupleDescFinalize()? */
> +	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
>  
>  	/* We can only fetch as many attributes as the tuple has. */
> -	natts = Min(HeapTupleHeaderGetNatts(tuple->t_data), natts);
> +	natts = Min(HeapTupleHeaderGetNatts(tup), natts);
> +	attnum = slot->tts_nvalid;
> +	firstNonCacheOffsetAttr = Min(tupleDesc->firstNonCachedOffAttr, natts);
> +
> +	if (hasnulls)
> +	{
> +		bp = tup->t_bits;
> +		firstNullAttr = first_null_attr(bp, natts);
> +		firstNonCacheOffsetAttr = Min(firstNonCacheOffsetAttr, firstNullAttr);
> +	}
> +	else
> +	{
> +		bp = NULL;
> +		firstNullAttr = natts;
> +	}
> +
> +#ifdef OPTIMIZE_BYVAL
> +	firstByRefAttr = Min(firstNonCacheOffsetAttr, tupleDesc->firstByRefAttr);
> +#endif
> +	values = slot->tts_values;
> +	isnull = slot->tts_isnull;
> +	tp = (char *) tup + tup->t_hoff;
> +
> +#ifdef OPTIMIZE_BYVAL
>  
>  	/*
> -	 * Check whether the first call for this tuple, and initialize or restore
> -	 * loop state.
> +	 * Many tuples have leading byval attributes, try and process as many of
> +	 * those as possible with a special loop that can't handle byref types.
>  	 */
> -	attnum = slot->tts_nvalid;
> -	if (attnum == 0)
> +	if (attnum < firstByRefAttr)
> +	{
> +		/* Use do/while as we already know we need to loop at least once. */
> +		do
> +		{
> +			cattr = TupleDescCompactAttr(tupleDesc, attnum);
> +
> +			Assert(cattr->attcacheoff >= 0);
> +
> +			/*
> +			 * Hard code byval == true to allow the compiler to remove the
> +			 * byval check when inlining fetch_att().
> +			 */

Maybe add an assert for cattr->attbyval? Just to avoid a bad debugging
experience if somebody tries to extend this logic to
e.g. non-null-fixed-width-byref columns?

I also wonder if we could have assert-only crosschecking of the "real" offsets
against the cached ones?

Greetings,

Andres Freund






^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: More speedups for tuple deformation
@ 2026-01-21 05:00  David Rowley <[email protected]>
  parent: Andres Freund <[email protected]>
  0 siblings, 1 reply; 19+ messages in thread

From: David Rowley @ 2026-01-21 05:00 UTC (permalink / raw)
  To: Andres Freund <[email protected]>; +Cc: Chao Li <[email protected]>; PostgreSQL Developers <[email protected]>

On Wed, 21 Jan 2026 at 07:38, Andres Freund <[email protected]> wrote:
> I wonder if it's possible to split the patch - it's big enough to be
> nontrivial to review...  Perhaps the finalization could be introduced
> separately from the patch actually making use of it?

Seems reasonable. I've done that in the attached 0001, which contains
a dummy macro for TupleDescFinalize() and all the required calls to
it.

> I wonder if we should somehow change the API of tupledesc creation, to make
> old code that doesn't have TupleDescFinalize() fail to compile, instead of
> just warn...

I don't have any ideas on how to do that. I could maybe imagine some
preprocessor magic if we always expected a CreateTupleDesc() and
TupleDescFinalize() in the same function, but the TupleDescFinalize()
may be required after any modification to the TupleDesc that could
invalidate the processing that's done within that function.

> Think it'd be worth adding an assertion to BlessTupleDesc that
> TupleDescFinalize has been called, I think that'll lead to easier to
> understand backtraces in a lot of cases. Particularly if you consider cases
> where BlessTupleDesc() will create a tupdesc in shared memory, that could then
> trigger an assertion failure in a parallel worker or such.

Modified.

> Maybe add an assert for cattr->attbyval? Just to avoid a bad debugging
> experience if somebody tries to extend this logic to
> e.g. non-null-fixed-width-byref columns?

I ended up removing the OPTIMIZE_BYVAL code in the attached. Over all
the machines I tested on, with the benchmark results I previously
shared, it seemed to cause a slowdown rather than a speedup. Perhaps
it can be refined and tried again later, but I've removed it for now
to reduce complexity.

> I also wonder if we could have assert-only crosschecking of the "real" offsets
> against the cached ones?

I've modified the code to do that. v5 patches attached.

Thanks for reviewing.

David

From e94ee4368acd1697fa0b08ae3b0dd1ccc51d18bf Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Wed, 21 Jan 2026 15:41:37 +1300
Subject: [PATCH v5 1/2] Add empty TupleDescFinalize() function

Currently does nothing, but will in a future commit.
---
 contrib/dblink/dblink.c                             |  4 ++++
 contrib/pg_buffercache/pg_buffercache_pages.c       |  2 ++
 contrib/pg_visibility/pg_visibility.c               |  2 ++
 src/backend/access/brin/brin_tuple.c                |  1 +
 src/backend/access/common/tupdesc.c                 | 13 +++++++++++++
 src/backend/access/gin/ginutil.c                    |  1 +
 src/backend/access/gist/gistscan.c                  |  1 +
 src/backend/access/spgist/spgutils.c                |  1 +
 src/backend/access/transam/twophase.c               |  1 +
 src/backend/access/transam/xlogfuncs.c              |  1 +
 src/backend/backup/basebackup_copy.c                |  3 +++
 src/backend/catalog/index.c                         |  2 ++
 src/backend/catalog/pg_publication.c                |  1 +
 src/backend/catalog/toasting.c                      |  6 ++++++
 src/backend/commands/explain.c                      |  1 +
 src/backend/commands/functioncmds.c                 |  1 +
 src/backend/commands/sequence.c                     |  1 +
 src/backend/commands/tablecmds.c                    |  4 ++++
 src/backend/commands/wait.c                         |  1 +
 src/backend/executor/execSRF.c                      |  2 ++
 src/backend/executor/execTuples.c                   |  4 ++++
 src/backend/executor/nodeFunctionscan.c             |  2 ++
 src/backend/parser/parse_relation.c                 |  4 +++-
 src/backend/parser/parse_target.c                   |  2 ++
 .../replication/libpqwalreceiver/libpqwalreceiver.c |  1 +
 src/backend/replication/walsender.c                 |  5 +++++
 src/backend/utils/adt/acl.c                         |  1 +
 src/backend/utils/adt/genfile.c                     |  1 +
 src/backend/utils/adt/lockfuncs.c                   |  1 +
 src/backend/utils/adt/orderedsetaggs.c              |  1 +
 src/backend/utils/adt/pgstatfuncs.c                 |  5 +++++
 src/backend/utils/adt/tsvector_op.c                 |  1 +
 src/backend/utils/cache/relcache.c                  |  8 ++++++++
 src/backend/utils/fmgr/funcapi.c                    |  6 ++++++
 src/backend/utils/misc/guc_funcs.c                  |  5 +++++
 src/include/access/tupdesc.h                        |  1 +
 src/pl/plpgsql/src/pl_comp.c                        |  2 ++
 .../test_custom_stats/test_custom_fixed_stats.c     |  1 +
 src/test/modules/test_predtest/test_predtest.c      |  1 +
 39 files changed, 100 insertions(+), 1 deletion(-)

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 8cb3166495c..1ce4502fec2 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -881,6 +881,7 @@ materializeResult(FunctionCallInfo fcinfo, PGconn *conn, PGresult *res)
 		tupdesc = CreateTemplateTupleDesc(1);
 		TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 						   TEXTOID, -1, 0);
+		TupleDescFinalize(tupdesc);
 		ntuples = 1;
 		nfields = 1;
 	}
@@ -1044,6 +1045,7 @@ materializeQueryResult(FunctionCallInfo fcinfo,
 			tupdesc = CreateTemplateTupleDesc(1);
 			TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 							   TEXTOID, -1, 0);
+			TupleDescFinalize(tupdesc);
 			attinmeta = TupleDescGetAttInMetadata(tupdesc);
 
 			oldcontext = MemoryContextSwitchTo(rsinfo->econtext->ecxt_per_query_memory);
@@ -1529,6 +1531,8 @@ dblink_get_pkey(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 2, "colname",
 						   TEXTOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
+
 		/*
 		 * Generate attribute metadata needed later to produce tuples from raw
 		 * C strings
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index dcba3fb5473..2fdf5a341f6 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -174,6 +174,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
 			TupleDescInitEntry(tupledesc, (AttrNumber) 9, "pinning_backends",
 							   INT4OID, -1, 0);
 
+		TupleDescFinalize(tupledesc);
 		fctx->tupdesc = BlessTupleDesc(tupledesc);
 
 		/* Allocate NBuffers worth of BufferCachePagesRec records. */
@@ -442,6 +443,7 @@ pg_buffercache_os_pages_internal(FunctionCallInfo fcinfo, bool include_numa)
 		TupleDescInitEntry(tupledesc, (AttrNumber) 3, "numa_node",
 						   INT4OID, -1, 0);
 
+		TupleDescFinalize(tupledesc);
 		fctx->tupdesc = BlessTupleDesc(tupledesc);
 		fctx->include_numa = include_numa;
 
diff --git a/contrib/pg_visibility/pg_visibility.c b/contrib/pg_visibility/pg_visibility.c
index 9bc3a784bf7..dfab0b64cf5 100644
--- a/contrib/pg_visibility/pg_visibility.c
+++ b/contrib/pg_visibility/pg_visibility.c
@@ -469,6 +469,8 @@ pg_visibility_tupdesc(bool include_blkno, bool include_pd)
 		TupleDescInitEntry(tupdesc, ++a, "pd_all_visible", BOOLOID, -1, 0);
 	Assert(a == maxattr);
 
+	TupleDescFinalize(tupdesc);
+
 	return BlessTupleDesc(tupdesc);
 }
 
diff --git a/src/backend/access/brin/brin_tuple.c b/src/backend/access/brin/brin_tuple.c
index 706387e36d6..7f150df9ee7 100644
--- a/src/backend/access/brin/brin_tuple.c
+++ b/src/backend/access/brin/brin_tuple.c
@@ -84,6 +84,7 @@ brtuple_disk_tupdesc(BrinDesc *brdesc)
 
 		MemoryContextSwitchTo(oldcxt);
 
+		TupleDescFinalize(tupdesc);
 		brdesc->bd_disktdesc = tupdesc;
 	}
 
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index 94b4f1f9975..e98de806a77 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -238,6 +238,9 @@ CreateTupleDesc(int natts, Form_pg_attribute *attrs)
 		memcpy(TupleDescAttr(desc, i), attrs[i], ATTRIBUTE_FIXED_PART_SIZE);
 		populate_compact_attribute(desc, i);
 	}
+
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -282,6 +285,8 @@ CreateTupleDescCopy(TupleDesc tupdesc)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -328,6 +333,8 @@ CreateTupleDescTruncatedCopy(TupleDesc tupdesc, int natts)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -413,6 +420,8 @@ CreateTupleDescCopyConstr(TupleDesc tupdesc)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -455,6 +464,8 @@ TupleDescCopy(TupleDesc dst, TupleDesc src)
 	 * source's refcount would be wrong in any case.)
 	 */
 	dst->tdrefcount = -1;
+
+	TupleDescFinalize(dst);
 }
 
 /*
@@ -1082,6 +1093,8 @@ BuildDescFromLists(const List *names, const List *types, const List *typmods, co
 		TupleDescInitEntryCollation(desc, attnum, attcollation);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index d205093e21d..a533d79e26e 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -129,6 +129,7 @@ initGinState(GinState *state, Relation index)
 							   attr->attndims);
 			TupleDescInitEntryCollation(state->tupdesc[i], (AttrNumber) 2,
 										attr->attcollation);
+			TupleDescFinalize(state->tupdesc[i]);
 		}
 
 		/*
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
index f23bc4a6757..c65f93abdae 100644
--- a/src/backend/access/gist/gistscan.c
+++ b/src/backend/access/gist/gistscan.c
@@ -201,6 +201,7 @@ gistrescan(IndexScanDesc scan, ScanKey key, int nkeys,
 											 attno - 1)->atttypid,
 							   -1, 0);
 		}
+		TupleDescFinalize(so->giststate->fetchTupdesc);
 		scan->xs_hitupdesc = so->giststate->fetchTupdesc;
 
 		/* Also create a memory context that will hold the returned tuples */
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 9f5379b87ac..b246e8127db 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -340,6 +340,7 @@ getSpGistTupleDesc(Relation index, SpGistTypeDesc *keyType)
 			TupleDescCompactAttr(outTupDesc, i)->attcacheoff = -1;
 
 		populate_compact_attribute(outTupDesc, spgKeyColumn);
+		TupleDescFinalize(outTupDesc);
 	}
 	return outTupDesc;
 }
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index e50abb331cc..9f708f84334 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -744,6 +744,7 @@ pg_prepared_xact(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 5, "dbid",
 						   OIDOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 2efe4105efb..b6bc616c74c 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -400,6 +400,7 @@ pg_walfile_name_offset(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(resultTupleDesc, (AttrNumber) 2, "file_offset",
 					   INT4OID, -1, 0);
 
+	TupleDescFinalize(resultTupleDesc);
 	resultTupleDesc = BlessTupleDesc(resultTupleDesc);
 
 	/*
diff --git a/src/backend/backup/basebackup_copy.c b/src/backend/backup/basebackup_copy.c
index fecfad9ab7b..29dbd0cb32f 100644
--- a/src/backend/backup/basebackup_copy.c
+++ b/src/backend/backup/basebackup_copy.c
@@ -357,6 +357,8 @@ SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli)
 	 */
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "tli", INT8OID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
+
 	/* send RowDescription */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
 
@@ -388,6 +390,7 @@ SendTablespaceList(List *tablespaces)
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, "spcoid", OIDOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "spclocation", TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "size", INT8OID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* send RowDescription */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 43de42ce39e..75e97fb394a 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -481,6 +481,8 @@ ConstructTupleDescriptor(Relation heapRelation,
 		populate_compact_attribute(indexTupDesc, i);
 	}
 
+	TupleDescFinalize(indexTupDesc);
+
 	return indexTupDesc;
 }
 
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 9a4791c573e..fa353a0dd37 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -1230,6 +1230,7 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 4, "qual",
 						   PG_NODE_TREEOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 		funcctx->user_fctx = table_infos;
 
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index c78dcea98c1..078a1cf5127 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -229,6 +229,12 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid,
 	TupleDescAttr(tupdesc, 1)->attcompression = InvalidCompressionMethod;
 	TupleDescAttr(tupdesc, 2)->attcompression = InvalidCompressionMethod;
 
+	populate_compact_attribute(tupdesc, 0);
+	populate_compact_attribute(tupdesc, 1);
+	populate_compact_attribute(tupdesc, 2);
+
+	TupleDescFinalize(tupdesc);
+
 	/*
 	 * Toast tables for regular relations go in pg_toast; those for temp
 	 * relations go into the per-backend temp-toast-table namespace.
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index b7bb111688c..7abd9ed272f 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -281,6 +281,7 @@ ExplainResultDesc(ExplainStmt *stmt)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "QUERY PLAN",
 					   result_type, -1, 0);
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
 
diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c
index a516b037dea..6a8f162b640 100644
--- a/src/backend/commands/functioncmds.c
+++ b/src/backend/commands/functioncmds.c
@@ -2423,6 +2423,7 @@ CallStmtResultDesc(CallStmt *stmt)
 							   -1,
 							   0);
 		}
+		TupleDescFinalize(tupdesc);
 	}
 
 	return tupdesc;
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index e1b808bbb60..551667650ba 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -1808,6 +1808,7 @@ pg_get_sequence_data(PG_FUNCTION_ARGS)
 					   BOOLOID, -1, 0);
 	TupleDescInitEntry(resultTupleDesc, (AttrNumber) 3, "page_lsn",
 					   LSNOID, -1, 0);
+	TupleDescFinalize(resultTupleDesc);
 	resultTupleDesc = BlessTupleDesc(resultTupleDesc);
 
 	seqrel = try_relation_open(relid, AccessShareLock);
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index f976c0e5c7e..ae7c9db9fa9 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -1029,6 +1029,8 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
 		}
 	}
 
+	TupleDescFinalize(descriptor);
+
 	/*
 	 * For relations with table AM and partitioned tables, select access
 	 * method to use: an explicitly indicated one, or (in the case of a
@@ -1448,6 +1450,8 @@ BuildDescForRelation(const List *columns)
 		populate_compact_attribute(desc, attnum - 1);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
diff --git a/src/backend/commands/wait.c b/src/backend/commands/wait.c
index 1290df10c6f..8e920a72372 100644
--- a/src/backend/commands/wait.c
+++ b/src/backend/commands/wait.c
@@ -338,5 +338,6 @@ WaitStmtResultDesc(WaitStmt *stmt)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 					   TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
diff --git a/src/backend/executor/execSRF.c b/src/backend/executor/execSRF.c
index a0b111dc0e4..b481e50acfb 100644
--- a/src/backend/executor/execSRF.c
+++ b/src/backend/executor/execSRF.c
@@ -272,6 +272,7 @@ ExecMakeTableFunctionResult(SetExprState *setexpr,
 									   funcrettype,
 									   -1,
 									   0);
+					TupleDescFinalize(tupdesc);
 					rsinfo.setDesc = tupdesc;
 				}
 				MemoryContextSwitchTo(oldcontext);
@@ -776,6 +777,7 @@ init_sexpr(Oid foid, Oid input_collation, Expr *node,
 							   funcrettype,
 							   -1,
 							   0);
+			TupleDescFinalize(tupdesc);
 			sexpr->funcResultDesc = tupdesc;
 			sexpr->funcReturnsTuple = false;
 		}
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index b768eae9e53..e6ab51e6404 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -2173,6 +2173,8 @@ ExecTypeFromTLInternal(List *targetList, bool skipjunk)
 		cur_resno++;
 	}
 
+	TupleDescFinalize(typeInfo);
+
 	return typeInfo;
 }
 
@@ -2207,6 +2209,8 @@ ExecTypeFromExprList(List *exprList)
 		cur_resno++;
 	}
 
+	TupleDescFinalize(typeInfo);
+
 	return typeInfo;
 }
 
diff --git a/src/backend/executor/nodeFunctionscan.c b/src/backend/executor/nodeFunctionscan.c
index 63e605e1f81..feb82d64967 100644
--- a/src/backend/executor/nodeFunctionscan.c
+++ b/src/backend/executor/nodeFunctionscan.c
@@ -414,6 +414,7 @@ ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags)
 				TupleDescInitEntryCollation(tupdesc,
 											(AttrNumber) 1,
 											exprCollation(funcexpr));
+				TupleDescFinalize(tupdesc);
 			}
 			else
 			{
@@ -485,6 +486,7 @@ ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags)
 							   0);
 		}
 
+		TupleDescFinalize(scan_tupdesc);
 		Assert(attno == natts);
 	}
 
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 3ec8d8de011..0ad767d827b 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -1891,6 +1891,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 			TupleDescInitEntryCollation(tupdesc,
 										(AttrNumber) 1,
 										exprCollation(funcexpr));
+			TupleDescFinalize(tupdesc);
 		}
 		else if (functypclass == TYPEFUNC_RECORD)
 		{
@@ -1948,6 +1949,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 
 				i++;
 			}
+			TupleDescFinalize(tupdesc);
 
 			/*
 			 * Ensure that the coldeflist defines a legal set of names (no
@@ -2016,7 +2018,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 							   0);
 			/* no need to set collation */
 		}
-
+		TupleDescFinalize(tupdesc);
 		Assert(natts == totalatts);
 	}
 	else
diff --git a/src/backend/parser/parse_target.c b/src/backend/parser/parse_target.c
index b5a2f915b67..5fd17f3d8d0 100644
--- a/src/backend/parser/parse_target.c
+++ b/src/backend/parser/parse_target.c
@@ -1570,6 +1570,8 @@ expandRecordVariable(ParseState *pstate, Var *var, int levelsup)
 		}
 		Assert(lname == NULL && lvar == NULL);	/* lists same length? */
 
+		TupleDescFinalize(tupleDesc);
+
 		return tupleDesc;
 	}
 
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 7c8639b32e9..9f04c9ed25d 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -1073,6 +1073,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
 	for (coln = 0; coln < nRetTypes; coln++)
 		TupleDescInitEntry(walres->tupledesc, (AttrNumber) coln + 1,
 						   PQfname(pgres, coln), retTypes[coln], -1, 0);
+	TupleDescFinalize(walres->tupledesc);
 	attinmeta = TupleDescGetAttInMetadata(walres->tupledesc);
 
 	/* No point in doing more here if there were no tuples returned. */
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 1ab09655a70..269b081bac0 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -452,6 +452,7 @@ IdentifySystem(void)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 4, "dbname",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -497,6 +498,7 @@ ReadReplicationSlot(ReadReplicationSlotCmd *cmd)
 	/* TimeLineID is unsigned, so int4 is not wide enough. */
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "restart_tli",
 							  INT8OID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	memset(nulls, true, READ_REPLICATION_SLOT_COLS * sizeof(bool));
 
@@ -599,6 +601,7 @@ SendTimeLineHistory(TimeLineHistoryCmd *cmd)
 	tupdesc = CreateTemplateTupleDesc(2);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, "filename", TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "content", TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	TLHistoryFileName(histfname, cmd->timeline);
 	TLHistoryFilePath(path, cmd->timeline);
@@ -1016,6 +1019,7 @@ StartReplication(StartReplicationCmd *cmd)
 								  INT8OID, -1, 0);
 		TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "next_tli_startpos",
 								  TEXTOID, -1, 0);
+		TupleDescFinalize(tupdesc);
 
 		/* prepare for projection of tuple */
 		tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -1370,6 +1374,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 4, "output_plugin",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
diff --git a/src/backend/utils/adt/acl.c b/src/backend/utils/adt/acl.c
index 3a6905f9546..9d37053c81e 100644
--- a/src/backend/utils/adt/acl.c
+++ b/src/backend/utils/adt/acl.c
@@ -1818,6 +1818,7 @@ aclexplode(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 4, "is_grantable",
 						   BOOLOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/* allocate memory for user context */
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index c083608b1d5..bfb949401d0 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -454,6 +454,7 @@ pg_stat_file(PG_FUNCTION_ARGS)
 					   "creation", TIMESTAMPTZOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 6,
 					   "isdir", BOOLOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	memset(isnull, false, sizeof(isnull));
diff --git a/src/backend/utils/adt/lockfuncs.c b/src/backend/utils/adt/lockfuncs.c
index 9dadd6da672..4481c354fd6 100644
--- a/src/backend/utils/adt/lockfuncs.c
+++ b/src/backend/utils/adt/lockfuncs.c
@@ -146,6 +146,7 @@ pg_lock_status(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 16, "waitstart",
 						   TIMESTAMPTZOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/*
diff --git a/src/backend/utils/adt/orderedsetaggs.c b/src/backend/utils/adt/orderedsetaggs.c
index 3b6da8e36ac..fd8b8676470 100644
--- a/src/backend/utils/adt/orderedsetaggs.c
+++ b/src/backend/utils/adt/orderedsetaggs.c
@@ -233,6 +233,7 @@ ordered_set_startup(FunctionCallInfo fcinfo, bool use_tuples)
 								   -1,
 								   0);
 
+				TupleDescFinalize(newdesc);
 				FreeTupleDesc(qstate->tupdesc);
 				qstate->tupdesc = newdesc;
 			}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 73ca0bb0b7f..08ad27e57c2 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -769,6 +769,7 @@ pg_stat_get_backend_subxact(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subxact_overflow",
 					   BOOLOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	if ((local_beentry = pgstat_get_local_beentry_by_proc_number(procNumber)) != NULL)
@@ -1658,6 +1659,7 @@ pg_stat_wal_build_tuple(PgStat_WalCounters wal_counters,
 	TupleDescInitEntry(tupdesc, (AttrNumber) 6, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	/* Fill values and NULLs */
@@ -2085,6 +2087,7 @@ pg_stat_get_archiver(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(tupdesc, (AttrNumber) 7, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	/* Get statistics about the archiver process */
@@ -2166,6 +2169,7 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
 					   TIMESTAMPTZOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 13, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	namestrcpy(&slotname, text_to_cstring(slotname_text));
@@ -2253,6 +2257,7 @@ pg_stat_get_subscription_stats(PG_FUNCTION_ARGS)
 					   INT8OID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 13, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	if (!subentry)
diff --git a/src/backend/utils/adt/tsvector_op.c b/src/backend/utils/adt/tsvector_op.c
index 94e0fed8309..7ca19a97882 100644
--- a/src/backend/utils/adt/tsvector_op.c
+++ b/src/backend/utils/adt/tsvector_op.c
@@ -651,6 +651,7 @@ tsvector_unnest(PG_FUNCTION_ARGS)
 						   TEXTARRAYOID, -1, 0);
 		if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
 			elog(ERROR, "return type must be a row type");
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = tupdesc;
 
 		funcctx->user_fctx = PG_GETARG_TSVECTOR_COPY(0);
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 6b634c9fff1..770edb34e08 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -729,6 +729,8 @@ RelationBuildTupleDesc(Relation relation)
 		pfree(constr);
 		relation->rd_att->constr = NULL;
 	}
+
+	TupleDescFinalize(relation->rd_att);
 }
 
 /*
@@ -1985,6 +1987,7 @@ formrdesc(const char *relationName, Oid relationReltype,
 
 	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
 	TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
+	TupleDescFinalize(relation->rd_att);
 
 	/* mark not-null status */
 	if (has_not_null)
@@ -3688,6 +3691,8 @@ RelationBuildLocalRelation(const char *relname,
 	for (i = 0; i < natts; i++)
 		TupleDescAttr(rel->rd_att, i)->attrelid = relid;
 
+	TupleDescFinalize(rel->rd_att);
+
 	rel->rd_rel->reltablespace = reltablespace;
 
 	if (mapped_relation)
@@ -4443,6 +4448,7 @@ BuildHardcodedDescriptor(int natts, const FormData_pg_attribute *attrs)
 
 	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
 	TupleDescCompactAttr(result, 0)->attcacheoff = 0;
+	TupleDescFinalize(result);
 
 	/* Note: we don't bother to set up a TupleConstr entry */
 
@@ -6268,6 +6274,8 @@ load_relcache_init_file(bool shared)
 			populate_compact_attribute(rel->rd_att, i);
 		}
 
+		TupleDescFinalize(rel->rd_att);
+
 		/* next read the access method specific field */
 		if (fread(&len, 1, sizeof(len), fp) != sizeof(len))
 			goto read_failed;
diff --git a/src/backend/utils/fmgr/funcapi.c b/src/backend/utils/fmgr/funcapi.c
index 8a934ea8dca..516d02cfb82 100644
--- a/src/backend/utils/fmgr/funcapi.c
+++ b/src/backend/utils/fmgr/funcapi.c
@@ -340,6 +340,8 @@ get_expr_result_type(Node *expr,
 										exprCollation(col));
 			i++;
 		}
+		TupleDescFinalize(tupdesc);
+
 		if (resultTypeId)
 			*resultTypeId = rexpr->row_typeid;
 		if (resultTupleDesc)
@@ -1044,6 +1046,7 @@ resolve_polymorphic_tupdesc(TupleDesc tupdesc, oidvector *declared_args,
 		}
 	}
 
+	TupleDescFinalize(tupdesc);
 	return true;
 }
 
@@ -1853,6 +1856,8 @@ build_function_result_tupdesc_d(char prokind,
 						   0);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -1970,6 +1975,7 @@ TypeGetTupleDesc(Oid typeoid, List *colaliases)
 						   typeoid,
 						   -1,
 						   0);
+		TupleDescFinalize(tupdesc);
 	}
 	else if (functypclass == TYPEFUNC_RECORD)
 	{
diff --git a/src/backend/utils/misc/guc_funcs.c b/src/backend/utils/misc/guc_funcs.c
index 4f3e40bf470..b82f807e05e 100644
--- a/src/backend/utils/misc/guc_funcs.c
+++ b/src/backend/utils/misc/guc_funcs.c
@@ -444,6 +444,7 @@ GetPGVariableResultDesc(const char *name)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 1, varname,
 						   TEXTOID, -1, 0);
 	}
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
 
@@ -465,6 +466,7 @@ ShowGUCConfigOption(const char *name, DestReceiver *dest)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, varname,
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -499,6 +501,7 @@ ShowAllGUCConfig(DestReceiver *dest)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "description",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -934,6 +937,8 @@ show_all_settings(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 17, "pending_restart",
 						   BOOLOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
+
 		/*
 		 * Generate attribute metadata needed later to produce tuples from raw
 		 * C strings
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index d46cdbf7a3c..595413dbbc5 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -195,6 +195,7 @@ extern TupleDesc CreateTupleDescTruncatedCopy(TupleDesc tupdesc, int natts);
 
 extern TupleDesc CreateTupleDescCopyConstr(TupleDesc tupdesc);
 
+#define TupleDescFinalize(d) ((void) 0)
 #define TupleDescSize(src) \
 	(offsetof(struct TupleDescData, compact_attrs) + \
 	 (src)->natts * sizeof(CompactAttribute) + \
diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c
index 7d648c941c0..b2112c29fcf 100644
--- a/src/pl/plpgsql/src/pl_comp.c
+++ b/src/pl/plpgsql/src/pl_comp.c
@@ -1912,6 +1912,8 @@ build_row_from_vars(PLpgSQL_variable **vars, int numvars)
 		TupleDescInitEntryCollation(row->rowtupdesc, i + 1, typcoll);
 	}
 
+	TupleDescFinalize(row->rowtupdesc);
+
 	return row;
 }
 
diff --git a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
index 908bd18a7c7..fa1719bf3b5 100644
--- a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
+++ b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
@@ -205,6 +205,7 @@ test_custom_stats_fixed_report(PG_FUNCTION_ARGS)
 					   INT8OID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	values[0] = Int64GetDatum(stats->numcalls);
diff --git a/src/test/modules/test_predtest/test_predtest.c b/src/test/modules/test_predtest/test_predtest.c
index 679a5de456d..48ca2a4ea70 100644
--- a/src/test/modules/test_predtest/test_predtest.c
+++ b/src/test/modules/test_predtest/test_predtest.c
@@ -230,6 +230,7 @@ test_predtest(PG_FUNCTION_ARGS)
 					   "s_r_holds", BOOLOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 8,
 					   "w_r_holds", BOOLOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	tupdesc = BlessTupleDesc(tupdesc);
 
 	values[0] = BoolGetDatum(strong_implied_by);
-- 
2.51.0


From 3229f3c90519f9ad441821c2a819429ad34f9011 Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Tue, 31 Dec 2024 09:19:24 +1300
Subject: [PATCH v5 2/2] Precalculate CompactAttribute's attcacheoff

This allows code to be removed from the tuple deform routines which
shrinks down the code a little, which can make it run more quickly.
This also makes a dedicated deformer loop to deform the portion of the
tuple which has a known offset, which makes deforming much faster when
a leading set of the table's columns are non-NULL values and fixed-width
types.
---
 src/backend/access/common/heaptuple.c  | 334 +++++++++-------------
 src/backend/access/common/indextuple.c | 367 ++++++++++---------------
 src/backend/access/common/tupdesc.c    |  37 +++
 src/backend/access/spgist/spgutils.c   |   3 -
 src/backend/executor/execTuples.c      | 280 ++++++++-----------
 src/backend/jit/llvm/llvmjit_deform.c  |   6 -
 src/backend/utils/cache/relcache.c     |  12 -
 src/include/access/htup_details.h      |  19 +-
 src/include/access/itup.h              |  20 +-
 src/include/access/tupdesc.h           |  10 +-
 src/include/access/tupmacs.h           |  65 +++++
 src/include/executor/tuptable.h        |   9 +-
 12 files changed, 522 insertions(+), 640 deletions(-)

diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c
index 11bec20e82e..42cce3dcdfe 100644
--- a/src/backend/access/common/heaptuple.c
+++ b/src/backend/access/common/heaptuple.c
@@ -497,20 +497,8 @@ heap_attisnull(HeapTuple tup, int attnum, TupleDesc tupleDesc)
 /* ----------------
  *		nocachegetattr
  *
- *		This only gets called from fastgetattr(), in cases where we
- *		can't use a cacheoffset and the value is not null.
- *
- *		This caches attribute offsets in the attribute descriptor.
- *
- *		An alternative way to speed things up would be to cache offsets
- *		with the tuple, but that seems more difficult unless you take
- *		the storage hit of actually putting those offsets into the
- *		tuple you send to disk.  Yuck.
- *
- *		This scheme will be slightly slower than that, but should
- *		perform well for queries which hit large #'s of tuples.  After
- *		you cache the offsets once, examining all the other tuples using
- *		the same attribute descriptor will go much quicker. -cim 5/4/91
+ *		This only gets called from fastgetattr(), in cases where the
+ *		attcacheoff is not set.
  *
  *		NOTE: if you need to change this code, see also heap_deform_tuple.
  *		Also see nocache_index_getattr, which is the same code for index
@@ -522,194 +510,104 @@ nocachegetattr(HeapTuple tup,
 			   int attnum,
 			   TupleDesc tupleDesc)
 {
+	CompactAttribute *cattr;
 	HeapTupleHeader td = tup->t_data;
 	char	   *tp;				/* ptr to data part of tuple */
 	bits8	   *bp = td->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* do we have to walk attrs? */
 	int			off;			/* current offset within data */
+	int			startAttr;
+	int			firstnullattr;
+	bool		hasnulls = HeapTupleHasNulls(tup);
 
-	/* ----------------
-	 *	 Three cases:
-	 *
-	 *	 1: No nulls and no variable-width attributes.
-	 *	 2: Has a null or a var-width AFTER att.
-	 *	 3: Has nulls or var-widths BEFORE att.
-	 * ----------------
-	 */
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
+	/*
+	 * If there are no NULLs before the required attnum, then we can start at
+	 * the highest attribute with a known offset, or the first attribute if
+	 * none have a cached offset.  If the tuple has no variable width types,
+	 * then we can use a slightly cheaper method of offset calculation, as we
+	 * just need to add the attlen to the aligned offset when skipping over
+	 * columns.  When the tuple contains variable-width types, we must use
+	 * att_addlength_pointer(), which does a bit more branching and is
+	 * slightly less efficient.
+	 */
 	attnum--;
 
-	if (!HeapTupleNoNulls(tup))
-	{
-		/*
-		 * there's a null somewhere in the tuple
-		 *
-		 * check to see if any preceding bits are null...
-		 */
-		int			byte = attnum >> 3;
-		int			finalbit = attnum & 0x07;
-
-		/* check for nulls "before" final bit of last byte */
-		if ((~bp[byte]) & ((1 << finalbit) - 1))
-			slow = true;
-		else
-		{
-			/* check for nulls in any "earlier" bytes */
-			int			i;
+	if (hasnulls)
+		firstnullattr = first_null_attr(bp, attnum);
+	else
+		firstnullattr = attnum;
 
-			for (i = 0; i < byte; i++)
-			{
-				if (bp[i] != 0xFF)
-				{
-					slow = true;
-					break;
-				}
-			}
-		}
+	if (tupleDesc->firstNonCachedOffAttr > 0)
+	{
+		startAttr = Min(tupleDesc->firstNonCachedOffAttr - 1, firstnullattr);
+		off = TupleDescCompactAttr(tupleDesc, startAttr)->attcacheoff;
+	}
+	else
+	{
+		startAttr = 0;
+		off = 0;
 	}
 
 	tp = (char *) td + td->t_hoff;
 
-	if (!slow)
+	if (hasnulls)
 	{
-		CompactAttribute *att;
+		for (int i = startAttr; i < attnum; i++)
+		{
+			CompactAttribute *att;
 
-		/*
-		 * If we get here, there are no nulls up to and including the target
-		 * attribute.  If we have a cached offset, we can use it.
-		 */
-		att = TupleDescCompactAttr(tupleDesc, attnum);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, tp + att->attcacheoff);
+			if (att_isnull(i, bp))
+				continue;
 
-		/*
-		 * Otherwise, check for non-fixed-length attrs up to and including
-		 * target.  If there aren't any, it's safe to cheaply initialize the
-		 * cached offsets for these attrs.
-		 */
-		if (HeapTupleHasVarWidth(tup))
-		{
-			int			j;
+			att = TupleDescCompactAttr(tupleDesc, i);
 
-			for (j = 0; j <= attnum; j++)
-			{
-				if (TupleDescCompactAttr(tupleDesc, j)->attlen <= 0)
-				{
-					slow = true;
-					break;
-				}
-			}
+			off = att_pointer_alignby(off,
+									  att->attalignby,
+									  att->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, att->attlen, tp + off);
 		}
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
 	}
-
-	if (!slow)
+	else if (!HeapTupleHasVarWidth(tup))
 	{
-		int			natts = tupleDesc->natts;
-		int			j = 1;
-
-		/*
-		 * If we get here, we have a tuple with no nulls or var-widths up to
-		 * and including the target attribute, so we can use the cached offset
-		 * ... only we don't have it yet, or we'd not have got here.  Since
-		 * it's cheap to compute offsets for fixed-width columns, we take the
-		 * opportunity to initialize the cached offsets for *all* the leading
-		 * fixed-width columns, in hope of avoiding future visits to this
-		 * routine.
-		 */
-		TupleDescCompactAttr(tupleDesc, 0)->attcacheoff = 0;
-
-		/* we might have set some offsets in the slow path previously */
-		while (j < natts && TupleDescCompactAttr(tupleDesc, j)->attcacheoff > 0)
-			j++;
-
-		off = TupleDescCompactAttr(tupleDesc, j - 1)->attcacheoff +
-			TupleDescCompactAttr(tupleDesc, j - 1)->attlen;
-
-		for (; j < natts; j++)
+		for (int i = startAttr; i < attnum; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, j);
-
-			if (att->attlen <= 0)
-				break;
+			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
 
 			off = att_nominal_alignby(off, att->attalignby);
-
-			att->attcacheoff = off;
-
 			off += att->attlen;
 		}
-
-		Assert(j > attnum);
-
-		off = TupleDescCompactAttr(tupleDesc, attnum)->attcacheoff;
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_nominal_alignby(off, cattr->attalignby);
 	}
 	else
 	{
-		bool		usecache = true;
-		int			i;
-
-		/*
-		 * Now we know that we have to walk the tuple CAREFULLY.  But we still
-		 * might be able to cache some offsets for next time.
-		 *
-		 * Note - This loop is a little tricky.  For each non-null attribute,
-		 * we have to first account for alignment padding before the attr,
-		 * then advance over the attr based on its length.  Nulls have no
-		 * storage and no alignment padding either.  We can use/set
-		 * attcacheoff until we reach either a null or a var-width attribute.
-		 */
-		off = 0;
-		for (i = 0;; i++)		/* loop exit is at "break" */
+		for (int i = startAttr; i < attnum; i++)
 		{
 			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
 
-			if (HeapTupleHasNulls(tup) && att_isnull(i, bp))
-			{
-				usecache = false;
-				continue;		/* this cannot be the target att */
-			}
-
-			/* If we know the next offset, we can skip the rest */
-			if (usecache && att->attcacheoff >= 0)
-				off = att->attcacheoff;
-			else if (att->attlen == -1)
-			{
-				/*
-				 * We can only cache the offset for a varlena attribute if the
-				 * offset is already suitably aligned, so that there would be
-				 * no pad bytes in any case: then the offset will be valid for
-				 * either an aligned or unaligned value.
-				 */
-				if (usecache &&
-					off == att_nominal_alignby(off, att->attalignby))
-					att->attcacheoff = off;
-				else
-				{
-					off = att_pointer_alignby(off, att->attalignby, -1,
-											  tp + off);
-					usecache = false;
-				}
-			}
-			else
-			{
-				/* not varlena, so safe to use att_nominal_alignby */
-				off = att_nominal_alignby(off, att->attalignby);
-
-				if (usecache)
-					att->attcacheoff = off;
-			}
-
-			if (i == attnum)
-				break;
-
+			off = att_pointer_alignby(off,
+									  att->attalignby,
+									  att->attlen,
+									  tp + off);
 			off = att_addlength_pointer(off, att->attlen, tp + off);
 
-			if (usecache && att->attlen <= 0)
-				usecache = false;
 		}
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
 	}
 
-	return fetchatt(TupleDescCompactAttr(tupleDesc, attnum), tp + off);
+	return fetchatt(cattr, tp + off);
 }
 
 /* ----------------
@@ -1347,6 +1245,7 @@ heap_deform_tuple(HeapTuple tuple, TupleDesc tupleDesc,
 				  Datum *values, bool *isnull)
 {
 	HeapTupleHeader tup = tuple->t_data;
+	CompactAttribute *cattr;
 	bool		hasnulls = HeapTupleHasNulls(tuple);
 	int			tdesc_natts = tupleDesc->natts;
 	int			natts;			/* number of atts to extract */
@@ -1354,70 +1253,91 @@ heap_deform_tuple(HeapTuple tuple, TupleDesc tupleDesc,
 	char	   *tp;				/* ptr to tuple data */
 	uint32		off;			/* offset in tuple data */
 	bits8	   *bp = tup->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* can we use/set attcacheoff? */
+	int			cacheoffattrs;
+	int			firstnullattr;
 
 	natts = HeapTupleHeaderGetNatts(tup);
 
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
+
 	/*
 	 * In inheritance situations, it is possible that the given tuple actually
 	 * has more fields than the caller is expecting.  Don't run off the end of
 	 * the caller's arrays.
 	 */
 	natts = Min(natts, tdesc_natts);
+	cacheoffattrs = Min(tupleDesc->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		firstnullattr = first_null_attr(bp, natts);
+		cacheoffattrs = Min(cacheoffattrs, firstnullattr);
+	}
+	else
+		firstnullattr = natts;
 
 	tp = (char *) tup + tup->t_hoff;
+	attnum = 0;
 
-	off = 0;
+	if (cacheoffattrs > 0)
+	{
+#ifdef USE_ASSERT_CHECKING
+		/* In Assert enabled builds, verify attcacheoff is correct */
+		off = 0;
+#endif
+		do
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+#ifdef USE_ASSERT_CHECKING
+			off = att_nominal_alignby(off, cattr->attalignby);
+			Assert(off == cattr->attcacheoff);
+			off += cattr->attlen;
+#endif
 
-	for (attnum = 0; attnum < natts; attnum++)
+			values[attnum] = fetch_att(tp + cattr->attcacheoff,
+									   cattr->attbyval,
+									   cattr->attlen);
+			isnull[attnum] = false;
+		} while (++attnum < cacheoffattrs);
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+	else
+		off = 0;
+
+	for (; attnum < firstnullattr; attnum++)
 	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDesc, attnum);
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
 
-		if (hasnulls && att_isnull(attnum, bp))
-		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			slow = true;		/* can't use attcacheoff anymore */
-			continue;
-		}
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
 		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (!slow && thisatt->attcacheoff >= 0)
-			off = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow &&
-				off == att_nominal_alignby(off, thisatt->attalignby))
-				thisatt->attcacheoff = off;
-			else
-			{
-				off = att_pointer_alignby(off, thisatt->attalignby, -1,
-										  tp + off);
-				slow = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			off = att_nominal_alignby(off, thisatt->attalignby);
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+	}
+
+	for (; attnum < natts; attnum++)
+	{
+		Assert(hasnulls);
 
-			if (!slow)
-				thisatt->attcacheoff = off;
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
 		}
 
-		values[attnum] = fetchatt(thisatt, tp + off);
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
-		off = att_addlength_pointer(off, thisatt->attlen, tp + off);
+		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (thisatt->attlen <= 0)
-			slow = true;		/* can't use attcacheoff anymore */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
 	/*
diff --git a/src/backend/access/common/indextuple.c b/src/backend/access/common/indextuple.c
index d7c8c53fd8d..084e0937a60 100644
--- a/src/backend/access/common/indextuple.c
+++ b/src/backend/access/common/indextuple.c
@@ -223,18 +223,6 @@ index_form_tuple_context(TupleDesc tupleDescriptor,
  *
  *		This gets called from index_getattr() macro, and only in cases
  *		where we can't use cacheoffset and the value is not null.
- *
- *		This caches attribute offsets in the attribute descriptor.
- *
- *		An alternative way to speed things up would be to cache offsets
- *		with the tuple, but that seems more difficult unless you take
- *		the storage hit of actually putting those offsets into the
- *		tuple you send to disk.  Yuck.
- *
- *		This scheme will be slightly slower than that, but should
- *		perform well for queries which hit large #'s of tuples.  After
- *		you cache the offsets once, examining all the other tuples using
- *		the same attribute descriptor will go much quicker. -cim 5/4/91
  * ----------------
  */
 Datum
@@ -242,205 +230,129 @@ nocache_index_getattr(IndexTuple tup,
 					  int attnum,
 					  TupleDesc tupleDesc)
 {
+	CompactAttribute *cattr;
 	char	   *tp;				/* ptr to data part of tuple */
 	bits8	   *bp = NULL;		/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* do we have to walk attrs? */
 	int			data_off;		/* tuple data offset */
 	int			off;			/* current offset within data */
+	int			startAttr;
+	int			firstnullattr;
+	bool		hasnulls = IndexTupleHasNulls(tup);
+	int			i;
 
-	/* ----------------
-	 *	 Three cases:
-	 *
-	 *	 1: No nulls and no variable-width attributes.
-	 *	 2: Has a null or a var-width AFTER att.
-	 *	 3: Has nulls or var-widths BEFORE att.
-	 * ----------------
-	 */
-
-	data_off = IndexInfoFindDataOffset(tup->t_info);
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
 	attnum--;
 
-	if (IndexTupleHasNulls(tup))
-	{
-		/*
-		 * there's a null somewhere in the tuple
-		 *
-		 * check to see if desired att is null
-		 */
+	/*
+	 * If there are no NULLs before the required attnum, then we can start at
+	 * the highest attribute with a known offset, or the first attribute if
+	 * none have a cached offset.  If the tuple has no variable width types,
+	 * which is common with indexes, then we can use a slightly cheaper method
+	 * of offset calculation, as we just need to add the attlen to the aligned
+	 * offset when skipping over columns.  When the tuple contains
+	 * variable-width types, we must use att_addlength_pointer(), which does a
+	 * bit more branching and is slightly less efficient.
+	 */
+	data_off = IndexInfoFindDataOffset(tup->t_info);
+	tp = (char *) tup + data_off;
 
-		/* XXX "knows" t_bits are just after fixed tuple header! */
+	/*
+	 * Find the first NULL column, or if there's none set the first NULL to
+	 * attnum so that we can forego NULL checking all the way to attnum.
+	 */
+	if (hasnulls)
+	{
 		bp = (bits8 *) ((char *) tup + sizeof(IndexTupleData));
-
-		/*
-		 * Now check to see if any preceding bits are null...
-		 */
-		{
-			int			byte = attnum >> 3;
-			int			finalbit = attnum & 0x07;
-
-			/* check for nulls "before" final bit of last byte */
-			if ((~bp[byte]) & ((1 << finalbit) - 1))
-				slow = true;
-			else
-			{
-				/* check for nulls in any "earlier" bytes */
-				int			i;
-
-				for (i = 0; i < byte; i++)
-				{
-					if (bp[i] != 0xFF)
-					{
-						slow = true;
-						break;
-					}
-				}
-			}
-		}
+		firstnullattr = first_null_attr(bp, attnum);
 	}
+	else
+		firstnullattr = attnum;
 
-	tp = (char *) tup + data_off;
-
-	if (!slow)
+	if (tupleDesc->firstNonCachedOffAttr > 0)
 	{
-		CompactAttribute *att;
-
-		/*
-		 * If we get here, there are no nulls up to and including the target
-		 * attribute.  If we have a cached offset, we can use it.
-		 */
-		att = TupleDescCompactAttr(tupleDesc, attnum);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, tp + att->attcacheoff);
-
-		/*
-		 * Otherwise, check for non-fixed-length attrs up to and including
-		 * target.  If there aren't any, it's safe to cheaply initialize the
-		 * cached offsets for these attrs.
-		 */
-		if (IndexTupleHasVarwidths(tup))
-		{
-			int			j;
-
-			for (j = 0; j <= attnum; j++)
-			{
-				if (TupleDescCompactAttr(tupleDesc, j)->attlen <= 0)
-				{
-					slow = true;
-					break;
-				}
-			}
-		}
+		startAttr = Min(tupleDesc->firstNonCachedOffAttr - 1, firstnullattr);
+		off = TupleDescCompactAttr(tupleDesc, startAttr)->attcacheoff;
 	}
-
-	if (!slow)
+	else
 	{
-		int			natts = tupleDesc->natts;
-		int			j = 1;
-
-		/*
-		 * If we get here, we have a tuple with no nulls or var-widths up to
-		 * and including the target attribute, so we can use the cached offset
-		 * ... only we don't have it yet, or we'd not have got here.  Since
-		 * it's cheap to compute offsets for fixed-width columns, we take the
-		 * opportunity to initialize the cached offsets for *all* the leading
-		 * fixed-width columns, in hope of avoiding future visits to this
-		 * routine.
-		 */
-		TupleDescCompactAttr(tupleDesc, 0)->attcacheoff = 0;
+		startAttr = 0;
+		off = 0;
+	}
 
-		/* we might have set some offsets in the slow path previously */
-		while (j < natts && TupleDescCompactAttr(tupleDesc, j)->attcacheoff > 0)
-			j++;
+	/* Handle tuples with var-width attributes */
+	if (IndexTupleHasVarwidths(tup))
+	{
+		/* Calculate the offset up until the first NULL */
+		for (i = startAttr; i < firstnullattr; i++)
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-		off = TupleDescCompactAttr(tupleDesc, j - 1)->attcacheoff +
-			TupleDescCompactAttr(tupleDesc, j - 1)->attlen;
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, cattr->attlen, tp + off);
+		}
 
-		for (; j < natts; j++)
+		/* Calculate the offset for any remaining columns. */
+		for (; i < attnum; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, j);
+			Assert(hasnulls);
 
-			if (att->attlen <= 0)
-				break;
+			if (att_isnull(i, bp))
+				continue;
 
-			off = att_nominal_alignby(off, att->attalignby);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-			att->attcacheoff = off;
-
-			off += att->attlen;
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, cattr->attlen, tp + off);
 		}
-
-		Assert(j > attnum);
-
-		off = TupleDescCompactAttr(tupleDesc, attnum)->attcacheoff;
 	}
 	else
 	{
-		bool		usecache = true;
-		int			i;
+		/* Handle tuples with only fixed-width attributes */
 
-		/*
-		 * Now we know that we have to walk the tuple CAREFULLY.  But we still
-		 * might be able to cache some offsets for next time.
-		 *
-		 * Note - This loop is a little tricky.  For each non-null attribute,
-		 * we have to first account for alignment padding before the attr,
-		 * then advance over the attr based on its length.  Nulls have no
-		 * storage and no alignment padding either.  We can use/set
-		 * attcacheoff until we reach either a null or a var-width attribute.
-		 */
-		off = 0;
-		for (i = 0;; i++)		/* loop exit is at "break" */
+		/* Calculate the offset up until the first NULL */
+		for (i = startAttr; i < firstnullattr; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
-
-			if (IndexTupleHasNulls(tup) && att_isnull(i, bp))
-			{
-				usecache = false;
-				continue;		/* this cannot be the target att */
-			}
-
-			/* If we know the next offset, we can skip the rest */
-			if (usecache && att->attcacheoff >= 0)
-				off = att->attcacheoff;
-			else if (att->attlen == -1)
-			{
-				/*
-				 * We can only cache the offset for a varlena attribute if the
-				 * offset is already suitably aligned, so that there would be
-				 * no pad bytes in any case: then the offset will be valid for
-				 * either an aligned or unaligned value.
-				 */
-				if (usecache &&
-					off == att_nominal_alignby(off, att->attalignby))
-					att->attcacheoff = off;
-				else
-				{
-					off = att_pointer_alignby(off, att->attalignby, -1,
-											  tp + off);
-					usecache = false;
-				}
-			}
-			else
-			{
-				/* not varlena, so safe to use att_nominal_alignby */
-				off = att_nominal_alignby(off, att->attalignby);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
+
+			Assert(cattr->attlen > 0);
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off += cattr->attlen;
+		}
 
-				if (usecache)
-					att->attcacheoff = off;
-			}
+		/* Calculate the offset for any remaining columns. */
+		for (; i < attnum; i++)
+		{
+			Assert(hasnulls);
 
-			if (i == attnum)
-				break;
+			if (att_isnull(i, bp))
+				continue;
 
-			off = att_addlength_pointer(off, att->attlen, tp + off);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-			if (usecache && att->attlen <= 0)
-				usecache = false;
+			Assert(cattr->attlen > 0);
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off += cattr->attlen;
 		}
 	}
 
-	return fetchatt(TupleDescCompactAttr(tupleDesc, attnum), tp + off);
+	cattr = TupleDescCompactAttr(tupleDesc, attnum);
+	off = att_pointer_alignby(off, cattr->attalignby,
+							  cattr->attlen, tp + off);
+	return fetchatt(cattr, tp + off);
 }
 
 /*
@@ -480,63 +392,86 @@ index_deform_tuple_internal(TupleDesc tupleDescriptor,
 							Datum *values, bool *isnull,
 							char *tp, bits8 *bp, int hasnulls)
 {
+	CompactAttribute *cattr;
 	int			natts = tupleDescriptor->natts; /* number of atts to extract */
-	int			attnum;
+	int			attnum = 0;
 	int			off = 0;		/* offset in tuple data */
-	bool		slow = false;	/* can we use/set attcacheoff? */
+	int			cacheoffattrs;
+	int			firstnullattr;
 
 	/* Assert to protect callers who allocate fixed-size arrays */
 	Assert(natts <= INDEX_MAX_KEYS);
 
-	for (attnum = 0; attnum < natts; attnum++)
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDescriptor->firstNonCachedOffAttr >= 0);
+
+	cacheoffattrs = Min(tupleDescriptor->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		firstnullattr = first_null_attr(bp, natts);
+		cacheoffattrs = Min(cacheoffattrs, firstnullattr);
+	}
+	else
+		firstnullattr = natts;
+
+	if (cacheoffattrs > 0)
 	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDescriptor, attnum);
+#ifdef USE_ASSERT_CHECKING
+		/* In Assert enabled builds, verify attcacheoff is correct */
+		off = 0;
+#endif
 
-		if (hasnulls && att_isnull(attnum, bp))
+		do
 		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			slow = true;		/* can't use attcacheoff anymore */
-			continue;
-		}
+			cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+
+#ifdef USE_ASSERT_CHECKING
+			off = att_nominal_alignby(off, cattr->attalignby);
+			Assert(off == cattr->attcacheoff);
+			off += cattr->attlen;
+#endif
+
+			values[attnum] = fetch_att(tp + cattr->attcacheoff, cattr->attbyval,
+									   cattr->attlen);
+			isnull[attnum] = false;
+		} while (++attnum < cacheoffattrs);
+
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+
+	for (; attnum < firstnullattr; attnum++)
+	{
+		cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
 		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (!slow && thisatt->attcacheoff >= 0)
-			off = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow &&
-				off == att_nominal_alignby(off, thisatt->attalignby))
-				thisatt->attcacheoff = off;
-			else
-			{
-				off = att_pointer_alignby(off, thisatt->attalignby, -1,
-										  tp + off);
-				slow = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			off = att_nominal_alignby(off, thisatt->attalignby);
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+	}
+
+	for (; attnum < natts; attnum++)
+	{
+		Assert(hasnulls);
 
-			if (!slow)
-				thisatt->attcacheoff = off;
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
 		}
 
-		values[attnum] = fetchatt(thisatt, tp + off);
+		cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
-		off = att_addlength_pointer(off, thisatt->attlen, tp + off);
+		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (thisatt->attlen <= 0)
-			slow = true;		/* can't use attcacheoff anymore */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 }
 
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index e98de806a77..25364db630a 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -214,6 +214,9 @@ CreateTemplateTupleDesc(int natts)
 	desc->tdtypmod = -1;
 	desc->tdrefcount = -1;		/* assume not reference-counted */
 
+	/* This will be set to the correct value by TupleDescFinalize() */
+	desc->firstNonCachedOffAttr = -1;
+
 	return desc;
 }
 
@@ -474,6 +477,9 @@ TupleDescCopy(TupleDesc dst, TupleDesc src)
  *		descriptor to another.
  *
  * !!! Constraints and defaults are not copied !!!
+ *
+ * The caller must take care of calling TupleDescFinalize() on once all
+ * TupleDesc changes have been made.
  */
 void
 TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
@@ -506,6 +512,37 @@ TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 	populate_compact_attribute(dst, dstAttno - 1);
 }
 
+/*
+ * TupleDescFinalize
+ *		Finalize the given TupleDesc.  This must be called after the
+ *		attributes arrays have been populated or adjusted by any code.
+ *
+ * Must be called after populate_compact_attribute() and before
+ * BlessTupleDesc().
+ */
+void
+TupleDescFinalize(TupleDesc tupdesc)
+{
+	int			firstNonCachedOffAttr = 0;
+	int			offp = 0;
+
+	for (int i = 0; i < tupdesc->natts; i++)
+	{
+		CompactAttribute *cattr = TupleDescCompactAttr(tupdesc, i);
+
+		if (cattr->attlen <= 0)
+			break;
+
+		offp = att_nominal_alignby(offp, cattr->attalignby);
+		cattr->attcacheoff = offp;
+
+		offp += cattr->attlen;
+		firstNonCachedOffAttr = i + 1;
+	}
+
+	tupdesc->firstNonCachedOffAttr = firstNonCachedOffAttr;
+}
+
 /*
  * Free a TupleDesc including all substructure
  */
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index b246e8127db..a4694bd8065 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -335,9 +335,6 @@ getSpGistTupleDesc(Relation index, SpGistTypeDesc *keyType)
 		/* We shouldn't need to bother with making these valid: */
 		att->attcompression = InvalidCompressionMethod;
 		att->attcollation = InvalidOid;
-		/* In case we changed typlen, we'd better reset following offsets */
-		for (int i = spgFirstIncludeColumn; i < outTupDesc->natts; i++)
-			TupleDescCompactAttr(outTupDesc, i)->attcacheoff = -1;
 
 		populate_compact_attribute(outTupDesc, spgKeyColumn);
 		TupleDescFinalize(outTupDesc);
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index e6ab51e6404..89f18be5d82 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -992,118 +992,6 @@ tts_buffer_heap_store_tuple(TupleTableSlot *slot, HeapTuple tuple,
 	}
 }
 
-/*
- * slot_deform_heap_tuple_internal
- *		An always inline helper function for use in slot_deform_heap_tuple to
- *		allow the compiler to emit specialized versions of this function for
- *		various combinations of "slow" and "hasnulls".  For example, if a
- *		given tuple has no nulls, then we needn't check "hasnulls" for every
- *		attribute that we're deforming.  The caller can just call this
- *		function with hasnulls set to constant-false and have the compiler
- *		remove the constant-false branches and emit more optimal code.
- *
- * Returns the next attnum to deform, which can be equal to natts when the
- * function manages to deform all requested attributes.  *offp is an input and
- * output parameter which is the byte offset within the tuple to start deforming
- * from which, on return, gets set to the offset where the next attribute
- * should be deformed from.  *slowp is set to true when subsequent deforming
- * of this tuple must use a version of this function with "slow" passed as
- * true.
- *
- * Callers cannot assume when we return "attnum" (i.e. all requested
- * attributes have been deformed) that slow mode isn't required for any
- * additional deforming as the final attribute may have caused a switch to
- * slow mode.
- */
-static pg_attribute_always_inline int
-slot_deform_heap_tuple_internal(TupleTableSlot *slot, HeapTuple tuple,
-								int attnum, int natts, bool slow,
-								bool hasnulls, uint32 *offp, bool *slowp)
-{
-	TupleDesc	tupleDesc = slot->tts_tupleDescriptor;
-	Datum	   *values = slot->tts_values;
-	bool	   *isnull = slot->tts_isnull;
-	HeapTupleHeader tup = tuple->t_data;
-	char	   *tp;				/* ptr to tuple data */
-	bits8	   *bp = tup->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slownext = false;
-
-	tp = (char *) tup + tup->t_hoff;
-
-	for (; attnum < natts; attnum++)
-	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDesc, attnum);
-
-		if (hasnulls && att_isnull(attnum, bp))
-		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			if (!slow)
-			{
-				*slowp = true;
-				return attnum + 1;
-			}
-			else
-				continue;
-		}
-
-		isnull[attnum] = false;
-
-		/* calculate the offset of this attribute */
-		if (!slow && thisatt->attcacheoff >= 0)
-			*offp = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow && *offp == att_nominal_alignby(*offp, thisatt->attalignby))
-				thisatt->attcacheoff = *offp;
-			else
-			{
-				*offp = att_pointer_alignby(*offp,
-											thisatt->attalignby,
-											-1,
-											tp + *offp);
-
-				if (!slow)
-					slownext = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			*offp = att_nominal_alignby(*offp, thisatt->attalignby);
-
-			if (!slow)
-				thisatt->attcacheoff = *offp;
-		}
-
-		values[attnum] = fetchatt(thisatt, tp + *offp);
-
-		*offp = att_addlength_pointer(*offp, thisatt->attlen, tp + *offp);
-
-		/* check if we need to switch to slow mode */
-		if (!slow)
-		{
-			/*
-			 * We're unable to deform any further if the above code set
-			 * 'slownext', or if this isn't a fixed-width attribute.
-			 */
-			if (slownext || thisatt->attlen <= 0)
-			{
-				*slowp = true;
-				return attnum + 1;
-			}
-		}
-	}
-
-	return natts;
-}
-
 /*
  * slot_deform_heap_tuple
  *		Given a TupleTableSlot, extract data from the slot's physical tuple
@@ -1122,78 +1010,140 @@ static pg_attribute_always_inline void
 slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 					   int natts)
 {
+	CompactAttribute *cattr;
+	TupleDesc	tupleDesc = slot->tts_tupleDescriptor;
 	bool		hasnulls = HeapTupleHasNulls(tuple);
+	HeapTupleHeader tup = tuple->t_data;
+	bits8	   *bp;				/* ptr to null bitmap in tuple */
 	int			attnum;
+	int			firstNonCacheOffsetAttr;
+	int			firstNullAttr;
+	Datum	   *values;
+	bool	   *isnull;
+	char	   *tp;				/* ptr to tuple data */
 	uint32		off;			/* offset in tuple data */
-	bool		slow;			/* can we use/set attcacheoff? */
+
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
 	/* We can only fetch as many attributes as the tuple has. */
-	natts = Min(HeapTupleHeaderGetNatts(tuple->t_data), natts);
+	natts = Min(HeapTupleHeaderGetNatts(tup), natts);
+	attnum = slot->tts_nvalid;
+	firstNonCacheOffsetAttr = Min(tupleDesc->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		bp = tup->t_bits;
+		firstNullAttr = first_null_attr(bp, natts);
+		firstNonCacheOffsetAttr = Min(firstNonCacheOffsetAttr, firstNullAttr);
+	}
+	else
+	{
+		bp = NULL;
+		firstNullAttr = natts;
+	}
+
+	values = slot->tts_values;
+	isnull = slot->tts_isnull;
+	tp = (char *) tup + tup->t_hoff;
 
 	/*
-	 * Check whether the first call for this tuple, and initialize or restore
-	 * loop state.
+	 * Handle the portion of the tuple that we have cached the offset for up
+	 * to the first NULL attribute.  The offset is effectively fixed for these
+	 * so we can use the CompactAttribute's attcacheoff.
 	 */
-	attnum = slot->tts_nvalid;
-	if (attnum == 0)
+	if (attnum < firstNonCacheOffsetAttr)
+	{
+#ifdef USE_ASSERT_CHECKING
+		int			offcheck;
+
+		/* In Assert enabled builds, verify attcacheoff is correct */
+		if (attnum == 0)
+			offcheck = 0;
+		else
+			offcheck = *offp;
+#endif
+		do
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+#ifdef USE_ASSERT_CHECKING
+			offcheck = att_nominal_alignby(offcheck, cattr->attalignby);
+			Assert(offcheck == cattr->attcacheoff);
+			offcheck += cattr->attlen;
+#endif
+
+			values[attnum] = fetchatt(cattr, tp + cattr->attcacheoff);
+			isnull[attnum] = false;
+		} while (++attnum < firstNonCacheOffsetAttr);
+
+		/*
+		 * Point the offset after the end of the last attribute with a cached
+		 * offset.  We expect the final cached offset attribute to have a
+		 * fixed width, so just add the attlen to the attcacheoff
+		 */
+		Assert(cattr->attlen > 0);
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+	else if (attnum == 0)
 	{
 		/* Start from the first attribute */
 		off = 0;
-		slow = false;
 	}
 	else
 	{
 		/* Restore state from previous execution */
 		off = *offp;
-		slow = TTS_SLOW(slot);
 	}
 
 	/*
-	 * If 'slow' isn't set, try deforming using deforming code that does not
-	 * contain any of the extra checks required for non-fixed offset
-	 * deforming.  During deforming, if or when we find a NULL or a variable
-	 * length attribute, we'll switch to a deforming method which includes the
-	 * extra code required for non-fixed offset deforming, a.k.a slow mode.
-	 * Because this is performance critical, we inline
-	 * slot_deform_heap_tuple_internal passing the 'slow' and 'hasnull'
-	 * parameters as constants to allow the compiler to emit specialized code
-	 * with the known-const false comparisons and subsequent branches removed.
+	 * Handle any portion of the tuple that doesn't have a fixed offset up
+	 * until the first NULL attribute.  This loops only differs from the one
+	 * after it by the NULL checks.
 	 */
-	if (!slow)
+	for (; attnum < firstNullAttr; attnum++)
 	{
-		/* Tuple without any NULLs? We can skip doing any NULL checking */
-		if (!hasnulls)
-			attnum = slot_deform_heap_tuple_internal(slot,
-													 tuple,
-													 attnum,
-													 natts,
-													 false, /* slow */
-													 false, /* hasnulls */
-													 &off,
-													 &slow);
-		else
-			attnum = slot_deform_heap_tuple_internal(slot,
-													 tuple,
-													 attnum,
-													 natts,
-													 false, /* slow */
-													 true,	/* hasnulls */
-													 &off,
-													 &slow);
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+		/* align the offset for this attribute */
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
+
+		values[attnum] = fetchatt(cattr, tp + off);
+		isnull[attnum] = false;
+
+		/* move the offset beyond this attribute */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
-	/* If there's still work to do then we must be in slow mode */
-	if (attnum < natts)
+	/*
+	 * Now handle any remaining tuples, this time include NULL checks as we're
+	 * now at the first NULL attribute.
+	 */
+	for (; attnum < natts; attnum++)
 	{
-		/* XXX is it worth adding a separate call when hasnulls is false? */
-		attnum = slot_deform_heap_tuple_internal(slot,
-												 tuple,
-												 attnum,
-												 natts,
-												 true,	/* slow */
-												 hasnulls,
-												 &off,
-												 &slow);
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
+		}
+
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+		/* align the offset for this attribute */
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
+
+		values[attnum] = fetchatt(cattr, tp + off);
+		isnull[attnum] = false;
+
+		/* move the offset beyond this attribute */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
 	/*
@@ -1201,10 +1151,6 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 	 */
 	slot->tts_nvalid = attnum;
 	*offp = off;
-	if (slow)
-		slot->tts_flags |= TTS_FLAG_SLOW;
-	else
-		slot->tts_flags &= ~TTS_FLAG_SLOW;
 }
 
 const TupleTableSlotOps TTSOpsVirtual = {
@@ -2259,10 +2205,16 @@ ExecTypeSetColNames(TupleDesc typeInfo, List *namesList)
  * This happens "for free" if the tupdesc came from a relcache entry, but
  * not if we have manufactured a tupdesc for a transient RECORD datatype.
  * In that case we have to notify typcache.c of the existence of the type.
+ *
+ * TupleDescFinalize() must be called on the TupleDesc before calling this
+ * function.
  */
 TupleDesc
 BlessTupleDesc(TupleDesc tupdesc)
 {
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupdesc->firstNonCachedOffAttr >= 0);
+
 	if (tupdesc->tdtypeid == RECORDOID &&
 		tupdesc->tdtypmod < 0)
 		assign_record_type_typmod(tupdesc);
diff --git a/src/backend/jit/llvm/llvmjit_deform.c b/src/backend/jit/llvm/llvmjit_deform.c
index 3eb087eb56b..12521e3e46a 100644
--- a/src/backend/jit/llvm/llvmjit_deform.c
+++ b/src/backend/jit/llvm/llvmjit_deform.c
@@ -62,7 +62,6 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 	LLVMValueRef v_tts_values;
 	LLVMValueRef v_tts_nulls;
 	LLVMValueRef v_slotoffp;
-	LLVMValueRef v_flagsp;
 	LLVMValueRef v_nvalidp;
 	LLVMValueRef v_nvalid;
 	LLVMValueRef v_maxatt;
@@ -178,7 +177,6 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 	v_tts_nulls =
 		l_load_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_ISNULL,
 						  "tts_ISNULL");
-	v_flagsp = l_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_FLAGS, "");
 	v_nvalidp = l_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_NVALID, "");
 
 	if (ops == &TTSOpsHeapTuple || ops == &TTSOpsBufferHeapTuple)
@@ -747,14 +745,10 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 
 	{
 		LLVMValueRef v_off = l_load(b, TypeSizeT, v_offp, "");
-		LLVMValueRef v_flags;
 
 		LLVMBuildStore(b, l_int16_const(lc, natts), v_nvalidp);
 		v_off = LLVMBuildTrunc(b, v_off, LLVMInt32TypeInContext(lc), "");
 		LLVMBuildStore(b, v_off, v_slotoffp);
-		v_flags = l_load(b, LLVMInt16TypeInContext(lc), v_flagsp, "tts_flags");
-		v_flags = LLVMBuildOr(b, v_flags, l_int16_const(lc, TTS_FLAG_SLOW), "");
-		LLVMBuildStore(b, v_flags, v_flagsp);
 		LLVMBuildRetVoid(b);
 	}
 
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 770edb34e08..998be24ac41 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -666,14 +666,6 @@ RelationBuildTupleDesc(Relation relation)
 		elog(ERROR, "pg_attribute catalog is missing %d attribute(s) for relation OID %u",
 			 need, RelationGetRelid(relation));
 
-	/*
-	 * We can easily set the attcacheoff value for the first attribute: it
-	 * must be zero.  This eliminates the need for special cases for attnum=1
-	 * that used to exist in fastgetattr() and index_getattr().
-	 */
-	if (RelationGetNumberOfAttributes(relation) > 0)
-		TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
-
 	/*
 	 * Set up constraint/default info
 	 */
@@ -1985,8 +1977,6 @@ formrdesc(const char *relationName, Oid relationReltype,
 		populate_compact_attribute(relation->rd_att, i);
 	}
 
-	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
-	TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
 	TupleDescFinalize(relation->rd_att);
 
 	/* mark not-null status */
@@ -4446,8 +4436,6 @@ BuildHardcodedDescriptor(int natts, const FormData_pg_attribute *attrs)
 		populate_compact_attribute(result, i);
 	}
 
-	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
-	TupleDescCompactAttr(result, 0)->attcacheoff = 0;
 	TupleDescFinalize(result);
 
 	/* Note: we don't bother to set up a TupleConstr entry */
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index d406825ff22..94b4279b7f1 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -865,20 +865,17 @@ extern MinimalTuple minimal_expand_tuple(HeapTuple sourceTuple, TupleDesc tupleD
 static inline Datum
 fastgetattr(HeapTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
 {
-	Assert(attnum > 0);
+	CompactAttribute *att = TupleDescCompactAttr(tupleDesc, attnum - 1);
 
+	Assert(attnum > 0);
 	*isnull = false;
-	if (HeapTupleNoNulls(tup))
-	{
-		CompactAttribute *att;
 
-		att = TupleDescCompactAttr(tupleDesc, attnum - 1);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, (char *) tup->t_data + tup->t_data->t_hoff +
-							att->attcacheoff);
-		else
-			return nocachegetattr(tup, attnum, tupleDesc);
-	}
+	if (att->attcacheoff >= 0 && !HeapTupleHasNulls(tup))
+		return fetchatt(att, (char *) tup->t_data + tup->t_data->t_hoff +
+						att->attcacheoff);
+
+	if (HeapTupleNoNulls(tup))
+		return nocachegetattr(tup, attnum, tupleDesc);
 	else
 	{
 		if (att_isnull(attnum - 1, tup->t_data->t_bits))
diff --git a/src/include/access/itup.h b/src/include/access/itup.h
index 57e4daafb0d..e4bb27b7e58 100644
--- a/src/include/access/itup.h
+++ b/src/include/access/itup.h
@@ -131,24 +131,20 @@ IndexInfoFindDataOffset(unsigned short t_info)
 static inline Datum
 index_getattr(IndexTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
 {
+	CompactAttribute *attr = TupleDescCompactAttr(tupleDesc, attnum - 1);
+
 	Assert(isnull);
 	Assert(attnum > 0);
 
 	*isnull = false;
 
-	if (!IndexTupleHasNulls(tup))
-	{
-		CompactAttribute *attr = TupleDescCompactAttr(tupleDesc, attnum - 1);
+	if (attr->attcacheoff >= 0 && !IndexTupleHasNulls(tup))
+		return fetchatt(attr,
+						(char *) tup + IndexInfoFindDataOffset(tup->t_info) +
+						attr->attcacheoff);
 
-		if (attr->attcacheoff >= 0)
-		{
-			return fetchatt(attr,
-							(char *) tup + IndexInfoFindDataOffset(tup->t_info) +
-							attr->attcacheoff);
-		}
-		else
-			return nocache_index_getattr(tup, attnum, tupleDesc);
-	}
+	if (!IndexTupleHasNulls(tup))
+		return nocache_index_getattr(tup, attnum, tupleDesc);
 	else
 	{
 		if (att_isnull(attnum - 1, (bits8 *) tup + sizeof(IndexTupleData)))
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index 595413dbbc5..99d9017d1a6 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -131,6 +131,12 @@ typedef struct CompactAttribute
  * Any code making changes manually to and fields in the FormData_pg_attribute
  * array must subsequently call populate_compact_attribute() to flush the
  * changes out to the corresponding 'compact_attrs' element.
+ *
+ * firstNonCachedOffAttr stores the index into the compact_attrs array for the
+ * first attribute that we don't have a known attcacheoff for.
+ *
+ * Once a TupleDesc has been populated, before it is used for any purpose
+ * TupleDescFinalize() must be called on it.
  */
 typedef struct TupleDescData
 {
@@ -138,6 +144,8 @@ typedef struct TupleDescData
 	Oid			tdtypeid;		/* composite type ID for tuple type */
 	int32		tdtypmod;		/* typmod for tuple type */
 	int			tdrefcount;		/* reference count, or -1 if not counting */
+	int			firstNonCachedOffAttr;	/* index of the first att without an
+										 * attcacheoff */
 	TupleConstr *constr;		/* constraints, or NULL if none */
 	/* compact_attrs[N] is the compact metadata of Attribute Number N+1 */
 	CompactAttribute compact_attrs[FLEXIBLE_ARRAY_MEMBER];
@@ -195,7 +203,6 @@ extern TupleDesc CreateTupleDescTruncatedCopy(TupleDesc tupdesc, int natts);
 
 extern TupleDesc CreateTupleDescCopyConstr(TupleDesc tupdesc);
 
-#define TupleDescFinalize(d) ((void) 0)
 #define TupleDescSize(src) \
 	(offsetof(struct TupleDescData, compact_attrs) + \
 	 (src)->natts * sizeof(CompactAttribute) + \
@@ -206,6 +213,7 @@ extern void TupleDescCopy(TupleDesc dst, TupleDesc src);
 extern void TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 							   TupleDesc src, AttrNumber srcAttno);
 
+extern void TupleDescFinalize(TupleDesc tupdesc);
 extern void FreeTupleDesc(TupleDesc tupdesc);
 
 extern void IncrTupleDescRefCount(TupleDesc tupdesc);
diff --git a/src/include/access/tupmacs.h b/src/include/access/tupmacs.h
index e6df8264750..fcaf6ad149f 100644
--- a/src/include/access/tupmacs.h
+++ b/src/include/access/tupmacs.h
@@ -15,6 +15,7 @@
 #define TUPMACS_H
 
 #include "catalog/pg_type_d.h"	/* for TYPALIGN macros */
+#include "port/pg_bitutils.h"
 
 
 /*
@@ -69,6 +70,70 @@ fetch_att(const void *T, bool attbyval, int attlen)
 	else
 		return PointerGetDatum(T);
 }
+
+/*
+ * first_null_attr
+ *		Inspect a NULL bitmask from a tuple and return the 0-based attnum of the
+ *		first NULL attribute.  Returns natts if no NULLs were found.
+ *
+ * We expect that 'bits' contains at least one 0 bit somewhere in the mask,
+ * not necessarily < natts.
+ */
+static inline int
+first_null_attr(const bits8 *bits, int natts)
+{
+	int			lastByte = natts >> 3;
+	uint8		mask;
+	int			res = natts;
+	uint8		byte;
+
+#ifdef USE_ASSERT_CHECKING
+	int			firstnull_check = natts;
+
+	/* Do it the slow way and check we get the same answer. */
+	for (int i = 0; i < natts; i++)
+	{
+		if (att_isnull(i, bits))
+		{
+			firstnull_check = i;
+			break;
+		}
+	}
+#endif
+
+	/* Process all bytes up to just before the byte for the natts index */
+	for (int bytenum = 0; bytenum < lastByte; bytenum++)
+	{
+		if (bits[bytenum] != 0xFF)
+		{
+			byte = ~bits[bytenum];
+			res = bytenum << 3;
+			res += pg_rightmost_one_pos[byte];
+
+			Assert(res == firstnull_check);
+			return res;
+		}
+	}
+
+	/*
+	 * Create a mask with all bits beyond natts's bit set to off.  This
+	 * assumes the code above will have found a 0-bit before we run off the
+	 * end of the bits array.  Tuples without any NULLs won't have a bitmask
+	 * to mark NULLs.
+	 */
+	mask = 0xFF & ((((uint8) 1) << (natts & 7)) - 1);
+	byte = (~bits[lastByte]) & mask;
+
+	if (byte != 0)
+	{
+		res = lastByte << 3;
+		res += pg_rightmost_one_pos[byte];
+	}
+
+	Assert(res == firstnull_check);
+
+	return res;
+}
 #endif							/* FRONTEND */
 
 /*
diff --git a/src/include/executor/tuptable.h b/src/include/executor/tuptable.h
index a2dfd707e78..363c5f33697 100644
--- a/src/include/executor/tuptable.h
+++ b/src/include/executor/tuptable.h
@@ -84,9 +84,6 @@
  * tts_values/tts_isnull are allocated either when the slot is created (when
  * the descriptor is provided), or when a descriptor is assigned to the slot;
  * they are of length equal to the descriptor's natts.
- *
- * The TTS_FLAG_SLOW flag is saved state for
- * slot_deform_heap_tuple, and should not be touched by any other code.
  *----------
  */
 
@@ -98,12 +95,8 @@
 #define			TTS_FLAG_SHOULDFREE		(1 << 2)
 #define TTS_SHOULDFREE(slot) (((slot)->tts_flags & TTS_FLAG_SHOULDFREE) != 0)
 
-/* saved state for slot_deform_heap_tuple */
-#define			TTS_FLAG_SLOW		(1 << 3)
-#define TTS_SLOW(slot) (((slot)->tts_flags & TTS_FLAG_SLOW) != 0)
-
 /* fixed tuple descriptor */
-#define			TTS_FLAG_FIXED		(1 << 4)
+#define			TTS_FLAG_FIXED		(1 << 4)	/* XXX change to #3? */
 #define TTS_FIXED(slot) (((slot)->tts_flags & TTS_FLAG_FIXED) != 0)
 
 struct TupleTableSlotOps;
-- 
2.51.0



Attachments:

  [text/plain] v5-0001-Add-empty-TupleDescFinalize-function.patch (29.0K, 2-v5-0001-Add-empty-TupleDescFinalize-function.patch)
  download | inline diff:
From e94ee4368acd1697fa0b08ae3b0dd1ccc51d18bf Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Wed, 21 Jan 2026 15:41:37 +1300
Subject: [PATCH v5 1/2] Add empty TupleDescFinalize() function

Currently does nothing, but will in a future commit.
---
 contrib/dblink/dblink.c                             |  4 ++++
 contrib/pg_buffercache/pg_buffercache_pages.c       |  2 ++
 contrib/pg_visibility/pg_visibility.c               |  2 ++
 src/backend/access/brin/brin_tuple.c                |  1 +
 src/backend/access/common/tupdesc.c                 | 13 +++++++++++++
 src/backend/access/gin/ginutil.c                    |  1 +
 src/backend/access/gist/gistscan.c                  |  1 +
 src/backend/access/spgist/spgutils.c                |  1 +
 src/backend/access/transam/twophase.c               |  1 +
 src/backend/access/transam/xlogfuncs.c              |  1 +
 src/backend/backup/basebackup_copy.c                |  3 +++
 src/backend/catalog/index.c                         |  2 ++
 src/backend/catalog/pg_publication.c                |  1 +
 src/backend/catalog/toasting.c                      |  6 ++++++
 src/backend/commands/explain.c                      |  1 +
 src/backend/commands/functioncmds.c                 |  1 +
 src/backend/commands/sequence.c                     |  1 +
 src/backend/commands/tablecmds.c                    |  4 ++++
 src/backend/commands/wait.c                         |  1 +
 src/backend/executor/execSRF.c                      |  2 ++
 src/backend/executor/execTuples.c                   |  4 ++++
 src/backend/executor/nodeFunctionscan.c             |  2 ++
 src/backend/parser/parse_relation.c                 |  4 +++-
 src/backend/parser/parse_target.c                   |  2 ++
 .../replication/libpqwalreceiver/libpqwalreceiver.c |  1 +
 src/backend/replication/walsender.c                 |  5 +++++
 src/backend/utils/adt/acl.c                         |  1 +
 src/backend/utils/adt/genfile.c                     |  1 +
 src/backend/utils/adt/lockfuncs.c                   |  1 +
 src/backend/utils/adt/orderedsetaggs.c              |  1 +
 src/backend/utils/adt/pgstatfuncs.c                 |  5 +++++
 src/backend/utils/adt/tsvector_op.c                 |  1 +
 src/backend/utils/cache/relcache.c                  |  8 ++++++++
 src/backend/utils/fmgr/funcapi.c                    |  6 ++++++
 src/backend/utils/misc/guc_funcs.c                  |  5 +++++
 src/include/access/tupdesc.h                        |  1 +
 src/pl/plpgsql/src/pl_comp.c                        |  2 ++
 .../test_custom_stats/test_custom_fixed_stats.c     |  1 +
 src/test/modules/test_predtest/test_predtest.c      |  1 +
 39 files changed, 100 insertions(+), 1 deletion(-)

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 8cb3166495c..1ce4502fec2 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -881,6 +881,7 @@ materializeResult(FunctionCallInfo fcinfo, PGconn *conn, PGresult *res)
 		tupdesc = CreateTemplateTupleDesc(1);
 		TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 						   TEXTOID, -1, 0);
+		TupleDescFinalize(tupdesc);
 		ntuples = 1;
 		nfields = 1;
 	}
@@ -1044,6 +1045,7 @@ materializeQueryResult(FunctionCallInfo fcinfo,
 			tupdesc = CreateTemplateTupleDesc(1);
 			TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 							   TEXTOID, -1, 0);
+			TupleDescFinalize(tupdesc);
 			attinmeta = TupleDescGetAttInMetadata(tupdesc);
 
 			oldcontext = MemoryContextSwitchTo(rsinfo->econtext->ecxt_per_query_memory);
@@ -1529,6 +1531,8 @@ dblink_get_pkey(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 2, "colname",
 						   TEXTOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
+
 		/*
 		 * Generate attribute metadata needed later to produce tuples from raw
 		 * C strings
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index dcba3fb5473..2fdf5a341f6 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -174,6 +174,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
 			TupleDescInitEntry(tupledesc, (AttrNumber) 9, "pinning_backends",
 							   INT4OID, -1, 0);
 
+		TupleDescFinalize(tupledesc);
 		fctx->tupdesc = BlessTupleDesc(tupledesc);
 
 		/* Allocate NBuffers worth of BufferCachePagesRec records. */
@@ -442,6 +443,7 @@ pg_buffercache_os_pages_internal(FunctionCallInfo fcinfo, bool include_numa)
 		TupleDescInitEntry(tupledesc, (AttrNumber) 3, "numa_node",
 						   INT4OID, -1, 0);
 
+		TupleDescFinalize(tupledesc);
 		fctx->tupdesc = BlessTupleDesc(tupledesc);
 		fctx->include_numa = include_numa;
 
diff --git a/contrib/pg_visibility/pg_visibility.c b/contrib/pg_visibility/pg_visibility.c
index 9bc3a784bf7..dfab0b64cf5 100644
--- a/contrib/pg_visibility/pg_visibility.c
+++ b/contrib/pg_visibility/pg_visibility.c
@@ -469,6 +469,8 @@ pg_visibility_tupdesc(bool include_blkno, bool include_pd)
 		TupleDescInitEntry(tupdesc, ++a, "pd_all_visible", BOOLOID, -1, 0);
 	Assert(a == maxattr);
 
+	TupleDescFinalize(tupdesc);
+
 	return BlessTupleDesc(tupdesc);
 }
 
diff --git a/src/backend/access/brin/brin_tuple.c b/src/backend/access/brin/brin_tuple.c
index 706387e36d6..7f150df9ee7 100644
--- a/src/backend/access/brin/brin_tuple.c
+++ b/src/backend/access/brin/brin_tuple.c
@@ -84,6 +84,7 @@ brtuple_disk_tupdesc(BrinDesc *brdesc)
 
 		MemoryContextSwitchTo(oldcxt);
 
+		TupleDescFinalize(tupdesc);
 		brdesc->bd_disktdesc = tupdesc;
 	}
 
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index 94b4f1f9975..e98de806a77 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -238,6 +238,9 @@ CreateTupleDesc(int natts, Form_pg_attribute *attrs)
 		memcpy(TupleDescAttr(desc, i), attrs[i], ATTRIBUTE_FIXED_PART_SIZE);
 		populate_compact_attribute(desc, i);
 	}
+
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -282,6 +285,8 @@ CreateTupleDescCopy(TupleDesc tupdesc)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -328,6 +333,8 @@ CreateTupleDescTruncatedCopy(TupleDesc tupdesc, int natts)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -413,6 +420,8 @@ CreateTupleDescCopyConstr(TupleDesc tupdesc)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -455,6 +464,8 @@ TupleDescCopy(TupleDesc dst, TupleDesc src)
 	 * source's refcount would be wrong in any case.)
 	 */
 	dst->tdrefcount = -1;
+
+	TupleDescFinalize(dst);
 }
 
 /*
@@ -1082,6 +1093,8 @@ BuildDescFromLists(const List *names, const List *types, const List *typmods, co
 		TupleDescInitEntryCollation(desc, attnum, attcollation);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index d205093e21d..a533d79e26e 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -129,6 +129,7 @@ initGinState(GinState *state, Relation index)
 							   attr->attndims);
 			TupleDescInitEntryCollation(state->tupdesc[i], (AttrNumber) 2,
 										attr->attcollation);
+			TupleDescFinalize(state->tupdesc[i]);
 		}
 
 		/*
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
index f23bc4a6757..c65f93abdae 100644
--- a/src/backend/access/gist/gistscan.c
+++ b/src/backend/access/gist/gistscan.c
@@ -201,6 +201,7 @@ gistrescan(IndexScanDesc scan, ScanKey key, int nkeys,
 											 attno - 1)->atttypid,
 							   -1, 0);
 		}
+		TupleDescFinalize(so->giststate->fetchTupdesc);
 		scan->xs_hitupdesc = so->giststate->fetchTupdesc;
 
 		/* Also create a memory context that will hold the returned tuples */
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 9f5379b87ac..b246e8127db 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -340,6 +340,7 @@ getSpGistTupleDesc(Relation index, SpGistTypeDesc *keyType)
 			TupleDescCompactAttr(outTupDesc, i)->attcacheoff = -1;
 
 		populate_compact_attribute(outTupDesc, spgKeyColumn);
+		TupleDescFinalize(outTupDesc);
 	}
 	return outTupDesc;
 }
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index e50abb331cc..9f708f84334 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -744,6 +744,7 @@ pg_prepared_xact(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 5, "dbid",
 						   OIDOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 2efe4105efb..b6bc616c74c 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -400,6 +400,7 @@ pg_walfile_name_offset(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(resultTupleDesc, (AttrNumber) 2, "file_offset",
 					   INT4OID, -1, 0);
 
+	TupleDescFinalize(resultTupleDesc);
 	resultTupleDesc = BlessTupleDesc(resultTupleDesc);
 
 	/*
diff --git a/src/backend/backup/basebackup_copy.c b/src/backend/backup/basebackup_copy.c
index fecfad9ab7b..29dbd0cb32f 100644
--- a/src/backend/backup/basebackup_copy.c
+++ b/src/backend/backup/basebackup_copy.c
@@ -357,6 +357,8 @@ SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli)
 	 */
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "tli", INT8OID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
+
 	/* send RowDescription */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
 
@@ -388,6 +390,7 @@ SendTablespaceList(List *tablespaces)
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, "spcoid", OIDOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "spclocation", TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "size", INT8OID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* send RowDescription */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 43de42ce39e..75e97fb394a 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -481,6 +481,8 @@ ConstructTupleDescriptor(Relation heapRelation,
 		populate_compact_attribute(indexTupDesc, i);
 	}
 
+	TupleDescFinalize(indexTupDesc);
+
 	return indexTupDesc;
 }
 
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 9a4791c573e..fa353a0dd37 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -1230,6 +1230,7 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 4, "qual",
 						   PG_NODE_TREEOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 		funcctx->user_fctx = table_infos;
 
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index c78dcea98c1..078a1cf5127 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -229,6 +229,12 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid,
 	TupleDescAttr(tupdesc, 1)->attcompression = InvalidCompressionMethod;
 	TupleDescAttr(tupdesc, 2)->attcompression = InvalidCompressionMethod;
 
+	populate_compact_attribute(tupdesc, 0);
+	populate_compact_attribute(tupdesc, 1);
+	populate_compact_attribute(tupdesc, 2);
+
+	TupleDescFinalize(tupdesc);
+
 	/*
 	 * Toast tables for regular relations go in pg_toast; those for temp
 	 * relations go into the per-backend temp-toast-table namespace.
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index b7bb111688c..7abd9ed272f 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -281,6 +281,7 @@ ExplainResultDesc(ExplainStmt *stmt)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "QUERY PLAN",
 					   result_type, -1, 0);
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
 
diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c
index a516b037dea..6a8f162b640 100644
--- a/src/backend/commands/functioncmds.c
+++ b/src/backend/commands/functioncmds.c
@@ -2423,6 +2423,7 @@ CallStmtResultDesc(CallStmt *stmt)
 							   -1,
 							   0);
 		}
+		TupleDescFinalize(tupdesc);
 	}
 
 	return tupdesc;
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index e1b808bbb60..551667650ba 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -1808,6 +1808,7 @@ pg_get_sequence_data(PG_FUNCTION_ARGS)
 					   BOOLOID, -1, 0);
 	TupleDescInitEntry(resultTupleDesc, (AttrNumber) 3, "page_lsn",
 					   LSNOID, -1, 0);
+	TupleDescFinalize(resultTupleDesc);
 	resultTupleDesc = BlessTupleDesc(resultTupleDesc);
 
 	seqrel = try_relation_open(relid, AccessShareLock);
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index f976c0e5c7e..ae7c9db9fa9 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -1029,6 +1029,8 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
 		}
 	}
 
+	TupleDescFinalize(descriptor);
+
 	/*
 	 * For relations with table AM and partitioned tables, select access
 	 * method to use: an explicitly indicated one, or (in the case of a
@@ -1448,6 +1450,8 @@ BuildDescForRelation(const List *columns)
 		populate_compact_attribute(desc, attnum - 1);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
diff --git a/src/backend/commands/wait.c b/src/backend/commands/wait.c
index 1290df10c6f..8e920a72372 100644
--- a/src/backend/commands/wait.c
+++ b/src/backend/commands/wait.c
@@ -338,5 +338,6 @@ WaitStmtResultDesc(WaitStmt *stmt)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 					   TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
diff --git a/src/backend/executor/execSRF.c b/src/backend/executor/execSRF.c
index a0b111dc0e4..b481e50acfb 100644
--- a/src/backend/executor/execSRF.c
+++ b/src/backend/executor/execSRF.c
@@ -272,6 +272,7 @@ ExecMakeTableFunctionResult(SetExprState *setexpr,
 									   funcrettype,
 									   -1,
 									   0);
+					TupleDescFinalize(tupdesc);
 					rsinfo.setDesc = tupdesc;
 				}
 				MemoryContextSwitchTo(oldcontext);
@@ -776,6 +777,7 @@ init_sexpr(Oid foid, Oid input_collation, Expr *node,
 							   funcrettype,
 							   -1,
 							   0);
+			TupleDescFinalize(tupdesc);
 			sexpr->funcResultDesc = tupdesc;
 			sexpr->funcReturnsTuple = false;
 		}
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index b768eae9e53..e6ab51e6404 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -2173,6 +2173,8 @@ ExecTypeFromTLInternal(List *targetList, bool skipjunk)
 		cur_resno++;
 	}
 
+	TupleDescFinalize(typeInfo);
+
 	return typeInfo;
 }
 
@@ -2207,6 +2209,8 @@ ExecTypeFromExprList(List *exprList)
 		cur_resno++;
 	}
 
+	TupleDescFinalize(typeInfo);
+
 	return typeInfo;
 }
 
diff --git a/src/backend/executor/nodeFunctionscan.c b/src/backend/executor/nodeFunctionscan.c
index 63e605e1f81..feb82d64967 100644
--- a/src/backend/executor/nodeFunctionscan.c
+++ b/src/backend/executor/nodeFunctionscan.c
@@ -414,6 +414,7 @@ ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags)
 				TupleDescInitEntryCollation(tupdesc,
 											(AttrNumber) 1,
 											exprCollation(funcexpr));
+				TupleDescFinalize(tupdesc);
 			}
 			else
 			{
@@ -485,6 +486,7 @@ ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags)
 							   0);
 		}
 
+		TupleDescFinalize(scan_tupdesc);
 		Assert(attno == natts);
 	}
 
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 3ec8d8de011..0ad767d827b 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -1891,6 +1891,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 			TupleDescInitEntryCollation(tupdesc,
 										(AttrNumber) 1,
 										exprCollation(funcexpr));
+			TupleDescFinalize(tupdesc);
 		}
 		else if (functypclass == TYPEFUNC_RECORD)
 		{
@@ -1948,6 +1949,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 
 				i++;
 			}
+			TupleDescFinalize(tupdesc);
 
 			/*
 			 * Ensure that the coldeflist defines a legal set of names (no
@@ -2016,7 +2018,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 							   0);
 			/* no need to set collation */
 		}
-
+		TupleDescFinalize(tupdesc);
 		Assert(natts == totalatts);
 	}
 	else
diff --git a/src/backend/parser/parse_target.c b/src/backend/parser/parse_target.c
index b5a2f915b67..5fd17f3d8d0 100644
--- a/src/backend/parser/parse_target.c
+++ b/src/backend/parser/parse_target.c
@@ -1570,6 +1570,8 @@ expandRecordVariable(ParseState *pstate, Var *var, int levelsup)
 		}
 		Assert(lname == NULL && lvar == NULL);	/* lists same length? */
 
+		TupleDescFinalize(tupleDesc);
+
 		return tupleDesc;
 	}
 
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 7c8639b32e9..9f04c9ed25d 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -1073,6 +1073,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
 	for (coln = 0; coln < nRetTypes; coln++)
 		TupleDescInitEntry(walres->tupledesc, (AttrNumber) coln + 1,
 						   PQfname(pgres, coln), retTypes[coln], -1, 0);
+	TupleDescFinalize(walres->tupledesc);
 	attinmeta = TupleDescGetAttInMetadata(walres->tupledesc);
 
 	/* No point in doing more here if there were no tuples returned. */
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 1ab09655a70..269b081bac0 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -452,6 +452,7 @@ IdentifySystem(void)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 4, "dbname",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -497,6 +498,7 @@ ReadReplicationSlot(ReadReplicationSlotCmd *cmd)
 	/* TimeLineID is unsigned, so int4 is not wide enough. */
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "restart_tli",
 							  INT8OID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	memset(nulls, true, READ_REPLICATION_SLOT_COLS * sizeof(bool));
 
@@ -599,6 +601,7 @@ SendTimeLineHistory(TimeLineHistoryCmd *cmd)
 	tupdesc = CreateTemplateTupleDesc(2);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, "filename", TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "content", TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	TLHistoryFileName(histfname, cmd->timeline);
 	TLHistoryFilePath(path, cmd->timeline);
@@ -1016,6 +1019,7 @@ StartReplication(StartReplicationCmd *cmd)
 								  INT8OID, -1, 0);
 		TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "next_tli_startpos",
 								  TEXTOID, -1, 0);
+		TupleDescFinalize(tupdesc);
 
 		/* prepare for projection of tuple */
 		tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -1370,6 +1374,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 4, "output_plugin",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
diff --git a/src/backend/utils/adt/acl.c b/src/backend/utils/adt/acl.c
index 3a6905f9546..9d37053c81e 100644
--- a/src/backend/utils/adt/acl.c
+++ b/src/backend/utils/adt/acl.c
@@ -1818,6 +1818,7 @@ aclexplode(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 4, "is_grantable",
 						   BOOLOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/* allocate memory for user context */
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index c083608b1d5..bfb949401d0 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -454,6 +454,7 @@ pg_stat_file(PG_FUNCTION_ARGS)
 					   "creation", TIMESTAMPTZOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 6,
 					   "isdir", BOOLOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	memset(isnull, false, sizeof(isnull));
diff --git a/src/backend/utils/adt/lockfuncs.c b/src/backend/utils/adt/lockfuncs.c
index 9dadd6da672..4481c354fd6 100644
--- a/src/backend/utils/adt/lockfuncs.c
+++ b/src/backend/utils/adt/lockfuncs.c
@@ -146,6 +146,7 @@ pg_lock_status(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 16, "waitstart",
 						   TIMESTAMPTZOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/*
diff --git a/src/backend/utils/adt/orderedsetaggs.c b/src/backend/utils/adt/orderedsetaggs.c
index 3b6da8e36ac..fd8b8676470 100644
--- a/src/backend/utils/adt/orderedsetaggs.c
+++ b/src/backend/utils/adt/orderedsetaggs.c
@@ -233,6 +233,7 @@ ordered_set_startup(FunctionCallInfo fcinfo, bool use_tuples)
 								   -1,
 								   0);
 
+				TupleDescFinalize(newdesc);
 				FreeTupleDesc(qstate->tupdesc);
 				qstate->tupdesc = newdesc;
 			}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 73ca0bb0b7f..08ad27e57c2 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -769,6 +769,7 @@ pg_stat_get_backend_subxact(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subxact_overflow",
 					   BOOLOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	if ((local_beentry = pgstat_get_local_beentry_by_proc_number(procNumber)) != NULL)
@@ -1658,6 +1659,7 @@ pg_stat_wal_build_tuple(PgStat_WalCounters wal_counters,
 	TupleDescInitEntry(tupdesc, (AttrNumber) 6, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	/* Fill values and NULLs */
@@ -2085,6 +2087,7 @@ pg_stat_get_archiver(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(tupdesc, (AttrNumber) 7, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	/* Get statistics about the archiver process */
@@ -2166,6 +2169,7 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
 					   TIMESTAMPTZOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 13, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	namestrcpy(&slotname, text_to_cstring(slotname_text));
@@ -2253,6 +2257,7 @@ pg_stat_get_subscription_stats(PG_FUNCTION_ARGS)
 					   INT8OID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 13, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	if (!subentry)
diff --git a/src/backend/utils/adt/tsvector_op.c b/src/backend/utils/adt/tsvector_op.c
index 94e0fed8309..7ca19a97882 100644
--- a/src/backend/utils/adt/tsvector_op.c
+++ b/src/backend/utils/adt/tsvector_op.c
@@ -651,6 +651,7 @@ tsvector_unnest(PG_FUNCTION_ARGS)
 						   TEXTARRAYOID, -1, 0);
 		if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
 			elog(ERROR, "return type must be a row type");
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = tupdesc;
 
 		funcctx->user_fctx = PG_GETARG_TSVECTOR_COPY(0);
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 6b634c9fff1..770edb34e08 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -729,6 +729,8 @@ RelationBuildTupleDesc(Relation relation)
 		pfree(constr);
 		relation->rd_att->constr = NULL;
 	}
+
+	TupleDescFinalize(relation->rd_att);
 }
 
 /*
@@ -1985,6 +1987,7 @@ formrdesc(const char *relationName, Oid relationReltype,
 
 	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
 	TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
+	TupleDescFinalize(relation->rd_att);
 
 	/* mark not-null status */
 	if (has_not_null)
@@ -3688,6 +3691,8 @@ RelationBuildLocalRelation(const char *relname,
 	for (i = 0; i < natts; i++)
 		TupleDescAttr(rel->rd_att, i)->attrelid = relid;
 
+	TupleDescFinalize(rel->rd_att);
+
 	rel->rd_rel->reltablespace = reltablespace;
 
 	if (mapped_relation)
@@ -4443,6 +4448,7 @@ BuildHardcodedDescriptor(int natts, const FormData_pg_attribute *attrs)
 
 	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
 	TupleDescCompactAttr(result, 0)->attcacheoff = 0;
+	TupleDescFinalize(result);
 
 	/* Note: we don't bother to set up a TupleConstr entry */
 
@@ -6268,6 +6274,8 @@ load_relcache_init_file(bool shared)
 			populate_compact_attribute(rel->rd_att, i);
 		}
 
+		TupleDescFinalize(rel->rd_att);
+
 		/* next read the access method specific field */
 		if (fread(&len, 1, sizeof(len), fp) != sizeof(len))
 			goto read_failed;
diff --git a/src/backend/utils/fmgr/funcapi.c b/src/backend/utils/fmgr/funcapi.c
index 8a934ea8dca..516d02cfb82 100644
--- a/src/backend/utils/fmgr/funcapi.c
+++ b/src/backend/utils/fmgr/funcapi.c
@@ -340,6 +340,8 @@ get_expr_result_type(Node *expr,
 										exprCollation(col));
 			i++;
 		}
+		TupleDescFinalize(tupdesc);
+
 		if (resultTypeId)
 			*resultTypeId = rexpr->row_typeid;
 		if (resultTupleDesc)
@@ -1044,6 +1046,7 @@ resolve_polymorphic_tupdesc(TupleDesc tupdesc, oidvector *declared_args,
 		}
 	}
 
+	TupleDescFinalize(tupdesc);
 	return true;
 }
 
@@ -1853,6 +1856,8 @@ build_function_result_tupdesc_d(char prokind,
 						   0);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -1970,6 +1975,7 @@ TypeGetTupleDesc(Oid typeoid, List *colaliases)
 						   typeoid,
 						   -1,
 						   0);
+		TupleDescFinalize(tupdesc);
 	}
 	else if (functypclass == TYPEFUNC_RECORD)
 	{
diff --git a/src/backend/utils/misc/guc_funcs.c b/src/backend/utils/misc/guc_funcs.c
index 4f3e40bf470..b82f807e05e 100644
--- a/src/backend/utils/misc/guc_funcs.c
+++ b/src/backend/utils/misc/guc_funcs.c
@@ -444,6 +444,7 @@ GetPGVariableResultDesc(const char *name)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 1, varname,
 						   TEXTOID, -1, 0);
 	}
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
 
@@ -465,6 +466,7 @@ ShowGUCConfigOption(const char *name, DestReceiver *dest)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, varname,
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -499,6 +501,7 @@ ShowAllGUCConfig(DestReceiver *dest)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "description",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -934,6 +937,8 @@ show_all_settings(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 17, "pending_restart",
 						   BOOLOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
+
 		/*
 		 * Generate attribute metadata needed later to produce tuples from raw
 		 * C strings
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index d46cdbf7a3c..595413dbbc5 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -195,6 +195,7 @@ extern TupleDesc CreateTupleDescTruncatedCopy(TupleDesc tupdesc, int natts);
 
 extern TupleDesc CreateTupleDescCopyConstr(TupleDesc tupdesc);
 
+#define TupleDescFinalize(d) ((void) 0)
 #define TupleDescSize(src) \
 	(offsetof(struct TupleDescData, compact_attrs) + \
 	 (src)->natts * sizeof(CompactAttribute) + \
diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c
index 7d648c941c0..b2112c29fcf 100644
--- a/src/pl/plpgsql/src/pl_comp.c
+++ b/src/pl/plpgsql/src/pl_comp.c
@@ -1912,6 +1912,8 @@ build_row_from_vars(PLpgSQL_variable **vars, int numvars)
 		TupleDescInitEntryCollation(row->rowtupdesc, i + 1, typcoll);
 	}
 
+	TupleDescFinalize(row->rowtupdesc);
+
 	return row;
 }
 
diff --git a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
index 908bd18a7c7..fa1719bf3b5 100644
--- a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
+++ b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
@@ -205,6 +205,7 @@ test_custom_stats_fixed_report(PG_FUNCTION_ARGS)
 					   INT8OID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	values[0] = Int64GetDatum(stats->numcalls);
diff --git a/src/test/modules/test_predtest/test_predtest.c b/src/test/modules/test_predtest/test_predtest.c
index 679a5de456d..48ca2a4ea70 100644
--- a/src/test/modules/test_predtest/test_predtest.c
+++ b/src/test/modules/test_predtest/test_predtest.c
@@ -230,6 +230,7 @@ test_predtest(PG_FUNCTION_ARGS)
 					   "s_r_holds", BOOLOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 8,
 					   "w_r_holds", BOOLOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	tupdesc = BlessTupleDesc(tupdesc);
 
 	values[0] = BoolGetDatum(strong_implied_by);
-- 
2.51.0



  [text/plain] v5-0002-Precalculate-CompactAttribute-s-attcacheoff.patch (49.1K, 3-v5-0002-Precalculate-CompactAttribute-s-attcacheoff.patch)
  download | inline diff:
From 3229f3c90519f9ad441821c2a819429ad34f9011 Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Tue, 31 Dec 2024 09:19:24 +1300
Subject: [PATCH v5 2/2] Precalculate CompactAttribute's attcacheoff

This allows code to be removed from the tuple deform routines which
shrinks down the code a little, which can make it run more quickly.
This also makes a dedicated deformer loop to deform the portion of the
tuple which has a known offset, which makes deforming much faster when
a leading set of the table's columns are non-NULL values and fixed-width
types.
---
 src/backend/access/common/heaptuple.c  | 334 +++++++++-------------
 src/backend/access/common/indextuple.c | 367 ++++++++++---------------
 src/backend/access/common/tupdesc.c    |  37 +++
 src/backend/access/spgist/spgutils.c   |   3 -
 src/backend/executor/execTuples.c      | 280 ++++++++-----------
 src/backend/jit/llvm/llvmjit_deform.c  |   6 -
 src/backend/utils/cache/relcache.c     |  12 -
 src/include/access/htup_details.h      |  19 +-
 src/include/access/itup.h              |  20 +-
 src/include/access/tupdesc.h           |  10 +-
 src/include/access/tupmacs.h           |  65 +++++
 src/include/executor/tuptable.h        |   9 +-
 12 files changed, 522 insertions(+), 640 deletions(-)

diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c
index 11bec20e82e..42cce3dcdfe 100644
--- a/src/backend/access/common/heaptuple.c
+++ b/src/backend/access/common/heaptuple.c
@@ -497,20 +497,8 @@ heap_attisnull(HeapTuple tup, int attnum, TupleDesc tupleDesc)
 /* ----------------
  *		nocachegetattr
  *
- *		This only gets called from fastgetattr(), in cases where we
- *		can't use a cacheoffset and the value is not null.
- *
- *		This caches attribute offsets in the attribute descriptor.
- *
- *		An alternative way to speed things up would be to cache offsets
- *		with the tuple, but that seems more difficult unless you take
- *		the storage hit of actually putting those offsets into the
- *		tuple you send to disk.  Yuck.
- *
- *		This scheme will be slightly slower than that, but should
- *		perform well for queries which hit large #'s of tuples.  After
- *		you cache the offsets once, examining all the other tuples using
- *		the same attribute descriptor will go much quicker. -cim 5/4/91
+ *		This only gets called from fastgetattr(), in cases where the
+ *		attcacheoff is not set.
  *
  *		NOTE: if you need to change this code, see also heap_deform_tuple.
  *		Also see nocache_index_getattr, which is the same code for index
@@ -522,194 +510,104 @@ nocachegetattr(HeapTuple tup,
 			   int attnum,
 			   TupleDesc tupleDesc)
 {
+	CompactAttribute *cattr;
 	HeapTupleHeader td = tup->t_data;
 	char	   *tp;				/* ptr to data part of tuple */
 	bits8	   *bp = td->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* do we have to walk attrs? */
 	int			off;			/* current offset within data */
+	int			startAttr;
+	int			firstnullattr;
+	bool		hasnulls = HeapTupleHasNulls(tup);
 
-	/* ----------------
-	 *	 Three cases:
-	 *
-	 *	 1: No nulls and no variable-width attributes.
-	 *	 2: Has a null or a var-width AFTER att.
-	 *	 3: Has nulls or var-widths BEFORE att.
-	 * ----------------
-	 */
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
+	/*
+	 * If there are no NULLs before the required attnum, then we can start at
+	 * the highest attribute with a known offset, or the first attribute if
+	 * none have a cached offset.  If the tuple has no variable width types,
+	 * then we can use a slightly cheaper method of offset calculation, as we
+	 * just need to add the attlen to the aligned offset when skipping over
+	 * columns.  When the tuple contains variable-width types, we must use
+	 * att_addlength_pointer(), which does a bit more branching and is
+	 * slightly less efficient.
+	 */
 	attnum--;
 
-	if (!HeapTupleNoNulls(tup))
-	{
-		/*
-		 * there's a null somewhere in the tuple
-		 *
-		 * check to see if any preceding bits are null...
-		 */
-		int			byte = attnum >> 3;
-		int			finalbit = attnum & 0x07;
-
-		/* check for nulls "before" final bit of last byte */
-		if ((~bp[byte]) & ((1 << finalbit) - 1))
-			slow = true;
-		else
-		{
-			/* check for nulls in any "earlier" bytes */
-			int			i;
+	if (hasnulls)
+		firstnullattr = first_null_attr(bp, attnum);
+	else
+		firstnullattr = attnum;
 
-			for (i = 0; i < byte; i++)
-			{
-				if (bp[i] != 0xFF)
-				{
-					slow = true;
-					break;
-				}
-			}
-		}
+	if (tupleDesc->firstNonCachedOffAttr > 0)
+	{
+		startAttr = Min(tupleDesc->firstNonCachedOffAttr - 1, firstnullattr);
+		off = TupleDescCompactAttr(tupleDesc, startAttr)->attcacheoff;
+	}
+	else
+	{
+		startAttr = 0;
+		off = 0;
 	}
 
 	tp = (char *) td + td->t_hoff;
 
-	if (!slow)
+	if (hasnulls)
 	{
-		CompactAttribute *att;
+		for (int i = startAttr; i < attnum; i++)
+		{
+			CompactAttribute *att;
 
-		/*
-		 * If we get here, there are no nulls up to and including the target
-		 * attribute.  If we have a cached offset, we can use it.
-		 */
-		att = TupleDescCompactAttr(tupleDesc, attnum);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, tp + att->attcacheoff);
+			if (att_isnull(i, bp))
+				continue;
 
-		/*
-		 * Otherwise, check for non-fixed-length attrs up to and including
-		 * target.  If there aren't any, it's safe to cheaply initialize the
-		 * cached offsets for these attrs.
-		 */
-		if (HeapTupleHasVarWidth(tup))
-		{
-			int			j;
+			att = TupleDescCompactAttr(tupleDesc, i);
 
-			for (j = 0; j <= attnum; j++)
-			{
-				if (TupleDescCompactAttr(tupleDesc, j)->attlen <= 0)
-				{
-					slow = true;
-					break;
-				}
-			}
+			off = att_pointer_alignby(off,
+									  att->attalignby,
+									  att->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, att->attlen, tp + off);
 		}
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
 	}
-
-	if (!slow)
+	else if (!HeapTupleHasVarWidth(tup))
 	{
-		int			natts = tupleDesc->natts;
-		int			j = 1;
-
-		/*
-		 * If we get here, we have a tuple with no nulls or var-widths up to
-		 * and including the target attribute, so we can use the cached offset
-		 * ... only we don't have it yet, or we'd not have got here.  Since
-		 * it's cheap to compute offsets for fixed-width columns, we take the
-		 * opportunity to initialize the cached offsets for *all* the leading
-		 * fixed-width columns, in hope of avoiding future visits to this
-		 * routine.
-		 */
-		TupleDescCompactAttr(tupleDesc, 0)->attcacheoff = 0;
-
-		/* we might have set some offsets in the slow path previously */
-		while (j < natts && TupleDescCompactAttr(tupleDesc, j)->attcacheoff > 0)
-			j++;
-
-		off = TupleDescCompactAttr(tupleDesc, j - 1)->attcacheoff +
-			TupleDescCompactAttr(tupleDesc, j - 1)->attlen;
-
-		for (; j < natts; j++)
+		for (int i = startAttr; i < attnum; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, j);
-
-			if (att->attlen <= 0)
-				break;
+			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
 
 			off = att_nominal_alignby(off, att->attalignby);
-
-			att->attcacheoff = off;
-
 			off += att->attlen;
 		}
-
-		Assert(j > attnum);
-
-		off = TupleDescCompactAttr(tupleDesc, attnum)->attcacheoff;
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_nominal_alignby(off, cattr->attalignby);
 	}
 	else
 	{
-		bool		usecache = true;
-		int			i;
-
-		/*
-		 * Now we know that we have to walk the tuple CAREFULLY.  But we still
-		 * might be able to cache some offsets for next time.
-		 *
-		 * Note - This loop is a little tricky.  For each non-null attribute,
-		 * we have to first account for alignment padding before the attr,
-		 * then advance over the attr based on its length.  Nulls have no
-		 * storage and no alignment padding either.  We can use/set
-		 * attcacheoff until we reach either a null or a var-width attribute.
-		 */
-		off = 0;
-		for (i = 0;; i++)		/* loop exit is at "break" */
+		for (int i = startAttr; i < attnum; i++)
 		{
 			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
 
-			if (HeapTupleHasNulls(tup) && att_isnull(i, bp))
-			{
-				usecache = false;
-				continue;		/* this cannot be the target att */
-			}
-
-			/* If we know the next offset, we can skip the rest */
-			if (usecache && att->attcacheoff >= 0)
-				off = att->attcacheoff;
-			else if (att->attlen == -1)
-			{
-				/*
-				 * We can only cache the offset for a varlena attribute if the
-				 * offset is already suitably aligned, so that there would be
-				 * no pad bytes in any case: then the offset will be valid for
-				 * either an aligned or unaligned value.
-				 */
-				if (usecache &&
-					off == att_nominal_alignby(off, att->attalignby))
-					att->attcacheoff = off;
-				else
-				{
-					off = att_pointer_alignby(off, att->attalignby, -1,
-											  tp + off);
-					usecache = false;
-				}
-			}
-			else
-			{
-				/* not varlena, so safe to use att_nominal_alignby */
-				off = att_nominal_alignby(off, att->attalignby);
-
-				if (usecache)
-					att->attcacheoff = off;
-			}
-
-			if (i == attnum)
-				break;
-
+			off = att_pointer_alignby(off,
+									  att->attalignby,
+									  att->attlen,
+									  tp + off);
 			off = att_addlength_pointer(off, att->attlen, tp + off);
 
-			if (usecache && att->attlen <= 0)
-				usecache = false;
 		}
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
 	}
 
-	return fetchatt(TupleDescCompactAttr(tupleDesc, attnum), tp + off);
+	return fetchatt(cattr, tp + off);
 }
 
 /* ----------------
@@ -1347,6 +1245,7 @@ heap_deform_tuple(HeapTuple tuple, TupleDesc tupleDesc,
 				  Datum *values, bool *isnull)
 {
 	HeapTupleHeader tup = tuple->t_data;
+	CompactAttribute *cattr;
 	bool		hasnulls = HeapTupleHasNulls(tuple);
 	int			tdesc_natts = tupleDesc->natts;
 	int			natts;			/* number of atts to extract */
@@ -1354,70 +1253,91 @@ heap_deform_tuple(HeapTuple tuple, TupleDesc tupleDesc,
 	char	   *tp;				/* ptr to tuple data */
 	uint32		off;			/* offset in tuple data */
 	bits8	   *bp = tup->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* can we use/set attcacheoff? */
+	int			cacheoffattrs;
+	int			firstnullattr;
 
 	natts = HeapTupleHeaderGetNatts(tup);
 
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
+
 	/*
 	 * In inheritance situations, it is possible that the given tuple actually
 	 * has more fields than the caller is expecting.  Don't run off the end of
 	 * the caller's arrays.
 	 */
 	natts = Min(natts, tdesc_natts);
+	cacheoffattrs = Min(tupleDesc->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		firstnullattr = first_null_attr(bp, natts);
+		cacheoffattrs = Min(cacheoffattrs, firstnullattr);
+	}
+	else
+		firstnullattr = natts;
 
 	tp = (char *) tup + tup->t_hoff;
+	attnum = 0;
 
-	off = 0;
+	if (cacheoffattrs > 0)
+	{
+#ifdef USE_ASSERT_CHECKING
+		/* In Assert enabled builds, verify attcacheoff is correct */
+		off = 0;
+#endif
+		do
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+#ifdef USE_ASSERT_CHECKING
+			off = att_nominal_alignby(off, cattr->attalignby);
+			Assert(off == cattr->attcacheoff);
+			off += cattr->attlen;
+#endif
 
-	for (attnum = 0; attnum < natts; attnum++)
+			values[attnum] = fetch_att(tp + cattr->attcacheoff,
+									   cattr->attbyval,
+									   cattr->attlen);
+			isnull[attnum] = false;
+		} while (++attnum < cacheoffattrs);
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+	else
+		off = 0;
+
+	for (; attnum < firstnullattr; attnum++)
 	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDesc, attnum);
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
 
-		if (hasnulls && att_isnull(attnum, bp))
-		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			slow = true;		/* can't use attcacheoff anymore */
-			continue;
-		}
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
 		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (!slow && thisatt->attcacheoff >= 0)
-			off = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow &&
-				off == att_nominal_alignby(off, thisatt->attalignby))
-				thisatt->attcacheoff = off;
-			else
-			{
-				off = att_pointer_alignby(off, thisatt->attalignby, -1,
-										  tp + off);
-				slow = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			off = att_nominal_alignby(off, thisatt->attalignby);
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+	}
+
+	for (; attnum < natts; attnum++)
+	{
+		Assert(hasnulls);
 
-			if (!slow)
-				thisatt->attcacheoff = off;
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
 		}
 
-		values[attnum] = fetchatt(thisatt, tp + off);
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
-		off = att_addlength_pointer(off, thisatt->attlen, tp + off);
+		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (thisatt->attlen <= 0)
-			slow = true;		/* can't use attcacheoff anymore */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
 	/*
diff --git a/src/backend/access/common/indextuple.c b/src/backend/access/common/indextuple.c
index d7c8c53fd8d..084e0937a60 100644
--- a/src/backend/access/common/indextuple.c
+++ b/src/backend/access/common/indextuple.c
@@ -223,18 +223,6 @@ index_form_tuple_context(TupleDesc tupleDescriptor,
  *
  *		This gets called from index_getattr() macro, and only in cases
  *		where we can't use cacheoffset and the value is not null.
- *
- *		This caches attribute offsets in the attribute descriptor.
- *
- *		An alternative way to speed things up would be to cache offsets
- *		with the tuple, but that seems more difficult unless you take
- *		the storage hit of actually putting those offsets into the
- *		tuple you send to disk.  Yuck.
- *
- *		This scheme will be slightly slower than that, but should
- *		perform well for queries which hit large #'s of tuples.  After
- *		you cache the offsets once, examining all the other tuples using
- *		the same attribute descriptor will go much quicker. -cim 5/4/91
  * ----------------
  */
 Datum
@@ -242,205 +230,129 @@ nocache_index_getattr(IndexTuple tup,
 					  int attnum,
 					  TupleDesc tupleDesc)
 {
+	CompactAttribute *cattr;
 	char	   *tp;				/* ptr to data part of tuple */
 	bits8	   *bp = NULL;		/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* do we have to walk attrs? */
 	int			data_off;		/* tuple data offset */
 	int			off;			/* current offset within data */
+	int			startAttr;
+	int			firstnullattr;
+	bool		hasnulls = IndexTupleHasNulls(tup);
+	int			i;
 
-	/* ----------------
-	 *	 Three cases:
-	 *
-	 *	 1: No nulls and no variable-width attributes.
-	 *	 2: Has a null or a var-width AFTER att.
-	 *	 3: Has nulls or var-widths BEFORE att.
-	 * ----------------
-	 */
-
-	data_off = IndexInfoFindDataOffset(tup->t_info);
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
 	attnum--;
 
-	if (IndexTupleHasNulls(tup))
-	{
-		/*
-		 * there's a null somewhere in the tuple
-		 *
-		 * check to see if desired att is null
-		 */
+	/*
+	 * If there are no NULLs before the required attnum, then we can start at
+	 * the highest attribute with a known offset, or the first attribute if
+	 * none have a cached offset.  If the tuple has no variable width types,
+	 * which is common with indexes, then we can use a slightly cheaper method
+	 * of offset calculation, as we just need to add the attlen to the aligned
+	 * offset when skipping over columns.  When the tuple contains
+	 * variable-width types, we must use att_addlength_pointer(), which does a
+	 * bit more branching and is slightly less efficient.
+	 */
+	data_off = IndexInfoFindDataOffset(tup->t_info);
+	tp = (char *) tup + data_off;
 
-		/* XXX "knows" t_bits are just after fixed tuple header! */
+	/*
+	 * Find the first NULL column, or if there's none set the first NULL to
+	 * attnum so that we can forego NULL checking all the way to attnum.
+	 */
+	if (hasnulls)
+	{
 		bp = (bits8 *) ((char *) tup + sizeof(IndexTupleData));
-
-		/*
-		 * Now check to see if any preceding bits are null...
-		 */
-		{
-			int			byte = attnum >> 3;
-			int			finalbit = attnum & 0x07;
-
-			/* check for nulls "before" final bit of last byte */
-			if ((~bp[byte]) & ((1 << finalbit) - 1))
-				slow = true;
-			else
-			{
-				/* check for nulls in any "earlier" bytes */
-				int			i;
-
-				for (i = 0; i < byte; i++)
-				{
-					if (bp[i] != 0xFF)
-					{
-						slow = true;
-						break;
-					}
-				}
-			}
-		}
+		firstnullattr = first_null_attr(bp, attnum);
 	}
+	else
+		firstnullattr = attnum;
 
-	tp = (char *) tup + data_off;
-
-	if (!slow)
+	if (tupleDesc->firstNonCachedOffAttr > 0)
 	{
-		CompactAttribute *att;
-
-		/*
-		 * If we get here, there are no nulls up to and including the target
-		 * attribute.  If we have a cached offset, we can use it.
-		 */
-		att = TupleDescCompactAttr(tupleDesc, attnum);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, tp + att->attcacheoff);
-
-		/*
-		 * Otherwise, check for non-fixed-length attrs up to and including
-		 * target.  If there aren't any, it's safe to cheaply initialize the
-		 * cached offsets for these attrs.
-		 */
-		if (IndexTupleHasVarwidths(tup))
-		{
-			int			j;
-
-			for (j = 0; j <= attnum; j++)
-			{
-				if (TupleDescCompactAttr(tupleDesc, j)->attlen <= 0)
-				{
-					slow = true;
-					break;
-				}
-			}
-		}
+		startAttr = Min(tupleDesc->firstNonCachedOffAttr - 1, firstnullattr);
+		off = TupleDescCompactAttr(tupleDesc, startAttr)->attcacheoff;
 	}
-
-	if (!slow)
+	else
 	{
-		int			natts = tupleDesc->natts;
-		int			j = 1;
-
-		/*
-		 * If we get here, we have a tuple with no nulls or var-widths up to
-		 * and including the target attribute, so we can use the cached offset
-		 * ... only we don't have it yet, or we'd not have got here.  Since
-		 * it's cheap to compute offsets for fixed-width columns, we take the
-		 * opportunity to initialize the cached offsets for *all* the leading
-		 * fixed-width columns, in hope of avoiding future visits to this
-		 * routine.
-		 */
-		TupleDescCompactAttr(tupleDesc, 0)->attcacheoff = 0;
+		startAttr = 0;
+		off = 0;
+	}
 
-		/* we might have set some offsets in the slow path previously */
-		while (j < natts && TupleDescCompactAttr(tupleDesc, j)->attcacheoff > 0)
-			j++;
+	/* Handle tuples with var-width attributes */
+	if (IndexTupleHasVarwidths(tup))
+	{
+		/* Calculate the offset up until the first NULL */
+		for (i = startAttr; i < firstnullattr; i++)
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-		off = TupleDescCompactAttr(tupleDesc, j - 1)->attcacheoff +
-			TupleDescCompactAttr(tupleDesc, j - 1)->attlen;
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, cattr->attlen, tp + off);
+		}
 
-		for (; j < natts; j++)
+		/* Calculate the offset for any remaining columns. */
+		for (; i < attnum; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, j);
+			Assert(hasnulls);
 
-			if (att->attlen <= 0)
-				break;
+			if (att_isnull(i, bp))
+				continue;
 
-			off = att_nominal_alignby(off, att->attalignby);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-			att->attcacheoff = off;
-
-			off += att->attlen;
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, cattr->attlen, tp + off);
 		}
-
-		Assert(j > attnum);
-
-		off = TupleDescCompactAttr(tupleDesc, attnum)->attcacheoff;
 	}
 	else
 	{
-		bool		usecache = true;
-		int			i;
+		/* Handle tuples with only fixed-width attributes */
 
-		/*
-		 * Now we know that we have to walk the tuple CAREFULLY.  But we still
-		 * might be able to cache some offsets for next time.
-		 *
-		 * Note - This loop is a little tricky.  For each non-null attribute,
-		 * we have to first account for alignment padding before the attr,
-		 * then advance over the attr based on its length.  Nulls have no
-		 * storage and no alignment padding either.  We can use/set
-		 * attcacheoff until we reach either a null or a var-width attribute.
-		 */
-		off = 0;
-		for (i = 0;; i++)		/* loop exit is at "break" */
+		/* Calculate the offset up until the first NULL */
+		for (i = startAttr; i < firstnullattr; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
-
-			if (IndexTupleHasNulls(tup) && att_isnull(i, bp))
-			{
-				usecache = false;
-				continue;		/* this cannot be the target att */
-			}
-
-			/* If we know the next offset, we can skip the rest */
-			if (usecache && att->attcacheoff >= 0)
-				off = att->attcacheoff;
-			else if (att->attlen == -1)
-			{
-				/*
-				 * We can only cache the offset for a varlena attribute if the
-				 * offset is already suitably aligned, so that there would be
-				 * no pad bytes in any case: then the offset will be valid for
-				 * either an aligned or unaligned value.
-				 */
-				if (usecache &&
-					off == att_nominal_alignby(off, att->attalignby))
-					att->attcacheoff = off;
-				else
-				{
-					off = att_pointer_alignby(off, att->attalignby, -1,
-											  tp + off);
-					usecache = false;
-				}
-			}
-			else
-			{
-				/* not varlena, so safe to use att_nominal_alignby */
-				off = att_nominal_alignby(off, att->attalignby);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
+
+			Assert(cattr->attlen > 0);
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off += cattr->attlen;
+		}
 
-				if (usecache)
-					att->attcacheoff = off;
-			}
+		/* Calculate the offset for any remaining columns. */
+		for (; i < attnum; i++)
+		{
+			Assert(hasnulls);
 
-			if (i == attnum)
-				break;
+			if (att_isnull(i, bp))
+				continue;
 
-			off = att_addlength_pointer(off, att->attlen, tp + off);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-			if (usecache && att->attlen <= 0)
-				usecache = false;
+			Assert(cattr->attlen > 0);
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off += cattr->attlen;
 		}
 	}
 
-	return fetchatt(TupleDescCompactAttr(tupleDesc, attnum), tp + off);
+	cattr = TupleDescCompactAttr(tupleDesc, attnum);
+	off = att_pointer_alignby(off, cattr->attalignby,
+							  cattr->attlen, tp + off);
+	return fetchatt(cattr, tp + off);
 }
 
 /*
@@ -480,63 +392,86 @@ index_deform_tuple_internal(TupleDesc tupleDescriptor,
 							Datum *values, bool *isnull,
 							char *tp, bits8 *bp, int hasnulls)
 {
+	CompactAttribute *cattr;
 	int			natts = tupleDescriptor->natts; /* number of atts to extract */
-	int			attnum;
+	int			attnum = 0;
 	int			off = 0;		/* offset in tuple data */
-	bool		slow = false;	/* can we use/set attcacheoff? */
+	int			cacheoffattrs;
+	int			firstnullattr;
 
 	/* Assert to protect callers who allocate fixed-size arrays */
 	Assert(natts <= INDEX_MAX_KEYS);
 
-	for (attnum = 0; attnum < natts; attnum++)
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDescriptor->firstNonCachedOffAttr >= 0);
+
+	cacheoffattrs = Min(tupleDescriptor->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		firstnullattr = first_null_attr(bp, natts);
+		cacheoffattrs = Min(cacheoffattrs, firstnullattr);
+	}
+	else
+		firstnullattr = natts;
+
+	if (cacheoffattrs > 0)
 	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDescriptor, attnum);
+#ifdef USE_ASSERT_CHECKING
+		/* In Assert enabled builds, verify attcacheoff is correct */
+		off = 0;
+#endif
 
-		if (hasnulls && att_isnull(attnum, bp))
+		do
 		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			slow = true;		/* can't use attcacheoff anymore */
-			continue;
-		}
+			cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+
+#ifdef USE_ASSERT_CHECKING
+			off = att_nominal_alignby(off, cattr->attalignby);
+			Assert(off == cattr->attcacheoff);
+			off += cattr->attlen;
+#endif
+
+			values[attnum] = fetch_att(tp + cattr->attcacheoff, cattr->attbyval,
+									   cattr->attlen);
+			isnull[attnum] = false;
+		} while (++attnum < cacheoffattrs);
+
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+
+	for (; attnum < firstnullattr; attnum++)
+	{
+		cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
 		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (!slow && thisatt->attcacheoff >= 0)
-			off = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow &&
-				off == att_nominal_alignby(off, thisatt->attalignby))
-				thisatt->attcacheoff = off;
-			else
-			{
-				off = att_pointer_alignby(off, thisatt->attalignby, -1,
-										  tp + off);
-				slow = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			off = att_nominal_alignby(off, thisatt->attalignby);
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+	}
+
+	for (; attnum < natts; attnum++)
+	{
+		Assert(hasnulls);
 
-			if (!slow)
-				thisatt->attcacheoff = off;
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
 		}
 
-		values[attnum] = fetchatt(thisatt, tp + off);
+		cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
-		off = att_addlength_pointer(off, thisatt->attlen, tp + off);
+		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (thisatt->attlen <= 0)
-			slow = true;		/* can't use attcacheoff anymore */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 }
 
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index e98de806a77..25364db630a 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -214,6 +214,9 @@ CreateTemplateTupleDesc(int natts)
 	desc->tdtypmod = -1;
 	desc->tdrefcount = -1;		/* assume not reference-counted */
 
+	/* This will be set to the correct value by TupleDescFinalize() */
+	desc->firstNonCachedOffAttr = -1;
+
 	return desc;
 }
 
@@ -474,6 +477,9 @@ TupleDescCopy(TupleDesc dst, TupleDesc src)
  *		descriptor to another.
  *
  * !!! Constraints and defaults are not copied !!!
+ *
+ * The caller must take care of calling TupleDescFinalize() on once all
+ * TupleDesc changes have been made.
  */
 void
 TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
@@ -506,6 +512,37 @@ TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 	populate_compact_attribute(dst, dstAttno - 1);
 }
 
+/*
+ * TupleDescFinalize
+ *		Finalize the given TupleDesc.  This must be called after the
+ *		attributes arrays have been populated or adjusted by any code.
+ *
+ * Must be called after populate_compact_attribute() and before
+ * BlessTupleDesc().
+ */
+void
+TupleDescFinalize(TupleDesc tupdesc)
+{
+	int			firstNonCachedOffAttr = 0;
+	int			offp = 0;
+
+	for (int i = 0; i < tupdesc->natts; i++)
+	{
+		CompactAttribute *cattr = TupleDescCompactAttr(tupdesc, i);
+
+		if (cattr->attlen <= 0)
+			break;
+
+		offp = att_nominal_alignby(offp, cattr->attalignby);
+		cattr->attcacheoff = offp;
+
+		offp += cattr->attlen;
+		firstNonCachedOffAttr = i + 1;
+	}
+
+	tupdesc->firstNonCachedOffAttr = firstNonCachedOffAttr;
+}
+
 /*
  * Free a TupleDesc including all substructure
  */
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index b246e8127db..a4694bd8065 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -335,9 +335,6 @@ getSpGistTupleDesc(Relation index, SpGistTypeDesc *keyType)
 		/* We shouldn't need to bother with making these valid: */
 		att->attcompression = InvalidCompressionMethod;
 		att->attcollation = InvalidOid;
-		/* In case we changed typlen, we'd better reset following offsets */
-		for (int i = spgFirstIncludeColumn; i < outTupDesc->natts; i++)
-			TupleDescCompactAttr(outTupDesc, i)->attcacheoff = -1;
 
 		populate_compact_attribute(outTupDesc, spgKeyColumn);
 		TupleDescFinalize(outTupDesc);
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index e6ab51e6404..89f18be5d82 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -992,118 +992,6 @@ tts_buffer_heap_store_tuple(TupleTableSlot *slot, HeapTuple tuple,
 	}
 }
 
-/*
- * slot_deform_heap_tuple_internal
- *		An always inline helper function for use in slot_deform_heap_tuple to
- *		allow the compiler to emit specialized versions of this function for
- *		various combinations of "slow" and "hasnulls".  For example, if a
- *		given tuple has no nulls, then we needn't check "hasnulls" for every
- *		attribute that we're deforming.  The caller can just call this
- *		function with hasnulls set to constant-false and have the compiler
- *		remove the constant-false branches and emit more optimal code.
- *
- * Returns the next attnum to deform, which can be equal to natts when the
- * function manages to deform all requested attributes.  *offp is an input and
- * output parameter which is the byte offset within the tuple to start deforming
- * from which, on return, gets set to the offset where the next attribute
- * should be deformed from.  *slowp is set to true when subsequent deforming
- * of this tuple must use a version of this function with "slow" passed as
- * true.
- *
- * Callers cannot assume when we return "attnum" (i.e. all requested
- * attributes have been deformed) that slow mode isn't required for any
- * additional deforming as the final attribute may have caused a switch to
- * slow mode.
- */
-static pg_attribute_always_inline int
-slot_deform_heap_tuple_internal(TupleTableSlot *slot, HeapTuple tuple,
-								int attnum, int natts, bool slow,
-								bool hasnulls, uint32 *offp, bool *slowp)
-{
-	TupleDesc	tupleDesc = slot->tts_tupleDescriptor;
-	Datum	   *values = slot->tts_values;
-	bool	   *isnull = slot->tts_isnull;
-	HeapTupleHeader tup = tuple->t_data;
-	char	   *tp;				/* ptr to tuple data */
-	bits8	   *bp = tup->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slownext = false;
-
-	tp = (char *) tup + tup->t_hoff;
-
-	for (; attnum < natts; attnum++)
-	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDesc, attnum);
-
-		if (hasnulls && att_isnull(attnum, bp))
-		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			if (!slow)
-			{
-				*slowp = true;
-				return attnum + 1;
-			}
-			else
-				continue;
-		}
-
-		isnull[attnum] = false;
-
-		/* calculate the offset of this attribute */
-		if (!slow && thisatt->attcacheoff >= 0)
-			*offp = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow && *offp == att_nominal_alignby(*offp, thisatt->attalignby))
-				thisatt->attcacheoff = *offp;
-			else
-			{
-				*offp = att_pointer_alignby(*offp,
-											thisatt->attalignby,
-											-1,
-											tp + *offp);
-
-				if (!slow)
-					slownext = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			*offp = att_nominal_alignby(*offp, thisatt->attalignby);
-
-			if (!slow)
-				thisatt->attcacheoff = *offp;
-		}
-
-		values[attnum] = fetchatt(thisatt, tp + *offp);
-
-		*offp = att_addlength_pointer(*offp, thisatt->attlen, tp + *offp);
-
-		/* check if we need to switch to slow mode */
-		if (!slow)
-		{
-			/*
-			 * We're unable to deform any further if the above code set
-			 * 'slownext', or if this isn't a fixed-width attribute.
-			 */
-			if (slownext || thisatt->attlen <= 0)
-			{
-				*slowp = true;
-				return attnum + 1;
-			}
-		}
-	}
-
-	return natts;
-}
-
 /*
  * slot_deform_heap_tuple
  *		Given a TupleTableSlot, extract data from the slot's physical tuple
@@ -1122,78 +1010,140 @@ static pg_attribute_always_inline void
 slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 					   int natts)
 {
+	CompactAttribute *cattr;
+	TupleDesc	tupleDesc = slot->tts_tupleDescriptor;
 	bool		hasnulls = HeapTupleHasNulls(tuple);
+	HeapTupleHeader tup = tuple->t_data;
+	bits8	   *bp;				/* ptr to null bitmap in tuple */
 	int			attnum;
+	int			firstNonCacheOffsetAttr;
+	int			firstNullAttr;
+	Datum	   *values;
+	bool	   *isnull;
+	char	   *tp;				/* ptr to tuple data */
 	uint32		off;			/* offset in tuple data */
-	bool		slow;			/* can we use/set attcacheoff? */
+
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
 	/* We can only fetch as many attributes as the tuple has. */
-	natts = Min(HeapTupleHeaderGetNatts(tuple->t_data), natts);
+	natts = Min(HeapTupleHeaderGetNatts(tup), natts);
+	attnum = slot->tts_nvalid;
+	firstNonCacheOffsetAttr = Min(tupleDesc->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		bp = tup->t_bits;
+		firstNullAttr = first_null_attr(bp, natts);
+		firstNonCacheOffsetAttr = Min(firstNonCacheOffsetAttr, firstNullAttr);
+	}
+	else
+	{
+		bp = NULL;
+		firstNullAttr = natts;
+	}
+
+	values = slot->tts_values;
+	isnull = slot->tts_isnull;
+	tp = (char *) tup + tup->t_hoff;
 
 	/*
-	 * Check whether the first call for this tuple, and initialize or restore
-	 * loop state.
+	 * Handle the portion of the tuple that we have cached the offset for up
+	 * to the first NULL attribute.  The offset is effectively fixed for these
+	 * so we can use the CompactAttribute's attcacheoff.
 	 */
-	attnum = slot->tts_nvalid;
-	if (attnum == 0)
+	if (attnum < firstNonCacheOffsetAttr)
+	{
+#ifdef USE_ASSERT_CHECKING
+		int			offcheck;
+
+		/* In Assert enabled builds, verify attcacheoff is correct */
+		if (attnum == 0)
+			offcheck = 0;
+		else
+			offcheck = *offp;
+#endif
+		do
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+#ifdef USE_ASSERT_CHECKING
+			offcheck = att_nominal_alignby(offcheck, cattr->attalignby);
+			Assert(offcheck == cattr->attcacheoff);
+			offcheck += cattr->attlen;
+#endif
+
+			values[attnum] = fetchatt(cattr, tp + cattr->attcacheoff);
+			isnull[attnum] = false;
+		} while (++attnum < firstNonCacheOffsetAttr);
+
+		/*
+		 * Point the offset after the end of the last attribute with a cached
+		 * offset.  We expect the final cached offset attribute to have a
+		 * fixed width, so just add the attlen to the attcacheoff
+		 */
+		Assert(cattr->attlen > 0);
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+	else if (attnum == 0)
 	{
 		/* Start from the first attribute */
 		off = 0;
-		slow = false;
 	}
 	else
 	{
 		/* Restore state from previous execution */
 		off = *offp;
-		slow = TTS_SLOW(slot);
 	}
 
 	/*
-	 * If 'slow' isn't set, try deforming using deforming code that does not
-	 * contain any of the extra checks required for non-fixed offset
-	 * deforming.  During deforming, if or when we find a NULL or a variable
-	 * length attribute, we'll switch to a deforming method which includes the
-	 * extra code required for non-fixed offset deforming, a.k.a slow mode.
-	 * Because this is performance critical, we inline
-	 * slot_deform_heap_tuple_internal passing the 'slow' and 'hasnull'
-	 * parameters as constants to allow the compiler to emit specialized code
-	 * with the known-const false comparisons and subsequent branches removed.
+	 * Handle any portion of the tuple that doesn't have a fixed offset up
+	 * until the first NULL attribute.  This loops only differs from the one
+	 * after it by the NULL checks.
 	 */
-	if (!slow)
+	for (; attnum < firstNullAttr; attnum++)
 	{
-		/* Tuple without any NULLs? We can skip doing any NULL checking */
-		if (!hasnulls)
-			attnum = slot_deform_heap_tuple_internal(slot,
-													 tuple,
-													 attnum,
-													 natts,
-													 false, /* slow */
-													 false, /* hasnulls */
-													 &off,
-													 &slow);
-		else
-			attnum = slot_deform_heap_tuple_internal(slot,
-													 tuple,
-													 attnum,
-													 natts,
-													 false, /* slow */
-													 true,	/* hasnulls */
-													 &off,
-													 &slow);
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+		/* align the offset for this attribute */
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
+
+		values[attnum] = fetchatt(cattr, tp + off);
+		isnull[attnum] = false;
+
+		/* move the offset beyond this attribute */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
-	/* If there's still work to do then we must be in slow mode */
-	if (attnum < natts)
+	/*
+	 * Now handle any remaining tuples, this time include NULL checks as we're
+	 * now at the first NULL attribute.
+	 */
+	for (; attnum < natts; attnum++)
 	{
-		/* XXX is it worth adding a separate call when hasnulls is false? */
-		attnum = slot_deform_heap_tuple_internal(slot,
-												 tuple,
-												 attnum,
-												 natts,
-												 true,	/* slow */
-												 hasnulls,
-												 &off,
-												 &slow);
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
+		}
+
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+		/* align the offset for this attribute */
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
+
+		values[attnum] = fetchatt(cattr, tp + off);
+		isnull[attnum] = false;
+
+		/* move the offset beyond this attribute */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
 	/*
@@ -1201,10 +1151,6 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 	 */
 	slot->tts_nvalid = attnum;
 	*offp = off;
-	if (slow)
-		slot->tts_flags |= TTS_FLAG_SLOW;
-	else
-		slot->tts_flags &= ~TTS_FLAG_SLOW;
 }
 
 const TupleTableSlotOps TTSOpsVirtual = {
@@ -2259,10 +2205,16 @@ ExecTypeSetColNames(TupleDesc typeInfo, List *namesList)
  * This happens "for free" if the tupdesc came from a relcache entry, but
  * not if we have manufactured a tupdesc for a transient RECORD datatype.
  * In that case we have to notify typcache.c of the existence of the type.
+ *
+ * TupleDescFinalize() must be called on the TupleDesc before calling this
+ * function.
  */
 TupleDesc
 BlessTupleDesc(TupleDesc tupdesc)
 {
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupdesc->firstNonCachedOffAttr >= 0);
+
 	if (tupdesc->tdtypeid == RECORDOID &&
 		tupdesc->tdtypmod < 0)
 		assign_record_type_typmod(tupdesc);
diff --git a/src/backend/jit/llvm/llvmjit_deform.c b/src/backend/jit/llvm/llvmjit_deform.c
index 3eb087eb56b..12521e3e46a 100644
--- a/src/backend/jit/llvm/llvmjit_deform.c
+++ b/src/backend/jit/llvm/llvmjit_deform.c
@@ -62,7 +62,6 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 	LLVMValueRef v_tts_values;
 	LLVMValueRef v_tts_nulls;
 	LLVMValueRef v_slotoffp;
-	LLVMValueRef v_flagsp;
 	LLVMValueRef v_nvalidp;
 	LLVMValueRef v_nvalid;
 	LLVMValueRef v_maxatt;
@@ -178,7 +177,6 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 	v_tts_nulls =
 		l_load_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_ISNULL,
 						  "tts_ISNULL");
-	v_flagsp = l_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_FLAGS, "");
 	v_nvalidp = l_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_NVALID, "");
 
 	if (ops == &TTSOpsHeapTuple || ops == &TTSOpsBufferHeapTuple)
@@ -747,14 +745,10 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 
 	{
 		LLVMValueRef v_off = l_load(b, TypeSizeT, v_offp, "");
-		LLVMValueRef v_flags;
 
 		LLVMBuildStore(b, l_int16_const(lc, natts), v_nvalidp);
 		v_off = LLVMBuildTrunc(b, v_off, LLVMInt32TypeInContext(lc), "");
 		LLVMBuildStore(b, v_off, v_slotoffp);
-		v_flags = l_load(b, LLVMInt16TypeInContext(lc), v_flagsp, "tts_flags");
-		v_flags = LLVMBuildOr(b, v_flags, l_int16_const(lc, TTS_FLAG_SLOW), "");
-		LLVMBuildStore(b, v_flags, v_flagsp);
 		LLVMBuildRetVoid(b);
 	}
 
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 770edb34e08..998be24ac41 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -666,14 +666,6 @@ RelationBuildTupleDesc(Relation relation)
 		elog(ERROR, "pg_attribute catalog is missing %d attribute(s) for relation OID %u",
 			 need, RelationGetRelid(relation));
 
-	/*
-	 * We can easily set the attcacheoff value for the first attribute: it
-	 * must be zero.  This eliminates the need for special cases for attnum=1
-	 * that used to exist in fastgetattr() and index_getattr().
-	 */
-	if (RelationGetNumberOfAttributes(relation) > 0)
-		TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
-
 	/*
 	 * Set up constraint/default info
 	 */
@@ -1985,8 +1977,6 @@ formrdesc(const char *relationName, Oid relationReltype,
 		populate_compact_attribute(relation->rd_att, i);
 	}
 
-	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
-	TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
 	TupleDescFinalize(relation->rd_att);
 
 	/* mark not-null status */
@@ -4446,8 +4436,6 @@ BuildHardcodedDescriptor(int natts, const FormData_pg_attribute *attrs)
 		populate_compact_attribute(result, i);
 	}
 
-	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
-	TupleDescCompactAttr(result, 0)->attcacheoff = 0;
 	TupleDescFinalize(result);
 
 	/* Note: we don't bother to set up a TupleConstr entry */
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index d406825ff22..94b4279b7f1 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -865,20 +865,17 @@ extern MinimalTuple minimal_expand_tuple(HeapTuple sourceTuple, TupleDesc tupleD
 static inline Datum
 fastgetattr(HeapTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
 {
-	Assert(attnum > 0);
+	CompactAttribute *att = TupleDescCompactAttr(tupleDesc, attnum - 1);
 
+	Assert(attnum > 0);
 	*isnull = false;
-	if (HeapTupleNoNulls(tup))
-	{
-		CompactAttribute *att;
 
-		att = TupleDescCompactAttr(tupleDesc, attnum - 1);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, (char *) tup->t_data + tup->t_data->t_hoff +
-							att->attcacheoff);
-		else
-			return nocachegetattr(tup, attnum, tupleDesc);
-	}
+	if (att->attcacheoff >= 0 && !HeapTupleHasNulls(tup))
+		return fetchatt(att, (char *) tup->t_data + tup->t_data->t_hoff +
+						att->attcacheoff);
+
+	if (HeapTupleNoNulls(tup))
+		return nocachegetattr(tup, attnum, tupleDesc);
 	else
 	{
 		if (att_isnull(attnum - 1, tup->t_data->t_bits))
diff --git a/src/include/access/itup.h b/src/include/access/itup.h
index 57e4daafb0d..e4bb27b7e58 100644
--- a/src/include/access/itup.h
+++ b/src/include/access/itup.h
@@ -131,24 +131,20 @@ IndexInfoFindDataOffset(unsigned short t_info)
 static inline Datum
 index_getattr(IndexTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
 {
+	CompactAttribute *attr = TupleDescCompactAttr(tupleDesc, attnum - 1);
+
 	Assert(isnull);
 	Assert(attnum > 0);
 
 	*isnull = false;
 
-	if (!IndexTupleHasNulls(tup))
-	{
-		CompactAttribute *attr = TupleDescCompactAttr(tupleDesc, attnum - 1);
+	if (attr->attcacheoff >= 0 && !IndexTupleHasNulls(tup))
+		return fetchatt(attr,
+						(char *) tup + IndexInfoFindDataOffset(tup->t_info) +
+						attr->attcacheoff);
 
-		if (attr->attcacheoff >= 0)
-		{
-			return fetchatt(attr,
-							(char *) tup + IndexInfoFindDataOffset(tup->t_info) +
-							attr->attcacheoff);
-		}
-		else
-			return nocache_index_getattr(tup, attnum, tupleDesc);
-	}
+	if (!IndexTupleHasNulls(tup))
+		return nocache_index_getattr(tup, attnum, tupleDesc);
 	else
 	{
 		if (att_isnull(attnum - 1, (bits8 *) tup + sizeof(IndexTupleData)))
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index 595413dbbc5..99d9017d1a6 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -131,6 +131,12 @@ typedef struct CompactAttribute
  * Any code making changes manually to and fields in the FormData_pg_attribute
  * array must subsequently call populate_compact_attribute() to flush the
  * changes out to the corresponding 'compact_attrs' element.
+ *
+ * firstNonCachedOffAttr stores the index into the compact_attrs array for the
+ * first attribute that we don't have a known attcacheoff for.
+ *
+ * Once a TupleDesc has been populated, before it is used for any purpose
+ * TupleDescFinalize() must be called on it.
  */
 typedef struct TupleDescData
 {
@@ -138,6 +144,8 @@ typedef struct TupleDescData
 	Oid			tdtypeid;		/* composite type ID for tuple type */
 	int32		tdtypmod;		/* typmod for tuple type */
 	int			tdrefcount;		/* reference count, or -1 if not counting */
+	int			firstNonCachedOffAttr;	/* index of the first att without an
+										 * attcacheoff */
 	TupleConstr *constr;		/* constraints, or NULL if none */
 	/* compact_attrs[N] is the compact metadata of Attribute Number N+1 */
 	CompactAttribute compact_attrs[FLEXIBLE_ARRAY_MEMBER];
@@ -195,7 +203,6 @@ extern TupleDesc CreateTupleDescTruncatedCopy(TupleDesc tupdesc, int natts);
 
 extern TupleDesc CreateTupleDescCopyConstr(TupleDesc tupdesc);
 
-#define TupleDescFinalize(d) ((void) 0)
 #define TupleDescSize(src) \
 	(offsetof(struct TupleDescData, compact_attrs) + \
 	 (src)->natts * sizeof(CompactAttribute) + \
@@ -206,6 +213,7 @@ extern void TupleDescCopy(TupleDesc dst, TupleDesc src);
 extern void TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 							   TupleDesc src, AttrNumber srcAttno);
 
+extern void TupleDescFinalize(TupleDesc tupdesc);
 extern void FreeTupleDesc(TupleDesc tupdesc);
 
 extern void IncrTupleDescRefCount(TupleDesc tupdesc);
diff --git a/src/include/access/tupmacs.h b/src/include/access/tupmacs.h
index e6df8264750..fcaf6ad149f 100644
--- a/src/include/access/tupmacs.h
+++ b/src/include/access/tupmacs.h
@@ -15,6 +15,7 @@
 #define TUPMACS_H
 
 #include "catalog/pg_type_d.h"	/* for TYPALIGN macros */
+#include "port/pg_bitutils.h"
 
 
 /*
@@ -69,6 +70,70 @@ fetch_att(const void *T, bool attbyval, int attlen)
 	else
 		return PointerGetDatum(T);
 }
+
+/*
+ * first_null_attr
+ *		Inspect a NULL bitmask from a tuple and return the 0-based attnum of the
+ *		first NULL attribute.  Returns natts if no NULLs were found.
+ *
+ * We expect that 'bits' contains at least one 0 bit somewhere in the mask,
+ * not necessarily < natts.
+ */
+static inline int
+first_null_attr(const bits8 *bits, int natts)
+{
+	int			lastByte = natts >> 3;
+	uint8		mask;
+	int			res = natts;
+	uint8		byte;
+
+#ifdef USE_ASSERT_CHECKING
+	int			firstnull_check = natts;
+
+	/* Do it the slow way and check we get the same answer. */
+	for (int i = 0; i < natts; i++)
+	{
+		if (att_isnull(i, bits))
+		{
+			firstnull_check = i;
+			break;
+		}
+	}
+#endif
+
+	/* Process all bytes up to just before the byte for the natts index */
+	for (int bytenum = 0; bytenum < lastByte; bytenum++)
+	{
+		if (bits[bytenum] != 0xFF)
+		{
+			byte = ~bits[bytenum];
+			res = bytenum << 3;
+			res += pg_rightmost_one_pos[byte];
+
+			Assert(res == firstnull_check);
+			return res;
+		}
+	}
+
+	/*
+	 * Create a mask with all bits beyond natts's bit set to off.  This
+	 * assumes the code above will have found a 0-bit before we run off the
+	 * end of the bits array.  Tuples without any NULLs won't have a bitmask
+	 * to mark NULLs.
+	 */
+	mask = 0xFF & ((((uint8) 1) << (natts & 7)) - 1);
+	byte = (~bits[lastByte]) & mask;
+
+	if (byte != 0)
+	{
+		res = lastByte << 3;
+		res += pg_rightmost_one_pos[byte];
+	}
+
+	Assert(res == firstnull_check);
+
+	return res;
+}
 #endif							/* FRONTEND */
 
 /*
diff --git a/src/include/executor/tuptable.h b/src/include/executor/tuptable.h
index a2dfd707e78..363c5f33697 100644
--- a/src/include/executor/tuptable.h
+++ b/src/include/executor/tuptable.h
@@ -84,9 +84,6 @@
  * tts_values/tts_isnull are allocated either when the slot is created (when
  * the descriptor is provided), or when a descriptor is assigned to the slot;
  * they are of length equal to the descriptor's natts.
- *
- * The TTS_FLAG_SLOW flag is saved state for
- * slot_deform_heap_tuple, and should not be touched by any other code.
  *----------
  */
 
@@ -98,12 +95,8 @@
 #define			TTS_FLAG_SHOULDFREE		(1 << 2)
 #define TTS_SHOULDFREE(slot) (((slot)->tts_flags & TTS_FLAG_SHOULDFREE) != 0)
 
-/* saved state for slot_deform_heap_tuple */
-#define			TTS_FLAG_SLOW		(1 << 3)
-#define TTS_SLOW(slot) (((slot)->tts_flags & TTS_FLAG_SLOW) != 0)
-
 /* fixed tuple descriptor */
-#define			TTS_FLAG_FIXED		(1 << 4)
+#define			TTS_FLAG_FIXED		(1 << 4)	/* XXX change to #3? */
 #define TTS_FIXED(slot) (((slot)->tts_flags & TTS_FLAG_FIXED) != 0)
 
 struct TupleTableSlotOps;
-- 
2.51.0



^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: More speedups for tuple deformation
@ 2026-01-23 01:18  Andres Freund <[email protected]>
  parent: David Rowley <[email protected]>
  0 siblings, 2 replies; 19+ messages in thread

From: Andres Freund @ 2026-01-23 01:18 UTC (permalink / raw)
  To: David Rowley <[email protected]>; +Cc: Chao Li <[email protected]>; PostgreSQL Developers <[email protected]>

Hi,

I haven't yet looked at the new version of the patch, but I ran your benchmark
from upthread (fwiw, I removed the sleep 10 to reduce runtimes, the results
seem stable enough anyway) on two intel machines, as you mentioned that you
saw a lot variation in Azure.

For both I disabled turbo boost, cpu idling and pinned the backend to a single
CPU core.

There's a bit of noise on "awork3" (basically an editor and an idle browser
window), but everything is pinned to the other socket. "awork4" is entirely
idle.


Looks like overall the results are quite impressive!  Some of the extra_cols=0
runs saphire rapids are a bit slower, but the losses are much smaller than the
gains in other cases.


I think it'd be good to add a few test cases of "incremental deforming" to the
benchmark. E.g. a qual that accesses column 10, but projection then deforms up
to 20.  I'm a bit worried that e.g. the repeated first_null_attr()
computations could cause regressions.


Greetings,

Andres Freund


Attachments:

  [text/csv] deform_bench.csv (20.9K, 2-deform_bench.csv)
  download | inline:
awork3-cascade-lake,master,gcc,1,0,1,102.650
awork3-cascade-lake,master,gcc,1,0,2,102.049
awork3-cascade-lake,master,gcc,1,0,3,102.209
awork3-cascade-lake,master,gcc,1,10,1,148.652
awork3-cascade-lake,master,gcc,1,10,2,148.552
awork3-cascade-lake,master,gcc,1,10,3,148.710
awork3-cascade-lake,master,gcc,1,20,1,197.771
awork3-cascade-lake,master,gcc,1,20,2,197.502
awork3-cascade-lake,master,gcc,1,20,3,197.700
awork3-cascade-lake,master,gcc,1,30,1,256.406
awork3-cascade-lake,master,gcc,1,30,2,256.548
awork3-cascade-lake,master,gcc,1,30,3,257.026
awork3-cascade-lake,master,gcc,1,40,1,303.904
awork3-cascade-lake,master,gcc,1,40,2,303.547
awork3-cascade-lake,master,gcc,1,40,3,304.060
awork3-cascade-lake,master,gcc,2,0,1,103.550
awork3-cascade-lake,master,gcc,2,0,2,104.276
awork3-cascade-lake,master,gcc,2,0,3,103.948
awork3-cascade-lake,master,gcc,2,10,1,163.866
awork3-cascade-lake,master,gcc,2,10,2,163.719
awork3-cascade-lake,master,gcc,2,10,3,163.776
awork3-cascade-lake,master,gcc,2,20,1,208.907
awork3-cascade-lake,master,gcc,2,20,2,208.849
awork3-cascade-lake,master,gcc,2,20,3,208.797
awork3-cascade-lake,master,gcc,2,30,1,263.781
awork3-cascade-lake,master,gcc,2,30,2,263.719
awork3-cascade-lake,master,gcc,2,30,3,263.518
awork3-cascade-lake,master,gcc,2,40,1,323.905
awork3-cascade-lake,master,gcc,2,40,2,325.130
awork3-cascade-lake,master,gcc,2,40,3,324.237
awork3-cascade-lake,master,gcc,3,0,1,102.511
awork3-cascade-lake,master,gcc,3,0,2,102.537
awork3-cascade-lake,master,gcc,3,0,3,102.270
awork3-cascade-lake,master,gcc,3,10,1,169.438
awork3-cascade-lake,master,gcc,3,10,2,169.384
awork3-cascade-lake,master,gcc,3,10,3,169.396
awork3-cascade-lake,master,gcc,3,20,1,236.111
awork3-cascade-lake,master,gcc,3,20,2,236.142
awork3-cascade-lake,master,gcc,3,20,3,235.937
awork3-cascade-lake,master,gcc,3,30,1,314.902
awork3-cascade-lake,master,gcc,3,30,2,314.745
awork3-cascade-lake,master,gcc,3,30,3,314.748
awork3-cascade-lake,master,gcc,3,40,1,385.589
awork3-cascade-lake,master,gcc,3,40,2,383.495
awork3-cascade-lake,master,gcc,3,40,3,383.660
awork3-cascade-lake,master,gcc,4,0,1,102.488
awork3-cascade-lake,master,gcc,4,0,2,102.428
awork3-cascade-lake,master,gcc,4,0,3,102.356
awork3-cascade-lake,master,gcc,4,10,1,169.802
awork3-cascade-lake,master,gcc,4,10,2,169.551
awork3-cascade-lake,master,gcc,4,10,3,169.612
awork3-cascade-lake,master,gcc,4,20,1,236.433
awork3-cascade-lake,master,gcc,4,20,2,236.109
awork3-cascade-lake,master,gcc,4,20,3,235.870
awork3-cascade-lake,master,gcc,4,30,1,314.830
awork3-cascade-lake,master,gcc,4,30,2,315.097
awork3-cascade-lake,master,gcc,4,30,3,314.643
awork3-cascade-lake,master,gcc,4,40,1,383.686
awork3-cascade-lake,master,gcc,4,40,2,383.826
awork3-cascade-lake,master,gcc,4,40,3,383.218
awork3-cascade-lake,master,gcc,5,0,1,109.376
awork3-cascade-lake,master,gcc,5,0,2,109.395
awork3-cascade-lake,master,gcc,5,0,3,109.258
awork3-cascade-lake,master,gcc,5,10,1,193.453
awork3-cascade-lake,master,gcc,5,10,2,193.474
awork3-cascade-lake,master,gcc,5,10,3,193.560
awork3-cascade-lake,master,gcc,5,20,1,278.782
awork3-cascade-lake,master,gcc,5,20,2,278.905
awork3-cascade-lake,master,gcc,5,20,3,278.757
awork3-cascade-lake,master,gcc,5,30,1,375.193
awork3-cascade-lake,master,gcc,5,30,2,375.108
awork3-cascade-lake,master,gcc,5,30,3,375.723
awork3-cascade-lake,master,gcc,5,40,1,456.812
awork3-cascade-lake,master,gcc,5,40,2,457.320
awork3-cascade-lake,master,gcc,5,40,3,457.187
awork3-cascade-lake,master,gcc,6,0,1,109.912
awork3-cascade-lake,master,gcc,6,0,2,109.890
awork3-cascade-lake,master,gcc,6,0,3,110.041
awork3-cascade-lake,master,gcc,6,10,1,177.925
awork3-cascade-lake,master,gcc,6,10,2,177.814
awork3-cascade-lake,master,gcc,6,10,3,177.833
awork3-cascade-lake,master,gcc,6,20,1,244.359
awork3-cascade-lake,master,gcc,6,20,2,243.971
awork3-cascade-lake,master,gcc,6,20,3,244.341
awork3-cascade-lake,master,gcc,6,30,1,321.747
awork3-cascade-lake,master,gcc,6,30,2,322.574
awork3-cascade-lake,master,gcc,6,30,3,321.898
awork3-cascade-lake,master,gcc,6,40,1,388.915
awork3-cascade-lake,master,gcc,6,40,2,390.149
awork3-cascade-lake,master,gcc,6,40,3,388.845
awork3-cascade-lake,master,gcc,7,0,1,100.884
awork3-cascade-lake,master,gcc,7,0,2,101.023
awork3-cascade-lake,master,gcc,7,0,3,100.913
awork3-cascade-lake,master,gcc,7,10,1,168.613
awork3-cascade-lake,master,gcc,7,10,2,168.679
awork3-cascade-lake,master,gcc,7,10,3,168.570
awork3-cascade-lake,master,gcc,7,20,1,237.352
awork3-cascade-lake,master,gcc,7,20,2,237.407
awork3-cascade-lake,master,gcc,7,20,3,237.578
awork3-cascade-lake,master,gcc,7,30,1,313.746
awork3-cascade-lake,master,gcc,7,30,2,314.113
awork3-cascade-lake,master,gcc,7,30,3,314.235
awork3-cascade-lake,master,gcc,7,40,1,380.827
awork3-cascade-lake,master,gcc,7,40,2,380.847
awork3-cascade-lake,master,gcc,7,40,3,380.915
awork3-cascade-lake,master,gcc,8,0,1,100.806
awork3-cascade-lake,master,gcc,8,0,2,100.807
awork3-cascade-lake,master,gcc,8,0,3,100.882
awork3-cascade-lake,master,gcc,8,10,1,168.653
awork3-cascade-lake,master,gcc,8,10,2,168.618
awork3-cascade-lake,master,gcc,8,10,3,168.632
awork3-cascade-lake,master,gcc,8,20,1,237.841
awork3-cascade-lake,master,gcc,8,20,2,237.253
awork3-cascade-lake,master,gcc,8,20,3,237.121
awork3-cascade-lake,master,gcc,8,30,1,314.998
awork3-cascade-lake,master,gcc,8,30,2,314.320
awork3-cascade-lake,master,gcc,8,30,3,314.284
awork3-cascade-lake,master,gcc,8,40,1,380.858
awork3-cascade-lake,master,gcc,8,40,2,381.088
awork3-cascade-lake,master,gcc,8,40,3,380.795
awork3-cascade-lake,v5,gcc,1,0,1,93.341
awork3-cascade-lake,v5,gcc,1,0,2,93.282
awork3-cascade-lake,v5,gcc,1,0,3,93.519
awork3-cascade-lake,v5,gcc,1,10,1,113.467
awork3-cascade-lake,v5,gcc,1,10,2,113.401
awork3-cascade-lake,v5,gcc,1,10,3,113.363
awork3-cascade-lake,v5,gcc,1,20,1,137.208
awork3-cascade-lake,v5,gcc,1,20,2,137.146
awork3-cascade-lake,v5,gcc,1,20,3,137.200
awork3-cascade-lake,v5,gcc,1,30,1,164.974
awork3-cascade-lake,v5,gcc,1,30,2,164.901
awork3-cascade-lake,v5,gcc,1,30,3,164.954
awork3-cascade-lake,v5,gcc,1,40,1,187.637
awork3-cascade-lake,v5,gcc,1,40,2,187.698
awork3-cascade-lake,v5,gcc,1,40,3,187.692
awork3-cascade-lake,v5,gcc,2,0,1,100.984
awork3-cascade-lake,v5,gcc,2,0,2,101.437
awork3-cascade-lake,v5,gcc,2,0,3,101.098
awork3-cascade-lake,v5,gcc,2,10,1,152.717
awork3-cascade-lake,v5,gcc,2,10,2,152.662
awork3-cascade-lake,v5,gcc,2,10,3,152.645
awork3-cascade-lake,v5,gcc,2,20,1,205.803
awork3-cascade-lake,v5,gcc,2,20,2,205.956
awork3-cascade-lake,v5,gcc,2,20,3,205.722
awork3-cascade-lake,v5,gcc,2,30,1,274.158
awork3-cascade-lake,v5,gcc,2,30,2,274.026
awork3-cascade-lake,v5,gcc,2,30,3,273.824
awork3-cascade-lake,v5,gcc,2,40,1,326.507
awork3-cascade-lake,v5,gcc,2,40,2,324.092
awork3-cascade-lake,v5,gcc,2,40,3,324.032
awork3-cascade-lake,v5,gcc,3,0,1,102.849
awork3-cascade-lake,v5,gcc,3,0,2,102.841
awork3-cascade-lake,v5,gcc,3,0,3,102.660
awork3-cascade-lake,v5,gcc,3,10,1,169.825
awork3-cascade-lake,v5,gcc,3,10,2,169.792
awork3-cascade-lake,v5,gcc,3,10,3,169.859
awork3-cascade-lake,v5,gcc,3,20,1,233.059
awork3-cascade-lake,v5,gcc,3,20,2,233.005
awork3-cascade-lake,v5,gcc,3,20,3,233.260
awork3-cascade-lake,v5,gcc,3,30,1,309.879
awork3-cascade-lake,v5,gcc,3,30,2,309.918
awork3-cascade-lake,v5,gcc,3,30,3,310.019
awork3-cascade-lake,v5,gcc,3,40,1,372.825
awork3-cascade-lake,v5,gcc,3,40,2,373.586
awork3-cascade-lake,v5,gcc,3,40,3,372.628
awork3-cascade-lake,v5,gcc,4,0,1,102.875
awork3-cascade-lake,v5,gcc,4,0,2,102.962
awork3-cascade-lake,v5,gcc,4,0,3,102.864
awork3-cascade-lake,v5,gcc,4,10,1,169.866
awork3-cascade-lake,v5,gcc,4,10,2,170.291
awork3-cascade-lake,v5,gcc,4,10,3,169.958
awork3-cascade-lake,v5,gcc,4,20,1,233.179
awork3-cascade-lake,v5,gcc,4,20,2,234.101
awork3-cascade-lake,v5,gcc,4,20,3,232.956
awork3-cascade-lake,v5,gcc,4,30,1,311.703
awork3-cascade-lake,v5,gcc,4,30,2,310.171
awork3-cascade-lake,v5,gcc,4,30,3,310.225
awork3-cascade-lake,v5,gcc,4,40,1,373.302
awork3-cascade-lake,v5,gcc,4,40,2,372.687
awork3-cascade-lake,v5,gcc,4,40,3,372.817
awork3-cascade-lake,v5,gcc,5,0,1,93.675
awork3-cascade-lake,v5,gcc,5,0,2,93.167
awork3-cascade-lake,v5,gcc,5,0,3,93.179
awork3-cascade-lake,v5,gcc,5,10,1,124.256
awork3-cascade-lake,v5,gcc,5,10,2,124.155
awork3-cascade-lake,v5,gcc,5,10,3,124.269
awork3-cascade-lake,v5,gcc,5,20,1,147.759
awork3-cascade-lake,v5,gcc,5,20,2,147.719
awork3-cascade-lake,v5,gcc,5,20,3,147.850
awork3-cascade-lake,v5,gcc,5,30,1,175.232
awork3-cascade-lake,v5,gcc,5,30,2,175.089
awork3-cascade-lake,v5,gcc,5,30,3,175.593
awork3-cascade-lake,v5,gcc,5,40,1,198.088
awork3-cascade-lake,v5,gcc,5,40,2,198.110
awork3-cascade-lake,v5,gcc,5,40,3,198.086
awork3-cascade-lake,v5,gcc,6,0,1,103.652
awork3-cascade-lake,v5,gcc,6,0,2,103.412
awork3-cascade-lake,v5,gcc,6,0,3,103.690
awork3-cascade-lake,v5,gcc,6,10,1,160.879
awork3-cascade-lake,v5,gcc,6,10,2,161.585
awork3-cascade-lake,v5,gcc,6,10,3,160.882
awork3-cascade-lake,v5,gcc,6,20,1,213.919
awork3-cascade-lake,v5,gcc,6,20,2,214.943
awork3-cascade-lake,v5,gcc,6,20,3,213.981
awork3-cascade-lake,v5,gcc,6,30,1,282.677
awork3-cascade-lake,v5,gcc,6,30,2,282.337
awork3-cascade-lake,v5,gcc,6,30,3,282.354
awork3-cascade-lake,v5,gcc,6,40,1,334.232
awork3-cascade-lake,v5,gcc,6,40,2,333.409
awork3-cascade-lake,v5,gcc,6,40,3,333.569
awork3-cascade-lake,v5,gcc,7,0,1,101.528
awork3-cascade-lake,v5,gcc,7,0,2,101.499
awork3-cascade-lake,v5,gcc,7,0,3,101.428
awork3-cascade-lake,v5,gcc,7,10,1,169.475
awork3-cascade-lake,v5,gcc,7,10,2,168.877
awork3-cascade-lake,v5,gcc,7,10,3,168.998
awork3-cascade-lake,v5,gcc,7,20,1,233.678
awork3-cascade-lake,v5,gcc,7,20,2,233.665
awork3-cascade-lake,v5,gcc,7,20,3,233.907
awork3-cascade-lake,v5,gcc,7,30,1,308.277
awork3-cascade-lake,v5,gcc,7,30,2,308.150
awork3-cascade-lake,v5,gcc,7,30,3,308.221
awork3-cascade-lake,v5,gcc,7,40,1,373.200
awork3-cascade-lake,v5,gcc,7,40,2,373.004
awork3-cascade-lake,v5,gcc,7,40,3,373.514
awork3-cascade-lake,v5,gcc,8,0,1,101.520
awork3-cascade-lake,v5,gcc,8,0,2,101.726
awork3-cascade-lake,v5,gcc,8,0,3,101.406
awork3-cascade-lake,v5,gcc,8,10,1,168.892
awork3-cascade-lake,v5,gcc,8,10,2,170.018
awork3-cascade-lake,v5,gcc,8,10,3,169.014
awork3-cascade-lake,v5,gcc,8,20,1,233.720
awork3-cascade-lake,v5,gcc,8,20,2,233.910
awork3-cascade-lake,v5,gcc,8,20,3,233.864
awork3-cascade-lake,v5,gcc,8,30,1,308.034
awork3-cascade-lake,v5,gcc,8,30,2,308.497
awork3-cascade-lake,v5,gcc,8,30,3,308.009
awork3-cascade-lake,v5,gcc,8,40,1,373.269
awork3-cascade-lake,v5,gcc,8,40,2,372.991
awork3-cascade-lake,v5,gcc,8,40,3,373.167
awork4-saphire-rapids,master,gcc,1,0,1,46.921
awork4-saphire-rapids,master,gcc,1,0,2,46.943
awork4-saphire-rapids,master,gcc,1,0,3,46.922
awork4-saphire-rapids,master,gcc,1,10,1,67.804
awork4-saphire-rapids,master,gcc,1,10,2,67.818
awork4-saphire-rapids,master,gcc,1,10,3,67.841
awork4-saphire-rapids,master,gcc,1,20,1,89.228
awork4-saphire-rapids,master,gcc,1,20,2,89.266
awork4-saphire-rapids,master,gcc,1,20,3,89.330
awork4-saphire-rapids,master,gcc,1,30,1,117.678
awork4-saphire-rapids,master,gcc,1,30,2,117.568
awork4-saphire-rapids,master,gcc,1,30,3,117.620
awork4-saphire-rapids,master,gcc,1,40,1,164.346
awork4-saphire-rapids,master,gcc,1,40,2,164.296
awork4-saphire-rapids,master,gcc,1,40,3,164.619
awork4-saphire-rapids,master,gcc,2,0,1,49.315
awork4-saphire-rapids,master,gcc,2,0,2,49.525
awork4-saphire-rapids,master,gcc,2,0,3,49.383
awork4-saphire-rapids,master,gcc,2,10,1,74.775
awork4-saphire-rapids,master,gcc,2,10,2,75.084
awork4-saphire-rapids,master,gcc,2,10,3,75.147
awork4-saphire-rapids,master,gcc,2,20,1,99.558
awork4-saphire-rapids,master,gcc,2,20,2,99.898
awork4-saphire-rapids,master,gcc,2,20,3,99.827
awork4-saphire-rapids,master,gcc,2,30,1,131.390
awork4-saphire-rapids,master,gcc,2,30,2,131.531
awork4-saphire-rapids,master,gcc,2,30,3,131.521
awork4-saphire-rapids,master,gcc,2,40,1,176.671
awork4-saphire-rapids,master,gcc,2,40,2,176.692
awork4-saphire-rapids,master,gcc,2,40,3,176.798
awork4-saphire-rapids,master,gcc,3,0,1,49.054
awork4-saphire-rapids,master,gcc,3,0,2,49.058
awork4-saphire-rapids,master,gcc,3,0,3,49.280
awork4-saphire-rapids,master,gcc,3,10,1,80.401
awork4-saphire-rapids,master,gcc,3,10,2,80.372
awork4-saphire-rapids,master,gcc,3,10,3,80.630
awork4-saphire-rapids,master,gcc,3,20,1,109.197
awork4-saphire-rapids,master,gcc,3,20,2,109.329
awork4-saphire-rapids,master,gcc,3,20,3,109.330
awork4-saphire-rapids,master,gcc,3,30,1,147.717
awork4-saphire-rapids,master,gcc,3,30,2,147.749
awork4-saphire-rapids,master,gcc,3,30,3,147.727
awork4-saphire-rapids,master,gcc,3,40,1,181.498
awork4-saphire-rapids,master,gcc,3,40,2,181.185
awork4-saphire-rapids,master,gcc,3,40,3,181.282
awork4-saphire-rapids,master,gcc,4,0,1,49.048
awork4-saphire-rapids,master,gcc,4,0,2,49.057
awork4-saphire-rapids,master,gcc,4,0,3,49.257
awork4-saphire-rapids,master,gcc,4,10,1,80.744
awork4-saphire-rapids,master,gcc,4,10,2,80.438
awork4-saphire-rapids,master,gcc,4,10,3,80.418
awork4-saphire-rapids,master,gcc,4,20,1,109.221
awork4-saphire-rapids,master,gcc,4,20,2,109.387
awork4-saphire-rapids,master,gcc,4,20,3,109.401
awork4-saphire-rapids,master,gcc,4,30,1,147.696
awork4-saphire-rapids,master,gcc,4,30,2,147.764
awork4-saphire-rapids,master,gcc,4,30,3,148.171
awork4-saphire-rapids,master,gcc,4,40,1,181.435
awork4-saphire-rapids,master,gcc,4,40,2,181.512
awork4-saphire-rapids,master,gcc,4,40,3,181.331
awork4-saphire-rapids,master,gcc,5,0,1,51.798
awork4-saphire-rapids,master,gcc,5,0,2,51.814
awork4-saphire-rapids,master,gcc,5,0,3,51.846
awork4-saphire-rapids,master,gcc,5,10,1,91.097
awork4-saphire-rapids,master,gcc,5,10,2,91.174
awork4-saphire-rapids,master,gcc,5,10,3,91.049
awork4-saphire-rapids,master,gcc,5,20,1,131.503
awork4-saphire-rapids,master,gcc,5,20,2,131.154
awork4-saphire-rapids,master,gcc,5,20,3,131.399
awork4-saphire-rapids,master,gcc,5,30,1,181.891
awork4-saphire-rapids,master,gcc,5,30,2,181.035
awork4-saphire-rapids,master,gcc,5,30,3,180.962
awork4-saphire-rapids,master,gcc,5,40,1,223.097
awork4-saphire-rapids,master,gcc,5,40,2,223.026
awork4-saphire-rapids,master,gcc,5,40,3,223.206
awork4-saphire-rapids,master,gcc,6,0,1,51.862
awork4-saphire-rapids,master,gcc,6,0,2,51.854
awork4-saphire-rapids,master,gcc,6,0,3,51.859
awork4-saphire-rapids,master,gcc,6,10,1,83.205
awork4-saphire-rapids,master,gcc,6,10,2,83.267
awork4-saphire-rapids,master,gcc,6,10,3,83.772
awork4-saphire-rapids,master,gcc,6,20,1,112.224
awork4-saphire-rapids,master,gcc,6,20,2,112.131
awork4-saphire-rapids,master,gcc,6,20,3,112.146
awork4-saphire-rapids,master,gcc,6,30,1,151.210
awork4-saphire-rapids,master,gcc,6,30,2,151.174
awork4-saphire-rapids,master,gcc,6,30,3,151.229
awork4-saphire-rapids,master,gcc,6,40,1,185.426
awork4-saphire-rapids,master,gcc,6,40,2,185.111
awork4-saphire-rapids,master,gcc,6,40,3,185.036
awork4-saphire-rapids,master,gcc,7,0,1,48.959
awork4-saphire-rapids,master,gcc,7,0,2,48.945
awork4-saphire-rapids,master,gcc,7,0,3,48.947
awork4-saphire-rapids,master,gcc,7,10,1,80.048
awork4-saphire-rapids,master,gcc,7,10,2,80.427
awork4-saphire-rapids,master,gcc,7,10,3,80.132
awork4-saphire-rapids,master,gcc,7,20,1,109.369
awork4-saphire-rapids,master,gcc,7,20,2,109.421
awork4-saphire-rapids,master,gcc,7,20,3,109.491
awork4-saphire-rapids,master,gcc,7,30,1,146.060
awork4-saphire-rapids,master,gcc,7,30,2,146.426
awork4-saphire-rapids,master,gcc,7,30,3,146.384
awork4-saphire-rapids,master,gcc,7,40,1,182.010
awork4-saphire-rapids,master,gcc,7,40,2,181.952
awork4-saphire-rapids,master,gcc,7,40,3,182.047
awork4-saphire-rapids,master,gcc,8,0,1,48.961
awork4-saphire-rapids,master,gcc,8,0,2,48.976
awork4-saphire-rapids,master,gcc,8,0,3,48.979
awork4-saphire-rapids,master,gcc,8,10,1,80.258
awork4-saphire-rapids,master,gcc,8,10,2,80.078
awork4-saphire-rapids,master,gcc,8,10,3,80.137
awork4-saphire-rapids,master,gcc,8,20,1,109.321
awork4-saphire-rapids,master,gcc,8,20,2,109.377
awork4-saphire-rapids,master,gcc,8,20,3,109.432
awork4-saphire-rapids,master,gcc,8,30,1,146.321
awork4-saphire-rapids,master,gcc,8,30,2,146.334
awork4-saphire-rapids,master,gcc,8,30,3,146.820
awork4-saphire-rapids,master,gcc,8,40,1,181.935
awork4-saphire-rapids,master,gcc,8,40,2,182.043
awork4-saphire-rapids,master,gcc,8,40,3,182.115
awork4-saphire-rapids,v5,gcc,1,0,1,47.179
awork4-saphire-rapids,v5,gcc,1,0,2,47.061
awork4-saphire-rapids,v5,gcc,1,0,3,47.032
awork4-saphire-rapids,v5,gcc,1,10,1,60.513
awork4-saphire-rapids,v5,gcc,1,10,2,60.584
awork4-saphire-rapids,v5,gcc,1,10,3,60.587
awork4-saphire-rapids,v5,gcc,1,20,1,76.657
awork4-saphire-rapids,v5,gcc,1,20,2,76.676
awork4-saphire-rapids,v5,gcc,1,20,3,76.689
awork4-saphire-rapids,v5,gcc,1,30,1,96.932
awork4-saphire-rapids,v5,gcc,1,30,2,96.574
awork4-saphire-rapids,v5,gcc,1,30,3,96.958
awork4-saphire-rapids,v5,gcc,1,40,1,118.203
awork4-saphire-rapids,v5,gcc,1,40,2,118.456
awork4-saphire-rapids,v5,gcc,1,40,3,118.500
awork4-saphire-rapids,v5,gcc,2,0,1,50.122
awork4-saphire-rapids,v5,gcc,2,0,2,50.104
awork4-saphire-rapids,v5,gcc,2,0,3,50.123
awork4-saphire-rapids,v5,gcc,2,10,1,73.909
awork4-saphire-rapids,v5,gcc,2,10,2,74.049
awork4-saphire-rapids,v5,gcc,2,10,3,74.066
awork4-saphire-rapids,v5,gcc,2,20,1,99.370
awork4-saphire-rapids,v5,gcc,2,20,2,99.465
awork4-saphire-rapids,v5,gcc,2,20,3,99.499
awork4-saphire-rapids,v5,gcc,2,30,1,129.085
awork4-saphire-rapids,v5,gcc,2,30,2,128.626
awork4-saphire-rapids,v5,gcc,2,30,3,129.107
awork4-saphire-rapids,v5,gcc,2,40,1,181.695
awork4-saphire-rapids,v5,gcc,2,40,2,181.273
awork4-saphire-rapids,v5,gcc,2,40,3,181.333
awork4-saphire-rapids,v5,gcc,3,0,1,52.065
awork4-saphire-rapids,v5,gcc,3,0,2,52.063
awork4-saphire-rapids,v5,gcc,3,0,3,52.061
awork4-saphire-rapids,v5,gcc,3,10,1,82.646
awork4-saphire-rapids,v5,gcc,3,10,2,82.692
awork4-saphire-rapids,v5,gcc,3,10,3,82.761
awork4-saphire-rapids,v5,gcc,3,20,1,111.720
awork4-saphire-rapids,v5,gcc,3,20,2,111.795
awork4-saphire-rapids,v5,gcc,3,20,3,111.933
awork4-saphire-rapids,v5,gcc,3,30,1,147.043
awork4-saphire-rapids,v5,gcc,3,30,2,147.037
awork4-saphire-rapids,v5,gcc,3,30,3,147.298
awork4-saphire-rapids,v5,gcc,3,40,1,178.232
awork4-saphire-rapids,v5,gcc,3,40,2,178.079
awork4-saphire-rapids,v5,gcc,3,40,3,178.608
awork4-saphire-rapids,v5,gcc,4,0,1,52.093
awork4-saphire-rapids,v5,gcc,4,0,2,52.072
awork4-saphire-rapids,v5,gcc,4,0,3,52.078
awork4-saphire-rapids,v5,gcc,4,10,1,82.694
awork4-saphire-rapids,v5,gcc,4,10,2,82.705
awork4-saphire-rapids,v5,gcc,4,10,3,82.701
awork4-saphire-rapids,v5,gcc,4,20,1,112.105
awork4-saphire-rapids,v5,gcc,4,20,2,112.012
awork4-saphire-rapids,v5,gcc,4,20,3,112.031
awork4-saphire-rapids,v5,gcc,4,30,1,146.771
awork4-saphire-rapids,v5,gcc,4,30,2,147.193
awork4-saphire-rapids,v5,gcc,4,30,3,147.198
awork4-saphire-rapids,v5,gcc,4,40,1,178.500
awork4-saphire-rapids,v5,gcc,4,40,2,178.633
awork4-saphire-rapids,v5,gcc,4,40,3,178.191
awork4-saphire-rapids,v5,gcc,5,0,1,47.865
awork4-saphire-rapids,v5,gcc,5,0,2,47.866
awork4-saphire-rapids,v5,gcc,5,0,3,47.871
awork4-saphire-rapids,v5,gcc,5,10,1,64.176
awork4-saphire-rapids,v5,gcc,5,10,2,64.261
awork4-saphire-rapids,v5,gcc,5,10,3,64.223
awork4-saphire-rapids,v5,gcc,5,20,1,79.541
awork4-saphire-rapids,v5,gcc,5,20,2,79.783
awork4-saphire-rapids,v5,gcc,5,20,3,79.678
awork4-saphire-rapids,v5,gcc,5,30,1,101.540
awork4-saphire-rapids,v5,gcc,5,30,2,101.579
awork4-saphire-rapids,v5,gcc,5,30,3,101.607
awork4-saphire-rapids,v5,gcc,5,40,1,121.498
awork4-saphire-rapids,v5,gcc,5,40,2,121.535
awork4-saphire-rapids,v5,gcc,5,40,3,121.511
awork4-saphire-rapids,v5,gcc,6,0,1,50.831
awork4-saphire-rapids,v5,gcc,6,0,2,50.821
awork4-saphire-rapids,v5,gcc,6,0,3,50.810
awork4-saphire-rapids,v5,gcc,6,10,1,77.795
awork4-saphire-rapids,v5,gcc,6,10,2,77.941
awork4-saphire-rapids,v5,gcc,6,10,3,77.849
awork4-saphire-rapids,v5,gcc,6,20,1,102.979
awork4-saphire-rapids,v5,gcc,6,20,2,103.029
awork4-saphire-rapids,v5,gcc,6,20,3,102.986
awork4-saphire-rapids,v5,gcc,6,30,1,136.245
awork4-saphire-rapids,v5,gcc,6,30,2,135.758
awork4-saphire-rapids,v5,gcc,6,30,3,136.043
awork4-saphire-rapids,v5,gcc,6,40,1,180.620
awork4-saphire-rapids,v5,gcc,6,40,2,180.158
awork4-saphire-rapids,v5,gcc,6,40,3,180.346
awork4-saphire-rapids,v5,gcc,7,0,1,51.187
awork4-saphire-rapids,v5,gcc,7,0,2,51.213
awork4-saphire-rapids,v5,gcc,7,0,3,51.237
awork4-saphire-rapids,v5,gcc,7,10,1,82.384
awork4-saphire-rapids,v5,gcc,7,10,2,82.492
awork4-saphire-rapids,v5,gcc,7,10,3,82.508
awork4-saphire-rapids,v5,gcc,7,20,1,111.640
awork4-saphire-rapids,v5,gcc,7,20,2,111.555
awork4-saphire-rapids,v5,gcc,7,20,3,111.754
awork4-saphire-rapids,v5,gcc,7,30,1,145.710
awork4-saphire-rapids,v5,gcc,7,30,2,146.150
awork4-saphire-rapids,v5,gcc,7,30,3,146.106
awork4-saphire-rapids,v5,gcc,7,40,1,181.022
awork4-saphire-rapids,v5,gcc,7,40,2,181.057
awork4-saphire-rapids,v5,gcc,7,40,3,180.671
awork4-saphire-rapids,v5,gcc,8,0,1,51.177
awork4-saphire-rapids,v5,gcc,8,0,2,51.184
awork4-saphire-rapids,v5,gcc,8,0,3,51.224
awork4-saphire-rapids,v5,gcc,8,10,1,82.421
awork4-saphire-rapids,v5,gcc,8,10,2,82.591
awork4-saphire-rapids,v5,gcc,8,10,3,82.508
awork4-saphire-rapids,v5,gcc,8,20,1,111.636
awork4-saphire-rapids,v5,gcc,8,20,2,111.926
awork4-saphire-rapids,v5,gcc,8,20,3,111.646
awork4-saphire-rapids,v5,gcc,8,30,1,145.822
awork4-saphire-rapids,v5,gcc,8,30,2,145.918
awork4-saphire-rapids,v5,gcc,8,30,3,145.951
awork4-saphire-rapids,v5,gcc,8,40,1,180.658
awork4-saphire-rapids,v5,gcc,8,40,2,180.770
awork4-saphire-rapids,v5,gcc,8,40,3,180.807

^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: More speedups for tuple deformation
@ 2026-01-23 05:29  Chao Li <[email protected]>
  parent: Andres Freund <[email protected]>
  1 sibling, 0 replies; 19+ messages in thread

From: Chao Li @ 2026-01-23 05:29 UTC (permalink / raw)
  To: Andres Freund <[email protected]>; David Rowley <[email protected]>; +Cc: PostgreSQL Developers <[email protected]>



> On Jan 23, 2026, at 09:18, Andres Freund <[email protected]> wrote:
> 
> Hi,
> 
> I haven't yet looked at the new version of the patch, but I ran your benchmark
> from upthread (fwiw, I removed the sleep 10 to reduce runtimes, the results
> seem stable enough anyway) on two intel machines, as you mentioned that you
> saw a lot variation in Azure.
> 
> For both I disabled turbo boost, cpu idling and pinned the backend to a single
> CPU core.
> 
> There's a bit of noise on "awork3" (basically an editor and an idle browser
> window), but everything is pinned to the other socket. "awork4" is entirely
> idle.
> 
> 
> Looks like overall the results are quite impressive!  Some of the extra_cols=0
> runs saphire rapids are a bit slower, but the losses are much smaller than the
> gains in other cases.
> 
> 
> I think it'd be good to add a few test cases of "incremental deforming" to the
> benchmark. E.g. a qual that accesses column 10, but projection then deforms up
> to 20.  I'm a bit worried that e.g. the repeated first_null_attr()
> computations could cause regressions.
> 
> 
> Greetings,
> 
> Andres Freund
> <deform_bench.csv>

Today I ran the benchmark on my MacBook M4 against 3 versions (all without assert and with -O2):

1) Master (f9a468c664a)
2) Master + v4
3) Master + v4 + My tweak (first_null_attr immediately returns 0 when natts == 0)

Overall, v4 shows significant improvements across most configuration combinations. In the best case, v4 is about 43% faster than master.

The tweak version is only slightly faster than v4. In the best case, the tweak achieves an additional ~3.5% improvement over v4.

Note that the MacBook is my working laptop. I didn’t actively work on it while the tests were running, but it was still not fully idle, as some other applications (Email, VScode, etc.) were running in the background. That said, I suppose this is still fair for the three rounds of test runs.

See the attached Excel sheet for details.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/






Attachments:

  [application/vnd.openxmlformats-officedocument.spreadsheetml.sheet] pgbench_comparison_chao_li_mac_m4.xlsx (15.8K, 2-pgbench_comparison_chao_li_mac_m4.xlsx)
  download

^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: More speedups for tuple deformation
@ 2026-01-23 16:33  Andres Freund <[email protected]>
  parent: Andres Freund <[email protected]>
  1 sibling, 1 reply; 19+ messages in thread

From: Andres Freund @ 2026-01-23 16:33 UTC (permalink / raw)
  To: David Rowley <[email protected]>; +Cc: Chao Li <[email protected]>; PostgreSQL Developers <[email protected]>

Hi,

On 2026-01-22 20:18:21 -0500, Andres Freund wrote:
> I haven't yet looked at the new version of the patch, but I ran your benchmark
> from upthread (fwiw, I removed the sleep 10 to reduce runtimes, the results
> seem stable enough anyway) on two intel machines, as you mentioned that you
> saw a lot variation in Azure.
>
> For both I disabled turbo boost, cpu idling and pinned the backend to a single
> CPU core.
>
> There's a bit of noise on "awork3" (basically an editor and an idle browser
> window), but everything is pinned to the other socket. "awork4" is entirely
> idle.
>
>
> Looks like overall the results are quite impressive!  Some of the extra_cols=0
> runs saphire rapids are a bit slower, but the losses are much smaller than the
> gains in other cases.
>
>
> I think it'd be good to add a few test cases of "incremental deforming" to the
> benchmark. E.g. a qual that accesses column 10, but projection then deforms up
> to 20.  I'm a bit worried that e.g. the repeated first_null_attr()
> computations could cause regressions.

The overhead of the aggregation etc makes it harder to see efficiency changes
in deformation speed:

I think it'd be worth replacing the SUM(a) with WHERE a < 0 (filtering all
rows), to reduce the cost of the executor dispatch.

Here's a profile of the SUM(a):

-   99.90%     0.00%  postgres         postgres           [.] standard_ExecutorRun
   - standard_ExecutorRun
      - 96.83% ExecAgg
         - 49.86% ExecInterpExpr
            - 28.30% slot_getsomeattrs_int
                 tts_buffer_heap_getsomeattrs
              0.67% tts_buffer_heap_getsomeattrs
            + 0.02% asm_sysvec_apic_timer_interrupt
         - 37.44% fetch_input_tuple
            - 31.42% ExecSeqScan
               + 20.58% heap_getnextslot
                 3.58% MemoryContextReset
                 0.52% heapgettup_pagemode
                 0.32% ExecStoreBufferHeapTuple
              0.99% heap_getnextslot
              0.79% MemoryContextReset
           2.81% int4_sum
           1.39% MemoryContextReset

Which takes ~93ms on average for the first generated bench.sql


-   99.88%     0.00%  postgres  postgres           [.] standard_ExecutorRun
   - standard_ExecutorRun
      - 95.78% ExecSeqScanWithQual
         - 57.65% ExecInterpExpr
            - 29.08% slot_getsomeattrs_int
                 tts_buffer_heap_getsomeattrs
              0.49% tts_buffer_heap_getsomeattrs
         - 25.40% heap_getnextslot
            + 15.00% heapgettup_pagemode
            + 4.71% ExecStoreBufferHeapTuple
              0.05% UnlockBuffer
           1.80% MemoryContextReset
           0.77% int4lt
           0.52% heapgettup_pagemode
           0.47% ExecStoreBufferHeapTuple
           0.37% slot_getsomeattrs_int
        2.11% heap_getnextslot
        1.49% ExecInterpExpr
        0.50% MemoryContextReset

Same data, but with a WHERE a < 0, takes on average ~74m.


I wonder if it's worth writing a C helper to test deformation in a bit more
targeted way.


Looking at the profile of ExecSeqScanWithQual() made me a bit sad, turns out
that some of the generated code isn't great :(. I'll start a separate thread
about that.

Greetings,

Andres Freund






^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: More speedups for tuple deformation
@ 2026-01-27 13:34  David Rowley <[email protected]>
  parent: Andres Freund <[email protected]>
  0 siblings, 2 replies; 19+ messages in thread

From: David Rowley @ 2026-01-27 13:34 UTC (permalink / raw)
  To: Andres Freund <[email protected]>; +Cc: Chao Li <[email protected]>; PostgreSQL Developers <[email protected]>

On Sat, 24 Jan 2026 at 05:33, Andres Freund <[email protected]> wrote:
> I wonder if it's worth writing a C helper to test deformation in a bit more
> targeted way.

Good idea. I've written a test module called "deform_bench". You can
do: "select deform_bench('tablename'::regclass, '{10,20}');" which
will deform up to attnum=10, then in a 2nd pass deform up to
attnum=20. This is in the 0003 patch. (Requires "ninja
install-test-files"). 0003 is intended for testing, not commit.

There are also 2 scripts attached, one which sets up all the tables
for the benchmark, and one to run it. This saves creating the same
tables again when trying other branches or compilers.

I've also included a slightly revised patch. I made a small change to
the first_null_attr() to get rid of the masking of higher attnums and
also now making use of __builtin_ctz to find the first NULL attnum in
the byte. For compilers that don't support that, I've included a
pg_rightmost_*zero*_pos table. I didn't want to use the pg_bitutils
table for the rightmost *one* pos as it meant having to special-case
what happens when using index 255, as that would return 0, and I want
8. I'll make the MSVC version use _BitScanForward() in the next patch.
Using __builtin_ctz() seems to help reduce the small regression I was
seeing with the 0 extra column test. It's still there, but it is very
small. It's more pronounced because of the deform_bench module due to
the reduction of the other execution overheads.

Technically, the first_null_attr() function *could* contain slightly
fewer checks. It should be guaranteed that we'll find a byte not set
to 255, as there wouldn't be a bitmask there if there were no 0s. So
technically, the first for loop could be a while (byte[bytenum] ==
0xFF) bytenum++;.  I just felt that might be too dangerous to do that
as the code would walk off the end of the bitmask if the tuple was
corrupted in the right way.

With the reduced overhead using deform_bench, the Apple M2 results are
looking quite good. Test 5 with 20 extra columns is 128% faster than
master and averages ~25% faster than master over all tests. My results
are in the attached spreadsheet.

David

#!/bin/bash

dbname=postgres
secs=10
rows=1000000
extra_cols_start=0
extra_cols_end=40
extra_cols_increment=10
psql -c "alter system set max_parallel_workers_per_gather = 0;" $dbname
psql -c "alter system set jit = 0;" $dbname
psql -c "select pg_reload_conf();" $dbname
psql -c "create extension if not exists pg_prewarm;" $dbname
psql -c "create extension if not exists deform_bench;" $dbname
psql -c "create table if not exists deform_results (machine text not null, cc text not null, branch text not null, test_id int not null, extra_columns int not null, run_id int not null, milliseconds float4 not NULL);" $dbname

test_id=1

for extracol in ", b int not null default 0" ", b int default null"
do
	for firstcol in "c int not null default 0" "c text not null default '0'" "c int null" "c text null"
	do
		for c in $(seq $extra_cols_start $extra_cols_increment $extra_cols_end)
		do
			tablename="t_${test_id}_${c}"
			psql -c "drop table if exists $tablename" $dbname
			sql="create table $tablename ($firstcol"
			for i in $(seq 0 $c)
			do
				sql="$sql,c$i int not null default 0"
			done
			sql="$sql,a int not null$extracol);"
			psql -c "$sql" $dbname

			psql -c "insert into $tablename (a) select a from generate_series(1,$rows) a;" $dbname
			psql -c "vacuum freeze analyze $tablename;" $dbname
		done
		let "test_id=test_id+1"
	done
done

psql -c "checkpoint;" $dbname

#!/bin/bash

dbname=postgres

# Must match what was used in the test setup script
extra_cols_start=0
extra_cols_end=40
extra_cols_increment=10

# How many times to run each test.  10 - 100 is probably good
number_of_runs=10

me=$(basename "$0")
if [ $# -lt 3 ]
  then
    echo "Syntax: ./$me <machine name> <name of compiler you used> <branch name> "
	exit 1
fi

machine=$1
compiler=$2
branch=$3

psql -c "truncate table deform_results;" $dbname
echo "Results will be stored in the deform_results table"
echo -n "Running tests..."
for test_id in {1..8}
do
	for c in $(seq $extra_cols_start $extra_cols_increment $extra_cols_end)
	do
		let "deform_attnum=c+3"
		tablename="t_${test_id}_${c}"
		echo -ne "\rRunning test $test_id of 8 with $c extra columns..."
		psql -c "select pg_prewarm('$tablename'); insert into deform_results (machine, cc, branch, test_id, extra_columns, run_id, milliseconds) select '${machine}','${compiler}','${branch}',${test_id},${c},run_id,deform_bench('$tablename'::regclass, '{$deform_attnum}') from generate_series(1,$number_of_runs) run_id;" $dbname > /dev/null
	done
done
echo ""
echo ""
echo "The results are:"
psql --csv -c "select branch,test_id,extra_columns,round(avg(milliseconds)::numeric,2) avg_ms,stddev(milliseconds) AS stddev from deform_results where machine = '$machine' and cc='$compiler' and branch='$branch' group by 1,2,3 order by 3,1,2;" $dbname
From cd036ce2c09982dac5be8bdb7283e8772d7468d3 Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Wed, 21 Jan 2026 15:41:37 +1300
Subject: [PATCH v6 1/3] Add empty TupleDescFinalize() function

Currently does nothing, but will in a future commit.
---
 contrib/dblink/dblink.c                             |  4 ++++
 contrib/pg_buffercache/pg_buffercache_pages.c       |  2 ++
 contrib/pg_visibility/pg_visibility.c               |  2 ++
 src/backend/access/brin/brin_tuple.c                |  1 +
 src/backend/access/common/tupdesc.c                 | 13 +++++++++++++
 src/backend/access/gin/ginutil.c                    |  1 +
 src/backend/access/gist/gistscan.c                  |  1 +
 src/backend/access/spgist/spgutils.c                |  1 +
 src/backend/access/transam/twophase.c               |  1 +
 src/backend/access/transam/xlogfuncs.c              |  1 +
 src/backend/backup/basebackup_copy.c                |  3 +++
 src/backend/catalog/index.c                         |  2 ++
 src/backend/catalog/pg_publication.c                |  1 +
 src/backend/catalog/toasting.c                      |  6 ++++++
 src/backend/commands/explain.c                      |  1 +
 src/backend/commands/functioncmds.c                 |  1 +
 src/backend/commands/sequence.c                     |  1 +
 src/backend/commands/tablecmds.c                    |  4 ++++
 src/backend/commands/wait.c                         |  1 +
 src/backend/executor/execSRF.c                      |  2 ++
 src/backend/executor/execTuples.c                   |  4 ++++
 src/backend/executor/nodeFunctionscan.c             |  2 ++
 src/backend/parser/parse_relation.c                 |  4 +++-
 src/backend/parser/parse_target.c                   |  2 ++
 .../replication/libpqwalreceiver/libpqwalreceiver.c |  1 +
 src/backend/replication/walsender.c                 |  5 +++++
 src/backend/utils/adt/acl.c                         |  1 +
 src/backend/utils/adt/genfile.c                     |  1 +
 src/backend/utils/adt/lockfuncs.c                   |  1 +
 src/backend/utils/adt/orderedsetaggs.c              |  1 +
 src/backend/utils/adt/pgstatfuncs.c                 |  5 +++++
 src/backend/utils/adt/tsvector_op.c                 |  1 +
 src/backend/utils/cache/relcache.c                  |  8 ++++++++
 src/backend/utils/fmgr/funcapi.c                    |  6 ++++++
 src/backend/utils/misc/guc_funcs.c                  |  5 +++++
 src/include/access/tupdesc.h                        |  1 +
 src/pl/plpgsql/src/pl_comp.c                        |  2 ++
 .../test_custom_stats/test_custom_fixed_stats.c     |  1 +
 src/test/modules/test_predtest/test_predtest.c      |  1 +
 39 files changed, 100 insertions(+), 1 deletion(-)

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 8cb3166495c..1ce4502fec2 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -881,6 +881,7 @@ materializeResult(FunctionCallInfo fcinfo, PGconn *conn, PGresult *res)
 		tupdesc = CreateTemplateTupleDesc(1);
 		TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 						   TEXTOID, -1, 0);
+		TupleDescFinalize(tupdesc);
 		ntuples = 1;
 		nfields = 1;
 	}
@@ -1044,6 +1045,7 @@ materializeQueryResult(FunctionCallInfo fcinfo,
 			tupdesc = CreateTemplateTupleDesc(1);
 			TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 							   TEXTOID, -1, 0);
+			TupleDescFinalize(tupdesc);
 			attinmeta = TupleDescGetAttInMetadata(tupdesc);
 
 			oldcontext = MemoryContextSwitchTo(rsinfo->econtext->ecxt_per_query_memory);
@@ -1529,6 +1531,8 @@ dblink_get_pkey(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 2, "colname",
 						   TEXTOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
+
 		/*
 		 * Generate attribute metadata needed later to produce tuples from raw
 		 * C strings
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 89b86855243..a6b4fb5252b 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -174,6 +174,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
 			TupleDescInitEntry(tupledesc, (AttrNumber) 9, "pinning_backends",
 							   INT4OID, -1, 0);
 
+		TupleDescFinalize(tupledesc);
 		fctx->tupdesc = BlessTupleDesc(tupledesc);
 
 		/* Allocate NBuffers worth of BufferCachePagesRec records. */
@@ -442,6 +443,7 @@ pg_buffercache_os_pages_internal(FunctionCallInfo fcinfo, bool include_numa)
 		TupleDescInitEntry(tupledesc, (AttrNumber) 3, "numa_node",
 						   INT4OID, -1, 0);
 
+		TupleDescFinalize(tupledesc);
 		fctx->tupdesc = BlessTupleDesc(tupledesc);
 		fctx->include_numa = include_numa;
 
diff --git a/contrib/pg_visibility/pg_visibility.c b/contrib/pg_visibility/pg_visibility.c
index 9bc3a784bf7..dfab0b64cf5 100644
--- a/contrib/pg_visibility/pg_visibility.c
+++ b/contrib/pg_visibility/pg_visibility.c
@@ -469,6 +469,8 @@ pg_visibility_tupdesc(bool include_blkno, bool include_pd)
 		TupleDescInitEntry(tupdesc, ++a, "pd_all_visible", BOOLOID, -1, 0);
 	Assert(a == maxattr);
 
+	TupleDescFinalize(tupdesc);
+
 	return BlessTupleDesc(tupdesc);
 }
 
diff --git a/src/backend/access/brin/brin_tuple.c b/src/backend/access/brin/brin_tuple.c
index 706387e36d6..7f150df9ee7 100644
--- a/src/backend/access/brin/brin_tuple.c
+++ b/src/backend/access/brin/brin_tuple.c
@@ -84,6 +84,7 @@ brtuple_disk_tupdesc(BrinDesc *brdesc)
 
 		MemoryContextSwitchTo(oldcxt);
 
+		TupleDescFinalize(tupdesc);
 		brdesc->bd_disktdesc = tupdesc;
 	}
 
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index 94b4f1f9975..e98de806a77 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -238,6 +238,9 @@ CreateTupleDesc(int natts, Form_pg_attribute *attrs)
 		memcpy(TupleDescAttr(desc, i), attrs[i], ATTRIBUTE_FIXED_PART_SIZE);
 		populate_compact_attribute(desc, i);
 	}
+
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -282,6 +285,8 @@ CreateTupleDescCopy(TupleDesc tupdesc)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -328,6 +333,8 @@ CreateTupleDescTruncatedCopy(TupleDesc tupdesc, int natts)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -413,6 +420,8 @@ CreateTupleDescCopyConstr(TupleDesc tupdesc)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -455,6 +464,8 @@ TupleDescCopy(TupleDesc dst, TupleDesc src)
 	 * source's refcount would be wrong in any case.)
 	 */
 	dst->tdrefcount = -1;
+
+	TupleDescFinalize(dst);
 }
 
 /*
@@ -1082,6 +1093,8 @@ BuildDescFromLists(const List *names, const List *types, const List *typmods, co
 		TupleDescInitEntryCollation(desc, attnum, attcollation);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index d205093e21d..a533d79e26e 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -129,6 +129,7 @@ initGinState(GinState *state, Relation index)
 							   attr->attndims);
 			TupleDescInitEntryCollation(state->tupdesc[i], (AttrNumber) 2,
 										attr->attcollation);
+			TupleDescFinalize(state->tupdesc[i]);
 		}
 
 		/*
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
index f23bc4a6757..c65f93abdae 100644
--- a/src/backend/access/gist/gistscan.c
+++ b/src/backend/access/gist/gistscan.c
@@ -201,6 +201,7 @@ gistrescan(IndexScanDesc scan, ScanKey key, int nkeys,
 											 attno - 1)->atttypid,
 							   -1, 0);
 		}
+		TupleDescFinalize(so->giststate->fetchTupdesc);
 		scan->xs_hitupdesc = so->giststate->fetchTupdesc;
 
 		/* Also create a memory context that will hold the returned tuples */
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 9f5379b87ac..b246e8127db 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -340,6 +340,7 @@ getSpGistTupleDesc(Relation index, SpGistTypeDesc *keyType)
 			TupleDescCompactAttr(outTupDesc, i)->attcacheoff = -1;
 
 		populate_compact_attribute(outTupDesc, spgKeyColumn);
+		TupleDescFinalize(outTupDesc);
 	}
 	return outTupDesc;
 }
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index e50abb331cc..9f708f84334 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -744,6 +744,7 @@ pg_prepared_xact(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 5, "dbid",
 						   OIDOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 2efe4105efb..b6bc616c74c 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -400,6 +400,7 @@ pg_walfile_name_offset(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(resultTupleDesc, (AttrNumber) 2, "file_offset",
 					   INT4OID, -1, 0);
 
+	TupleDescFinalize(resultTupleDesc);
 	resultTupleDesc = BlessTupleDesc(resultTupleDesc);
 
 	/*
diff --git a/src/backend/backup/basebackup_copy.c b/src/backend/backup/basebackup_copy.c
index fecfad9ab7b..29dbd0cb32f 100644
--- a/src/backend/backup/basebackup_copy.c
+++ b/src/backend/backup/basebackup_copy.c
@@ -357,6 +357,8 @@ SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli)
 	 */
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "tli", INT8OID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
+
 	/* send RowDescription */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
 
@@ -388,6 +390,7 @@ SendTablespaceList(List *tablespaces)
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, "spcoid", OIDOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "spclocation", TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "size", INT8OID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* send RowDescription */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 43de42ce39e..75e97fb394a 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -481,6 +481,8 @@ ConstructTupleDescriptor(Relation heapRelation,
 		populate_compact_attribute(indexTupDesc, i);
 	}
 
+	TupleDescFinalize(indexTupDesc);
+
 	return indexTupDesc;
 }
 
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 9a4791c573e..fa353a0dd37 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -1230,6 +1230,7 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 4, "qual",
 						   PG_NODE_TREEOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 		funcctx->user_fctx = table_infos;
 
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index c78dcea98c1..078a1cf5127 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -229,6 +229,12 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid,
 	TupleDescAttr(tupdesc, 1)->attcompression = InvalidCompressionMethod;
 	TupleDescAttr(tupdesc, 2)->attcompression = InvalidCompressionMethod;
 
+	populate_compact_attribute(tupdesc, 0);
+	populate_compact_attribute(tupdesc, 1);
+	populate_compact_attribute(tupdesc, 2);
+
+	TupleDescFinalize(tupdesc);
+
 	/*
 	 * Toast tables for regular relations go in pg_toast; those for temp
 	 * relations go into the per-backend temp-toast-table namespace.
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index b7bb111688c..7abd9ed272f 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -281,6 +281,7 @@ ExplainResultDesc(ExplainStmt *stmt)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "QUERY PLAN",
 					   result_type, -1, 0);
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
 
diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c
index a516b037dea..6a8f162b640 100644
--- a/src/backend/commands/functioncmds.c
+++ b/src/backend/commands/functioncmds.c
@@ -2423,6 +2423,7 @@ CallStmtResultDesc(CallStmt *stmt)
 							   -1,
 							   0);
 		}
+		TupleDescFinalize(tupdesc);
 	}
 
 	return tupdesc;
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index e1b808bbb60..551667650ba 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -1808,6 +1808,7 @@ pg_get_sequence_data(PG_FUNCTION_ARGS)
 					   BOOLOID, -1, 0);
 	TupleDescInitEntry(resultTupleDesc, (AttrNumber) 3, "page_lsn",
 					   LSNOID, -1, 0);
+	TupleDescFinalize(resultTupleDesc);
 	resultTupleDesc = BlessTupleDesc(resultTupleDesc);
 
 	seqrel = try_relation_open(relid, AccessShareLock);
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index f976c0e5c7e..ae7c9db9fa9 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -1029,6 +1029,8 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
 		}
 	}
 
+	TupleDescFinalize(descriptor);
+
 	/*
 	 * For relations with table AM and partitioned tables, select access
 	 * method to use: an explicitly indicated one, or (in the case of a
@@ -1448,6 +1450,8 @@ BuildDescForRelation(const List *columns)
 		populate_compact_attribute(desc, attnum - 1);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
diff --git a/src/backend/commands/wait.c b/src/backend/commands/wait.c
index 1290df10c6f..8e920a72372 100644
--- a/src/backend/commands/wait.c
+++ b/src/backend/commands/wait.c
@@ -338,5 +338,6 @@ WaitStmtResultDesc(WaitStmt *stmt)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 					   TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
diff --git a/src/backend/executor/execSRF.c b/src/backend/executor/execSRF.c
index a0b111dc0e4..b481e50acfb 100644
--- a/src/backend/executor/execSRF.c
+++ b/src/backend/executor/execSRF.c
@@ -272,6 +272,7 @@ ExecMakeTableFunctionResult(SetExprState *setexpr,
 									   funcrettype,
 									   -1,
 									   0);
+					TupleDescFinalize(tupdesc);
 					rsinfo.setDesc = tupdesc;
 				}
 				MemoryContextSwitchTo(oldcontext);
@@ -776,6 +777,7 @@ init_sexpr(Oid foid, Oid input_collation, Expr *node,
 							   funcrettype,
 							   -1,
 							   0);
+			TupleDescFinalize(tupdesc);
 			sexpr->funcResultDesc = tupdesc;
 			sexpr->funcReturnsTuple = false;
 		}
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index b768eae9e53..e6ab51e6404 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -2173,6 +2173,8 @@ ExecTypeFromTLInternal(List *targetList, bool skipjunk)
 		cur_resno++;
 	}
 
+	TupleDescFinalize(typeInfo);
+
 	return typeInfo;
 }
 
@@ -2207,6 +2209,8 @@ ExecTypeFromExprList(List *exprList)
 		cur_resno++;
 	}
 
+	TupleDescFinalize(typeInfo);
+
 	return typeInfo;
 }
 
diff --git a/src/backend/executor/nodeFunctionscan.c b/src/backend/executor/nodeFunctionscan.c
index 63e605e1f81..feb82d64967 100644
--- a/src/backend/executor/nodeFunctionscan.c
+++ b/src/backend/executor/nodeFunctionscan.c
@@ -414,6 +414,7 @@ ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags)
 				TupleDescInitEntryCollation(tupdesc,
 											(AttrNumber) 1,
 											exprCollation(funcexpr));
+				TupleDescFinalize(tupdesc);
 			}
 			else
 			{
@@ -485,6 +486,7 @@ ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags)
 							   0);
 		}
 
+		TupleDescFinalize(scan_tupdesc);
 		Assert(attno == natts);
 	}
 
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 3ec8d8de011..0ad767d827b 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -1891,6 +1891,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 			TupleDescInitEntryCollation(tupdesc,
 										(AttrNumber) 1,
 										exprCollation(funcexpr));
+			TupleDescFinalize(tupdesc);
 		}
 		else if (functypclass == TYPEFUNC_RECORD)
 		{
@@ -1948,6 +1949,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 
 				i++;
 			}
+			TupleDescFinalize(tupdesc);
 
 			/*
 			 * Ensure that the coldeflist defines a legal set of names (no
@@ -2016,7 +2018,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 							   0);
 			/* no need to set collation */
 		}
-
+		TupleDescFinalize(tupdesc);
 		Assert(natts == totalatts);
 	}
 	else
diff --git a/src/backend/parser/parse_target.c b/src/backend/parser/parse_target.c
index b5a2f915b67..5fd17f3d8d0 100644
--- a/src/backend/parser/parse_target.c
+++ b/src/backend/parser/parse_target.c
@@ -1570,6 +1570,8 @@ expandRecordVariable(ParseState *pstate, Var *var, int levelsup)
 		}
 		Assert(lname == NULL && lvar == NULL);	/* lists same length? */
 
+		TupleDescFinalize(tupleDesc);
+
 		return tupleDesc;
 	}
 
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 7c8639b32e9..9f04c9ed25d 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -1073,6 +1073,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
 	for (coln = 0; coln < nRetTypes; coln++)
 		TupleDescInitEntry(walres->tupledesc, (AttrNumber) coln + 1,
 						   PQfname(pgres, coln), retTypes[coln], -1, 0);
+	TupleDescFinalize(walres->tupledesc);
 	attinmeta = TupleDescGetAttInMetadata(walres->tupledesc);
 
 	/* No point in doing more here if there were no tuples returned. */
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index a0e6a3d200c..9e2f4a664b4 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -451,6 +451,7 @@ IdentifySystem(void)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 4, "dbname",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -496,6 +497,7 @@ ReadReplicationSlot(ReadReplicationSlotCmd *cmd)
 	/* TimeLineID is unsigned, so int4 is not wide enough. */
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "restart_tli",
 							  INT8OID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	memset(nulls, true, READ_REPLICATION_SLOT_COLS * sizeof(bool));
 
@@ -598,6 +600,7 @@ SendTimeLineHistory(TimeLineHistoryCmd *cmd)
 	tupdesc = CreateTemplateTupleDesc(2);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, "filename", TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "content", TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	TLHistoryFileName(histfname, cmd->timeline);
 	TLHistoryFilePath(path, cmd->timeline);
@@ -1015,6 +1018,7 @@ StartReplication(StartReplicationCmd *cmd)
 								  INT8OID, -1, 0);
 		TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "next_tli_startpos",
 								  TEXTOID, -1, 0);
+		TupleDescFinalize(tupdesc);
 
 		/* prepare for projection of tuple */
 		tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -1369,6 +1373,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 4, "output_plugin",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
diff --git a/src/backend/utils/adt/acl.c b/src/backend/utils/adt/acl.c
index 3a6905f9546..9d37053c81e 100644
--- a/src/backend/utils/adt/acl.c
+++ b/src/backend/utils/adt/acl.c
@@ -1818,6 +1818,7 @@ aclexplode(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 4, "is_grantable",
 						   BOOLOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/* allocate memory for user context */
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index c083608b1d5..bfb949401d0 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -454,6 +454,7 @@ pg_stat_file(PG_FUNCTION_ARGS)
 					   "creation", TIMESTAMPTZOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 6,
 					   "isdir", BOOLOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	memset(isnull, false, sizeof(isnull));
diff --git a/src/backend/utils/adt/lockfuncs.c b/src/backend/utils/adt/lockfuncs.c
index 9dadd6da672..4481c354fd6 100644
--- a/src/backend/utils/adt/lockfuncs.c
+++ b/src/backend/utils/adt/lockfuncs.c
@@ -146,6 +146,7 @@ pg_lock_status(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 16, "waitstart",
 						   TIMESTAMPTZOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/*
diff --git a/src/backend/utils/adt/orderedsetaggs.c b/src/backend/utils/adt/orderedsetaggs.c
index 3b6da8e36ac..fd8b8676470 100644
--- a/src/backend/utils/adt/orderedsetaggs.c
+++ b/src/backend/utils/adt/orderedsetaggs.c
@@ -233,6 +233,7 @@ ordered_set_startup(FunctionCallInfo fcinfo, bool use_tuples)
 								   -1,
 								   0);
 
+				TupleDescFinalize(newdesc);
 				FreeTupleDesc(qstate->tupdesc);
 				qstate->tupdesc = newdesc;
 			}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 73ca0bb0b7f..08ad27e57c2 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -769,6 +769,7 @@ pg_stat_get_backend_subxact(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subxact_overflow",
 					   BOOLOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	if ((local_beentry = pgstat_get_local_beentry_by_proc_number(procNumber)) != NULL)
@@ -1658,6 +1659,7 @@ pg_stat_wal_build_tuple(PgStat_WalCounters wal_counters,
 	TupleDescInitEntry(tupdesc, (AttrNumber) 6, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	/* Fill values and NULLs */
@@ -2085,6 +2087,7 @@ pg_stat_get_archiver(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(tupdesc, (AttrNumber) 7, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	/* Get statistics about the archiver process */
@@ -2166,6 +2169,7 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
 					   TIMESTAMPTZOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 13, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	namestrcpy(&slotname, text_to_cstring(slotname_text));
@@ -2253,6 +2257,7 @@ pg_stat_get_subscription_stats(PG_FUNCTION_ARGS)
 					   INT8OID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 13, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	if (!subentry)
diff --git a/src/backend/utils/adt/tsvector_op.c b/src/backend/utils/adt/tsvector_op.c
index 94e0fed8309..7ca19a97882 100644
--- a/src/backend/utils/adt/tsvector_op.c
+++ b/src/backend/utils/adt/tsvector_op.c
@@ -651,6 +651,7 @@ tsvector_unnest(PG_FUNCTION_ARGS)
 						   TEXTARRAYOID, -1, 0);
 		if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
 			elog(ERROR, "return type must be a row type");
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = tupdesc;
 
 		funcctx->user_fctx = PG_GETARG_TSVECTOR_COPY(0);
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 6b634c9fff1..770edb34e08 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -729,6 +729,8 @@ RelationBuildTupleDesc(Relation relation)
 		pfree(constr);
 		relation->rd_att->constr = NULL;
 	}
+
+	TupleDescFinalize(relation->rd_att);
 }
 
 /*
@@ -1985,6 +1987,7 @@ formrdesc(const char *relationName, Oid relationReltype,
 
 	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
 	TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
+	TupleDescFinalize(relation->rd_att);
 
 	/* mark not-null status */
 	if (has_not_null)
@@ -3688,6 +3691,8 @@ RelationBuildLocalRelation(const char *relname,
 	for (i = 0; i < natts; i++)
 		TupleDescAttr(rel->rd_att, i)->attrelid = relid;
 
+	TupleDescFinalize(rel->rd_att);
+
 	rel->rd_rel->reltablespace = reltablespace;
 
 	if (mapped_relation)
@@ -4443,6 +4448,7 @@ BuildHardcodedDescriptor(int natts, const FormData_pg_attribute *attrs)
 
 	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
 	TupleDescCompactAttr(result, 0)->attcacheoff = 0;
+	TupleDescFinalize(result);
 
 	/* Note: we don't bother to set up a TupleConstr entry */
 
@@ -6268,6 +6274,8 @@ load_relcache_init_file(bool shared)
 			populate_compact_attribute(rel->rd_att, i);
 		}
 
+		TupleDescFinalize(rel->rd_att);
+
 		/* next read the access method specific field */
 		if (fread(&len, 1, sizeof(len), fp) != sizeof(len))
 			goto read_failed;
diff --git a/src/backend/utils/fmgr/funcapi.c b/src/backend/utils/fmgr/funcapi.c
index 8a934ea8dca..516d02cfb82 100644
--- a/src/backend/utils/fmgr/funcapi.c
+++ b/src/backend/utils/fmgr/funcapi.c
@@ -340,6 +340,8 @@ get_expr_result_type(Node *expr,
 										exprCollation(col));
 			i++;
 		}
+		TupleDescFinalize(tupdesc);
+
 		if (resultTypeId)
 			*resultTypeId = rexpr->row_typeid;
 		if (resultTupleDesc)
@@ -1044,6 +1046,7 @@ resolve_polymorphic_tupdesc(TupleDesc tupdesc, oidvector *declared_args,
 		}
 	}
 
+	TupleDescFinalize(tupdesc);
 	return true;
 }
 
@@ -1853,6 +1856,8 @@ build_function_result_tupdesc_d(char prokind,
 						   0);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -1970,6 +1975,7 @@ TypeGetTupleDesc(Oid typeoid, List *colaliases)
 						   typeoid,
 						   -1,
 						   0);
+		TupleDescFinalize(tupdesc);
 	}
 	else if (functypclass == TYPEFUNC_RECORD)
 	{
diff --git a/src/backend/utils/misc/guc_funcs.c b/src/backend/utils/misc/guc_funcs.c
index 4f3e40bf470..b82f807e05e 100644
--- a/src/backend/utils/misc/guc_funcs.c
+++ b/src/backend/utils/misc/guc_funcs.c
@@ -444,6 +444,7 @@ GetPGVariableResultDesc(const char *name)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 1, varname,
 						   TEXTOID, -1, 0);
 	}
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
 
@@ -465,6 +466,7 @@ ShowGUCConfigOption(const char *name, DestReceiver *dest)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, varname,
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -499,6 +501,7 @@ ShowAllGUCConfig(DestReceiver *dest)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "description",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -934,6 +937,8 @@ show_all_settings(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 17, "pending_restart",
 						   BOOLOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
+
 		/*
 		 * Generate attribute metadata needed later to produce tuples from raw
 		 * C strings
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index d46cdbf7a3c..595413dbbc5 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -195,6 +195,7 @@ extern TupleDesc CreateTupleDescTruncatedCopy(TupleDesc tupdesc, int natts);
 
 extern TupleDesc CreateTupleDescCopyConstr(TupleDesc tupdesc);
 
+#define TupleDescFinalize(d) ((void) 0)
 #define TupleDescSize(src) \
 	(offsetof(struct TupleDescData, compact_attrs) + \
 	 (src)->natts * sizeof(CompactAttribute) + \
diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c
index 7d648c941c0..b2112c29fcf 100644
--- a/src/pl/plpgsql/src/pl_comp.c
+++ b/src/pl/plpgsql/src/pl_comp.c
@@ -1912,6 +1912,8 @@ build_row_from_vars(PLpgSQL_variable **vars, int numvars)
 		TupleDescInitEntryCollation(row->rowtupdesc, i + 1, typcoll);
 	}
 
+	TupleDescFinalize(row->rowtupdesc);
+
 	return row;
 }
 
diff --git a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
index 908bd18a7c7..fa1719bf3b5 100644
--- a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
+++ b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
@@ -205,6 +205,7 @@ test_custom_stats_fixed_report(PG_FUNCTION_ARGS)
 					   INT8OID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	values[0] = Int64GetDatum(stats->numcalls);
diff --git a/src/test/modules/test_predtest/test_predtest.c b/src/test/modules/test_predtest/test_predtest.c
index 679a5de456d..48ca2a4ea70 100644
--- a/src/test/modules/test_predtest/test_predtest.c
+++ b/src/test/modules/test_predtest/test_predtest.c
@@ -230,6 +230,7 @@ test_predtest(PG_FUNCTION_ARGS)
 					   "s_r_holds", BOOLOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 8,
 					   "w_r_holds", BOOLOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	tupdesc = BlessTupleDesc(tupdesc);
 
 	values[0] = BoolGetDatum(strong_implied_by);
-- 
2.51.0


From 9d02280cc685739dc82306d1ada3509b476f6a8f Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Tue, 31 Dec 2024 09:19:24 +1300
Subject: [PATCH v6 2/3] Precalculate CompactAttribute's attcacheoff

This allows code to be removed from the tuple deform routines which
shrinks down the code a little, which can make it run more quickly.
This also makes a dedicated deformer loop to deform the portion of the
tuple which has a known offset, which makes deforming much faster when
a leading set of the table's columns are non-NULL values and fixed-width
types.
---
 src/backend/access/common/heaptuple.c  | 334 +++++++++-------------
 src/backend/access/common/indextuple.c | 367 ++++++++++---------------
 src/backend/access/common/tupdesc.c    |  37 +++
 src/backend/access/spgist/spgutils.c   |   3 -
 src/backend/executor/execTuples.c      | 280 ++++++++-----------
 src/backend/jit/llvm/llvmjit_deform.c  |   6 -
 src/backend/utils/cache/relcache.c     |  12 -
 src/include/access/htup_details.h      |  19 +-
 src/include/access/itup.h              |  20 +-
 src/include/access/tupdesc.h           |  10 +-
 src/include/access/tupmacs.h           |  82 ++++++
 src/include/executor/tuptable.h        |   9 +-
 12 files changed, 539 insertions(+), 640 deletions(-)

diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c
index 11bec20e82e..42cce3dcdfe 100644
--- a/src/backend/access/common/heaptuple.c
+++ b/src/backend/access/common/heaptuple.c
@@ -497,20 +497,8 @@ heap_attisnull(HeapTuple tup, int attnum, TupleDesc tupleDesc)
 /* ----------------
  *		nocachegetattr
  *
- *		This only gets called from fastgetattr(), in cases where we
- *		can't use a cacheoffset and the value is not null.
- *
- *		This caches attribute offsets in the attribute descriptor.
- *
- *		An alternative way to speed things up would be to cache offsets
- *		with the tuple, but that seems more difficult unless you take
- *		the storage hit of actually putting those offsets into the
- *		tuple you send to disk.  Yuck.
- *
- *		This scheme will be slightly slower than that, but should
- *		perform well for queries which hit large #'s of tuples.  After
- *		you cache the offsets once, examining all the other tuples using
- *		the same attribute descriptor will go much quicker. -cim 5/4/91
+ *		This only gets called from fastgetattr(), in cases where the
+ *		attcacheoff is not set.
  *
  *		NOTE: if you need to change this code, see also heap_deform_tuple.
  *		Also see nocache_index_getattr, which is the same code for index
@@ -522,194 +510,104 @@ nocachegetattr(HeapTuple tup,
 			   int attnum,
 			   TupleDesc tupleDesc)
 {
+	CompactAttribute *cattr;
 	HeapTupleHeader td = tup->t_data;
 	char	   *tp;				/* ptr to data part of tuple */
 	bits8	   *bp = td->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* do we have to walk attrs? */
 	int			off;			/* current offset within data */
+	int			startAttr;
+	int			firstnullattr;
+	bool		hasnulls = HeapTupleHasNulls(tup);
 
-	/* ----------------
-	 *	 Three cases:
-	 *
-	 *	 1: No nulls and no variable-width attributes.
-	 *	 2: Has a null or a var-width AFTER att.
-	 *	 3: Has nulls or var-widths BEFORE att.
-	 * ----------------
-	 */
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
+	/*
+	 * If there are no NULLs before the required attnum, then we can start at
+	 * the highest attribute with a known offset, or the first attribute if
+	 * none have a cached offset.  If the tuple has no variable width types,
+	 * then we can use a slightly cheaper method of offset calculation, as we
+	 * just need to add the attlen to the aligned offset when skipping over
+	 * columns.  When the tuple contains variable-width types, we must use
+	 * att_addlength_pointer(), which does a bit more branching and is
+	 * slightly less efficient.
+	 */
 	attnum--;
 
-	if (!HeapTupleNoNulls(tup))
-	{
-		/*
-		 * there's a null somewhere in the tuple
-		 *
-		 * check to see if any preceding bits are null...
-		 */
-		int			byte = attnum >> 3;
-		int			finalbit = attnum & 0x07;
-
-		/* check for nulls "before" final bit of last byte */
-		if ((~bp[byte]) & ((1 << finalbit) - 1))
-			slow = true;
-		else
-		{
-			/* check for nulls in any "earlier" bytes */
-			int			i;
+	if (hasnulls)
+		firstnullattr = first_null_attr(bp, attnum);
+	else
+		firstnullattr = attnum;
 
-			for (i = 0; i < byte; i++)
-			{
-				if (bp[i] != 0xFF)
-				{
-					slow = true;
-					break;
-				}
-			}
-		}
+	if (tupleDesc->firstNonCachedOffAttr > 0)
+	{
+		startAttr = Min(tupleDesc->firstNonCachedOffAttr - 1, firstnullattr);
+		off = TupleDescCompactAttr(tupleDesc, startAttr)->attcacheoff;
+	}
+	else
+	{
+		startAttr = 0;
+		off = 0;
 	}
 
 	tp = (char *) td + td->t_hoff;
 
-	if (!slow)
+	if (hasnulls)
 	{
-		CompactAttribute *att;
+		for (int i = startAttr; i < attnum; i++)
+		{
+			CompactAttribute *att;
 
-		/*
-		 * If we get here, there are no nulls up to and including the target
-		 * attribute.  If we have a cached offset, we can use it.
-		 */
-		att = TupleDescCompactAttr(tupleDesc, attnum);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, tp + att->attcacheoff);
+			if (att_isnull(i, bp))
+				continue;
 
-		/*
-		 * Otherwise, check for non-fixed-length attrs up to and including
-		 * target.  If there aren't any, it's safe to cheaply initialize the
-		 * cached offsets for these attrs.
-		 */
-		if (HeapTupleHasVarWidth(tup))
-		{
-			int			j;
+			att = TupleDescCompactAttr(tupleDesc, i);
 
-			for (j = 0; j <= attnum; j++)
-			{
-				if (TupleDescCompactAttr(tupleDesc, j)->attlen <= 0)
-				{
-					slow = true;
-					break;
-				}
-			}
+			off = att_pointer_alignby(off,
+									  att->attalignby,
+									  att->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, att->attlen, tp + off);
 		}
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
 	}
-
-	if (!slow)
+	else if (!HeapTupleHasVarWidth(tup))
 	{
-		int			natts = tupleDesc->natts;
-		int			j = 1;
-
-		/*
-		 * If we get here, we have a tuple with no nulls or var-widths up to
-		 * and including the target attribute, so we can use the cached offset
-		 * ... only we don't have it yet, or we'd not have got here.  Since
-		 * it's cheap to compute offsets for fixed-width columns, we take the
-		 * opportunity to initialize the cached offsets for *all* the leading
-		 * fixed-width columns, in hope of avoiding future visits to this
-		 * routine.
-		 */
-		TupleDescCompactAttr(tupleDesc, 0)->attcacheoff = 0;
-
-		/* we might have set some offsets in the slow path previously */
-		while (j < natts && TupleDescCompactAttr(tupleDesc, j)->attcacheoff > 0)
-			j++;
-
-		off = TupleDescCompactAttr(tupleDesc, j - 1)->attcacheoff +
-			TupleDescCompactAttr(tupleDesc, j - 1)->attlen;
-
-		for (; j < natts; j++)
+		for (int i = startAttr; i < attnum; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, j);
-
-			if (att->attlen <= 0)
-				break;
+			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
 
 			off = att_nominal_alignby(off, att->attalignby);
-
-			att->attcacheoff = off;
-
 			off += att->attlen;
 		}
-
-		Assert(j > attnum);
-
-		off = TupleDescCompactAttr(tupleDesc, attnum)->attcacheoff;
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_nominal_alignby(off, cattr->attalignby);
 	}
 	else
 	{
-		bool		usecache = true;
-		int			i;
-
-		/*
-		 * Now we know that we have to walk the tuple CAREFULLY.  But we still
-		 * might be able to cache some offsets for next time.
-		 *
-		 * Note - This loop is a little tricky.  For each non-null attribute,
-		 * we have to first account for alignment padding before the attr,
-		 * then advance over the attr based on its length.  Nulls have no
-		 * storage and no alignment padding either.  We can use/set
-		 * attcacheoff until we reach either a null or a var-width attribute.
-		 */
-		off = 0;
-		for (i = 0;; i++)		/* loop exit is at "break" */
+		for (int i = startAttr; i < attnum; i++)
 		{
 			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
 
-			if (HeapTupleHasNulls(tup) && att_isnull(i, bp))
-			{
-				usecache = false;
-				continue;		/* this cannot be the target att */
-			}
-
-			/* If we know the next offset, we can skip the rest */
-			if (usecache && att->attcacheoff >= 0)
-				off = att->attcacheoff;
-			else if (att->attlen == -1)
-			{
-				/*
-				 * We can only cache the offset for a varlena attribute if the
-				 * offset is already suitably aligned, so that there would be
-				 * no pad bytes in any case: then the offset will be valid for
-				 * either an aligned or unaligned value.
-				 */
-				if (usecache &&
-					off == att_nominal_alignby(off, att->attalignby))
-					att->attcacheoff = off;
-				else
-				{
-					off = att_pointer_alignby(off, att->attalignby, -1,
-											  tp + off);
-					usecache = false;
-				}
-			}
-			else
-			{
-				/* not varlena, so safe to use att_nominal_alignby */
-				off = att_nominal_alignby(off, att->attalignby);
-
-				if (usecache)
-					att->attcacheoff = off;
-			}
-
-			if (i == attnum)
-				break;
-
+			off = att_pointer_alignby(off,
+									  att->attalignby,
+									  att->attlen,
+									  tp + off);
 			off = att_addlength_pointer(off, att->attlen, tp + off);
 
-			if (usecache && att->attlen <= 0)
-				usecache = false;
 		}
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
 	}
 
-	return fetchatt(TupleDescCompactAttr(tupleDesc, attnum), tp + off);
+	return fetchatt(cattr, tp + off);
 }
 
 /* ----------------
@@ -1347,6 +1245,7 @@ heap_deform_tuple(HeapTuple tuple, TupleDesc tupleDesc,
 				  Datum *values, bool *isnull)
 {
 	HeapTupleHeader tup = tuple->t_data;
+	CompactAttribute *cattr;
 	bool		hasnulls = HeapTupleHasNulls(tuple);
 	int			tdesc_natts = tupleDesc->natts;
 	int			natts;			/* number of atts to extract */
@@ -1354,70 +1253,91 @@ heap_deform_tuple(HeapTuple tuple, TupleDesc tupleDesc,
 	char	   *tp;				/* ptr to tuple data */
 	uint32		off;			/* offset in tuple data */
 	bits8	   *bp = tup->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* can we use/set attcacheoff? */
+	int			cacheoffattrs;
+	int			firstnullattr;
 
 	natts = HeapTupleHeaderGetNatts(tup);
 
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
+
 	/*
 	 * In inheritance situations, it is possible that the given tuple actually
 	 * has more fields than the caller is expecting.  Don't run off the end of
 	 * the caller's arrays.
 	 */
 	natts = Min(natts, tdesc_natts);
+	cacheoffattrs = Min(tupleDesc->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		firstnullattr = first_null_attr(bp, natts);
+		cacheoffattrs = Min(cacheoffattrs, firstnullattr);
+	}
+	else
+		firstnullattr = natts;
 
 	tp = (char *) tup + tup->t_hoff;
+	attnum = 0;
 
-	off = 0;
+	if (cacheoffattrs > 0)
+	{
+#ifdef USE_ASSERT_CHECKING
+		/* In Assert enabled builds, verify attcacheoff is correct */
+		off = 0;
+#endif
+		do
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+#ifdef USE_ASSERT_CHECKING
+			off = att_nominal_alignby(off, cattr->attalignby);
+			Assert(off == cattr->attcacheoff);
+			off += cattr->attlen;
+#endif
 
-	for (attnum = 0; attnum < natts; attnum++)
+			values[attnum] = fetch_att(tp + cattr->attcacheoff,
+									   cattr->attbyval,
+									   cattr->attlen);
+			isnull[attnum] = false;
+		} while (++attnum < cacheoffattrs);
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+	else
+		off = 0;
+
+	for (; attnum < firstnullattr; attnum++)
 	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDesc, attnum);
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
 
-		if (hasnulls && att_isnull(attnum, bp))
-		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			slow = true;		/* can't use attcacheoff anymore */
-			continue;
-		}
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
 		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (!slow && thisatt->attcacheoff >= 0)
-			off = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow &&
-				off == att_nominal_alignby(off, thisatt->attalignby))
-				thisatt->attcacheoff = off;
-			else
-			{
-				off = att_pointer_alignby(off, thisatt->attalignby, -1,
-										  tp + off);
-				slow = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			off = att_nominal_alignby(off, thisatt->attalignby);
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+	}
+
+	for (; attnum < natts; attnum++)
+	{
+		Assert(hasnulls);
 
-			if (!slow)
-				thisatt->attcacheoff = off;
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
 		}
 
-		values[attnum] = fetchatt(thisatt, tp + off);
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
-		off = att_addlength_pointer(off, thisatt->attlen, tp + off);
+		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (thisatt->attlen <= 0)
-			slow = true;		/* can't use attcacheoff anymore */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
 	/*
diff --git a/src/backend/access/common/indextuple.c b/src/backend/access/common/indextuple.c
index d7c8c53fd8d..084e0937a60 100644
--- a/src/backend/access/common/indextuple.c
+++ b/src/backend/access/common/indextuple.c
@@ -223,18 +223,6 @@ index_form_tuple_context(TupleDesc tupleDescriptor,
  *
  *		This gets called from index_getattr() macro, and only in cases
  *		where we can't use cacheoffset and the value is not null.
- *
- *		This caches attribute offsets in the attribute descriptor.
- *
- *		An alternative way to speed things up would be to cache offsets
- *		with the tuple, but that seems more difficult unless you take
- *		the storage hit of actually putting those offsets into the
- *		tuple you send to disk.  Yuck.
- *
- *		This scheme will be slightly slower than that, but should
- *		perform well for queries which hit large #'s of tuples.  After
- *		you cache the offsets once, examining all the other tuples using
- *		the same attribute descriptor will go much quicker. -cim 5/4/91
  * ----------------
  */
 Datum
@@ -242,205 +230,129 @@ nocache_index_getattr(IndexTuple tup,
 					  int attnum,
 					  TupleDesc tupleDesc)
 {
+	CompactAttribute *cattr;
 	char	   *tp;				/* ptr to data part of tuple */
 	bits8	   *bp = NULL;		/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* do we have to walk attrs? */
 	int			data_off;		/* tuple data offset */
 	int			off;			/* current offset within data */
+	int			startAttr;
+	int			firstnullattr;
+	bool		hasnulls = IndexTupleHasNulls(tup);
+	int			i;
 
-	/* ----------------
-	 *	 Three cases:
-	 *
-	 *	 1: No nulls and no variable-width attributes.
-	 *	 2: Has a null or a var-width AFTER att.
-	 *	 3: Has nulls or var-widths BEFORE att.
-	 * ----------------
-	 */
-
-	data_off = IndexInfoFindDataOffset(tup->t_info);
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
 	attnum--;
 
-	if (IndexTupleHasNulls(tup))
-	{
-		/*
-		 * there's a null somewhere in the tuple
-		 *
-		 * check to see if desired att is null
-		 */
+	/*
+	 * If there are no NULLs before the required attnum, then we can start at
+	 * the highest attribute with a known offset, or the first attribute if
+	 * none have a cached offset.  If the tuple has no variable width types,
+	 * which is common with indexes, then we can use a slightly cheaper method
+	 * of offset calculation, as we just need to add the attlen to the aligned
+	 * offset when skipping over columns.  When the tuple contains
+	 * variable-width types, we must use att_addlength_pointer(), which does a
+	 * bit more branching and is slightly less efficient.
+	 */
+	data_off = IndexInfoFindDataOffset(tup->t_info);
+	tp = (char *) tup + data_off;
 
-		/* XXX "knows" t_bits are just after fixed tuple header! */
+	/*
+	 * Find the first NULL column, or if there's none set the first NULL to
+	 * attnum so that we can forego NULL checking all the way to attnum.
+	 */
+	if (hasnulls)
+	{
 		bp = (bits8 *) ((char *) tup + sizeof(IndexTupleData));
-
-		/*
-		 * Now check to see if any preceding bits are null...
-		 */
-		{
-			int			byte = attnum >> 3;
-			int			finalbit = attnum & 0x07;
-
-			/* check for nulls "before" final bit of last byte */
-			if ((~bp[byte]) & ((1 << finalbit) - 1))
-				slow = true;
-			else
-			{
-				/* check for nulls in any "earlier" bytes */
-				int			i;
-
-				for (i = 0; i < byte; i++)
-				{
-					if (bp[i] != 0xFF)
-					{
-						slow = true;
-						break;
-					}
-				}
-			}
-		}
+		firstnullattr = first_null_attr(bp, attnum);
 	}
+	else
+		firstnullattr = attnum;
 
-	tp = (char *) tup + data_off;
-
-	if (!slow)
+	if (tupleDesc->firstNonCachedOffAttr > 0)
 	{
-		CompactAttribute *att;
-
-		/*
-		 * If we get here, there are no nulls up to and including the target
-		 * attribute.  If we have a cached offset, we can use it.
-		 */
-		att = TupleDescCompactAttr(tupleDesc, attnum);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, tp + att->attcacheoff);
-
-		/*
-		 * Otherwise, check for non-fixed-length attrs up to and including
-		 * target.  If there aren't any, it's safe to cheaply initialize the
-		 * cached offsets for these attrs.
-		 */
-		if (IndexTupleHasVarwidths(tup))
-		{
-			int			j;
-
-			for (j = 0; j <= attnum; j++)
-			{
-				if (TupleDescCompactAttr(tupleDesc, j)->attlen <= 0)
-				{
-					slow = true;
-					break;
-				}
-			}
-		}
+		startAttr = Min(tupleDesc->firstNonCachedOffAttr - 1, firstnullattr);
+		off = TupleDescCompactAttr(tupleDesc, startAttr)->attcacheoff;
 	}
-
-	if (!slow)
+	else
 	{
-		int			natts = tupleDesc->natts;
-		int			j = 1;
-
-		/*
-		 * If we get here, we have a tuple with no nulls or var-widths up to
-		 * and including the target attribute, so we can use the cached offset
-		 * ... only we don't have it yet, or we'd not have got here.  Since
-		 * it's cheap to compute offsets for fixed-width columns, we take the
-		 * opportunity to initialize the cached offsets for *all* the leading
-		 * fixed-width columns, in hope of avoiding future visits to this
-		 * routine.
-		 */
-		TupleDescCompactAttr(tupleDesc, 0)->attcacheoff = 0;
+		startAttr = 0;
+		off = 0;
+	}
 
-		/* we might have set some offsets in the slow path previously */
-		while (j < natts && TupleDescCompactAttr(tupleDesc, j)->attcacheoff > 0)
-			j++;
+	/* Handle tuples with var-width attributes */
+	if (IndexTupleHasVarwidths(tup))
+	{
+		/* Calculate the offset up until the first NULL */
+		for (i = startAttr; i < firstnullattr; i++)
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-		off = TupleDescCompactAttr(tupleDesc, j - 1)->attcacheoff +
-			TupleDescCompactAttr(tupleDesc, j - 1)->attlen;
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, cattr->attlen, tp + off);
+		}
 
-		for (; j < natts; j++)
+		/* Calculate the offset for any remaining columns. */
+		for (; i < attnum; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, j);
+			Assert(hasnulls);
 
-			if (att->attlen <= 0)
-				break;
+			if (att_isnull(i, bp))
+				continue;
 
-			off = att_nominal_alignby(off, att->attalignby);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-			att->attcacheoff = off;
-
-			off += att->attlen;
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, cattr->attlen, tp + off);
 		}
-
-		Assert(j > attnum);
-
-		off = TupleDescCompactAttr(tupleDesc, attnum)->attcacheoff;
 	}
 	else
 	{
-		bool		usecache = true;
-		int			i;
+		/* Handle tuples with only fixed-width attributes */
 
-		/*
-		 * Now we know that we have to walk the tuple CAREFULLY.  But we still
-		 * might be able to cache some offsets for next time.
-		 *
-		 * Note - This loop is a little tricky.  For each non-null attribute,
-		 * we have to first account for alignment padding before the attr,
-		 * then advance over the attr based on its length.  Nulls have no
-		 * storage and no alignment padding either.  We can use/set
-		 * attcacheoff until we reach either a null or a var-width attribute.
-		 */
-		off = 0;
-		for (i = 0;; i++)		/* loop exit is at "break" */
+		/* Calculate the offset up until the first NULL */
+		for (i = startAttr; i < firstnullattr; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
-
-			if (IndexTupleHasNulls(tup) && att_isnull(i, bp))
-			{
-				usecache = false;
-				continue;		/* this cannot be the target att */
-			}
-
-			/* If we know the next offset, we can skip the rest */
-			if (usecache && att->attcacheoff >= 0)
-				off = att->attcacheoff;
-			else if (att->attlen == -1)
-			{
-				/*
-				 * We can only cache the offset for a varlena attribute if the
-				 * offset is already suitably aligned, so that there would be
-				 * no pad bytes in any case: then the offset will be valid for
-				 * either an aligned or unaligned value.
-				 */
-				if (usecache &&
-					off == att_nominal_alignby(off, att->attalignby))
-					att->attcacheoff = off;
-				else
-				{
-					off = att_pointer_alignby(off, att->attalignby, -1,
-											  tp + off);
-					usecache = false;
-				}
-			}
-			else
-			{
-				/* not varlena, so safe to use att_nominal_alignby */
-				off = att_nominal_alignby(off, att->attalignby);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
+
+			Assert(cattr->attlen > 0);
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off += cattr->attlen;
+		}
 
-				if (usecache)
-					att->attcacheoff = off;
-			}
+		/* Calculate the offset for any remaining columns. */
+		for (; i < attnum; i++)
+		{
+			Assert(hasnulls);
 
-			if (i == attnum)
-				break;
+			if (att_isnull(i, bp))
+				continue;
 
-			off = att_addlength_pointer(off, att->attlen, tp + off);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-			if (usecache && att->attlen <= 0)
-				usecache = false;
+			Assert(cattr->attlen > 0);
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off += cattr->attlen;
 		}
 	}
 
-	return fetchatt(TupleDescCompactAttr(tupleDesc, attnum), tp + off);
+	cattr = TupleDescCompactAttr(tupleDesc, attnum);
+	off = att_pointer_alignby(off, cattr->attalignby,
+							  cattr->attlen, tp + off);
+	return fetchatt(cattr, tp + off);
 }
 
 /*
@@ -480,63 +392,86 @@ index_deform_tuple_internal(TupleDesc tupleDescriptor,
 							Datum *values, bool *isnull,
 							char *tp, bits8 *bp, int hasnulls)
 {
+	CompactAttribute *cattr;
 	int			natts = tupleDescriptor->natts; /* number of atts to extract */
-	int			attnum;
+	int			attnum = 0;
 	int			off = 0;		/* offset in tuple data */
-	bool		slow = false;	/* can we use/set attcacheoff? */
+	int			cacheoffattrs;
+	int			firstnullattr;
 
 	/* Assert to protect callers who allocate fixed-size arrays */
 	Assert(natts <= INDEX_MAX_KEYS);
 
-	for (attnum = 0; attnum < natts; attnum++)
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDescriptor->firstNonCachedOffAttr >= 0);
+
+	cacheoffattrs = Min(tupleDescriptor->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		firstnullattr = first_null_attr(bp, natts);
+		cacheoffattrs = Min(cacheoffattrs, firstnullattr);
+	}
+	else
+		firstnullattr = natts;
+
+	if (cacheoffattrs > 0)
 	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDescriptor, attnum);
+#ifdef USE_ASSERT_CHECKING
+		/* In Assert enabled builds, verify attcacheoff is correct */
+		off = 0;
+#endif
 
-		if (hasnulls && att_isnull(attnum, bp))
+		do
 		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			slow = true;		/* can't use attcacheoff anymore */
-			continue;
-		}
+			cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+
+#ifdef USE_ASSERT_CHECKING
+			off = att_nominal_alignby(off, cattr->attalignby);
+			Assert(off == cattr->attcacheoff);
+			off += cattr->attlen;
+#endif
+
+			values[attnum] = fetch_att(tp + cattr->attcacheoff, cattr->attbyval,
+									   cattr->attlen);
+			isnull[attnum] = false;
+		} while (++attnum < cacheoffattrs);
+
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+
+	for (; attnum < firstnullattr; attnum++)
+	{
+		cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
 		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (!slow && thisatt->attcacheoff >= 0)
-			off = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow &&
-				off == att_nominal_alignby(off, thisatt->attalignby))
-				thisatt->attcacheoff = off;
-			else
-			{
-				off = att_pointer_alignby(off, thisatt->attalignby, -1,
-										  tp + off);
-				slow = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			off = att_nominal_alignby(off, thisatt->attalignby);
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+	}
+
+	for (; attnum < natts; attnum++)
+	{
+		Assert(hasnulls);
 
-			if (!slow)
-				thisatt->attcacheoff = off;
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
 		}
 
-		values[attnum] = fetchatt(thisatt, tp + off);
+		cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
-		off = att_addlength_pointer(off, thisatt->attlen, tp + off);
+		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (thisatt->attlen <= 0)
-			slow = true;		/* can't use attcacheoff anymore */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 }
 
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index e98de806a77..25364db630a 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -214,6 +214,9 @@ CreateTemplateTupleDesc(int natts)
 	desc->tdtypmod = -1;
 	desc->tdrefcount = -1;		/* assume not reference-counted */
 
+	/* This will be set to the correct value by TupleDescFinalize() */
+	desc->firstNonCachedOffAttr = -1;
+
 	return desc;
 }
 
@@ -474,6 +477,9 @@ TupleDescCopy(TupleDesc dst, TupleDesc src)
  *		descriptor to another.
  *
  * !!! Constraints and defaults are not copied !!!
+ *
+ * The caller must take care of calling TupleDescFinalize() on once all
+ * TupleDesc changes have been made.
  */
 void
 TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
@@ -506,6 +512,37 @@ TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 	populate_compact_attribute(dst, dstAttno - 1);
 }
 
+/*
+ * TupleDescFinalize
+ *		Finalize the given TupleDesc.  This must be called after the
+ *		attributes arrays have been populated or adjusted by any code.
+ *
+ * Must be called after populate_compact_attribute() and before
+ * BlessTupleDesc().
+ */
+void
+TupleDescFinalize(TupleDesc tupdesc)
+{
+	int			firstNonCachedOffAttr = 0;
+	int			offp = 0;
+
+	for (int i = 0; i < tupdesc->natts; i++)
+	{
+		CompactAttribute *cattr = TupleDescCompactAttr(tupdesc, i);
+
+		if (cattr->attlen <= 0)
+			break;
+
+		offp = att_nominal_alignby(offp, cattr->attalignby);
+		cattr->attcacheoff = offp;
+
+		offp += cattr->attlen;
+		firstNonCachedOffAttr = i + 1;
+	}
+
+	tupdesc->firstNonCachedOffAttr = firstNonCachedOffAttr;
+}
+
 /*
  * Free a TupleDesc including all substructure
  */
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index b246e8127db..a4694bd8065 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -335,9 +335,6 @@ getSpGistTupleDesc(Relation index, SpGistTypeDesc *keyType)
 		/* We shouldn't need to bother with making these valid: */
 		att->attcompression = InvalidCompressionMethod;
 		att->attcollation = InvalidOid;
-		/* In case we changed typlen, we'd better reset following offsets */
-		for (int i = spgFirstIncludeColumn; i < outTupDesc->natts; i++)
-			TupleDescCompactAttr(outTupDesc, i)->attcacheoff = -1;
 
 		populate_compact_attribute(outTupDesc, spgKeyColumn);
 		TupleDescFinalize(outTupDesc);
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index e6ab51e6404..89f18be5d82 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -992,118 +992,6 @@ tts_buffer_heap_store_tuple(TupleTableSlot *slot, HeapTuple tuple,
 	}
 }
 
-/*
- * slot_deform_heap_tuple_internal
- *		An always inline helper function for use in slot_deform_heap_tuple to
- *		allow the compiler to emit specialized versions of this function for
- *		various combinations of "slow" and "hasnulls".  For example, if a
- *		given tuple has no nulls, then we needn't check "hasnulls" for every
- *		attribute that we're deforming.  The caller can just call this
- *		function with hasnulls set to constant-false and have the compiler
- *		remove the constant-false branches and emit more optimal code.
- *
- * Returns the next attnum to deform, which can be equal to natts when the
- * function manages to deform all requested attributes.  *offp is an input and
- * output parameter which is the byte offset within the tuple to start deforming
- * from which, on return, gets set to the offset where the next attribute
- * should be deformed from.  *slowp is set to true when subsequent deforming
- * of this tuple must use a version of this function with "slow" passed as
- * true.
- *
- * Callers cannot assume when we return "attnum" (i.e. all requested
- * attributes have been deformed) that slow mode isn't required for any
- * additional deforming as the final attribute may have caused a switch to
- * slow mode.
- */
-static pg_attribute_always_inline int
-slot_deform_heap_tuple_internal(TupleTableSlot *slot, HeapTuple tuple,
-								int attnum, int natts, bool slow,
-								bool hasnulls, uint32 *offp, bool *slowp)
-{
-	TupleDesc	tupleDesc = slot->tts_tupleDescriptor;
-	Datum	   *values = slot->tts_values;
-	bool	   *isnull = slot->tts_isnull;
-	HeapTupleHeader tup = tuple->t_data;
-	char	   *tp;				/* ptr to tuple data */
-	bits8	   *bp = tup->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slownext = false;
-
-	tp = (char *) tup + tup->t_hoff;
-
-	for (; attnum < natts; attnum++)
-	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDesc, attnum);
-
-		if (hasnulls && att_isnull(attnum, bp))
-		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			if (!slow)
-			{
-				*slowp = true;
-				return attnum + 1;
-			}
-			else
-				continue;
-		}
-
-		isnull[attnum] = false;
-
-		/* calculate the offset of this attribute */
-		if (!slow && thisatt->attcacheoff >= 0)
-			*offp = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow && *offp == att_nominal_alignby(*offp, thisatt->attalignby))
-				thisatt->attcacheoff = *offp;
-			else
-			{
-				*offp = att_pointer_alignby(*offp,
-											thisatt->attalignby,
-											-1,
-											tp + *offp);
-
-				if (!slow)
-					slownext = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			*offp = att_nominal_alignby(*offp, thisatt->attalignby);
-
-			if (!slow)
-				thisatt->attcacheoff = *offp;
-		}
-
-		values[attnum] = fetchatt(thisatt, tp + *offp);
-
-		*offp = att_addlength_pointer(*offp, thisatt->attlen, tp + *offp);
-
-		/* check if we need to switch to slow mode */
-		if (!slow)
-		{
-			/*
-			 * We're unable to deform any further if the above code set
-			 * 'slownext', or if this isn't a fixed-width attribute.
-			 */
-			if (slownext || thisatt->attlen <= 0)
-			{
-				*slowp = true;
-				return attnum + 1;
-			}
-		}
-	}
-
-	return natts;
-}
-
 /*
  * slot_deform_heap_tuple
  *		Given a TupleTableSlot, extract data from the slot's physical tuple
@@ -1122,78 +1010,140 @@ static pg_attribute_always_inline void
 slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 					   int natts)
 {
+	CompactAttribute *cattr;
+	TupleDesc	tupleDesc = slot->tts_tupleDescriptor;
 	bool		hasnulls = HeapTupleHasNulls(tuple);
+	HeapTupleHeader tup = tuple->t_data;
+	bits8	   *bp;				/* ptr to null bitmap in tuple */
 	int			attnum;
+	int			firstNonCacheOffsetAttr;
+	int			firstNullAttr;
+	Datum	   *values;
+	bool	   *isnull;
+	char	   *tp;				/* ptr to tuple data */
 	uint32		off;			/* offset in tuple data */
-	bool		slow;			/* can we use/set attcacheoff? */
+
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
 	/* We can only fetch as many attributes as the tuple has. */
-	natts = Min(HeapTupleHeaderGetNatts(tuple->t_data), natts);
+	natts = Min(HeapTupleHeaderGetNatts(tup), natts);
+	attnum = slot->tts_nvalid;
+	firstNonCacheOffsetAttr = Min(tupleDesc->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		bp = tup->t_bits;
+		firstNullAttr = first_null_attr(bp, natts);
+		firstNonCacheOffsetAttr = Min(firstNonCacheOffsetAttr, firstNullAttr);
+	}
+	else
+	{
+		bp = NULL;
+		firstNullAttr = natts;
+	}
+
+	values = slot->tts_values;
+	isnull = slot->tts_isnull;
+	tp = (char *) tup + tup->t_hoff;
 
 	/*
-	 * Check whether the first call for this tuple, and initialize or restore
-	 * loop state.
+	 * Handle the portion of the tuple that we have cached the offset for up
+	 * to the first NULL attribute.  The offset is effectively fixed for these
+	 * so we can use the CompactAttribute's attcacheoff.
 	 */
-	attnum = slot->tts_nvalid;
-	if (attnum == 0)
+	if (attnum < firstNonCacheOffsetAttr)
+	{
+#ifdef USE_ASSERT_CHECKING
+		int			offcheck;
+
+		/* In Assert enabled builds, verify attcacheoff is correct */
+		if (attnum == 0)
+			offcheck = 0;
+		else
+			offcheck = *offp;
+#endif
+		do
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+#ifdef USE_ASSERT_CHECKING
+			offcheck = att_nominal_alignby(offcheck, cattr->attalignby);
+			Assert(offcheck == cattr->attcacheoff);
+			offcheck += cattr->attlen;
+#endif
+
+			values[attnum] = fetchatt(cattr, tp + cattr->attcacheoff);
+			isnull[attnum] = false;
+		} while (++attnum < firstNonCacheOffsetAttr);
+
+		/*
+		 * Point the offset after the end of the last attribute with a cached
+		 * offset.  We expect the final cached offset attribute to have a
+		 * fixed width, so just add the attlen to the attcacheoff
+		 */
+		Assert(cattr->attlen > 0);
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+	else if (attnum == 0)
 	{
 		/* Start from the first attribute */
 		off = 0;
-		slow = false;
 	}
 	else
 	{
 		/* Restore state from previous execution */
 		off = *offp;
-		slow = TTS_SLOW(slot);
 	}
 
 	/*
-	 * If 'slow' isn't set, try deforming using deforming code that does not
-	 * contain any of the extra checks required for non-fixed offset
-	 * deforming.  During deforming, if or when we find a NULL or a variable
-	 * length attribute, we'll switch to a deforming method which includes the
-	 * extra code required for non-fixed offset deforming, a.k.a slow mode.
-	 * Because this is performance critical, we inline
-	 * slot_deform_heap_tuple_internal passing the 'slow' and 'hasnull'
-	 * parameters as constants to allow the compiler to emit specialized code
-	 * with the known-const false comparisons and subsequent branches removed.
+	 * Handle any portion of the tuple that doesn't have a fixed offset up
+	 * until the first NULL attribute.  This loops only differs from the one
+	 * after it by the NULL checks.
 	 */
-	if (!slow)
+	for (; attnum < firstNullAttr; attnum++)
 	{
-		/* Tuple without any NULLs? We can skip doing any NULL checking */
-		if (!hasnulls)
-			attnum = slot_deform_heap_tuple_internal(slot,
-													 tuple,
-													 attnum,
-													 natts,
-													 false, /* slow */
-													 false, /* hasnulls */
-													 &off,
-													 &slow);
-		else
-			attnum = slot_deform_heap_tuple_internal(slot,
-													 tuple,
-													 attnum,
-													 natts,
-													 false, /* slow */
-													 true,	/* hasnulls */
-													 &off,
-													 &slow);
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+		/* align the offset for this attribute */
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
+
+		values[attnum] = fetchatt(cattr, tp + off);
+		isnull[attnum] = false;
+
+		/* move the offset beyond this attribute */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
-	/* If there's still work to do then we must be in slow mode */
-	if (attnum < natts)
+	/*
+	 * Now handle any remaining tuples, this time include NULL checks as we're
+	 * now at the first NULL attribute.
+	 */
+	for (; attnum < natts; attnum++)
 	{
-		/* XXX is it worth adding a separate call when hasnulls is false? */
-		attnum = slot_deform_heap_tuple_internal(slot,
-												 tuple,
-												 attnum,
-												 natts,
-												 true,	/* slow */
-												 hasnulls,
-												 &off,
-												 &slow);
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
+		}
+
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+		/* align the offset for this attribute */
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
+
+		values[attnum] = fetchatt(cattr, tp + off);
+		isnull[attnum] = false;
+
+		/* move the offset beyond this attribute */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
 	/*
@@ -1201,10 +1151,6 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 	 */
 	slot->tts_nvalid = attnum;
 	*offp = off;
-	if (slow)
-		slot->tts_flags |= TTS_FLAG_SLOW;
-	else
-		slot->tts_flags &= ~TTS_FLAG_SLOW;
 }
 
 const TupleTableSlotOps TTSOpsVirtual = {
@@ -2259,10 +2205,16 @@ ExecTypeSetColNames(TupleDesc typeInfo, List *namesList)
  * This happens "for free" if the tupdesc came from a relcache entry, but
  * not if we have manufactured a tupdesc for a transient RECORD datatype.
  * In that case we have to notify typcache.c of the existence of the type.
+ *
+ * TupleDescFinalize() must be called on the TupleDesc before calling this
+ * function.
  */
 TupleDesc
 BlessTupleDesc(TupleDesc tupdesc)
 {
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupdesc->firstNonCachedOffAttr >= 0);
+
 	if (tupdesc->tdtypeid == RECORDOID &&
 		tupdesc->tdtypmod < 0)
 		assign_record_type_typmod(tupdesc);
diff --git a/src/backend/jit/llvm/llvmjit_deform.c b/src/backend/jit/llvm/llvmjit_deform.c
index 3eb087eb56b..12521e3e46a 100644
--- a/src/backend/jit/llvm/llvmjit_deform.c
+++ b/src/backend/jit/llvm/llvmjit_deform.c
@@ -62,7 +62,6 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 	LLVMValueRef v_tts_values;
 	LLVMValueRef v_tts_nulls;
 	LLVMValueRef v_slotoffp;
-	LLVMValueRef v_flagsp;
 	LLVMValueRef v_nvalidp;
 	LLVMValueRef v_nvalid;
 	LLVMValueRef v_maxatt;
@@ -178,7 +177,6 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 	v_tts_nulls =
 		l_load_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_ISNULL,
 						  "tts_ISNULL");
-	v_flagsp = l_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_FLAGS, "");
 	v_nvalidp = l_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_NVALID, "");
 
 	if (ops == &TTSOpsHeapTuple || ops == &TTSOpsBufferHeapTuple)
@@ -747,14 +745,10 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 
 	{
 		LLVMValueRef v_off = l_load(b, TypeSizeT, v_offp, "");
-		LLVMValueRef v_flags;
 
 		LLVMBuildStore(b, l_int16_const(lc, natts), v_nvalidp);
 		v_off = LLVMBuildTrunc(b, v_off, LLVMInt32TypeInContext(lc), "");
 		LLVMBuildStore(b, v_off, v_slotoffp);
-		v_flags = l_load(b, LLVMInt16TypeInContext(lc), v_flagsp, "tts_flags");
-		v_flags = LLVMBuildOr(b, v_flags, l_int16_const(lc, TTS_FLAG_SLOW), "");
-		LLVMBuildStore(b, v_flags, v_flagsp);
 		LLVMBuildRetVoid(b);
 	}
 
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 770edb34e08..998be24ac41 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -666,14 +666,6 @@ RelationBuildTupleDesc(Relation relation)
 		elog(ERROR, "pg_attribute catalog is missing %d attribute(s) for relation OID %u",
 			 need, RelationGetRelid(relation));
 
-	/*
-	 * We can easily set the attcacheoff value for the first attribute: it
-	 * must be zero.  This eliminates the need for special cases for attnum=1
-	 * that used to exist in fastgetattr() and index_getattr().
-	 */
-	if (RelationGetNumberOfAttributes(relation) > 0)
-		TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
-
 	/*
 	 * Set up constraint/default info
 	 */
@@ -1985,8 +1977,6 @@ formrdesc(const char *relationName, Oid relationReltype,
 		populate_compact_attribute(relation->rd_att, i);
 	}
 
-	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
-	TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
 	TupleDescFinalize(relation->rd_att);
 
 	/* mark not-null status */
@@ -4446,8 +4436,6 @@ BuildHardcodedDescriptor(int natts, const FormData_pg_attribute *attrs)
 		populate_compact_attribute(result, i);
 	}
 
-	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
-	TupleDescCompactAttr(result, 0)->attcacheoff = 0;
 	TupleDescFinalize(result);
 
 	/* Note: we don't bother to set up a TupleConstr entry */
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index d406825ff22..94b4279b7f1 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -865,20 +865,17 @@ extern MinimalTuple minimal_expand_tuple(HeapTuple sourceTuple, TupleDesc tupleD
 static inline Datum
 fastgetattr(HeapTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
 {
-	Assert(attnum > 0);
+	CompactAttribute *att = TupleDescCompactAttr(tupleDesc, attnum - 1);
 
+	Assert(attnum > 0);
 	*isnull = false;
-	if (HeapTupleNoNulls(tup))
-	{
-		CompactAttribute *att;
 
-		att = TupleDescCompactAttr(tupleDesc, attnum - 1);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, (char *) tup->t_data + tup->t_data->t_hoff +
-							att->attcacheoff);
-		else
-			return nocachegetattr(tup, attnum, tupleDesc);
-	}
+	if (att->attcacheoff >= 0 && !HeapTupleHasNulls(tup))
+		return fetchatt(att, (char *) tup->t_data + tup->t_data->t_hoff +
+						att->attcacheoff);
+
+	if (HeapTupleNoNulls(tup))
+		return nocachegetattr(tup, attnum, tupleDesc);
 	else
 	{
 		if (att_isnull(attnum - 1, tup->t_data->t_bits))
diff --git a/src/include/access/itup.h b/src/include/access/itup.h
index 57e4daafb0d..e4bb27b7e58 100644
--- a/src/include/access/itup.h
+++ b/src/include/access/itup.h
@@ -131,24 +131,20 @@ IndexInfoFindDataOffset(unsigned short t_info)
 static inline Datum
 index_getattr(IndexTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
 {
+	CompactAttribute *attr = TupleDescCompactAttr(tupleDesc, attnum - 1);
+
 	Assert(isnull);
 	Assert(attnum > 0);
 
 	*isnull = false;
 
-	if (!IndexTupleHasNulls(tup))
-	{
-		CompactAttribute *attr = TupleDescCompactAttr(tupleDesc, attnum - 1);
+	if (attr->attcacheoff >= 0 && !IndexTupleHasNulls(tup))
+		return fetchatt(attr,
+						(char *) tup + IndexInfoFindDataOffset(tup->t_info) +
+						attr->attcacheoff);
 
-		if (attr->attcacheoff >= 0)
-		{
-			return fetchatt(attr,
-							(char *) tup + IndexInfoFindDataOffset(tup->t_info) +
-							attr->attcacheoff);
-		}
-		else
-			return nocache_index_getattr(tup, attnum, tupleDesc);
-	}
+	if (!IndexTupleHasNulls(tup))
+		return nocache_index_getattr(tup, attnum, tupleDesc);
 	else
 	{
 		if (att_isnull(attnum - 1, (bits8 *) tup + sizeof(IndexTupleData)))
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index 595413dbbc5..99d9017d1a6 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -131,6 +131,12 @@ typedef struct CompactAttribute
  * Any code making changes manually to and fields in the FormData_pg_attribute
  * array must subsequently call populate_compact_attribute() to flush the
  * changes out to the corresponding 'compact_attrs' element.
+ *
+ * firstNonCachedOffAttr stores the index into the compact_attrs array for the
+ * first attribute that we don't have a known attcacheoff for.
+ *
+ * Once a TupleDesc has been populated, before it is used for any purpose
+ * TupleDescFinalize() must be called on it.
  */
 typedef struct TupleDescData
 {
@@ -138,6 +144,8 @@ typedef struct TupleDescData
 	Oid			tdtypeid;		/* composite type ID for tuple type */
 	int32		tdtypmod;		/* typmod for tuple type */
 	int			tdrefcount;		/* reference count, or -1 if not counting */
+	int			firstNonCachedOffAttr;	/* index of the first att without an
+										 * attcacheoff */
 	TupleConstr *constr;		/* constraints, or NULL if none */
 	/* compact_attrs[N] is the compact metadata of Attribute Number N+1 */
 	CompactAttribute compact_attrs[FLEXIBLE_ARRAY_MEMBER];
@@ -195,7 +203,6 @@ extern TupleDesc CreateTupleDescTruncatedCopy(TupleDesc tupdesc, int natts);
 
 extern TupleDesc CreateTupleDescCopyConstr(TupleDesc tupdesc);
 
-#define TupleDescFinalize(d) ((void) 0)
 #define TupleDescSize(src) \
 	(offsetof(struct TupleDescData, compact_attrs) + \
 	 (src)->natts * sizeof(CompactAttribute) + \
@@ -206,6 +213,7 @@ extern void TupleDescCopy(TupleDesc dst, TupleDesc src);
 extern void TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 							   TupleDesc src, AttrNumber srcAttno);
 
+extern void TupleDescFinalize(TupleDesc tupdesc);
 extern void FreeTupleDesc(TupleDesc tupdesc);
 
 extern void IncrTupleDescRefCount(TupleDesc tupdesc);
diff --git a/src/include/access/tupmacs.h b/src/include/access/tupmacs.h
index 3e5530658c9..150a7a24785 100644
--- a/src/include/access/tupmacs.h
+++ b/src/include/access/tupmacs.h
@@ -15,6 +15,7 @@
 #define TUPMACS_H
 
 #include "catalog/pg_type_d.h"	/* for TYPALIGN macros */
+#include "port/pg_bitutils.h"
 
 
 /*
@@ -69,6 +70,87 @@ fetch_att(const void *T, bool attbyval, int attlen)
 	else
 		return PointerGetDatum(T);
 }
+
+#ifndef HAVE__BUILTIN_CTZ
+/*
+ * For returning the 0-based position of the right-most 0 bit of a uint8, or 8
+ * if all bits are 1 bits.
+ */
+static const uint8 pg_rightmost_zero_pos[256] = {
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 6,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 7,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 6,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 8
+};
+#endif
+
+/*
+ * first_null_attr
+ *		Inspect a NULL bitmask from a tuple and return the 0-based attnum of the
+ *		first NULL attribute.  Returns natts if no NULLs were found.
+ *
+ * We expect that 'bits' contains at least one 0 bit somewhere in the mask,
+ * not necessarily < natts.
+ */
+static inline int
+first_null_attr(const bits8 *bits, int natts)
+{
+	int			lastByte = natts >> 3;
+	int			bytenum;
+	int			res;
+
+#ifdef USE_ASSERT_CHECKING
+	int			firstnull_check = natts;
+
+	/* Do it the slow way and check we get the same answer. */
+	for (int i = 0; i < natts; i++)
+	{
+		if (att_isnull(i, bits))
+		{
+			firstnull_check = i;
+			break;
+		}
+	}
+#endif
+
+	/* Process all bytes up to just before the byte for the natts index */
+	for (bytenum = 0; bytenum < lastByte; bytenum++)
+	{
+		/* break if there's any NULL attrs (a 0 bit) */
+		if (bits[bytenum] != 0xFF)
+			break;
+	}
+
+	res = bytenum << 3;
+
+#ifdef HAVE__BUILTIN_CTZ
+	res += __builtin_ctz(~bits[bytenum]);
+#else
+	res += pg_rightmost_zero_pos[bits[bytenum]];
+#endif
+
+	/*
+	 * Since we did no masking to mask out bits beyond natts, we may have
+	 * found a bit higher than natts, so we must cap to natts
+	 */
+	res = Min(res, natts);
+
+	Assert(res == firstnull_check);
+
+	return res;
+}
 #endif							/* FRONTEND */
 
 /*
diff --git a/src/include/executor/tuptable.h b/src/include/executor/tuptable.h
index a2dfd707e78..363c5f33697 100644
--- a/src/include/executor/tuptable.h
+++ b/src/include/executor/tuptable.h
@@ -84,9 +84,6 @@
  * tts_values/tts_isnull are allocated either when the slot is created (when
  * the descriptor is provided), or when a descriptor is assigned to the slot;
  * they are of length equal to the descriptor's natts.
- *
- * The TTS_FLAG_SLOW flag is saved state for
- * slot_deform_heap_tuple, and should not be touched by any other code.
  *----------
  */
 
@@ -98,12 +95,8 @@
 #define			TTS_FLAG_SHOULDFREE		(1 << 2)
 #define TTS_SHOULDFREE(slot) (((slot)->tts_flags & TTS_FLAG_SHOULDFREE) != 0)
 
-/* saved state for slot_deform_heap_tuple */
-#define			TTS_FLAG_SLOW		(1 << 3)
-#define TTS_SLOW(slot) (((slot)->tts_flags & TTS_FLAG_SLOW) != 0)
-
 /* fixed tuple descriptor */
-#define			TTS_FLAG_FIXED		(1 << 4)
+#define			TTS_FLAG_FIXED		(1 << 4)	/* XXX change to #3? */
 #define TTS_FIXED(slot) (((slot)->tts_flags & TTS_FLAG_FIXED) != 0)
 
 struct TupleTableSlotOps;
-- 
2.51.0


From 0b846521934c7fb273c26675c5054ea2990366d1 Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Tue, 27 Jan 2026 15:08:09 +1300
Subject: [PATCH v6 3/3] Introduce deform_bench test module

For benchmaring tuple deformation.
---
 src/test/modules/deform_bench/.gitignore      |   4 +
 src/test/modules/deform_bench/Makefile        |  21 ++++
 .../deform_bench/deform_bench--1.0.sql        |   8 ++
 src/test/modules/deform_bench/deform_bench.c  | 105 ++++++++++++++++++
 .../modules/deform_bench/deform_bench.control |   4 +
 src/test/modules/deform_bench/meson.build     |  22 ++++
 src/test/modules/meson.build                  |   1 +
 7 files changed, 165 insertions(+)
 create mode 100644 src/test/modules/deform_bench/.gitignore
 create mode 100644 src/test/modules/deform_bench/Makefile
 create mode 100644 src/test/modules/deform_bench/deform_bench--1.0.sql
 create mode 100644 src/test/modules/deform_bench/deform_bench.c
 create mode 100644 src/test/modules/deform_bench/deform_bench.control
 create mode 100644 src/test/modules/deform_bench/meson.build

diff --git a/src/test/modules/deform_bench/.gitignore b/src/test/modules/deform_bench/.gitignore
new file mode 100644
index 00000000000..5dcb3ff9723
--- /dev/null
+++ b/src/test/modules/deform_bench/.gitignore
@@ -0,0 +1,4 @@
+# Generated subdirectories
+/log/
+/results/
+/tmp_check/
diff --git a/src/test/modules/deform_bench/Makefile b/src/test/modules/deform_bench/Makefile
new file mode 100644
index 00000000000..b5fc0f7a583
--- /dev/null
+++ b/src/test/modules/deform_bench/Makefile
@@ -0,0 +1,21 @@
+# src/test/modules/deform_bench/Makefile
+
+MODULE_big = deform_bench
+OBJS = deform_bench.o
+
+EXTENSION = deform_bench
+DATA = deform_bench--1.0.sql
+PGFILEDESC = "deform_bench - tuple deform benchmarking"
+
+REGRESS = deform_bench
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/deform_bench
+top_builddir = ../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/deform_bench/deform_bench--1.0.sql b/src/test/modules/deform_bench/deform_bench--1.0.sql
new file mode 100644
index 00000000000..492b71dba3b
--- /dev/null
+++ b/src/test/modules/deform_bench/deform_bench--1.0.sql
@@ -0,0 +1,8 @@
+/* deform_bench--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION deform_bench" to load this file. \quit
+
+CREATE FUNCTION deform_bench(tableoid Oid, attnum int[]) RETURNS FLOAT
+AS 'MODULE_PATHNAME', 'deform_bench'
+LANGUAGE C VOLATILE STRICT;
diff --git a/src/test/modules/deform_bench/deform_bench.c b/src/test/modules/deform_bench/deform_bench.c
new file mode 100644
index 00000000000..895ff3f4222
--- /dev/null
+++ b/src/test/modules/deform_bench/deform_bench.c
@@ -0,0 +1,105 @@
+/*-------------------------------------------------------------------------
+ *
+ * deform_bench.c
+ *
+ * for benchmarking tuple deformation routines
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include <time.h>
+#include <sys/time.h>
+
+#include "access/heapam.h"
+#include "access/relscan.h"
+#include "catalog/pg_am_d.h"
+#include "catalog/pg_type_d.h"
+#include "funcapi.h"
+#include "miscadmin.h"
+#include "utils/array.h"
+#include "utils/arrayaccess.h"
+#include "utils/builtins.h"
+
+PG_MODULE_MAGIC;
+
+PG_FUNCTION_INFO_V1(deform_bench);
+
+Datum
+deform_bench(PG_FUNCTION_ARGS)
+{
+	Oid		tableoid = PG_GETARG_OID(0);
+	ArrayType *array = PG_GETARG_ARRAYTYPE_P(1);
+	TableScanDesc scan;
+	Relation	rel;
+	TupleDesc	tupdesc;
+	TupleTableSlot *slot;
+	Datum *elem_datums = NULL;
+	bool *elem_nulls = NULL;
+	int elem_count;
+	int *attnums;
+	clock_t start, end;
+
+	rel = relation_open(tableoid, AccessShareLock);
+
+	if (rel->rd_rel->relam != HEAP_TABLE_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only heap AM is supported")));
+
+	tupdesc = RelationGetDescr(rel);
+	slot = MakeTupleTableSlot(tupdesc, &TTSOpsBufferHeapTuple);
+	scan = table_beginscan_strat(rel, GetActiveSnapshot(), 0, NULL, true, false);
+
+	/*
+	 * The array is used to allow callers to define how many atts to deform.
+	 * e.g: '{1,10}'::int[] would deform attnum=1, then in a 2nd pass deform
+	 * the remainder up to attnum=10.  Passing an element as NULL means all
+	 * attnums.  This allows simulation of incremental deformation.  Generally
+	 * if you're passing an array with more than 1 element, then the array
+	 * should be in ascending order.  Doing something like '{10,1}' would mean
+	 * we've already deformed 10 attributes and on the 2nd pass there's
+	 * nothing to do since attnum=1 was already deformed in the first pass.
+	 *
+	 * You'll get an ERROR if you pass a number higher than the number of
+	 * attributes in the table.
+	 */
+	deconstruct_array(array,
+					  INT4OID,
+					  sizeof(int32),
+					  true,
+					  'i',
+					  &elem_datums,
+					  &elem_nulls,
+					  &elem_count);
+
+	attnums = palloc_array(int, elem_count);
+
+	for (int i = 0; i < elem_count; i++)
+	{
+		/* Make a NULL element mean all attributes */
+		if (elem_nulls[i])
+			attnums[i] = tupdesc->natts;
+		else
+			attnums[i] = DatumGetInt32(elem_datums[i]);
+	}
+
+	start = clock();
+
+	while (heap_getnextslot(scan, ForwardScanDirection, slot))
+	{
+		CHECK_FOR_INTERRUPTS();
+
+		/* Deform in stages according to the attnums array */
+		for (int i = 0; i < elem_count; i++)
+			slot_getsomeattrs_int(slot, attnums[i]);
+	}
+
+	ExecDropSingleTupleTableSlot(slot);
+	table_endscan(scan);
+	relation_close(rel, AccessShareLock);
+
+	end = clock();
+
+	/* Returns the number of milliseconds to run the test */
+	PG_RETURN_FLOAT8((double) (end - start) / (CLOCKS_PER_SEC / 1000));
+}
diff --git a/src/test/modules/deform_bench/deform_bench.control b/src/test/modules/deform_bench/deform_bench.control
new file mode 100644
index 00000000000..a2023f9d738
--- /dev/null
+++ b/src/test/modules/deform_bench/deform_bench.control
@@ -0,0 +1,4 @@
+# deform_bench extension
+comment = 'functions for benchmarking tuple deformation'
+default_version = '1.0'
+module_pathname = '$libdir/deform_bench'
diff --git a/src/test/modules/deform_bench/meson.build b/src/test/modules/deform_bench/meson.build
new file mode 100644
index 00000000000..82049585244
--- /dev/null
+++ b/src/test/modules/deform_bench/meson.build
@@ -0,0 +1,22 @@
+# Copyright (c) 2026, PostgreSQL Global Development Group
+
+deform_bench_sources = files(
+  'deform_bench.c',
+)
+
+if host_system == 'windows'
+  deform_bench_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+    '--NAME', 'deform_bench',
+    '--FILEDESC', 'deform_bench - benchmarking tuple deformation',])
+endif
+
+deform_bench = shared_module('deform_bench',
+  deform_bench_sources,
+  kwargs: pg_test_mod_args,
+)
+test_install_libs += deform_bench
+
+test_install_data += files(
+  'deform_bench--1.0.sql',
+  'deform_bench.control',
+)
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 2634a519935..ef2b0af4581 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -2,6 +2,7 @@
 
 subdir('brin')
 subdir('commit_ts')
+subdir('deform_bench')
 subdir('delay_execution')
 subdir('dummy_index_am')
 subdir('dummy_seclabel')
-- 
2.51.0



Attachments:

  [text/plain] deform_test_setup.sh.txt (1.4K, 2-deform_test_setup.sh.txt)
  download | inline:
#!/bin/bash

dbname=postgres
secs=10
rows=1000000
extra_cols_start=0
extra_cols_end=40
extra_cols_increment=10
psql -c "alter system set max_parallel_workers_per_gather = 0;" $dbname
psql -c "alter system set jit = 0;" $dbname
psql -c "select pg_reload_conf();" $dbname
psql -c "create extension if not exists pg_prewarm;" $dbname
psql -c "create extension if not exists deform_bench;" $dbname
psql -c "create table if not exists deform_results (machine text not null, cc text not null, branch text not null, test_id int not null, extra_columns int not null, run_id int not null, milliseconds float4 not NULL);" $dbname

test_id=1

for extracol in ", b int not null default 0" ", b int default null"
do
	for firstcol in "c int not null default 0" "c text not null default '0'" "c int null" "c text null"
	do
		for c in $(seq $extra_cols_start $extra_cols_increment $extra_cols_end)
		do
			tablename="t_${test_id}_${c}"
			psql -c "drop table if exists $tablename" $dbname
			sql="create table $tablename ($firstcol"
			for i in $(seq 0 $c)
			do
				sql="$sql,c$i int not null default 0"
			done
			sql="$sql,a int not null$extracol);"
			psql -c "$sql" $dbname

			psql -c "insert into $tablename (a) select a from generate_series(1,$rows) a;" $dbname
			psql -c "vacuum freeze analyze $tablename;" $dbname
		done
		let "test_id=test_id+1"
	done
done

psql -c "checkpoint;" $dbname

  [text/plain] deform_test_run.sh.txt (1.4K, 3-deform_test_run.sh.txt)
  download | inline:
#!/bin/bash

dbname=postgres

# Must match what was used in the test setup script
extra_cols_start=0
extra_cols_end=40
extra_cols_increment=10

# How many times to run each test.  10 - 100 is probably good
number_of_runs=10

me=$(basename "$0")
if [ $# -lt 3 ]
  then
    echo "Syntax: ./$me <machine name> <name of compiler you used> <branch name> "
	exit 1
fi

machine=$1
compiler=$2
branch=$3

psql -c "truncate table deform_results;" $dbname
echo "Results will be stored in the deform_results table"
echo -n "Running tests..."
for test_id in {1..8}
do
	for c in $(seq $extra_cols_start $extra_cols_increment $extra_cols_end)
	do
		let "deform_attnum=c+3"
		tablename="t_${test_id}_${c}"
		echo -ne "\rRunning test $test_id of 8 with $c extra columns..."
		psql -c "select pg_prewarm('$tablename'); insert into deform_results (machine, cc, branch, test_id, extra_columns, run_id, milliseconds) select '${machine}','${compiler}','${branch}',${test_id},${c},run_id,deform_bench('$tablename'::regclass, '{$deform_attnum}') from generate_series(1,$number_of_runs) run_id;" $dbname > /dev/null
	done
done
echo ""
echo ""
echo "The results are:"
psql --csv -c "select branch,test_id,extra_columns,round(avg(milliseconds)::numeric,2) avg_ms,stddev(milliseconds) AS stddev from deform_results where machine = '$machine' and cc='$compiler' and branch='$branch' group by 1,2,3 order by 3,1,2;" $dbname

  [text/plain] v6-0001-Add-empty-TupleDescFinalize-function.patch (29.0K, 4-v6-0001-Add-empty-TupleDescFinalize-function.patch)
  download | inline diff:
From cd036ce2c09982dac5be8bdb7283e8772d7468d3 Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Wed, 21 Jan 2026 15:41:37 +1300
Subject: [PATCH v6 1/3] Add empty TupleDescFinalize() function

Currently does nothing, but will in a future commit.
---
 contrib/dblink/dblink.c                             |  4 ++++
 contrib/pg_buffercache/pg_buffercache_pages.c       |  2 ++
 contrib/pg_visibility/pg_visibility.c               |  2 ++
 src/backend/access/brin/brin_tuple.c                |  1 +
 src/backend/access/common/tupdesc.c                 | 13 +++++++++++++
 src/backend/access/gin/ginutil.c                    |  1 +
 src/backend/access/gist/gistscan.c                  |  1 +
 src/backend/access/spgist/spgutils.c                |  1 +
 src/backend/access/transam/twophase.c               |  1 +
 src/backend/access/transam/xlogfuncs.c              |  1 +
 src/backend/backup/basebackup_copy.c                |  3 +++
 src/backend/catalog/index.c                         |  2 ++
 src/backend/catalog/pg_publication.c                |  1 +
 src/backend/catalog/toasting.c                      |  6 ++++++
 src/backend/commands/explain.c                      |  1 +
 src/backend/commands/functioncmds.c                 |  1 +
 src/backend/commands/sequence.c                     |  1 +
 src/backend/commands/tablecmds.c                    |  4 ++++
 src/backend/commands/wait.c                         |  1 +
 src/backend/executor/execSRF.c                      |  2 ++
 src/backend/executor/execTuples.c                   |  4 ++++
 src/backend/executor/nodeFunctionscan.c             |  2 ++
 src/backend/parser/parse_relation.c                 |  4 +++-
 src/backend/parser/parse_target.c                   |  2 ++
 .../replication/libpqwalreceiver/libpqwalreceiver.c |  1 +
 src/backend/replication/walsender.c                 |  5 +++++
 src/backend/utils/adt/acl.c                         |  1 +
 src/backend/utils/adt/genfile.c                     |  1 +
 src/backend/utils/adt/lockfuncs.c                   |  1 +
 src/backend/utils/adt/orderedsetaggs.c              |  1 +
 src/backend/utils/adt/pgstatfuncs.c                 |  5 +++++
 src/backend/utils/adt/tsvector_op.c                 |  1 +
 src/backend/utils/cache/relcache.c                  |  8 ++++++++
 src/backend/utils/fmgr/funcapi.c                    |  6 ++++++
 src/backend/utils/misc/guc_funcs.c                  |  5 +++++
 src/include/access/tupdesc.h                        |  1 +
 src/pl/plpgsql/src/pl_comp.c                        |  2 ++
 .../test_custom_stats/test_custom_fixed_stats.c     |  1 +
 src/test/modules/test_predtest/test_predtest.c      |  1 +
 39 files changed, 100 insertions(+), 1 deletion(-)

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 8cb3166495c..1ce4502fec2 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -881,6 +881,7 @@ materializeResult(FunctionCallInfo fcinfo, PGconn *conn, PGresult *res)
 		tupdesc = CreateTemplateTupleDesc(1);
 		TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 						   TEXTOID, -1, 0);
+		TupleDescFinalize(tupdesc);
 		ntuples = 1;
 		nfields = 1;
 	}
@@ -1044,6 +1045,7 @@ materializeQueryResult(FunctionCallInfo fcinfo,
 			tupdesc = CreateTemplateTupleDesc(1);
 			TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 							   TEXTOID, -1, 0);
+			TupleDescFinalize(tupdesc);
 			attinmeta = TupleDescGetAttInMetadata(tupdesc);
 
 			oldcontext = MemoryContextSwitchTo(rsinfo->econtext->ecxt_per_query_memory);
@@ -1529,6 +1531,8 @@ dblink_get_pkey(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 2, "colname",
 						   TEXTOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
+
 		/*
 		 * Generate attribute metadata needed later to produce tuples from raw
 		 * C strings
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 89b86855243..a6b4fb5252b 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -174,6 +174,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
 			TupleDescInitEntry(tupledesc, (AttrNumber) 9, "pinning_backends",
 							   INT4OID, -1, 0);
 
+		TupleDescFinalize(tupledesc);
 		fctx->tupdesc = BlessTupleDesc(tupledesc);
 
 		/* Allocate NBuffers worth of BufferCachePagesRec records. */
@@ -442,6 +443,7 @@ pg_buffercache_os_pages_internal(FunctionCallInfo fcinfo, bool include_numa)
 		TupleDescInitEntry(tupledesc, (AttrNumber) 3, "numa_node",
 						   INT4OID, -1, 0);
 
+		TupleDescFinalize(tupledesc);
 		fctx->tupdesc = BlessTupleDesc(tupledesc);
 		fctx->include_numa = include_numa;
 
diff --git a/contrib/pg_visibility/pg_visibility.c b/contrib/pg_visibility/pg_visibility.c
index 9bc3a784bf7..dfab0b64cf5 100644
--- a/contrib/pg_visibility/pg_visibility.c
+++ b/contrib/pg_visibility/pg_visibility.c
@@ -469,6 +469,8 @@ pg_visibility_tupdesc(bool include_blkno, bool include_pd)
 		TupleDescInitEntry(tupdesc, ++a, "pd_all_visible", BOOLOID, -1, 0);
 	Assert(a == maxattr);
 
+	TupleDescFinalize(tupdesc);
+
 	return BlessTupleDesc(tupdesc);
 }
 
diff --git a/src/backend/access/brin/brin_tuple.c b/src/backend/access/brin/brin_tuple.c
index 706387e36d6..7f150df9ee7 100644
--- a/src/backend/access/brin/brin_tuple.c
+++ b/src/backend/access/brin/brin_tuple.c
@@ -84,6 +84,7 @@ brtuple_disk_tupdesc(BrinDesc *brdesc)
 
 		MemoryContextSwitchTo(oldcxt);
 
+		TupleDescFinalize(tupdesc);
 		brdesc->bd_disktdesc = tupdesc;
 	}
 
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index 94b4f1f9975..e98de806a77 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -238,6 +238,9 @@ CreateTupleDesc(int natts, Form_pg_attribute *attrs)
 		memcpy(TupleDescAttr(desc, i), attrs[i], ATTRIBUTE_FIXED_PART_SIZE);
 		populate_compact_attribute(desc, i);
 	}
+
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -282,6 +285,8 @@ CreateTupleDescCopy(TupleDesc tupdesc)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -328,6 +333,8 @@ CreateTupleDescTruncatedCopy(TupleDesc tupdesc, int natts)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -413,6 +420,8 @@ CreateTupleDescCopyConstr(TupleDesc tupdesc)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -455,6 +464,8 @@ TupleDescCopy(TupleDesc dst, TupleDesc src)
 	 * source's refcount would be wrong in any case.)
 	 */
 	dst->tdrefcount = -1;
+
+	TupleDescFinalize(dst);
 }
 
 /*
@@ -1082,6 +1093,8 @@ BuildDescFromLists(const List *names, const List *types, const List *typmods, co
 		TupleDescInitEntryCollation(desc, attnum, attcollation);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index d205093e21d..a533d79e26e 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -129,6 +129,7 @@ initGinState(GinState *state, Relation index)
 							   attr->attndims);
 			TupleDescInitEntryCollation(state->tupdesc[i], (AttrNumber) 2,
 										attr->attcollation);
+			TupleDescFinalize(state->tupdesc[i]);
 		}
 
 		/*
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
index f23bc4a6757..c65f93abdae 100644
--- a/src/backend/access/gist/gistscan.c
+++ b/src/backend/access/gist/gistscan.c
@@ -201,6 +201,7 @@ gistrescan(IndexScanDesc scan, ScanKey key, int nkeys,
 											 attno - 1)->atttypid,
 							   -1, 0);
 		}
+		TupleDescFinalize(so->giststate->fetchTupdesc);
 		scan->xs_hitupdesc = so->giststate->fetchTupdesc;
 
 		/* Also create a memory context that will hold the returned tuples */
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 9f5379b87ac..b246e8127db 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -340,6 +340,7 @@ getSpGistTupleDesc(Relation index, SpGistTypeDesc *keyType)
 			TupleDescCompactAttr(outTupDesc, i)->attcacheoff = -1;
 
 		populate_compact_attribute(outTupDesc, spgKeyColumn);
+		TupleDescFinalize(outTupDesc);
 	}
 	return outTupDesc;
 }
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index e50abb331cc..9f708f84334 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -744,6 +744,7 @@ pg_prepared_xact(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 5, "dbid",
 						   OIDOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 2efe4105efb..b6bc616c74c 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -400,6 +400,7 @@ pg_walfile_name_offset(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(resultTupleDesc, (AttrNumber) 2, "file_offset",
 					   INT4OID, -1, 0);
 
+	TupleDescFinalize(resultTupleDesc);
 	resultTupleDesc = BlessTupleDesc(resultTupleDesc);
 
 	/*
diff --git a/src/backend/backup/basebackup_copy.c b/src/backend/backup/basebackup_copy.c
index fecfad9ab7b..29dbd0cb32f 100644
--- a/src/backend/backup/basebackup_copy.c
+++ b/src/backend/backup/basebackup_copy.c
@@ -357,6 +357,8 @@ SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli)
 	 */
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "tli", INT8OID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
+
 	/* send RowDescription */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
 
@@ -388,6 +390,7 @@ SendTablespaceList(List *tablespaces)
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, "spcoid", OIDOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "spclocation", TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "size", INT8OID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* send RowDescription */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 43de42ce39e..75e97fb394a 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -481,6 +481,8 @@ ConstructTupleDescriptor(Relation heapRelation,
 		populate_compact_attribute(indexTupDesc, i);
 	}
 
+	TupleDescFinalize(indexTupDesc);
+
 	return indexTupDesc;
 }
 
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 9a4791c573e..fa353a0dd37 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -1230,6 +1230,7 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 4, "qual",
 						   PG_NODE_TREEOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 		funcctx->user_fctx = table_infos;
 
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index c78dcea98c1..078a1cf5127 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -229,6 +229,12 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid,
 	TupleDescAttr(tupdesc, 1)->attcompression = InvalidCompressionMethod;
 	TupleDescAttr(tupdesc, 2)->attcompression = InvalidCompressionMethod;
 
+	populate_compact_attribute(tupdesc, 0);
+	populate_compact_attribute(tupdesc, 1);
+	populate_compact_attribute(tupdesc, 2);
+
+	TupleDescFinalize(tupdesc);
+
 	/*
 	 * Toast tables for regular relations go in pg_toast; those for temp
 	 * relations go into the per-backend temp-toast-table namespace.
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index b7bb111688c..7abd9ed272f 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -281,6 +281,7 @@ ExplainResultDesc(ExplainStmt *stmt)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "QUERY PLAN",
 					   result_type, -1, 0);
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
 
diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c
index a516b037dea..6a8f162b640 100644
--- a/src/backend/commands/functioncmds.c
+++ b/src/backend/commands/functioncmds.c
@@ -2423,6 +2423,7 @@ CallStmtResultDesc(CallStmt *stmt)
 							   -1,
 							   0);
 		}
+		TupleDescFinalize(tupdesc);
 	}
 
 	return tupdesc;
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index e1b808bbb60..551667650ba 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -1808,6 +1808,7 @@ pg_get_sequence_data(PG_FUNCTION_ARGS)
 					   BOOLOID, -1, 0);
 	TupleDescInitEntry(resultTupleDesc, (AttrNumber) 3, "page_lsn",
 					   LSNOID, -1, 0);
+	TupleDescFinalize(resultTupleDesc);
 	resultTupleDesc = BlessTupleDesc(resultTupleDesc);
 
 	seqrel = try_relation_open(relid, AccessShareLock);
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index f976c0e5c7e..ae7c9db9fa9 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -1029,6 +1029,8 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
 		}
 	}
 
+	TupleDescFinalize(descriptor);
+
 	/*
 	 * For relations with table AM and partitioned tables, select access
 	 * method to use: an explicitly indicated one, or (in the case of a
@@ -1448,6 +1450,8 @@ BuildDescForRelation(const List *columns)
 		populate_compact_attribute(desc, attnum - 1);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
diff --git a/src/backend/commands/wait.c b/src/backend/commands/wait.c
index 1290df10c6f..8e920a72372 100644
--- a/src/backend/commands/wait.c
+++ b/src/backend/commands/wait.c
@@ -338,5 +338,6 @@ WaitStmtResultDesc(WaitStmt *stmt)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 					   TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
diff --git a/src/backend/executor/execSRF.c b/src/backend/executor/execSRF.c
index a0b111dc0e4..b481e50acfb 100644
--- a/src/backend/executor/execSRF.c
+++ b/src/backend/executor/execSRF.c
@@ -272,6 +272,7 @@ ExecMakeTableFunctionResult(SetExprState *setexpr,
 									   funcrettype,
 									   -1,
 									   0);
+					TupleDescFinalize(tupdesc);
 					rsinfo.setDesc = tupdesc;
 				}
 				MemoryContextSwitchTo(oldcontext);
@@ -776,6 +777,7 @@ init_sexpr(Oid foid, Oid input_collation, Expr *node,
 							   funcrettype,
 							   -1,
 							   0);
+			TupleDescFinalize(tupdesc);
 			sexpr->funcResultDesc = tupdesc;
 			sexpr->funcReturnsTuple = false;
 		}
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index b768eae9e53..e6ab51e6404 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -2173,6 +2173,8 @@ ExecTypeFromTLInternal(List *targetList, bool skipjunk)
 		cur_resno++;
 	}
 
+	TupleDescFinalize(typeInfo);
+
 	return typeInfo;
 }
 
@@ -2207,6 +2209,8 @@ ExecTypeFromExprList(List *exprList)
 		cur_resno++;
 	}
 
+	TupleDescFinalize(typeInfo);
+
 	return typeInfo;
 }
 
diff --git a/src/backend/executor/nodeFunctionscan.c b/src/backend/executor/nodeFunctionscan.c
index 63e605e1f81..feb82d64967 100644
--- a/src/backend/executor/nodeFunctionscan.c
+++ b/src/backend/executor/nodeFunctionscan.c
@@ -414,6 +414,7 @@ ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags)
 				TupleDescInitEntryCollation(tupdesc,
 											(AttrNumber) 1,
 											exprCollation(funcexpr));
+				TupleDescFinalize(tupdesc);
 			}
 			else
 			{
@@ -485,6 +486,7 @@ ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags)
 							   0);
 		}
 
+		TupleDescFinalize(scan_tupdesc);
 		Assert(attno == natts);
 	}
 
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 3ec8d8de011..0ad767d827b 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -1891,6 +1891,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 			TupleDescInitEntryCollation(tupdesc,
 										(AttrNumber) 1,
 										exprCollation(funcexpr));
+			TupleDescFinalize(tupdesc);
 		}
 		else if (functypclass == TYPEFUNC_RECORD)
 		{
@@ -1948,6 +1949,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 
 				i++;
 			}
+			TupleDescFinalize(tupdesc);
 
 			/*
 			 * Ensure that the coldeflist defines a legal set of names (no
@@ -2016,7 +2018,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 							   0);
 			/* no need to set collation */
 		}
-
+		TupleDescFinalize(tupdesc);
 		Assert(natts == totalatts);
 	}
 	else
diff --git a/src/backend/parser/parse_target.c b/src/backend/parser/parse_target.c
index b5a2f915b67..5fd17f3d8d0 100644
--- a/src/backend/parser/parse_target.c
+++ b/src/backend/parser/parse_target.c
@@ -1570,6 +1570,8 @@ expandRecordVariable(ParseState *pstate, Var *var, int levelsup)
 		}
 		Assert(lname == NULL && lvar == NULL);	/* lists same length? */
 
+		TupleDescFinalize(tupleDesc);
+
 		return tupleDesc;
 	}
 
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 7c8639b32e9..9f04c9ed25d 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -1073,6 +1073,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
 	for (coln = 0; coln < nRetTypes; coln++)
 		TupleDescInitEntry(walres->tupledesc, (AttrNumber) coln + 1,
 						   PQfname(pgres, coln), retTypes[coln], -1, 0);
+	TupleDescFinalize(walres->tupledesc);
 	attinmeta = TupleDescGetAttInMetadata(walres->tupledesc);
 
 	/* No point in doing more here if there were no tuples returned. */
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index a0e6a3d200c..9e2f4a664b4 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -451,6 +451,7 @@ IdentifySystem(void)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 4, "dbname",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -496,6 +497,7 @@ ReadReplicationSlot(ReadReplicationSlotCmd *cmd)
 	/* TimeLineID is unsigned, so int4 is not wide enough. */
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "restart_tli",
 							  INT8OID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	memset(nulls, true, READ_REPLICATION_SLOT_COLS * sizeof(bool));
 
@@ -598,6 +600,7 @@ SendTimeLineHistory(TimeLineHistoryCmd *cmd)
 	tupdesc = CreateTemplateTupleDesc(2);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, "filename", TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "content", TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	TLHistoryFileName(histfname, cmd->timeline);
 	TLHistoryFilePath(path, cmd->timeline);
@@ -1015,6 +1018,7 @@ StartReplication(StartReplicationCmd *cmd)
 								  INT8OID, -1, 0);
 		TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "next_tli_startpos",
 								  TEXTOID, -1, 0);
+		TupleDescFinalize(tupdesc);
 
 		/* prepare for projection of tuple */
 		tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -1369,6 +1373,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 4, "output_plugin",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
diff --git a/src/backend/utils/adt/acl.c b/src/backend/utils/adt/acl.c
index 3a6905f9546..9d37053c81e 100644
--- a/src/backend/utils/adt/acl.c
+++ b/src/backend/utils/adt/acl.c
@@ -1818,6 +1818,7 @@ aclexplode(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 4, "is_grantable",
 						   BOOLOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/* allocate memory for user context */
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index c083608b1d5..bfb949401d0 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -454,6 +454,7 @@ pg_stat_file(PG_FUNCTION_ARGS)
 					   "creation", TIMESTAMPTZOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 6,
 					   "isdir", BOOLOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	memset(isnull, false, sizeof(isnull));
diff --git a/src/backend/utils/adt/lockfuncs.c b/src/backend/utils/adt/lockfuncs.c
index 9dadd6da672..4481c354fd6 100644
--- a/src/backend/utils/adt/lockfuncs.c
+++ b/src/backend/utils/adt/lockfuncs.c
@@ -146,6 +146,7 @@ pg_lock_status(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 16, "waitstart",
 						   TIMESTAMPTZOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/*
diff --git a/src/backend/utils/adt/orderedsetaggs.c b/src/backend/utils/adt/orderedsetaggs.c
index 3b6da8e36ac..fd8b8676470 100644
--- a/src/backend/utils/adt/orderedsetaggs.c
+++ b/src/backend/utils/adt/orderedsetaggs.c
@@ -233,6 +233,7 @@ ordered_set_startup(FunctionCallInfo fcinfo, bool use_tuples)
 								   -1,
 								   0);
 
+				TupleDescFinalize(newdesc);
 				FreeTupleDesc(qstate->tupdesc);
 				qstate->tupdesc = newdesc;
 			}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 73ca0bb0b7f..08ad27e57c2 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -769,6 +769,7 @@ pg_stat_get_backend_subxact(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subxact_overflow",
 					   BOOLOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	if ((local_beentry = pgstat_get_local_beentry_by_proc_number(procNumber)) != NULL)
@@ -1658,6 +1659,7 @@ pg_stat_wal_build_tuple(PgStat_WalCounters wal_counters,
 	TupleDescInitEntry(tupdesc, (AttrNumber) 6, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	/* Fill values and NULLs */
@@ -2085,6 +2087,7 @@ pg_stat_get_archiver(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(tupdesc, (AttrNumber) 7, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	/* Get statistics about the archiver process */
@@ -2166,6 +2169,7 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
 					   TIMESTAMPTZOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 13, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	namestrcpy(&slotname, text_to_cstring(slotname_text));
@@ -2253,6 +2257,7 @@ pg_stat_get_subscription_stats(PG_FUNCTION_ARGS)
 					   INT8OID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 13, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	if (!subentry)
diff --git a/src/backend/utils/adt/tsvector_op.c b/src/backend/utils/adt/tsvector_op.c
index 94e0fed8309..7ca19a97882 100644
--- a/src/backend/utils/adt/tsvector_op.c
+++ b/src/backend/utils/adt/tsvector_op.c
@@ -651,6 +651,7 @@ tsvector_unnest(PG_FUNCTION_ARGS)
 						   TEXTARRAYOID, -1, 0);
 		if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
 			elog(ERROR, "return type must be a row type");
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = tupdesc;
 
 		funcctx->user_fctx = PG_GETARG_TSVECTOR_COPY(0);
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 6b634c9fff1..770edb34e08 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -729,6 +729,8 @@ RelationBuildTupleDesc(Relation relation)
 		pfree(constr);
 		relation->rd_att->constr = NULL;
 	}
+
+	TupleDescFinalize(relation->rd_att);
 }
 
 /*
@@ -1985,6 +1987,7 @@ formrdesc(const char *relationName, Oid relationReltype,
 
 	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
 	TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
+	TupleDescFinalize(relation->rd_att);
 
 	/* mark not-null status */
 	if (has_not_null)
@@ -3688,6 +3691,8 @@ RelationBuildLocalRelation(const char *relname,
 	for (i = 0; i < natts; i++)
 		TupleDescAttr(rel->rd_att, i)->attrelid = relid;
 
+	TupleDescFinalize(rel->rd_att);
+
 	rel->rd_rel->reltablespace = reltablespace;
 
 	if (mapped_relation)
@@ -4443,6 +4448,7 @@ BuildHardcodedDescriptor(int natts, const FormData_pg_attribute *attrs)
 
 	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
 	TupleDescCompactAttr(result, 0)->attcacheoff = 0;
+	TupleDescFinalize(result);
 
 	/* Note: we don't bother to set up a TupleConstr entry */
 
@@ -6268,6 +6274,8 @@ load_relcache_init_file(bool shared)
 			populate_compact_attribute(rel->rd_att, i);
 		}
 
+		TupleDescFinalize(rel->rd_att);
+
 		/* next read the access method specific field */
 		if (fread(&len, 1, sizeof(len), fp) != sizeof(len))
 			goto read_failed;
diff --git a/src/backend/utils/fmgr/funcapi.c b/src/backend/utils/fmgr/funcapi.c
index 8a934ea8dca..516d02cfb82 100644
--- a/src/backend/utils/fmgr/funcapi.c
+++ b/src/backend/utils/fmgr/funcapi.c
@@ -340,6 +340,8 @@ get_expr_result_type(Node *expr,
 										exprCollation(col));
 			i++;
 		}
+		TupleDescFinalize(tupdesc);
+
 		if (resultTypeId)
 			*resultTypeId = rexpr->row_typeid;
 		if (resultTupleDesc)
@@ -1044,6 +1046,7 @@ resolve_polymorphic_tupdesc(TupleDesc tupdesc, oidvector *declared_args,
 		}
 	}
 
+	TupleDescFinalize(tupdesc);
 	return true;
 }
 
@@ -1853,6 +1856,8 @@ build_function_result_tupdesc_d(char prokind,
 						   0);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -1970,6 +1975,7 @@ TypeGetTupleDesc(Oid typeoid, List *colaliases)
 						   typeoid,
 						   -1,
 						   0);
+		TupleDescFinalize(tupdesc);
 	}
 	else if (functypclass == TYPEFUNC_RECORD)
 	{
diff --git a/src/backend/utils/misc/guc_funcs.c b/src/backend/utils/misc/guc_funcs.c
index 4f3e40bf470..b82f807e05e 100644
--- a/src/backend/utils/misc/guc_funcs.c
+++ b/src/backend/utils/misc/guc_funcs.c
@@ -444,6 +444,7 @@ GetPGVariableResultDesc(const char *name)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 1, varname,
 						   TEXTOID, -1, 0);
 	}
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
 
@@ -465,6 +466,7 @@ ShowGUCConfigOption(const char *name, DestReceiver *dest)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, varname,
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -499,6 +501,7 @@ ShowAllGUCConfig(DestReceiver *dest)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "description",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -934,6 +937,8 @@ show_all_settings(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 17, "pending_restart",
 						   BOOLOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
+
 		/*
 		 * Generate attribute metadata needed later to produce tuples from raw
 		 * C strings
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index d46cdbf7a3c..595413dbbc5 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -195,6 +195,7 @@ extern TupleDesc CreateTupleDescTruncatedCopy(TupleDesc tupdesc, int natts);
 
 extern TupleDesc CreateTupleDescCopyConstr(TupleDesc tupdesc);
 
+#define TupleDescFinalize(d) ((void) 0)
 #define TupleDescSize(src) \
 	(offsetof(struct TupleDescData, compact_attrs) + \
 	 (src)->natts * sizeof(CompactAttribute) + \
diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c
index 7d648c941c0..b2112c29fcf 100644
--- a/src/pl/plpgsql/src/pl_comp.c
+++ b/src/pl/plpgsql/src/pl_comp.c
@@ -1912,6 +1912,8 @@ build_row_from_vars(PLpgSQL_variable **vars, int numvars)
 		TupleDescInitEntryCollation(row->rowtupdesc, i + 1, typcoll);
 	}
 
+	TupleDescFinalize(row->rowtupdesc);
+
 	return row;
 }
 
diff --git a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
index 908bd18a7c7..fa1719bf3b5 100644
--- a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
+++ b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
@@ -205,6 +205,7 @@ test_custom_stats_fixed_report(PG_FUNCTION_ARGS)
 					   INT8OID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	values[0] = Int64GetDatum(stats->numcalls);
diff --git a/src/test/modules/test_predtest/test_predtest.c b/src/test/modules/test_predtest/test_predtest.c
index 679a5de456d..48ca2a4ea70 100644
--- a/src/test/modules/test_predtest/test_predtest.c
+++ b/src/test/modules/test_predtest/test_predtest.c
@@ -230,6 +230,7 @@ test_predtest(PG_FUNCTION_ARGS)
 					   "s_r_holds", BOOLOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 8,
 					   "w_r_holds", BOOLOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	tupdesc = BlessTupleDesc(tupdesc);
 
 	values[0] = BoolGetDatum(strong_implied_by);
-- 
2.51.0



  [text/plain] v6-0002-Precalculate-CompactAttribute-s-attcacheoff.patch (49.9K, 5-v6-0002-Precalculate-CompactAttribute-s-attcacheoff.patch)
  download | inline diff:
From 9d02280cc685739dc82306d1ada3509b476f6a8f Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Tue, 31 Dec 2024 09:19:24 +1300
Subject: [PATCH v6 2/3] Precalculate CompactAttribute's attcacheoff

This allows code to be removed from the tuple deform routines which
shrinks down the code a little, which can make it run more quickly.
This also makes a dedicated deformer loop to deform the portion of the
tuple which has a known offset, which makes deforming much faster when
a leading set of the table's columns are non-NULL values and fixed-width
types.
---
 src/backend/access/common/heaptuple.c  | 334 +++++++++-------------
 src/backend/access/common/indextuple.c | 367 ++++++++++---------------
 src/backend/access/common/tupdesc.c    |  37 +++
 src/backend/access/spgist/spgutils.c   |   3 -
 src/backend/executor/execTuples.c      | 280 ++++++++-----------
 src/backend/jit/llvm/llvmjit_deform.c  |   6 -
 src/backend/utils/cache/relcache.c     |  12 -
 src/include/access/htup_details.h      |  19 +-
 src/include/access/itup.h              |  20 +-
 src/include/access/tupdesc.h           |  10 +-
 src/include/access/tupmacs.h           |  82 ++++++
 src/include/executor/tuptable.h        |   9 +-
 12 files changed, 539 insertions(+), 640 deletions(-)

diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c
index 11bec20e82e..42cce3dcdfe 100644
--- a/src/backend/access/common/heaptuple.c
+++ b/src/backend/access/common/heaptuple.c
@@ -497,20 +497,8 @@ heap_attisnull(HeapTuple tup, int attnum, TupleDesc tupleDesc)
 /* ----------------
  *		nocachegetattr
  *
- *		This only gets called from fastgetattr(), in cases where we
- *		can't use a cacheoffset and the value is not null.
- *
- *		This caches attribute offsets in the attribute descriptor.
- *
- *		An alternative way to speed things up would be to cache offsets
- *		with the tuple, but that seems more difficult unless you take
- *		the storage hit of actually putting those offsets into the
- *		tuple you send to disk.  Yuck.
- *
- *		This scheme will be slightly slower than that, but should
- *		perform well for queries which hit large #'s of tuples.  After
- *		you cache the offsets once, examining all the other tuples using
- *		the same attribute descriptor will go much quicker. -cim 5/4/91
+ *		This only gets called from fastgetattr(), in cases where the
+ *		attcacheoff is not set.
  *
  *		NOTE: if you need to change this code, see also heap_deform_tuple.
  *		Also see nocache_index_getattr, which is the same code for index
@@ -522,194 +510,104 @@ nocachegetattr(HeapTuple tup,
 			   int attnum,
 			   TupleDesc tupleDesc)
 {
+	CompactAttribute *cattr;
 	HeapTupleHeader td = tup->t_data;
 	char	   *tp;				/* ptr to data part of tuple */
 	bits8	   *bp = td->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* do we have to walk attrs? */
 	int			off;			/* current offset within data */
+	int			startAttr;
+	int			firstnullattr;
+	bool		hasnulls = HeapTupleHasNulls(tup);
 
-	/* ----------------
-	 *	 Three cases:
-	 *
-	 *	 1: No nulls and no variable-width attributes.
-	 *	 2: Has a null or a var-width AFTER att.
-	 *	 3: Has nulls or var-widths BEFORE att.
-	 * ----------------
-	 */
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
+	/*
+	 * If there are no NULLs before the required attnum, then we can start at
+	 * the highest attribute with a known offset, or the first attribute if
+	 * none have a cached offset.  If the tuple has no variable width types,
+	 * then we can use a slightly cheaper method of offset calculation, as we
+	 * just need to add the attlen to the aligned offset when skipping over
+	 * columns.  When the tuple contains variable-width types, we must use
+	 * att_addlength_pointer(), which does a bit more branching and is
+	 * slightly less efficient.
+	 */
 	attnum--;
 
-	if (!HeapTupleNoNulls(tup))
-	{
-		/*
-		 * there's a null somewhere in the tuple
-		 *
-		 * check to see if any preceding bits are null...
-		 */
-		int			byte = attnum >> 3;
-		int			finalbit = attnum & 0x07;
-
-		/* check for nulls "before" final bit of last byte */
-		if ((~bp[byte]) & ((1 << finalbit) - 1))
-			slow = true;
-		else
-		{
-			/* check for nulls in any "earlier" bytes */
-			int			i;
+	if (hasnulls)
+		firstnullattr = first_null_attr(bp, attnum);
+	else
+		firstnullattr = attnum;
 
-			for (i = 0; i < byte; i++)
-			{
-				if (bp[i] != 0xFF)
-				{
-					slow = true;
-					break;
-				}
-			}
-		}
+	if (tupleDesc->firstNonCachedOffAttr > 0)
+	{
+		startAttr = Min(tupleDesc->firstNonCachedOffAttr - 1, firstnullattr);
+		off = TupleDescCompactAttr(tupleDesc, startAttr)->attcacheoff;
+	}
+	else
+	{
+		startAttr = 0;
+		off = 0;
 	}
 
 	tp = (char *) td + td->t_hoff;
 
-	if (!slow)
+	if (hasnulls)
 	{
-		CompactAttribute *att;
+		for (int i = startAttr; i < attnum; i++)
+		{
+			CompactAttribute *att;
 
-		/*
-		 * If we get here, there are no nulls up to and including the target
-		 * attribute.  If we have a cached offset, we can use it.
-		 */
-		att = TupleDescCompactAttr(tupleDesc, attnum);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, tp + att->attcacheoff);
+			if (att_isnull(i, bp))
+				continue;
 
-		/*
-		 * Otherwise, check for non-fixed-length attrs up to and including
-		 * target.  If there aren't any, it's safe to cheaply initialize the
-		 * cached offsets for these attrs.
-		 */
-		if (HeapTupleHasVarWidth(tup))
-		{
-			int			j;
+			att = TupleDescCompactAttr(tupleDesc, i);
 
-			for (j = 0; j <= attnum; j++)
-			{
-				if (TupleDescCompactAttr(tupleDesc, j)->attlen <= 0)
-				{
-					slow = true;
-					break;
-				}
-			}
+			off = att_pointer_alignby(off,
+									  att->attalignby,
+									  att->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, att->attlen, tp + off);
 		}
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
 	}
-
-	if (!slow)
+	else if (!HeapTupleHasVarWidth(tup))
 	{
-		int			natts = tupleDesc->natts;
-		int			j = 1;
-
-		/*
-		 * If we get here, we have a tuple with no nulls or var-widths up to
-		 * and including the target attribute, so we can use the cached offset
-		 * ... only we don't have it yet, or we'd not have got here.  Since
-		 * it's cheap to compute offsets for fixed-width columns, we take the
-		 * opportunity to initialize the cached offsets for *all* the leading
-		 * fixed-width columns, in hope of avoiding future visits to this
-		 * routine.
-		 */
-		TupleDescCompactAttr(tupleDesc, 0)->attcacheoff = 0;
-
-		/* we might have set some offsets in the slow path previously */
-		while (j < natts && TupleDescCompactAttr(tupleDesc, j)->attcacheoff > 0)
-			j++;
-
-		off = TupleDescCompactAttr(tupleDesc, j - 1)->attcacheoff +
-			TupleDescCompactAttr(tupleDesc, j - 1)->attlen;
-
-		for (; j < natts; j++)
+		for (int i = startAttr; i < attnum; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, j);
-
-			if (att->attlen <= 0)
-				break;
+			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
 
 			off = att_nominal_alignby(off, att->attalignby);
-
-			att->attcacheoff = off;
-
 			off += att->attlen;
 		}
-
-		Assert(j > attnum);
-
-		off = TupleDescCompactAttr(tupleDesc, attnum)->attcacheoff;
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_nominal_alignby(off, cattr->attalignby);
 	}
 	else
 	{
-		bool		usecache = true;
-		int			i;
-
-		/*
-		 * Now we know that we have to walk the tuple CAREFULLY.  But we still
-		 * might be able to cache some offsets for next time.
-		 *
-		 * Note - This loop is a little tricky.  For each non-null attribute,
-		 * we have to first account for alignment padding before the attr,
-		 * then advance over the attr based on its length.  Nulls have no
-		 * storage and no alignment padding either.  We can use/set
-		 * attcacheoff until we reach either a null or a var-width attribute.
-		 */
-		off = 0;
-		for (i = 0;; i++)		/* loop exit is at "break" */
+		for (int i = startAttr; i < attnum; i++)
 		{
 			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
 
-			if (HeapTupleHasNulls(tup) && att_isnull(i, bp))
-			{
-				usecache = false;
-				continue;		/* this cannot be the target att */
-			}
-
-			/* If we know the next offset, we can skip the rest */
-			if (usecache && att->attcacheoff >= 0)
-				off = att->attcacheoff;
-			else if (att->attlen == -1)
-			{
-				/*
-				 * We can only cache the offset for a varlena attribute if the
-				 * offset is already suitably aligned, so that there would be
-				 * no pad bytes in any case: then the offset will be valid for
-				 * either an aligned or unaligned value.
-				 */
-				if (usecache &&
-					off == att_nominal_alignby(off, att->attalignby))
-					att->attcacheoff = off;
-				else
-				{
-					off = att_pointer_alignby(off, att->attalignby, -1,
-											  tp + off);
-					usecache = false;
-				}
-			}
-			else
-			{
-				/* not varlena, so safe to use att_nominal_alignby */
-				off = att_nominal_alignby(off, att->attalignby);
-
-				if (usecache)
-					att->attcacheoff = off;
-			}
-
-			if (i == attnum)
-				break;
-
+			off = att_pointer_alignby(off,
+									  att->attalignby,
+									  att->attlen,
+									  tp + off);
 			off = att_addlength_pointer(off, att->attlen, tp + off);
 
-			if (usecache && att->attlen <= 0)
-				usecache = false;
 		}
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
 	}
 
-	return fetchatt(TupleDescCompactAttr(tupleDesc, attnum), tp + off);
+	return fetchatt(cattr, tp + off);
 }
 
 /* ----------------
@@ -1347,6 +1245,7 @@ heap_deform_tuple(HeapTuple tuple, TupleDesc tupleDesc,
 				  Datum *values, bool *isnull)
 {
 	HeapTupleHeader tup = tuple->t_data;
+	CompactAttribute *cattr;
 	bool		hasnulls = HeapTupleHasNulls(tuple);
 	int			tdesc_natts = tupleDesc->natts;
 	int			natts;			/* number of atts to extract */
@@ -1354,70 +1253,91 @@ heap_deform_tuple(HeapTuple tuple, TupleDesc tupleDesc,
 	char	   *tp;				/* ptr to tuple data */
 	uint32		off;			/* offset in tuple data */
 	bits8	   *bp = tup->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* can we use/set attcacheoff? */
+	int			cacheoffattrs;
+	int			firstnullattr;
 
 	natts = HeapTupleHeaderGetNatts(tup);
 
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
+
 	/*
 	 * In inheritance situations, it is possible that the given tuple actually
 	 * has more fields than the caller is expecting.  Don't run off the end of
 	 * the caller's arrays.
 	 */
 	natts = Min(natts, tdesc_natts);
+	cacheoffattrs = Min(tupleDesc->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		firstnullattr = first_null_attr(bp, natts);
+		cacheoffattrs = Min(cacheoffattrs, firstnullattr);
+	}
+	else
+		firstnullattr = natts;
 
 	tp = (char *) tup + tup->t_hoff;
+	attnum = 0;
 
-	off = 0;
+	if (cacheoffattrs > 0)
+	{
+#ifdef USE_ASSERT_CHECKING
+		/* In Assert enabled builds, verify attcacheoff is correct */
+		off = 0;
+#endif
+		do
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+#ifdef USE_ASSERT_CHECKING
+			off = att_nominal_alignby(off, cattr->attalignby);
+			Assert(off == cattr->attcacheoff);
+			off += cattr->attlen;
+#endif
 
-	for (attnum = 0; attnum < natts; attnum++)
+			values[attnum] = fetch_att(tp + cattr->attcacheoff,
+									   cattr->attbyval,
+									   cattr->attlen);
+			isnull[attnum] = false;
+		} while (++attnum < cacheoffattrs);
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+	else
+		off = 0;
+
+	for (; attnum < firstnullattr; attnum++)
 	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDesc, attnum);
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
 
-		if (hasnulls && att_isnull(attnum, bp))
-		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			slow = true;		/* can't use attcacheoff anymore */
-			continue;
-		}
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
 		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (!slow && thisatt->attcacheoff >= 0)
-			off = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow &&
-				off == att_nominal_alignby(off, thisatt->attalignby))
-				thisatt->attcacheoff = off;
-			else
-			{
-				off = att_pointer_alignby(off, thisatt->attalignby, -1,
-										  tp + off);
-				slow = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			off = att_nominal_alignby(off, thisatt->attalignby);
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+	}
+
+	for (; attnum < natts; attnum++)
+	{
+		Assert(hasnulls);
 
-			if (!slow)
-				thisatt->attcacheoff = off;
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
 		}
 
-		values[attnum] = fetchatt(thisatt, tp + off);
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
-		off = att_addlength_pointer(off, thisatt->attlen, tp + off);
+		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (thisatt->attlen <= 0)
-			slow = true;		/* can't use attcacheoff anymore */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
 	/*
diff --git a/src/backend/access/common/indextuple.c b/src/backend/access/common/indextuple.c
index d7c8c53fd8d..084e0937a60 100644
--- a/src/backend/access/common/indextuple.c
+++ b/src/backend/access/common/indextuple.c
@@ -223,18 +223,6 @@ index_form_tuple_context(TupleDesc tupleDescriptor,
  *
  *		This gets called from index_getattr() macro, and only in cases
  *		where we can't use cacheoffset and the value is not null.
- *
- *		This caches attribute offsets in the attribute descriptor.
- *
- *		An alternative way to speed things up would be to cache offsets
- *		with the tuple, but that seems more difficult unless you take
- *		the storage hit of actually putting those offsets into the
- *		tuple you send to disk.  Yuck.
- *
- *		This scheme will be slightly slower than that, but should
- *		perform well for queries which hit large #'s of tuples.  After
- *		you cache the offsets once, examining all the other tuples using
- *		the same attribute descriptor will go much quicker. -cim 5/4/91
  * ----------------
  */
 Datum
@@ -242,205 +230,129 @@ nocache_index_getattr(IndexTuple tup,
 					  int attnum,
 					  TupleDesc tupleDesc)
 {
+	CompactAttribute *cattr;
 	char	   *tp;				/* ptr to data part of tuple */
 	bits8	   *bp = NULL;		/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* do we have to walk attrs? */
 	int			data_off;		/* tuple data offset */
 	int			off;			/* current offset within data */
+	int			startAttr;
+	int			firstnullattr;
+	bool		hasnulls = IndexTupleHasNulls(tup);
+	int			i;
 
-	/* ----------------
-	 *	 Three cases:
-	 *
-	 *	 1: No nulls and no variable-width attributes.
-	 *	 2: Has a null or a var-width AFTER att.
-	 *	 3: Has nulls or var-widths BEFORE att.
-	 * ----------------
-	 */
-
-	data_off = IndexInfoFindDataOffset(tup->t_info);
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
 	attnum--;
 
-	if (IndexTupleHasNulls(tup))
-	{
-		/*
-		 * there's a null somewhere in the tuple
-		 *
-		 * check to see if desired att is null
-		 */
+	/*
+	 * If there are no NULLs before the required attnum, then we can start at
+	 * the highest attribute with a known offset, or the first attribute if
+	 * none have a cached offset.  If the tuple has no variable width types,
+	 * which is common with indexes, then we can use a slightly cheaper method
+	 * of offset calculation, as we just need to add the attlen to the aligned
+	 * offset when skipping over columns.  When the tuple contains
+	 * variable-width types, we must use att_addlength_pointer(), which does a
+	 * bit more branching and is slightly less efficient.
+	 */
+	data_off = IndexInfoFindDataOffset(tup->t_info);
+	tp = (char *) tup + data_off;
 
-		/* XXX "knows" t_bits are just after fixed tuple header! */
+	/*
+	 * Find the first NULL column, or if there's none set the first NULL to
+	 * attnum so that we can forego NULL checking all the way to attnum.
+	 */
+	if (hasnulls)
+	{
 		bp = (bits8 *) ((char *) tup + sizeof(IndexTupleData));
-
-		/*
-		 * Now check to see if any preceding bits are null...
-		 */
-		{
-			int			byte = attnum >> 3;
-			int			finalbit = attnum & 0x07;
-
-			/* check for nulls "before" final bit of last byte */
-			if ((~bp[byte]) & ((1 << finalbit) - 1))
-				slow = true;
-			else
-			{
-				/* check for nulls in any "earlier" bytes */
-				int			i;
-
-				for (i = 0; i < byte; i++)
-				{
-					if (bp[i] != 0xFF)
-					{
-						slow = true;
-						break;
-					}
-				}
-			}
-		}
+		firstnullattr = first_null_attr(bp, attnum);
 	}
+	else
+		firstnullattr = attnum;
 
-	tp = (char *) tup + data_off;
-
-	if (!slow)
+	if (tupleDesc->firstNonCachedOffAttr > 0)
 	{
-		CompactAttribute *att;
-
-		/*
-		 * If we get here, there are no nulls up to and including the target
-		 * attribute.  If we have a cached offset, we can use it.
-		 */
-		att = TupleDescCompactAttr(tupleDesc, attnum);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, tp + att->attcacheoff);
-
-		/*
-		 * Otherwise, check for non-fixed-length attrs up to and including
-		 * target.  If there aren't any, it's safe to cheaply initialize the
-		 * cached offsets for these attrs.
-		 */
-		if (IndexTupleHasVarwidths(tup))
-		{
-			int			j;
-
-			for (j = 0; j <= attnum; j++)
-			{
-				if (TupleDescCompactAttr(tupleDesc, j)->attlen <= 0)
-				{
-					slow = true;
-					break;
-				}
-			}
-		}
+		startAttr = Min(tupleDesc->firstNonCachedOffAttr - 1, firstnullattr);
+		off = TupleDescCompactAttr(tupleDesc, startAttr)->attcacheoff;
 	}
-
-	if (!slow)
+	else
 	{
-		int			natts = tupleDesc->natts;
-		int			j = 1;
-
-		/*
-		 * If we get here, we have a tuple with no nulls or var-widths up to
-		 * and including the target attribute, so we can use the cached offset
-		 * ... only we don't have it yet, or we'd not have got here.  Since
-		 * it's cheap to compute offsets for fixed-width columns, we take the
-		 * opportunity to initialize the cached offsets for *all* the leading
-		 * fixed-width columns, in hope of avoiding future visits to this
-		 * routine.
-		 */
-		TupleDescCompactAttr(tupleDesc, 0)->attcacheoff = 0;
+		startAttr = 0;
+		off = 0;
+	}
 
-		/* we might have set some offsets in the slow path previously */
-		while (j < natts && TupleDescCompactAttr(tupleDesc, j)->attcacheoff > 0)
-			j++;
+	/* Handle tuples with var-width attributes */
+	if (IndexTupleHasVarwidths(tup))
+	{
+		/* Calculate the offset up until the first NULL */
+		for (i = startAttr; i < firstnullattr; i++)
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-		off = TupleDescCompactAttr(tupleDesc, j - 1)->attcacheoff +
-			TupleDescCompactAttr(tupleDesc, j - 1)->attlen;
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, cattr->attlen, tp + off);
+		}
 
-		for (; j < natts; j++)
+		/* Calculate the offset for any remaining columns. */
+		for (; i < attnum; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, j);
+			Assert(hasnulls);
 
-			if (att->attlen <= 0)
-				break;
+			if (att_isnull(i, bp))
+				continue;
 
-			off = att_nominal_alignby(off, att->attalignby);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-			att->attcacheoff = off;
-
-			off += att->attlen;
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, cattr->attlen, tp + off);
 		}
-
-		Assert(j > attnum);
-
-		off = TupleDescCompactAttr(tupleDesc, attnum)->attcacheoff;
 	}
 	else
 	{
-		bool		usecache = true;
-		int			i;
+		/* Handle tuples with only fixed-width attributes */
 
-		/*
-		 * Now we know that we have to walk the tuple CAREFULLY.  But we still
-		 * might be able to cache some offsets for next time.
-		 *
-		 * Note - This loop is a little tricky.  For each non-null attribute,
-		 * we have to first account for alignment padding before the attr,
-		 * then advance over the attr based on its length.  Nulls have no
-		 * storage and no alignment padding either.  We can use/set
-		 * attcacheoff until we reach either a null or a var-width attribute.
-		 */
-		off = 0;
-		for (i = 0;; i++)		/* loop exit is at "break" */
+		/* Calculate the offset up until the first NULL */
+		for (i = startAttr; i < firstnullattr; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
-
-			if (IndexTupleHasNulls(tup) && att_isnull(i, bp))
-			{
-				usecache = false;
-				continue;		/* this cannot be the target att */
-			}
-
-			/* If we know the next offset, we can skip the rest */
-			if (usecache && att->attcacheoff >= 0)
-				off = att->attcacheoff;
-			else if (att->attlen == -1)
-			{
-				/*
-				 * We can only cache the offset for a varlena attribute if the
-				 * offset is already suitably aligned, so that there would be
-				 * no pad bytes in any case: then the offset will be valid for
-				 * either an aligned or unaligned value.
-				 */
-				if (usecache &&
-					off == att_nominal_alignby(off, att->attalignby))
-					att->attcacheoff = off;
-				else
-				{
-					off = att_pointer_alignby(off, att->attalignby, -1,
-											  tp + off);
-					usecache = false;
-				}
-			}
-			else
-			{
-				/* not varlena, so safe to use att_nominal_alignby */
-				off = att_nominal_alignby(off, att->attalignby);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
+
+			Assert(cattr->attlen > 0);
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off += cattr->attlen;
+		}
 
-				if (usecache)
-					att->attcacheoff = off;
-			}
+		/* Calculate the offset for any remaining columns. */
+		for (; i < attnum; i++)
+		{
+			Assert(hasnulls);
 
-			if (i == attnum)
-				break;
+			if (att_isnull(i, bp))
+				continue;
 
-			off = att_addlength_pointer(off, att->attlen, tp + off);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-			if (usecache && att->attlen <= 0)
-				usecache = false;
+			Assert(cattr->attlen > 0);
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off += cattr->attlen;
 		}
 	}
 
-	return fetchatt(TupleDescCompactAttr(tupleDesc, attnum), tp + off);
+	cattr = TupleDescCompactAttr(tupleDesc, attnum);
+	off = att_pointer_alignby(off, cattr->attalignby,
+							  cattr->attlen, tp + off);
+	return fetchatt(cattr, tp + off);
 }
 
 /*
@@ -480,63 +392,86 @@ index_deform_tuple_internal(TupleDesc tupleDescriptor,
 							Datum *values, bool *isnull,
 							char *tp, bits8 *bp, int hasnulls)
 {
+	CompactAttribute *cattr;
 	int			natts = tupleDescriptor->natts; /* number of atts to extract */
-	int			attnum;
+	int			attnum = 0;
 	int			off = 0;		/* offset in tuple data */
-	bool		slow = false;	/* can we use/set attcacheoff? */
+	int			cacheoffattrs;
+	int			firstnullattr;
 
 	/* Assert to protect callers who allocate fixed-size arrays */
 	Assert(natts <= INDEX_MAX_KEYS);
 
-	for (attnum = 0; attnum < natts; attnum++)
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDescriptor->firstNonCachedOffAttr >= 0);
+
+	cacheoffattrs = Min(tupleDescriptor->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		firstnullattr = first_null_attr(bp, natts);
+		cacheoffattrs = Min(cacheoffattrs, firstnullattr);
+	}
+	else
+		firstnullattr = natts;
+
+	if (cacheoffattrs > 0)
 	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDescriptor, attnum);
+#ifdef USE_ASSERT_CHECKING
+		/* In Assert enabled builds, verify attcacheoff is correct */
+		off = 0;
+#endif
 
-		if (hasnulls && att_isnull(attnum, bp))
+		do
 		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			slow = true;		/* can't use attcacheoff anymore */
-			continue;
-		}
+			cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+
+#ifdef USE_ASSERT_CHECKING
+			off = att_nominal_alignby(off, cattr->attalignby);
+			Assert(off == cattr->attcacheoff);
+			off += cattr->attlen;
+#endif
+
+			values[attnum] = fetch_att(tp + cattr->attcacheoff, cattr->attbyval,
+									   cattr->attlen);
+			isnull[attnum] = false;
+		} while (++attnum < cacheoffattrs);
+
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+
+	for (; attnum < firstnullattr; attnum++)
+	{
+		cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
 		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (!slow && thisatt->attcacheoff >= 0)
-			off = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow &&
-				off == att_nominal_alignby(off, thisatt->attalignby))
-				thisatt->attcacheoff = off;
-			else
-			{
-				off = att_pointer_alignby(off, thisatt->attalignby, -1,
-										  tp + off);
-				slow = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			off = att_nominal_alignby(off, thisatt->attalignby);
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+	}
+
+	for (; attnum < natts; attnum++)
+	{
+		Assert(hasnulls);
 
-			if (!slow)
-				thisatt->attcacheoff = off;
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
 		}
 
-		values[attnum] = fetchatt(thisatt, tp + off);
+		cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
-		off = att_addlength_pointer(off, thisatt->attlen, tp + off);
+		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (thisatt->attlen <= 0)
-			slow = true;		/* can't use attcacheoff anymore */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 }
 
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index e98de806a77..25364db630a 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -214,6 +214,9 @@ CreateTemplateTupleDesc(int natts)
 	desc->tdtypmod = -1;
 	desc->tdrefcount = -1;		/* assume not reference-counted */
 
+	/* This will be set to the correct value by TupleDescFinalize() */
+	desc->firstNonCachedOffAttr = -1;
+
 	return desc;
 }
 
@@ -474,6 +477,9 @@ TupleDescCopy(TupleDesc dst, TupleDesc src)
  *		descriptor to another.
  *
  * !!! Constraints and defaults are not copied !!!
+ *
+ * The caller must take care of calling TupleDescFinalize() on once all
+ * TupleDesc changes have been made.
  */
 void
 TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
@@ -506,6 +512,37 @@ TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 	populate_compact_attribute(dst, dstAttno - 1);
 }
 
+/*
+ * TupleDescFinalize
+ *		Finalize the given TupleDesc.  This must be called after the
+ *		attributes arrays have been populated or adjusted by any code.
+ *
+ * Must be called after populate_compact_attribute() and before
+ * BlessTupleDesc().
+ */
+void
+TupleDescFinalize(TupleDesc tupdesc)
+{
+	int			firstNonCachedOffAttr = 0;
+	int			offp = 0;
+
+	for (int i = 0; i < tupdesc->natts; i++)
+	{
+		CompactAttribute *cattr = TupleDescCompactAttr(tupdesc, i);
+
+		if (cattr->attlen <= 0)
+			break;
+
+		offp = att_nominal_alignby(offp, cattr->attalignby);
+		cattr->attcacheoff = offp;
+
+		offp += cattr->attlen;
+		firstNonCachedOffAttr = i + 1;
+	}
+
+	tupdesc->firstNonCachedOffAttr = firstNonCachedOffAttr;
+}
+
 /*
  * Free a TupleDesc including all substructure
  */
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index b246e8127db..a4694bd8065 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -335,9 +335,6 @@ getSpGistTupleDesc(Relation index, SpGistTypeDesc *keyType)
 		/* We shouldn't need to bother with making these valid: */
 		att->attcompression = InvalidCompressionMethod;
 		att->attcollation = InvalidOid;
-		/* In case we changed typlen, we'd better reset following offsets */
-		for (int i = spgFirstIncludeColumn; i < outTupDesc->natts; i++)
-			TupleDescCompactAttr(outTupDesc, i)->attcacheoff = -1;
 
 		populate_compact_attribute(outTupDesc, spgKeyColumn);
 		TupleDescFinalize(outTupDesc);
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index e6ab51e6404..89f18be5d82 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -992,118 +992,6 @@ tts_buffer_heap_store_tuple(TupleTableSlot *slot, HeapTuple tuple,
 	}
 }
 
-/*
- * slot_deform_heap_tuple_internal
- *		An always inline helper function for use in slot_deform_heap_tuple to
- *		allow the compiler to emit specialized versions of this function for
- *		various combinations of "slow" and "hasnulls".  For example, if a
- *		given tuple has no nulls, then we needn't check "hasnulls" for every
- *		attribute that we're deforming.  The caller can just call this
- *		function with hasnulls set to constant-false and have the compiler
- *		remove the constant-false branches and emit more optimal code.
- *
- * Returns the next attnum to deform, which can be equal to natts when the
- * function manages to deform all requested attributes.  *offp is an input and
- * output parameter which is the byte offset within the tuple to start deforming
- * from which, on return, gets set to the offset where the next attribute
- * should be deformed from.  *slowp is set to true when subsequent deforming
- * of this tuple must use a version of this function with "slow" passed as
- * true.
- *
- * Callers cannot assume when we return "attnum" (i.e. all requested
- * attributes have been deformed) that slow mode isn't required for any
- * additional deforming as the final attribute may have caused a switch to
- * slow mode.
- */
-static pg_attribute_always_inline int
-slot_deform_heap_tuple_internal(TupleTableSlot *slot, HeapTuple tuple,
-								int attnum, int natts, bool slow,
-								bool hasnulls, uint32 *offp, bool *slowp)
-{
-	TupleDesc	tupleDesc = slot->tts_tupleDescriptor;
-	Datum	   *values = slot->tts_values;
-	bool	   *isnull = slot->tts_isnull;
-	HeapTupleHeader tup = tuple->t_data;
-	char	   *tp;				/* ptr to tuple data */
-	bits8	   *bp = tup->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slownext = false;
-
-	tp = (char *) tup + tup->t_hoff;
-
-	for (; attnum < natts; attnum++)
-	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDesc, attnum);
-
-		if (hasnulls && att_isnull(attnum, bp))
-		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			if (!slow)
-			{
-				*slowp = true;
-				return attnum + 1;
-			}
-			else
-				continue;
-		}
-
-		isnull[attnum] = false;
-
-		/* calculate the offset of this attribute */
-		if (!slow && thisatt->attcacheoff >= 0)
-			*offp = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow && *offp == att_nominal_alignby(*offp, thisatt->attalignby))
-				thisatt->attcacheoff = *offp;
-			else
-			{
-				*offp = att_pointer_alignby(*offp,
-											thisatt->attalignby,
-											-1,
-											tp + *offp);
-
-				if (!slow)
-					slownext = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			*offp = att_nominal_alignby(*offp, thisatt->attalignby);
-
-			if (!slow)
-				thisatt->attcacheoff = *offp;
-		}
-
-		values[attnum] = fetchatt(thisatt, tp + *offp);
-
-		*offp = att_addlength_pointer(*offp, thisatt->attlen, tp + *offp);
-
-		/* check if we need to switch to slow mode */
-		if (!slow)
-		{
-			/*
-			 * We're unable to deform any further if the above code set
-			 * 'slownext', or if this isn't a fixed-width attribute.
-			 */
-			if (slownext || thisatt->attlen <= 0)
-			{
-				*slowp = true;
-				return attnum + 1;
-			}
-		}
-	}
-
-	return natts;
-}
-
 /*
  * slot_deform_heap_tuple
  *		Given a TupleTableSlot, extract data from the slot's physical tuple
@@ -1122,78 +1010,140 @@ static pg_attribute_always_inline void
 slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 					   int natts)
 {
+	CompactAttribute *cattr;
+	TupleDesc	tupleDesc = slot->tts_tupleDescriptor;
 	bool		hasnulls = HeapTupleHasNulls(tuple);
+	HeapTupleHeader tup = tuple->t_data;
+	bits8	   *bp;				/* ptr to null bitmap in tuple */
 	int			attnum;
+	int			firstNonCacheOffsetAttr;
+	int			firstNullAttr;
+	Datum	   *values;
+	bool	   *isnull;
+	char	   *tp;				/* ptr to tuple data */
 	uint32		off;			/* offset in tuple data */
-	bool		slow;			/* can we use/set attcacheoff? */
+
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
 	/* We can only fetch as many attributes as the tuple has. */
-	natts = Min(HeapTupleHeaderGetNatts(tuple->t_data), natts);
+	natts = Min(HeapTupleHeaderGetNatts(tup), natts);
+	attnum = slot->tts_nvalid;
+	firstNonCacheOffsetAttr = Min(tupleDesc->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		bp = tup->t_bits;
+		firstNullAttr = first_null_attr(bp, natts);
+		firstNonCacheOffsetAttr = Min(firstNonCacheOffsetAttr, firstNullAttr);
+	}
+	else
+	{
+		bp = NULL;
+		firstNullAttr = natts;
+	}
+
+	values = slot->tts_values;
+	isnull = slot->tts_isnull;
+	tp = (char *) tup + tup->t_hoff;
 
 	/*
-	 * Check whether the first call for this tuple, and initialize or restore
-	 * loop state.
+	 * Handle the portion of the tuple that we have cached the offset for up
+	 * to the first NULL attribute.  The offset is effectively fixed for these
+	 * so we can use the CompactAttribute's attcacheoff.
 	 */
-	attnum = slot->tts_nvalid;
-	if (attnum == 0)
+	if (attnum < firstNonCacheOffsetAttr)
+	{
+#ifdef USE_ASSERT_CHECKING
+		int			offcheck;
+
+		/* In Assert enabled builds, verify attcacheoff is correct */
+		if (attnum == 0)
+			offcheck = 0;
+		else
+			offcheck = *offp;
+#endif
+		do
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+#ifdef USE_ASSERT_CHECKING
+			offcheck = att_nominal_alignby(offcheck, cattr->attalignby);
+			Assert(offcheck == cattr->attcacheoff);
+			offcheck += cattr->attlen;
+#endif
+
+			values[attnum] = fetchatt(cattr, tp + cattr->attcacheoff);
+			isnull[attnum] = false;
+		} while (++attnum < firstNonCacheOffsetAttr);
+
+		/*
+		 * Point the offset after the end of the last attribute with a cached
+		 * offset.  We expect the final cached offset attribute to have a
+		 * fixed width, so just add the attlen to the attcacheoff
+		 */
+		Assert(cattr->attlen > 0);
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+	else if (attnum == 0)
 	{
 		/* Start from the first attribute */
 		off = 0;
-		slow = false;
 	}
 	else
 	{
 		/* Restore state from previous execution */
 		off = *offp;
-		slow = TTS_SLOW(slot);
 	}
 
 	/*
-	 * If 'slow' isn't set, try deforming using deforming code that does not
-	 * contain any of the extra checks required for non-fixed offset
-	 * deforming.  During deforming, if or when we find a NULL or a variable
-	 * length attribute, we'll switch to a deforming method which includes the
-	 * extra code required for non-fixed offset deforming, a.k.a slow mode.
-	 * Because this is performance critical, we inline
-	 * slot_deform_heap_tuple_internal passing the 'slow' and 'hasnull'
-	 * parameters as constants to allow the compiler to emit specialized code
-	 * with the known-const false comparisons and subsequent branches removed.
+	 * Handle any portion of the tuple that doesn't have a fixed offset up
+	 * until the first NULL attribute.  This loops only differs from the one
+	 * after it by the NULL checks.
 	 */
-	if (!slow)
+	for (; attnum < firstNullAttr; attnum++)
 	{
-		/* Tuple without any NULLs? We can skip doing any NULL checking */
-		if (!hasnulls)
-			attnum = slot_deform_heap_tuple_internal(slot,
-													 tuple,
-													 attnum,
-													 natts,
-													 false, /* slow */
-													 false, /* hasnulls */
-													 &off,
-													 &slow);
-		else
-			attnum = slot_deform_heap_tuple_internal(slot,
-													 tuple,
-													 attnum,
-													 natts,
-													 false, /* slow */
-													 true,	/* hasnulls */
-													 &off,
-													 &slow);
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+		/* align the offset for this attribute */
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
+
+		values[attnum] = fetchatt(cattr, tp + off);
+		isnull[attnum] = false;
+
+		/* move the offset beyond this attribute */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
-	/* If there's still work to do then we must be in slow mode */
-	if (attnum < natts)
+	/*
+	 * Now handle any remaining tuples, this time include NULL checks as we're
+	 * now at the first NULL attribute.
+	 */
+	for (; attnum < natts; attnum++)
 	{
-		/* XXX is it worth adding a separate call when hasnulls is false? */
-		attnum = slot_deform_heap_tuple_internal(slot,
-												 tuple,
-												 attnum,
-												 natts,
-												 true,	/* slow */
-												 hasnulls,
-												 &off,
-												 &slow);
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
+		}
+
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+		/* align the offset for this attribute */
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
+
+		values[attnum] = fetchatt(cattr, tp + off);
+		isnull[attnum] = false;
+
+		/* move the offset beyond this attribute */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
 	/*
@@ -1201,10 +1151,6 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 	 */
 	slot->tts_nvalid = attnum;
 	*offp = off;
-	if (slow)
-		slot->tts_flags |= TTS_FLAG_SLOW;
-	else
-		slot->tts_flags &= ~TTS_FLAG_SLOW;
 }
 
 const TupleTableSlotOps TTSOpsVirtual = {
@@ -2259,10 +2205,16 @@ ExecTypeSetColNames(TupleDesc typeInfo, List *namesList)
  * This happens "for free" if the tupdesc came from a relcache entry, but
  * not if we have manufactured a tupdesc for a transient RECORD datatype.
  * In that case we have to notify typcache.c of the existence of the type.
+ *
+ * TupleDescFinalize() must be called on the TupleDesc before calling this
+ * function.
  */
 TupleDesc
 BlessTupleDesc(TupleDesc tupdesc)
 {
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupdesc->firstNonCachedOffAttr >= 0);
+
 	if (tupdesc->tdtypeid == RECORDOID &&
 		tupdesc->tdtypmod < 0)
 		assign_record_type_typmod(tupdesc);
diff --git a/src/backend/jit/llvm/llvmjit_deform.c b/src/backend/jit/llvm/llvmjit_deform.c
index 3eb087eb56b..12521e3e46a 100644
--- a/src/backend/jit/llvm/llvmjit_deform.c
+++ b/src/backend/jit/llvm/llvmjit_deform.c
@@ -62,7 +62,6 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 	LLVMValueRef v_tts_values;
 	LLVMValueRef v_tts_nulls;
 	LLVMValueRef v_slotoffp;
-	LLVMValueRef v_flagsp;
 	LLVMValueRef v_nvalidp;
 	LLVMValueRef v_nvalid;
 	LLVMValueRef v_maxatt;
@@ -178,7 +177,6 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 	v_tts_nulls =
 		l_load_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_ISNULL,
 						  "tts_ISNULL");
-	v_flagsp = l_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_FLAGS, "");
 	v_nvalidp = l_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_NVALID, "");
 
 	if (ops == &TTSOpsHeapTuple || ops == &TTSOpsBufferHeapTuple)
@@ -747,14 +745,10 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 
 	{
 		LLVMValueRef v_off = l_load(b, TypeSizeT, v_offp, "");
-		LLVMValueRef v_flags;
 
 		LLVMBuildStore(b, l_int16_const(lc, natts), v_nvalidp);
 		v_off = LLVMBuildTrunc(b, v_off, LLVMInt32TypeInContext(lc), "");
 		LLVMBuildStore(b, v_off, v_slotoffp);
-		v_flags = l_load(b, LLVMInt16TypeInContext(lc), v_flagsp, "tts_flags");
-		v_flags = LLVMBuildOr(b, v_flags, l_int16_const(lc, TTS_FLAG_SLOW), "");
-		LLVMBuildStore(b, v_flags, v_flagsp);
 		LLVMBuildRetVoid(b);
 	}
 
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 770edb34e08..998be24ac41 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -666,14 +666,6 @@ RelationBuildTupleDesc(Relation relation)
 		elog(ERROR, "pg_attribute catalog is missing %d attribute(s) for relation OID %u",
 			 need, RelationGetRelid(relation));
 
-	/*
-	 * We can easily set the attcacheoff value for the first attribute: it
-	 * must be zero.  This eliminates the need for special cases for attnum=1
-	 * that used to exist in fastgetattr() and index_getattr().
-	 */
-	if (RelationGetNumberOfAttributes(relation) > 0)
-		TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
-
 	/*
 	 * Set up constraint/default info
 	 */
@@ -1985,8 +1977,6 @@ formrdesc(const char *relationName, Oid relationReltype,
 		populate_compact_attribute(relation->rd_att, i);
 	}
 
-	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
-	TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
 	TupleDescFinalize(relation->rd_att);
 
 	/* mark not-null status */
@@ -4446,8 +4436,6 @@ BuildHardcodedDescriptor(int natts, const FormData_pg_attribute *attrs)
 		populate_compact_attribute(result, i);
 	}
 
-	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
-	TupleDescCompactAttr(result, 0)->attcacheoff = 0;
 	TupleDescFinalize(result);
 
 	/* Note: we don't bother to set up a TupleConstr entry */
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index d406825ff22..94b4279b7f1 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -865,20 +865,17 @@ extern MinimalTuple minimal_expand_tuple(HeapTuple sourceTuple, TupleDesc tupleD
 static inline Datum
 fastgetattr(HeapTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
 {
-	Assert(attnum > 0);
+	CompactAttribute *att = TupleDescCompactAttr(tupleDesc, attnum - 1);
 
+	Assert(attnum > 0);
 	*isnull = false;
-	if (HeapTupleNoNulls(tup))
-	{
-		CompactAttribute *att;
 
-		att = TupleDescCompactAttr(tupleDesc, attnum - 1);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, (char *) tup->t_data + tup->t_data->t_hoff +
-							att->attcacheoff);
-		else
-			return nocachegetattr(tup, attnum, tupleDesc);
-	}
+	if (att->attcacheoff >= 0 && !HeapTupleHasNulls(tup))
+		return fetchatt(att, (char *) tup->t_data + tup->t_data->t_hoff +
+						att->attcacheoff);
+
+	if (HeapTupleNoNulls(tup))
+		return nocachegetattr(tup, attnum, tupleDesc);
 	else
 	{
 		if (att_isnull(attnum - 1, tup->t_data->t_bits))
diff --git a/src/include/access/itup.h b/src/include/access/itup.h
index 57e4daafb0d..e4bb27b7e58 100644
--- a/src/include/access/itup.h
+++ b/src/include/access/itup.h
@@ -131,24 +131,20 @@ IndexInfoFindDataOffset(unsigned short t_info)
 static inline Datum
 index_getattr(IndexTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
 {
+	CompactAttribute *attr = TupleDescCompactAttr(tupleDesc, attnum - 1);
+
 	Assert(isnull);
 	Assert(attnum > 0);
 
 	*isnull = false;
 
-	if (!IndexTupleHasNulls(tup))
-	{
-		CompactAttribute *attr = TupleDescCompactAttr(tupleDesc, attnum - 1);
+	if (attr->attcacheoff >= 0 && !IndexTupleHasNulls(tup))
+		return fetchatt(attr,
+						(char *) tup + IndexInfoFindDataOffset(tup->t_info) +
+						attr->attcacheoff);
 
-		if (attr->attcacheoff >= 0)
-		{
-			return fetchatt(attr,
-							(char *) tup + IndexInfoFindDataOffset(tup->t_info) +
-							attr->attcacheoff);
-		}
-		else
-			return nocache_index_getattr(tup, attnum, tupleDesc);
-	}
+	if (!IndexTupleHasNulls(tup))
+		return nocache_index_getattr(tup, attnum, tupleDesc);
 	else
 	{
 		if (att_isnull(attnum - 1, (bits8 *) tup + sizeof(IndexTupleData)))
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index 595413dbbc5..99d9017d1a6 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -131,6 +131,12 @@ typedef struct CompactAttribute
  * Any code making changes manually to and fields in the FormData_pg_attribute
  * array must subsequently call populate_compact_attribute() to flush the
  * changes out to the corresponding 'compact_attrs' element.
+ *
+ * firstNonCachedOffAttr stores the index into the compact_attrs array for the
+ * first attribute that we don't have a known attcacheoff for.
+ *
+ * Once a TupleDesc has been populated, before it is used for any purpose
+ * TupleDescFinalize() must be called on it.
  */
 typedef struct TupleDescData
 {
@@ -138,6 +144,8 @@ typedef struct TupleDescData
 	Oid			tdtypeid;		/* composite type ID for tuple type */
 	int32		tdtypmod;		/* typmod for tuple type */
 	int			tdrefcount;		/* reference count, or -1 if not counting */
+	int			firstNonCachedOffAttr;	/* index of the first att without an
+										 * attcacheoff */
 	TupleConstr *constr;		/* constraints, or NULL if none */
 	/* compact_attrs[N] is the compact metadata of Attribute Number N+1 */
 	CompactAttribute compact_attrs[FLEXIBLE_ARRAY_MEMBER];
@@ -195,7 +203,6 @@ extern TupleDesc CreateTupleDescTruncatedCopy(TupleDesc tupdesc, int natts);
 
 extern TupleDesc CreateTupleDescCopyConstr(TupleDesc tupdesc);
 
-#define TupleDescFinalize(d) ((void) 0)
 #define TupleDescSize(src) \
 	(offsetof(struct TupleDescData, compact_attrs) + \
 	 (src)->natts * sizeof(CompactAttribute) + \
@@ -206,6 +213,7 @@ extern void TupleDescCopy(TupleDesc dst, TupleDesc src);
 extern void TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 							   TupleDesc src, AttrNumber srcAttno);
 
+extern void TupleDescFinalize(TupleDesc tupdesc);
 extern void FreeTupleDesc(TupleDesc tupdesc);
 
 extern void IncrTupleDescRefCount(TupleDesc tupdesc);
diff --git a/src/include/access/tupmacs.h b/src/include/access/tupmacs.h
index 3e5530658c9..150a7a24785 100644
--- a/src/include/access/tupmacs.h
+++ b/src/include/access/tupmacs.h
@@ -15,6 +15,7 @@
 #define TUPMACS_H
 
 #include "catalog/pg_type_d.h"	/* for TYPALIGN macros */
+#include "port/pg_bitutils.h"
 
 
 /*
@@ -69,6 +70,87 @@ fetch_att(const void *T, bool attbyval, int attlen)
 	else
 		return PointerGetDatum(T);
 }
+
+#ifndef HAVE__BUILTIN_CTZ
+/*
+ * For returning the 0-based position of the right-most 0 bit of a uint8, or 8
+ * if all bits are 1 bits.
+ */
+static const uint8 pg_rightmost_zero_pos[256] = {
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 6,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 7,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 6,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 8
+};
+#endif
+
+/*
+ * first_null_attr
+ *		Inspect a NULL bitmask from a tuple and return the 0-based attnum of the
+ *		first NULL attribute.  Returns natts if no NULLs were found.
+ *
+ * We expect that 'bits' contains at least one 0 bit somewhere in the mask,
+ * not necessarily < natts.
+ */
+static inline int
+first_null_attr(const bits8 *bits, int natts)
+{
+	int			lastByte = natts >> 3;
+	int			bytenum;
+	int			res;
+
+#ifdef USE_ASSERT_CHECKING
+	int			firstnull_check = natts;
+
+	/* Do it the slow way and check we get the same answer. */
+	for (int i = 0; i < natts; i++)
+	{
+		if (att_isnull(i, bits))
+		{
+			firstnull_check = i;
+			break;
+		}
+	}
+#endif
+
+	/* Process all bytes up to just before the byte for the natts index */
+	for (bytenum = 0; bytenum < lastByte; bytenum++)
+	{
+		/* break if there's any NULL attrs (a 0 bit) */
+		if (bits[bytenum] != 0xFF)
+			break;
+	}
+
+	res = bytenum << 3;
+
+#ifdef HAVE__BUILTIN_CTZ
+	res += __builtin_ctz(~bits[bytenum]);
+#else
+	res += pg_rightmost_zero_pos[bits[bytenum]];
+#endif
+
+	/*
+	 * Since we did no masking to mask out bits beyond natts, we may have
+	 * found a bit higher than natts, so we must cap to natts
+	 */
+	res = Min(res, natts);
+
+	Assert(res == firstnull_check);
+
+	return res;
+}
 #endif							/* FRONTEND */
 
 /*
diff --git a/src/include/executor/tuptable.h b/src/include/executor/tuptable.h
index a2dfd707e78..363c5f33697 100644
--- a/src/include/executor/tuptable.h
+++ b/src/include/executor/tuptable.h
@@ -84,9 +84,6 @@
  * tts_values/tts_isnull are allocated either when the slot is created (when
  * the descriptor is provided), or when a descriptor is assigned to the slot;
  * they are of length equal to the descriptor's natts.
- *
- * The TTS_FLAG_SLOW flag is saved state for
- * slot_deform_heap_tuple, and should not be touched by any other code.
  *----------
  */
 
@@ -98,12 +95,8 @@
 #define			TTS_FLAG_SHOULDFREE		(1 << 2)
 #define TTS_SHOULDFREE(slot) (((slot)->tts_flags & TTS_FLAG_SHOULDFREE) != 0)
 
-/* saved state for slot_deform_heap_tuple */
-#define			TTS_FLAG_SLOW		(1 << 3)
-#define TTS_SLOW(slot) (((slot)->tts_flags & TTS_FLAG_SLOW) != 0)
-
 /* fixed tuple descriptor */
-#define			TTS_FLAG_FIXED		(1 << 4)
+#define			TTS_FLAG_FIXED		(1 << 4)	/* XXX change to #3? */
 #define TTS_FIXED(slot) (((slot)->tts_flags & TTS_FLAG_FIXED) != 0)
 
 struct TupleTableSlotOps;
-- 
2.51.0



  [text/plain] v6-0003-Introduce-deform_bench-test-module.patch (7.2K, 6-v6-0003-Introduce-deform_bench-test-module.patch)
  download | inline diff:
From 0b846521934c7fb273c26675c5054ea2990366d1 Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Tue, 27 Jan 2026 15:08:09 +1300
Subject: [PATCH v6 3/3] Introduce deform_bench test module

For benchmaring tuple deformation.
---
 src/test/modules/deform_bench/.gitignore      |   4 +
 src/test/modules/deform_bench/Makefile        |  21 ++++
 .../deform_bench/deform_bench--1.0.sql        |   8 ++
 src/test/modules/deform_bench/deform_bench.c  | 105 ++++++++++++++++++
 .../modules/deform_bench/deform_bench.control |   4 +
 src/test/modules/deform_bench/meson.build     |  22 ++++
 src/test/modules/meson.build                  |   1 +
 7 files changed, 165 insertions(+)
 create mode 100644 src/test/modules/deform_bench/.gitignore
 create mode 100644 src/test/modules/deform_bench/Makefile
 create mode 100644 src/test/modules/deform_bench/deform_bench--1.0.sql
 create mode 100644 src/test/modules/deform_bench/deform_bench.c
 create mode 100644 src/test/modules/deform_bench/deform_bench.control
 create mode 100644 src/test/modules/deform_bench/meson.build

diff --git a/src/test/modules/deform_bench/.gitignore b/src/test/modules/deform_bench/.gitignore
new file mode 100644
index 00000000000..5dcb3ff9723
--- /dev/null
+++ b/src/test/modules/deform_bench/.gitignore
@@ -0,0 +1,4 @@
+# Generated subdirectories
+/log/
+/results/
+/tmp_check/
diff --git a/src/test/modules/deform_bench/Makefile b/src/test/modules/deform_bench/Makefile
new file mode 100644
index 00000000000..b5fc0f7a583
--- /dev/null
+++ b/src/test/modules/deform_bench/Makefile
@@ -0,0 +1,21 @@
+# src/test/modules/deform_bench/Makefile
+
+MODULE_big = deform_bench
+OBJS = deform_bench.o
+
+EXTENSION = deform_bench
+DATA = deform_bench--1.0.sql
+PGFILEDESC = "deform_bench - tuple deform benchmarking"
+
+REGRESS = deform_bench
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/deform_bench
+top_builddir = ../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/deform_bench/deform_bench--1.0.sql b/src/test/modules/deform_bench/deform_bench--1.0.sql
new file mode 100644
index 00000000000..492b71dba3b
--- /dev/null
+++ b/src/test/modules/deform_bench/deform_bench--1.0.sql
@@ -0,0 +1,8 @@
+/* deform_bench--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION deform_bench" to load this file. \quit
+
+CREATE FUNCTION deform_bench(tableoid Oid, attnum int[]) RETURNS FLOAT
+AS 'MODULE_PATHNAME', 'deform_bench'
+LANGUAGE C VOLATILE STRICT;
diff --git a/src/test/modules/deform_bench/deform_bench.c b/src/test/modules/deform_bench/deform_bench.c
new file mode 100644
index 00000000000..895ff3f4222
--- /dev/null
+++ b/src/test/modules/deform_bench/deform_bench.c
@@ -0,0 +1,105 @@
+/*-------------------------------------------------------------------------
+ *
+ * deform_bench.c
+ *
+ * for benchmarking tuple deformation routines
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include <time.h>
+#include <sys/time.h>
+
+#include "access/heapam.h"
+#include "access/relscan.h"
+#include "catalog/pg_am_d.h"
+#include "catalog/pg_type_d.h"
+#include "funcapi.h"
+#include "miscadmin.h"
+#include "utils/array.h"
+#include "utils/arrayaccess.h"
+#include "utils/builtins.h"
+
+PG_MODULE_MAGIC;
+
+PG_FUNCTION_INFO_V1(deform_bench);
+
+Datum
+deform_bench(PG_FUNCTION_ARGS)
+{
+	Oid		tableoid = PG_GETARG_OID(0);
+	ArrayType *array = PG_GETARG_ARRAYTYPE_P(1);
+	TableScanDesc scan;
+	Relation	rel;
+	TupleDesc	tupdesc;
+	TupleTableSlot *slot;
+	Datum *elem_datums = NULL;
+	bool *elem_nulls = NULL;
+	int elem_count;
+	int *attnums;
+	clock_t start, end;
+
+	rel = relation_open(tableoid, AccessShareLock);
+
+	if (rel->rd_rel->relam != HEAP_TABLE_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only heap AM is supported")));
+
+	tupdesc = RelationGetDescr(rel);
+	slot = MakeTupleTableSlot(tupdesc, &TTSOpsBufferHeapTuple);
+	scan = table_beginscan_strat(rel, GetActiveSnapshot(), 0, NULL, true, false);
+
+	/*
+	 * The array is used to allow callers to define how many atts to deform.
+	 * e.g: '{1,10}'::int[] would deform attnum=1, then in a 2nd pass deform
+	 * the remainder up to attnum=10.  Passing an element as NULL means all
+	 * attnums.  This allows simulation of incremental deformation.  Generally
+	 * if you're passing an array with more than 1 element, then the array
+	 * should be in ascending order.  Doing something like '{10,1}' would mean
+	 * we've already deformed 10 attributes and on the 2nd pass there's
+	 * nothing to do since attnum=1 was already deformed in the first pass.
+	 *
+	 * You'll get an ERROR if you pass a number higher than the number of
+	 * attributes in the table.
+	 */
+	deconstruct_array(array,
+					  INT4OID,
+					  sizeof(int32),
+					  true,
+					  'i',
+					  &elem_datums,
+					  &elem_nulls,
+					  &elem_count);
+
+	attnums = palloc_array(int, elem_count);
+
+	for (int i = 0; i < elem_count; i++)
+	{
+		/* Make a NULL element mean all attributes */
+		if (elem_nulls[i])
+			attnums[i] = tupdesc->natts;
+		else
+			attnums[i] = DatumGetInt32(elem_datums[i]);
+	}
+
+	start = clock();
+
+	while (heap_getnextslot(scan, ForwardScanDirection, slot))
+	{
+		CHECK_FOR_INTERRUPTS();
+
+		/* Deform in stages according to the attnums array */
+		for (int i = 0; i < elem_count; i++)
+			slot_getsomeattrs_int(slot, attnums[i]);
+	}
+
+	ExecDropSingleTupleTableSlot(slot);
+	table_endscan(scan);
+	relation_close(rel, AccessShareLock);
+
+	end = clock();
+
+	/* Returns the number of milliseconds to run the test */
+	PG_RETURN_FLOAT8((double) (end - start) / (CLOCKS_PER_SEC / 1000));
+}
diff --git a/src/test/modules/deform_bench/deform_bench.control b/src/test/modules/deform_bench/deform_bench.control
new file mode 100644
index 00000000000..a2023f9d738
--- /dev/null
+++ b/src/test/modules/deform_bench/deform_bench.control
@@ -0,0 +1,4 @@
+# deform_bench extension
+comment = 'functions for benchmarking tuple deformation'
+default_version = '1.0'
+module_pathname = '$libdir/deform_bench'
diff --git a/src/test/modules/deform_bench/meson.build b/src/test/modules/deform_bench/meson.build
new file mode 100644
index 00000000000..82049585244
--- /dev/null
+++ b/src/test/modules/deform_bench/meson.build
@@ -0,0 +1,22 @@
+# Copyright (c) 2026, PostgreSQL Global Development Group
+
+deform_bench_sources = files(
+  'deform_bench.c',
+)
+
+if host_system == 'windows'
+  deform_bench_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+    '--NAME', 'deform_bench',
+    '--FILEDESC', 'deform_bench - benchmarking tuple deformation',])
+endif
+
+deform_bench = shared_module('deform_bench',
+  deform_bench_sources,
+  kwargs: pg_test_mod_args,
+)
+test_install_libs += deform_bench
+
+test_install_data += files(
+  'deform_bench--1.0.sql',
+  'deform_bench.control',
+)
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 2634a519935..ef2b0af4581 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -2,6 +2,7 @@
 
 subdir('brin')
 subdir('commit_ts')
+subdir('deform_bench')
 subdir('delay_execution')
 subdir('dummy_index_am')
 subdir('dummy_seclabel')
-- 
2.51.0



  [application/vnd.openxmlformats-officedocument.spreadsheetml.sheet] Deform_bench_test_module_results_2026-01-28.xlsx (29.8K, 7-Deform_bench_test_module_results_2026-01-28.xlsx)
  download

^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: More speedups for tuple deformation
@ 2026-01-28 16:26  Andres Freund <[email protected]>
  parent: David Rowley <[email protected]>
  1 sibling, 1 reply; 19+ messages in thread

From: Andres Freund @ 2026-01-28 16:26 UTC (permalink / raw)
  To: David Rowley <[email protected]>; +Cc: Chao Li <[email protected]>; PostgreSQL Developers <[email protected]>

Hi,

On 2026-01-28 02:34:26 +1300, David Rowley wrote:
> On Sat, 24 Jan 2026 at 05:33, Andres Freund <[email protected]> wrote:
> > I wonder if it's worth writing a C helper to test deformation in a bit more
> > targeted way.
>
> Good idea. I've written a test module called "deform_bench". You can
> do: "select deform_bench('tablename'::regclass, '{10,20}');" which
> will deform up to attnum=10, then in a 2nd pass deform up to
> attnum=20. This is in the 0003 patch. (Requires "ninja
> install-test-files"). 0003 is intended for testing, not commit.

Nice!  I am trying very hard to restrain myself from playing with it right
now, because I really need to get some other things done first...


>  /*
>   * slot_deform_heap_tuple
>   *		Given a TupleTableSlot, extract data from the slot's physical tuple
> @@ -1122,78 +1010,140 @@ static pg_attribute_always_inline void
>  slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
>  					   int natts)
>  {
> +	CompactAttribute *cattr;
> +	TupleDesc	tupleDesc = slot->tts_tupleDescriptor;
>  	bool		hasnulls = HeapTupleHasNulls(tuple);
> +	HeapTupleHeader tup = tuple->t_data;
> +	bits8	   *bp;				/* ptr to null bitmap in tuple */
>  	int			attnum;
> +	int			firstNonCacheOffsetAttr;
> +	int			firstNullAttr;
> +	Datum	   *values;
> +	bool	   *isnull;
> +	char	   *tp;				/* ptr to tuple data */
>  	uint32		off;			/* offset in tuple data */
> -	bool		slow;			/* can we use/set attcacheoff? */
> +
> +	/* Did someone forget to call TupleDescFinalize()? */
> +	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
>
>  	/* We can only fetch as many attributes as the tuple has. */
> -	natts = Min(HeapTupleHeaderGetNatts(tuple->t_data), natts);
> +	natts = Min(HeapTupleHeaderGetNatts(tup), natts);
> +	attnum = slot->tts_nvalid;
> +	firstNonCacheOffsetAttr = Min(tupleDesc->firstNonCachedOffAttr, natts);

FWIW, in a few experiments on my cascade lake systems, this branch (well, it
ends up as a cmov) ends up causing a surprisingly large performance
bottleneck.  I don't really see a way around that, but I thought I'd mention it.


On the topic of tupleDesc->firstNonCachedOffAttr - shouldn't that be an
AttrNumber? Not that it'll make a difference perf or space wise, just for
clarity.

Hm, I guess natts isn't an AttrNumber either. Not sure why?


> +	if (hasnulls)
> +	{
> +		bp = tup->t_bits;
> +		firstNullAttr = first_null_attr(bp, natts);
> +		firstNonCacheOffsetAttr = Min(firstNonCacheOffsetAttr, firstNullAttr);
> +	}
> +	else
> +	{
> +		bp = NULL;
> +		firstNullAttr = natts;
> +	}
> +
> +	values = slot->tts_values;
> +	isnull = slot->tts_isnull;
> +	tp = (char *) tup + tup->t_hoff;

Another stall I see is due to the t_hoff computation - which makes sense, it's
in the tuple header and none of the deforming can happen without knowing the
address. I think in the !hasnulls case, the only influence on it is
MAXALIGN(offsetof(HeapTupleHeaderData, t_bits)), so we could just hardcode
that?

Separately, sometimes - I haven't figured out when - gcc seems to think it's
smart to actually compute the `tp + cattr->attcacheoff` below using tup and
tup->t_hoff stored in registers (i.e. doing multiple adds).  When the code is
generated that way, I see substantially worse performance.  Have you seen
that?

> +	else if (attnum == 0)
>  	{
>  		/* Start from the first attribute */
>  		off = 0;
> -		slow = false;
>  	}
>  	else
>  	{
>  		/* Restore state from previous execution */
>  		off = *offp;
> -		slow = TTS_SLOW(slot);
>  	}

Do we actually need both of these branches? Shouldn't *offp be set to 0 in the
attnum == 0 case?


> -	if (!slow)
> +	for (; attnum < firstNullAttr; attnum++)
>  	{
> [...]
> +		cattr = TupleDescCompactAttr(tupleDesc, attnum);
> +
> +		/* align the offset for this attribute */
> +		off = att_pointer_alignby(off,
> +								  cattr->attalignby,
> +								  cattr->attlen,
> +								  tp + off);
> +
> +		values[attnum] = fetchatt(cattr, tp + off);
> +		isnull[attnum] = false;
> +
> +		/* move the offset beyond this attribute */
> +		off = att_addlength_pointer(off, cattr->attlen, tp + off);
>  	}

A few thoughts / suggestions:

1) We should not update values[]/isnull[] between fetchatt() and
   att_addlength_pointer(). The compiler can't figure out that no fields in
   cattr or *(tp + off) are being affected by those stores.

   Changing this on master improves performance quite noticeably. I see a 13%
   improvement in a test with deforming 5 not-null byval columns.

2) I sometime see performance benefits due to moving the isnull[attnum] =
   false; to the beginning of the loop. Which makes some sense, starting the
   store earlier allows it to complete earlier, and it doesn't depend on
   fetching cattr, aligning the pointer, fetching the attribute and adjusting
   the offset.

3) I briefly experimented with this code, and I think we may be able to
   optimize the combination of att_pointer_alignby(), fetch_att() and
   att_addlength_pointer(). They all do quite related work, and for byvalue
   types, we know at compile time what the alignment requirement for each of
   the supported attlen is.


> +	/*
> +	 * Now handle any remaining tuples, this time include NULL checks as we're
> +	 * now at the first NULL attribute.
> +	 */
> +	for (; attnum < natts; attnum++)
>  	{
> -		/* XXX is it worth adding a separate call when hasnulls is false? */
> -		attnum = slot_deform_heap_tuple_internal(slot,
> -												 tuple,
> -												 attnum,
> -												 natts,
> -												 true,	/* slow */
> -												 hasnulls,
> -												 &off,
> -												 &slow);
> +		if (att_isnull(attnum, bp))
> +		{
> +			values[attnum] = (Datum) 0;
> +			isnull[attnum] = true;
> +			continue;
> +		}

Have you experimented setting isnull[] in a dedicated loop if there are nulls
and then in this loop just checking isnull[attnum]? Seems like that could
perhaps be combined with the work in first_null_attr() and be more efficient
than doing an att_isnull() separately for each column.

Greetings,

Andres Freund






^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: More speedups for tuple deformation
@ 2026-01-30 11:10  David Rowley <[email protected]>
  parent: Andres Freund <[email protected]>
  0 siblings, 1 reply; 19+ messages in thread

From: David Rowley @ 2026-01-30 11:10 UTC (permalink / raw)
  To: Andres Freund <[email protected]>; +Cc: Chao Li <[email protected]>; PostgreSQL Developers <[email protected]>

Thank you for looking at this again.

On Thu, 29 Jan 2026 at 05:26, Andres Freund <[email protected]> wrote:
> On 2026-01-28 02:34:26 +1300, David Rowley wrote:
> > +     firstNonCacheOffsetAttr = Min(tupleDesc->firstNonCachedOffAttr, natts);
>
> FWIW, in a few experiments on my cascade lake systems, this branch (well, it
> ends up as a cmov) ends up causing a surprisingly large performance
> bottleneck.  I don't really see a way around that, but I thought I'd mention it.

Yeah, I believe this is the primary reason that I'm fighting the small
regression on the 0 extra column test.  I thought it might be because
the mov has a dependency and wait on natts being calculated, which
needs to access fields in the tuple header. I wonder if there's some
reason the compiler to CPU can't defer calculating
firstNonCacheOffsetAttr until later. Maybe I should try moving it
later in the code to see if that helps.

> On the topic of tupleDesc->firstNonCachedOffAttr - shouldn't that be an
> AttrNumber? Not that it'll make a difference perf or space wise, just for
> clarity.
>
> Hm, I guess natts isn't an AttrNumber either. Not sure why?

I noticed that too, but took the path of least resistance and made
firstNonCachedOffAttr an int too. I did wonder why natts wasn't an
AttrNumber. If they both were AttrNumbers, I wouldn't need to make the
TupleDesc struct bigger. Right now, I've enlarged it by 8 bytes by
adding firstNonCachedOffAttr.

One problem is that a bunch of functions that accept int;
CreateTemplateTupleDesc(int natts), CreateTupleDesc(int natts,
Form_pg_attribute *attrs). Then BuildDescFromLists() sets natts based
on list_length(). Maybe CreateTemplateTupleDesc() could Assert or
throw an error if natts does not fit in 16-bits.

>
> > +     if (hasnulls)
> > +     {
> > +             bp = tup->t_bits;
> > +             firstNullAttr = first_null_attr(bp, natts);
> > +             firstNonCacheOffsetAttr = Min(firstNonCacheOffsetAttr, firstNullAttr);
> > +     }
> > +     else
> > +     {
> > +             bp = NULL;
> > +             firstNullAttr = natts;
> > +     }
> > +
> > +     values = slot->tts_values;
> > +     isnull = slot->tts_isnull;
> > +     tp = (char *) tup + tup->t_hoff;
>
> Another stall I see is due to the t_hoff computation - which makes sense, it's
> in the tuple header and none of the deforming can happen without knowing the
> address. I think in the !hasnulls case, the only influence on it is
> MAXALIGN(offsetof(HeapTupleHeaderData, t_bits)), so we could just hardcode
> that?

hmm. I wonder why it even needs to exist. If the null bitmap is there,
you can calculate how many bytes from natts. I tried doing "tp = (char
*) tup + MAXALIGN(offsetof(HeapTupleHeaderData, t_bits));" for the
!hasnulls case and it's hard to tell if it helps. See the 0004 patch.
I'm somewhat hesitant to go against the grain here on how to calculate
where the tuple data starts.

> Separately, sometimes - I haven't figured out when - gcc seems to think it's
> smart to actually compute the `tp + cattr->attcacheoff` below using tup and
> tup->t_hoff stored in registers (i.e. doing multiple adds).  When the code is
> generated that way, I see substantially worse performance.  Have you seen
> that?

I've not noticed that.

> > +     else if (attnum == 0)
> >       {
> >               /* Start from the first attribute */
> >               off = 0;
> > -             slow = false;
> >       }
> >       else
> >       {
> >               /* Restore state from previous execution */
> >               off = *offp;
> > -             slow = TTS_SLOW(slot);
> >       }
>
> Do we actually need both of these branches? Shouldn't *offp be set to 0 in the
> attnum == 0 case?

I see tts_heap_clear() zeros it, so I think it should be ok. Doesn't
feel quite as robust, however.

> A few thoughts / suggestions:
>
> 1) We should not update values[]/isnull[] between fetchatt() and
>    att_addlength_pointer(). The compiler can't figure out that no fields in
>    cattr or *(tp + off) are being affected by those stores.
>
>    Changing this on master improves performance quite noticeably. I see a 13%
>    improvement in a test with deforming 5 not-null byval columns.

That's easy enough. I've moved it up in the v7 patch before the cattr
assignment.

> 2) I sometime see performance benefits due to moving the isnull[attnum] =
>    false; to the beginning of the loop. Which makes some sense, starting the
>    store earlier allows it to complete earlier, and it doesn't depend on
>    fetching cattr, aligning the pointer, fetching the attribute and adjusting
>    the offset.

> 3) I briefly experimented with this code, and I think we may be able to
>    optimize the combination of att_pointer_alignby(), fetch_att() and
>    att_addlength_pointer(). They all do quite related work, and for byvalue
>    types, we know at compile time what the alignment requirement for each of
>    the supported attlen is.

Is this true? Isn't there some nearby discussion about AIX having
4-byte double alignment?

I've taken a go at implementing a function called
align_fetch_then_add(), which rolls all the macros into one (See
0004). I just can't see any improvements with it. Maybe I've missed
something that could be more optimal. I did even ditch one of the
cases from the switch(attlen). It might be ok to do that now as we can
check for invalid attlens for byval types when we populate the
CompactAttribute.

> Have you experimented setting isnull[] in a dedicated loop if there are nulls
> and then in this loop just checking isnull[attnum]? Seems like that could
> perhaps be combined with the work in first_null_attr() and be more efficient
> than doing an att_isnull() separately for each column.

Yes. I experiment with that quite a bit. I wasn't able to make it any
faster than setting the isnull element in the same loop as the
tts_values element. What I did try was having a dedicated tight loop
like; for (int i = attnum; i < firstNullAttr; i++) isnull[i] = false;,
but the compiler would always try to optimise that into an inlined
memset which would result in poorly performing code in cases with a
small number of columns due to the size and alignment prechecks. I had
given up on it as I was already fighting some performance regressions
for the 0 extra column test and this made those worse. However...

In the attached 0004 patch I've experimented with this again. This
time, I wrote a function that converts the null bitmap into the isnull
array using a lookup table. I spent a bit of time trying to figure out
a way to do this without the lookup table and only came up with a
method that requires AVX512 instructions. I coded that up, but it
requires building with -march=x86-64-v4, which will likely cause many
other reasons for the performance to vary.

The machine that likes 0004 the most (using the lookup table method of
setting the isnull array) is the Apple M2. All the tests apart from
the 0 extra column test became 30-90% faster. Previously the tests
that had to do att_isnull didn't improve very much. The 0 extra column
test regressed quite a bit. 50% slower on all but test 1 and 5 (the
ones without NULLs). See the attached graph. The Zen2 machine also
perhaps quite likes it, but not for the 0 extra column test. I'm
struggling to get stable performance results from that machine right
now. My Zen 4 laptop isn't a fan of it, but also not getting very
stable performance results from that either.

I'm curious to see what your Intel machines think of 0004 vs not having it.

Right now, I'm really only getting stable performance out of my Apple
M2 machine, so I'm not too sure what parts of 0004 I should include in
0002 and which ones I should throw away.

David

From f7efca3f15f517bb0c807c3e09ad40aab4c08f6b Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Wed, 21 Jan 2026 15:41:37 +1300
Subject: [PATCH v7 1/4] Add empty TupleDescFinalize() function

Currently does nothing, but will in a future commit.
---
 contrib/dblink/dblink.c                             |  4 ++++
 contrib/pg_buffercache/pg_buffercache_pages.c       |  2 ++
 contrib/pg_visibility/pg_visibility.c               |  2 ++
 src/backend/access/brin/brin_tuple.c                |  1 +
 src/backend/access/common/tupdesc.c                 | 13 +++++++++++++
 src/backend/access/gin/ginutil.c                    |  1 +
 src/backend/access/gist/gistscan.c                  |  1 +
 src/backend/access/spgist/spgutils.c                |  1 +
 src/backend/access/transam/twophase.c               |  1 +
 src/backend/access/transam/xlogfuncs.c              |  1 +
 src/backend/backup/basebackup_copy.c                |  3 +++
 src/backend/catalog/index.c                         |  2 ++
 src/backend/catalog/pg_publication.c                |  1 +
 src/backend/catalog/toasting.c                      |  6 ++++++
 src/backend/commands/explain.c                      |  1 +
 src/backend/commands/functioncmds.c                 |  1 +
 src/backend/commands/sequence.c                     |  1 +
 src/backend/commands/tablecmds.c                    |  4 ++++
 src/backend/commands/wait.c                         |  1 +
 src/backend/executor/execSRF.c                      |  2 ++
 src/backend/executor/execTuples.c                   |  4 ++++
 src/backend/executor/nodeFunctionscan.c             |  2 ++
 src/backend/parser/parse_relation.c                 |  4 +++-
 src/backend/parser/parse_target.c                   |  2 ++
 .../replication/libpqwalreceiver/libpqwalreceiver.c |  1 +
 src/backend/replication/walsender.c                 |  5 +++++
 src/backend/utils/adt/acl.c                         |  1 +
 src/backend/utils/adt/genfile.c                     |  1 +
 src/backend/utils/adt/lockfuncs.c                   |  1 +
 src/backend/utils/adt/orderedsetaggs.c              |  1 +
 src/backend/utils/adt/pgstatfuncs.c                 |  5 +++++
 src/backend/utils/adt/tsvector_op.c                 |  1 +
 src/backend/utils/cache/relcache.c                  |  8 ++++++++
 src/backend/utils/fmgr/funcapi.c                    |  6 ++++++
 src/backend/utils/misc/guc_funcs.c                  |  5 +++++
 src/include/access/tupdesc.h                        |  1 +
 src/pl/plpgsql/src/pl_comp.c                        |  2 ++
 .../test_custom_stats/test_custom_fixed_stats.c     |  1 +
 src/test/modules/test_predtest/test_predtest.c      |  1 +
 39 files changed, 100 insertions(+), 1 deletion(-)

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 8cb3166495c..1ce4502fec2 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -881,6 +881,7 @@ materializeResult(FunctionCallInfo fcinfo, PGconn *conn, PGresult *res)
 		tupdesc = CreateTemplateTupleDesc(1);
 		TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 						   TEXTOID, -1, 0);
+		TupleDescFinalize(tupdesc);
 		ntuples = 1;
 		nfields = 1;
 	}
@@ -1044,6 +1045,7 @@ materializeQueryResult(FunctionCallInfo fcinfo,
 			tupdesc = CreateTemplateTupleDesc(1);
 			TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 							   TEXTOID, -1, 0);
+			TupleDescFinalize(tupdesc);
 			attinmeta = TupleDescGetAttInMetadata(tupdesc);
 
 			oldcontext = MemoryContextSwitchTo(rsinfo->econtext->ecxt_per_query_memory);
@@ -1529,6 +1531,8 @@ dblink_get_pkey(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 2, "colname",
 						   TEXTOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
+
 		/*
 		 * Generate attribute metadata needed later to produce tuples from raw
 		 * C strings
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 89b86855243..a6b4fb5252b 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -174,6 +174,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
 			TupleDescInitEntry(tupledesc, (AttrNumber) 9, "pinning_backends",
 							   INT4OID, -1, 0);
 
+		TupleDescFinalize(tupledesc);
 		fctx->tupdesc = BlessTupleDesc(tupledesc);
 
 		/* Allocate NBuffers worth of BufferCachePagesRec records. */
@@ -442,6 +443,7 @@ pg_buffercache_os_pages_internal(FunctionCallInfo fcinfo, bool include_numa)
 		TupleDescInitEntry(tupledesc, (AttrNumber) 3, "numa_node",
 						   INT4OID, -1, 0);
 
+		TupleDescFinalize(tupledesc);
 		fctx->tupdesc = BlessTupleDesc(tupledesc);
 		fctx->include_numa = include_numa;
 
diff --git a/contrib/pg_visibility/pg_visibility.c b/contrib/pg_visibility/pg_visibility.c
index 9bc3a784bf7..dfab0b64cf5 100644
--- a/contrib/pg_visibility/pg_visibility.c
+++ b/contrib/pg_visibility/pg_visibility.c
@@ -469,6 +469,8 @@ pg_visibility_tupdesc(bool include_blkno, bool include_pd)
 		TupleDescInitEntry(tupdesc, ++a, "pd_all_visible", BOOLOID, -1, 0);
 	Assert(a == maxattr);
 
+	TupleDescFinalize(tupdesc);
+
 	return BlessTupleDesc(tupdesc);
 }
 
diff --git a/src/backend/access/brin/brin_tuple.c b/src/backend/access/brin/brin_tuple.c
index 706387e36d6..7f150df9ee7 100644
--- a/src/backend/access/brin/brin_tuple.c
+++ b/src/backend/access/brin/brin_tuple.c
@@ -84,6 +84,7 @@ brtuple_disk_tupdesc(BrinDesc *brdesc)
 
 		MemoryContextSwitchTo(oldcxt);
 
+		TupleDescFinalize(tupdesc);
 		brdesc->bd_disktdesc = tupdesc;
 	}
 
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index 94b4f1f9975..e98de806a77 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -238,6 +238,9 @@ CreateTupleDesc(int natts, Form_pg_attribute *attrs)
 		memcpy(TupleDescAttr(desc, i), attrs[i], ATTRIBUTE_FIXED_PART_SIZE);
 		populate_compact_attribute(desc, i);
 	}
+
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -282,6 +285,8 @@ CreateTupleDescCopy(TupleDesc tupdesc)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -328,6 +333,8 @@ CreateTupleDescTruncatedCopy(TupleDesc tupdesc, int natts)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -413,6 +420,8 @@ CreateTupleDescCopyConstr(TupleDesc tupdesc)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -455,6 +464,8 @@ TupleDescCopy(TupleDesc dst, TupleDesc src)
 	 * source's refcount would be wrong in any case.)
 	 */
 	dst->tdrefcount = -1;
+
+	TupleDescFinalize(dst);
 }
 
 /*
@@ -1082,6 +1093,8 @@ BuildDescFromLists(const List *names, const List *types, const List *typmods, co
 		TupleDescInitEntryCollation(desc, attnum, attcollation);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index d205093e21d..a533d79e26e 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -129,6 +129,7 @@ initGinState(GinState *state, Relation index)
 							   attr->attndims);
 			TupleDescInitEntryCollation(state->tupdesc[i], (AttrNumber) 2,
 										attr->attcollation);
+			TupleDescFinalize(state->tupdesc[i]);
 		}
 
 		/*
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
index f23bc4a6757..c65f93abdae 100644
--- a/src/backend/access/gist/gistscan.c
+++ b/src/backend/access/gist/gistscan.c
@@ -201,6 +201,7 @@ gistrescan(IndexScanDesc scan, ScanKey key, int nkeys,
 											 attno - 1)->atttypid,
 							   -1, 0);
 		}
+		TupleDescFinalize(so->giststate->fetchTupdesc);
 		scan->xs_hitupdesc = so->giststate->fetchTupdesc;
 
 		/* Also create a memory context that will hold the returned tuples */
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 9f5379b87ac..b246e8127db 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -340,6 +340,7 @@ getSpGistTupleDesc(Relation index, SpGistTypeDesc *keyType)
 			TupleDescCompactAttr(outTupDesc, i)->attcacheoff = -1;
 
 		populate_compact_attribute(outTupDesc, spgKeyColumn);
+		TupleDescFinalize(outTupDesc);
 	}
 	return outTupDesc;
 }
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 601ce3faa64..6d5792c7929 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -744,6 +744,7 @@ pg_prepared_xact(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 5, "dbid",
 						   OIDOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 2efe4105efb..b6bc616c74c 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -400,6 +400,7 @@ pg_walfile_name_offset(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(resultTupleDesc, (AttrNumber) 2, "file_offset",
 					   INT4OID, -1, 0);
 
+	TupleDescFinalize(resultTupleDesc);
 	resultTupleDesc = BlessTupleDesc(resultTupleDesc);
 
 	/*
diff --git a/src/backend/backup/basebackup_copy.c b/src/backend/backup/basebackup_copy.c
index fecfad9ab7b..29dbd0cb32f 100644
--- a/src/backend/backup/basebackup_copy.c
+++ b/src/backend/backup/basebackup_copy.c
@@ -357,6 +357,8 @@ SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli)
 	 */
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "tli", INT8OID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
+
 	/* send RowDescription */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
 
@@ -388,6 +390,7 @@ SendTablespaceList(List *tablespaces)
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, "spcoid", OIDOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "spclocation", TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "size", INT8OID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* send RowDescription */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 43de42ce39e..75e97fb394a 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -481,6 +481,8 @@ ConstructTupleDescriptor(Relation heapRelation,
 		populate_compact_attribute(indexTupDesc, i);
 	}
 
+	TupleDescFinalize(indexTupDesc);
+
 	return indexTupDesc;
 }
 
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 9a4791c573e..fa353a0dd37 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -1230,6 +1230,7 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 4, "qual",
 						   PG_NODE_TREEOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 		funcctx->user_fctx = table_infos;
 
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index c78dcea98c1..078a1cf5127 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -229,6 +229,12 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid,
 	TupleDescAttr(tupdesc, 1)->attcompression = InvalidCompressionMethod;
 	TupleDescAttr(tupdesc, 2)->attcompression = InvalidCompressionMethod;
 
+	populate_compact_attribute(tupdesc, 0);
+	populate_compact_attribute(tupdesc, 1);
+	populate_compact_attribute(tupdesc, 2);
+
+	TupleDescFinalize(tupdesc);
+
 	/*
 	 * Toast tables for regular relations go in pg_toast; those for temp
 	 * relations go into the per-backend temp-toast-table namespace.
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index b7bb111688c..7abd9ed272f 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -281,6 +281,7 @@ ExplainResultDesc(ExplainStmt *stmt)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "QUERY PLAN",
 					   result_type, -1, 0);
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
 
diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c
index a516b037dea..6a8f162b640 100644
--- a/src/backend/commands/functioncmds.c
+++ b/src/backend/commands/functioncmds.c
@@ -2423,6 +2423,7 @@ CallStmtResultDesc(CallStmt *stmt)
 							   -1,
 							   0);
 		}
+		TupleDescFinalize(tupdesc);
 	}
 
 	return tupdesc;
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index e1b808bbb60..551667650ba 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -1808,6 +1808,7 @@ pg_get_sequence_data(PG_FUNCTION_ARGS)
 					   BOOLOID, -1, 0);
 	TupleDescInitEntry(resultTupleDesc, (AttrNumber) 3, "page_lsn",
 					   LSNOID, -1, 0);
+	TupleDescFinalize(resultTupleDesc);
 	resultTupleDesc = BlessTupleDesc(resultTupleDesc);
 
 	seqrel = try_relation_open(relid, AccessShareLock);
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index f976c0e5c7e..ae7c9db9fa9 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -1029,6 +1029,8 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
 		}
 	}
 
+	TupleDescFinalize(descriptor);
+
 	/*
 	 * For relations with table AM and partitioned tables, select access
 	 * method to use: an explicitly indicated one, or (in the case of a
@@ -1448,6 +1450,8 @@ BuildDescForRelation(const List *columns)
 		populate_compact_attribute(desc, attnum - 1);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
diff --git a/src/backend/commands/wait.c b/src/backend/commands/wait.c
index 1290df10c6f..8e920a72372 100644
--- a/src/backend/commands/wait.c
+++ b/src/backend/commands/wait.c
@@ -338,5 +338,6 @@ WaitStmtResultDesc(WaitStmt *stmt)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 					   TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
diff --git a/src/backend/executor/execSRF.c b/src/backend/executor/execSRF.c
index a0b111dc0e4..b481e50acfb 100644
--- a/src/backend/executor/execSRF.c
+++ b/src/backend/executor/execSRF.c
@@ -272,6 +272,7 @@ ExecMakeTableFunctionResult(SetExprState *setexpr,
 									   funcrettype,
 									   -1,
 									   0);
+					TupleDescFinalize(tupdesc);
 					rsinfo.setDesc = tupdesc;
 				}
 				MemoryContextSwitchTo(oldcontext);
@@ -776,6 +777,7 @@ init_sexpr(Oid foid, Oid input_collation, Expr *node,
 							   funcrettype,
 							   -1,
 							   0);
+			TupleDescFinalize(tupdesc);
 			sexpr->funcResultDesc = tupdesc;
 			sexpr->funcReturnsTuple = false;
 		}
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index b768eae9e53..e6ab51e6404 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -2173,6 +2173,8 @@ ExecTypeFromTLInternal(List *targetList, bool skipjunk)
 		cur_resno++;
 	}
 
+	TupleDescFinalize(typeInfo);
+
 	return typeInfo;
 }
 
@@ -2207,6 +2209,8 @@ ExecTypeFromExprList(List *exprList)
 		cur_resno++;
 	}
 
+	TupleDescFinalize(typeInfo);
+
 	return typeInfo;
 }
 
diff --git a/src/backend/executor/nodeFunctionscan.c b/src/backend/executor/nodeFunctionscan.c
index 63e605e1f81..feb82d64967 100644
--- a/src/backend/executor/nodeFunctionscan.c
+++ b/src/backend/executor/nodeFunctionscan.c
@@ -414,6 +414,7 @@ ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags)
 				TupleDescInitEntryCollation(tupdesc,
 											(AttrNumber) 1,
 											exprCollation(funcexpr));
+				TupleDescFinalize(tupdesc);
 			}
 			else
 			{
@@ -485,6 +486,7 @@ ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags)
 							   0);
 		}
 
+		TupleDescFinalize(scan_tupdesc);
 		Assert(attno == natts);
 	}
 
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 3ec8d8de011..0ad767d827b 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -1891,6 +1891,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 			TupleDescInitEntryCollation(tupdesc,
 										(AttrNumber) 1,
 										exprCollation(funcexpr));
+			TupleDescFinalize(tupdesc);
 		}
 		else if (functypclass == TYPEFUNC_RECORD)
 		{
@@ -1948,6 +1949,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 
 				i++;
 			}
+			TupleDescFinalize(tupdesc);
 
 			/*
 			 * Ensure that the coldeflist defines a legal set of names (no
@@ -2016,7 +2018,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 							   0);
 			/* no need to set collation */
 		}
-
+		TupleDescFinalize(tupdesc);
 		Assert(natts == totalatts);
 	}
 	else
diff --git a/src/backend/parser/parse_target.c b/src/backend/parser/parse_target.c
index b5a2f915b67..5fd17f3d8d0 100644
--- a/src/backend/parser/parse_target.c
+++ b/src/backend/parser/parse_target.c
@@ -1570,6 +1570,8 @@ expandRecordVariable(ParseState *pstate, Var *var, int levelsup)
 		}
 		Assert(lname == NULL && lvar == NULL);	/* lists same length? */
 
+		TupleDescFinalize(tupleDesc);
+
 		return tupleDesc;
 	}
 
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 7c8639b32e9..9f04c9ed25d 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -1073,6 +1073,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
 	for (coln = 0; coln < nRetTypes; coln++)
 		TupleDescInitEntry(walres->tupledesc, (AttrNumber) coln + 1,
 						   PQfname(pgres, coln), retTypes[coln], -1, 0);
+	TupleDescFinalize(walres->tupledesc);
 	attinmeta = TupleDescGetAttInMetadata(walres->tupledesc);
 
 	/* No point in doing more here if there were no tuples returned. */
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index a0e6a3d200c..9e2f4a664b4 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -451,6 +451,7 @@ IdentifySystem(void)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 4, "dbname",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -496,6 +497,7 @@ ReadReplicationSlot(ReadReplicationSlotCmd *cmd)
 	/* TimeLineID is unsigned, so int4 is not wide enough. */
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "restart_tli",
 							  INT8OID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	memset(nulls, true, READ_REPLICATION_SLOT_COLS * sizeof(bool));
 
@@ -598,6 +600,7 @@ SendTimeLineHistory(TimeLineHistoryCmd *cmd)
 	tupdesc = CreateTemplateTupleDesc(2);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, "filename", TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "content", TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	TLHistoryFileName(histfname, cmd->timeline);
 	TLHistoryFilePath(path, cmd->timeline);
@@ -1015,6 +1018,7 @@ StartReplication(StartReplicationCmd *cmd)
 								  INT8OID, -1, 0);
 		TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "next_tli_startpos",
 								  TEXTOID, -1, 0);
+		TupleDescFinalize(tupdesc);
 
 		/* prepare for projection of tuple */
 		tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -1369,6 +1373,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 4, "output_plugin",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
diff --git a/src/backend/utils/adt/acl.c b/src/backend/utils/adt/acl.c
index 3a6905f9546..9d37053c81e 100644
--- a/src/backend/utils/adt/acl.c
+++ b/src/backend/utils/adt/acl.c
@@ -1818,6 +1818,7 @@ aclexplode(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 4, "is_grantable",
 						   BOOLOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/* allocate memory for user context */
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index c083608b1d5..bfb949401d0 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -454,6 +454,7 @@ pg_stat_file(PG_FUNCTION_ARGS)
 					   "creation", TIMESTAMPTZOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 6,
 					   "isdir", BOOLOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	memset(isnull, false, sizeof(isnull));
diff --git a/src/backend/utils/adt/lockfuncs.c b/src/backend/utils/adt/lockfuncs.c
index 9dadd6da672..4481c354fd6 100644
--- a/src/backend/utils/adt/lockfuncs.c
+++ b/src/backend/utils/adt/lockfuncs.c
@@ -146,6 +146,7 @@ pg_lock_status(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 16, "waitstart",
 						   TIMESTAMPTZOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/*
diff --git a/src/backend/utils/adt/orderedsetaggs.c b/src/backend/utils/adt/orderedsetaggs.c
index 3b6da8e36ac..fd8b8676470 100644
--- a/src/backend/utils/adt/orderedsetaggs.c
+++ b/src/backend/utils/adt/orderedsetaggs.c
@@ -233,6 +233,7 @@ ordered_set_startup(FunctionCallInfo fcinfo, bool use_tuples)
 								   -1,
 								   0);
 
+				TupleDescFinalize(newdesc);
 				FreeTupleDesc(qstate->tupdesc);
 				qstate->tupdesc = newdesc;
 			}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 73ca0bb0b7f..08ad27e57c2 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -769,6 +769,7 @@ pg_stat_get_backend_subxact(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subxact_overflow",
 					   BOOLOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	if ((local_beentry = pgstat_get_local_beentry_by_proc_number(procNumber)) != NULL)
@@ -1658,6 +1659,7 @@ pg_stat_wal_build_tuple(PgStat_WalCounters wal_counters,
 	TupleDescInitEntry(tupdesc, (AttrNumber) 6, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	/* Fill values and NULLs */
@@ -2085,6 +2087,7 @@ pg_stat_get_archiver(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(tupdesc, (AttrNumber) 7, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	/* Get statistics about the archiver process */
@@ -2166,6 +2169,7 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
 					   TIMESTAMPTZOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 13, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	namestrcpy(&slotname, text_to_cstring(slotname_text));
@@ -2253,6 +2257,7 @@ pg_stat_get_subscription_stats(PG_FUNCTION_ARGS)
 					   INT8OID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 13, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	if (!subentry)
diff --git a/src/backend/utils/adt/tsvector_op.c b/src/backend/utils/adt/tsvector_op.c
index 94e0fed8309..7ca19a97882 100644
--- a/src/backend/utils/adt/tsvector_op.c
+++ b/src/backend/utils/adt/tsvector_op.c
@@ -651,6 +651,7 @@ tsvector_unnest(PG_FUNCTION_ARGS)
 						   TEXTARRAYOID, -1, 0);
 		if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
 			elog(ERROR, "return type must be a row type");
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = tupdesc;
 
 		funcctx->user_fctx = PG_GETARG_TSVECTOR_COPY(0);
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 6b634c9fff1..770edb34e08 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -729,6 +729,8 @@ RelationBuildTupleDesc(Relation relation)
 		pfree(constr);
 		relation->rd_att->constr = NULL;
 	}
+
+	TupleDescFinalize(relation->rd_att);
 }
 
 /*
@@ -1985,6 +1987,7 @@ formrdesc(const char *relationName, Oid relationReltype,
 
 	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
 	TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
+	TupleDescFinalize(relation->rd_att);
 
 	/* mark not-null status */
 	if (has_not_null)
@@ -3688,6 +3691,8 @@ RelationBuildLocalRelation(const char *relname,
 	for (i = 0; i < natts; i++)
 		TupleDescAttr(rel->rd_att, i)->attrelid = relid;
 
+	TupleDescFinalize(rel->rd_att);
+
 	rel->rd_rel->reltablespace = reltablespace;
 
 	if (mapped_relation)
@@ -4443,6 +4448,7 @@ BuildHardcodedDescriptor(int natts, const FormData_pg_attribute *attrs)
 
 	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
 	TupleDescCompactAttr(result, 0)->attcacheoff = 0;
+	TupleDescFinalize(result);
 
 	/* Note: we don't bother to set up a TupleConstr entry */
 
@@ -6268,6 +6274,8 @@ load_relcache_init_file(bool shared)
 			populate_compact_attribute(rel->rd_att, i);
 		}
 
+		TupleDescFinalize(rel->rd_att);
+
 		/* next read the access method specific field */
 		if (fread(&len, 1, sizeof(len), fp) != sizeof(len))
 			goto read_failed;
diff --git a/src/backend/utils/fmgr/funcapi.c b/src/backend/utils/fmgr/funcapi.c
index 8a934ea8dca..516d02cfb82 100644
--- a/src/backend/utils/fmgr/funcapi.c
+++ b/src/backend/utils/fmgr/funcapi.c
@@ -340,6 +340,8 @@ get_expr_result_type(Node *expr,
 										exprCollation(col));
 			i++;
 		}
+		TupleDescFinalize(tupdesc);
+
 		if (resultTypeId)
 			*resultTypeId = rexpr->row_typeid;
 		if (resultTupleDesc)
@@ -1044,6 +1046,7 @@ resolve_polymorphic_tupdesc(TupleDesc tupdesc, oidvector *declared_args,
 		}
 	}
 
+	TupleDescFinalize(tupdesc);
 	return true;
 }
 
@@ -1853,6 +1856,8 @@ build_function_result_tupdesc_d(char prokind,
 						   0);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -1970,6 +1975,7 @@ TypeGetTupleDesc(Oid typeoid, List *colaliases)
 						   typeoid,
 						   -1,
 						   0);
+		TupleDescFinalize(tupdesc);
 	}
 	else if (functypclass == TYPEFUNC_RECORD)
 	{
diff --git a/src/backend/utils/misc/guc_funcs.c b/src/backend/utils/misc/guc_funcs.c
index 4f3e40bf470..b82f807e05e 100644
--- a/src/backend/utils/misc/guc_funcs.c
+++ b/src/backend/utils/misc/guc_funcs.c
@@ -444,6 +444,7 @@ GetPGVariableResultDesc(const char *name)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 1, varname,
 						   TEXTOID, -1, 0);
 	}
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
 
@@ -465,6 +466,7 @@ ShowGUCConfigOption(const char *name, DestReceiver *dest)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, varname,
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -499,6 +501,7 @@ ShowAllGUCConfig(DestReceiver *dest)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "description",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -934,6 +937,8 @@ show_all_settings(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 17, "pending_restart",
 						   BOOLOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
+
 		/*
 		 * Generate attribute metadata needed later to produce tuples from raw
 		 * C strings
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index d46cdbf7a3c..595413dbbc5 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -195,6 +195,7 @@ extern TupleDesc CreateTupleDescTruncatedCopy(TupleDesc tupdesc, int natts);
 
 extern TupleDesc CreateTupleDescCopyConstr(TupleDesc tupdesc);
 
+#define TupleDescFinalize(d) ((void) 0)
 #define TupleDescSize(src) \
 	(offsetof(struct TupleDescData, compact_attrs) + \
 	 (src)->natts * sizeof(CompactAttribute) + \
diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c
index 7d648c941c0..b2112c29fcf 100644
--- a/src/pl/plpgsql/src/pl_comp.c
+++ b/src/pl/plpgsql/src/pl_comp.c
@@ -1912,6 +1912,8 @@ build_row_from_vars(PLpgSQL_variable **vars, int numvars)
 		TupleDescInitEntryCollation(row->rowtupdesc, i + 1, typcoll);
 	}
 
+	TupleDescFinalize(row->rowtupdesc);
+
 	return row;
 }
 
diff --git a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
index 908bd18a7c7..fa1719bf3b5 100644
--- a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
+++ b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
@@ -205,6 +205,7 @@ test_custom_stats_fixed_report(PG_FUNCTION_ARGS)
 					   INT8OID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	values[0] = Int64GetDatum(stats->numcalls);
diff --git a/src/test/modules/test_predtest/test_predtest.c b/src/test/modules/test_predtest/test_predtest.c
index 679a5de456d..48ca2a4ea70 100644
--- a/src/test/modules/test_predtest/test_predtest.c
+++ b/src/test/modules/test_predtest/test_predtest.c
@@ -230,6 +230,7 @@ test_predtest(PG_FUNCTION_ARGS)
 					   "s_r_holds", BOOLOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 8,
 					   "w_r_holds", BOOLOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	tupdesc = BlessTupleDesc(tupdesc);
 
 	values[0] = BoolGetDatum(strong_implied_by);
-- 
2.51.0


From 8d6e9b363c37d74a2e6e4972bff710cf1be2a88f Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Tue, 31 Dec 2024 09:19:24 +1300
Subject: [PATCH v7 2/4] Precalculate CompactAttribute's attcacheoff

This allows code to be removed from the tuple deform routines which
shrinks down the code a little, which can make it run more quickly.
This also makes a dedicated deformer loop to deform the portion of the
tuple which has a known offset, which makes deforming much faster when
a leading set of the table's columns are non-NULL values and fixed-width
types.
---
 src/backend/access/common/heaptuple.c  | 334 +++++++++-------------
 src/backend/access/common/indextuple.c | 367 ++++++++++---------------
 src/backend/access/common/tupdesc.c    |  37 +++
 src/backend/access/spgist/spgutils.c   |   3 -
 src/backend/executor/execTuples.c      | 282 ++++++++-----------
 src/backend/jit/llvm/llvmjit_deform.c  |   6 -
 src/backend/utils/cache/relcache.c     |  12 -
 src/include/access/htup_details.h      |  19 +-
 src/include/access/itup.h              |  20 +-
 src/include/access/tupdesc.h           |  10 +-
 src/include/access/tupmacs.h           |  82 ++++++
 src/include/executor/tuptable.h        |   9 +-
 12 files changed, 539 insertions(+), 642 deletions(-)

diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c
index 11bec20e82e..42cce3dcdfe 100644
--- a/src/backend/access/common/heaptuple.c
+++ b/src/backend/access/common/heaptuple.c
@@ -497,20 +497,8 @@ heap_attisnull(HeapTuple tup, int attnum, TupleDesc tupleDesc)
 /* ----------------
  *		nocachegetattr
  *
- *		This only gets called from fastgetattr(), in cases where we
- *		can't use a cacheoffset and the value is not null.
- *
- *		This caches attribute offsets in the attribute descriptor.
- *
- *		An alternative way to speed things up would be to cache offsets
- *		with the tuple, but that seems more difficult unless you take
- *		the storage hit of actually putting those offsets into the
- *		tuple you send to disk.  Yuck.
- *
- *		This scheme will be slightly slower than that, but should
- *		perform well for queries which hit large #'s of tuples.  After
- *		you cache the offsets once, examining all the other tuples using
- *		the same attribute descriptor will go much quicker. -cim 5/4/91
+ *		This only gets called from fastgetattr(), in cases where the
+ *		attcacheoff is not set.
  *
  *		NOTE: if you need to change this code, see also heap_deform_tuple.
  *		Also see nocache_index_getattr, which is the same code for index
@@ -522,194 +510,104 @@ nocachegetattr(HeapTuple tup,
 			   int attnum,
 			   TupleDesc tupleDesc)
 {
+	CompactAttribute *cattr;
 	HeapTupleHeader td = tup->t_data;
 	char	   *tp;				/* ptr to data part of tuple */
 	bits8	   *bp = td->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* do we have to walk attrs? */
 	int			off;			/* current offset within data */
+	int			startAttr;
+	int			firstnullattr;
+	bool		hasnulls = HeapTupleHasNulls(tup);
 
-	/* ----------------
-	 *	 Three cases:
-	 *
-	 *	 1: No nulls and no variable-width attributes.
-	 *	 2: Has a null or a var-width AFTER att.
-	 *	 3: Has nulls or var-widths BEFORE att.
-	 * ----------------
-	 */
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
+	/*
+	 * If there are no NULLs before the required attnum, then we can start at
+	 * the highest attribute with a known offset, or the first attribute if
+	 * none have a cached offset.  If the tuple has no variable width types,
+	 * then we can use a slightly cheaper method of offset calculation, as we
+	 * just need to add the attlen to the aligned offset when skipping over
+	 * columns.  When the tuple contains variable-width types, we must use
+	 * att_addlength_pointer(), which does a bit more branching and is
+	 * slightly less efficient.
+	 */
 	attnum--;
 
-	if (!HeapTupleNoNulls(tup))
-	{
-		/*
-		 * there's a null somewhere in the tuple
-		 *
-		 * check to see if any preceding bits are null...
-		 */
-		int			byte = attnum >> 3;
-		int			finalbit = attnum & 0x07;
-
-		/* check for nulls "before" final bit of last byte */
-		if ((~bp[byte]) & ((1 << finalbit) - 1))
-			slow = true;
-		else
-		{
-			/* check for nulls in any "earlier" bytes */
-			int			i;
+	if (hasnulls)
+		firstnullattr = first_null_attr(bp, attnum);
+	else
+		firstnullattr = attnum;
 
-			for (i = 0; i < byte; i++)
-			{
-				if (bp[i] != 0xFF)
-				{
-					slow = true;
-					break;
-				}
-			}
-		}
+	if (tupleDesc->firstNonCachedOffAttr > 0)
+	{
+		startAttr = Min(tupleDesc->firstNonCachedOffAttr - 1, firstnullattr);
+		off = TupleDescCompactAttr(tupleDesc, startAttr)->attcacheoff;
+	}
+	else
+	{
+		startAttr = 0;
+		off = 0;
 	}
 
 	tp = (char *) td + td->t_hoff;
 
-	if (!slow)
+	if (hasnulls)
 	{
-		CompactAttribute *att;
+		for (int i = startAttr; i < attnum; i++)
+		{
+			CompactAttribute *att;
 
-		/*
-		 * If we get here, there are no nulls up to and including the target
-		 * attribute.  If we have a cached offset, we can use it.
-		 */
-		att = TupleDescCompactAttr(tupleDesc, attnum);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, tp + att->attcacheoff);
+			if (att_isnull(i, bp))
+				continue;
 
-		/*
-		 * Otherwise, check for non-fixed-length attrs up to and including
-		 * target.  If there aren't any, it's safe to cheaply initialize the
-		 * cached offsets for these attrs.
-		 */
-		if (HeapTupleHasVarWidth(tup))
-		{
-			int			j;
+			att = TupleDescCompactAttr(tupleDesc, i);
 
-			for (j = 0; j <= attnum; j++)
-			{
-				if (TupleDescCompactAttr(tupleDesc, j)->attlen <= 0)
-				{
-					slow = true;
-					break;
-				}
-			}
+			off = att_pointer_alignby(off,
+									  att->attalignby,
+									  att->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, att->attlen, tp + off);
 		}
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
 	}
-
-	if (!slow)
+	else if (!HeapTupleHasVarWidth(tup))
 	{
-		int			natts = tupleDesc->natts;
-		int			j = 1;
-
-		/*
-		 * If we get here, we have a tuple with no nulls or var-widths up to
-		 * and including the target attribute, so we can use the cached offset
-		 * ... only we don't have it yet, or we'd not have got here.  Since
-		 * it's cheap to compute offsets for fixed-width columns, we take the
-		 * opportunity to initialize the cached offsets for *all* the leading
-		 * fixed-width columns, in hope of avoiding future visits to this
-		 * routine.
-		 */
-		TupleDescCompactAttr(tupleDesc, 0)->attcacheoff = 0;
-
-		/* we might have set some offsets in the slow path previously */
-		while (j < natts && TupleDescCompactAttr(tupleDesc, j)->attcacheoff > 0)
-			j++;
-
-		off = TupleDescCompactAttr(tupleDesc, j - 1)->attcacheoff +
-			TupleDescCompactAttr(tupleDesc, j - 1)->attlen;
-
-		for (; j < natts; j++)
+		for (int i = startAttr; i < attnum; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, j);
-
-			if (att->attlen <= 0)
-				break;
+			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
 
 			off = att_nominal_alignby(off, att->attalignby);
-
-			att->attcacheoff = off;
-
 			off += att->attlen;
 		}
-
-		Assert(j > attnum);
-
-		off = TupleDescCompactAttr(tupleDesc, attnum)->attcacheoff;
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_nominal_alignby(off, cattr->attalignby);
 	}
 	else
 	{
-		bool		usecache = true;
-		int			i;
-
-		/*
-		 * Now we know that we have to walk the tuple CAREFULLY.  But we still
-		 * might be able to cache some offsets for next time.
-		 *
-		 * Note - This loop is a little tricky.  For each non-null attribute,
-		 * we have to first account for alignment padding before the attr,
-		 * then advance over the attr based on its length.  Nulls have no
-		 * storage and no alignment padding either.  We can use/set
-		 * attcacheoff until we reach either a null or a var-width attribute.
-		 */
-		off = 0;
-		for (i = 0;; i++)		/* loop exit is at "break" */
+		for (int i = startAttr; i < attnum; i++)
 		{
 			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
 
-			if (HeapTupleHasNulls(tup) && att_isnull(i, bp))
-			{
-				usecache = false;
-				continue;		/* this cannot be the target att */
-			}
-
-			/* If we know the next offset, we can skip the rest */
-			if (usecache && att->attcacheoff >= 0)
-				off = att->attcacheoff;
-			else if (att->attlen == -1)
-			{
-				/*
-				 * We can only cache the offset for a varlena attribute if the
-				 * offset is already suitably aligned, so that there would be
-				 * no pad bytes in any case: then the offset will be valid for
-				 * either an aligned or unaligned value.
-				 */
-				if (usecache &&
-					off == att_nominal_alignby(off, att->attalignby))
-					att->attcacheoff = off;
-				else
-				{
-					off = att_pointer_alignby(off, att->attalignby, -1,
-											  tp + off);
-					usecache = false;
-				}
-			}
-			else
-			{
-				/* not varlena, so safe to use att_nominal_alignby */
-				off = att_nominal_alignby(off, att->attalignby);
-
-				if (usecache)
-					att->attcacheoff = off;
-			}
-
-			if (i == attnum)
-				break;
-
+			off = att_pointer_alignby(off,
+									  att->attalignby,
+									  att->attlen,
+									  tp + off);
 			off = att_addlength_pointer(off, att->attlen, tp + off);
 
-			if (usecache && att->attlen <= 0)
-				usecache = false;
 		}
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
 	}
 
-	return fetchatt(TupleDescCompactAttr(tupleDesc, attnum), tp + off);
+	return fetchatt(cattr, tp + off);
 }
 
 /* ----------------
@@ -1347,6 +1245,7 @@ heap_deform_tuple(HeapTuple tuple, TupleDesc tupleDesc,
 				  Datum *values, bool *isnull)
 {
 	HeapTupleHeader tup = tuple->t_data;
+	CompactAttribute *cattr;
 	bool		hasnulls = HeapTupleHasNulls(tuple);
 	int			tdesc_natts = tupleDesc->natts;
 	int			natts;			/* number of atts to extract */
@@ -1354,70 +1253,91 @@ heap_deform_tuple(HeapTuple tuple, TupleDesc tupleDesc,
 	char	   *tp;				/* ptr to tuple data */
 	uint32		off;			/* offset in tuple data */
 	bits8	   *bp = tup->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* can we use/set attcacheoff? */
+	int			cacheoffattrs;
+	int			firstnullattr;
 
 	natts = HeapTupleHeaderGetNatts(tup);
 
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
+
 	/*
 	 * In inheritance situations, it is possible that the given tuple actually
 	 * has more fields than the caller is expecting.  Don't run off the end of
 	 * the caller's arrays.
 	 */
 	natts = Min(natts, tdesc_natts);
+	cacheoffattrs = Min(tupleDesc->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		firstnullattr = first_null_attr(bp, natts);
+		cacheoffattrs = Min(cacheoffattrs, firstnullattr);
+	}
+	else
+		firstnullattr = natts;
 
 	tp = (char *) tup + tup->t_hoff;
+	attnum = 0;
 
-	off = 0;
+	if (cacheoffattrs > 0)
+	{
+#ifdef USE_ASSERT_CHECKING
+		/* In Assert enabled builds, verify attcacheoff is correct */
+		off = 0;
+#endif
+		do
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+#ifdef USE_ASSERT_CHECKING
+			off = att_nominal_alignby(off, cattr->attalignby);
+			Assert(off == cattr->attcacheoff);
+			off += cattr->attlen;
+#endif
 
-	for (attnum = 0; attnum < natts; attnum++)
+			values[attnum] = fetch_att(tp + cattr->attcacheoff,
+									   cattr->attbyval,
+									   cattr->attlen);
+			isnull[attnum] = false;
+		} while (++attnum < cacheoffattrs);
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+	else
+		off = 0;
+
+	for (; attnum < firstnullattr; attnum++)
 	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDesc, attnum);
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
 
-		if (hasnulls && att_isnull(attnum, bp))
-		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			slow = true;		/* can't use attcacheoff anymore */
-			continue;
-		}
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
 		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (!slow && thisatt->attcacheoff >= 0)
-			off = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow &&
-				off == att_nominal_alignby(off, thisatt->attalignby))
-				thisatt->attcacheoff = off;
-			else
-			{
-				off = att_pointer_alignby(off, thisatt->attalignby, -1,
-										  tp + off);
-				slow = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			off = att_nominal_alignby(off, thisatt->attalignby);
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+	}
+
+	for (; attnum < natts; attnum++)
+	{
+		Assert(hasnulls);
 
-			if (!slow)
-				thisatt->attcacheoff = off;
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
 		}
 
-		values[attnum] = fetchatt(thisatt, tp + off);
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
-		off = att_addlength_pointer(off, thisatt->attlen, tp + off);
+		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (thisatt->attlen <= 0)
-			slow = true;		/* can't use attcacheoff anymore */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
 	/*
diff --git a/src/backend/access/common/indextuple.c b/src/backend/access/common/indextuple.c
index d7c8c53fd8d..084e0937a60 100644
--- a/src/backend/access/common/indextuple.c
+++ b/src/backend/access/common/indextuple.c
@@ -223,18 +223,6 @@ index_form_tuple_context(TupleDesc tupleDescriptor,
  *
  *		This gets called from index_getattr() macro, and only in cases
  *		where we can't use cacheoffset and the value is not null.
- *
- *		This caches attribute offsets in the attribute descriptor.
- *
- *		An alternative way to speed things up would be to cache offsets
- *		with the tuple, but that seems more difficult unless you take
- *		the storage hit of actually putting those offsets into the
- *		tuple you send to disk.  Yuck.
- *
- *		This scheme will be slightly slower than that, but should
- *		perform well for queries which hit large #'s of tuples.  After
- *		you cache the offsets once, examining all the other tuples using
- *		the same attribute descriptor will go much quicker. -cim 5/4/91
  * ----------------
  */
 Datum
@@ -242,205 +230,129 @@ nocache_index_getattr(IndexTuple tup,
 					  int attnum,
 					  TupleDesc tupleDesc)
 {
+	CompactAttribute *cattr;
 	char	   *tp;				/* ptr to data part of tuple */
 	bits8	   *bp = NULL;		/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* do we have to walk attrs? */
 	int			data_off;		/* tuple data offset */
 	int			off;			/* current offset within data */
+	int			startAttr;
+	int			firstnullattr;
+	bool		hasnulls = IndexTupleHasNulls(tup);
+	int			i;
 
-	/* ----------------
-	 *	 Three cases:
-	 *
-	 *	 1: No nulls and no variable-width attributes.
-	 *	 2: Has a null or a var-width AFTER att.
-	 *	 3: Has nulls or var-widths BEFORE att.
-	 * ----------------
-	 */
-
-	data_off = IndexInfoFindDataOffset(tup->t_info);
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
 	attnum--;
 
-	if (IndexTupleHasNulls(tup))
-	{
-		/*
-		 * there's a null somewhere in the tuple
-		 *
-		 * check to see if desired att is null
-		 */
+	/*
+	 * If there are no NULLs before the required attnum, then we can start at
+	 * the highest attribute with a known offset, or the first attribute if
+	 * none have a cached offset.  If the tuple has no variable width types,
+	 * which is common with indexes, then we can use a slightly cheaper method
+	 * of offset calculation, as we just need to add the attlen to the aligned
+	 * offset when skipping over columns.  When the tuple contains
+	 * variable-width types, we must use att_addlength_pointer(), which does a
+	 * bit more branching and is slightly less efficient.
+	 */
+	data_off = IndexInfoFindDataOffset(tup->t_info);
+	tp = (char *) tup + data_off;
 
-		/* XXX "knows" t_bits are just after fixed tuple header! */
+	/*
+	 * Find the first NULL column, or if there's none set the first NULL to
+	 * attnum so that we can forego NULL checking all the way to attnum.
+	 */
+	if (hasnulls)
+	{
 		bp = (bits8 *) ((char *) tup + sizeof(IndexTupleData));
-
-		/*
-		 * Now check to see if any preceding bits are null...
-		 */
-		{
-			int			byte = attnum >> 3;
-			int			finalbit = attnum & 0x07;
-
-			/* check for nulls "before" final bit of last byte */
-			if ((~bp[byte]) & ((1 << finalbit) - 1))
-				slow = true;
-			else
-			{
-				/* check for nulls in any "earlier" bytes */
-				int			i;
-
-				for (i = 0; i < byte; i++)
-				{
-					if (bp[i] != 0xFF)
-					{
-						slow = true;
-						break;
-					}
-				}
-			}
-		}
+		firstnullattr = first_null_attr(bp, attnum);
 	}
+	else
+		firstnullattr = attnum;
 
-	tp = (char *) tup + data_off;
-
-	if (!slow)
+	if (tupleDesc->firstNonCachedOffAttr > 0)
 	{
-		CompactAttribute *att;
-
-		/*
-		 * If we get here, there are no nulls up to and including the target
-		 * attribute.  If we have a cached offset, we can use it.
-		 */
-		att = TupleDescCompactAttr(tupleDesc, attnum);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, tp + att->attcacheoff);
-
-		/*
-		 * Otherwise, check for non-fixed-length attrs up to and including
-		 * target.  If there aren't any, it's safe to cheaply initialize the
-		 * cached offsets for these attrs.
-		 */
-		if (IndexTupleHasVarwidths(tup))
-		{
-			int			j;
-
-			for (j = 0; j <= attnum; j++)
-			{
-				if (TupleDescCompactAttr(tupleDesc, j)->attlen <= 0)
-				{
-					slow = true;
-					break;
-				}
-			}
-		}
+		startAttr = Min(tupleDesc->firstNonCachedOffAttr - 1, firstnullattr);
+		off = TupleDescCompactAttr(tupleDesc, startAttr)->attcacheoff;
 	}
-
-	if (!slow)
+	else
 	{
-		int			natts = tupleDesc->natts;
-		int			j = 1;
-
-		/*
-		 * If we get here, we have a tuple with no nulls or var-widths up to
-		 * and including the target attribute, so we can use the cached offset
-		 * ... only we don't have it yet, or we'd not have got here.  Since
-		 * it's cheap to compute offsets for fixed-width columns, we take the
-		 * opportunity to initialize the cached offsets for *all* the leading
-		 * fixed-width columns, in hope of avoiding future visits to this
-		 * routine.
-		 */
-		TupleDescCompactAttr(tupleDesc, 0)->attcacheoff = 0;
+		startAttr = 0;
+		off = 0;
+	}
 
-		/* we might have set some offsets in the slow path previously */
-		while (j < natts && TupleDescCompactAttr(tupleDesc, j)->attcacheoff > 0)
-			j++;
+	/* Handle tuples with var-width attributes */
+	if (IndexTupleHasVarwidths(tup))
+	{
+		/* Calculate the offset up until the first NULL */
+		for (i = startAttr; i < firstnullattr; i++)
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-		off = TupleDescCompactAttr(tupleDesc, j - 1)->attcacheoff +
-			TupleDescCompactAttr(tupleDesc, j - 1)->attlen;
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, cattr->attlen, tp + off);
+		}
 
-		for (; j < natts; j++)
+		/* Calculate the offset for any remaining columns. */
+		for (; i < attnum; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, j);
+			Assert(hasnulls);
 
-			if (att->attlen <= 0)
-				break;
+			if (att_isnull(i, bp))
+				continue;
 
-			off = att_nominal_alignby(off, att->attalignby);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-			att->attcacheoff = off;
-
-			off += att->attlen;
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, cattr->attlen, tp + off);
 		}
-
-		Assert(j > attnum);
-
-		off = TupleDescCompactAttr(tupleDesc, attnum)->attcacheoff;
 	}
 	else
 	{
-		bool		usecache = true;
-		int			i;
+		/* Handle tuples with only fixed-width attributes */
 
-		/*
-		 * Now we know that we have to walk the tuple CAREFULLY.  But we still
-		 * might be able to cache some offsets for next time.
-		 *
-		 * Note - This loop is a little tricky.  For each non-null attribute,
-		 * we have to first account for alignment padding before the attr,
-		 * then advance over the attr based on its length.  Nulls have no
-		 * storage and no alignment padding either.  We can use/set
-		 * attcacheoff until we reach either a null or a var-width attribute.
-		 */
-		off = 0;
-		for (i = 0;; i++)		/* loop exit is at "break" */
+		/* Calculate the offset up until the first NULL */
+		for (i = startAttr; i < firstnullattr; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
-
-			if (IndexTupleHasNulls(tup) && att_isnull(i, bp))
-			{
-				usecache = false;
-				continue;		/* this cannot be the target att */
-			}
-
-			/* If we know the next offset, we can skip the rest */
-			if (usecache && att->attcacheoff >= 0)
-				off = att->attcacheoff;
-			else if (att->attlen == -1)
-			{
-				/*
-				 * We can only cache the offset for a varlena attribute if the
-				 * offset is already suitably aligned, so that there would be
-				 * no pad bytes in any case: then the offset will be valid for
-				 * either an aligned or unaligned value.
-				 */
-				if (usecache &&
-					off == att_nominal_alignby(off, att->attalignby))
-					att->attcacheoff = off;
-				else
-				{
-					off = att_pointer_alignby(off, att->attalignby, -1,
-											  tp + off);
-					usecache = false;
-				}
-			}
-			else
-			{
-				/* not varlena, so safe to use att_nominal_alignby */
-				off = att_nominal_alignby(off, att->attalignby);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
+
+			Assert(cattr->attlen > 0);
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off += cattr->attlen;
+		}
 
-				if (usecache)
-					att->attcacheoff = off;
-			}
+		/* Calculate the offset for any remaining columns. */
+		for (; i < attnum; i++)
+		{
+			Assert(hasnulls);
 
-			if (i == attnum)
-				break;
+			if (att_isnull(i, bp))
+				continue;
 
-			off = att_addlength_pointer(off, att->attlen, tp + off);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-			if (usecache && att->attlen <= 0)
-				usecache = false;
+			Assert(cattr->attlen > 0);
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off += cattr->attlen;
 		}
 	}
 
-	return fetchatt(TupleDescCompactAttr(tupleDesc, attnum), tp + off);
+	cattr = TupleDescCompactAttr(tupleDesc, attnum);
+	off = att_pointer_alignby(off, cattr->attalignby,
+							  cattr->attlen, tp + off);
+	return fetchatt(cattr, tp + off);
 }
 
 /*
@@ -480,63 +392,86 @@ index_deform_tuple_internal(TupleDesc tupleDescriptor,
 							Datum *values, bool *isnull,
 							char *tp, bits8 *bp, int hasnulls)
 {
+	CompactAttribute *cattr;
 	int			natts = tupleDescriptor->natts; /* number of atts to extract */
-	int			attnum;
+	int			attnum = 0;
 	int			off = 0;		/* offset in tuple data */
-	bool		slow = false;	/* can we use/set attcacheoff? */
+	int			cacheoffattrs;
+	int			firstnullattr;
 
 	/* Assert to protect callers who allocate fixed-size arrays */
 	Assert(natts <= INDEX_MAX_KEYS);
 
-	for (attnum = 0; attnum < natts; attnum++)
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDescriptor->firstNonCachedOffAttr >= 0);
+
+	cacheoffattrs = Min(tupleDescriptor->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		firstnullattr = first_null_attr(bp, natts);
+		cacheoffattrs = Min(cacheoffattrs, firstnullattr);
+	}
+	else
+		firstnullattr = natts;
+
+	if (cacheoffattrs > 0)
 	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDescriptor, attnum);
+#ifdef USE_ASSERT_CHECKING
+		/* In Assert enabled builds, verify attcacheoff is correct */
+		off = 0;
+#endif
 
-		if (hasnulls && att_isnull(attnum, bp))
+		do
 		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			slow = true;		/* can't use attcacheoff anymore */
-			continue;
-		}
+			cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+
+#ifdef USE_ASSERT_CHECKING
+			off = att_nominal_alignby(off, cattr->attalignby);
+			Assert(off == cattr->attcacheoff);
+			off += cattr->attlen;
+#endif
+
+			values[attnum] = fetch_att(tp + cattr->attcacheoff, cattr->attbyval,
+									   cattr->attlen);
+			isnull[attnum] = false;
+		} while (++attnum < cacheoffattrs);
+
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+
+	for (; attnum < firstnullattr; attnum++)
+	{
+		cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
 		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (!slow && thisatt->attcacheoff >= 0)
-			off = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow &&
-				off == att_nominal_alignby(off, thisatt->attalignby))
-				thisatt->attcacheoff = off;
-			else
-			{
-				off = att_pointer_alignby(off, thisatt->attalignby, -1,
-										  tp + off);
-				slow = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			off = att_nominal_alignby(off, thisatt->attalignby);
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+	}
+
+	for (; attnum < natts; attnum++)
+	{
+		Assert(hasnulls);
 
-			if (!slow)
-				thisatt->attcacheoff = off;
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
 		}
 
-		values[attnum] = fetchatt(thisatt, tp + off);
+		cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
-		off = att_addlength_pointer(off, thisatt->attlen, tp + off);
+		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (thisatt->attlen <= 0)
-			slow = true;		/* can't use attcacheoff anymore */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 }
 
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index e98de806a77..25364db630a 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -214,6 +214,9 @@ CreateTemplateTupleDesc(int natts)
 	desc->tdtypmod = -1;
 	desc->tdrefcount = -1;		/* assume not reference-counted */
 
+	/* This will be set to the correct value by TupleDescFinalize() */
+	desc->firstNonCachedOffAttr = -1;
+
 	return desc;
 }
 
@@ -474,6 +477,9 @@ TupleDescCopy(TupleDesc dst, TupleDesc src)
  *		descriptor to another.
  *
  * !!! Constraints and defaults are not copied !!!
+ *
+ * The caller must take care of calling TupleDescFinalize() on once all
+ * TupleDesc changes have been made.
  */
 void
 TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
@@ -506,6 +512,37 @@ TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 	populate_compact_attribute(dst, dstAttno - 1);
 }
 
+/*
+ * TupleDescFinalize
+ *		Finalize the given TupleDesc.  This must be called after the
+ *		attributes arrays have been populated or adjusted by any code.
+ *
+ * Must be called after populate_compact_attribute() and before
+ * BlessTupleDesc().
+ */
+void
+TupleDescFinalize(TupleDesc tupdesc)
+{
+	int			firstNonCachedOffAttr = 0;
+	int			offp = 0;
+
+	for (int i = 0; i < tupdesc->natts; i++)
+	{
+		CompactAttribute *cattr = TupleDescCompactAttr(tupdesc, i);
+
+		if (cattr->attlen <= 0)
+			break;
+
+		offp = att_nominal_alignby(offp, cattr->attalignby);
+		cattr->attcacheoff = offp;
+
+		offp += cattr->attlen;
+		firstNonCachedOffAttr = i + 1;
+	}
+
+	tupdesc->firstNonCachedOffAttr = firstNonCachedOffAttr;
+}
+
 /*
  * Free a TupleDesc including all substructure
  */
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index b246e8127db..a4694bd8065 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -335,9 +335,6 @@ getSpGistTupleDesc(Relation index, SpGistTypeDesc *keyType)
 		/* We shouldn't need to bother with making these valid: */
 		att->attcompression = InvalidCompressionMethod;
 		att->attcollation = InvalidOid;
-		/* In case we changed typlen, we'd better reset following offsets */
-		for (int i = spgFirstIncludeColumn; i < outTupDesc->natts; i++)
-			TupleDescCompactAttr(outTupDesc, i)->attcacheoff = -1;
 
 		populate_compact_attribute(outTupDesc, spgKeyColumn);
 		TupleDescFinalize(outTupDesc);
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index e6ab51e6404..36d0aaed2fb 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -992,118 +992,6 @@ tts_buffer_heap_store_tuple(TupleTableSlot *slot, HeapTuple tuple,
 	}
 }
 
-/*
- * slot_deform_heap_tuple_internal
- *		An always inline helper function for use in slot_deform_heap_tuple to
- *		allow the compiler to emit specialized versions of this function for
- *		various combinations of "slow" and "hasnulls".  For example, if a
- *		given tuple has no nulls, then we needn't check "hasnulls" for every
- *		attribute that we're deforming.  The caller can just call this
- *		function with hasnulls set to constant-false and have the compiler
- *		remove the constant-false branches and emit more optimal code.
- *
- * Returns the next attnum to deform, which can be equal to natts when the
- * function manages to deform all requested attributes.  *offp is an input and
- * output parameter which is the byte offset within the tuple to start deforming
- * from which, on return, gets set to the offset where the next attribute
- * should be deformed from.  *slowp is set to true when subsequent deforming
- * of this tuple must use a version of this function with "slow" passed as
- * true.
- *
- * Callers cannot assume when we return "attnum" (i.e. all requested
- * attributes have been deformed) that slow mode isn't required for any
- * additional deforming as the final attribute may have caused a switch to
- * slow mode.
- */
-static pg_attribute_always_inline int
-slot_deform_heap_tuple_internal(TupleTableSlot *slot, HeapTuple tuple,
-								int attnum, int natts, bool slow,
-								bool hasnulls, uint32 *offp, bool *slowp)
-{
-	TupleDesc	tupleDesc = slot->tts_tupleDescriptor;
-	Datum	   *values = slot->tts_values;
-	bool	   *isnull = slot->tts_isnull;
-	HeapTupleHeader tup = tuple->t_data;
-	char	   *tp;				/* ptr to tuple data */
-	bits8	   *bp = tup->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slownext = false;
-
-	tp = (char *) tup + tup->t_hoff;
-
-	for (; attnum < natts; attnum++)
-	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDesc, attnum);
-
-		if (hasnulls && att_isnull(attnum, bp))
-		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			if (!slow)
-			{
-				*slowp = true;
-				return attnum + 1;
-			}
-			else
-				continue;
-		}
-
-		isnull[attnum] = false;
-
-		/* calculate the offset of this attribute */
-		if (!slow && thisatt->attcacheoff >= 0)
-			*offp = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow && *offp == att_nominal_alignby(*offp, thisatt->attalignby))
-				thisatt->attcacheoff = *offp;
-			else
-			{
-				*offp = att_pointer_alignby(*offp,
-											thisatt->attalignby,
-											-1,
-											tp + *offp);
-
-				if (!slow)
-					slownext = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			*offp = att_nominal_alignby(*offp, thisatt->attalignby);
-
-			if (!slow)
-				thisatt->attcacheoff = *offp;
-		}
-
-		values[attnum] = fetchatt(thisatt, tp + *offp);
-
-		*offp = att_addlength_pointer(*offp, thisatt->attlen, tp + *offp);
-
-		/* check if we need to switch to slow mode */
-		if (!slow)
-		{
-			/*
-			 * We're unable to deform any further if the above code set
-			 * 'slownext', or if this isn't a fixed-width attribute.
-			 */
-			if (slownext || thisatt->attlen <= 0)
-			{
-				*slowp = true;
-				return attnum + 1;
-			}
-		}
-	}
-
-	return natts;
-}
-
 /*
  * slot_deform_heap_tuple
  *		Given a TupleTableSlot, extract data from the slot's physical tuple
@@ -1122,78 +1010,138 @@ static pg_attribute_always_inline void
 slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 					   int natts)
 {
+	CompactAttribute *cattr;
+	TupleDesc	tupleDesc = slot->tts_tupleDescriptor;
 	bool		hasnulls = HeapTupleHasNulls(tuple);
+	HeapTupleHeader tup = tuple->t_data;
+	bits8	   *bp;				/* ptr to null bitmap in tuple */
 	int			attnum;
+	int			firstNonCacheOffsetAttr;
+	int			firstNullAttr;
+	Datum	   *values;
+	bool	   *isnull;
+	char	   *tp;				/* ptr to tuple data */
 	uint32		off;			/* offset in tuple data */
-	bool		slow;			/* can we use/set attcacheoff? */
+
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
 	/* We can only fetch as many attributes as the tuple has. */
-	natts = Min(HeapTupleHeaderGetNatts(tuple->t_data), natts);
+	natts = Min(HeapTupleHeaderGetNatts(tup), natts);
+	attnum = slot->tts_nvalid;
+	firstNonCacheOffsetAttr = Min(tupleDesc->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		bp = tup->t_bits;
+		firstNullAttr = first_null_attr(bp, natts);
+		firstNonCacheOffsetAttr = Min(firstNonCacheOffsetAttr, firstNullAttr);
+	}
+	else
+	{
+		bp = NULL;
+		firstNullAttr = natts;
+	}
+
+	values = slot->tts_values;
+	isnull = slot->tts_isnull;
+	tp = (char *) tup + tup->t_hoff;
 
 	/*
-	 * Check whether the first call for this tuple, and initialize or restore
-	 * loop state.
+	 * Handle the portion of the tuple that we have cached the offset for up
+	 * to the first NULL attribute.  The offset is effectively fixed for these
+	 * so we can use the CompactAttribute's attcacheoff.
 	 */
-	attnum = slot->tts_nvalid;
-	if (attnum == 0)
+	if (attnum < firstNonCacheOffsetAttr)
 	{
-		/* Start from the first attribute */
-		off = 0;
-		slow = false;
+#ifdef USE_ASSERT_CHECKING
+		int			offcheck;
+
+		/* In Assert enabled builds, verify attcacheoff is correct */
+		if (attnum == 0)
+			offcheck = 0;
+		else
+			offcheck = *offp;
+#endif
+		do
+		{
+			isnull[attnum] = false;
+			cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+#ifdef USE_ASSERT_CHECKING
+			offcheck = att_nominal_alignby(offcheck, cattr->attalignby);
+			Assert(offcheck == cattr->attcacheoff);
+			offcheck += cattr->attlen;
+#endif
+
+			values[attnum] = fetchatt(cattr, tp + cattr->attcacheoff);
+		} while (++attnum < firstNonCacheOffsetAttr);
+
+		/*
+		 * Point the offset after the end of the last attribute with a cached
+		 * offset.  We expect the final cached offset attribute to have a
+		 * fixed width, so just add the attlen to the attcacheoff
+		 */
+		Assert(cattr->attlen > 0);
+		off = cattr->attcacheoff + cattr->attlen;
 	}
 	else
 	{
 		/* Restore state from previous execution */
 		off = *offp;
-		slow = TTS_SLOW(slot);
+
+		/* We expect *offp to be set to 0 when attnum == 0 */
+		Assert(off == 0 || attnum > 0);
 	}
 
 	/*
-	 * If 'slow' isn't set, try deforming using deforming code that does not
-	 * contain any of the extra checks required for non-fixed offset
-	 * deforming.  During deforming, if or when we find a NULL or a variable
-	 * length attribute, we'll switch to a deforming method which includes the
-	 * extra code required for non-fixed offset deforming, a.k.a slow mode.
-	 * Because this is performance critical, we inline
-	 * slot_deform_heap_tuple_internal passing the 'slow' and 'hasnull'
-	 * parameters as constants to allow the compiler to emit specialized code
-	 * with the known-const false comparisons and subsequent branches removed.
+	 * Handle any portion of the tuple that doesn't have a fixed offset up
+	 * until the first NULL attribute.  This loops only differs from the one
+	 * after it by the NULL checks.
 	 */
-	if (!slow)
+	for (; attnum < firstNullAttr; attnum++)
 	{
-		/* Tuple without any NULLs? We can skip doing any NULL checking */
-		if (!hasnulls)
-			attnum = slot_deform_heap_tuple_internal(slot,
-													 tuple,
-													 attnum,
-													 natts,
-													 false, /* slow */
-													 false, /* hasnulls */
-													 &off,
-													 &slow);
-		else
-			attnum = slot_deform_heap_tuple_internal(slot,
-													 tuple,
-													 attnum,
-													 natts,
-													 false, /* slow */
-													 true,	/* hasnulls */
-													 &off,
-													 &slow);
+		isnull[attnum] = false;
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+		/* align the offset for this attribute */
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
+
+		values[attnum] = fetchatt(cattr, tp + off);
+
+		/* move the offset beyond this attribute */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
-	/* If there's still work to do then we must be in slow mode */
-	if (attnum < natts)
+	/*
+	 * Now handle any remaining tuples, this time include NULL checks as we're
+	 * now at the first NULL attribute.
+	 */
+	for (; attnum < natts; attnum++)
 	{
-		/* XXX is it worth adding a separate call when hasnulls is false? */
-		attnum = slot_deform_heap_tuple_internal(slot,
-												 tuple,
-												 attnum,
-												 natts,
-												 true,	/* slow */
-												 hasnulls,
-												 &off,
-												 &slow);
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
+		}
+
+		isnull[attnum] = false;
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+		/* align the offset for this attribute */
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
+
+		values[attnum] = fetchatt(cattr, tp + off);
+
+		/* move the offset beyond this attribute */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
 	/*
@@ -1201,10 +1149,6 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 	 */
 	slot->tts_nvalid = attnum;
 	*offp = off;
-	if (slow)
-		slot->tts_flags |= TTS_FLAG_SLOW;
-	else
-		slot->tts_flags &= ~TTS_FLAG_SLOW;
 }
 
 const TupleTableSlotOps TTSOpsVirtual = {
@@ -2259,10 +2203,16 @@ ExecTypeSetColNames(TupleDesc typeInfo, List *namesList)
  * This happens "for free" if the tupdesc came from a relcache entry, but
  * not if we have manufactured a tupdesc for a transient RECORD datatype.
  * In that case we have to notify typcache.c of the existence of the type.
+ *
+ * TupleDescFinalize() must be called on the TupleDesc before calling this
+ * function.
  */
 TupleDesc
 BlessTupleDesc(TupleDesc tupdesc)
 {
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupdesc->firstNonCachedOffAttr >= 0);
+
 	if (tupdesc->tdtypeid == RECORDOID &&
 		tupdesc->tdtypmod < 0)
 		assign_record_type_typmod(tupdesc);
diff --git a/src/backend/jit/llvm/llvmjit_deform.c b/src/backend/jit/llvm/llvmjit_deform.c
index 3eb087eb56b..12521e3e46a 100644
--- a/src/backend/jit/llvm/llvmjit_deform.c
+++ b/src/backend/jit/llvm/llvmjit_deform.c
@@ -62,7 +62,6 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 	LLVMValueRef v_tts_values;
 	LLVMValueRef v_tts_nulls;
 	LLVMValueRef v_slotoffp;
-	LLVMValueRef v_flagsp;
 	LLVMValueRef v_nvalidp;
 	LLVMValueRef v_nvalid;
 	LLVMValueRef v_maxatt;
@@ -178,7 +177,6 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 	v_tts_nulls =
 		l_load_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_ISNULL,
 						  "tts_ISNULL");
-	v_flagsp = l_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_FLAGS, "");
 	v_nvalidp = l_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_NVALID, "");
 
 	if (ops == &TTSOpsHeapTuple || ops == &TTSOpsBufferHeapTuple)
@@ -747,14 +745,10 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 
 	{
 		LLVMValueRef v_off = l_load(b, TypeSizeT, v_offp, "");
-		LLVMValueRef v_flags;
 
 		LLVMBuildStore(b, l_int16_const(lc, natts), v_nvalidp);
 		v_off = LLVMBuildTrunc(b, v_off, LLVMInt32TypeInContext(lc), "");
 		LLVMBuildStore(b, v_off, v_slotoffp);
-		v_flags = l_load(b, LLVMInt16TypeInContext(lc), v_flagsp, "tts_flags");
-		v_flags = LLVMBuildOr(b, v_flags, l_int16_const(lc, TTS_FLAG_SLOW), "");
-		LLVMBuildStore(b, v_flags, v_flagsp);
 		LLVMBuildRetVoid(b);
 	}
 
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 770edb34e08..998be24ac41 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -666,14 +666,6 @@ RelationBuildTupleDesc(Relation relation)
 		elog(ERROR, "pg_attribute catalog is missing %d attribute(s) for relation OID %u",
 			 need, RelationGetRelid(relation));
 
-	/*
-	 * We can easily set the attcacheoff value for the first attribute: it
-	 * must be zero.  This eliminates the need for special cases for attnum=1
-	 * that used to exist in fastgetattr() and index_getattr().
-	 */
-	if (RelationGetNumberOfAttributes(relation) > 0)
-		TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
-
 	/*
 	 * Set up constraint/default info
 	 */
@@ -1985,8 +1977,6 @@ formrdesc(const char *relationName, Oid relationReltype,
 		populate_compact_attribute(relation->rd_att, i);
 	}
 
-	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
-	TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
 	TupleDescFinalize(relation->rd_att);
 
 	/* mark not-null status */
@@ -4446,8 +4436,6 @@ BuildHardcodedDescriptor(int natts, const FormData_pg_attribute *attrs)
 		populate_compact_attribute(result, i);
 	}
 
-	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
-	TupleDescCompactAttr(result, 0)->attcacheoff = 0;
 	TupleDescFinalize(result);
 
 	/* Note: we don't bother to set up a TupleConstr entry */
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index d406825ff22..94b4279b7f1 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -865,20 +865,17 @@ extern MinimalTuple minimal_expand_tuple(HeapTuple sourceTuple, TupleDesc tupleD
 static inline Datum
 fastgetattr(HeapTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
 {
-	Assert(attnum > 0);
+	CompactAttribute *att = TupleDescCompactAttr(tupleDesc, attnum - 1);
 
+	Assert(attnum > 0);
 	*isnull = false;
-	if (HeapTupleNoNulls(tup))
-	{
-		CompactAttribute *att;
 
-		att = TupleDescCompactAttr(tupleDesc, attnum - 1);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, (char *) tup->t_data + tup->t_data->t_hoff +
-							att->attcacheoff);
-		else
-			return nocachegetattr(tup, attnum, tupleDesc);
-	}
+	if (att->attcacheoff >= 0 && !HeapTupleHasNulls(tup))
+		return fetchatt(att, (char *) tup->t_data + tup->t_data->t_hoff +
+						att->attcacheoff);
+
+	if (HeapTupleNoNulls(tup))
+		return nocachegetattr(tup, attnum, tupleDesc);
 	else
 	{
 		if (att_isnull(attnum - 1, tup->t_data->t_bits))
diff --git a/src/include/access/itup.h b/src/include/access/itup.h
index 57e4daafb0d..e4bb27b7e58 100644
--- a/src/include/access/itup.h
+++ b/src/include/access/itup.h
@@ -131,24 +131,20 @@ IndexInfoFindDataOffset(unsigned short t_info)
 static inline Datum
 index_getattr(IndexTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
 {
+	CompactAttribute *attr = TupleDescCompactAttr(tupleDesc, attnum - 1);
+
 	Assert(isnull);
 	Assert(attnum > 0);
 
 	*isnull = false;
 
-	if (!IndexTupleHasNulls(tup))
-	{
-		CompactAttribute *attr = TupleDescCompactAttr(tupleDesc, attnum - 1);
+	if (attr->attcacheoff >= 0 && !IndexTupleHasNulls(tup))
+		return fetchatt(attr,
+						(char *) tup + IndexInfoFindDataOffset(tup->t_info) +
+						attr->attcacheoff);
 
-		if (attr->attcacheoff >= 0)
-		{
-			return fetchatt(attr,
-							(char *) tup + IndexInfoFindDataOffset(tup->t_info) +
-							attr->attcacheoff);
-		}
-		else
-			return nocache_index_getattr(tup, attnum, tupleDesc);
-	}
+	if (!IndexTupleHasNulls(tup))
+		return nocache_index_getattr(tup, attnum, tupleDesc);
 	else
 	{
 		if (att_isnull(attnum - 1, (bits8 *) tup + sizeof(IndexTupleData)))
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index 595413dbbc5..99d9017d1a6 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -131,6 +131,12 @@ typedef struct CompactAttribute
  * Any code making changes manually to and fields in the FormData_pg_attribute
  * array must subsequently call populate_compact_attribute() to flush the
  * changes out to the corresponding 'compact_attrs' element.
+ *
+ * firstNonCachedOffAttr stores the index into the compact_attrs array for the
+ * first attribute that we don't have a known attcacheoff for.
+ *
+ * Once a TupleDesc has been populated, before it is used for any purpose
+ * TupleDescFinalize() must be called on it.
  */
 typedef struct TupleDescData
 {
@@ -138,6 +144,8 @@ typedef struct TupleDescData
 	Oid			tdtypeid;		/* composite type ID for tuple type */
 	int32		tdtypmod;		/* typmod for tuple type */
 	int			tdrefcount;		/* reference count, or -1 if not counting */
+	int			firstNonCachedOffAttr;	/* index of the first att without an
+										 * attcacheoff */
 	TupleConstr *constr;		/* constraints, or NULL if none */
 	/* compact_attrs[N] is the compact metadata of Attribute Number N+1 */
 	CompactAttribute compact_attrs[FLEXIBLE_ARRAY_MEMBER];
@@ -195,7 +203,6 @@ extern TupleDesc CreateTupleDescTruncatedCopy(TupleDesc tupdesc, int natts);
 
 extern TupleDesc CreateTupleDescCopyConstr(TupleDesc tupdesc);
 
-#define TupleDescFinalize(d) ((void) 0)
 #define TupleDescSize(src) \
 	(offsetof(struct TupleDescData, compact_attrs) + \
 	 (src)->natts * sizeof(CompactAttribute) + \
@@ -206,6 +213,7 @@ extern void TupleDescCopy(TupleDesc dst, TupleDesc src);
 extern void TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 							   TupleDesc src, AttrNumber srcAttno);
 
+extern void TupleDescFinalize(TupleDesc tupdesc);
 extern void FreeTupleDesc(TupleDesc tupdesc);
 
 extern void IncrTupleDescRefCount(TupleDesc tupdesc);
diff --git a/src/include/access/tupmacs.h b/src/include/access/tupmacs.h
index 3e5530658c9..150a7a24785 100644
--- a/src/include/access/tupmacs.h
+++ b/src/include/access/tupmacs.h
@@ -15,6 +15,7 @@
 #define TUPMACS_H
 
 #include "catalog/pg_type_d.h"	/* for TYPALIGN macros */
+#include "port/pg_bitutils.h"
 
 
 /*
@@ -69,6 +70,87 @@ fetch_att(const void *T, bool attbyval, int attlen)
 	else
 		return PointerGetDatum(T);
 }
+
+#ifndef HAVE__BUILTIN_CTZ
+/*
+ * For returning the 0-based position of the right-most 0 bit of a uint8, or 8
+ * if all bits are 1 bits.
+ */
+static const uint8 pg_rightmost_zero_pos[256] = {
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 6,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 7,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 6,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 8
+};
+#endif
+
+/*
+ * first_null_attr
+ *		Inspect a NULL bitmask from a tuple and return the 0-based attnum of the
+ *		first NULL attribute.  Returns natts if no NULLs were found.
+ *
+ * We expect that 'bits' contains at least one 0 bit somewhere in the mask,
+ * not necessarily < natts.
+ */
+static inline int
+first_null_attr(const bits8 *bits, int natts)
+{
+	int			lastByte = natts >> 3;
+	int			bytenum;
+	int			res;
+
+#ifdef USE_ASSERT_CHECKING
+	int			firstnull_check = natts;
+
+	/* Do it the slow way and check we get the same answer. */
+	for (int i = 0; i < natts; i++)
+	{
+		if (att_isnull(i, bits))
+		{
+			firstnull_check = i;
+			break;
+		}
+	}
+#endif
+
+	/* Process all bytes up to just before the byte for the natts index */
+	for (bytenum = 0; bytenum < lastByte; bytenum++)
+	{
+		/* break if there's any NULL attrs (a 0 bit) */
+		if (bits[bytenum] != 0xFF)
+			break;
+	}
+
+	res = bytenum << 3;
+
+#ifdef HAVE__BUILTIN_CTZ
+	res += __builtin_ctz(~bits[bytenum]);
+#else
+	res += pg_rightmost_zero_pos[bits[bytenum]];
+#endif
+
+	/*
+	 * Since we did no masking to mask out bits beyond natts, we may have
+	 * found a bit higher than natts, so we must cap to natts
+	 */
+	res = Min(res, natts);
+
+	Assert(res == firstnull_check);
+
+	return res;
+}
 #endif							/* FRONTEND */
 
 /*
diff --git a/src/include/executor/tuptable.h b/src/include/executor/tuptable.h
index a2dfd707e78..363c5f33697 100644
--- a/src/include/executor/tuptable.h
+++ b/src/include/executor/tuptable.h
@@ -84,9 +84,6 @@
  * tts_values/tts_isnull are allocated either when the slot is created (when
  * the descriptor is provided), or when a descriptor is assigned to the slot;
  * they are of length equal to the descriptor's natts.
- *
- * The TTS_FLAG_SLOW flag is saved state for
- * slot_deform_heap_tuple, and should not be touched by any other code.
  *----------
  */
 
@@ -98,12 +95,8 @@
 #define			TTS_FLAG_SHOULDFREE		(1 << 2)
 #define TTS_SHOULDFREE(slot) (((slot)->tts_flags & TTS_FLAG_SHOULDFREE) != 0)
 
-/* saved state for slot_deform_heap_tuple */
-#define			TTS_FLAG_SLOW		(1 << 3)
-#define TTS_SLOW(slot) (((slot)->tts_flags & TTS_FLAG_SLOW) != 0)
-
 /* fixed tuple descriptor */
-#define			TTS_FLAG_FIXED		(1 << 4)
+#define			TTS_FLAG_FIXED		(1 << 4)	/* XXX change to #3? */
 #define TTS_FIXED(slot) (((slot)->tts_flags & TTS_FLAG_FIXED) != 0)
 
 struct TupleTableSlotOps;
-- 
2.51.0


From 1393b8bf8885ce097baa9a757fc8923b61706e22 Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Tue, 27 Jan 2026 15:08:09 +1300
Subject: [PATCH v7 3/4] Introduce deform_bench test module

For benchmaring tuple deformation.
---
 src/test/modules/deform_bench/.gitignore      |   4 +
 src/test/modules/deform_bench/Makefile        |  21 ++++
 .../deform_bench/deform_bench--1.0.sql        |   8 ++
 src/test/modules/deform_bench/deform_bench.c  | 106 ++++++++++++++++++
 .../modules/deform_bench/deform_bench.control |   4 +
 src/test/modules/deform_bench/meson.build     |  22 ++++
 src/test/modules/meson.build                  |   1 +
 7 files changed, 166 insertions(+)
 create mode 100644 src/test/modules/deform_bench/.gitignore
 create mode 100644 src/test/modules/deform_bench/Makefile
 create mode 100644 src/test/modules/deform_bench/deform_bench--1.0.sql
 create mode 100644 src/test/modules/deform_bench/deform_bench.c
 create mode 100644 src/test/modules/deform_bench/deform_bench.control
 create mode 100644 src/test/modules/deform_bench/meson.build

diff --git a/src/test/modules/deform_bench/.gitignore b/src/test/modules/deform_bench/.gitignore
new file mode 100644
index 00000000000..5dcb3ff9723
--- /dev/null
+++ b/src/test/modules/deform_bench/.gitignore
@@ -0,0 +1,4 @@
+# Generated subdirectories
+/log/
+/results/
+/tmp_check/
diff --git a/src/test/modules/deform_bench/Makefile b/src/test/modules/deform_bench/Makefile
new file mode 100644
index 00000000000..b5fc0f7a583
--- /dev/null
+++ b/src/test/modules/deform_bench/Makefile
@@ -0,0 +1,21 @@
+# src/test/modules/deform_bench/Makefile
+
+MODULE_big = deform_bench
+OBJS = deform_bench.o
+
+EXTENSION = deform_bench
+DATA = deform_bench--1.0.sql
+PGFILEDESC = "deform_bench - tuple deform benchmarking"
+
+REGRESS = deform_bench
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/deform_bench
+top_builddir = ../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/deform_bench/deform_bench--1.0.sql b/src/test/modules/deform_bench/deform_bench--1.0.sql
new file mode 100644
index 00000000000..492b71dba3b
--- /dev/null
+++ b/src/test/modules/deform_bench/deform_bench--1.0.sql
@@ -0,0 +1,8 @@
+/* deform_bench--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION deform_bench" to load this file. \quit
+
+CREATE FUNCTION deform_bench(tableoid Oid, attnum int[]) RETURNS FLOAT
+AS 'MODULE_PATHNAME', 'deform_bench'
+LANGUAGE C VOLATILE STRICT;
diff --git a/src/test/modules/deform_bench/deform_bench.c b/src/test/modules/deform_bench/deform_bench.c
new file mode 100644
index 00000000000..525162eb59c
--- /dev/null
+++ b/src/test/modules/deform_bench/deform_bench.c
@@ -0,0 +1,106 @@
+/*-------------------------------------------------------------------------
+ *
+ * deform_bench.c
+ *
+ * for benchmarking tuple deformation routines
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include <time.h>
+#include <sys/time.h>
+
+#include "access/heapam.h"
+#include "access/relscan.h"
+#include "catalog/pg_am_d.h"
+#include "catalog/pg_type_d.h"
+#include "funcapi.h"
+#include "miscadmin.h"
+#include "utils/array.h"
+#include "utils/arrayaccess.h"
+#include "utils/builtins.h"
+
+PG_MODULE_MAGIC;
+
+PG_FUNCTION_INFO_V1(deform_bench);
+
+Datum
+deform_bench(PG_FUNCTION_ARGS)
+{
+	Oid			tableoid = PG_GETARG_OID(0);
+	ArrayType  *array = PG_GETARG_ARRAYTYPE_P(1);
+	TableScanDesc scan;
+	Relation	rel;
+	TupleDesc	tupdesc;
+	TupleTableSlot *slot;
+	Datum	   *elem_datums = NULL;
+	bool	   *elem_nulls = NULL;
+	int			elem_count;
+	int		   *attnums;
+	clock_t		start,
+				end;
+
+	rel = relation_open(tableoid, AccessShareLock);
+
+	if (rel->rd_rel->relam != HEAP_TABLE_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only heap AM is supported")));
+
+	tupdesc = RelationGetDescr(rel);
+	slot = MakeTupleTableSlot(tupdesc, &TTSOpsBufferHeapTuple);
+	scan = table_beginscan_strat(rel, GetActiveSnapshot(), 0, NULL, true, false);
+
+	/*
+	 * The array is used to allow callers to define how many atts to deform.
+	 * e.g: '{1,10}'::int[] would deform attnum=1, then in a 2nd pass deform
+	 * the remainder up to attnum=10.  Passing an element as NULL means all
+	 * attnums.  This allows simulation of incremental deformation.  Generally
+	 * if you're passing an array with more than 1 element, then the array
+	 * should be in ascending order.  Doing something like '{10,1}' would mean
+	 * we've already deformed 10 attributes and on the 2nd pass there's
+	 * nothing to do since attnum=1 was already deformed in the first pass.
+	 *
+	 * You'll get an ERROR if you pass a number higher than the number of
+	 * attributes in the table.
+	 */
+	deconstruct_array(array,
+					  INT4OID,
+					  sizeof(int32),
+					  true,
+					  'i',
+					  &elem_datums,
+					  &elem_nulls,
+					  &elem_count);
+
+	attnums = palloc_array(int, elem_count);
+
+	for (int i = 0; i < elem_count; i++)
+	{
+		/* Make a NULL element mean all attributes */
+		if (elem_nulls[i])
+			attnums[i] = tupdesc->natts;
+		else
+			attnums[i] = DatumGetInt32(elem_datums[i]);
+	}
+
+	start = clock();
+
+	while (heap_getnextslot(scan, ForwardScanDirection, slot))
+	{
+		CHECK_FOR_INTERRUPTS();
+
+		/* Deform in stages according to the attnums array */
+		for (int i = 0; i < elem_count; i++)
+			slot_getsomeattrs_int(slot, attnums[i]);
+	}
+
+	ExecDropSingleTupleTableSlot(slot);
+	table_endscan(scan);
+	relation_close(rel, AccessShareLock);
+
+	end = clock();
+
+	/* Returns the number of milliseconds to run the test */
+	PG_RETURN_FLOAT8((double) (end - start) / (CLOCKS_PER_SEC / 1000));
+}
diff --git a/src/test/modules/deform_bench/deform_bench.control b/src/test/modules/deform_bench/deform_bench.control
new file mode 100644
index 00000000000..a2023f9d738
--- /dev/null
+++ b/src/test/modules/deform_bench/deform_bench.control
@@ -0,0 +1,4 @@
+# deform_bench extension
+comment = 'functions for benchmarking tuple deformation'
+default_version = '1.0'
+module_pathname = '$libdir/deform_bench'
diff --git a/src/test/modules/deform_bench/meson.build b/src/test/modules/deform_bench/meson.build
new file mode 100644
index 00000000000..82049585244
--- /dev/null
+++ b/src/test/modules/deform_bench/meson.build
@@ -0,0 +1,22 @@
+# Copyright (c) 2026, PostgreSQL Global Development Group
+
+deform_bench_sources = files(
+  'deform_bench.c',
+)
+
+if host_system == 'windows'
+  deform_bench_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+    '--NAME', 'deform_bench',
+    '--FILEDESC', 'deform_bench - benchmarking tuple deformation',])
+endif
+
+deform_bench = shared_module('deform_bench',
+  deform_bench_sources,
+  kwargs: pg_test_mod_args,
+)
+test_install_libs += deform_bench
+
+test_install_data += files(
+  'deform_bench--1.0.sql',
+  'deform_bench.control',
+)
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 2634a519935..ef2b0af4581 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -2,6 +2,7 @@
 
 subdir('brin')
 subdir('commit_ts')
+subdir('deform_bench')
 subdir('delay_execution')
 subdir('dummy_index_am')
 subdir('dummy_seclabel')
-- 
2.51.0


From bcd35959d96d237208643faa3e9d6ed196a34391 Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Fri, 30 Jan 2026 23:18:45 +1300
Subject: [PATCH v7 4/4] Various experimental changes

---
 src/backend/access/common/tupdesc.c |   6 ++
 src/backend/executor/execTuples.c   |  48 ++++-----
 src/include/access/tupmacs.h        | 155 ++++++++++++++++++++++++++++
 3 files changed, 180 insertions(+), 29 deletions(-)

diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index 25364db630a..ca393af67c9 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -105,6 +105,12 @@ populate_compact_attribute_internal(Form_pg_attribute src,
 			elog(ERROR, "invalid attalign value: %c", src->attalign);
 			break;
 	}
+
+	/* Check for unsupported byval attlens */
+	if (src->attbyval && src->attlen != sizeof(char) &&
+		src->attlen != sizeof(int16) && src->attlen != sizeof(int32) &&
+		src->attlen != sizeof(int64))
+		elog(ERROR, "unsupported byval length: %d", src->attlen);
 }
 
 /*
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index 36d0aaed2fb..c3bc010d824 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -1029,24 +1029,26 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 	/* We can only fetch as many attributes as the tuple has. */
 	natts = Min(HeapTupleHeaderGetNatts(tup), natts);
 	attnum = slot->tts_nvalid;
+	values = slot->tts_values;
+	isnull = slot->tts_isnull;
 	firstNonCacheOffsetAttr = Min(tupleDesc->firstNonCachedOffAttr, natts);
 
 	if (hasnulls)
 	{
+		tp = (char *) tup + tup->t_hoff;
 		bp = tup->t_bits;
 		firstNullAttr = first_null_attr(bp, natts);
 		firstNonCacheOffsetAttr = Min(firstNonCacheOffsetAttr, firstNullAttr);
+		populate_isnull_array(bp, natts, isnull);
 	}
 	else
 	{
+		tp = (char *) tup + MAXALIGN(offsetof(HeapTupleHeaderData, t_bits));
 		bp = NULL;
 		firstNullAttr = natts;
+		memset(isnull, 0, sizeof(bool) * natts);
 	}
 
-	values = slot->tts_values;
-	isnull = slot->tts_isnull;
-	tp = (char *) tup + tup->t_hoff;
-
 	/*
 	 * Handle the portion of the tuple that we have cached the offset for up
 	 * to the first NULL attribute.  The offset is effectively fixed for these
@@ -1065,7 +1067,6 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 #endif
 		do
 		{
-			isnull[attnum] = false;
 			cattr = TupleDescCompactAttr(tupleDesc, attnum);
 
 #ifdef USE_ASSERT_CHECKING
@@ -1101,19 +1102,14 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 	 */
 	for (; attnum < firstNullAttr; attnum++)
 	{
-		isnull[attnum] = false;
 		cattr = TupleDescCompactAttr(tupleDesc, attnum);
 
-		/* align the offset for this attribute */
-		off = att_pointer_alignby(off,
-								  cattr->attalignby,
-								  cattr->attlen,
-								  tp + off);
-
-		values[attnum] = fetchatt(cattr, tp + off);
-
-		/* move the offset beyond this attribute */
-		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+		/* align 'off', fetch the datum, and increment off beyond the datum */
+		values[attnum] = align_fetch_then_add(tp,
+											  &off,
+											  cattr->attbyval,
+											  cattr->attlen,
+											  cattr->attalignby);
 	}
 
 	/*
@@ -1122,26 +1118,20 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 	 */
 	for (; attnum < natts; attnum++)
 	{
-		if (att_isnull(attnum, bp))
+		if (isnull[attnum])
 		{
 			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
 			continue;
 		}
 
-		isnull[attnum] = false;
 		cattr = TupleDescCompactAttr(tupleDesc, attnum);
 
-		/* align the offset for this attribute */
-		off = att_pointer_alignby(off,
-								  cattr->attalignby,
-								  cattr->attlen,
-								  tp + off);
-
-		values[attnum] = fetchatt(cattr, tp + off);
-
-		/* move the offset beyond this attribute */
-		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+		/* align 'off', fetch the datum, and increment off beyond the datum */
+		values[attnum] = align_fetch_then_add(tp,
+											  &off,
+											  cattr->attbyval,
+											  cattr->attlen,
+											  cattr->attalignby);
 	}
 
 	/*
diff --git a/src/include/access/tupmacs.h b/src/include/access/tupmacs.h
index 150a7a24785..21ee3cc3594 100644
--- a/src/include/access/tupmacs.h
+++ b/src/include/access/tupmacs.h
@@ -16,7 +16,11 @@
 
 #include "catalog/pg_type_d.h"	/* for TYPALIGN macros */
 #include "port/pg_bitutils.h"
+#include "varatt.h"
 
+#ifdef DO_AVX512_VERSION
+#include <immintrin.h>
+#endif
 
 /*
  * Check a tuple's null bitmap to determine whether the attribute is null.
@@ -29,6 +33,90 @@ att_isnull(int ATT, const bits8 *BITS)
 	return !(BITS[ATT >> 3] & (1 << (ATT & 0x07)));
 }
 
+/*
+ * populate_isnull_array
+ *		Transform a tuple's null array into a boolean array.
+ *
+ * XXX there does not seem to be an efficient way to do this without AVX512.
+ * Here we use a 256 element array with all possible patterns for 8 isnull
+ * array elements for each possible byte value for a bitmask element.
+ */
+static inline void
+populate_isnull_array(const bits8 *bits, int natts, bool *isnull)
+{
+	int			n_full_bytes = natts >> 3;
+	int			attnum = n_full_bytes << 3;
+	bool	   *isnull_ptr = isnull;
+
+#ifndef DO_AVX512_VERSION
+	/* This is 2 kilobytes! */
+	static const uint64 isnull_to_array[256] = {
+		0x0101010101010101, 0x0101010101010100, 0x0101010101010001, 0x0101010101010000, 0x0101010101000101, 0x0101010101000100, 0x0101010101000001, 0x0101010101000000,
+		0x0101010100010101, 0x0101010100010100, 0x0101010100010001, 0x0101010100010000, 0x0101010100000101, 0x0101010100000100, 0x0101010100000001, 0x0101010100000000,
+		0x0101010001010101, 0x0101010001010100, 0x0101010001010001, 0x0101010001010000, 0x0101010001000101, 0x0101010001000100, 0x0101010001000001, 0x0101010001000000,
+		0x0101010000010101, 0x0101010000010100, 0x0101010000010001, 0x0101010000010000, 0x0101010000000101, 0x0101010000000100, 0x0101010000000001, 0x0101010000000000,
+		0x0101000101010101, 0x0101000101010100, 0x0101000101010001, 0x0101000101010000, 0x0101000101000101, 0x0101000101000100, 0x0101000101000001, 0x0101000101000000,
+		0x0101000100010101, 0x0101000100010100, 0x0101000100010001, 0x0101000100010000, 0x0101000100000101, 0x0101000100000100, 0x0101000100000001, 0x0101000100000000,
+		0x0101000001010101, 0x0101000001010100, 0x0101000001010001, 0x0101000001010000, 0x0101000001000101, 0x0101000001000100, 0x0101000001000001, 0x0101000001000000,
+		0x0101000000010101, 0x0101000000010100, 0x0101000000010001, 0x0101000000010000, 0x0101000000000101, 0x0101000000000100, 0x0101000000000001, 0x0101000000000000,
+		0x0100010101010101, 0x0100010101010100, 0x0100010101010001, 0x0100010101010000, 0x0100010101000101, 0x0100010101000100, 0x0100010101000001, 0x0100010101000000,
+		0x0100010100010101, 0x0100010100010100, 0x0100010100010001, 0x0100010100010000, 0x0100010100000101, 0x0100010100000100, 0x0100010100000001, 0x0100010100000000,
+		0x0100010001010101, 0x0100010001010100, 0x0100010001010001, 0x0100010001010000, 0x0100010001000101, 0x0100010001000100, 0x0100010001000001, 0x0100010001000000,
+		0x0100010000010101, 0x0100010000010100, 0x0100010000010001, 0x0100010000010000, 0x0100010000000101, 0x0100010000000100, 0x0100010000000001, 0x0100010000000000,
+		0x0100000101010101, 0x0100000101010100, 0x0100000101010001, 0x0100000101010000, 0x0100000101000101, 0x0100000101000100, 0x0100000101000001, 0x0100000101000000,
+		0x0100000100010101, 0x0100000100010100, 0x0100000100010001, 0x0100000100010000, 0x0100000100000101, 0x0100000100000100, 0x0100000100000001, 0x0100000100000000,
+		0x0100000001010101, 0x0100000001010100, 0x0100000001010001, 0x0100000001010000, 0x0100000001000101, 0x0100000001000100, 0x0100000001000001, 0x0100000001000000,
+		0x0100000000010101, 0x0100000000010100, 0x0100000000010001, 0x0100000000010000, 0x0100000000000101, 0x0100000000000100, 0x0100000000000001, 0x0100000000000000,
+		0x0001010101010101, 0x0001010101010100, 0x0001010101010001, 0x0001010101010000, 0x0001010101000101, 0x0001010101000100, 0x0001010101000001, 0x0001010101000000,
+		0x0001010100010101, 0x0001010100010100, 0x0001010100010001, 0x0001010100010000, 0x0001010100000101, 0x0001010100000100, 0x0001010100000001, 0x0001010100000000,
+		0x0001010001010101, 0x0001010001010100, 0x0001010001010001, 0x0001010001010000, 0x0001010001000101, 0x0001010001000100, 0x0001010001000001, 0x0001010001000000,
+		0x0001010000010101, 0x0001010000010100, 0x0001010000010001, 0x0001010000010000, 0x0001010000000101, 0x0001010000000100, 0x0001010000000001, 0x0001010000000000,
+		0x0001000101010101, 0x0001000101010100, 0x0001000101010001, 0x0001000101010000, 0x0001000101000101, 0x0001000101000100, 0x0001000101000001, 0x0001000101000000,
+		0x0001000100010101, 0x0001000100010100, 0x0001000100010001, 0x0001000100010000, 0x0001000100000101, 0x0001000100000100, 0x0001000100000001, 0x0001000100000000,
+		0x0001000001010101, 0x0001000001010100, 0x0001000001010001, 0x0001000001010000, 0x0001000001000101, 0x0001000001000100, 0x0001000001000001, 0x0001000001000000,
+		0x0001000000010101, 0x0001000000010100, 0x0001000000010001, 0x0001000000010000, 0x0001000000000101, 0x0001000000000100, 0x0001000000000001, 0x0001000000000000,
+		0x0000010101010101, 0x0000010101010100, 0x0000010101010001, 0x0000010101010000, 0x0000010101000101, 0x0000010101000100, 0x0000010101000001, 0x0000010101000000,
+		0x0000010100010101, 0x0000010100010100, 0x0000010100010001, 0x0000010100010000, 0x0000010100000101, 0x0000010100000100, 0x0000010100000001, 0x0000010100000000,
+		0x0000010001010101, 0x0000010001010100, 0x0000010001010001, 0x0000010001010000, 0x0000010001000101, 0x0000010001000100, 0x0000010001000001, 0x0000010001000000,
+		0x0000010000010101, 0x0000010000010100, 0x0000010000010001, 0x0000010000010000, 0x0000010000000101, 0x0000010000000100, 0x0000010000000001, 0x0000010000000000,
+		0x0000000101010101, 0x0000000101010100, 0x0000000101010001, 0x0000000101010000, 0x0000000101000101, 0x0000000101000100, 0x0000000101000001, 0x0000000101000000,
+		0x0000000100010101, 0x0000000100010100, 0x0000000100010001, 0x0000000100010000, 0x0000000100000101, 0x0000000100000100, 0x0000000100000001, 0x0000000100000000,
+		0x0000000001010101, 0x0000000001010100, 0x0000000001010001, 0x0000000001010000, 0x0000000001000101, 0x0000000001000100, 0x0000000001000001, 0x0000000001000000,
+		0x0000000000010101, 0x0000000000010100, 0x0000000000010001, 0x0000000000010000, 0x0000000000000101, 0x0000000000000100, 0x0000000000000001, 0x0000000000000000
+	};
+#endif
+
+	for (int i = 0; i < n_full_bytes; i++)
+	{
+#ifdef DO_AVX512_VERSION
+		/* The array isn't required when AVX512 is available.  Testing only */
+		/*
+		 * XXX requires CFLAGS="-D DO_AVX512_VERSION -march=x86-64-v4" and an
+		 * avx512 machine
+		 */
+
+		/*
+		 * The bits array has 1s for values and 0s for NULLs. Bit-flip that to
+		 * get 1s for NULLs and use that mask to populate the register with
+		 * true values and zeros (falses) when the mask bit isn't set.
+		 */
+		__m128i		res = _mm_maskz_set1_epi8(~bits[i], true);
+
+		/* Grab lower 64-bits of the 128-bit register */
+		uint64		src = _mm_cvtsi128_si64(res);
+
+		memcpy(isnull_ptr, &src, sizeof(uint64));
+#else
+		memcpy(isnull_ptr, &isnull_to_array[bits[i]], sizeof(uint64));
+#endif
+		isnull_ptr += 8;
+	}
+
+	/* handle remaining attributes */
+	for (; attnum < natts; attnum++)
+		isnull[attnum] = att_isnull(attnum, bits);
+}
+
 #ifndef FRONTEND
 /*
  * Given an attbyval and an attlen from either a Form_pg_attribute or
@@ -71,6 +159,73 @@ fetch_att(const void *T, bool attbyval, int attlen)
 		return PointerGetDatum(T);
 }
 
+/*
+ * align_fetch_then_add
+ *		Applies all the functionality of att_pointer_alignby(), fetch_att()
+ *		and att_addlength_pointer() resulting in *off pointer to the perhaps
+ *		unaligned number of bytes into 'tupptr', ready to deform the next
+ *		attribute.
+ *
+ * tupptr: pointer to the beginning of the tuple, after the header and any
+ * NULL bitmask.
+ * off: offset in bytes for reading tuple data, possibly unaligned.
+ * attbyval, attlen, attalignby are values from CompactAttribute.
+ */
+static inline Datum
+align_fetch_then_add(const char *tupptr, uint32 *off, bool attbyval, int attlen,
+					 uint8 attalignby)
+{
+	Datum		res;
+
+	if (attlen > 0)
+	{
+		const char *offset_ptr;
+
+		*off = TYPEALIGN(attalignby, *off);
+		offset_ptr = tupptr + *off;
+		*off += attlen;
+		if (attbyval)
+		{
+			switch (attlen)
+			{
+				case sizeof(char):
+					return CharGetDatum(*((const char *) offset_ptr));
+				case sizeof(int16):
+					return Int16GetDatum(*((const int16 *) offset_ptr));
+				case sizeof(int32):
+					return Int32GetDatum(*((const int32 *) offset_ptr));
+				default:
+
+					/*
+					 * populate_compact_attribute_internal() should have
+					 * checked
+					 */
+					Assert(attlen == sizeof(int64));
+					return Int64GetDatum(*((const int64 *) offset_ptr));
+			}
+		}
+		return PointerGetDatum(offset_ptr);
+	}
+	else if (attlen == -1)
+	{
+
+		if (!VARATT_IS_SHORT(tupptr + *off))
+			*off = TYPEALIGN(attalignby, *off);
+
+		res = PointerGetDatum(tupptr + *off);
+		*off += VARSIZE_ANY(DatumGetPointer(res));
+		return res;
+	}
+	else
+	{
+		Assert(attlen == -2);
+		*off = TYPEALIGN(attalignby, *off);
+		res = PointerGetDatum(tupptr + *off);
+		*off += strlen(tupptr + *off) + 1;
+		return res;
+	}
+}
+
 #ifndef HAVE__BUILTIN_CTZ
 /*
  * For returning the 0-based position of the right-most 0 bit of a uint8, or 8
-- 
2.51.0



Attachments:

  [text/plain] v7-0001-Add-empty-TupleDescFinalize-function.patch (29.0K, 2-v7-0001-Add-empty-TupleDescFinalize-function.patch)
  download | inline diff:
From f7efca3f15f517bb0c807c3e09ad40aab4c08f6b Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Wed, 21 Jan 2026 15:41:37 +1300
Subject: [PATCH v7 1/4] Add empty TupleDescFinalize() function

Currently does nothing, but will in a future commit.
---
 contrib/dblink/dblink.c                             |  4 ++++
 contrib/pg_buffercache/pg_buffercache_pages.c       |  2 ++
 contrib/pg_visibility/pg_visibility.c               |  2 ++
 src/backend/access/brin/brin_tuple.c                |  1 +
 src/backend/access/common/tupdesc.c                 | 13 +++++++++++++
 src/backend/access/gin/ginutil.c                    |  1 +
 src/backend/access/gist/gistscan.c                  |  1 +
 src/backend/access/spgist/spgutils.c                |  1 +
 src/backend/access/transam/twophase.c               |  1 +
 src/backend/access/transam/xlogfuncs.c              |  1 +
 src/backend/backup/basebackup_copy.c                |  3 +++
 src/backend/catalog/index.c                         |  2 ++
 src/backend/catalog/pg_publication.c                |  1 +
 src/backend/catalog/toasting.c                      |  6 ++++++
 src/backend/commands/explain.c                      |  1 +
 src/backend/commands/functioncmds.c                 |  1 +
 src/backend/commands/sequence.c                     |  1 +
 src/backend/commands/tablecmds.c                    |  4 ++++
 src/backend/commands/wait.c                         |  1 +
 src/backend/executor/execSRF.c                      |  2 ++
 src/backend/executor/execTuples.c                   |  4 ++++
 src/backend/executor/nodeFunctionscan.c             |  2 ++
 src/backend/parser/parse_relation.c                 |  4 +++-
 src/backend/parser/parse_target.c                   |  2 ++
 .../replication/libpqwalreceiver/libpqwalreceiver.c |  1 +
 src/backend/replication/walsender.c                 |  5 +++++
 src/backend/utils/adt/acl.c                         |  1 +
 src/backend/utils/adt/genfile.c                     |  1 +
 src/backend/utils/adt/lockfuncs.c                   |  1 +
 src/backend/utils/adt/orderedsetaggs.c              |  1 +
 src/backend/utils/adt/pgstatfuncs.c                 |  5 +++++
 src/backend/utils/adt/tsvector_op.c                 |  1 +
 src/backend/utils/cache/relcache.c                  |  8 ++++++++
 src/backend/utils/fmgr/funcapi.c                    |  6 ++++++
 src/backend/utils/misc/guc_funcs.c                  |  5 +++++
 src/include/access/tupdesc.h                        |  1 +
 src/pl/plpgsql/src/pl_comp.c                        |  2 ++
 .../test_custom_stats/test_custom_fixed_stats.c     |  1 +
 src/test/modules/test_predtest/test_predtest.c      |  1 +
 39 files changed, 100 insertions(+), 1 deletion(-)

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 8cb3166495c..1ce4502fec2 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -881,6 +881,7 @@ materializeResult(FunctionCallInfo fcinfo, PGconn *conn, PGresult *res)
 		tupdesc = CreateTemplateTupleDesc(1);
 		TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 						   TEXTOID, -1, 0);
+		TupleDescFinalize(tupdesc);
 		ntuples = 1;
 		nfields = 1;
 	}
@@ -1044,6 +1045,7 @@ materializeQueryResult(FunctionCallInfo fcinfo,
 			tupdesc = CreateTemplateTupleDesc(1);
 			TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 							   TEXTOID, -1, 0);
+			TupleDescFinalize(tupdesc);
 			attinmeta = TupleDescGetAttInMetadata(tupdesc);
 
 			oldcontext = MemoryContextSwitchTo(rsinfo->econtext->ecxt_per_query_memory);
@@ -1529,6 +1531,8 @@ dblink_get_pkey(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 2, "colname",
 						   TEXTOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
+
 		/*
 		 * Generate attribute metadata needed later to produce tuples from raw
 		 * C strings
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 89b86855243..a6b4fb5252b 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -174,6 +174,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
 			TupleDescInitEntry(tupledesc, (AttrNumber) 9, "pinning_backends",
 							   INT4OID, -1, 0);
 
+		TupleDescFinalize(tupledesc);
 		fctx->tupdesc = BlessTupleDesc(tupledesc);
 
 		/* Allocate NBuffers worth of BufferCachePagesRec records. */
@@ -442,6 +443,7 @@ pg_buffercache_os_pages_internal(FunctionCallInfo fcinfo, bool include_numa)
 		TupleDescInitEntry(tupledesc, (AttrNumber) 3, "numa_node",
 						   INT4OID, -1, 0);
 
+		TupleDescFinalize(tupledesc);
 		fctx->tupdesc = BlessTupleDesc(tupledesc);
 		fctx->include_numa = include_numa;
 
diff --git a/contrib/pg_visibility/pg_visibility.c b/contrib/pg_visibility/pg_visibility.c
index 9bc3a784bf7..dfab0b64cf5 100644
--- a/contrib/pg_visibility/pg_visibility.c
+++ b/contrib/pg_visibility/pg_visibility.c
@@ -469,6 +469,8 @@ pg_visibility_tupdesc(bool include_blkno, bool include_pd)
 		TupleDescInitEntry(tupdesc, ++a, "pd_all_visible", BOOLOID, -1, 0);
 	Assert(a == maxattr);
 
+	TupleDescFinalize(tupdesc);
+
 	return BlessTupleDesc(tupdesc);
 }
 
diff --git a/src/backend/access/brin/brin_tuple.c b/src/backend/access/brin/brin_tuple.c
index 706387e36d6..7f150df9ee7 100644
--- a/src/backend/access/brin/brin_tuple.c
+++ b/src/backend/access/brin/brin_tuple.c
@@ -84,6 +84,7 @@ brtuple_disk_tupdesc(BrinDesc *brdesc)
 
 		MemoryContextSwitchTo(oldcxt);
 
+		TupleDescFinalize(tupdesc);
 		brdesc->bd_disktdesc = tupdesc;
 	}
 
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index 94b4f1f9975..e98de806a77 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -238,6 +238,9 @@ CreateTupleDesc(int natts, Form_pg_attribute *attrs)
 		memcpy(TupleDescAttr(desc, i), attrs[i], ATTRIBUTE_FIXED_PART_SIZE);
 		populate_compact_attribute(desc, i);
 	}
+
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -282,6 +285,8 @@ CreateTupleDescCopy(TupleDesc tupdesc)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -328,6 +333,8 @@ CreateTupleDescTruncatedCopy(TupleDesc tupdesc, int natts)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -413,6 +420,8 @@ CreateTupleDescCopyConstr(TupleDesc tupdesc)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -455,6 +464,8 @@ TupleDescCopy(TupleDesc dst, TupleDesc src)
 	 * source's refcount would be wrong in any case.)
 	 */
 	dst->tdrefcount = -1;
+
+	TupleDescFinalize(dst);
 }
 
 /*
@@ -1082,6 +1093,8 @@ BuildDescFromLists(const List *names, const List *types, const List *typmods, co
 		TupleDescInitEntryCollation(desc, attnum, attcollation);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index d205093e21d..a533d79e26e 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -129,6 +129,7 @@ initGinState(GinState *state, Relation index)
 							   attr->attndims);
 			TupleDescInitEntryCollation(state->tupdesc[i], (AttrNumber) 2,
 										attr->attcollation);
+			TupleDescFinalize(state->tupdesc[i]);
 		}
 
 		/*
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
index f23bc4a6757..c65f93abdae 100644
--- a/src/backend/access/gist/gistscan.c
+++ b/src/backend/access/gist/gistscan.c
@@ -201,6 +201,7 @@ gistrescan(IndexScanDesc scan, ScanKey key, int nkeys,
 											 attno - 1)->atttypid,
 							   -1, 0);
 		}
+		TupleDescFinalize(so->giststate->fetchTupdesc);
 		scan->xs_hitupdesc = so->giststate->fetchTupdesc;
 
 		/* Also create a memory context that will hold the returned tuples */
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 9f5379b87ac..b246e8127db 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -340,6 +340,7 @@ getSpGistTupleDesc(Relation index, SpGistTypeDesc *keyType)
 			TupleDescCompactAttr(outTupDesc, i)->attcacheoff = -1;
 
 		populate_compact_attribute(outTupDesc, spgKeyColumn);
+		TupleDescFinalize(outTupDesc);
 	}
 	return outTupDesc;
 }
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 601ce3faa64..6d5792c7929 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -744,6 +744,7 @@ pg_prepared_xact(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 5, "dbid",
 						   OIDOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 2efe4105efb..b6bc616c74c 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -400,6 +400,7 @@ pg_walfile_name_offset(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(resultTupleDesc, (AttrNumber) 2, "file_offset",
 					   INT4OID, -1, 0);
 
+	TupleDescFinalize(resultTupleDesc);
 	resultTupleDesc = BlessTupleDesc(resultTupleDesc);
 
 	/*
diff --git a/src/backend/backup/basebackup_copy.c b/src/backend/backup/basebackup_copy.c
index fecfad9ab7b..29dbd0cb32f 100644
--- a/src/backend/backup/basebackup_copy.c
+++ b/src/backend/backup/basebackup_copy.c
@@ -357,6 +357,8 @@ SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli)
 	 */
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "tli", INT8OID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
+
 	/* send RowDescription */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
 
@@ -388,6 +390,7 @@ SendTablespaceList(List *tablespaces)
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, "spcoid", OIDOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "spclocation", TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "size", INT8OID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* send RowDescription */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 43de42ce39e..75e97fb394a 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -481,6 +481,8 @@ ConstructTupleDescriptor(Relation heapRelation,
 		populate_compact_attribute(indexTupDesc, i);
 	}
 
+	TupleDescFinalize(indexTupDesc);
+
 	return indexTupDesc;
 }
 
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 9a4791c573e..fa353a0dd37 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -1230,6 +1230,7 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 4, "qual",
 						   PG_NODE_TREEOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 		funcctx->user_fctx = table_infos;
 
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index c78dcea98c1..078a1cf5127 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -229,6 +229,12 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid,
 	TupleDescAttr(tupdesc, 1)->attcompression = InvalidCompressionMethod;
 	TupleDescAttr(tupdesc, 2)->attcompression = InvalidCompressionMethod;
 
+	populate_compact_attribute(tupdesc, 0);
+	populate_compact_attribute(tupdesc, 1);
+	populate_compact_attribute(tupdesc, 2);
+
+	TupleDescFinalize(tupdesc);
+
 	/*
 	 * Toast tables for regular relations go in pg_toast; those for temp
 	 * relations go into the per-backend temp-toast-table namespace.
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index b7bb111688c..7abd9ed272f 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -281,6 +281,7 @@ ExplainResultDesc(ExplainStmt *stmt)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "QUERY PLAN",
 					   result_type, -1, 0);
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
 
diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c
index a516b037dea..6a8f162b640 100644
--- a/src/backend/commands/functioncmds.c
+++ b/src/backend/commands/functioncmds.c
@@ -2423,6 +2423,7 @@ CallStmtResultDesc(CallStmt *stmt)
 							   -1,
 							   0);
 		}
+		TupleDescFinalize(tupdesc);
 	}
 
 	return tupdesc;
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index e1b808bbb60..551667650ba 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -1808,6 +1808,7 @@ pg_get_sequence_data(PG_FUNCTION_ARGS)
 					   BOOLOID, -1, 0);
 	TupleDescInitEntry(resultTupleDesc, (AttrNumber) 3, "page_lsn",
 					   LSNOID, -1, 0);
+	TupleDescFinalize(resultTupleDesc);
 	resultTupleDesc = BlessTupleDesc(resultTupleDesc);
 
 	seqrel = try_relation_open(relid, AccessShareLock);
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index f976c0e5c7e..ae7c9db9fa9 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -1029,6 +1029,8 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
 		}
 	}
 
+	TupleDescFinalize(descriptor);
+
 	/*
 	 * For relations with table AM and partitioned tables, select access
 	 * method to use: an explicitly indicated one, or (in the case of a
@@ -1448,6 +1450,8 @@ BuildDescForRelation(const List *columns)
 		populate_compact_attribute(desc, attnum - 1);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
diff --git a/src/backend/commands/wait.c b/src/backend/commands/wait.c
index 1290df10c6f..8e920a72372 100644
--- a/src/backend/commands/wait.c
+++ b/src/backend/commands/wait.c
@@ -338,5 +338,6 @@ WaitStmtResultDesc(WaitStmt *stmt)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 					   TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
diff --git a/src/backend/executor/execSRF.c b/src/backend/executor/execSRF.c
index a0b111dc0e4..b481e50acfb 100644
--- a/src/backend/executor/execSRF.c
+++ b/src/backend/executor/execSRF.c
@@ -272,6 +272,7 @@ ExecMakeTableFunctionResult(SetExprState *setexpr,
 									   funcrettype,
 									   -1,
 									   0);
+					TupleDescFinalize(tupdesc);
 					rsinfo.setDesc = tupdesc;
 				}
 				MemoryContextSwitchTo(oldcontext);
@@ -776,6 +777,7 @@ init_sexpr(Oid foid, Oid input_collation, Expr *node,
 							   funcrettype,
 							   -1,
 							   0);
+			TupleDescFinalize(tupdesc);
 			sexpr->funcResultDesc = tupdesc;
 			sexpr->funcReturnsTuple = false;
 		}
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index b768eae9e53..e6ab51e6404 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -2173,6 +2173,8 @@ ExecTypeFromTLInternal(List *targetList, bool skipjunk)
 		cur_resno++;
 	}
 
+	TupleDescFinalize(typeInfo);
+
 	return typeInfo;
 }
 
@@ -2207,6 +2209,8 @@ ExecTypeFromExprList(List *exprList)
 		cur_resno++;
 	}
 
+	TupleDescFinalize(typeInfo);
+
 	return typeInfo;
 }
 
diff --git a/src/backend/executor/nodeFunctionscan.c b/src/backend/executor/nodeFunctionscan.c
index 63e605e1f81..feb82d64967 100644
--- a/src/backend/executor/nodeFunctionscan.c
+++ b/src/backend/executor/nodeFunctionscan.c
@@ -414,6 +414,7 @@ ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags)
 				TupleDescInitEntryCollation(tupdesc,
 											(AttrNumber) 1,
 											exprCollation(funcexpr));
+				TupleDescFinalize(tupdesc);
 			}
 			else
 			{
@@ -485,6 +486,7 @@ ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags)
 							   0);
 		}
 
+		TupleDescFinalize(scan_tupdesc);
 		Assert(attno == natts);
 	}
 
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 3ec8d8de011..0ad767d827b 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -1891,6 +1891,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 			TupleDescInitEntryCollation(tupdesc,
 										(AttrNumber) 1,
 										exprCollation(funcexpr));
+			TupleDescFinalize(tupdesc);
 		}
 		else if (functypclass == TYPEFUNC_RECORD)
 		{
@@ -1948,6 +1949,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 
 				i++;
 			}
+			TupleDescFinalize(tupdesc);
 
 			/*
 			 * Ensure that the coldeflist defines a legal set of names (no
@@ -2016,7 +2018,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 							   0);
 			/* no need to set collation */
 		}
-
+		TupleDescFinalize(tupdesc);
 		Assert(natts == totalatts);
 	}
 	else
diff --git a/src/backend/parser/parse_target.c b/src/backend/parser/parse_target.c
index b5a2f915b67..5fd17f3d8d0 100644
--- a/src/backend/parser/parse_target.c
+++ b/src/backend/parser/parse_target.c
@@ -1570,6 +1570,8 @@ expandRecordVariable(ParseState *pstate, Var *var, int levelsup)
 		}
 		Assert(lname == NULL && lvar == NULL);	/* lists same length? */
 
+		TupleDescFinalize(tupleDesc);
+
 		return tupleDesc;
 	}
 
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 7c8639b32e9..9f04c9ed25d 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -1073,6 +1073,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
 	for (coln = 0; coln < nRetTypes; coln++)
 		TupleDescInitEntry(walres->tupledesc, (AttrNumber) coln + 1,
 						   PQfname(pgres, coln), retTypes[coln], -1, 0);
+	TupleDescFinalize(walres->tupledesc);
 	attinmeta = TupleDescGetAttInMetadata(walres->tupledesc);
 
 	/* No point in doing more here if there were no tuples returned. */
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index a0e6a3d200c..9e2f4a664b4 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -451,6 +451,7 @@ IdentifySystem(void)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 4, "dbname",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -496,6 +497,7 @@ ReadReplicationSlot(ReadReplicationSlotCmd *cmd)
 	/* TimeLineID is unsigned, so int4 is not wide enough. */
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "restart_tli",
 							  INT8OID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	memset(nulls, true, READ_REPLICATION_SLOT_COLS * sizeof(bool));
 
@@ -598,6 +600,7 @@ SendTimeLineHistory(TimeLineHistoryCmd *cmd)
 	tupdesc = CreateTemplateTupleDesc(2);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, "filename", TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "content", TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	TLHistoryFileName(histfname, cmd->timeline);
 	TLHistoryFilePath(path, cmd->timeline);
@@ -1015,6 +1018,7 @@ StartReplication(StartReplicationCmd *cmd)
 								  INT8OID, -1, 0);
 		TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "next_tli_startpos",
 								  TEXTOID, -1, 0);
+		TupleDescFinalize(tupdesc);
 
 		/* prepare for projection of tuple */
 		tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -1369,6 +1373,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 4, "output_plugin",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
diff --git a/src/backend/utils/adt/acl.c b/src/backend/utils/adt/acl.c
index 3a6905f9546..9d37053c81e 100644
--- a/src/backend/utils/adt/acl.c
+++ b/src/backend/utils/adt/acl.c
@@ -1818,6 +1818,7 @@ aclexplode(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 4, "is_grantable",
 						   BOOLOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/* allocate memory for user context */
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index c083608b1d5..bfb949401d0 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -454,6 +454,7 @@ pg_stat_file(PG_FUNCTION_ARGS)
 					   "creation", TIMESTAMPTZOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 6,
 					   "isdir", BOOLOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	memset(isnull, false, sizeof(isnull));
diff --git a/src/backend/utils/adt/lockfuncs.c b/src/backend/utils/adt/lockfuncs.c
index 9dadd6da672..4481c354fd6 100644
--- a/src/backend/utils/adt/lockfuncs.c
+++ b/src/backend/utils/adt/lockfuncs.c
@@ -146,6 +146,7 @@ pg_lock_status(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 16, "waitstart",
 						   TIMESTAMPTZOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/*
diff --git a/src/backend/utils/adt/orderedsetaggs.c b/src/backend/utils/adt/orderedsetaggs.c
index 3b6da8e36ac..fd8b8676470 100644
--- a/src/backend/utils/adt/orderedsetaggs.c
+++ b/src/backend/utils/adt/orderedsetaggs.c
@@ -233,6 +233,7 @@ ordered_set_startup(FunctionCallInfo fcinfo, bool use_tuples)
 								   -1,
 								   0);
 
+				TupleDescFinalize(newdesc);
 				FreeTupleDesc(qstate->tupdesc);
 				qstate->tupdesc = newdesc;
 			}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 73ca0bb0b7f..08ad27e57c2 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -769,6 +769,7 @@ pg_stat_get_backend_subxact(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subxact_overflow",
 					   BOOLOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	if ((local_beentry = pgstat_get_local_beentry_by_proc_number(procNumber)) != NULL)
@@ -1658,6 +1659,7 @@ pg_stat_wal_build_tuple(PgStat_WalCounters wal_counters,
 	TupleDescInitEntry(tupdesc, (AttrNumber) 6, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	/* Fill values and NULLs */
@@ -2085,6 +2087,7 @@ pg_stat_get_archiver(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(tupdesc, (AttrNumber) 7, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	/* Get statistics about the archiver process */
@@ -2166,6 +2169,7 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
 					   TIMESTAMPTZOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 13, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	namestrcpy(&slotname, text_to_cstring(slotname_text));
@@ -2253,6 +2257,7 @@ pg_stat_get_subscription_stats(PG_FUNCTION_ARGS)
 					   INT8OID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 13, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	if (!subentry)
diff --git a/src/backend/utils/adt/tsvector_op.c b/src/backend/utils/adt/tsvector_op.c
index 94e0fed8309..7ca19a97882 100644
--- a/src/backend/utils/adt/tsvector_op.c
+++ b/src/backend/utils/adt/tsvector_op.c
@@ -651,6 +651,7 @@ tsvector_unnest(PG_FUNCTION_ARGS)
 						   TEXTARRAYOID, -1, 0);
 		if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
 			elog(ERROR, "return type must be a row type");
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = tupdesc;
 
 		funcctx->user_fctx = PG_GETARG_TSVECTOR_COPY(0);
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 6b634c9fff1..770edb34e08 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -729,6 +729,8 @@ RelationBuildTupleDesc(Relation relation)
 		pfree(constr);
 		relation->rd_att->constr = NULL;
 	}
+
+	TupleDescFinalize(relation->rd_att);
 }
 
 /*
@@ -1985,6 +1987,7 @@ formrdesc(const char *relationName, Oid relationReltype,
 
 	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
 	TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
+	TupleDescFinalize(relation->rd_att);
 
 	/* mark not-null status */
 	if (has_not_null)
@@ -3688,6 +3691,8 @@ RelationBuildLocalRelation(const char *relname,
 	for (i = 0; i < natts; i++)
 		TupleDescAttr(rel->rd_att, i)->attrelid = relid;
 
+	TupleDescFinalize(rel->rd_att);
+
 	rel->rd_rel->reltablespace = reltablespace;
 
 	if (mapped_relation)
@@ -4443,6 +4448,7 @@ BuildHardcodedDescriptor(int natts, const FormData_pg_attribute *attrs)
 
 	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
 	TupleDescCompactAttr(result, 0)->attcacheoff = 0;
+	TupleDescFinalize(result);
 
 	/* Note: we don't bother to set up a TupleConstr entry */
 
@@ -6268,6 +6274,8 @@ load_relcache_init_file(bool shared)
 			populate_compact_attribute(rel->rd_att, i);
 		}
 
+		TupleDescFinalize(rel->rd_att);
+
 		/* next read the access method specific field */
 		if (fread(&len, 1, sizeof(len), fp) != sizeof(len))
 			goto read_failed;
diff --git a/src/backend/utils/fmgr/funcapi.c b/src/backend/utils/fmgr/funcapi.c
index 8a934ea8dca..516d02cfb82 100644
--- a/src/backend/utils/fmgr/funcapi.c
+++ b/src/backend/utils/fmgr/funcapi.c
@@ -340,6 +340,8 @@ get_expr_result_type(Node *expr,
 										exprCollation(col));
 			i++;
 		}
+		TupleDescFinalize(tupdesc);
+
 		if (resultTypeId)
 			*resultTypeId = rexpr->row_typeid;
 		if (resultTupleDesc)
@@ -1044,6 +1046,7 @@ resolve_polymorphic_tupdesc(TupleDesc tupdesc, oidvector *declared_args,
 		}
 	}
 
+	TupleDescFinalize(tupdesc);
 	return true;
 }
 
@@ -1853,6 +1856,8 @@ build_function_result_tupdesc_d(char prokind,
 						   0);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -1970,6 +1975,7 @@ TypeGetTupleDesc(Oid typeoid, List *colaliases)
 						   typeoid,
 						   -1,
 						   0);
+		TupleDescFinalize(tupdesc);
 	}
 	else if (functypclass == TYPEFUNC_RECORD)
 	{
diff --git a/src/backend/utils/misc/guc_funcs.c b/src/backend/utils/misc/guc_funcs.c
index 4f3e40bf470..b82f807e05e 100644
--- a/src/backend/utils/misc/guc_funcs.c
+++ b/src/backend/utils/misc/guc_funcs.c
@@ -444,6 +444,7 @@ GetPGVariableResultDesc(const char *name)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 1, varname,
 						   TEXTOID, -1, 0);
 	}
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
 
@@ -465,6 +466,7 @@ ShowGUCConfigOption(const char *name, DestReceiver *dest)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, varname,
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -499,6 +501,7 @@ ShowAllGUCConfig(DestReceiver *dest)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "description",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -934,6 +937,8 @@ show_all_settings(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 17, "pending_restart",
 						   BOOLOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
+
 		/*
 		 * Generate attribute metadata needed later to produce tuples from raw
 		 * C strings
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index d46cdbf7a3c..595413dbbc5 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -195,6 +195,7 @@ extern TupleDesc CreateTupleDescTruncatedCopy(TupleDesc tupdesc, int natts);
 
 extern TupleDesc CreateTupleDescCopyConstr(TupleDesc tupdesc);
 
+#define TupleDescFinalize(d) ((void) 0)
 #define TupleDescSize(src) \
 	(offsetof(struct TupleDescData, compact_attrs) + \
 	 (src)->natts * sizeof(CompactAttribute) + \
diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c
index 7d648c941c0..b2112c29fcf 100644
--- a/src/pl/plpgsql/src/pl_comp.c
+++ b/src/pl/plpgsql/src/pl_comp.c
@@ -1912,6 +1912,8 @@ build_row_from_vars(PLpgSQL_variable **vars, int numvars)
 		TupleDescInitEntryCollation(row->rowtupdesc, i + 1, typcoll);
 	}
 
+	TupleDescFinalize(row->rowtupdesc);
+
 	return row;
 }
 
diff --git a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
index 908bd18a7c7..fa1719bf3b5 100644
--- a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
+++ b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
@@ -205,6 +205,7 @@ test_custom_stats_fixed_report(PG_FUNCTION_ARGS)
 					   INT8OID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	values[0] = Int64GetDatum(stats->numcalls);
diff --git a/src/test/modules/test_predtest/test_predtest.c b/src/test/modules/test_predtest/test_predtest.c
index 679a5de456d..48ca2a4ea70 100644
--- a/src/test/modules/test_predtest/test_predtest.c
+++ b/src/test/modules/test_predtest/test_predtest.c
@@ -230,6 +230,7 @@ test_predtest(PG_FUNCTION_ARGS)
 					   "s_r_holds", BOOLOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 8,
 					   "w_r_holds", BOOLOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	tupdesc = BlessTupleDesc(tupdesc);
 
 	values[0] = BoolGetDatum(strong_implied_by);
-- 
2.51.0



  [image/gif] m2_on_v7_with_0004.gif (80.0K, 3-m2_on_v7_with_0004.gif)
  download | view image

  [text/plain] v7-0002-Precalculate-CompactAttribute-s-attcacheoff.patch (50.0K, 4-v7-0002-Precalculate-CompactAttribute-s-attcacheoff.patch)
  download | inline diff:
From 8d6e9b363c37d74a2e6e4972bff710cf1be2a88f Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Tue, 31 Dec 2024 09:19:24 +1300
Subject: [PATCH v7 2/4] Precalculate CompactAttribute's attcacheoff

This allows code to be removed from the tuple deform routines which
shrinks down the code a little, which can make it run more quickly.
This also makes a dedicated deformer loop to deform the portion of the
tuple which has a known offset, which makes deforming much faster when
a leading set of the table's columns are non-NULL values and fixed-width
types.
---
 src/backend/access/common/heaptuple.c  | 334 +++++++++-------------
 src/backend/access/common/indextuple.c | 367 ++++++++++---------------
 src/backend/access/common/tupdesc.c    |  37 +++
 src/backend/access/spgist/spgutils.c   |   3 -
 src/backend/executor/execTuples.c      | 282 ++++++++-----------
 src/backend/jit/llvm/llvmjit_deform.c  |   6 -
 src/backend/utils/cache/relcache.c     |  12 -
 src/include/access/htup_details.h      |  19 +-
 src/include/access/itup.h              |  20 +-
 src/include/access/tupdesc.h           |  10 +-
 src/include/access/tupmacs.h           |  82 ++++++
 src/include/executor/tuptable.h        |   9 +-
 12 files changed, 539 insertions(+), 642 deletions(-)

diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c
index 11bec20e82e..42cce3dcdfe 100644
--- a/src/backend/access/common/heaptuple.c
+++ b/src/backend/access/common/heaptuple.c
@@ -497,20 +497,8 @@ heap_attisnull(HeapTuple tup, int attnum, TupleDesc tupleDesc)
 /* ----------------
  *		nocachegetattr
  *
- *		This only gets called from fastgetattr(), in cases where we
- *		can't use a cacheoffset and the value is not null.
- *
- *		This caches attribute offsets in the attribute descriptor.
- *
- *		An alternative way to speed things up would be to cache offsets
- *		with the tuple, but that seems more difficult unless you take
- *		the storage hit of actually putting those offsets into the
- *		tuple you send to disk.  Yuck.
- *
- *		This scheme will be slightly slower than that, but should
- *		perform well for queries which hit large #'s of tuples.  After
- *		you cache the offsets once, examining all the other tuples using
- *		the same attribute descriptor will go much quicker. -cim 5/4/91
+ *		This only gets called from fastgetattr(), in cases where the
+ *		attcacheoff is not set.
  *
  *		NOTE: if you need to change this code, see also heap_deform_tuple.
  *		Also see nocache_index_getattr, which is the same code for index
@@ -522,194 +510,104 @@ nocachegetattr(HeapTuple tup,
 			   int attnum,
 			   TupleDesc tupleDesc)
 {
+	CompactAttribute *cattr;
 	HeapTupleHeader td = tup->t_data;
 	char	   *tp;				/* ptr to data part of tuple */
 	bits8	   *bp = td->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* do we have to walk attrs? */
 	int			off;			/* current offset within data */
+	int			startAttr;
+	int			firstnullattr;
+	bool		hasnulls = HeapTupleHasNulls(tup);
 
-	/* ----------------
-	 *	 Three cases:
-	 *
-	 *	 1: No nulls and no variable-width attributes.
-	 *	 2: Has a null or a var-width AFTER att.
-	 *	 3: Has nulls or var-widths BEFORE att.
-	 * ----------------
-	 */
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
+	/*
+	 * If there are no NULLs before the required attnum, then we can start at
+	 * the highest attribute with a known offset, or the first attribute if
+	 * none have a cached offset.  If the tuple has no variable width types,
+	 * then we can use a slightly cheaper method of offset calculation, as we
+	 * just need to add the attlen to the aligned offset when skipping over
+	 * columns.  When the tuple contains variable-width types, we must use
+	 * att_addlength_pointer(), which does a bit more branching and is
+	 * slightly less efficient.
+	 */
 	attnum--;
 
-	if (!HeapTupleNoNulls(tup))
-	{
-		/*
-		 * there's a null somewhere in the tuple
-		 *
-		 * check to see if any preceding bits are null...
-		 */
-		int			byte = attnum >> 3;
-		int			finalbit = attnum & 0x07;
-
-		/* check for nulls "before" final bit of last byte */
-		if ((~bp[byte]) & ((1 << finalbit) - 1))
-			slow = true;
-		else
-		{
-			/* check for nulls in any "earlier" bytes */
-			int			i;
+	if (hasnulls)
+		firstnullattr = first_null_attr(bp, attnum);
+	else
+		firstnullattr = attnum;
 
-			for (i = 0; i < byte; i++)
-			{
-				if (bp[i] != 0xFF)
-				{
-					slow = true;
-					break;
-				}
-			}
-		}
+	if (tupleDesc->firstNonCachedOffAttr > 0)
+	{
+		startAttr = Min(tupleDesc->firstNonCachedOffAttr - 1, firstnullattr);
+		off = TupleDescCompactAttr(tupleDesc, startAttr)->attcacheoff;
+	}
+	else
+	{
+		startAttr = 0;
+		off = 0;
 	}
 
 	tp = (char *) td + td->t_hoff;
 
-	if (!slow)
+	if (hasnulls)
 	{
-		CompactAttribute *att;
+		for (int i = startAttr; i < attnum; i++)
+		{
+			CompactAttribute *att;
 
-		/*
-		 * If we get here, there are no nulls up to and including the target
-		 * attribute.  If we have a cached offset, we can use it.
-		 */
-		att = TupleDescCompactAttr(tupleDesc, attnum);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, tp + att->attcacheoff);
+			if (att_isnull(i, bp))
+				continue;
 
-		/*
-		 * Otherwise, check for non-fixed-length attrs up to and including
-		 * target.  If there aren't any, it's safe to cheaply initialize the
-		 * cached offsets for these attrs.
-		 */
-		if (HeapTupleHasVarWidth(tup))
-		{
-			int			j;
+			att = TupleDescCompactAttr(tupleDesc, i);
 
-			for (j = 0; j <= attnum; j++)
-			{
-				if (TupleDescCompactAttr(tupleDesc, j)->attlen <= 0)
-				{
-					slow = true;
-					break;
-				}
-			}
+			off = att_pointer_alignby(off,
+									  att->attalignby,
+									  att->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, att->attlen, tp + off);
 		}
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
 	}
-
-	if (!slow)
+	else if (!HeapTupleHasVarWidth(tup))
 	{
-		int			natts = tupleDesc->natts;
-		int			j = 1;
-
-		/*
-		 * If we get here, we have a tuple with no nulls or var-widths up to
-		 * and including the target attribute, so we can use the cached offset
-		 * ... only we don't have it yet, or we'd not have got here.  Since
-		 * it's cheap to compute offsets for fixed-width columns, we take the
-		 * opportunity to initialize the cached offsets for *all* the leading
-		 * fixed-width columns, in hope of avoiding future visits to this
-		 * routine.
-		 */
-		TupleDescCompactAttr(tupleDesc, 0)->attcacheoff = 0;
-
-		/* we might have set some offsets in the slow path previously */
-		while (j < natts && TupleDescCompactAttr(tupleDesc, j)->attcacheoff > 0)
-			j++;
-
-		off = TupleDescCompactAttr(tupleDesc, j - 1)->attcacheoff +
-			TupleDescCompactAttr(tupleDesc, j - 1)->attlen;
-
-		for (; j < natts; j++)
+		for (int i = startAttr; i < attnum; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, j);
-
-			if (att->attlen <= 0)
-				break;
+			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
 
 			off = att_nominal_alignby(off, att->attalignby);
-
-			att->attcacheoff = off;
-
 			off += att->attlen;
 		}
-
-		Assert(j > attnum);
-
-		off = TupleDescCompactAttr(tupleDesc, attnum)->attcacheoff;
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_nominal_alignby(off, cattr->attalignby);
 	}
 	else
 	{
-		bool		usecache = true;
-		int			i;
-
-		/*
-		 * Now we know that we have to walk the tuple CAREFULLY.  But we still
-		 * might be able to cache some offsets for next time.
-		 *
-		 * Note - This loop is a little tricky.  For each non-null attribute,
-		 * we have to first account for alignment padding before the attr,
-		 * then advance over the attr based on its length.  Nulls have no
-		 * storage and no alignment padding either.  We can use/set
-		 * attcacheoff until we reach either a null or a var-width attribute.
-		 */
-		off = 0;
-		for (i = 0;; i++)		/* loop exit is at "break" */
+		for (int i = startAttr; i < attnum; i++)
 		{
 			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
 
-			if (HeapTupleHasNulls(tup) && att_isnull(i, bp))
-			{
-				usecache = false;
-				continue;		/* this cannot be the target att */
-			}
-
-			/* If we know the next offset, we can skip the rest */
-			if (usecache && att->attcacheoff >= 0)
-				off = att->attcacheoff;
-			else if (att->attlen == -1)
-			{
-				/*
-				 * We can only cache the offset for a varlena attribute if the
-				 * offset is already suitably aligned, so that there would be
-				 * no pad bytes in any case: then the offset will be valid for
-				 * either an aligned or unaligned value.
-				 */
-				if (usecache &&
-					off == att_nominal_alignby(off, att->attalignby))
-					att->attcacheoff = off;
-				else
-				{
-					off = att_pointer_alignby(off, att->attalignby, -1,
-											  tp + off);
-					usecache = false;
-				}
-			}
-			else
-			{
-				/* not varlena, so safe to use att_nominal_alignby */
-				off = att_nominal_alignby(off, att->attalignby);
-
-				if (usecache)
-					att->attcacheoff = off;
-			}
-
-			if (i == attnum)
-				break;
-
+			off = att_pointer_alignby(off,
+									  att->attalignby,
+									  att->attlen,
+									  tp + off);
 			off = att_addlength_pointer(off, att->attlen, tp + off);
 
-			if (usecache && att->attlen <= 0)
-				usecache = false;
 		}
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
 	}
 
-	return fetchatt(TupleDescCompactAttr(tupleDesc, attnum), tp + off);
+	return fetchatt(cattr, tp + off);
 }
 
 /* ----------------
@@ -1347,6 +1245,7 @@ heap_deform_tuple(HeapTuple tuple, TupleDesc tupleDesc,
 				  Datum *values, bool *isnull)
 {
 	HeapTupleHeader tup = tuple->t_data;
+	CompactAttribute *cattr;
 	bool		hasnulls = HeapTupleHasNulls(tuple);
 	int			tdesc_natts = tupleDesc->natts;
 	int			natts;			/* number of atts to extract */
@@ -1354,70 +1253,91 @@ heap_deform_tuple(HeapTuple tuple, TupleDesc tupleDesc,
 	char	   *tp;				/* ptr to tuple data */
 	uint32		off;			/* offset in tuple data */
 	bits8	   *bp = tup->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* can we use/set attcacheoff? */
+	int			cacheoffattrs;
+	int			firstnullattr;
 
 	natts = HeapTupleHeaderGetNatts(tup);
 
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
+
 	/*
 	 * In inheritance situations, it is possible that the given tuple actually
 	 * has more fields than the caller is expecting.  Don't run off the end of
 	 * the caller's arrays.
 	 */
 	natts = Min(natts, tdesc_natts);
+	cacheoffattrs = Min(tupleDesc->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		firstnullattr = first_null_attr(bp, natts);
+		cacheoffattrs = Min(cacheoffattrs, firstnullattr);
+	}
+	else
+		firstnullattr = natts;
 
 	tp = (char *) tup + tup->t_hoff;
+	attnum = 0;
 
-	off = 0;
+	if (cacheoffattrs > 0)
+	{
+#ifdef USE_ASSERT_CHECKING
+		/* In Assert enabled builds, verify attcacheoff is correct */
+		off = 0;
+#endif
+		do
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+#ifdef USE_ASSERT_CHECKING
+			off = att_nominal_alignby(off, cattr->attalignby);
+			Assert(off == cattr->attcacheoff);
+			off += cattr->attlen;
+#endif
 
-	for (attnum = 0; attnum < natts; attnum++)
+			values[attnum] = fetch_att(tp + cattr->attcacheoff,
+									   cattr->attbyval,
+									   cattr->attlen);
+			isnull[attnum] = false;
+		} while (++attnum < cacheoffattrs);
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+	else
+		off = 0;
+
+	for (; attnum < firstnullattr; attnum++)
 	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDesc, attnum);
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
 
-		if (hasnulls && att_isnull(attnum, bp))
-		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			slow = true;		/* can't use attcacheoff anymore */
-			continue;
-		}
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
 		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (!slow && thisatt->attcacheoff >= 0)
-			off = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow &&
-				off == att_nominal_alignby(off, thisatt->attalignby))
-				thisatt->attcacheoff = off;
-			else
-			{
-				off = att_pointer_alignby(off, thisatt->attalignby, -1,
-										  tp + off);
-				slow = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			off = att_nominal_alignby(off, thisatt->attalignby);
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+	}
+
+	for (; attnum < natts; attnum++)
+	{
+		Assert(hasnulls);
 
-			if (!slow)
-				thisatt->attcacheoff = off;
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
 		}
 
-		values[attnum] = fetchatt(thisatt, tp + off);
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
-		off = att_addlength_pointer(off, thisatt->attlen, tp + off);
+		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (thisatt->attlen <= 0)
-			slow = true;		/* can't use attcacheoff anymore */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
 	/*
diff --git a/src/backend/access/common/indextuple.c b/src/backend/access/common/indextuple.c
index d7c8c53fd8d..084e0937a60 100644
--- a/src/backend/access/common/indextuple.c
+++ b/src/backend/access/common/indextuple.c
@@ -223,18 +223,6 @@ index_form_tuple_context(TupleDesc tupleDescriptor,
  *
  *		This gets called from index_getattr() macro, and only in cases
  *		where we can't use cacheoffset and the value is not null.
- *
- *		This caches attribute offsets in the attribute descriptor.
- *
- *		An alternative way to speed things up would be to cache offsets
- *		with the tuple, but that seems more difficult unless you take
- *		the storage hit of actually putting those offsets into the
- *		tuple you send to disk.  Yuck.
- *
- *		This scheme will be slightly slower than that, but should
- *		perform well for queries which hit large #'s of tuples.  After
- *		you cache the offsets once, examining all the other tuples using
- *		the same attribute descriptor will go much quicker. -cim 5/4/91
  * ----------------
  */
 Datum
@@ -242,205 +230,129 @@ nocache_index_getattr(IndexTuple tup,
 					  int attnum,
 					  TupleDesc tupleDesc)
 {
+	CompactAttribute *cattr;
 	char	   *tp;				/* ptr to data part of tuple */
 	bits8	   *bp = NULL;		/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* do we have to walk attrs? */
 	int			data_off;		/* tuple data offset */
 	int			off;			/* current offset within data */
+	int			startAttr;
+	int			firstnullattr;
+	bool		hasnulls = IndexTupleHasNulls(tup);
+	int			i;
 
-	/* ----------------
-	 *	 Three cases:
-	 *
-	 *	 1: No nulls and no variable-width attributes.
-	 *	 2: Has a null or a var-width AFTER att.
-	 *	 3: Has nulls or var-widths BEFORE att.
-	 * ----------------
-	 */
-
-	data_off = IndexInfoFindDataOffset(tup->t_info);
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
 	attnum--;
 
-	if (IndexTupleHasNulls(tup))
-	{
-		/*
-		 * there's a null somewhere in the tuple
-		 *
-		 * check to see if desired att is null
-		 */
+	/*
+	 * If there are no NULLs before the required attnum, then we can start at
+	 * the highest attribute with a known offset, or the first attribute if
+	 * none have a cached offset.  If the tuple has no variable width types,
+	 * which is common with indexes, then we can use a slightly cheaper method
+	 * of offset calculation, as we just need to add the attlen to the aligned
+	 * offset when skipping over columns.  When the tuple contains
+	 * variable-width types, we must use att_addlength_pointer(), which does a
+	 * bit more branching and is slightly less efficient.
+	 */
+	data_off = IndexInfoFindDataOffset(tup->t_info);
+	tp = (char *) tup + data_off;
 
-		/* XXX "knows" t_bits are just after fixed tuple header! */
+	/*
+	 * Find the first NULL column, or if there's none set the first NULL to
+	 * attnum so that we can forego NULL checking all the way to attnum.
+	 */
+	if (hasnulls)
+	{
 		bp = (bits8 *) ((char *) tup + sizeof(IndexTupleData));
-
-		/*
-		 * Now check to see if any preceding bits are null...
-		 */
-		{
-			int			byte = attnum >> 3;
-			int			finalbit = attnum & 0x07;
-
-			/* check for nulls "before" final bit of last byte */
-			if ((~bp[byte]) & ((1 << finalbit) - 1))
-				slow = true;
-			else
-			{
-				/* check for nulls in any "earlier" bytes */
-				int			i;
-
-				for (i = 0; i < byte; i++)
-				{
-					if (bp[i] != 0xFF)
-					{
-						slow = true;
-						break;
-					}
-				}
-			}
-		}
+		firstnullattr = first_null_attr(bp, attnum);
 	}
+	else
+		firstnullattr = attnum;
 
-	tp = (char *) tup + data_off;
-
-	if (!slow)
+	if (tupleDesc->firstNonCachedOffAttr > 0)
 	{
-		CompactAttribute *att;
-
-		/*
-		 * If we get here, there are no nulls up to and including the target
-		 * attribute.  If we have a cached offset, we can use it.
-		 */
-		att = TupleDescCompactAttr(tupleDesc, attnum);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, tp + att->attcacheoff);
-
-		/*
-		 * Otherwise, check for non-fixed-length attrs up to and including
-		 * target.  If there aren't any, it's safe to cheaply initialize the
-		 * cached offsets for these attrs.
-		 */
-		if (IndexTupleHasVarwidths(tup))
-		{
-			int			j;
-
-			for (j = 0; j <= attnum; j++)
-			{
-				if (TupleDescCompactAttr(tupleDesc, j)->attlen <= 0)
-				{
-					slow = true;
-					break;
-				}
-			}
-		}
+		startAttr = Min(tupleDesc->firstNonCachedOffAttr - 1, firstnullattr);
+		off = TupleDescCompactAttr(tupleDesc, startAttr)->attcacheoff;
 	}
-
-	if (!slow)
+	else
 	{
-		int			natts = tupleDesc->natts;
-		int			j = 1;
-
-		/*
-		 * If we get here, we have a tuple with no nulls or var-widths up to
-		 * and including the target attribute, so we can use the cached offset
-		 * ... only we don't have it yet, or we'd not have got here.  Since
-		 * it's cheap to compute offsets for fixed-width columns, we take the
-		 * opportunity to initialize the cached offsets for *all* the leading
-		 * fixed-width columns, in hope of avoiding future visits to this
-		 * routine.
-		 */
-		TupleDescCompactAttr(tupleDesc, 0)->attcacheoff = 0;
+		startAttr = 0;
+		off = 0;
+	}
 
-		/* we might have set some offsets in the slow path previously */
-		while (j < natts && TupleDescCompactAttr(tupleDesc, j)->attcacheoff > 0)
-			j++;
+	/* Handle tuples with var-width attributes */
+	if (IndexTupleHasVarwidths(tup))
+	{
+		/* Calculate the offset up until the first NULL */
+		for (i = startAttr; i < firstnullattr; i++)
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-		off = TupleDescCompactAttr(tupleDesc, j - 1)->attcacheoff +
-			TupleDescCompactAttr(tupleDesc, j - 1)->attlen;
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, cattr->attlen, tp + off);
+		}
 
-		for (; j < natts; j++)
+		/* Calculate the offset for any remaining columns. */
+		for (; i < attnum; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, j);
+			Assert(hasnulls);
 
-			if (att->attlen <= 0)
-				break;
+			if (att_isnull(i, bp))
+				continue;
 
-			off = att_nominal_alignby(off, att->attalignby);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-			att->attcacheoff = off;
-
-			off += att->attlen;
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, cattr->attlen, tp + off);
 		}
-
-		Assert(j > attnum);
-
-		off = TupleDescCompactAttr(tupleDesc, attnum)->attcacheoff;
 	}
 	else
 	{
-		bool		usecache = true;
-		int			i;
+		/* Handle tuples with only fixed-width attributes */
 
-		/*
-		 * Now we know that we have to walk the tuple CAREFULLY.  But we still
-		 * might be able to cache some offsets for next time.
-		 *
-		 * Note - This loop is a little tricky.  For each non-null attribute,
-		 * we have to first account for alignment padding before the attr,
-		 * then advance over the attr based on its length.  Nulls have no
-		 * storage and no alignment padding either.  We can use/set
-		 * attcacheoff until we reach either a null or a var-width attribute.
-		 */
-		off = 0;
-		for (i = 0;; i++)		/* loop exit is at "break" */
+		/* Calculate the offset up until the first NULL */
+		for (i = startAttr; i < firstnullattr; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
-
-			if (IndexTupleHasNulls(tup) && att_isnull(i, bp))
-			{
-				usecache = false;
-				continue;		/* this cannot be the target att */
-			}
-
-			/* If we know the next offset, we can skip the rest */
-			if (usecache && att->attcacheoff >= 0)
-				off = att->attcacheoff;
-			else if (att->attlen == -1)
-			{
-				/*
-				 * We can only cache the offset for a varlena attribute if the
-				 * offset is already suitably aligned, so that there would be
-				 * no pad bytes in any case: then the offset will be valid for
-				 * either an aligned or unaligned value.
-				 */
-				if (usecache &&
-					off == att_nominal_alignby(off, att->attalignby))
-					att->attcacheoff = off;
-				else
-				{
-					off = att_pointer_alignby(off, att->attalignby, -1,
-											  tp + off);
-					usecache = false;
-				}
-			}
-			else
-			{
-				/* not varlena, so safe to use att_nominal_alignby */
-				off = att_nominal_alignby(off, att->attalignby);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
+
+			Assert(cattr->attlen > 0);
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off += cattr->attlen;
+		}
 
-				if (usecache)
-					att->attcacheoff = off;
-			}
+		/* Calculate the offset for any remaining columns. */
+		for (; i < attnum; i++)
+		{
+			Assert(hasnulls);
 
-			if (i == attnum)
-				break;
+			if (att_isnull(i, bp))
+				continue;
 
-			off = att_addlength_pointer(off, att->attlen, tp + off);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-			if (usecache && att->attlen <= 0)
-				usecache = false;
+			Assert(cattr->attlen > 0);
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off += cattr->attlen;
 		}
 	}
 
-	return fetchatt(TupleDescCompactAttr(tupleDesc, attnum), tp + off);
+	cattr = TupleDescCompactAttr(tupleDesc, attnum);
+	off = att_pointer_alignby(off, cattr->attalignby,
+							  cattr->attlen, tp + off);
+	return fetchatt(cattr, tp + off);
 }
 
 /*
@@ -480,63 +392,86 @@ index_deform_tuple_internal(TupleDesc tupleDescriptor,
 							Datum *values, bool *isnull,
 							char *tp, bits8 *bp, int hasnulls)
 {
+	CompactAttribute *cattr;
 	int			natts = tupleDescriptor->natts; /* number of atts to extract */
-	int			attnum;
+	int			attnum = 0;
 	int			off = 0;		/* offset in tuple data */
-	bool		slow = false;	/* can we use/set attcacheoff? */
+	int			cacheoffattrs;
+	int			firstnullattr;
 
 	/* Assert to protect callers who allocate fixed-size arrays */
 	Assert(natts <= INDEX_MAX_KEYS);
 
-	for (attnum = 0; attnum < natts; attnum++)
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDescriptor->firstNonCachedOffAttr >= 0);
+
+	cacheoffattrs = Min(tupleDescriptor->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		firstnullattr = first_null_attr(bp, natts);
+		cacheoffattrs = Min(cacheoffattrs, firstnullattr);
+	}
+	else
+		firstnullattr = natts;
+
+	if (cacheoffattrs > 0)
 	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDescriptor, attnum);
+#ifdef USE_ASSERT_CHECKING
+		/* In Assert enabled builds, verify attcacheoff is correct */
+		off = 0;
+#endif
 
-		if (hasnulls && att_isnull(attnum, bp))
+		do
 		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			slow = true;		/* can't use attcacheoff anymore */
-			continue;
-		}
+			cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+
+#ifdef USE_ASSERT_CHECKING
+			off = att_nominal_alignby(off, cattr->attalignby);
+			Assert(off == cattr->attcacheoff);
+			off += cattr->attlen;
+#endif
+
+			values[attnum] = fetch_att(tp + cattr->attcacheoff, cattr->attbyval,
+									   cattr->attlen);
+			isnull[attnum] = false;
+		} while (++attnum < cacheoffattrs);
+
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+
+	for (; attnum < firstnullattr; attnum++)
+	{
+		cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
 		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (!slow && thisatt->attcacheoff >= 0)
-			off = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow &&
-				off == att_nominal_alignby(off, thisatt->attalignby))
-				thisatt->attcacheoff = off;
-			else
-			{
-				off = att_pointer_alignby(off, thisatt->attalignby, -1,
-										  tp + off);
-				slow = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			off = att_nominal_alignby(off, thisatt->attalignby);
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+	}
+
+	for (; attnum < natts; attnum++)
+	{
+		Assert(hasnulls);
 
-			if (!slow)
-				thisatt->attcacheoff = off;
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
 		}
 
-		values[attnum] = fetchatt(thisatt, tp + off);
+		cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
-		off = att_addlength_pointer(off, thisatt->attlen, tp + off);
+		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (thisatt->attlen <= 0)
-			slow = true;		/* can't use attcacheoff anymore */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 }
 
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index e98de806a77..25364db630a 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -214,6 +214,9 @@ CreateTemplateTupleDesc(int natts)
 	desc->tdtypmod = -1;
 	desc->tdrefcount = -1;		/* assume not reference-counted */
 
+	/* This will be set to the correct value by TupleDescFinalize() */
+	desc->firstNonCachedOffAttr = -1;
+
 	return desc;
 }
 
@@ -474,6 +477,9 @@ TupleDescCopy(TupleDesc dst, TupleDesc src)
  *		descriptor to another.
  *
  * !!! Constraints and defaults are not copied !!!
+ *
+ * The caller must take care of calling TupleDescFinalize() on once all
+ * TupleDesc changes have been made.
  */
 void
 TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
@@ -506,6 +512,37 @@ TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 	populate_compact_attribute(dst, dstAttno - 1);
 }
 
+/*
+ * TupleDescFinalize
+ *		Finalize the given TupleDesc.  This must be called after the
+ *		attributes arrays have been populated or adjusted by any code.
+ *
+ * Must be called after populate_compact_attribute() and before
+ * BlessTupleDesc().
+ */
+void
+TupleDescFinalize(TupleDesc tupdesc)
+{
+	int			firstNonCachedOffAttr = 0;
+	int			offp = 0;
+
+	for (int i = 0; i < tupdesc->natts; i++)
+	{
+		CompactAttribute *cattr = TupleDescCompactAttr(tupdesc, i);
+
+		if (cattr->attlen <= 0)
+			break;
+
+		offp = att_nominal_alignby(offp, cattr->attalignby);
+		cattr->attcacheoff = offp;
+
+		offp += cattr->attlen;
+		firstNonCachedOffAttr = i + 1;
+	}
+
+	tupdesc->firstNonCachedOffAttr = firstNonCachedOffAttr;
+}
+
 /*
  * Free a TupleDesc including all substructure
  */
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index b246e8127db..a4694bd8065 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -335,9 +335,6 @@ getSpGistTupleDesc(Relation index, SpGistTypeDesc *keyType)
 		/* We shouldn't need to bother with making these valid: */
 		att->attcompression = InvalidCompressionMethod;
 		att->attcollation = InvalidOid;
-		/* In case we changed typlen, we'd better reset following offsets */
-		for (int i = spgFirstIncludeColumn; i < outTupDesc->natts; i++)
-			TupleDescCompactAttr(outTupDesc, i)->attcacheoff = -1;
 
 		populate_compact_attribute(outTupDesc, spgKeyColumn);
 		TupleDescFinalize(outTupDesc);
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index e6ab51e6404..36d0aaed2fb 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -992,118 +992,6 @@ tts_buffer_heap_store_tuple(TupleTableSlot *slot, HeapTuple tuple,
 	}
 }
 
-/*
- * slot_deform_heap_tuple_internal
- *		An always inline helper function for use in slot_deform_heap_tuple to
- *		allow the compiler to emit specialized versions of this function for
- *		various combinations of "slow" and "hasnulls".  For example, if a
- *		given tuple has no nulls, then we needn't check "hasnulls" for every
- *		attribute that we're deforming.  The caller can just call this
- *		function with hasnulls set to constant-false and have the compiler
- *		remove the constant-false branches and emit more optimal code.
- *
- * Returns the next attnum to deform, which can be equal to natts when the
- * function manages to deform all requested attributes.  *offp is an input and
- * output parameter which is the byte offset within the tuple to start deforming
- * from which, on return, gets set to the offset where the next attribute
- * should be deformed from.  *slowp is set to true when subsequent deforming
- * of this tuple must use a version of this function with "slow" passed as
- * true.
- *
- * Callers cannot assume when we return "attnum" (i.e. all requested
- * attributes have been deformed) that slow mode isn't required for any
- * additional deforming as the final attribute may have caused a switch to
- * slow mode.
- */
-static pg_attribute_always_inline int
-slot_deform_heap_tuple_internal(TupleTableSlot *slot, HeapTuple tuple,
-								int attnum, int natts, bool slow,
-								bool hasnulls, uint32 *offp, bool *slowp)
-{
-	TupleDesc	tupleDesc = slot->tts_tupleDescriptor;
-	Datum	   *values = slot->tts_values;
-	bool	   *isnull = slot->tts_isnull;
-	HeapTupleHeader tup = tuple->t_data;
-	char	   *tp;				/* ptr to tuple data */
-	bits8	   *bp = tup->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slownext = false;
-
-	tp = (char *) tup + tup->t_hoff;
-
-	for (; attnum < natts; attnum++)
-	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDesc, attnum);
-
-		if (hasnulls && att_isnull(attnum, bp))
-		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			if (!slow)
-			{
-				*slowp = true;
-				return attnum + 1;
-			}
-			else
-				continue;
-		}
-
-		isnull[attnum] = false;
-
-		/* calculate the offset of this attribute */
-		if (!slow && thisatt->attcacheoff >= 0)
-			*offp = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow && *offp == att_nominal_alignby(*offp, thisatt->attalignby))
-				thisatt->attcacheoff = *offp;
-			else
-			{
-				*offp = att_pointer_alignby(*offp,
-											thisatt->attalignby,
-											-1,
-											tp + *offp);
-
-				if (!slow)
-					slownext = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			*offp = att_nominal_alignby(*offp, thisatt->attalignby);
-
-			if (!slow)
-				thisatt->attcacheoff = *offp;
-		}
-
-		values[attnum] = fetchatt(thisatt, tp + *offp);
-
-		*offp = att_addlength_pointer(*offp, thisatt->attlen, tp + *offp);
-
-		/* check if we need to switch to slow mode */
-		if (!slow)
-		{
-			/*
-			 * We're unable to deform any further if the above code set
-			 * 'slownext', or if this isn't a fixed-width attribute.
-			 */
-			if (slownext || thisatt->attlen <= 0)
-			{
-				*slowp = true;
-				return attnum + 1;
-			}
-		}
-	}
-
-	return natts;
-}
-
 /*
  * slot_deform_heap_tuple
  *		Given a TupleTableSlot, extract data from the slot's physical tuple
@@ -1122,78 +1010,138 @@ static pg_attribute_always_inline void
 slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 					   int natts)
 {
+	CompactAttribute *cattr;
+	TupleDesc	tupleDesc = slot->tts_tupleDescriptor;
 	bool		hasnulls = HeapTupleHasNulls(tuple);
+	HeapTupleHeader tup = tuple->t_data;
+	bits8	   *bp;				/* ptr to null bitmap in tuple */
 	int			attnum;
+	int			firstNonCacheOffsetAttr;
+	int			firstNullAttr;
+	Datum	   *values;
+	bool	   *isnull;
+	char	   *tp;				/* ptr to tuple data */
 	uint32		off;			/* offset in tuple data */
-	bool		slow;			/* can we use/set attcacheoff? */
+
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
 	/* We can only fetch as many attributes as the tuple has. */
-	natts = Min(HeapTupleHeaderGetNatts(tuple->t_data), natts);
+	natts = Min(HeapTupleHeaderGetNatts(tup), natts);
+	attnum = slot->tts_nvalid;
+	firstNonCacheOffsetAttr = Min(tupleDesc->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		bp = tup->t_bits;
+		firstNullAttr = first_null_attr(bp, natts);
+		firstNonCacheOffsetAttr = Min(firstNonCacheOffsetAttr, firstNullAttr);
+	}
+	else
+	{
+		bp = NULL;
+		firstNullAttr = natts;
+	}
+
+	values = slot->tts_values;
+	isnull = slot->tts_isnull;
+	tp = (char *) tup + tup->t_hoff;
 
 	/*
-	 * Check whether the first call for this tuple, and initialize or restore
-	 * loop state.
+	 * Handle the portion of the tuple that we have cached the offset for up
+	 * to the first NULL attribute.  The offset is effectively fixed for these
+	 * so we can use the CompactAttribute's attcacheoff.
 	 */
-	attnum = slot->tts_nvalid;
-	if (attnum == 0)
+	if (attnum < firstNonCacheOffsetAttr)
 	{
-		/* Start from the first attribute */
-		off = 0;
-		slow = false;
+#ifdef USE_ASSERT_CHECKING
+		int			offcheck;
+
+		/* In Assert enabled builds, verify attcacheoff is correct */
+		if (attnum == 0)
+			offcheck = 0;
+		else
+			offcheck = *offp;
+#endif
+		do
+		{
+			isnull[attnum] = false;
+			cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+#ifdef USE_ASSERT_CHECKING
+			offcheck = att_nominal_alignby(offcheck, cattr->attalignby);
+			Assert(offcheck == cattr->attcacheoff);
+			offcheck += cattr->attlen;
+#endif
+
+			values[attnum] = fetchatt(cattr, tp + cattr->attcacheoff);
+		} while (++attnum < firstNonCacheOffsetAttr);
+
+		/*
+		 * Point the offset after the end of the last attribute with a cached
+		 * offset.  We expect the final cached offset attribute to have a
+		 * fixed width, so just add the attlen to the attcacheoff
+		 */
+		Assert(cattr->attlen > 0);
+		off = cattr->attcacheoff + cattr->attlen;
 	}
 	else
 	{
 		/* Restore state from previous execution */
 		off = *offp;
-		slow = TTS_SLOW(slot);
+
+		/* We expect *offp to be set to 0 when attnum == 0 */
+		Assert(off == 0 || attnum > 0);
 	}
 
 	/*
-	 * If 'slow' isn't set, try deforming using deforming code that does not
-	 * contain any of the extra checks required for non-fixed offset
-	 * deforming.  During deforming, if or when we find a NULL or a variable
-	 * length attribute, we'll switch to a deforming method which includes the
-	 * extra code required for non-fixed offset deforming, a.k.a slow mode.
-	 * Because this is performance critical, we inline
-	 * slot_deform_heap_tuple_internal passing the 'slow' and 'hasnull'
-	 * parameters as constants to allow the compiler to emit specialized code
-	 * with the known-const false comparisons and subsequent branches removed.
+	 * Handle any portion of the tuple that doesn't have a fixed offset up
+	 * until the first NULL attribute.  This loops only differs from the one
+	 * after it by the NULL checks.
 	 */
-	if (!slow)
+	for (; attnum < firstNullAttr; attnum++)
 	{
-		/* Tuple without any NULLs? We can skip doing any NULL checking */
-		if (!hasnulls)
-			attnum = slot_deform_heap_tuple_internal(slot,
-													 tuple,
-													 attnum,
-													 natts,
-													 false, /* slow */
-													 false, /* hasnulls */
-													 &off,
-													 &slow);
-		else
-			attnum = slot_deform_heap_tuple_internal(slot,
-													 tuple,
-													 attnum,
-													 natts,
-													 false, /* slow */
-													 true,	/* hasnulls */
-													 &off,
-													 &slow);
+		isnull[attnum] = false;
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+		/* align the offset for this attribute */
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
+
+		values[attnum] = fetchatt(cattr, tp + off);
+
+		/* move the offset beyond this attribute */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
-	/* If there's still work to do then we must be in slow mode */
-	if (attnum < natts)
+	/*
+	 * Now handle any remaining tuples, this time include NULL checks as we're
+	 * now at the first NULL attribute.
+	 */
+	for (; attnum < natts; attnum++)
 	{
-		/* XXX is it worth adding a separate call when hasnulls is false? */
-		attnum = slot_deform_heap_tuple_internal(slot,
-												 tuple,
-												 attnum,
-												 natts,
-												 true,	/* slow */
-												 hasnulls,
-												 &off,
-												 &slow);
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
+		}
+
+		isnull[attnum] = false;
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+		/* align the offset for this attribute */
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
+
+		values[attnum] = fetchatt(cattr, tp + off);
+
+		/* move the offset beyond this attribute */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
 	/*
@@ -1201,10 +1149,6 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 	 */
 	slot->tts_nvalid = attnum;
 	*offp = off;
-	if (slow)
-		slot->tts_flags |= TTS_FLAG_SLOW;
-	else
-		slot->tts_flags &= ~TTS_FLAG_SLOW;
 }
 
 const TupleTableSlotOps TTSOpsVirtual = {
@@ -2259,10 +2203,16 @@ ExecTypeSetColNames(TupleDesc typeInfo, List *namesList)
  * This happens "for free" if the tupdesc came from a relcache entry, but
  * not if we have manufactured a tupdesc for a transient RECORD datatype.
  * In that case we have to notify typcache.c of the existence of the type.
+ *
+ * TupleDescFinalize() must be called on the TupleDesc before calling this
+ * function.
  */
 TupleDesc
 BlessTupleDesc(TupleDesc tupdesc)
 {
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupdesc->firstNonCachedOffAttr >= 0);
+
 	if (tupdesc->tdtypeid == RECORDOID &&
 		tupdesc->tdtypmod < 0)
 		assign_record_type_typmod(tupdesc);
diff --git a/src/backend/jit/llvm/llvmjit_deform.c b/src/backend/jit/llvm/llvmjit_deform.c
index 3eb087eb56b..12521e3e46a 100644
--- a/src/backend/jit/llvm/llvmjit_deform.c
+++ b/src/backend/jit/llvm/llvmjit_deform.c
@@ -62,7 +62,6 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 	LLVMValueRef v_tts_values;
 	LLVMValueRef v_tts_nulls;
 	LLVMValueRef v_slotoffp;
-	LLVMValueRef v_flagsp;
 	LLVMValueRef v_nvalidp;
 	LLVMValueRef v_nvalid;
 	LLVMValueRef v_maxatt;
@@ -178,7 +177,6 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 	v_tts_nulls =
 		l_load_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_ISNULL,
 						  "tts_ISNULL");
-	v_flagsp = l_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_FLAGS, "");
 	v_nvalidp = l_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_NVALID, "");
 
 	if (ops == &TTSOpsHeapTuple || ops == &TTSOpsBufferHeapTuple)
@@ -747,14 +745,10 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 
 	{
 		LLVMValueRef v_off = l_load(b, TypeSizeT, v_offp, "");
-		LLVMValueRef v_flags;
 
 		LLVMBuildStore(b, l_int16_const(lc, natts), v_nvalidp);
 		v_off = LLVMBuildTrunc(b, v_off, LLVMInt32TypeInContext(lc), "");
 		LLVMBuildStore(b, v_off, v_slotoffp);
-		v_flags = l_load(b, LLVMInt16TypeInContext(lc), v_flagsp, "tts_flags");
-		v_flags = LLVMBuildOr(b, v_flags, l_int16_const(lc, TTS_FLAG_SLOW), "");
-		LLVMBuildStore(b, v_flags, v_flagsp);
 		LLVMBuildRetVoid(b);
 	}
 
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 770edb34e08..998be24ac41 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -666,14 +666,6 @@ RelationBuildTupleDesc(Relation relation)
 		elog(ERROR, "pg_attribute catalog is missing %d attribute(s) for relation OID %u",
 			 need, RelationGetRelid(relation));
 
-	/*
-	 * We can easily set the attcacheoff value for the first attribute: it
-	 * must be zero.  This eliminates the need for special cases for attnum=1
-	 * that used to exist in fastgetattr() and index_getattr().
-	 */
-	if (RelationGetNumberOfAttributes(relation) > 0)
-		TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
-
 	/*
 	 * Set up constraint/default info
 	 */
@@ -1985,8 +1977,6 @@ formrdesc(const char *relationName, Oid relationReltype,
 		populate_compact_attribute(relation->rd_att, i);
 	}
 
-	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
-	TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
 	TupleDescFinalize(relation->rd_att);
 
 	/* mark not-null status */
@@ -4446,8 +4436,6 @@ BuildHardcodedDescriptor(int natts, const FormData_pg_attribute *attrs)
 		populate_compact_attribute(result, i);
 	}
 
-	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
-	TupleDescCompactAttr(result, 0)->attcacheoff = 0;
 	TupleDescFinalize(result);
 
 	/* Note: we don't bother to set up a TupleConstr entry */
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index d406825ff22..94b4279b7f1 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -865,20 +865,17 @@ extern MinimalTuple minimal_expand_tuple(HeapTuple sourceTuple, TupleDesc tupleD
 static inline Datum
 fastgetattr(HeapTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
 {
-	Assert(attnum > 0);
+	CompactAttribute *att = TupleDescCompactAttr(tupleDesc, attnum - 1);
 
+	Assert(attnum > 0);
 	*isnull = false;
-	if (HeapTupleNoNulls(tup))
-	{
-		CompactAttribute *att;
 
-		att = TupleDescCompactAttr(tupleDesc, attnum - 1);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, (char *) tup->t_data + tup->t_data->t_hoff +
-							att->attcacheoff);
-		else
-			return nocachegetattr(tup, attnum, tupleDesc);
-	}
+	if (att->attcacheoff >= 0 && !HeapTupleHasNulls(tup))
+		return fetchatt(att, (char *) tup->t_data + tup->t_data->t_hoff +
+						att->attcacheoff);
+
+	if (HeapTupleNoNulls(tup))
+		return nocachegetattr(tup, attnum, tupleDesc);
 	else
 	{
 		if (att_isnull(attnum - 1, tup->t_data->t_bits))
diff --git a/src/include/access/itup.h b/src/include/access/itup.h
index 57e4daafb0d..e4bb27b7e58 100644
--- a/src/include/access/itup.h
+++ b/src/include/access/itup.h
@@ -131,24 +131,20 @@ IndexInfoFindDataOffset(unsigned short t_info)
 static inline Datum
 index_getattr(IndexTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
 {
+	CompactAttribute *attr = TupleDescCompactAttr(tupleDesc, attnum - 1);
+
 	Assert(isnull);
 	Assert(attnum > 0);
 
 	*isnull = false;
 
-	if (!IndexTupleHasNulls(tup))
-	{
-		CompactAttribute *attr = TupleDescCompactAttr(tupleDesc, attnum - 1);
+	if (attr->attcacheoff >= 0 && !IndexTupleHasNulls(tup))
+		return fetchatt(attr,
+						(char *) tup + IndexInfoFindDataOffset(tup->t_info) +
+						attr->attcacheoff);
 
-		if (attr->attcacheoff >= 0)
-		{
-			return fetchatt(attr,
-							(char *) tup + IndexInfoFindDataOffset(tup->t_info) +
-							attr->attcacheoff);
-		}
-		else
-			return nocache_index_getattr(tup, attnum, tupleDesc);
-	}
+	if (!IndexTupleHasNulls(tup))
+		return nocache_index_getattr(tup, attnum, tupleDesc);
 	else
 	{
 		if (att_isnull(attnum - 1, (bits8 *) tup + sizeof(IndexTupleData)))
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index 595413dbbc5..99d9017d1a6 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -131,6 +131,12 @@ typedef struct CompactAttribute
  * Any code making changes manually to and fields in the FormData_pg_attribute
  * array must subsequently call populate_compact_attribute() to flush the
  * changes out to the corresponding 'compact_attrs' element.
+ *
+ * firstNonCachedOffAttr stores the index into the compact_attrs array for the
+ * first attribute that we don't have a known attcacheoff for.
+ *
+ * Once a TupleDesc has been populated, before it is used for any purpose
+ * TupleDescFinalize() must be called on it.
  */
 typedef struct TupleDescData
 {
@@ -138,6 +144,8 @@ typedef struct TupleDescData
 	Oid			tdtypeid;		/* composite type ID for tuple type */
 	int32		tdtypmod;		/* typmod for tuple type */
 	int			tdrefcount;		/* reference count, or -1 if not counting */
+	int			firstNonCachedOffAttr;	/* index of the first att without an
+										 * attcacheoff */
 	TupleConstr *constr;		/* constraints, or NULL if none */
 	/* compact_attrs[N] is the compact metadata of Attribute Number N+1 */
 	CompactAttribute compact_attrs[FLEXIBLE_ARRAY_MEMBER];
@@ -195,7 +203,6 @@ extern TupleDesc CreateTupleDescTruncatedCopy(TupleDesc tupdesc, int natts);
 
 extern TupleDesc CreateTupleDescCopyConstr(TupleDesc tupdesc);
 
-#define TupleDescFinalize(d) ((void) 0)
 #define TupleDescSize(src) \
 	(offsetof(struct TupleDescData, compact_attrs) + \
 	 (src)->natts * sizeof(CompactAttribute) + \
@@ -206,6 +213,7 @@ extern void TupleDescCopy(TupleDesc dst, TupleDesc src);
 extern void TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 							   TupleDesc src, AttrNumber srcAttno);
 
+extern void TupleDescFinalize(TupleDesc tupdesc);
 extern void FreeTupleDesc(TupleDesc tupdesc);
 
 extern void IncrTupleDescRefCount(TupleDesc tupdesc);
diff --git a/src/include/access/tupmacs.h b/src/include/access/tupmacs.h
index 3e5530658c9..150a7a24785 100644
--- a/src/include/access/tupmacs.h
+++ b/src/include/access/tupmacs.h
@@ -15,6 +15,7 @@
 #define TUPMACS_H
 
 #include "catalog/pg_type_d.h"	/* for TYPALIGN macros */
+#include "port/pg_bitutils.h"
 
 
 /*
@@ -69,6 +70,87 @@ fetch_att(const void *T, bool attbyval, int attlen)
 	else
 		return PointerGetDatum(T);
 }
+
+#ifndef HAVE__BUILTIN_CTZ
+/*
+ * For returning the 0-based position of the right-most 0 bit of a uint8, or 8
+ * if all bits are 1 bits.
+ */
+static const uint8 pg_rightmost_zero_pos[256] = {
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 6,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 7,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 6,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 8
+};
+#endif
+
+/*
+ * first_null_attr
+ *		Inspect a NULL bitmask from a tuple and return the 0-based attnum of the
+ *		first NULL attribute.  Returns natts if no NULLs were found.
+ *
+ * We expect that 'bits' contains at least one 0 bit somewhere in the mask,
+ * not necessarily < natts.
+ */
+static inline int
+first_null_attr(const bits8 *bits, int natts)
+{
+	int			lastByte = natts >> 3;
+	int			bytenum;
+	int			res;
+
+#ifdef USE_ASSERT_CHECKING
+	int			firstnull_check = natts;
+
+	/* Do it the slow way and check we get the same answer. */
+	for (int i = 0; i < natts; i++)
+	{
+		if (att_isnull(i, bits))
+		{
+			firstnull_check = i;
+			break;
+		}
+	}
+#endif
+
+	/* Process all bytes up to just before the byte for the natts index */
+	for (bytenum = 0; bytenum < lastByte; bytenum++)
+	{
+		/* break if there's any NULL attrs (a 0 bit) */
+		if (bits[bytenum] != 0xFF)
+			break;
+	}
+
+	res = bytenum << 3;
+
+#ifdef HAVE__BUILTIN_CTZ
+	res += __builtin_ctz(~bits[bytenum]);
+#else
+	res += pg_rightmost_zero_pos[bits[bytenum]];
+#endif
+
+	/*
+	 * Since we did no masking to mask out bits beyond natts, we may have
+	 * found a bit higher than natts, so we must cap to natts
+	 */
+	res = Min(res, natts);
+
+	Assert(res == firstnull_check);
+
+	return res;
+}
 #endif							/* FRONTEND */
 
 /*
diff --git a/src/include/executor/tuptable.h b/src/include/executor/tuptable.h
index a2dfd707e78..363c5f33697 100644
--- a/src/include/executor/tuptable.h
+++ b/src/include/executor/tuptable.h
@@ -84,9 +84,6 @@
  * tts_values/tts_isnull are allocated either when the slot is created (when
  * the descriptor is provided), or when a descriptor is assigned to the slot;
  * they are of length equal to the descriptor's natts.
- *
- * The TTS_FLAG_SLOW flag is saved state for
- * slot_deform_heap_tuple, and should not be touched by any other code.
  *----------
  */
 
@@ -98,12 +95,8 @@
 #define			TTS_FLAG_SHOULDFREE		(1 << 2)
 #define TTS_SHOULDFREE(slot) (((slot)->tts_flags & TTS_FLAG_SHOULDFREE) != 0)
 
-/* saved state for slot_deform_heap_tuple */
-#define			TTS_FLAG_SLOW		(1 << 3)
-#define TTS_SLOW(slot) (((slot)->tts_flags & TTS_FLAG_SLOW) != 0)
-
 /* fixed tuple descriptor */
-#define			TTS_FLAG_FIXED		(1 << 4)
+#define			TTS_FLAG_FIXED		(1 << 4)	/* XXX change to #3? */
 #define TTS_FIXED(slot) (((slot)->tts_flags & TTS_FLAG_FIXED) != 0)
 
 struct TupleTableSlotOps;
-- 
2.51.0



  [text/plain] v7-0003-Introduce-deform_bench-test-module.patch (7.3K, 5-v7-0003-Introduce-deform_bench-test-module.patch)
  download | inline diff:
From 1393b8bf8885ce097baa9a757fc8923b61706e22 Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Tue, 27 Jan 2026 15:08:09 +1300
Subject: [PATCH v7 3/4] Introduce deform_bench test module

For benchmaring tuple deformation.
---
 src/test/modules/deform_bench/.gitignore      |   4 +
 src/test/modules/deform_bench/Makefile        |  21 ++++
 .../deform_bench/deform_bench--1.0.sql        |   8 ++
 src/test/modules/deform_bench/deform_bench.c  | 106 ++++++++++++++++++
 .../modules/deform_bench/deform_bench.control |   4 +
 src/test/modules/deform_bench/meson.build     |  22 ++++
 src/test/modules/meson.build                  |   1 +
 7 files changed, 166 insertions(+)
 create mode 100644 src/test/modules/deform_bench/.gitignore
 create mode 100644 src/test/modules/deform_bench/Makefile
 create mode 100644 src/test/modules/deform_bench/deform_bench--1.0.sql
 create mode 100644 src/test/modules/deform_bench/deform_bench.c
 create mode 100644 src/test/modules/deform_bench/deform_bench.control
 create mode 100644 src/test/modules/deform_bench/meson.build

diff --git a/src/test/modules/deform_bench/.gitignore b/src/test/modules/deform_bench/.gitignore
new file mode 100644
index 00000000000..5dcb3ff9723
--- /dev/null
+++ b/src/test/modules/deform_bench/.gitignore
@@ -0,0 +1,4 @@
+# Generated subdirectories
+/log/
+/results/
+/tmp_check/
diff --git a/src/test/modules/deform_bench/Makefile b/src/test/modules/deform_bench/Makefile
new file mode 100644
index 00000000000..b5fc0f7a583
--- /dev/null
+++ b/src/test/modules/deform_bench/Makefile
@@ -0,0 +1,21 @@
+# src/test/modules/deform_bench/Makefile
+
+MODULE_big = deform_bench
+OBJS = deform_bench.o
+
+EXTENSION = deform_bench
+DATA = deform_bench--1.0.sql
+PGFILEDESC = "deform_bench - tuple deform benchmarking"
+
+REGRESS = deform_bench
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/deform_bench
+top_builddir = ../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/deform_bench/deform_bench--1.0.sql b/src/test/modules/deform_bench/deform_bench--1.0.sql
new file mode 100644
index 00000000000..492b71dba3b
--- /dev/null
+++ b/src/test/modules/deform_bench/deform_bench--1.0.sql
@@ -0,0 +1,8 @@
+/* deform_bench--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION deform_bench" to load this file. \quit
+
+CREATE FUNCTION deform_bench(tableoid Oid, attnum int[]) RETURNS FLOAT
+AS 'MODULE_PATHNAME', 'deform_bench'
+LANGUAGE C VOLATILE STRICT;
diff --git a/src/test/modules/deform_bench/deform_bench.c b/src/test/modules/deform_bench/deform_bench.c
new file mode 100644
index 00000000000..525162eb59c
--- /dev/null
+++ b/src/test/modules/deform_bench/deform_bench.c
@@ -0,0 +1,106 @@
+/*-------------------------------------------------------------------------
+ *
+ * deform_bench.c
+ *
+ * for benchmarking tuple deformation routines
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include <time.h>
+#include <sys/time.h>
+
+#include "access/heapam.h"
+#include "access/relscan.h"
+#include "catalog/pg_am_d.h"
+#include "catalog/pg_type_d.h"
+#include "funcapi.h"
+#include "miscadmin.h"
+#include "utils/array.h"
+#include "utils/arrayaccess.h"
+#include "utils/builtins.h"
+
+PG_MODULE_MAGIC;
+
+PG_FUNCTION_INFO_V1(deform_bench);
+
+Datum
+deform_bench(PG_FUNCTION_ARGS)
+{
+	Oid			tableoid = PG_GETARG_OID(0);
+	ArrayType  *array = PG_GETARG_ARRAYTYPE_P(1);
+	TableScanDesc scan;
+	Relation	rel;
+	TupleDesc	tupdesc;
+	TupleTableSlot *slot;
+	Datum	   *elem_datums = NULL;
+	bool	   *elem_nulls = NULL;
+	int			elem_count;
+	int		   *attnums;
+	clock_t		start,
+				end;
+
+	rel = relation_open(tableoid, AccessShareLock);
+
+	if (rel->rd_rel->relam != HEAP_TABLE_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only heap AM is supported")));
+
+	tupdesc = RelationGetDescr(rel);
+	slot = MakeTupleTableSlot(tupdesc, &TTSOpsBufferHeapTuple);
+	scan = table_beginscan_strat(rel, GetActiveSnapshot(), 0, NULL, true, false);
+
+	/*
+	 * The array is used to allow callers to define how many atts to deform.
+	 * e.g: '{1,10}'::int[] would deform attnum=1, then in a 2nd pass deform
+	 * the remainder up to attnum=10.  Passing an element as NULL means all
+	 * attnums.  This allows simulation of incremental deformation.  Generally
+	 * if you're passing an array with more than 1 element, then the array
+	 * should be in ascending order.  Doing something like '{10,1}' would mean
+	 * we've already deformed 10 attributes and on the 2nd pass there's
+	 * nothing to do since attnum=1 was already deformed in the first pass.
+	 *
+	 * You'll get an ERROR if you pass a number higher than the number of
+	 * attributes in the table.
+	 */
+	deconstruct_array(array,
+					  INT4OID,
+					  sizeof(int32),
+					  true,
+					  'i',
+					  &elem_datums,
+					  &elem_nulls,
+					  &elem_count);
+
+	attnums = palloc_array(int, elem_count);
+
+	for (int i = 0; i < elem_count; i++)
+	{
+		/* Make a NULL element mean all attributes */
+		if (elem_nulls[i])
+			attnums[i] = tupdesc->natts;
+		else
+			attnums[i] = DatumGetInt32(elem_datums[i]);
+	}
+
+	start = clock();
+
+	while (heap_getnextslot(scan, ForwardScanDirection, slot))
+	{
+		CHECK_FOR_INTERRUPTS();
+
+		/* Deform in stages according to the attnums array */
+		for (int i = 0; i < elem_count; i++)
+			slot_getsomeattrs_int(slot, attnums[i]);
+	}
+
+	ExecDropSingleTupleTableSlot(slot);
+	table_endscan(scan);
+	relation_close(rel, AccessShareLock);
+
+	end = clock();
+
+	/* Returns the number of milliseconds to run the test */
+	PG_RETURN_FLOAT8((double) (end - start) / (CLOCKS_PER_SEC / 1000));
+}
diff --git a/src/test/modules/deform_bench/deform_bench.control b/src/test/modules/deform_bench/deform_bench.control
new file mode 100644
index 00000000000..a2023f9d738
--- /dev/null
+++ b/src/test/modules/deform_bench/deform_bench.control
@@ -0,0 +1,4 @@
+# deform_bench extension
+comment = 'functions for benchmarking tuple deformation'
+default_version = '1.0'
+module_pathname = '$libdir/deform_bench'
diff --git a/src/test/modules/deform_bench/meson.build b/src/test/modules/deform_bench/meson.build
new file mode 100644
index 00000000000..82049585244
--- /dev/null
+++ b/src/test/modules/deform_bench/meson.build
@@ -0,0 +1,22 @@
+# Copyright (c) 2026, PostgreSQL Global Development Group
+
+deform_bench_sources = files(
+  'deform_bench.c',
+)
+
+if host_system == 'windows'
+  deform_bench_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+    '--NAME', 'deform_bench',
+    '--FILEDESC', 'deform_bench - benchmarking tuple deformation',])
+endif
+
+deform_bench = shared_module('deform_bench',
+  deform_bench_sources,
+  kwargs: pg_test_mod_args,
+)
+test_install_libs += deform_bench
+
+test_install_data += files(
+  'deform_bench--1.0.sql',
+  'deform_bench.control',
+)
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 2634a519935..ef2b0af4581 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -2,6 +2,7 @@
 
 subdir('brin')
 subdir('commit_ts')
+subdir('deform_bench')
 subdir('delay_execution')
 subdir('dummy_index_am')
 subdir('dummy_seclabel')
-- 
2.51.0



  [text/plain] v7-0004-Various-experimental-changes.patch (13.3K, 6-v7-0004-Various-experimental-changes.patch)
  download | inline diff:
From bcd35959d96d237208643faa3e9d6ed196a34391 Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Fri, 30 Jan 2026 23:18:45 +1300
Subject: [PATCH v7 4/4] Various experimental changes

---
 src/backend/access/common/tupdesc.c |   6 ++
 src/backend/executor/execTuples.c   |  48 ++++-----
 src/include/access/tupmacs.h        | 155 ++++++++++++++++++++++++++++
 3 files changed, 180 insertions(+), 29 deletions(-)

diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index 25364db630a..ca393af67c9 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -105,6 +105,12 @@ populate_compact_attribute_internal(Form_pg_attribute src,
 			elog(ERROR, "invalid attalign value: %c", src->attalign);
 			break;
 	}
+
+	/* Check for unsupported byval attlens */
+	if (src->attbyval && src->attlen != sizeof(char) &&
+		src->attlen != sizeof(int16) && src->attlen != sizeof(int32) &&
+		src->attlen != sizeof(int64))
+		elog(ERROR, "unsupported byval length: %d", src->attlen);
 }
 
 /*
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index 36d0aaed2fb..c3bc010d824 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -1029,24 +1029,26 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 	/* We can only fetch as many attributes as the tuple has. */
 	natts = Min(HeapTupleHeaderGetNatts(tup), natts);
 	attnum = slot->tts_nvalid;
+	values = slot->tts_values;
+	isnull = slot->tts_isnull;
 	firstNonCacheOffsetAttr = Min(tupleDesc->firstNonCachedOffAttr, natts);
 
 	if (hasnulls)
 	{
+		tp = (char *) tup + tup->t_hoff;
 		bp = tup->t_bits;
 		firstNullAttr = first_null_attr(bp, natts);
 		firstNonCacheOffsetAttr = Min(firstNonCacheOffsetAttr, firstNullAttr);
+		populate_isnull_array(bp, natts, isnull);
 	}
 	else
 	{
+		tp = (char *) tup + MAXALIGN(offsetof(HeapTupleHeaderData, t_bits));
 		bp = NULL;
 		firstNullAttr = natts;
+		memset(isnull, 0, sizeof(bool) * natts);
 	}
 
-	values = slot->tts_values;
-	isnull = slot->tts_isnull;
-	tp = (char *) tup + tup->t_hoff;
-
 	/*
 	 * Handle the portion of the tuple that we have cached the offset for up
 	 * to the first NULL attribute.  The offset is effectively fixed for these
@@ -1065,7 +1067,6 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 #endif
 		do
 		{
-			isnull[attnum] = false;
 			cattr = TupleDescCompactAttr(tupleDesc, attnum);
 
 #ifdef USE_ASSERT_CHECKING
@@ -1101,19 +1102,14 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 	 */
 	for (; attnum < firstNullAttr; attnum++)
 	{
-		isnull[attnum] = false;
 		cattr = TupleDescCompactAttr(tupleDesc, attnum);
 
-		/* align the offset for this attribute */
-		off = att_pointer_alignby(off,
-								  cattr->attalignby,
-								  cattr->attlen,
-								  tp + off);
-
-		values[attnum] = fetchatt(cattr, tp + off);
-
-		/* move the offset beyond this attribute */
-		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+		/* align 'off', fetch the datum, and increment off beyond the datum */
+		values[attnum] = align_fetch_then_add(tp,
+											  &off,
+											  cattr->attbyval,
+											  cattr->attlen,
+											  cattr->attalignby);
 	}
 
 	/*
@@ -1122,26 +1118,20 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 	 */
 	for (; attnum < natts; attnum++)
 	{
-		if (att_isnull(attnum, bp))
+		if (isnull[attnum])
 		{
 			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
 			continue;
 		}
 
-		isnull[attnum] = false;
 		cattr = TupleDescCompactAttr(tupleDesc, attnum);
 
-		/* align the offset for this attribute */
-		off = att_pointer_alignby(off,
-								  cattr->attalignby,
-								  cattr->attlen,
-								  tp + off);
-
-		values[attnum] = fetchatt(cattr, tp + off);
-
-		/* move the offset beyond this attribute */
-		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+		/* align 'off', fetch the datum, and increment off beyond the datum */
+		values[attnum] = align_fetch_then_add(tp,
+											  &off,
+											  cattr->attbyval,
+											  cattr->attlen,
+											  cattr->attalignby);
 	}
 
 	/*
diff --git a/src/include/access/tupmacs.h b/src/include/access/tupmacs.h
index 150a7a24785..21ee3cc3594 100644
--- a/src/include/access/tupmacs.h
+++ b/src/include/access/tupmacs.h
@@ -16,7 +16,11 @@
 
 #include "catalog/pg_type_d.h"	/* for TYPALIGN macros */
 #include "port/pg_bitutils.h"
+#include "varatt.h"
 
+#ifdef DO_AVX512_VERSION
+#include <immintrin.h>
+#endif
 
 /*
  * Check a tuple's null bitmap to determine whether the attribute is null.
@@ -29,6 +33,90 @@ att_isnull(int ATT, const bits8 *BITS)
 	return !(BITS[ATT >> 3] & (1 << (ATT & 0x07)));
 }
 
+/*
+ * populate_isnull_array
+ *		Transform a tuple's null array into a boolean array.
+ *
+ * XXX there does not seem to be an efficient way to do this without AVX512.
+ * Here we use a 256 element array with all possible patterns for 8 isnull
+ * array elements for each possible byte value for a bitmask element.
+ */
+static inline void
+populate_isnull_array(const bits8 *bits, int natts, bool *isnull)
+{
+	int			n_full_bytes = natts >> 3;
+	int			attnum = n_full_bytes << 3;
+	bool	   *isnull_ptr = isnull;
+
+#ifndef DO_AVX512_VERSION
+	/* This is 2 kilobytes! */
+	static const uint64 isnull_to_array[256] = {
+		0x0101010101010101, 0x0101010101010100, 0x0101010101010001, 0x0101010101010000, 0x0101010101000101, 0x0101010101000100, 0x0101010101000001, 0x0101010101000000,
+		0x0101010100010101, 0x0101010100010100, 0x0101010100010001, 0x0101010100010000, 0x0101010100000101, 0x0101010100000100, 0x0101010100000001, 0x0101010100000000,
+		0x0101010001010101, 0x0101010001010100, 0x0101010001010001, 0x0101010001010000, 0x0101010001000101, 0x0101010001000100, 0x0101010001000001, 0x0101010001000000,
+		0x0101010000010101, 0x0101010000010100, 0x0101010000010001, 0x0101010000010000, 0x0101010000000101, 0x0101010000000100, 0x0101010000000001, 0x0101010000000000,
+		0x0101000101010101, 0x0101000101010100, 0x0101000101010001, 0x0101000101010000, 0x0101000101000101, 0x0101000101000100, 0x0101000101000001, 0x0101000101000000,
+		0x0101000100010101, 0x0101000100010100, 0x0101000100010001, 0x0101000100010000, 0x0101000100000101, 0x0101000100000100, 0x0101000100000001, 0x0101000100000000,
+		0x0101000001010101, 0x0101000001010100, 0x0101000001010001, 0x0101000001010000, 0x0101000001000101, 0x0101000001000100, 0x0101000001000001, 0x0101000001000000,
+		0x0101000000010101, 0x0101000000010100, 0x0101000000010001, 0x0101000000010000, 0x0101000000000101, 0x0101000000000100, 0x0101000000000001, 0x0101000000000000,
+		0x0100010101010101, 0x0100010101010100, 0x0100010101010001, 0x0100010101010000, 0x0100010101000101, 0x0100010101000100, 0x0100010101000001, 0x0100010101000000,
+		0x0100010100010101, 0x0100010100010100, 0x0100010100010001, 0x0100010100010000, 0x0100010100000101, 0x0100010100000100, 0x0100010100000001, 0x0100010100000000,
+		0x0100010001010101, 0x0100010001010100, 0x0100010001010001, 0x0100010001010000, 0x0100010001000101, 0x0100010001000100, 0x0100010001000001, 0x0100010001000000,
+		0x0100010000010101, 0x0100010000010100, 0x0100010000010001, 0x0100010000010000, 0x0100010000000101, 0x0100010000000100, 0x0100010000000001, 0x0100010000000000,
+		0x0100000101010101, 0x0100000101010100, 0x0100000101010001, 0x0100000101010000, 0x0100000101000101, 0x0100000101000100, 0x0100000101000001, 0x0100000101000000,
+		0x0100000100010101, 0x0100000100010100, 0x0100000100010001, 0x0100000100010000, 0x0100000100000101, 0x0100000100000100, 0x0100000100000001, 0x0100000100000000,
+		0x0100000001010101, 0x0100000001010100, 0x0100000001010001, 0x0100000001010000, 0x0100000001000101, 0x0100000001000100, 0x0100000001000001, 0x0100000001000000,
+		0x0100000000010101, 0x0100000000010100, 0x0100000000010001, 0x0100000000010000, 0x0100000000000101, 0x0100000000000100, 0x0100000000000001, 0x0100000000000000,
+		0x0001010101010101, 0x0001010101010100, 0x0001010101010001, 0x0001010101010000, 0x0001010101000101, 0x0001010101000100, 0x0001010101000001, 0x0001010101000000,
+		0x0001010100010101, 0x0001010100010100, 0x0001010100010001, 0x0001010100010000, 0x0001010100000101, 0x0001010100000100, 0x0001010100000001, 0x0001010100000000,
+		0x0001010001010101, 0x0001010001010100, 0x0001010001010001, 0x0001010001010000, 0x0001010001000101, 0x0001010001000100, 0x0001010001000001, 0x0001010001000000,
+		0x0001010000010101, 0x0001010000010100, 0x0001010000010001, 0x0001010000010000, 0x0001010000000101, 0x0001010000000100, 0x0001010000000001, 0x0001010000000000,
+		0x0001000101010101, 0x0001000101010100, 0x0001000101010001, 0x0001000101010000, 0x0001000101000101, 0x0001000101000100, 0x0001000101000001, 0x0001000101000000,
+		0x0001000100010101, 0x0001000100010100, 0x0001000100010001, 0x0001000100010000, 0x0001000100000101, 0x0001000100000100, 0x0001000100000001, 0x0001000100000000,
+		0x0001000001010101, 0x0001000001010100, 0x0001000001010001, 0x0001000001010000, 0x0001000001000101, 0x0001000001000100, 0x0001000001000001, 0x0001000001000000,
+		0x0001000000010101, 0x0001000000010100, 0x0001000000010001, 0x0001000000010000, 0x0001000000000101, 0x0001000000000100, 0x0001000000000001, 0x0001000000000000,
+		0x0000010101010101, 0x0000010101010100, 0x0000010101010001, 0x0000010101010000, 0x0000010101000101, 0x0000010101000100, 0x0000010101000001, 0x0000010101000000,
+		0x0000010100010101, 0x0000010100010100, 0x0000010100010001, 0x0000010100010000, 0x0000010100000101, 0x0000010100000100, 0x0000010100000001, 0x0000010100000000,
+		0x0000010001010101, 0x0000010001010100, 0x0000010001010001, 0x0000010001010000, 0x0000010001000101, 0x0000010001000100, 0x0000010001000001, 0x0000010001000000,
+		0x0000010000010101, 0x0000010000010100, 0x0000010000010001, 0x0000010000010000, 0x0000010000000101, 0x0000010000000100, 0x0000010000000001, 0x0000010000000000,
+		0x0000000101010101, 0x0000000101010100, 0x0000000101010001, 0x0000000101010000, 0x0000000101000101, 0x0000000101000100, 0x0000000101000001, 0x0000000101000000,
+		0x0000000100010101, 0x0000000100010100, 0x0000000100010001, 0x0000000100010000, 0x0000000100000101, 0x0000000100000100, 0x0000000100000001, 0x0000000100000000,
+		0x0000000001010101, 0x0000000001010100, 0x0000000001010001, 0x0000000001010000, 0x0000000001000101, 0x0000000001000100, 0x0000000001000001, 0x0000000001000000,
+		0x0000000000010101, 0x0000000000010100, 0x0000000000010001, 0x0000000000010000, 0x0000000000000101, 0x0000000000000100, 0x0000000000000001, 0x0000000000000000
+	};
+#endif
+
+	for (int i = 0; i < n_full_bytes; i++)
+	{
+#ifdef DO_AVX512_VERSION
+		/* The array isn't required when AVX512 is available.  Testing only */
+		/*
+		 * XXX requires CFLAGS="-D DO_AVX512_VERSION -march=x86-64-v4" and an
+		 * avx512 machine
+		 */
+
+		/*
+		 * The bits array has 1s for values and 0s for NULLs. Bit-flip that to
+		 * get 1s for NULLs and use that mask to populate the register with
+		 * true values and zeros (falses) when the mask bit isn't set.
+		 */
+		__m128i		res = _mm_maskz_set1_epi8(~bits[i], true);
+
+		/* Grab lower 64-bits of the 128-bit register */
+		uint64		src = _mm_cvtsi128_si64(res);
+
+		memcpy(isnull_ptr, &src, sizeof(uint64));
+#else
+		memcpy(isnull_ptr, &isnull_to_array[bits[i]], sizeof(uint64));
+#endif
+		isnull_ptr += 8;
+	}
+
+	/* handle remaining attributes */
+	for (; attnum < natts; attnum++)
+		isnull[attnum] = att_isnull(attnum, bits);
+}
+
 #ifndef FRONTEND
 /*
  * Given an attbyval and an attlen from either a Form_pg_attribute or
@@ -71,6 +159,73 @@ fetch_att(const void *T, bool attbyval, int attlen)
 		return PointerGetDatum(T);
 }
 
+/*
+ * align_fetch_then_add
+ *		Applies all the functionality of att_pointer_alignby(), fetch_att()
+ *		and att_addlength_pointer() resulting in *off pointer to the perhaps
+ *		unaligned number of bytes into 'tupptr', ready to deform the next
+ *		attribute.
+ *
+ * tupptr: pointer to the beginning of the tuple, after the header and any
+ * NULL bitmask.
+ * off: offset in bytes for reading tuple data, possibly unaligned.
+ * attbyval, attlen, attalignby are values from CompactAttribute.
+ */
+static inline Datum
+align_fetch_then_add(const char *tupptr, uint32 *off, bool attbyval, int attlen,
+					 uint8 attalignby)
+{
+	Datum		res;
+
+	if (attlen > 0)
+	{
+		const char *offset_ptr;
+
+		*off = TYPEALIGN(attalignby, *off);
+		offset_ptr = tupptr + *off;
+		*off += attlen;
+		if (attbyval)
+		{
+			switch (attlen)
+			{
+				case sizeof(char):
+					return CharGetDatum(*((const char *) offset_ptr));
+				case sizeof(int16):
+					return Int16GetDatum(*((const int16 *) offset_ptr));
+				case sizeof(int32):
+					return Int32GetDatum(*((const int32 *) offset_ptr));
+				default:
+
+					/*
+					 * populate_compact_attribute_internal() should have
+					 * checked
+					 */
+					Assert(attlen == sizeof(int64));
+					return Int64GetDatum(*((const int64 *) offset_ptr));
+			}
+		}
+		return PointerGetDatum(offset_ptr);
+	}
+	else if (attlen == -1)
+	{
+
+		if (!VARATT_IS_SHORT(tupptr + *off))
+			*off = TYPEALIGN(attalignby, *off);
+
+		res = PointerGetDatum(tupptr + *off);
+		*off += VARSIZE_ANY(DatumGetPointer(res));
+		return res;
+	}
+	else
+	{
+		Assert(attlen == -2);
+		*off = TYPEALIGN(attalignby, *off);
+		res = PointerGetDatum(tupptr + *off);
+		*off += strlen(tupptr + *off) + 1;
+		return res;
+	}
+}
+
 #ifndef HAVE__BUILTIN_CTZ
 /*
  * For returning the 0-based position of the right-most 0 bit of a uint8, or 8
-- 
2.51.0



^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: More speedups for tuple deformation
@ 2026-01-30 17:11  Andres Freund <[email protected]>
  parent: David Rowley <[email protected]>
  0 siblings, 2 replies; 19+ messages in thread

From: Andres Freund @ 2026-01-30 17:11 UTC (permalink / raw)
  To: David Rowley <[email protected]>; +Cc: Chao Li <[email protected]>; PostgreSQL Developers <[email protected]>

Hi,

On 2026-01-31 00:10:42 +1300, David Rowley wrote:
> Thank you for looking at this again.
>
> On Thu, 29 Jan 2026 at 05:26, Andres Freund <[email protected]> wrote:
> > On 2026-01-28 02:34:26 +1300, David Rowley wrote:
> > > +     firstNonCacheOffsetAttr = Min(tupleDesc->firstNonCachedOffAttr, natts);
> >
> > FWIW, in a few experiments on my cascade lake systems, this branch (well, it
> > ends up as a cmov) ends up causing a surprisingly large performance
> > bottleneck.  I don't really see a way around that, but I thought I'd mention it.
>
> Yeah, I believe this is the primary reason that I'm fighting the small
> regression on the 0 extra column test.  I thought it might be because
> the mov has a dependency and wait on natts being calculated, which
> needs to access fields in the tuple header.

I agree that it's related to that. You have a chain of multiple computations
that lead to a higher aggregate latency.

natts depends on HeapTupleHeaderGetNatts(tup), firstNonCachedOffsetAttr
depends on both tupleDesc->firstNonCachedOffAttr and natts.

Afaict the dependencies are like the following:

0) a memory load (for tuple->t_data), likely to be cached
1) a memory load (for HeapTupleHeaderGetNatts(tup)), likely to miss memory,
   depends on 0)
2) some bit-shiftery to compute natts, depends on 1)
3) a conditional move (the Min()), depends on 3)
4) a memory load (for slot->tupdesc), no dependencies, likely cached
5) a memory load (for tupdesc->firstNonCachedOfAttr), depends on 4), cached
6) a conditional move (the Min()), depends on 3) and 5)
7) a memory load (for HeapTupleHasNulls()), likely to miss memory, but can
   piggy back on the cacheline load from 1), depends on 0)
8) a conditional branch (if (hasnulls), depends on 7)

And that's without even taking the hasnulls == true case into account.

Before all of those are completely executed ("retired"), none of the
speculatively executed instructions, e.g. the contents of the if (attnum <
firstNonCacheOffsetAttr) branch, can be retired.


This is why I like the idea of keeping track of whether we can rely on NOT
NULL columns to be present (I think that means we're evaluating expressions
other than constraint checks for new rows). It allows the leading NOT NULL
fixed-width columns to be decoded without having to wait for a good chunk of
the computations above. That's a performance boon even if we later have
nullable or varlength columns.



> I wonder if there's some reason the compiler to CPU can't defer calculating
> firstNonCacheOffsetAttr until later. Maybe I should try moving it later in
> the code to see if that helps.

I think it's the opposite, you want to have it as early as possible so it has
the lowest latency impact by the time the value is required, by starting the
first elements in the dependency chain before the rest.


> > On the topic of tupleDesc->firstNonCachedOffAttr - shouldn't that be an
> > AttrNumber? Not that it'll make a difference perf or space wise, just for
> > clarity.
> >
> > Hm, I guess natts isn't an AttrNumber either. Not sure why?
>
> I noticed that too, but took the path of least resistance and made
> firstNonCachedOffAttr an int too.

Probably reasonable.


> I did wonder why natts wasn't an AttrNumber. If they both were AttrNumbers,
> I wouldn't need to make the TupleDesc struct bigger. Right now, I've
> enlarged it by 8 bytes by adding firstNonCachedOffAttr.

One related thing: I noticed a speedup a few days ago when making sure that
both the slot and the descriptor are aligned to a cacheline boundary,
presumably because it reduces the number of cachelines that need to be in L1
for good performance.  But the results were somewhat inconsistent.


Separately I was seeing performance changes in cases where I shouldn't really
see then on the cascade lake system. After a while I noticed that the
differences I was seeing were due to differences in how effective the uop
cache is.  A bunch of searching lead to information about the "jcc eratum",
which in turn can be mitigated via "-Wa,-mbranches-within-32B-boundaries". I
am not actually clear whether my CPU is directly affected by that eratum, but
nonetheless, it improved performance quite substantially.


> One problem is that a bunch of functions that accept int;
> CreateTemplateTupleDesc(int natts), CreateTupleDesc(int natts,
> Form_pg_attribute *attrs). Then BuildDescFromLists() sets natts based on
> list_length(). Maybe CreateTemplateTupleDesc() could Assert or throw an
> error if natts does not fit in 16-bits.

We'd probably fail with larger values anyway, so it might be a good idea to
detect that regardless of changing the argument type.


> >
> > > +     if (hasnulls)
> > > +     {
> > > +             bp = tup->t_bits;
> > > +             firstNullAttr = first_null_attr(bp, natts);
> > > +             firstNonCacheOffsetAttr = Min(firstNonCacheOffsetAttr, firstNullAttr);
> > > +     }
> > > +     else
> > > +     {
> > > +             bp = NULL;
> > > +             firstNullAttr = natts;
> > > +     }
> > > +
> > > +     values = slot->tts_values;
> > > +     isnull = slot->tts_isnull;
> > > +     tp = (char *) tup + tup->t_hoff;
> >
> > Another stall I see is due to the t_hoff computation - which makes sense, it's
> > in the tuple header and none of the deforming can happen without knowing the
> > address. I think in the !hasnulls case, the only influence on it is
> > MAXALIGN(offsetof(HeapTupleHeaderData, t_bits)), so we could just hardcode
> > that?
>
> hmm. I wonder why it even needs to exist.

There are a lot of questions like that in our on-disk format. There's a lot of
decisions in there that make fast decoding way harder than it has to be...


> If the null bitmap is there,
> you can calculate how many bytes from natts. I tried doing "tp = (char
> *) tup + MAXALIGN(offsetof(HeapTupleHeaderData, t_bits));" for the
> !hasnulls case and it's hard to tell if it helps. See the 0004 patch.

It helps here quite measurably interestingly.


> I'm somewhat hesitant to go against the grain here on how to calculate where
> the tuple data starts.

Hm. I don't think it's particularly risky. We could even add an error branch
for that assumption being violated - as long as the other computations don't
depend on the result of the conditional branch, the impact on performance
should be quite minimal.


> > > +     else if (attnum == 0)
> > >       {
> > >               /* Start from the first attribute */
> > >               off = 0;
> > > -             slow = false;
> > >       }
> > >       else
> > >       {
> > >               /* Restore state from previous execution */
> > >               off = *offp;
> > > -             slow = TTS_SLOW(slot);
> > >       }
> >
> > Do we actually need both of these branches? Shouldn't *offp be set to 0 in the
> > attnum == 0 case?
>
> I see tts_heap_clear() zeros it, so I think it should be ok. Doesn't
> feel quite as robust, however.

Should be easy enough to assert that it's correct.  I don't think it's very
likely we'd just forget resetting it.  It's already catastrophic if we set it
wrongly.



> > 3) I briefly experimented with this code, and I think we may be able to
> >    optimize the combination of att_pointer_alignby(), fetch_att() and
> >    att_addlength_pointer(). They all do quite related work, and for byvalue
> >    types, we know at compile time what the alignment requirement for each of
> >    the supported attlen is.
>
> Is this true? Isn't there some nearby discussion about AIX having
> 4-byte double alignment?

The AIX stuff is just bonkers. Having different alignment based on where in an
aggregate a double is is just insane.

We could probably just accomodate that with a compile-time ifdef.


> I've taken a go at implementing a function called align_fetch_then_add(),
> which rolls all the macros into one (See 0004). I just can't see any
> improvements with it. Maybe I've missed something that could be more
> optimal. I did even ditch one of the cases from the switch(attlen). It might
> be ok to do that now as we can check for invalid attlens for byval types
> when we populate the CompactAttribute.

You could move the TYPEALIGN(attalignby, *off) calls into the attlen switch
and call it with a constant argument.


> > Have you experimented setting isnull[] in a dedicated loop if there are nulls
> > and then in this loop just checking isnull[attnum]? Seems like that could
> > perhaps be combined with the work in first_null_attr() and be more efficient
> > than doing an att_isnull() separately for each column.
>
> Yes. I experiment with that quite a bit. I wasn't able to make it any
> faster than setting the isnull element in the same loop as the
> tts_values element. What I did try was having a dedicated tight loop
> like; for (int i = attnum; i < firstNullAttr; i++) isnull[i] = false;,
> but the compiler would always try to optimise that into an inlined
> memset which would result in poorly performing code in cases with a
> small number of columns due to the size and alignment prechecks.

Yea, that kind of transformation is pretty annoying and makes little sense
here :(.

I was thinking of actually computing the value of isnull[] based on the null
bitmap (as you also try below).


> In the attached 0004 patch I've experimented with this again. This
> time, I wrote a function that converts the null bitmap into the isnull
> array using a lookup table.

Oh. I was just thinking of something roughly like

  int i nullbyte_i = attnum >> 3;
  for (int nullcol = attnum; nullcol < natts; nullcol += 8)
  {
      bits8 nullbyte = bp[nullbyte_i++];

      for (int onebyte = 0; onebyte < 8; onebyte++)
      {
          if (nullcol < natts)
             tts_isnull[nullcol] = nullbyte & 0x01;
          nullbyte >>= 1;
          nullcol++;
      }
  }

This isn't quite right, as we'd need to deal with starting to deform at a
attribute that's not % 8 = 0 (I'd probably just do the whole byte even if we'd
redo a few column)). And probably lots of other stuff.

With a bit of care the inner loop should be unrollable with all the moves as
conditional moves depending on nullcol < natts. Or such.

Or we could just make sure tts_isnull is always sized to be divisible by 8,
that'd presumably allow considerably better code to be generated.


I'd hope this would be more efficient than doing

	static inline bool
	att_isnull(int ATT, const bits8 *BITS)
	{
		return !(BITS[ATT >> 3] & (1 << (ATT & 0x07)));
	}

for each column.


> I spent a bit of time trying to figure out a way to do this without the
> lookup table and only came up with a method that requires AVX512
> instructions. I coded that up, but it requires building with
> -march=x86-64-v4, which will likely cause many other reasons for the
> performance to vary.

Yea, I doubt we want that... Too many tuples will be too short to benefit.


> The machine that likes 0004 the most (using the lookup table method of
> setting the isnull array) is the Apple M2. All the tests apart from
> the 0 extra column test became 30-90% faster. Previously the tests
> that had to do att_isnull didn't improve very much. The 0 extra column
> test regressed quite a bit. 50% slower on all but test 1 and 5 (the
> ones without NULLs). See the attached graph. The Zen2 machine also
> perhaps quite likes it, but not for the 0 extra column test. I'm
> struggling to get stable performance results from that machine right
> now. My Zen 4 laptop isn't a fan of it, but also not getting very
> stable performance results from that either.

Hm. A 2kB lookup table in the middle of deforming is probably not great for
L1...


> I'm curious to see what your Intel machines think of 0004 vs not having it.

Scheduling an experiment with it.


Greetings,

Andres Freund






^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: More speedups for tuple deformation
@ 2026-01-30 20:03  Andres Freund <[email protected]>
  parent: Andres Freund <[email protected]>
  1 sibling, 0 replies; 19+ messages in thread

From: Andres Freund @ 2026-01-30 20:03 UTC (permalink / raw)
  To: David Rowley <[email protected]>; +Cc: Chao Li <[email protected]>; PostgreSQL Developers <[email protected]>

Hi,

On 2026-01-30 12:11:44 -0500, Andres Freund wrote:
> > In the attached 0004 patch I've experimented with this again. This
> > time, I wrote a function that converts the null bitmap into the isnull
> > array using a lookup table.
>
> Oh. I was just thinking of something roughly like
>
>   int i nullbyte_i = attnum >> 3;
>   for (int nullcol = attnum; nullcol < natts; nullcol += 8)
>   {
>       bits8 nullbyte = bp[nullbyte_i++];
>
>       for (int onebyte = 0; onebyte < 8; onebyte++)
>       {
>           if (nullcol < natts)
>              tts_isnull[nullcol] = nullbyte & 0x01;
>           nullbyte >>= 1;
>           nullcol++;
>       }
>   }
>
> This isn't quite right, as we'd need to deal with starting to deform at a
> attribute that's not % 8 = 0 (I'd probably just do the whole byte even if we'd
> redo a few column)). And probably lots of other stuff.
>
> With a bit of care the inner loop should be unrollable with all the moves as
> conditional moves depending on nullcol < natts. Or such.
>
> Or we could just make sure tts_isnull is always sized to be divisible by 8,
> that'd presumably allow considerably better code to be generated.
>
>
> I'd hope this would be more efficient than doing
>
> 	static inline bool
> 	att_isnull(int ATT, const bits8 *BITS)
> 	{
> 		return !(BITS[ATT >> 3] & (1 << (ATT & 0x07)));
> 	}
>
> for each column.

I couldn't quite let go of this, thinking there had to be a non-simd way to do
this efficiently (by basically doing SWAR).  And I think there is:

We can multiply one byte of the null bitmap with a carefully chosen value that
spreads each bit into the higher bytes. E.g.

0b111 * (1 <<  0)           = 0b                111
0b111 * (1 <<  7)           = 0b        11_10000000
0b111 * ((1 << 0) | (1 << 7))
                            = 0b        11_10000111
0b111 * (1 << 14)           = 0b1_11000000_00000000
0b111 * ((1 << 0) | (1 << 7) | ( << 14))
                            = 0b1_11000011_10000111
...

Then, we can fairly easily mask out the unnecessary bits.


However, as maybe apparent above, this won't work for the input 0xff, as the
carry from each byte's multiplications will "overflow its byte" and flip the
next bit to 0. But that's not hard to handle:

a) A single branch would probably be fine?

b) Compute the low 4 bits and high 4 bits of the bitmap in parallel and or
   the result together. By just looking at 4 null bits, no carry overflow is
   possible. Due to being branchless, that's propbably better.


Something like

    /*
     * The bits array has 1s for values and 0s for NULLs. Bit-flip that to
     * get 1s for NULLs.
     */
    uint8_t nullbyte = ~bp[nullbyte_i++];
    /* 8 bytes where each byte is 0 or 1 depending on whether null bitmap is set */
    uint64_t isnull_8;

    /*
     * Multiplier ensuring that input bit 0 is reflected in output bit 0, input bit 1 at output bit 8, etc.
     * Other bits also will often be set and need to be masked away.
     */
    uint32_t spread_bits_32 = (1U << 0) | (1U << 7) | (1U << 14) | (1U << 21);
    uint64_t mask_bits_64 = 0x0101010101010101ULL;

    /* convert the lower 4 bits of null bitmap word into 32 bit int */
    isnull_8 = (nullbyte & 0xf) * spread_bits_32;
    /* convert the upper 4 bits of null bitmap word into 32 bit int, shift into the upper 32 bit */
    isnull_8 |= ((uint64_t)((nullbyte >> 4) * spread_bits_32)) << 32;

    /* mask out all the bogus bits (could also be done as a 32bit op?)*/
    isnull_8 &= mask_bits_64;

    memcpy(&tts_isnull[nullcol], &isnull_8, sizeof(isnull_8));


should, I think, be better than a 2kB array.  Perhaps not quite as fast as the
AVX512 method, but it should be decent on most hardware...

Greetings,

Andres Freund






^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: More speedups for tuple deformation
@ 2026-01-31 02:47  John Naylor <[email protected]>
  parent: David Rowley <[email protected]>
  1 sibling, 1 reply; 19+ messages in thread

From: John Naylor @ 2026-01-31 02:47 UTC (permalink / raw)
  To: David Rowley <[email protected]>; +Cc: Andres Freund <[email protected]>; Chao Li <[email protected]>; PostgreSQL Developers <[email protected]>

On Tue, Jan 27, 2026 at 8:34 PM David Rowley <[email protected]> wrote:
> I've also included a slightly revised patch. I made a small change to
> the first_null_attr() to get rid of the masking of higher attnums and
> also now making use of __builtin_ctz to find the first NULL attnum in
> the byte. For compilers that don't support that, I've included a
> pg_rightmost_*zero*_pos table. I didn't want to use the pg_bitutils
> table for the rightmost *one* pos as it meant having to special-case
> what happens when using index 255, as that would return 0, and I want
> 8. I'll make the MSVC version use _BitScanForward() in the next patch.

I don't get why we'd need to special-case 255 in only one place.

+ /* Process all bytes up to just before the byte for the natts index */
+ for (bytenum = 0; bytenum < lastByte; bytenum++)
+ {
+   /* break if there's any NULL attrs (a 0 bit) */
+   if (bits[bytenum] != 0xFF)
+   break;
+ }
+
+ res = bytenum << 3;
+
+#ifdef HAVE__BUILTIN_CTZ
+   res += __builtin_ctz(~bits[bytenum]);
+#else
+   res += pg_rightmost_zero_pos[bits[bytenum]];
+#endif

If bits[bytenum] is 255, then __builtin_ctz(0) is undefined. The top
of the function says

+ * We expect that 'bits' contains at least one 0 bit somewhere in the mask,
+ * not necessarily < natts.

...in which case it should be well defined everywhere. Am I missing
something? If we need to handle the 255 case, this should work:

pg_rightmost_one_pos32(~((uint32) bits[bytenum]))

--
John Naylor
Amazon Web Services






^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: More speedups for tuple deformation
@ 2026-01-31 03:44  David Rowley <[email protected]>
  parent: John Naylor <[email protected]>
  0 siblings, 1 reply; 19+ messages in thread

From: David Rowley @ 2026-01-31 03:44 UTC (permalink / raw)
  To: John Naylor <[email protected]>; +Cc: Andres Freund <[email protected]>; Chao Li <[email protected]>; PostgreSQL Developers <[email protected]>

On Sat, 31 Jan 2026 at 15:48, John Naylor <[email protected]> wrote:
> +   res += __builtin_ctz(~bits[bytenum]);

> If bits[bytenum] is 255, then __builtin_ctz(0) is undefined. The top
> of the function says

Oops, I forgot to cast the byte to uint32 before the bitwise-not. I've
fixed locally. Still processing Andres' comments.

> + * We expect that 'bits' contains at least one 0 bit somewhere in the mask,
> + * not necessarily < natts.
>
> ...in which case it should be well defined everywhere. Am I missing
> something? If we need to handle the 255 case, this should work:
>
> pg_rightmost_one_pos32(~((uint32) bits[bytenum]))

I'd rather handle that in a single byte as the fallback path in that
function requires byte-at-a-time processing.

David






^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: More speedups for tuple deformation
@ 2026-01-31 06:01  John Naylor <[email protected]>
  parent: David Rowley <[email protected]>
  0 siblings, 0 replies; 19+ messages in thread

From: John Naylor @ 2026-01-31 06:01 UTC (permalink / raw)
  To: David Rowley <[email protected]>; +Cc: Andres Freund <[email protected]>; Chao Li <[email protected]>; PostgreSQL Developers <[email protected]>

On Sat, Jan 31, 2026 at 10:45 AM David Rowley <[email protected]> wrote:
> On Sat, 31 Jan 2026 at 15:48, John Naylor <[email protected]> wrote:
> > pg_rightmost_one_pos32(~((uint32) bits[bytenum]))
>
> I'd rather handle that in a single byte as the fallback path in that
> function requires byte-at-a-time processing.

I spot checked some less common animals in the buildfarm and i686,
ppc64le, riscv64, loongarch64, and s390x all have the builtin. Of
these, only riscv64 seems to lack a single instruction implementation.
It's good to keep the old fallback path around just in case, but I
doubt a new fallback would get any coverage at all.

-- 
John Naylor
Amazon Web Services






^ permalink  raw  reply  [nested|flat] 19+ messages in thread

* Re: More speedups for tuple deformation
@ 2026-01-31 11:27  David Rowley <[email protected]>
  parent: Andres Freund <[email protected]>
  1 sibling, 0 replies; 19+ messages in thread

From: David Rowley @ 2026-01-31 11:27 UTC (permalink / raw)
  To: Andres Freund <[email protected]>; +Cc: Chao Li <[email protected]>; PostgreSQL Developers <[email protected]>

On Sat, 31 Jan 2026 at 06:11, Andres Freund <[email protected]> wrote:
> This is why I like the idea of keeping track of whether we can rely on NOT
> NULL columns to be present (I think that means we're evaluating expressions
> other than constraint checks for new rows). It allows the leading NOT NULL
> fixed-width columns to be decoded without having to wait for a good chunk of
> the computations above. That's a performance boon even if we later have
> nullable or varlength columns.

I can look into this. As we both know, we can't apply this
optimisation in every case as there are places in the code which form
then deform tuples before NOT NULL constraints are checked. Perhaps
the slot can store a flag to mention if the optimisation is valid to
apply or not. It doesn't look like the flag can be part of the
TupleDesc since we cache those in relcache. I'm imagining that
TupleDescFinalize() calculates another field which could be the max
cached offset that's got a NOT NULL constraint and isn't attmissing. I
think this will need another dedicated loop in
slot_deform_heap_tuple() to loop up to that attribute before doing the
firstNonCacheOffsetAttr loop. Maybe it's worth making that new loop
only handle byval types so that we can get rid of the byval branch
from fetch_att(). That should make it a bit faster at the expense of
not being able to handle fixed-width types that have a attlen > 8,
which I don't think are excessively common...  It's hard to know.
UUIDs do get used as primary key columns, and making the PK column the
first column in the table is a common pattern to follow.

Overall, I agree that this will likely speed up some of the tests, but
it will also likely further increase the overheads for the 0 extra
column tests.

> > > 3) I briefly experimented with this code, and I think we may be able to
> > >    optimize the combination of att_pointer_alignby(), fetch_att() and
> > >    att_addlength_pointer(). They all do quite related work, and for byvalue
> > >    types, we know at compile time what the alignment requirement for each of
> > >    the supported attlen is.
> >
> > Is this true? Isn't there some nearby discussion about AIX having
> > 4-byte double alignment?
>
> The AIX stuff is just bonkers. Having different alignment based on where in an
> aggregate a double is is just insane.
>
> We could probably just accomodate that with a compile-time ifdef.

I've not tried this yet. There's probably more than
align_fetch_then_add() that could benefit from the knowledge that
attalign is the same as attlen for byval types. align_fetch_then_add()
is the only one I'm currently aware of, and that only saves doing
attalign - 1, which only saves 1 cycle.

> > I've taken a go at implementing a function called align_fetch_then_add(),
> > which rolls all the macros into one (See 0004). I just can't see any
> > improvements with it. Maybe I've missed something that could be more
> > optimal. I did even ditch one of the cases from the switch(attlen). It might
> > be ok to do that now as we can check for invalid attlens for byval types
> > when we populate the CompactAttribute.
>
> You could move the TYPEALIGN(attalignby, *off) calls into the attlen switch
> and call it with a constant argument.

I think that would need the assumption you mentioned above that the
byval type's attlen defines the alignment requirements.

> > > Have you experimented setting isnull[] in a dedicated loop if there are nulls
> > > and then in this loop just checking isnull[attnum]? Seems like that could
> > > perhaps be combined with the work in first_null_attr() and be more efficient
> > > than doing an att_isnull() separately for each column.
> >
> > Yes. I experiment with that quite a bit. I wasn't able to make it any
> > faster than setting the isnull element in the same loop as the
> > tts_values element. What I did try was having a dedicated tight loop
> > like; for (int i = attnum; i < firstNullAttr; i++) isnull[i] = false;,
> > but the compiler would always try to optimise that into an inlined
> > memset which would result in poorly performing code in cases with a
> > small number of columns due to the size and alignment prechecks.
>
> Yea, that kind of transformation is pretty annoying and makes little sense
> here :(.
>
> I was thinking of actually computing the value of isnull[] based on the null
> bitmap (as you also try below).

I've taken the code you posted in [1] to do this. Thanks for that. It
works very well. I made it so the tts_isnull array size is rounded up
to the next multiple of 8. Without that we'll overwrite the sentinel
byte in the palloc'd chunk if we assume we can always write to
tts_isnull in multiples of 8 elements.  Because I've made tts_isnull
round up to the next 8 elements, I can make the !hasnulls loop zero
the memory 8 bytes at a time. That solves the annoying memset inlining
I was getting. That seems to mostly fix the regression in the zero
extra column test. The t_hoff calculation replacement maybe helped a
bit too. I didn't test individually.

> > I spent a bit of time trying to figure out a way to do this without the
> > lookup table and only came up with a method that requires AVX512
> > instructions. I coded that up, but it requires building with
> > -march=x86-64-v4, which will likely cause many other reasons for the
> > performance to vary.
>
> Yea, I doubt we want that... Too many tuples will be too short to benefit.

It was only doing 8 columns at once, not 64. I think it's irrelevant
now with your 0x204081 trick.

I've revised the 0004 patch to use the 0x204081 trick. I also added
fetch_att_noerr() gets rid of the elog and assumes the default switch
case must be attlen == 8. I didn't modify the existing function as
that will be used for attlen values that don't come from
CompactAttribute. Also got rid of the t_hoff usage. With those and the
new code that zeros the tts_isnull array 8 bytes at a time, things are
looking pretty good.

I've attached 3 graphs, which are now looking a bit better. The gcc
results are not quite as good. There's still a small regression with 0
extra column test, and overall, the results are not as impressive as
clang's. I've not yet studied why. The Apple M2 machine averaged over
53% faster than master over all tests, or ~63% if you exclude the 0
extra column tests.

I'll look into adding another optional optimisation for the NOT NULL
columns. Attaching the updated patch in the meantime. This includes
the fix for the __builtin_ctz() issue mentioned by John.

David

From df730d93ed3f7f9cd719c69c213ab60cac28c7fb Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Wed, 21 Jan 2026 15:41:37 +1300
Subject: [PATCH v8 1/4] Add empty TupleDescFinalize() function

Currently does nothing, but will in a future commit.
---
 contrib/dblink/dblink.c                             |  4 ++++
 contrib/pg_buffercache/pg_buffercache_pages.c       |  2 ++
 contrib/pg_visibility/pg_visibility.c               |  2 ++
 src/backend/access/brin/brin_tuple.c                |  1 +
 src/backend/access/common/tupdesc.c                 | 13 +++++++++++++
 src/backend/access/gin/ginutil.c                    |  1 +
 src/backend/access/gist/gistscan.c                  |  1 +
 src/backend/access/spgist/spgutils.c                |  1 +
 src/backend/access/transam/twophase.c               |  1 +
 src/backend/access/transam/xlogfuncs.c              |  1 +
 src/backend/backup/basebackup_copy.c                |  3 +++
 src/backend/catalog/index.c                         |  2 ++
 src/backend/catalog/pg_publication.c                |  1 +
 src/backend/catalog/toasting.c                      |  6 ++++++
 src/backend/commands/explain.c                      |  1 +
 src/backend/commands/functioncmds.c                 |  1 +
 src/backend/commands/sequence.c                     |  1 +
 src/backend/commands/tablecmds.c                    |  4 ++++
 src/backend/commands/wait.c                         |  1 +
 src/backend/executor/execSRF.c                      |  2 ++
 src/backend/executor/execTuples.c                   |  4 ++++
 src/backend/executor/nodeFunctionscan.c             |  2 ++
 src/backend/parser/parse_relation.c                 |  4 +++-
 src/backend/parser/parse_target.c                   |  2 ++
 .../replication/libpqwalreceiver/libpqwalreceiver.c |  1 +
 src/backend/replication/walsender.c                 |  5 +++++
 src/backend/utils/adt/acl.c                         |  1 +
 src/backend/utils/adt/genfile.c                     |  1 +
 src/backend/utils/adt/lockfuncs.c                   |  1 +
 src/backend/utils/adt/orderedsetaggs.c              |  1 +
 src/backend/utils/adt/pgstatfuncs.c                 |  5 +++++
 src/backend/utils/adt/tsvector_op.c                 |  1 +
 src/backend/utils/cache/relcache.c                  |  8 ++++++++
 src/backend/utils/fmgr/funcapi.c                    |  6 ++++++
 src/backend/utils/misc/guc_funcs.c                  |  5 +++++
 src/include/access/tupdesc.h                        |  1 +
 src/pl/plpgsql/src/pl_comp.c                        |  2 ++
 .../test_custom_stats/test_custom_fixed_stats.c     |  1 +
 src/test/modules/test_predtest/test_predtest.c      |  1 +
 39 files changed, 100 insertions(+), 1 deletion(-)

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 8cb3166495c..1ce4502fec2 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -881,6 +881,7 @@ materializeResult(FunctionCallInfo fcinfo, PGconn *conn, PGresult *res)
 		tupdesc = CreateTemplateTupleDesc(1);
 		TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 						   TEXTOID, -1, 0);
+		TupleDescFinalize(tupdesc);
 		ntuples = 1;
 		nfields = 1;
 	}
@@ -1044,6 +1045,7 @@ materializeQueryResult(FunctionCallInfo fcinfo,
 			tupdesc = CreateTemplateTupleDesc(1);
 			TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 							   TEXTOID, -1, 0);
+			TupleDescFinalize(tupdesc);
 			attinmeta = TupleDescGetAttInMetadata(tupdesc);
 
 			oldcontext = MemoryContextSwitchTo(rsinfo->econtext->ecxt_per_query_memory);
@@ -1529,6 +1531,8 @@ dblink_get_pkey(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 2, "colname",
 						   TEXTOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
+
 		/*
 		 * Generate attribute metadata needed later to produce tuples from raw
 		 * C strings
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 89b86855243..a6b4fb5252b 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -174,6 +174,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
 			TupleDescInitEntry(tupledesc, (AttrNumber) 9, "pinning_backends",
 							   INT4OID, -1, 0);
 
+		TupleDescFinalize(tupledesc);
 		fctx->tupdesc = BlessTupleDesc(tupledesc);
 
 		/* Allocate NBuffers worth of BufferCachePagesRec records. */
@@ -442,6 +443,7 @@ pg_buffercache_os_pages_internal(FunctionCallInfo fcinfo, bool include_numa)
 		TupleDescInitEntry(tupledesc, (AttrNumber) 3, "numa_node",
 						   INT4OID, -1, 0);
 
+		TupleDescFinalize(tupledesc);
 		fctx->tupdesc = BlessTupleDesc(tupledesc);
 		fctx->include_numa = include_numa;
 
diff --git a/contrib/pg_visibility/pg_visibility.c b/contrib/pg_visibility/pg_visibility.c
index 9bc3a784bf7..dfab0b64cf5 100644
--- a/contrib/pg_visibility/pg_visibility.c
+++ b/contrib/pg_visibility/pg_visibility.c
@@ -469,6 +469,8 @@ pg_visibility_tupdesc(bool include_blkno, bool include_pd)
 		TupleDescInitEntry(tupdesc, ++a, "pd_all_visible", BOOLOID, -1, 0);
 	Assert(a == maxattr);
 
+	TupleDescFinalize(tupdesc);
+
 	return BlessTupleDesc(tupdesc);
 }
 
diff --git a/src/backend/access/brin/brin_tuple.c b/src/backend/access/brin/brin_tuple.c
index 706387e36d6..7f150df9ee7 100644
--- a/src/backend/access/brin/brin_tuple.c
+++ b/src/backend/access/brin/brin_tuple.c
@@ -84,6 +84,7 @@ brtuple_disk_tupdesc(BrinDesc *brdesc)
 
 		MemoryContextSwitchTo(oldcxt);
 
+		TupleDescFinalize(tupdesc);
 		brdesc->bd_disktdesc = tupdesc;
 	}
 
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index 94b4f1f9975..e98de806a77 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -238,6 +238,9 @@ CreateTupleDesc(int natts, Form_pg_attribute *attrs)
 		memcpy(TupleDescAttr(desc, i), attrs[i], ATTRIBUTE_FIXED_PART_SIZE);
 		populate_compact_attribute(desc, i);
 	}
+
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -282,6 +285,8 @@ CreateTupleDescCopy(TupleDesc tupdesc)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -328,6 +333,8 @@ CreateTupleDescTruncatedCopy(TupleDesc tupdesc, int natts)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -413,6 +420,8 @@ CreateTupleDescCopyConstr(TupleDesc tupdesc)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -455,6 +464,8 @@ TupleDescCopy(TupleDesc dst, TupleDesc src)
 	 * source's refcount would be wrong in any case.)
 	 */
 	dst->tdrefcount = -1;
+
+	TupleDescFinalize(dst);
 }
 
 /*
@@ -1082,6 +1093,8 @@ BuildDescFromLists(const List *names, const List *types, const List *typmods, co
 		TupleDescInitEntryCollation(desc, attnum, attcollation);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index d205093e21d..a533d79e26e 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -129,6 +129,7 @@ initGinState(GinState *state, Relation index)
 							   attr->attndims);
 			TupleDescInitEntryCollation(state->tupdesc[i], (AttrNumber) 2,
 										attr->attcollation);
+			TupleDescFinalize(state->tupdesc[i]);
 		}
 
 		/*
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
index f23bc4a6757..c65f93abdae 100644
--- a/src/backend/access/gist/gistscan.c
+++ b/src/backend/access/gist/gistscan.c
@@ -201,6 +201,7 @@ gistrescan(IndexScanDesc scan, ScanKey key, int nkeys,
 											 attno - 1)->atttypid,
 							   -1, 0);
 		}
+		TupleDescFinalize(so->giststate->fetchTupdesc);
 		scan->xs_hitupdesc = so->giststate->fetchTupdesc;
 
 		/* Also create a memory context that will hold the returned tuples */
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 9f5379b87ac..b246e8127db 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -340,6 +340,7 @@ getSpGistTupleDesc(Relation index, SpGistTypeDesc *keyType)
 			TupleDescCompactAttr(outTupDesc, i)->attcacheoff = -1;
 
 		populate_compact_attribute(outTupDesc, spgKeyColumn);
+		TupleDescFinalize(outTupDesc);
 	}
 	return outTupDesc;
 }
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 601ce3faa64..6d5792c7929 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -744,6 +744,7 @@ pg_prepared_xact(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 5, "dbid",
 						   OIDOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 2efe4105efb..b6bc616c74c 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -400,6 +400,7 @@ pg_walfile_name_offset(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(resultTupleDesc, (AttrNumber) 2, "file_offset",
 					   INT4OID, -1, 0);
 
+	TupleDescFinalize(resultTupleDesc);
 	resultTupleDesc = BlessTupleDesc(resultTupleDesc);
 
 	/*
diff --git a/src/backend/backup/basebackup_copy.c b/src/backend/backup/basebackup_copy.c
index fecfad9ab7b..29dbd0cb32f 100644
--- a/src/backend/backup/basebackup_copy.c
+++ b/src/backend/backup/basebackup_copy.c
@@ -357,6 +357,8 @@ SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli)
 	 */
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "tli", INT8OID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
+
 	/* send RowDescription */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
 
@@ -388,6 +390,7 @@ SendTablespaceList(List *tablespaces)
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, "spcoid", OIDOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "spclocation", TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "size", INT8OID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* send RowDescription */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 43de42ce39e..75e97fb394a 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -481,6 +481,8 @@ ConstructTupleDescriptor(Relation heapRelation,
 		populate_compact_attribute(indexTupDesc, i);
 	}
 
+	TupleDescFinalize(indexTupDesc);
+
 	return indexTupDesc;
 }
 
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 9a4791c573e..fa353a0dd37 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -1230,6 +1230,7 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 4, "qual",
 						   PG_NODE_TREEOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 		funcctx->user_fctx = table_infos;
 
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index c78dcea98c1..078a1cf5127 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -229,6 +229,12 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid,
 	TupleDescAttr(tupdesc, 1)->attcompression = InvalidCompressionMethod;
 	TupleDescAttr(tupdesc, 2)->attcompression = InvalidCompressionMethod;
 
+	populate_compact_attribute(tupdesc, 0);
+	populate_compact_attribute(tupdesc, 1);
+	populate_compact_attribute(tupdesc, 2);
+
+	TupleDescFinalize(tupdesc);
+
 	/*
 	 * Toast tables for regular relations go in pg_toast; those for temp
 	 * relations go into the per-backend temp-toast-table namespace.
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index b7bb111688c..7abd9ed272f 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -281,6 +281,7 @@ ExplainResultDesc(ExplainStmt *stmt)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "QUERY PLAN",
 					   result_type, -1, 0);
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
 
diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c
index a516b037dea..6a8f162b640 100644
--- a/src/backend/commands/functioncmds.c
+++ b/src/backend/commands/functioncmds.c
@@ -2423,6 +2423,7 @@ CallStmtResultDesc(CallStmt *stmt)
 							   -1,
 							   0);
 		}
+		TupleDescFinalize(tupdesc);
 	}
 
 	return tupdesc;
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index e1b808bbb60..551667650ba 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -1808,6 +1808,7 @@ pg_get_sequence_data(PG_FUNCTION_ARGS)
 					   BOOLOID, -1, 0);
 	TupleDescInitEntry(resultTupleDesc, (AttrNumber) 3, "page_lsn",
 					   LSNOID, -1, 0);
+	TupleDescFinalize(resultTupleDesc);
 	resultTupleDesc = BlessTupleDesc(resultTupleDesc);
 
 	seqrel = try_relation_open(relid, AccessShareLock);
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index f976c0e5c7e..ae7c9db9fa9 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -1029,6 +1029,8 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
 		}
 	}
 
+	TupleDescFinalize(descriptor);
+
 	/*
 	 * For relations with table AM and partitioned tables, select access
 	 * method to use: an explicitly indicated one, or (in the case of a
@@ -1448,6 +1450,8 @@ BuildDescForRelation(const List *columns)
 		populate_compact_attribute(desc, attnum - 1);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
diff --git a/src/backend/commands/wait.c b/src/backend/commands/wait.c
index 1290df10c6f..8e920a72372 100644
--- a/src/backend/commands/wait.c
+++ b/src/backend/commands/wait.c
@@ -338,5 +338,6 @@ WaitStmtResultDesc(WaitStmt *stmt)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 					   TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
diff --git a/src/backend/executor/execSRF.c b/src/backend/executor/execSRF.c
index a0b111dc0e4..b481e50acfb 100644
--- a/src/backend/executor/execSRF.c
+++ b/src/backend/executor/execSRF.c
@@ -272,6 +272,7 @@ ExecMakeTableFunctionResult(SetExprState *setexpr,
 									   funcrettype,
 									   -1,
 									   0);
+					TupleDescFinalize(tupdesc);
 					rsinfo.setDesc = tupdesc;
 				}
 				MemoryContextSwitchTo(oldcontext);
@@ -776,6 +777,7 @@ init_sexpr(Oid foid, Oid input_collation, Expr *node,
 							   funcrettype,
 							   -1,
 							   0);
+			TupleDescFinalize(tupdesc);
 			sexpr->funcResultDesc = tupdesc;
 			sexpr->funcReturnsTuple = false;
 		}
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index b768eae9e53..e6ab51e6404 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -2173,6 +2173,8 @@ ExecTypeFromTLInternal(List *targetList, bool skipjunk)
 		cur_resno++;
 	}
 
+	TupleDescFinalize(typeInfo);
+
 	return typeInfo;
 }
 
@@ -2207,6 +2209,8 @@ ExecTypeFromExprList(List *exprList)
 		cur_resno++;
 	}
 
+	TupleDescFinalize(typeInfo);
+
 	return typeInfo;
 }
 
diff --git a/src/backend/executor/nodeFunctionscan.c b/src/backend/executor/nodeFunctionscan.c
index 63e605e1f81..feb82d64967 100644
--- a/src/backend/executor/nodeFunctionscan.c
+++ b/src/backend/executor/nodeFunctionscan.c
@@ -414,6 +414,7 @@ ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags)
 				TupleDescInitEntryCollation(tupdesc,
 											(AttrNumber) 1,
 											exprCollation(funcexpr));
+				TupleDescFinalize(tupdesc);
 			}
 			else
 			{
@@ -485,6 +486,7 @@ ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags)
 							   0);
 		}
 
+		TupleDescFinalize(scan_tupdesc);
 		Assert(attno == natts);
 	}
 
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 3ec8d8de011..0ad767d827b 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -1891,6 +1891,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 			TupleDescInitEntryCollation(tupdesc,
 										(AttrNumber) 1,
 										exprCollation(funcexpr));
+			TupleDescFinalize(tupdesc);
 		}
 		else if (functypclass == TYPEFUNC_RECORD)
 		{
@@ -1948,6 +1949,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 
 				i++;
 			}
+			TupleDescFinalize(tupdesc);
 
 			/*
 			 * Ensure that the coldeflist defines a legal set of names (no
@@ -2016,7 +2018,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 							   0);
 			/* no need to set collation */
 		}
-
+		TupleDescFinalize(tupdesc);
 		Assert(natts == totalatts);
 	}
 	else
diff --git a/src/backend/parser/parse_target.c b/src/backend/parser/parse_target.c
index b5a2f915b67..5fd17f3d8d0 100644
--- a/src/backend/parser/parse_target.c
+++ b/src/backend/parser/parse_target.c
@@ -1570,6 +1570,8 @@ expandRecordVariable(ParseState *pstate, Var *var, int levelsup)
 		}
 		Assert(lname == NULL && lvar == NULL);	/* lists same length? */
 
+		TupleDescFinalize(tupleDesc);
+
 		return tupleDesc;
 	}
 
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 7c8639b32e9..9f04c9ed25d 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -1073,6 +1073,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
 	for (coln = 0; coln < nRetTypes; coln++)
 		TupleDescInitEntry(walres->tupledesc, (AttrNumber) coln + 1,
 						   PQfname(pgres, coln), retTypes[coln], -1, 0);
+	TupleDescFinalize(walres->tupledesc);
 	attinmeta = TupleDescGetAttInMetadata(walres->tupledesc);
 
 	/* No point in doing more here if there were no tuples returned. */
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index a0e6a3d200c..9e2f4a664b4 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -451,6 +451,7 @@ IdentifySystem(void)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 4, "dbname",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -496,6 +497,7 @@ ReadReplicationSlot(ReadReplicationSlotCmd *cmd)
 	/* TimeLineID is unsigned, so int4 is not wide enough. */
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "restart_tli",
 							  INT8OID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	memset(nulls, true, READ_REPLICATION_SLOT_COLS * sizeof(bool));
 
@@ -598,6 +600,7 @@ SendTimeLineHistory(TimeLineHistoryCmd *cmd)
 	tupdesc = CreateTemplateTupleDesc(2);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, "filename", TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "content", TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	TLHistoryFileName(histfname, cmd->timeline);
 	TLHistoryFilePath(path, cmd->timeline);
@@ -1015,6 +1018,7 @@ StartReplication(StartReplicationCmd *cmd)
 								  INT8OID, -1, 0);
 		TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "next_tli_startpos",
 								  TEXTOID, -1, 0);
+		TupleDescFinalize(tupdesc);
 
 		/* prepare for projection of tuple */
 		tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -1369,6 +1373,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 4, "output_plugin",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
diff --git a/src/backend/utils/adt/acl.c b/src/backend/utils/adt/acl.c
index 3a6905f9546..9d37053c81e 100644
--- a/src/backend/utils/adt/acl.c
+++ b/src/backend/utils/adt/acl.c
@@ -1818,6 +1818,7 @@ aclexplode(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 4, "is_grantable",
 						   BOOLOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/* allocate memory for user context */
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index c083608b1d5..bfb949401d0 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -454,6 +454,7 @@ pg_stat_file(PG_FUNCTION_ARGS)
 					   "creation", TIMESTAMPTZOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 6,
 					   "isdir", BOOLOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	memset(isnull, false, sizeof(isnull));
diff --git a/src/backend/utils/adt/lockfuncs.c b/src/backend/utils/adt/lockfuncs.c
index 9dadd6da672..4481c354fd6 100644
--- a/src/backend/utils/adt/lockfuncs.c
+++ b/src/backend/utils/adt/lockfuncs.c
@@ -146,6 +146,7 @@ pg_lock_status(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 16, "waitstart",
 						   TIMESTAMPTZOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/*
diff --git a/src/backend/utils/adt/orderedsetaggs.c b/src/backend/utils/adt/orderedsetaggs.c
index 3b6da8e36ac..fd8b8676470 100644
--- a/src/backend/utils/adt/orderedsetaggs.c
+++ b/src/backend/utils/adt/orderedsetaggs.c
@@ -233,6 +233,7 @@ ordered_set_startup(FunctionCallInfo fcinfo, bool use_tuples)
 								   -1,
 								   0);
 
+				TupleDescFinalize(newdesc);
 				FreeTupleDesc(qstate->tupdesc);
 				qstate->tupdesc = newdesc;
 			}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 73ca0bb0b7f..08ad27e57c2 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -769,6 +769,7 @@ pg_stat_get_backend_subxact(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subxact_overflow",
 					   BOOLOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	if ((local_beentry = pgstat_get_local_beentry_by_proc_number(procNumber)) != NULL)
@@ -1658,6 +1659,7 @@ pg_stat_wal_build_tuple(PgStat_WalCounters wal_counters,
 	TupleDescInitEntry(tupdesc, (AttrNumber) 6, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	/* Fill values and NULLs */
@@ -2085,6 +2087,7 @@ pg_stat_get_archiver(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(tupdesc, (AttrNumber) 7, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	/* Get statistics about the archiver process */
@@ -2166,6 +2169,7 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
 					   TIMESTAMPTZOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 13, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	namestrcpy(&slotname, text_to_cstring(slotname_text));
@@ -2253,6 +2257,7 @@ pg_stat_get_subscription_stats(PG_FUNCTION_ARGS)
 					   INT8OID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 13, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	if (!subentry)
diff --git a/src/backend/utils/adt/tsvector_op.c b/src/backend/utils/adt/tsvector_op.c
index 94e0fed8309..7ca19a97882 100644
--- a/src/backend/utils/adt/tsvector_op.c
+++ b/src/backend/utils/adt/tsvector_op.c
@@ -651,6 +651,7 @@ tsvector_unnest(PG_FUNCTION_ARGS)
 						   TEXTARRAYOID, -1, 0);
 		if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
 			elog(ERROR, "return type must be a row type");
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = tupdesc;
 
 		funcctx->user_fctx = PG_GETARG_TSVECTOR_COPY(0);
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 6b634c9fff1..770edb34e08 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -729,6 +729,8 @@ RelationBuildTupleDesc(Relation relation)
 		pfree(constr);
 		relation->rd_att->constr = NULL;
 	}
+
+	TupleDescFinalize(relation->rd_att);
 }
 
 /*
@@ -1985,6 +1987,7 @@ formrdesc(const char *relationName, Oid relationReltype,
 
 	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
 	TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
+	TupleDescFinalize(relation->rd_att);
 
 	/* mark not-null status */
 	if (has_not_null)
@@ -3688,6 +3691,8 @@ RelationBuildLocalRelation(const char *relname,
 	for (i = 0; i < natts; i++)
 		TupleDescAttr(rel->rd_att, i)->attrelid = relid;
 
+	TupleDescFinalize(rel->rd_att);
+
 	rel->rd_rel->reltablespace = reltablespace;
 
 	if (mapped_relation)
@@ -4443,6 +4448,7 @@ BuildHardcodedDescriptor(int natts, const FormData_pg_attribute *attrs)
 
 	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
 	TupleDescCompactAttr(result, 0)->attcacheoff = 0;
+	TupleDescFinalize(result);
 
 	/* Note: we don't bother to set up a TupleConstr entry */
 
@@ -6268,6 +6274,8 @@ load_relcache_init_file(bool shared)
 			populate_compact_attribute(rel->rd_att, i);
 		}
 
+		TupleDescFinalize(rel->rd_att);
+
 		/* next read the access method specific field */
 		if (fread(&len, 1, sizeof(len), fp) != sizeof(len))
 			goto read_failed;
diff --git a/src/backend/utils/fmgr/funcapi.c b/src/backend/utils/fmgr/funcapi.c
index 8a934ea8dca..516d02cfb82 100644
--- a/src/backend/utils/fmgr/funcapi.c
+++ b/src/backend/utils/fmgr/funcapi.c
@@ -340,6 +340,8 @@ get_expr_result_type(Node *expr,
 										exprCollation(col));
 			i++;
 		}
+		TupleDescFinalize(tupdesc);
+
 		if (resultTypeId)
 			*resultTypeId = rexpr->row_typeid;
 		if (resultTupleDesc)
@@ -1044,6 +1046,7 @@ resolve_polymorphic_tupdesc(TupleDesc tupdesc, oidvector *declared_args,
 		}
 	}
 
+	TupleDescFinalize(tupdesc);
 	return true;
 }
 
@@ -1853,6 +1856,8 @@ build_function_result_tupdesc_d(char prokind,
 						   0);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -1970,6 +1975,7 @@ TypeGetTupleDesc(Oid typeoid, List *colaliases)
 						   typeoid,
 						   -1,
 						   0);
+		TupleDescFinalize(tupdesc);
 	}
 	else if (functypclass == TYPEFUNC_RECORD)
 	{
diff --git a/src/backend/utils/misc/guc_funcs.c b/src/backend/utils/misc/guc_funcs.c
index 4f3e40bf470..b82f807e05e 100644
--- a/src/backend/utils/misc/guc_funcs.c
+++ b/src/backend/utils/misc/guc_funcs.c
@@ -444,6 +444,7 @@ GetPGVariableResultDesc(const char *name)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 1, varname,
 						   TEXTOID, -1, 0);
 	}
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
 
@@ -465,6 +466,7 @@ ShowGUCConfigOption(const char *name, DestReceiver *dest)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, varname,
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -499,6 +501,7 @@ ShowAllGUCConfig(DestReceiver *dest)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "description",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -934,6 +937,8 @@ show_all_settings(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 17, "pending_restart",
 						   BOOLOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
+
 		/*
 		 * Generate attribute metadata needed later to produce tuples from raw
 		 * C strings
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index d46cdbf7a3c..595413dbbc5 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -195,6 +195,7 @@ extern TupleDesc CreateTupleDescTruncatedCopy(TupleDesc tupdesc, int natts);
 
 extern TupleDesc CreateTupleDescCopyConstr(TupleDesc tupdesc);
 
+#define TupleDescFinalize(d) ((void) 0)
 #define TupleDescSize(src) \
 	(offsetof(struct TupleDescData, compact_attrs) + \
 	 (src)->natts * sizeof(CompactAttribute) + \
diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c
index 7d648c941c0..b2112c29fcf 100644
--- a/src/pl/plpgsql/src/pl_comp.c
+++ b/src/pl/plpgsql/src/pl_comp.c
@@ -1912,6 +1912,8 @@ build_row_from_vars(PLpgSQL_variable **vars, int numvars)
 		TupleDescInitEntryCollation(row->rowtupdesc, i + 1, typcoll);
 	}
 
+	TupleDescFinalize(row->rowtupdesc);
+
 	return row;
 }
 
diff --git a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
index 908bd18a7c7..fa1719bf3b5 100644
--- a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
+++ b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
@@ -205,6 +205,7 @@ test_custom_stats_fixed_report(PG_FUNCTION_ARGS)
 					   INT8OID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	values[0] = Int64GetDatum(stats->numcalls);
diff --git a/src/test/modules/test_predtest/test_predtest.c b/src/test/modules/test_predtest/test_predtest.c
index 679a5de456d..48ca2a4ea70 100644
--- a/src/test/modules/test_predtest/test_predtest.c
+++ b/src/test/modules/test_predtest/test_predtest.c
@@ -230,6 +230,7 @@ test_predtest(PG_FUNCTION_ARGS)
 					   "s_r_holds", BOOLOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 8,
 					   "w_r_holds", BOOLOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	tupdesc = BlessTupleDesc(tupdesc);
 
 	values[0] = BoolGetDatum(strong_implied_by);
-- 
2.51.0


From 6e464b41393dbb7fdeff853285e9d80051b2c12b Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Tue, 31 Dec 2024 09:19:24 +1300
Subject: [PATCH v8 2/4] Precalculate CompactAttribute's attcacheoff

This allows code to be removed from the tuple deform routines which
shrinks down the code a little, which can make it run more quickly.
This also makes a dedicated deformer loop to deform the portion of the
tuple which has a known offset, which makes deforming much faster when
a leading set of the table's columns are non-NULL values and fixed-width
types.
---
 src/backend/access/common/heaptuple.c  | 334 +++++++++-------------
 src/backend/access/common/indextuple.c | 367 ++++++++++---------------
 src/backend/access/common/tupdesc.c    |  37 +++
 src/backend/access/spgist/spgutils.c   |   3 -
 src/backend/executor/execTuples.c      | 282 ++++++++-----------
 src/backend/jit/llvm/llvmjit_deform.c  |   6 -
 src/backend/utils/cache/relcache.c     |  12 -
 src/include/access/htup_details.h      |  19 +-
 src/include/access/itup.h              |  20 +-
 src/include/access/tupdesc.h           |  10 +-
 src/include/access/tupmacs.h           |  84 ++++++
 src/include/executor/tuptable.h        |   9 +-
 12 files changed, 541 insertions(+), 642 deletions(-)

diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c
index 11bec20e82e..42cce3dcdfe 100644
--- a/src/backend/access/common/heaptuple.c
+++ b/src/backend/access/common/heaptuple.c
@@ -497,20 +497,8 @@ heap_attisnull(HeapTuple tup, int attnum, TupleDesc tupleDesc)
 /* ----------------
  *		nocachegetattr
  *
- *		This only gets called from fastgetattr(), in cases where we
- *		can't use a cacheoffset and the value is not null.
- *
- *		This caches attribute offsets in the attribute descriptor.
- *
- *		An alternative way to speed things up would be to cache offsets
- *		with the tuple, but that seems more difficult unless you take
- *		the storage hit of actually putting those offsets into the
- *		tuple you send to disk.  Yuck.
- *
- *		This scheme will be slightly slower than that, but should
- *		perform well for queries which hit large #'s of tuples.  After
- *		you cache the offsets once, examining all the other tuples using
- *		the same attribute descriptor will go much quicker. -cim 5/4/91
+ *		This only gets called from fastgetattr(), in cases where the
+ *		attcacheoff is not set.
  *
  *		NOTE: if you need to change this code, see also heap_deform_tuple.
  *		Also see nocache_index_getattr, which is the same code for index
@@ -522,194 +510,104 @@ nocachegetattr(HeapTuple tup,
 			   int attnum,
 			   TupleDesc tupleDesc)
 {
+	CompactAttribute *cattr;
 	HeapTupleHeader td = tup->t_data;
 	char	   *tp;				/* ptr to data part of tuple */
 	bits8	   *bp = td->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* do we have to walk attrs? */
 	int			off;			/* current offset within data */
+	int			startAttr;
+	int			firstnullattr;
+	bool		hasnulls = HeapTupleHasNulls(tup);
 
-	/* ----------------
-	 *	 Three cases:
-	 *
-	 *	 1: No nulls and no variable-width attributes.
-	 *	 2: Has a null or a var-width AFTER att.
-	 *	 3: Has nulls or var-widths BEFORE att.
-	 * ----------------
-	 */
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
+	/*
+	 * If there are no NULLs before the required attnum, then we can start at
+	 * the highest attribute with a known offset, or the first attribute if
+	 * none have a cached offset.  If the tuple has no variable width types,
+	 * then we can use a slightly cheaper method of offset calculation, as we
+	 * just need to add the attlen to the aligned offset when skipping over
+	 * columns.  When the tuple contains variable-width types, we must use
+	 * att_addlength_pointer(), which does a bit more branching and is
+	 * slightly less efficient.
+	 */
 	attnum--;
 
-	if (!HeapTupleNoNulls(tup))
-	{
-		/*
-		 * there's a null somewhere in the tuple
-		 *
-		 * check to see if any preceding bits are null...
-		 */
-		int			byte = attnum >> 3;
-		int			finalbit = attnum & 0x07;
-
-		/* check for nulls "before" final bit of last byte */
-		if ((~bp[byte]) & ((1 << finalbit) - 1))
-			slow = true;
-		else
-		{
-			/* check for nulls in any "earlier" bytes */
-			int			i;
+	if (hasnulls)
+		firstnullattr = first_null_attr(bp, attnum);
+	else
+		firstnullattr = attnum;
 
-			for (i = 0; i < byte; i++)
-			{
-				if (bp[i] != 0xFF)
-				{
-					slow = true;
-					break;
-				}
-			}
-		}
+	if (tupleDesc->firstNonCachedOffAttr > 0)
+	{
+		startAttr = Min(tupleDesc->firstNonCachedOffAttr - 1, firstnullattr);
+		off = TupleDescCompactAttr(tupleDesc, startAttr)->attcacheoff;
+	}
+	else
+	{
+		startAttr = 0;
+		off = 0;
 	}
 
 	tp = (char *) td + td->t_hoff;
 
-	if (!slow)
+	if (hasnulls)
 	{
-		CompactAttribute *att;
+		for (int i = startAttr; i < attnum; i++)
+		{
+			CompactAttribute *att;
 
-		/*
-		 * If we get here, there are no nulls up to and including the target
-		 * attribute.  If we have a cached offset, we can use it.
-		 */
-		att = TupleDescCompactAttr(tupleDesc, attnum);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, tp + att->attcacheoff);
+			if (att_isnull(i, bp))
+				continue;
 
-		/*
-		 * Otherwise, check for non-fixed-length attrs up to and including
-		 * target.  If there aren't any, it's safe to cheaply initialize the
-		 * cached offsets for these attrs.
-		 */
-		if (HeapTupleHasVarWidth(tup))
-		{
-			int			j;
+			att = TupleDescCompactAttr(tupleDesc, i);
 
-			for (j = 0; j <= attnum; j++)
-			{
-				if (TupleDescCompactAttr(tupleDesc, j)->attlen <= 0)
-				{
-					slow = true;
-					break;
-				}
-			}
+			off = att_pointer_alignby(off,
+									  att->attalignby,
+									  att->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, att->attlen, tp + off);
 		}
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
 	}
-
-	if (!slow)
+	else if (!HeapTupleHasVarWidth(tup))
 	{
-		int			natts = tupleDesc->natts;
-		int			j = 1;
-
-		/*
-		 * If we get here, we have a tuple with no nulls or var-widths up to
-		 * and including the target attribute, so we can use the cached offset
-		 * ... only we don't have it yet, or we'd not have got here.  Since
-		 * it's cheap to compute offsets for fixed-width columns, we take the
-		 * opportunity to initialize the cached offsets for *all* the leading
-		 * fixed-width columns, in hope of avoiding future visits to this
-		 * routine.
-		 */
-		TupleDescCompactAttr(tupleDesc, 0)->attcacheoff = 0;
-
-		/* we might have set some offsets in the slow path previously */
-		while (j < natts && TupleDescCompactAttr(tupleDesc, j)->attcacheoff > 0)
-			j++;
-
-		off = TupleDescCompactAttr(tupleDesc, j - 1)->attcacheoff +
-			TupleDescCompactAttr(tupleDesc, j - 1)->attlen;
-
-		for (; j < natts; j++)
+		for (int i = startAttr; i < attnum; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, j);
-
-			if (att->attlen <= 0)
-				break;
+			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
 
 			off = att_nominal_alignby(off, att->attalignby);
-
-			att->attcacheoff = off;
-
 			off += att->attlen;
 		}
-
-		Assert(j > attnum);
-
-		off = TupleDescCompactAttr(tupleDesc, attnum)->attcacheoff;
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_nominal_alignby(off, cattr->attalignby);
 	}
 	else
 	{
-		bool		usecache = true;
-		int			i;
-
-		/*
-		 * Now we know that we have to walk the tuple CAREFULLY.  But we still
-		 * might be able to cache some offsets for next time.
-		 *
-		 * Note - This loop is a little tricky.  For each non-null attribute,
-		 * we have to first account for alignment padding before the attr,
-		 * then advance over the attr based on its length.  Nulls have no
-		 * storage and no alignment padding either.  We can use/set
-		 * attcacheoff until we reach either a null or a var-width attribute.
-		 */
-		off = 0;
-		for (i = 0;; i++)		/* loop exit is at "break" */
+		for (int i = startAttr; i < attnum; i++)
 		{
 			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
 
-			if (HeapTupleHasNulls(tup) && att_isnull(i, bp))
-			{
-				usecache = false;
-				continue;		/* this cannot be the target att */
-			}
-
-			/* If we know the next offset, we can skip the rest */
-			if (usecache && att->attcacheoff >= 0)
-				off = att->attcacheoff;
-			else if (att->attlen == -1)
-			{
-				/*
-				 * We can only cache the offset for a varlena attribute if the
-				 * offset is already suitably aligned, so that there would be
-				 * no pad bytes in any case: then the offset will be valid for
-				 * either an aligned or unaligned value.
-				 */
-				if (usecache &&
-					off == att_nominal_alignby(off, att->attalignby))
-					att->attcacheoff = off;
-				else
-				{
-					off = att_pointer_alignby(off, att->attalignby, -1,
-											  tp + off);
-					usecache = false;
-				}
-			}
-			else
-			{
-				/* not varlena, so safe to use att_nominal_alignby */
-				off = att_nominal_alignby(off, att->attalignby);
-
-				if (usecache)
-					att->attcacheoff = off;
-			}
-
-			if (i == attnum)
-				break;
-
+			off = att_pointer_alignby(off,
+									  att->attalignby,
+									  att->attlen,
+									  tp + off);
 			off = att_addlength_pointer(off, att->attlen, tp + off);
 
-			if (usecache && att->attlen <= 0)
-				usecache = false;
 		}
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
 	}
 
-	return fetchatt(TupleDescCompactAttr(tupleDesc, attnum), tp + off);
+	return fetchatt(cattr, tp + off);
 }
 
 /* ----------------
@@ -1347,6 +1245,7 @@ heap_deform_tuple(HeapTuple tuple, TupleDesc tupleDesc,
 				  Datum *values, bool *isnull)
 {
 	HeapTupleHeader tup = tuple->t_data;
+	CompactAttribute *cattr;
 	bool		hasnulls = HeapTupleHasNulls(tuple);
 	int			tdesc_natts = tupleDesc->natts;
 	int			natts;			/* number of atts to extract */
@@ -1354,70 +1253,91 @@ heap_deform_tuple(HeapTuple tuple, TupleDesc tupleDesc,
 	char	   *tp;				/* ptr to tuple data */
 	uint32		off;			/* offset in tuple data */
 	bits8	   *bp = tup->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* can we use/set attcacheoff? */
+	int			cacheoffattrs;
+	int			firstnullattr;
 
 	natts = HeapTupleHeaderGetNatts(tup);
 
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
+
 	/*
 	 * In inheritance situations, it is possible that the given tuple actually
 	 * has more fields than the caller is expecting.  Don't run off the end of
 	 * the caller's arrays.
 	 */
 	natts = Min(natts, tdesc_natts);
+	cacheoffattrs = Min(tupleDesc->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		firstnullattr = first_null_attr(bp, natts);
+		cacheoffattrs = Min(cacheoffattrs, firstnullattr);
+	}
+	else
+		firstnullattr = natts;
 
 	tp = (char *) tup + tup->t_hoff;
+	attnum = 0;
 
-	off = 0;
+	if (cacheoffattrs > 0)
+	{
+#ifdef USE_ASSERT_CHECKING
+		/* In Assert enabled builds, verify attcacheoff is correct */
+		off = 0;
+#endif
+		do
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+#ifdef USE_ASSERT_CHECKING
+			off = att_nominal_alignby(off, cattr->attalignby);
+			Assert(off == cattr->attcacheoff);
+			off += cattr->attlen;
+#endif
 
-	for (attnum = 0; attnum < natts; attnum++)
+			values[attnum] = fetch_att(tp + cattr->attcacheoff,
+									   cattr->attbyval,
+									   cattr->attlen);
+			isnull[attnum] = false;
+		} while (++attnum < cacheoffattrs);
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+	else
+		off = 0;
+
+	for (; attnum < firstnullattr; attnum++)
 	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDesc, attnum);
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
 
-		if (hasnulls && att_isnull(attnum, bp))
-		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			slow = true;		/* can't use attcacheoff anymore */
-			continue;
-		}
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
 		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (!slow && thisatt->attcacheoff >= 0)
-			off = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow &&
-				off == att_nominal_alignby(off, thisatt->attalignby))
-				thisatt->attcacheoff = off;
-			else
-			{
-				off = att_pointer_alignby(off, thisatt->attalignby, -1,
-										  tp + off);
-				slow = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			off = att_nominal_alignby(off, thisatt->attalignby);
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+	}
+
+	for (; attnum < natts; attnum++)
+	{
+		Assert(hasnulls);
 
-			if (!slow)
-				thisatt->attcacheoff = off;
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
 		}
 
-		values[attnum] = fetchatt(thisatt, tp + off);
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
-		off = att_addlength_pointer(off, thisatt->attlen, tp + off);
+		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (thisatt->attlen <= 0)
-			slow = true;		/* can't use attcacheoff anymore */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
 	/*
diff --git a/src/backend/access/common/indextuple.c b/src/backend/access/common/indextuple.c
index d7c8c53fd8d..084e0937a60 100644
--- a/src/backend/access/common/indextuple.c
+++ b/src/backend/access/common/indextuple.c
@@ -223,18 +223,6 @@ index_form_tuple_context(TupleDesc tupleDescriptor,
  *
  *		This gets called from index_getattr() macro, and only in cases
  *		where we can't use cacheoffset and the value is not null.
- *
- *		This caches attribute offsets in the attribute descriptor.
- *
- *		An alternative way to speed things up would be to cache offsets
- *		with the tuple, but that seems more difficult unless you take
- *		the storage hit of actually putting those offsets into the
- *		tuple you send to disk.  Yuck.
- *
- *		This scheme will be slightly slower than that, but should
- *		perform well for queries which hit large #'s of tuples.  After
- *		you cache the offsets once, examining all the other tuples using
- *		the same attribute descriptor will go much quicker. -cim 5/4/91
  * ----------------
  */
 Datum
@@ -242,205 +230,129 @@ nocache_index_getattr(IndexTuple tup,
 					  int attnum,
 					  TupleDesc tupleDesc)
 {
+	CompactAttribute *cattr;
 	char	   *tp;				/* ptr to data part of tuple */
 	bits8	   *bp = NULL;		/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* do we have to walk attrs? */
 	int			data_off;		/* tuple data offset */
 	int			off;			/* current offset within data */
+	int			startAttr;
+	int			firstnullattr;
+	bool		hasnulls = IndexTupleHasNulls(tup);
+	int			i;
 
-	/* ----------------
-	 *	 Three cases:
-	 *
-	 *	 1: No nulls and no variable-width attributes.
-	 *	 2: Has a null or a var-width AFTER att.
-	 *	 3: Has nulls or var-widths BEFORE att.
-	 * ----------------
-	 */
-
-	data_off = IndexInfoFindDataOffset(tup->t_info);
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
 	attnum--;
 
-	if (IndexTupleHasNulls(tup))
-	{
-		/*
-		 * there's a null somewhere in the tuple
-		 *
-		 * check to see if desired att is null
-		 */
+	/*
+	 * If there are no NULLs before the required attnum, then we can start at
+	 * the highest attribute with a known offset, or the first attribute if
+	 * none have a cached offset.  If the tuple has no variable width types,
+	 * which is common with indexes, then we can use a slightly cheaper method
+	 * of offset calculation, as we just need to add the attlen to the aligned
+	 * offset when skipping over columns.  When the tuple contains
+	 * variable-width types, we must use att_addlength_pointer(), which does a
+	 * bit more branching and is slightly less efficient.
+	 */
+	data_off = IndexInfoFindDataOffset(tup->t_info);
+	tp = (char *) tup + data_off;
 
-		/* XXX "knows" t_bits are just after fixed tuple header! */
+	/*
+	 * Find the first NULL column, or if there's none set the first NULL to
+	 * attnum so that we can forego NULL checking all the way to attnum.
+	 */
+	if (hasnulls)
+	{
 		bp = (bits8 *) ((char *) tup + sizeof(IndexTupleData));
-
-		/*
-		 * Now check to see if any preceding bits are null...
-		 */
-		{
-			int			byte = attnum >> 3;
-			int			finalbit = attnum & 0x07;
-
-			/* check for nulls "before" final bit of last byte */
-			if ((~bp[byte]) & ((1 << finalbit) - 1))
-				slow = true;
-			else
-			{
-				/* check for nulls in any "earlier" bytes */
-				int			i;
-
-				for (i = 0; i < byte; i++)
-				{
-					if (bp[i] != 0xFF)
-					{
-						slow = true;
-						break;
-					}
-				}
-			}
-		}
+		firstnullattr = first_null_attr(bp, attnum);
 	}
+	else
+		firstnullattr = attnum;
 
-	tp = (char *) tup + data_off;
-
-	if (!slow)
+	if (tupleDesc->firstNonCachedOffAttr > 0)
 	{
-		CompactAttribute *att;
-
-		/*
-		 * If we get here, there are no nulls up to and including the target
-		 * attribute.  If we have a cached offset, we can use it.
-		 */
-		att = TupleDescCompactAttr(tupleDesc, attnum);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, tp + att->attcacheoff);
-
-		/*
-		 * Otherwise, check for non-fixed-length attrs up to and including
-		 * target.  If there aren't any, it's safe to cheaply initialize the
-		 * cached offsets for these attrs.
-		 */
-		if (IndexTupleHasVarwidths(tup))
-		{
-			int			j;
-
-			for (j = 0; j <= attnum; j++)
-			{
-				if (TupleDescCompactAttr(tupleDesc, j)->attlen <= 0)
-				{
-					slow = true;
-					break;
-				}
-			}
-		}
+		startAttr = Min(tupleDesc->firstNonCachedOffAttr - 1, firstnullattr);
+		off = TupleDescCompactAttr(tupleDesc, startAttr)->attcacheoff;
 	}
-
-	if (!slow)
+	else
 	{
-		int			natts = tupleDesc->natts;
-		int			j = 1;
-
-		/*
-		 * If we get here, we have a tuple with no nulls or var-widths up to
-		 * and including the target attribute, so we can use the cached offset
-		 * ... only we don't have it yet, or we'd not have got here.  Since
-		 * it's cheap to compute offsets for fixed-width columns, we take the
-		 * opportunity to initialize the cached offsets for *all* the leading
-		 * fixed-width columns, in hope of avoiding future visits to this
-		 * routine.
-		 */
-		TupleDescCompactAttr(tupleDesc, 0)->attcacheoff = 0;
+		startAttr = 0;
+		off = 0;
+	}
 
-		/* we might have set some offsets in the slow path previously */
-		while (j < natts && TupleDescCompactAttr(tupleDesc, j)->attcacheoff > 0)
-			j++;
+	/* Handle tuples with var-width attributes */
+	if (IndexTupleHasVarwidths(tup))
+	{
+		/* Calculate the offset up until the first NULL */
+		for (i = startAttr; i < firstnullattr; i++)
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-		off = TupleDescCompactAttr(tupleDesc, j - 1)->attcacheoff +
-			TupleDescCompactAttr(tupleDesc, j - 1)->attlen;
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, cattr->attlen, tp + off);
+		}
 
-		for (; j < natts; j++)
+		/* Calculate the offset for any remaining columns. */
+		for (; i < attnum; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, j);
+			Assert(hasnulls);
 
-			if (att->attlen <= 0)
-				break;
+			if (att_isnull(i, bp))
+				continue;
 
-			off = att_nominal_alignby(off, att->attalignby);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-			att->attcacheoff = off;
-
-			off += att->attlen;
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, cattr->attlen, tp + off);
 		}
-
-		Assert(j > attnum);
-
-		off = TupleDescCompactAttr(tupleDesc, attnum)->attcacheoff;
 	}
 	else
 	{
-		bool		usecache = true;
-		int			i;
+		/* Handle tuples with only fixed-width attributes */
 
-		/*
-		 * Now we know that we have to walk the tuple CAREFULLY.  But we still
-		 * might be able to cache some offsets for next time.
-		 *
-		 * Note - This loop is a little tricky.  For each non-null attribute,
-		 * we have to first account for alignment padding before the attr,
-		 * then advance over the attr based on its length.  Nulls have no
-		 * storage and no alignment padding either.  We can use/set
-		 * attcacheoff until we reach either a null or a var-width attribute.
-		 */
-		off = 0;
-		for (i = 0;; i++)		/* loop exit is at "break" */
+		/* Calculate the offset up until the first NULL */
+		for (i = startAttr; i < firstnullattr; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
-
-			if (IndexTupleHasNulls(tup) && att_isnull(i, bp))
-			{
-				usecache = false;
-				continue;		/* this cannot be the target att */
-			}
-
-			/* If we know the next offset, we can skip the rest */
-			if (usecache && att->attcacheoff >= 0)
-				off = att->attcacheoff;
-			else if (att->attlen == -1)
-			{
-				/*
-				 * We can only cache the offset for a varlena attribute if the
-				 * offset is already suitably aligned, so that there would be
-				 * no pad bytes in any case: then the offset will be valid for
-				 * either an aligned or unaligned value.
-				 */
-				if (usecache &&
-					off == att_nominal_alignby(off, att->attalignby))
-					att->attcacheoff = off;
-				else
-				{
-					off = att_pointer_alignby(off, att->attalignby, -1,
-											  tp + off);
-					usecache = false;
-				}
-			}
-			else
-			{
-				/* not varlena, so safe to use att_nominal_alignby */
-				off = att_nominal_alignby(off, att->attalignby);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
+
+			Assert(cattr->attlen > 0);
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off += cattr->attlen;
+		}
 
-				if (usecache)
-					att->attcacheoff = off;
-			}
+		/* Calculate the offset for any remaining columns. */
+		for (; i < attnum; i++)
+		{
+			Assert(hasnulls);
 
-			if (i == attnum)
-				break;
+			if (att_isnull(i, bp))
+				continue;
 
-			off = att_addlength_pointer(off, att->attlen, tp + off);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-			if (usecache && att->attlen <= 0)
-				usecache = false;
+			Assert(cattr->attlen > 0);
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off += cattr->attlen;
 		}
 	}
 
-	return fetchatt(TupleDescCompactAttr(tupleDesc, attnum), tp + off);
+	cattr = TupleDescCompactAttr(tupleDesc, attnum);
+	off = att_pointer_alignby(off, cattr->attalignby,
+							  cattr->attlen, tp + off);
+	return fetchatt(cattr, tp + off);
 }
 
 /*
@@ -480,63 +392,86 @@ index_deform_tuple_internal(TupleDesc tupleDescriptor,
 							Datum *values, bool *isnull,
 							char *tp, bits8 *bp, int hasnulls)
 {
+	CompactAttribute *cattr;
 	int			natts = tupleDescriptor->natts; /* number of atts to extract */
-	int			attnum;
+	int			attnum = 0;
 	int			off = 0;		/* offset in tuple data */
-	bool		slow = false;	/* can we use/set attcacheoff? */
+	int			cacheoffattrs;
+	int			firstnullattr;
 
 	/* Assert to protect callers who allocate fixed-size arrays */
 	Assert(natts <= INDEX_MAX_KEYS);
 
-	for (attnum = 0; attnum < natts; attnum++)
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDescriptor->firstNonCachedOffAttr >= 0);
+
+	cacheoffattrs = Min(tupleDescriptor->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		firstnullattr = first_null_attr(bp, natts);
+		cacheoffattrs = Min(cacheoffattrs, firstnullattr);
+	}
+	else
+		firstnullattr = natts;
+
+	if (cacheoffattrs > 0)
 	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDescriptor, attnum);
+#ifdef USE_ASSERT_CHECKING
+		/* In Assert enabled builds, verify attcacheoff is correct */
+		off = 0;
+#endif
 
-		if (hasnulls && att_isnull(attnum, bp))
+		do
 		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			slow = true;		/* can't use attcacheoff anymore */
-			continue;
-		}
+			cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+
+#ifdef USE_ASSERT_CHECKING
+			off = att_nominal_alignby(off, cattr->attalignby);
+			Assert(off == cattr->attcacheoff);
+			off += cattr->attlen;
+#endif
+
+			values[attnum] = fetch_att(tp + cattr->attcacheoff, cattr->attbyval,
+									   cattr->attlen);
+			isnull[attnum] = false;
+		} while (++attnum < cacheoffattrs);
+
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+
+	for (; attnum < firstnullattr; attnum++)
+	{
+		cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
 		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (!slow && thisatt->attcacheoff >= 0)
-			off = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow &&
-				off == att_nominal_alignby(off, thisatt->attalignby))
-				thisatt->attcacheoff = off;
-			else
-			{
-				off = att_pointer_alignby(off, thisatt->attalignby, -1,
-										  tp + off);
-				slow = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			off = att_nominal_alignby(off, thisatt->attalignby);
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+	}
+
+	for (; attnum < natts; attnum++)
+	{
+		Assert(hasnulls);
 
-			if (!slow)
-				thisatt->attcacheoff = off;
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
 		}
 
-		values[attnum] = fetchatt(thisatt, tp + off);
+		cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
-		off = att_addlength_pointer(off, thisatt->attlen, tp + off);
+		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (thisatt->attlen <= 0)
-			slow = true;		/* can't use attcacheoff anymore */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 }
 
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index e98de806a77..25364db630a 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -214,6 +214,9 @@ CreateTemplateTupleDesc(int natts)
 	desc->tdtypmod = -1;
 	desc->tdrefcount = -1;		/* assume not reference-counted */
 
+	/* This will be set to the correct value by TupleDescFinalize() */
+	desc->firstNonCachedOffAttr = -1;
+
 	return desc;
 }
 
@@ -474,6 +477,9 @@ TupleDescCopy(TupleDesc dst, TupleDesc src)
  *		descriptor to another.
  *
  * !!! Constraints and defaults are not copied !!!
+ *
+ * The caller must take care of calling TupleDescFinalize() on once all
+ * TupleDesc changes have been made.
  */
 void
 TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
@@ -506,6 +512,37 @@ TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 	populate_compact_attribute(dst, dstAttno - 1);
 }
 
+/*
+ * TupleDescFinalize
+ *		Finalize the given TupleDesc.  This must be called after the
+ *		attributes arrays have been populated or adjusted by any code.
+ *
+ * Must be called after populate_compact_attribute() and before
+ * BlessTupleDesc().
+ */
+void
+TupleDescFinalize(TupleDesc tupdesc)
+{
+	int			firstNonCachedOffAttr = 0;
+	int			offp = 0;
+
+	for (int i = 0; i < tupdesc->natts; i++)
+	{
+		CompactAttribute *cattr = TupleDescCompactAttr(tupdesc, i);
+
+		if (cattr->attlen <= 0)
+			break;
+
+		offp = att_nominal_alignby(offp, cattr->attalignby);
+		cattr->attcacheoff = offp;
+
+		offp += cattr->attlen;
+		firstNonCachedOffAttr = i + 1;
+	}
+
+	tupdesc->firstNonCachedOffAttr = firstNonCachedOffAttr;
+}
+
 /*
  * Free a TupleDesc including all substructure
  */
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index b246e8127db..a4694bd8065 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -335,9 +335,6 @@ getSpGistTupleDesc(Relation index, SpGistTypeDesc *keyType)
 		/* We shouldn't need to bother with making these valid: */
 		att->attcompression = InvalidCompressionMethod;
 		att->attcollation = InvalidOid;
-		/* In case we changed typlen, we'd better reset following offsets */
-		for (int i = spgFirstIncludeColumn; i < outTupDesc->natts; i++)
-			TupleDescCompactAttr(outTupDesc, i)->attcacheoff = -1;
 
 		populate_compact_attribute(outTupDesc, spgKeyColumn);
 		TupleDescFinalize(outTupDesc);
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index e6ab51e6404..36d0aaed2fb 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -992,118 +992,6 @@ tts_buffer_heap_store_tuple(TupleTableSlot *slot, HeapTuple tuple,
 	}
 }
 
-/*
- * slot_deform_heap_tuple_internal
- *		An always inline helper function for use in slot_deform_heap_tuple to
- *		allow the compiler to emit specialized versions of this function for
- *		various combinations of "slow" and "hasnulls".  For example, if a
- *		given tuple has no nulls, then we needn't check "hasnulls" for every
- *		attribute that we're deforming.  The caller can just call this
- *		function with hasnulls set to constant-false and have the compiler
- *		remove the constant-false branches and emit more optimal code.
- *
- * Returns the next attnum to deform, which can be equal to natts when the
- * function manages to deform all requested attributes.  *offp is an input and
- * output parameter which is the byte offset within the tuple to start deforming
- * from which, on return, gets set to the offset where the next attribute
- * should be deformed from.  *slowp is set to true when subsequent deforming
- * of this tuple must use a version of this function with "slow" passed as
- * true.
- *
- * Callers cannot assume when we return "attnum" (i.e. all requested
- * attributes have been deformed) that slow mode isn't required for any
- * additional deforming as the final attribute may have caused a switch to
- * slow mode.
- */
-static pg_attribute_always_inline int
-slot_deform_heap_tuple_internal(TupleTableSlot *slot, HeapTuple tuple,
-								int attnum, int natts, bool slow,
-								bool hasnulls, uint32 *offp, bool *slowp)
-{
-	TupleDesc	tupleDesc = slot->tts_tupleDescriptor;
-	Datum	   *values = slot->tts_values;
-	bool	   *isnull = slot->tts_isnull;
-	HeapTupleHeader tup = tuple->t_data;
-	char	   *tp;				/* ptr to tuple data */
-	bits8	   *bp = tup->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slownext = false;
-
-	tp = (char *) tup + tup->t_hoff;
-
-	for (; attnum < natts; attnum++)
-	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDesc, attnum);
-
-		if (hasnulls && att_isnull(attnum, bp))
-		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			if (!slow)
-			{
-				*slowp = true;
-				return attnum + 1;
-			}
-			else
-				continue;
-		}
-
-		isnull[attnum] = false;
-
-		/* calculate the offset of this attribute */
-		if (!slow && thisatt->attcacheoff >= 0)
-			*offp = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow && *offp == att_nominal_alignby(*offp, thisatt->attalignby))
-				thisatt->attcacheoff = *offp;
-			else
-			{
-				*offp = att_pointer_alignby(*offp,
-											thisatt->attalignby,
-											-1,
-											tp + *offp);
-
-				if (!slow)
-					slownext = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			*offp = att_nominal_alignby(*offp, thisatt->attalignby);
-
-			if (!slow)
-				thisatt->attcacheoff = *offp;
-		}
-
-		values[attnum] = fetchatt(thisatt, tp + *offp);
-
-		*offp = att_addlength_pointer(*offp, thisatt->attlen, tp + *offp);
-
-		/* check if we need to switch to slow mode */
-		if (!slow)
-		{
-			/*
-			 * We're unable to deform any further if the above code set
-			 * 'slownext', or if this isn't a fixed-width attribute.
-			 */
-			if (slownext || thisatt->attlen <= 0)
-			{
-				*slowp = true;
-				return attnum + 1;
-			}
-		}
-	}
-
-	return natts;
-}
-
 /*
  * slot_deform_heap_tuple
  *		Given a TupleTableSlot, extract data from the slot's physical tuple
@@ -1122,78 +1010,138 @@ static pg_attribute_always_inline void
 slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 					   int natts)
 {
+	CompactAttribute *cattr;
+	TupleDesc	tupleDesc = slot->tts_tupleDescriptor;
 	bool		hasnulls = HeapTupleHasNulls(tuple);
+	HeapTupleHeader tup = tuple->t_data;
+	bits8	   *bp;				/* ptr to null bitmap in tuple */
 	int			attnum;
+	int			firstNonCacheOffsetAttr;
+	int			firstNullAttr;
+	Datum	   *values;
+	bool	   *isnull;
+	char	   *tp;				/* ptr to tuple data */
 	uint32		off;			/* offset in tuple data */
-	bool		slow;			/* can we use/set attcacheoff? */
+
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
 	/* We can only fetch as many attributes as the tuple has. */
-	natts = Min(HeapTupleHeaderGetNatts(tuple->t_data), natts);
+	natts = Min(HeapTupleHeaderGetNatts(tup), natts);
+	attnum = slot->tts_nvalid;
+	firstNonCacheOffsetAttr = Min(tupleDesc->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		bp = tup->t_bits;
+		firstNullAttr = first_null_attr(bp, natts);
+		firstNonCacheOffsetAttr = Min(firstNonCacheOffsetAttr, firstNullAttr);
+	}
+	else
+	{
+		bp = NULL;
+		firstNullAttr = natts;
+	}
+
+	values = slot->tts_values;
+	isnull = slot->tts_isnull;
+	tp = (char *) tup + tup->t_hoff;
 
 	/*
-	 * Check whether the first call for this tuple, and initialize or restore
-	 * loop state.
+	 * Handle the portion of the tuple that we have cached the offset for up
+	 * to the first NULL attribute.  The offset is effectively fixed for these
+	 * so we can use the CompactAttribute's attcacheoff.
 	 */
-	attnum = slot->tts_nvalid;
-	if (attnum == 0)
+	if (attnum < firstNonCacheOffsetAttr)
 	{
-		/* Start from the first attribute */
-		off = 0;
-		slow = false;
+#ifdef USE_ASSERT_CHECKING
+		int			offcheck;
+
+		/* In Assert enabled builds, verify attcacheoff is correct */
+		if (attnum == 0)
+			offcheck = 0;
+		else
+			offcheck = *offp;
+#endif
+		do
+		{
+			isnull[attnum] = false;
+			cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+#ifdef USE_ASSERT_CHECKING
+			offcheck = att_nominal_alignby(offcheck, cattr->attalignby);
+			Assert(offcheck == cattr->attcacheoff);
+			offcheck += cattr->attlen;
+#endif
+
+			values[attnum] = fetchatt(cattr, tp + cattr->attcacheoff);
+		} while (++attnum < firstNonCacheOffsetAttr);
+
+		/*
+		 * Point the offset after the end of the last attribute with a cached
+		 * offset.  We expect the final cached offset attribute to have a
+		 * fixed width, so just add the attlen to the attcacheoff
+		 */
+		Assert(cattr->attlen > 0);
+		off = cattr->attcacheoff + cattr->attlen;
 	}
 	else
 	{
 		/* Restore state from previous execution */
 		off = *offp;
-		slow = TTS_SLOW(slot);
+
+		/* We expect *offp to be set to 0 when attnum == 0 */
+		Assert(off == 0 || attnum > 0);
 	}
 
 	/*
-	 * If 'slow' isn't set, try deforming using deforming code that does not
-	 * contain any of the extra checks required for non-fixed offset
-	 * deforming.  During deforming, if or when we find a NULL or a variable
-	 * length attribute, we'll switch to a deforming method which includes the
-	 * extra code required for non-fixed offset deforming, a.k.a slow mode.
-	 * Because this is performance critical, we inline
-	 * slot_deform_heap_tuple_internal passing the 'slow' and 'hasnull'
-	 * parameters as constants to allow the compiler to emit specialized code
-	 * with the known-const false comparisons and subsequent branches removed.
+	 * Handle any portion of the tuple that doesn't have a fixed offset up
+	 * until the first NULL attribute.  This loops only differs from the one
+	 * after it by the NULL checks.
 	 */
-	if (!slow)
+	for (; attnum < firstNullAttr; attnum++)
 	{
-		/* Tuple without any NULLs? We can skip doing any NULL checking */
-		if (!hasnulls)
-			attnum = slot_deform_heap_tuple_internal(slot,
-													 tuple,
-													 attnum,
-													 natts,
-													 false, /* slow */
-													 false, /* hasnulls */
-													 &off,
-													 &slow);
-		else
-			attnum = slot_deform_heap_tuple_internal(slot,
-													 tuple,
-													 attnum,
-													 natts,
-													 false, /* slow */
-													 true,	/* hasnulls */
-													 &off,
-													 &slow);
+		isnull[attnum] = false;
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+		/* align the offset for this attribute */
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
+
+		values[attnum] = fetchatt(cattr, tp + off);
+
+		/* move the offset beyond this attribute */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
-	/* If there's still work to do then we must be in slow mode */
-	if (attnum < natts)
+	/*
+	 * Now handle any remaining tuples, this time include NULL checks as we're
+	 * now at the first NULL attribute.
+	 */
+	for (; attnum < natts; attnum++)
 	{
-		/* XXX is it worth adding a separate call when hasnulls is false? */
-		attnum = slot_deform_heap_tuple_internal(slot,
-												 tuple,
-												 attnum,
-												 natts,
-												 true,	/* slow */
-												 hasnulls,
-												 &off,
-												 &slow);
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
+		}
+
+		isnull[attnum] = false;
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+		/* align the offset for this attribute */
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
+
+		values[attnum] = fetchatt(cattr, tp + off);
+
+		/* move the offset beyond this attribute */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
 	/*
@@ -1201,10 +1149,6 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 	 */
 	slot->tts_nvalid = attnum;
 	*offp = off;
-	if (slow)
-		slot->tts_flags |= TTS_FLAG_SLOW;
-	else
-		slot->tts_flags &= ~TTS_FLAG_SLOW;
 }
 
 const TupleTableSlotOps TTSOpsVirtual = {
@@ -2259,10 +2203,16 @@ ExecTypeSetColNames(TupleDesc typeInfo, List *namesList)
  * This happens "for free" if the tupdesc came from a relcache entry, but
  * not if we have manufactured a tupdesc for a transient RECORD datatype.
  * In that case we have to notify typcache.c of the existence of the type.
+ *
+ * TupleDescFinalize() must be called on the TupleDesc before calling this
+ * function.
  */
 TupleDesc
 BlessTupleDesc(TupleDesc tupdesc)
 {
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupdesc->firstNonCachedOffAttr >= 0);
+
 	if (tupdesc->tdtypeid == RECORDOID &&
 		tupdesc->tdtypmod < 0)
 		assign_record_type_typmod(tupdesc);
diff --git a/src/backend/jit/llvm/llvmjit_deform.c b/src/backend/jit/llvm/llvmjit_deform.c
index 3eb087eb56b..12521e3e46a 100644
--- a/src/backend/jit/llvm/llvmjit_deform.c
+++ b/src/backend/jit/llvm/llvmjit_deform.c
@@ -62,7 +62,6 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 	LLVMValueRef v_tts_values;
 	LLVMValueRef v_tts_nulls;
 	LLVMValueRef v_slotoffp;
-	LLVMValueRef v_flagsp;
 	LLVMValueRef v_nvalidp;
 	LLVMValueRef v_nvalid;
 	LLVMValueRef v_maxatt;
@@ -178,7 +177,6 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 	v_tts_nulls =
 		l_load_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_ISNULL,
 						  "tts_ISNULL");
-	v_flagsp = l_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_FLAGS, "");
 	v_nvalidp = l_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_NVALID, "");
 
 	if (ops == &TTSOpsHeapTuple || ops == &TTSOpsBufferHeapTuple)
@@ -747,14 +745,10 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 
 	{
 		LLVMValueRef v_off = l_load(b, TypeSizeT, v_offp, "");
-		LLVMValueRef v_flags;
 
 		LLVMBuildStore(b, l_int16_const(lc, natts), v_nvalidp);
 		v_off = LLVMBuildTrunc(b, v_off, LLVMInt32TypeInContext(lc), "");
 		LLVMBuildStore(b, v_off, v_slotoffp);
-		v_flags = l_load(b, LLVMInt16TypeInContext(lc), v_flagsp, "tts_flags");
-		v_flags = LLVMBuildOr(b, v_flags, l_int16_const(lc, TTS_FLAG_SLOW), "");
-		LLVMBuildStore(b, v_flags, v_flagsp);
 		LLVMBuildRetVoid(b);
 	}
 
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 770edb34e08..998be24ac41 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -666,14 +666,6 @@ RelationBuildTupleDesc(Relation relation)
 		elog(ERROR, "pg_attribute catalog is missing %d attribute(s) for relation OID %u",
 			 need, RelationGetRelid(relation));
 
-	/*
-	 * We can easily set the attcacheoff value for the first attribute: it
-	 * must be zero.  This eliminates the need for special cases for attnum=1
-	 * that used to exist in fastgetattr() and index_getattr().
-	 */
-	if (RelationGetNumberOfAttributes(relation) > 0)
-		TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
-
 	/*
 	 * Set up constraint/default info
 	 */
@@ -1985,8 +1977,6 @@ formrdesc(const char *relationName, Oid relationReltype,
 		populate_compact_attribute(relation->rd_att, i);
 	}
 
-	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
-	TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
 	TupleDescFinalize(relation->rd_att);
 
 	/* mark not-null status */
@@ -4446,8 +4436,6 @@ BuildHardcodedDescriptor(int natts, const FormData_pg_attribute *attrs)
 		populate_compact_attribute(result, i);
 	}
 
-	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
-	TupleDescCompactAttr(result, 0)->attcacheoff = 0;
 	TupleDescFinalize(result);
 
 	/* Note: we don't bother to set up a TupleConstr entry */
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index d406825ff22..94b4279b7f1 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -865,20 +865,17 @@ extern MinimalTuple minimal_expand_tuple(HeapTuple sourceTuple, TupleDesc tupleD
 static inline Datum
 fastgetattr(HeapTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
 {
-	Assert(attnum > 0);
+	CompactAttribute *att = TupleDescCompactAttr(tupleDesc, attnum - 1);
 
+	Assert(attnum > 0);
 	*isnull = false;
-	if (HeapTupleNoNulls(tup))
-	{
-		CompactAttribute *att;
 
-		att = TupleDescCompactAttr(tupleDesc, attnum - 1);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, (char *) tup->t_data + tup->t_data->t_hoff +
-							att->attcacheoff);
-		else
-			return nocachegetattr(tup, attnum, tupleDesc);
-	}
+	if (att->attcacheoff >= 0 && !HeapTupleHasNulls(tup))
+		return fetchatt(att, (char *) tup->t_data + tup->t_data->t_hoff +
+						att->attcacheoff);
+
+	if (HeapTupleNoNulls(tup))
+		return nocachegetattr(tup, attnum, tupleDesc);
 	else
 	{
 		if (att_isnull(attnum - 1, tup->t_data->t_bits))
diff --git a/src/include/access/itup.h b/src/include/access/itup.h
index 57e4daafb0d..e4bb27b7e58 100644
--- a/src/include/access/itup.h
+++ b/src/include/access/itup.h
@@ -131,24 +131,20 @@ IndexInfoFindDataOffset(unsigned short t_info)
 static inline Datum
 index_getattr(IndexTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
 {
+	CompactAttribute *attr = TupleDescCompactAttr(tupleDesc, attnum - 1);
+
 	Assert(isnull);
 	Assert(attnum > 0);
 
 	*isnull = false;
 
-	if (!IndexTupleHasNulls(tup))
-	{
-		CompactAttribute *attr = TupleDescCompactAttr(tupleDesc, attnum - 1);
+	if (attr->attcacheoff >= 0 && !IndexTupleHasNulls(tup))
+		return fetchatt(attr,
+						(char *) tup + IndexInfoFindDataOffset(tup->t_info) +
+						attr->attcacheoff);
 
-		if (attr->attcacheoff >= 0)
-		{
-			return fetchatt(attr,
-							(char *) tup + IndexInfoFindDataOffset(tup->t_info) +
-							attr->attcacheoff);
-		}
-		else
-			return nocache_index_getattr(tup, attnum, tupleDesc);
-	}
+	if (!IndexTupleHasNulls(tup))
+		return nocache_index_getattr(tup, attnum, tupleDesc);
 	else
 	{
 		if (att_isnull(attnum - 1, (bits8 *) tup + sizeof(IndexTupleData)))
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index 595413dbbc5..99d9017d1a6 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -131,6 +131,12 @@ typedef struct CompactAttribute
  * Any code making changes manually to and fields in the FormData_pg_attribute
  * array must subsequently call populate_compact_attribute() to flush the
  * changes out to the corresponding 'compact_attrs' element.
+ *
+ * firstNonCachedOffAttr stores the index into the compact_attrs array for the
+ * first attribute that we don't have a known attcacheoff for.
+ *
+ * Once a TupleDesc has been populated, before it is used for any purpose
+ * TupleDescFinalize() must be called on it.
  */
 typedef struct TupleDescData
 {
@@ -138,6 +144,8 @@ typedef struct TupleDescData
 	Oid			tdtypeid;		/* composite type ID for tuple type */
 	int32		tdtypmod;		/* typmod for tuple type */
 	int			tdrefcount;		/* reference count, or -1 if not counting */
+	int			firstNonCachedOffAttr;	/* index of the first att without an
+										 * attcacheoff */
 	TupleConstr *constr;		/* constraints, or NULL if none */
 	/* compact_attrs[N] is the compact metadata of Attribute Number N+1 */
 	CompactAttribute compact_attrs[FLEXIBLE_ARRAY_MEMBER];
@@ -195,7 +203,6 @@ extern TupleDesc CreateTupleDescTruncatedCopy(TupleDesc tupdesc, int natts);
 
 extern TupleDesc CreateTupleDescCopyConstr(TupleDesc tupdesc);
 
-#define TupleDescFinalize(d) ((void) 0)
 #define TupleDescSize(src) \
 	(offsetof(struct TupleDescData, compact_attrs) + \
 	 (src)->natts * sizeof(CompactAttribute) + \
@@ -206,6 +213,7 @@ extern void TupleDescCopy(TupleDesc dst, TupleDesc src);
 extern void TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 							   TupleDesc src, AttrNumber srcAttno);
 
+extern void TupleDescFinalize(TupleDesc tupdesc);
 extern void FreeTupleDesc(TupleDesc tupdesc);
 
 extern void IncrTupleDescRefCount(TupleDesc tupdesc);
diff --git a/src/include/access/tupmacs.h b/src/include/access/tupmacs.h
index 3e5530658c9..ce7a88df611 100644
--- a/src/include/access/tupmacs.h
+++ b/src/include/access/tupmacs.h
@@ -15,6 +15,7 @@
 #define TUPMACS_H
 
 #include "catalog/pg_type_d.h"	/* for TYPALIGN macros */
+#include "port/pg_bitutils.h"
 
 
 /*
@@ -69,6 +70,89 @@ fetch_att(const void *T, bool attbyval, int attlen)
 	else
 		return PointerGetDatum(T);
 }
+
+#ifndef HAVE__BUILTIN_CTZ
+/*
+ * For returning the 0-based position of the right-most 0 bit of a uint8, or 8
+ * if all bits are 1 bits.
+ */
+static const uint8 pg_rightmost_zero_pos[256] = {
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 6,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 7,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 6,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 8
+};
+#endif
+
+/*
+ * first_null_attr
+ *		Inspect a NULL bitmask from a tuple and return the 0-based attnum of the
+ *		first NULL attribute.  Returns natts if no NULLs were found.
+ */
+static inline int
+first_null_attr(const bits8 *bits, int natts)
+{
+	int			lastByte = natts >> 3;
+	int			bytenum;
+	int			res;
+
+#ifdef USE_ASSERT_CHECKING
+	int			firstnull_check = natts;
+
+	/* Do it the slow way and check we get the same answer. */
+	for (int i = 0; i < natts; i++)
+	{
+		if (att_isnull(i, bits))
+		{
+			firstnull_check = i;
+			break;
+		}
+	}
+#endif
+
+	/* Process all bytes up to just before the byte for the natts index */
+	for (bytenum = 0; bytenum < lastByte; bytenum++)
+	{
+		/* break if there's any NULL attrs (a 0 bit) */
+		if (bits[bytenum] != 0xFF)
+			break;
+	}
+
+	res = bytenum << 3;
+
+#ifdef HAVE__BUILTIN_CTZ
+	/*
+	 * Promote to 32-bit before doing bit-wise NOT.  This means we'll convert
+	 * 0xff into 0xffffff00 rather than 0x0, which is undefined with
+	 * __builtin_ctz.  That'll mean we correctly get 8 for 0xff
+	 */
+	res += __builtin_ctz(~(uint32) bits[bytenum]);
+#else
+	res += pg_rightmost_zero_pos[bits[bytenum]];
+#endif
+
+	/*
+	 * Since we did no masking to mask out bits beyond natts, we may have
+	 * found a bit higher than natts, so we must cap to natts
+	 */
+	res = Min(res, natts);
+
+	Assert(res == firstnull_check);
+
+	return res;
+}
 #endif							/* FRONTEND */
 
 /*
diff --git a/src/include/executor/tuptable.h b/src/include/executor/tuptable.h
index a2dfd707e78..363c5f33697 100644
--- a/src/include/executor/tuptable.h
+++ b/src/include/executor/tuptable.h
@@ -84,9 +84,6 @@
  * tts_values/tts_isnull are allocated either when the slot is created (when
  * the descriptor is provided), or when a descriptor is assigned to the slot;
  * they are of length equal to the descriptor's natts.
- *
- * The TTS_FLAG_SLOW flag is saved state for
- * slot_deform_heap_tuple, and should not be touched by any other code.
  *----------
  */
 
@@ -98,12 +95,8 @@
 #define			TTS_FLAG_SHOULDFREE		(1 << 2)
 #define TTS_SHOULDFREE(slot) (((slot)->tts_flags & TTS_FLAG_SHOULDFREE) != 0)
 
-/* saved state for slot_deform_heap_tuple */
-#define			TTS_FLAG_SLOW		(1 << 3)
-#define TTS_SLOW(slot) (((slot)->tts_flags & TTS_FLAG_SLOW) != 0)
-
 /* fixed tuple descriptor */
-#define			TTS_FLAG_FIXED		(1 << 4)
+#define			TTS_FLAG_FIXED		(1 << 4)	/* XXX change to #3? */
 #define TTS_FIXED(slot) (((slot)->tts_flags & TTS_FLAG_FIXED) != 0)
 
 struct TupleTableSlotOps;
-- 
2.51.0


From 98e27bc74fbc3f3da9048430ea9aa2ffa156aa78 Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Tue, 27 Jan 2026 15:08:09 +1300
Subject: [PATCH v8 3/4] Introduce deform_bench test module

For benchmaring tuple deformation.
---
 src/test/modules/deform_bench/.gitignore      |   4 +
 src/test/modules/deform_bench/Makefile        |  21 ++++
 .../deform_bench/deform_bench--1.0.sql        |   8 ++
 src/test/modules/deform_bench/deform_bench.c  | 106 ++++++++++++++++++
 .../modules/deform_bench/deform_bench.control |   4 +
 src/test/modules/deform_bench/meson.build     |  22 ++++
 src/test/modules/meson.build                  |   1 +
 7 files changed, 166 insertions(+)
 create mode 100644 src/test/modules/deform_bench/.gitignore
 create mode 100644 src/test/modules/deform_bench/Makefile
 create mode 100644 src/test/modules/deform_bench/deform_bench--1.0.sql
 create mode 100644 src/test/modules/deform_bench/deform_bench.c
 create mode 100644 src/test/modules/deform_bench/deform_bench.control
 create mode 100644 src/test/modules/deform_bench/meson.build

diff --git a/src/test/modules/deform_bench/.gitignore b/src/test/modules/deform_bench/.gitignore
new file mode 100644
index 00000000000..5dcb3ff9723
--- /dev/null
+++ b/src/test/modules/deform_bench/.gitignore
@@ -0,0 +1,4 @@
+# Generated subdirectories
+/log/
+/results/
+/tmp_check/
diff --git a/src/test/modules/deform_bench/Makefile b/src/test/modules/deform_bench/Makefile
new file mode 100644
index 00000000000..b5fc0f7a583
--- /dev/null
+++ b/src/test/modules/deform_bench/Makefile
@@ -0,0 +1,21 @@
+# src/test/modules/deform_bench/Makefile
+
+MODULE_big = deform_bench
+OBJS = deform_bench.o
+
+EXTENSION = deform_bench
+DATA = deform_bench--1.0.sql
+PGFILEDESC = "deform_bench - tuple deform benchmarking"
+
+REGRESS = deform_bench
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/deform_bench
+top_builddir = ../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/deform_bench/deform_bench--1.0.sql b/src/test/modules/deform_bench/deform_bench--1.0.sql
new file mode 100644
index 00000000000..492b71dba3b
--- /dev/null
+++ b/src/test/modules/deform_bench/deform_bench--1.0.sql
@@ -0,0 +1,8 @@
+/* deform_bench--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION deform_bench" to load this file. \quit
+
+CREATE FUNCTION deform_bench(tableoid Oid, attnum int[]) RETURNS FLOAT
+AS 'MODULE_PATHNAME', 'deform_bench'
+LANGUAGE C VOLATILE STRICT;
diff --git a/src/test/modules/deform_bench/deform_bench.c b/src/test/modules/deform_bench/deform_bench.c
new file mode 100644
index 00000000000..525162eb59c
--- /dev/null
+++ b/src/test/modules/deform_bench/deform_bench.c
@@ -0,0 +1,106 @@
+/*-------------------------------------------------------------------------
+ *
+ * deform_bench.c
+ *
+ * for benchmarking tuple deformation routines
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include <time.h>
+#include <sys/time.h>
+
+#include "access/heapam.h"
+#include "access/relscan.h"
+#include "catalog/pg_am_d.h"
+#include "catalog/pg_type_d.h"
+#include "funcapi.h"
+#include "miscadmin.h"
+#include "utils/array.h"
+#include "utils/arrayaccess.h"
+#include "utils/builtins.h"
+
+PG_MODULE_MAGIC;
+
+PG_FUNCTION_INFO_V1(deform_bench);
+
+Datum
+deform_bench(PG_FUNCTION_ARGS)
+{
+	Oid			tableoid = PG_GETARG_OID(0);
+	ArrayType  *array = PG_GETARG_ARRAYTYPE_P(1);
+	TableScanDesc scan;
+	Relation	rel;
+	TupleDesc	tupdesc;
+	TupleTableSlot *slot;
+	Datum	   *elem_datums = NULL;
+	bool	   *elem_nulls = NULL;
+	int			elem_count;
+	int		   *attnums;
+	clock_t		start,
+				end;
+
+	rel = relation_open(tableoid, AccessShareLock);
+
+	if (rel->rd_rel->relam != HEAP_TABLE_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only heap AM is supported")));
+
+	tupdesc = RelationGetDescr(rel);
+	slot = MakeTupleTableSlot(tupdesc, &TTSOpsBufferHeapTuple);
+	scan = table_beginscan_strat(rel, GetActiveSnapshot(), 0, NULL, true, false);
+
+	/*
+	 * The array is used to allow callers to define how many atts to deform.
+	 * e.g: '{1,10}'::int[] would deform attnum=1, then in a 2nd pass deform
+	 * the remainder up to attnum=10.  Passing an element as NULL means all
+	 * attnums.  This allows simulation of incremental deformation.  Generally
+	 * if you're passing an array with more than 1 element, then the array
+	 * should be in ascending order.  Doing something like '{10,1}' would mean
+	 * we've already deformed 10 attributes and on the 2nd pass there's
+	 * nothing to do since attnum=1 was already deformed in the first pass.
+	 *
+	 * You'll get an ERROR if you pass a number higher than the number of
+	 * attributes in the table.
+	 */
+	deconstruct_array(array,
+					  INT4OID,
+					  sizeof(int32),
+					  true,
+					  'i',
+					  &elem_datums,
+					  &elem_nulls,
+					  &elem_count);
+
+	attnums = palloc_array(int, elem_count);
+
+	for (int i = 0; i < elem_count; i++)
+	{
+		/* Make a NULL element mean all attributes */
+		if (elem_nulls[i])
+			attnums[i] = tupdesc->natts;
+		else
+			attnums[i] = DatumGetInt32(elem_datums[i]);
+	}
+
+	start = clock();
+
+	while (heap_getnextslot(scan, ForwardScanDirection, slot))
+	{
+		CHECK_FOR_INTERRUPTS();
+
+		/* Deform in stages according to the attnums array */
+		for (int i = 0; i < elem_count; i++)
+			slot_getsomeattrs_int(slot, attnums[i]);
+	}
+
+	ExecDropSingleTupleTableSlot(slot);
+	table_endscan(scan);
+	relation_close(rel, AccessShareLock);
+
+	end = clock();
+
+	/* Returns the number of milliseconds to run the test */
+	PG_RETURN_FLOAT8((double) (end - start) / (CLOCKS_PER_SEC / 1000));
+}
diff --git a/src/test/modules/deform_bench/deform_bench.control b/src/test/modules/deform_bench/deform_bench.control
new file mode 100644
index 00000000000..a2023f9d738
--- /dev/null
+++ b/src/test/modules/deform_bench/deform_bench.control
@@ -0,0 +1,4 @@
+# deform_bench extension
+comment = 'functions for benchmarking tuple deformation'
+default_version = '1.0'
+module_pathname = '$libdir/deform_bench'
diff --git a/src/test/modules/deform_bench/meson.build b/src/test/modules/deform_bench/meson.build
new file mode 100644
index 00000000000..82049585244
--- /dev/null
+++ b/src/test/modules/deform_bench/meson.build
@@ -0,0 +1,22 @@
+# Copyright (c) 2026, PostgreSQL Global Development Group
+
+deform_bench_sources = files(
+  'deform_bench.c',
+)
+
+if host_system == 'windows'
+  deform_bench_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+    '--NAME', 'deform_bench',
+    '--FILEDESC', 'deform_bench - benchmarking tuple deformation',])
+endif
+
+deform_bench = shared_module('deform_bench',
+  deform_bench_sources,
+  kwargs: pg_test_mod_args,
+)
+test_install_libs += deform_bench
+
+test_install_data += files(
+  'deform_bench--1.0.sql',
+  'deform_bench.control',
+)
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 2634a519935..ef2b0af4581 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -2,6 +2,7 @@
 
 subdir('brin')
 subdir('commit_ts')
+subdir('deform_bench')
 subdir('delay_execution')
 subdir('dummy_index_am')
 subdir('dummy_seclabel')
-- 
2.51.0


From 8160dd33825abfcae2a14e78ed85a4c8e94e0274 Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Fri, 30 Jan 2026 23:18:45 +1300
Subject: [PATCH v8 4/4] Various experimental changes

---
 src/backend/access/common/tupdesc.c |   6 ++
 src/backend/executor/execTuples.c   |  63 ++++++------
 src/include/access/tupmacs.h        | 149 ++++++++++++++++++++++++++--
 src/include/executor/tuptable.h     |   4 +-
 4 files changed, 183 insertions(+), 39 deletions(-)

diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index 25364db630a..ca393af67c9 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -105,6 +105,12 @@ populate_compact_attribute_internal(Form_pg_attribute src,
 			elog(ERROR, "invalid attalign value: %c", src->attalign);
 			break;
 	}
+
+	/* Check for unsupported byval attlens */
+	if (src->attbyval && src->attlen != sizeof(char) &&
+		src->attlen != sizeof(int16) && src->attlen != sizeof(int32) &&
+		src->attlen != sizeof(int64))
+		elog(ERROR, "unsupported byval length: %d", src->attlen);
 }
 
 /*
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index 36d0aaed2fb..5e20a05a830 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -1029,23 +1029,34 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 	/* We can only fetch as many attributes as the tuple has. */
 	natts = Min(HeapTupleHeaderGetNatts(tup), natts);
 	attnum = slot->tts_nvalid;
+	values = slot->tts_values;
+	isnull = slot->tts_isnull;
 	firstNonCacheOffsetAttr = Min(tupleDesc->firstNonCachedOffAttr, natts);
 
 	if (hasnulls)
 	{
+		tp = (char *) tup +
+			MAXALIGN(offsetof(HeapTupleHeaderData, t_bits) +
+					 BITMAPLEN(HeapTupleHeaderGetNatts(tup)));
+		Assert(tp == (char *) tup + tup->t_hoff);
 		bp = tup->t_bits;
 		firstNullAttr = first_null_attr(bp, natts);
+		populate_isnull_array(bp, natts, isnull);
 		firstNonCacheOffsetAttr = Min(firstNonCacheOffsetAttr, firstNullAttr);
 	}
 	else
 	{
+		uint64	   *isnull64 = (uint64 *) isnull;
+		Size		asize = (natts + 7) >> 3;
+
+		tp = (char *) tup + MAXALIGN(offsetof(HeapTupleHeaderData, t_bits));
 		bp = NULL;
 		firstNullAttr = natts;
-	}
 
-	values = slot->tts_values;
-	isnull = slot->tts_isnull;
-	tp = (char *) tup + tup->t_hoff;
+		/* No nulls, set all isnull elements to false */
+		for (int i = 0; i < asize; i++)
+			isnull64[i] = 0;
+	}
 
 	/*
 	 * Handle the portion of the tuple that we have cached the offset for up
@@ -1065,7 +1076,6 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 #endif
 		do
 		{
-			isnull[attnum] = false;
 			cattr = TupleDescCompactAttr(tupleDesc, attnum);
 
 #ifdef USE_ASSERT_CHECKING
@@ -1074,7 +1084,9 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 			offcheck += cattr->attlen;
 #endif
 
-			values[attnum] = fetchatt(cattr, tp + cattr->attcacheoff);
+			values[attnum] = fetch_att_noerr(tp + cattr->attcacheoff,
+											 cattr->attbyval,
+											 cattr->attlen);
 		} while (++attnum < firstNonCacheOffsetAttr);
 
 		/*
@@ -1101,19 +1113,14 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 	 */
 	for (; attnum < firstNullAttr; attnum++)
 	{
-		isnull[attnum] = false;
 		cattr = TupleDescCompactAttr(tupleDesc, attnum);
 
-		/* align the offset for this attribute */
-		off = att_pointer_alignby(off,
-								  cattr->attalignby,
-								  cattr->attlen,
-								  tp + off);
-
-		values[attnum] = fetchatt(cattr, tp + off);
-
-		/* move the offset beyond this attribute */
-		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+		/* align 'off', fetch the datum, and increment off beyond the datum */
+		values[attnum] = align_fetch_then_add(tp,
+											  &off,
+											  cattr->attbyval,
+											  cattr->attlen,
+											  cattr->attalignby);
 	}
 
 	/*
@@ -1122,26 +1129,20 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 	 */
 	for (; attnum < natts; attnum++)
 	{
-		if (att_isnull(attnum, bp))
+		if (isnull[attnum])
 		{
 			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
 			continue;
 		}
 
-		isnull[attnum] = false;
 		cattr = TupleDescCompactAttr(tupleDesc, attnum);
 
-		/* align the offset for this attribute */
-		off = att_pointer_alignby(off,
-								  cattr->attalignby,
-								  cattr->attlen,
-								  tp + off);
-
-		values[attnum] = fetchatt(cattr, tp + off);
-
-		/* move the offset beyond this attribute */
-		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+		/* align 'off', fetch the datum, and increment off beyond the datum */
+		values[attnum] = align_fetch_then_add(tp,
+											  &off,
+											  cattr->attbyval,
+											  cattr->attlen,
+											  cattr->attalignby);
 	}
 
 	/*
@@ -1452,7 +1453,7 @@ ExecSetSlotDescriptor(TupleTableSlot *slot, /* slot to change */
 	slot->tts_values = (Datum *)
 		MemoryContextAlloc(slot->tts_mcxt, tupdesc->natts * sizeof(Datum));
 	slot->tts_isnull = (bool *)
-		MemoryContextAlloc(slot->tts_mcxt, tupdesc->natts * sizeof(bool));
+		MemoryContextAlloc(slot->tts_mcxt, MAXALIGN(tupdesc->natts * sizeof(bool)));
 }
 
 /* --------------------------------
diff --git a/src/include/access/tupmacs.h b/src/include/access/tupmacs.h
index ce7a88df611..d53784899fb 100644
--- a/src/include/access/tupmacs.h
+++ b/src/include/access/tupmacs.h
@@ -16,7 +16,7 @@
 
 #include "catalog/pg_type_d.h"	/* for TYPALIGN macros */
 #include "port/pg_bitutils.h"
-
+#include "varatt.h"
 
 /*
  * Check a tuple's null bitmap to determine whether the attribute is null.
@@ -29,6 +29,49 @@ att_isnull(int ATT, const bits8 *BITS)
 	return !(BITS[ATT >> 3] & (1 << (ATT & 0x07)));
 }
 
+/*
+ * populate_isnull_array
+ *		Transform a tuple's null bitmap into a boolean array.
+ *
+ * Caller must ensure that the isnull array is sized so it contains
+ * at least as many elements as there are bits in the 'bits' array.
+ * This is required because we always round 'natts' up to the next multiple
+ * of 8.
+ */
+static inline void
+populate_isnull_array(const bits8 *bits, int natts, bool *isnull)
+{
+	int			nbytes = (natts + 7) >> 3;
+
+	/*
+	 * Multiplying a NULL bitmap byte by this value results in the lowest bit
+	 * in each byte being set the same as each bit of the bitmap.  We perform
+	 * this as 2 32-bit operations rather than a single 64-bit operation as
+	 * multiplying by the required value to do this in 64-bits would result in
+	 * overflowing a uint64 in some cases.
+	 */
+#define SPREAD_BITS_MULTIPLIER_32 0x204081U
+
+	for (int i = 0; i < nbytes; i++, isnull += 8)
+	{
+		uint64		isnull_8;
+		bits8		nullbyte = ~bits[i];
+
+		/* convert the lower 4 bits of null bitmap word into 32 bit int */
+		isnull_8 = (nullbyte & 0xf) * SPREAD_BITS_MULTIPLIER_32;
+
+		/*
+		 * convert the upper 4 bits of null bitmap word into 32 bit int, shift
+		 * into the upper 32 bit
+		 */
+		isnull_8 |= ((uint64) ((nullbyte >> 4) * SPREAD_BITS_MULTIPLIER_32)) << 32;
+
+		/* mask out all other bits apart from the lowest bit of each byte */
+		isnull_8 &= UINT64CONST(0x0101010101010101);
+		memcpy(isnull, &isnull_8, sizeof(uint64));
+	}
+}
+
 #ifndef FRONTEND
 /*
  * Given an attbyval and an attlen from either a Form_pg_attribute or
@@ -71,6 +114,100 @@ fetch_att(const void *T, bool attbyval, int attlen)
 		return PointerGetDatum(T);
 }
 
+/*
+ * Same, but no error checking for invalid attlens for byval types.  This
+ * is safe to use when attlen comes from CompactAttribute as we validate the
+ * length when populating that struct.
+ */
+static inline Datum
+fetch_att_noerr(const void *T, bool attbyval, int attlen)
+{
+	if (attbyval)
+	{
+		switch (attlen)
+		{
+			case sizeof(char):
+				return CharGetDatum(*((const char *) T));
+			case sizeof(int16):
+				return Int16GetDatum(*((const int16 *) T));
+			case sizeof(int32):
+				return Int32GetDatum(*((const int32 *) T));
+			default:
+				Assert(attlen == sizeof(int64));
+				return Int64GetDatum(*((const int64 *) T));
+		}
+	}
+	else
+		return PointerGetDatum(T);
+}
+
+
+/*
+ * align_fetch_then_add
+ *		Applies all the functionality of att_pointer_alignby(), fetch_att()
+ *		and att_addlength_pointer() resulting in *off pointer to the perhaps
+ *		unaligned number of bytes into 'tupptr', ready to deform the next
+ *		attribute.
+ *
+ * tupptr: pointer to the beginning of the tuple, after the header and any
+ * NULL bitmask.
+ * off: offset in bytes for reading tuple data, possibly unaligned.
+ * attbyval, attlen, attalignby are values from CompactAttribute.
+ */
+static inline Datum
+align_fetch_then_add(const char *tupptr, uint32 *off, bool attbyval, int attlen,
+					 uint8 attalignby)
+{
+	Datum		res;
+
+	if (attlen > 0)
+	{
+		const char *offset_ptr;
+
+		*off = TYPEALIGN(attalignby, *off);
+		offset_ptr = tupptr + *off;
+		*off += attlen;
+		if (attbyval)
+		{
+			switch (attlen)
+			{
+				case sizeof(char):
+					return CharGetDatum(*((const char *) offset_ptr));
+				case sizeof(int16):
+					return Int16GetDatum(*((const int16 *) offset_ptr));
+				case sizeof(int32):
+					return Int32GetDatum(*((const int32 *) offset_ptr));
+				default:
+
+					/*
+					 * populate_compact_attribute_internal() should have
+					 * checked
+					 */
+					Assert(attlen == sizeof(int64));
+					return Int64GetDatum(*((const int64 *) offset_ptr));
+			}
+		}
+		return PointerGetDatum(offset_ptr);
+	}
+	else if (attlen == -1)
+	{
+		if (!VARATT_IS_SHORT(tupptr + *off))
+			*off = TYPEALIGN(attalignby, *off);
+
+		res = PointerGetDatum(tupptr + *off);
+		*off += VARSIZE_ANY(DatumGetPointer(res));
+		return res;
+	}
+	else
+	{
+		Assert(attlen == -2);
+		*off = TYPEALIGN(attalignby, *off);
+		res = PointerGetDatum(tupptr + *off);
+		*off += strlen(tupptr + *off) + 1;
+		return res;
+	}
+}
+
 #ifndef HAVE__BUILTIN_CTZ
 /*
  * For returning the 0-based position of the right-most 0 bit of a uint8, or 8
@@ -100,6 +237,9 @@ static const uint8 pg_rightmost_zero_pos[256] = {
  * first_null_attr
  *		Inspect a NULL bitmask from a tuple and return the 0-based attnum of the
  *		first NULL attribute.  Returns natts if no NULLs were found.
+ *
+ * We expect that 'bits' contains at least one 0 bit somewhere in the mask,
+ * not necessarily < natts.
  */
 static inline int
 first_null_attr(const bits8 *bits, int natts)
@@ -133,12 +273,7 @@ first_null_attr(const bits8 *bits, int natts)
 	res = bytenum << 3;
 
 #ifdef HAVE__BUILTIN_CTZ
-	/*
-	 * Promote to 32-bit before doing bit-wise NOT.  This means we'll convert
-	 * 0xff into 0xffffff00 rather than 0x0, which is undefined with
-	 * __builtin_ctz.  That'll mean we correctly get 8 for 0xff
-	 */
-	res += __builtin_ctz(~(uint32) bits[bytenum]);
+	res += __builtin_ctz(~bits[bytenum]);
 #else
 	res += pg_rightmost_zero_pos[bits[bytenum]];
 #endif
diff --git a/src/include/executor/tuptable.h b/src/include/executor/tuptable.h
index 363c5f33697..180fccc999f 100644
--- a/src/include/executor/tuptable.h
+++ b/src/include/executor/tuptable.h
@@ -116,7 +116,9 @@ typedef struct TupleTableSlot
 #define FIELDNO_TUPLETABLESLOT_VALUES 5
 	Datum	   *tts_values;		/* current per-attribute values */
 #define FIELDNO_TUPLETABLESLOT_ISNULL 6
-	bool	   *tts_isnull;		/* current per-attribute isnull flags */
+	bool	   *tts_isnull;		/* current per-attribute isnull flags.  Array
+								 * size must always be rounded up to the next
+								 * 8 elements. */
 	MemoryContext tts_mcxt;		/* slot itself is in this context */
 	ItemPointerData tts_tid;	/* stored tuple's tid */
 	Oid			tts_tableOid;	/* table oid of tuple */
-- 
2.51.0



Attachments:

  [text/plain] v8-0001-Add-empty-TupleDescFinalize-function.patch (29.0K, 2-v8-0001-Add-empty-TupleDescFinalize-function.patch)
  download | inline diff:
From df730d93ed3f7f9cd719c69c213ab60cac28c7fb Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Wed, 21 Jan 2026 15:41:37 +1300
Subject: [PATCH v8 1/4] Add empty TupleDescFinalize() function

Currently does nothing, but will in a future commit.
---
 contrib/dblink/dblink.c                             |  4 ++++
 contrib/pg_buffercache/pg_buffercache_pages.c       |  2 ++
 contrib/pg_visibility/pg_visibility.c               |  2 ++
 src/backend/access/brin/brin_tuple.c                |  1 +
 src/backend/access/common/tupdesc.c                 | 13 +++++++++++++
 src/backend/access/gin/ginutil.c                    |  1 +
 src/backend/access/gist/gistscan.c                  |  1 +
 src/backend/access/spgist/spgutils.c                |  1 +
 src/backend/access/transam/twophase.c               |  1 +
 src/backend/access/transam/xlogfuncs.c              |  1 +
 src/backend/backup/basebackup_copy.c                |  3 +++
 src/backend/catalog/index.c                         |  2 ++
 src/backend/catalog/pg_publication.c                |  1 +
 src/backend/catalog/toasting.c                      |  6 ++++++
 src/backend/commands/explain.c                      |  1 +
 src/backend/commands/functioncmds.c                 |  1 +
 src/backend/commands/sequence.c                     |  1 +
 src/backend/commands/tablecmds.c                    |  4 ++++
 src/backend/commands/wait.c                         |  1 +
 src/backend/executor/execSRF.c                      |  2 ++
 src/backend/executor/execTuples.c                   |  4 ++++
 src/backend/executor/nodeFunctionscan.c             |  2 ++
 src/backend/parser/parse_relation.c                 |  4 +++-
 src/backend/parser/parse_target.c                   |  2 ++
 .../replication/libpqwalreceiver/libpqwalreceiver.c |  1 +
 src/backend/replication/walsender.c                 |  5 +++++
 src/backend/utils/adt/acl.c                         |  1 +
 src/backend/utils/adt/genfile.c                     |  1 +
 src/backend/utils/adt/lockfuncs.c                   |  1 +
 src/backend/utils/adt/orderedsetaggs.c              |  1 +
 src/backend/utils/adt/pgstatfuncs.c                 |  5 +++++
 src/backend/utils/adt/tsvector_op.c                 |  1 +
 src/backend/utils/cache/relcache.c                  |  8 ++++++++
 src/backend/utils/fmgr/funcapi.c                    |  6 ++++++
 src/backend/utils/misc/guc_funcs.c                  |  5 +++++
 src/include/access/tupdesc.h                        |  1 +
 src/pl/plpgsql/src/pl_comp.c                        |  2 ++
 .../test_custom_stats/test_custom_fixed_stats.c     |  1 +
 src/test/modules/test_predtest/test_predtest.c      |  1 +
 39 files changed, 100 insertions(+), 1 deletion(-)

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 8cb3166495c..1ce4502fec2 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -881,6 +881,7 @@ materializeResult(FunctionCallInfo fcinfo, PGconn *conn, PGresult *res)
 		tupdesc = CreateTemplateTupleDesc(1);
 		TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 						   TEXTOID, -1, 0);
+		TupleDescFinalize(tupdesc);
 		ntuples = 1;
 		nfields = 1;
 	}
@@ -1044,6 +1045,7 @@ materializeQueryResult(FunctionCallInfo fcinfo,
 			tupdesc = CreateTemplateTupleDesc(1);
 			TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 							   TEXTOID, -1, 0);
+			TupleDescFinalize(tupdesc);
 			attinmeta = TupleDescGetAttInMetadata(tupdesc);
 
 			oldcontext = MemoryContextSwitchTo(rsinfo->econtext->ecxt_per_query_memory);
@@ -1529,6 +1531,8 @@ dblink_get_pkey(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 2, "colname",
 						   TEXTOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
+
 		/*
 		 * Generate attribute metadata needed later to produce tuples from raw
 		 * C strings
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 89b86855243..a6b4fb5252b 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -174,6 +174,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
 			TupleDescInitEntry(tupledesc, (AttrNumber) 9, "pinning_backends",
 							   INT4OID, -1, 0);
 
+		TupleDescFinalize(tupledesc);
 		fctx->tupdesc = BlessTupleDesc(tupledesc);
 
 		/* Allocate NBuffers worth of BufferCachePagesRec records. */
@@ -442,6 +443,7 @@ pg_buffercache_os_pages_internal(FunctionCallInfo fcinfo, bool include_numa)
 		TupleDescInitEntry(tupledesc, (AttrNumber) 3, "numa_node",
 						   INT4OID, -1, 0);
 
+		TupleDescFinalize(tupledesc);
 		fctx->tupdesc = BlessTupleDesc(tupledesc);
 		fctx->include_numa = include_numa;
 
diff --git a/contrib/pg_visibility/pg_visibility.c b/contrib/pg_visibility/pg_visibility.c
index 9bc3a784bf7..dfab0b64cf5 100644
--- a/contrib/pg_visibility/pg_visibility.c
+++ b/contrib/pg_visibility/pg_visibility.c
@@ -469,6 +469,8 @@ pg_visibility_tupdesc(bool include_blkno, bool include_pd)
 		TupleDescInitEntry(tupdesc, ++a, "pd_all_visible", BOOLOID, -1, 0);
 	Assert(a == maxattr);
 
+	TupleDescFinalize(tupdesc);
+
 	return BlessTupleDesc(tupdesc);
 }
 
diff --git a/src/backend/access/brin/brin_tuple.c b/src/backend/access/brin/brin_tuple.c
index 706387e36d6..7f150df9ee7 100644
--- a/src/backend/access/brin/brin_tuple.c
+++ b/src/backend/access/brin/brin_tuple.c
@@ -84,6 +84,7 @@ brtuple_disk_tupdesc(BrinDesc *brdesc)
 
 		MemoryContextSwitchTo(oldcxt);
 
+		TupleDescFinalize(tupdesc);
 		brdesc->bd_disktdesc = tupdesc;
 	}
 
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index 94b4f1f9975..e98de806a77 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -238,6 +238,9 @@ CreateTupleDesc(int natts, Form_pg_attribute *attrs)
 		memcpy(TupleDescAttr(desc, i), attrs[i], ATTRIBUTE_FIXED_PART_SIZE);
 		populate_compact_attribute(desc, i);
 	}
+
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -282,6 +285,8 @@ CreateTupleDescCopy(TupleDesc tupdesc)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -328,6 +333,8 @@ CreateTupleDescTruncatedCopy(TupleDesc tupdesc, int natts)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -413,6 +420,8 @@ CreateTupleDescCopyConstr(TupleDesc tupdesc)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -455,6 +464,8 @@ TupleDescCopy(TupleDesc dst, TupleDesc src)
 	 * source's refcount would be wrong in any case.)
 	 */
 	dst->tdrefcount = -1;
+
+	TupleDescFinalize(dst);
 }
 
 /*
@@ -1082,6 +1093,8 @@ BuildDescFromLists(const List *names, const List *types, const List *typmods, co
 		TupleDescInitEntryCollation(desc, attnum, attcollation);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index d205093e21d..a533d79e26e 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -129,6 +129,7 @@ initGinState(GinState *state, Relation index)
 							   attr->attndims);
 			TupleDescInitEntryCollation(state->tupdesc[i], (AttrNumber) 2,
 										attr->attcollation);
+			TupleDescFinalize(state->tupdesc[i]);
 		}
 
 		/*
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
index f23bc4a6757..c65f93abdae 100644
--- a/src/backend/access/gist/gistscan.c
+++ b/src/backend/access/gist/gistscan.c
@@ -201,6 +201,7 @@ gistrescan(IndexScanDesc scan, ScanKey key, int nkeys,
 											 attno - 1)->atttypid,
 							   -1, 0);
 		}
+		TupleDescFinalize(so->giststate->fetchTupdesc);
 		scan->xs_hitupdesc = so->giststate->fetchTupdesc;
 
 		/* Also create a memory context that will hold the returned tuples */
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 9f5379b87ac..b246e8127db 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -340,6 +340,7 @@ getSpGistTupleDesc(Relation index, SpGistTypeDesc *keyType)
 			TupleDescCompactAttr(outTupDesc, i)->attcacheoff = -1;
 
 		populate_compact_attribute(outTupDesc, spgKeyColumn);
+		TupleDescFinalize(outTupDesc);
 	}
 	return outTupDesc;
 }
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 601ce3faa64..6d5792c7929 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -744,6 +744,7 @@ pg_prepared_xact(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 5, "dbid",
 						   OIDOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 2efe4105efb..b6bc616c74c 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -400,6 +400,7 @@ pg_walfile_name_offset(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(resultTupleDesc, (AttrNumber) 2, "file_offset",
 					   INT4OID, -1, 0);
 
+	TupleDescFinalize(resultTupleDesc);
 	resultTupleDesc = BlessTupleDesc(resultTupleDesc);
 
 	/*
diff --git a/src/backend/backup/basebackup_copy.c b/src/backend/backup/basebackup_copy.c
index fecfad9ab7b..29dbd0cb32f 100644
--- a/src/backend/backup/basebackup_copy.c
+++ b/src/backend/backup/basebackup_copy.c
@@ -357,6 +357,8 @@ SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli)
 	 */
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "tli", INT8OID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
+
 	/* send RowDescription */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
 
@@ -388,6 +390,7 @@ SendTablespaceList(List *tablespaces)
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, "spcoid", OIDOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "spclocation", TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "size", INT8OID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* send RowDescription */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 43de42ce39e..75e97fb394a 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -481,6 +481,8 @@ ConstructTupleDescriptor(Relation heapRelation,
 		populate_compact_attribute(indexTupDesc, i);
 	}
 
+	TupleDescFinalize(indexTupDesc);
+
 	return indexTupDesc;
 }
 
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 9a4791c573e..fa353a0dd37 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -1230,6 +1230,7 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 4, "qual",
 						   PG_NODE_TREEOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 		funcctx->user_fctx = table_infos;
 
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index c78dcea98c1..078a1cf5127 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -229,6 +229,12 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid,
 	TupleDescAttr(tupdesc, 1)->attcompression = InvalidCompressionMethod;
 	TupleDescAttr(tupdesc, 2)->attcompression = InvalidCompressionMethod;
 
+	populate_compact_attribute(tupdesc, 0);
+	populate_compact_attribute(tupdesc, 1);
+	populate_compact_attribute(tupdesc, 2);
+
+	TupleDescFinalize(tupdesc);
+
 	/*
 	 * Toast tables for regular relations go in pg_toast; those for temp
 	 * relations go into the per-backend temp-toast-table namespace.
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index b7bb111688c..7abd9ed272f 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -281,6 +281,7 @@ ExplainResultDesc(ExplainStmt *stmt)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "QUERY PLAN",
 					   result_type, -1, 0);
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
 
diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c
index a516b037dea..6a8f162b640 100644
--- a/src/backend/commands/functioncmds.c
+++ b/src/backend/commands/functioncmds.c
@@ -2423,6 +2423,7 @@ CallStmtResultDesc(CallStmt *stmt)
 							   -1,
 							   0);
 		}
+		TupleDescFinalize(tupdesc);
 	}
 
 	return tupdesc;
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index e1b808bbb60..551667650ba 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -1808,6 +1808,7 @@ pg_get_sequence_data(PG_FUNCTION_ARGS)
 					   BOOLOID, -1, 0);
 	TupleDescInitEntry(resultTupleDesc, (AttrNumber) 3, "page_lsn",
 					   LSNOID, -1, 0);
+	TupleDescFinalize(resultTupleDesc);
 	resultTupleDesc = BlessTupleDesc(resultTupleDesc);
 
 	seqrel = try_relation_open(relid, AccessShareLock);
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index f976c0e5c7e..ae7c9db9fa9 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -1029,6 +1029,8 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
 		}
 	}
 
+	TupleDescFinalize(descriptor);
+
 	/*
 	 * For relations with table AM and partitioned tables, select access
 	 * method to use: an explicitly indicated one, or (in the case of a
@@ -1448,6 +1450,8 @@ BuildDescForRelation(const List *columns)
 		populate_compact_attribute(desc, attnum - 1);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
diff --git a/src/backend/commands/wait.c b/src/backend/commands/wait.c
index 1290df10c6f..8e920a72372 100644
--- a/src/backend/commands/wait.c
+++ b/src/backend/commands/wait.c
@@ -338,5 +338,6 @@ WaitStmtResultDesc(WaitStmt *stmt)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 					   TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
diff --git a/src/backend/executor/execSRF.c b/src/backend/executor/execSRF.c
index a0b111dc0e4..b481e50acfb 100644
--- a/src/backend/executor/execSRF.c
+++ b/src/backend/executor/execSRF.c
@@ -272,6 +272,7 @@ ExecMakeTableFunctionResult(SetExprState *setexpr,
 									   funcrettype,
 									   -1,
 									   0);
+					TupleDescFinalize(tupdesc);
 					rsinfo.setDesc = tupdesc;
 				}
 				MemoryContextSwitchTo(oldcontext);
@@ -776,6 +777,7 @@ init_sexpr(Oid foid, Oid input_collation, Expr *node,
 							   funcrettype,
 							   -1,
 							   0);
+			TupleDescFinalize(tupdesc);
 			sexpr->funcResultDesc = tupdesc;
 			sexpr->funcReturnsTuple = false;
 		}
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index b768eae9e53..e6ab51e6404 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -2173,6 +2173,8 @@ ExecTypeFromTLInternal(List *targetList, bool skipjunk)
 		cur_resno++;
 	}
 
+	TupleDescFinalize(typeInfo);
+
 	return typeInfo;
 }
 
@@ -2207,6 +2209,8 @@ ExecTypeFromExprList(List *exprList)
 		cur_resno++;
 	}
 
+	TupleDescFinalize(typeInfo);
+
 	return typeInfo;
 }
 
diff --git a/src/backend/executor/nodeFunctionscan.c b/src/backend/executor/nodeFunctionscan.c
index 63e605e1f81..feb82d64967 100644
--- a/src/backend/executor/nodeFunctionscan.c
+++ b/src/backend/executor/nodeFunctionscan.c
@@ -414,6 +414,7 @@ ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags)
 				TupleDescInitEntryCollation(tupdesc,
 											(AttrNumber) 1,
 											exprCollation(funcexpr));
+				TupleDescFinalize(tupdesc);
 			}
 			else
 			{
@@ -485,6 +486,7 @@ ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags)
 							   0);
 		}
 
+		TupleDescFinalize(scan_tupdesc);
 		Assert(attno == natts);
 	}
 
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 3ec8d8de011..0ad767d827b 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -1891,6 +1891,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 			TupleDescInitEntryCollation(tupdesc,
 										(AttrNumber) 1,
 										exprCollation(funcexpr));
+			TupleDescFinalize(tupdesc);
 		}
 		else if (functypclass == TYPEFUNC_RECORD)
 		{
@@ -1948,6 +1949,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 
 				i++;
 			}
+			TupleDescFinalize(tupdesc);
 
 			/*
 			 * Ensure that the coldeflist defines a legal set of names (no
@@ -2016,7 +2018,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 							   0);
 			/* no need to set collation */
 		}
-
+		TupleDescFinalize(tupdesc);
 		Assert(natts == totalatts);
 	}
 	else
diff --git a/src/backend/parser/parse_target.c b/src/backend/parser/parse_target.c
index b5a2f915b67..5fd17f3d8d0 100644
--- a/src/backend/parser/parse_target.c
+++ b/src/backend/parser/parse_target.c
@@ -1570,6 +1570,8 @@ expandRecordVariable(ParseState *pstate, Var *var, int levelsup)
 		}
 		Assert(lname == NULL && lvar == NULL);	/* lists same length? */
 
+		TupleDescFinalize(tupleDesc);
+
 		return tupleDesc;
 	}
 
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 7c8639b32e9..9f04c9ed25d 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -1073,6 +1073,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
 	for (coln = 0; coln < nRetTypes; coln++)
 		TupleDescInitEntry(walres->tupledesc, (AttrNumber) coln + 1,
 						   PQfname(pgres, coln), retTypes[coln], -1, 0);
+	TupleDescFinalize(walres->tupledesc);
 	attinmeta = TupleDescGetAttInMetadata(walres->tupledesc);
 
 	/* No point in doing more here if there were no tuples returned. */
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index a0e6a3d200c..9e2f4a664b4 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -451,6 +451,7 @@ IdentifySystem(void)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 4, "dbname",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -496,6 +497,7 @@ ReadReplicationSlot(ReadReplicationSlotCmd *cmd)
 	/* TimeLineID is unsigned, so int4 is not wide enough. */
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "restart_tli",
 							  INT8OID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	memset(nulls, true, READ_REPLICATION_SLOT_COLS * sizeof(bool));
 
@@ -598,6 +600,7 @@ SendTimeLineHistory(TimeLineHistoryCmd *cmd)
 	tupdesc = CreateTemplateTupleDesc(2);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, "filename", TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "content", TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	TLHistoryFileName(histfname, cmd->timeline);
 	TLHistoryFilePath(path, cmd->timeline);
@@ -1015,6 +1018,7 @@ StartReplication(StartReplicationCmd *cmd)
 								  INT8OID, -1, 0);
 		TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "next_tli_startpos",
 								  TEXTOID, -1, 0);
+		TupleDescFinalize(tupdesc);
 
 		/* prepare for projection of tuple */
 		tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -1369,6 +1373,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 4, "output_plugin",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
diff --git a/src/backend/utils/adt/acl.c b/src/backend/utils/adt/acl.c
index 3a6905f9546..9d37053c81e 100644
--- a/src/backend/utils/adt/acl.c
+++ b/src/backend/utils/adt/acl.c
@@ -1818,6 +1818,7 @@ aclexplode(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 4, "is_grantable",
 						   BOOLOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/* allocate memory for user context */
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index c083608b1d5..bfb949401d0 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -454,6 +454,7 @@ pg_stat_file(PG_FUNCTION_ARGS)
 					   "creation", TIMESTAMPTZOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 6,
 					   "isdir", BOOLOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	memset(isnull, false, sizeof(isnull));
diff --git a/src/backend/utils/adt/lockfuncs.c b/src/backend/utils/adt/lockfuncs.c
index 9dadd6da672..4481c354fd6 100644
--- a/src/backend/utils/adt/lockfuncs.c
+++ b/src/backend/utils/adt/lockfuncs.c
@@ -146,6 +146,7 @@ pg_lock_status(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 16, "waitstart",
 						   TIMESTAMPTZOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/*
diff --git a/src/backend/utils/adt/orderedsetaggs.c b/src/backend/utils/adt/orderedsetaggs.c
index 3b6da8e36ac..fd8b8676470 100644
--- a/src/backend/utils/adt/orderedsetaggs.c
+++ b/src/backend/utils/adt/orderedsetaggs.c
@@ -233,6 +233,7 @@ ordered_set_startup(FunctionCallInfo fcinfo, bool use_tuples)
 								   -1,
 								   0);
 
+				TupleDescFinalize(newdesc);
 				FreeTupleDesc(qstate->tupdesc);
 				qstate->tupdesc = newdesc;
 			}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 73ca0bb0b7f..08ad27e57c2 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -769,6 +769,7 @@ pg_stat_get_backend_subxact(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subxact_overflow",
 					   BOOLOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	if ((local_beentry = pgstat_get_local_beentry_by_proc_number(procNumber)) != NULL)
@@ -1658,6 +1659,7 @@ pg_stat_wal_build_tuple(PgStat_WalCounters wal_counters,
 	TupleDescInitEntry(tupdesc, (AttrNumber) 6, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	/* Fill values and NULLs */
@@ -2085,6 +2087,7 @@ pg_stat_get_archiver(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(tupdesc, (AttrNumber) 7, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	/* Get statistics about the archiver process */
@@ -2166,6 +2169,7 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
 					   TIMESTAMPTZOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 13, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	namestrcpy(&slotname, text_to_cstring(slotname_text));
@@ -2253,6 +2257,7 @@ pg_stat_get_subscription_stats(PG_FUNCTION_ARGS)
 					   INT8OID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 13, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	if (!subentry)
diff --git a/src/backend/utils/adt/tsvector_op.c b/src/backend/utils/adt/tsvector_op.c
index 94e0fed8309..7ca19a97882 100644
--- a/src/backend/utils/adt/tsvector_op.c
+++ b/src/backend/utils/adt/tsvector_op.c
@@ -651,6 +651,7 @@ tsvector_unnest(PG_FUNCTION_ARGS)
 						   TEXTARRAYOID, -1, 0);
 		if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
 			elog(ERROR, "return type must be a row type");
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = tupdesc;
 
 		funcctx->user_fctx = PG_GETARG_TSVECTOR_COPY(0);
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 6b634c9fff1..770edb34e08 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -729,6 +729,8 @@ RelationBuildTupleDesc(Relation relation)
 		pfree(constr);
 		relation->rd_att->constr = NULL;
 	}
+
+	TupleDescFinalize(relation->rd_att);
 }
 
 /*
@@ -1985,6 +1987,7 @@ formrdesc(const char *relationName, Oid relationReltype,
 
 	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
 	TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
+	TupleDescFinalize(relation->rd_att);
 
 	/* mark not-null status */
 	if (has_not_null)
@@ -3688,6 +3691,8 @@ RelationBuildLocalRelation(const char *relname,
 	for (i = 0; i < natts; i++)
 		TupleDescAttr(rel->rd_att, i)->attrelid = relid;
 
+	TupleDescFinalize(rel->rd_att);
+
 	rel->rd_rel->reltablespace = reltablespace;
 
 	if (mapped_relation)
@@ -4443,6 +4448,7 @@ BuildHardcodedDescriptor(int natts, const FormData_pg_attribute *attrs)
 
 	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
 	TupleDescCompactAttr(result, 0)->attcacheoff = 0;
+	TupleDescFinalize(result);
 
 	/* Note: we don't bother to set up a TupleConstr entry */
 
@@ -6268,6 +6274,8 @@ load_relcache_init_file(bool shared)
 			populate_compact_attribute(rel->rd_att, i);
 		}
 
+		TupleDescFinalize(rel->rd_att);
+
 		/* next read the access method specific field */
 		if (fread(&len, 1, sizeof(len), fp) != sizeof(len))
 			goto read_failed;
diff --git a/src/backend/utils/fmgr/funcapi.c b/src/backend/utils/fmgr/funcapi.c
index 8a934ea8dca..516d02cfb82 100644
--- a/src/backend/utils/fmgr/funcapi.c
+++ b/src/backend/utils/fmgr/funcapi.c
@@ -340,6 +340,8 @@ get_expr_result_type(Node *expr,
 										exprCollation(col));
 			i++;
 		}
+		TupleDescFinalize(tupdesc);
+
 		if (resultTypeId)
 			*resultTypeId = rexpr->row_typeid;
 		if (resultTupleDesc)
@@ -1044,6 +1046,7 @@ resolve_polymorphic_tupdesc(TupleDesc tupdesc, oidvector *declared_args,
 		}
 	}
 
+	TupleDescFinalize(tupdesc);
 	return true;
 }
 
@@ -1853,6 +1856,8 @@ build_function_result_tupdesc_d(char prokind,
 						   0);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -1970,6 +1975,7 @@ TypeGetTupleDesc(Oid typeoid, List *colaliases)
 						   typeoid,
 						   -1,
 						   0);
+		TupleDescFinalize(tupdesc);
 	}
 	else if (functypclass == TYPEFUNC_RECORD)
 	{
diff --git a/src/backend/utils/misc/guc_funcs.c b/src/backend/utils/misc/guc_funcs.c
index 4f3e40bf470..b82f807e05e 100644
--- a/src/backend/utils/misc/guc_funcs.c
+++ b/src/backend/utils/misc/guc_funcs.c
@@ -444,6 +444,7 @@ GetPGVariableResultDesc(const char *name)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 1, varname,
 						   TEXTOID, -1, 0);
 	}
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
 
@@ -465,6 +466,7 @@ ShowGUCConfigOption(const char *name, DestReceiver *dest)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, varname,
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -499,6 +501,7 @@ ShowAllGUCConfig(DestReceiver *dest)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "description",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -934,6 +937,8 @@ show_all_settings(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 17, "pending_restart",
 						   BOOLOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
+
 		/*
 		 * Generate attribute metadata needed later to produce tuples from raw
 		 * C strings
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index d46cdbf7a3c..595413dbbc5 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -195,6 +195,7 @@ extern TupleDesc CreateTupleDescTruncatedCopy(TupleDesc tupdesc, int natts);
 
 extern TupleDesc CreateTupleDescCopyConstr(TupleDesc tupdesc);
 
+#define TupleDescFinalize(d) ((void) 0)
 #define TupleDescSize(src) \
 	(offsetof(struct TupleDescData, compact_attrs) + \
 	 (src)->natts * sizeof(CompactAttribute) + \
diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c
index 7d648c941c0..b2112c29fcf 100644
--- a/src/pl/plpgsql/src/pl_comp.c
+++ b/src/pl/plpgsql/src/pl_comp.c
@@ -1912,6 +1912,8 @@ build_row_from_vars(PLpgSQL_variable **vars, int numvars)
 		TupleDescInitEntryCollation(row->rowtupdesc, i + 1, typcoll);
 	}
 
+	TupleDescFinalize(row->rowtupdesc);
+
 	return row;
 }
 
diff --git a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
index 908bd18a7c7..fa1719bf3b5 100644
--- a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
+++ b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
@@ -205,6 +205,7 @@ test_custom_stats_fixed_report(PG_FUNCTION_ARGS)
 					   INT8OID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	values[0] = Int64GetDatum(stats->numcalls);
diff --git a/src/test/modules/test_predtest/test_predtest.c b/src/test/modules/test_predtest/test_predtest.c
index 679a5de456d..48ca2a4ea70 100644
--- a/src/test/modules/test_predtest/test_predtest.c
+++ b/src/test/modules/test_predtest/test_predtest.c
@@ -230,6 +230,7 @@ test_predtest(PG_FUNCTION_ARGS)
 					   "s_r_holds", BOOLOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 8,
 					   "w_r_holds", BOOLOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	tupdesc = BlessTupleDesc(tupdesc);
 
 	values[0] = BoolGetDatum(strong_implied_by);
-- 
2.51.0



  [text/plain] v8-0002-Precalculate-CompactAttribute-s-attcacheoff.patch (50.1K, 3-v8-0002-Precalculate-CompactAttribute-s-attcacheoff.patch)
  download | inline diff:
From 6e464b41393dbb7fdeff853285e9d80051b2c12b Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Tue, 31 Dec 2024 09:19:24 +1300
Subject: [PATCH v8 2/4] Precalculate CompactAttribute's attcacheoff

This allows code to be removed from the tuple deform routines which
shrinks down the code a little, which can make it run more quickly.
This also makes a dedicated deformer loop to deform the portion of the
tuple which has a known offset, which makes deforming much faster when
a leading set of the table's columns are non-NULL values and fixed-width
types.
---
 src/backend/access/common/heaptuple.c  | 334 +++++++++-------------
 src/backend/access/common/indextuple.c | 367 ++++++++++---------------
 src/backend/access/common/tupdesc.c    |  37 +++
 src/backend/access/spgist/spgutils.c   |   3 -
 src/backend/executor/execTuples.c      | 282 ++++++++-----------
 src/backend/jit/llvm/llvmjit_deform.c  |   6 -
 src/backend/utils/cache/relcache.c     |  12 -
 src/include/access/htup_details.h      |  19 +-
 src/include/access/itup.h              |  20 +-
 src/include/access/tupdesc.h           |  10 +-
 src/include/access/tupmacs.h           |  84 ++++++
 src/include/executor/tuptable.h        |   9 +-
 12 files changed, 541 insertions(+), 642 deletions(-)

diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c
index 11bec20e82e..42cce3dcdfe 100644
--- a/src/backend/access/common/heaptuple.c
+++ b/src/backend/access/common/heaptuple.c
@@ -497,20 +497,8 @@ heap_attisnull(HeapTuple tup, int attnum, TupleDesc tupleDesc)
 /* ----------------
  *		nocachegetattr
  *
- *		This only gets called from fastgetattr(), in cases where we
- *		can't use a cacheoffset and the value is not null.
- *
- *		This caches attribute offsets in the attribute descriptor.
- *
- *		An alternative way to speed things up would be to cache offsets
- *		with the tuple, but that seems more difficult unless you take
- *		the storage hit of actually putting those offsets into the
- *		tuple you send to disk.  Yuck.
- *
- *		This scheme will be slightly slower than that, but should
- *		perform well for queries which hit large #'s of tuples.  After
- *		you cache the offsets once, examining all the other tuples using
- *		the same attribute descriptor will go much quicker. -cim 5/4/91
+ *		This only gets called from fastgetattr(), in cases where the
+ *		attcacheoff is not set.
  *
  *		NOTE: if you need to change this code, see also heap_deform_tuple.
  *		Also see nocache_index_getattr, which is the same code for index
@@ -522,194 +510,104 @@ nocachegetattr(HeapTuple tup,
 			   int attnum,
 			   TupleDesc tupleDesc)
 {
+	CompactAttribute *cattr;
 	HeapTupleHeader td = tup->t_data;
 	char	   *tp;				/* ptr to data part of tuple */
 	bits8	   *bp = td->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* do we have to walk attrs? */
 	int			off;			/* current offset within data */
+	int			startAttr;
+	int			firstnullattr;
+	bool		hasnulls = HeapTupleHasNulls(tup);
 
-	/* ----------------
-	 *	 Three cases:
-	 *
-	 *	 1: No nulls and no variable-width attributes.
-	 *	 2: Has a null or a var-width AFTER att.
-	 *	 3: Has nulls or var-widths BEFORE att.
-	 * ----------------
-	 */
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
+	/*
+	 * If there are no NULLs before the required attnum, then we can start at
+	 * the highest attribute with a known offset, or the first attribute if
+	 * none have a cached offset.  If the tuple has no variable width types,
+	 * then we can use a slightly cheaper method of offset calculation, as we
+	 * just need to add the attlen to the aligned offset when skipping over
+	 * columns.  When the tuple contains variable-width types, we must use
+	 * att_addlength_pointer(), which does a bit more branching and is
+	 * slightly less efficient.
+	 */
 	attnum--;
 
-	if (!HeapTupleNoNulls(tup))
-	{
-		/*
-		 * there's a null somewhere in the tuple
-		 *
-		 * check to see if any preceding bits are null...
-		 */
-		int			byte = attnum >> 3;
-		int			finalbit = attnum & 0x07;
-
-		/* check for nulls "before" final bit of last byte */
-		if ((~bp[byte]) & ((1 << finalbit) - 1))
-			slow = true;
-		else
-		{
-			/* check for nulls in any "earlier" bytes */
-			int			i;
+	if (hasnulls)
+		firstnullattr = first_null_attr(bp, attnum);
+	else
+		firstnullattr = attnum;
 
-			for (i = 0; i < byte; i++)
-			{
-				if (bp[i] != 0xFF)
-				{
-					slow = true;
-					break;
-				}
-			}
-		}
+	if (tupleDesc->firstNonCachedOffAttr > 0)
+	{
+		startAttr = Min(tupleDesc->firstNonCachedOffAttr - 1, firstnullattr);
+		off = TupleDescCompactAttr(tupleDesc, startAttr)->attcacheoff;
+	}
+	else
+	{
+		startAttr = 0;
+		off = 0;
 	}
 
 	tp = (char *) td + td->t_hoff;
 
-	if (!slow)
+	if (hasnulls)
 	{
-		CompactAttribute *att;
+		for (int i = startAttr; i < attnum; i++)
+		{
+			CompactAttribute *att;
 
-		/*
-		 * If we get here, there are no nulls up to and including the target
-		 * attribute.  If we have a cached offset, we can use it.
-		 */
-		att = TupleDescCompactAttr(tupleDesc, attnum);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, tp + att->attcacheoff);
+			if (att_isnull(i, bp))
+				continue;
 
-		/*
-		 * Otherwise, check for non-fixed-length attrs up to and including
-		 * target.  If there aren't any, it's safe to cheaply initialize the
-		 * cached offsets for these attrs.
-		 */
-		if (HeapTupleHasVarWidth(tup))
-		{
-			int			j;
+			att = TupleDescCompactAttr(tupleDesc, i);
 
-			for (j = 0; j <= attnum; j++)
-			{
-				if (TupleDescCompactAttr(tupleDesc, j)->attlen <= 0)
-				{
-					slow = true;
-					break;
-				}
-			}
+			off = att_pointer_alignby(off,
+									  att->attalignby,
+									  att->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, att->attlen, tp + off);
 		}
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
 	}
-
-	if (!slow)
+	else if (!HeapTupleHasVarWidth(tup))
 	{
-		int			natts = tupleDesc->natts;
-		int			j = 1;
-
-		/*
-		 * If we get here, we have a tuple with no nulls or var-widths up to
-		 * and including the target attribute, so we can use the cached offset
-		 * ... only we don't have it yet, or we'd not have got here.  Since
-		 * it's cheap to compute offsets for fixed-width columns, we take the
-		 * opportunity to initialize the cached offsets for *all* the leading
-		 * fixed-width columns, in hope of avoiding future visits to this
-		 * routine.
-		 */
-		TupleDescCompactAttr(tupleDesc, 0)->attcacheoff = 0;
-
-		/* we might have set some offsets in the slow path previously */
-		while (j < natts && TupleDescCompactAttr(tupleDesc, j)->attcacheoff > 0)
-			j++;
-
-		off = TupleDescCompactAttr(tupleDesc, j - 1)->attcacheoff +
-			TupleDescCompactAttr(tupleDesc, j - 1)->attlen;
-
-		for (; j < natts; j++)
+		for (int i = startAttr; i < attnum; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, j);
-
-			if (att->attlen <= 0)
-				break;
+			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
 
 			off = att_nominal_alignby(off, att->attalignby);
-
-			att->attcacheoff = off;
-
 			off += att->attlen;
 		}
-
-		Assert(j > attnum);
-
-		off = TupleDescCompactAttr(tupleDesc, attnum)->attcacheoff;
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_nominal_alignby(off, cattr->attalignby);
 	}
 	else
 	{
-		bool		usecache = true;
-		int			i;
-
-		/*
-		 * Now we know that we have to walk the tuple CAREFULLY.  But we still
-		 * might be able to cache some offsets for next time.
-		 *
-		 * Note - This loop is a little tricky.  For each non-null attribute,
-		 * we have to first account for alignment padding before the attr,
-		 * then advance over the attr based on its length.  Nulls have no
-		 * storage and no alignment padding either.  We can use/set
-		 * attcacheoff until we reach either a null or a var-width attribute.
-		 */
-		off = 0;
-		for (i = 0;; i++)		/* loop exit is at "break" */
+		for (int i = startAttr; i < attnum; i++)
 		{
 			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
 
-			if (HeapTupleHasNulls(tup) && att_isnull(i, bp))
-			{
-				usecache = false;
-				continue;		/* this cannot be the target att */
-			}
-
-			/* If we know the next offset, we can skip the rest */
-			if (usecache && att->attcacheoff >= 0)
-				off = att->attcacheoff;
-			else if (att->attlen == -1)
-			{
-				/*
-				 * We can only cache the offset for a varlena attribute if the
-				 * offset is already suitably aligned, so that there would be
-				 * no pad bytes in any case: then the offset will be valid for
-				 * either an aligned or unaligned value.
-				 */
-				if (usecache &&
-					off == att_nominal_alignby(off, att->attalignby))
-					att->attcacheoff = off;
-				else
-				{
-					off = att_pointer_alignby(off, att->attalignby, -1,
-											  tp + off);
-					usecache = false;
-				}
-			}
-			else
-			{
-				/* not varlena, so safe to use att_nominal_alignby */
-				off = att_nominal_alignby(off, att->attalignby);
-
-				if (usecache)
-					att->attcacheoff = off;
-			}
-
-			if (i == attnum)
-				break;
-
+			off = att_pointer_alignby(off,
+									  att->attalignby,
+									  att->attlen,
+									  tp + off);
 			off = att_addlength_pointer(off, att->attlen, tp + off);
 
-			if (usecache && att->attlen <= 0)
-				usecache = false;
 		}
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
 	}
 
-	return fetchatt(TupleDescCompactAttr(tupleDesc, attnum), tp + off);
+	return fetchatt(cattr, tp + off);
 }
 
 /* ----------------
@@ -1347,6 +1245,7 @@ heap_deform_tuple(HeapTuple tuple, TupleDesc tupleDesc,
 				  Datum *values, bool *isnull)
 {
 	HeapTupleHeader tup = tuple->t_data;
+	CompactAttribute *cattr;
 	bool		hasnulls = HeapTupleHasNulls(tuple);
 	int			tdesc_natts = tupleDesc->natts;
 	int			natts;			/* number of atts to extract */
@@ -1354,70 +1253,91 @@ heap_deform_tuple(HeapTuple tuple, TupleDesc tupleDesc,
 	char	   *tp;				/* ptr to tuple data */
 	uint32		off;			/* offset in tuple data */
 	bits8	   *bp = tup->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* can we use/set attcacheoff? */
+	int			cacheoffattrs;
+	int			firstnullattr;
 
 	natts = HeapTupleHeaderGetNatts(tup);
 
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
+
 	/*
 	 * In inheritance situations, it is possible that the given tuple actually
 	 * has more fields than the caller is expecting.  Don't run off the end of
 	 * the caller's arrays.
 	 */
 	natts = Min(natts, tdesc_natts);
+	cacheoffattrs = Min(tupleDesc->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		firstnullattr = first_null_attr(bp, natts);
+		cacheoffattrs = Min(cacheoffattrs, firstnullattr);
+	}
+	else
+		firstnullattr = natts;
 
 	tp = (char *) tup + tup->t_hoff;
+	attnum = 0;
 
-	off = 0;
+	if (cacheoffattrs > 0)
+	{
+#ifdef USE_ASSERT_CHECKING
+		/* In Assert enabled builds, verify attcacheoff is correct */
+		off = 0;
+#endif
+		do
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+#ifdef USE_ASSERT_CHECKING
+			off = att_nominal_alignby(off, cattr->attalignby);
+			Assert(off == cattr->attcacheoff);
+			off += cattr->attlen;
+#endif
 
-	for (attnum = 0; attnum < natts; attnum++)
+			values[attnum] = fetch_att(tp + cattr->attcacheoff,
+									   cattr->attbyval,
+									   cattr->attlen);
+			isnull[attnum] = false;
+		} while (++attnum < cacheoffattrs);
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+	else
+		off = 0;
+
+	for (; attnum < firstnullattr; attnum++)
 	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDesc, attnum);
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
 
-		if (hasnulls && att_isnull(attnum, bp))
-		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			slow = true;		/* can't use attcacheoff anymore */
-			continue;
-		}
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
 		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (!slow && thisatt->attcacheoff >= 0)
-			off = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow &&
-				off == att_nominal_alignby(off, thisatt->attalignby))
-				thisatt->attcacheoff = off;
-			else
-			{
-				off = att_pointer_alignby(off, thisatt->attalignby, -1,
-										  tp + off);
-				slow = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			off = att_nominal_alignby(off, thisatt->attalignby);
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+	}
+
+	for (; attnum < natts; attnum++)
+	{
+		Assert(hasnulls);
 
-			if (!slow)
-				thisatt->attcacheoff = off;
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
 		}
 
-		values[attnum] = fetchatt(thisatt, tp + off);
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
-		off = att_addlength_pointer(off, thisatt->attlen, tp + off);
+		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (thisatt->attlen <= 0)
-			slow = true;		/* can't use attcacheoff anymore */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
 	/*
diff --git a/src/backend/access/common/indextuple.c b/src/backend/access/common/indextuple.c
index d7c8c53fd8d..084e0937a60 100644
--- a/src/backend/access/common/indextuple.c
+++ b/src/backend/access/common/indextuple.c
@@ -223,18 +223,6 @@ index_form_tuple_context(TupleDesc tupleDescriptor,
  *
  *		This gets called from index_getattr() macro, and only in cases
  *		where we can't use cacheoffset and the value is not null.
- *
- *		This caches attribute offsets in the attribute descriptor.
- *
- *		An alternative way to speed things up would be to cache offsets
- *		with the tuple, but that seems more difficult unless you take
- *		the storage hit of actually putting those offsets into the
- *		tuple you send to disk.  Yuck.
- *
- *		This scheme will be slightly slower than that, but should
- *		perform well for queries which hit large #'s of tuples.  After
- *		you cache the offsets once, examining all the other tuples using
- *		the same attribute descriptor will go much quicker. -cim 5/4/91
  * ----------------
  */
 Datum
@@ -242,205 +230,129 @@ nocache_index_getattr(IndexTuple tup,
 					  int attnum,
 					  TupleDesc tupleDesc)
 {
+	CompactAttribute *cattr;
 	char	   *tp;				/* ptr to data part of tuple */
 	bits8	   *bp = NULL;		/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* do we have to walk attrs? */
 	int			data_off;		/* tuple data offset */
 	int			off;			/* current offset within data */
+	int			startAttr;
+	int			firstnullattr;
+	bool		hasnulls = IndexTupleHasNulls(tup);
+	int			i;
 
-	/* ----------------
-	 *	 Three cases:
-	 *
-	 *	 1: No nulls and no variable-width attributes.
-	 *	 2: Has a null or a var-width AFTER att.
-	 *	 3: Has nulls or var-widths BEFORE att.
-	 * ----------------
-	 */
-
-	data_off = IndexInfoFindDataOffset(tup->t_info);
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
 	attnum--;
 
-	if (IndexTupleHasNulls(tup))
-	{
-		/*
-		 * there's a null somewhere in the tuple
-		 *
-		 * check to see if desired att is null
-		 */
+	/*
+	 * If there are no NULLs before the required attnum, then we can start at
+	 * the highest attribute with a known offset, or the first attribute if
+	 * none have a cached offset.  If the tuple has no variable width types,
+	 * which is common with indexes, then we can use a slightly cheaper method
+	 * of offset calculation, as we just need to add the attlen to the aligned
+	 * offset when skipping over columns.  When the tuple contains
+	 * variable-width types, we must use att_addlength_pointer(), which does a
+	 * bit more branching and is slightly less efficient.
+	 */
+	data_off = IndexInfoFindDataOffset(tup->t_info);
+	tp = (char *) tup + data_off;
 
-		/* XXX "knows" t_bits are just after fixed tuple header! */
+	/*
+	 * Find the first NULL column, or if there's none set the first NULL to
+	 * attnum so that we can forego NULL checking all the way to attnum.
+	 */
+	if (hasnulls)
+	{
 		bp = (bits8 *) ((char *) tup + sizeof(IndexTupleData));
-
-		/*
-		 * Now check to see if any preceding bits are null...
-		 */
-		{
-			int			byte = attnum >> 3;
-			int			finalbit = attnum & 0x07;
-
-			/* check for nulls "before" final bit of last byte */
-			if ((~bp[byte]) & ((1 << finalbit) - 1))
-				slow = true;
-			else
-			{
-				/* check for nulls in any "earlier" bytes */
-				int			i;
-
-				for (i = 0; i < byte; i++)
-				{
-					if (bp[i] != 0xFF)
-					{
-						slow = true;
-						break;
-					}
-				}
-			}
-		}
+		firstnullattr = first_null_attr(bp, attnum);
 	}
+	else
+		firstnullattr = attnum;
 
-	tp = (char *) tup + data_off;
-
-	if (!slow)
+	if (tupleDesc->firstNonCachedOffAttr > 0)
 	{
-		CompactAttribute *att;
-
-		/*
-		 * If we get here, there are no nulls up to and including the target
-		 * attribute.  If we have a cached offset, we can use it.
-		 */
-		att = TupleDescCompactAttr(tupleDesc, attnum);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, tp + att->attcacheoff);
-
-		/*
-		 * Otherwise, check for non-fixed-length attrs up to and including
-		 * target.  If there aren't any, it's safe to cheaply initialize the
-		 * cached offsets for these attrs.
-		 */
-		if (IndexTupleHasVarwidths(tup))
-		{
-			int			j;
-
-			for (j = 0; j <= attnum; j++)
-			{
-				if (TupleDescCompactAttr(tupleDesc, j)->attlen <= 0)
-				{
-					slow = true;
-					break;
-				}
-			}
-		}
+		startAttr = Min(tupleDesc->firstNonCachedOffAttr - 1, firstnullattr);
+		off = TupleDescCompactAttr(tupleDesc, startAttr)->attcacheoff;
 	}
-
-	if (!slow)
+	else
 	{
-		int			natts = tupleDesc->natts;
-		int			j = 1;
-
-		/*
-		 * If we get here, we have a tuple with no nulls or var-widths up to
-		 * and including the target attribute, so we can use the cached offset
-		 * ... only we don't have it yet, or we'd not have got here.  Since
-		 * it's cheap to compute offsets for fixed-width columns, we take the
-		 * opportunity to initialize the cached offsets for *all* the leading
-		 * fixed-width columns, in hope of avoiding future visits to this
-		 * routine.
-		 */
-		TupleDescCompactAttr(tupleDesc, 0)->attcacheoff = 0;
+		startAttr = 0;
+		off = 0;
+	}
 
-		/* we might have set some offsets in the slow path previously */
-		while (j < natts && TupleDescCompactAttr(tupleDesc, j)->attcacheoff > 0)
-			j++;
+	/* Handle tuples with var-width attributes */
+	if (IndexTupleHasVarwidths(tup))
+	{
+		/* Calculate the offset up until the first NULL */
+		for (i = startAttr; i < firstnullattr; i++)
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-		off = TupleDescCompactAttr(tupleDesc, j - 1)->attcacheoff +
-			TupleDescCompactAttr(tupleDesc, j - 1)->attlen;
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, cattr->attlen, tp + off);
+		}
 
-		for (; j < natts; j++)
+		/* Calculate the offset for any remaining columns. */
+		for (; i < attnum; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, j);
+			Assert(hasnulls);
 
-			if (att->attlen <= 0)
-				break;
+			if (att_isnull(i, bp))
+				continue;
 
-			off = att_nominal_alignby(off, att->attalignby);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-			att->attcacheoff = off;
-
-			off += att->attlen;
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, cattr->attlen, tp + off);
 		}
-
-		Assert(j > attnum);
-
-		off = TupleDescCompactAttr(tupleDesc, attnum)->attcacheoff;
 	}
 	else
 	{
-		bool		usecache = true;
-		int			i;
+		/* Handle tuples with only fixed-width attributes */
 
-		/*
-		 * Now we know that we have to walk the tuple CAREFULLY.  But we still
-		 * might be able to cache some offsets for next time.
-		 *
-		 * Note - This loop is a little tricky.  For each non-null attribute,
-		 * we have to first account for alignment padding before the attr,
-		 * then advance over the attr based on its length.  Nulls have no
-		 * storage and no alignment padding either.  We can use/set
-		 * attcacheoff until we reach either a null or a var-width attribute.
-		 */
-		off = 0;
-		for (i = 0;; i++)		/* loop exit is at "break" */
+		/* Calculate the offset up until the first NULL */
+		for (i = startAttr; i < firstnullattr; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
-
-			if (IndexTupleHasNulls(tup) && att_isnull(i, bp))
-			{
-				usecache = false;
-				continue;		/* this cannot be the target att */
-			}
-
-			/* If we know the next offset, we can skip the rest */
-			if (usecache && att->attcacheoff >= 0)
-				off = att->attcacheoff;
-			else if (att->attlen == -1)
-			{
-				/*
-				 * We can only cache the offset for a varlena attribute if the
-				 * offset is already suitably aligned, so that there would be
-				 * no pad bytes in any case: then the offset will be valid for
-				 * either an aligned or unaligned value.
-				 */
-				if (usecache &&
-					off == att_nominal_alignby(off, att->attalignby))
-					att->attcacheoff = off;
-				else
-				{
-					off = att_pointer_alignby(off, att->attalignby, -1,
-											  tp + off);
-					usecache = false;
-				}
-			}
-			else
-			{
-				/* not varlena, so safe to use att_nominal_alignby */
-				off = att_nominal_alignby(off, att->attalignby);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
+
+			Assert(cattr->attlen > 0);
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off += cattr->attlen;
+		}
 
-				if (usecache)
-					att->attcacheoff = off;
-			}
+		/* Calculate the offset for any remaining columns. */
+		for (; i < attnum; i++)
+		{
+			Assert(hasnulls);
 
-			if (i == attnum)
-				break;
+			if (att_isnull(i, bp))
+				continue;
 
-			off = att_addlength_pointer(off, att->attlen, tp + off);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-			if (usecache && att->attlen <= 0)
-				usecache = false;
+			Assert(cattr->attlen > 0);
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off += cattr->attlen;
 		}
 	}
 
-	return fetchatt(TupleDescCompactAttr(tupleDesc, attnum), tp + off);
+	cattr = TupleDescCompactAttr(tupleDesc, attnum);
+	off = att_pointer_alignby(off, cattr->attalignby,
+							  cattr->attlen, tp + off);
+	return fetchatt(cattr, tp + off);
 }
 
 /*
@@ -480,63 +392,86 @@ index_deform_tuple_internal(TupleDesc tupleDescriptor,
 							Datum *values, bool *isnull,
 							char *tp, bits8 *bp, int hasnulls)
 {
+	CompactAttribute *cattr;
 	int			natts = tupleDescriptor->natts; /* number of atts to extract */
-	int			attnum;
+	int			attnum = 0;
 	int			off = 0;		/* offset in tuple data */
-	bool		slow = false;	/* can we use/set attcacheoff? */
+	int			cacheoffattrs;
+	int			firstnullattr;
 
 	/* Assert to protect callers who allocate fixed-size arrays */
 	Assert(natts <= INDEX_MAX_KEYS);
 
-	for (attnum = 0; attnum < natts; attnum++)
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDescriptor->firstNonCachedOffAttr >= 0);
+
+	cacheoffattrs = Min(tupleDescriptor->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		firstnullattr = first_null_attr(bp, natts);
+		cacheoffattrs = Min(cacheoffattrs, firstnullattr);
+	}
+	else
+		firstnullattr = natts;
+
+	if (cacheoffattrs > 0)
 	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDescriptor, attnum);
+#ifdef USE_ASSERT_CHECKING
+		/* In Assert enabled builds, verify attcacheoff is correct */
+		off = 0;
+#endif
 
-		if (hasnulls && att_isnull(attnum, bp))
+		do
 		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			slow = true;		/* can't use attcacheoff anymore */
-			continue;
-		}
+			cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+
+#ifdef USE_ASSERT_CHECKING
+			off = att_nominal_alignby(off, cattr->attalignby);
+			Assert(off == cattr->attcacheoff);
+			off += cattr->attlen;
+#endif
+
+			values[attnum] = fetch_att(tp + cattr->attcacheoff, cattr->attbyval,
+									   cattr->attlen);
+			isnull[attnum] = false;
+		} while (++attnum < cacheoffattrs);
+
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+
+	for (; attnum < firstnullattr; attnum++)
+	{
+		cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
 		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (!slow && thisatt->attcacheoff >= 0)
-			off = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow &&
-				off == att_nominal_alignby(off, thisatt->attalignby))
-				thisatt->attcacheoff = off;
-			else
-			{
-				off = att_pointer_alignby(off, thisatt->attalignby, -1,
-										  tp + off);
-				slow = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			off = att_nominal_alignby(off, thisatt->attalignby);
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+	}
+
+	for (; attnum < natts; attnum++)
+	{
+		Assert(hasnulls);
 
-			if (!slow)
-				thisatt->attcacheoff = off;
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
 		}
 
-		values[attnum] = fetchatt(thisatt, tp + off);
+		cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
-		off = att_addlength_pointer(off, thisatt->attlen, tp + off);
+		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (thisatt->attlen <= 0)
-			slow = true;		/* can't use attcacheoff anymore */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 }
 
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index e98de806a77..25364db630a 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -214,6 +214,9 @@ CreateTemplateTupleDesc(int natts)
 	desc->tdtypmod = -1;
 	desc->tdrefcount = -1;		/* assume not reference-counted */
 
+	/* This will be set to the correct value by TupleDescFinalize() */
+	desc->firstNonCachedOffAttr = -1;
+
 	return desc;
 }
 
@@ -474,6 +477,9 @@ TupleDescCopy(TupleDesc dst, TupleDesc src)
  *		descriptor to another.
  *
  * !!! Constraints and defaults are not copied !!!
+ *
+ * The caller must take care of calling TupleDescFinalize() on once all
+ * TupleDesc changes have been made.
  */
 void
 TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
@@ -506,6 +512,37 @@ TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 	populate_compact_attribute(dst, dstAttno - 1);
 }
 
+/*
+ * TupleDescFinalize
+ *		Finalize the given TupleDesc.  This must be called after the
+ *		attributes arrays have been populated or adjusted by any code.
+ *
+ * Must be called after populate_compact_attribute() and before
+ * BlessTupleDesc().
+ */
+void
+TupleDescFinalize(TupleDesc tupdesc)
+{
+	int			firstNonCachedOffAttr = 0;
+	int			offp = 0;
+
+	for (int i = 0; i < tupdesc->natts; i++)
+	{
+		CompactAttribute *cattr = TupleDescCompactAttr(tupdesc, i);
+
+		if (cattr->attlen <= 0)
+			break;
+
+		offp = att_nominal_alignby(offp, cattr->attalignby);
+		cattr->attcacheoff = offp;
+
+		offp += cattr->attlen;
+		firstNonCachedOffAttr = i + 1;
+	}
+
+	tupdesc->firstNonCachedOffAttr = firstNonCachedOffAttr;
+}
+
 /*
  * Free a TupleDesc including all substructure
  */
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index b246e8127db..a4694bd8065 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -335,9 +335,6 @@ getSpGistTupleDesc(Relation index, SpGistTypeDesc *keyType)
 		/* We shouldn't need to bother with making these valid: */
 		att->attcompression = InvalidCompressionMethod;
 		att->attcollation = InvalidOid;
-		/* In case we changed typlen, we'd better reset following offsets */
-		for (int i = spgFirstIncludeColumn; i < outTupDesc->natts; i++)
-			TupleDescCompactAttr(outTupDesc, i)->attcacheoff = -1;
 
 		populate_compact_attribute(outTupDesc, spgKeyColumn);
 		TupleDescFinalize(outTupDesc);
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index e6ab51e6404..36d0aaed2fb 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -992,118 +992,6 @@ tts_buffer_heap_store_tuple(TupleTableSlot *slot, HeapTuple tuple,
 	}
 }
 
-/*
- * slot_deform_heap_tuple_internal
- *		An always inline helper function for use in slot_deform_heap_tuple to
- *		allow the compiler to emit specialized versions of this function for
- *		various combinations of "slow" and "hasnulls".  For example, if a
- *		given tuple has no nulls, then we needn't check "hasnulls" for every
- *		attribute that we're deforming.  The caller can just call this
- *		function with hasnulls set to constant-false and have the compiler
- *		remove the constant-false branches and emit more optimal code.
- *
- * Returns the next attnum to deform, which can be equal to natts when the
- * function manages to deform all requested attributes.  *offp is an input and
- * output parameter which is the byte offset within the tuple to start deforming
- * from which, on return, gets set to the offset where the next attribute
- * should be deformed from.  *slowp is set to true when subsequent deforming
- * of this tuple must use a version of this function with "slow" passed as
- * true.
- *
- * Callers cannot assume when we return "attnum" (i.e. all requested
- * attributes have been deformed) that slow mode isn't required for any
- * additional deforming as the final attribute may have caused a switch to
- * slow mode.
- */
-static pg_attribute_always_inline int
-slot_deform_heap_tuple_internal(TupleTableSlot *slot, HeapTuple tuple,
-								int attnum, int natts, bool slow,
-								bool hasnulls, uint32 *offp, bool *slowp)
-{
-	TupleDesc	tupleDesc = slot->tts_tupleDescriptor;
-	Datum	   *values = slot->tts_values;
-	bool	   *isnull = slot->tts_isnull;
-	HeapTupleHeader tup = tuple->t_data;
-	char	   *tp;				/* ptr to tuple data */
-	bits8	   *bp = tup->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slownext = false;
-
-	tp = (char *) tup + tup->t_hoff;
-
-	for (; attnum < natts; attnum++)
-	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDesc, attnum);
-
-		if (hasnulls && att_isnull(attnum, bp))
-		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			if (!slow)
-			{
-				*slowp = true;
-				return attnum + 1;
-			}
-			else
-				continue;
-		}
-
-		isnull[attnum] = false;
-
-		/* calculate the offset of this attribute */
-		if (!slow && thisatt->attcacheoff >= 0)
-			*offp = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow && *offp == att_nominal_alignby(*offp, thisatt->attalignby))
-				thisatt->attcacheoff = *offp;
-			else
-			{
-				*offp = att_pointer_alignby(*offp,
-											thisatt->attalignby,
-											-1,
-											tp + *offp);
-
-				if (!slow)
-					slownext = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			*offp = att_nominal_alignby(*offp, thisatt->attalignby);
-
-			if (!slow)
-				thisatt->attcacheoff = *offp;
-		}
-
-		values[attnum] = fetchatt(thisatt, tp + *offp);
-
-		*offp = att_addlength_pointer(*offp, thisatt->attlen, tp + *offp);
-
-		/* check if we need to switch to slow mode */
-		if (!slow)
-		{
-			/*
-			 * We're unable to deform any further if the above code set
-			 * 'slownext', or if this isn't a fixed-width attribute.
-			 */
-			if (slownext || thisatt->attlen <= 0)
-			{
-				*slowp = true;
-				return attnum + 1;
-			}
-		}
-	}
-
-	return natts;
-}
-
 /*
  * slot_deform_heap_tuple
  *		Given a TupleTableSlot, extract data from the slot's physical tuple
@@ -1122,78 +1010,138 @@ static pg_attribute_always_inline void
 slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 					   int natts)
 {
+	CompactAttribute *cattr;
+	TupleDesc	tupleDesc = slot->tts_tupleDescriptor;
 	bool		hasnulls = HeapTupleHasNulls(tuple);
+	HeapTupleHeader tup = tuple->t_data;
+	bits8	   *bp;				/* ptr to null bitmap in tuple */
 	int			attnum;
+	int			firstNonCacheOffsetAttr;
+	int			firstNullAttr;
+	Datum	   *values;
+	bool	   *isnull;
+	char	   *tp;				/* ptr to tuple data */
 	uint32		off;			/* offset in tuple data */
-	bool		slow;			/* can we use/set attcacheoff? */
+
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
 	/* We can only fetch as many attributes as the tuple has. */
-	natts = Min(HeapTupleHeaderGetNatts(tuple->t_data), natts);
+	natts = Min(HeapTupleHeaderGetNatts(tup), natts);
+	attnum = slot->tts_nvalid;
+	firstNonCacheOffsetAttr = Min(tupleDesc->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		bp = tup->t_bits;
+		firstNullAttr = first_null_attr(bp, natts);
+		firstNonCacheOffsetAttr = Min(firstNonCacheOffsetAttr, firstNullAttr);
+	}
+	else
+	{
+		bp = NULL;
+		firstNullAttr = natts;
+	}
+
+	values = slot->tts_values;
+	isnull = slot->tts_isnull;
+	tp = (char *) tup + tup->t_hoff;
 
 	/*
-	 * Check whether the first call for this tuple, and initialize or restore
-	 * loop state.
+	 * Handle the portion of the tuple that we have cached the offset for up
+	 * to the first NULL attribute.  The offset is effectively fixed for these
+	 * so we can use the CompactAttribute's attcacheoff.
 	 */
-	attnum = slot->tts_nvalid;
-	if (attnum == 0)
+	if (attnum < firstNonCacheOffsetAttr)
 	{
-		/* Start from the first attribute */
-		off = 0;
-		slow = false;
+#ifdef USE_ASSERT_CHECKING
+		int			offcheck;
+
+		/* In Assert enabled builds, verify attcacheoff is correct */
+		if (attnum == 0)
+			offcheck = 0;
+		else
+			offcheck = *offp;
+#endif
+		do
+		{
+			isnull[attnum] = false;
+			cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+#ifdef USE_ASSERT_CHECKING
+			offcheck = att_nominal_alignby(offcheck, cattr->attalignby);
+			Assert(offcheck == cattr->attcacheoff);
+			offcheck += cattr->attlen;
+#endif
+
+			values[attnum] = fetchatt(cattr, tp + cattr->attcacheoff);
+		} while (++attnum < firstNonCacheOffsetAttr);
+
+		/*
+		 * Point the offset after the end of the last attribute with a cached
+		 * offset.  We expect the final cached offset attribute to have a
+		 * fixed width, so just add the attlen to the attcacheoff
+		 */
+		Assert(cattr->attlen > 0);
+		off = cattr->attcacheoff + cattr->attlen;
 	}
 	else
 	{
 		/* Restore state from previous execution */
 		off = *offp;
-		slow = TTS_SLOW(slot);
+
+		/* We expect *offp to be set to 0 when attnum == 0 */
+		Assert(off == 0 || attnum > 0);
 	}
 
 	/*
-	 * If 'slow' isn't set, try deforming using deforming code that does not
-	 * contain any of the extra checks required for non-fixed offset
-	 * deforming.  During deforming, if or when we find a NULL or a variable
-	 * length attribute, we'll switch to a deforming method which includes the
-	 * extra code required for non-fixed offset deforming, a.k.a slow mode.
-	 * Because this is performance critical, we inline
-	 * slot_deform_heap_tuple_internal passing the 'slow' and 'hasnull'
-	 * parameters as constants to allow the compiler to emit specialized code
-	 * with the known-const false comparisons and subsequent branches removed.
+	 * Handle any portion of the tuple that doesn't have a fixed offset up
+	 * until the first NULL attribute.  This loops only differs from the one
+	 * after it by the NULL checks.
 	 */
-	if (!slow)
+	for (; attnum < firstNullAttr; attnum++)
 	{
-		/* Tuple without any NULLs? We can skip doing any NULL checking */
-		if (!hasnulls)
-			attnum = slot_deform_heap_tuple_internal(slot,
-													 tuple,
-													 attnum,
-													 natts,
-													 false, /* slow */
-													 false, /* hasnulls */
-													 &off,
-													 &slow);
-		else
-			attnum = slot_deform_heap_tuple_internal(slot,
-													 tuple,
-													 attnum,
-													 natts,
-													 false, /* slow */
-													 true,	/* hasnulls */
-													 &off,
-													 &slow);
+		isnull[attnum] = false;
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+		/* align the offset for this attribute */
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
+
+		values[attnum] = fetchatt(cattr, tp + off);
+
+		/* move the offset beyond this attribute */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
-	/* If there's still work to do then we must be in slow mode */
-	if (attnum < natts)
+	/*
+	 * Now handle any remaining tuples, this time include NULL checks as we're
+	 * now at the first NULL attribute.
+	 */
+	for (; attnum < natts; attnum++)
 	{
-		/* XXX is it worth adding a separate call when hasnulls is false? */
-		attnum = slot_deform_heap_tuple_internal(slot,
-												 tuple,
-												 attnum,
-												 natts,
-												 true,	/* slow */
-												 hasnulls,
-												 &off,
-												 &slow);
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
+		}
+
+		isnull[attnum] = false;
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+		/* align the offset for this attribute */
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
+
+		values[attnum] = fetchatt(cattr, tp + off);
+
+		/* move the offset beyond this attribute */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
 	/*
@@ -1201,10 +1149,6 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 	 */
 	slot->tts_nvalid = attnum;
 	*offp = off;
-	if (slow)
-		slot->tts_flags |= TTS_FLAG_SLOW;
-	else
-		slot->tts_flags &= ~TTS_FLAG_SLOW;
 }
 
 const TupleTableSlotOps TTSOpsVirtual = {
@@ -2259,10 +2203,16 @@ ExecTypeSetColNames(TupleDesc typeInfo, List *namesList)
  * This happens "for free" if the tupdesc came from a relcache entry, but
  * not if we have manufactured a tupdesc for a transient RECORD datatype.
  * In that case we have to notify typcache.c of the existence of the type.
+ *
+ * TupleDescFinalize() must be called on the TupleDesc before calling this
+ * function.
  */
 TupleDesc
 BlessTupleDesc(TupleDesc tupdesc)
 {
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupdesc->firstNonCachedOffAttr >= 0);
+
 	if (tupdesc->tdtypeid == RECORDOID &&
 		tupdesc->tdtypmod < 0)
 		assign_record_type_typmod(tupdesc);
diff --git a/src/backend/jit/llvm/llvmjit_deform.c b/src/backend/jit/llvm/llvmjit_deform.c
index 3eb087eb56b..12521e3e46a 100644
--- a/src/backend/jit/llvm/llvmjit_deform.c
+++ b/src/backend/jit/llvm/llvmjit_deform.c
@@ -62,7 +62,6 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 	LLVMValueRef v_tts_values;
 	LLVMValueRef v_tts_nulls;
 	LLVMValueRef v_slotoffp;
-	LLVMValueRef v_flagsp;
 	LLVMValueRef v_nvalidp;
 	LLVMValueRef v_nvalid;
 	LLVMValueRef v_maxatt;
@@ -178,7 +177,6 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 	v_tts_nulls =
 		l_load_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_ISNULL,
 						  "tts_ISNULL");
-	v_flagsp = l_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_FLAGS, "");
 	v_nvalidp = l_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_NVALID, "");
 
 	if (ops == &TTSOpsHeapTuple || ops == &TTSOpsBufferHeapTuple)
@@ -747,14 +745,10 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 
 	{
 		LLVMValueRef v_off = l_load(b, TypeSizeT, v_offp, "");
-		LLVMValueRef v_flags;
 
 		LLVMBuildStore(b, l_int16_const(lc, natts), v_nvalidp);
 		v_off = LLVMBuildTrunc(b, v_off, LLVMInt32TypeInContext(lc), "");
 		LLVMBuildStore(b, v_off, v_slotoffp);
-		v_flags = l_load(b, LLVMInt16TypeInContext(lc), v_flagsp, "tts_flags");
-		v_flags = LLVMBuildOr(b, v_flags, l_int16_const(lc, TTS_FLAG_SLOW), "");
-		LLVMBuildStore(b, v_flags, v_flagsp);
 		LLVMBuildRetVoid(b);
 	}
 
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 770edb34e08..998be24ac41 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -666,14 +666,6 @@ RelationBuildTupleDesc(Relation relation)
 		elog(ERROR, "pg_attribute catalog is missing %d attribute(s) for relation OID %u",
 			 need, RelationGetRelid(relation));
 
-	/*
-	 * We can easily set the attcacheoff value for the first attribute: it
-	 * must be zero.  This eliminates the need for special cases for attnum=1
-	 * that used to exist in fastgetattr() and index_getattr().
-	 */
-	if (RelationGetNumberOfAttributes(relation) > 0)
-		TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
-
 	/*
 	 * Set up constraint/default info
 	 */
@@ -1985,8 +1977,6 @@ formrdesc(const char *relationName, Oid relationReltype,
 		populate_compact_attribute(relation->rd_att, i);
 	}
 
-	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
-	TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
 	TupleDescFinalize(relation->rd_att);
 
 	/* mark not-null status */
@@ -4446,8 +4436,6 @@ BuildHardcodedDescriptor(int natts, const FormData_pg_attribute *attrs)
 		populate_compact_attribute(result, i);
 	}
 
-	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
-	TupleDescCompactAttr(result, 0)->attcacheoff = 0;
 	TupleDescFinalize(result);
 
 	/* Note: we don't bother to set up a TupleConstr entry */
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index d406825ff22..94b4279b7f1 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -865,20 +865,17 @@ extern MinimalTuple minimal_expand_tuple(HeapTuple sourceTuple, TupleDesc tupleD
 static inline Datum
 fastgetattr(HeapTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
 {
-	Assert(attnum > 0);
+	CompactAttribute *att = TupleDescCompactAttr(tupleDesc, attnum - 1);
 
+	Assert(attnum > 0);
 	*isnull = false;
-	if (HeapTupleNoNulls(tup))
-	{
-		CompactAttribute *att;
 
-		att = TupleDescCompactAttr(tupleDesc, attnum - 1);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, (char *) tup->t_data + tup->t_data->t_hoff +
-							att->attcacheoff);
-		else
-			return nocachegetattr(tup, attnum, tupleDesc);
-	}
+	if (att->attcacheoff >= 0 && !HeapTupleHasNulls(tup))
+		return fetchatt(att, (char *) tup->t_data + tup->t_data->t_hoff +
+						att->attcacheoff);
+
+	if (HeapTupleNoNulls(tup))
+		return nocachegetattr(tup, attnum, tupleDesc);
 	else
 	{
 		if (att_isnull(attnum - 1, tup->t_data->t_bits))
diff --git a/src/include/access/itup.h b/src/include/access/itup.h
index 57e4daafb0d..e4bb27b7e58 100644
--- a/src/include/access/itup.h
+++ b/src/include/access/itup.h
@@ -131,24 +131,20 @@ IndexInfoFindDataOffset(unsigned short t_info)
 static inline Datum
 index_getattr(IndexTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
 {
+	CompactAttribute *attr = TupleDescCompactAttr(tupleDesc, attnum - 1);
+
 	Assert(isnull);
 	Assert(attnum > 0);
 
 	*isnull = false;
 
-	if (!IndexTupleHasNulls(tup))
-	{
-		CompactAttribute *attr = TupleDescCompactAttr(tupleDesc, attnum - 1);
+	if (attr->attcacheoff >= 0 && !IndexTupleHasNulls(tup))
+		return fetchatt(attr,
+						(char *) tup + IndexInfoFindDataOffset(tup->t_info) +
+						attr->attcacheoff);
 
-		if (attr->attcacheoff >= 0)
-		{
-			return fetchatt(attr,
-							(char *) tup + IndexInfoFindDataOffset(tup->t_info) +
-							attr->attcacheoff);
-		}
-		else
-			return nocache_index_getattr(tup, attnum, tupleDesc);
-	}
+	if (!IndexTupleHasNulls(tup))
+		return nocache_index_getattr(tup, attnum, tupleDesc);
 	else
 	{
 		if (att_isnull(attnum - 1, (bits8 *) tup + sizeof(IndexTupleData)))
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index 595413dbbc5..99d9017d1a6 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -131,6 +131,12 @@ typedef struct CompactAttribute
  * Any code making changes manually to and fields in the FormData_pg_attribute
  * array must subsequently call populate_compact_attribute() to flush the
  * changes out to the corresponding 'compact_attrs' element.
+ *
+ * firstNonCachedOffAttr stores the index into the compact_attrs array for the
+ * first attribute that we don't have a known attcacheoff for.
+ *
+ * Once a TupleDesc has been populated, before it is used for any purpose
+ * TupleDescFinalize() must be called on it.
  */
 typedef struct TupleDescData
 {
@@ -138,6 +144,8 @@ typedef struct TupleDescData
 	Oid			tdtypeid;		/* composite type ID for tuple type */
 	int32		tdtypmod;		/* typmod for tuple type */
 	int			tdrefcount;		/* reference count, or -1 if not counting */
+	int			firstNonCachedOffAttr;	/* index of the first att without an
+										 * attcacheoff */
 	TupleConstr *constr;		/* constraints, or NULL if none */
 	/* compact_attrs[N] is the compact metadata of Attribute Number N+1 */
 	CompactAttribute compact_attrs[FLEXIBLE_ARRAY_MEMBER];
@@ -195,7 +203,6 @@ extern TupleDesc CreateTupleDescTruncatedCopy(TupleDesc tupdesc, int natts);
 
 extern TupleDesc CreateTupleDescCopyConstr(TupleDesc tupdesc);
 
-#define TupleDescFinalize(d) ((void) 0)
 #define TupleDescSize(src) \
 	(offsetof(struct TupleDescData, compact_attrs) + \
 	 (src)->natts * sizeof(CompactAttribute) + \
@@ -206,6 +213,7 @@ extern void TupleDescCopy(TupleDesc dst, TupleDesc src);
 extern void TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 							   TupleDesc src, AttrNumber srcAttno);
 
+extern void TupleDescFinalize(TupleDesc tupdesc);
 extern void FreeTupleDesc(TupleDesc tupdesc);
 
 extern void IncrTupleDescRefCount(TupleDesc tupdesc);
diff --git a/src/include/access/tupmacs.h b/src/include/access/tupmacs.h
index 3e5530658c9..ce7a88df611 100644
--- a/src/include/access/tupmacs.h
+++ b/src/include/access/tupmacs.h
@@ -15,6 +15,7 @@
 #define TUPMACS_H
 
 #include "catalog/pg_type_d.h"	/* for TYPALIGN macros */
+#include "port/pg_bitutils.h"
 
 
 /*
@@ -69,6 +70,89 @@ fetch_att(const void *T, bool attbyval, int attlen)
 	else
 		return PointerGetDatum(T);
 }
+
+#ifndef HAVE__BUILTIN_CTZ
+/*
+ * For returning the 0-based position of the right-most 0 bit of a uint8, or 8
+ * if all bits are 1 bits.
+ */
+static const uint8 pg_rightmost_zero_pos[256] = {
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 6,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 7,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 6,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 8
+};
+#endif
+
+/*
+ * first_null_attr
+ *		Inspect a NULL bitmask from a tuple and return the 0-based attnum of the
+ *		first NULL attribute.  Returns natts if no NULLs were found.
+ */
+static inline int
+first_null_attr(const bits8 *bits, int natts)
+{
+	int			lastByte = natts >> 3;
+	int			bytenum;
+	int			res;
+
+#ifdef USE_ASSERT_CHECKING
+	int			firstnull_check = natts;
+
+	/* Do it the slow way and check we get the same answer. */
+	for (int i = 0; i < natts; i++)
+	{
+		if (att_isnull(i, bits))
+		{
+			firstnull_check = i;
+			break;
+		}
+	}
+#endif
+
+	/* Process all bytes up to just before the byte for the natts index */
+	for (bytenum = 0; bytenum < lastByte; bytenum++)
+	{
+		/* break if there's any NULL attrs (a 0 bit) */
+		if (bits[bytenum] != 0xFF)
+			break;
+	}
+
+	res = bytenum << 3;
+
+#ifdef HAVE__BUILTIN_CTZ
+	/*
+	 * Promote to 32-bit before doing bit-wise NOT.  This means we'll convert
+	 * 0xff into 0xffffff00 rather than 0x0, which is undefined with
+	 * __builtin_ctz.  That'll mean we correctly get 8 for 0xff
+	 */
+	res += __builtin_ctz(~(uint32) bits[bytenum]);
+#else
+	res += pg_rightmost_zero_pos[bits[bytenum]];
+#endif
+
+	/*
+	 * Since we did no masking to mask out bits beyond natts, we may have
+	 * found a bit higher than natts, so we must cap to natts
+	 */
+	res = Min(res, natts);
+
+	Assert(res == firstnull_check);
+
+	return res;
+}
 #endif							/* FRONTEND */
 
 /*
diff --git a/src/include/executor/tuptable.h b/src/include/executor/tuptable.h
index a2dfd707e78..363c5f33697 100644
--- a/src/include/executor/tuptable.h
+++ b/src/include/executor/tuptable.h
@@ -84,9 +84,6 @@
  * tts_values/tts_isnull are allocated either when the slot is created (when
  * the descriptor is provided), or when a descriptor is assigned to the slot;
  * they are of length equal to the descriptor's natts.
- *
- * The TTS_FLAG_SLOW flag is saved state for
- * slot_deform_heap_tuple, and should not be touched by any other code.
  *----------
  */
 
@@ -98,12 +95,8 @@
 #define			TTS_FLAG_SHOULDFREE		(1 << 2)
 #define TTS_SHOULDFREE(slot) (((slot)->tts_flags & TTS_FLAG_SHOULDFREE) != 0)
 
-/* saved state for slot_deform_heap_tuple */
-#define			TTS_FLAG_SLOW		(1 << 3)
-#define TTS_SLOW(slot) (((slot)->tts_flags & TTS_FLAG_SLOW) != 0)
-
 /* fixed tuple descriptor */
-#define			TTS_FLAG_FIXED		(1 << 4)
+#define			TTS_FLAG_FIXED		(1 << 4)	/* XXX change to #3? */
 #define TTS_FIXED(slot) (((slot)->tts_flags & TTS_FLAG_FIXED) != 0)
 
 struct TupleTableSlotOps;
-- 
2.51.0



  [text/plain] v8-0003-Introduce-deform_bench-test-module.patch (7.3K, 4-v8-0003-Introduce-deform_bench-test-module.patch)
  download | inline diff:
From 98e27bc74fbc3f3da9048430ea9aa2ffa156aa78 Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Tue, 27 Jan 2026 15:08:09 +1300
Subject: [PATCH v8 3/4] Introduce deform_bench test module

For benchmaring tuple deformation.
---
 src/test/modules/deform_bench/.gitignore      |   4 +
 src/test/modules/deform_bench/Makefile        |  21 ++++
 .../deform_bench/deform_bench--1.0.sql        |   8 ++
 src/test/modules/deform_bench/deform_bench.c  | 106 ++++++++++++++++++
 .../modules/deform_bench/deform_bench.control |   4 +
 src/test/modules/deform_bench/meson.build     |  22 ++++
 src/test/modules/meson.build                  |   1 +
 7 files changed, 166 insertions(+)
 create mode 100644 src/test/modules/deform_bench/.gitignore
 create mode 100644 src/test/modules/deform_bench/Makefile
 create mode 100644 src/test/modules/deform_bench/deform_bench--1.0.sql
 create mode 100644 src/test/modules/deform_bench/deform_bench.c
 create mode 100644 src/test/modules/deform_bench/deform_bench.control
 create mode 100644 src/test/modules/deform_bench/meson.build

diff --git a/src/test/modules/deform_bench/.gitignore b/src/test/modules/deform_bench/.gitignore
new file mode 100644
index 00000000000..5dcb3ff9723
--- /dev/null
+++ b/src/test/modules/deform_bench/.gitignore
@@ -0,0 +1,4 @@
+# Generated subdirectories
+/log/
+/results/
+/tmp_check/
diff --git a/src/test/modules/deform_bench/Makefile b/src/test/modules/deform_bench/Makefile
new file mode 100644
index 00000000000..b5fc0f7a583
--- /dev/null
+++ b/src/test/modules/deform_bench/Makefile
@@ -0,0 +1,21 @@
+# src/test/modules/deform_bench/Makefile
+
+MODULE_big = deform_bench
+OBJS = deform_bench.o
+
+EXTENSION = deform_bench
+DATA = deform_bench--1.0.sql
+PGFILEDESC = "deform_bench - tuple deform benchmarking"
+
+REGRESS = deform_bench
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/deform_bench
+top_builddir = ../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/deform_bench/deform_bench--1.0.sql b/src/test/modules/deform_bench/deform_bench--1.0.sql
new file mode 100644
index 00000000000..492b71dba3b
--- /dev/null
+++ b/src/test/modules/deform_bench/deform_bench--1.0.sql
@@ -0,0 +1,8 @@
+/* deform_bench--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION deform_bench" to load this file. \quit
+
+CREATE FUNCTION deform_bench(tableoid Oid, attnum int[]) RETURNS FLOAT
+AS 'MODULE_PATHNAME', 'deform_bench'
+LANGUAGE C VOLATILE STRICT;
diff --git a/src/test/modules/deform_bench/deform_bench.c b/src/test/modules/deform_bench/deform_bench.c
new file mode 100644
index 00000000000..525162eb59c
--- /dev/null
+++ b/src/test/modules/deform_bench/deform_bench.c
@@ -0,0 +1,106 @@
+/*-------------------------------------------------------------------------
+ *
+ * deform_bench.c
+ *
+ * for benchmarking tuple deformation routines
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include <time.h>
+#include <sys/time.h>
+
+#include "access/heapam.h"
+#include "access/relscan.h"
+#include "catalog/pg_am_d.h"
+#include "catalog/pg_type_d.h"
+#include "funcapi.h"
+#include "miscadmin.h"
+#include "utils/array.h"
+#include "utils/arrayaccess.h"
+#include "utils/builtins.h"
+
+PG_MODULE_MAGIC;
+
+PG_FUNCTION_INFO_V1(deform_bench);
+
+Datum
+deform_bench(PG_FUNCTION_ARGS)
+{
+	Oid			tableoid = PG_GETARG_OID(0);
+	ArrayType  *array = PG_GETARG_ARRAYTYPE_P(1);
+	TableScanDesc scan;
+	Relation	rel;
+	TupleDesc	tupdesc;
+	TupleTableSlot *slot;
+	Datum	   *elem_datums = NULL;
+	bool	   *elem_nulls = NULL;
+	int			elem_count;
+	int		   *attnums;
+	clock_t		start,
+				end;
+
+	rel = relation_open(tableoid, AccessShareLock);
+
+	if (rel->rd_rel->relam != HEAP_TABLE_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only heap AM is supported")));
+
+	tupdesc = RelationGetDescr(rel);
+	slot = MakeTupleTableSlot(tupdesc, &TTSOpsBufferHeapTuple);
+	scan = table_beginscan_strat(rel, GetActiveSnapshot(), 0, NULL, true, false);
+
+	/*
+	 * The array is used to allow callers to define how many atts to deform.
+	 * e.g: '{1,10}'::int[] would deform attnum=1, then in a 2nd pass deform
+	 * the remainder up to attnum=10.  Passing an element as NULL means all
+	 * attnums.  This allows simulation of incremental deformation.  Generally
+	 * if you're passing an array with more than 1 element, then the array
+	 * should be in ascending order.  Doing something like '{10,1}' would mean
+	 * we've already deformed 10 attributes and on the 2nd pass there's
+	 * nothing to do since attnum=1 was already deformed in the first pass.
+	 *
+	 * You'll get an ERROR if you pass a number higher than the number of
+	 * attributes in the table.
+	 */
+	deconstruct_array(array,
+					  INT4OID,
+					  sizeof(int32),
+					  true,
+					  'i',
+					  &elem_datums,
+					  &elem_nulls,
+					  &elem_count);
+
+	attnums = palloc_array(int, elem_count);
+
+	for (int i = 0; i < elem_count; i++)
+	{
+		/* Make a NULL element mean all attributes */
+		if (elem_nulls[i])
+			attnums[i] = tupdesc->natts;
+		else
+			attnums[i] = DatumGetInt32(elem_datums[i]);
+	}
+
+	start = clock();
+
+	while (heap_getnextslot(scan, ForwardScanDirection, slot))
+	{
+		CHECK_FOR_INTERRUPTS();
+
+		/* Deform in stages according to the attnums array */
+		for (int i = 0; i < elem_count; i++)
+			slot_getsomeattrs_int(slot, attnums[i]);
+	}
+
+	ExecDropSingleTupleTableSlot(slot);
+	table_endscan(scan);
+	relation_close(rel, AccessShareLock);
+
+	end = clock();
+
+	/* Returns the number of milliseconds to run the test */
+	PG_RETURN_FLOAT8((double) (end - start) / (CLOCKS_PER_SEC / 1000));
+}
diff --git a/src/test/modules/deform_bench/deform_bench.control b/src/test/modules/deform_bench/deform_bench.control
new file mode 100644
index 00000000000..a2023f9d738
--- /dev/null
+++ b/src/test/modules/deform_bench/deform_bench.control
@@ -0,0 +1,4 @@
+# deform_bench extension
+comment = 'functions for benchmarking tuple deformation'
+default_version = '1.0'
+module_pathname = '$libdir/deform_bench'
diff --git a/src/test/modules/deform_bench/meson.build b/src/test/modules/deform_bench/meson.build
new file mode 100644
index 00000000000..82049585244
--- /dev/null
+++ b/src/test/modules/deform_bench/meson.build
@@ -0,0 +1,22 @@
+# Copyright (c) 2026, PostgreSQL Global Development Group
+
+deform_bench_sources = files(
+  'deform_bench.c',
+)
+
+if host_system == 'windows'
+  deform_bench_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+    '--NAME', 'deform_bench',
+    '--FILEDESC', 'deform_bench - benchmarking tuple deformation',])
+endif
+
+deform_bench = shared_module('deform_bench',
+  deform_bench_sources,
+  kwargs: pg_test_mod_args,
+)
+test_install_libs += deform_bench
+
+test_install_data += files(
+  'deform_bench--1.0.sql',
+  'deform_bench.control',
+)
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 2634a519935..ef2b0af4581 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -2,6 +2,7 @@
 
 subdir('brin')
 subdir('commit_ts')
+subdir('deform_bench')
 subdir('delay_execution')
 subdir('dummy_index_am')
 subdir('dummy_seclabel')
-- 
2.51.0



  [text/plain] v8-0004-Various-experimental-changes.patch (11.4K, 5-v8-0004-Various-experimental-changes.patch)
  download | inline diff:
From 8160dd33825abfcae2a14e78ed85a4c8e94e0274 Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Fri, 30 Jan 2026 23:18:45 +1300
Subject: [PATCH v8 4/4] Various experimental changes

---
 src/backend/access/common/tupdesc.c |   6 ++
 src/backend/executor/execTuples.c   |  63 ++++++------
 src/include/access/tupmacs.h        | 149 ++++++++++++++++++++++++++--
 src/include/executor/tuptable.h     |   4 +-
 4 files changed, 183 insertions(+), 39 deletions(-)

diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index 25364db630a..ca393af67c9 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -105,6 +105,12 @@ populate_compact_attribute_internal(Form_pg_attribute src,
 			elog(ERROR, "invalid attalign value: %c", src->attalign);
 			break;
 	}
+
+	/* Check for unsupported byval attlens */
+	if (src->attbyval && src->attlen != sizeof(char) &&
+		src->attlen != sizeof(int16) && src->attlen != sizeof(int32) &&
+		src->attlen != sizeof(int64))
+		elog(ERROR, "unsupported byval length: %d", src->attlen);
 }
 
 /*
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index 36d0aaed2fb..5e20a05a830 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -1029,23 +1029,34 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 	/* We can only fetch as many attributes as the tuple has. */
 	natts = Min(HeapTupleHeaderGetNatts(tup), natts);
 	attnum = slot->tts_nvalid;
+	values = slot->tts_values;
+	isnull = slot->tts_isnull;
 	firstNonCacheOffsetAttr = Min(tupleDesc->firstNonCachedOffAttr, natts);
 
 	if (hasnulls)
 	{
+		tp = (char *) tup +
+			MAXALIGN(offsetof(HeapTupleHeaderData, t_bits) +
+					 BITMAPLEN(HeapTupleHeaderGetNatts(tup)));
+		Assert(tp == (char *) tup + tup->t_hoff);
 		bp = tup->t_bits;
 		firstNullAttr = first_null_attr(bp, natts);
+		populate_isnull_array(bp, natts, isnull);
 		firstNonCacheOffsetAttr = Min(firstNonCacheOffsetAttr, firstNullAttr);
 	}
 	else
 	{
+		uint64	   *isnull64 = (uint64 *) isnull;
+		Size		asize = (natts + 7) >> 3;
+
+		tp = (char *) tup + MAXALIGN(offsetof(HeapTupleHeaderData, t_bits));
 		bp = NULL;
 		firstNullAttr = natts;
-	}
 
-	values = slot->tts_values;
-	isnull = slot->tts_isnull;
-	tp = (char *) tup + tup->t_hoff;
+		/* No nulls, set all isnull elements to false */
+		for (int i = 0; i < asize; i++)
+			isnull64[i] = 0;
+	}
 
 	/*
 	 * Handle the portion of the tuple that we have cached the offset for up
@@ -1065,7 +1076,6 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 #endif
 		do
 		{
-			isnull[attnum] = false;
 			cattr = TupleDescCompactAttr(tupleDesc, attnum);
 
 #ifdef USE_ASSERT_CHECKING
@@ -1074,7 +1084,9 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 			offcheck += cattr->attlen;
 #endif
 
-			values[attnum] = fetchatt(cattr, tp + cattr->attcacheoff);
+			values[attnum] = fetch_att_noerr(tp + cattr->attcacheoff,
+											 cattr->attbyval,
+											 cattr->attlen);
 		} while (++attnum < firstNonCacheOffsetAttr);
 
 		/*
@@ -1101,19 +1113,14 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 	 */
 	for (; attnum < firstNullAttr; attnum++)
 	{
-		isnull[attnum] = false;
 		cattr = TupleDescCompactAttr(tupleDesc, attnum);
 
-		/* align the offset for this attribute */
-		off = att_pointer_alignby(off,
-								  cattr->attalignby,
-								  cattr->attlen,
-								  tp + off);
-
-		values[attnum] = fetchatt(cattr, tp + off);
-
-		/* move the offset beyond this attribute */
-		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+		/* align 'off', fetch the datum, and increment off beyond the datum */
+		values[attnum] = align_fetch_then_add(tp,
+											  &off,
+											  cattr->attbyval,
+											  cattr->attlen,
+											  cattr->attalignby);
 	}
 
 	/*
@@ -1122,26 +1129,20 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 	 */
 	for (; attnum < natts; attnum++)
 	{
-		if (att_isnull(attnum, bp))
+		if (isnull[attnum])
 		{
 			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
 			continue;
 		}
 
-		isnull[attnum] = false;
 		cattr = TupleDescCompactAttr(tupleDesc, attnum);
 
-		/* align the offset for this attribute */
-		off = att_pointer_alignby(off,
-								  cattr->attalignby,
-								  cattr->attlen,
-								  tp + off);
-
-		values[attnum] = fetchatt(cattr, tp + off);
-
-		/* move the offset beyond this attribute */
-		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+		/* align 'off', fetch the datum, and increment off beyond the datum */
+		values[attnum] = align_fetch_then_add(tp,
+											  &off,
+											  cattr->attbyval,
+											  cattr->attlen,
+											  cattr->attalignby);
 	}
 
 	/*
@@ -1452,7 +1453,7 @@ ExecSetSlotDescriptor(TupleTableSlot *slot, /* slot to change */
 	slot->tts_values = (Datum *)
 		MemoryContextAlloc(slot->tts_mcxt, tupdesc->natts * sizeof(Datum));
 	slot->tts_isnull = (bool *)
-		MemoryContextAlloc(slot->tts_mcxt, tupdesc->natts * sizeof(bool));
+		MemoryContextAlloc(slot->tts_mcxt, MAXALIGN(tupdesc->natts * sizeof(bool)));
 }
 
 /* --------------------------------
diff --git a/src/include/access/tupmacs.h b/src/include/access/tupmacs.h
index ce7a88df611..d53784899fb 100644
--- a/src/include/access/tupmacs.h
+++ b/src/include/access/tupmacs.h
@@ -16,7 +16,7 @@
 
 #include "catalog/pg_type_d.h"	/* for TYPALIGN macros */
 #include "port/pg_bitutils.h"
-
+#include "varatt.h"
 
 /*
  * Check a tuple's null bitmap to determine whether the attribute is null.
@@ -29,6 +29,49 @@ att_isnull(int ATT, const bits8 *BITS)
 	return !(BITS[ATT >> 3] & (1 << (ATT & 0x07)));
 }
 
+/*
+ * populate_isnull_array
+ *		Transform a tuple's null bitmap into a boolean array.
+ *
+ * Caller must ensure that the isnull array is sized so it contains
+ * at least as many elements as there are bits in the 'bits' array.
+ * This is required because we always round 'natts' up to the next multiple
+ * of 8.
+ */
+static inline void
+populate_isnull_array(const bits8 *bits, int natts, bool *isnull)
+{
+	int			nbytes = (natts + 7) >> 3;
+
+	/*
+	 * Multiplying a NULL bitmap byte by this value results in the lowest bit
+	 * in each byte being set the same as each bit of the bitmap.  We perform
+	 * this as 2 32-bit operations rather than a single 64-bit operation as
+	 * multiplying by the required value to do this in 64-bits would result in
+	 * overflowing a uint64 in some cases.
+	 */
+#define SPREAD_BITS_MULTIPLIER_32 0x204081U
+
+	for (int i = 0; i < nbytes; i++, isnull += 8)
+	{
+		uint64		isnull_8;
+		bits8		nullbyte = ~bits[i];
+
+		/* convert the lower 4 bits of null bitmap word into 32 bit int */
+		isnull_8 = (nullbyte & 0xf) * SPREAD_BITS_MULTIPLIER_32;
+
+		/*
+		 * convert the upper 4 bits of null bitmap word into 32 bit int, shift
+		 * into the upper 32 bit
+		 */
+		isnull_8 |= ((uint64) ((nullbyte >> 4) * SPREAD_BITS_MULTIPLIER_32)) << 32;
+
+		/* mask out all other bits apart from the lowest bit of each byte */
+		isnull_8 &= UINT64CONST(0x0101010101010101);
+		memcpy(isnull, &isnull_8, sizeof(uint64));
+	}
+}
+
 #ifndef FRONTEND
 /*
  * Given an attbyval and an attlen from either a Form_pg_attribute or
@@ -71,6 +114,100 @@ fetch_att(const void *T, bool attbyval, int attlen)
 		return PointerGetDatum(T);
 }
 
+/*
+ * Same, but no error checking for invalid attlens for byval types.  This
+ * is safe to use when attlen comes from CompactAttribute as we validate the
+ * length when populating that struct.
+ */
+static inline Datum
+fetch_att_noerr(const void *T, bool attbyval, int attlen)
+{
+	if (attbyval)
+	{
+		switch (attlen)
+		{
+			case sizeof(char):
+				return CharGetDatum(*((const char *) T));
+			case sizeof(int16):
+				return Int16GetDatum(*((const int16 *) T));
+			case sizeof(int32):
+				return Int32GetDatum(*((const int32 *) T));
+			default:
+				Assert(attlen == sizeof(int64));
+				return Int64GetDatum(*((const int64 *) T));
+		}
+	}
+	else
+		return PointerGetDatum(T);
+}
+
+
+/*
+ * align_fetch_then_add
+ *		Applies all the functionality of att_pointer_alignby(), fetch_att()
+ *		and att_addlength_pointer() resulting in *off pointer to the perhaps
+ *		unaligned number of bytes into 'tupptr', ready to deform the next
+ *		attribute.
+ *
+ * tupptr: pointer to the beginning of the tuple, after the header and any
+ * NULL bitmask.
+ * off: offset in bytes for reading tuple data, possibly unaligned.
+ * attbyval, attlen, attalignby are values from CompactAttribute.
+ */
+static inline Datum
+align_fetch_then_add(const char *tupptr, uint32 *off, bool attbyval, int attlen,
+					 uint8 attalignby)
+{
+	Datum		res;
+
+	if (attlen > 0)
+	{
+		const char *offset_ptr;
+
+		*off = TYPEALIGN(attalignby, *off);
+		offset_ptr = tupptr + *off;
+		*off += attlen;
+		if (attbyval)
+		{
+			switch (attlen)
+			{
+				case sizeof(char):
+					return CharGetDatum(*((const char *) offset_ptr));
+				case sizeof(int16):
+					return Int16GetDatum(*((const int16 *) offset_ptr));
+				case sizeof(int32):
+					return Int32GetDatum(*((const int32 *) offset_ptr));
+				default:
+
+					/*
+					 * populate_compact_attribute_internal() should have
+					 * checked
+					 */
+					Assert(attlen == sizeof(int64));
+					return Int64GetDatum(*((const int64 *) offset_ptr));
+			}
+		}
+		return PointerGetDatum(offset_ptr);
+	}
+	else if (attlen == -1)
+	{
+		if (!VARATT_IS_SHORT(tupptr + *off))
+			*off = TYPEALIGN(attalignby, *off);
+
+		res = PointerGetDatum(tupptr + *off);
+		*off += VARSIZE_ANY(DatumGetPointer(res));
+		return res;
+	}
+	else
+	{
+		Assert(attlen == -2);
+		*off = TYPEALIGN(attalignby, *off);
+		res = PointerGetDatum(tupptr + *off);
+		*off += strlen(tupptr + *off) + 1;
+		return res;
+	}
+}
+
 #ifndef HAVE__BUILTIN_CTZ
 /*
  * For returning the 0-based position of the right-most 0 bit of a uint8, or 8
@@ -100,6 +237,9 @@ static const uint8 pg_rightmost_zero_pos[256] = {
  * first_null_attr
  *		Inspect a NULL bitmask from a tuple and return the 0-based attnum of the
  *		first NULL attribute.  Returns natts if no NULLs were found.
+ *
+ * We expect that 'bits' contains at least one 0 bit somewhere in the mask,
+ * not necessarily < natts.
  */
 static inline int
 first_null_attr(const bits8 *bits, int natts)
@@ -133,12 +273,7 @@ first_null_attr(const bits8 *bits, int natts)
 	res = bytenum << 3;
 
 #ifdef HAVE__BUILTIN_CTZ
-	/*
-	 * Promote to 32-bit before doing bit-wise NOT.  This means we'll convert
-	 * 0xff into 0xffffff00 rather than 0x0, which is undefined with
-	 * __builtin_ctz.  That'll mean we correctly get 8 for 0xff
-	 */
-	res += __builtin_ctz(~(uint32) bits[bytenum]);
+	res += __builtin_ctz(~bits[bytenum]);
 #else
 	res += pg_rightmost_zero_pos[bits[bytenum]];
 #endif
diff --git a/src/include/executor/tuptable.h b/src/include/executor/tuptable.h
index 363c5f33697..180fccc999f 100644
--- a/src/include/executor/tuptable.h
+++ b/src/include/executor/tuptable.h
@@ -116,7 +116,9 @@ typedef struct TupleTableSlot
 #define FIELDNO_TUPLETABLESLOT_VALUES 5
 	Datum	   *tts_values;		/* current per-attribute values */
 #define FIELDNO_TUPLETABLESLOT_ISNULL 6
-	bool	   *tts_isnull;		/* current per-attribute isnull flags */
+	bool	   *tts_isnull;		/* current per-attribute isnull flags.  Array
+								 * size must always be rounded up to the next
+								 * 8 elements. */
 	MemoryContext tts_mcxt;		/* slot itself is in this context */
 	ItemPointerData tts_tid;	/* stored tuple's tid */
 	Oid			tts_tableOid;	/* table oid of tuple */
-- 
2.51.0



  [image/gif] m2_on_v8.gif (105.4K, 6-m2_on_v8.gif)
  download | view image

  [image/gif] amd3990x_clang_with_v8.gif (78.6K, 7-amd3990x_clang_with_v8.gif)
  download | view image

  [image/gif] amd7945hx_clang_with_v8.gif (102.4K, 8-amd7945hx_clang_with_v8.gif)
  download | view image

^ permalink  raw  reply  [nested|flat] 19+ messages in thread


end of thread, other threads:[~2026-01-31 11:27 UTC | newest]

Thread overview: 19+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2026-01-18 22:13 Re: More speedups for tuple deformation David Rowley <[email protected]>
2026-01-19 05:47 ` Chao Li <[email protected]>
2026-01-20 00:11   ` David Rowley <[email protected]>
2026-01-20 04:32     ` Chao Li <[email protected]>
2026-01-20 06:05       ` Chao Li <[email protected]>
2026-01-20 18:38     ` Andres Freund <[email protected]>
2026-01-21 05:00       ` David Rowley <[email protected]>
2026-01-23 01:18         ` Andres Freund <[email protected]>
2026-01-23 05:29           ` Chao Li <[email protected]>
2026-01-23 16:33           ` Andres Freund <[email protected]>
2026-01-27 13:34             ` David Rowley <[email protected]>
2026-01-28 16:26               ` Andres Freund <[email protected]>
2026-01-30 11:10                 ` David Rowley <[email protected]>
2026-01-30 17:11                   ` Andres Freund <[email protected]>
2026-01-30 20:03                     ` Andres Freund <[email protected]>
2026-01-31 11:27                     ` David Rowley <[email protected]>
2026-01-31 02:47               ` John Naylor <[email protected]>
2026-01-31 03:44                 ` David Rowley <[email protected]>
2026-01-31 06:01                   ` John Naylor <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox