public inbox for [email protected]  
help / color / mirror / Atom feed
From: David Rowley <[email protected]>
To: Andres Freund <[email protected]>
Cc: Chao Li <[email protected]>
Cc: PostgreSQL Developers <[email protected]>
Subject: Re: More speedups for tuple deformation
Date: Sun, 1 Feb 2026 00:27:02 +1300
Message-ID: <CAApHDvpuEbhvH1ViCZRz5vks+_bGbEnPoEdZYAZXK76_isb_+Q@mail.gmail.com> (raw)
In-Reply-To: <e7sto7tk5dk5hfyvoocaddnxcngemcmfvbuh23l32w5cssaizy@znuphjqug7qe>
References: <CAApHDvoh3Q413szd-zsUTCpQPWNdpUYvx-fvsB8DP8zOja+ckg@mail.gmail.com>
	<[email protected]>
	<CAApHDvqhbJU_-yF3Hbf4VhX33qXtpeYv3MsvMPDMcDwGGLr9ZQ@mail.gmail.com>
	<rbskhk7scqbxqnaw4o6nh6na2ffcclg3cxn4d4cn5jfr2z7vv3@kadtz65meesb>
	<CAApHDvpDxDFatUskuOfuM7A3VESrx8U7MtYnU_HiB0QLAg94zg@mail.gmail.com>
	<pmik622adey6fnddivkt4uvkulvnc6rasmq3tcbrzeglx4hsn7@f3x6e2eph3w5>
	<rvlc7pb6zn4kydqovcqh72lf2qfcgs3qkj2seq7tcpvxyqwtqt@nrvv6lpehwwa>
	<CAApHDvo1i-ycAcWnK3L7ZASTuM8mW46kvRqMaUHD46HSuJmx7A@mail.gmail.com>
	<p2oo6g2fg5nxbsspbyeauxhbyyi3zsx3tt4prkgju3j5qjo7lf@rjlhxusxzam2>
	<CAApHDvpbntG7V3_EsZ+w-V=jU-y8rFmv9RB1EDJm4sxKno-4UA@mail.gmail.com>
	<e7sto7tk5dk5hfyvoocaddnxcngemcmfvbuh23l32w5cssaizy@znuphjqug7qe>

On Sat, 31 Jan 2026 at 06:11, Andres Freund <[email protected]> wrote:
> This is why I like the idea of keeping track of whether we can rely on NOT
> NULL columns to be present (I think that means we're evaluating expressions
> other than constraint checks for new rows). It allows the leading NOT NULL
> fixed-width columns to be decoded without having to wait for a good chunk of
> the computations above. That's a performance boon even if we later have
> nullable or varlength columns.

I can look into this. As we both know, we can't apply this
optimisation in every case as there are places in the code which form
then deform tuples before NOT NULL constraints are checked. Perhaps
the slot can store a flag to mention if the optimisation is valid to
apply or not. It doesn't look like the flag can be part of the
TupleDesc since we cache those in relcache. I'm imagining that
TupleDescFinalize() calculates another field which could be the max
cached offset that's got a NOT NULL constraint and isn't attmissing. I
think this will need another dedicated loop in
slot_deform_heap_tuple() to loop up to that attribute before doing the
firstNonCacheOffsetAttr loop. Maybe it's worth making that new loop
only handle byval types so that we can get rid of the byval branch
from fetch_att(). That should make it a bit faster at the expense of
not being able to handle fixed-width types that have a attlen > 8,
which I don't think are excessively common...  It's hard to know.
UUIDs do get used as primary key columns, and making the PK column the
first column in the table is a common pattern to follow.

Overall, I agree that this will likely speed up some of the tests, but
it will also likely further increase the overheads for the 0 extra
column tests.

> > > 3) I briefly experimented with this code, and I think we may be able to
> > >    optimize the combination of att_pointer_alignby(), fetch_att() and
> > >    att_addlength_pointer(). They all do quite related work, and for byvalue
> > >    types, we know at compile time what the alignment requirement for each of
> > >    the supported attlen is.
> >
> > Is this true? Isn't there some nearby discussion about AIX having
> > 4-byte double alignment?
>
> The AIX stuff is just bonkers. Having different alignment based on where in an
> aggregate a double is is just insane.
>
> We could probably just accomodate that with a compile-time ifdef.

I've not tried this yet. There's probably more than
align_fetch_then_add() that could benefit from the knowledge that
attalign is the same as attlen for byval types. align_fetch_then_add()
is the only one I'm currently aware of, and that only saves doing
attalign - 1, which only saves 1 cycle.

> > I've taken a go at implementing a function called align_fetch_then_add(),
> > which rolls all the macros into one (See 0004). I just can't see any
> > improvements with it. Maybe I've missed something that could be more
> > optimal. I did even ditch one of the cases from the switch(attlen). It might
> > be ok to do that now as we can check for invalid attlens for byval types
> > when we populate the CompactAttribute.
>
> You could move the TYPEALIGN(attalignby, *off) calls into the attlen switch
> and call it with a constant argument.

I think that would need the assumption you mentioned above that the
byval type's attlen defines the alignment requirements.

> > > Have you experimented setting isnull[] in a dedicated loop if there are nulls
> > > and then in this loop just checking isnull[attnum]? Seems like that could
> > > perhaps be combined with the work in first_null_attr() and be more efficient
> > > than doing an att_isnull() separately for each column.
> >
> > Yes. I experiment with that quite a bit. I wasn't able to make it any
> > faster than setting the isnull element in the same loop as the
> > tts_values element. What I did try was having a dedicated tight loop
> > like; for (int i = attnum; i < firstNullAttr; i++) isnull[i] = false;,
> > but the compiler would always try to optimise that into an inlined
> > memset which would result in poorly performing code in cases with a
> > small number of columns due to the size and alignment prechecks.
>
> Yea, that kind of transformation is pretty annoying and makes little sense
> here :(.
>
> I was thinking of actually computing the value of isnull[] based on the null
> bitmap (as you also try below).

I've taken the code you posted in [1] to do this. Thanks for that. It
works very well. I made it so the tts_isnull array size is rounded up
to the next multiple of 8. Without that we'll overwrite the sentinel
byte in the palloc'd chunk if we assume we can always write to
tts_isnull in multiples of 8 elements.  Because I've made tts_isnull
round up to the next 8 elements, I can make the !hasnulls loop zero
the memory 8 bytes at a time. That solves the annoying memset inlining
I was getting. That seems to mostly fix the regression in the zero
extra column test. The t_hoff calculation replacement maybe helped a
bit too. I didn't test individually.

> > I spent a bit of time trying to figure out a way to do this without the
> > lookup table and only came up with a method that requires AVX512
> > instructions. I coded that up, but it requires building with
> > -march=x86-64-v4, which will likely cause many other reasons for the
> > performance to vary.
>
> Yea, I doubt we want that... Too many tuples will be too short to benefit.

It was only doing 8 columns at once, not 64. I think it's irrelevant
now with your 0x204081 trick.

I've revised the 0004 patch to use the 0x204081 trick. I also added
fetch_att_noerr() gets rid of the elog and assumes the default switch
case must be attlen == 8. I didn't modify the existing function as
that will be used for attlen values that don't come from
CompactAttribute. Also got rid of the t_hoff usage. With those and the
new code that zeros the tts_isnull array 8 bytes at a time, things are
looking pretty good.

I've attached 3 graphs, which are now looking a bit better. The gcc
results are not quite as good. There's still a small regression with 0
extra column test, and overall, the results are not as impressive as
clang's. I've not yet studied why. The Apple M2 machine averaged over
53% faster than master over all tests, or ~63% if you exclude the 0
extra column tests.

I'll look into adding another optional optimisation for the NOT NULL
columns. Attaching the updated patch in the meantime. This includes
the fix for the __builtin_ctz() issue mentioned by John.

David

From df730d93ed3f7f9cd719c69c213ab60cac28c7fb Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Wed, 21 Jan 2026 15:41:37 +1300
Subject: [PATCH v8 1/4] Add empty TupleDescFinalize() function

Currently does nothing, but will in a future commit.
---
 contrib/dblink/dblink.c                             |  4 ++++
 contrib/pg_buffercache/pg_buffercache_pages.c       |  2 ++
 contrib/pg_visibility/pg_visibility.c               |  2 ++
 src/backend/access/brin/brin_tuple.c                |  1 +
 src/backend/access/common/tupdesc.c                 | 13 +++++++++++++
 src/backend/access/gin/ginutil.c                    |  1 +
 src/backend/access/gist/gistscan.c                  |  1 +
 src/backend/access/spgist/spgutils.c                |  1 +
 src/backend/access/transam/twophase.c               |  1 +
 src/backend/access/transam/xlogfuncs.c              |  1 +
 src/backend/backup/basebackup_copy.c                |  3 +++
 src/backend/catalog/index.c                         |  2 ++
 src/backend/catalog/pg_publication.c                |  1 +
 src/backend/catalog/toasting.c                      |  6 ++++++
 src/backend/commands/explain.c                      |  1 +
 src/backend/commands/functioncmds.c                 |  1 +
 src/backend/commands/sequence.c                     |  1 +
 src/backend/commands/tablecmds.c                    |  4 ++++
 src/backend/commands/wait.c                         |  1 +
 src/backend/executor/execSRF.c                      |  2 ++
 src/backend/executor/execTuples.c                   |  4 ++++
 src/backend/executor/nodeFunctionscan.c             |  2 ++
 src/backend/parser/parse_relation.c                 |  4 +++-
 src/backend/parser/parse_target.c                   |  2 ++
 .../replication/libpqwalreceiver/libpqwalreceiver.c |  1 +
 src/backend/replication/walsender.c                 |  5 +++++
 src/backend/utils/adt/acl.c                         |  1 +
 src/backend/utils/adt/genfile.c                     |  1 +
 src/backend/utils/adt/lockfuncs.c                   |  1 +
 src/backend/utils/adt/orderedsetaggs.c              |  1 +
 src/backend/utils/adt/pgstatfuncs.c                 |  5 +++++
 src/backend/utils/adt/tsvector_op.c                 |  1 +
 src/backend/utils/cache/relcache.c                  |  8 ++++++++
 src/backend/utils/fmgr/funcapi.c                    |  6 ++++++
 src/backend/utils/misc/guc_funcs.c                  |  5 +++++
 src/include/access/tupdesc.h                        |  1 +
 src/pl/plpgsql/src/pl_comp.c                        |  2 ++
 .../test_custom_stats/test_custom_fixed_stats.c     |  1 +
 src/test/modules/test_predtest/test_predtest.c      |  1 +
 39 files changed, 100 insertions(+), 1 deletion(-)

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 8cb3166495c..1ce4502fec2 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -881,6 +881,7 @@ materializeResult(FunctionCallInfo fcinfo, PGconn *conn, PGresult *res)
 		tupdesc = CreateTemplateTupleDesc(1);
 		TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 						   TEXTOID, -1, 0);
+		TupleDescFinalize(tupdesc);
 		ntuples = 1;
 		nfields = 1;
 	}
@@ -1044,6 +1045,7 @@ materializeQueryResult(FunctionCallInfo fcinfo,
 			tupdesc = CreateTemplateTupleDesc(1);
 			TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 							   TEXTOID, -1, 0);
+			TupleDescFinalize(tupdesc);
 			attinmeta = TupleDescGetAttInMetadata(tupdesc);
 
 			oldcontext = MemoryContextSwitchTo(rsinfo->econtext->ecxt_per_query_memory);
@@ -1529,6 +1531,8 @@ dblink_get_pkey(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 2, "colname",
 						   TEXTOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
+
 		/*
 		 * Generate attribute metadata needed later to produce tuples from raw
 		 * C strings
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 89b86855243..a6b4fb5252b 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -174,6 +174,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
 			TupleDescInitEntry(tupledesc, (AttrNumber) 9, "pinning_backends",
 							   INT4OID, -1, 0);
 
+		TupleDescFinalize(tupledesc);
 		fctx->tupdesc = BlessTupleDesc(tupledesc);
 
 		/* Allocate NBuffers worth of BufferCachePagesRec records. */
@@ -442,6 +443,7 @@ pg_buffercache_os_pages_internal(FunctionCallInfo fcinfo, bool include_numa)
 		TupleDescInitEntry(tupledesc, (AttrNumber) 3, "numa_node",
 						   INT4OID, -1, 0);
 
+		TupleDescFinalize(tupledesc);
 		fctx->tupdesc = BlessTupleDesc(tupledesc);
 		fctx->include_numa = include_numa;
 
diff --git a/contrib/pg_visibility/pg_visibility.c b/contrib/pg_visibility/pg_visibility.c
index 9bc3a784bf7..dfab0b64cf5 100644
--- a/contrib/pg_visibility/pg_visibility.c
+++ b/contrib/pg_visibility/pg_visibility.c
@@ -469,6 +469,8 @@ pg_visibility_tupdesc(bool include_blkno, bool include_pd)
 		TupleDescInitEntry(tupdesc, ++a, "pd_all_visible", BOOLOID, -1, 0);
 	Assert(a == maxattr);
 
+	TupleDescFinalize(tupdesc);
+
 	return BlessTupleDesc(tupdesc);
 }
 
diff --git a/src/backend/access/brin/brin_tuple.c b/src/backend/access/brin/brin_tuple.c
index 706387e36d6..7f150df9ee7 100644
--- a/src/backend/access/brin/brin_tuple.c
+++ b/src/backend/access/brin/brin_tuple.c
@@ -84,6 +84,7 @@ brtuple_disk_tupdesc(BrinDesc *brdesc)
 
 		MemoryContextSwitchTo(oldcxt);
 
+		TupleDescFinalize(tupdesc);
 		brdesc->bd_disktdesc = tupdesc;
 	}
 
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index 94b4f1f9975..e98de806a77 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -238,6 +238,9 @@ CreateTupleDesc(int natts, Form_pg_attribute *attrs)
 		memcpy(TupleDescAttr(desc, i), attrs[i], ATTRIBUTE_FIXED_PART_SIZE);
 		populate_compact_attribute(desc, i);
 	}
+
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -282,6 +285,8 @@ CreateTupleDescCopy(TupleDesc tupdesc)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -328,6 +333,8 @@ CreateTupleDescTruncatedCopy(TupleDesc tupdesc, int natts)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -413,6 +420,8 @@ CreateTupleDescCopyConstr(TupleDesc tupdesc)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -455,6 +464,8 @@ TupleDescCopy(TupleDesc dst, TupleDesc src)
 	 * source's refcount would be wrong in any case.)
 	 */
 	dst->tdrefcount = -1;
+
+	TupleDescFinalize(dst);
 }
 
 /*
@@ -1082,6 +1093,8 @@ BuildDescFromLists(const List *names, const List *types, const List *typmods, co
 		TupleDescInitEntryCollation(desc, attnum, attcollation);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index d205093e21d..a533d79e26e 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -129,6 +129,7 @@ initGinState(GinState *state, Relation index)
 							   attr->attndims);
 			TupleDescInitEntryCollation(state->tupdesc[i], (AttrNumber) 2,
 										attr->attcollation);
+			TupleDescFinalize(state->tupdesc[i]);
 		}
 
 		/*
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
index f23bc4a6757..c65f93abdae 100644
--- a/src/backend/access/gist/gistscan.c
+++ b/src/backend/access/gist/gistscan.c
@@ -201,6 +201,7 @@ gistrescan(IndexScanDesc scan, ScanKey key, int nkeys,
 											 attno - 1)->atttypid,
 							   -1, 0);
 		}
+		TupleDescFinalize(so->giststate->fetchTupdesc);
 		scan->xs_hitupdesc = so->giststate->fetchTupdesc;
 
 		/* Also create a memory context that will hold the returned tuples */
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 9f5379b87ac..b246e8127db 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -340,6 +340,7 @@ getSpGistTupleDesc(Relation index, SpGistTypeDesc *keyType)
 			TupleDescCompactAttr(outTupDesc, i)->attcacheoff = -1;
 
 		populate_compact_attribute(outTupDesc, spgKeyColumn);
+		TupleDescFinalize(outTupDesc);
 	}
 	return outTupDesc;
 }
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 601ce3faa64..6d5792c7929 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -744,6 +744,7 @@ pg_prepared_xact(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 5, "dbid",
 						   OIDOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 2efe4105efb..b6bc616c74c 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -400,6 +400,7 @@ pg_walfile_name_offset(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(resultTupleDesc, (AttrNumber) 2, "file_offset",
 					   INT4OID, -1, 0);
 
+	TupleDescFinalize(resultTupleDesc);
 	resultTupleDesc = BlessTupleDesc(resultTupleDesc);
 
 	/*
diff --git a/src/backend/backup/basebackup_copy.c b/src/backend/backup/basebackup_copy.c
index fecfad9ab7b..29dbd0cb32f 100644
--- a/src/backend/backup/basebackup_copy.c
+++ b/src/backend/backup/basebackup_copy.c
@@ -357,6 +357,8 @@ SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli)
 	 */
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "tli", INT8OID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
+
 	/* send RowDescription */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
 
@@ -388,6 +390,7 @@ SendTablespaceList(List *tablespaces)
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, "spcoid", OIDOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "spclocation", TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "size", INT8OID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* send RowDescription */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 43de42ce39e..75e97fb394a 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -481,6 +481,8 @@ ConstructTupleDescriptor(Relation heapRelation,
 		populate_compact_attribute(indexTupDesc, i);
 	}
 
+	TupleDescFinalize(indexTupDesc);
+
 	return indexTupDesc;
 }
 
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 9a4791c573e..fa353a0dd37 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -1230,6 +1230,7 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 4, "qual",
 						   PG_NODE_TREEOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 		funcctx->user_fctx = table_infos;
 
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index c78dcea98c1..078a1cf5127 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -229,6 +229,12 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid,
 	TupleDescAttr(tupdesc, 1)->attcompression = InvalidCompressionMethod;
 	TupleDescAttr(tupdesc, 2)->attcompression = InvalidCompressionMethod;
 
+	populate_compact_attribute(tupdesc, 0);
+	populate_compact_attribute(tupdesc, 1);
+	populate_compact_attribute(tupdesc, 2);
+
+	TupleDescFinalize(tupdesc);
+
 	/*
 	 * Toast tables for regular relations go in pg_toast; those for temp
 	 * relations go into the per-backend temp-toast-table namespace.
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index b7bb111688c..7abd9ed272f 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -281,6 +281,7 @@ ExplainResultDesc(ExplainStmt *stmt)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "QUERY PLAN",
 					   result_type, -1, 0);
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
 
diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c
index a516b037dea..6a8f162b640 100644
--- a/src/backend/commands/functioncmds.c
+++ b/src/backend/commands/functioncmds.c
@@ -2423,6 +2423,7 @@ CallStmtResultDesc(CallStmt *stmt)
 							   -1,
 							   0);
 		}
+		TupleDescFinalize(tupdesc);
 	}
 
 	return tupdesc;
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index e1b808bbb60..551667650ba 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -1808,6 +1808,7 @@ pg_get_sequence_data(PG_FUNCTION_ARGS)
 					   BOOLOID, -1, 0);
 	TupleDescInitEntry(resultTupleDesc, (AttrNumber) 3, "page_lsn",
 					   LSNOID, -1, 0);
+	TupleDescFinalize(resultTupleDesc);
 	resultTupleDesc = BlessTupleDesc(resultTupleDesc);
 
 	seqrel = try_relation_open(relid, AccessShareLock);
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index f976c0e5c7e..ae7c9db9fa9 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -1029,6 +1029,8 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
 		}
 	}
 
+	TupleDescFinalize(descriptor);
+
 	/*
 	 * For relations with table AM and partitioned tables, select access
 	 * method to use: an explicitly indicated one, or (in the case of a
@@ -1448,6 +1450,8 @@ BuildDescForRelation(const List *columns)
 		populate_compact_attribute(desc, attnum - 1);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
diff --git a/src/backend/commands/wait.c b/src/backend/commands/wait.c
index 1290df10c6f..8e920a72372 100644
--- a/src/backend/commands/wait.c
+++ b/src/backend/commands/wait.c
@@ -338,5 +338,6 @@ WaitStmtResultDesc(WaitStmt *stmt)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 					   TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
diff --git a/src/backend/executor/execSRF.c b/src/backend/executor/execSRF.c
index a0b111dc0e4..b481e50acfb 100644
--- a/src/backend/executor/execSRF.c
+++ b/src/backend/executor/execSRF.c
@@ -272,6 +272,7 @@ ExecMakeTableFunctionResult(SetExprState *setexpr,
 									   funcrettype,
 									   -1,
 									   0);
+					TupleDescFinalize(tupdesc);
 					rsinfo.setDesc = tupdesc;
 				}
 				MemoryContextSwitchTo(oldcontext);
@@ -776,6 +777,7 @@ init_sexpr(Oid foid, Oid input_collation, Expr *node,
 							   funcrettype,
 							   -1,
 							   0);
+			TupleDescFinalize(tupdesc);
 			sexpr->funcResultDesc = tupdesc;
 			sexpr->funcReturnsTuple = false;
 		}
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index b768eae9e53..e6ab51e6404 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -2173,6 +2173,8 @@ ExecTypeFromTLInternal(List *targetList, bool skipjunk)
 		cur_resno++;
 	}
 
+	TupleDescFinalize(typeInfo);
+
 	return typeInfo;
 }
 
@@ -2207,6 +2209,8 @@ ExecTypeFromExprList(List *exprList)
 		cur_resno++;
 	}
 
+	TupleDescFinalize(typeInfo);
+
 	return typeInfo;
 }
 
diff --git a/src/backend/executor/nodeFunctionscan.c b/src/backend/executor/nodeFunctionscan.c
index 63e605e1f81..feb82d64967 100644
--- a/src/backend/executor/nodeFunctionscan.c
+++ b/src/backend/executor/nodeFunctionscan.c
@@ -414,6 +414,7 @@ ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags)
 				TupleDescInitEntryCollation(tupdesc,
 											(AttrNumber) 1,
 											exprCollation(funcexpr));
+				TupleDescFinalize(tupdesc);
 			}
 			else
 			{
@@ -485,6 +486,7 @@ ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags)
 							   0);
 		}
 
+		TupleDescFinalize(scan_tupdesc);
 		Assert(attno == natts);
 	}
 
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 3ec8d8de011..0ad767d827b 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -1891,6 +1891,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 			TupleDescInitEntryCollation(tupdesc,
 										(AttrNumber) 1,
 										exprCollation(funcexpr));
+			TupleDescFinalize(tupdesc);
 		}
 		else if (functypclass == TYPEFUNC_RECORD)
 		{
@@ -1948,6 +1949,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 
 				i++;
 			}
+			TupleDescFinalize(tupdesc);
 
 			/*
 			 * Ensure that the coldeflist defines a legal set of names (no
@@ -2016,7 +2018,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 							   0);
 			/* no need to set collation */
 		}
-
+		TupleDescFinalize(tupdesc);
 		Assert(natts == totalatts);
 	}
 	else
diff --git a/src/backend/parser/parse_target.c b/src/backend/parser/parse_target.c
index b5a2f915b67..5fd17f3d8d0 100644
--- a/src/backend/parser/parse_target.c
+++ b/src/backend/parser/parse_target.c
@@ -1570,6 +1570,8 @@ expandRecordVariable(ParseState *pstate, Var *var, int levelsup)
 		}
 		Assert(lname == NULL && lvar == NULL);	/* lists same length? */
 
+		TupleDescFinalize(tupleDesc);
+
 		return tupleDesc;
 	}
 
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 7c8639b32e9..9f04c9ed25d 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -1073,6 +1073,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
 	for (coln = 0; coln < nRetTypes; coln++)
 		TupleDescInitEntry(walres->tupledesc, (AttrNumber) coln + 1,
 						   PQfname(pgres, coln), retTypes[coln], -1, 0);
+	TupleDescFinalize(walres->tupledesc);
 	attinmeta = TupleDescGetAttInMetadata(walres->tupledesc);
 
 	/* No point in doing more here if there were no tuples returned. */
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index a0e6a3d200c..9e2f4a664b4 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -451,6 +451,7 @@ IdentifySystem(void)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 4, "dbname",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -496,6 +497,7 @@ ReadReplicationSlot(ReadReplicationSlotCmd *cmd)
 	/* TimeLineID is unsigned, so int4 is not wide enough. */
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "restart_tli",
 							  INT8OID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	memset(nulls, true, READ_REPLICATION_SLOT_COLS * sizeof(bool));
 
@@ -598,6 +600,7 @@ SendTimeLineHistory(TimeLineHistoryCmd *cmd)
 	tupdesc = CreateTemplateTupleDesc(2);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, "filename", TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "content", TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	TLHistoryFileName(histfname, cmd->timeline);
 	TLHistoryFilePath(path, cmd->timeline);
@@ -1015,6 +1018,7 @@ StartReplication(StartReplicationCmd *cmd)
 								  INT8OID, -1, 0);
 		TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "next_tli_startpos",
 								  TEXTOID, -1, 0);
+		TupleDescFinalize(tupdesc);
 
 		/* prepare for projection of tuple */
 		tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -1369,6 +1373,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 4, "output_plugin",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
diff --git a/src/backend/utils/adt/acl.c b/src/backend/utils/adt/acl.c
index 3a6905f9546..9d37053c81e 100644
--- a/src/backend/utils/adt/acl.c
+++ b/src/backend/utils/adt/acl.c
@@ -1818,6 +1818,7 @@ aclexplode(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 4, "is_grantable",
 						   BOOLOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/* allocate memory for user context */
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index c083608b1d5..bfb949401d0 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -454,6 +454,7 @@ pg_stat_file(PG_FUNCTION_ARGS)
 					   "creation", TIMESTAMPTZOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 6,
 					   "isdir", BOOLOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	memset(isnull, false, sizeof(isnull));
diff --git a/src/backend/utils/adt/lockfuncs.c b/src/backend/utils/adt/lockfuncs.c
index 9dadd6da672..4481c354fd6 100644
--- a/src/backend/utils/adt/lockfuncs.c
+++ b/src/backend/utils/adt/lockfuncs.c
@@ -146,6 +146,7 @@ pg_lock_status(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 16, "waitstart",
 						   TIMESTAMPTZOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/*
diff --git a/src/backend/utils/adt/orderedsetaggs.c b/src/backend/utils/adt/orderedsetaggs.c
index 3b6da8e36ac..fd8b8676470 100644
--- a/src/backend/utils/adt/orderedsetaggs.c
+++ b/src/backend/utils/adt/orderedsetaggs.c
@@ -233,6 +233,7 @@ ordered_set_startup(FunctionCallInfo fcinfo, bool use_tuples)
 								   -1,
 								   0);
 
+				TupleDescFinalize(newdesc);
 				FreeTupleDesc(qstate->tupdesc);
 				qstate->tupdesc = newdesc;
 			}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 73ca0bb0b7f..08ad27e57c2 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -769,6 +769,7 @@ pg_stat_get_backend_subxact(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subxact_overflow",
 					   BOOLOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	if ((local_beentry = pgstat_get_local_beentry_by_proc_number(procNumber)) != NULL)
@@ -1658,6 +1659,7 @@ pg_stat_wal_build_tuple(PgStat_WalCounters wal_counters,
 	TupleDescInitEntry(tupdesc, (AttrNumber) 6, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	/* Fill values and NULLs */
@@ -2085,6 +2087,7 @@ pg_stat_get_archiver(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(tupdesc, (AttrNumber) 7, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	/* Get statistics about the archiver process */
@@ -2166,6 +2169,7 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
 					   TIMESTAMPTZOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 13, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	namestrcpy(&slotname, text_to_cstring(slotname_text));
@@ -2253,6 +2257,7 @@ pg_stat_get_subscription_stats(PG_FUNCTION_ARGS)
 					   INT8OID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 13, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	if (!subentry)
diff --git a/src/backend/utils/adt/tsvector_op.c b/src/backend/utils/adt/tsvector_op.c
index 94e0fed8309..7ca19a97882 100644
--- a/src/backend/utils/adt/tsvector_op.c
+++ b/src/backend/utils/adt/tsvector_op.c
@@ -651,6 +651,7 @@ tsvector_unnest(PG_FUNCTION_ARGS)
 						   TEXTARRAYOID, -1, 0);
 		if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
 			elog(ERROR, "return type must be a row type");
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = tupdesc;
 
 		funcctx->user_fctx = PG_GETARG_TSVECTOR_COPY(0);
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 6b634c9fff1..770edb34e08 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -729,6 +729,8 @@ RelationBuildTupleDesc(Relation relation)
 		pfree(constr);
 		relation->rd_att->constr = NULL;
 	}
+
+	TupleDescFinalize(relation->rd_att);
 }
 
 /*
@@ -1985,6 +1987,7 @@ formrdesc(const char *relationName, Oid relationReltype,
 
 	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
 	TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
+	TupleDescFinalize(relation->rd_att);
 
 	/* mark not-null status */
 	if (has_not_null)
@@ -3688,6 +3691,8 @@ RelationBuildLocalRelation(const char *relname,
 	for (i = 0; i < natts; i++)
 		TupleDescAttr(rel->rd_att, i)->attrelid = relid;
 
+	TupleDescFinalize(rel->rd_att);
+
 	rel->rd_rel->reltablespace = reltablespace;
 
 	if (mapped_relation)
@@ -4443,6 +4448,7 @@ BuildHardcodedDescriptor(int natts, const FormData_pg_attribute *attrs)
 
 	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
 	TupleDescCompactAttr(result, 0)->attcacheoff = 0;
+	TupleDescFinalize(result);
 
 	/* Note: we don't bother to set up a TupleConstr entry */
 
@@ -6268,6 +6274,8 @@ load_relcache_init_file(bool shared)
 			populate_compact_attribute(rel->rd_att, i);
 		}
 
+		TupleDescFinalize(rel->rd_att);
+
 		/* next read the access method specific field */
 		if (fread(&len, 1, sizeof(len), fp) != sizeof(len))
 			goto read_failed;
diff --git a/src/backend/utils/fmgr/funcapi.c b/src/backend/utils/fmgr/funcapi.c
index 8a934ea8dca..516d02cfb82 100644
--- a/src/backend/utils/fmgr/funcapi.c
+++ b/src/backend/utils/fmgr/funcapi.c
@@ -340,6 +340,8 @@ get_expr_result_type(Node *expr,
 										exprCollation(col));
 			i++;
 		}
+		TupleDescFinalize(tupdesc);
+
 		if (resultTypeId)
 			*resultTypeId = rexpr->row_typeid;
 		if (resultTupleDesc)
@@ -1044,6 +1046,7 @@ resolve_polymorphic_tupdesc(TupleDesc tupdesc, oidvector *declared_args,
 		}
 	}
 
+	TupleDescFinalize(tupdesc);
 	return true;
 }
 
@@ -1853,6 +1856,8 @@ build_function_result_tupdesc_d(char prokind,
 						   0);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -1970,6 +1975,7 @@ TypeGetTupleDesc(Oid typeoid, List *colaliases)
 						   typeoid,
 						   -1,
 						   0);
+		TupleDescFinalize(tupdesc);
 	}
 	else if (functypclass == TYPEFUNC_RECORD)
 	{
diff --git a/src/backend/utils/misc/guc_funcs.c b/src/backend/utils/misc/guc_funcs.c
index 4f3e40bf470..b82f807e05e 100644
--- a/src/backend/utils/misc/guc_funcs.c
+++ b/src/backend/utils/misc/guc_funcs.c
@@ -444,6 +444,7 @@ GetPGVariableResultDesc(const char *name)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 1, varname,
 						   TEXTOID, -1, 0);
 	}
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
 
@@ -465,6 +466,7 @@ ShowGUCConfigOption(const char *name, DestReceiver *dest)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, varname,
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -499,6 +501,7 @@ ShowAllGUCConfig(DestReceiver *dest)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "description",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -934,6 +937,8 @@ show_all_settings(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 17, "pending_restart",
 						   BOOLOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
+
 		/*
 		 * Generate attribute metadata needed later to produce tuples from raw
 		 * C strings
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index d46cdbf7a3c..595413dbbc5 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -195,6 +195,7 @@ extern TupleDesc CreateTupleDescTruncatedCopy(TupleDesc tupdesc, int natts);
 
 extern TupleDesc CreateTupleDescCopyConstr(TupleDesc tupdesc);
 
+#define TupleDescFinalize(d) ((void) 0)
 #define TupleDescSize(src) \
 	(offsetof(struct TupleDescData, compact_attrs) + \
 	 (src)->natts * sizeof(CompactAttribute) + \
diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c
index 7d648c941c0..b2112c29fcf 100644
--- a/src/pl/plpgsql/src/pl_comp.c
+++ b/src/pl/plpgsql/src/pl_comp.c
@@ -1912,6 +1912,8 @@ build_row_from_vars(PLpgSQL_variable **vars, int numvars)
 		TupleDescInitEntryCollation(row->rowtupdesc, i + 1, typcoll);
 	}
 
+	TupleDescFinalize(row->rowtupdesc);
+
 	return row;
 }
 
diff --git a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
index 908bd18a7c7..fa1719bf3b5 100644
--- a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
+++ b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
@@ -205,6 +205,7 @@ test_custom_stats_fixed_report(PG_FUNCTION_ARGS)
 					   INT8OID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	values[0] = Int64GetDatum(stats->numcalls);
diff --git a/src/test/modules/test_predtest/test_predtest.c b/src/test/modules/test_predtest/test_predtest.c
index 679a5de456d..48ca2a4ea70 100644
--- a/src/test/modules/test_predtest/test_predtest.c
+++ b/src/test/modules/test_predtest/test_predtest.c
@@ -230,6 +230,7 @@ test_predtest(PG_FUNCTION_ARGS)
 					   "s_r_holds", BOOLOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 8,
 					   "w_r_holds", BOOLOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	tupdesc = BlessTupleDesc(tupdesc);
 
 	values[0] = BoolGetDatum(strong_implied_by);
-- 
2.51.0


From 6e464b41393dbb7fdeff853285e9d80051b2c12b Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Tue, 31 Dec 2024 09:19:24 +1300
Subject: [PATCH v8 2/4] Precalculate CompactAttribute's attcacheoff

This allows code to be removed from the tuple deform routines which
shrinks down the code a little, which can make it run more quickly.
This also makes a dedicated deformer loop to deform the portion of the
tuple which has a known offset, which makes deforming much faster when
a leading set of the table's columns are non-NULL values and fixed-width
types.
---
 src/backend/access/common/heaptuple.c  | 334 +++++++++-------------
 src/backend/access/common/indextuple.c | 367 ++++++++++---------------
 src/backend/access/common/tupdesc.c    |  37 +++
 src/backend/access/spgist/spgutils.c   |   3 -
 src/backend/executor/execTuples.c      | 282 ++++++++-----------
 src/backend/jit/llvm/llvmjit_deform.c  |   6 -
 src/backend/utils/cache/relcache.c     |  12 -
 src/include/access/htup_details.h      |  19 +-
 src/include/access/itup.h              |  20 +-
 src/include/access/tupdesc.h           |  10 +-
 src/include/access/tupmacs.h           |  84 ++++++
 src/include/executor/tuptable.h        |   9 +-
 12 files changed, 541 insertions(+), 642 deletions(-)

diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c
index 11bec20e82e..42cce3dcdfe 100644
--- a/src/backend/access/common/heaptuple.c
+++ b/src/backend/access/common/heaptuple.c
@@ -497,20 +497,8 @@ heap_attisnull(HeapTuple tup, int attnum, TupleDesc tupleDesc)
 /* ----------------
  *		nocachegetattr
  *
- *		This only gets called from fastgetattr(), in cases where we
- *		can't use a cacheoffset and the value is not null.
- *
- *		This caches attribute offsets in the attribute descriptor.
- *
- *		An alternative way to speed things up would be to cache offsets
- *		with the tuple, but that seems more difficult unless you take
- *		the storage hit of actually putting those offsets into the
- *		tuple you send to disk.  Yuck.
- *
- *		This scheme will be slightly slower than that, but should
- *		perform well for queries which hit large #'s of tuples.  After
- *		you cache the offsets once, examining all the other tuples using
- *		the same attribute descriptor will go much quicker. -cim 5/4/91
+ *		This only gets called from fastgetattr(), in cases where the
+ *		attcacheoff is not set.
  *
  *		NOTE: if you need to change this code, see also heap_deform_tuple.
  *		Also see nocache_index_getattr, which is the same code for index
@@ -522,194 +510,104 @@ nocachegetattr(HeapTuple tup,
 			   int attnum,
 			   TupleDesc tupleDesc)
 {
+	CompactAttribute *cattr;
 	HeapTupleHeader td = tup->t_data;
 	char	   *tp;				/* ptr to data part of tuple */
 	bits8	   *bp = td->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* do we have to walk attrs? */
 	int			off;			/* current offset within data */
+	int			startAttr;
+	int			firstnullattr;
+	bool		hasnulls = HeapTupleHasNulls(tup);
 
-	/* ----------------
-	 *	 Three cases:
-	 *
-	 *	 1: No nulls and no variable-width attributes.
-	 *	 2: Has a null or a var-width AFTER att.
-	 *	 3: Has nulls or var-widths BEFORE att.
-	 * ----------------
-	 */
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
+	/*
+	 * If there are no NULLs before the required attnum, then we can start at
+	 * the highest attribute with a known offset, or the first attribute if
+	 * none have a cached offset.  If the tuple has no variable width types,
+	 * then we can use a slightly cheaper method of offset calculation, as we
+	 * just need to add the attlen to the aligned offset when skipping over
+	 * columns.  When the tuple contains variable-width types, we must use
+	 * att_addlength_pointer(), which does a bit more branching and is
+	 * slightly less efficient.
+	 */
 	attnum--;
 
-	if (!HeapTupleNoNulls(tup))
-	{
-		/*
-		 * there's a null somewhere in the tuple
-		 *
-		 * check to see if any preceding bits are null...
-		 */
-		int			byte = attnum >> 3;
-		int			finalbit = attnum & 0x07;
-
-		/* check for nulls "before" final bit of last byte */
-		if ((~bp[byte]) & ((1 << finalbit) - 1))
-			slow = true;
-		else
-		{
-			/* check for nulls in any "earlier" bytes */
-			int			i;
+	if (hasnulls)
+		firstnullattr = first_null_attr(bp, attnum);
+	else
+		firstnullattr = attnum;
 
-			for (i = 0; i < byte; i++)
-			{
-				if (bp[i] != 0xFF)
-				{
-					slow = true;
-					break;
-				}
-			}
-		}
+	if (tupleDesc->firstNonCachedOffAttr > 0)
+	{
+		startAttr = Min(tupleDesc->firstNonCachedOffAttr - 1, firstnullattr);
+		off = TupleDescCompactAttr(tupleDesc, startAttr)->attcacheoff;
+	}
+	else
+	{
+		startAttr = 0;
+		off = 0;
 	}
 
 	tp = (char *) td + td->t_hoff;
 
-	if (!slow)
+	if (hasnulls)
 	{
-		CompactAttribute *att;
+		for (int i = startAttr; i < attnum; i++)
+		{
+			CompactAttribute *att;
 
-		/*
-		 * If we get here, there are no nulls up to and including the target
-		 * attribute.  If we have a cached offset, we can use it.
-		 */
-		att = TupleDescCompactAttr(tupleDesc, attnum);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, tp + att->attcacheoff);
+			if (att_isnull(i, bp))
+				continue;
 
-		/*
-		 * Otherwise, check for non-fixed-length attrs up to and including
-		 * target.  If there aren't any, it's safe to cheaply initialize the
-		 * cached offsets for these attrs.
-		 */
-		if (HeapTupleHasVarWidth(tup))
-		{
-			int			j;
+			att = TupleDescCompactAttr(tupleDesc, i);
 
-			for (j = 0; j <= attnum; j++)
-			{
-				if (TupleDescCompactAttr(tupleDesc, j)->attlen <= 0)
-				{
-					slow = true;
-					break;
-				}
-			}
+			off = att_pointer_alignby(off,
+									  att->attalignby,
+									  att->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, att->attlen, tp + off);
 		}
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
 	}
-
-	if (!slow)
+	else if (!HeapTupleHasVarWidth(tup))
 	{
-		int			natts = tupleDesc->natts;
-		int			j = 1;
-
-		/*
-		 * If we get here, we have a tuple with no nulls or var-widths up to
-		 * and including the target attribute, so we can use the cached offset
-		 * ... only we don't have it yet, or we'd not have got here.  Since
-		 * it's cheap to compute offsets for fixed-width columns, we take the
-		 * opportunity to initialize the cached offsets for *all* the leading
-		 * fixed-width columns, in hope of avoiding future visits to this
-		 * routine.
-		 */
-		TupleDescCompactAttr(tupleDesc, 0)->attcacheoff = 0;
-
-		/* we might have set some offsets in the slow path previously */
-		while (j < natts && TupleDescCompactAttr(tupleDesc, j)->attcacheoff > 0)
-			j++;
-
-		off = TupleDescCompactAttr(tupleDesc, j - 1)->attcacheoff +
-			TupleDescCompactAttr(tupleDesc, j - 1)->attlen;
-
-		for (; j < natts; j++)
+		for (int i = startAttr; i < attnum; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, j);
-
-			if (att->attlen <= 0)
-				break;
+			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
 
 			off = att_nominal_alignby(off, att->attalignby);
-
-			att->attcacheoff = off;
-
 			off += att->attlen;
 		}
-
-		Assert(j > attnum);
-
-		off = TupleDescCompactAttr(tupleDesc, attnum)->attcacheoff;
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_nominal_alignby(off, cattr->attalignby);
 	}
 	else
 	{
-		bool		usecache = true;
-		int			i;
-
-		/*
-		 * Now we know that we have to walk the tuple CAREFULLY.  But we still
-		 * might be able to cache some offsets for next time.
-		 *
-		 * Note - This loop is a little tricky.  For each non-null attribute,
-		 * we have to first account for alignment padding before the attr,
-		 * then advance over the attr based on its length.  Nulls have no
-		 * storage and no alignment padding either.  We can use/set
-		 * attcacheoff until we reach either a null or a var-width attribute.
-		 */
-		off = 0;
-		for (i = 0;; i++)		/* loop exit is at "break" */
+		for (int i = startAttr; i < attnum; i++)
 		{
 			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
 
-			if (HeapTupleHasNulls(tup) && att_isnull(i, bp))
-			{
-				usecache = false;
-				continue;		/* this cannot be the target att */
-			}
-
-			/* If we know the next offset, we can skip the rest */
-			if (usecache && att->attcacheoff >= 0)
-				off = att->attcacheoff;
-			else if (att->attlen == -1)
-			{
-				/*
-				 * We can only cache the offset for a varlena attribute if the
-				 * offset is already suitably aligned, so that there would be
-				 * no pad bytes in any case: then the offset will be valid for
-				 * either an aligned or unaligned value.
-				 */
-				if (usecache &&
-					off == att_nominal_alignby(off, att->attalignby))
-					att->attcacheoff = off;
-				else
-				{
-					off = att_pointer_alignby(off, att->attalignby, -1,
-											  tp + off);
-					usecache = false;
-				}
-			}
-			else
-			{
-				/* not varlena, so safe to use att_nominal_alignby */
-				off = att_nominal_alignby(off, att->attalignby);
-
-				if (usecache)
-					att->attcacheoff = off;
-			}
-
-			if (i == attnum)
-				break;
-
+			off = att_pointer_alignby(off,
+									  att->attalignby,
+									  att->attlen,
+									  tp + off);
 			off = att_addlength_pointer(off, att->attlen, tp + off);
 
-			if (usecache && att->attlen <= 0)
-				usecache = false;
 		}
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
 	}
 
-	return fetchatt(TupleDescCompactAttr(tupleDesc, attnum), tp + off);
+	return fetchatt(cattr, tp + off);
 }
 
 /* ----------------
@@ -1347,6 +1245,7 @@ heap_deform_tuple(HeapTuple tuple, TupleDesc tupleDesc,
 				  Datum *values, bool *isnull)
 {
 	HeapTupleHeader tup = tuple->t_data;
+	CompactAttribute *cattr;
 	bool		hasnulls = HeapTupleHasNulls(tuple);
 	int			tdesc_natts = tupleDesc->natts;
 	int			natts;			/* number of atts to extract */
@@ -1354,70 +1253,91 @@ heap_deform_tuple(HeapTuple tuple, TupleDesc tupleDesc,
 	char	   *tp;				/* ptr to tuple data */
 	uint32		off;			/* offset in tuple data */
 	bits8	   *bp = tup->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* can we use/set attcacheoff? */
+	int			cacheoffattrs;
+	int			firstnullattr;
 
 	natts = HeapTupleHeaderGetNatts(tup);
 
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
+
 	/*
 	 * In inheritance situations, it is possible that the given tuple actually
 	 * has more fields than the caller is expecting.  Don't run off the end of
 	 * the caller's arrays.
 	 */
 	natts = Min(natts, tdesc_natts);
+	cacheoffattrs = Min(tupleDesc->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		firstnullattr = first_null_attr(bp, natts);
+		cacheoffattrs = Min(cacheoffattrs, firstnullattr);
+	}
+	else
+		firstnullattr = natts;
 
 	tp = (char *) tup + tup->t_hoff;
+	attnum = 0;
 
-	off = 0;
+	if (cacheoffattrs > 0)
+	{
+#ifdef USE_ASSERT_CHECKING
+		/* In Assert enabled builds, verify attcacheoff is correct */
+		off = 0;
+#endif
+		do
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+#ifdef USE_ASSERT_CHECKING
+			off = att_nominal_alignby(off, cattr->attalignby);
+			Assert(off == cattr->attcacheoff);
+			off += cattr->attlen;
+#endif
 
-	for (attnum = 0; attnum < natts; attnum++)
+			values[attnum] = fetch_att(tp + cattr->attcacheoff,
+									   cattr->attbyval,
+									   cattr->attlen);
+			isnull[attnum] = false;
+		} while (++attnum < cacheoffattrs);
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+	else
+		off = 0;
+
+	for (; attnum < firstnullattr; attnum++)
 	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDesc, attnum);
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
 
-		if (hasnulls && att_isnull(attnum, bp))
-		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			slow = true;		/* can't use attcacheoff anymore */
-			continue;
-		}
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
 		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (!slow && thisatt->attcacheoff >= 0)
-			off = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow &&
-				off == att_nominal_alignby(off, thisatt->attalignby))
-				thisatt->attcacheoff = off;
-			else
-			{
-				off = att_pointer_alignby(off, thisatt->attalignby, -1,
-										  tp + off);
-				slow = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			off = att_nominal_alignby(off, thisatt->attalignby);
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+	}
+
+	for (; attnum < natts; attnum++)
+	{
+		Assert(hasnulls);
 
-			if (!slow)
-				thisatt->attcacheoff = off;
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
 		}
 
-		values[attnum] = fetchatt(thisatt, tp + off);
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
-		off = att_addlength_pointer(off, thisatt->attlen, tp + off);
+		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (thisatt->attlen <= 0)
-			slow = true;		/* can't use attcacheoff anymore */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
 	/*
diff --git a/src/backend/access/common/indextuple.c b/src/backend/access/common/indextuple.c
index d7c8c53fd8d..084e0937a60 100644
--- a/src/backend/access/common/indextuple.c
+++ b/src/backend/access/common/indextuple.c
@@ -223,18 +223,6 @@ index_form_tuple_context(TupleDesc tupleDescriptor,
  *
  *		This gets called from index_getattr() macro, and only in cases
  *		where we can't use cacheoffset and the value is not null.
- *
- *		This caches attribute offsets in the attribute descriptor.
- *
- *		An alternative way to speed things up would be to cache offsets
- *		with the tuple, but that seems more difficult unless you take
- *		the storage hit of actually putting those offsets into the
- *		tuple you send to disk.  Yuck.
- *
- *		This scheme will be slightly slower than that, but should
- *		perform well for queries which hit large #'s of tuples.  After
- *		you cache the offsets once, examining all the other tuples using
- *		the same attribute descriptor will go much quicker. -cim 5/4/91
  * ----------------
  */
 Datum
@@ -242,205 +230,129 @@ nocache_index_getattr(IndexTuple tup,
 					  int attnum,
 					  TupleDesc tupleDesc)
 {
+	CompactAttribute *cattr;
 	char	   *tp;				/* ptr to data part of tuple */
 	bits8	   *bp = NULL;		/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* do we have to walk attrs? */
 	int			data_off;		/* tuple data offset */
 	int			off;			/* current offset within data */
+	int			startAttr;
+	int			firstnullattr;
+	bool		hasnulls = IndexTupleHasNulls(tup);
+	int			i;
 
-	/* ----------------
-	 *	 Three cases:
-	 *
-	 *	 1: No nulls and no variable-width attributes.
-	 *	 2: Has a null or a var-width AFTER att.
-	 *	 3: Has nulls or var-widths BEFORE att.
-	 * ----------------
-	 */
-
-	data_off = IndexInfoFindDataOffset(tup->t_info);
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
 	attnum--;
 
-	if (IndexTupleHasNulls(tup))
-	{
-		/*
-		 * there's a null somewhere in the tuple
-		 *
-		 * check to see if desired att is null
-		 */
+	/*
+	 * If there are no NULLs before the required attnum, then we can start at
+	 * the highest attribute with a known offset, or the first attribute if
+	 * none have a cached offset.  If the tuple has no variable width types,
+	 * which is common with indexes, then we can use a slightly cheaper method
+	 * of offset calculation, as we just need to add the attlen to the aligned
+	 * offset when skipping over columns.  When the tuple contains
+	 * variable-width types, we must use att_addlength_pointer(), which does a
+	 * bit more branching and is slightly less efficient.
+	 */
+	data_off = IndexInfoFindDataOffset(tup->t_info);
+	tp = (char *) tup + data_off;
 
-		/* XXX "knows" t_bits are just after fixed tuple header! */
+	/*
+	 * Find the first NULL column, or if there's none set the first NULL to
+	 * attnum so that we can forego NULL checking all the way to attnum.
+	 */
+	if (hasnulls)
+	{
 		bp = (bits8 *) ((char *) tup + sizeof(IndexTupleData));
-
-		/*
-		 * Now check to see if any preceding bits are null...
-		 */
-		{
-			int			byte = attnum >> 3;
-			int			finalbit = attnum & 0x07;
-
-			/* check for nulls "before" final bit of last byte */
-			if ((~bp[byte]) & ((1 << finalbit) - 1))
-				slow = true;
-			else
-			{
-				/* check for nulls in any "earlier" bytes */
-				int			i;
-
-				for (i = 0; i < byte; i++)
-				{
-					if (bp[i] != 0xFF)
-					{
-						slow = true;
-						break;
-					}
-				}
-			}
-		}
+		firstnullattr = first_null_attr(bp, attnum);
 	}
+	else
+		firstnullattr = attnum;
 
-	tp = (char *) tup + data_off;
-
-	if (!slow)
+	if (tupleDesc->firstNonCachedOffAttr > 0)
 	{
-		CompactAttribute *att;
-
-		/*
-		 * If we get here, there are no nulls up to and including the target
-		 * attribute.  If we have a cached offset, we can use it.
-		 */
-		att = TupleDescCompactAttr(tupleDesc, attnum);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, tp + att->attcacheoff);
-
-		/*
-		 * Otherwise, check for non-fixed-length attrs up to and including
-		 * target.  If there aren't any, it's safe to cheaply initialize the
-		 * cached offsets for these attrs.
-		 */
-		if (IndexTupleHasVarwidths(tup))
-		{
-			int			j;
-
-			for (j = 0; j <= attnum; j++)
-			{
-				if (TupleDescCompactAttr(tupleDesc, j)->attlen <= 0)
-				{
-					slow = true;
-					break;
-				}
-			}
-		}
+		startAttr = Min(tupleDesc->firstNonCachedOffAttr - 1, firstnullattr);
+		off = TupleDescCompactAttr(tupleDesc, startAttr)->attcacheoff;
 	}
-
-	if (!slow)
+	else
 	{
-		int			natts = tupleDesc->natts;
-		int			j = 1;
-
-		/*
-		 * If we get here, we have a tuple with no nulls or var-widths up to
-		 * and including the target attribute, so we can use the cached offset
-		 * ... only we don't have it yet, or we'd not have got here.  Since
-		 * it's cheap to compute offsets for fixed-width columns, we take the
-		 * opportunity to initialize the cached offsets for *all* the leading
-		 * fixed-width columns, in hope of avoiding future visits to this
-		 * routine.
-		 */
-		TupleDescCompactAttr(tupleDesc, 0)->attcacheoff = 0;
+		startAttr = 0;
+		off = 0;
+	}
 
-		/* we might have set some offsets in the slow path previously */
-		while (j < natts && TupleDescCompactAttr(tupleDesc, j)->attcacheoff > 0)
-			j++;
+	/* Handle tuples with var-width attributes */
+	if (IndexTupleHasVarwidths(tup))
+	{
+		/* Calculate the offset up until the first NULL */
+		for (i = startAttr; i < firstnullattr; i++)
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-		off = TupleDescCompactAttr(tupleDesc, j - 1)->attcacheoff +
-			TupleDescCompactAttr(tupleDesc, j - 1)->attlen;
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, cattr->attlen, tp + off);
+		}
 
-		for (; j < natts; j++)
+		/* Calculate the offset for any remaining columns. */
+		for (; i < attnum; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, j);
+			Assert(hasnulls);
 
-			if (att->attlen <= 0)
-				break;
+			if (att_isnull(i, bp))
+				continue;
 
-			off = att_nominal_alignby(off, att->attalignby);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-			att->attcacheoff = off;
-
-			off += att->attlen;
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, cattr->attlen, tp + off);
 		}
-
-		Assert(j > attnum);
-
-		off = TupleDescCompactAttr(tupleDesc, attnum)->attcacheoff;
 	}
 	else
 	{
-		bool		usecache = true;
-		int			i;
+		/* Handle tuples with only fixed-width attributes */
 
-		/*
-		 * Now we know that we have to walk the tuple CAREFULLY.  But we still
-		 * might be able to cache some offsets for next time.
-		 *
-		 * Note - This loop is a little tricky.  For each non-null attribute,
-		 * we have to first account for alignment padding before the attr,
-		 * then advance over the attr based on its length.  Nulls have no
-		 * storage and no alignment padding either.  We can use/set
-		 * attcacheoff until we reach either a null or a var-width attribute.
-		 */
-		off = 0;
-		for (i = 0;; i++)		/* loop exit is at "break" */
+		/* Calculate the offset up until the first NULL */
+		for (i = startAttr; i < firstnullattr; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
-
-			if (IndexTupleHasNulls(tup) && att_isnull(i, bp))
-			{
-				usecache = false;
-				continue;		/* this cannot be the target att */
-			}
-
-			/* If we know the next offset, we can skip the rest */
-			if (usecache && att->attcacheoff >= 0)
-				off = att->attcacheoff;
-			else if (att->attlen == -1)
-			{
-				/*
-				 * We can only cache the offset for a varlena attribute if the
-				 * offset is already suitably aligned, so that there would be
-				 * no pad bytes in any case: then the offset will be valid for
-				 * either an aligned or unaligned value.
-				 */
-				if (usecache &&
-					off == att_nominal_alignby(off, att->attalignby))
-					att->attcacheoff = off;
-				else
-				{
-					off = att_pointer_alignby(off, att->attalignby, -1,
-											  tp + off);
-					usecache = false;
-				}
-			}
-			else
-			{
-				/* not varlena, so safe to use att_nominal_alignby */
-				off = att_nominal_alignby(off, att->attalignby);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
+
+			Assert(cattr->attlen > 0);
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off += cattr->attlen;
+		}
 
-				if (usecache)
-					att->attcacheoff = off;
-			}
+		/* Calculate the offset for any remaining columns. */
+		for (; i < attnum; i++)
+		{
+			Assert(hasnulls);
 
-			if (i == attnum)
-				break;
+			if (att_isnull(i, bp))
+				continue;
 
-			off = att_addlength_pointer(off, att->attlen, tp + off);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-			if (usecache && att->attlen <= 0)
-				usecache = false;
+			Assert(cattr->attlen > 0);
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off += cattr->attlen;
 		}
 	}
 
-	return fetchatt(TupleDescCompactAttr(tupleDesc, attnum), tp + off);
+	cattr = TupleDescCompactAttr(tupleDesc, attnum);
+	off = att_pointer_alignby(off, cattr->attalignby,
+							  cattr->attlen, tp + off);
+	return fetchatt(cattr, tp + off);
 }
 
 /*
@@ -480,63 +392,86 @@ index_deform_tuple_internal(TupleDesc tupleDescriptor,
 							Datum *values, bool *isnull,
 							char *tp, bits8 *bp, int hasnulls)
 {
+	CompactAttribute *cattr;
 	int			natts = tupleDescriptor->natts; /* number of atts to extract */
-	int			attnum;
+	int			attnum = 0;
 	int			off = 0;		/* offset in tuple data */
-	bool		slow = false;	/* can we use/set attcacheoff? */
+	int			cacheoffattrs;
+	int			firstnullattr;
 
 	/* Assert to protect callers who allocate fixed-size arrays */
 	Assert(natts <= INDEX_MAX_KEYS);
 
-	for (attnum = 0; attnum < natts; attnum++)
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDescriptor->firstNonCachedOffAttr >= 0);
+
+	cacheoffattrs = Min(tupleDescriptor->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		firstnullattr = first_null_attr(bp, natts);
+		cacheoffattrs = Min(cacheoffattrs, firstnullattr);
+	}
+	else
+		firstnullattr = natts;
+
+	if (cacheoffattrs > 0)
 	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDescriptor, attnum);
+#ifdef USE_ASSERT_CHECKING
+		/* In Assert enabled builds, verify attcacheoff is correct */
+		off = 0;
+#endif
 
-		if (hasnulls && att_isnull(attnum, bp))
+		do
 		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			slow = true;		/* can't use attcacheoff anymore */
-			continue;
-		}
+			cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+
+#ifdef USE_ASSERT_CHECKING
+			off = att_nominal_alignby(off, cattr->attalignby);
+			Assert(off == cattr->attcacheoff);
+			off += cattr->attlen;
+#endif
+
+			values[attnum] = fetch_att(tp + cattr->attcacheoff, cattr->attbyval,
+									   cattr->attlen);
+			isnull[attnum] = false;
+		} while (++attnum < cacheoffattrs);
+
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+
+	for (; attnum < firstnullattr; attnum++)
+	{
+		cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
 		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (!slow && thisatt->attcacheoff >= 0)
-			off = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow &&
-				off == att_nominal_alignby(off, thisatt->attalignby))
-				thisatt->attcacheoff = off;
-			else
-			{
-				off = att_pointer_alignby(off, thisatt->attalignby, -1,
-										  tp + off);
-				slow = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			off = att_nominal_alignby(off, thisatt->attalignby);
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+	}
+
+	for (; attnum < natts; attnum++)
+	{
+		Assert(hasnulls);
 
-			if (!slow)
-				thisatt->attcacheoff = off;
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
 		}
 
-		values[attnum] = fetchatt(thisatt, tp + off);
+		cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
-		off = att_addlength_pointer(off, thisatt->attlen, tp + off);
+		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (thisatt->attlen <= 0)
-			slow = true;		/* can't use attcacheoff anymore */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 }
 
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index e98de806a77..25364db630a 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -214,6 +214,9 @@ CreateTemplateTupleDesc(int natts)
 	desc->tdtypmod = -1;
 	desc->tdrefcount = -1;		/* assume not reference-counted */
 
+	/* This will be set to the correct value by TupleDescFinalize() */
+	desc->firstNonCachedOffAttr = -1;
+
 	return desc;
 }
 
@@ -474,6 +477,9 @@ TupleDescCopy(TupleDesc dst, TupleDesc src)
  *		descriptor to another.
  *
  * !!! Constraints and defaults are not copied !!!
+ *
+ * The caller must take care of calling TupleDescFinalize() on once all
+ * TupleDesc changes have been made.
  */
 void
 TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
@@ -506,6 +512,37 @@ TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 	populate_compact_attribute(dst, dstAttno - 1);
 }
 
+/*
+ * TupleDescFinalize
+ *		Finalize the given TupleDesc.  This must be called after the
+ *		attributes arrays have been populated or adjusted by any code.
+ *
+ * Must be called after populate_compact_attribute() and before
+ * BlessTupleDesc().
+ */
+void
+TupleDescFinalize(TupleDesc tupdesc)
+{
+	int			firstNonCachedOffAttr = 0;
+	int			offp = 0;
+
+	for (int i = 0; i < tupdesc->natts; i++)
+	{
+		CompactAttribute *cattr = TupleDescCompactAttr(tupdesc, i);
+
+		if (cattr->attlen <= 0)
+			break;
+
+		offp = att_nominal_alignby(offp, cattr->attalignby);
+		cattr->attcacheoff = offp;
+
+		offp += cattr->attlen;
+		firstNonCachedOffAttr = i + 1;
+	}
+
+	tupdesc->firstNonCachedOffAttr = firstNonCachedOffAttr;
+}
+
 /*
  * Free a TupleDesc including all substructure
  */
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index b246e8127db..a4694bd8065 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -335,9 +335,6 @@ getSpGistTupleDesc(Relation index, SpGistTypeDesc *keyType)
 		/* We shouldn't need to bother with making these valid: */
 		att->attcompression = InvalidCompressionMethod;
 		att->attcollation = InvalidOid;
-		/* In case we changed typlen, we'd better reset following offsets */
-		for (int i = spgFirstIncludeColumn; i < outTupDesc->natts; i++)
-			TupleDescCompactAttr(outTupDesc, i)->attcacheoff = -1;
 
 		populate_compact_attribute(outTupDesc, spgKeyColumn);
 		TupleDescFinalize(outTupDesc);
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index e6ab51e6404..36d0aaed2fb 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -992,118 +992,6 @@ tts_buffer_heap_store_tuple(TupleTableSlot *slot, HeapTuple tuple,
 	}
 }
 
-/*
- * slot_deform_heap_tuple_internal
- *		An always inline helper function for use in slot_deform_heap_tuple to
- *		allow the compiler to emit specialized versions of this function for
- *		various combinations of "slow" and "hasnulls".  For example, if a
- *		given tuple has no nulls, then we needn't check "hasnulls" for every
- *		attribute that we're deforming.  The caller can just call this
- *		function with hasnulls set to constant-false and have the compiler
- *		remove the constant-false branches and emit more optimal code.
- *
- * Returns the next attnum to deform, which can be equal to natts when the
- * function manages to deform all requested attributes.  *offp is an input and
- * output parameter which is the byte offset within the tuple to start deforming
- * from which, on return, gets set to the offset where the next attribute
- * should be deformed from.  *slowp is set to true when subsequent deforming
- * of this tuple must use a version of this function with "slow" passed as
- * true.
- *
- * Callers cannot assume when we return "attnum" (i.e. all requested
- * attributes have been deformed) that slow mode isn't required for any
- * additional deforming as the final attribute may have caused a switch to
- * slow mode.
- */
-static pg_attribute_always_inline int
-slot_deform_heap_tuple_internal(TupleTableSlot *slot, HeapTuple tuple,
-								int attnum, int natts, bool slow,
-								bool hasnulls, uint32 *offp, bool *slowp)
-{
-	TupleDesc	tupleDesc = slot->tts_tupleDescriptor;
-	Datum	   *values = slot->tts_values;
-	bool	   *isnull = slot->tts_isnull;
-	HeapTupleHeader tup = tuple->t_data;
-	char	   *tp;				/* ptr to tuple data */
-	bits8	   *bp = tup->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slownext = false;
-
-	tp = (char *) tup + tup->t_hoff;
-
-	for (; attnum < natts; attnum++)
-	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDesc, attnum);
-
-		if (hasnulls && att_isnull(attnum, bp))
-		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			if (!slow)
-			{
-				*slowp = true;
-				return attnum + 1;
-			}
-			else
-				continue;
-		}
-
-		isnull[attnum] = false;
-
-		/* calculate the offset of this attribute */
-		if (!slow && thisatt->attcacheoff >= 0)
-			*offp = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow && *offp == att_nominal_alignby(*offp, thisatt->attalignby))
-				thisatt->attcacheoff = *offp;
-			else
-			{
-				*offp = att_pointer_alignby(*offp,
-											thisatt->attalignby,
-											-1,
-											tp + *offp);
-
-				if (!slow)
-					slownext = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			*offp = att_nominal_alignby(*offp, thisatt->attalignby);
-
-			if (!slow)
-				thisatt->attcacheoff = *offp;
-		}
-
-		values[attnum] = fetchatt(thisatt, tp + *offp);
-
-		*offp = att_addlength_pointer(*offp, thisatt->attlen, tp + *offp);
-
-		/* check if we need to switch to slow mode */
-		if (!slow)
-		{
-			/*
-			 * We're unable to deform any further if the above code set
-			 * 'slownext', or if this isn't a fixed-width attribute.
-			 */
-			if (slownext || thisatt->attlen <= 0)
-			{
-				*slowp = true;
-				return attnum + 1;
-			}
-		}
-	}
-
-	return natts;
-}
-
 /*
  * slot_deform_heap_tuple
  *		Given a TupleTableSlot, extract data from the slot's physical tuple
@@ -1122,78 +1010,138 @@ static pg_attribute_always_inline void
 slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 					   int natts)
 {
+	CompactAttribute *cattr;
+	TupleDesc	tupleDesc = slot->tts_tupleDescriptor;
 	bool		hasnulls = HeapTupleHasNulls(tuple);
+	HeapTupleHeader tup = tuple->t_data;
+	bits8	   *bp;				/* ptr to null bitmap in tuple */
 	int			attnum;
+	int			firstNonCacheOffsetAttr;
+	int			firstNullAttr;
+	Datum	   *values;
+	bool	   *isnull;
+	char	   *tp;				/* ptr to tuple data */
 	uint32		off;			/* offset in tuple data */
-	bool		slow;			/* can we use/set attcacheoff? */
+
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
 	/* We can only fetch as many attributes as the tuple has. */
-	natts = Min(HeapTupleHeaderGetNatts(tuple->t_data), natts);
+	natts = Min(HeapTupleHeaderGetNatts(tup), natts);
+	attnum = slot->tts_nvalid;
+	firstNonCacheOffsetAttr = Min(tupleDesc->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		bp = tup->t_bits;
+		firstNullAttr = first_null_attr(bp, natts);
+		firstNonCacheOffsetAttr = Min(firstNonCacheOffsetAttr, firstNullAttr);
+	}
+	else
+	{
+		bp = NULL;
+		firstNullAttr = natts;
+	}
+
+	values = slot->tts_values;
+	isnull = slot->tts_isnull;
+	tp = (char *) tup + tup->t_hoff;
 
 	/*
-	 * Check whether the first call for this tuple, and initialize or restore
-	 * loop state.
+	 * Handle the portion of the tuple that we have cached the offset for up
+	 * to the first NULL attribute.  The offset is effectively fixed for these
+	 * so we can use the CompactAttribute's attcacheoff.
 	 */
-	attnum = slot->tts_nvalid;
-	if (attnum == 0)
+	if (attnum < firstNonCacheOffsetAttr)
 	{
-		/* Start from the first attribute */
-		off = 0;
-		slow = false;
+#ifdef USE_ASSERT_CHECKING
+		int			offcheck;
+
+		/* In Assert enabled builds, verify attcacheoff is correct */
+		if (attnum == 0)
+			offcheck = 0;
+		else
+			offcheck = *offp;
+#endif
+		do
+		{
+			isnull[attnum] = false;
+			cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+#ifdef USE_ASSERT_CHECKING
+			offcheck = att_nominal_alignby(offcheck, cattr->attalignby);
+			Assert(offcheck == cattr->attcacheoff);
+			offcheck += cattr->attlen;
+#endif
+
+			values[attnum] = fetchatt(cattr, tp + cattr->attcacheoff);
+		} while (++attnum < firstNonCacheOffsetAttr);
+
+		/*
+		 * Point the offset after the end of the last attribute with a cached
+		 * offset.  We expect the final cached offset attribute to have a
+		 * fixed width, so just add the attlen to the attcacheoff
+		 */
+		Assert(cattr->attlen > 0);
+		off = cattr->attcacheoff + cattr->attlen;
 	}
 	else
 	{
 		/* Restore state from previous execution */
 		off = *offp;
-		slow = TTS_SLOW(slot);
+
+		/* We expect *offp to be set to 0 when attnum == 0 */
+		Assert(off == 0 || attnum > 0);
 	}
 
 	/*
-	 * If 'slow' isn't set, try deforming using deforming code that does not
-	 * contain any of the extra checks required for non-fixed offset
-	 * deforming.  During deforming, if or when we find a NULL or a variable
-	 * length attribute, we'll switch to a deforming method which includes the
-	 * extra code required for non-fixed offset deforming, a.k.a slow mode.
-	 * Because this is performance critical, we inline
-	 * slot_deform_heap_tuple_internal passing the 'slow' and 'hasnull'
-	 * parameters as constants to allow the compiler to emit specialized code
-	 * with the known-const false comparisons and subsequent branches removed.
+	 * Handle any portion of the tuple that doesn't have a fixed offset up
+	 * until the first NULL attribute.  This loops only differs from the one
+	 * after it by the NULL checks.
 	 */
-	if (!slow)
+	for (; attnum < firstNullAttr; attnum++)
 	{
-		/* Tuple without any NULLs? We can skip doing any NULL checking */
-		if (!hasnulls)
-			attnum = slot_deform_heap_tuple_internal(slot,
-													 tuple,
-													 attnum,
-													 natts,
-													 false, /* slow */
-													 false, /* hasnulls */
-													 &off,
-													 &slow);
-		else
-			attnum = slot_deform_heap_tuple_internal(slot,
-													 tuple,
-													 attnum,
-													 natts,
-													 false, /* slow */
-													 true,	/* hasnulls */
-													 &off,
-													 &slow);
+		isnull[attnum] = false;
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+		/* align the offset for this attribute */
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
+
+		values[attnum] = fetchatt(cattr, tp + off);
+
+		/* move the offset beyond this attribute */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
-	/* If there's still work to do then we must be in slow mode */
-	if (attnum < natts)
+	/*
+	 * Now handle any remaining tuples, this time include NULL checks as we're
+	 * now at the first NULL attribute.
+	 */
+	for (; attnum < natts; attnum++)
 	{
-		/* XXX is it worth adding a separate call when hasnulls is false? */
-		attnum = slot_deform_heap_tuple_internal(slot,
-												 tuple,
-												 attnum,
-												 natts,
-												 true,	/* slow */
-												 hasnulls,
-												 &off,
-												 &slow);
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
+		}
+
+		isnull[attnum] = false;
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+		/* align the offset for this attribute */
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
+
+		values[attnum] = fetchatt(cattr, tp + off);
+
+		/* move the offset beyond this attribute */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
 	/*
@@ -1201,10 +1149,6 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 	 */
 	slot->tts_nvalid = attnum;
 	*offp = off;
-	if (slow)
-		slot->tts_flags |= TTS_FLAG_SLOW;
-	else
-		slot->tts_flags &= ~TTS_FLAG_SLOW;
 }
 
 const TupleTableSlotOps TTSOpsVirtual = {
@@ -2259,10 +2203,16 @@ ExecTypeSetColNames(TupleDesc typeInfo, List *namesList)
  * This happens "for free" if the tupdesc came from a relcache entry, but
  * not if we have manufactured a tupdesc for a transient RECORD datatype.
  * In that case we have to notify typcache.c of the existence of the type.
+ *
+ * TupleDescFinalize() must be called on the TupleDesc before calling this
+ * function.
  */
 TupleDesc
 BlessTupleDesc(TupleDesc tupdesc)
 {
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupdesc->firstNonCachedOffAttr >= 0);
+
 	if (tupdesc->tdtypeid == RECORDOID &&
 		tupdesc->tdtypmod < 0)
 		assign_record_type_typmod(tupdesc);
diff --git a/src/backend/jit/llvm/llvmjit_deform.c b/src/backend/jit/llvm/llvmjit_deform.c
index 3eb087eb56b..12521e3e46a 100644
--- a/src/backend/jit/llvm/llvmjit_deform.c
+++ b/src/backend/jit/llvm/llvmjit_deform.c
@@ -62,7 +62,6 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 	LLVMValueRef v_tts_values;
 	LLVMValueRef v_tts_nulls;
 	LLVMValueRef v_slotoffp;
-	LLVMValueRef v_flagsp;
 	LLVMValueRef v_nvalidp;
 	LLVMValueRef v_nvalid;
 	LLVMValueRef v_maxatt;
@@ -178,7 +177,6 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 	v_tts_nulls =
 		l_load_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_ISNULL,
 						  "tts_ISNULL");
-	v_flagsp = l_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_FLAGS, "");
 	v_nvalidp = l_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_NVALID, "");
 
 	if (ops == &TTSOpsHeapTuple || ops == &TTSOpsBufferHeapTuple)
@@ -747,14 +745,10 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 
 	{
 		LLVMValueRef v_off = l_load(b, TypeSizeT, v_offp, "");
-		LLVMValueRef v_flags;
 
 		LLVMBuildStore(b, l_int16_const(lc, natts), v_nvalidp);
 		v_off = LLVMBuildTrunc(b, v_off, LLVMInt32TypeInContext(lc), "");
 		LLVMBuildStore(b, v_off, v_slotoffp);
-		v_flags = l_load(b, LLVMInt16TypeInContext(lc), v_flagsp, "tts_flags");
-		v_flags = LLVMBuildOr(b, v_flags, l_int16_const(lc, TTS_FLAG_SLOW), "");
-		LLVMBuildStore(b, v_flags, v_flagsp);
 		LLVMBuildRetVoid(b);
 	}
 
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 770edb34e08..998be24ac41 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -666,14 +666,6 @@ RelationBuildTupleDesc(Relation relation)
 		elog(ERROR, "pg_attribute catalog is missing %d attribute(s) for relation OID %u",
 			 need, RelationGetRelid(relation));
 
-	/*
-	 * We can easily set the attcacheoff value for the first attribute: it
-	 * must be zero.  This eliminates the need for special cases for attnum=1
-	 * that used to exist in fastgetattr() and index_getattr().
-	 */
-	if (RelationGetNumberOfAttributes(relation) > 0)
-		TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
-
 	/*
 	 * Set up constraint/default info
 	 */
@@ -1985,8 +1977,6 @@ formrdesc(const char *relationName, Oid relationReltype,
 		populate_compact_attribute(relation->rd_att, i);
 	}
 
-	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
-	TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
 	TupleDescFinalize(relation->rd_att);
 
 	/* mark not-null status */
@@ -4446,8 +4436,6 @@ BuildHardcodedDescriptor(int natts, const FormData_pg_attribute *attrs)
 		populate_compact_attribute(result, i);
 	}
 
-	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
-	TupleDescCompactAttr(result, 0)->attcacheoff = 0;
 	TupleDescFinalize(result);
 
 	/* Note: we don't bother to set up a TupleConstr entry */
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index d406825ff22..94b4279b7f1 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -865,20 +865,17 @@ extern MinimalTuple minimal_expand_tuple(HeapTuple sourceTuple, TupleDesc tupleD
 static inline Datum
 fastgetattr(HeapTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
 {
-	Assert(attnum > 0);
+	CompactAttribute *att = TupleDescCompactAttr(tupleDesc, attnum - 1);
 
+	Assert(attnum > 0);
 	*isnull = false;
-	if (HeapTupleNoNulls(tup))
-	{
-		CompactAttribute *att;
 
-		att = TupleDescCompactAttr(tupleDesc, attnum - 1);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, (char *) tup->t_data + tup->t_data->t_hoff +
-							att->attcacheoff);
-		else
-			return nocachegetattr(tup, attnum, tupleDesc);
-	}
+	if (att->attcacheoff >= 0 && !HeapTupleHasNulls(tup))
+		return fetchatt(att, (char *) tup->t_data + tup->t_data->t_hoff +
+						att->attcacheoff);
+
+	if (HeapTupleNoNulls(tup))
+		return nocachegetattr(tup, attnum, tupleDesc);
 	else
 	{
 		if (att_isnull(attnum - 1, tup->t_data->t_bits))
diff --git a/src/include/access/itup.h b/src/include/access/itup.h
index 57e4daafb0d..e4bb27b7e58 100644
--- a/src/include/access/itup.h
+++ b/src/include/access/itup.h
@@ -131,24 +131,20 @@ IndexInfoFindDataOffset(unsigned short t_info)
 static inline Datum
 index_getattr(IndexTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
 {
+	CompactAttribute *attr = TupleDescCompactAttr(tupleDesc, attnum - 1);
+
 	Assert(isnull);
 	Assert(attnum > 0);
 
 	*isnull = false;
 
-	if (!IndexTupleHasNulls(tup))
-	{
-		CompactAttribute *attr = TupleDescCompactAttr(tupleDesc, attnum - 1);
+	if (attr->attcacheoff >= 0 && !IndexTupleHasNulls(tup))
+		return fetchatt(attr,
+						(char *) tup + IndexInfoFindDataOffset(tup->t_info) +
+						attr->attcacheoff);
 
-		if (attr->attcacheoff >= 0)
-		{
-			return fetchatt(attr,
-							(char *) tup + IndexInfoFindDataOffset(tup->t_info) +
-							attr->attcacheoff);
-		}
-		else
-			return nocache_index_getattr(tup, attnum, tupleDesc);
-	}
+	if (!IndexTupleHasNulls(tup))
+		return nocache_index_getattr(tup, attnum, tupleDesc);
 	else
 	{
 		if (att_isnull(attnum - 1, (bits8 *) tup + sizeof(IndexTupleData)))
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index 595413dbbc5..99d9017d1a6 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -131,6 +131,12 @@ typedef struct CompactAttribute
  * Any code making changes manually to and fields in the FormData_pg_attribute
  * array must subsequently call populate_compact_attribute() to flush the
  * changes out to the corresponding 'compact_attrs' element.
+ *
+ * firstNonCachedOffAttr stores the index into the compact_attrs array for the
+ * first attribute that we don't have a known attcacheoff for.
+ *
+ * Once a TupleDesc has been populated, before it is used for any purpose
+ * TupleDescFinalize() must be called on it.
  */
 typedef struct TupleDescData
 {
@@ -138,6 +144,8 @@ typedef struct TupleDescData
 	Oid			tdtypeid;		/* composite type ID for tuple type */
 	int32		tdtypmod;		/* typmod for tuple type */
 	int			tdrefcount;		/* reference count, or -1 if not counting */
+	int			firstNonCachedOffAttr;	/* index of the first att without an
+										 * attcacheoff */
 	TupleConstr *constr;		/* constraints, or NULL if none */
 	/* compact_attrs[N] is the compact metadata of Attribute Number N+1 */
 	CompactAttribute compact_attrs[FLEXIBLE_ARRAY_MEMBER];
@@ -195,7 +203,6 @@ extern TupleDesc CreateTupleDescTruncatedCopy(TupleDesc tupdesc, int natts);
 
 extern TupleDesc CreateTupleDescCopyConstr(TupleDesc tupdesc);
 
-#define TupleDescFinalize(d) ((void) 0)
 #define TupleDescSize(src) \
 	(offsetof(struct TupleDescData, compact_attrs) + \
 	 (src)->natts * sizeof(CompactAttribute) + \
@@ -206,6 +213,7 @@ extern void TupleDescCopy(TupleDesc dst, TupleDesc src);
 extern void TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 							   TupleDesc src, AttrNumber srcAttno);
 
+extern void TupleDescFinalize(TupleDesc tupdesc);
 extern void FreeTupleDesc(TupleDesc tupdesc);
 
 extern void IncrTupleDescRefCount(TupleDesc tupdesc);
diff --git a/src/include/access/tupmacs.h b/src/include/access/tupmacs.h
index 3e5530658c9..ce7a88df611 100644
--- a/src/include/access/tupmacs.h
+++ b/src/include/access/tupmacs.h
@@ -15,6 +15,7 @@
 #define TUPMACS_H
 
 #include "catalog/pg_type_d.h"	/* for TYPALIGN macros */
+#include "port/pg_bitutils.h"
 
 
 /*
@@ -69,6 +70,89 @@ fetch_att(const void *T, bool attbyval, int attlen)
 	else
 		return PointerGetDatum(T);
 }
+
+#ifndef HAVE__BUILTIN_CTZ
+/*
+ * For returning the 0-based position of the right-most 0 bit of a uint8, or 8
+ * if all bits are 1 bits.
+ */
+static const uint8 pg_rightmost_zero_pos[256] = {
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 6,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 7,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 6,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 8
+};
+#endif
+
+/*
+ * first_null_attr
+ *		Inspect a NULL bitmask from a tuple and return the 0-based attnum of the
+ *		first NULL attribute.  Returns natts if no NULLs were found.
+ */
+static inline int
+first_null_attr(const bits8 *bits, int natts)
+{
+	int			lastByte = natts >> 3;
+	int			bytenum;
+	int			res;
+
+#ifdef USE_ASSERT_CHECKING
+	int			firstnull_check = natts;
+
+	/* Do it the slow way and check we get the same answer. */
+	for (int i = 0; i < natts; i++)
+	{
+		if (att_isnull(i, bits))
+		{
+			firstnull_check = i;
+			break;
+		}
+	}
+#endif
+
+	/* Process all bytes up to just before the byte for the natts index */
+	for (bytenum = 0; bytenum < lastByte; bytenum++)
+	{
+		/* break if there's any NULL attrs (a 0 bit) */
+		if (bits[bytenum] != 0xFF)
+			break;
+	}
+
+	res = bytenum << 3;
+
+#ifdef HAVE__BUILTIN_CTZ
+	/*
+	 * Promote to 32-bit before doing bit-wise NOT.  This means we'll convert
+	 * 0xff into 0xffffff00 rather than 0x0, which is undefined with
+	 * __builtin_ctz.  That'll mean we correctly get 8 for 0xff
+	 */
+	res += __builtin_ctz(~(uint32) bits[bytenum]);
+#else
+	res += pg_rightmost_zero_pos[bits[bytenum]];
+#endif
+
+	/*
+	 * Since we did no masking to mask out bits beyond natts, we may have
+	 * found a bit higher than natts, so we must cap to natts
+	 */
+	res = Min(res, natts);
+
+	Assert(res == firstnull_check);
+
+	return res;
+}
 #endif							/* FRONTEND */
 
 /*
diff --git a/src/include/executor/tuptable.h b/src/include/executor/tuptable.h
index a2dfd707e78..363c5f33697 100644
--- a/src/include/executor/tuptable.h
+++ b/src/include/executor/tuptable.h
@@ -84,9 +84,6 @@
  * tts_values/tts_isnull are allocated either when the slot is created (when
  * the descriptor is provided), or when a descriptor is assigned to the slot;
  * they are of length equal to the descriptor's natts.
- *
- * The TTS_FLAG_SLOW flag is saved state for
- * slot_deform_heap_tuple, and should not be touched by any other code.
  *----------
  */
 
@@ -98,12 +95,8 @@
 #define			TTS_FLAG_SHOULDFREE		(1 << 2)
 #define TTS_SHOULDFREE(slot) (((slot)->tts_flags & TTS_FLAG_SHOULDFREE) != 0)
 
-/* saved state for slot_deform_heap_tuple */
-#define			TTS_FLAG_SLOW		(1 << 3)
-#define TTS_SLOW(slot) (((slot)->tts_flags & TTS_FLAG_SLOW) != 0)
-
 /* fixed tuple descriptor */
-#define			TTS_FLAG_FIXED		(1 << 4)
+#define			TTS_FLAG_FIXED		(1 << 4)	/* XXX change to #3? */
 #define TTS_FIXED(slot) (((slot)->tts_flags & TTS_FLAG_FIXED) != 0)
 
 struct TupleTableSlotOps;
-- 
2.51.0


From 98e27bc74fbc3f3da9048430ea9aa2ffa156aa78 Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Tue, 27 Jan 2026 15:08:09 +1300
Subject: [PATCH v8 3/4] Introduce deform_bench test module

For benchmaring tuple deformation.
---
 src/test/modules/deform_bench/.gitignore      |   4 +
 src/test/modules/deform_bench/Makefile        |  21 ++++
 .../deform_bench/deform_bench--1.0.sql        |   8 ++
 src/test/modules/deform_bench/deform_bench.c  | 106 ++++++++++++++++++
 .../modules/deform_bench/deform_bench.control |   4 +
 src/test/modules/deform_bench/meson.build     |  22 ++++
 src/test/modules/meson.build                  |   1 +
 7 files changed, 166 insertions(+)
 create mode 100644 src/test/modules/deform_bench/.gitignore
 create mode 100644 src/test/modules/deform_bench/Makefile
 create mode 100644 src/test/modules/deform_bench/deform_bench--1.0.sql
 create mode 100644 src/test/modules/deform_bench/deform_bench.c
 create mode 100644 src/test/modules/deform_bench/deform_bench.control
 create mode 100644 src/test/modules/deform_bench/meson.build

diff --git a/src/test/modules/deform_bench/.gitignore b/src/test/modules/deform_bench/.gitignore
new file mode 100644
index 00000000000..5dcb3ff9723
--- /dev/null
+++ b/src/test/modules/deform_bench/.gitignore
@@ -0,0 +1,4 @@
+# Generated subdirectories
+/log/
+/results/
+/tmp_check/
diff --git a/src/test/modules/deform_bench/Makefile b/src/test/modules/deform_bench/Makefile
new file mode 100644
index 00000000000..b5fc0f7a583
--- /dev/null
+++ b/src/test/modules/deform_bench/Makefile
@@ -0,0 +1,21 @@
+# src/test/modules/deform_bench/Makefile
+
+MODULE_big = deform_bench
+OBJS = deform_bench.o
+
+EXTENSION = deform_bench
+DATA = deform_bench--1.0.sql
+PGFILEDESC = "deform_bench - tuple deform benchmarking"
+
+REGRESS = deform_bench
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/deform_bench
+top_builddir = ../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/deform_bench/deform_bench--1.0.sql b/src/test/modules/deform_bench/deform_bench--1.0.sql
new file mode 100644
index 00000000000..492b71dba3b
--- /dev/null
+++ b/src/test/modules/deform_bench/deform_bench--1.0.sql
@@ -0,0 +1,8 @@
+/* deform_bench--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION deform_bench" to load this file. \quit
+
+CREATE FUNCTION deform_bench(tableoid Oid, attnum int[]) RETURNS FLOAT
+AS 'MODULE_PATHNAME', 'deform_bench'
+LANGUAGE C VOLATILE STRICT;
diff --git a/src/test/modules/deform_bench/deform_bench.c b/src/test/modules/deform_bench/deform_bench.c
new file mode 100644
index 00000000000..525162eb59c
--- /dev/null
+++ b/src/test/modules/deform_bench/deform_bench.c
@@ -0,0 +1,106 @@
+/*-------------------------------------------------------------------------
+ *
+ * deform_bench.c
+ *
+ * for benchmarking tuple deformation routines
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include <time.h>
+#include <sys/time.h>
+
+#include "access/heapam.h"
+#include "access/relscan.h"
+#include "catalog/pg_am_d.h"
+#include "catalog/pg_type_d.h"
+#include "funcapi.h"
+#include "miscadmin.h"
+#include "utils/array.h"
+#include "utils/arrayaccess.h"
+#include "utils/builtins.h"
+
+PG_MODULE_MAGIC;
+
+PG_FUNCTION_INFO_V1(deform_bench);
+
+Datum
+deform_bench(PG_FUNCTION_ARGS)
+{
+	Oid			tableoid = PG_GETARG_OID(0);
+	ArrayType  *array = PG_GETARG_ARRAYTYPE_P(1);
+	TableScanDesc scan;
+	Relation	rel;
+	TupleDesc	tupdesc;
+	TupleTableSlot *slot;
+	Datum	   *elem_datums = NULL;
+	bool	   *elem_nulls = NULL;
+	int			elem_count;
+	int		   *attnums;
+	clock_t		start,
+				end;
+
+	rel = relation_open(tableoid, AccessShareLock);
+
+	if (rel->rd_rel->relam != HEAP_TABLE_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only heap AM is supported")));
+
+	tupdesc = RelationGetDescr(rel);
+	slot = MakeTupleTableSlot(tupdesc, &TTSOpsBufferHeapTuple);
+	scan = table_beginscan_strat(rel, GetActiveSnapshot(), 0, NULL, true, false);
+
+	/*
+	 * The array is used to allow callers to define how many atts to deform.
+	 * e.g: '{1,10}'::int[] would deform attnum=1, then in a 2nd pass deform
+	 * the remainder up to attnum=10.  Passing an element as NULL means all
+	 * attnums.  This allows simulation of incremental deformation.  Generally
+	 * if you're passing an array with more than 1 element, then the array
+	 * should be in ascending order.  Doing something like '{10,1}' would mean
+	 * we've already deformed 10 attributes and on the 2nd pass there's
+	 * nothing to do since attnum=1 was already deformed in the first pass.
+	 *
+	 * You'll get an ERROR if you pass a number higher than the number of
+	 * attributes in the table.
+	 */
+	deconstruct_array(array,
+					  INT4OID,
+					  sizeof(int32),
+					  true,
+					  'i',
+					  &elem_datums,
+					  &elem_nulls,
+					  &elem_count);
+
+	attnums = palloc_array(int, elem_count);
+
+	for (int i = 0; i < elem_count; i++)
+	{
+		/* Make a NULL element mean all attributes */
+		if (elem_nulls[i])
+			attnums[i] = tupdesc->natts;
+		else
+			attnums[i] = DatumGetInt32(elem_datums[i]);
+	}
+
+	start = clock();
+
+	while (heap_getnextslot(scan, ForwardScanDirection, slot))
+	{
+		CHECK_FOR_INTERRUPTS();
+
+		/* Deform in stages according to the attnums array */
+		for (int i = 0; i < elem_count; i++)
+			slot_getsomeattrs_int(slot, attnums[i]);
+	}
+
+	ExecDropSingleTupleTableSlot(slot);
+	table_endscan(scan);
+	relation_close(rel, AccessShareLock);
+
+	end = clock();
+
+	/* Returns the number of milliseconds to run the test */
+	PG_RETURN_FLOAT8((double) (end - start) / (CLOCKS_PER_SEC / 1000));
+}
diff --git a/src/test/modules/deform_bench/deform_bench.control b/src/test/modules/deform_bench/deform_bench.control
new file mode 100644
index 00000000000..a2023f9d738
--- /dev/null
+++ b/src/test/modules/deform_bench/deform_bench.control
@@ -0,0 +1,4 @@
+# deform_bench extension
+comment = 'functions for benchmarking tuple deformation'
+default_version = '1.0'
+module_pathname = '$libdir/deform_bench'
diff --git a/src/test/modules/deform_bench/meson.build b/src/test/modules/deform_bench/meson.build
new file mode 100644
index 00000000000..82049585244
--- /dev/null
+++ b/src/test/modules/deform_bench/meson.build
@@ -0,0 +1,22 @@
+# Copyright (c) 2026, PostgreSQL Global Development Group
+
+deform_bench_sources = files(
+  'deform_bench.c',
+)
+
+if host_system == 'windows'
+  deform_bench_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+    '--NAME', 'deform_bench',
+    '--FILEDESC', 'deform_bench - benchmarking tuple deformation',])
+endif
+
+deform_bench = shared_module('deform_bench',
+  deform_bench_sources,
+  kwargs: pg_test_mod_args,
+)
+test_install_libs += deform_bench
+
+test_install_data += files(
+  'deform_bench--1.0.sql',
+  'deform_bench.control',
+)
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 2634a519935..ef2b0af4581 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -2,6 +2,7 @@
 
 subdir('brin')
 subdir('commit_ts')
+subdir('deform_bench')
 subdir('delay_execution')
 subdir('dummy_index_am')
 subdir('dummy_seclabel')
-- 
2.51.0


From 8160dd33825abfcae2a14e78ed85a4c8e94e0274 Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Fri, 30 Jan 2026 23:18:45 +1300
Subject: [PATCH v8 4/4] Various experimental changes

---
 src/backend/access/common/tupdesc.c |   6 ++
 src/backend/executor/execTuples.c   |  63 ++++++------
 src/include/access/tupmacs.h        | 149 ++++++++++++++++++++++++++--
 src/include/executor/tuptable.h     |   4 +-
 4 files changed, 183 insertions(+), 39 deletions(-)

diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index 25364db630a..ca393af67c9 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -105,6 +105,12 @@ populate_compact_attribute_internal(Form_pg_attribute src,
 			elog(ERROR, "invalid attalign value: %c", src->attalign);
 			break;
 	}
+
+	/* Check for unsupported byval attlens */
+	if (src->attbyval && src->attlen != sizeof(char) &&
+		src->attlen != sizeof(int16) && src->attlen != sizeof(int32) &&
+		src->attlen != sizeof(int64))
+		elog(ERROR, "unsupported byval length: %d", src->attlen);
 }
 
 /*
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index 36d0aaed2fb..5e20a05a830 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -1029,23 +1029,34 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 	/* We can only fetch as many attributes as the tuple has. */
 	natts = Min(HeapTupleHeaderGetNatts(tup), natts);
 	attnum = slot->tts_nvalid;
+	values = slot->tts_values;
+	isnull = slot->tts_isnull;
 	firstNonCacheOffsetAttr = Min(tupleDesc->firstNonCachedOffAttr, natts);
 
 	if (hasnulls)
 	{
+		tp = (char *) tup +
+			MAXALIGN(offsetof(HeapTupleHeaderData, t_bits) +
+					 BITMAPLEN(HeapTupleHeaderGetNatts(tup)));
+		Assert(tp == (char *) tup + tup->t_hoff);
 		bp = tup->t_bits;
 		firstNullAttr = first_null_attr(bp, natts);
+		populate_isnull_array(bp, natts, isnull);
 		firstNonCacheOffsetAttr = Min(firstNonCacheOffsetAttr, firstNullAttr);
 	}
 	else
 	{
+		uint64	   *isnull64 = (uint64 *) isnull;
+		Size		asize = (natts + 7) >> 3;
+
+		tp = (char *) tup + MAXALIGN(offsetof(HeapTupleHeaderData, t_bits));
 		bp = NULL;
 		firstNullAttr = natts;
-	}
 
-	values = slot->tts_values;
-	isnull = slot->tts_isnull;
-	tp = (char *) tup + tup->t_hoff;
+		/* No nulls, set all isnull elements to false */
+		for (int i = 0; i < asize; i++)
+			isnull64[i] = 0;
+	}
 
 	/*
 	 * Handle the portion of the tuple that we have cached the offset for up
@@ -1065,7 +1076,6 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 #endif
 		do
 		{
-			isnull[attnum] = false;
 			cattr = TupleDescCompactAttr(tupleDesc, attnum);
 
 #ifdef USE_ASSERT_CHECKING
@@ -1074,7 +1084,9 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 			offcheck += cattr->attlen;
 #endif
 
-			values[attnum] = fetchatt(cattr, tp + cattr->attcacheoff);
+			values[attnum] = fetch_att_noerr(tp + cattr->attcacheoff,
+											 cattr->attbyval,
+											 cattr->attlen);
 		} while (++attnum < firstNonCacheOffsetAttr);
 
 		/*
@@ -1101,19 +1113,14 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 	 */
 	for (; attnum < firstNullAttr; attnum++)
 	{
-		isnull[attnum] = false;
 		cattr = TupleDescCompactAttr(tupleDesc, attnum);
 
-		/* align the offset for this attribute */
-		off = att_pointer_alignby(off,
-								  cattr->attalignby,
-								  cattr->attlen,
-								  tp + off);
-
-		values[attnum] = fetchatt(cattr, tp + off);
-
-		/* move the offset beyond this attribute */
-		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+		/* align 'off', fetch the datum, and increment off beyond the datum */
+		values[attnum] = align_fetch_then_add(tp,
+											  &off,
+											  cattr->attbyval,
+											  cattr->attlen,
+											  cattr->attalignby);
 	}
 
 	/*
@@ -1122,26 +1129,20 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 	 */
 	for (; attnum < natts; attnum++)
 	{
-		if (att_isnull(attnum, bp))
+		if (isnull[attnum])
 		{
 			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
 			continue;
 		}
 
-		isnull[attnum] = false;
 		cattr = TupleDescCompactAttr(tupleDesc, attnum);
 
-		/* align the offset for this attribute */
-		off = att_pointer_alignby(off,
-								  cattr->attalignby,
-								  cattr->attlen,
-								  tp + off);
-
-		values[attnum] = fetchatt(cattr, tp + off);
-
-		/* move the offset beyond this attribute */
-		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+		/* align 'off', fetch the datum, and increment off beyond the datum */
+		values[attnum] = align_fetch_then_add(tp,
+											  &off,
+											  cattr->attbyval,
+											  cattr->attlen,
+											  cattr->attalignby);
 	}
 
 	/*
@@ -1452,7 +1453,7 @@ ExecSetSlotDescriptor(TupleTableSlot *slot, /* slot to change */
 	slot->tts_values = (Datum *)
 		MemoryContextAlloc(slot->tts_mcxt, tupdesc->natts * sizeof(Datum));
 	slot->tts_isnull = (bool *)
-		MemoryContextAlloc(slot->tts_mcxt, tupdesc->natts * sizeof(bool));
+		MemoryContextAlloc(slot->tts_mcxt, MAXALIGN(tupdesc->natts * sizeof(bool)));
 }
 
 /* --------------------------------
diff --git a/src/include/access/tupmacs.h b/src/include/access/tupmacs.h
index ce7a88df611..d53784899fb 100644
--- a/src/include/access/tupmacs.h
+++ b/src/include/access/tupmacs.h
@@ -16,7 +16,7 @@
 
 #include "catalog/pg_type_d.h"	/* for TYPALIGN macros */
 #include "port/pg_bitutils.h"
-
+#include "varatt.h"
 
 /*
  * Check a tuple's null bitmap to determine whether the attribute is null.
@@ -29,6 +29,49 @@ att_isnull(int ATT, const bits8 *BITS)
 	return !(BITS[ATT >> 3] & (1 << (ATT & 0x07)));
 }
 
+/*
+ * populate_isnull_array
+ *		Transform a tuple's null bitmap into a boolean array.
+ *
+ * Caller must ensure that the isnull array is sized so it contains
+ * at least as many elements as there are bits in the 'bits' array.
+ * This is required because we always round 'natts' up to the next multiple
+ * of 8.
+ */
+static inline void
+populate_isnull_array(const bits8 *bits, int natts, bool *isnull)
+{
+	int			nbytes = (natts + 7) >> 3;
+
+	/*
+	 * Multiplying a NULL bitmap byte by this value results in the lowest bit
+	 * in each byte being set the same as each bit of the bitmap.  We perform
+	 * this as 2 32-bit operations rather than a single 64-bit operation as
+	 * multiplying by the required value to do this in 64-bits would result in
+	 * overflowing a uint64 in some cases.
+	 */
+#define SPREAD_BITS_MULTIPLIER_32 0x204081U
+
+	for (int i = 0; i < nbytes; i++, isnull += 8)
+	{
+		uint64		isnull_8;
+		bits8		nullbyte = ~bits[i];
+
+		/* convert the lower 4 bits of null bitmap word into 32 bit int */
+		isnull_8 = (nullbyte & 0xf) * SPREAD_BITS_MULTIPLIER_32;
+
+		/*
+		 * convert the upper 4 bits of null bitmap word into 32 bit int, shift
+		 * into the upper 32 bit
+		 */
+		isnull_8 |= ((uint64) ((nullbyte >> 4) * SPREAD_BITS_MULTIPLIER_32)) << 32;
+
+		/* mask out all other bits apart from the lowest bit of each byte */
+		isnull_8 &= UINT64CONST(0x0101010101010101);
+		memcpy(isnull, &isnull_8, sizeof(uint64));
+	}
+}
+
 #ifndef FRONTEND
 /*
  * Given an attbyval and an attlen from either a Form_pg_attribute or
@@ -71,6 +114,100 @@ fetch_att(const void *T, bool attbyval, int attlen)
 		return PointerGetDatum(T);
 }
 
+/*
+ * Same, but no error checking for invalid attlens for byval types.  This
+ * is safe to use when attlen comes from CompactAttribute as we validate the
+ * length when populating that struct.
+ */
+static inline Datum
+fetch_att_noerr(const void *T, bool attbyval, int attlen)
+{
+	if (attbyval)
+	{
+		switch (attlen)
+		{
+			case sizeof(char):
+				return CharGetDatum(*((const char *) T));
+			case sizeof(int16):
+				return Int16GetDatum(*((const int16 *) T));
+			case sizeof(int32):
+				return Int32GetDatum(*((const int32 *) T));
+			default:
+				Assert(attlen == sizeof(int64));
+				return Int64GetDatum(*((const int64 *) T));
+		}
+	}
+	else
+		return PointerGetDatum(T);
+}
+
+
+/*
+ * align_fetch_then_add
+ *		Applies all the functionality of att_pointer_alignby(), fetch_att()
+ *		and att_addlength_pointer() resulting in *off pointer to the perhaps
+ *		unaligned number of bytes into 'tupptr', ready to deform the next
+ *		attribute.
+ *
+ * tupptr: pointer to the beginning of the tuple, after the header and any
+ * NULL bitmask.
+ * off: offset in bytes for reading tuple data, possibly unaligned.
+ * attbyval, attlen, attalignby are values from CompactAttribute.
+ */
+static inline Datum
+align_fetch_then_add(const char *tupptr, uint32 *off, bool attbyval, int attlen,
+					 uint8 attalignby)
+{
+	Datum		res;
+
+	if (attlen > 0)
+	{
+		const char *offset_ptr;
+
+		*off = TYPEALIGN(attalignby, *off);
+		offset_ptr = tupptr + *off;
+		*off += attlen;
+		if (attbyval)
+		{
+			switch (attlen)
+			{
+				case sizeof(char):
+					return CharGetDatum(*((const char *) offset_ptr));
+				case sizeof(int16):
+					return Int16GetDatum(*((const int16 *) offset_ptr));
+				case sizeof(int32):
+					return Int32GetDatum(*((const int32 *) offset_ptr));
+				default:
+
+					/*
+					 * populate_compact_attribute_internal() should have
+					 * checked
+					 */
+					Assert(attlen == sizeof(int64));
+					return Int64GetDatum(*((const int64 *) offset_ptr));
+			}
+		}
+		return PointerGetDatum(offset_ptr);
+	}
+	else if (attlen == -1)
+	{
+		if (!VARATT_IS_SHORT(tupptr + *off))
+			*off = TYPEALIGN(attalignby, *off);
+
+		res = PointerGetDatum(tupptr + *off);
+		*off += VARSIZE_ANY(DatumGetPointer(res));
+		return res;
+	}
+	else
+	{
+		Assert(attlen == -2);
+		*off = TYPEALIGN(attalignby, *off);
+		res = PointerGetDatum(tupptr + *off);
+		*off += strlen(tupptr + *off) + 1;
+		return res;
+	}
+}
+
 #ifndef HAVE__BUILTIN_CTZ
 /*
  * For returning the 0-based position of the right-most 0 bit of a uint8, or 8
@@ -100,6 +237,9 @@ static const uint8 pg_rightmost_zero_pos[256] = {
  * first_null_attr
  *		Inspect a NULL bitmask from a tuple and return the 0-based attnum of the
  *		first NULL attribute.  Returns natts if no NULLs were found.
+ *
+ * We expect that 'bits' contains at least one 0 bit somewhere in the mask,
+ * not necessarily < natts.
  */
 static inline int
 first_null_attr(const bits8 *bits, int natts)
@@ -133,12 +273,7 @@ first_null_attr(const bits8 *bits, int natts)
 	res = bytenum << 3;
 
 #ifdef HAVE__BUILTIN_CTZ
-	/*
-	 * Promote to 32-bit before doing bit-wise NOT.  This means we'll convert
-	 * 0xff into 0xffffff00 rather than 0x0, which is undefined with
-	 * __builtin_ctz.  That'll mean we correctly get 8 for 0xff
-	 */
-	res += __builtin_ctz(~(uint32) bits[bytenum]);
+	res += __builtin_ctz(~bits[bytenum]);
 #else
 	res += pg_rightmost_zero_pos[bits[bytenum]];
 #endif
diff --git a/src/include/executor/tuptable.h b/src/include/executor/tuptable.h
index 363c5f33697..180fccc999f 100644
--- a/src/include/executor/tuptable.h
+++ b/src/include/executor/tuptable.h
@@ -116,7 +116,9 @@ typedef struct TupleTableSlot
 #define FIELDNO_TUPLETABLESLOT_VALUES 5
 	Datum	   *tts_values;		/* current per-attribute values */
 #define FIELDNO_TUPLETABLESLOT_ISNULL 6
-	bool	   *tts_isnull;		/* current per-attribute isnull flags */
+	bool	   *tts_isnull;		/* current per-attribute isnull flags.  Array
+								 * size must always be rounded up to the next
+								 * 8 elements. */
 	MemoryContext tts_mcxt;		/* slot itself is in this context */
 	ItemPointerData tts_tid;	/* stored tuple's tid */
 	Oid			tts_tableOid;	/* table oid of tuple */
-- 
2.51.0



Attachments:

  [text/plain] v8-0001-Add-empty-TupleDescFinalize-function.patch (29.0K, 2-v8-0001-Add-empty-TupleDescFinalize-function.patch)
  download | inline diff:
From df730d93ed3f7f9cd719c69c213ab60cac28c7fb Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Wed, 21 Jan 2026 15:41:37 +1300
Subject: [PATCH v8 1/4] Add empty TupleDescFinalize() function

Currently does nothing, but will in a future commit.
---
 contrib/dblink/dblink.c                             |  4 ++++
 contrib/pg_buffercache/pg_buffercache_pages.c       |  2 ++
 contrib/pg_visibility/pg_visibility.c               |  2 ++
 src/backend/access/brin/brin_tuple.c                |  1 +
 src/backend/access/common/tupdesc.c                 | 13 +++++++++++++
 src/backend/access/gin/ginutil.c                    |  1 +
 src/backend/access/gist/gistscan.c                  |  1 +
 src/backend/access/spgist/spgutils.c                |  1 +
 src/backend/access/transam/twophase.c               |  1 +
 src/backend/access/transam/xlogfuncs.c              |  1 +
 src/backend/backup/basebackup_copy.c                |  3 +++
 src/backend/catalog/index.c                         |  2 ++
 src/backend/catalog/pg_publication.c                |  1 +
 src/backend/catalog/toasting.c                      |  6 ++++++
 src/backend/commands/explain.c                      |  1 +
 src/backend/commands/functioncmds.c                 |  1 +
 src/backend/commands/sequence.c                     |  1 +
 src/backend/commands/tablecmds.c                    |  4 ++++
 src/backend/commands/wait.c                         |  1 +
 src/backend/executor/execSRF.c                      |  2 ++
 src/backend/executor/execTuples.c                   |  4 ++++
 src/backend/executor/nodeFunctionscan.c             |  2 ++
 src/backend/parser/parse_relation.c                 |  4 +++-
 src/backend/parser/parse_target.c                   |  2 ++
 .../replication/libpqwalreceiver/libpqwalreceiver.c |  1 +
 src/backend/replication/walsender.c                 |  5 +++++
 src/backend/utils/adt/acl.c                         |  1 +
 src/backend/utils/adt/genfile.c                     |  1 +
 src/backend/utils/adt/lockfuncs.c                   |  1 +
 src/backend/utils/adt/orderedsetaggs.c              |  1 +
 src/backend/utils/adt/pgstatfuncs.c                 |  5 +++++
 src/backend/utils/adt/tsvector_op.c                 |  1 +
 src/backend/utils/cache/relcache.c                  |  8 ++++++++
 src/backend/utils/fmgr/funcapi.c                    |  6 ++++++
 src/backend/utils/misc/guc_funcs.c                  |  5 +++++
 src/include/access/tupdesc.h                        |  1 +
 src/pl/plpgsql/src/pl_comp.c                        |  2 ++
 .../test_custom_stats/test_custom_fixed_stats.c     |  1 +
 src/test/modules/test_predtest/test_predtest.c      |  1 +
 39 files changed, 100 insertions(+), 1 deletion(-)

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 8cb3166495c..1ce4502fec2 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -881,6 +881,7 @@ materializeResult(FunctionCallInfo fcinfo, PGconn *conn, PGresult *res)
 		tupdesc = CreateTemplateTupleDesc(1);
 		TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 						   TEXTOID, -1, 0);
+		TupleDescFinalize(tupdesc);
 		ntuples = 1;
 		nfields = 1;
 	}
@@ -1044,6 +1045,7 @@ materializeQueryResult(FunctionCallInfo fcinfo,
 			tupdesc = CreateTemplateTupleDesc(1);
 			TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 							   TEXTOID, -1, 0);
+			TupleDescFinalize(tupdesc);
 			attinmeta = TupleDescGetAttInMetadata(tupdesc);
 
 			oldcontext = MemoryContextSwitchTo(rsinfo->econtext->ecxt_per_query_memory);
@@ -1529,6 +1531,8 @@ dblink_get_pkey(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 2, "colname",
 						   TEXTOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
+
 		/*
 		 * Generate attribute metadata needed later to produce tuples from raw
 		 * C strings
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 89b86855243..a6b4fb5252b 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -174,6 +174,7 @@ pg_buffercache_pages(PG_FUNCTION_ARGS)
 			TupleDescInitEntry(tupledesc, (AttrNumber) 9, "pinning_backends",
 							   INT4OID, -1, 0);
 
+		TupleDescFinalize(tupledesc);
 		fctx->tupdesc = BlessTupleDesc(tupledesc);
 
 		/* Allocate NBuffers worth of BufferCachePagesRec records. */
@@ -442,6 +443,7 @@ pg_buffercache_os_pages_internal(FunctionCallInfo fcinfo, bool include_numa)
 		TupleDescInitEntry(tupledesc, (AttrNumber) 3, "numa_node",
 						   INT4OID, -1, 0);
 
+		TupleDescFinalize(tupledesc);
 		fctx->tupdesc = BlessTupleDesc(tupledesc);
 		fctx->include_numa = include_numa;
 
diff --git a/contrib/pg_visibility/pg_visibility.c b/contrib/pg_visibility/pg_visibility.c
index 9bc3a784bf7..dfab0b64cf5 100644
--- a/contrib/pg_visibility/pg_visibility.c
+++ b/contrib/pg_visibility/pg_visibility.c
@@ -469,6 +469,8 @@ pg_visibility_tupdesc(bool include_blkno, bool include_pd)
 		TupleDescInitEntry(tupdesc, ++a, "pd_all_visible", BOOLOID, -1, 0);
 	Assert(a == maxattr);
 
+	TupleDescFinalize(tupdesc);
+
 	return BlessTupleDesc(tupdesc);
 }
 
diff --git a/src/backend/access/brin/brin_tuple.c b/src/backend/access/brin/brin_tuple.c
index 706387e36d6..7f150df9ee7 100644
--- a/src/backend/access/brin/brin_tuple.c
+++ b/src/backend/access/brin/brin_tuple.c
@@ -84,6 +84,7 @@ brtuple_disk_tupdesc(BrinDesc *brdesc)
 
 		MemoryContextSwitchTo(oldcxt);
 
+		TupleDescFinalize(tupdesc);
 		brdesc->bd_disktdesc = tupdesc;
 	}
 
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index 94b4f1f9975..e98de806a77 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -238,6 +238,9 @@ CreateTupleDesc(int natts, Form_pg_attribute *attrs)
 		memcpy(TupleDescAttr(desc, i), attrs[i], ATTRIBUTE_FIXED_PART_SIZE);
 		populate_compact_attribute(desc, i);
 	}
+
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -282,6 +285,8 @@ CreateTupleDescCopy(TupleDesc tupdesc)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -328,6 +333,8 @@ CreateTupleDescTruncatedCopy(TupleDesc tupdesc, int natts)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -413,6 +420,8 @@ CreateTupleDescCopyConstr(TupleDesc tupdesc)
 	desc->tdtypeid = tupdesc->tdtypeid;
 	desc->tdtypmod = tupdesc->tdtypmod;
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -455,6 +464,8 @@ TupleDescCopy(TupleDesc dst, TupleDesc src)
 	 * source's refcount would be wrong in any case.)
 	 */
 	dst->tdrefcount = -1;
+
+	TupleDescFinalize(dst);
 }
 
 /*
@@ -1082,6 +1093,8 @@ BuildDescFromLists(const List *names, const List *types, const List *typmods, co
 		TupleDescInitEntryCollation(desc, attnum, attcollation);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index d205093e21d..a533d79e26e 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -129,6 +129,7 @@ initGinState(GinState *state, Relation index)
 							   attr->attndims);
 			TupleDescInitEntryCollation(state->tupdesc[i], (AttrNumber) 2,
 										attr->attcollation);
+			TupleDescFinalize(state->tupdesc[i]);
 		}
 
 		/*
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
index f23bc4a6757..c65f93abdae 100644
--- a/src/backend/access/gist/gistscan.c
+++ b/src/backend/access/gist/gistscan.c
@@ -201,6 +201,7 @@ gistrescan(IndexScanDesc scan, ScanKey key, int nkeys,
 											 attno - 1)->atttypid,
 							   -1, 0);
 		}
+		TupleDescFinalize(so->giststate->fetchTupdesc);
 		scan->xs_hitupdesc = so->giststate->fetchTupdesc;
 
 		/* Also create a memory context that will hold the returned tuples */
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 9f5379b87ac..b246e8127db 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -340,6 +340,7 @@ getSpGistTupleDesc(Relation index, SpGistTypeDesc *keyType)
 			TupleDescCompactAttr(outTupDesc, i)->attcacheoff = -1;
 
 		populate_compact_attribute(outTupDesc, spgKeyColumn);
+		TupleDescFinalize(outTupDesc);
 	}
 	return outTupDesc;
 }
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 601ce3faa64..6d5792c7929 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -744,6 +744,7 @@ pg_prepared_xact(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 5, "dbid",
 						   OIDOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/*
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 2efe4105efb..b6bc616c74c 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -400,6 +400,7 @@ pg_walfile_name_offset(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(resultTupleDesc, (AttrNumber) 2, "file_offset",
 					   INT4OID, -1, 0);
 
+	TupleDescFinalize(resultTupleDesc);
 	resultTupleDesc = BlessTupleDesc(resultTupleDesc);
 
 	/*
diff --git a/src/backend/backup/basebackup_copy.c b/src/backend/backup/basebackup_copy.c
index fecfad9ab7b..29dbd0cb32f 100644
--- a/src/backend/backup/basebackup_copy.c
+++ b/src/backend/backup/basebackup_copy.c
@@ -357,6 +357,8 @@ SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli)
 	 */
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "tli", INT8OID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
+
 	/* send RowDescription */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
 
@@ -388,6 +390,7 @@ SendTablespaceList(List *tablespaces)
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, "spcoid", OIDOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "spclocation", TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "size", INT8OID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* send RowDescription */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 43de42ce39e..75e97fb394a 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -481,6 +481,8 @@ ConstructTupleDescriptor(Relation heapRelation,
 		populate_compact_attribute(indexTupDesc, i);
 	}
 
+	TupleDescFinalize(indexTupDesc);
+
 	return indexTupDesc;
 }
 
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index 9a4791c573e..fa353a0dd37 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -1230,6 +1230,7 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 4, "qual",
 						   PG_NODE_TREEOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 		funcctx->user_fctx = table_infos;
 
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
index c78dcea98c1..078a1cf5127 100644
--- a/src/backend/catalog/toasting.c
+++ b/src/backend/catalog/toasting.c
@@ -229,6 +229,12 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid,
 	TupleDescAttr(tupdesc, 1)->attcompression = InvalidCompressionMethod;
 	TupleDescAttr(tupdesc, 2)->attcompression = InvalidCompressionMethod;
 
+	populate_compact_attribute(tupdesc, 0);
+	populate_compact_attribute(tupdesc, 1);
+	populate_compact_attribute(tupdesc, 2);
+
+	TupleDescFinalize(tupdesc);
+
 	/*
 	 * Toast tables for regular relations go in pg_toast; those for temp
 	 * relations go into the per-backend temp-toast-table namespace.
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index b7bb111688c..7abd9ed272f 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -281,6 +281,7 @@ ExplainResultDesc(ExplainStmt *stmt)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "QUERY PLAN",
 					   result_type, -1, 0);
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
 
diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c
index a516b037dea..6a8f162b640 100644
--- a/src/backend/commands/functioncmds.c
+++ b/src/backend/commands/functioncmds.c
@@ -2423,6 +2423,7 @@ CallStmtResultDesc(CallStmt *stmt)
 							   -1,
 							   0);
 		}
+		TupleDescFinalize(tupdesc);
 	}
 
 	return tupdesc;
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index e1b808bbb60..551667650ba 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -1808,6 +1808,7 @@ pg_get_sequence_data(PG_FUNCTION_ARGS)
 					   BOOLOID, -1, 0);
 	TupleDescInitEntry(resultTupleDesc, (AttrNumber) 3, "page_lsn",
 					   LSNOID, -1, 0);
+	TupleDescFinalize(resultTupleDesc);
 	resultTupleDesc = BlessTupleDesc(resultTupleDesc);
 
 	seqrel = try_relation_open(relid, AccessShareLock);
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index f976c0e5c7e..ae7c9db9fa9 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -1029,6 +1029,8 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
 		}
 	}
 
+	TupleDescFinalize(descriptor);
+
 	/*
 	 * For relations with table AM and partitioned tables, select access
 	 * method to use: an explicitly indicated one, or (in the case of a
@@ -1448,6 +1450,8 @@ BuildDescForRelation(const List *columns)
 		populate_compact_attribute(desc, attnum - 1);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
diff --git a/src/backend/commands/wait.c b/src/backend/commands/wait.c
index 1290df10c6f..8e920a72372 100644
--- a/src/backend/commands/wait.c
+++ b/src/backend/commands/wait.c
@@ -338,5 +338,6 @@ WaitStmtResultDesc(WaitStmt *stmt)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "status",
 					   TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
diff --git a/src/backend/executor/execSRF.c b/src/backend/executor/execSRF.c
index a0b111dc0e4..b481e50acfb 100644
--- a/src/backend/executor/execSRF.c
+++ b/src/backend/executor/execSRF.c
@@ -272,6 +272,7 @@ ExecMakeTableFunctionResult(SetExprState *setexpr,
 									   funcrettype,
 									   -1,
 									   0);
+					TupleDescFinalize(tupdesc);
 					rsinfo.setDesc = tupdesc;
 				}
 				MemoryContextSwitchTo(oldcontext);
@@ -776,6 +777,7 @@ init_sexpr(Oid foid, Oid input_collation, Expr *node,
 							   funcrettype,
 							   -1,
 							   0);
+			TupleDescFinalize(tupdesc);
 			sexpr->funcResultDesc = tupdesc;
 			sexpr->funcReturnsTuple = false;
 		}
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index b768eae9e53..e6ab51e6404 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -2173,6 +2173,8 @@ ExecTypeFromTLInternal(List *targetList, bool skipjunk)
 		cur_resno++;
 	}
 
+	TupleDescFinalize(typeInfo);
+
 	return typeInfo;
 }
 
@@ -2207,6 +2209,8 @@ ExecTypeFromExprList(List *exprList)
 		cur_resno++;
 	}
 
+	TupleDescFinalize(typeInfo);
+
 	return typeInfo;
 }
 
diff --git a/src/backend/executor/nodeFunctionscan.c b/src/backend/executor/nodeFunctionscan.c
index 63e605e1f81..feb82d64967 100644
--- a/src/backend/executor/nodeFunctionscan.c
+++ b/src/backend/executor/nodeFunctionscan.c
@@ -414,6 +414,7 @@ ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags)
 				TupleDescInitEntryCollation(tupdesc,
 											(AttrNumber) 1,
 											exprCollation(funcexpr));
+				TupleDescFinalize(tupdesc);
 			}
 			else
 			{
@@ -485,6 +486,7 @@ ExecInitFunctionScan(FunctionScan *node, EState *estate, int eflags)
 							   0);
 		}
 
+		TupleDescFinalize(scan_tupdesc);
 		Assert(attno == natts);
 	}
 
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 3ec8d8de011..0ad767d827b 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -1891,6 +1891,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 			TupleDescInitEntryCollation(tupdesc,
 										(AttrNumber) 1,
 										exprCollation(funcexpr));
+			TupleDescFinalize(tupdesc);
 		}
 		else if (functypclass == TYPEFUNC_RECORD)
 		{
@@ -1948,6 +1949,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 
 				i++;
 			}
+			TupleDescFinalize(tupdesc);
 
 			/*
 			 * Ensure that the coldeflist defines a legal set of names (no
@@ -2016,7 +2018,7 @@ addRangeTableEntryForFunction(ParseState *pstate,
 							   0);
 			/* no need to set collation */
 		}
-
+		TupleDescFinalize(tupdesc);
 		Assert(natts == totalatts);
 	}
 	else
diff --git a/src/backend/parser/parse_target.c b/src/backend/parser/parse_target.c
index b5a2f915b67..5fd17f3d8d0 100644
--- a/src/backend/parser/parse_target.c
+++ b/src/backend/parser/parse_target.c
@@ -1570,6 +1570,8 @@ expandRecordVariable(ParseState *pstate, Var *var, int levelsup)
 		}
 		Assert(lname == NULL && lvar == NULL);	/* lists same length? */
 
+		TupleDescFinalize(tupleDesc);
+
 		return tupleDesc;
 	}
 
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 7c8639b32e9..9f04c9ed25d 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -1073,6 +1073,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
 	for (coln = 0; coln < nRetTypes; coln++)
 		TupleDescInitEntry(walres->tupledesc, (AttrNumber) coln + 1,
 						   PQfname(pgres, coln), retTypes[coln], -1, 0);
+	TupleDescFinalize(walres->tupledesc);
 	attinmeta = TupleDescGetAttInMetadata(walres->tupledesc);
 
 	/* No point in doing more here if there were no tuples returned. */
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index a0e6a3d200c..9e2f4a664b4 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -451,6 +451,7 @@ IdentifySystem(void)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 4, "dbname",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -496,6 +497,7 @@ ReadReplicationSlot(ReadReplicationSlotCmd *cmd)
 	/* TimeLineID is unsigned, so int4 is not wide enough. */
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "restart_tli",
 							  INT8OID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	memset(nulls, true, READ_REPLICATION_SLOT_COLS * sizeof(bool));
 
@@ -598,6 +600,7 @@ SendTimeLineHistory(TimeLineHistoryCmd *cmd)
 	tupdesc = CreateTemplateTupleDesc(2);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, "filename", TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "content", TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	TLHistoryFileName(histfname, cmd->timeline);
 	TLHistoryFilePath(path, cmd->timeline);
@@ -1015,6 +1018,7 @@ StartReplication(StartReplicationCmd *cmd)
 								  INT8OID, -1, 0);
 		TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 2, "next_tli_startpos",
 								  TEXTOID, -1, 0);
+		TupleDescFinalize(tupdesc);
 
 		/* prepare for projection of tuple */
 		tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -1369,6 +1373,7 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 4, "output_plugin",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
diff --git a/src/backend/utils/adt/acl.c b/src/backend/utils/adt/acl.c
index 3a6905f9546..9d37053c81e 100644
--- a/src/backend/utils/adt/acl.c
+++ b/src/backend/utils/adt/acl.c
@@ -1818,6 +1818,7 @@ aclexplode(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 4, "is_grantable",
 						   BOOLOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/* allocate memory for user context */
diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index c083608b1d5..bfb949401d0 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -454,6 +454,7 @@ pg_stat_file(PG_FUNCTION_ARGS)
 					   "creation", TIMESTAMPTZOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 6,
 					   "isdir", BOOLOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	memset(isnull, false, sizeof(isnull));
diff --git a/src/backend/utils/adt/lockfuncs.c b/src/backend/utils/adt/lockfuncs.c
index 9dadd6da672..4481c354fd6 100644
--- a/src/backend/utils/adt/lockfuncs.c
+++ b/src/backend/utils/adt/lockfuncs.c
@@ -146,6 +146,7 @@ pg_lock_status(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 16, "waitstart",
 						   TIMESTAMPTZOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = BlessTupleDesc(tupdesc);
 
 		/*
diff --git a/src/backend/utils/adt/orderedsetaggs.c b/src/backend/utils/adt/orderedsetaggs.c
index 3b6da8e36ac..fd8b8676470 100644
--- a/src/backend/utils/adt/orderedsetaggs.c
+++ b/src/backend/utils/adt/orderedsetaggs.c
@@ -233,6 +233,7 @@ ordered_set_startup(FunctionCallInfo fcinfo, bool use_tuples)
 								   -1,
 								   0);
 
+				TupleDescFinalize(newdesc);
 				FreeTupleDesc(qstate->tupdesc);
 				qstate->tupdesc = newdesc;
 			}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 73ca0bb0b7f..08ad27e57c2 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -769,6 +769,7 @@ pg_stat_get_backend_subxact(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "subxact_overflow",
 					   BOOLOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	if ((local_beentry = pgstat_get_local_beentry_by_proc_number(procNumber)) != NULL)
@@ -1658,6 +1659,7 @@ pg_stat_wal_build_tuple(PgStat_WalCounters wal_counters,
 	TupleDescInitEntry(tupdesc, (AttrNumber) 6, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	/* Fill values and NULLs */
@@ -2085,6 +2087,7 @@ pg_stat_get_archiver(PG_FUNCTION_ARGS)
 	TupleDescInitEntry(tupdesc, (AttrNumber) 7, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
 
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	/* Get statistics about the archiver process */
@@ -2166,6 +2169,7 @@ pg_stat_get_replication_slot(PG_FUNCTION_ARGS)
 					   TIMESTAMPTZOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 13, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	namestrcpy(&slotname, text_to_cstring(slotname_text));
@@ -2253,6 +2257,7 @@ pg_stat_get_subscription_stats(PG_FUNCTION_ARGS)
 					   INT8OID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 13, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	if (!subentry)
diff --git a/src/backend/utils/adt/tsvector_op.c b/src/backend/utils/adt/tsvector_op.c
index 94e0fed8309..7ca19a97882 100644
--- a/src/backend/utils/adt/tsvector_op.c
+++ b/src/backend/utils/adt/tsvector_op.c
@@ -651,6 +651,7 @@ tsvector_unnest(PG_FUNCTION_ARGS)
 						   TEXTARRAYOID, -1, 0);
 		if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
 			elog(ERROR, "return type must be a row type");
+		TupleDescFinalize(tupdesc);
 		funcctx->tuple_desc = tupdesc;
 
 		funcctx->user_fctx = PG_GETARG_TSVECTOR_COPY(0);
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 6b634c9fff1..770edb34e08 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -729,6 +729,8 @@ RelationBuildTupleDesc(Relation relation)
 		pfree(constr);
 		relation->rd_att->constr = NULL;
 	}
+
+	TupleDescFinalize(relation->rd_att);
 }
 
 /*
@@ -1985,6 +1987,7 @@ formrdesc(const char *relationName, Oid relationReltype,
 
 	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
 	TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
+	TupleDescFinalize(relation->rd_att);
 
 	/* mark not-null status */
 	if (has_not_null)
@@ -3688,6 +3691,8 @@ RelationBuildLocalRelation(const char *relname,
 	for (i = 0; i < natts; i++)
 		TupleDescAttr(rel->rd_att, i)->attrelid = relid;
 
+	TupleDescFinalize(rel->rd_att);
+
 	rel->rd_rel->reltablespace = reltablespace;
 
 	if (mapped_relation)
@@ -4443,6 +4448,7 @@ BuildHardcodedDescriptor(int natts, const FormData_pg_attribute *attrs)
 
 	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
 	TupleDescCompactAttr(result, 0)->attcacheoff = 0;
+	TupleDescFinalize(result);
 
 	/* Note: we don't bother to set up a TupleConstr entry */
 
@@ -6268,6 +6274,8 @@ load_relcache_init_file(bool shared)
 			populate_compact_attribute(rel->rd_att, i);
 		}
 
+		TupleDescFinalize(rel->rd_att);
+
 		/* next read the access method specific field */
 		if (fread(&len, 1, sizeof(len), fp) != sizeof(len))
 			goto read_failed;
diff --git a/src/backend/utils/fmgr/funcapi.c b/src/backend/utils/fmgr/funcapi.c
index 8a934ea8dca..516d02cfb82 100644
--- a/src/backend/utils/fmgr/funcapi.c
+++ b/src/backend/utils/fmgr/funcapi.c
@@ -340,6 +340,8 @@ get_expr_result_type(Node *expr,
 										exprCollation(col));
 			i++;
 		}
+		TupleDescFinalize(tupdesc);
+
 		if (resultTypeId)
 			*resultTypeId = rexpr->row_typeid;
 		if (resultTupleDesc)
@@ -1044,6 +1046,7 @@ resolve_polymorphic_tupdesc(TupleDesc tupdesc, oidvector *declared_args,
 		}
 	}
 
+	TupleDescFinalize(tupdesc);
 	return true;
 }
 
@@ -1853,6 +1856,8 @@ build_function_result_tupdesc_d(char prokind,
 						   0);
 	}
 
+	TupleDescFinalize(desc);
+
 	return desc;
 }
 
@@ -1970,6 +1975,7 @@ TypeGetTupleDesc(Oid typeoid, List *colaliases)
 						   typeoid,
 						   -1,
 						   0);
+		TupleDescFinalize(tupdesc);
 	}
 	else if (functypclass == TYPEFUNC_RECORD)
 	{
diff --git a/src/backend/utils/misc/guc_funcs.c b/src/backend/utils/misc/guc_funcs.c
index 4f3e40bf470..b82f807e05e 100644
--- a/src/backend/utils/misc/guc_funcs.c
+++ b/src/backend/utils/misc/guc_funcs.c
@@ -444,6 +444,7 @@ GetPGVariableResultDesc(const char *name)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 1, varname,
 						   TEXTOID, -1, 0);
 	}
+	TupleDescFinalize(tupdesc);
 	return tupdesc;
 }
 
@@ -465,6 +466,7 @@ ShowGUCConfigOption(const char *name, DestReceiver *dest)
 	tupdesc = CreateTemplateTupleDesc(1);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 1, varname,
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -499,6 +501,7 @@ ShowAllGUCConfig(DestReceiver *dest)
 							  TEXTOID, -1, 0);
 	TupleDescInitBuiltinEntry(tupdesc, (AttrNumber) 3, "description",
 							  TEXTOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 
 	/* prepare for projection of tuples */
 	tstate = begin_tup_output_tupdesc(dest, tupdesc, &TTSOpsVirtual);
@@ -934,6 +937,8 @@ show_all_settings(PG_FUNCTION_ARGS)
 		TupleDescInitEntry(tupdesc, (AttrNumber) 17, "pending_restart",
 						   BOOLOID, -1, 0);
 
+		TupleDescFinalize(tupdesc);
+
 		/*
 		 * Generate attribute metadata needed later to produce tuples from raw
 		 * C strings
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index d46cdbf7a3c..595413dbbc5 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -195,6 +195,7 @@ extern TupleDesc CreateTupleDescTruncatedCopy(TupleDesc tupdesc, int natts);
 
 extern TupleDesc CreateTupleDescCopyConstr(TupleDesc tupdesc);
 
+#define TupleDescFinalize(d) ((void) 0)
 #define TupleDescSize(src) \
 	(offsetof(struct TupleDescData, compact_attrs) + \
 	 (src)->natts * sizeof(CompactAttribute) + \
diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c
index 7d648c941c0..b2112c29fcf 100644
--- a/src/pl/plpgsql/src/pl_comp.c
+++ b/src/pl/plpgsql/src/pl_comp.c
@@ -1912,6 +1912,8 @@ build_row_from_vars(PLpgSQL_variable **vars, int numvars)
 		TupleDescInitEntryCollation(row->rowtupdesc, i + 1, typcoll);
 	}
 
+	TupleDescFinalize(row->rowtupdesc);
+
 	return row;
 }
 
diff --git a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
index 908bd18a7c7..fa1719bf3b5 100644
--- a/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
+++ b/src/test/modules/test_custom_stats/test_custom_fixed_stats.c
@@ -205,6 +205,7 @@ test_custom_stats_fixed_report(PG_FUNCTION_ARGS)
 					   INT8OID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "stats_reset",
 					   TIMESTAMPTZOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	BlessTupleDesc(tupdesc);
 
 	values[0] = Int64GetDatum(stats->numcalls);
diff --git a/src/test/modules/test_predtest/test_predtest.c b/src/test/modules/test_predtest/test_predtest.c
index 679a5de456d..48ca2a4ea70 100644
--- a/src/test/modules/test_predtest/test_predtest.c
+++ b/src/test/modules/test_predtest/test_predtest.c
@@ -230,6 +230,7 @@ test_predtest(PG_FUNCTION_ARGS)
 					   "s_r_holds", BOOLOID, -1, 0);
 	TupleDescInitEntry(tupdesc, (AttrNumber) 8,
 					   "w_r_holds", BOOLOID, -1, 0);
+	TupleDescFinalize(tupdesc);
 	tupdesc = BlessTupleDesc(tupdesc);
 
 	values[0] = BoolGetDatum(strong_implied_by);
-- 
2.51.0



  [text/plain] v8-0002-Precalculate-CompactAttribute-s-attcacheoff.patch (50.1K, 3-v8-0002-Precalculate-CompactAttribute-s-attcacheoff.patch)
  download | inline diff:
From 6e464b41393dbb7fdeff853285e9d80051b2c12b Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Tue, 31 Dec 2024 09:19:24 +1300
Subject: [PATCH v8 2/4] Precalculate CompactAttribute's attcacheoff

This allows code to be removed from the tuple deform routines which
shrinks down the code a little, which can make it run more quickly.
This also makes a dedicated deformer loop to deform the portion of the
tuple which has a known offset, which makes deforming much faster when
a leading set of the table's columns are non-NULL values and fixed-width
types.
---
 src/backend/access/common/heaptuple.c  | 334 +++++++++-------------
 src/backend/access/common/indextuple.c | 367 ++++++++++---------------
 src/backend/access/common/tupdesc.c    |  37 +++
 src/backend/access/spgist/spgutils.c   |   3 -
 src/backend/executor/execTuples.c      | 282 ++++++++-----------
 src/backend/jit/llvm/llvmjit_deform.c  |   6 -
 src/backend/utils/cache/relcache.c     |  12 -
 src/include/access/htup_details.h      |  19 +-
 src/include/access/itup.h              |  20 +-
 src/include/access/tupdesc.h           |  10 +-
 src/include/access/tupmacs.h           |  84 ++++++
 src/include/executor/tuptable.h        |   9 +-
 12 files changed, 541 insertions(+), 642 deletions(-)

diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c
index 11bec20e82e..42cce3dcdfe 100644
--- a/src/backend/access/common/heaptuple.c
+++ b/src/backend/access/common/heaptuple.c
@@ -497,20 +497,8 @@ heap_attisnull(HeapTuple tup, int attnum, TupleDesc tupleDesc)
 /* ----------------
  *		nocachegetattr
  *
- *		This only gets called from fastgetattr(), in cases where we
- *		can't use a cacheoffset and the value is not null.
- *
- *		This caches attribute offsets in the attribute descriptor.
- *
- *		An alternative way to speed things up would be to cache offsets
- *		with the tuple, but that seems more difficult unless you take
- *		the storage hit of actually putting those offsets into the
- *		tuple you send to disk.  Yuck.
- *
- *		This scheme will be slightly slower than that, but should
- *		perform well for queries which hit large #'s of tuples.  After
- *		you cache the offsets once, examining all the other tuples using
- *		the same attribute descriptor will go much quicker. -cim 5/4/91
+ *		This only gets called from fastgetattr(), in cases where the
+ *		attcacheoff is not set.
  *
  *		NOTE: if you need to change this code, see also heap_deform_tuple.
  *		Also see nocache_index_getattr, which is the same code for index
@@ -522,194 +510,104 @@ nocachegetattr(HeapTuple tup,
 			   int attnum,
 			   TupleDesc tupleDesc)
 {
+	CompactAttribute *cattr;
 	HeapTupleHeader td = tup->t_data;
 	char	   *tp;				/* ptr to data part of tuple */
 	bits8	   *bp = td->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* do we have to walk attrs? */
 	int			off;			/* current offset within data */
+	int			startAttr;
+	int			firstnullattr;
+	bool		hasnulls = HeapTupleHasNulls(tup);
 
-	/* ----------------
-	 *	 Three cases:
-	 *
-	 *	 1: No nulls and no variable-width attributes.
-	 *	 2: Has a null or a var-width AFTER att.
-	 *	 3: Has nulls or var-widths BEFORE att.
-	 * ----------------
-	 */
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
+	/*
+	 * If there are no NULLs before the required attnum, then we can start at
+	 * the highest attribute with a known offset, or the first attribute if
+	 * none have a cached offset.  If the tuple has no variable width types,
+	 * then we can use a slightly cheaper method of offset calculation, as we
+	 * just need to add the attlen to the aligned offset when skipping over
+	 * columns.  When the tuple contains variable-width types, we must use
+	 * att_addlength_pointer(), which does a bit more branching and is
+	 * slightly less efficient.
+	 */
 	attnum--;
 
-	if (!HeapTupleNoNulls(tup))
-	{
-		/*
-		 * there's a null somewhere in the tuple
-		 *
-		 * check to see if any preceding bits are null...
-		 */
-		int			byte = attnum >> 3;
-		int			finalbit = attnum & 0x07;
-
-		/* check for nulls "before" final bit of last byte */
-		if ((~bp[byte]) & ((1 << finalbit) - 1))
-			slow = true;
-		else
-		{
-			/* check for nulls in any "earlier" bytes */
-			int			i;
+	if (hasnulls)
+		firstnullattr = first_null_attr(bp, attnum);
+	else
+		firstnullattr = attnum;
 
-			for (i = 0; i < byte; i++)
-			{
-				if (bp[i] != 0xFF)
-				{
-					slow = true;
-					break;
-				}
-			}
-		}
+	if (tupleDesc->firstNonCachedOffAttr > 0)
+	{
+		startAttr = Min(tupleDesc->firstNonCachedOffAttr - 1, firstnullattr);
+		off = TupleDescCompactAttr(tupleDesc, startAttr)->attcacheoff;
+	}
+	else
+	{
+		startAttr = 0;
+		off = 0;
 	}
 
 	tp = (char *) td + td->t_hoff;
 
-	if (!slow)
+	if (hasnulls)
 	{
-		CompactAttribute *att;
+		for (int i = startAttr; i < attnum; i++)
+		{
+			CompactAttribute *att;
 
-		/*
-		 * If we get here, there are no nulls up to and including the target
-		 * attribute.  If we have a cached offset, we can use it.
-		 */
-		att = TupleDescCompactAttr(tupleDesc, attnum);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, tp + att->attcacheoff);
+			if (att_isnull(i, bp))
+				continue;
 
-		/*
-		 * Otherwise, check for non-fixed-length attrs up to and including
-		 * target.  If there aren't any, it's safe to cheaply initialize the
-		 * cached offsets for these attrs.
-		 */
-		if (HeapTupleHasVarWidth(tup))
-		{
-			int			j;
+			att = TupleDescCompactAttr(tupleDesc, i);
 
-			for (j = 0; j <= attnum; j++)
-			{
-				if (TupleDescCompactAttr(tupleDesc, j)->attlen <= 0)
-				{
-					slow = true;
-					break;
-				}
-			}
+			off = att_pointer_alignby(off,
+									  att->attalignby,
+									  att->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, att->attlen, tp + off);
 		}
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
 	}
-
-	if (!slow)
+	else if (!HeapTupleHasVarWidth(tup))
 	{
-		int			natts = tupleDesc->natts;
-		int			j = 1;
-
-		/*
-		 * If we get here, we have a tuple with no nulls or var-widths up to
-		 * and including the target attribute, so we can use the cached offset
-		 * ... only we don't have it yet, or we'd not have got here.  Since
-		 * it's cheap to compute offsets for fixed-width columns, we take the
-		 * opportunity to initialize the cached offsets for *all* the leading
-		 * fixed-width columns, in hope of avoiding future visits to this
-		 * routine.
-		 */
-		TupleDescCompactAttr(tupleDesc, 0)->attcacheoff = 0;
-
-		/* we might have set some offsets in the slow path previously */
-		while (j < natts && TupleDescCompactAttr(tupleDesc, j)->attcacheoff > 0)
-			j++;
-
-		off = TupleDescCompactAttr(tupleDesc, j - 1)->attcacheoff +
-			TupleDescCompactAttr(tupleDesc, j - 1)->attlen;
-
-		for (; j < natts; j++)
+		for (int i = startAttr; i < attnum; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, j);
-
-			if (att->attlen <= 0)
-				break;
+			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
 
 			off = att_nominal_alignby(off, att->attalignby);
-
-			att->attcacheoff = off;
-
 			off += att->attlen;
 		}
-
-		Assert(j > attnum);
-
-		off = TupleDescCompactAttr(tupleDesc, attnum)->attcacheoff;
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_nominal_alignby(off, cattr->attalignby);
 	}
 	else
 	{
-		bool		usecache = true;
-		int			i;
-
-		/*
-		 * Now we know that we have to walk the tuple CAREFULLY.  But we still
-		 * might be able to cache some offsets for next time.
-		 *
-		 * Note - This loop is a little tricky.  For each non-null attribute,
-		 * we have to first account for alignment padding before the attr,
-		 * then advance over the attr based on its length.  Nulls have no
-		 * storage and no alignment padding either.  We can use/set
-		 * attcacheoff until we reach either a null or a var-width attribute.
-		 */
-		off = 0;
-		for (i = 0;; i++)		/* loop exit is at "break" */
+		for (int i = startAttr; i < attnum; i++)
 		{
 			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
 
-			if (HeapTupleHasNulls(tup) && att_isnull(i, bp))
-			{
-				usecache = false;
-				continue;		/* this cannot be the target att */
-			}
-
-			/* If we know the next offset, we can skip the rest */
-			if (usecache && att->attcacheoff >= 0)
-				off = att->attcacheoff;
-			else if (att->attlen == -1)
-			{
-				/*
-				 * We can only cache the offset for a varlena attribute if the
-				 * offset is already suitably aligned, so that there would be
-				 * no pad bytes in any case: then the offset will be valid for
-				 * either an aligned or unaligned value.
-				 */
-				if (usecache &&
-					off == att_nominal_alignby(off, att->attalignby))
-					att->attcacheoff = off;
-				else
-				{
-					off = att_pointer_alignby(off, att->attalignby, -1,
-											  tp + off);
-					usecache = false;
-				}
-			}
-			else
-			{
-				/* not varlena, so safe to use att_nominal_alignby */
-				off = att_nominal_alignby(off, att->attalignby);
-
-				if (usecache)
-					att->attcacheoff = off;
-			}
-
-			if (i == attnum)
-				break;
-
+			off = att_pointer_alignby(off,
+									  att->attalignby,
+									  att->attlen,
+									  tp + off);
 			off = att_addlength_pointer(off, att->attlen, tp + off);
 
-			if (usecache && att->attlen <= 0)
-				usecache = false;
 		}
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
 	}
 
-	return fetchatt(TupleDescCompactAttr(tupleDesc, attnum), tp + off);
+	return fetchatt(cattr, tp + off);
 }
 
 /* ----------------
@@ -1347,6 +1245,7 @@ heap_deform_tuple(HeapTuple tuple, TupleDesc tupleDesc,
 				  Datum *values, bool *isnull)
 {
 	HeapTupleHeader tup = tuple->t_data;
+	CompactAttribute *cattr;
 	bool		hasnulls = HeapTupleHasNulls(tuple);
 	int			tdesc_natts = tupleDesc->natts;
 	int			natts;			/* number of atts to extract */
@@ -1354,70 +1253,91 @@ heap_deform_tuple(HeapTuple tuple, TupleDesc tupleDesc,
 	char	   *tp;				/* ptr to tuple data */
 	uint32		off;			/* offset in tuple data */
 	bits8	   *bp = tup->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* can we use/set attcacheoff? */
+	int			cacheoffattrs;
+	int			firstnullattr;
 
 	natts = HeapTupleHeaderGetNatts(tup);
 
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
+
 	/*
 	 * In inheritance situations, it is possible that the given tuple actually
 	 * has more fields than the caller is expecting.  Don't run off the end of
 	 * the caller's arrays.
 	 */
 	natts = Min(natts, tdesc_natts);
+	cacheoffattrs = Min(tupleDesc->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		firstnullattr = first_null_attr(bp, natts);
+		cacheoffattrs = Min(cacheoffattrs, firstnullattr);
+	}
+	else
+		firstnullattr = natts;
 
 	tp = (char *) tup + tup->t_hoff;
+	attnum = 0;
 
-	off = 0;
+	if (cacheoffattrs > 0)
+	{
+#ifdef USE_ASSERT_CHECKING
+		/* In Assert enabled builds, verify attcacheoff is correct */
+		off = 0;
+#endif
+		do
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+#ifdef USE_ASSERT_CHECKING
+			off = att_nominal_alignby(off, cattr->attalignby);
+			Assert(off == cattr->attcacheoff);
+			off += cattr->attlen;
+#endif
 
-	for (attnum = 0; attnum < natts; attnum++)
+			values[attnum] = fetch_att(tp + cattr->attcacheoff,
+									   cattr->attbyval,
+									   cattr->attlen);
+			isnull[attnum] = false;
+		} while (++attnum < cacheoffattrs);
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+	else
+		off = 0;
+
+	for (; attnum < firstnullattr; attnum++)
 	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDesc, attnum);
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
 
-		if (hasnulls && att_isnull(attnum, bp))
-		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			slow = true;		/* can't use attcacheoff anymore */
-			continue;
-		}
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
 		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (!slow && thisatt->attcacheoff >= 0)
-			off = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow &&
-				off == att_nominal_alignby(off, thisatt->attalignby))
-				thisatt->attcacheoff = off;
-			else
-			{
-				off = att_pointer_alignby(off, thisatt->attalignby, -1,
-										  tp + off);
-				slow = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			off = att_nominal_alignby(off, thisatt->attalignby);
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+	}
+
+	for (; attnum < natts; attnum++)
+	{
+		Assert(hasnulls);
 
-			if (!slow)
-				thisatt->attcacheoff = off;
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
 		}
 
-		values[attnum] = fetchatt(thisatt, tp + off);
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
-		off = att_addlength_pointer(off, thisatt->attlen, tp + off);
+		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (thisatt->attlen <= 0)
-			slow = true;		/* can't use attcacheoff anymore */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
 	/*
diff --git a/src/backend/access/common/indextuple.c b/src/backend/access/common/indextuple.c
index d7c8c53fd8d..084e0937a60 100644
--- a/src/backend/access/common/indextuple.c
+++ b/src/backend/access/common/indextuple.c
@@ -223,18 +223,6 @@ index_form_tuple_context(TupleDesc tupleDescriptor,
  *
  *		This gets called from index_getattr() macro, and only in cases
  *		where we can't use cacheoffset and the value is not null.
- *
- *		This caches attribute offsets in the attribute descriptor.
- *
- *		An alternative way to speed things up would be to cache offsets
- *		with the tuple, but that seems more difficult unless you take
- *		the storage hit of actually putting those offsets into the
- *		tuple you send to disk.  Yuck.
- *
- *		This scheme will be slightly slower than that, but should
- *		perform well for queries which hit large #'s of tuples.  After
- *		you cache the offsets once, examining all the other tuples using
- *		the same attribute descriptor will go much quicker. -cim 5/4/91
  * ----------------
  */
 Datum
@@ -242,205 +230,129 @@ nocache_index_getattr(IndexTuple tup,
 					  int attnum,
 					  TupleDesc tupleDesc)
 {
+	CompactAttribute *cattr;
 	char	   *tp;				/* ptr to data part of tuple */
 	bits8	   *bp = NULL;		/* ptr to null bitmap in tuple */
-	bool		slow = false;	/* do we have to walk attrs? */
 	int			data_off;		/* tuple data offset */
 	int			off;			/* current offset within data */
+	int			startAttr;
+	int			firstnullattr;
+	bool		hasnulls = IndexTupleHasNulls(tup);
+	int			i;
 
-	/* ----------------
-	 *	 Three cases:
-	 *
-	 *	 1: No nulls and no variable-width attributes.
-	 *	 2: Has a null or a var-width AFTER att.
-	 *	 3: Has nulls or var-widths BEFORE att.
-	 * ----------------
-	 */
-
-	data_off = IndexInfoFindDataOffset(tup->t_info);
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
 	attnum--;
 
-	if (IndexTupleHasNulls(tup))
-	{
-		/*
-		 * there's a null somewhere in the tuple
-		 *
-		 * check to see if desired att is null
-		 */
+	/*
+	 * If there are no NULLs before the required attnum, then we can start at
+	 * the highest attribute with a known offset, or the first attribute if
+	 * none have a cached offset.  If the tuple has no variable width types,
+	 * which is common with indexes, then we can use a slightly cheaper method
+	 * of offset calculation, as we just need to add the attlen to the aligned
+	 * offset when skipping over columns.  When the tuple contains
+	 * variable-width types, we must use att_addlength_pointer(), which does a
+	 * bit more branching and is slightly less efficient.
+	 */
+	data_off = IndexInfoFindDataOffset(tup->t_info);
+	tp = (char *) tup + data_off;
 
-		/* XXX "knows" t_bits are just after fixed tuple header! */
+	/*
+	 * Find the first NULL column, or if there's none set the first NULL to
+	 * attnum so that we can forego NULL checking all the way to attnum.
+	 */
+	if (hasnulls)
+	{
 		bp = (bits8 *) ((char *) tup + sizeof(IndexTupleData));
-
-		/*
-		 * Now check to see if any preceding bits are null...
-		 */
-		{
-			int			byte = attnum >> 3;
-			int			finalbit = attnum & 0x07;
-
-			/* check for nulls "before" final bit of last byte */
-			if ((~bp[byte]) & ((1 << finalbit) - 1))
-				slow = true;
-			else
-			{
-				/* check for nulls in any "earlier" bytes */
-				int			i;
-
-				for (i = 0; i < byte; i++)
-				{
-					if (bp[i] != 0xFF)
-					{
-						slow = true;
-						break;
-					}
-				}
-			}
-		}
+		firstnullattr = first_null_attr(bp, attnum);
 	}
+	else
+		firstnullattr = attnum;
 
-	tp = (char *) tup + data_off;
-
-	if (!slow)
+	if (tupleDesc->firstNonCachedOffAttr > 0)
 	{
-		CompactAttribute *att;
-
-		/*
-		 * If we get here, there are no nulls up to and including the target
-		 * attribute.  If we have a cached offset, we can use it.
-		 */
-		att = TupleDescCompactAttr(tupleDesc, attnum);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, tp + att->attcacheoff);
-
-		/*
-		 * Otherwise, check for non-fixed-length attrs up to and including
-		 * target.  If there aren't any, it's safe to cheaply initialize the
-		 * cached offsets for these attrs.
-		 */
-		if (IndexTupleHasVarwidths(tup))
-		{
-			int			j;
-
-			for (j = 0; j <= attnum; j++)
-			{
-				if (TupleDescCompactAttr(tupleDesc, j)->attlen <= 0)
-				{
-					slow = true;
-					break;
-				}
-			}
-		}
+		startAttr = Min(tupleDesc->firstNonCachedOffAttr - 1, firstnullattr);
+		off = TupleDescCompactAttr(tupleDesc, startAttr)->attcacheoff;
 	}
-
-	if (!slow)
+	else
 	{
-		int			natts = tupleDesc->natts;
-		int			j = 1;
-
-		/*
-		 * If we get here, we have a tuple with no nulls or var-widths up to
-		 * and including the target attribute, so we can use the cached offset
-		 * ... only we don't have it yet, or we'd not have got here.  Since
-		 * it's cheap to compute offsets for fixed-width columns, we take the
-		 * opportunity to initialize the cached offsets for *all* the leading
-		 * fixed-width columns, in hope of avoiding future visits to this
-		 * routine.
-		 */
-		TupleDescCompactAttr(tupleDesc, 0)->attcacheoff = 0;
+		startAttr = 0;
+		off = 0;
+	}
 
-		/* we might have set some offsets in the slow path previously */
-		while (j < natts && TupleDescCompactAttr(tupleDesc, j)->attcacheoff > 0)
-			j++;
+	/* Handle tuples with var-width attributes */
+	if (IndexTupleHasVarwidths(tup))
+	{
+		/* Calculate the offset up until the first NULL */
+		for (i = startAttr; i < firstnullattr; i++)
+		{
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-		off = TupleDescCompactAttr(tupleDesc, j - 1)->attcacheoff +
-			TupleDescCompactAttr(tupleDesc, j - 1)->attlen;
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, cattr->attlen, tp + off);
+		}
 
-		for (; j < natts; j++)
+		/* Calculate the offset for any remaining columns. */
+		for (; i < attnum; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, j);
+			Assert(hasnulls);
 
-			if (att->attlen <= 0)
-				break;
+			if (att_isnull(i, bp))
+				continue;
 
-			off = att_nominal_alignby(off, att->attalignby);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-			att->attcacheoff = off;
-
-			off += att->attlen;
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off = att_addlength_pointer(off, cattr->attlen, tp + off);
 		}
-
-		Assert(j > attnum);
-
-		off = TupleDescCompactAttr(tupleDesc, attnum)->attcacheoff;
 	}
 	else
 	{
-		bool		usecache = true;
-		int			i;
+		/* Handle tuples with only fixed-width attributes */
 
-		/*
-		 * Now we know that we have to walk the tuple CAREFULLY.  But we still
-		 * might be able to cache some offsets for next time.
-		 *
-		 * Note - This loop is a little tricky.  For each non-null attribute,
-		 * we have to first account for alignment padding before the attr,
-		 * then advance over the attr based on its length.  Nulls have no
-		 * storage and no alignment padding either.  We can use/set
-		 * attcacheoff until we reach either a null or a var-width attribute.
-		 */
-		off = 0;
-		for (i = 0;; i++)		/* loop exit is at "break" */
+		/* Calculate the offset up until the first NULL */
+		for (i = startAttr; i < firstnullattr; i++)
 		{
-			CompactAttribute *att = TupleDescCompactAttr(tupleDesc, i);
-
-			if (IndexTupleHasNulls(tup) && att_isnull(i, bp))
-			{
-				usecache = false;
-				continue;		/* this cannot be the target att */
-			}
-
-			/* If we know the next offset, we can skip the rest */
-			if (usecache && att->attcacheoff >= 0)
-				off = att->attcacheoff;
-			else if (att->attlen == -1)
-			{
-				/*
-				 * We can only cache the offset for a varlena attribute if the
-				 * offset is already suitably aligned, so that there would be
-				 * no pad bytes in any case: then the offset will be valid for
-				 * either an aligned or unaligned value.
-				 */
-				if (usecache &&
-					off == att_nominal_alignby(off, att->attalignby))
-					att->attcacheoff = off;
-				else
-				{
-					off = att_pointer_alignby(off, att->attalignby, -1,
-											  tp + off);
-					usecache = false;
-				}
-			}
-			else
-			{
-				/* not varlena, so safe to use att_nominal_alignby */
-				off = att_nominal_alignby(off, att->attalignby);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
+
+			Assert(cattr->attlen > 0);
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off += cattr->attlen;
+		}
 
-				if (usecache)
-					att->attcacheoff = off;
-			}
+		/* Calculate the offset for any remaining columns. */
+		for (; i < attnum; i++)
+		{
+			Assert(hasnulls);
 
-			if (i == attnum)
-				break;
+			if (att_isnull(i, bp))
+				continue;
 
-			off = att_addlength_pointer(off, att->attlen, tp + off);
+			cattr = TupleDescCompactAttr(tupleDesc, i);
 
-			if (usecache && att->attlen <= 0)
-				usecache = false;
+			Assert(cattr->attlen > 0);
+			off = att_pointer_alignby(off,
+									  cattr->attalignby,
+									  cattr->attlen,
+									  tp + off);
+			off += cattr->attlen;
 		}
 	}
 
-	return fetchatt(TupleDescCompactAttr(tupleDesc, attnum), tp + off);
+	cattr = TupleDescCompactAttr(tupleDesc, attnum);
+	off = att_pointer_alignby(off, cattr->attalignby,
+							  cattr->attlen, tp + off);
+	return fetchatt(cattr, tp + off);
 }
 
 /*
@@ -480,63 +392,86 @@ index_deform_tuple_internal(TupleDesc tupleDescriptor,
 							Datum *values, bool *isnull,
 							char *tp, bits8 *bp, int hasnulls)
 {
+	CompactAttribute *cattr;
 	int			natts = tupleDescriptor->natts; /* number of atts to extract */
-	int			attnum;
+	int			attnum = 0;
 	int			off = 0;		/* offset in tuple data */
-	bool		slow = false;	/* can we use/set attcacheoff? */
+	int			cacheoffattrs;
+	int			firstnullattr;
 
 	/* Assert to protect callers who allocate fixed-size arrays */
 	Assert(natts <= INDEX_MAX_KEYS);
 
-	for (attnum = 0; attnum < natts; attnum++)
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDescriptor->firstNonCachedOffAttr >= 0);
+
+	cacheoffattrs = Min(tupleDescriptor->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		firstnullattr = first_null_attr(bp, natts);
+		cacheoffattrs = Min(cacheoffattrs, firstnullattr);
+	}
+	else
+		firstnullattr = natts;
+
+	if (cacheoffattrs > 0)
 	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDescriptor, attnum);
+#ifdef USE_ASSERT_CHECKING
+		/* In Assert enabled builds, verify attcacheoff is correct */
+		off = 0;
+#endif
 
-		if (hasnulls && att_isnull(attnum, bp))
+		do
 		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			slow = true;		/* can't use attcacheoff anymore */
-			continue;
-		}
+			cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+
+#ifdef USE_ASSERT_CHECKING
+			off = att_nominal_alignby(off, cattr->attalignby);
+			Assert(off == cattr->attcacheoff);
+			off += cattr->attlen;
+#endif
+
+			values[attnum] = fetch_att(tp + cattr->attcacheoff, cattr->attbyval,
+									   cattr->attlen);
+			isnull[attnum] = false;
+		} while (++attnum < cacheoffattrs);
+
+		off = cattr->attcacheoff + cattr->attlen;
+	}
+
+	for (; attnum < firstnullattr; attnum++)
+	{
+		cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
 		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (!slow && thisatt->attcacheoff >= 0)
-			off = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow &&
-				off == att_nominal_alignby(off, thisatt->attalignby))
-				thisatt->attcacheoff = off;
-			else
-			{
-				off = att_pointer_alignby(off, thisatt->attalignby, -1,
-										  tp + off);
-				slow = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			off = att_nominal_alignby(off, thisatt->attalignby);
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+	}
+
+	for (; attnum < natts; attnum++)
+	{
+		Assert(hasnulls);
 
-			if (!slow)
-				thisatt->attcacheoff = off;
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
 		}
 
-		values[attnum] = fetchatt(thisatt, tp + off);
+		cattr = TupleDescCompactAttr(tupleDescriptor, attnum);
+		off = att_pointer_alignby(off, cattr->attalignby, cattr->attlen,
+								  tp + off);
 
-		off = att_addlength_pointer(off, thisatt->attlen, tp + off);
+		isnull[attnum] = false;
+		values[attnum] = fetchatt(cattr, tp + off);
 
-		if (thisatt->attlen <= 0)
-			slow = true;		/* can't use attcacheoff anymore */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 }
 
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index e98de806a77..25364db630a 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -214,6 +214,9 @@ CreateTemplateTupleDesc(int natts)
 	desc->tdtypmod = -1;
 	desc->tdrefcount = -1;		/* assume not reference-counted */
 
+	/* This will be set to the correct value by TupleDescFinalize() */
+	desc->firstNonCachedOffAttr = -1;
+
 	return desc;
 }
 
@@ -474,6 +477,9 @@ TupleDescCopy(TupleDesc dst, TupleDesc src)
  *		descriptor to another.
  *
  * !!! Constraints and defaults are not copied !!!
+ *
+ * The caller must take care of calling TupleDescFinalize() on once all
+ * TupleDesc changes have been made.
  */
 void
 TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
@@ -506,6 +512,37 @@ TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 	populate_compact_attribute(dst, dstAttno - 1);
 }
 
+/*
+ * TupleDescFinalize
+ *		Finalize the given TupleDesc.  This must be called after the
+ *		attributes arrays have been populated or adjusted by any code.
+ *
+ * Must be called after populate_compact_attribute() and before
+ * BlessTupleDesc().
+ */
+void
+TupleDescFinalize(TupleDesc tupdesc)
+{
+	int			firstNonCachedOffAttr = 0;
+	int			offp = 0;
+
+	for (int i = 0; i < tupdesc->natts; i++)
+	{
+		CompactAttribute *cattr = TupleDescCompactAttr(tupdesc, i);
+
+		if (cattr->attlen <= 0)
+			break;
+
+		offp = att_nominal_alignby(offp, cattr->attalignby);
+		cattr->attcacheoff = offp;
+
+		offp += cattr->attlen;
+		firstNonCachedOffAttr = i + 1;
+	}
+
+	tupdesc->firstNonCachedOffAttr = firstNonCachedOffAttr;
+}
+
 /*
  * Free a TupleDesc including all substructure
  */
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index b246e8127db..a4694bd8065 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -335,9 +335,6 @@ getSpGistTupleDesc(Relation index, SpGistTypeDesc *keyType)
 		/* We shouldn't need to bother with making these valid: */
 		att->attcompression = InvalidCompressionMethod;
 		att->attcollation = InvalidOid;
-		/* In case we changed typlen, we'd better reset following offsets */
-		for (int i = spgFirstIncludeColumn; i < outTupDesc->natts; i++)
-			TupleDescCompactAttr(outTupDesc, i)->attcacheoff = -1;
 
 		populate_compact_attribute(outTupDesc, spgKeyColumn);
 		TupleDescFinalize(outTupDesc);
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index e6ab51e6404..36d0aaed2fb 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -992,118 +992,6 @@ tts_buffer_heap_store_tuple(TupleTableSlot *slot, HeapTuple tuple,
 	}
 }
 
-/*
- * slot_deform_heap_tuple_internal
- *		An always inline helper function for use in slot_deform_heap_tuple to
- *		allow the compiler to emit specialized versions of this function for
- *		various combinations of "slow" and "hasnulls".  For example, if a
- *		given tuple has no nulls, then we needn't check "hasnulls" for every
- *		attribute that we're deforming.  The caller can just call this
- *		function with hasnulls set to constant-false and have the compiler
- *		remove the constant-false branches and emit more optimal code.
- *
- * Returns the next attnum to deform, which can be equal to natts when the
- * function manages to deform all requested attributes.  *offp is an input and
- * output parameter which is the byte offset within the tuple to start deforming
- * from which, on return, gets set to the offset where the next attribute
- * should be deformed from.  *slowp is set to true when subsequent deforming
- * of this tuple must use a version of this function with "slow" passed as
- * true.
- *
- * Callers cannot assume when we return "attnum" (i.e. all requested
- * attributes have been deformed) that slow mode isn't required for any
- * additional deforming as the final attribute may have caused a switch to
- * slow mode.
- */
-static pg_attribute_always_inline int
-slot_deform_heap_tuple_internal(TupleTableSlot *slot, HeapTuple tuple,
-								int attnum, int natts, bool slow,
-								bool hasnulls, uint32 *offp, bool *slowp)
-{
-	TupleDesc	tupleDesc = slot->tts_tupleDescriptor;
-	Datum	   *values = slot->tts_values;
-	bool	   *isnull = slot->tts_isnull;
-	HeapTupleHeader tup = tuple->t_data;
-	char	   *tp;				/* ptr to tuple data */
-	bits8	   *bp = tup->t_bits;	/* ptr to null bitmap in tuple */
-	bool		slownext = false;
-
-	tp = (char *) tup + tup->t_hoff;
-
-	for (; attnum < natts; attnum++)
-	{
-		CompactAttribute *thisatt = TupleDescCompactAttr(tupleDesc, attnum);
-
-		if (hasnulls && att_isnull(attnum, bp))
-		{
-			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
-			if (!slow)
-			{
-				*slowp = true;
-				return attnum + 1;
-			}
-			else
-				continue;
-		}
-
-		isnull[attnum] = false;
-
-		/* calculate the offset of this attribute */
-		if (!slow && thisatt->attcacheoff >= 0)
-			*offp = thisatt->attcacheoff;
-		else if (thisatt->attlen == -1)
-		{
-			/*
-			 * We can only cache the offset for a varlena attribute if the
-			 * offset is already suitably aligned, so that there would be no
-			 * pad bytes in any case: then the offset will be valid for either
-			 * an aligned or unaligned value.
-			 */
-			if (!slow && *offp == att_nominal_alignby(*offp, thisatt->attalignby))
-				thisatt->attcacheoff = *offp;
-			else
-			{
-				*offp = att_pointer_alignby(*offp,
-											thisatt->attalignby,
-											-1,
-											tp + *offp);
-
-				if (!slow)
-					slownext = true;
-			}
-		}
-		else
-		{
-			/* not varlena, so safe to use att_nominal_alignby */
-			*offp = att_nominal_alignby(*offp, thisatt->attalignby);
-
-			if (!slow)
-				thisatt->attcacheoff = *offp;
-		}
-
-		values[attnum] = fetchatt(thisatt, tp + *offp);
-
-		*offp = att_addlength_pointer(*offp, thisatt->attlen, tp + *offp);
-
-		/* check if we need to switch to slow mode */
-		if (!slow)
-		{
-			/*
-			 * We're unable to deform any further if the above code set
-			 * 'slownext', or if this isn't a fixed-width attribute.
-			 */
-			if (slownext || thisatt->attlen <= 0)
-			{
-				*slowp = true;
-				return attnum + 1;
-			}
-		}
-	}
-
-	return natts;
-}
-
 /*
  * slot_deform_heap_tuple
  *		Given a TupleTableSlot, extract data from the slot's physical tuple
@@ -1122,78 +1010,138 @@ static pg_attribute_always_inline void
 slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 					   int natts)
 {
+	CompactAttribute *cattr;
+	TupleDesc	tupleDesc = slot->tts_tupleDescriptor;
 	bool		hasnulls = HeapTupleHasNulls(tuple);
+	HeapTupleHeader tup = tuple->t_data;
+	bits8	   *bp;				/* ptr to null bitmap in tuple */
 	int			attnum;
+	int			firstNonCacheOffsetAttr;
+	int			firstNullAttr;
+	Datum	   *values;
+	bool	   *isnull;
+	char	   *tp;				/* ptr to tuple data */
 	uint32		off;			/* offset in tuple data */
-	bool		slow;			/* can we use/set attcacheoff? */
+
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupleDesc->firstNonCachedOffAttr >= 0);
 
 	/* We can only fetch as many attributes as the tuple has. */
-	natts = Min(HeapTupleHeaderGetNatts(tuple->t_data), natts);
+	natts = Min(HeapTupleHeaderGetNatts(tup), natts);
+	attnum = slot->tts_nvalid;
+	firstNonCacheOffsetAttr = Min(tupleDesc->firstNonCachedOffAttr, natts);
+
+	if (hasnulls)
+	{
+		bp = tup->t_bits;
+		firstNullAttr = first_null_attr(bp, natts);
+		firstNonCacheOffsetAttr = Min(firstNonCacheOffsetAttr, firstNullAttr);
+	}
+	else
+	{
+		bp = NULL;
+		firstNullAttr = natts;
+	}
+
+	values = slot->tts_values;
+	isnull = slot->tts_isnull;
+	tp = (char *) tup + tup->t_hoff;
 
 	/*
-	 * Check whether the first call for this tuple, and initialize or restore
-	 * loop state.
+	 * Handle the portion of the tuple that we have cached the offset for up
+	 * to the first NULL attribute.  The offset is effectively fixed for these
+	 * so we can use the CompactAttribute's attcacheoff.
 	 */
-	attnum = slot->tts_nvalid;
-	if (attnum == 0)
+	if (attnum < firstNonCacheOffsetAttr)
 	{
-		/* Start from the first attribute */
-		off = 0;
-		slow = false;
+#ifdef USE_ASSERT_CHECKING
+		int			offcheck;
+
+		/* In Assert enabled builds, verify attcacheoff is correct */
+		if (attnum == 0)
+			offcheck = 0;
+		else
+			offcheck = *offp;
+#endif
+		do
+		{
+			isnull[attnum] = false;
+			cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+#ifdef USE_ASSERT_CHECKING
+			offcheck = att_nominal_alignby(offcheck, cattr->attalignby);
+			Assert(offcheck == cattr->attcacheoff);
+			offcheck += cattr->attlen;
+#endif
+
+			values[attnum] = fetchatt(cattr, tp + cattr->attcacheoff);
+		} while (++attnum < firstNonCacheOffsetAttr);
+
+		/*
+		 * Point the offset after the end of the last attribute with a cached
+		 * offset.  We expect the final cached offset attribute to have a
+		 * fixed width, so just add the attlen to the attcacheoff
+		 */
+		Assert(cattr->attlen > 0);
+		off = cattr->attcacheoff + cattr->attlen;
 	}
 	else
 	{
 		/* Restore state from previous execution */
 		off = *offp;
-		slow = TTS_SLOW(slot);
+
+		/* We expect *offp to be set to 0 when attnum == 0 */
+		Assert(off == 0 || attnum > 0);
 	}
 
 	/*
-	 * If 'slow' isn't set, try deforming using deforming code that does not
-	 * contain any of the extra checks required for non-fixed offset
-	 * deforming.  During deforming, if or when we find a NULL or a variable
-	 * length attribute, we'll switch to a deforming method which includes the
-	 * extra code required for non-fixed offset deforming, a.k.a slow mode.
-	 * Because this is performance critical, we inline
-	 * slot_deform_heap_tuple_internal passing the 'slow' and 'hasnull'
-	 * parameters as constants to allow the compiler to emit specialized code
-	 * with the known-const false comparisons and subsequent branches removed.
+	 * Handle any portion of the tuple that doesn't have a fixed offset up
+	 * until the first NULL attribute.  This loops only differs from the one
+	 * after it by the NULL checks.
 	 */
-	if (!slow)
+	for (; attnum < firstNullAttr; attnum++)
 	{
-		/* Tuple without any NULLs? We can skip doing any NULL checking */
-		if (!hasnulls)
-			attnum = slot_deform_heap_tuple_internal(slot,
-													 tuple,
-													 attnum,
-													 natts,
-													 false, /* slow */
-													 false, /* hasnulls */
-													 &off,
-													 &slow);
-		else
-			attnum = slot_deform_heap_tuple_internal(slot,
-													 tuple,
-													 attnum,
-													 natts,
-													 false, /* slow */
-													 true,	/* hasnulls */
-													 &off,
-													 &slow);
+		isnull[attnum] = false;
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+		/* align the offset for this attribute */
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
+
+		values[attnum] = fetchatt(cattr, tp + off);
+
+		/* move the offset beyond this attribute */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
-	/* If there's still work to do then we must be in slow mode */
-	if (attnum < natts)
+	/*
+	 * Now handle any remaining tuples, this time include NULL checks as we're
+	 * now at the first NULL attribute.
+	 */
+	for (; attnum < natts; attnum++)
 	{
-		/* XXX is it worth adding a separate call when hasnulls is false? */
-		attnum = slot_deform_heap_tuple_internal(slot,
-												 tuple,
-												 attnum,
-												 natts,
-												 true,	/* slow */
-												 hasnulls,
-												 &off,
-												 &slow);
+		if (att_isnull(attnum, bp))
+		{
+			values[attnum] = (Datum) 0;
+			isnull[attnum] = true;
+			continue;
+		}
+
+		isnull[attnum] = false;
+		cattr = TupleDescCompactAttr(tupleDesc, attnum);
+
+		/* align the offset for this attribute */
+		off = att_pointer_alignby(off,
+								  cattr->attalignby,
+								  cattr->attlen,
+								  tp + off);
+
+		values[attnum] = fetchatt(cattr, tp + off);
+
+		/* move the offset beyond this attribute */
+		off = att_addlength_pointer(off, cattr->attlen, tp + off);
 	}
 
 	/*
@@ -1201,10 +1149,6 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 	 */
 	slot->tts_nvalid = attnum;
 	*offp = off;
-	if (slow)
-		slot->tts_flags |= TTS_FLAG_SLOW;
-	else
-		slot->tts_flags &= ~TTS_FLAG_SLOW;
 }
 
 const TupleTableSlotOps TTSOpsVirtual = {
@@ -2259,10 +2203,16 @@ ExecTypeSetColNames(TupleDesc typeInfo, List *namesList)
  * This happens "for free" if the tupdesc came from a relcache entry, but
  * not if we have manufactured a tupdesc for a transient RECORD datatype.
  * In that case we have to notify typcache.c of the existence of the type.
+ *
+ * TupleDescFinalize() must be called on the TupleDesc before calling this
+ * function.
  */
 TupleDesc
 BlessTupleDesc(TupleDesc tupdesc)
 {
+	/* Did someone forget to call TupleDescFinalize()? */
+	Assert(tupdesc->firstNonCachedOffAttr >= 0);
+
 	if (tupdesc->tdtypeid == RECORDOID &&
 		tupdesc->tdtypmod < 0)
 		assign_record_type_typmod(tupdesc);
diff --git a/src/backend/jit/llvm/llvmjit_deform.c b/src/backend/jit/llvm/llvmjit_deform.c
index 3eb087eb56b..12521e3e46a 100644
--- a/src/backend/jit/llvm/llvmjit_deform.c
+++ b/src/backend/jit/llvm/llvmjit_deform.c
@@ -62,7 +62,6 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 	LLVMValueRef v_tts_values;
 	LLVMValueRef v_tts_nulls;
 	LLVMValueRef v_slotoffp;
-	LLVMValueRef v_flagsp;
 	LLVMValueRef v_nvalidp;
 	LLVMValueRef v_nvalid;
 	LLVMValueRef v_maxatt;
@@ -178,7 +177,6 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 	v_tts_nulls =
 		l_load_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_ISNULL,
 						  "tts_ISNULL");
-	v_flagsp = l_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_FLAGS, "");
 	v_nvalidp = l_struct_gep(b, StructTupleTableSlot, v_slot, FIELDNO_TUPLETABLESLOT_NVALID, "");
 
 	if (ops == &TTSOpsHeapTuple || ops == &TTSOpsBufferHeapTuple)
@@ -747,14 +745,10 @@ slot_compile_deform(LLVMJitContext *context, TupleDesc desc,
 
 	{
 		LLVMValueRef v_off = l_load(b, TypeSizeT, v_offp, "");
-		LLVMValueRef v_flags;
 
 		LLVMBuildStore(b, l_int16_const(lc, natts), v_nvalidp);
 		v_off = LLVMBuildTrunc(b, v_off, LLVMInt32TypeInContext(lc), "");
 		LLVMBuildStore(b, v_off, v_slotoffp);
-		v_flags = l_load(b, LLVMInt16TypeInContext(lc), v_flagsp, "tts_flags");
-		v_flags = LLVMBuildOr(b, v_flags, l_int16_const(lc, TTS_FLAG_SLOW), "");
-		LLVMBuildStore(b, v_flags, v_flagsp);
 		LLVMBuildRetVoid(b);
 	}
 
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 770edb34e08..998be24ac41 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -666,14 +666,6 @@ RelationBuildTupleDesc(Relation relation)
 		elog(ERROR, "pg_attribute catalog is missing %d attribute(s) for relation OID %u",
 			 need, RelationGetRelid(relation));
 
-	/*
-	 * We can easily set the attcacheoff value for the first attribute: it
-	 * must be zero.  This eliminates the need for special cases for attnum=1
-	 * that used to exist in fastgetattr() and index_getattr().
-	 */
-	if (RelationGetNumberOfAttributes(relation) > 0)
-		TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
-
 	/*
 	 * Set up constraint/default info
 	 */
@@ -1985,8 +1977,6 @@ formrdesc(const char *relationName, Oid relationReltype,
 		populate_compact_attribute(relation->rd_att, i);
 	}
 
-	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
-	TupleDescCompactAttr(relation->rd_att, 0)->attcacheoff = 0;
 	TupleDescFinalize(relation->rd_att);
 
 	/* mark not-null status */
@@ -4446,8 +4436,6 @@ BuildHardcodedDescriptor(int natts, const FormData_pg_attribute *attrs)
 		populate_compact_attribute(result, i);
 	}
 
-	/* initialize first attribute's attcacheoff, cf RelationBuildTupleDesc */
-	TupleDescCompactAttr(result, 0)->attcacheoff = 0;
 	TupleDescFinalize(result);
 
 	/* Note: we don't bother to set up a TupleConstr entry */
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index d406825ff22..94b4279b7f1 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -865,20 +865,17 @@ extern MinimalTuple minimal_expand_tuple(HeapTuple sourceTuple, TupleDesc tupleD
 static inline Datum
 fastgetattr(HeapTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
 {
-	Assert(attnum > 0);
+	CompactAttribute *att = TupleDescCompactAttr(tupleDesc, attnum - 1);
 
+	Assert(attnum > 0);
 	*isnull = false;
-	if (HeapTupleNoNulls(tup))
-	{
-		CompactAttribute *att;
 
-		att = TupleDescCompactAttr(tupleDesc, attnum - 1);
-		if (att->attcacheoff >= 0)
-			return fetchatt(att, (char *) tup->t_data + tup->t_data->t_hoff +
-							att->attcacheoff);
-		else
-			return nocachegetattr(tup, attnum, tupleDesc);
-	}
+	if (att->attcacheoff >= 0 && !HeapTupleHasNulls(tup))
+		return fetchatt(att, (char *) tup->t_data + tup->t_data->t_hoff +
+						att->attcacheoff);
+
+	if (HeapTupleNoNulls(tup))
+		return nocachegetattr(tup, attnum, tupleDesc);
 	else
 	{
 		if (att_isnull(attnum - 1, tup->t_data->t_bits))
diff --git a/src/include/access/itup.h b/src/include/access/itup.h
index 57e4daafb0d..e4bb27b7e58 100644
--- a/src/include/access/itup.h
+++ b/src/include/access/itup.h
@@ -131,24 +131,20 @@ IndexInfoFindDataOffset(unsigned short t_info)
 static inline Datum
 index_getattr(IndexTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
 {
+	CompactAttribute *attr = TupleDescCompactAttr(tupleDesc, attnum - 1);
+
 	Assert(isnull);
 	Assert(attnum > 0);
 
 	*isnull = false;
 
-	if (!IndexTupleHasNulls(tup))
-	{
-		CompactAttribute *attr = TupleDescCompactAttr(tupleDesc, attnum - 1);
+	if (attr->attcacheoff >= 0 && !IndexTupleHasNulls(tup))
+		return fetchatt(attr,
+						(char *) tup + IndexInfoFindDataOffset(tup->t_info) +
+						attr->attcacheoff);
 
-		if (attr->attcacheoff >= 0)
-		{
-			return fetchatt(attr,
-							(char *) tup + IndexInfoFindDataOffset(tup->t_info) +
-							attr->attcacheoff);
-		}
-		else
-			return nocache_index_getattr(tup, attnum, tupleDesc);
-	}
+	if (!IndexTupleHasNulls(tup))
+		return nocache_index_getattr(tup, attnum, tupleDesc);
 	else
 	{
 		if (att_isnull(attnum - 1, (bits8 *) tup + sizeof(IndexTupleData)))
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index 595413dbbc5..99d9017d1a6 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -131,6 +131,12 @@ typedef struct CompactAttribute
  * Any code making changes manually to and fields in the FormData_pg_attribute
  * array must subsequently call populate_compact_attribute() to flush the
  * changes out to the corresponding 'compact_attrs' element.
+ *
+ * firstNonCachedOffAttr stores the index into the compact_attrs array for the
+ * first attribute that we don't have a known attcacheoff for.
+ *
+ * Once a TupleDesc has been populated, before it is used for any purpose
+ * TupleDescFinalize() must be called on it.
  */
 typedef struct TupleDescData
 {
@@ -138,6 +144,8 @@ typedef struct TupleDescData
 	Oid			tdtypeid;		/* composite type ID for tuple type */
 	int32		tdtypmod;		/* typmod for tuple type */
 	int			tdrefcount;		/* reference count, or -1 if not counting */
+	int			firstNonCachedOffAttr;	/* index of the first att without an
+										 * attcacheoff */
 	TupleConstr *constr;		/* constraints, or NULL if none */
 	/* compact_attrs[N] is the compact metadata of Attribute Number N+1 */
 	CompactAttribute compact_attrs[FLEXIBLE_ARRAY_MEMBER];
@@ -195,7 +203,6 @@ extern TupleDesc CreateTupleDescTruncatedCopy(TupleDesc tupdesc, int natts);
 
 extern TupleDesc CreateTupleDescCopyConstr(TupleDesc tupdesc);
 
-#define TupleDescFinalize(d) ((void) 0)
 #define TupleDescSize(src) \
 	(offsetof(struct TupleDescData, compact_attrs) + \
 	 (src)->natts * sizeof(CompactAttribute) + \
@@ -206,6 +213,7 @@ extern void TupleDescCopy(TupleDesc dst, TupleDesc src);
 extern void TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 							   TupleDesc src, AttrNumber srcAttno);
 
+extern void TupleDescFinalize(TupleDesc tupdesc);
 extern void FreeTupleDesc(TupleDesc tupdesc);
 
 extern void IncrTupleDescRefCount(TupleDesc tupdesc);
diff --git a/src/include/access/tupmacs.h b/src/include/access/tupmacs.h
index 3e5530658c9..ce7a88df611 100644
--- a/src/include/access/tupmacs.h
+++ b/src/include/access/tupmacs.h
@@ -15,6 +15,7 @@
 #define TUPMACS_H
 
 #include "catalog/pg_type_d.h"	/* for TYPALIGN macros */
+#include "port/pg_bitutils.h"
 
 
 /*
@@ -69,6 +70,89 @@ fetch_att(const void *T, bool attbyval, int attlen)
 	else
 		return PointerGetDatum(T);
 }
+
+#ifndef HAVE__BUILTIN_CTZ
+/*
+ * For returning the 0-based position of the right-most 0 bit of a uint8, or 8
+ * if all bits are 1 bits.
+ */
+static const uint8 pg_rightmost_zero_pos[256] = {
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 6,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 7,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 6,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 5,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4,
+	0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 8
+};
+#endif
+
+/*
+ * first_null_attr
+ *		Inspect a NULL bitmask from a tuple and return the 0-based attnum of the
+ *		first NULL attribute.  Returns natts if no NULLs were found.
+ */
+static inline int
+first_null_attr(const bits8 *bits, int natts)
+{
+	int			lastByte = natts >> 3;
+	int			bytenum;
+	int			res;
+
+#ifdef USE_ASSERT_CHECKING
+	int			firstnull_check = natts;
+
+	/* Do it the slow way and check we get the same answer. */
+	for (int i = 0; i < natts; i++)
+	{
+		if (att_isnull(i, bits))
+		{
+			firstnull_check = i;
+			break;
+		}
+	}
+#endif
+
+	/* Process all bytes up to just before the byte for the natts index */
+	for (bytenum = 0; bytenum < lastByte; bytenum++)
+	{
+		/* break if there's any NULL attrs (a 0 bit) */
+		if (bits[bytenum] != 0xFF)
+			break;
+	}
+
+	res = bytenum << 3;
+
+#ifdef HAVE__BUILTIN_CTZ
+	/*
+	 * Promote to 32-bit before doing bit-wise NOT.  This means we'll convert
+	 * 0xff into 0xffffff00 rather than 0x0, which is undefined with
+	 * __builtin_ctz.  That'll mean we correctly get 8 for 0xff
+	 */
+	res += __builtin_ctz(~(uint32) bits[bytenum]);
+#else
+	res += pg_rightmost_zero_pos[bits[bytenum]];
+#endif
+
+	/*
+	 * Since we did no masking to mask out bits beyond natts, we may have
+	 * found a bit higher than natts, so we must cap to natts
+	 */
+	res = Min(res, natts);
+
+	Assert(res == firstnull_check);
+
+	return res;
+}
 #endif							/* FRONTEND */
 
 /*
diff --git a/src/include/executor/tuptable.h b/src/include/executor/tuptable.h
index a2dfd707e78..363c5f33697 100644
--- a/src/include/executor/tuptable.h
+++ b/src/include/executor/tuptable.h
@@ -84,9 +84,6 @@
  * tts_values/tts_isnull are allocated either when the slot is created (when
  * the descriptor is provided), or when a descriptor is assigned to the slot;
  * they are of length equal to the descriptor's natts.
- *
- * The TTS_FLAG_SLOW flag is saved state for
- * slot_deform_heap_tuple, and should not be touched by any other code.
  *----------
  */
 
@@ -98,12 +95,8 @@
 #define			TTS_FLAG_SHOULDFREE		(1 << 2)
 #define TTS_SHOULDFREE(slot) (((slot)->tts_flags & TTS_FLAG_SHOULDFREE) != 0)
 
-/* saved state for slot_deform_heap_tuple */
-#define			TTS_FLAG_SLOW		(1 << 3)
-#define TTS_SLOW(slot) (((slot)->tts_flags & TTS_FLAG_SLOW) != 0)
-
 /* fixed tuple descriptor */
-#define			TTS_FLAG_FIXED		(1 << 4)
+#define			TTS_FLAG_FIXED		(1 << 4)	/* XXX change to #3? */
 #define TTS_FIXED(slot) (((slot)->tts_flags & TTS_FLAG_FIXED) != 0)
 
 struct TupleTableSlotOps;
-- 
2.51.0



  [text/plain] v8-0003-Introduce-deform_bench-test-module.patch (7.3K, 4-v8-0003-Introduce-deform_bench-test-module.patch)
  download | inline diff:
From 98e27bc74fbc3f3da9048430ea9aa2ffa156aa78 Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Tue, 27 Jan 2026 15:08:09 +1300
Subject: [PATCH v8 3/4] Introduce deform_bench test module

For benchmaring tuple deformation.
---
 src/test/modules/deform_bench/.gitignore      |   4 +
 src/test/modules/deform_bench/Makefile        |  21 ++++
 .../deform_bench/deform_bench--1.0.sql        |   8 ++
 src/test/modules/deform_bench/deform_bench.c  | 106 ++++++++++++++++++
 .../modules/deform_bench/deform_bench.control |   4 +
 src/test/modules/deform_bench/meson.build     |  22 ++++
 src/test/modules/meson.build                  |   1 +
 7 files changed, 166 insertions(+)
 create mode 100644 src/test/modules/deform_bench/.gitignore
 create mode 100644 src/test/modules/deform_bench/Makefile
 create mode 100644 src/test/modules/deform_bench/deform_bench--1.0.sql
 create mode 100644 src/test/modules/deform_bench/deform_bench.c
 create mode 100644 src/test/modules/deform_bench/deform_bench.control
 create mode 100644 src/test/modules/deform_bench/meson.build

diff --git a/src/test/modules/deform_bench/.gitignore b/src/test/modules/deform_bench/.gitignore
new file mode 100644
index 00000000000..5dcb3ff9723
--- /dev/null
+++ b/src/test/modules/deform_bench/.gitignore
@@ -0,0 +1,4 @@
+# Generated subdirectories
+/log/
+/results/
+/tmp_check/
diff --git a/src/test/modules/deform_bench/Makefile b/src/test/modules/deform_bench/Makefile
new file mode 100644
index 00000000000..b5fc0f7a583
--- /dev/null
+++ b/src/test/modules/deform_bench/Makefile
@@ -0,0 +1,21 @@
+# src/test/modules/deform_bench/Makefile
+
+MODULE_big = deform_bench
+OBJS = deform_bench.o
+
+EXTENSION = deform_bench
+DATA = deform_bench--1.0.sql
+PGFILEDESC = "deform_bench - tuple deform benchmarking"
+
+REGRESS = deform_bench
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/deform_bench
+top_builddir = ../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/src/test/modules/deform_bench/deform_bench--1.0.sql b/src/test/modules/deform_bench/deform_bench--1.0.sql
new file mode 100644
index 00000000000..492b71dba3b
--- /dev/null
+++ b/src/test/modules/deform_bench/deform_bench--1.0.sql
@@ -0,0 +1,8 @@
+/* deform_bench--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION deform_bench" to load this file. \quit
+
+CREATE FUNCTION deform_bench(tableoid Oid, attnum int[]) RETURNS FLOAT
+AS 'MODULE_PATHNAME', 'deform_bench'
+LANGUAGE C VOLATILE STRICT;
diff --git a/src/test/modules/deform_bench/deform_bench.c b/src/test/modules/deform_bench/deform_bench.c
new file mode 100644
index 00000000000..525162eb59c
--- /dev/null
+++ b/src/test/modules/deform_bench/deform_bench.c
@@ -0,0 +1,106 @@
+/*-------------------------------------------------------------------------
+ *
+ * deform_bench.c
+ *
+ * for benchmarking tuple deformation routines
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include <time.h>
+#include <sys/time.h>
+
+#include "access/heapam.h"
+#include "access/relscan.h"
+#include "catalog/pg_am_d.h"
+#include "catalog/pg_type_d.h"
+#include "funcapi.h"
+#include "miscadmin.h"
+#include "utils/array.h"
+#include "utils/arrayaccess.h"
+#include "utils/builtins.h"
+
+PG_MODULE_MAGIC;
+
+PG_FUNCTION_INFO_V1(deform_bench);
+
+Datum
+deform_bench(PG_FUNCTION_ARGS)
+{
+	Oid			tableoid = PG_GETARG_OID(0);
+	ArrayType  *array = PG_GETARG_ARRAYTYPE_P(1);
+	TableScanDesc scan;
+	Relation	rel;
+	TupleDesc	tupdesc;
+	TupleTableSlot *slot;
+	Datum	   *elem_datums = NULL;
+	bool	   *elem_nulls = NULL;
+	int			elem_count;
+	int		   *attnums;
+	clock_t		start,
+				end;
+
+	rel = relation_open(tableoid, AccessShareLock);
+
+	if (rel->rd_rel->relam != HEAP_TABLE_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only heap AM is supported")));
+
+	tupdesc = RelationGetDescr(rel);
+	slot = MakeTupleTableSlot(tupdesc, &TTSOpsBufferHeapTuple);
+	scan = table_beginscan_strat(rel, GetActiveSnapshot(), 0, NULL, true, false);
+
+	/*
+	 * The array is used to allow callers to define how many atts to deform.
+	 * e.g: '{1,10}'::int[] would deform attnum=1, then in a 2nd pass deform
+	 * the remainder up to attnum=10.  Passing an element as NULL means all
+	 * attnums.  This allows simulation of incremental deformation.  Generally
+	 * if you're passing an array with more than 1 element, then the array
+	 * should be in ascending order.  Doing something like '{10,1}' would mean
+	 * we've already deformed 10 attributes and on the 2nd pass there's
+	 * nothing to do since attnum=1 was already deformed in the first pass.
+	 *
+	 * You'll get an ERROR if you pass a number higher than the number of
+	 * attributes in the table.
+	 */
+	deconstruct_array(array,
+					  INT4OID,
+					  sizeof(int32),
+					  true,
+					  'i',
+					  &elem_datums,
+					  &elem_nulls,
+					  &elem_count);
+
+	attnums = palloc_array(int, elem_count);
+
+	for (int i = 0; i < elem_count; i++)
+	{
+		/* Make a NULL element mean all attributes */
+		if (elem_nulls[i])
+			attnums[i] = tupdesc->natts;
+		else
+			attnums[i] = DatumGetInt32(elem_datums[i]);
+	}
+
+	start = clock();
+
+	while (heap_getnextslot(scan, ForwardScanDirection, slot))
+	{
+		CHECK_FOR_INTERRUPTS();
+
+		/* Deform in stages according to the attnums array */
+		for (int i = 0; i < elem_count; i++)
+			slot_getsomeattrs_int(slot, attnums[i]);
+	}
+
+	ExecDropSingleTupleTableSlot(slot);
+	table_endscan(scan);
+	relation_close(rel, AccessShareLock);
+
+	end = clock();
+
+	/* Returns the number of milliseconds to run the test */
+	PG_RETURN_FLOAT8((double) (end - start) / (CLOCKS_PER_SEC / 1000));
+}
diff --git a/src/test/modules/deform_bench/deform_bench.control b/src/test/modules/deform_bench/deform_bench.control
new file mode 100644
index 00000000000..a2023f9d738
--- /dev/null
+++ b/src/test/modules/deform_bench/deform_bench.control
@@ -0,0 +1,4 @@
+# deform_bench extension
+comment = 'functions for benchmarking tuple deformation'
+default_version = '1.0'
+module_pathname = '$libdir/deform_bench'
diff --git a/src/test/modules/deform_bench/meson.build b/src/test/modules/deform_bench/meson.build
new file mode 100644
index 00000000000..82049585244
--- /dev/null
+++ b/src/test/modules/deform_bench/meson.build
@@ -0,0 +1,22 @@
+# Copyright (c) 2026, PostgreSQL Global Development Group
+
+deform_bench_sources = files(
+  'deform_bench.c',
+)
+
+if host_system == 'windows'
+  deform_bench_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+    '--NAME', 'deform_bench',
+    '--FILEDESC', 'deform_bench - benchmarking tuple deformation',])
+endif
+
+deform_bench = shared_module('deform_bench',
+  deform_bench_sources,
+  kwargs: pg_test_mod_args,
+)
+test_install_libs += deform_bench
+
+test_install_data += files(
+  'deform_bench--1.0.sql',
+  'deform_bench.control',
+)
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 2634a519935..ef2b0af4581 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -2,6 +2,7 @@
 
 subdir('brin')
 subdir('commit_ts')
+subdir('deform_bench')
 subdir('delay_execution')
 subdir('dummy_index_am')
 subdir('dummy_seclabel')
-- 
2.51.0



  [text/plain] v8-0004-Various-experimental-changes.patch (11.4K, 5-v8-0004-Various-experimental-changes.patch)
  download | inline diff:
From 8160dd33825abfcae2a14e78ed85a4c8e94e0274 Mon Sep 17 00:00:00 2001
From: David Rowley <[email protected]>
Date: Fri, 30 Jan 2026 23:18:45 +1300
Subject: [PATCH v8 4/4] Various experimental changes

---
 src/backend/access/common/tupdesc.c |   6 ++
 src/backend/executor/execTuples.c   |  63 ++++++------
 src/include/access/tupmacs.h        | 149 ++++++++++++++++++++++++++--
 src/include/executor/tuptable.h     |   4 +-
 4 files changed, 183 insertions(+), 39 deletions(-)

diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index 25364db630a..ca393af67c9 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -105,6 +105,12 @@ populate_compact_attribute_internal(Form_pg_attribute src,
 			elog(ERROR, "invalid attalign value: %c", src->attalign);
 			break;
 	}
+
+	/* Check for unsupported byval attlens */
+	if (src->attbyval && src->attlen != sizeof(char) &&
+		src->attlen != sizeof(int16) && src->attlen != sizeof(int32) &&
+		src->attlen != sizeof(int64))
+		elog(ERROR, "unsupported byval length: %d", src->attlen);
 }
 
 /*
diff --git a/src/backend/executor/execTuples.c b/src/backend/executor/execTuples.c
index 36d0aaed2fb..5e20a05a830 100644
--- a/src/backend/executor/execTuples.c
+++ b/src/backend/executor/execTuples.c
@@ -1029,23 +1029,34 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 	/* We can only fetch as many attributes as the tuple has. */
 	natts = Min(HeapTupleHeaderGetNatts(tup), natts);
 	attnum = slot->tts_nvalid;
+	values = slot->tts_values;
+	isnull = slot->tts_isnull;
 	firstNonCacheOffsetAttr = Min(tupleDesc->firstNonCachedOffAttr, natts);
 
 	if (hasnulls)
 	{
+		tp = (char *) tup +
+			MAXALIGN(offsetof(HeapTupleHeaderData, t_bits) +
+					 BITMAPLEN(HeapTupleHeaderGetNatts(tup)));
+		Assert(tp == (char *) tup + tup->t_hoff);
 		bp = tup->t_bits;
 		firstNullAttr = first_null_attr(bp, natts);
+		populate_isnull_array(bp, natts, isnull);
 		firstNonCacheOffsetAttr = Min(firstNonCacheOffsetAttr, firstNullAttr);
 	}
 	else
 	{
+		uint64	   *isnull64 = (uint64 *) isnull;
+		Size		asize = (natts + 7) >> 3;
+
+		tp = (char *) tup + MAXALIGN(offsetof(HeapTupleHeaderData, t_bits));
 		bp = NULL;
 		firstNullAttr = natts;
-	}
 
-	values = slot->tts_values;
-	isnull = slot->tts_isnull;
-	tp = (char *) tup + tup->t_hoff;
+		/* No nulls, set all isnull elements to false */
+		for (int i = 0; i < asize; i++)
+			isnull64[i] = 0;
+	}
 
 	/*
 	 * Handle the portion of the tuple that we have cached the offset for up
@@ -1065,7 +1076,6 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 #endif
 		do
 		{
-			isnull[attnum] = false;
 			cattr = TupleDescCompactAttr(tupleDesc, attnum);
 
 #ifdef USE_ASSERT_CHECKING
@@ -1074,7 +1084,9 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 			offcheck += cattr->attlen;
 #endif
 
-			values[attnum] = fetchatt(cattr, tp + cattr->attcacheoff);
+			values[attnum] = fetch_att_noerr(tp + cattr->attcacheoff,
+											 cattr->attbyval,
+											 cattr->attlen);
 		} while (++attnum < firstNonCacheOffsetAttr);
 
 		/*
@@ -1101,19 +1113,14 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 	 */
 	for (; attnum < firstNullAttr; attnum++)
 	{
-		isnull[attnum] = false;
 		cattr = TupleDescCompactAttr(tupleDesc, attnum);
 
-		/* align the offset for this attribute */
-		off = att_pointer_alignby(off,
-								  cattr->attalignby,
-								  cattr->attlen,
-								  tp + off);
-
-		values[attnum] = fetchatt(cattr, tp + off);
-
-		/* move the offset beyond this attribute */
-		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+		/* align 'off', fetch the datum, and increment off beyond the datum */
+		values[attnum] = align_fetch_then_add(tp,
+											  &off,
+											  cattr->attbyval,
+											  cattr->attlen,
+											  cattr->attalignby);
 	}
 
 	/*
@@ -1122,26 +1129,20 @@ slot_deform_heap_tuple(TupleTableSlot *slot, HeapTuple tuple, uint32 *offp,
 	 */
 	for (; attnum < natts; attnum++)
 	{
-		if (att_isnull(attnum, bp))
+		if (isnull[attnum])
 		{
 			values[attnum] = (Datum) 0;
-			isnull[attnum] = true;
 			continue;
 		}
 
-		isnull[attnum] = false;
 		cattr = TupleDescCompactAttr(tupleDesc, attnum);
 
-		/* align the offset for this attribute */
-		off = att_pointer_alignby(off,
-								  cattr->attalignby,
-								  cattr->attlen,
-								  tp + off);
-
-		values[attnum] = fetchatt(cattr, tp + off);
-
-		/* move the offset beyond this attribute */
-		off = att_addlength_pointer(off, cattr->attlen, tp + off);
+		/* align 'off', fetch the datum, and increment off beyond the datum */
+		values[attnum] = align_fetch_then_add(tp,
+											  &off,
+											  cattr->attbyval,
+											  cattr->attlen,
+											  cattr->attalignby);
 	}
 
 	/*
@@ -1452,7 +1453,7 @@ ExecSetSlotDescriptor(TupleTableSlot *slot, /* slot to change */
 	slot->tts_values = (Datum *)
 		MemoryContextAlloc(slot->tts_mcxt, tupdesc->natts * sizeof(Datum));
 	slot->tts_isnull = (bool *)
-		MemoryContextAlloc(slot->tts_mcxt, tupdesc->natts * sizeof(bool));
+		MemoryContextAlloc(slot->tts_mcxt, MAXALIGN(tupdesc->natts * sizeof(bool)));
 }
 
 /* --------------------------------
diff --git a/src/include/access/tupmacs.h b/src/include/access/tupmacs.h
index ce7a88df611..d53784899fb 100644
--- a/src/include/access/tupmacs.h
+++ b/src/include/access/tupmacs.h
@@ -16,7 +16,7 @@
 
 #include "catalog/pg_type_d.h"	/* for TYPALIGN macros */
 #include "port/pg_bitutils.h"
-
+#include "varatt.h"
 
 /*
  * Check a tuple's null bitmap to determine whether the attribute is null.
@@ -29,6 +29,49 @@ att_isnull(int ATT, const bits8 *BITS)
 	return !(BITS[ATT >> 3] & (1 << (ATT & 0x07)));
 }
 
+/*
+ * populate_isnull_array
+ *		Transform a tuple's null bitmap into a boolean array.
+ *
+ * Caller must ensure that the isnull array is sized so it contains
+ * at least as many elements as there are bits in the 'bits' array.
+ * This is required because we always round 'natts' up to the next multiple
+ * of 8.
+ */
+static inline void
+populate_isnull_array(const bits8 *bits, int natts, bool *isnull)
+{
+	int			nbytes = (natts + 7) >> 3;
+
+	/*
+	 * Multiplying a NULL bitmap byte by this value results in the lowest bit
+	 * in each byte being set the same as each bit of the bitmap.  We perform
+	 * this as 2 32-bit operations rather than a single 64-bit operation as
+	 * multiplying by the required value to do this in 64-bits would result in
+	 * overflowing a uint64 in some cases.
+	 */
+#define SPREAD_BITS_MULTIPLIER_32 0x204081U
+
+	for (int i = 0; i < nbytes; i++, isnull += 8)
+	{
+		uint64		isnull_8;
+		bits8		nullbyte = ~bits[i];
+
+		/* convert the lower 4 bits of null bitmap word into 32 bit int */
+		isnull_8 = (nullbyte & 0xf) * SPREAD_BITS_MULTIPLIER_32;
+
+		/*
+		 * convert the upper 4 bits of null bitmap word into 32 bit int, shift
+		 * into the upper 32 bit
+		 */
+		isnull_8 |= ((uint64) ((nullbyte >> 4) * SPREAD_BITS_MULTIPLIER_32)) << 32;
+
+		/* mask out all other bits apart from the lowest bit of each byte */
+		isnull_8 &= UINT64CONST(0x0101010101010101);
+		memcpy(isnull, &isnull_8, sizeof(uint64));
+	}
+}
+
 #ifndef FRONTEND
 /*
  * Given an attbyval and an attlen from either a Form_pg_attribute or
@@ -71,6 +114,100 @@ fetch_att(const void *T, bool attbyval, int attlen)
 		return PointerGetDatum(T);
 }
 
+/*
+ * Same, but no error checking for invalid attlens for byval types.  This
+ * is safe to use when attlen comes from CompactAttribute as we validate the
+ * length when populating that struct.
+ */
+static inline Datum
+fetch_att_noerr(const void *T, bool attbyval, int attlen)
+{
+	if (attbyval)
+	{
+		switch (attlen)
+		{
+			case sizeof(char):
+				return CharGetDatum(*((const char *) T));
+			case sizeof(int16):
+				return Int16GetDatum(*((const int16 *) T));
+			case sizeof(int32):
+				return Int32GetDatum(*((const int32 *) T));
+			default:
+				Assert(attlen == sizeof(int64));
+				return Int64GetDatum(*((const int64 *) T));
+		}
+	}
+	else
+		return PointerGetDatum(T);
+}
+
+
+/*
+ * align_fetch_then_add
+ *		Applies all the functionality of att_pointer_alignby(), fetch_att()
+ *		and att_addlength_pointer() resulting in *off pointer to the perhaps
+ *		unaligned number of bytes into 'tupptr', ready to deform the next
+ *		attribute.
+ *
+ * tupptr: pointer to the beginning of the tuple, after the header and any
+ * NULL bitmask.
+ * off: offset in bytes for reading tuple data, possibly unaligned.
+ * attbyval, attlen, attalignby are values from CompactAttribute.
+ */
+static inline Datum
+align_fetch_then_add(const char *tupptr, uint32 *off, bool attbyval, int attlen,
+					 uint8 attalignby)
+{
+	Datum		res;
+
+	if (attlen > 0)
+	{
+		const char *offset_ptr;
+
+		*off = TYPEALIGN(attalignby, *off);
+		offset_ptr = tupptr + *off;
+		*off += attlen;
+		if (attbyval)
+		{
+			switch (attlen)
+			{
+				case sizeof(char):
+					return CharGetDatum(*((const char *) offset_ptr));
+				case sizeof(int16):
+					return Int16GetDatum(*((const int16 *) offset_ptr));
+				case sizeof(int32):
+					return Int32GetDatum(*((const int32 *) offset_ptr));
+				default:
+
+					/*
+					 * populate_compact_attribute_internal() should have
+					 * checked
+					 */
+					Assert(attlen == sizeof(int64));
+					return Int64GetDatum(*((const int64 *) offset_ptr));
+			}
+		}
+		return PointerGetDatum(offset_ptr);
+	}
+	else if (attlen == -1)
+	{
+		if (!VARATT_IS_SHORT(tupptr + *off))
+			*off = TYPEALIGN(attalignby, *off);
+
+		res = PointerGetDatum(tupptr + *off);
+		*off += VARSIZE_ANY(DatumGetPointer(res));
+		return res;
+	}
+	else
+	{
+		Assert(attlen == -2);
+		*off = TYPEALIGN(attalignby, *off);
+		res = PointerGetDatum(tupptr + *off);
+		*off += strlen(tupptr + *off) + 1;
+		return res;
+	}
+}
+
 #ifndef HAVE__BUILTIN_CTZ
 /*
  * For returning the 0-based position of the right-most 0 bit of a uint8, or 8
@@ -100,6 +237,9 @@ static const uint8 pg_rightmost_zero_pos[256] = {
  * first_null_attr
  *		Inspect a NULL bitmask from a tuple and return the 0-based attnum of the
  *		first NULL attribute.  Returns natts if no NULLs were found.
+ *
+ * We expect that 'bits' contains at least one 0 bit somewhere in the mask,
+ * not necessarily < natts.
  */
 static inline int
 first_null_attr(const bits8 *bits, int natts)
@@ -133,12 +273,7 @@ first_null_attr(const bits8 *bits, int natts)
 	res = bytenum << 3;
 
 #ifdef HAVE__BUILTIN_CTZ
-	/*
-	 * Promote to 32-bit before doing bit-wise NOT.  This means we'll convert
-	 * 0xff into 0xffffff00 rather than 0x0, which is undefined with
-	 * __builtin_ctz.  That'll mean we correctly get 8 for 0xff
-	 */
-	res += __builtin_ctz(~(uint32) bits[bytenum]);
+	res += __builtin_ctz(~bits[bytenum]);
 #else
 	res += pg_rightmost_zero_pos[bits[bytenum]];
 #endif
diff --git a/src/include/executor/tuptable.h b/src/include/executor/tuptable.h
index 363c5f33697..180fccc999f 100644
--- a/src/include/executor/tuptable.h
+++ b/src/include/executor/tuptable.h
@@ -116,7 +116,9 @@ typedef struct TupleTableSlot
 #define FIELDNO_TUPLETABLESLOT_VALUES 5
 	Datum	   *tts_values;		/* current per-attribute values */
 #define FIELDNO_TUPLETABLESLOT_ISNULL 6
-	bool	   *tts_isnull;		/* current per-attribute isnull flags */
+	bool	   *tts_isnull;		/* current per-attribute isnull flags.  Array
+								 * size must always be rounded up to the next
+								 * 8 elements. */
 	MemoryContext tts_mcxt;		/* slot itself is in this context */
 	ItemPointerData tts_tid;	/* stored tuple's tid */
 	Oid			tts_tableOid;	/* table oid of tuple */
-- 
2.51.0



  [image/gif] m2_on_v8.gif (105.4K, 6-m2_on_v8.gif)
  download | view image

  [image/gif] amd3990x_clang_with_v8.gif (78.6K, 7-amd3990x_clang_with_v8.gif)
  download | view image

  [image/gif] amd7945hx_clang_with_v8.gif (102.4K, 8-amd7945hx_clang_with_v8.gif)
  download | view image

view thread (19+ messages)

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected]
  Subject: Re: More speedups for tuple deformation
  In-Reply-To: <CAApHDvpuEbhvH1ViCZRz5vks+_bGbEnPoEdZYAZXK76_isb_+Q@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox