public inbox for [email protected]  
help / color / mirror / Atom feed
[PATCH] pg_dump: Restore extension config table data before user objects during pg_upgrade
3+ messages / 2 participants
[nested] [flat]

* [PATCH] pg_dump: Restore extension config table data before user objects during pg_upgrade
@ 2026-03-20 17:47 Jimmy Angelakos <[email protected]>
  2026-03-29 18:34 ` Re: [PATCH] pg_dump: Restore extension config table data before user objects during pg_upgrade Andrew Dunstan <[email protected]>
  0 siblings, 1 reply; 3+ messages in thread

From: Jimmy Angelakos @ 2026-03-20 17:47 UTC (permalink / raw)
  To: pgsql-hackers

Hi All,

I ran into this issue when pg_upgrade-ing a DB with PostGIS. This is my
first code patch, so any feedback on the approach will be appreciated!

The problem:
============
pg_upgrade uses pg_dump --schema-only --binary-upgrade to copy the schema
from $oldcluster to $newcluster. Because this excludes all table data, it
leaves out data in extension config tables registered with
pg_extension_config_dump().

In $newcluster, binary_upgrade_create_empty_extension() creates the
extensions without populating any table data. The extensions' CREATE
EXTENSION scripts never get executed so any INSERTs are skipped. As a
consequence, if any CREATE TABLE statement in $newcluster requires
validation against these empty config tables, the upgrade fails. As an
example,
PostGIS registers config table spatial_ref_sys to hold ~8500 spatial
reference system definitions (SRIDs). When a table has, e.g. a geometry
column that specifies an SRID, this gets validated during the CREATE TABLE:

CREATE TABLE points (id int, location geometry(Point, 27700));
ERROR:  Cannot find SRID (27700) in spatial_ref_sys

This will happen for any SRID-constrained column, which will prevent many
real-world PostGIS deployments from being able to pg_upgrade. To summarise
the problem, our ordering is wrong here because extension configuration
data must be present before user tables that depend on it get created, but
--schema-only strips this data.

The patch:
==========
We are adding a new dump object type DO_EXTENSION_DATA that dumps extension
config table data in SECTION_PRE_DATA during --binary-upgrade ONLY. This
restores the needed data between extension creation and user object
creation, allowing the DDL to succeed.

Four files are modified in bin/pg_dump:

pg_dump.h:
Add DO_EXTENSION_DATA to the DumpableObjectType enum, between DO_EXTENSION
and DO_TYPE

pg_dump_sort.c:
Add PRIO_EXTENSION_DATA between PRIO_EXTENSION and PRIO_TYPE

pg_dump.c:
1. Add makeExtensionDataInfo() to create a TableDataInfo with objType =
DO_EXTENSION_DATA. Called for plain tables (RELKIND_RELATION) during
--binary-upgrade ONLY. As it depends on the table def, the COPY will be
emitted after the CREATE TABLE.
2. Add dumpExtensionData() to emit the entry in SECTION_PRE_DATA with
description "EXTENSION DATA" using dumpTableData_copy(). This allows the
config table data to go into the schema-only dump.
3. In processExtensionTables(), when dopt->binary_upgrade is true, call
makeExtensionDataInfo() instead of makeTableDataInfo(). Additionally, skip
extcondition filter because we need to dump all rows here.
4. Include DO_EXTENSION_DATA in pre-data boundary in
addBoundaryDependencies()

pg_backup_archiver.c:
Add "EXTENSION DATA" to the whitelist in _tocEntryRequired() similar to
BLOB, BLOB METADATA, etc. to include extension config table data in
--schema-only dumps during --binary-upgrade ONLY.

What ends up happening:
=======================
The inserted rows are basically scaffolding to allow the upgrade, and do
not persist. The pg_upgrade sequence goes like:
1. pg_dump includes $oldcluster extension config data in schema-only dump
2. pg_restore replays the dump into $newcluster and "EXTENSION DATA"
entries populate tables like spatial_ref_sys with COPY. Subsequent CREATE
TABLEs with e.g. SRID-constrained columns pass validation.
3. pg_upgrade transfers all data files from $oldcluster to $newcluster,
making spatial_ref_sys byte-for-byte identical to its previous state.

This patch:
1. Does NOT affect normal pg_dumps (without --binary-upgrade).
DO_EXTENSION_DATA objects are not created in this case.
2. Leaves binary_upgrade_create_empty_extension() unchanged.
3. Is not PostGIS-specific, and should solve this class of problem for any
extension that registers config tables that will be needed for DDL
validation.
4. Has been tested against HEAD at 29bf4ee7496 with $oldcluster PostGIS
3.3.9 on PG14 and $newcluster PostGIS 3.7.0dev/master on PG19-devel.

Thanks in advance for your review! Please find attached the patch for HEAD.
I believe this should be easily backpatchable to (at least) PG15, and will
be happy to work on backports.

Best regards,
Jimmy


Attachments:

  [text/x-patch] 0001-pg_dump-Restore-extension-config-table-data-before-u.patch (9.8K, 3-0001-pg_dump-Restore-extension-config-table-data-before-u.patch)
  download | inline diff:
From 29019b8638b28a487b66c7ad01893a2de40ff79d Mon Sep 17 00:00:00 2001
From: Jimmy Angelakos <[email protected]>
Date: Fri, 20 Mar 2026 16:04:45 +0000
Subject: [PATCH] pg_dump: Restore extension config table data before user
 objects during binary upgrade

pg_upgrade uses pg_dump --schema-only --binary-upgrade, which excludes
all table data including extension configuration tables registered via
pg_extension_config_dump(). Since binary_upgrade_create_empty_extension()
does not populate these tables, any user table whose CREATE TABLE
triggers validation against config data will fail.

For example, PostGIS tables with SRID-constrained geometry/geography
columns fail because spatial_ref_sys is empty during schema restore.

Fix by introducing a new dump object type DO_EXTENSION_DATA that dumps
extension config table data into SECTION_PRE_DATA during binary upgrade.
This puts the data restore between extension creation and user object
creation, allowing DDL-time validation to succeed. The data is
scaffolding: it is overwritten when pg_upgrade transfers the old
cluster's data files to the new cluster.

This is not PostGIS-specific and applies to any extension that registers
config tables via pg_extension_config_dump() where that data is needed
for DDL-time validation.
---
 src/bin/pg_dump/pg_backup_archiver.c |   2 +
 src/bin/pg_dump/pg_dump.c            | 109 ++++++++++++++++++++++++++-
 src/bin/pg_dump/pg_dump.h            |   1 +
 src/bin/pg_dump/pg_dump_sort.c       |   7 ++
 4 files changed, 117 insertions(+), 2 deletions(-)

diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 271a2c3e481..c9f9e574f16 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -3309,6 +3309,7 @@ _tocEntryRequired(TocEntry *te, teSection curSection, ArchiveHandle *AH)
 		 */
 		if (strcmp(te->desc, "SEQUENCE SET") == 0 ||
 			strcmp(te->desc, "BLOB") == 0 ||
+			strcmp(te->desc, "EXTENSION DATA") == 0 ||
 			strcmp(te->desc, "BLOB METADATA") == 0 ||
 			(strcmp(te->desc, "ACL") == 0 &&
 			 strncmp(te->tag, "LARGE OBJECT", 12) == 0) ||
@@ -3350,6 +3351,7 @@ _tocEntryRequired(TocEntry *te, teSection curSection, ArchiveHandle *AH)
 		if (!(ropt->sequence_data && strcmp(te->desc, "SEQUENCE SET") == 0) &&
 			!(ropt->binary_upgrade &&
 			  (strcmp(te->desc, "BLOB") == 0 ||
+			   strcmp(te->desc, "EXTENSION DATA") == 0 ||
 			   strcmp(te->desc, "BLOB METADATA") == 0 ||
 			   (strcmp(te->desc, "ACL") == 0 &&
 				strncmp(te->tag, "LARGE OBJECT", 12) == 0) ||
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index ad09677c336..2ffb8eb863f 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -264,6 +264,7 @@ static void prohibit_crossdb_refs(PGconn *conn, const char *dbname,
 static NamespaceInfo *findNamespace(Oid nsoid);
 static void dumpTableData(Archive *fout, const TableDataInfo *tdinfo);
 static void refreshMatViewData(Archive *fout, const TableDataInfo *tdinfo);
+static void dumpExtensionData(Archive *fout, const TableDataInfo *tdinfo);
 static const char *getRoleName(const char *roleoid_str);
 static void collectRoleNames(Archive *fout);
 static void getAdditionalACLs(Archive *fout);
@@ -352,6 +353,7 @@ static void addConstrChildIdxDeps(DumpableObject *dobj, const IndxInfo *refidx);
 static void getDomainConstraints(Archive *fout, TypeInfo *tyinfo);
 static void getTableData(DumpOptions *dopt, TableInfo *tblinfo, int numTables, char relkind);
 static void makeTableDataInfo(DumpOptions *dopt, TableInfo *tbinfo);
+static void makeExtensionDataInfo(DumpOptions *dopt, TableInfo *tbinfo);
 static void buildMatViewRefreshDependencies(Archive *fout);
 static void getTableDataFKConstraints(void);
 static void determineNotNullFlags(Archive *fout, PGresult *res, int r,
@@ -2961,6 +2963,48 @@ dumpTableData(Archive *fout, const TableDataInfo *tdinfo)
 	destroyPQExpBuffer(clistBuf);
 }
 
+/*
+ * dumpExtensionData -
+ *	  dump extension configuration table data for binary upgrade
+ *
+ * Entry goes into SECTION_PRE_DATA so the data is available before
+ * user tables that may need it for validation.
+ */
+static void
+dumpExtensionData(Archive *fout, const TableDataInfo *tdinfo)
+{
+	TableInfo  *tbinfo = tdinfo->tdtable;
+	PQExpBuffer copyBuf = createPQExpBuffer();
+	PQExpBuffer clistBuf = createPQExpBuffer();
+
+	/* Check that we have per-column details about this table */
+	Assert(tbinfo->interesting);
+
+	/* Build COPY statement */
+	printfPQExpBuffer(copyBuf, "COPY %s ",
+					  fmtQualifiedDumpable(tbinfo));
+	appendPQExpBuffer(copyBuf, "%s FROM stdin;\n",
+					  fmtCopyColumnList(tbinfo, clistBuf));
+
+	if (tdinfo->dobj.dump & DUMP_COMPONENT_DATA)
+	{
+		ArchiveEntry(fout, tdinfo->dobj.catId, tdinfo->dobj.dumpId,
+					 ARCHIVE_OPTS(.tag = tbinfo->dobj.name,
+								  .namespace = tbinfo->dobj.namespace->dobj.name,
+								  .owner = tbinfo->rolname,
+								  .description = "EXTENSION DATA",
+								  .section = SECTION_PRE_DATA,
+								  .copyStmt = copyBuf->data,
+								  .deps = &(tbinfo->dobj.dumpId),
+								  .nDeps = 1,
+								  .dumpFn = dumpTableData_copy,
+								  .dumpArg = tdinfo));
+	}
+
+	destroyPQExpBuffer(copyBuf);
+	destroyPQExpBuffer(clistBuf);
+}
+
 /*
  * refreshMatViewData -
  *	  load or refresh the contents of a single materialized view
@@ -3105,6 +3149,48 @@ makeTableDataInfo(DumpOptions *dopt, TableInfo *tbinfo)
 	tbinfo->interesting = true;
 }
 
+/*
+ * makeExtensionDataInfo --- create TableDataInfo for extension config table
+ *
+ * This is used during binary upgrades to ensure extension configuration
+ * table data is dumped early (before user tables that may depend on it).
+ * For example, PostGIS's spatial_ref_sys must be populated before any
+ * table with geography(Point, 4283) can be created due to SRID validation.
+ */
+static void
+makeExtensionDataInfo(DumpOptions *dopt, TableInfo *tbinfo)
+{
+	TableDataInfo *tdinfo;
+
+	/* Already have a data object? */
+	if (tbinfo->dataObj != NULL)
+		return;
+
+	/*
+	 * Caller ensures that this is only called for RELKIND_RELATION.
+	 */
+
+	/* OK, create the data object */
+	tdinfo = (TableDataInfo *) pg_malloc(sizeof(TableDataInfo));
+
+	tdinfo->dobj.objType = DO_EXTENSION_DATA;
+
+	tdinfo->dobj.catId.tableoid = 0;
+	tdinfo->dobj.catId.oid = tbinfo->dobj.catId.oid;
+	AssignDumpId(&tdinfo->dobj);
+	tdinfo->dobj.name = tbinfo->dobj.name;
+	tdinfo->dobj.namespace = tbinfo->dobj.namespace;
+	tdinfo->tdtable = tbinfo;
+	tdinfo->filtercond = NULL;
+	addObjectDependency(&tdinfo->dobj, tbinfo->dobj.dumpId);
+
+	/* Mark that this object contains data */
+	tdinfo->dobj.components |= DUMP_COMPONENT_DATA;
+
+	tbinfo->dataObj = tdinfo;
+	tbinfo->interesting = true;
+}
+
 /*
  * The refresh for a materialized view must be dependent on the refresh for
  * any materialized view that this one is dependent on.
@@ -11828,6 +11914,9 @@ dumpDumpableObject(Archive *fout, DumpableObject *dobj)
 		case DO_EXTENSION:
 			dumpExtension(fout, (const ExtensionInfo *) dobj);
 			break;
+		case DO_EXTENSION_DATA:
+			dumpExtensionData(fout, (const TableDataInfo *) dobj);
+			break;
 		case DO_TYPE:
 			dumpType(fout, (const TypeInfo *) dobj);
 			break;
@@ -20380,10 +20469,25 @@ processExtensionTables(Archive *fout, ExtensionInfo extinfo[],
 
 				if (dumpobj)
 				{
-					makeTableDataInfo(dopt, configtbl);
+					/*
+					 * For binary upgrades, dump extension config table data
+					 * before user tables are created so it's available for
+					 * validation (e.g. PostGIS SRIDs).
+					 */
+					if (dopt->binary_upgrade &&
+						configtbl->relkind == RELKIND_RELATION)
+						makeExtensionDataInfo(dopt, configtbl);
+					else
+						makeTableDataInfo(dopt, configtbl);
 					if (configtbl->dataObj != NULL)
 					{
-						if (strlen(extconditionarray[j]) > 0)
+						/*
+						 * For binary upgrade (DO_EXTENSION_DATA), don't apply
+						 * the filter condition - we need ALL data since the
+						 * extension won't populate built-in data in binary
+						 * upgrade mode.
+						 */
+						if (strlen(extconditionarray[j]) > 0 && !dopt->binary_upgrade)
 							configtbl->dataObj->filtercond = pg_strdup(extconditionarray[j]);
 					}
 				}
@@ -20661,6 +20765,7 @@ addBoundaryDependencies(DumpableObject **dobjs, int numObjs,
 		{
 			case DO_NAMESPACE:
 			case DO_EXTENSION:
+			case DO_EXTENSION_DATA:
 			case DO_TYPE:
 			case DO_SHELL_TYPE:
 			case DO_FUNC:
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 2b9c01b2c0a..185396a01a4 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -40,6 +40,7 @@ typedef enum
 	/* When modifying this enum, update priority tables in pg_dump_sort.c! */
 	DO_NAMESPACE,
 	DO_EXTENSION,
+	DO_EXTENSION_DATA,			/* extension config table data for binary upgrade */
 	DO_TYPE,
 	DO_SHELL_TYPE,
 	DO_FUNC,
diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c
index 03e5c1c1116..3cced9c27be 100644
--- a/src/bin/pg_dump/pg_dump_sort.c
+++ b/src/bin/pg_dump/pg_dump_sort.c
@@ -58,6 +58,7 @@ enum dbObjectTypePriorities
 	PRIO_COLLATION,
 	PRIO_TRANSFORM,
 	PRIO_EXTENSION,
+	PRIO_EXTENSION_DATA,		/* ext config data: used for binary upgrade */
 	PRIO_TYPE,					/* used for DO_TYPE and DO_SHELL_TYPE */
 	PRIO_CAST,
 	PRIO_FUNC,
@@ -106,6 +107,7 @@ static const int dbObjectTypePriority[] =
 {
 	[DO_NAMESPACE] = PRIO_NAMESPACE,
 	[DO_EXTENSION] = PRIO_EXTENSION,
+	[DO_EXTENSION_DATA] = PRIO_EXTENSION_DATA,
 	[DO_TYPE] = PRIO_TYPE,
 	[DO_SHELL_TYPE] = PRIO_TYPE,
 	[DO_FUNC] = PRIO_FUNC,
@@ -1525,6 +1527,11 @@ describeDumpableObject(DumpableObject *obj, char *buf, int bufsize)
 					 "EXTENSION %s  (ID %d OID %u)",
 					 obj->name, obj->dumpId, obj->catId.oid);
 			return;
+		case DO_EXTENSION_DATA:
+			snprintf(buf, bufsize,
+					 "EXTENSION DATA %s  (ID %d OID %u)",
+					 obj->name, obj->dumpId, obj->catId.oid);
+			return;
 		case DO_TYPE:
 			snprintf(buf, bufsize,
 					 "TYPE %s  (ID %d OID %u)",
-- 
2.51.0



^ permalink  raw  reply  [nested|flat] 3+ messages in thread

* Re: [PATCH] pg_dump: Restore extension config table data before user objects during pg_upgrade
  2026-03-20 17:47 [PATCH] pg_dump: Restore extension config table data before user objects during pg_upgrade Jimmy Angelakos <[email protected]>
@ 2026-03-29 18:34 ` Andrew Dunstan <[email protected]>
  2026-04-06 19:33   ` Re: [PATCH] pg_dump: Restore extension config table data before user objects during pg_upgrade Jimmy Angelakos <[email protected]>
  0 siblings, 1 reply; 3+ messages in thread

From: Andrew Dunstan @ 2026-03-29 18:34 UTC (permalink / raw)
  To: Jimmy Angelakos <[email protected]>; pgsql-hackers


On 2026-03-20 Fr 1:47 PM, Jimmy Angelakos wrote:
> Hi All,
>
> I ran into this issue when pg_upgrade-ing a DB with PostGIS. This is 
> my first code patch, so any feedback on the approach will be appreciated!
>
> The problem:
> ============
> pg_upgrade uses pg_dump --schema-only --binary-upgrade to copy the 
> schema from $oldcluster to $newcluster. Because this excludes all 
> table data, it leaves out data in extension config tables registered 
> with pg_extension_config_dump().
>
> In $newcluster, binary_upgrade_create_empty_extension() creates the 
> extensions without populating any table data. The extensions' CREATE 
> EXTENSION scripts never get executed so any INSERTs are skipped. As a 
> consequence, if any CREATE TABLE statement in $newcluster requires 
> validation against these empty config tables, the upgrade fails. As an 
> example,
> PostGIS registers config table spatial_ref_sys to hold ~8500 spatial 
> reference system definitions (SRIDs). When a table has, e.g. a 
> geometry column that specifies an SRID, this gets validated during the 
> CREATE TABLE:
>
> CREATE TABLE points (id int, location geometry(Point, 27700));
> ERROR:  Cannot find SRID (27700) in spatial_ref_sys
>
> This will happen for any SRID-constrained column, which will prevent 
> many real-world PostGIS deployments from being able to pg_upgrade. To 
> summarise the problem, our ordering is wrong here because extension 
> configuration data must be present before user tables that depend on 
> it get created, but --schema-only strips this data.
>
> The patch:
> ==========
> We are adding a new dump object type DO_EXTENSION_DATA that dumps 
> extension config table data in SECTION_PRE_DATA during 
> --binary-upgrade ONLY. This restores the needed data between extension 
> creation and user object creation, allowing the DDL to succeed.
>
> Four files are modified in bin/pg_dump:
>
> pg_dump.h:
> Add DO_EXTENSION_DATA to the DumpableObjectType enum, between 
> DO_EXTENSION and DO_TYPE
>
> pg_dump_sort.c:
> Add PRIO_EXTENSION_DATA between PRIO_EXTENSION and PRIO_TYPE
>
> pg_dump.c:
> 1. Add makeExtensionDataInfo() to create a TableDataInfo with objType 
> = DO_EXTENSION_DATA. Called for plain tables (RELKIND_RELATION) during 
> --binary-upgrade ONLY. As it depends on the table def, the COPY will 
> be emitted after the CREATE TABLE.
> 2. Add dumpExtensionData() to emit the entry in SECTION_PRE_DATA with 
> description "EXTENSION DATA" using dumpTableData_copy(). This allows 
> the config table data to go into the schema-only dump.
> 3. In processExtensionTables(), when dopt->binary_upgrade is true, 
> call makeExtensionDataInfo() instead of makeTableDataInfo(). 
> Additionally, skip extcondition filter because we need to dump all 
> rows here.
> 4. Include DO_EXTENSION_DATA in pre-data boundary in 
> addBoundaryDependencies()
>
> pg_backup_archiver.c:
> Add "EXTENSION DATA" to the whitelist in _tocEntryRequired() similar 
> to BLOB, BLOB METADATA, etc. to include extension config table data in 
> --schema-only dumps during --binary-upgrade ONLY.
>
> What ends up happening:
> =======================
> The inserted rows are basically scaffolding to allow the upgrade, and 
> do not persist. The pg_upgrade sequence goes like:
> 1. pg_dump includes $oldcluster extension config data in schema-only dump
> 2. pg_restore replays the dump into $newcluster and "EXTENSION DATA" 
> entries populate tables like spatial_ref_sys with COPY. Subsequent 
> CREATE TABLEs with e.g. SRID-constrained columns pass validation.
> 3. pg_upgrade transfers all data files from $oldcluster to 
> $newcluster, making spatial_ref_sys byte-for-byte identical to its 
> previous state.
>
> This patch:
> 1. Does NOT affect normal pg_dumps (without --binary-upgrade). 
> DO_EXTENSION_DATA objects are not created in this case.
> 2. Leaves binary_upgrade_create_empty_extension() unchanged.
> 3. Is not PostGIS-specific, and should solve this class of problem for 
> any extension that registers config tables that will be needed for DDL 
> validation.
> 4. Has been tested against HEAD at 29bf4ee7496 with $oldcluster 
> PostGIS 3.3.9 on PG14 and $newcluster PostGIS 3.7.0dev/master on 
> PG19-devel.
>
> Thanks in advance for your review! Please find attached the patch for 
> HEAD. I believe this should be easily backpatchable to (at least) 
> PG15, and will be happy to work on backports.


Hi, Jimmy.

First, as you probably know, we don't backpatch features, and I think 
this comes into that category. Unfortunately, we're about to close 
release 19 for features, so this would need to wait till release 20.

The patch didn't include any tests. It will need them (probably in 
src/test/modules/test_pg_dump)

There appears to be a lot of code duplication between 
dumpExtensionData() and dumpTableData(). It might be better to refactor 
that, perhaps by supplying an extra flag to dumpTableData().

Do make sure to add a Commitfest entry for this is you haven't already 
done so.


cheers


andrew







>
> Best regards,
> Jimmy

--
Andrew Dunstan
EDB: https://www.enterprisedb.com






^ permalink  raw  reply  [nested|flat] 3+ messages in thread

* Re: [PATCH] pg_dump: Restore extension config table data before user objects during pg_upgrade
  2026-03-20 17:47 [PATCH] pg_dump: Restore extension config table data before user objects during pg_upgrade Jimmy Angelakos <[email protected]>
  2026-03-29 18:34 ` Re: [PATCH] pg_dump: Restore extension config table data before user objects during pg_upgrade Andrew Dunstan <[email protected]>
@ 2026-04-06 19:33   ` Jimmy Angelakos <[email protected]>
  0 siblings, 0 replies; 3+ messages in thread

From: Jimmy Angelakos @ 2026-04-06 19:33 UTC (permalink / raw)
  To: Andrew Dunstan <[email protected]>; +Cc: pgsql-hackers

Hi Andrew,

Thanks for your review!

My opinion is that this is a bugfix rather than a feature: Rather than
adding new capability it's fixing pg_upgrade's behaviour, because it
currently fails in the described scenario (SRID-constrained columns). The
new code path isn't user facing and it only fires during pg_upgrade's
internal use of pg_dump --binary-upgrade.

However, I will defer to the committers' judgement on whether this should
be included in PG19 and backpatched.

To address your feedback, please find attached v2 which:
1. Removes dumpExtensionData() and adds the handling for EXTENSION DATA
object type to dumpTableData()
2. Adds test in test_pg_dump: we insert a row into the dumpable extension
table, and we expect that the COPY appears in --binary-upgrade dumps.

I have also added a commitfest entry.

Thanks again,
Jimmy


On Sun, Mar 29, 2026 at 7:34 PM Andrew Dunstan <[email protected]> wrote:

>
> On 2026-03-20 Fr 1:47 PM, Jimmy Angelakos wrote:
> > Hi All,
> >
> > I ran into this issue when pg_upgrade-ing a DB with PostGIS. This is
> > my first code patch, so any feedback on the approach will be appreciated!
> >
> > The problem:
> > ============
> > pg_upgrade uses pg_dump --schema-only --binary-upgrade to copy the
> > schema from $oldcluster to $newcluster. Because this excludes all
> > table data, it leaves out data in extension config tables registered
> > with pg_extension_config_dump().
> >
> > In $newcluster, binary_upgrade_create_empty_extension() creates the
> > extensions without populating any table data. The extensions' CREATE
> > EXTENSION scripts never get executed so any INSERTs are skipped. As a
> > consequence, if any CREATE TABLE statement in $newcluster requires
> > validation against these empty config tables, the upgrade fails. As an
> > example,
> > PostGIS registers config table spatial_ref_sys to hold ~8500 spatial
> > reference system definitions (SRIDs). When a table has, e.g. a
> > geometry column that specifies an SRID, this gets validated during the
> > CREATE TABLE:
> >
> > CREATE TABLE points (id int, location geometry(Point, 27700));
> > ERROR:  Cannot find SRID (27700) in spatial_ref_sys
> >
> > This will happen for any SRID-constrained column, which will prevent
> > many real-world PostGIS deployments from being able to pg_upgrade. To
> > summarise the problem, our ordering is wrong here because extension
> > configuration data must be present before user tables that depend on
> > it get created, but --schema-only strips this data.
> >
> > The patch:
> > ==========
> > We are adding a new dump object type DO_EXTENSION_DATA that dumps
> > extension config table data in SECTION_PRE_DATA during
> > --binary-upgrade ONLY. This restores the needed data between extension
> > creation and user object creation, allowing the DDL to succeed.
> >
> > Four files are modified in bin/pg_dump:
> >
> > pg_dump.h:
> > Add DO_EXTENSION_DATA to the DumpableObjectType enum, between
> > DO_EXTENSION and DO_TYPE
> >
> > pg_dump_sort.c:
> > Add PRIO_EXTENSION_DATA between PRIO_EXTENSION and PRIO_TYPE
> >
> > pg_dump.c:
> > 1. Add makeExtensionDataInfo() to create a TableDataInfo with objType
> > = DO_EXTENSION_DATA. Called for plain tables (RELKIND_RELATION) during
> > --binary-upgrade ONLY. As it depends on the table def, the COPY will
> > be emitted after the CREATE TABLE.
> > 2. Add dumpExtensionData() to emit the entry in SECTION_PRE_DATA with
> > description "EXTENSION DATA" using dumpTableData_copy(). This allows
> > the config table data to go into the schema-only dump.
> > 3. In processExtensionTables(), when dopt->binary_upgrade is true,
> > call makeExtensionDataInfo() instead of makeTableDataInfo().
> > Additionally, skip extcondition filter because we need to dump all
> > rows here.
> > 4. Include DO_EXTENSION_DATA in pre-data boundary in
> > addBoundaryDependencies()
> >
> > pg_backup_archiver.c:
> > Add "EXTENSION DATA" to the whitelist in _tocEntryRequired() similar
> > to BLOB, BLOB METADATA, etc. to include extension config table data in
> > --schema-only dumps during --binary-upgrade ONLY.
> >
> > What ends up happening:
> > =======================
> > The inserted rows are basically scaffolding to allow the upgrade, and
> > do not persist. The pg_upgrade sequence goes like:
> > 1. pg_dump includes $oldcluster extension config data in schema-only dump
> > 2. pg_restore replays the dump into $newcluster and "EXTENSION DATA"
> > entries populate tables like spatial_ref_sys with COPY. Subsequent
> > CREATE TABLEs with e.g. SRID-constrained columns pass validation.
> > 3. pg_upgrade transfers all data files from $oldcluster to
> > $newcluster, making spatial_ref_sys byte-for-byte identical to its
> > previous state.
> >
> > This patch:
> > 1. Does NOT affect normal pg_dumps (without --binary-upgrade).
> > DO_EXTENSION_DATA objects are not created in this case.
> > 2. Leaves binary_upgrade_create_empty_extension() unchanged.
> > 3. Is not PostGIS-specific, and should solve this class of problem for
> > any extension that registers config tables that will be needed for DDL
> > validation.
> > 4. Has been tested against HEAD at 29bf4ee7496 with $oldcluster
> > PostGIS 3.3.9 on PG14 and $newcluster PostGIS 3.7.0dev/master on
> > PG19-devel.
> >
> > Thanks in advance for your review! Please find attached the patch for
> > HEAD. I believe this should be easily backpatchable to (at least)
> > PG15, and will be happy to work on backports.
>
>
> Hi, Jimmy.
>
> First, as you probably know, we don't backpatch features, and I think
> this comes into that category. Unfortunately, we're about to close
> release 19 for features, so this would need to wait till release 20.
>
> The patch didn't include any tests. It will need them (probably in
> src/test/modules/test_pg_dump)
>
> There appears to be a lot of code duplication between
> dumpExtensionData() and dumpTableData(). It might be better to refactor
> that, perhaps by supplying an extra flag to dumpTableData().
>
> Do make sure to add a Commitfest entry for this is you haven't already
> done so.
>
>
> cheers
>
>
> andrew
>
>
>
>
>
>
>
> >
> > Best regards,
> > Jimmy
>
> --
> Andrew Dunstan
> EDB: https://www.enterprisedb.com
>
>

-- 

Jimmy Angelakos

Staff Software Engineer

[email protected]

pgEdge.com <http://pgedge.com/;


Attachments:

  [image/png] noname (41.7K, 3-noname)
  download | view image

  [text/x-patch] v2-0001-pg_dump-Restore-extension-config-table-data-befor.patch (10.2K, 4-v2-0001-pg_dump-Restore-extension-config-table-data-befor.patch)
  download | inline diff:
From 46ab846425e8ae62f3cc4dec6b88799c217a20f8 Mon Sep 17 00:00:00 2001
From: Jimmy Angelakos <[email protected]>
Date: Fri, 20 Mar 2026 16:04:45 +0000
Subject: [PATCH v2] pg_dump: Restore extension config table data before user
 objects during binary upgrade

pg_upgrade uses pg_dump --schema-only --binary-upgrade, which excludes
all table data including extension configuration tables registered via
pg_extension_config_dump(). Since binary_upgrade_create_empty_extension()
does not populate these tables, any user table whose CREATE TABLE
triggers validation against config data will fail.

For example, PostGIS tables with SRID-constrained geometry/geography
columns fail because spatial_ref_sys is empty during schema restore.

Fix by introducing a new dump object type DO_EXTENSION_DATA that dumps
extension config table data into SECTION_PRE_DATA during binary upgrade.
This puts the data restore between extension creation and user object
creation, allowing DDL-time validation to succeed. The data is
scaffolding: it is overwritten when pg_upgrade transfers the old
cluster's data files to the new cluster.

This is not PostGIS-specific and applies to any extension that registers
config tables via pg_extension_config_dump() where that data is needed
for DDL-time validation.
---
 src/bin/pg_dump/pg_backup_archiver.c          |  2 +
 src/bin/pg_dump/pg_dump.c                     | 82 ++++++++++++++++++-
 src/bin/pg_dump/pg_dump.h                     |  1 +
 src/bin/pg_dump/pg_dump_sort.c                |  7 ++
 src/test/modules/test_pg_dump/t/001_base.pl   |  1 -
 .../test_pg_dump/test_pg_dump--1.0.sql        |  1 +
 6 files changed, 89 insertions(+), 5 deletions(-)

diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index fecf6f2d1ce..8b2acbe4936 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -3310,6 +3310,7 @@ _tocEntryRequired(TocEntry *te, teSection curSection, ArchiveHandle *AH)
 		 */
 		if (strcmp(te->desc, "SEQUENCE SET") == 0 ||
 			strcmp(te->desc, "BLOB") == 0 ||
+			strcmp(te->desc, "EXTENSION DATA") == 0 ||
 			strcmp(te->desc, "BLOB METADATA") == 0 ||
 			(strcmp(te->desc, "ACL") == 0 &&
 			 strncmp(te->tag, "LARGE OBJECT", 12) == 0) ||
@@ -3351,6 +3352,7 @@ _tocEntryRequired(TocEntry *te, teSection curSection, ArchiveHandle *AH)
 		if (!(ropt->sequence_data && strcmp(te->desc, "SEQUENCE SET") == 0) &&
 			!(ropt->binary_upgrade &&
 			  (strcmp(te->desc, "BLOB") == 0 ||
+			   strcmp(te->desc, "EXTENSION DATA") == 0 ||
 			   strcmp(te->desc, "BLOB METADATA") == 0 ||
 			   (strcmp(te->desc, "ACL") == 0 &&
 				strncmp(te->tag, "LARGE OBJECT", 12) == 0) ||
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index d34240073bb..e3cd4c8e8d4 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -352,6 +352,7 @@ static void addConstrChildIdxDeps(DumpableObject *dobj, const IndxInfo *refidx);
 static void getDomainConstraints(Archive *fout, TypeInfo *tyinfo);
 static void getTableData(DumpOptions *dopt, TableInfo *tblinfo, int numTables, char relkind);
 static void makeTableDataInfo(DumpOptions *dopt, TableInfo *tbinfo);
+static void makeExtensionDataInfo(DumpOptions *dopt, TableInfo *tbinfo);
 static void buildMatViewRefreshDependencies(Archive *fout);
 static void getTableDataFKConstraints(void);
 static void determineNotNullFlags(Archive *fout, PGresult *res, int r,
@@ -2864,6 +2865,8 @@ dumpTableData(Archive *fout, const TableDataInfo *tdinfo)
 	char	   *tdDefn = NULL;
 	char	   *copyStmt;
 	const char *copyFrom;
+	const char *description = "TABLE DATA";
+	teSection	section = SECTION_DATA;
 
 	/* We had better have loaded per-column details about this table */
 	Assert(tbinfo->interesting);
@@ -2910,6 +2913,16 @@ dumpTableData(Archive *fout, const TableDataInfo *tdinfo)
 		copyStmt = NULL;
 	}
 
+	/*
+	 * Extension config table data goes into SECTION_PRE_DATA so it is
+	 * available before user tables that may need it for validation.
+	 */
+	if (tdinfo->dobj.objType == DO_EXTENSION_DATA)
+	{
+		description = "EXTENSION DATA";
+		section = SECTION_PRE_DATA;
+	}
+
 	/*
 	 * Note: although the TableDataInfo is a full DumpableObject, we treat its
 	 * dependency on its table as "special" and pass it to ArchiveEntry now.
@@ -2923,8 +2936,8 @@ dumpTableData(Archive *fout, const TableDataInfo *tdinfo)
 						  ARCHIVE_OPTS(.tag = tbinfo->dobj.name,
 									   .namespace = tbinfo->dobj.namespace->dobj.name,
 									   .owner = tbinfo->rolname,
-									   .description = "TABLE DATA",
-									   .section = SECTION_DATA,
+									   .description = description,
+									   .section = section,
 									   .createStmt = tdDefn,
 									   .copyStmt = copyStmt,
 									   .deps = &(tbinfo->dobj.dumpId),
@@ -3105,6 +3118,48 @@ makeTableDataInfo(DumpOptions *dopt, TableInfo *tbinfo)
 	tbinfo->interesting = true;
 }
 
+/*
+ * makeExtensionDataInfo --- create TableDataInfo for extension config table
+ *
+ * This is used during binary upgrades to ensure extension configuration
+ * table data is dumped early (before user tables that may depend on it).
+ * For example, PostGIS's spatial_ref_sys must be populated before any
+ * table with geometry(Point, 27700) can be created due to SRID validation.
+ */
+static void
+makeExtensionDataInfo(DumpOptions *dopt, TableInfo *tbinfo)
+{
+	TableDataInfo *tdinfo;
+
+	/* Already have a data object? */
+	if (tbinfo->dataObj != NULL)
+		return;
+
+	/*
+	 * Caller ensures that this is only called for RELKIND_RELATION.
+	 */
+
+	/* OK, create the data object */
+	tdinfo = (TableDataInfo *) pg_malloc(sizeof(TableDataInfo));
+
+	tdinfo->dobj.objType = DO_EXTENSION_DATA;
+
+	tdinfo->dobj.catId.tableoid = 0;
+	tdinfo->dobj.catId.oid = tbinfo->dobj.catId.oid;
+	AssignDumpId(&tdinfo->dobj);
+	tdinfo->dobj.name = tbinfo->dobj.name;
+	tdinfo->dobj.namespace = tbinfo->dobj.namespace;
+	tdinfo->tdtable = tbinfo;
+	tdinfo->filtercond = NULL;
+	addObjectDependency(&tdinfo->dobj, tbinfo->dobj.dumpId);
+
+	/* Mark that this object contains data */
+	tdinfo->dobj.components |= DUMP_COMPONENT_DATA;
+
+	tbinfo->dataObj = tdinfo;
+	tbinfo->interesting = true;
+}
+
 /*
  * The refresh for a materialized view must be dependent on the refresh for
  * any materialized view that this one is dependent on.
@@ -11838,6 +11893,9 @@ dumpDumpableObject(Archive *fout, DumpableObject *dobj)
 		case DO_EXTENSION:
 			dumpExtension(fout, (const ExtensionInfo *) dobj);
 			break;
+		case DO_EXTENSION_DATA:
+			dumpTableData(fout, (const TableDataInfo *) dobj);
+			break;
 		case DO_TYPE:
 			dumpType(fout, (const TypeInfo *) dobj);
 			break;
@@ -20393,10 +20451,25 @@ processExtensionTables(Archive *fout, ExtensionInfo extinfo[],
 
 				if (dumpobj)
 				{
-					makeTableDataInfo(dopt, configtbl);
+					/*
+					 * For binary upgrades, dump extension config table data
+					 * before user tables are created so it's available for
+					 * validation (e.g. PostGIS SRIDs).
+					 */
+					if (dopt->binary_upgrade &&
+						configtbl->relkind == RELKIND_RELATION)
+						makeExtensionDataInfo(dopt, configtbl);
+					else
+						makeTableDataInfo(dopt, configtbl);
 					if (configtbl->dataObj != NULL)
 					{
-						if (strlen(extconditionarray[j]) > 0)
+						/*
+						 * For binary upgrade (DO_EXTENSION_DATA), don't apply
+						 * the filter condition - we need ALL data since the
+						 * extension won't populate built-in data in binary
+						 * upgrade mode.
+						 */
+						if (strlen(extconditionarray[j]) > 0 && !dopt->binary_upgrade)
 							configtbl->dataObj->filtercond = pg_strdup(extconditionarray[j]);
 					}
 				}
@@ -20674,6 +20747,7 @@ addBoundaryDependencies(DumpableObject **dobjs, int numObjs,
 		{
 			case DO_NAMESPACE:
 			case DO_EXTENSION:
+			case DO_EXTENSION_DATA:
 			case DO_TYPE:
 			case DO_SHELL_TYPE:
 			case DO_FUNC:
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 5a6726d8b12..1e8bb961e8d 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -40,6 +40,7 @@ typedef enum
 	/* When modifying this enum, update priority tables in pg_dump_sort.c! */
 	DO_NAMESPACE,
 	DO_EXTENSION,
+	DO_EXTENSION_DATA,			/* extension config table data for binary upgrade */
 	DO_TYPE,
 	DO_SHELL_TYPE,
 	DO_FUNC,
diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c
index 03e5c1c1116..3cced9c27be 100644
--- a/src/bin/pg_dump/pg_dump_sort.c
+++ b/src/bin/pg_dump/pg_dump_sort.c
@@ -58,6 +58,7 @@ enum dbObjectTypePriorities
 	PRIO_COLLATION,
 	PRIO_TRANSFORM,
 	PRIO_EXTENSION,
+	PRIO_EXTENSION_DATA,		/* ext config data: used for binary upgrade */
 	PRIO_TYPE,					/* used for DO_TYPE and DO_SHELL_TYPE */
 	PRIO_CAST,
 	PRIO_FUNC,
@@ -106,6 +107,7 @@ static const int dbObjectTypePriority[] =
 {
 	[DO_NAMESPACE] = PRIO_NAMESPACE,
 	[DO_EXTENSION] = PRIO_EXTENSION,
+	[DO_EXTENSION_DATA] = PRIO_EXTENSION_DATA,
 	[DO_TYPE] = PRIO_TYPE,
 	[DO_SHELL_TYPE] = PRIO_TYPE,
 	[DO_FUNC] = PRIO_FUNC,
@@ -1525,6 +1527,11 @@ describeDumpableObject(DumpableObject *obj, char *buf, int bufsize)
 					 "EXTENSION %s  (ID %d OID %u)",
 					 obj->name, obj->dumpId, obj->catId.oid);
 			return;
+		case DO_EXTENSION_DATA:
+			snprintf(buf, bufsize,
+					 "EXTENSION DATA %s  (ID %d OID %u)",
+					 obj->name, obj->dumpId, obj->catId.oid);
+			return;
 		case DO_TYPE:
 			snprintf(buf, bufsize,
 					 "TYPE %s  (ID %d OID %u)",
diff --git a/src/test/modules/test_pg_dump/t/001_base.pl b/src/test/modules/test_pg_dump/t/001_base.pl
index 3d65ce4497a..db1cb22a13b 100644
--- a/src/test/modules/test_pg_dump/t/001_base.pl
+++ b/src/test/modules/test_pg_dump/t/001_base.pl
@@ -515,7 +515,6 @@ my %tests = (
 			extension_schema => 1,
 		},
 		unlike => {
-			binary_upgrade => 1,
 			exclude_table => 1,
 			exclude_extension => 1,
 			exclude_extension_filter => 1,
diff --git a/src/test/modules/test_pg_dump/test_pg_dump--1.0.sql b/src/test/modules/test_pg_dump/test_pg_dump--1.0.sql
index 1c68e146d91..134743e3943 100644
--- a/src/test/modules/test_pg_dump/test_pg_dump--1.0.sql
+++ b/src/test/modules/test_pg_dump/test_pg_dump--1.0.sql
@@ -18,6 +18,7 @@ CREATE TABLE regress_table_dumpable (
 	col1 int check (col1 > 0)
 );
 SELECT pg_catalog.pg_extension_config_dump('regress_table_dumpable', '');
+INSERT INTO regress_table_dumpable VALUES (27700);
 GRANT SELECT ON regress_table_dumpable TO public;
 
 CREATE SCHEMA regress_pg_dump_schema;
-- 
2.51.0



^ permalink  raw  reply  [nested|flat] 3+ messages in thread


end of thread, other threads:[~2026-04-06 19:33 UTC | newest]

Thread overview: 3+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2026-03-20 17:47 [PATCH] pg_dump: Restore extension config table data before user objects during pg_upgrade Jimmy Angelakos <[email protected]>
2026-03-29 18:34 ` Andrew Dunstan <[email protected]>
2026-04-06 19:33   ` Jimmy Angelakos <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox