public inbox for [email protected]  
help / color / mirror / Atom feed
Re: Return pg_control from pg_backup_stop().
13+ messages / 4 participants
[nested] [flat]

* Re: Return pg_control from pg_backup_stop().
@ 2026-02-20 03:10 David Steele <[email protected]>
  2026-02-20 05:47 ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  0 siblings, 1 reply; 13+ messages in thread

From: David Steele @ 2026-02-20 03:10 UTC (permalink / raw)
  To: Michael Paquier <[email protected]>; +Cc: pgsql-hackers

On 8/7/25 05:30, David Steele wrote:
> On 1/24/25 13:43, David Steele wrote:
>>
>> Rebased and improved a comment and an error.
> Rebased to fix breakage caused by the split of func.sgml in 4e23c9e.


Rebased to implement simplification added by "Simplify creation of 
built-in functions with default arguments" (759b03b2).

Regards,
-David






^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: Return pg_control from pg_backup_stop().
  2026-02-20 03:10 Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
@ 2026-02-20 05:47 ` David Steele <[email protected]>
  2026-03-06 01:27   ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  0 siblings, 1 reply; 13+ messages in thread

From: David Steele @ 2026-02-20 05:47 UTC (permalink / raw)
  To: Michael Paquier <[email protected]>; +Cc: pgsql-hackers

On 2/20/26 10:10, David Steele wrote:
> On 8/7/25 05:30, David Steele wrote:
>> On 1/24/25 13:43, David Steele wrote:
>>>
>>> Rebased and improved a comment and an error.
>> Rebased to fix breakage caused by the split of func.sgml in 4e23c9e.
> 
> 
> Rebased to implement simplification added by "Simplify creation of 
> built-in functions with default arguments" (759b03b2).

With the patches this time!

Regards,
-David
From f4043472a7acb4c74566ac17026d0e4b6245aa58 Mon Sep 17 00:00:00 2001
From: David Steele <[email protected]>
Date: Fri, 20 Feb 2026 02:59:52 +0000
Subject: Add pg_control flag to prevent recovery without backup_label.

Harden recovery by adding a flag to pg_control to indicate that backup_label is
required. This prevents the user from deleting backup_label resulting in an
inconsistent recovery.

Another advantage is that the copy of pg_control used by pg_basebackup is
guaranteed not to be torn.

This functionality is limited to pg_basebackup (or any software comfortable
with modifying pg_control).

Control and catalog version bumps are required.
---
 doc/src/sgml/func/func-info.sgml          |  5 +++++
 src/backend/access/transam/xlog.c         | 23 +++++++++++++++++++++++
 src/backend/access/transam/xlogrecovery.c | 10 +++++++++-
 src/backend/backup/basebackup.c           | 15 ++++++---------
 src/backend/utils/misc/pg_controldata.c   |  7 +++++--
 src/bin/pg_controldata/pg_controldata.c   |  2 ++
 src/bin/pg_resetwal/pg_resetwal.c         |  1 +
 src/bin/pg_rewind/pg_rewind.c             |  1 +
 src/include/access/xlog.h                 |  1 +
 src/include/catalog/pg_control.h          |  4 ++++
 src/include/catalog/pg_proc.dat           |  6 +++---
 src/test/recovery/t/002_archiving.pl      | 20 ++++++++++++++++++++
 12 files changed, 80 insertions(+), 15 deletions(-)

diff --git a/doc/src/sgml/func/func-info.sgml b/doc/src/sgml/func/func-info.sgml
index 294f45e82a3..2e78d28221c 100644
--- a/doc/src/sgml/func/func-info.sgml
+++ b/doc/src/sgml/func/func-info.sgml
@@ -3646,6 +3646,11 @@ acl      | {postgres=arwdDxtm/postgres,foo=r/postgres}
        <entry><type>boolean</type></entry>
       </row>
 
+      <row>
+       <entry><structfield>backup_label_required</structfield></entry>
+       <entry><type>boolean</type></entry>
+      </row>
+
      </tbody>
     </tgroup>
    </table>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 13cce9b49f1..babb76b4710 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -9586,6 +9586,29 @@ do_pg_abort_backup(int code, Datum arg)
 	}
 }
 
+/*
+ * Create a consistent copy of control data to be used for backup and update it
+ * to require a backup label for recovery. Also recalculate the CRC.
+ */
+void
+backup_control_file(uint8_t *controlFile)
+{
+	ControlFileData *controlData = ((ControlFileData *)controlFile);
+
+	memset(controlFile + sizeof(ControlFileData), 0,
+		   PG_CONTROL_FILE_SIZE - sizeof(ControlFileData));
+
+	LWLockAcquire(ControlFileLock, LW_SHARED);
+	memcpy(controlFile, ControlFile, sizeof(ControlFileData));
+	LWLockRelease(ControlFileLock);
+
+	controlData->backupLabelRequired = true;
+
+	INIT_CRC32C(controlData->crc);
+	COMP_CRC32C(controlData->crc, controlFile, offsetof(ControlFileData, crc));
+	FIN_CRC32C(controlData->crc);
+}
+
 /*
  * Register a handler that will warn about unterminated backups at end of
  * session, unless this has already been done.
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index c0c2744d45b..18a0db611e6 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -710,7 +710,14 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 	}
 	else
 	{
-		/* No backup_label file has been found if we are here. */
+		/*
+		 * No backup_label file has been found if we are here. Error if the
+		 * control file requires backup_label.
+		 */
+		if (ControlFile->backupLabelRequired)
+			ereport(FATAL,
+					(errmsg("could not find backup_label required for recovery"),
+					 errhint("backup_label must be present for recovery to proceed")));
 
 		/*
 		 * If tablespace_map file is present without backup_label file, there
@@ -995,6 +1002,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 		{
 			ControlFile->backupStartPoint = checkPoint.redo;
 			ControlFile->backupEndRequired = backupEndRequired;
+			ControlFile->backupLabelRequired = false;
 
 			if (backupFromStandby)
 			{
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 2d74c648335..f54c8793090 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -23,6 +23,7 @@
 #include "backup/basebackup_incremental.h"
 #include "backup/basebackup_sink.h"
 #include "backup/basebackup_target.h"
+#include "catalog/pg_control.h"
 #include "catalog/pg_tablespace_d.h"
 #include "commands/defrem.h"
 #include "common/compression.h"
@@ -331,9 +332,9 @@ perform_base_backup(basebackup_options *opt, bbsink *sink,
 
 			if (ti->path == NULL)
 			{
-				struct stat statbuf;
 				bool		sendtblspclinks = true;
 				char	   *backup_label;
+				uint8_t controlFile[PG_CONTROL_FILE_SIZE];
 
 				bbsink_begin_archive(sink, "base.tar");
 
@@ -356,14 +357,10 @@ perform_base_backup(basebackup_options *opt, bbsink *sink,
 						sendtblspclinks, &manifest, InvalidOid, ib);
 
 				/* ... and pg_control after everything else. */
-				if (lstat(XLOG_CONTROL_FILE, &statbuf) != 0)
-					ereport(ERROR,
-							(errcode_for_file_access(),
-							 errmsg("could not stat file \"%s\": %m",
-									XLOG_CONTROL_FILE)));
-				sendFile(sink, XLOG_CONTROL_FILE, XLOG_CONTROL_FILE, &statbuf,
-						 false, InvalidOid, InvalidOid,
-						 InvalidRelFileNumber, 0, &manifest, 0, NULL, 0);
+				backup_control_file(controlFile);
+				sendFileWithContent(sink, XLOG_CONTROL_FILE,
+									(char *)controlFile, PG_CONTROL_FILE_SIZE,
+									&manifest);
 			}
 			else
 			{
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index c6d9cbb1577..c2c19eb77df 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -162,8 +162,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
 Datum
 pg_control_recovery(PG_FUNCTION_ARGS)
 {
-	Datum		values[5];
-	bool		nulls[5];
+	Datum		values[6];
+	bool		nulls[6];
 	TupleDesc	tupdesc;
 	HeapTuple	htup;
 	ControlFileData *ControlFile;
@@ -195,6 +195,9 @@ pg_control_recovery(PG_FUNCTION_ARGS)
 	values[4] = BoolGetDatum(ControlFile->backupEndRequired);
 	nulls[4] = false;
 
+	values[5] = BoolGetDatum(ControlFile->backupLabelRequired);
+	nulls[5] = false;
+
 	htup = heap_form_tuple(tupdesc, values, nulls);
 
 	PG_RETURN_DATUM(HeapTupleGetDatum(htup));
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index a4060309ae0..5919dd58fed 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -301,6 +301,8 @@ main(int argc, char *argv[])
 		   LSN_FORMAT_ARGS(ControlFile->backupEndPoint));
 	printf(_("End-of-backup record required:        %s\n"),
 		   ControlFile->backupEndRequired ? _("yes") : _("no"));
+	printf(_("Backup label required:                %s\n"),
+		   ControlFile->backupLabelRequired ? _("yes") : _("no"));
 	printf(_("wal_level setting:                    %s\n"),
 		   wal_level_str(ControlFile->wal_level));
 	printf(_("wal_log_hints setting:                %s\n"),
diff --git a/src/bin/pg_resetwal/pg_resetwal.c b/src/bin/pg_resetwal/pg_resetwal.c
index 85dc43d4cdb..5eddebe27f2 100644
--- a/src/bin/pg_resetwal/pg_resetwal.c
+++ b/src/bin/pg_resetwal/pg_resetwal.c
@@ -918,6 +918,7 @@ RewriteControlFile(void)
 	ControlFile.backupStartPoint = InvalidXLogRecPtr;
 	ControlFile.backupEndPoint = InvalidXLogRecPtr;
 	ControlFile.backupEndRequired = false;
+	ControlFile.backupLabelRequired = false;
 
 	/*
 	 * Force the defaults for max_* settings. The values don't really matter
diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c
index d0aafd7e7a6..d5b49b2a26b 100644
--- a/src/bin/pg_rewind/pg_rewind.c
+++ b/src/bin/pg_rewind/pg_rewind.c
@@ -736,6 +736,7 @@ perform_rewind(filemap_t *filemap, rewind_source *source,
 	ControlFile_new.minRecoveryPoint = endrec;
 	ControlFile_new.minRecoveryPointTLI = endtli;
 	ControlFile_new.state = DB_IN_ARCHIVE_RECOVERY;
+	ControlFile_new.backupLabelRequired = true;
 	if (!dry_run)
 		update_controlfile(datadir_target, &ControlFile_new, do_sync);
 }
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index fdfb572467b..84ec5aabdfd 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -311,6 +311,7 @@ extern void do_pg_backup_start(const char *backupidstr, bool fast,
 							   StringInfo tblspcmapfile);
 extern void do_pg_backup_stop(BackupState *state, bool waitforarchive);
 extern void do_pg_abort_backup(int code, Datum arg);
+extern void backup_control_file(uint8_t *controlFile);
 extern void register_persistent_abort_backup_handler(void);
 extern SessionBackupState get_backup_status(void);
 
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 7503db1af51..26618162a7b 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -166,12 +166,16 @@ typedef struct ControlFileData
 	 * If backupEndRequired is true, we know for sure that we're restoring
 	 * from a backup, and must see a backup-end record before we can safely
 	 * start up.
+	 *
+	 * If backupLabelRequired is true, then a backup_label file must be
+	 * present in order for recovery to proceed.
 	 */
 	XLogRecPtr	minRecoveryPoint;
 	TimeLineID	minRecoveryPointTLI;
 	XLogRecPtr	backupStartPoint;
 	XLogRecPtr	backupEndPoint;
 	bool		backupEndRequired;
+	bool		backupLabelRequired;
 
 	/*
 	 * Parameter settings that determine if the WAL can be used for archival
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index dac40992cbc..3c368433615 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12437,9 +12437,9 @@
 { oid => '3443',
   descr => 'pg_controldata recovery state information as a function',
   proname => 'pg_control_recovery', provolatile => 'v', prorettype => 'record',
-  proargtypes => '', proallargtypes => '{pg_lsn,int4,pg_lsn,pg_lsn,bool}',
-  proargmodes => '{o,o,o,o,o}',
-  proargnames => '{min_recovery_end_lsn,min_recovery_end_timeline,backup_start_lsn,backup_end_lsn,end_of_backup_record_required}',
+  proargtypes => '', proallargtypes => '{pg_lsn,int4,pg_lsn,pg_lsn,bool,bool}',
+  proargmodes => '{o,o,o,o,o,o}',
+  proargnames => '{min_recovery_end_lsn,min_recovery_end_timeline,backup_start_lsn,backup_end_lsn,end_of_backup_record_required,backup_label_required}',
   prosrc => 'pg_control_recovery' },
 
 { oid => '3444',
diff --git a/src/test/recovery/t/002_archiving.pl b/src/test/recovery/t/002_archiving.pl
index aa40f58e6d6..9963d13473e 100644
--- a/src/test/recovery/t/002_archiving.pl
+++ b/src/test/recovery/t/002_archiving.pl
@@ -41,6 +41,26 @@ $node_standby->append_conf(
 archive_cleanup_command = 'echo archive_cleanup_done > $archive_cleanup_command_file'
 recovery_end_command = 'echo recovery_ended_done > $recovery_end_command_file'
 ));
+
+# Rename backup_label to verify that recovery will not start without it
+rename("$data_dir/backup_label", "$data_dir/backup_label.tmp")
+  or BAIL_OUT "could not move $data_dir/backup_label";
+
+my $res = run_log(
+	[
+		'pg_ctl', '-D', $node_standby->data_dir, '-l',
+		$node_standby->logfile, 'start'
+	]);
+ok(!$res, 'invalid recovery startup fails');
+
+my $logfile = slurp_file($node_standby->logfile());
+ok($logfile =~ qr/could not find backup_label required for recovery/,
+	'could not find backup_label required for recovery');
+
+# Restore backup_label so recovery proceeds normally
+rename("$data_dir/backup_label.tmp", "$data_dir/backup_label")
+  or BAIL_OUT "could not move $data_dir/backup_label";
+
 $node_standby->start;
 
 # Create some content on primary
-- 
2.34.1


From e654c44aac81c2aa5b70b03a62067b0caedcae1c Mon Sep 17 00:00:00 2001
From: David Steele <[email protected]>
Date: Fri, 20 Feb 2026 02:59:52 +0000
Subject: Return pg_control from pg_backup_stop().

Harden recovery by returning a copy of pg_control from pg_backup_stop() that has
a flag set to prevent recovery if the backup_label file is missing. Instead of
backup software copying pg_control from PGDATA, it stores an updated version
that is returned from pg_backup_stop(). This is better for the following
reasons:

* The user can no longer remove backup_label and get what looks like a
successful recovery (while almost certainly causing corruption). If backup_label
is removed the cluster will not start. The user may try pg_resetwal, but that
tool makes it pretty clear that corruption will result from its use.

* We don't need to worry about backup software seeing a torn copy of pg_control,
since Postgres can safely read it out of memory and provide a valid copy via
pg_backup_stop(). This solves torn reads without needing to write pg_control via
a temp file, which may affect performance on a standby.

* For backup from standby, we no longer need to instruct the backup software to
copy pg_control last. In fact the backup software should not copy pg_control from
PGDATA at all.

These changes have no impact on current backup software and they are free to use
the pg_control available from pg_stop_backup() or continue to use pg_control from
PGDATA. Of course they will miss the benefits of getting a consistent copy of
pg_control and the backup_label checking, but will be no worse off than before.

Catalog version bump is required.
---
 doc/src/sgml/backup.sgml                    | 18 +++++-
 src/backend/access/transam/xlogfuncs.c      | 20 ++++--
 src/include/catalog/pg_proc.dat             |  4 +-
 src/test/recovery/t/042_low_level_backup.pl | 67 ++++++++++++++++++++-
 4 files changed, 97 insertions(+), 12 deletions(-)

diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
index 168444eccc5..67974b7b1b6 100644
--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -1027,9 +1027,12 @@ SELECT * FROM pg_backup_stop(wait_for_archive => true);
      values. The second of these fields should be written to a file named
      <filename>backup_label</filename> in the root directory of the backup. The
      third field should be written to a file named
-     <filename>tablespace_map</filename> unless the field is empty. These files are
+     <filename>tablespace_map</filename> unless the field is empty. The fourth
+     field should be written into a file named
+     <filename>global/pg_control</filename> (overwriting the existing file when
+     present). These files are
      vital to the backup working and must be written byte for byte without
-     modification, which may require opening the file in binary mode.
+     modification, which will require opening the file in binary mode.
     </para>
    </listitem>
    <listitem>
@@ -1101,7 +1104,16 @@ SELECT * FROM pg_backup_stop(wait_for_archive => true);
    </para>
 
    <para>
-    You should, however, omit from the backup the files within the
+    You should exclude <filename>global/pg_control</filename> from your backup
+    and put the contents of the <parameter>controlfile</parameter> column
+    returned from <function>pg_backup_stop</function> in your backup at
+    <filename>global/pg_control</filename>. This version of pg_control contains
+    safeguards against recovery without backup_label present and is guaranteed
+    not to be torn.
+   </para>
+
+   <para>
+    You should also omit from the backup the files within the
     cluster's <filename>pg_wal/</filename> subdirectory.  This
     slight adjustment is worthwhile because it reduces the risk
     of mistakes when restoring.  This is easy to arrange if
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 2efe4105efb..e7e7c0b18e5 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -113,9 +113,11 @@ pg_backup_start(PG_FUNCTION_ARGS)
  *
  * The backup_label contains the user-supplied label string (typically this
  * would be used to tell where the backup dump will be stored), the starting
- * time, starting WAL location for the dump and so on.  It is the caller's
- * responsibility to write the backup_label and tablespace_map files in the
- * data folder that will be restored from this backup.
+ * time, starting WAL location for the dump and so on.  The pg_control file
+ * contains a consistent copy of pg_control that also has a safeguard against
+ * being used without backup_label.  It is the caller's responsibility to write
+ * the backup_label, pg_control, and tablespace_map files in the data folder
+ * that will be restored from this backup.
  *
  * Permission checking for this function is managed through the normal
  * GRANT system.
@@ -123,12 +125,14 @@ pg_backup_start(PG_FUNCTION_ARGS)
 Datum
 pg_backup_stop(PG_FUNCTION_ARGS)
 {
-#define PG_BACKUP_STOP_V2_COLS 3
+#define PG_BACKUP_STOP_V2_COLS 4
 	TupleDesc	tupdesc;
 	Datum		values[PG_BACKUP_STOP_V2_COLS] = {0};
 	bool		nulls[PG_BACKUP_STOP_V2_COLS] = {0};
 	bool		waitforarchive = PG_GETARG_BOOL(0);
 	char	   *backup_label;
+	bytea	   *pg_control_bytea;
+	uint8_t		pg_control[PG_CONTROL_FILE_SIZE];
 	SessionBackupState status = get_backup_status();
 
 	/* Initialize attributes information in the tuple descriptor */
@@ -150,9 +154,17 @@ pg_backup_stop(PG_FUNCTION_ARGS)
 	/* Build the contents of backup_label */
 	backup_label = build_backup_content(backup_state, false);
 
+	/* Build the contents of pg_control */
+	backup_control_file(pg_control);
+
+	pg_control_bytea = (bytea *) palloc(PG_CONTROL_FILE_SIZE + VARHDRSZ);
+	SET_VARSIZE(pg_control_bytea, PG_CONTROL_FILE_SIZE + VARHDRSZ);
+	memcpy(VARDATA(pg_control_bytea), pg_control, PG_CONTROL_FILE_SIZE);
+
 	values[0] = LSNGetDatum(backup_state->stoppoint);
 	values[1] = CStringGetTextDatum(backup_label);
 	values[2] = CStringGetTextDatum(tablespace_map->data);
+	values[3] = PointerGetDatum(pg_control_bytea);
 
 	/* Deallocate backup-related variables */
 	pfree(backup_label);
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 3c368433615..c89967613a5 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -6741,8 +6741,8 @@
 { oid => '2739', descr => 'finish taking an online backup',
   proname => 'pg_backup_stop', provolatile => 'v', proparallel => 'r',
   prorettype => 'record', proargtypes => 'bool',
-  proallargtypes => '{bool,pg_lsn,text,text}', proargmodes => '{i,o,o,o}',
-  proargnames => '{wait_for_archive,lsn,labelfile,spcmapfile}',
+  proallargtypes => '{bool,pg_lsn,text,text,bytea}', proargmodes => '{i,o,o,o,o}',
+  proargnames => '{wait_for_archive,lsn,labelfile,spcmapfile,controlfile}',
   proargdefaults => '{true}',
   prosrc => 'pg_backup_stop' },
 { oid => '3436', descr => 'promote standby server',
diff --git a/src/test/recovery/t/042_low_level_backup.pl b/src/test/recovery/t/042_low_level_backup.pl
index df4ae029fe6..34c6f4461af 100644
--- a/src/test/recovery/t/042_low_level_backup.pl
+++ b/src/test/recovery/t/042_low_level_backup.pl
@@ -13,6 +13,42 @@ use PostgreSQL::Test::Cluster;
 use PostgreSQL::Test::Utils;
 use Test::More;
 
+# Decode hex to binary
+sub decode_hex
+{
+	my ($encoded) = @_;
+	my $decoded;
+
+	$encoded =~ s/^\s+|\s+$//g;
+
+	for (my $idx = 0; $idx < length($encoded); $idx += 2)
+	{
+		$decoded .= pack('C', hex(substr($encoded, $idx, 2)));
+	}
+
+	return $decoded;
+}
+
+# Get backup_label/pg_control from pg_stop_backup()
+sub stop_backup_result
+{
+	my ($psql) = @_;
+
+	my $encoded = $psql->query_safe(
+		"select encode(labelfile::bytea, 'hex') || ',' || " .
+		"       encode(controlfile, 'hex')" .
+		"  from pg_backup_stop()");
+
+	my @result;
+
+    foreach my $column (split(',', $encoded))
+	{
+		push(@result, decode_hex($column));
+	}
+
+	return @result;
+}
+
 # Start primary node with archiving.
 my $node_primary = PostgreSQL::Test::Cluster->new('primary');
 $node_primary->init(has_archiving => 1, allows_streaming => 1);
@@ -80,8 +116,7 @@ my $stop_segment_name = $node_primary->safe_psql('postgres',
 	'SELECT pg_walfile_name(pg_current_wal_lsn())');
 
 # Stop backup and get backup_label, the last segment is archived.
-my $backup_label =
-  $psql->query_safe("select labelfile from pg_backup_stop()");
+(my $backup_label, my $pg_control) = stop_backup_result($psql);
 
 $psql->quit;
 
@@ -118,10 +153,36 @@ ok( $node_replica->log_contains(
 $node_replica->teardown_node;
 $node_replica->clean_node;
 
+# Save only pg_control into the backup to demonstrate that pg_control returned
+# from pg_stop_backup() will only perform recovery when backup_label is present.
+open(my $fh, ">", "$backup_dir/global/pg_control")
+  or die "could not open pg_control";
+binmode($fh);
+syswrite($fh, $pg_control);
+close($fh);
+
+$node_replica = PostgreSQL::Test::Cluster->new('replica_fail2');
+$node_replica->init_from_backup($node_primary, $backup_name,
+	has_restoring => 1);
+
+my $res = run_log(
+	[
+		'pg_ctl', '-D', $node_replica->data_dir, '-l',
+		$node_replica->logfile, 'start'
+	]);
+ok(!$res, 'invalid recovery startup fails');
+
+my $logfile = slurp_file($node_replica->logfile());
+ok($logfile =~ qr/could not find backup_label required for recovery/,
+	'could not find backup_label required for recovery');
+
+$node_replica->teardown_node;
+$node_replica->clean_node;
+
 # Save backup_label into the backup directory and recover using the primary's
 # archive.  This time recovery will succeed and the canary table will be
 # present.
-open my $fh, ">>", "$backup_dir/backup_label"
+open $fh, ">>", "$backup_dir/backup_label"
   or die "could not open backup_label";
 # Binary mode is required for Windows, as the backup_label parsing is not
 # able to cope with CRLFs.
-- 
2.34.1



Attachments:

  [text/plain] pgcontrol-flag-v7-01-basebackup.patch (11.5K, 2-pgcontrol-flag-v7-01-basebackup.patch)
  download | inline diff:
From f4043472a7acb4c74566ac17026d0e4b6245aa58 Mon Sep 17 00:00:00 2001
From: David Steele <[email protected]>
Date: Fri, 20 Feb 2026 02:59:52 +0000
Subject: Add pg_control flag to prevent recovery without backup_label.

Harden recovery by adding a flag to pg_control to indicate that backup_label is
required. This prevents the user from deleting backup_label resulting in an
inconsistent recovery.

Another advantage is that the copy of pg_control used by pg_basebackup is
guaranteed not to be torn.

This functionality is limited to pg_basebackup (or any software comfortable
with modifying pg_control).

Control and catalog version bumps are required.
---
 doc/src/sgml/func/func-info.sgml          |  5 +++++
 src/backend/access/transam/xlog.c         | 23 +++++++++++++++++++++++
 src/backend/access/transam/xlogrecovery.c | 10 +++++++++-
 src/backend/backup/basebackup.c           | 15 ++++++---------
 src/backend/utils/misc/pg_controldata.c   |  7 +++++--
 src/bin/pg_controldata/pg_controldata.c   |  2 ++
 src/bin/pg_resetwal/pg_resetwal.c         |  1 +
 src/bin/pg_rewind/pg_rewind.c             |  1 +
 src/include/access/xlog.h                 |  1 +
 src/include/catalog/pg_control.h          |  4 ++++
 src/include/catalog/pg_proc.dat           |  6 +++---
 src/test/recovery/t/002_archiving.pl      | 20 ++++++++++++++++++++
 12 files changed, 80 insertions(+), 15 deletions(-)

diff --git a/doc/src/sgml/func/func-info.sgml b/doc/src/sgml/func/func-info.sgml
index 294f45e82a3..2e78d28221c 100644
--- a/doc/src/sgml/func/func-info.sgml
+++ b/doc/src/sgml/func/func-info.sgml
@@ -3646,6 +3646,11 @@ acl      | {postgres=arwdDxtm/postgres,foo=r/postgres}
        <entry><type>boolean</type></entry>
       </row>
 
+      <row>
+       <entry><structfield>backup_label_required</structfield></entry>
+       <entry><type>boolean</type></entry>
+      </row>
+
      </tbody>
     </tgroup>
    </table>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 13cce9b49f1..babb76b4710 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -9586,6 +9586,29 @@ do_pg_abort_backup(int code, Datum arg)
 	}
 }
 
+/*
+ * Create a consistent copy of control data to be used for backup and update it
+ * to require a backup label for recovery. Also recalculate the CRC.
+ */
+void
+backup_control_file(uint8_t *controlFile)
+{
+	ControlFileData *controlData = ((ControlFileData *)controlFile);
+
+	memset(controlFile + sizeof(ControlFileData), 0,
+		   PG_CONTROL_FILE_SIZE - sizeof(ControlFileData));
+
+	LWLockAcquire(ControlFileLock, LW_SHARED);
+	memcpy(controlFile, ControlFile, sizeof(ControlFileData));
+	LWLockRelease(ControlFileLock);
+
+	controlData->backupLabelRequired = true;
+
+	INIT_CRC32C(controlData->crc);
+	COMP_CRC32C(controlData->crc, controlFile, offsetof(ControlFileData, crc));
+	FIN_CRC32C(controlData->crc);
+}
+
 /*
  * Register a handler that will warn about unterminated backups at end of
  * session, unless this has already been done.
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index c0c2744d45b..18a0db611e6 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -710,7 +710,14 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 	}
 	else
 	{
-		/* No backup_label file has been found if we are here. */
+		/*
+		 * No backup_label file has been found if we are here. Error if the
+		 * control file requires backup_label.
+		 */
+		if (ControlFile->backupLabelRequired)
+			ereport(FATAL,
+					(errmsg("could not find backup_label required for recovery"),
+					 errhint("backup_label must be present for recovery to proceed")));
 
 		/*
 		 * If tablespace_map file is present without backup_label file, there
@@ -995,6 +1002,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 		{
 			ControlFile->backupStartPoint = checkPoint.redo;
 			ControlFile->backupEndRequired = backupEndRequired;
+			ControlFile->backupLabelRequired = false;
 
 			if (backupFromStandby)
 			{
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 2d74c648335..f54c8793090 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -23,6 +23,7 @@
 #include "backup/basebackup_incremental.h"
 #include "backup/basebackup_sink.h"
 #include "backup/basebackup_target.h"
+#include "catalog/pg_control.h"
 #include "catalog/pg_tablespace_d.h"
 #include "commands/defrem.h"
 #include "common/compression.h"
@@ -331,9 +332,9 @@ perform_base_backup(basebackup_options *opt, bbsink *sink,
 
 			if (ti->path == NULL)
 			{
-				struct stat statbuf;
 				bool		sendtblspclinks = true;
 				char	   *backup_label;
+				uint8_t controlFile[PG_CONTROL_FILE_SIZE];
 
 				bbsink_begin_archive(sink, "base.tar");
 
@@ -356,14 +357,10 @@ perform_base_backup(basebackup_options *opt, bbsink *sink,
 						sendtblspclinks, &manifest, InvalidOid, ib);
 
 				/* ... and pg_control after everything else. */
-				if (lstat(XLOG_CONTROL_FILE, &statbuf) != 0)
-					ereport(ERROR,
-							(errcode_for_file_access(),
-							 errmsg("could not stat file \"%s\": %m",
-									XLOG_CONTROL_FILE)));
-				sendFile(sink, XLOG_CONTROL_FILE, XLOG_CONTROL_FILE, &statbuf,
-						 false, InvalidOid, InvalidOid,
-						 InvalidRelFileNumber, 0, &manifest, 0, NULL, 0);
+				backup_control_file(controlFile);
+				sendFileWithContent(sink, XLOG_CONTROL_FILE,
+									(char *)controlFile, PG_CONTROL_FILE_SIZE,
+									&manifest);
 			}
 			else
 			{
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index c6d9cbb1577..c2c19eb77df 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -162,8 +162,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
 Datum
 pg_control_recovery(PG_FUNCTION_ARGS)
 {
-	Datum		values[5];
-	bool		nulls[5];
+	Datum		values[6];
+	bool		nulls[6];
 	TupleDesc	tupdesc;
 	HeapTuple	htup;
 	ControlFileData *ControlFile;
@@ -195,6 +195,9 @@ pg_control_recovery(PG_FUNCTION_ARGS)
 	values[4] = BoolGetDatum(ControlFile->backupEndRequired);
 	nulls[4] = false;
 
+	values[5] = BoolGetDatum(ControlFile->backupLabelRequired);
+	nulls[5] = false;
+
 	htup = heap_form_tuple(tupdesc, values, nulls);
 
 	PG_RETURN_DATUM(HeapTupleGetDatum(htup));
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index a4060309ae0..5919dd58fed 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -301,6 +301,8 @@ main(int argc, char *argv[])
 		   LSN_FORMAT_ARGS(ControlFile->backupEndPoint));
 	printf(_("End-of-backup record required:        %s\n"),
 		   ControlFile->backupEndRequired ? _("yes") : _("no"));
+	printf(_("Backup label required:                %s\n"),
+		   ControlFile->backupLabelRequired ? _("yes") : _("no"));
 	printf(_("wal_level setting:                    %s\n"),
 		   wal_level_str(ControlFile->wal_level));
 	printf(_("wal_log_hints setting:                %s\n"),
diff --git a/src/bin/pg_resetwal/pg_resetwal.c b/src/bin/pg_resetwal/pg_resetwal.c
index 85dc43d4cdb..5eddebe27f2 100644
--- a/src/bin/pg_resetwal/pg_resetwal.c
+++ b/src/bin/pg_resetwal/pg_resetwal.c
@@ -918,6 +918,7 @@ RewriteControlFile(void)
 	ControlFile.backupStartPoint = InvalidXLogRecPtr;
 	ControlFile.backupEndPoint = InvalidXLogRecPtr;
 	ControlFile.backupEndRequired = false;
+	ControlFile.backupLabelRequired = false;
 
 	/*
 	 * Force the defaults for max_* settings. The values don't really matter
diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c
index d0aafd7e7a6..d5b49b2a26b 100644
--- a/src/bin/pg_rewind/pg_rewind.c
+++ b/src/bin/pg_rewind/pg_rewind.c
@@ -736,6 +736,7 @@ perform_rewind(filemap_t *filemap, rewind_source *source,
 	ControlFile_new.minRecoveryPoint = endrec;
 	ControlFile_new.minRecoveryPointTLI = endtli;
 	ControlFile_new.state = DB_IN_ARCHIVE_RECOVERY;
+	ControlFile_new.backupLabelRequired = true;
 	if (!dry_run)
 		update_controlfile(datadir_target, &ControlFile_new, do_sync);
 }
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index fdfb572467b..84ec5aabdfd 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -311,6 +311,7 @@ extern void do_pg_backup_start(const char *backupidstr, bool fast,
 							   StringInfo tblspcmapfile);
 extern void do_pg_backup_stop(BackupState *state, bool waitforarchive);
 extern void do_pg_abort_backup(int code, Datum arg);
+extern void backup_control_file(uint8_t *controlFile);
 extern void register_persistent_abort_backup_handler(void);
 extern SessionBackupState get_backup_status(void);
 
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 7503db1af51..26618162a7b 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -166,12 +166,16 @@ typedef struct ControlFileData
 	 * If backupEndRequired is true, we know for sure that we're restoring
 	 * from a backup, and must see a backup-end record before we can safely
 	 * start up.
+	 *
+	 * If backupLabelRequired is true, then a backup_label file must be
+	 * present in order for recovery to proceed.
 	 */
 	XLogRecPtr	minRecoveryPoint;
 	TimeLineID	minRecoveryPointTLI;
 	XLogRecPtr	backupStartPoint;
 	XLogRecPtr	backupEndPoint;
 	bool		backupEndRequired;
+	bool		backupLabelRequired;
 
 	/*
 	 * Parameter settings that determine if the WAL can be used for archival
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index dac40992cbc..3c368433615 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12437,9 +12437,9 @@
 { oid => '3443',
   descr => 'pg_controldata recovery state information as a function',
   proname => 'pg_control_recovery', provolatile => 'v', prorettype => 'record',
-  proargtypes => '', proallargtypes => '{pg_lsn,int4,pg_lsn,pg_lsn,bool}',
-  proargmodes => '{o,o,o,o,o}',
-  proargnames => '{min_recovery_end_lsn,min_recovery_end_timeline,backup_start_lsn,backup_end_lsn,end_of_backup_record_required}',
+  proargtypes => '', proallargtypes => '{pg_lsn,int4,pg_lsn,pg_lsn,bool,bool}',
+  proargmodes => '{o,o,o,o,o,o}',
+  proargnames => '{min_recovery_end_lsn,min_recovery_end_timeline,backup_start_lsn,backup_end_lsn,end_of_backup_record_required,backup_label_required}',
   prosrc => 'pg_control_recovery' },
 
 { oid => '3444',
diff --git a/src/test/recovery/t/002_archiving.pl b/src/test/recovery/t/002_archiving.pl
index aa40f58e6d6..9963d13473e 100644
--- a/src/test/recovery/t/002_archiving.pl
+++ b/src/test/recovery/t/002_archiving.pl
@@ -41,6 +41,26 @@ $node_standby->append_conf(
 archive_cleanup_command = 'echo archive_cleanup_done > $archive_cleanup_command_file'
 recovery_end_command = 'echo recovery_ended_done > $recovery_end_command_file'
 ));
+
+# Rename backup_label to verify that recovery will not start without it
+rename("$data_dir/backup_label", "$data_dir/backup_label.tmp")
+  or BAIL_OUT "could not move $data_dir/backup_label";
+
+my $res = run_log(
+	[
+		'pg_ctl', '-D', $node_standby->data_dir, '-l',
+		$node_standby->logfile, 'start'
+	]);
+ok(!$res, 'invalid recovery startup fails');
+
+my $logfile = slurp_file($node_standby->logfile());
+ok($logfile =~ qr/could not find backup_label required for recovery/,
+	'could not find backup_label required for recovery');
+
+# Restore backup_label so recovery proceeds normally
+rename("$data_dir/backup_label.tmp", "$data_dir/backup_label")
+  or BAIL_OUT "could not move $data_dir/backup_label";
+
 $node_standby->start;
 
 # Create some content on primary
-- 
2.34.1



  [text/plain] pgcontrol-flag-v7-02-sql.patch (9.7K, 3-pgcontrol-flag-v7-02-sql.patch)
  download | inline diff:
From e654c44aac81c2aa5b70b03a62067b0caedcae1c Mon Sep 17 00:00:00 2001
From: David Steele <[email protected]>
Date: Fri, 20 Feb 2026 02:59:52 +0000
Subject: Return pg_control from pg_backup_stop().

Harden recovery by returning a copy of pg_control from pg_backup_stop() that has
a flag set to prevent recovery if the backup_label file is missing. Instead of
backup software copying pg_control from PGDATA, it stores an updated version
that is returned from pg_backup_stop(). This is better for the following
reasons:

* The user can no longer remove backup_label and get what looks like a
successful recovery (while almost certainly causing corruption). If backup_label
is removed the cluster will not start. The user may try pg_resetwal, but that
tool makes it pretty clear that corruption will result from its use.

* We don't need to worry about backup software seeing a torn copy of pg_control,
since Postgres can safely read it out of memory and provide a valid copy via
pg_backup_stop(). This solves torn reads without needing to write pg_control via
a temp file, which may affect performance on a standby.

* For backup from standby, we no longer need to instruct the backup software to
copy pg_control last. In fact the backup software should not copy pg_control from
PGDATA at all.

These changes have no impact on current backup software and they are free to use
the pg_control available from pg_stop_backup() or continue to use pg_control from
PGDATA. Of course they will miss the benefits of getting a consistent copy of
pg_control and the backup_label checking, but will be no worse off than before.

Catalog version bump is required.
---
 doc/src/sgml/backup.sgml                    | 18 +++++-
 src/backend/access/transam/xlogfuncs.c      | 20 ++++--
 src/include/catalog/pg_proc.dat             |  4 +-
 src/test/recovery/t/042_low_level_backup.pl | 67 ++++++++++++++++++++-
 4 files changed, 97 insertions(+), 12 deletions(-)

diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
index 168444eccc5..67974b7b1b6 100644
--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -1027,9 +1027,12 @@ SELECT * FROM pg_backup_stop(wait_for_archive => true);
      values. The second of these fields should be written to a file named
      <filename>backup_label</filename> in the root directory of the backup. The
      third field should be written to a file named
-     <filename>tablespace_map</filename> unless the field is empty. These files are
+     <filename>tablespace_map</filename> unless the field is empty. The fourth
+     field should be written into a file named
+     <filename>global/pg_control</filename> (overwriting the existing file when
+     present). These files are
      vital to the backup working and must be written byte for byte without
-     modification, which may require opening the file in binary mode.
+     modification, which will require opening the file in binary mode.
     </para>
    </listitem>
    <listitem>
@@ -1101,7 +1104,16 @@ SELECT * FROM pg_backup_stop(wait_for_archive => true);
    </para>
 
    <para>
-    You should, however, omit from the backup the files within the
+    You should exclude <filename>global/pg_control</filename> from your backup
+    and put the contents of the <parameter>controlfile</parameter> column
+    returned from <function>pg_backup_stop</function> in your backup at
+    <filename>global/pg_control</filename>. This version of pg_control contains
+    safeguards against recovery without backup_label present and is guaranteed
+    not to be torn.
+   </para>
+
+   <para>
+    You should also omit from the backup the files within the
     cluster's <filename>pg_wal/</filename> subdirectory.  This
     slight adjustment is worthwhile because it reduces the risk
     of mistakes when restoring.  This is easy to arrange if
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 2efe4105efb..e7e7c0b18e5 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -113,9 +113,11 @@ pg_backup_start(PG_FUNCTION_ARGS)
  *
  * The backup_label contains the user-supplied label string (typically this
  * would be used to tell where the backup dump will be stored), the starting
- * time, starting WAL location for the dump and so on.  It is the caller's
- * responsibility to write the backup_label and tablespace_map files in the
- * data folder that will be restored from this backup.
+ * time, starting WAL location for the dump and so on.  The pg_control file
+ * contains a consistent copy of pg_control that also has a safeguard against
+ * being used without backup_label.  It is the caller's responsibility to write
+ * the backup_label, pg_control, and tablespace_map files in the data folder
+ * that will be restored from this backup.
  *
  * Permission checking for this function is managed through the normal
  * GRANT system.
@@ -123,12 +125,14 @@ pg_backup_start(PG_FUNCTION_ARGS)
 Datum
 pg_backup_stop(PG_FUNCTION_ARGS)
 {
-#define PG_BACKUP_STOP_V2_COLS 3
+#define PG_BACKUP_STOP_V2_COLS 4
 	TupleDesc	tupdesc;
 	Datum		values[PG_BACKUP_STOP_V2_COLS] = {0};
 	bool		nulls[PG_BACKUP_STOP_V2_COLS] = {0};
 	bool		waitforarchive = PG_GETARG_BOOL(0);
 	char	   *backup_label;
+	bytea	   *pg_control_bytea;
+	uint8_t		pg_control[PG_CONTROL_FILE_SIZE];
 	SessionBackupState status = get_backup_status();
 
 	/* Initialize attributes information in the tuple descriptor */
@@ -150,9 +154,17 @@ pg_backup_stop(PG_FUNCTION_ARGS)
 	/* Build the contents of backup_label */
 	backup_label = build_backup_content(backup_state, false);
 
+	/* Build the contents of pg_control */
+	backup_control_file(pg_control);
+
+	pg_control_bytea = (bytea *) palloc(PG_CONTROL_FILE_SIZE + VARHDRSZ);
+	SET_VARSIZE(pg_control_bytea, PG_CONTROL_FILE_SIZE + VARHDRSZ);
+	memcpy(VARDATA(pg_control_bytea), pg_control, PG_CONTROL_FILE_SIZE);
+
 	values[0] = LSNGetDatum(backup_state->stoppoint);
 	values[1] = CStringGetTextDatum(backup_label);
 	values[2] = CStringGetTextDatum(tablespace_map->data);
+	values[3] = PointerGetDatum(pg_control_bytea);
 
 	/* Deallocate backup-related variables */
 	pfree(backup_label);
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 3c368433615..c89967613a5 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -6741,8 +6741,8 @@
 { oid => '2739', descr => 'finish taking an online backup',
   proname => 'pg_backup_stop', provolatile => 'v', proparallel => 'r',
   prorettype => 'record', proargtypes => 'bool',
-  proallargtypes => '{bool,pg_lsn,text,text}', proargmodes => '{i,o,o,o}',
-  proargnames => '{wait_for_archive,lsn,labelfile,spcmapfile}',
+  proallargtypes => '{bool,pg_lsn,text,text,bytea}', proargmodes => '{i,o,o,o,o}',
+  proargnames => '{wait_for_archive,lsn,labelfile,spcmapfile,controlfile}',
   proargdefaults => '{true}',
   prosrc => 'pg_backup_stop' },
 { oid => '3436', descr => 'promote standby server',
diff --git a/src/test/recovery/t/042_low_level_backup.pl b/src/test/recovery/t/042_low_level_backup.pl
index df4ae029fe6..34c6f4461af 100644
--- a/src/test/recovery/t/042_low_level_backup.pl
+++ b/src/test/recovery/t/042_low_level_backup.pl
@@ -13,6 +13,42 @@ use PostgreSQL::Test::Cluster;
 use PostgreSQL::Test::Utils;
 use Test::More;
 
+# Decode hex to binary
+sub decode_hex
+{
+	my ($encoded) = @_;
+	my $decoded;
+
+	$encoded =~ s/^\s+|\s+$//g;
+
+	for (my $idx = 0; $idx < length($encoded); $idx += 2)
+	{
+		$decoded .= pack('C', hex(substr($encoded, $idx, 2)));
+	}
+
+	return $decoded;
+}
+
+# Get backup_label/pg_control from pg_stop_backup()
+sub stop_backup_result
+{
+	my ($psql) = @_;
+
+	my $encoded = $psql->query_safe(
+		"select encode(labelfile::bytea, 'hex') || ',' || " .
+		"       encode(controlfile, 'hex')" .
+		"  from pg_backup_stop()");
+
+	my @result;
+
+    foreach my $column (split(',', $encoded))
+	{
+		push(@result, decode_hex($column));
+	}
+
+	return @result;
+}
+
 # Start primary node with archiving.
 my $node_primary = PostgreSQL::Test::Cluster->new('primary');
 $node_primary->init(has_archiving => 1, allows_streaming => 1);
@@ -80,8 +116,7 @@ my $stop_segment_name = $node_primary->safe_psql('postgres',
 	'SELECT pg_walfile_name(pg_current_wal_lsn())');
 
 # Stop backup and get backup_label, the last segment is archived.
-my $backup_label =
-  $psql->query_safe("select labelfile from pg_backup_stop()");
+(my $backup_label, my $pg_control) = stop_backup_result($psql);
 
 $psql->quit;
 
@@ -118,10 +153,36 @@ ok( $node_replica->log_contains(
 $node_replica->teardown_node;
 $node_replica->clean_node;
 
+# Save only pg_control into the backup to demonstrate that pg_control returned
+# from pg_stop_backup() will only perform recovery when backup_label is present.
+open(my $fh, ">", "$backup_dir/global/pg_control")
+  or die "could not open pg_control";
+binmode($fh);
+syswrite($fh, $pg_control);
+close($fh);
+
+$node_replica = PostgreSQL::Test::Cluster->new('replica_fail2');
+$node_replica->init_from_backup($node_primary, $backup_name,
+	has_restoring => 1);
+
+my $res = run_log(
+	[
+		'pg_ctl', '-D', $node_replica->data_dir, '-l',
+		$node_replica->logfile, 'start'
+	]);
+ok(!$res, 'invalid recovery startup fails');
+
+my $logfile = slurp_file($node_replica->logfile());
+ok($logfile =~ qr/could not find backup_label required for recovery/,
+	'could not find backup_label required for recovery');
+
+$node_replica->teardown_node;
+$node_replica->clean_node;
+
 # Save backup_label into the backup directory and recover using the primary's
 # archive.  This time recovery will succeed and the canary table will be
 # present.
-open my $fh, ">>", "$backup_dir/backup_label"
+open $fh, ">>", "$backup_dir/backup_label"
   or die "could not open backup_label";
 # Binary mode is required for Windows, as the backup_label parsing is not
 # able to cope with CRLFs.
-- 
2.34.1



^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: Return pg_control from pg_backup_stop().
  2026-02-20 03:10 Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-02-20 05:47 ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
@ 2026-03-06 01:27   ` David Steele <[email protected]>
  2026-03-17 05:51     ` Re: Return pg_control from pg_backup_stop(). Michael Paquier <[email protected]>
  2026-03-17 07:05     ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  0 siblings, 2 replies; 13+ messages in thread

From: David Steele @ 2026-03-06 01:27 UTC (permalink / raw)
  To: Michael Paquier <[email protected]>; +Cc: pgsql-hackers

On 2/20/26 12:47, David Steele wrote:
> On 2/20/26 10:10, David Steele wrote:
>> On 8/7/25 05:30, David Steele wrote:
>>> On 1/24/25 13:43, David Steele wrote:
>>>>
>>>> Rebased and improved a comment and an error.
>>> Rebased to fix breakage caused by the split of func.sgml in 4e23c9e.
>>
>>
>> Rebased to implement simplification added by "Simplify creation of 
>> built-in functions with default arguments" (759b03b2).

Rebased on "Simplify creation of built-in functions with non-default 
ACLs." (f95d73ed).

Regards,
-David
From a38d5e68b648e1d03832c3b917a60a2a5ed4c61b Mon Sep 17 00:00:00 2001
From: David Steele <[email protected]>
Date: Fri, 6 Mar 2026 01:10:14 +0000
Subject: Add pg_control flag to prevent recovery without backup_label.

Harden recovery by adding a flag to pg_control to indicate that backup_label is
required. This prevents the user from deleting backup_label resulting in an
inconsistent recovery.

Another advantage is that the copy of pg_control used by pg_basebackup is
guaranteed not to be torn.

This functionality is limited to pg_basebackup (or any software comfortable
with modifying pg_control).

Control and catalog version bumps are required.
---
 doc/src/sgml/func/func-info.sgml          |  5 +++++
 src/backend/access/transam/xlog.c         | 23 +++++++++++++++++++++++
 src/backend/access/transam/xlogrecovery.c | 10 +++++++++-
 src/backend/backup/basebackup.c           | 15 ++++++---------
 src/backend/utils/misc/pg_controldata.c   |  7 +++++--
 src/bin/pg_controldata/pg_controldata.c   |  2 ++
 src/bin/pg_resetwal/pg_resetwal.c         |  1 +
 src/bin/pg_rewind/pg_rewind.c             |  1 +
 src/include/access/xlog.h                 |  1 +
 src/include/catalog/pg_control.h          |  4 ++++
 src/include/catalog/pg_proc.dat           |  6 +++---
 src/test/recovery/t/002_archiving.pl      | 20 ++++++++++++++++++++
 12 files changed, 80 insertions(+), 15 deletions(-)

diff --git a/doc/src/sgml/func/func-info.sgml b/doc/src/sgml/func/func-info.sgml
index 294f45e82a3..2e78d28221c 100644
--- a/doc/src/sgml/func/func-info.sgml
+++ b/doc/src/sgml/func/func-info.sgml
@@ -3646,6 +3646,11 @@ acl      | {postgres=arwdDxtm/postgres,foo=r/postgres}
        <entry><type>boolean</type></entry>
       </row>
 
+      <row>
+       <entry><structfield>backup_label_required</structfield></entry>
+       <entry><type>boolean</type></entry>
+      </row>
+
      </tbody>
     </tgroup>
    </table>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 354ac645bdc..66f0d7e091d 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -9563,6 +9563,29 @@ do_pg_abort_backup(int code, Datum arg)
 	}
 }
 
+/*
+ * Create a consistent copy of control data to be used for backup and update it
+ * to require a backup label for recovery. Also recalculate the CRC.
+ */
+void
+backup_control_file(uint8_t *controlFile)
+{
+	ControlFileData *controlData = ((ControlFileData *)controlFile);
+
+	memset(controlFile + sizeof(ControlFileData), 0,
+		   PG_CONTROL_FILE_SIZE - sizeof(ControlFileData));
+
+	LWLockAcquire(ControlFileLock, LW_SHARED);
+	memcpy(controlFile, ControlFile, sizeof(ControlFileData));
+	LWLockRelease(ControlFileLock);
+
+	controlData->backupLabelRequired = true;
+
+	INIT_CRC32C(controlData->crc);
+	COMP_CRC32C(controlData->crc, controlFile, offsetof(ControlFileData, crc));
+	FIN_CRC32C(controlData->crc);
+}
+
 /*
  * Register a handler that will warn about unterminated backups at end of
  * session, unless this has already been done.
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index fbddd7e522c..71759ae67b3 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -646,7 +646,14 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 	}
 	else
 	{
-		/* No backup_label file has been found if we are here. */
+		/*
+		 * No backup_label file has been found if we are here. Error if the
+		 * control file requires backup_label.
+		 */
+		if (ControlFile->backupLabelRequired)
+			ereport(FATAL,
+					(errmsg("could not find backup_label required for recovery"),
+					 errhint("backup_label must be present for recovery to proceed")));
 
 		/*
 		 * If tablespace_map file is present without backup_label file, there
@@ -931,6 +938,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 		{
 			ControlFile->backupStartPoint = checkPoint.redo;
 			ControlFile->backupEndRequired = backupEndRequired;
+			ControlFile->backupLabelRequired = false;
 
 			if (backupFromStandby)
 			{
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 2d74c648335..f54c8793090 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -23,6 +23,7 @@
 #include "backup/basebackup_incremental.h"
 #include "backup/basebackup_sink.h"
 #include "backup/basebackup_target.h"
+#include "catalog/pg_control.h"
 #include "catalog/pg_tablespace_d.h"
 #include "commands/defrem.h"
 #include "common/compression.h"
@@ -331,9 +332,9 @@ perform_base_backup(basebackup_options *opt, bbsink *sink,
 
 			if (ti->path == NULL)
 			{
-				struct stat statbuf;
 				bool		sendtblspclinks = true;
 				char	   *backup_label;
+				uint8_t controlFile[PG_CONTROL_FILE_SIZE];
 
 				bbsink_begin_archive(sink, "base.tar");
 
@@ -356,14 +357,10 @@ perform_base_backup(basebackup_options *opt, bbsink *sink,
 						sendtblspclinks, &manifest, InvalidOid, ib);
 
 				/* ... and pg_control after everything else. */
-				if (lstat(XLOG_CONTROL_FILE, &statbuf) != 0)
-					ereport(ERROR,
-							(errcode_for_file_access(),
-							 errmsg("could not stat file \"%s\": %m",
-									XLOG_CONTROL_FILE)));
-				sendFile(sink, XLOG_CONTROL_FILE, XLOG_CONTROL_FILE, &statbuf,
-						 false, InvalidOid, InvalidOid,
-						 InvalidRelFileNumber, 0, &manifest, 0, NULL, 0);
+				backup_control_file(controlFile);
+				sendFileWithContent(sink, XLOG_CONTROL_FILE,
+									(char *)controlFile, PG_CONTROL_FILE_SIZE,
+									&manifest);
 			}
 			else
 			{
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index c6d9cbb1577..c2c19eb77df 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -162,8 +162,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
 Datum
 pg_control_recovery(PG_FUNCTION_ARGS)
 {
-	Datum		values[5];
-	bool		nulls[5];
+	Datum		values[6];
+	bool		nulls[6];
 	TupleDesc	tupdesc;
 	HeapTuple	htup;
 	ControlFileData *ControlFile;
@@ -195,6 +195,9 @@ pg_control_recovery(PG_FUNCTION_ARGS)
 	values[4] = BoolGetDatum(ControlFile->backupEndRequired);
 	nulls[4] = false;
 
+	values[5] = BoolGetDatum(ControlFile->backupLabelRequired);
+	nulls[5] = false;
+
 	htup = heap_form_tuple(tupdesc, values, nulls);
 
 	PG_RETURN_DATUM(HeapTupleGetDatum(htup));
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index a4060309ae0..5919dd58fed 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -301,6 +301,8 @@ main(int argc, char *argv[])
 		   LSN_FORMAT_ARGS(ControlFile->backupEndPoint));
 	printf(_("End-of-backup record required:        %s\n"),
 		   ControlFile->backupEndRequired ? _("yes") : _("no"));
+	printf(_("Backup label required:                %s\n"),
+		   ControlFile->backupLabelRequired ? _("yes") : _("no"));
 	printf(_("wal_level setting:                    %s\n"),
 		   wal_level_str(ControlFile->wal_level));
 	printf(_("wal_log_hints setting:                %s\n"),
diff --git a/src/bin/pg_resetwal/pg_resetwal.c b/src/bin/pg_resetwal/pg_resetwal.c
index ab766c34d4b..768a4ed2f18 100644
--- a/src/bin/pg_resetwal/pg_resetwal.c
+++ b/src/bin/pg_resetwal/pg_resetwal.c
@@ -918,6 +918,7 @@ RewriteControlFile(void)
 	ControlFile.backupStartPoint = InvalidXLogRecPtr;
 	ControlFile.backupEndPoint = InvalidXLogRecPtr;
 	ControlFile.backupEndRequired = false;
+	ControlFile.backupLabelRequired = false;
 
 	/*
 	 * Force the defaults for max_* settings. The values don't really matter
diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c
index 9d745d4b25b..509b9e80a21 100644
--- a/src/bin/pg_rewind/pg_rewind.c
+++ b/src/bin/pg_rewind/pg_rewind.c
@@ -736,6 +736,7 @@ perform_rewind(filemap_t *filemap, rewind_source *source,
 	ControlFile_new.minRecoveryPoint = endrec;
 	ControlFile_new.minRecoveryPointTLI = endtli;
 	ControlFile_new.state = DB_IN_ARCHIVE_RECOVERY;
+	ControlFile_new.backupLabelRequired = true;
 	if (!dry_run)
 		update_controlfile(datadir_target, &ControlFile_new, do_sync);
 }
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index fdfb572467b..84ec5aabdfd 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -311,6 +311,7 @@ extern void do_pg_backup_start(const char *backupidstr, bool fast,
 							   StringInfo tblspcmapfile);
 extern void do_pg_backup_stop(BackupState *state, bool waitforarchive);
 extern void do_pg_abort_backup(int code, Datum arg);
+extern void backup_control_file(uint8_t *controlFile);
 extern void register_persistent_abort_backup_handler(void);
 extern SessionBackupState get_backup_status(void);
 
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 7503db1af51..26618162a7b 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -166,12 +166,16 @@ typedef struct ControlFileData
 	 * If backupEndRequired is true, we know for sure that we're restoring
 	 * from a backup, and must see a backup-end record before we can safely
 	 * start up.
+	 *
+	 * If backupLabelRequired is true, then a backup_label file must be
+	 * present in order for recovery to proceed.
 	 */
 	XLogRecPtr	minRecoveryPoint;
 	TimeLineID	minRecoveryPointTLI;
 	XLogRecPtr	backupStartPoint;
 	XLogRecPtr	backupEndPoint;
 	bool		backupEndRequired;
+	bool		backupLabelRequired;
 
 	/*
 	 * Parameter settings that determine if the WAL can be used for archival
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 4950bff2804..ece9b4f08c9 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12494,9 +12494,9 @@
 { oid => '3443',
   descr => 'pg_controldata recovery state information as a function',
   proname => 'pg_control_recovery', provolatile => 'v', prorettype => 'record',
-  proargtypes => '', proallargtypes => '{pg_lsn,int4,pg_lsn,pg_lsn,bool}',
-  proargmodes => '{o,o,o,o,o}',
-  proargnames => '{min_recovery_end_lsn,min_recovery_end_timeline,backup_start_lsn,backup_end_lsn,end_of_backup_record_required}',
+  proargtypes => '', proallargtypes => '{pg_lsn,int4,pg_lsn,pg_lsn,bool,bool}',
+  proargmodes => '{o,o,o,o,o,o}',
+  proargnames => '{min_recovery_end_lsn,min_recovery_end_timeline,backup_start_lsn,backup_end_lsn,end_of_backup_record_required,backup_label_required}',
   prosrc => 'pg_control_recovery' },
 
 { oid => '3444',
diff --git a/src/test/recovery/t/002_archiving.pl b/src/test/recovery/t/002_archiving.pl
index aa40f58e6d6..9963d13473e 100644
--- a/src/test/recovery/t/002_archiving.pl
+++ b/src/test/recovery/t/002_archiving.pl
@@ -41,6 +41,26 @@ $node_standby->append_conf(
 archive_cleanup_command = 'echo archive_cleanup_done > $archive_cleanup_command_file'
 recovery_end_command = 'echo recovery_ended_done > $recovery_end_command_file'
 ));
+
+# Rename backup_label to verify that recovery will not start without it
+rename("$data_dir/backup_label", "$data_dir/backup_label.tmp")
+  or BAIL_OUT "could not move $data_dir/backup_label";
+
+my $res = run_log(
+	[
+		'pg_ctl', '-D', $node_standby->data_dir, '-l',
+		$node_standby->logfile, 'start'
+	]);
+ok(!$res, 'invalid recovery startup fails');
+
+my $logfile = slurp_file($node_standby->logfile());
+ok($logfile =~ qr/could not find backup_label required for recovery/,
+	'could not find backup_label required for recovery');
+
+# Restore backup_label so recovery proceeds normally
+rename("$data_dir/backup_label.tmp", "$data_dir/backup_label")
+  or BAIL_OUT "could not move $data_dir/backup_label";
+
 $node_standby->start;
 
 # Create some content on primary
-- 
2.34.1


From 1de5c13947884e33f744177d7f21147826a180e1 Mon Sep 17 00:00:00 2001
From: David Steele <[email protected]>
Date: Fri, 6 Mar 2026 01:10:14 +0000
Subject: Return pg_control from pg_backup_stop().

Harden recovery by returning a copy of pg_control from pg_backup_stop() that has
a flag set to prevent recovery if the backup_label file is missing. Instead of
backup software copying pg_control from PGDATA, it stores an updated version
that is returned from pg_backup_stop(). This is better for the following
reasons:

* The user can no longer remove backup_label and get what looks like a
successful recovery (while almost certainly causing corruption). If backup_label
is removed the cluster will not start. The user may try pg_resetwal, but that
tool makes it pretty clear that corruption will result from its use.

* We don't need to worry about backup software seeing a torn copy of pg_control,
since Postgres can safely read it out of memory and provide a valid copy via
pg_backup_stop(). This solves torn reads without needing to write pg_control via
a temp file, which may affect performance on a standby.

* For backup from standby, we no longer need to instruct the backup software to
copy pg_control last. In fact the backup software should not copy pg_control from
PGDATA at all.

These changes have no impact on current backup software and they are free to use
the pg_control available from pg_stop_backup() or continue to use pg_control from
PGDATA. Of course they will miss the benefits of getting a consistent copy of
pg_control and the backup_label checking, but will be no worse off than before.

Catalog version bump is required.
---
 doc/src/sgml/backup.sgml                    | 18 +++++-
 src/backend/access/transam/xlogfuncs.c      | 20 ++++--
 src/include/catalog/pg_proc.dat             |  4 +-
 src/test/recovery/t/042_low_level_backup.pl | 67 ++++++++++++++++++++-
 4 files changed, 97 insertions(+), 12 deletions(-)

diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
index 168444eccc5..67974b7b1b6 100644
--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -1027,9 +1027,12 @@ SELECT * FROM pg_backup_stop(wait_for_archive => true);
      values. The second of these fields should be written to a file named
      <filename>backup_label</filename> in the root directory of the backup. The
      third field should be written to a file named
-     <filename>tablespace_map</filename> unless the field is empty. These files are
+     <filename>tablespace_map</filename> unless the field is empty. The fourth
+     field should be written into a file named
+     <filename>global/pg_control</filename> (overwriting the existing file when
+     present). These files are
      vital to the backup working and must be written byte for byte without
-     modification, which may require opening the file in binary mode.
+     modification, which will require opening the file in binary mode.
     </para>
    </listitem>
    <listitem>
@@ -1101,7 +1104,16 @@ SELECT * FROM pg_backup_stop(wait_for_archive => true);
    </para>
 
    <para>
-    You should, however, omit from the backup the files within the
+    You should exclude <filename>global/pg_control</filename> from your backup
+    and put the contents of the <parameter>controlfile</parameter> column
+    returned from <function>pg_backup_stop</function> in your backup at
+    <filename>global/pg_control</filename>. This version of pg_control contains
+    safeguards against recovery without backup_label present and is guaranteed
+    not to be torn.
+   </para>
+
+   <para>
+    You should also omit from the backup the files within the
     cluster's <filename>pg_wal/</filename> subdirectory.  This
     slight adjustment is worthwhile because it reduces the risk
     of mistakes when restoring.  This is easy to arrange if
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 2efe4105efb..e7e7c0b18e5 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -113,9 +113,11 @@ pg_backup_start(PG_FUNCTION_ARGS)
  *
  * The backup_label contains the user-supplied label string (typically this
  * would be used to tell where the backup dump will be stored), the starting
- * time, starting WAL location for the dump and so on.  It is the caller's
- * responsibility to write the backup_label and tablespace_map files in the
- * data folder that will be restored from this backup.
+ * time, starting WAL location for the dump and so on.  The pg_control file
+ * contains a consistent copy of pg_control that also has a safeguard against
+ * being used without backup_label.  It is the caller's responsibility to write
+ * the backup_label, pg_control, and tablespace_map files in the data folder
+ * that will be restored from this backup.
  *
  * Permission checking for this function is managed through the normal
  * GRANT system.
@@ -123,12 +125,14 @@ pg_backup_start(PG_FUNCTION_ARGS)
 Datum
 pg_backup_stop(PG_FUNCTION_ARGS)
 {
-#define PG_BACKUP_STOP_V2_COLS 3
+#define PG_BACKUP_STOP_V2_COLS 4
 	TupleDesc	tupdesc;
 	Datum		values[PG_BACKUP_STOP_V2_COLS] = {0};
 	bool		nulls[PG_BACKUP_STOP_V2_COLS] = {0};
 	bool		waitforarchive = PG_GETARG_BOOL(0);
 	char	   *backup_label;
+	bytea	   *pg_control_bytea;
+	uint8_t		pg_control[PG_CONTROL_FILE_SIZE];
 	SessionBackupState status = get_backup_status();
 
 	/* Initialize attributes information in the tuple descriptor */
@@ -150,9 +154,17 @@ pg_backup_stop(PG_FUNCTION_ARGS)
 	/* Build the contents of backup_label */
 	backup_label = build_backup_content(backup_state, false);
 
+	/* Build the contents of pg_control */
+	backup_control_file(pg_control);
+
+	pg_control_bytea = (bytea *) palloc(PG_CONTROL_FILE_SIZE + VARHDRSZ);
+	SET_VARSIZE(pg_control_bytea, PG_CONTROL_FILE_SIZE + VARHDRSZ);
+	memcpy(VARDATA(pg_control_bytea), pg_control, PG_CONTROL_FILE_SIZE);
+
 	values[0] = LSNGetDatum(backup_state->stoppoint);
 	values[1] = CStringGetTextDatum(backup_label);
 	values[2] = CStringGetTextDatum(tablespace_map->data);
+	values[3] = PointerGetDatum(pg_control_bytea);
 
 	/* Deallocate backup-related variables */
 	pfree(backup_label);
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index ece9b4f08c9..9a85bc987df 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -6757,8 +6757,8 @@
 { oid => '2739', descr => 'finish taking an online backup',
   proname => 'pg_backup_stop', provolatile => 'v', proparallel => 'r',
   prorettype => 'record', proargtypes => 'bool',
-  proallargtypes => '{bool,pg_lsn,text,text}', proargmodes => '{i,o,o,o}',
-  proargnames => '{wait_for_archive,lsn,labelfile,spcmapfile}',
+  proallargtypes => '{bool,pg_lsn,text,text,bytea}', proargmodes => '{i,o,o,o,o}',
+  proargnames => '{wait_for_archive,lsn,labelfile,spcmapfile,controlfile}',
   proargdefaults => '{true}',
   prosrc => 'pg_backup_stop',
   proacl => '{POSTGRES=X}' },
diff --git a/src/test/recovery/t/042_low_level_backup.pl b/src/test/recovery/t/042_low_level_backup.pl
index df4ae029fe6..34c6f4461af 100644
--- a/src/test/recovery/t/042_low_level_backup.pl
+++ b/src/test/recovery/t/042_low_level_backup.pl
@@ -13,6 +13,42 @@ use PostgreSQL::Test::Cluster;
 use PostgreSQL::Test::Utils;
 use Test::More;
 
+# Decode hex to binary
+sub decode_hex
+{
+	my ($encoded) = @_;
+	my $decoded;
+
+	$encoded =~ s/^\s+|\s+$//g;
+
+	for (my $idx = 0; $idx < length($encoded); $idx += 2)
+	{
+		$decoded .= pack('C', hex(substr($encoded, $idx, 2)));
+	}
+
+	return $decoded;
+}
+
+# Get backup_label/pg_control from pg_stop_backup()
+sub stop_backup_result
+{
+	my ($psql) = @_;
+
+	my $encoded = $psql->query_safe(
+		"select encode(labelfile::bytea, 'hex') || ',' || " .
+		"       encode(controlfile, 'hex')" .
+		"  from pg_backup_stop()");
+
+	my @result;
+
+    foreach my $column (split(',', $encoded))
+	{
+		push(@result, decode_hex($column));
+	}
+
+	return @result;
+}
+
 # Start primary node with archiving.
 my $node_primary = PostgreSQL::Test::Cluster->new('primary');
 $node_primary->init(has_archiving => 1, allows_streaming => 1);
@@ -80,8 +116,7 @@ my $stop_segment_name = $node_primary->safe_psql('postgres',
 	'SELECT pg_walfile_name(pg_current_wal_lsn())');
 
 # Stop backup and get backup_label, the last segment is archived.
-my $backup_label =
-  $psql->query_safe("select labelfile from pg_backup_stop()");
+(my $backup_label, my $pg_control) = stop_backup_result($psql);
 
 $psql->quit;
 
@@ -118,10 +153,36 @@ ok( $node_replica->log_contains(
 $node_replica->teardown_node;
 $node_replica->clean_node;
 
+# Save only pg_control into the backup to demonstrate that pg_control returned
+# from pg_stop_backup() will only perform recovery when backup_label is present.
+open(my $fh, ">", "$backup_dir/global/pg_control")
+  or die "could not open pg_control";
+binmode($fh);
+syswrite($fh, $pg_control);
+close($fh);
+
+$node_replica = PostgreSQL::Test::Cluster->new('replica_fail2');
+$node_replica->init_from_backup($node_primary, $backup_name,
+	has_restoring => 1);
+
+my $res = run_log(
+	[
+		'pg_ctl', '-D', $node_replica->data_dir, '-l',
+		$node_replica->logfile, 'start'
+	]);
+ok(!$res, 'invalid recovery startup fails');
+
+my $logfile = slurp_file($node_replica->logfile());
+ok($logfile =~ qr/could not find backup_label required for recovery/,
+	'could not find backup_label required for recovery');
+
+$node_replica->teardown_node;
+$node_replica->clean_node;
+
 # Save backup_label into the backup directory and recover using the primary's
 # archive.  This time recovery will succeed and the canary table will be
 # present.
-open my $fh, ">>", "$backup_dir/backup_label"
+open $fh, ">>", "$backup_dir/backup_label"
   or die "could not open backup_label";
 # Binary mode is required for Windows, as the backup_label parsing is not
 # able to cope with CRLFs.
-- 
2.34.1



Attachments:

  [text/plain] pgcontrol-flag-v8-01-basebackup.patch (11.5K, 2-pgcontrol-flag-v8-01-basebackup.patch)
  download | inline diff:
From a38d5e68b648e1d03832c3b917a60a2a5ed4c61b Mon Sep 17 00:00:00 2001
From: David Steele <[email protected]>
Date: Fri, 6 Mar 2026 01:10:14 +0000
Subject: Add pg_control flag to prevent recovery without backup_label.

Harden recovery by adding a flag to pg_control to indicate that backup_label is
required. This prevents the user from deleting backup_label resulting in an
inconsistent recovery.

Another advantage is that the copy of pg_control used by pg_basebackup is
guaranteed not to be torn.

This functionality is limited to pg_basebackup (or any software comfortable
with modifying pg_control).

Control and catalog version bumps are required.
---
 doc/src/sgml/func/func-info.sgml          |  5 +++++
 src/backend/access/transam/xlog.c         | 23 +++++++++++++++++++++++
 src/backend/access/transam/xlogrecovery.c | 10 +++++++++-
 src/backend/backup/basebackup.c           | 15 ++++++---------
 src/backend/utils/misc/pg_controldata.c   |  7 +++++--
 src/bin/pg_controldata/pg_controldata.c   |  2 ++
 src/bin/pg_resetwal/pg_resetwal.c         |  1 +
 src/bin/pg_rewind/pg_rewind.c             |  1 +
 src/include/access/xlog.h                 |  1 +
 src/include/catalog/pg_control.h          |  4 ++++
 src/include/catalog/pg_proc.dat           |  6 +++---
 src/test/recovery/t/002_archiving.pl      | 20 ++++++++++++++++++++
 12 files changed, 80 insertions(+), 15 deletions(-)

diff --git a/doc/src/sgml/func/func-info.sgml b/doc/src/sgml/func/func-info.sgml
index 294f45e82a3..2e78d28221c 100644
--- a/doc/src/sgml/func/func-info.sgml
+++ b/doc/src/sgml/func/func-info.sgml
@@ -3646,6 +3646,11 @@ acl      | {postgres=arwdDxtm/postgres,foo=r/postgres}
        <entry><type>boolean</type></entry>
       </row>
 
+      <row>
+       <entry><structfield>backup_label_required</structfield></entry>
+       <entry><type>boolean</type></entry>
+      </row>
+
      </tbody>
     </tgroup>
    </table>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 354ac645bdc..66f0d7e091d 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -9563,6 +9563,29 @@ do_pg_abort_backup(int code, Datum arg)
 	}
 }
 
+/*
+ * Create a consistent copy of control data to be used for backup and update it
+ * to require a backup label for recovery. Also recalculate the CRC.
+ */
+void
+backup_control_file(uint8_t *controlFile)
+{
+	ControlFileData *controlData = ((ControlFileData *)controlFile);
+
+	memset(controlFile + sizeof(ControlFileData), 0,
+		   PG_CONTROL_FILE_SIZE - sizeof(ControlFileData));
+
+	LWLockAcquire(ControlFileLock, LW_SHARED);
+	memcpy(controlFile, ControlFile, sizeof(ControlFileData));
+	LWLockRelease(ControlFileLock);
+
+	controlData->backupLabelRequired = true;
+
+	INIT_CRC32C(controlData->crc);
+	COMP_CRC32C(controlData->crc, controlFile, offsetof(ControlFileData, crc));
+	FIN_CRC32C(controlData->crc);
+}
+
 /*
  * Register a handler that will warn about unterminated backups at end of
  * session, unless this has already been done.
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index fbddd7e522c..71759ae67b3 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -646,7 +646,14 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 	}
 	else
 	{
-		/* No backup_label file has been found if we are here. */
+		/*
+		 * No backup_label file has been found if we are here. Error if the
+		 * control file requires backup_label.
+		 */
+		if (ControlFile->backupLabelRequired)
+			ereport(FATAL,
+					(errmsg("could not find backup_label required for recovery"),
+					 errhint("backup_label must be present for recovery to proceed")));
 
 		/*
 		 * If tablespace_map file is present without backup_label file, there
@@ -931,6 +938,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 		{
 			ControlFile->backupStartPoint = checkPoint.redo;
 			ControlFile->backupEndRequired = backupEndRequired;
+			ControlFile->backupLabelRequired = false;
 
 			if (backupFromStandby)
 			{
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 2d74c648335..f54c8793090 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -23,6 +23,7 @@
 #include "backup/basebackup_incremental.h"
 #include "backup/basebackup_sink.h"
 #include "backup/basebackup_target.h"
+#include "catalog/pg_control.h"
 #include "catalog/pg_tablespace_d.h"
 #include "commands/defrem.h"
 #include "common/compression.h"
@@ -331,9 +332,9 @@ perform_base_backup(basebackup_options *opt, bbsink *sink,
 
 			if (ti->path == NULL)
 			{
-				struct stat statbuf;
 				bool		sendtblspclinks = true;
 				char	   *backup_label;
+				uint8_t controlFile[PG_CONTROL_FILE_SIZE];
 
 				bbsink_begin_archive(sink, "base.tar");
 
@@ -356,14 +357,10 @@ perform_base_backup(basebackup_options *opt, bbsink *sink,
 						sendtblspclinks, &manifest, InvalidOid, ib);
 
 				/* ... and pg_control after everything else. */
-				if (lstat(XLOG_CONTROL_FILE, &statbuf) != 0)
-					ereport(ERROR,
-							(errcode_for_file_access(),
-							 errmsg("could not stat file \"%s\": %m",
-									XLOG_CONTROL_FILE)));
-				sendFile(sink, XLOG_CONTROL_FILE, XLOG_CONTROL_FILE, &statbuf,
-						 false, InvalidOid, InvalidOid,
-						 InvalidRelFileNumber, 0, &manifest, 0, NULL, 0);
+				backup_control_file(controlFile);
+				sendFileWithContent(sink, XLOG_CONTROL_FILE,
+									(char *)controlFile, PG_CONTROL_FILE_SIZE,
+									&manifest);
 			}
 			else
 			{
diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index c6d9cbb1577..c2c19eb77df 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -162,8 +162,8 @@ pg_control_checkpoint(PG_FUNCTION_ARGS)
 Datum
 pg_control_recovery(PG_FUNCTION_ARGS)
 {
-	Datum		values[5];
-	bool		nulls[5];
+	Datum		values[6];
+	bool		nulls[6];
 	TupleDesc	tupdesc;
 	HeapTuple	htup;
 	ControlFileData *ControlFile;
@@ -195,6 +195,9 @@ pg_control_recovery(PG_FUNCTION_ARGS)
 	values[4] = BoolGetDatum(ControlFile->backupEndRequired);
 	nulls[4] = false;
 
+	values[5] = BoolGetDatum(ControlFile->backupLabelRequired);
+	nulls[5] = false;
+
 	htup = heap_form_tuple(tupdesc, values, nulls);
 
 	PG_RETURN_DATUM(HeapTupleGetDatum(htup));
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index a4060309ae0..5919dd58fed 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -301,6 +301,8 @@ main(int argc, char *argv[])
 		   LSN_FORMAT_ARGS(ControlFile->backupEndPoint));
 	printf(_("End-of-backup record required:        %s\n"),
 		   ControlFile->backupEndRequired ? _("yes") : _("no"));
+	printf(_("Backup label required:                %s\n"),
+		   ControlFile->backupLabelRequired ? _("yes") : _("no"));
 	printf(_("wal_level setting:                    %s\n"),
 		   wal_level_str(ControlFile->wal_level));
 	printf(_("wal_log_hints setting:                %s\n"),
diff --git a/src/bin/pg_resetwal/pg_resetwal.c b/src/bin/pg_resetwal/pg_resetwal.c
index ab766c34d4b..768a4ed2f18 100644
--- a/src/bin/pg_resetwal/pg_resetwal.c
+++ b/src/bin/pg_resetwal/pg_resetwal.c
@@ -918,6 +918,7 @@ RewriteControlFile(void)
 	ControlFile.backupStartPoint = InvalidXLogRecPtr;
 	ControlFile.backupEndPoint = InvalidXLogRecPtr;
 	ControlFile.backupEndRequired = false;
+	ControlFile.backupLabelRequired = false;
 
 	/*
 	 * Force the defaults for max_* settings. The values don't really matter
diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c
index 9d745d4b25b..509b9e80a21 100644
--- a/src/bin/pg_rewind/pg_rewind.c
+++ b/src/bin/pg_rewind/pg_rewind.c
@@ -736,6 +736,7 @@ perform_rewind(filemap_t *filemap, rewind_source *source,
 	ControlFile_new.minRecoveryPoint = endrec;
 	ControlFile_new.minRecoveryPointTLI = endtli;
 	ControlFile_new.state = DB_IN_ARCHIVE_RECOVERY;
+	ControlFile_new.backupLabelRequired = true;
 	if (!dry_run)
 		update_controlfile(datadir_target, &ControlFile_new, do_sync);
 }
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index fdfb572467b..84ec5aabdfd 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -311,6 +311,7 @@ extern void do_pg_backup_start(const char *backupidstr, bool fast,
 							   StringInfo tblspcmapfile);
 extern void do_pg_backup_stop(BackupState *state, bool waitforarchive);
 extern void do_pg_abort_backup(int code, Datum arg);
+extern void backup_control_file(uint8_t *controlFile);
 extern void register_persistent_abort_backup_handler(void);
 extern SessionBackupState get_backup_status(void);
 
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 7503db1af51..26618162a7b 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -166,12 +166,16 @@ typedef struct ControlFileData
 	 * If backupEndRequired is true, we know for sure that we're restoring
 	 * from a backup, and must see a backup-end record before we can safely
 	 * start up.
+	 *
+	 * If backupLabelRequired is true, then a backup_label file must be
+	 * present in order for recovery to proceed.
 	 */
 	XLogRecPtr	minRecoveryPoint;
 	TimeLineID	minRecoveryPointTLI;
 	XLogRecPtr	backupStartPoint;
 	XLogRecPtr	backupEndPoint;
 	bool		backupEndRequired;
+	bool		backupLabelRequired;
 
 	/*
 	 * Parameter settings that determine if the WAL can be used for archival
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 4950bff2804..ece9b4f08c9 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12494,9 +12494,9 @@
 { oid => '3443',
   descr => 'pg_controldata recovery state information as a function',
   proname => 'pg_control_recovery', provolatile => 'v', prorettype => 'record',
-  proargtypes => '', proallargtypes => '{pg_lsn,int4,pg_lsn,pg_lsn,bool}',
-  proargmodes => '{o,o,o,o,o}',
-  proargnames => '{min_recovery_end_lsn,min_recovery_end_timeline,backup_start_lsn,backup_end_lsn,end_of_backup_record_required}',
+  proargtypes => '', proallargtypes => '{pg_lsn,int4,pg_lsn,pg_lsn,bool,bool}',
+  proargmodes => '{o,o,o,o,o,o}',
+  proargnames => '{min_recovery_end_lsn,min_recovery_end_timeline,backup_start_lsn,backup_end_lsn,end_of_backup_record_required,backup_label_required}',
   prosrc => 'pg_control_recovery' },
 
 { oid => '3444',
diff --git a/src/test/recovery/t/002_archiving.pl b/src/test/recovery/t/002_archiving.pl
index aa40f58e6d6..9963d13473e 100644
--- a/src/test/recovery/t/002_archiving.pl
+++ b/src/test/recovery/t/002_archiving.pl
@@ -41,6 +41,26 @@ $node_standby->append_conf(
 archive_cleanup_command = 'echo archive_cleanup_done > $archive_cleanup_command_file'
 recovery_end_command = 'echo recovery_ended_done > $recovery_end_command_file'
 ));
+
+# Rename backup_label to verify that recovery will not start without it
+rename("$data_dir/backup_label", "$data_dir/backup_label.tmp")
+  or BAIL_OUT "could not move $data_dir/backup_label";
+
+my $res = run_log(
+	[
+		'pg_ctl', '-D', $node_standby->data_dir, '-l',
+		$node_standby->logfile, 'start'
+	]);
+ok(!$res, 'invalid recovery startup fails');
+
+my $logfile = slurp_file($node_standby->logfile());
+ok($logfile =~ qr/could not find backup_label required for recovery/,
+	'could not find backup_label required for recovery');
+
+# Restore backup_label so recovery proceeds normally
+rename("$data_dir/backup_label.tmp", "$data_dir/backup_label")
+  or BAIL_OUT "could not move $data_dir/backup_label";
+
 $node_standby->start;
 
 # Create some content on primary
-- 
2.34.1



  [text/plain] pgcontrol-flag-v8-02-sql.patch (9.7K, 3-pgcontrol-flag-v8-02-sql.patch)
  download | inline diff:
From 1de5c13947884e33f744177d7f21147826a180e1 Mon Sep 17 00:00:00 2001
From: David Steele <[email protected]>
Date: Fri, 6 Mar 2026 01:10:14 +0000
Subject: Return pg_control from pg_backup_stop().

Harden recovery by returning a copy of pg_control from pg_backup_stop() that has
a flag set to prevent recovery if the backup_label file is missing. Instead of
backup software copying pg_control from PGDATA, it stores an updated version
that is returned from pg_backup_stop(). This is better for the following
reasons:

* The user can no longer remove backup_label and get what looks like a
successful recovery (while almost certainly causing corruption). If backup_label
is removed the cluster will not start. The user may try pg_resetwal, but that
tool makes it pretty clear that corruption will result from its use.

* We don't need to worry about backup software seeing a torn copy of pg_control,
since Postgres can safely read it out of memory and provide a valid copy via
pg_backup_stop(). This solves torn reads without needing to write pg_control via
a temp file, which may affect performance on a standby.

* For backup from standby, we no longer need to instruct the backup software to
copy pg_control last. In fact the backup software should not copy pg_control from
PGDATA at all.

These changes have no impact on current backup software and they are free to use
the pg_control available from pg_stop_backup() or continue to use pg_control from
PGDATA. Of course they will miss the benefits of getting a consistent copy of
pg_control and the backup_label checking, but will be no worse off than before.

Catalog version bump is required.
---
 doc/src/sgml/backup.sgml                    | 18 +++++-
 src/backend/access/transam/xlogfuncs.c      | 20 ++++--
 src/include/catalog/pg_proc.dat             |  4 +-
 src/test/recovery/t/042_low_level_backup.pl | 67 ++++++++++++++++++++-
 4 files changed, 97 insertions(+), 12 deletions(-)

diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
index 168444eccc5..67974b7b1b6 100644
--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -1027,9 +1027,12 @@ SELECT * FROM pg_backup_stop(wait_for_archive => true);
      values. The second of these fields should be written to a file named
      <filename>backup_label</filename> in the root directory of the backup. The
      third field should be written to a file named
-     <filename>tablespace_map</filename> unless the field is empty. These files are
+     <filename>tablespace_map</filename> unless the field is empty. The fourth
+     field should be written into a file named
+     <filename>global/pg_control</filename> (overwriting the existing file when
+     present). These files are
      vital to the backup working and must be written byte for byte without
-     modification, which may require opening the file in binary mode.
+     modification, which will require opening the file in binary mode.
     </para>
    </listitem>
    <listitem>
@@ -1101,7 +1104,16 @@ SELECT * FROM pg_backup_stop(wait_for_archive => true);
    </para>
 
    <para>
-    You should, however, omit from the backup the files within the
+    You should exclude <filename>global/pg_control</filename> from your backup
+    and put the contents of the <parameter>controlfile</parameter> column
+    returned from <function>pg_backup_stop</function> in your backup at
+    <filename>global/pg_control</filename>. This version of pg_control contains
+    safeguards against recovery without backup_label present and is guaranteed
+    not to be torn.
+   </para>
+
+   <para>
+    You should also omit from the backup the files within the
     cluster's <filename>pg_wal/</filename> subdirectory.  This
     slight adjustment is worthwhile because it reduces the risk
     of mistakes when restoring.  This is easy to arrange if
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 2efe4105efb..e7e7c0b18e5 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -113,9 +113,11 @@ pg_backup_start(PG_FUNCTION_ARGS)
  *
  * The backup_label contains the user-supplied label string (typically this
  * would be used to tell where the backup dump will be stored), the starting
- * time, starting WAL location for the dump and so on.  It is the caller's
- * responsibility to write the backup_label and tablespace_map files in the
- * data folder that will be restored from this backup.
+ * time, starting WAL location for the dump and so on.  The pg_control file
+ * contains a consistent copy of pg_control that also has a safeguard against
+ * being used without backup_label.  It is the caller's responsibility to write
+ * the backup_label, pg_control, and tablespace_map files in the data folder
+ * that will be restored from this backup.
  *
  * Permission checking for this function is managed through the normal
  * GRANT system.
@@ -123,12 +125,14 @@ pg_backup_start(PG_FUNCTION_ARGS)
 Datum
 pg_backup_stop(PG_FUNCTION_ARGS)
 {
-#define PG_BACKUP_STOP_V2_COLS 3
+#define PG_BACKUP_STOP_V2_COLS 4
 	TupleDesc	tupdesc;
 	Datum		values[PG_BACKUP_STOP_V2_COLS] = {0};
 	bool		nulls[PG_BACKUP_STOP_V2_COLS] = {0};
 	bool		waitforarchive = PG_GETARG_BOOL(0);
 	char	   *backup_label;
+	bytea	   *pg_control_bytea;
+	uint8_t		pg_control[PG_CONTROL_FILE_SIZE];
 	SessionBackupState status = get_backup_status();
 
 	/* Initialize attributes information in the tuple descriptor */
@@ -150,9 +154,17 @@ pg_backup_stop(PG_FUNCTION_ARGS)
 	/* Build the contents of backup_label */
 	backup_label = build_backup_content(backup_state, false);
 
+	/* Build the contents of pg_control */
+	backup_control_file(pg_control);
+
+	pg_control_bytea = (bytea *) palloc(PG_CONTROL_FILE_SIZE + VARHDRSZ);
+	SET_VARSIZE(pg_control_bytea, PG_CONTROL_FILE_SIZE + VARHDRSZ);
+	memcpy(VARDATA(pg_control_bytea), pg_control, PG_CONTROL_FILE_SIZE);
+
 	values[0] = LSNGetDatum(backup_state->stoppoint);
 	values[1] = CStringGetTextDatum(backup_label);
 	values[2] = CStringGetTextDatum(tablespace_map->data);
+	values[3] = PointerGetDatum(pg_control_bytea);
 
 	/* Deallocate backup-related variables */
 	pfree(backup_label);
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index ece9b4f08c9..9a85bc987df 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -6757,8 +6757,8 @@
 { oid => '2739', descr => 'finish taking an online backup',
   proname => 'pg_backup_stop', provolatile => 'v', proparallel => 'r',
   prorettype => 'record', proargtypes => 'bool',
-  proallargtypes => '{bool,pg_lsn,text,text}', proargmodes => '{i,o,o,o}',
-  proargnames => '{wait_for_archive,lsn,labelfile,spcmapfile}',
+  proallargtypes => '{bool,pg_lsn,text,text,bytea}', proargmodes => '{i,o,o,o,o}',
+  proargnames => '{wait_for_archive,lsn,labelfile,spcmapfile,controlfile}',
   proargdefaults => '{true}',
   prosrc => 'pg_backup_stop',
   proacl => '{POSTGRES=X}' },
diff --git a/src/test/recovery/t/042_low_level_backup.pl b/src/test/recovery/t/042_low_level_backup.pl
index df4ae029fe6..34c6f4461af 100644
--- a/src/test/recovery/t/042_low_level_backup.pl
+++ b/src/test/recovery/t/042_low_level_backup.pl
@@ -13,6 +13,42 @@ use PostgreSQL::Test::Cluster;
 use PostgreSQL::Test::Utils;
 use Test::More;
 
+# Decode hex to binary
+sub decode_hex
+{
+	my ($encoded) = @_;
+	my $decoded;
+
+	$encoded =~ s/^\s+|\s+$//g;
+
+	for (my $idx = 0; $idx < length($encoded); $idx += 2)
+	{
+		$decoded .= pack('C', hex(substr($encoded, $idx, 2)));
+	}
+
+	return $decoded;
+}
+
+# Get backup_label/pg_control from pg_stop_backup()
+sub stop_backup_result
+{
+	my ($psql) = @_;
+
+	my $encoded = $psql->query_safe(
+		"select encode(labelfile::bytea, 'hex') || ',' || " .
+		"       encode(controlfile, 'hex')" .
+		"  from pg_backup_stop()");
+
+	my @result;
+
+    foreach my $column (split(',', $encoded))
+	{
+		push(@result, decode_hex($column));
+	}
+
+	return @result;
+}
+
 # Start primary node with archiving.
 my $node_primary = PostgreSQL::Test::Cluster->new('primary');
 $node_primary->init(has_archiving => 1, allows_streaming => 1);
@@ -80,8 +116,7 @@ my $stop_segment_name = $node_primary->safe_psql('postgres',
 	'SELECT pg_walfile_name(pg_current_wal_lsn())');
 
 # Stop backup and get backup_label, the last segment is archived.
-my $backup_label =
-  $psql->query_safe("select labelfile from pg_backup_stop()");
+(my $backup_label, my $pg_control) = stop_backup_result($psql);
 
 $psql->quit;
 
@@ -118,10 +153,36 @@ ok( $node_replica->log_contains(
 $node_replica->teardown_node;
 $node_replica->clean_node;
 
+# Save only pg_control into the backup to demonstrate that pg_control returned
+# from pg_stop_backup() will only perform recovery when backup_label is present.
+open(my $fh, ">", "$backup_dir/global/pg_control")
+  or die "could not open pg_control";
+binmode($fh);
+syswrite($fh, $pg_control);
+close($fh);
+
+$node_replica = PostgreSQL::Test::Cluster->new('replica_fail2');
+$node_replica->init_from_backup($node_primary, $backup_name,
+	has_restoring => 1);
+
+my $res = run_log(
+	[
+		'pg_ctl', '-D', $node_replica->data_dir, '-l',
+		$node_replica->logfile, 'start'
+	]);
+ok(!$res, 'invalid recovery startup fails');
+
+my $logfile = slurp_file($node_replica->logfile());
+ok($logfile =~ qr/could not find backup_label required for recovery/,
+	'could not find backup_label required for recovery');
+
+$node_replica->teardown_node;
+$node_replica->clean_node;
+
 # Save backup_label into the backup directory and recover using the primary's
 # archive.  This time recovery will succeed and the canary table will be
 # present.
-open my $fh, ">>", "$backup_dir/backup_label"
+open $fh, ">>", "$backup_dir/backup_label"
   or die "could not open backup_label";
 # Binary mode is required for Windows, as the backup_label parsing is not
 # able to cope with CRLFs.
-- 
2.34.1



^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: Return pg_control from pg_backup_stop().
  2026-02-20 03:10 Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-02-20 05:47 ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-03-06 01:27   ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
@ 2026-03-17 05:51     ` Michael Paquier <[email protected]>
  1 sibling, 0 replies; 13+ messages in thread

From: Michael Paquier @ 2026-03-17 05:51 UTC (permalink / raw)
  To: Haibo Yan <[email protected]>; +Cc: David Steele <[email protected]>; pgsql-hackers

On Mon, Mar 16, 2026 at 10:16:58PM -0700, Haibo Yan wrote:
> I have not read the code yet, so this may already be answered there,
> but I had a question about the proposal itself. This patch protects
> against a missing backup_label, but what about a wrong one? If a
> user restores a backup_label file from a different backup, the
> existence check alone would not detect that. Do we need some
> consistency check between the returned pg_control copy and the
> backup_label contents, or is the intended scope here limited to the
> “missing file” case only? 

Please note that we use bottom-posting on the lists, as of:
https://en.wikipedia.org/wiki/Posting_style#Bottom-posting
--
Michael


Attachments:

  [application/pgp-signature] signature.asc (833B, 2-signature.asc)
  download

^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: Return pg_control from pg_backup_stop().
  2026-02-20 03:10 Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-02-20 05:47 ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-03-06 01:27   ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
@ 2026-03-17 07:05     ` David Steele <[email protected]>
  2026-03-17 18:50       ` Re: Return pg_control from pg_backup_stop(). Haibo Yan <[email protected]>
  1 sibling, 1 reply; 13+ messages in thread

From: David Steele @ 2026-03-17 07:05 UTC (permalink / raw)
  To: Haibo Yan <[email protected]>; +Cc: Michael Paquier <[email protected]>; pgsql-hackers

On 3/17/26 12:16, Haibo Yan wrote:

> I have not read the code yet, so this may already be answered there, but 
> I had a question about the proposal itself. This patch protects against 
> a missing backup_label, but what about a wrong one? If a user restores a 
> backup_label file from a different backup, the existence check alone 
> would not detect that. Do we need some consistency check between the 
> returned pg_control copy and the backup_label contents, or is the 
> intended scope here limited to the “missing file” case only?

Thank you for having a look!

The goal here is only to check for a missing backup_label. The general 
problem is that PostgreSQL suggests that removing backup_label might be 
a good idea so the user does it:

If you are not restoring from a backup, try removing the file 
\"%s/backup_label\"

The user *could* copy a backup_label from another backup and there are 
ways we could detect that but I feel that should be material for a 
separate patch.

Regards,
-David







^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: Return pg_control from pg_backup_stop().
  2026-02-20 03:10 Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-02-20 05:47 ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-03-06 01:27   ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-03-17 07:05     ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
@ 2026-03-17 18:50       ` Haibo Yan <[email protected]>
  2026-03-18 01:43         ` Re: Return pg_control from pg_backup_stop(). Michael Paquier <[email protected]>
  0 siblings, 1 reply; 13+ messages in thread

From: Haibo Yan @ 2026-03-17 18:50 UTC (permalink / raw)
  To: David Steele <[email protected]>; +Cc: Michael Paquier <[email protected]>; pgsql-hackers

Hi David,

Thank you for the clarification. I have now read the code, and overall it looks good to me. I only had one very small comment.

You currently have:
```
memset(controlFile + sizeof(ControlFileData), 0,
       PG_CONTROL_FILE_SIZE - sizeof(ControlFileData));
memcpy(controlFile, ControlFile, sizeof(ControlFileData));
``` 
This is correct, since only the trailing bytes need to be zeroed before the copy.

I was just wondering whether the following might be slightly clearer:
```
memset(controlFile, 0, PG_CONTROL_FILE_SIZE);
memcpy(controlFile, ControlFile, sizeof(ControlFileData));
```

I do not think this is a real issue, though.

Thanks
Haibo

> On Mar 17, 2026, at 12:05 AM, David Steele <[email protected]> wrote:
> 
> On 3/17/26 12:16, Haibo Yan wrote:
> 
>> I have not read the code yet, so this may already be answered there, but I had a question about the proposal itself. This patch protects against a missing backup_label, but what about a wrong one? If a user restores a backup_label file from a different backup, the existence check alone would not detect that. Do we need some consistency check between the returned pg_control copy and the backup_label contents, or is the intended scope here limited to the “missing file” case only?
> 
> Thank you for having a look!
> 
> The goal here is only to check for a missing backup_label. The general problem is that PostgreSQL suggests that removing backup_label might be a good idea so the user does it:
> 
> If you are not restoring from a backup, try removing the file \"%s/backup_label\"
> 
> The user *could* copy a backup_label from another backup and there are ways we could detect that but I feel that should be material for a separate patch.
> 
> Regards,
> -David
> 
> 



^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: Return pg_control from pg_backup_stop().
  2026-02-20 03:10 Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-02-20 05:47 ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-03-06 01:27   ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-03-17 07:05     ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-03-17 18:50       ` Re: Return pg_control from pg_backup_stop(). Haibo Yan <[email protected]>
@ 2026-03-18 01:43         ` Michael Paquier <[email protected]>
  2026-03-18 04:53           ` Re: Return pg_control from pg_backup_stop(). Michael Paquier <[email protected]>
  0 siblings, 1 reply; 13+ messages in thread

From: Michael Paquier @ 2026-03-18 01:43 UTC (permalink / raw)
  To: Haibo Yan <[email protected]>; +Cc: David Steele <[email protected]>; pgsql-hackers; Heikki Linnakangas <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; Fujii Masao <[email protected]>

On Tue, Mar 17, 2026 at 11:50:29AM -0700, Haibo Yan wrote:
> Thank you for the clarification. I have now read the code, and
> overall it looks good to me. I only had one very small comment.

(Bottom-posting note from above, please be careful.)

> I was just wondering whether the following might be slightly clearer:
> ```
> memset(controlFile, 0, PG_CONTROL_FILE_SIZE);
> memcpy(controlFile, ControlFile, sizeof(ControlFileData));
> ```
> 
> I do not think this is a real issue, though.

         {
             ControlFile->backupStartPoint = checkPoint.redo;
             ControlFile->backupEndRequired = backupEndRequired;
+            ControlFile->backupLabelRequired = false;

It sounds like it is going to be important to document the reason why
the flag is reset here (aka we don't need the backup_label file
anymore).

+backup_control_file(uint8_t *controlFile)
+{
+    ControlFileData *controlData = ((ControlFileData *)controlFile);
+
+    memset(controlFile + sizeof(ControlFileData), 0,
+           PG_CONTROL_FILE_SIZE - sizeof(ControlFileData));
+
+    LWLockAcquire(ControlFileLock, LW_SHARED);
+    memcpy(controlFile, ControlFile, sizeof(ControlFileData));
+    LWLockRelease(ControlFileLock);
+
+    controlData->backupLabelRequired = true;
+
+    INIT_CRC32C(controlData->crc);
+    COMP_CRC32C(controlData->crc, controlFile, offsetof(ControlFileData, crc));
+    FIN_CRC32C(controlData->crc);

I was wondering if we should have an assertion at least to cross-check
that the contents we store in shared memory never go out-of-sync with
the on-disk contents, in the shape of a USE_ASSERT_CHECKING block that
calls get_controlfile() and memcmp()'s the contents between shmem and
the on-disk file, while the LWLock is taken.  We ship the control file
last on purpose, one reason being backups taken from standbys, so that
may be sensible to do.

Another property of the new control file flag that is implied in the
implementation but not documented is that we should never check for
backupLabelRequired when a backup_label is gone.  Actually, the flag
is reset in InitWalRecovery() in the initial steps of recovery, and
the backup_label file is removed much later in StartupXLOG() just
*after* UpdateControlFile() to minimize the window where the contents
of the control file and the backup_label file is removed are
out-of-sync.  This window means that if we crash between the
completion of UpdateControlFile() and the durable_rename() we could
have a flag reset with a backup_label still around.  On restart,
recovery would fail, requiring a manual modification of the control
file, at least.  It sounds to me that this implementation detail
should be documented clearly.

Finally, here is a general opinion.  I like this patch, and it is
basically risk-free for base backups taken with the replication
protocol as we update the control file with the new flag set
on-the-fly. 

Now, I am worried about backups that use pg_stop_backup().  Changing
backup APIs has always been a very sensitive area, and this is going
to require operators to update backup tools so as the control file
received as a result of pg_stop_backup() is copied, at the cost of
getting a failure if they don't do so.  I will *not* proceed with this
change without a clear approval from some more committers or senior
hackers that they like this change (approach previously suggested by
Andres, actually, for what I can see).  I am adding in CC a few
committers who have commented on this set of proposals and who have
touched the recovery code in the last few years, for awareness.
The timing is what it is, and we are at the end of a release cycle.
Let's see if we can reach a consensus of some kind.
--
Michael


Attachments:

  [application/pgp-signature] signature.asc (833B, 2-signature.asc)
  download

^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: Return pg_control from pg_backup_stop().
  2026-02-20 03:10 Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-02-20 05:47 ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-03-06 01:27   ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-03-17 07:05     ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-03-17 18:50       ` Re: Return pg_control from pg_backup_stop(). Haibo Yan <[email protected]>
  2026-03-18 01:43         ` Re: Return pg_control from pg_backup_stop(). Michael Paquier <[email protected]>
@ 2026-03-18 04:53           ` Michael Paquier <[email protected]>
  2026-03-18 08:26             ` Re: Return pg_control from pg_backup_stop(). Michael Paquier <[email protected]>
  0 siblings, 1 reply; 13+ messages in thread

From: Michael Paquier @ 2026-03-18 04:53 UTC (permalink / raw)
  To: David Steele <[email protected]>; +Cc: Haibo Yan <[email protected]>; pgsql-hackers; Heikki Linnakangas <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; Fujii Masao <[email protected]>

On Wed, Mar 18, 2026 at 04:05:24AM +0000, David Steele wrote:
> On 3/18/26 08:43, Michael Paquier wrote:
>> I was wondering if we should have an assertion at least to cross-check
>> that the contents we store in shared memory never go out-of-sync with
>> the on-disk contents, in the shape of a USE_ASSERT_CHECKING block that
>> calls get_controlfile() and memcmp()'s the contents between shmem and
>> the on-disk file, while the LWLock is taken.  We ship the control file
>> last on purpose, one reason being backups taken from standbys, so that
>> may be sensible to do.
> 
> As far as I can see this should always be true -- I audited all the
> 
> LWLockAcquire(ControlFileLock, LW_EXCLUSIVE)
> 
> sections and the file is always saved once if is updated. Let me see if I
> can add this check without too much pain, e.g. an additional parameter.

This matches with my reads of the code.  The attached check, that can
be applied on top of your patches, passes under check-world.

>> Another property of the new control file flag that is implied in the
>> implementation but not documented is that we should never check for
>> backupLabelRequired when a backup_label is gone.
> 
> I'm not sure what you mean here? That's exactly when we do want to check as
> below:

Sorry for the confusion, I meant that "we should never check for
backupLabelRequired when we have a backup_label".

> Using the pg_control copy from pg_backup_stop() is entirely optional and
> nothing is broken if vendors decide not to use it. This can be demonstrated
> by applying the 01 patch without 02. In that case the tests in
> 042_low_level_backup continue to run. You can also apply 01 and 02 and
> revert the test changes in 042_low_level_backup and that works, too.

FWIW, after a second look I am actually wondering if 0002 is safe at
all.  The contents of the control file are fetched after we are done
with do_pg_backup_stop(), and there could be a bunch of activity that
happens between the end of do_pg_backup_stop() and the
backup_control_file() call, where the control file could be updated
more and interfere with the recovery startup for some of its fields?
GUC parameter updates that may touch the control file are one thing
popping into mind.
--
Michael

From de5b9b00867e39e73dc99a1ee61a814cead1ab7d Mon Sep 17 00:00:00 2001
From: Michael Paquier <[email protected]>
Date: Wed, 18 Mar 2026 13:44:01 +0900
Subject: [PATCH] Add consistency check for shmem and on-disk control file

---
 src/backend/access/transam/xlog.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index c8ea1ea49d24..0f8516fcb563 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -9606,6 +9606,17 @@ backup_control_file(uint8_t *controlFile)
 
 	LWLockAcquire(ControlFileLock, LW_SHARED);
 	memcpy(controlFile, ControlFile, sizeof(ControlFileData));
+
+#ifdef USE_ASSERT_CHECKING
+	{
+		bool crc_ok;
+		ControlFileData *dataDisk = get_controlfile(DataDir, &crc_ok);
+
+		Assert(crc_ok &&
+			   memcmp(dataDisk, ControlFile, sizeof(ControlFileData)) == 0);
+	}
+#endif
+
 	LWLockRelease(ControlFileLock);
 
 	controlData->backupLabelRequired = true;
-- 
2.53.0



Attachments:

  [text/plain] 0001-Add-consistency-check-for-shmem-and-on-disk-control-.patch (1007B, 2-0001-Add-consistency-check-for-shmem-and-on-disk-control-.patch)
  download | inline diff:
From de5b9b00867e39e73dc99a1ee61a814cead1ab7d Mon Sep 17 00:00:00 2001
From: Michael Paquier <[email protected]>
Date: Wed, 18 Mar 2026 13:44:01 +0900
Subject: [PATCH] Add consistency check for shmem and on-disk control file

---
 src/backend/access/transam/xlog.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index c8ea1ea49d24..0f8516fcb563 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -9606,6 +9606,17 @@ backup_control_file(uint8_t *controlFile)
 
 	LWLockAcquire(ControlFileLock, LW_SHARED);
 	memcpy(controlFile, ControlFile, sizeof(ControlFileData));
+
+#ifdef USE_ASSERT_CHECKING
+	{
+		bool crc_ok;
+		ControlFileData *dataDisk = get_controlfile(DataDir, &crc_ok);
+
+		Assert(crc_ok &&
+			   memcmp(dataDisk, ControlFile, sizeof(ControlFileData)) == 0);
+	}
+#endif
+
 	LWLockRelease(ControlFileLock);
 
 	controlData->backupLabelRequired = true;
-- 
2.53.0



  [application/pgp-signature] signature.asc (833B, 3-signature.asc)
  download

^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: Return pg_control from pg_backup_stop().
  2026-02-20 03:10 Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-02-20 05:47 ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-03-06 01:27   ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-03-17 07:05     ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-03-17 18:50       ` Re: Return pg_control from pg_backup_stop(). Haibo Yan <[email protected]>
  2026-03-18 01:43         ` Re: Return pg_control from pg_backup_stop(). Michael Paquier <[email protected]>
  2026-03-18 04:53           ` Re: Return pg_control from pg_backup_stop(). Michael Paquier <[email protected]>
@ 2026-03-18 08:26             ` Michael Paquier <[email protected]>
  2026-03-18 12:26               ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  0 siblings, 1 reply; 13+ messages in thread

From: Michael Paquier @ 2026-03-18 08:26 UTC (permalink / raw)
  To: David Steele <[email protected]>; +Cc: Haibo Yan <[email protected]>; pgsql-hackers; Heikki Linnakangas <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; Fujii Masao <[email protected]>

On Wed, Mar 18, 2026 at 07:35:47AM +0000, David Steele wrote:
> You are correct -- the copy of pg_control needs to happen before
> do_pg_backup_stop(). An older version of this patch saved pg_control in
> backup_state which made the prior location safe. However, I missed moving
> this code when I moved pg_control out of backup_state. Code review to the
> rescue.

Right.  I am wondering also if the final result would not be better
without 0002, actually, focusing only on the "simpler" base backup
case through the replication protocol, and you are making a good case
in mentioning it as not absolutely mandatory for base backups that are
taken through the SQL functions.  One could always tweak the flag
manually in the control file based on the contents taken from the data
folder.  That's more hairy than writing the entire file, for sure,
still possible.
--
Michael


Attachments:

  [application/pgp-signature] signature.asc (833B, 2-signature.asc)
  download

^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: Return pg_control from pg_backup_stop().
  2026-02-20 03:10 Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-02-20 05:47 ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-03-06 01:27   ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-03-17 07:05     ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-03-17 18:50       ` Re: Return pg_control from pg_backup_stop(). Haibo Yan <[email protected]>
  2026-03-18 01:43         ` Re: Return pg_control from pg_backup_stop(). Michael Paquier <[email protected]>
  2026-03-18 04:53           ` Re: Return pg_control from pg_backup_stop(). Michael Paquier <[email protected]>
  2026-03-18 08:26             ` Re: Return pg_control from pg_backup_stop(). Michael Paquier <[email protected]>
@ 2026-03-18 12:26               ` David Steele <[email protected]>
  2026-03-18 21:00                 ` Re: Return pg_control from pg_backup_stop(). Corey Huinker <[email protected]>
  2026-04-13 14:55                 ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  0 siblings, 2 replies; 13+ messages in thread

From: David Steele @ 2026-03-18 12:26 UTC (permalink / raw)
  To: Michael Paquier <[email protected]>; +Cc: Haibo Yan <[email protected]>; pgsql-hackers; Heikki Linnakangas <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; Fujii Masao <[email protected]>

On 3/18/26 15:26, Michael Paquier wrote:
> On Wed, Mar 18, 2026 at 07:35:47AM +0000, David Steele wrote:
>> You are correct -- the copy of pg_control needs to happen before
>> do_pg_backup_stop(). An older version of this patch saved pg_control in
>> backup_state which made the prior location safe. However, I missed moving
>> this code when I moved pg_control out of backup_state. Code review to the
>> rescue.
> 
> Right.  I am wondering also if the final result would not be better
> without 0002, actually, focusing only on the "simpler" base backup
> case through the replication protocol, and you are making a good case
> in mentioning it as not absolutely mandatory for base backups that are
> taken through the SQL functions.  One could always tweak the flag
> manually in the control file based on the contents taken from the data
> folder.  That's more hairy than writing the entire file, for sure,
> still possible.

Getting even 01 into PG19 would be a great outcome. This would solve the 
problem of torn pg_control and deleted backup labels for any backups 
made with pg_basebackup and that's going to cover a *lot* of cases.

Established third-party backup solutions that are not based on 
pg_basebackup are generally able to manipulate pg_control so that's not 
as much of a concern, perhaps. It does raise the barrier of entry for 
new backup software if they need to learn to read and validate 
pg_control to avoid a torn copy and set the flag. Patch 02 solves that 
problem in a general way so I still think it adds value for the 
ecosystem -- but we could always discuss that in the PG20 cycle.

Whatever gets committed for PG19 I'll write a followup patch to describe 
the hazards of reading pg_control and generally how to get a good copy. 
However, this will be complicated enough that the best answer will 
likely be to use pg_basebackup or some other reputable backup software. 
I don't love this -- I feel like the low-level interface should be 
usable with such hazards.

Regards,
-David





^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: Return pg_control from pg_backup_stop().
  2026-02-20 03:10 Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-02-20 05:47 ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-03-06 01:27   ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-03-17 07:05     ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-03-17 18:50       ` Re: Return pg_control from pg_backup_stop(). Haibo Yan <[email protected]>
  2026-03-18 01:43         ` Re: Return pg_control from pg_backup_stop(). Michael Paquier <[email protected]>
  2026-03-18 04:53           ` Re: Return pg_control from pg_backup_stop(). Michael Paquier <[email protected]>
  2026-03-18 08:26             ` Re: Return pg_control from pg_backup_stop(). Michael Paquier <[email protected]>
  2026-03-18 12:26               ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
@ 2026-03-18 21:00                 ` Corey Huinker <[email protected]>
  2026-03-19 04:20                   ` Re: Return pg_control from pg_backup_stop(). Corey Huinker <[email protected]>
  1 sibling, 1 reply; 13+ messages in thread

From: Corey Huinker @ 2026-03-18 21:00 UTC (permalink / raw)
  To: David Steele <[email protected]>; +Cc: Michael Paquier <[email protected]>; Haibo Yan <[email protected]>; pgsql-hackers; Heikki Linnakangas <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; Fujii Masao <[email protected]>

>
> Whatever gets committed for PG19 I'll write a followup patch to describe
> the hazards of reading pg_control and generally how to get a good copy.
> However, this will be complicated enough that the best answer will
> likely be to use pg_basebackup or some other reputable backup software.
> I don't love this -- I feel like the low-level interface should be
> usable with such hazards.


Surya Poondla and I had decided on this patchset as a pair-reviewing
exercise. However, events have overtaken us, and several other people have
chimed in expressing the same concerns that we had observed but hadn't yet
completed our review. All of the main concerns that we had found up to this
point have been addressed in the lastest patchset, except for the trivial
observation that the ereport() uses the old style and doesn't need the set
of parens around (errmsg(), errhint()). Patches apply clean, tests pass,
test coverage seems sufficient, we're happy with the wording of the
documentation, in short there really isn't a whole lot for us to add to the
review, and for that reason we're removing our names from the list of
reviewers in the commitfest app.


^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: Return pg_control from pg_backup_stop().
  2026-02-20 03:10 Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-02-20 05:47 ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-03-06 01:27   ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-03-17 07:05     ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-03-17 18:50       ` Re: Return pg_control from pg_backup_stop(). Haibo Yan <[email protected]>
  2026-03-18 01:43         ` Re: Return pg_control from pg_backup_stop(). Michael Paquier <[email protected]>
  2026-03-18 04:53           ` Re: Return pg_control from pg_backup_stop(). Michael Paquier <[email protected]>
  2026-03-18 08:26             ` Re: Return pg_control from pg_backup_stop(). Michael Paquier <[email protected]>
  2026-03-18 12:26               ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-03-18 21:00                 ` Re: Return pg_control from pg_backup_stop(). Corey Huinker <[email protected]>
@ 2026-03-19 04:20                   ` Corey Huinker <[email protected]>
  0 siblings, 0 replies; 13+ messages in thread

From: Corey Huinker @ 2026-03-19 04:20 UTC (permalink / raw)
  To: David Steele <[email protected]>; +Cc: Michael Paquier <[email protected]>; Haibo Yan <[email protected]>; pgsql-hackers; Heikki Linnakangas <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; Fujii Masao <[email protected]>

On Wed, Mar 18, 2026 at 10:56 PM David Steele <[email protected]> wrote:

> Hi Corey,
>
> On 3/19/26 04:00, Corey Huinker wrote:
> >     Whatever gets committed for PG19 I'll write a followup patch to
> >     describe
> >     the hazards of reading pg_control and generally how to get a good
> copy.
> >     However, this will be complicated enough that the best answer will
> >     likely be to use pg_basebackup or some other reputable backup
> software.
> >     I don't love this -- I feel like the low-level interface should be
> >     usable with such hazards.
> >
> > Surya Poondla and I had decided on this patchset as a pair-reviewing
> > exercise. However, events have overtaken us, and several other people
> > have chimed in expressing the same concerns that we had observed but
> > hadn't yet completed our review.
>
> Thank you both for having a look!
>
>  > All of the main concerns that we had > found up to this point have
> been addressed in the lastest patchset,
> > except for the trivial observation that the ereport() uses the old style
> > and doesn't need the set of parens around (errmsg(), errhint()).
>
> Grep shows there are lots of messages with the new style but many more
> in the old style. Presumably they are only being updated as they are
> modified.


That's always been my assumption. Not worth the churn.


> Do you happen to know the commit or message thread where this
> policy was started? I've been searching but it is such a generic search
> term.
>

I limited my git log -p to elog.h, and it seems it started with
e3a87b4991cc back in 2020. The only reason I knew about it was that I used
to do backports from v13 to unsupported versions, and the new style would
cause the build to fail on an otherwise clean cherry pick.


> It seems to me you've still done a review. Confirming what the other
> reviewers found is good info to have.
>

Of a sort, yes, but our review doesn't touch the "is this a good idea"
question, which has been by far the thing most in need of reviewing across
the long discussion threads.


^ permalink  raw  reply  [nested|flat] 13+ messages in thread

* Re: Return pg_control from pg_backup_stop().
  2026-02-20 03:10 Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-02-20 05:47 ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-03-06 01:27   ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-03-17 07:05     ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
  2026-03-17 18:50       ` Re: Return pg_control from pg_backup_stop(). Haibo Yan <[email protected]>
  2026-03-18 01:43         ` Re: Return pg_control from pg_backup_stop(). Michael Paquier <[email protected]>
  2026-03-18 04:53           ` Re: Return pg_control from pg_backup_stop(). Michael Paquier <[email protected]>
  2026-03-18 08:26             ` Re: Return pg_control from pg_backup_stop(). Michael Paquier <[email protected]>
  2026-03-18 12:26               ` Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
@ 2026-04-13 14:55                 ` David Steele <[email protected]>
  1 sibling, 0 replies; 13+ messages in thread

From: David Steele @ 2026-04-13 14:55 UTC (permalink / raw)
  To: Michael Paquier <[email protected]>; +Cc: Haibo Yan <[email protected]>; pgsql-hackers; Heikki Linnakangas <[email protected]>; Robert Haas <[email protected]>; Andres Freund <[email protected]>; Fujii Masao <[email protected]>

On 3/18/26 19:26, David Steele wrote:
> On 3/18/26 15:26, Michael Paquier wrote:
>> On Wed, Mar 18, 2026 at 07:35:47AM +0000, David Steele wrote:
>>> You are correct -- the copy of pg_control needs to happen before
>>> do_pg_backup_stop(). An older version of this patch saved pg_control in
>>> backup_state which made the prior location safe. However, I missed 
>>> moving
>>> this code when I moved pg_control out of backup_state. Code review to 
>>> the
>>> rescue.
>>
>> Right.  I am wondering also if the final result would not be better
>> without 0002, actually, focusing only on the "simpler" base backup
>> case through the replication protocol, and you are making a good case
>> in mentioning it as not absolutely mandatory for base backups that are
>> taken through the SQL functions.  One could always tweak the flag
>> manually in the control file based on the contents taken from the data
>> folder.  That's more hairy than writing the entire file, for sure,
>> still possible.
> 
> Getting even 01 into PG19 would be a great outcome. This would solve the 
> problem of torn pg_control and deleted backup labels for any backups 
> made with pg_basebackup and that's going to cover a *lot* of cases.
> 
> Established third-party backup solutions that are not based on 
> pg_basebackup are generally able to manipulate pg_control so that's not 
> as much of a concern, perhaps. It does raise the barrier of entry for 
> new backup software if they need to learn to read and validate 
> pg_control to avoid a torn copy and set the flag. Patch 02 solves that 
> problem in a general way so I still think it adds value for the 
> ecosystem -- but we could always discuss that in the PG20 cycle.
> 
> Whatever gets committed for PG19 I'll write a followup patch to describe 
> the hazards of reading pg_control and generally how to get a good copy. 
> However, this will be complicated enough that the best answer will 
> likely be to use pg_basebackup or some other reputable backup software. 
> I don't love this -- I feel like the low-level interface should be 
> usable with such hazards.

I have withdrawn this patch. If anybody wants to pick it up in the 
future I'll be happy to rebase it but I think two years is long enough 
to maintain a patch that is not getting traction.

We are left with the issue that pg_basebackup backups may contain a torn 
copy of pg_control. At the least this should be documented.

It would also be a good idea to document that utilizing the low-level 
backup interface requires validating the checksum in pg_control to avoid 
a torn copy. This is non-trivial but certainly doable.

Regards,
-David





^ permalink  raw  reply  [nested|flat] 13+ messages in thread


end of thread, other threads:[~2026-04-13 14:55 UTC | newest]

Thread overview: 13+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2026-02-20 03:10 Re: Return pg_control from pg_backup_stop(). David Steele <[email protected]>
2026-02-20 05:47 ` David Steele <[email protected]>
2026-03-06 01:27   ` David Steele <[email protected]>
2026-03-17 05:51     ` Michael Paquier <[email protected]>
2026-03-17 07:05     ` David Steele <[email protected]>
2026-03-17 18:50       ` Haibo Yan <[email protected]>
2026-03-18 01:43         ` Michael Paquier <[email protected]>
2026-03-18 04:53           ` Michael Paquier <[email protected]>
2026-03-18 08:26             ` Michael Paquier <[email protected]>
2026-03-18 12:26               ` David Steele <[email protected]>
2026-03-18 21:00                 ` Corey Huinker <[email protected]>
2026-03-19 04:20                   ` Corey Huinker <[email protected]>
2026-04-13 14:55                 ` David Steele <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox