public inbox for [email protected]  
help / color / mirror / Atom feed
From: Andrew Dunstan <[email protected]>
To: Amul Sul <[email protected]>
To: Robert Haas <[email protected]>
Cc: Chao Li <[email protected]>
Cc: Jakub Wartak <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Subject: Re: pg_waldump: support decoding of WAL inside tarfile
Date: Tue, 3 Mar 2026 19:37:48 -0500
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAAJ_b96csGv+TvdxK-zp1Zo6zrAxOJ4n-qnxiWO1f0Lk0X0N_g@mail.gmail.com>
References: <CAAJ_b94bqdWN3h2J-PzzzQ2Npbwct5ZQHggn_QoYGhC2rn-=WQ@mail.gmail.com>
	<CAAJ_b94SEcBVJcsp0Y1-YvLqZWBHPQH4FhRzSJfaH_ah_eL_FQ@mail.gmail.com>
	<CAAJ_b97nn9denP2SAjuXyjwbd3is-VnZVSkiRMJ-5YNiKfO9MA@mail.gmail.com>
	<CA+Tgmob_DB9QHDOxnP7a5Y0yJdeGqY8YNi+uK_811y7cN4mxYA@mail.gmail.com>
	<CAAJ_b97VUiP-DbLNe-ddq64J_RiB4ZcPgAjHkJH-0dbzgjR++A@mail.gmail.com>
	<CA+TgmoYMtcZBaqy9r59eDapaDy3WOdepkFFURu9MV-x-kxEbKg@mail.gmail.com>
	<CAAJ_b95FOeW38gw-3BLmpdnTWHFimopTvf=eTObYUbTOC0x8qg@mail.gmail.com>
	<CA+Tgmob=3POOO8st-v-fCjKCKREQ=+gs5_PBQnoFeNBdERfuEg@mail.gmail.com>
	<CAAJ_b94=gtCeUKkGPUmPj_2SwHV+PiXQ8Mx-1RqfYj3pP3OwpA@mail.gmail.com>
	<CA+TgmoYuC-92tMxZHn-KY=ExF-YkHkd4yOfiMhi9upqk3M1DuA@mail.gmail.com>
	<CAAJ_b974evwS9+tmKVEV+ctEp1scRnLAMrmx9_pzLFHsPtr0hA@mail.gmail.com>
	<CAAJ_b95H+=D2P_fLAbePCF++FAXO0fsJnxFr5NKVe4at=ga6WA@mail.gmail.com>
	<CAAJ_b979p7Z-dBZhxBC3m2VPkYTvYzypMTWkYg2q-e1S5F_f-Q@mail.gmail.com>
	<CAAJ_b96csGv+TvdxK-zp1Zo6zrAxOJ4n-qnxiWO1f0Lk0X0N_g@mail.gmail.com>


On 2026-03-02 Mo 8:00 AM, Amul Sul wrote:
> On Wed, Feb 18, 2026 at 12:28 PM Amul Sul <[email protected]> wrote:
>> On Tue, Feb 10, 2026 at 3:06 PM Amul Sul <[email protected]> wrote:
>>> On Wed, Feb 4, 2026 at 6:39 PM Amul Sul <[email protected]> wrote:
>>>> On Wed, Jan 28, 2026 at 2:41 AM Robert Haas <[email protected]> wrote:
>>>>> On Tue, Jan 27, 2026 at 7:07 AM Amul Sul <[email protected]> wrote:
>>>>>> In the attached version, I am using the WAL segment name as the hash
>>>>>> key, which is much more straightforward. I have rewritten
>>>>>> read_archive_wal_page(), and it looks much cleaner than before. The
>>>>>> logic to discard irrelevant WAL files is still within
>>>>>> get_archive_wal_entry. I added an explanation for setting cur_wal to
>>>>>> NULL, which is now handled in the separate function I mentioned
>>>>>> previously.
>>>>>>
>>>>>> Kindly have a look at the attached version; let me know if you are
>>>>>> still not happy with the current approach for filtering/discarding
>>>>>> irrelevant WAL segments. It isn't much different from the previous
>>>>>> version, but I have tried to keep it in a separate routine for better
>>>>>> code readability, with comments to make it easier to understand. I
>>>>>> also added a comment for ArchivedWALFile.
>>>>> I feel like the division of labor between get_archive_wal_entry() and
>>>>> read_archive_wal_page() is odd. I noticed this in the last version,
>>>>> too, and it still seems to be the case. get_archive_wal_entry() first
>>>>> calls ArchivedWAL_lookup(). If that finds an entry, it just returns.
>>>>> If it doesn't, it loops until an entry for the requested file shows up
>>>>> and then returns it. Then control returns to read_archive_wal_page()
>>>>> which loops some more until we have all the data we need for the
>>>>> requested file. But it seems odd to me to have two separate loops
>>>>> here. I think that the first loop is going to call read_archive_file()
>>>>> until we find the beginning of the file that we care about and then
>>>>> the second one is going to call read_archive_file() some more until we
>>>>> have read enough of it to satisfy the request. It feels odd to me to
>>>>> do it that way, as if we told somebody to first wait until 9 o'clock
>>>>> and then wait another 30 minutes, instead of just telling them to wait
>>>>> until 9:30. I realize it's not quite the same thing, because apart
>>>>> from calling read_archive_file(), the two loops do different things,
>>>>> but I still think it looks odd.
>>>>>
>>>>> + /*
>>>>> + * Ignore if the timeline is different or the current segment is not
>>>>> + * the desired one.
>>>>> + */
>>>>> + XLogFromFileName(entry->fname, &curSegTimeline, &curSegNo, WalSegSz);
>>>>> + if (privateInfo->timeline != curSegTimeline ||
>>>>> + privateInfo->startSegNo > curSegNo ||
>>>>> + privateInfo->endSegNo < curSegNo ||
>>>>> + segno > curSegNo)
>>>>> + {
>>>>> + free_archive_wal_entry(entry->fname, privateInfo);
>>>>> + continue;
>>>>> + }
>>>>>
>>>>> The comment doesn't match the code. If it did, the test would be
>>>>> (privateInfo->timeline != curSegTimeline || segno != curSegno). But
>>>>> instead the segno test is > rather than !=, and the checks against
>>>>> startSegNo and endSegNo aren't explained at all. I think I understand
>>>>> why the segno test uses > rather than !=, but it's the point of the
>>>>> comment to explain things like that, rather than leaving the reader to
>>>>> guess. And I don't know why we also need to test startSegNo and
>>>>> endSegNo.
>>>>>
>>>>> I also wonder what the point is of doing XLogFromFileName() on the
>>>>> fname provided by the caller and then again on entry->fname. Couldn't
>>>>> you just compare the strings?
>>>>>
>>>>> Again, the division of labor is really odd here. It's the job of
>>>>> astreamer_waldump_content() to skip things that aren't WAL files at
>>>>> all, but it's the job of get_archive_wal_entry() to skip things that
>>>>> are WAL files but not the one we want. I disagree with putting those
>>>>> checks in completely separate parts of the code.
>>>>>
>>>> Keeping the timeline and segment start-end range checks inside the
>>>> archive streamer creates a circular dependency that cannot be resolved
>>>> without a 'dirty hack'. We must read the first available WAL file page
>>>> to determine the wal_segment_size before it can calculate the target
>>>> segment range. Moving the checks inside the streamer would make it
>>>> impossible to process that initial file, as the necessary filtering
>>>> parameters --  would still be unknown which would need to be skipped
>>>> for the first read somehow. What if later we realized that the first
>>>> WAL file which was allowed to be streamed by skipping that check is
>>>> irrelevant and doesn't fall under the start-end segment range?
>>>>
>>> Please have a look at the attached version, specifically patch 0005.
>>> In astreamer_waldump_content(), I have moved the WAL file filtration
>>> check from get_archive_wal_entry(). This check will be skipped during
>>> the initial read in init_archive_reader(), which instead performs it
>>> explicitly once it determines the WAL segment size and the start/end
>>> segments.
>>>
>>> To access the WAL segment size inside astreamer_waldump_content(), I
>>> have moved the WAL segment size variable into the XLogDumpPrivate
>>> structure in the separate 0004 patch.
>> Attached is an updated version including the aforesaid changes. It
>> includes a new refactoring patch (0001) that moves the logic for
>> identifying tar archives and their compression types from
>> pg_basebackup and pg_verifybackup into a separate-reusable function,
>> per a suggestion from Euler [1].  Additionally, I have added a test
>> for the contrecord decoding to the main patch (now 0006).
>>
>> 1] http://postgr.es/m/[email protected]
>>
> Rebased against the latest master, fixed typos in code comments, and
> replaced palloc0 with palloc0_object.
>

Hi Amul.


I think this looks in pretty good shape.

Attached are patches for a few things I think could be fixed. They are 
mostly self-explanatory. The TAP test fix is the only sane way I could 
come up with stopping the skip code you had from reporting a wildly 
inaccurate number of tests skipped. The sane way to do this from a 
Test::More perspective is a subtest, but unfortunately meson does not 
like subtest output, which is why we don't use it elsewhere, so the only 
way I could come up with was to split this out into a separate test. Of 
course, we might just say we don't care about the misreport, in which 
case we could just live with things as they are.


cheers


andrew



--
Andrew Dunstan
EDB: https://www.enterprisedb.com

From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <[email protected]>
Date: Tue, 3 Mar 2026 00:00:00 +0000
Subject: [PATCH] Add pg_verifybackup test for tar-format WAL verification

The new tar-format WAL verification in pg_verifybackup had no test
coverage for the case where pg_basebackup produces a separate
pg_wal.tar (--format=tar --wal-method=stream).  Add a test that takes
a tar-format backup and verifies it.
---
 src/bin/pg_verifybackup/t/007_wal.pl | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/src/bin/pg_verifybackup/t/007_wal.pl b/src/bin/pg_verifybackup/t/007_wal.pl
index 8ad2234453d..0e0377bfacc 100644
--- a/src/bin/pg_verifybackup/t/007_wal.pl
+++ b/src/bin/pg_verifybackup/t/007_wal.pl
@@ -90,4 +90,20 @@ command_ok(
 	[ 'pg_verifybackup', $backup_path2 ],
 	'valid base backup with timeline > 1');
 
+# Test WAL verification for a tar-format backup with a separate pg_wal.tar,
+# as produced by pg_basebackup --format=tar --wal-method=stream.
+my $backup_path3 = $primary->backup_dir . '/test_tar_wal';
+$primary->command_ok(
+	[
+		'pg_basebackup',
+		'--pgdata' => $backup_path3,
+		'--no-sync',
+		'--format' => 'tar',
+		'--checkpoint' => 'fast'
+	],
+	"tar backup with separate pg_wal.tar");
+command_ok(
+	[ 'pg_verifybackup', $backup_path3 ],
+	'WAL verification succeeds with separate pg_wal.tar');
+
 done_testing();

From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <[email protected]>
Date: Tue, 3 Mar 2026 00:00:00 +0000
Subject: [PATCH] Split pg_waldump TAP tests into directory and archive files

The original 001_basic.pl mixed directory and tar archive tests in a
single SKIP loop with a hardcoded skip count of 3, but each scenario
actually runs ~19 assertions.  When tar is unavailable the skip count
was wrong, and the directory scenario was also wrongly guarded by the
tar-availability check.

Move all archive-related tests (tar, tar.gz) into a new
003_archive.pl that uses plan skip_all when tar is unavailable,
cleanly skipping the entire file.  001_basic.pl retains only
directory-based tests with no SKIP blocks needed.
---
 src/bin/pg_waldump/meson.build       |   1 +
 src/bin/pg_waldump/t/001_basic.pl    | 221 ++++++++++-----------------
 src/bin/pg_waldump/t/003_archive.pl  | 320 +++++++++++++++++++++++++++++++++++
 3 files changed, 396 insertions(+), 146 deletions(-)
 create mode 100644 src/bin/pg_waldump/t/003_archive.pl

diff --git a/src/bin/pg_waldump/meson.build b/src/bin/pg_waldump/meson.build
index 5296f21b82c..d2b4bd0c048 100644
--- a/src/bin/pg_waldump/meson.build
+++ b/src/bin/pg_waldump/meson.build
@@ -34,6 +34,7 @@ tests += {
     'tests': [
       't/001_basic.pl',
       't/002_save_fullpage.pl',
+      't/003_archive.pl',
     ],
   },
 }
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 9854c939007..282c9a37221 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -3,13 +3,9 @@
 
 use strict;
 use warnings FATAL => 'all';
-use Cwd;
 use PostgreSQL::Test::Cluster;
 use PostgreSQL::Test::Utils;
 use Test::More;
-use List::Util qw(shuffle);
-
-my $tar = $ENV{TAR};
 
 program_help_ok('pg_waldump');
 program_version_ok('pg_waldump');
@@ -195,8 +191,8 @@ END
 $$;
 });
 
-my $contrecord_lsn = $node->safe_psql('postgres',
-	'SELECT pg_current_wal_insert_lsn()');
+my $contrecord_lsn =
+  $node->safe_psql('postgres', 'SELECT pg_current_wal_insert_lsn()');
 # Generate contrecord record
 $node->safe_psql('postgres',
 	qq{SELECT pg_logical_emit_message(true, 'test 026', repeat('xyzxz', 123456))}
@@ -299,145 +295,78 @@ sub test_pg_waldump
 	return @lines;
 }
 
-# Create a tar archive, sorting the file order
-sub generate_archive
-{
-	my ($archive, $directory, $compression_flags) = @_;
-
-	my @files;
-	opendir my $dh, $directory or die "opendir: $!";
-	while (my $entry = readdir $dh) {
-		# Skip '.' and '..'
-		next if $entry eq '.' || $entry eq '..';
-		push @files, $entry;
-	}
-	closedir $dh;
-
-	@files = shuffle @files;
-
-	# move into the WAL directory before archiving files
-	my $cwd = getcwd;
-	chdir($directory) || die "chdir: $!";
-	command_ok([$tar, $compression_flags, $archive, @files]);
-	chdir($cwd) || die "chdir: $!";
-}
-
-my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
-
-my @scenarios = (
-	{
-		'path' => $node->data_dir,
-		'is_archive' => 0,
-		'enabled' => 1
-	},
-	{
-		'path' => "$tmp_dir/pg_wal.tar",
-		'compression_method' => 'none',
-		'compression_flags' => '-cf',
-		'is_archive' => 1,
-		'enabled' => 1
-	},
-	{
-		'path' => "$tmp_dir/pg_wal.tar.gz",
-		'compression_method' => 'gzip',
-		'compression_flags' => '-czf',
-		'is_archive' => 1,
-		'enabled' => check_pg_config("#define HAVE_LIBZ 1")
-	});
-
-for my $scenario (@scenarios)
-{
-	my $path = $scenario->{'path'};
-
-	SKIP:
-	{
-		skip "tar command is not available", 3
-		  if !defined $tar;
-		skip "$scenario->{'compression_method'} compression not supported by this build", 3
-		  if !$scenario->{'enabled'} && $scenario->{'is_archive'};
-
-		  # create pg_wal archive
-		  if ($scenario->{'is_archive'})
-		  {
-			  generate_archive($path,
-				  $node->data_dir . '/pg_wal',
-				  $scenario->{'compression_flags'});
-		  }
-
-		command_fails_like(
-			[ 'pg_waldump', '--path' => $path ],
-			qr/error: no start WAL location given/,
-			'path option requires start location');
-		command_like(
-			[
-				'pg_waldump',
-				'--path' => $path,
-				'--start' => $start_lsn,
-				'--end' => $end_lsn,
-			],
-			qr/./,
-			'runs with path option and start and end locations');
-		command_fails_like(
-			[
-				'pg_waldump',
-				'--path' => $path,
-				'--start' => $start_lsn,
-			],
-			qr/error: error in WAL record at/,
-			'falling off the end of the WAL results in an error');
-
-		command_fails_like(
-			[
-				'pg_waldump', '--quiet',
-				'--path' => $path,
-				'--start' => $start_lsn
-			],
-			qr/error: error in WAL record at/,
-			'errors are shown with --quiet');
-
-		test_pg_waldump_skip_bytes($path, $start_lsn, $end_lsn);
-
-		my @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
-		is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
-
-		@lines = test_pg_waldump($path, $contrecord_lsn, $end_lsn);
-		is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
-
-		test_pg_waldump_skip_bytes($path, $contrecord_lsn, $end_lsn);
-
-		@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--limit' => 6);
-		is(@lines, 6, 'limit option observed');
-
-		@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fullpage');
-		is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
-
-		@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats');
-		like($lines[0], qr/WAL statistics/, "statistics on stdout");
-		is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
-
-		@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats=record');
-		like($lines[0], qr/WAL statistics/, "statistics on stdout");
-		is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
-
-		@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--rmgr' => 'Btree');
-		is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
-
-		@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fork' => 'init');
-		is(grep(!/fork init/, @lines), 0, 'only init fork lines');
-
-		@lines = test_pg_waldump($path, $start_lsn, $end_lsn,
-			'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
-		is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
-			0, 'only lines for selected relation');
-
-		@lines = test_pg_waldump($path, $start_lsn, $end_lsn,
-			'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
-			'--block' => 1);
-		is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
-
-		# Cleanup.
-		unlink $path if $scenario->{'is_archive'};
-	}
-}
+my $path = $node->data_dir;
+
+command_fails_like(
+	[ 'pg_waldump', '--path' => $path ],
+	qr/error: no start WAL location given/,
+	'path option requires start location');
+command_like(
+	[
+		'pg_waldump',
+		'--path' => $path,
+		'--start' => $start_lsn,
+		'--end' => $end_lsn,
+	],
+	qr/./,
+	'runs with path option and start and end locations');
+command_fails_like(
+	[
+		'pg_waldump',
+		'--path' => $path,
+		'--start' => $start_lsn,
+	],
+	qr/error: error in WAL record at/,
+	'falling off the end of the WAL results in an error');
+
+command_fails_like(
+	[
+		'pg_waldump', '--quiet',
+		'--path' => $path,
+		'--start' => $start_lsn
+	],
+	qr/error: error in WAL record at/,
+	'errors are shown with --quiet');
+
+test_pg_waldump_skip_bytes($path, $start_lsn, $end_lsn);
+
+my @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
+is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+
+@lines = test_pg_waldump($path, $contrecord_lsn, $end_lsn);
+is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+
+test_pg_waldump_skip_bytes($path, $contrecord_lsn, $end_lsn);
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--limit' => 6);
+is(@lines, 6, 'limit option observed');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fullpage');
+is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats');
+like($lines[0], qr/WAL statistics/, "statistics on stdout");
+is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats=record');
+like($lines[0], qr/WAL statistics/, "statistics on stdout");
+is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--rmgr' => 'Btree');
+is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fork' => 'init');
+is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn,
+	'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
+is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
+	0, 'only lines for selected relation');
+
+@lines = test_pg_waldump(
+	$path, $start_lsn, $end_lsn,
+	'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
+	'--block' => 1);
+is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
 
 done_testing();
new file mode 100644
index 00000000000..c615713efd4
--- /dev/null
+++ b/src/bin/pg_waldump/t/003_archive.pl
@@ -0,0 +1,320 @@
+
+# Copyright (c) 2021-2026, PostgreSQL Global Development Group
+
+# Test pg_waldump's ability to read WAL from tar archives.
+
+use strict;
+use warnings FATAL => 'all';
+use Cwd;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+use List::Util qw(shuffle);
+
+my $tar = $ENV{TAR};
+
+if (!defined $tar)
+{
+	plan skip_all => 'tar command is not available';
+}
+
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init;
+$node->append_conf(
+	'postgresql.conf', q{
+autovacuum = off
+checkpoint_timeout = 1h
+
+# for standbydesc
+archive_mode=on
+archive_command=''
+
+# for XLOG_HEAP_TRUNCATE
+wal_level=logical
+});
+$node->start;
+
+my ($start_lsn, $start_walfile) = split /\|/,
+  $node->safe_psql('postgres',
+	q{SELECT pg_current_wal_insert_lsn(), pg_walfile_name(pg_current_wal_insert_lsn())}
+  );
+
+$node->safe_psql(
+	'postgres', q{
+-- heap, btree, hash, sequence
+CREATE TABLE t1 (a int GENERATED ALWAYS AS IDENTITY, b text);
+CREATE INDEX i1a ON t1 USING btree (a);
+CREATE INDEX i1b ON t1 USING hash (b);
+INSERT INTO t1 VALUES (default, 'one'), (default, 'two');
+DELETE FROM t1 WHERE b = 'one';
+TRUNCATE t1;
+
+-- abort
+START TRANSACTION;
+INSERT INTO t1 VALUES (default, 'three');
+ROLLBACK;
+
+-- unlogged/init fork
+CREATE UNLOGGED TABLE t2 (x int);
+CREATE INDEX i2 ON t2 USING btree (x);
+INSERT INTO t2 SELECT generate_series(1, 10);
+
+-- gin
+CREATE TABLE gin_idx_tbl (id bigserial PRIMARY KEY, data jsonb);
+CREATE INDEX gin_idx ON gin_idx_tbl USING gin (data);
+INSERT INTO gin_idx_tbl
+    WITH random_json AS (
+        SELECT json_object_agg(key, trunc(random() * 10)) as json_data
+            FROM unnest(array['a', 'b', 'c']) as u(key))
+          SELECT generate_series(1,500), json_data FROM random_json;
+
+-- gist, spgist
+CREATE TABLE gist_idx_tbl (p point);
+CREATE INDEX gist_idx ON gist_idx_tbl USING gist (p);
+CREATE INDEX spgist_idx ON gist_idx_tbl USING spgist (p);
+INSERT INTO gist_idx_tbl (p) VALUES (point '(1, 1)'), (point '(3, 2)'), (point '(6, 3)');
+
+-- brin
+CREATE TABLE brin_idx_tbl (col1 int, col2 text, col3 text );
+CREATE INDEX brin_idx ON brin_idx_tbl USING brin (col1, col2, col3) WITH (autosummarize=on);
+INSERT INTO brin_idx_tbl SELECT generate_series(1, 10000), 'dummy', 'dummy';
+UPDATE brin_idx_tbl SET col2 = 'updated' WHERE col1 BETWEEN 1 AND 5000;
+SELECT brin_summarize_range('brin_idx', 0);
+SELECT brin_desummarize_range('brin_idx', 0);
+
+VACUUM;
+
+-- logical message
+SELECT pg_logical_emit_message(true, 'foo', 'bar');
+
+-- relmap
+VACUUM FULL pg_authid;
+
+-- database
+CREATE DATABASE d1;
+DROP DATABASE d1;
+});
+
+my $tblspc_path = PostgreSQL::Test::Utils::tempdir_short();
+
+$node->safe_psql(
+	'postgres', qq{
+CREATE TABLESPACE ts1 LOCATION '$tblspc_path';
+DROP TABLESPACE ts1;
+});
+
+# Consume all remaining room in the current WAL segment, leaving space enough
+# only for the start of a largish record, to test contrecord decoding.
+$node->safe_psql(
+	'postgres', q{
+DO $$
+DECLARE
+    wal_segsize int := setting::int FROM pg_settings WHERE name = 'wal_segment_size';
+    remain int;
+    iters  int := 0;
+BEGIN
+    LOOP
+        INSERT into t1(b)
+        select repeat(encode(sha256(g::text::bytea), 'hex'), (random() * 15 + 1)::int)
+        from generate_series(1, 10) g;
+
+        remain := wal_segsize - (pg_current_wal_insert_lsn() - '0/0') % wal_segsize;
+        IF remain < 2 * setting::int from pg_settings where name = 'block_size' THEN
+            RAISE log 'exiting after % iterations, % bytes to end of WAL segment', iters, remain;
+            EXIT;
+        END IF;
+        iters := iters + 1;
+    END LOOP;
+END
+$$;
+});
+
+my $contrecord_lsn =
+  $node->safe_psql('postgres', 'SELECT pg_current_wal_insert_lsn()');
+$node->safe_psql('postgres',
+	qq{SELECT pg_logical_emit_message(true, 'test 026', repeat('xyzxz', 123456))}
+);
+
+my ($end_lsn, $end_walfile) = split /\|/,
+  $node->safe_psql('postgres',
+	q{SELECT pg_current_wal_insert_lsn(), pg_walfile_name(pg_current_wal_insert_lsn())}
+  );
+
+$node->stop;
+
+
+sub test_pg_waldump_skip_bytes
+{
+	my ($path, $startlsn, $endlsn) = @_;
+
+	my ($part1, $part2) = split qr{/}, $startlsn;
+	my $lsn2 = hex $part2;
+	$lsn2++;
+	my $new_start = sprintf("%s/%X", $part1, $lsn2);
+
+	my ($stdout, $stderr);
+
+	my $result = IPC::Run::run [
+		'pg_waldump',
+		'--start' => $new_start,
+		'--end' => $endlsn,
+		'--path' => $path,
+	  ],
+	  '>' => \$stdout,
+	  '2>' => \$stderr;
+	ok($result, "runs with start segment and start LSN specified");
+	like($stderr, qr/first record is after/, 'info message printed');
+}
+
+sub test_pg_waldump
+{
+	local $Test::Builder::Level = $Test::Builder::Level + 1;
+	my ($path, $startlsn, $endlsn, @opts) = @_;
+
+	my ($stdout, $stderr);
+
+	my $result = IPC::Run::run [
+		'pg_waldump',
+		'--start' => $startlsn,
+		'--end' => $endlsn,
+		'--path' => $path,
+		@opts
+	  ],
+	  '>' => \$stdout,
+	  '2>' => \$stderr;
+	ok($result, "pg_waldump @opts: runs ok");
+	is($stderr, '', "pg_waldump @opts: no stderr");
+	my @lines = split /\n/, $stdout;
+	ok(@lines > 0, "pg_waldump @opts: some lines are output");
+	return @lines;
+}
+
+sub generate_archive
+{
+	my ($archive, $directory, $compression_flags) = @_;
+
+	my @files;
+	opendir my $dh, $directory or die "opendir: $!";
+	while (my $entry = readdir $dh)
+	{
+		next if $entry eq '.' || $entry eq '..';
+		push @files, $entry;
+	}
+	closedir $dh;
+
+	@files = shuffle @files;
+
+	my $cwd = getcwd;
+	chdir($directory) || die "chdir: $!";
+	command_ok([ $tar, $compression_flags, $archive, @files ],
+		"create archive $archive");
+	chdir($cwd) || die "chdir: $!";
+}
+
+
+my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
+
+my @scenarios = (
+	{
+		'path' => "$tmp_dir/pg_wal.tar",
+		'compression_method' => 'none',
+		'compression_flags' => '-cf',
+		'enabled' => 1,
+	},
+	{
+		'path' => "$tmp_dir/pg_wal.tar.gz",
+		'compression_method' => 'gzip',
+		'compression_flags' => '-czf',
+		'enabled' => check_pg_config("#define HAVE_LIBZ 1"),
+	});
+
+for my $scenario (@scenarios)
+{
+	my $path = $scenario->{'path'};
+	my $method = $scenario->{'compression_method'};
+
+  SKIP:
+	{
+		skip "$method compression not supported by this build", 1
+		  if !$scenario->{'enabled'};
+
+		generate_archive(
+			$path,
+			$node->data_dir . '/pg_wal',
+			$scenario->{'compression_flags'});
+
+		command_fails_like(
+			[ 'pg_waldump', '--path' => $path ],
+			qr/error: no start WAL location given/,
+			"$method: path option requires start location");
+		command_like(
+			[
+				'pg_waldump',
+				'--path' => $path,
+				'--start' => $start_lsn,
+				'--end' => $end_lsn,
+			],
+			qr/./,
+			"$method: runs with path option and start and end locations");
+		command_fails_like(
+			[
+				'pg_waldump',
+				'--path' => $path,
+				'--start' => $start_lsn,
+			],
+			qr/error: error in WAL record at/,
+			"$method: falling off the end of the WAL results in an error");
+
+		command_fails_like(
+			[
+				'pg_waldump', '--quiet',
+				'--path' => $path,
+				'--start' => $start_lsn
+			],
+			qr/error: error in WAL record at/,
+			"$method: errors are shown with --quiet");
+
+		test_pg_waldump_skip_bytes($path, $start_lsn, $end_lsn);
+
+		my @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
+		is(grep(!/^rmgr: \w/, @lines),
+			0, "$method: all output lines are rmgr lines");
+
+		@lines = test_pg_waldump($path, $contrecord_lsn, $end_lsn);
+		is(grep(!/^rmgr: \w/, @lines),
+			0, "$method: contrecord - all output lines are rmgr lines");
+
+		test_pg_waldump_skip_bytes($path, $contrecord_lsn, $end_lsn);
+
+		@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--limit' => 6);
+		is(@lines, 6, "$method: limit option observed");
+
+		@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fullpage');
+		is(grep(!/^rmgr:.*\bFPW\b/, @lines),
+			0, "$method: all output lines are FPW");
+
+		@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats');
+		like($lines[0], qr/WAL statistics/, "$method: statistics on stdout");
+		is(grep(/^rmgr:/, @lines), 0, "$method: no rmgr lines output");
+
+		@lines =
+		  test_pg_waldump($path, $start_lsn, $end_lsn, '--stats=record');
+		like($lines[0], qr/WAL statistics/,
+			"$method: stats=record on stdout");
+		is(grep(/^rmgr:/, @lines),
+			0, "$method: no rmgr lines with stats=record");
+
+		@lines =
+		  test_pg_waldump($path, $start_lsn, $end_lsn, '--rmgr' => 'Btree');
+		is(grep(!/^rmgr: Btree/, @lines), 0, "$method: only Btree lines");
+
+		@lines =
+		  test_pg_waldump($path, $start_lsn, $end_lsn, '--fork' => 'init');
+		is(grep(!/fork init/, @lines), 0, "$method: only init fork lines");
+
+		# Cleanup.
+		unlink $path;
+	}
+}
+
+done_testing();

From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <[email protected]>
Date: Tue, 3 Mar 2026 00:00:00 +0000
Subject: [PATCH] Fix documentation for pg_waldump tar archive support

Two documentation issues with the tar archive reading feature:

- pg_waldump.sgml: When reading WAL from a tar archive with
  out-of-order segments, pg_waldump spills to temporary files.  TMPDIR
  controls where those files are created, but this was not documented
  in the Environment section.

- pg_verifybackup.sgml: The --wal-path option description still only
  said "directory" even though it now also accepts tar archives.
---
 doc/src/sgml/ref/pg_verifybackup.sgml |  7 ++++---
 doc/src/sgml/ref/pg_waldump.sgml      | 11 +++++++++++
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index 16b50b5a4df..1695cfe91c8 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -261,9 +261,10 @@ PostgreSQL documentation
       <term><option>--wal-path=<replaceable class="parameter">path</replaceable></option></term>
       <listitem>
        <para>
-        Try to parse WAL files stored in the specified directory, rather than
-        in <literal>pg_wal</literal>. This may be useful if the backup is
-        stored in a separate location from the WAL archive.
+        Try to parse WAL files stored in the specified directory or tar
+        archive, rather than in <literal>pg_wal</literal>. This may be
+        useful if the backup is stored in a separate location from the WAL
+        archive.
        </para>
       </listitem>
      </varlistentry>
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index b36323dde92..9bbb4bd5772 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -391,6 +391,17 @@ PostgreSQL documentation
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term><envar>TMPDIR</envar></term>
+    <listitem>
+     <para>
+      Directory in which to create temporary files when reading WAL from a
+      tar archive with out-of-order segment files. If not set, the temporary
+      directory is created within the same directory as the tar archive.
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
  </refsect1>
 

From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <[email protected]>
Date: Tue, 3 Mar 2026 00:00:00 +0000
Subject: [PATCH] Fix bugs in pg_waldump tar archive support

Fix several bugs introduced by the pg_waldump archive WAL reading
feature:

- pg_waldump.c: The error path for verify_directory() printed waldir
  (which is NULL when --path is used) instead of walpath.

- archive_waldump.c: The error message for short reads had an operator
  precedence bug: (long long int) count - nbytes cast only count, not
  the subtraction result.  Also reported nbytes (the requested amount)
  instead of count (the total file size) for the "of" portion.

- archive_waldump.c: The "ignoring duplicate WAL" code path leaked
  fname (allocated via pnstrdup/palloc).  Also changed the existing
  free(fname) to pfree(fname) for consistency.

- pg_verifybackup.c: The rename from --wal-directory to --wal-path
  didn't preserve the old spelling as a backward-compatible alias.

- pg_verifybackup.c: Fix double space before "Or" in --wal-path
  error hint message.
---
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 935ab8fafa8..b0b764913cf 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -131,6 +131,7 @@ main(int argc, char **argv)
 		{"quiet", no_argument, NULL, 'q'},
 		{"skip-checksums", no_argument, NULL, 's'},
 		{"wal-path", required_argument, NULL, 'w'},
+		{"wal-directory", required_argument, NULL, 'w'},
 		{NULL, 0, NULL, 0}
 	};
 
@@ -376,7 +377,7 @@ main(int argc, char **argv)
 		else
 		{
 			pg_log_error("WAL archive not found");
-			pg_log_error_hint("Specify the correct path using the option -w/--wal-path.  "
+			pg_log_error_hint("Specify the correct path using the option -w/--wal-path. "
 							  "Or you must use -n/--no-parse-wal when verifying a tar-format backup.");
 			exit(1);
 		}
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
index c5a4485b5b1..1479efe61f5 100644
--- a/src/bin/pg_waldump/archive_waldump.c
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -344,8 +344,8 @@ read_archive_wal_page(XLogDumpPrivate *privateInfo, XLogRecPtr targetPagePtr,
 				read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
 				pg_fatal("could not read file \"%s\" from archive \"%s\": read %lld of %lld",
 						 fname, privateInfo->archive_name,
-						 (long long int) count - nbytes,
-						 (long long int) nbytes);
+						 (long long int) (count - nbytes),
+						 (long long int) count);
 		}
 	}
 
@@ -664,7 +664,7 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
 						privateInfo->start_segno > segno ||
 						privateInfo->end_segno < segno)
 					{
-						free(fname);
+						pfree(fname);
 						break;
 					}
 				}
@@ -680,6 +680,7 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
 				{
 					pg_log_warning("ignoring duplicate WAL \"%s\" found in archive \"%s\"",
 								   member->pathname, privateInfo->archive_name);
+					pfree(fname);
 					break;
 				}
 
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 114969217d8..4b438b53ead 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -1223,7 +1223,7 @@ main(int argc, char **argv)
 		/* validate path points to directory */
 		else if (!verify_directory(walpath))
 		{
-			pg_log_error("could not open directory \"%s\": %m", waldir);
+			pg_log_error("could not open directory \"%s\": %m", walpath);
 			goto bad_argument;
 		}
 	}


Attachments:

  [text/plain] cf5955-tar-wal-test.patch.no-cfbot (1.4K, 2-cf5955-tar-wal-test.patch.no-cfbot)
  download | inline diff:
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <[email protected]>
Date: Tue, 3 Mar 2026 00:00:00 +0000
Subject: [PATCH] Add pg_verifybackup test for tar-format WAL verification

The new tar-format WAL verification in pg_verifybackup had no test
coverage for the case where pg_basebackup produces a separate
pg_wal.tar (--format=tar --wal-method=stream).  Add a test that takes
a tar-format backup and verifies it.
---
 src/bin/pg_verifybackup/t/007_wal.pl | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/src/bin/pg_verifybackup/t/007_wal.pl b/src/bin/pg_verifybackup/t/007_wal.pl
index 8ad2234453d..0e0377bfacc 100644
--- a/src/bin/pg_verifybackup/t/007_wal.pl
+++ b/src/bin/pg_verifybackup/t/007_wal.pl
@@ -90,4 +90,20 @@ command_ok(
 	[ 'pg_verifybackup', $backup_path2 ],
 	'valid base backup with timeline > 1');
 
+# Test WAL verification for a tar-format backup with a separate pg_wal.tar,
+# as produced by pg_basebackup --format=tar --wal-method=stream.
+my $backup_path3 = $primary->backup_dir . '/test_tar_wal';
+$primary->command_ok(
+	[
+		'pg_basebackup',
+		'--pgdata' => $backup_path3,
+		'--no-sync',
+		'--format' => 'tar',
+		'--checkpoint' => 'fast'
+	],
+	"tar backup with separate pg_wal.tar");
+command_ok(
+	[ 'pg_verifybackup', $backup_path3 ],
+	'WAL verification succeeds with separate pg_wal.tar');
+
 done_testing();


  [text/plain] cf5955-tap-test-fix.patch.no-cfbot (17.5K, 3-cf5955-tap-test-fix.patch.no-cfbot)
  download | inline diff:
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <[email protected]>
Date: Tue, 3 Mar 2026 00:00:00 +0000
Subject: [PATCH] Split pg_waldump TAP tests into directory and archive files

The original 001_basic.pl mixed directory and tar archive tests in a
single SKIP loop with a hardcoded skip count of 3, but each scenario
actually runs ~19 assertions.  When tar is unavailable the skip count
was wrong, and the directory scenario was also wrongly guarded by the
tar-availability check.

Move all archive-related tests (tar, tar.gz) into a new
003_archive.pl that uses plan skip_all when tar is unavailable,
cleanly skipping the entire file.  001_basic.pl retains only
directory-based tests with no SKIP blocks needed.
---
 src/bin/pg_waldump/meson.build       |   1 +
 src/bin/pg_waldump/t/001_basic.pl    | 221 ++++++++++-----------------
 src/bin/pg_waldump/t/003_archive.pl  | 320 +++++++++++++++++++++++++++++++++++
 3 files changed, 396 insertions(+), 146 deletions(-)
 create mode 100644 src/bin/pg_waldump/t/003_archive.pl

diff --git a/src/bin/pg_waldump/meson.build b/src/bin/pg_waldump/meson.build
index 5296f21b82c..d2b4bd0c048 100644
--- a/src/bin/pg_waldump/meson.build
+++ b/src/bin/pg_waldump/meson.build
@@ -34,6 +34,7 @@ tests += {
     'tests': [
       't/001_basic.pl',
       't/002_save_fullpage.pl',
+      't/003_archive.pl',
     ],
   },
 }
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 9854c939007..282c9a37221 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -3,13 +3,9 @@
 
 use strict;
 use warnings FATAL => 'all';
-use Cwd;
 use PostgreSQL::Test::Cluster;
 use PostgreSQL::Test::Utils;
 use Test::More;
-use List::Util qw(shuffle);
-
-my $tar = $ENV{TAR};
 
 program_help_ok('pg_waldump');
 program_version_ok('pg_waldump');
@@ -195,8 +191,8 @@ END
 $$;
 });
 
-my $contrecord_lsn = $node->safe_psql('postgres',
-	'SELECT pg_current_wal_insert_lsn()');
+my $contrecord_lsn =
+  $node->safe_psql('postgres', 'SELECT pg_current_wal_insert_lsn()');
 # Generate contrecord record
 $node->safe_psql('postgres',
 	qq{SELECT pg_logical_emit_message(true, 'test 026', repeat('xyzxz', 123456))}
@@ -299,145 +295,78 @@ sub test_pg_waldump
 	return @lines;
 }
 
-# Create a tar archive, sorting the file order
-sub generate_archive
-{
-	my ($archive, $directory, $compression_flags) = @_;
-
-	my @files;
-	opendir my $dh, $directory or die "opendir: $!";
-	while (my $entry = readdir $dh) {
-		# Skip '.' and '..'
-		next if $entry eq '.' || $entry eq '..';
-		push @files, $entry;
-	}
-	closedir $dh;
-
-	@files = shuffle @files;
-
-	# move into the WAL directory before archiving files
-	my $cwd = getcwd;
-	chdir($directory) || die "chdir: $!";
-	command_ok([$tar, $compression_flags, $archive, @files]);
-	chdir($cwd) || die "chdir: $!";
-}
-
-my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
-
-my @scenarios = (
-	{
-		'path' => $node->data_dir,
-		'is_archive' => 0,
-		'enabled' => 1
-	},
-	{
-		'path' => "$tmp_dir/pg_wal.tar",
-		'compression_method' => 'none',
-		'compression_flags' => '-cf',
-		'is_archive' => 1,
-		'enabled' => 1
-	},
-	{
-		'path' => "$tmp_dir/pg_wal.tar.gz",
-		'compression_method' => 'gzip',
-		'compression_flags' => '-czf',
-		'is_archive' => 1,
-		'enabled' => check_pg_config("#define HAVE_LIBZ 1")
-	});
-
-for my $scenario (@scenarios)
-{
-	my $path = $scenario->{'path'};
-
-	SKIP:
-	{
-		skip "tar command is not available", 3
-		  if !defined $tar;
-		skip "$scenario->{'compression_method'} compression not supported by this build", 3
-		  if !$scenario->{'enabled'} && $scenario->{'is_archive'};
-
-		  # create pg_wal archive
-		  if ($scenario->{'is_archive'})
-		  {
-			  generate_archive($path,
-				  $node->data_dir . '/pg_wal',
-				  $scenario->{'compression_flags'});
-		  }
-
-		command_fails_like(
-			[ 'pg_waldump', '--path' => $path ],
-			qr/error: no start WAL location given/,
-			'path option requires start location');
-		command_like(
-			[
-				'pg_waldump',
-				'--path' => $path,
-				'--start' => $start_lsn,
-				'--end' => $end_lsn,
-			],
-			qr/./,
-			'runs with path option and start and end locations');
-		command_fails_like(
-			[
-				'pg_waldump',
-				'--path' => $path,
-				'--start' => $start_lsn,
-			],
-			qr/error: error in WAL record at/,
-			'falling off the end of the WAL results in an error');
-
-		command_fails_like(
-			[
-				'pg_waldump', '--quiet',
-				'--path' => $path,
-				'--start' => $start_lsn
-			],
-			qr/error: error in WAL record at/,
-			'errors are shown with --quiet');
-
-		test_pg_waldump_skip_bytes($path, $start_lsn, $end_lsn);
-
-		my @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
-		is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
-
-		@lines = test_pg_waldump($path, $contrecord_lsn, $end_lsn);
-		is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
-
-		test_pg_waldump_skip_bytes($path, $contrecord_lsn, $end_lsn);
-
-		@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--limit' => 6);
-		is(@lines, 6, 'limit option observed');
-
-		@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fullpage');
-		is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
-
-		@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats');
-		like($lines[0], qr/WAL statistics/, "statistics on stdout");
-		is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
-
-		@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats=record');
-		like($lines[0], qr/WAL statistics/, "statistics on stdout");
-		is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
-
-		@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--rmgr' => 'Btree');
-		is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
-
-		@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fork' => 'init');
-		is(grep(!/fork init/, @lines), 0, 'only init fork lines');
-
-		@lines = test_pg_waldump($path, $start_lsn, $end_lsn,
-			'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
-		is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
-			0, 'only lines for selected relation');
-
-		@lines = test_pg_waldump($path, $start_lsn, $end_lsn,
-			'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
-			'--block' => 1);
-		is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
-
-		# Cleanup.
-		unlink $path if $scenario->{'is_archive'};
-	}
-}
+my $path = $node->data_dir;
+
+command_fails_like(
+	[ 'pg_waldump', '--path' => $path ],
+	qr/error: no start WAL location given/,
+	'path option requires start location');
+command_like(
+	[
+		'pg_waldump',
+		'--path' => $path,
+		'--start' => $start_lsn,
+		'--end' => $end_lsn,
+	],
+	qr/./,
+	'runs with path option and start and end locations');
+command_fails_like(
+	[
+		'pg_waldump',
+		'--path' => $path,
+		'--start' => $start_lsn,
+	],
+	qr/error: error in WAL record at/,
+	'falling off the end of the WAL results in an error');
+
+command_fails_like(
+	[
+		'pg_waldump', '--quiet',
+		'--path' => $path,
+		'--start' => $start_lsn
+	],
+	qr/error: error in WAL record at/,
+	'errors are shown with --quiet');
+
+test_pg_waldump_skip_bytes($path, $start_lsn, $end_lsn);
+
+my @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
+is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+
+@lines = test_pg_waldump($path, $contrecord_lsn, $end_lsn);
+is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+
+test_pg_waldump_skip_bytes($path, $contrecord_lsn, $end_lsn);
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--limit' => 6);
+is(@lines, 6, 'limit option observed');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fullpage');
+is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats');
+like($lines[0], qr/WAL statistics/, "statistics on stdout");
+is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats=record');
+like($lines[0], qr/WAL statistics/, "statistics on stdout");
+is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--rmgr' => 'Btree');
+is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fork' => 'init');
+is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn,
+	'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
+is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
+	0, 'only lines for selected relation');
+
+@lines = test_pg_waldump(
+	$path, $start_lsn, $end_lsn,
+	'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
+	'--block' => 1);
+is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
 
 done_testing();
new file mode 100644
index 00000000000..c615713efd4
--- /dev/null
+++ b/src/bin/pg_waldump/t/003_archive.pl
@@ -0,0 +1,320 @@
+
+# Copyright (c) 2021-2026, PostgreSQL Global Development Group
+
+# Test pg_waldump's ability to read WAL from tar archives.
+
+use strict;
+use warnings FATAL => 'all';
+use Cwd;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+use List::Util qw(shuffle);
+
+my $tar = $ENV{TAR};
+
+if (!defined $tar)
+{
+	plan skip_all => 'tar command is not available';
+}
+
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init;
+$node->append_conf(
+	'postgresql.conf', q{
+autovacuum = off
+checkpoint_timeout = 1h
+
+# for standbydesc
+archive_mode=on
+archive_command=''
+
+# for XLOG_HEAP_TRUNCATE
+wal_level=logical
+});
+$node->start;
+
+my ($start_lsn, $start_walfile) = split /\|/,
+  $node->safe_psql('postgres',
+	q{SELECT pg_current_wal_insert_lsn(), pg_walfile_name(pg_current_wal_insert_lsn())}
+  );
+
+$node->safe_psql(
+	'postgres', q{
+-- heap, btree, hash, sequence
+CREATE TABLE t1 (a int GENERATED ALWAYS AS IDENTITY, b text);
+CREATE INDEX i1a ON t1 USING btree (a);
+CREATE INDEX i1b ON t1 USING hash (b);
+INSERT INTO t1 VALUES (default, 'one'), (default, 'two');
+DELETE FROM t1 WHERE b = 'one';
+TRUNCATE t1;
+
+-- abort
+START TRANSACTION;
+INSERT INTO t1 VALUES (default, 'three');
+ROLLBACK;
+
+-- unlogged/init fork
+CREATE UNLOGGED TABLE t2 (x int);
+CREATE INDEX i2 ON t2 USING btree (x);
+INSERT INTO t2 SELECT generate_series(1, 10);
+
+-- gin
+CREATE TABLE gin_idx_tbl (id bigserial PRIMARY KEY, data jsonb);
+CREATE INDEX gin_idx ON gin_idx_tbl USING gin (data);
+INSERT INTO gin_idx_tbl
+    WITH random_json AS (
+        SELECT json_object_agg(key, trunc(random() * 10)) as json_data
+            FROM unnest(array['a', 'b', 'c']) as u(key))
+          SELECT generate_series(1,500), json_data FROM random_json;
+
+-- gist, spgist
+CREATE TABLE gist_idx_tbl (p point);
+CREATE INDEX gist_idx ON gist_idx_tbl USING gist (p);
+CREATE INDEX spgist_idx ON gist_idx_tbl USING spgist (p);
+INSERT INTO gist_idx_tbl (p) VALUES (point '(1, 1)'), (point '(3, 2)'), (point '(6, 3)');
+
+-- brin
+CREATE TABLE brin_idx_tbl (col1 int, col2 text, col3 text );
+CREATE INDEX brin_idx ON brin_idx_tbl USING brin (col1, col2, col3) WITH (autosummarize=on);
+INSERT INTO brin_idx_tbl SELECT generate_series(1, 10000), 'dummy', 'dummy';
+UPDATE brin_idx_tbl SET col2 = 'updated' WHERE col1 BETWEEN 1 AND 5000;
+SELECT brin_summarize_range('brin_idx', 0);
+SELECT brin_desummarize_range('brin_idx', 0);
+
+VACUUM;
+
+-- logical message
+SELECT pg_logical_emit_message(true, 'foo', 'bar');
+
+-- relmap
+VACUUM FULL pg_authid;
+
+-- database
+CREATE DATABASE d1;
+DROP DATABASE d1;
+});
+
+my $tblspc_path = PostgreSQL::Test::Utils::tempdir_short();
+
+$node->safe_psql(
+	'postgres', qq{
+CREATE TABLESPACE ts1 LOCATION '$tblspc_path';
+DROP TABLESPACE ts1;
+});
+
+# Consume all remaining room in the current WAL segment, leaving space enough
+# only for the start of a largish record, to test contrecord decoding.
+$node->safe_psql(
+	'postgres', q{
+DO $$
+DECLARE
+    wal_segsize int := setting::int FROM pg_settings WHERE name = 'wal_segment_size';
+    remain int;
+    iters  int := 0;
+BEGIN
+    LOOP
+        INSERT into t1(b)
+        select repeat(encode(sha256(g::text::bytea), 'hex'), (random() * 15 + 1)::int)
+        from generate_series(1, 10) g;
+
+        remain := wal_segsize - (pg_current_wal_insert_lsn() - '0/0') % wal_segsize;
+        IF remain < 2 * setting::int from pg_settings where name = 'block_size' THEN
+            RAISE log 'exiting after % iterations, % bytes to end of WAL segment', iters, remain;
+            EXIT;
+        END IF;
+        iters := iters + 1;
+    END LOOP;
+END
+$$;
+});
+
+my $contrecord_lsn =
+  $node->safe_psql('postgres', 'SELECT pg_current_wal_insert_lsn()');
+$node->safe_psql('postgres',
+	qq{SELECT pg_logical_emit_message(true, 'test 026', repeat('xyzxz', 123456))}
+);
+
+my ($end_lsn, $end_walfile) = split /\|/,
+  $node->safe_psql('postgres',
+	q{SELECT pg_current_wal_insert_lsn(), pg_walfile_name(pg_current_wal_insert_lsn())}
+  );
+
+$node->stop;
+
+
+sub test_pg_waldump_skip_bytes
+{
+	my ($path, $startlsn, $endlsn) = @_;
+
+	my ($part1, $part2) = split qr{/}, $startlsn;
+	my $lsn2 = hex $part2;
+	$lsn2++;
+	my $new_start = sprintf("%s/%X", $part1, $lsn2);
+
+	my ($stdout, $stderr);
+
+	my $result = IPC::Run::run [
+		'pg_waldump',
+		'--start' => $new_start,
+		'--end' => $endlsn,
+		'--path' => $path,
+	  ],
+	  '>' => \$stdout,
+	  '2>' => \$stderr;
+	ok($result, "runs with start segment and start LSN specified");
+	like($stderr, qr/first record is after/, 'info message printed');
+}
+
+sub test_pg_waldump
+{
+	local $Test::Builder::Level = $Test::Builder::Level + 1;
+	my ($path, $startlsn, $endlsn, @opts) = @_;
+
+	my ($stdout, $stderr);
+
+	my $result = IPC::Run::run [
+		'pg_waldump',
+		'--start' => $startlsn,
+		'--end' => $endlsn,
+		'--path' => $path,
+		@opts
+	  ],
+	  '>' => \$stdout,
+	  '2>' => \$stderr;
+	ok($result, "pg_waldump @opts: runs ok");
+	is($stderr, '', "pg_waldump @opts: no stderr");
+	my @lines = split /\n/, $stdout;
+	ok(@lines > 0, "pg_waldump @opts: some lines are output");
+	return @lines;
+}
+
+sub generate_archive
+{
+	my ($archive, $directory, $compression_flags) = @_;
+
+	my @files;
+	opendir my $dh, $directory or die "opendir: $!";
+	while (my $entry = readdir $dh)
+	{
+		next if $entry eq '.' || $entry eq '..';
+		push @files, $entry;
+	}
+	closedir $dh;
+
+	@files = shuffle @files;
+
+	my $cwd = getcwd;
+	chdir($directory) || die "chdir: $!";
+	command_ok([ $tar, $compression_flags, $archive, @files ],
+		"create archive $archive");
+	chdir($cwd) || die "chdir: $!";
+}
+
+
+my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
+
+my @scenarios = (
+	{
+		'path' => "$tmp_dir/pg_wal.tar",
+		'compression_method' => 'none',
+		'compression_flags' => '-cf',
+		'enabled' => 1,
+	},
+	{
+		'path' => "$tmp_dir/pg_wal.tar.gz",
+		'compression_method' => 'gzip',
+		'compression_flags' => '-czf',
+		'enabled' => check_pg_config("#define HAVE_LIBZ 1"),
+	});
+
+for my $scenario (@scenarios)
+{
+	my $path = $scenario->{'path'};
+	my $method = $scenario->{'compression_method'};
+
+  SKIP:
+	{
+		skip "$method compression not supported by this build", 1
+		  if !$scenario->{'enabled'};
+
+		generate_archive(
+			$path,
+			$node->data_dir . '/pg_wal',
+			$scenario->{'compression_flags'});
+
+		command_fails_like(
+			[ 'pg_waldump', '--path' => $path ],
+			qr/error: no start WAL location given/,
+			"$method: path option requires start location");
+		command_like(
+			[
+				'pg_waldump',
+				'--path' => $path,
+				'--start' => $start_lsn,
+				'--end' => $end_lsn,
+			],
+			qr/./,
+			"$method: runs with path option and start and end locations");
+		command_fails_like(
+			[
+				'pg_waldump',
+				'--path' => $path,
+				'--start' => $start_lsn,
+			],
+			qr/error: error in WAL record at/,
+			"$method: falling off the end of the WAL results in an error");
+
+		command_fails_like(
+			[
+				'pg_waldump', '--quiet',
+				'--path' => $path,
+				'--start' => $start_lsn
+			],
+			qr/error: error in WAL record at/,
+			"$method: errors are shown with --quiet");
+
+		test_pg_waldump_skip_bytes($path, $start_lsn, $end_lsn);
+
+		my @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
+		is(grep(!/^rmgr: \w/, @lines),
+			0, "$method: all output lines are rmgr lines");
+
+		@lines = test_pg_waldump($path, $contrecord_lsn, $end_lsn);
+		is(grep(!/^rmgr: \w/, @lines),
+			0, "$method: contrecord - all output lines are rmgr lines");
+
+		test_pg_waldump_skip_bytes($path, $contrecord_lsn, $end_lsn);
+
+		@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--limit' => 6);
+		is(@lines, 6, "$method: limit option observed");
+
+		@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fullpage');
+		is(grep(!/^rmgr:.*\bFPW\b/, @lines),
+			0, "$method: all output lines are FPW");
+
+		@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats');
+		like($lines[0], qr/WAL statistics/, "$method: statistics on stdout");
+		is(grep(/^rmgr:/, @lines), 0, "$method: no rmgr lines output");
+
+		@lines =
+		  test_pg_waldump($path, $start_lsn, $end_lsn, '--stats=record');
+		like($lines[0], qr/WAL statistics/,
+			"$method: stats=record on stdout");
+		is(grep(/^rmgr:/, @lines),
+			0, "$method: no rmgr lines with stats=record");
+
+		@lines =
+		  test_pg_waldump($path, $start_lsn, $end_lsn, '--rmgr' => 'Btree');
+		is(grep(!/^rmgr: Btree/, @lines), 0, "$method: only Btree lines");
+
+		@lines =
+		  test_pg_waldump($path, $start_lsn, $end_lsn, '--fork' => 'init');
+		is(grep(!/fork init/, @lines), 0, "$method: only init fork lines");
+
+		# Cleanup.
+		unlink $path;
+	}
+}
+
+done_testing();


  [text/plain] cf5955-docs.patch.no-cfbot (2.4K, 4-cf5955-docs.patch.no-cfbot)
  download | inline diff:
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <[email protected]>
Date: Tue, 3 Mar 2026 00:00:00 +0000
Subject: [PATCH] Fix documentation for pg_waldump tar archive support

Two documentation issues with the tar archive reading feature:

- pg_waldump.sgml: When reading WAL from a tar archive with
  out-of-order segments, pg_waldump spills to temporary files.  TMPDIR
  controls where those files are created, but this was not documented
  in the Environment section.

- pg_verifybackup.sgml: The --wal-path option description still only
  said "directory" even though it now also accepts tar archives.
---
 doc/src/sgml/ref/pg_verifybackup.sgml |  7 ++++---
 doc/src/sgml/ref/pg_waldump.sgml      | 11 +++++++++++
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index 16b50b5a4df..1695cfe91c8 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -261,9 +261,10 @@ PostgreSQL documentation
       <term><option>--wal-path=<replaceable class="parameter">path</replaceable></option></term>
       <listitem>
        <para>
-        Try to parse WAL files stored in the specified directory, rather than
-        in <literal>pg_wal</literal>. This may be useful if the backup is
-        stored in a separate location from the WAL archive.
+        Try to parse WAL files stored in the specified directory or tar
+        archive, rather than in <literal>pg_wal</literal>. This may be
+        useful if the backup is stored in a separate location from the WAL
+        archive.
        </para>
       </listitem>
      </varlistentry>
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index b36323dde92..9bbb4bd5772 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -391,6 +391,17 @@ PostgreSQL documentation
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term><envar>TMPDIR</envar></term>
+    <listitem>
+     <para>
+      Directory in which to create temporary files when reading WAL from a
+      tar archive with out-of-order segment files. If not set, the temporary
+      directory is created within the same directory as the tar archive.
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
  </refsect1>
 


  [text/plain] cf5955-fixes.patch.no-cfbot (3.6K, 5-cf5955-fixes.patch.no-cfbot)
  download | inline diff:
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <[email protected]>
Date: Tue, 3 Mar 2026 00:00:00 +0000
Subject: [PATCH] Fix bugs in pg_waldump tar archive support

Fix several bugs introduced by the pg_waldump archive WAL reading
feature:

- pg_waldump.c: The error path for verify_directory() printed waldir
  (which is NULL when --path is used) instead of walpath.

- archive_waldump.c: The error message for short reads had an operator
  precedence bug: (long long int) count - nbytes cast only count, not
  the subtraction result.  Also reported nbytes (the requested amount)
  instead of count (the total file size) for the "of" portion.

- archive_waldump.c: The "ignoring duplicate WAL" code path leaked
  fname (allocated via pnstrdup/palloc).  Also changed the existing
  free(fname) to pfree(fname) for consistency.

- pg_verifybackup.c: The rename from --wal-directory to --wal-path
  didn't preserve the old spelling as a backward-compatible alias.

- pg_verifybackup.c: Fix double space before "Or" in --wal-path
  error hint message.
---
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 935ab8fafa8..b0b764913cf 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -131,6 +131,7 @@ main(int argc, char **argv)
 		{"quiet", no_argument, NULL, 'q'},
 		{"skip-checksums", no_argument, NULL, 's'},
 		{"wal-path", required_argument, NULL, 'w'},
+		{"wal-directory", required_argument, NULL, 'w'},
 		{NULL, 0, NULL, 0}
 	};
 
@@ -376,7 +377,7 @@ main(int argc, char **argv)
 		else
 		{
 			pg_log_error("WAL archive not found");
-			pg_log_error_hint("Specify the correct path using the option -w/--wal-path.  "
+			pg_log_error_hint("Specify the correct path using the option -w/--wal-path. "
 							  "Or you must use -n/--no-parse-wal when verifying a tar-format backup.");
 			exit(1);
 		}
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
index c5a4485b5b1..1479efe61f5 100644
--- a/src/bin/pg_waldump/archive_waldump.c
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -344,8 +344,8 @@ read_archive_wal_page(XLogDumpPrivate *privateInfo, XLogRecPtr targetPagePtr,
 				read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
 				pg_fatal("could not read file \"%s\" from archive \"%s\": read %lld of %lld",
 						 fname, privateInfo->archive_name,
-						 (long long int) count - nbytes,
-						 (long long int) nbytes);
+						 (long long int) (count - nbytes),
+						 (long long int) count);
 		}
 	}
 
@@ -664,7 +664,7 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
 						privateInfo->start_segno > segno ||
 						privateInfo->end_segno < segno)
 					{
-						free(fname);
+						pfree(fname);
 						break;
 					}
 				}
@@ -680,6 +680,7 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
 				{
 					pg_log_warning("ignoring duplicate WAL \"%s\" found in archive \"%s\"",
 								   member->pathname, privateInfo->archive_name);
+					pfree(fname);
 					break;
 				}
 
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 114969217d8..4b438b53ead 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -1223,7 +1223,7 @@ main(int argc, char **argv)
 		/* validate path points to directory */
 		else if (!verify_directory(walpath))
 		{
-			pg_log_error("could not open directory \"%s\": %m", waldir);
+			pg_log_error("could not open directory \"%s\": %m", walpath);
 			goto bad_argument;
 		}
 	}


view thread (85+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: pg_waldump: support decoding of WAL inside tarfile
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox