public inbox for [email protected]
help / color / mirror / Atom feedFrom: Andrew Dunstan <[email protected]>
To: Amul Sul <[email protected]>
To: Robert Haas <[email protected]>
Cc: Chao Li <[email protected]>
Cc: Jakub Wartak <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Subject: Re: pg_waldump: support decoding of WAL inside tarfile
Date: Tue, 3 Mar 2026 19:37:48 -0500
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAAJ_b96csGv+TvdxK-zp1Zo6zrAxOJ4n-qnxiWO1f0Lk0X0N_g@mail.gmail.com>
References: <CAAJ_b94bqdWN3h2J-PzzzQ2Npbwct5ZQHggn_QoYGhC2rn-=WQ@mail.gmail.com>
<CAAJ_b94SEcBVJcsp0Y1-YvLqZWBHPQH4FhRzSJfaH_ah_eL_FQ@mail.gmail.com>
<CAAJ_b97nn9denP2SAjuXyjwbd3is-VnZVSkiRMJ-5YNiKfO9MA@mail.gmail.com>
<CA+Tgmob_DB9QHDOxnP7a5Y0yJdeGqY8YNi+uK_811y7cN4mxYA@mail.gmail.com>
<CAAJ_b97VUiP-DbLNe-ddq64J_RiB4ZcPgAjHkJH-0dbzgjR++A@mail.gmail.com>
<CA+TgmoYMtcZBaqy9r59eDapaDy3WOdepkFFURu9MV-x-kxEbKg@mail.gmail.com>
<CAAJ_b95FOeW38gw-3BLmpdnTWHFimopTvf=eTObYUbTOC0x8qg@mail.gmail.com>
<CA+Tgmob=3POOO8st-v-fCjKCKREQ=+gs5_PBQnoFeNBdERfuEg@mail.gmail.com>
<CAAJ_b94=gtCeUKkGPUmPj_2SwHV+PiXQ8Mx-1RqfYj3pP3OwpA@mail.gmail.com>
<CA+TgmoYuC-92tMxZHn-KY=ExF-YkHkd4yOfiMhi9upqk3M1DuA@mail.gmail.com>
<CAAJ_b974evwS9+tmKVEV+ctEp1scRnLAMrmx9_pzLFHsPtr0hA@mail.gmail.com>
<CAAJ_b95H+=D2P_fLAbePCF++FAXO0fsJnxFr5NKVe4at=ga6WA@mail.gmail.com>
<CAAJ_b979p7Z-dBZhxBC3m2VPkYTvYzypMTWkYg2q-e1S5F_f-Q@mail.gmail.com>
<CAAJ_b96csGv+TvdxK-zp1Zo6zrAxOJ4n-qnxiWO1f0Lk0X0N_g@mail.gmail.com>
On 2026-03-02 Mo 8:00 AM, Amul Sul wrote:
> On Wed, Feb 18, 2026 at 12:28 PM Amul Sul <[email protected]> wrote:
>> On Tue, Feb 10, 2026 at 3:06 PM Amul Sul <[email protected]> wrote:
>>> On Wed, Feb 4, 2026 at 6:39 PM Amul Sul <[email protected]> wrote:
>>>> On Wed, Jan 28, 2026 at 2:41 AM Robert Haas <[email protected]> wrote:
>>>>> On Tue, Jan 27, 2026 at 7:07 AM Amul Sul <[email protected]> wrote:
>>>>>> In the attached version, I am using the WAL segment name as the hash
>>>>>> key, which is much more straightforward. I have rewritten
>>>>>> read_archive_wal_page(), and it looks much cleaner than before. The
>>>>>> logic to discard irrelevant WAL files is still within
>>>>>> get_archive_wal_entry. I added an explanation for setting cur_wal to
>>>>>> NULL, which is now handled in the separate function I mentioned
>>>>>> previously.
>>>>>>
>>>>>> Kindly have a look at the attached version; let me know if you are
>>>>>> still not happy with the current approach for filtering/discarding
>>>>>> irrelevant WAL segments. It isn't much different from the previous
>>>>>> version, but I have tried to keep it in a separate routine for better
>>>>>> code readability, with comments to make it easier to understand. I
>>>>>> also added a comment for ArchivedWALFile.
>>>>> I feel like the division of labor between get_archive_wal_entry() and
>>>>> read_archive_wal_page() is odd. I noticed this in the last version,
>>>>> too, and it still seems to be the case. get_archive_wal_entry() first
>>>>> calls ArchivedWAL_lookup(). If that finds an entry, it just returns.
>>>>> If it doesn't, it loops until an entry for the requested file shows up
>>>>> and then returns it. Then control returns to read_archive_wal_page()
>>>>> which loops some more until we have all the data we need for the
>>>>> requested file. But it seems odd to me to have two separate loops
>>>>> here. I think that the first loop is going to call read_archive_file()
>>>>> until we find the beginning of the file that we care about and then
>>>>> the second one is going to call read_archive_file() some more until we
>>>>> have read enough of it to satisfy the request. It feels odd to me to
>>>>> do it that way, as if we told somebody to first wait until 9 o'clock
>>>>> and then wait another 30 minutes, instead of just telling them to wait
>>>>> until 9:30. I realize it's not quite the same thing, because apart
>>>>> from calling read_archive_file(), the two loops do different things,
>>>>> but I still think it looks odd.
>>>>>
>>>>> + /*
>>>>> + * Ignore if the timeline is different or the current segment is not
>>>>> + * the desired one.
>>>>> + */
>>>>> + XLogFromFileName(entry->fname, &curSegTimeline, &curSegNo, WalSegSz);
>>>>> + if (privateInfo->timeline != curSegTimeline ||
>>>>> + privateInfo->startSegNo > curSegNo ||
>>>>> + privateInfo->endSegNo < curSegNo ||
>>>>> + segno > curSegNo)
>>>>> + {
>>>>> + free_archive_wal_entry(entry->fname, privateInfo);
>>>>> + continue;
>>>>> + }
>>>>>
>>>>> The comment doesn't match the code. If it did, the test would be
>>>>> (privateInfo->timeline != curSegTimeline || segno != curSegno). But
>>>>> instead the segno test is > rather than !=, and the checks against
>>>>> startSegNo and endSegNo aren't explained at all. I think I understand
>>>>> why the segno test uses > rather than !=, but it's the point of the
>>>>> comment to explain things like that, rather than leaving the reader to
>>>>> guess. And I don't know why we also need to test startSegNo and
>>>>> endSegNo.
>>>>>
>>>>> I also wonder what the point is of doing XLogFromFileName() on the
>>>>> fname provided by the caller and then again on entry->fname. Couldn't
>>>>> you just compare the strings?
>>>>>
>>>>> Again, the division of labor is really odd here. It's the job of
>>>>> astreamer_waldump_content() to skip things that aren't WAL files at
>>>>> all, but it's the job of get_archive_wal_entry() to skip things that
>>>>> are WAL files but not the one we want. I disagree with putting those
>>>>> checks in completely separate parts of the code.
>>>>>
>>>> Keeping the timeline and segment start-end range checks inside the
>>>> archive streamer creates a circular dependency that cannot be resolved
>>>> without a 'dirty hack'. We must read the first available WAL file page
>>>> to determine the wal_segment_size before it can calculate the target
>>>> segment range. Moving the checks inside the streamer would make it
>>>> impossible to process that initial file, as the necessary filtering
>>>> parameters -- would still be unknown which would need to be skipped
>>>> for the first read somehow. What if later we realized that the first
>>>> WAL file which was allowed to be streamed by skipping that check is
>>>> irrelevant and doesn't fall under the start-end segment range?
>>>>
>>> Please have a look at the attached version, specifically patch 0005.
>>> In astreamer_waldump_content(), I have moved the WAL file filtration
>>> check from get_archive_wal_entry(). This check will be skipped during
>>> the initial read in init_archive_reader(), which instead performs it
>>> explicitly once it determines the WAL segment size and the start/end
>>> segments.
>>>
>>> To access the WAL segment size inside astreamer_waldump_content(), I
>>> have moved the WAL segment size variable into the XLogDumpPrivate
>>> structure in the separate 0004 patch.
>> Attached is an updated version including the aforesaid changes. It
>> includes a new refactoring patch (0001) that moves the logic for
>> identifying tar archives and their compression types from
>> pg_basebackup and pg_verifybackup into a separate-reusable function,
>> per a suggestion from Euler [1]. Additionally, I have added a test
>> for the contrecord decoding to the main patch (now 0006).
>>
>> 1] http://postgr.es/m/[email protected]
>>
> Rebased against the latest master, fixed typos in code comments, and
> replaced palloc0 with palloc0_object.
>
Hi Amul.
I think this looks in pretty good shape.
Attached are patches for a few things I think could be fixed. They are
mostly self-explanatory. The TAP test fix is the only sane way I could
come up with stopping the skip code you had from reporting a wildly
inaccurate number of tests skipped. The sane way to do this from a
Test::More perspective is a subtest, but unfortunately meson does not
like subtest output, which is why we don't use it elsewhere, so the only
way I could come up with was to split this out into a separate test. Of
course, we might just say we don't care about the misreport, in which
case we could just live with things as they are.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <[email protected]>
Date: Tue, 3 Mar 2026 00:00:00 +0000
Subject: [PATCH] Add pg_verifybackup test for tar-format WAL verification
The new tar-format WAL verification in pg_verifybackup had no test
coverage for the case where pg_basebackup produces a separate
pg_wal.tar (--format=tar --wal-method=stream). Add a test that takes
a tar-format backup and verifies it.
---
src/bin/pg_verifybackup/t/007_wal.pl | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/src/bin/pg_verifybackup/t/007_wal.pl b/src/bin/pg_verifybackup/t/007_wal.pl
index 8ad2234453d..0e0377bfacc 100644
--- a/src/bin/pg_verifybackup/t/007_wal.pl
+++ b/src/bin/pg_verifybackup/t/007_wal.pl
@@ -90,4 +90,20 @@ command_ok(
[ 'pg_verifybackup', $backup_path2 ],
'valid base backup with timeline > 1');
+# Test WAL verification for a tar-format backup with a separate pg_wal.tar,
+# as produced by pg_basebackup --format=tar --wal-method=stream.
+my $backup_path3 = $primary->backup_dir . '/test_tar_wal';
+$primary->command_ok(
+ [
+ 'pg_basebackup',
+ '--pgdata' => $backup_path3,
+ '--no-sync',
+ '--format' => 'tar',
+ '--checkpoint' => 'fast'
+ ],
+ "tar backup with separate pg_wal.tar");
+command_ok(
+ [ 'pg_verifybackup', $backup_path3 ],
+ 'WAL verification succeeds with separate pg_wal.tar');
+
done_testing();
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <[email protected]>
Date: Tue, 3 Mar 2026 00:00:00 +0000
Subject: [PATCH] Split pg_waldump TAP tests into directory and archive files
The original 001_basic.pl mixed directory and tar archive tests in a
single SKIP loop with a hardcoded skip count of 3, but each scenario
actually runs ~19 assertions. When tar is unavailable the skip count
was wrong, and the directory scenario was also wrongly guarded by the
tar-availability check.
Move all archive-related tests (tar, tar.gz) into a new
003_archive.pl that uses plan skip_all when tar is unavailable,
cleanly skipping the entire file. 001_basic.pl retains only
directory-based tests with no SKIP blocks needed.
---
src/bin/pg_waldump/meson.build | 1 +
src/bin/pg_waldump/t/001_basic.pl | 221 ++++++++++-----------------
src/bin/pg_waldump/t/003_archive.pl | 320 +++++++++++++++++++++++++++++++++++
3 files changed, 396 insertions(+), 146 deletions(-)
create mode 100644 src/bin/pg_waldump/t/003_archive.pl
diff --git a/src/bin/pg_waldump/meson.build b/src/bin/pg_waldump/meson.build
index 5296f21b82c..d2b4bd0c048 100644
--- a/src/bin/pg_waldump/meson.build
+++ b/src/bin/pg_waldump/meson.build
@@ -34,6 +34,7 @@ tests += {
'tests': [
't/001_basic.pl',
't/002_save_fullpage.pl',
+ 't/003_archive.pl',
],
},
}
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 9854c939007..282c9a37221 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -3,13 +3,9 @@
use strict;
use warnings FATAL => 'all';
-use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
-use List::Util qw(shuffle);
-
-my $tar = $ENV{TAR};
program_help_ok('pg_waldump');
program_version_ok('pg_waldump');
@@ -195,8 +191,8 @@ END
$$;
});
-my $contrecord_lsn = $node->safe_psql('postgres',
- 'SELECT pg_current_wal_insert_lsn()');
+my $contrecord_lsn =
+ $node->safe_psql('postgres', 'SELECT pg_current_wal_insert_lsn()');
# Generate contrecord record
$node->safe_psql('postgres',
qq{SELECT pg_logical_emit_message(true, 'test 026', repeat('xyzxz', 123456))}
@@ -299,145 +295,78 @@ sub test_pg_waldump
return @lines;
}
-# Create a tar archive, sorting the file order
-sub generate_archive
-{
- my ($archive, $directory, $compression_flags) = @_;
-
- my @files;
- opendir my $dh, $directory or die "opendir: $!";
- while (my $entry = readdir $dh) {
- # Skip '.' and '..'
- next if $entry eq '.' || $entry eq '..';
- push @files, $entry;
- }
- closedir $dh;
-
- @files = shuffle @files;
-
- # move into the WAL directory before archiving files
- my $cwd = getcwd;
- chdir($directory) || die "chdir: $!";
- command_ok([$tar, $compression_flags, $archive, @files]);
- chdir($cwd) || die "chdir: $!";
-}
-
-my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
-
-my @scenarios = (
- {
- 'path' => $node->data_dir,
- 'is_archive' => 0,
- 'enabled' => 1
- },
- {
- 'path' => "$tmp_dir/pg_wal.tar",
- 'compression_method' => 'none',
- 'compression_flags' => '-cf',
- 'is_archive' => 1,
- 'enabled' => 1
- },
- {
- 'path' => "$tmp_dir/pg_wal.tar.gz",
- 'compression_method' => 'gzip',
- 'compression_flags' => '-czf',
- 'is_archive' => 1,
- 'enabled' => check_pg_config("#define HAVE_LIBZ 1")
- });
-
-for my $scenario (@scenarios)
-{
- my $path = $scenario->{'path'};
-
- SKIP:
- {
- skip "tar command is not available", 3
- if !defined $tar;
- skip "$scenario->{'compression_method'} compression not supported by this build", 3
- if !$scenario->{'enabled'} && $scenario->{'is_archive'};
-
- # create pg_wal archive
- if ($scenario->{'is_archive'})
- {
- generate_archive($path,
- $node->data_dir . '/pg_wal',
- $scenario->{'compression_flags'});
- }
-
- command_fails_like(
- [ 'pg_waldump', '--path' => $path ],
- qr/error: no start WAL location given/,
- 'path option requires start location');
- command_like(
- [
- 'pg_waldump',
- '--path' => $path,
- '--start' => $start_lsn,
- '--end' => $end_lsn,
- ],
- qr/./,
- 'runs with path option and start and end locations');
- command_fails_like(
- [
- 'pg_waldump',
- '--path' => $path,
- '--start' => $start_lsn,
- ],
- qr/error: error in WAL record at/,
- 'falling off the end of the WAL results in an error');
-
- command_fails_like(
- [
- 'pg_waldump', '--quiet',
- '--path' => $path,
- '--start' => $start_lsn
- ],
- qr/error: error in WAL record at/,
- 'errors are shown with --quiet');
-
- test_pg_waldump_skip_bytes($path, $start_lsn, $end_lsn);
-
- my @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
- is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
-
- @lines = test_pg_waldump($path, $contrecord_lsn, $end_lsn);
- is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
-
- test_pg_waldump_skip_bytes($path, $contrecord_lsn, $end_lsn);
-
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--limit' => 6);
- is(@lines, 6, 'limit option observed');
-
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fullpage');
- is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
-
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats');
- like($lines[0], qr/WAL statistics/, "statistics on stdout");
- is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
-
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats=record');
- like($lines[0], qr/WAL statistics/, "statistics on stdout");
- is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
-
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--rmgr' => 'Btree');
- is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
-
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fork' => 'init');
- is(grep(!/fork init/, @lines), 0, 'only init fork lines');
-
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn,
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
- is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
- 0, 'only lines for selected relation');
-
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn,
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
- '--block' => 1);
- is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
-
- # Cleanup.
- unlink $path if $scenario->{'is_archive'};
- }
-}
+my $path = $node->data_dir;
+
+command_fails_like(
+ [ 'pg_waldump', '--path' => $path ],
+ qr/error: no start WAL location given/,
+ 'path option requires start location');
+command_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ '--end' => $end_lsn,
+ ],
+ qr/./,
+ 'runs with path option and start and end locations');
+command_fails_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ ],
+ qr/error: error in WAL record at/,
+ 'falling off the end of the WAL results in an error');
+
+command_fails_like(
+ [
+ 'pg_waldump', '--quiet',
+ '--path' => $path,
+ '--start' => $start_lsn
+ ],
+ qr/error: error in WAL record at/,
+ 'errors are shown with --quiet');
+
+test_pg_waldump_skip_bytes($path, $start_lsn, $end_lsn);
+
+my @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
+is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+
+@lines = test_pg_waldump($path, $contrecord_lsn, $end_lsn);
+is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+
+test_pg_waldump_skip_bytes($path, $contrecord_lsn, $end_lsn);
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--limit' => 6);
+is(@lines, 6, 'limit option observed');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fullpage');
+is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats');
+like($lines[0], qr/WAL statistics/, "statistics on stdout");
+is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats=record');
+like($lines[0], qr/WAL statistics/, "statistics on stdout");
+is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--rmgr' => 'Btree');
+is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fork' => 'init');
+is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
+is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
+ 0, 'only lines for selected relation');
+
+@lines = test_pg_waldump(
+ $path, $start_lsn, $end_lsn,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
+ '--block' => 1);
+is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
done_testing();
new file mode 100644
index 00000000000..c615713efd4
--- /dev/null
+++ b/src/bin/pg_waldump/t/003_archive.pl
@@ -0,0 +1,320 @@
+
+# Copyright (c) 2021-2026, PostgreSQL Global Development Group
+
+# Test pg_waldump's ability to read WAL from tar archives.
+
+use strict;
+use warnings FATAL => 'all';
+use Cwd;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+use List::Util qw(shuffle);
+
+my $tar = $ENV{TAR};
+
+if (!defined $tar)
+{
+ plan skip_all => 'tar command is not available';
+}
+
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init;
+$node->append_conf(
+ 'postgresql.conf', q{
+autovacuum = off
+checkpoint_timeout = 1h
+
+# for standbydesc
+archive_mode=on
+archive_command=''
+
+# for XLOG_HEAP_TRUNCATE
+wal_level=logical
+});
+$node->start;
+
+my ($start_lsn, $start_walfile) = split /\|/,
+ $node->safe_psql('postgres',
+ q{SELECT pg_current_wal_insert_lsn(), pg_walfile_name(pg_current_wal_insert_lsn())}
+ );
+
+$node->safe_psql(
+ 'postgres', q{
+-- heap, btree, hash, sequence
+CREATE TABLE t1 (a int GENERATED ALWAYS AS IDENTITY, b text);
+CREATE INDEX i1a ON t1 USING btree (a);
+CREATE INDEX i1b ON t1 USING hash (b);
+INSERT INTO t1 VALUES (default, 'one'), (default, 'two');
+DELETE FROM t1 WHERE b = 'one';
+TRUNCATE t1;
+
+-- abort
+START TRANSACTION;
+INSERT INTO t1 VALUES (default, 'three');
+ROLLBACK;
+
+-- unlogged/init fork
+CREATE UNLOGGED TABLE t2 (x int);
+CREATE INDEX i2 ON t2 USING btree (x);
+INSERT INTO t2 SELECT generate_series(1, 10);
+
+-- gin
+CREATE TABLE gin_idx_tbl (id bigserial PRIMARY KEY, data jsonb);
+CREATE INDEX gin_idx ON gin_idx_tbl USING gin (data);
+INSERT INTO gin_idx_tbl
+ WITH random_json AS (
+ SELECT json_object_agg(key, trunc(random() * 10)) as json_data
+ FROM unnest(array['a', 'b', 'c']) as u(key))
+ SELECT generate_series(1,500), json_data FROM random_json;
+
+-- gist, spgist
+CREATE TABLE gist_idx_tbl (p point);
+CREATE INDEX gist_idx ON gist_idx_tbl USING gist (p);
+CREATE INDEX spgist_idx ON gist_idx_tbl USING spgist (p);
+INSERT INTO gist_idx_tbl (p) VALUES (point '(1, 1)'), (point '(3, 2)'), (point '(6, 3)');
+
+-- brin
+CREATE TABLE brin_idx_tbl (col1 int, col2 text, col3 text );
+CREATE INDEX brin_idx ON brin_idx_tbl USING brin (col1, col2, col3) WITH (autosummarize=on);
+INSERT INTO brin_idx_tbl SELECT generate_series(1, 10000), 'dummy', 'dummy';
+UPDATE brin_idx_tbl SET col2 = 'updated' WHERE col1 BETWEEN 1 AND 5000;
+SELECT brin_summarize_range('brin_idx', 0);
+SELECT brin_desummarize_range('brin_idx', 0);
+
+VACUUM;
+
+-- logical message
+SELECT pg_logical_emit_message(true, 'foo', 'bar');
+
+-- relmap
+VACUUM FULL pg_authid;
+
+-- database
+CREATE DATABASE d1;
+DROP DATABASE d1;
+});
+
+my $tblspc_path = PostgreSQL::Test::Utils::tempdir_short();
+
+$node->safe_psql(
+ 'postgres', qq{
+CREATE TABLESPACE ts1 LOCATION '$tblspc_path';
+DROP TABLESPACE ts1;
+});
+
+# Consume all remaining room in the current WAL segment, leaving space enough
+# only for the start of a largish record, to test contrecord decoding.
+$node->safe_psql(
+ 'postgres', q{
+DO $$
+DECLARE
+ wal_segsize int := setting::int FROM pg_settings WHERE name = 'wal_segment_size';
+ remain int;
+ iters int := 0;
+BEGIN
+ LOOP
+ INSERT into t1(b)
+ select repeat(encode(sha256(g::text::bytea), 'hex'), (random() * 15 + 1)::int)
+ from generate_series(1, 10) g;
+
+ remain := wal_segsize - (pg_current_wal_insert_lsn() - '0/0') % wal_segsize;
+ IF remain < 2 * setting::int from pg_settings where name = 'block_size' THEN
+ RAISE log 'exiting after % iterations, % bytes to end of WAL segment', iters, remain;
+ EXIT;
+ END IF;
+ iters := iters + 1;
+ END LOOP;
+END
+$$;
+});
+
+my $contrecord_lsn =
+ $node->safe_psql('postgres', 'SELECT pg_current_wal_insert_lsn()');
+$node->safe_psql('postgres',
+ qq{SELECT pg_logical_emit_message(true, 'test 026', repeat('xyzxz', 123456))}
+);
+
+my ($end_lsn, $end_walfile) = split /\|/,
+ $node->safe_psql('postgres',
+ q{SELECT pg_current_wal_insert_lsn(), pg_walfile_name(pg_current_wal_insert_lsn())}
+ );
+
+$node->stop;
+
+
+sub test_pg_waldump_skip_bytes
+{
+ my ($path, $startlsn, $endlsn) = @_;
+
+ my ($part1, $part2) = split qr{/}, $startlsn;
+ my $lsn2 = hex $part2;
+ $lsn2++;
+ my $new_start = sprintf("%s/%X", $part1, $lsn2);
+
+ my ($stdout, $stderr);
+
+ my $result = IPC::Run::run [
+ 'pg_waldump',
+ '--start' => $new_start,
+ '--end' => $endlsn,
+ '--path' => $path,
+ ],
+ '>' => \$stdout,
+ '2>' => \$stderr;
+ ok($result, "runs with start segment and start LSN specified");
+ like($stderr, qr/first record is after/, 'info message printed');
+}
+
+sub test_pg_waldump
+{
+ local $Test::Builder::Level = $Test::Builder::Level + 1;
+ my ($path, $startlsn, $endlsn, @opts) = @_;
+
+ my ($stdout, $stderr);
+
+ my $result = IPC::Run::run [
+ 'pg_waldump',
+ '--start' => $startlsn,
+ '--end' => $endlsn,
+ '--path' => $path,
+ @opts
+ ],
+ '>' => \$stdout,
+ '2>' => \$stderr;
+ ok($result, "pg_waldump @opts: runs ok");
+ is($stderr, '', "pg_waldump @opts: no stderr");
+ my @lines = split /\n/, $stdout;
+ ok(@lines > 0, "pg_waldump @opts: some lines are output");
+ return @lines;
+}
+
+sub generate_archive
+{
+ my ($archive, $directory, $compression_flags) = @_;
+
+ my @files;
+ opendir my $dh, $directory or die "opendir: $!";
+ while (my $entry = readdir $dh)
+ {
+ next if $entry eq '.' || $entry eq '..';
+ push @files, $entry;
+ }
+ closedir $dh;
+
+ @files = shuffle @files;
+
+ my $cwd = getcwd;
+ chdir($directory) || die "chdir: $!";
+ command_ok([ $tar, $compression_flags, $archive, @files ],
+ "create archive $archive");
+ chdir($cwd) || die "chdir: $!";
+}
+
+
+my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
+
+my @scenarios = (
+ {
+ 'path' => "$tmp_dir/pg_wal.tar",
+ 'compression_method' => 'none',
+ 'compression_flags' => '-cf',
+ 'enabled' => 1,
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar.gz",
+ 'compression_method' => 'gzip',
+ 'compression_flags' => '-czf',
+ 'enabled' => check_pg_config("#define HAVE_LIBZ 1"),
+ });
+
+for my $scenario (@scenarios)
+{
+ my $path = $scenario->{'path'};
+ my $method = $scenario->{'compression_method'};
+
+ SKIP:
+ {
+ skip "$method compression not supported by this build", 1
+ if !$scenario->{'enabled'};
+
+ generate_archive(
+ $path,
+ $node->data_dir . '/pg_wal',
+ $scenario->{'compression_flags'});
+
+ command_fails_like(
+ [ 'pg_waldump', '--path' => $path ],
+ qr/error: no start WAL location given/,
+ "$method: path option requires start location");
+ command_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ '--end' => $end_lsn,
+ ],
+ qr/./,
+ "$method: runs with path option and start and end locations");
+ command_fails_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ ],
+ qr/error: error in WAL record at/,
+ "$method: falling off the end of the WAL results in an error");
+
+ command_fails_like(
+ [
+ 'pg_waldump', '--quiet',
+ '--path' => $path,
+ '--start' => $start_lsn
+ ],
+ qr/error: error in WAL record at/,
+ "$method: errors are shown with --quiet");
+
+ test_pg_waldump_skip_bytes($path, $start_lsn, $end_lsn);
+
+ my @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
+ is(grep(!/^rmgr: \w/, @lines),
+ 0, "$method: all output lines are rmgr lines");
+
+ @lines = test_pg_waldump($path, $contrecord_lsn, $end_lsn);
+ is(grep(!/^rmgr: \w/, @lines),
+ 0, "$method: contrecord - all output lines are rmgr lines");
+
+ test_pg_waldump_skip_bytes($path, $contrecord_lsn, $end_lsn);
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--limit' => 6);
+ is(@lines, 6, "$method: limit option observed");
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fullpage');
+ is(grep(!/^rmgr:.*\bFPW\b/, @lines),
+ 0, "$method: all output lines are FPW");
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats');
+ like($lines[0], qr/WAL statistics/, "$method: statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, "$method: no rmgr lines output");
+
+ @lines =
+ test_pg_waldump($path, $start_lsn, $end_lsn, '--stats=record');
+ like($lines[0], qr/WAL statistics/,
+ "$method: stats=record on stdout");
+ is(grep(/^rmgr:/, @lines),
+ 0, "$method: no rmgr lines with stats=record");
+
+ @lines =
+ test_pg_waldump($path, $start_lsn, $end_lsn, '--rmgr' => 'Btree');
+ is(grep(!/^rmgr: Btree/, @lines), 0, "$method: only Btree lines");
+
+ @lines =
+ test_pg_waldump($path, $start_lsn, $end_lsn, '--fork' => 'init');
+ is(grep(!/fork init/, @lines), 0, "$method: only init fork lines");
+
+ # Cleanup.
+ unlink $path;
+ }
+}
+
+done_testing();
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <[email protected]>
Date: Tue, 3 Mar 2026 00:00:00 +0000
Subject: [PATCH] Fix documentation for pg_waldump tar archive support
Two documentation issues with the tar archive reading feature:
- pg_waldump.sgml: When reading WAL from a tar archive with
out-of-order segments, pg_waldump spills to temporary files. TMPDIR
controls where those files are created, but this was not documented
in the Environment section.
- pg_verifybackup.sgml: The --wal-path option description still only
said "directory" even though it now also accepts tar archives.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 7 ++++---
doc/src/sgml/ref/pg_waldump.sgml | 11 +++++++++++
2 files changed, 15 insertions(+), 3 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index 16b50b5a4df..1695cfe91c8 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -261,9 +261,10 @@ PostgreSQL documentation
<term><option>--wal-path=<replaceable class="parameter">path</replaceable></option></term>
<listitem>
<para>
- Try to parse WAL files stored in the specified directory, rather than
- in <literal>pg_wal</literal>. This may be useful if the backup is
- stored in a separate location from the WAL archive.
+ Try to parse WAL files stored in the specified directory or tar
+ archive, rather than in <literal>pg_wal</literal>. This may be
+ useful if the backup is stored in a separate location from the WAL
+ archive.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index b36323dde92..9bbb4bd5772 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -391,6 +391,17 @@ PostgreSQL documentation
</para>
</listitem>
</varlistentry>
+
+ <varlistentry>
+ <term><envar>TMPDIR</envar></term>
+ <listitem>
+ <para>
+ Directory in which to create temporary files when reading WAL from a
+ tar archive with out-of-order segment files. If not set, the temporary
+ directory is created within the same directory as the tar archive.
+ </para>
+ </listitem>
+ </varlistentry>
</variablelist>
</refsect1>
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <[email protected]>
Date: Tue, 3 Mar 2026 00:00:00 +0000
Subject: [PATCH] Fix bugs in pg_waldump tar archive support
Fix several bugs introduced by the pg_waldump archive WAL reading
feature:
- pg_waldump.c: The error path for verify_directory() printed waldir
(which is NULL when --path is used) instead of walpath.
- archive_waldump.c: The error message for short reads had an operator
precedence bug: (long long int) count - nbytes cast only count, not
the subtraction result. Also reported nbytes (the requested amount)
instead of count (the total file size) for the "of" portion.
- archive_waldump.c: The "ignoring duplicate WAL" code path leaked
fname (allocated via pnstrdup/palloc). Also changed the existing
free(fname) to pfree(fname) for consistency.
- pg_verifybackup.c: The rename from --wal-directory to --wal-path
didn't preserve the old spelling as a backward-compatible alias.
- pg_verifybackup.c: Fix double space before "Or" in --wal-path
error hint message.
---
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 935ab8fafa8..b0b764913cf 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -131,6 +131,7 @@ main(int argc, char **argv)
{"quiet", no_argument, NULL, 'q'},
{"skip-checksums", no_argument, NULL, 's'},
{"wal-path", required_argument, NULL, 'w'},
+ {"wal-directory", required_argument, NULL, 'w'},
{NULL, 0, NULL, 0}
};
@@ -376,7 +377,7 @@ main(int argc, char **argv)
else
{
pg_log_error("WAL archive not found");
- pg_log_error_hint("Specify the correct path using the option -w/--wal-path. "
+ pg_log_error_hint("Specify the correct path using the option -w/--wal-path. "
"Or you must use -n/--no-parse-wal when verifying a tar-format backup.");
exit(1);
}
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
index c5a4485b5b1..1479efe61f5 100644
--- a/src/bin/pg_waldump/archive_waldump.c
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -344,8 +344,8 @@ read_archive_wal_page(XLogDumpPrivate *privateInfo, XLogRecPtr targetPagePtr,
read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
pg_fatal("could not read file \"%s\" from archive \"%s\": read %lld of %lld",
fname, privateInfo->archive_name,
- (long long int) count - nbytes,
- (long long int) nbytes);
+ (long long int) (count - nbytes),
+ (long long int) count);
}
}
@@ -664,7 +664,7 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
privateInfo->start_segno > segno ||
privateInfo->end_segno < segno)
{
- free(fname);
+ pfree(fname);
break;
}
}
@@ -680,6 +680,7 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
{
pg_log_warning("ignoring duplicate WAL \"%s\" found in archive \"%s\"",
member->pathname, privateInfo->archive_name);
+ pfree(fname);
break;
}
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 114969217d8..4b438b53ead 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -1223,7 +1223,7 @@ main(int argc, char **argv)
/* validate path points to directory */
else if (!verify_directory(walpath))
{
- pg_log_error("could not open directory \"%s\": %m", waldir);
+ pg_log_error("could not open directory \"%s\": %m", walpath);
goto bad_argument;
}
}
Attachments:
[text/plain] cf5955-tar-wal-test.patch.no-cfbot (1.4K, 2-cf5955-tar-wal-test.patch.no-cfbot)
download | inline diff:
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <[email protected]>
Date: Tue, 3 Mar 2026 00:00:00 +0000
Subject: [PATCH] Add pg_verifybackup test for tar-format WAL verification
The new tar-format WAL verification in pg_verifybackup had no test
coverage for the case where pg_basebackup produces a separate
pg_wal.tar (--format=tar --wal-method=stream). Add a test that takes
a tar-format backup and verifies it.
---
src/bin/pg_verifybackup/t/007_wal.pl | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/src/bin/pg_verifybackup/t/007_wal.pl b/src/bin/pg_verifybackup/t/007_wal.pl
index 8ad2234453d..0e0377bfacc 100644
--- a/src/bin/pg_verifybackup/t/007_wal.pl
+++ b/src/bin/pg_verifybackup/t/007_wal.pl
@@ -90,4 +90,20 @@ command_ok(
[ 'pg_verifybackup', $backup_path2 ],
'valid base backup with timeline > 1');
+# Test WAL verification for a tar-format backup with a separate pg_wal.tar,
+# as produced by pg_basebackup --format=tar --wal-method=stream.
+my $backup_path3 = $primary->backup_dir . '/test_tar_wal';
+$primary->command_ok(
+ [
+ 'pg_basebackup',
+ '--pgdata' => $backup_path3,
+ '--no-sync',
+ '--format' => 'tar',
+ '--checkpoint' => 'fast'
+ ],
+ "tar backup with separate pg_wal.tar");
+command_ok(
+ [ 'pg_verifybackup', $backup_path3 ],
+ 'WAL verification succeeds with separate pg_wal.tar');
+
done_testing();
[text/plain] cf5955-tap-test-fix.patch.no-cfbot (17.5K, 3-cf5955-tap-test-fix.patch.no-cfbot)
download | inline diff:
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <[email protected]>
Date: Tue, 3 Mar 2026 00:00:00 +0000
Subject: [PATCH] Split pg_waldump TAP tests into directory and archive files
The original 001_basic.pl mixed directory and tar archive tests in a
single SKIP loop with a hardcoded skip count of 3, but each scenario
actually runs ~19 assertions. When tar is unavailable the skip count
was wrong, and the directory scenario was also wrongly guarded by the
tar-availability check.
Move all archive-related tests (tar, tar.gz) into a new
003_archive.pl that uses plan skip_all when tar is unavailable,
cleanly skipping the entire file. 001_basic.pl retains only
directory-based tests with no SKIP blocks needed.
---
src/bin/pg_waldump/meson.build | 1 +
src/bin/pg_waldump/t/001_basic.pl | 221 ++++++++++-----------------
src/bin/pg_waldump/t/003_archive.pl | 320 +++++++++++++++++++++++++++++++++++
3 files changed, 396 insertions(+), 146 deletions(-)
create mode 100644 src/bin/pg_waldump/t/003_archive.pl
diff --git a/src/bin/pg_waldump/meson.build b/src/bin/pg_waldump/meson.build
index 5296f21b82c..d2b4bd0c048 100644
--- a/src/bin/pg_waldump/meson.build
+++ b/src/bin/pg_waldump/meson.build
@@ -34,6 +34,7 @@ tests += {
'tests': [
't/001_basic.pl',
't/002_save_fullpage.pl',
+ 't/003_archive.pl',
],
},
}
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 9854c939007..282c9a37221 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -3,13 +3,9 @@
use strict;
use warnings FATAL => 'all';
-use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
-use List::Util qw(shuffle);
-
-my $tar = $ENV{TAR};
program_help_ok('pg_waldump');
program_version_ok('pg_waldump');
@@ -195,8 +191,8 @@ END
$$;
});
-my $contrecord_lsn = $node->safe_psql('postgres',
- 'SELECT pg_current_wal_insert_lsn()');
+my $contrecord_lsn =
+ $node->safe_psql('postgres', 'SELECT pg_current_wal_insert_lsn()');
# Generate contrecord record
$node->safe_psql('postgres',
qq{SELECT pg_logical_emit_message(true, 'test 026', repeat('xyzxz', 123456))}
@@ -299,145 +295,78 @@ sub test_pg_waldump
return @lines;
}
-# Create a tar archive, sorting the file order
-sub generate_archive
-{
- my ($archive, $directory, $compression_flags) = @_;
-
- my @files;
- opendir my $dh, $directory or die "opendir: $!";
- while (my $entry = readdir $dh) {
- # Skip '.' and '..'
- next if $entry eq '.' || $entry eq '..';
- push @files, $entry;
- }
- closedir $dh;
-
- @files = shuffle @files;
-
- # move into the WAL directory before archiving files
- my $cwd = getcwd;
- chdir($directory) || die "chdir: $!";
- command_ok([$tar, $compression_flags, $archive, @files]);
- chdir($cwd) || die "chdir: $!";
-}
-
-my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
-
-my @scenarios = (
- {
- 'path' => $node->data_dir,
- 'is_archive' => 0,
- 'enabled' => 1
- },
- {
- 'path' => "$tmp_dir/pg_wal.tar",
- 'compression_method' => 'none',
- 'compression_flags' => '-cf',
- 'is_archive' => 1,
- 'enabled' => 1
- },
- {
- 'path' => "$tmp_dir/pg_wal.tar.gz",
- 'compression_method' => 'gzip',
- 'compression_flags' => '-czf',
- 'is_archive' => 1,
- 'enabled' => check_pg_config("#define HAVE_LIBZ 1")
- });
-
-for my $scenario (@scenarios)
-{
- my $path = $scenario->{'path'};
-
- SKIP:
- {
- skip "tar command is not available", 3
- if !defined $tar;
- skip "$scenario->{'compression_method'} compression not supported by this build", 3
- if !$scenario->{'enabled'} && $scenario->{'is_archive'};
-
- # create pg_wal archive
- if ($scenario->{'is_archive'})
- {
- generate_archive($path,
- $node->data_dir . '/pg_wal',
- $scenario->{'compression_flags'});
- }
-
- command_fails_like(
- [ 'pg_waldump', '--path' => $path ],
- qr/error: no start WAL location given/,
- 'path option requires start location');
- command_like(
- [
- 'pg_waldump',
- '--path' => $path,
- '--start' => $start_lsn,
- '--end' => $end_lsn,
- ],
- qr/./,
- 'runs with path option and start and end locations');
- command_fails_like(
- [
- 'pg_waldump',
- '--path' => $path,
- '--start' => $start_lsn,
- ],
- qr/error: error in WAL record at/,
- 'falling off the end of the WAL results in an error');
-
- command_fails_like(
- [
- 'pg_waldump', '--quiet',
- '--path' => $path,
- '--start' => $start_lsn
- ],
- qr/error: error in WAL record at/,
- 'errors are shown with --quiet');
-
- test_pg_waldump_skip_bytes($path, $start_lsn, $end_lsn);
-
- my @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
- is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
-
- @lines = test_pg_waldump($path, $contrecord_lsn, $end_lsn);
- is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
-
- test_pg_waldump_skip_bytes($path, $contrecord_lsn, $end_lsn);
-
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--limit' => 6);
- is(@lines, 6, 'limit option observed');
-
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fullpage');
- is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
-
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats');
- like($lines[0], qr/WAL statistics/, "statistics on stdout");
- is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
-
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats=record');
- like($lines[0], qr/WAL statistics/, "statistics on stdout");
- is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
-
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--rmgr' => 'Btree');
- is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
-
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fork' => 'init');
- is(grep(!/fork init/, @lines), 0, 'only init fork lines');
-
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn,
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
- is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
- 0, 'only lines for selected relation');
-
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn,
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
- '--block' => 1);
- is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
-
- # Cleanup.
- unlink $path if $scenario->{'is_archive'};
- }
-}
+my $path = $node->data_dir;
+
+command_fails_like(
+ [ 'pg_waldump', '--path' => $path ],
+ qr/error: no start WAL location given/,
+ 'path option requires start location');
+command_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ '--end' => $end_lsn,
+ ],
+ qr/./,
+ 'runs with path option and start and end locations');
+command_fails_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ ],
+ qr/error: error in WAL record at/,
+ 'falling off the end of the WAL results in an error');
+
+command_fails_like(
+ [
+ 'pg_waldump', '--quiet',
+ '--path' => $path,
+ '--start' => $start_lsn
+ ],
+ qr/error: error in WAL record at/,
+ 'errors are shown with --quiet');
+
+test_pg_waldump_skip_bytes($path, $start_lsn, $end_lsn);
+
+my @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
+is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+
+@lines = test_pg_waldump($path, $contrecord_lsn, $end_lsn);
+is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+
+test_pg_waldump_skip_bytes($path, $contrecord_lsn, $end_lsn);
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--limit' => 6);
+is(@lines, 6, 'limit option observed');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fullpage');
+is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats');
+like($lines[0], qr/WAL statistics/, "statistics on stdout");
+is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats=record');
+like($lines[0], qr/WAL statistics/, "statistics on stdout");
+is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--rmgr' => 'Btree');
+is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fork' => 'init');
+is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
+is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
+ 0, 'only lines for selected relation');
+
+@lines = test_pg_waldump(
+ $path, $start_lsn, $end_lsn,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
+ '--block' => 1);
+is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
done_testing();
new file mode 100644
index 00000000000..c615713efd4
--- /dev/null
+++ b/src/bin/pg_waldump/t/003_archive.pl
@@ -0,0 +1,320 @@
+
+# Copyright (c) 2021-2026, PostgreSQL Global Development Group
+
+# Test pg_waldump's ability to read WAL from tar archives.
+
+use strict;
+use warnings FATAL => 'all';
+use Cwd;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+use List::Util qw(shuffle);
+
+my $tar = $ENV{TAR};
+
+if (!defined $tar)
+{
+ plan skip_all => 'tar command is not available';
+}
+
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init;
+$node->append_conf(
+ 'postgresql.conf', q{
+autovacuum = off
+checkpoint_timeout = 1h
+
+# for standbydesc
+archive_mode=on
+archive_command=''
+
+# for XLOG_HEAP_TRUNCATE
+wal_level=logical
+});
+$node->start;
+
+my ($start_lsn, $start_walfile) = split /\|/,
+ $node->safe_psql('postgres',
+ q{SELECT pg_current_wal_insert_lsn(), pg_walfile_name(pg_current_wal_insert_lsn())}
+ );
+
+$node->safe_psql(
+ 'postgres', q{
+-- heap, btree, hash, sequence
+CREATE TABLE t1 (a int GENERATED ALWAYS AS IDENTITY, b text);
+CREATE INDEX i1a ON t1 USING btree (a);
+CREATE INDEX i1b ON t1 USING hash (b);
+INSERT INTO t1 VALUES (default, 'one'), (default, 'two');
+DELETE FROM t1 WHERE b = 'one';
+TRUNCATE t1;
+
+-- abort
+START TRANSACTION;
+INSERT INTO t1 VALUES (default, 'three');
+ROLLBACK;
+
+-- unlogged/init fork
+CREATE UNLOGGED TABLE t2 (x int);
+CREATE INDEX i2 ON t2 USING btree (x);
+INSERT INTO t2 SELECT generate_series(1, 10);
+
+-- gin
+CREATE TABLE gin_idx_tbl (id bigserial PRIMARY KEY, data jsonb);
+CREATE INDEX gin_idx ON gin_idx_tbl USING gin (data);
+INSERT INTO gin_idx_tbl
+ WITH random_json AS (
+ SELECT json_object_agg(key, trunc(random() * 10)) as json_data
+ FROM unnest(array['a', 'b', 'c']) as u(key))
+ SELECT generate_series(1,500), json_data FROM random_json;
+
+-- gist, spgist
+CREATE TABLE gist_idx_tbl (p point);
+CREATE INDEX gist_idx ON gist_idx_tbl USING gist (p);
+CREATE INDEX spgist_idx ON gist_idx_tbl USING spgist (p);
+INSERT INTO gist_idx_tbl (p) VALUES (point '(1, 1)'), (point '(3, 2)'), (point '(6, 3)');
+
+-- brin
+CREATE TABLE brin_idx_tbl (col1 int, col2 text, col3 text );
+CREATE INDEX brin_idx ON brin_idx_tbl USING brin (col1, col2, col3) WITH (autosummarize=on);
+INSERT INTO brin_idx_tbl SELECT generate_series(1, 10000), 'dummy', 'dummy';
+UPDATE brin_idx_tbl SET col2 = 'updated' WHERE col1 BETWEEN 1 AND 5000;
+SELECT brin_summarize_range('brin_idx', 0);
+SELECT brin_desummarize_range('brin_idx', 0);
+
+VACUUM;
+
+-- logical message
+SELECT pg_logical_emit_message(true, 'foo', 'bar');
+
+-- relmap
+VACUUM FULL pg_authid;
+
+-- database
+CREATE DATABASE d1;
+DROP DATABASE d1;
+});
+
+my $tblspc_path = PostgreSQL::Test::Utils::tempdir_short();
+
+$node->safe_psql(
+ 'postgres', qq{
+CREATE TABLESPACE ts1 LOCATION '$tblspc_path';
+DROP TABLESPACE ts1;
+});
+
+# Consume all remaining room in the current WAL segment, leaving space enough
+# only for the start of a largish record, to test contrecord decoding.
+$node->safe_psql(
+ 'postgres', q{
+DO $$
+DECLARE
+ wal_segsize int := setting::int FROM pg_settings WHERE name = 'wal_segment_size';
+ remain int;
+ iters int := 0;
+BEGIN
+ LOOP
+ INSERT into t1(b)
+ select repeat(encode(sha256(g::text::bytea), 'hex'), (random() * 15 + 1)::int)
+ from generate_series(1, 10) g;
+
+ remain := wal_segsize - (pg_current_wal_insert_lsn() - '0/0') % wal_segsize;
+ IF remain < 2 * setting::int from pg_settings where name = 'block_size' THEN
+ RAISE log 'exiting after % iterations, % bytes to end of WAL segment', iters, remain;
+ EXIT;
+ END IF;
+ iters := iters + 1;
+ END LOOP;
+END
+$$;
+});
+
+my $contrecord_lsn =
+ $node->safe_psql('postgres', 'SELECT pg_current_wal_insert_lsn()');
+$node->safe_psql('postgres',
+ qq{SELECT pg_logical_emit_message(true, 'test 026', repeat('xyzxz', 123456))}
+);
+
+my ($end_lsn, $end_walfile) = split /\|/,
+ $node->safe_psql('postgres',
+ q{SELECT pg_current_wal_insert_lsn(), pg_walfile_name(pg_current_wal_insert_lsn())}
+ );
+
+$node->stop;
+
+
+sub test_pg_waldump_skip_bytes
+{
+ my ($path, $startlsn, $endlsn) = @_;
+
+ my ($part1, $part2) = split qr{/}, $startlsn;
+ my $lsn2 = hex $part2;
+ $lsn2++;
+ my $new_start = sprintf("%s/%X", $part1, $lsn2);
+
+ my ($stdout, $stderr);
+
+ my $result = IPC::Run::run [
+ 'pg_waldump',
+ '--start' => $new_start,
+ '--end' => $endlsn,
+ '--path' => $path,
+ ],
+ '>' => \$stdout,
+ '2>' => \$stderr;
+ ok($result, "runs with start segment and start LSN specified");
+ like($stderr, qr/first record is after/, 'info message printed');
+}
+
+sub test_pg_waldump
+{
+ local $Test::Builder::Level = $Test::Builder::Level + 1;
+ my ($path, $startlsn, $endlsn, @opts) = @_;
+
+ my ($stdout, $stderr);
+
+ my $result = IPC::Run::run [
+ 'pg_waldump',
+ '--start' => $startlsn,
+ '--end' => $endlsn,
+ '--path' => $path,
+ @opts
+ ],
+ '>' => \$stdout,
+ '2>' => \$stderr;
+ ok($result, "pg_waldump @opts: runs ok");
+ is($stderr, '', "pg_waldump @opts: no stderr");
+ my @lines = split /\n/, $stdout;
+ ok(@lines > 0, "pg_waldump @opts: some lines are output");
+ return @lines;
+}
+
+sub generate_archive
+{
+ my ($archive, $directory, $compression_flags) = @_;
+
+ my @files;
+ opendir my $dh, $directory or die "opendir: $!";
+ while (my $entry = readdir $dh)
+ {
+ next if $entry eq '.' || $entry eq '..';
+ push @files, $entry;
+ }
+ closedir $dh;
+
+ @files = shuffle @files;
+
+ my $cwd = getcwd;
+ chdir($directory) || die "chdir: $!";
+ command_ok([ $tar, $compression_flags, $archive, @files ],
+ "create archive $archive");
+ chdir($cwd) || die "chdir: $!";
+}
+
+
+my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
+
+my @scenarios = (
+ {
+ 'path' => "$tmp_dir/pg_wal.tar",
+ 'compression_method' => 'none',
+ 'compression_flags' => '-cf',
+ 'enabled' => 1,
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar.gz",
+ 'compression_method' => 'gzip',
+ 'compression_flags' => '-czf',
+ 'enabled' => check_pg_config("#define HAVE_LIBZ 1"),
+ });
+
+for my $scenario (@scenarios)
+{
+ my $path = $scenario->{'path'};
+ my $method = $scenario->{'compression_method'};
+
+ SKIP:
+ {
+ skip "$method compression not supported by this build", 1
+ if !$scenario->{'enabled'};
+
+ generate_archive(
+ $path,
+ $node->data_dir . '/pg_wal',
+ $scenario->{'compression_flags'});
+
+ command_fails_like(
+ [ 'pg_waldump', '--path' => $path ],
+ qr/error: no start WAL location given/,
+ "$method: path option requires start location");
+ command_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ '--end' => $end_lsn,
+ ],
+ qr/./,
+ "$method: runs with path option and start and end locations");
+ command_fails_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ ],
+ qr/error: error in WAL record at/,
+ "$method: falling off the end of the WAL results in an error");
+
+ command_fails_like(
+ [
+ 'pg_waldump', '--quiet',
+ '--path' => $path,
+ '--start' => $start_lsn
+ ],
+ qr/error: error in WAL record at/,
+ "$method: errors are shown with --quiet");
+
+ test_pg_waldump_skip_bytes($path, $start_lsn, $end_lsn);
+
+ my @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
+ is(grep(!/^rmgr: \w/, @lines),
+ 0, "$method: all output lines are rmgr lines");
+
+ @lines = test_pg_waldump($path, $contrecord_lsn, $end_lsn);
+ is(grep(!/^rmgr: \w/, @lines),
+ 0, "$method: contrecord - all output lines are rmgr lines");
+
+ test_pg_waldump_skip_bytes($path, $contrecord_lsn, $end_lsn);
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--limit' => 6);
+ is(@lines, 6, "$method: limit option observed");
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fullpage');
+ is(grep(!/^rmgr:.*\bFPW\b/, @lines),
+ 0, "$method: all output lines are FPW");
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats');
+ like($lines[0], qr/WAL statistics/, "$method: statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, "$method: no rmgr lines output");
+
+ @lines =
+ test_pg_waldump($path, $start_lsn, $end_lsn, '--stats=record');
+ like($lines[0], qr/WAL statistics/,
+ "$method: stats=record on stdout");
+ is(grep(/^rmgr:/, @lines),
+ 0, "$method: no rmgr lines with stats=record");
+
+ @lines =
+ test_pg_waldump($path, $start_lsn, $end_lsn, '--rmgr' => 'Btree');
+ is(grep(!/^rmgr: Btree/, @lines), 0, "$method: only Btree lines");
+
+ @lines =
+ test_pg_waldump($path, $start_lsn, $end_lsn, '--fork' => 'init');
+ is(grep(!/fork init/, @lines), 0, "$method: only init fork lines");
+
+ # Cleanup.
+ unlink $path;
+ }
+}
+
+done_testing();
[text/plain] cf5955-docs.patch.no-cfbot (2.4K, 4-cf5955-docs.patch.no-cfbot)
download | inline diff:
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <[email protected]>
Date: Tue, 3 Mar 2026 00:00:00 +0000
Subject: [PATCH] Fix documentation for pg_waldump tar archive support
Two documentation issues with the tar archive reading feature:
- pg_waldump.sgml: When reading WAL from a tar archive with
out-of-order segments, pg_waldump spills to temporary files. TMPDIR
controls where those files are created, but this was not documented
in the Environment section.
- pg_verifybackup.sgml: The --wal-path option description still only
said "directory" even though it now also accepts tar archives.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 7 ++++---
doc/src/sgml/ref/pg_waldump.sgml | 11 +++++++++++
2 files changed, 15 insertions(+), 3 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index 16b50b5a4df..1695cfe91c8 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -261,9 +261,10 @@ PostgreSQL documentation
<term><option>--wal-path=<replaceable class="parameter">path</replaceable></option></term>
<listitem>
<para>
- Try to parse WAL files stored in the specified directory, rather than
- in <literal>pg_wal</literal>. This may be useful if the backup is
- stored in a separate location from the WAL archive.
+ Try to parse WAL files stored in the specified directory or tar
+ archive, rather than in <literal>pg_wal</literal>. This may be
+ useful if the backup is stored in a separate location from the WAL
+ archive.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index b36323dde92..9bbb4bd5772 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -391,6 +391,17 @@ PostgreSQL documentation
</para>
</listitem>
</varlistentry>
+
+ <varlistentry>
+ <term><envar>TMPDIR</envar></term>
+ <listitem>
+ <para>
+ Directory in which to create temporary files when reading WAL from a
+ tar archive with out-of-order segment files. If not set, the temporary
+ directory is created within the same directory as the tar archive.
+ </para>
+ </listitem>
+ </varlistentry>
</variablelist>
</refsect1>
[text/plain] cf5955-fixes.patch.no-cfbot (3.6K, 5-cf5955-fixes.patch.no-cfbot)
download | inline diff:
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <[email protected]>
Date: Tue, 3 Mar 2026 00:00:00 +0000
Subject: [PATCH] Fix bugs in pg_waldump tar archive support
Fix several bugs introduced by the pg_waldump archive WAL reading
feature:
- pg_waldump.c: The error path for verify_directory() printed waldir
(which is NULL when --path is used) instead of walpath.
- archive_waldump.c: The error message for short reads had an operator
precedence bug: (long long int) count - nbytes cast only count, not
the subtraction result. Also reported nbytes (the requested amount)
instead of count (the total file size) for the "of" portion.
- archive_waldump.c: The "ignoring duplicate WAL" code path leaked
fname (allocated via pnstrdup/palloc). Also changed the existing
free(fname) to pfree(fname) for consistency.
- pg_verifybackup.c: The rename from --wal-directory to --wal-path
didn't preserve the old spelling as a backward-compatible alias.
- pg_verifybackup.c: Fix double space before "Or" in --wal-path
error hint message.
---
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 935ab8fafa8..b0b764913cf 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -131,6 +131,7 @@ main(int argc, char **argv)
{"quiet", no_argument, NULL, 'q'},
{"skip-checksums", no_argument, NULL, 's'},
{"wal-path", required_argument, NULL, 'w'},
+ {"wal-directory", required_argument, NULL, 'w'},
{NULL, 0, NULL, 0}
};
@@ -376,7 +377,7 @@ main(int argc, char **argv)
else
{
pg_log_error("WAL archive not found");
- pg_log_error_hint("Specify the correct path using the option -w/--wal-path. "
+ pg_log_error_hint("Specify the correct path using the option -w/--wal-path. "
"Or you must use -n/--no-parse-wal when verifying a tar-format backup.");
exit(1);
}
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
index c5a4485b5b1..1479efe61f5 100644
--- a/src/bin/pg_waldump/archive_waldump.c
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -344,8 +344,8 @@ read_archive_wal_page(XLogDumpPrivate *privateInfo, XLogRecPtr targetPagePtr,
read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
pg_fatal("could not read file \"%s\" from archive \"%s\": read %lld of %lld",
fname, privateInfo->archive_name,
- (long long int) count - nbytes,
- (long long int) nbytes);
+ (long long int) (count - nbytes),
+ (long long int) count);
}
}
@@ -664,7 +664,7 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
privateInfo->start_segno > segno ||
privateInfo->end_segno < segno)
{
- free(fname);
+ pfree(fname);
break;
}
}
@@ -680,6 +680,7 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
{
pg_log_warning("ignoring duplicate WAL \"%s\" found in archive \"%s\"",
member->pathname, privateInfo->archive_name);
+ pfree(fname);
break;
}
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 114969217d8..4b438b53ead 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -1223,7 +1223,7 @@ main(int argc, char **argv)
/* validate path points to directory */
else if (!verify_directory(walpath))
{
- pg_log_error("could not open directory \"%s\": %m", waldir);
+ pg_log_error("could not open directory \"%s\": %m", walpath);
goto bad_argument;
}
}
view thread (85+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: pg_waldump: support decoding of WAL inside tarfile
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox