public inbox for [email protected]
help / color / mirror / Atom feedpg_waldump: support decoding of WAL inside tarfile
85+ messages / 13 participants
[nested] [flat]
* pg_waldump: support decoding of WAL inside tarfile
@ 2025-08-07 14:17 Amul Sul <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Amul Sul @ 2025-08-07 14:17 UTC (permalink / raw)
To: PostgreSQL Hackers <[email protected]>
Hi All,
Attaching patch to support a new feature that let pg_waldump decode
WAL files directly from a tar archive. This was worked to address a
limitation in pg_verifybackup[1], which couldn't parse WAL files from
tar-formatted backups.
The implementation will align with pg_waldump's existing xlogreader
design, which uses three callback functions to manage WAL segments:
open, read, and close. For tar archives, however, the approach will be
simpler. Instead of using separate callbacks for opening and closing,
the tar archive will be opened once at the start and closed explicitly
at the end.
The core logic will be in the WAL page reading callback. When
xlogreader requests a new WAL page, this callback will be invoked. It
will then call the archive streamer routine to read the WAL data from
the tar archive into a buffer. This data will then be copied into
xlogreader's own buffer, completing the read.
Essentially, this is plumbing work: the new code will be responsible
for getting WAL data from the tar archive and feeding it to the
existing xlogreader. All other WAL page and record decoding logic,
which is already robust within xlogreader, will be reused as is.
This feature is being implemented in a series of patches as:
- Refactoring: The first few patches (0001-0004) are dedicated to
refactoring and minor code changes.
- 005: This patch introduces the core functionality for pg_waldump to
read WAL from a tar archive using the same archive streamer
(fe_utils/astreamer.h) used in pg_verifybackup. This version requires
WAL files in the archive to be in sequential order.
- 006: This patch removes the sequential order restriction. If
pg_waldump encounters an out-of-order WAL file, it writes the file to
a temporary directory. The utility will then continue decoding and
read from this temporary location later.
- 007 and onwards: These patches will update pg_verifybackup to remove the
restriction on WAL parsing for tar-formatted backups. 008 patch renames the
"--wal-directory" switch to "--wal-path" to make it more generic, allowing
it accepts a directory path or a tar archive path.
-----------------------------------
Known Issues & Status:
-----------------------------------
- Timeline Switching: The current implementation in patch 006 does not
correctly handle timeline switching. This is a known issue, especially
when a timeline change occurs on a WAL file that has been written to a
temporary location.
- Testing: Local regression tests on CentOS and macOS M4 are passing.
However, some tests on macOS Sonoma (specifically 008_untar.pl and
010_client_untar.pl) are failing in the GitHub workflow with a "WAL
parsing failed for timeline 1" error. This issue is currently being
investigated.
Please take a look at the attached patch and let me know your
thoughts. This is an initial version, and I am making incremental
improvements to address known issues and limitations.
1] https://git.postgresql.org/pg/commitdiff/8dfd3129027969fdd2d9d294220c867d2efd84aa
--
Regards,
Amul Sul
EDB: http://www.enterprisedb.com
Attachments:
[application/x-patch] v1-0001-Refactor-pg_waldump-Move-some-declarations-to-new.patch (2.2K, 2-v1-0001-Refactor-pg_waldump-Move-some-declarations-to-new.patch)
download | inline diff:
From 420ab4e05566f81fb15488ae7060b9d5648994b5 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 24 Jun 2025 11:33:20 +0530
Subject: [PATCH v1 1/9] Refactor: pg_waldump: Move some declarations to new
pg_waldump.h
This is in preparation for adding a second source file to this
directory.
---
src/bin/pg_waldump/pg_waldump.c | 11 ++---------
src/bin/pg_waldump/pg_waldump.h | 27 +++++++++++++++++++++++++++
2 files changed, 29 insertions(+), 9 deletions(-)
create mode 100644 src/bin/pg_waldump/pg_waldump.h
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 13d3ec2f5be..a49b2fd96c7 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -29,6 +29,7 @@
#include "common/logging.h"
#include "common/relpath.h"
#include "getopt_long.h"
+#include "pg_waldump.h"
#include "rmgrdesc.h"
#include "storage/bufpage.h"
@@ -39,19 +40,11 @@
static const char *progname;
-static int WalSegSz;
+int WalSegSz = DEFAULT_XLOG_SEG_SIZE;
static volatile sig_atomic_t time_to_stop = false;
static const RelFileLocator emptyRelFileLocator = {0, 0, 0};
-typedef struct XLogDumpPrivate
-{
- TimeLineID timeline;
- XLogRecPtr startptr;
- XLogRecPtr endptr;
- bool endptr_reached;
-} XLogDumpPrivate;
-
typedef struct XLogDumpConfig
{
/* display options */
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
new file mode 100644
index 00000000000..cd9a36d7447
--- /dev/null
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -0,0 +1,27 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_waldump.h - decode and display WAL
+ *
+ * Copyright (c) 2013-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/pg_waldump.h
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_WALDUMP_H
+#define PG_WALDUMP_H
+
+#include "access/xlogdefs.h"
+
+extern int WalSegSz;
+
+/* Contains the necessary information to drive WAL decoding */
+typedef struct XLogDumpPrivate
+{
+ TimeLineID timeline;
+ XLogRecPtr startptr;
+ XLogRecPtr endptr;
+ bool endptr_reached;
+} XLogDumpPrivate;
+
+#endif /* end of PG_WALDUMP_H */
--
2.47.1
[application/x-patch] v1-0002-Refactor-pg_waldump-Separate-logic-used-to-calcul.patch (2.3K, 3-v1-0002-Refactor-pg_waldump-Separate-logic-used-to-calcul.patch)
download | inline diff:
From 30a226b1ae5ce3d1460bb3359c96a4e9a93d6b31 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 26 Jun 2025 11:42:53 +0530
Subject: [PATCH v1 2/9] Refactor: pg_waldump: Separate logic used to calculate
the required read size.
This refactoring prepares the codebase for an upcoming patch that will
support reading WAL from tar files. The logic for calculating the
required read size has been updated to handle both normal WAL files
and WAL files located inside a tar archive.
---
src/bin/pg_waldump/pg_waldump.c | 39 ++++++++++++++++++++++-----------
1 file changed, 26 insertions(+), 13 deletions(-)
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index a49b2fd96c7..8d0cd9e7156 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -326,6 +326,29 @@ identify_target_directory(char *directory, char *fname)
return NULL; /* not reached */
}
+/* Returns the size in bytes of the data to be read. */
+static inline int
+required_read_len(XLogDumpPrivate *private, XLogRecPtr targetPagePtr,
+ int reqLen)
+{
+ int count = XLOG_BLCKSZ;
+
+ if (private->endptr != InvalidXLogRecPtr)
+ {
+ if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
+ count = XLOG_BLCKSZ;
+ else if (targetPagePtr + reqLen <= private->endptr)
+ count = private->endptr - targetPagePtr;
+ else
+ {
+ private->endptr_reached = true;
+ return -1;
+ }
+ }
+
+ return count;
+}
+
/* pg_waldump's XLogReaderRoutine->segment_open callback */
static void
WALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
@@ -383,21 +406,11 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogRecPtr targetPtr, char *readBuff)
{
XLogDumpPrivate *private = state->private_data;
- int count = XLOG_BLCKSZ;
+ int count = required_read_len(private, targetPagePtr, reqLen);
WALReadError errinfo;
- if (private->endptr != InvalidXLogRecPtr)
- {
- if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
- count = XLOG_BLCKSZ;
- else if (targetPagePtr + reqLen <= private->endptr)
- count = private->endptr - targetPagePtr;
- else
- {
- private->endptr_reached = true;
- return -1;
- }
- }
+ if (private->endptr_reached)
+ return -1;
if (!WALRead(state, readBuff, targetPagePtr, count, private->timeline,
&errinfo))
--
2.47.1
[application/x-patch] v1-0003-Refactor-pg_waldump-Restructure-TAP-tests.patch (5.5K, 4-v1-0003-Refactor-pg_waldump-Restructure-TAP-tests.patch)
download | inline diff:
From 9e9121433ff4394238698bd27b3411daace5fd86 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 30 Jul 2025 12:43:30 +0530
Subject: [PATCH v1 3/9] Refactor: pg_waldump: Restructure TAP tests.
Restructured some tests to run inside a loop, facilitating their
re-execution for decoding WAL from tar archives.
---
src/bin/pg_waldump/t/001_basic.pl | 123 ++++++++++++++++--------------
1 file changed, 67 insertions(+), 56 deletions(-)
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index f26d75e01cf..1b712e8d74d 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -198,28 +198,6 @@ command_like(
],
qr/./,
'runs with start and end segment specified');
-command_fails_like(
- [ 'pg_waldump', '--path' => $node->data_dir ],
- qr/error: no start WAL location given/,
- 'path option requires start location');
-command_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- '--end' => $end_lsn,
- ],
- qr/./,
- 'runs with path option and start and end locations');
-command_fails_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- ],
- qr/error: error in WAL record at/,
- 'falling off the end of the WAL results in an error');
-
command_like(
[
'pg_waldump', '--quiet',
@@ -227,15 +205,6 @@ command_like(
],
qr/^$/,
'no output with --quiet option');
-command_fails_like(
- [
- 'pg_waldump', '--quiet',
- '--path' => $node->data_dir,
- '--start' => $start_lsn
- ],
- qr/error: error in WAL record at/,
- 'errors are shown with --quiet');
-
# Test for: Display a message that we're skipping data if `from`
# wasn't a pointer to the start of a record.
@@ -272,7 +241,6 @@ sub test_pg_waldump
my $result = IPC::Run::run [
'pg_waldump',
- '--path' => $node->data_dir,
'--start' => $start_lsn,
'--end' => $end_lsn,
@opts
@@ -288,38 +256,81 @@ sub test_pg_waldump
my @lines;
-@lines = test_pg_waldump;
-is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+my @scenario = (
+ {
+ 'path' => $node->data_dir
+ });
-@lines = test_pg_waldump('--limit' => 6);
-is(@lines, 6, 'limit option observed');
+for my $scenario (@scenario)
+{
+ my $path = $scenario->{'path'};
-@lines = test_pg_waldump('--fullpage');
-is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
+ SKIP:
+ {
+ command_fails_like(
+ [ 'pg_waldump', '--path' => $path ],
+ qr/error: no start WAL location given/,
+ 'path option requires start location');
+ command_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ '--end' => $end_lsn,
+ ],
+ qr/./,
+ 'runs with path option and start and end locations');
+ command_fails_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ ],
+ qr/error: error in WAL record at/,
+ 'falling off the end of the WAL results in an error');
-@lines = test_pg_waldump('--stats');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ command_fails_like(
+ [
+ 'pg_waldump', '--quiet',
+ '--path' => $path,
+ '--start' => $start_lsn
+ ],
+ qr/error: error in WAL record at/,
+ 'errors are shown with --quiet');
-@lines = test_pg_waldump('--stats=record');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ @lines = test_pg_waldump('--path' => $path);
+ is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
-@lines = test_pg_waldump('--rmgr' => 'Btree');
-is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+ @lines = test_pg_waldump('--path' => $path, '--limit' => 6);
+ is(@lines, 6, 'limit option observed');
-@lines = test_pg_waldump('--fork' => 'init');
-is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+ @lines = test_pg_waldump('--path' => $path, '--fullpage');
+ is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
-is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
- 0, 'only lines for selected relation');
+ @lines = test_pg_waldump('--path' => $path, '--stats');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
- '--block' => 1);
-is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+ @lines = test_pg_waldump('--path' => $path, '--stats=record');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ @lines = test_pg_waldump('--path' => $path, '--rmgr' => 'Btree');
+ is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+
+ @lines = test_pg_waldump('--path' => $path, '--fork' => 'init');
+ is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+
+ @lines = test_pg_waldump('--path' => $path,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
+ is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
+ 0, 'only lines for selected relation');
+
+ @lines = test_pg_waldump('--path' => $path,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
+ '--block' => 1);
+ is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+ }
+}
done_testing();
--
2.47.1
[application/x-patch] v1-0004-pg_waldump-Rename-directory-creation-routine-for-.patch (1.6K, 5-v1-0004-pg_waldump-Rename-directory-creation-routine-for-.patch)
download | inline diff:
From 58062914fb56aa4f8e005dbd24e072251f3150b6 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 29 Jul 2025 14:59:01 +0530
Subject: [PATCH v1 4/9] pg_waldump: Rename directory creation routine for
generalized use.
The create_fullpage_directory() function, currently used only for
storing full-page images from WAL records, should be renamed to a more
generalized name. This would allow it to be reused in future patches
for creating other directories as needed.
---
src/bin/pg_waldump/pg_waldump.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 8d0cd9e7156..4775275c07a 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -114,11 +114,11 @@ verify_directory(const char *directory)
}
/*
- * Create if necessary the directory storing the full-page images extracted
- * from the WAL records read.
+ * Create the directory if it doesn't exist. Report an error if creation fails
+ * or if an existing directory is not empty.
*/
static void
-create_fullpage_directory(char *path)
+create_directory(char *path)
{
int ret;
@@ -1112,8 +1112,12 @@ main(int argc, char **argv)
}
}
+ /*
+ * Create if necessary the directory storing the full-page images
+ * extracted from the WAL records read.
+ */
if (config.save_fullpage_path != NULL)
- create_fullpage_directory(config.save_fullpage_path);
+ create_directory(config.save_fullpage_path);
/* parse files as start/end boundaries, extract path if not specified */
if (optind < argc)
--
2.47.1
[application/x-patch] v1-0005-pg_waldump-Add-support-for-archived-WAL-decoding.patch (34.3K, 6-v1-0005-pg_waldump-Add-support-for-archived-WAL-decoding.patch)
download | inline diff:
From 21d9d604ca4b4ab08c5bc32decf1afc8d881c43c Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 16 Jul 2025 18:37:59 +0530
Subject: [PATCH v1 5/9] pg_waldump: Add support for archived WAL decoding.
pg_waldump can now accept the path to a tar archive containing WAL
files and decode them. This feature was added primarily for
pg_verifybackup, which previously disabled WAL parsing for
tar-formatted backups.
Note that this patch requires that the WAL files within the archive be
in sequential order; an error will be reported otherwise. The next
patch is planned to remove this restriction.
---
doc/src/sgml/ref/pg_waldump.sgml | 8 +-
src/bin/pg_waldump/Makefile | 7 +-
src/bin/pg_waldump/astreamer_waldump.c | 378 +++++++++++++++++++++++++
src/bin/pg_waldump/meson.build | 4 +-
src/bin/pg_waldump/pg_waldump.c | 361 +++++++++++++++++++----
src/bin/pg_waldump/pg_waldump.h | 21 +-
src/bin/pg_waldump/t/001_basic.pl | 64 ++++-
src/tools/pgindent/typedefs.list | 1 +
8 files changed, 765 insertions(+), 79 deletions(-)
create mode 100644 src/bin/pg_waldump/astreamer_waldump.c
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index ce23add5577..d004bb0f67e 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -141,13 +141,17 @@ PostgreSQL documentation
<term><option>--path=<replaceable>path</replaceable></option></term>
<listitem>
<para>
- Specifies a directory to search for WAL segment files or a
- directory with a <literal>pg_wal</literal> subdirectory that
+ Specifies a tar archive or a directory to search for WAL segment files
+ or a directory with a <literal>pg_wal</literal> subdirectory that
contains such files. The default is to search in the current
directory, the <literal>pg_wal</literal> subdirectory of the
current directory, and the <literal>pg_wal</literal> subdirectory
of <envar>PGDATA</envar>.
</para>
+ <para>
+ If a tar archive is provided, its WAL segment files must be in
+ sequential order; otherwise, an error will be reported.
+ </para>
</listitem>
</varlistentry>
diff --git a/src/bin/pg_waldump/Makefile b/src/bin/pg_waldump/Makefile
index 4c1ee649501..b234613eb50 100644
--- a/src/bin/pg_waldump/Makefile
+++ b/src/bin/pg_waldump/Makefile
@@ -3,6 +3,9 @@
PGFILEDESC = "pg_waldump - decode and display WAL"
PGAPPICON=win32
+# make these available to TAP test scripts
+export TAR
+
subdir = src/bin/pg_waldump
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
@@ -12,11 +15,13 @@ OBJS = \
$(WIN32RES) \
compat.o \
pg_waldump.o \
+ astreamer_waldump.o \
rmgrdesc.o \
xlogreader.o \
xlogstats.o
-override CPPFLAGS := -DFRONTEND $(CPPFLAGS)
+override CPPFLAGS := -DFRONTEND -I$(libpq_srcdir) $(CPPFLAGS)
+LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils
RMGRDESCSOURCES = $(sort $(notdir $(wildcard $(top_srcdir)/src/backend/access/rmgrdesc/*desc*.c)))
RMGRDESCOBJS = $(patsubst %.c,%.o,$(RMGRDESCSOURCES))
diff --git a/src/bin/pg_waldump/astreamer_waldump.c b/src/bin/pg_waldump/astreamer_waldump.c
new file mode 100644
index 00000000000..d0ac903c54e
--- /dev/null
+++ b/src/bin/pg_waldump/astreamer_waldump.c
@@ -0,0 +1,378 @@
+/*-------------------------------------------------------------------------
+ *
+ * astreamer_waldump.c
+ * A generic facility for reading WAL data from tar archives via archive
+ * streamer.
+ *
+ * Portions Copyright (c) 2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/astreamer_waldump.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres_fe.h"
+
+#include <unistd.h>
+
+#include "access/xlog_internal.h"
+#include "access/xlogdefs.h"
+#include "common/logging.h"
+#include "fe_utils/simple_list.h"
+#include "pg_waldump.h"
+
+/*
+ * How many bytes should we try to read from a file at once?
+ */
+#define READ_CHUNK_SIZE (128 * 1024)
+
+typedef struct astreamer_waldump
+{
+ /* These fields don't change once initialized. */
+ astreamer base;
+ XLogSegNo startSegNo;
+ XLogSegNo endSegNo;
+ XLogDumpPrivate *privateInfo;
+
+ /* These fields change with archive member. */
+ bool skipThisSeg;
+ XLogSegNo nextSegNo; /* Next expected segment to stream */
+} astreamer_waldump;
+
+static int astreamer_archive_read(XLogDumpPrivate *privateInfo);
+static void astreamer_waldump_content(astreamer *streamer,
+ astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context);
+static void astreamer_waldump_finalize(astreamer *streamer);
+static void astreamer_waldump_free(astreamer *streamer);
+
+static bool member_is_relevant_wal(astreamer_member *member,
+ TimeLineID startTimeLineID,
+ XLogSegNo startSegNo,
+ XLogSegNo endSegNo,
+ XLogSegNo nextSegNo,
+ XLogSegNo *curSegNo,
+ TimeLineID *curSegTimeline);
+
+static const astreamer_ops astreamer_waldump_ops = {
+ .content = astreamer_waldump_content,
+ .finalize = astreamer_waldump_finalize,
+ .free = astreamer_waldump_free
+};
+
+/*
+ * Copies WAL data from astreamer to readBuff; if unavailable, fetches more
+ * from the tar archive via astreamer.
+ */
+int
+astreamer_wal_read(char *readBuff, XLogRecPtr targetPagePtr, Size count,
+ XLogDumpPrivate *privateInfo)
+{
+ char *p = readBuff;
+ Size nbytes = count;
+ XLogRecPtr recptr = targetPagePtr;
+ volatile StringInfo astreamer_buf = privateInfo->archive_streamer_buf;
+
+ while (nbytes > 0)
+ {
+ char *buf = astreamer_buf->data;
+ int len = astreamer_buf->len;
+
+ /* WAL record range that the buffer contains */
+ XLogRecPtr endPtr = privateInfo->archive_streamer_read_ptr;
+ XLogRecPtr startPtr = (endPtr > len) ? endPtr - len : 0;
+
+ /*
+ * Ignore existing data if the required target page has not yet been
+ * read.
+ */
+ if (recptr >= endPtr)
+ {
+ len = 0;
+
+ /* Reset the buffer */
+ resetStringInfo(astreamer_buf);
+ }
+
+ if (len > 0 && recptr > startPtr)
+ {
+ int skipBytes = 0;
+
+ /*
+ * The required offset is not at the start of the archive streamer
+ * buffer, so skip bytes until reaching the desired offset of the
+ * target page.
+ */
+ skipBytes = recptr - startPtr;
+
+ buf += skipBytes;
+ len -= skipBytes;
+ }
+
+ if (len > 0)
+ {
+ int readBytes = len >= nbytes ? nbytes : len;
+
+ /*
+ * Ensure we are reading the correct page, unless we've received an
+ * invalid record pointer. In that specific case, it's acceptable
+ * to read any page.
+ */
+ Assert(XLogRecPtrIsInvalid(recptr) ||
+ (recptr >= startPtr && recptr < endPtr));
+
+ memcpy(p, buf, readBytes);
+
+ /* Update state for read */
+ nbytes -= readBytes;
+ p += readBytes;
+ recptr += readBytes;
+ }
+ else
+ {
+ /* Fetch more data */
+ if (astreamer_archive_read(privateInfo) == 0)
+ break; /* No data remaining */
+ }
+ }
+
+ return (count - nbytes) ? (count - nbytes) : -1;
+}
+
+/*
+ * Reads the archive and passes it to the archive streamer for decompression.
+ */
+static int
+astreamer_archive_read(XLogDumpPrivate *privateInfo)
+{
+ int rc;
+ char *buffer;
+
+ buffer = pg_malloc(READ_CHUNK_SIZE * sizeof(uint8));
+
+ /* Read more data from the tar file */
+ rc = read(privateInfo->archive_fd, buffer, READ_CHUNK_SIZE);
+ if (rc < 0)
+ pg_fatal("could not read file \"%s\": %m",
+ privateInfo->archive_name);
+
+ /*
+ * Decrypt (if required), and then parse the previously read contents of
+ * the tar file.
+ */
+ if (rc > 0)
+ astreamer_content(privateInfo->archive_streamer, NULL,
+ buffer, rc, ASTREAMER_UNKNOWN);
+ pg_free(buffer);
+
+ return rc;
+}
+
+/*
+ * Create an astreamer that can read WAL from tar file.
+ */
+astreamer *
+astreamer_waldump_content_new(astreamer *next, XLogRecPtr startptr,
+ XLogRecPtr endPtr, XLogDumpPrivate *privateInfo)
+{
+ astreamer_waldump *streamer;
+
+ streamer = palloc0(sizeof(astreamer_waldump));
+ *((const astreamer_ops **) &streamer->base.bbs_ops) =
+ &astreamer_waldump_ops;
+
+ streamer->base.bbs_next = next;
+ initStringInfo(&streamer->base.bbs_buffer);
+
+ if (XLogRecPtrIsInvalid(startptr))
+ streamer->startSegNo = 0;
+ else
+ {
+ XLByteToSeg(startptr, streamer->startSegNo, WalSegSz);
+
+ /*
+ * Initialize the record pointer to the beginning of the first
+ * segment; this pointer will track the WAL record reading status.
+ */
+ XLogSegNoOffsetToRecPtr(streamer->startSegNo, 0, WalSegSz,
+ privateInfo->archive_streamer_read_ptr);
+ }
+
+ if (XLogRecPtrIsInvalid(endPtr))
+ streamer->endSegNo = UINT64_MAX;
+ else
+ XLByteToSeg(endPtr, streamer->endSegNo, WalSegSz);
+
+ streamer->nextSegNo = streamer->startSegNo;
+ streamer->privateInfo = privateInfo;
+
+ return &streamer->base;
+}
+
+/*
+ * Main entry point of the archive streamer for reading WAL from a tar file.
+ */
+static void
+astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context)
+{
+ astreamer_waldump *mystreamer = (astreamer_waldump *) streamer;
+ XLogDumpPrivate *privateInfo = mystreamer->privateInfo;
+
+ Assert(context != ASTREAMER_UNKNOWN);
+
+ switch (context)
+ {
+ case ASTREAMER_MEMBER_HEADER:
+ {
+ XLogSegNo segNo;
+ TimeLineID timeline;
+
+ pg_log_debug("pg_waldump: reading \"%s\"", member->pathname);
+
+ mystreamer->skipThisSeg = false;
+
+ if (!member_is_relevant_wal(member,
+ privateInfo->timeline,
+ mystreamer->startSegNo,
+ mystreamer->endSegNo,
+ mystreamer->nextSegNo,
+ &segNo, &timeline))
+ {
+ mystreamer->skipThisSeg = true;
+ break;
+ }
+
+ /*
+ * If nextSegNo is 0, the check is skipped, and any WAL file
+ * can be read -- this typically occurs during initial
+ * verification.
+ */
+ if (mystreamer->nextSegNo == 0)
+ break;
+
+ /* WAL segments must be archived in order */
+ if (mystreamer->nextSegNo != segNo)
+ {
+ pg_log_error("WAL files are not archived in sequential order");
+ pg_log_error_detail("Expecting segment number " UINT64_FORMAT " but found " UINT64_FORMAT ".",
+ mystreamer->nextSegNo, segNo);
+ exit(1);
+ }
+
+ /*
+ * We track the reading of WAL segment records using a pointer
+ * that's continuously incremented by the length of the
+ * received data. This pointer is crucial for serving WAL page
+ * requests from the WAL decoding routine, so it must be
+ * accurate.
+ */
+#ifdef USE_ASSERT_CHECKING
+ if (mystreamer->nextSegNo != 0)
+ {
+ XLogRecPtr recPtr;
+
+ XLogSegNoOffsetToRecPtr(segNo, 0, WalSegSz, recPtr);
+ Assert(privateInfo->archive_streamer_read_ptr == recPtr);
+ }
+#endif
+
+ /* Save the timeline */
+ privateInfo->timeline = timeline;
+
+ /* Update the next expected segment number */
+ mystreamer->nextSegNo += 1;
+ }
+ break;
+
+ case ASTREAMER_MEMBER_CONTENTS:
+ /* Skip this segment */
+ if (mystreamer->skipThisSeg)
+ break;
+
+ /* Or, copy contents to buffer */
+ privateInfo->archive_streamer_read_ptr += len;
+ astreamer_buffer_bytes(streamer, &data, &len, len);
+ break;
+
+ case ASTREAMER_MEMBER_TRAILER:
+ break;
+
+ case ASTREAMER_ARCHIVE_TRAILER:
+ break;
+
+ default:
+ /* Shouldn't happen. */
+ pg_fatal("unexpected state while parsing tar file");
+ }
+}
+
+/*
+ * End-of-stream processing for a astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_finalize(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+}
+
+/*
+ * Free memory associated with a astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_free(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+
+ pfree(streamer->bbs_buffer.data);
+ pfree(streamer);
+}
+
+/*
+ * Returns true if the archive member name matches the WAL naming format and
+ * the corresponding WAL segment falls within the WAL decoding target range;
+ * otherwise, returns false.
+ */
+static bool
+member_is_relevant_wal(astreamer_member *member, TimeLineID startTimeLineID,
+ XLogSegNo startSegNo, XLogSegNo endSegNo,
+ XLogSegNo nextSegNo, XLogSegNo *curSegNo,
+ TimeLineID *curSegTimeline)
+{
+ int pathlen;
+ XLogSegNo segNo;
+ TimeLineID timeline;
+ char *fname;
+
+ /* We are only interested in normal files. */
+ if (member->is_directory || member->is_link)
+ return false;
+
+ pathlen = strlen(member->pathname);
+ if (pathlen < XLOG_FNAME_LEN)
+ return false;
+
+ /* WAL file could be with full path */
+ fname = member->pathname + (pathlen - XLOG_FNAME_LEN);
+ if (!IsXLogFileName(fname))
+ return false;
+
+ /* Parse position from file */
+ XLogFromFileName(fname, &timeline, &segNo, WalSegSz);
+
+ /* Ignore the older timeline */
+ if (startTimeLineID > timeline)
+ return false;
+
+ /* Skip if the current segment is not the desired one */
+ if (startSegNo > segNo || endSegNo < segNo)
+ return false;
+
+ *curSegNo = segNo;
+ *curSegTimeline = timeline;
+
+ return true;
+}
diff --git a/src/bin/pg_waldump/meson.build b/src/bin/pg_waldump/meson.build
index 937e0d68841..2a0300dc339 100644
--- a/src/bin/pg_waldump/meson.build
+++ b/src/bin/pg_waldump/meson.build
@@ -3,6 +3,7 @@
pg_waldump_sources = files(
'compat.c',
'pg_waldump.c',
+ 'astreamer_waldump.c',
'rmgrdesc.c',
)
@@ -18,7 +19,7 @@ endif
pg_waldump = executable('pg_waldump',
pg_waldump_sources,
- dependencies: [frontend_code, lz4, zstd],
+ dependencies: [frontend_code, lz4, zstd, libpq],
c_args: ['-DFRONTEND'], # needed for xlogreader et al
kwargs: default_bin_args,
)
@@ -29,6 +30,7 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
+ 'env': {'TAR': tar.found() ? tar.full_path() : ''},
'tests': [
't/001_basic.pl',
't/002_save_fullpage.pl',
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 4775275c07a..64f3a65b735 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -182,10 +182,9 @@ open_file_in_directory(const char *directory, const char *fname)
{
int fd = -1;
char fpath[MAXPGPATH];
+ char *dir = directory ? (char *) directory : ".";
- Assert(directory != NULL);
-
- snprintf(fpath, MAXPGPATH, "%s/%s", directory, fname);
+ snprintf(fpath, MAXPGPATH, "%s/%s", dir, fname);
fd = open(fpath, O_RDONLY | PG_BINARY, 0);
if (fd < 0 && errno != ENOENT)
@@ -326,6 +325,160 @@ identify_target_directory(char *directory, char *fname)
return NULL; /* not reached */
}
+/*
+ * Returns true if the given file is a tar archive and outputs its compression
+ * algorithm.
+ */
+static bool
+is_tar_file(const char *fname, pg_compress_algorithm *compression)
+{
+ int fname_len = strlen(fname);
+ pg_compress_algorithm compress_algo;
+
+ /* Now, check the compression type of the tar */
+ if (fname_len > 4 &&
+ strcmp(fname + fname_len - 4, ".tar") == 0)
+ compress_algo = PG_COMPRESSION_NONE;
+ else if (fname_len > 4 &&
+ strcmp(fname + fname_len - 4, ".tgz") == 0)
+ compress_algo = PG_COMPRESSION_GZIP;
+ else if (fname_len > 7 &&
+ strcmp(fname + fname_len - 7, ".tar.gz") == 0)
+ compress_algo = PG_COMPRESSION_GZIP;
+ else if (fname_len > 8 &&
+ strcmp(fname + fname_len - 8, ".tar.lz4") == 0)
+ compress_algo = PG_COMPRESSION_LZ4;
+ else if (fname_len > 8 &&
+ strcmp(fname + fname_len - 8, ".tar.zst") == 0)
+ compress_algo = PG_COMPRESSION_ZSTD;
+ else
+ return false;
+
+ *compression = compress_algo;
+
+ return true;
+}
+
+/*
+ * Creates an appropriate chain of archive streamers for reading the given
+ * tar archive.
+ */
+static void
+setup_astreamer(XLogDumpPrivate *private, pg_compress_algorithm compression,
+ XLogRecPtr startptr, XLogRecPtr endptr)
+{
+ astreamer *streamer = NULL;
+
+ streamer = astreamer_waldump_content_new(NULL, startptr, endptr, private);
+
+ /*
+ * Final extracted WAL data will reside in this streamer. However, since
+ * it sits at the bottom of the stack and isn't designed to propagate data
+ * upward, we need to hold a pointer to its data buffer in order to copy.
+ */
+ private->archive_streamer_buf = &streamer->bbs_buffer;
+
+ /* Before that we must parse the tar archive. */
+ streamer = astreamer_tar_parser_new(streamer);
+
+ /* Before that we must decompress, if archive is compressed. */
+ if (compression == PG_COMPRESSION_GZIP)
+ streamer = astreamer_gzip_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_LZ4)
+ streamer = astreamer_lz4_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_ZSTD)
+ streamer = astreamer_zstd_decompressor_new(streamer);
+
+ private->archive_streamer = streamer;
+}
+
+/*
+ * Initializes the archive reader for a tar file.
+ */
+static void
+init_tar_archive_reader(XLogDumpPrivate *private, char *waldir,
+ pg_compress_algorithm compression)
+{
+ int fd;
+
+ /* Now, the tar archive and store its file descriptor */
+ fd = open_file_in_directory(waldir, private->archive_name);
+
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", private->archive_name);
+
+ private->archive_fd = fd;
+
+ /* Setup tar archive reading facility */
+ setup_astreamer(private, compression, private->startptr, private->endptr);
+}
+
+/*
+ * Release the archive streamer chain and close the archive file.
+ */
+static void
+free_tar_archive_reader(XLogDumpPrivate *private)
+{
+ /*
+ * NB: Normally, astreamer_finalize() is called before astreamer_free() to
+ * flush any remaining buffered data or to ensure the end of the tar
+ * archive is reached. However, when decoding a WAL file, once we hit the
+ * end LSN, any remaining WAL data in the buffer or the tar archive's
+ * unreached end can be safely ignored.
+ */
+ astreamer_free(private->archive_streamer);
+
+ /* Close the file. */
+ if (close(private->archive_fd) != 0)
+ pg_log_error("could not close file \"%s\": %m",
+ private->archive_name);
+}
+
+/*
+ * Reads a WAL page from the archive and verifies WAL segment size.
+ */
+static void
+verify_tar_archive(XLogDumpPrivate *private, const char *waldir,
+ pg_compress_algorithm compression)
+{
+ PGAlignedXLogBlock buf;
+ int r;
+
+ setup_astreamer(private, compression, InvalidXLogRecPtr, InvalidXLogRecPtr);
+
+ /* Now, the tar archive and store its file descriptor */
+ private->archive_fd = open_file_in_directory(waldir, private->archive_name);
+
+ if (private->archive_fd < 0)
+ pg_fatal("could not open file \"%s\"", private->archive_name);
+
+ /* Read a wal page */
+ r = astreamer_wal_read(buf.data, InvalidXLogRecPtr, XLOG_BLCKSZ, private);
+
+ /* Set WalSegSz if WAL data is successfully read */
+ if (r == XLOG_BLCKSZ)
+ {
+ XLogLongPageHeader longhdr = (XLogLongPageHeader) buf.data;
+
+ WalSegSz = longhdr->xlp_seg_size;
+
+ if (!IsValidWalSegSize(WalSegSz))
+ {
+ pg_log_error(ngettext("invalid WAL segment size in WAL file \"%s\" (%d byte)",
+ "invalid WAL segment size in WAL file \"%s\" (%d bytes)",
+ WalSegSz),
+ private->archive_name, WalSegSz);
+ pg_log_error_detail("The WAL segment size must be a power of two between 1 MB and 1 GB.");
+ exit(1);
+ }
+ }
+ else
+ pg_fatal("could not read WAL data from \"%s\" archive: read %d of %d",
+ private->archive_name, r, XLOG_BLCKSZ);
+
+ free_tar_archive_reader(private);
+}
+
/* Returns the size in bytes of the data to be read. */
static inline int
required_read_len(XLogDumpPrivate *private, XLogRecPtr targetPagePtr,
@@ -406,7 +559,7 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogRecPtr targetPtr, char *readBuff)
{
XLogDumpPrivate *private = state->private_data;
- int count = required_read_len(private, targetPagePtr, reqLen);
+ int count = required_read_len(private, targetPtr, reqLen);
WALReadError errinfo;
if (private->endptr_reached)
@@ -436,6 +589,44 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
return count;
}
+/*
+ * pg_waldump's XLogReaderRoutine->segment_open callback to support dumping WAL
+ * files from tar archives.
+ */
+static void
+TarWALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
+ TimeLineID *tli_p)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->segment_close callback.
+ */
+static void
+TarWALDumpCloseSegment(XLogReaderState *state)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->page_read callback to support dumping WAL
+ * files from tar archives.
+ */
+static int
+TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
+ XLogRecPtr targetPtr, char *readBuff)
+{
+ XLogDumpPrivate *private = state->private_data;
+ int count = required_read_len(private, targetPtr, reqLen);
+
+ if (private->endptr_reached)
+ return -1;
+
+ /* Read the WAL page from the archive streamer */
+ return astreamer_wal_read(readBuff, targetPagePtr, count, private);
+}
+
/*
* Boolean to return whether the given WAL record matches a specific relation
* and optionally block.
@@ -773,8 +964,8 @@ usage(void)
printf(_(" -F, --fork=FORK only show records that modify blocks in fork FORK;\n"
" valid names are main, fsm, vm, init\n"));
printf(_(" -n, --limit=N number of records to display\n"));
- printf(_(" -p, --path=PATH directory in which to find WAL segment files or a\n"
- " directory with a ./pg_wal that contains such files\n"
+ printf(_(" -p, --path=PATH tar archive or a directory in which to find WAL segment files or\n"
+ " a directory with a ./pg_wal that contains such files\n"
" (default: current directory, ./pg_wal, $PGDATA/pg_wal)\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -r, --rmgr=RMGR only show records generated by resource manager RMGR;\n"
@@ -806,7 +997,11 @@ main(int argc, char **argv)
XLogRecord *record;
XLogRecPtr first_record;
char *waldir = NULL;
+ char *walpath = NULL;
char *errormsg;
+ bool is_tar = false;
+ XLogReaderRoutine *routine = NULL;
+ pg_compress_algorithm compression;
static struct option long_options[] = {
{"bkp-details", no_argument, NULL, 'b'},
@@ -938,7 +1133,7 @@ main(int argc, char **argv)
}
break;
case 'p':
- waldir = pg_strdup(optarg);
+ walpath = pg_strdup(optarg);
break;
case 'q':
config.quiet = true;
@@ -1102,10 +1297,20 @@ main(int argc, char **argv)
goto bad_argument;
}
- if (waldir != NULL)
+ if (walpath != NULL)
{
+ /* validate path points to tar archive */
+ if (is_tar_file(walpath, &compression))
+ {
+ char *fname = NULL;
+
+ split_path(walpath, &waldir, &fname);
+
+ private.archive_name = fname;
+ is_tar = true;
+ }
/* validate path points to directory */
- if (!verify_directory(waldir))
+ else if (!verify_directory(walpath))
{
pg_log_error("could not open directory \"%s\": %m", waldir);
goto bad_argument;
@@ -1129,44 +1334,23 @@ main(int argc, char **argv)
split_path(argv[optind], &directory, &fname);
- if (waldir == NULL && directory != NULL)
+ if (walpath == NULL && directory != NULL)
{
- waldir = directory;
+ walpath = directory;
- if (!verify_directory(waldir))
+ if (!verify_directory(walpath))
pg_fatal("could not open directory \"%s\": %m", waldir);
}
- waldir = identify_target_directory(waldir, fname);
- fd = open_file_in_directory(waldir, fname);
- if (fd < 0)
- pg_fatal("could not open file \"%s\"", fname);
- close(fd);
-
- /* parse position from file */
- XLogFromFileName(fname, &private.timeline, &segno, WalSegSz);
-
- if (XLogRecPtrIsInvalid(private.startptr))
- XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr);
- else if (!XLByteInSeg(private.startptr, segno, WalSegSz))
+ if (fname != NULL && is_tar_file(fname, &compression))
{
- pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.startptr),
- fname);
- goto bad_argument;
+ private.archive_name = fname;
+ waldir = walpath;
+ is_tar = true;
}
-
- /* no second file specified, set end position */
- if (!(optind + 1 < argc) && XLogRecPtrIsInvalid(private.endptr))
- XLogSegNoOffsetToRecPtr(segno + 1, 0, WalSegSz, private.endptr);
-
- /* parse ENDSEG if passed */
- if (optind + 1 < argc)
+ else
{
- XLogSegNo endsegno;
-
- /* ignore directory, already have that */
- split_path(argv[optind + 1], &directory, &fname);
+ waldir = identify_target_directory(walpath, fname);
fd = open_file_in_directory(waldir, fname);
if (fd < 0)
@@ -1174,32 +1358,67 @@ main(int argc, char **argv)
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &endsegno, WalSegSz);
+ XLogFromFileName(fname, &private.timeline, &segno, WalSegSz);
- if (endsegno < segno)
- pg_fatal("ENDSEG %s is before STARTSEG %s",
- argv[optind + 1], argv[optind]);
+ if (XLogRecPtrIsInvalid(private.startptr))
+ XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr);
+ else if (!XLByteInSeg(private.startptr, segno, WalSegSz))
+ {
+ pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.startptr),
+ fname);
+ goto bad_argument;
+ }
- if (XLogRecPtrIsInvalid(private.endptr))
- XLogSegNoOffsetToRecPtr(endsegno + 1, 0, WalSegSz,
- private.endptr);
+ /* no second file specified, set end position */
+ if (!(optind + 1 < argc) && XLogRecPtrIsInvalid(private.endptr))
+ XLogSegNoOffsetToRecPtr(segno + 1, 0, WalSegSz, private.endptr);
- /* set segno to endsegno for check of --end */
- segno = endsegno;
- }
+ /* parse ENDSEG if passed */
+ if (optind + 1 < argc)
+ {
+ XLogSegNo endsegno;
+ /* ignore directory, already have that */
+ split_path(argv[optind + 1], &directory, &fname);
- if (!XLByteInSeg(private.endptr, segno, WalSegSz) &&
- private.endptr != (segno + 1) * WalSegSz)
- {
- pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.endptr),
- argv[argc - 1]);
- goto bad_argument;
+ fd = open_file_in_directory(waldir, fname);
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", fname);
+ close(fd);
+
+ /* parse position from file */
+ XLogFromFileName(fname, &private.timeline, &endsegno, WalSegSz);
+
+ if (endsegno < segno)
+ pg_fatal("ENDSEG %s is before STARTSEG %s",
+ argv[optind + 1], argv[optind]);
+
+ if (XLogRecPtrIsInvalid(private.endptr))
+ XLogSegNoOffsetToRecPtr(endsegno + 1, 0, WalSegSz,
+ private.endptr);
+
+ /* set segno to endsegno for check of --end */
+ segno = endsegno;
+ }
+
+
+ if (!XLByteInSeg(private.endptr, segno, WalSegSz) &&
+ private.endptr != (segno + 1) * WalSegSz)
+ {
+ pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.endptr),
+ argv[argc - 1]);
+ goto bad_argument;
+ }
}
}
- else
- waldir = identify_target_directory(waldir, NULL);
+ else if (!is_tar)
+ waldir = identify_target_directory(walpath, NULL);
+
+ /* Verify that the archive contains valid WAL files */
+ if (is_tar)
+ verify_tar_archive(&private, waldir, compression);
/* we don't know what to print */
if (XLogRecPtrIsInvalid(private.startptr))
@@ -1211,11 +1430,26 @@ main(int argc, char **argv)
/* done with argument parsing, do the actual work */
/* we have everything we need, start reading */
+ if (is_tar)
+ {
+ /* Set up for reading tar file */
+ init_tar_archive_reader(&private, waldir, compression);
+
+ /* Routine to decode WAL files in tar archive */
+ routine = XL_ROUTINE(.page_read = TarWALDumpReadPage,
+ .segment_open = TarWALDumpOpenSegment,
+ .segment_close = TarWALDumpCloseSegment);
+ }
+ else
+ {
+ /* Routine to decode WAL files */
+ routine = XL_ROUTINE(.page_read = WALDumpReadPage,
+ .segment_open = WALDumpOpenSegment,
+ .segment_close = WALDumpCloseSegment);
+ }
+
xlogreader_state =
- XLogReaderAllocate(WalSegSz, waldir,
- XL_ROUTINE(.page_read = WALDumpReadPage,
- .segment_open = WALDumpOpenSegment,
- .segment_close = WALDumpCloseSegment),
+ XLogReaderAllocate(WalSegSz, waldir, routine,
&private);
if (!xlogreader_state)
pg_fatal("out of memory while allocating a WAL reading processor");
@@ -1325,6 +1559,9 @@ main(int argc, char **argv)
XLogReaderFree(xlogreader_state);
+ if (is_tar)
+ free_tar_archive_reader(&private);
+
return EXIT_SUCCESS;
bad_argument:
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index cd9a36d7447..d2c2307d6c2 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -12,6 +12,8 @@
#define PG_WALDUMP_H
#include "access/xlogdefs.h"
+#include "fe_utils/astreamer.h"
+#include "lib/stringinfo.h"
extern int WalSegSz;
@@ -22,6 +24,23 @@ typedef struct XLogDumpPrivate
XLogRecPtr startptr;
XLogRecPtr endptr;
bool endptr_reached;
+
+ /* Fields required to read WAL from archive */
+ char *archive_name; /* Tar archive name */
+ int archive_fd; /* File descriptor for the open tar file */
+
+ astreamer *archive_streamer;
+ StringInfo archive_streamer_buf; /* Buffer for receiving WAL data */
+ XLogRecPtr archive_streamer_read_ptr; /* Populate the buffer with records
+ until this record pointer */
} XLogDumpPrivate;
-#endif /* end of PG_WALDUMP_H */
+
+extern astreamer *astreamer_waldump_content_new(astreamer *next,
+ XLogRecPtr startptr,
+ XLogRecPtr endptr,
+ XLogDumpPrivate *privateInfo);
+extern int astreamer_wal_read(char *readBuff, XLogRecPtr startptr, Size count,
+ XLogDumpPrivate *privateInfo);
+
+#endif /* end of PG_WALDUMP_H */
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 1b712e8d74d..80298d2a51d 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -3,10 +3,13 @@
use strict;
use warnings FATAL => 'all';
+use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
+my $tar = $ENV{TAR};
+
program_help_ok('pg_waldump');
program_version_ok('pg_waldump');
program_options_handling_ok('pg_waldump');
@@ -235,7 +238,7 @@ command_like(
sub test_pg_waldump
{
local $Test::Builder::Level = $Test::Builder::Level + 1;
- my @opts = @_;
+ my ($path, @opts) = @_;
my ($stdout, $stderr);
@@ -243,6 +246,7 @@ sub test_pg_waldump
'pg_waldump',
'--start' => $start_lsn,
'--end' => $end_lsn,
+ '--path' => $path,
@opts
],
'>' => \$stdout,
@@ -254,11 +258,27 @@ sub test_pg_waldump
return @lines;
}
-my @lines;
+my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
my @scenario = (
{
- 'path' => $node->data_dir
+ 'path' => $node->data_dir,
+ 'is_archive' => 0,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar",
+ 'compression_method' => 'none',
+ 'compression_flags' => '-cf',
+ 'is_archive' => 1,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar.gz",
+ 'compression_method' => 'gzip',
+ 'compression_flags' => '-czf',
+ 'is_archive' => 1,
+ 'enabled' => check_pg_config("#define HAVE_LIBZ 1")
});
for my $scenario (@scenario)
@@ -267,6 +287,22 @@ for my $scenario (@scenario)
SKIP:
{
+ skip "tar command is not available", 3
+ if !defined $tar;
+ skip "$scenario->{'compression_method'} compression not supported by this build", 3
+ if !$scenario->{'enabled'} && $scenario->{'is_archive'};
+
+ # create pg_wal archive
+ if ($scenario->{'is_archive'})
+ {
+ # move into the WAL directory before archiving files
+ my $cwd = getcwd;
+ chdir($node->data_dir . '/pg_wal/') || die "chdir: $!";
+ command_ok(
+ [ $tar, $scenario->{'compression_flags'}, $path , '.' ]);
+ chdir($cwd) || die "chdir: $!";
+ }
+
command_fails_like(
[ 'pg_waldump', '--path' => $path ],
qr/error: no start WAL location given/,
@@ -298,38 +334,42 @@ for my $scenario (@scenario)
qr/error: error in WAL record at/,
'errors are shown with --quiet');
- @lines = test_pg_waldump('--path' => $path);
+ my @lines;
+ @lines = test_pg_waldump($path);
is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
- @lines = test_pg_waldump('--path' => $path, '--limit' => 6);
+ @lines = test_pg_waldump($path, '--limit' => 6);
is(@lines, 6, 'limit option observed');
- @lines = test_pg_waldump('--path' => $path, '--fullpage');
+ @lines = test_pg_waldump($path, '--fullpage');
is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
- @lines = test_pg_waldump('--path' => $path, '--stats');
+ @lines = test_pg_waldump($path, '--stats');
like($lines[0], qr/WAL statistics/, "statistics on stdout");
is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
- @lines = test_pg_waldump('--path' => $path, '--stats=record');
+ @lines = test_pg_waldump($path, '--stats=record');
like($lines[0], qr/WAL statistics/, "statistics on stdout");
is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
- @lines = test_pg_waldump('--path' => $path, '--rmgr' => 'Btree');
+ @lines = test_pg_waldump($path, '--rmgr' => 'Btree');
is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
- @lines = test_pg_waldump('--path' => $path, '--fork' => 'init');
+ @lines = test_pg_waldump($path, '--fork' => 'init');
is(grep(!/fork init/, @lines), 0, 'only init fork lines');
- @lines = test_pg_waldump('--path' => $path,
+ @lines = test_pg_waldump($path,
'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
0, 'only lines for selected relation');
- @lines = test_pg_waldump('--path' => $path,
+ @lines = test_pg_waldump($path,
'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
'--block' => 1);
is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+
+ # Cleanup.
+ unlink $path if $scenario->{'is_archive'};
}
}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index e6f2e93b2d6..d8428ce2352 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3445,6 +3445,7 @@ astreamer_recovery_injector
astreamer_tar_archiver
astreamer_tar_parser
astreamer_verify
+astreamer_waldump
astreamer_zstd_frame
auth_password_hook_typ
autovac_table
--
2.47.1
[application/x-patch] v1-0006-WIP-pg_waldump-Remove-the-restriction-on-the-orde.patch (18.0K, 7-v1-0006-WIP-pg_waldump-Remove-the-restriction-on-the-orde.patch)
download | inline diff:
From 7469b7b6bf3dd84d092fd86f69bf5ab574ee4f85 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 7 Aug 2025 17:37:23 +0530
Subject: [PATCH v1 6/9] WIP-pg_waldump: Remove the restriction on the order of
archived WAL files.
With previous patch, pg_waldump would stop decoding if WAL files were
not in the required sequence. With this patch, decoding will now
continue. Any WAL file that is out of order will be written to a
temporary location, from which it will be read later. Once a temporary
file has been read, it will be removed.
TODO:
Timeline switching is not handled correctly, especially when a
timeline change occurs on the next WAL file that was previously
written to a temporary location.
---
doc/src/sgml/ref/pg_waldump.sgml | 8 +-
src/bin/pg_waldump/astreamer_waldump.c | 188 +++++++++++++++++++++----
src/bin/pg_waldump/pg_waldump.c | 99 ++++++++++++-
src/bin/pg_waldump/pg_waldump.h | 1 +
src/bin/pg_waldump/t/001_basic.pl | 40 +++++-
5 files changed, 301 insertions(+), 35 deletions(-)
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index d004bb0f67e..8a28b4f0f91 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -149,8 +149,12 @@ PostgreSQL documentation
of <envar>PGDATA</envar>.
</para>
<para>
- If a tar archive is provided, its WAL segment files must be in
- sequential order; otherwise, an error will be reported.
+ If a tar archive is provided and its WAL segment files are not in
+ sequential order, those files will be written to a temporary directory
+ named <filename>pg_waldump_tmp_dir/</filename>. This directory will be
+ created inside the directory specified by the <envar>TMPDIR</envar>
+ environment variable if it is set; otherwise, it will be created within
+ the same directory as the tar archive.
</para>
</listitem>
</varlistentry>
diff --git a/src/bin/pg_waldump/astreamer_waldump.c b/src/bin/pg_waldump/astreamer_waldump.c
index d0ac903c54e..a088c33b16f 100644
--- a/src/bin/pg_waldump/astreamer_waldump.c
+++ b/src/bin/pg_waldump/astreamer_waldump.c
@@ -18,6 +18,7 @@
#include "access/xlog_internal.h"
#include "access/xlogdefs.h"
+#include "common/file_perm.h"
#include "common/logging.h"
#include "fe_utils/simple_list.h"
#include "pg_waldump.h"
@@ -37,6 +38,9 @@ typedef struct astreamer_waldump
/* These fields change with archive member. */
bool skipThisSeg;
+ bool writeThisSeg;
+ FILE *segFp;
+ SimpleStringList exportedSegList; /* Temporary exported segment list */
XLogSegNo nextSegNo; /* Next expected segment to stream */
} astreamer_waldump;
@@ -53,8 +57,11 @@ static bool member_is_relevant_wal(astreamer_member *member,
XLogSegNo startSegNo,
XLogSegNo endSegNo,
XLogSegNo nextSegNo,
+ char **curFname,
XLogSegNo *curSegNo,
TimeLineID *curSegTimeline);
+static bool member_needs_temp_write(astreamer_waldump *mystreamer,
+ const char *fname);
static const astreamer_ops astreamer_waldump_ops = {
.content = astreamer_waldump_content,
@@ -189,17 +196,8 @@ astreamer_waldump_content_new(astreamer *next, XLogRecPtr startptr,
if (XLogRecPtrIsInvalid(startptr))
streamer->startSegNo = 0;
else
- {
XLByteToSeg(startptr, streamer->startSegNo, WalSegSz);
- /*
- * Initialize the record pointer to the beginning of the first
- * segment; this pointer will track the WAL record reading status.
- */
- XLogSegNoOffsetToRecPtr(streamer->startSegNo, 0, WalSegSz,
- privateInfo->archive_streamer_read_ptr);
- }
-
if (XLogRecPtrIsInvalid(endPtr))
streamer->endSegNo = UINT64_MAX;
else
@@ -228,19 +226,21 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
{
case ASTREAMER_MEMBER_HEADER:
{
+ char *fname;
XLogSegNo segNo;
TimeLineID timeline;
pg_log_debug("pg_waldump: reading \"%s\"", member->pathname);
mystreamer->skipThisSeg = false;
+ mystreamer->writeThisSeg = false;
if (!member_is_relevant_wal(member,
privateInfo->timeline,
mystreamer->startSegNo,
mystreamer->endSegNo,
mystreamer->nextSegNo,
- &segNo, &timeline))
+ &fname, &segNo, &timeline))
{
mystreamer->skipThisSeg = true;
break;
@@ -254,24 +254,37 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
if (mystreamer->nextSegNo == 0)
break;
- /* WAL segments must be archived in order */
- if (mystreamer->nextSegNo != segNo)
+ /*
+ * When WAL segments are not archived sequentially, it becomes
+ * necessary to write out (or preserve) segments that might be
+ * required at a later point.
+ */
+ if (mystreamer->nextSegNo != segNo &&
+ member_needs_temp_write(mystreamer, fname))
{
- pg_log_error("WAL files are not archived in sequential order");
- pg_log_error_detail("Expecting segment number " UINT64_FORMAT " but found " UINT64_FORMAT ".",
- mystreamer->nextSegNo, segNo);
- exit(1);
+ mystreamer->writeThisSeg = true;
+ break;
}
/*
- * We track the reading of WAL segment records using a pointer
- * that's continuously incremented by the length of the
- * received data. This pointer is crucial for serving WAL page
- * requests from the WAL decoding routine, so it must be
- * accurate.
+ * We are now streaming segment containt.
+ *
+ * We need to track the reading of WAL segment records using a
+ * pointer that's typically incremented by the length of the
+ * data read. However, we sometimes export the WAL file to
+ * temporary storage, allowing the decoding routine to read
+ * directly from there. This makes continuous pointer
+ * incrementing challenging, as file reads can occur from any
+ * offset, leading to potential errors. Therefore, we now
+ * reset the pointer when reading from a file for streaming.
+ * Also, if there's any existing data in the buffer, the next
+ * WAL record should logically follow it.
*/
#ifdef USE_ASSERT_CHECKING
- if (mystreamer->nextSegNo != 0)
+ Assert(!mystreamer->skipThisSeg);
+ Assert(!mystreamer->writeThisSeg);
+
+ if (privateInfo->archive_streamer_buf->len != 0)
{
XLogRecPtr recPtr;
@@ -280,6 +293,13 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
}
#endif
+ /*
+ * Initialized to the beginning of the current segment being
+ * streamed through the buffer.
+ */
+ XLogSegNoOffsetToRecPtr(segNo, 0, WalSegSz,
+ privateInfo->archive_streamer_read_ptr);
+
/* Save the timeline */
privateInfo->timeline = timeline;
@@ -293,12 +313,44 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
if (mystreamer->skipThisSeg)
break;
+ /* Or, write contents to file */
+ if (mystreamer->writeThisSeg)
+ {
+ Assert(mystreamer->segFp != NULL);
+
+ errno = 0;
+ if (len > 0 && fwrite(data, len, 1, mystreamer->segFp) != 1)
+ {
+ char *fname;
+ int pathlen = strlen(member->pathname);
+
+ Assert(pathlen >= XLOG_FNAME_LEN);
+
+ fname = member->pathname + (pathlen - XLOG_FNAME_LEN);
+
+ /*
+ * If write didn't set errno, assume problem is no disk
+ * space
+ */
+ if (errno == 0)
+ errno = ENOSPC;
+ pg_fatal("could not write to file \"%s/%s\": %m",
+ privateInfo->tmpdir, fname);
+ }
+ break;
+ }
+
/* Or, copy contents to buffer */
privateInfo->archive_streamer_read_ptr += len;
astreamer_buffer_bytes(streamer, &data, &len, len);
break;
case ASTREAMER_MEMBER_TRAILER:
+ if (mystreamer->segFp != NULL)
+ {
+ fclose(mystreamer->segFp);
+ mystreamer->segFp = NULL;
+ }
break;
case ASTREAMER_ARCHIVE_TRAILER:
@@ -325,8 +377,14 @@ astreamer_waldump_finalize(astreamer *streamer)
static void
astreamer_waldump_free(astreamer *streamer)
{
+ astreamer_waldump *mystreamer;
+
Assert(streamer->bbs_next == NULL);
+ mystreamer = (astreamer_waldump *) streamer;
+ if (mystreamer->segFp != NULL)
+ fclose(mystreamer->segFp);
+
pfree(streamer->bbs_buffer.data);
pfree(streamer);
}
@@ -339,8 +397,8 @@ astreamer_waldump_free(astreamer *streamer)
static bool
member_is_relevant_wal(astreamer_member *member, TimeLineID startTimeLineID,
XLogSegNo startSegNo, XLogSegNo endSegNo,
- XLogSegNo nextSegNo, XLogSegNo *curSegNo,
- TimeLineID *curSegTimeline)
+ XLogSegNo nextSegNo, char **curFname,
+ XLogSegNo *curSegNo, TimeLineID *curSegTimeline)
{
int pathlen;
XLogSegNo segNo;
@@ -371,8 +429,90 @@ member_is_relevant_wal(astreamer_member *member, TimeLineID startTimeLineID,
if (startSegNo > segNo || endSegNo < segNo)
return false;
+ /*
+ * A corner case where we've already streamed the contents of an archived
+ * WAL segment with a similar name, so ignoring this duplicate.
+ */
+ if (nextSegNo > segNo)
+ return false;
+
+ *curFname = fname;
*curSegNo = segNo;
*curSegTimeline = timeline;
return true;
}
+
+/*
+ * Returns true and creates a temporary file if the given WAL segment needs to
+ * be written to temporary space. This is required when the segment is not the
+ * one currently being decoded. Conversely, if a temporary file for the
+ * preceding segment already exists and the current segment is its direct
+ * successor, then writing to temporary space is not necessary, and false is
+ * returned.
+ */
+static bool
+member_needs_temp_write(astreamer_waldump *mystreamer, const char *fname)
+{
+ bool exists;
+ XLogSegNo segNo;
+ TimeLineID timeline;
+ XLogDumpPrivate *privateInfo = mystreamer->privateInfo;
+
+ /* Parse position from file */
+ XLogFromFileName(fname, &timeline, &segNo, WalSegSz);
+
+ /*
+ * If we find a file that was previously written to the temporary space,
+ * it indicates that the corresponding WAL segment request has already
+ * been fulfilled. In that case, we increment the nextSegNo counter and
+ * check again whether the current segment number matches the required WAL
+ * segment (i.e. nextSegNo). If it does, we allow it to stream normally
+ * through the buffer. Otherwise, we write it to the temporary space, from
+ * where the caller is expected to read it directly.
+ */
+ do
+ {
+ char segName[MAXFNAMELEN];
+
+ XLogFileName(segName, timeline, mystreamer->nextSegNo, WalSegSz);
+
+ /*
+ * If the WAL segment has already been exported, increment the counter
+ * and check for the next segment.
+ */
+ exists = false;
+ if (simple_string_list_member(&mystreamer->exportedSegList, segName))
+ {
+ mystreamer->nextSegNo += 1;
+ exists = true;
+ }
+ } while (exists);
+
+ /*
+ * Need to export this segment to disk; create an empty placeholder file
+ * to be written once its content is received.
+ */
+ if (mystreamer->nextSegNo != segNo)
+ {
+ char fpath[MAXPGPATH];
+
+ snprintf(fpath, MAXPGPATH, "%s/%s", privateInfo->tmpdir, fname);
+
+ mystreamer->segFp = fopen(fpath, PG_BINARY_W);
+ if (mystreamer->segFp == NULL)
+ pg_fatal("could not create file \"%s\": %m", fpath);
+
+#ifndef WIN32
+ if (chmod(fpath, pg_file_create_mode))
+ pg_fatal("could not set permissions on file \"%s\": %m",
+ fpath);
+#endif
+
+ /* Record this segment's export to temporary space */
+ simple_string_list_append(&mystreamer->exportedSegList, fname);
+ return true;
+ }
+
+ return false;
+}
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 64f3a65b735..54a3b2dacda 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -325,6 +325,51 @@ identify_target_directory(char *directory, char *fname)
return NULL; /* not reached */
}
+/*
+ * Set up a temporary directory to temporarily store WAL segments.
+ */
+static char *
+setup_tmp_dir(char *waldir)
+{
+ char *tmpdir = waldir != NULL ? pstrdup(waldir) : pstrdup(".");
+
+ canonicalize_path(tmpdir);
+ tmpdir = psprintf("%s/pg_waldump_tmp_dir",
+ getenv("TMPDIR") ? getenv("TMPDIR") : tmpdir);
+
+ create_directory(tmpdir);
+
+ return tmpdir;
+}
+
+/*
+ * Removes a directory along with its contents, if any.
+ */
+static void
+remove_tmp_dir(char *tmpdir)
+{
+ DIR *dir;
+ struct dirent *de;
+
+ dir = opendir(tmpdir);
+ while ((de = readdir(dir)) != NULL)
+ {
+ char path[MAXPGPATH];
+
+ if (strcmp(de->d_name, ".") == 0 ||
+ strcmp(de->d_name, "..") == 0)
+ continue;
+
+ snprintf(path, MAXPGPATH, "%s/%s", tmpdir, de->d_name);
+ unlink(path);
+ }
+ closedir(dir);
+
+ if (rmdir(tmpdir) < 0)
+ pg_log_error("could not remove directory \"%s\": %m",
+ tmpdir);
+}
+
/*
* Returns true if the given file is a tar archive and outputs its compression
* algorithm.
@@ -559,7 +604,7 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogRecPtr targetPtr, char *readBuff)
{
XLogDumpPrivate *private = state->private_data;
- int count = required_read_len(private, targetPtr, reqLen);
+ int count = required_read_len(private, targetPagePtr, reqLen);
WALReadError errinfo;
if (private->endptr_reached)
@@ -618,12 +663,53 @@ TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogRecPtr targetPtr, char *readBuff)
{
XLogDumpPrivate *private = state->private_data;
- int count = required_read_len(private, targetPtr, reqLen);
+ int count = required_read_len(private, targetPagePtr, reqLen);
+ XLogSegNo nextSegNo;
if (private->endptr_reached)
return -1;
- /* Read the WAL page from the archive streamer */
+ /*
+ * If the target page is in a different segment, first check for the WAL
+ * segment's physical existence in the temporary directory.
+ *
+ * XXX: Timeline change is not handled.
+ */
+ nextSegNo = state->seg.ws_segno;
+ if (!XLByteInSeg(targetPagePtr, nextSegNo, WalSegSz))
+ {
+ char fname[MAXPGPATH];
+
+ if (state->seg.ws_file >= 0)
+ {
+ char fpath[MAXPGPATH];
+
+ close(state->seg.ws_file);
+ state->seg.ws_file = -1;
+
+ /* Remove this file, as it is no longer needed. */
+ XLogFileName(fname, state->seg.ws_tli, nextSegNo, WalSegSz);
+ snprintf(fpath, MAXPGPATH, "%s/%s", private->tmpdir, fname);
+ unlink(fpath);
+ }
+
+ XLByteToSeg(targetPagePtr, nextSegNo, WalSegSz);
+ state->seg.ws_tli = private->timeline;
+ state->seg.ws_segno = nextSegNo;
+
+ /*
+ * If the next segment exists, open it and continue reading from there
+ */
+ XLogFileName(fname, private->timeline, nextSegNo, WalSegSz);
+ state->seg.ws_file = open_file_in_directory(private->tmpdir, fname);
+ }
+
+ /* Continue reading from the open WAL segment, if any */
+ if (state->seg.ws_file >= 0)
+ return WALDumpReadPage(state, targetPagePtr, reqLen, targetPtr,
+ readBuff);
+
+ /* Otherwise, read the WAL page from the archive streamer */
return astreamer_wal_read(readBuff, targetPagePtr, count, private);
}
@@ -1435,6 +1521,9 @@ main(int argc, char **argv)
/* Set up for reading tar file */
init_tar_archive_reader(&private, waldir, compression);
+ /* Create temporary space for writing WAL segments. */
+ private.tmpdir = setup_tmp_dir(waldir);
+
/* Routine to decode WAL files in tar archive */
routine = XL_ROUTINE(.page_read = TarWALDumpReadPage,
.segment_open = TarWALDumpOpenSegment,
@@ -1549,6 +1638,10 @@ main(int argc, char **argv)
if (config.stats == true && !config.quiet)
XLogDumpDisplayStats(&config, &stats);
+ /* Remove temporary directory if any */
+ if (private.tmpdir != NULL)
+ remove_tmp_dir(private.tmpdir);
+
if (time_to_stop)
exit(0);
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index d2c2307d6c2..2644d847b47 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -33,6 +33,7 @@ typedef struct XLogDumpPrivate
StringInfo archive_streamer_buf; /* Buffer for receiving WAL data */
XLogRecPtr archive_streamer_read_ptr; /* Populate the buffer with records
until this record pointer */
+ char *tmpdir;
} XLogDumpPrivate;
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 80298d2a51d..a3bf950db97 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -7,6 +7,8 @@ use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
+use File::Path qw(rmtree);
+use List::Util qw(shuffle);
my $tar = $ENV{TAR};
@@ -258,6 +260,32 @@ sub test_pg_waldump
return @lines;
}
+# Create a tar archive, shuffling the file order
+sub generate_archive
+{
+ my ($archive, $directory, $compression_flags) = @_;
+
+ my @files;
+ opendir my $dh, $directory or die "opendir: $!";
+ while (my $entry = readdir $dh) {
+ # Skip '.' and '..'
+ next if $entry eq '.' || $entry eq '..';
+ push @files, $entry;
+ }
+ closedir $dh;
+
+ @files = shuffle @files;
+
+ # move into the WAL directory before archiving files
+ my $cwd = getcwd;
+ chdir($directory) || die "chdir: $!";
+ command_ok([$tar, $compression_flags, $archive, @files]);
+ chdir($cwd) || die "chdir: $!";
+
+ # give necessary permission
+ chmod(0755, $archive) || die "chmod $archive: $!";
+}
+
my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
my @scenario = (
@@ -291,16 +319,16 @@ for my $scenario (@scenario)
if !defined $tar;
skip "$scenario->{'compression_method'} compression not supported by this build", 3
if !$scenario->{'enabled'} && $scenario->{'is_archive'};
+ skip "unix-style permissions not supported on Windows", 3
+ if ($scenario->{'is_archive'}
+ && ($windows_os || $Config::Config{osname} eq 'cygwin'));
# create pg_wal archive
if ($scenario->{'is_archive'})
{
- # move into the WAL directory before archiving files
- my $cwd = getcwd;
- chdir($node->data_dir . '/pg_wal/') || die "chdir: $!";
- command_ok(
- [ $tar, $scenario->{'compression_flags'}, $path , '.' ]);
- chdir($cwd) || die "chdir: $!";
+ generate_archive($path,
+ $node->data_dir . '/pg_wal',
+ $scenario->{'compression_flags'});
}
command_fails_like(
--
2.47.1
[application/x-patch] v1-0007-pg_verifybackup-Delay-default-WAL-directory-prepa.patch (1.7K, 8-v1-0007-pg_verifybackup-Delay-default-WAL-directory-prepa.patch)
download | inline diff:
From 10816f545e7f2f3df1fb9075321d2bd81df195d4 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 16 Jul 2025 14:47:43 +0530
Subject: [PATCH v1 7/9] pg_verifybackup: Delay default WAL directory
preparation.
We are not sure whether to parse WAL from a directory or an archive
until the backup format is known. Therefore, we delay preparing the
default WAL directory until the point of parsing. This delay is
harmless, as the WAL directory is not used elsewhere.
---
src/bin/pg_verifybackup/pg_verifybackup.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 5e6c13bb921..31ebc1581fb 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -285,10 +285,6 @@ main(int argc, char **argv)
manifest_path = psprintf("%s/backup_manifest",
context.backup_directory);
- /* By default, look for the WAL in the backup directory, too. */
- if (wal_directory == NULL)
- wal_directory = psprintf("%s/pg_wal", context.backup_directory);
-
/*
* Try to read the manifest. We treat any errors encountered while parsing
* the manifest as fatal; there doesn't seem to be much point in trying to
@@ -368,6 +364,10 @@ main(int argc, char **argv)
if (context.format == 'p' && !context.skip_checksums)
verify_backup_checksums(&context);
+ /* By default, look for the WAL in the backup directory, too. */
+ if (wal_directory == NULL)
+ wal_directory = psprintf("%s/pg_wal", context.backup_directory);
+
/*
* Try to parse the required ranges of WAL records, unless we were told
* not to do so.
--
2.47.1
[application/x-patch] v1-0008-pg_verifybackup-Rename-the-wal-directory-switch-t.patch (15.6K, 9-v1-0008-pg_verifybackup-Rename-the-wal-directory-switch-t.patch)
download | inline diff:
From c8187d4996df271117afb623db6c72d3033d4b06 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 24 Jul 2025 16:37:43 +0530
Subject: [PATCH v1 8/9] pg_verifybackup: Rename the wal-directory switch to
wal-path
Future patches to pg_waldump will enable it to decode WAL directly
from tar files. This means you'll be able to specify a tar archive
path instead of a traditional WAL directory.
To keep things consistent and more versatile, we should also
generalize the input switch for pg_verifybackup. It should accept
either a directory or a tar file path that contains WALs. This change
will also aligning it with the existing manifest-path switch naming.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 2 +-
src/bin/pg_verifybackup/pg_verifybackup.c | 22 +++++++++++-----------
src/bin/pg_verifybackup/po/de.po | 4 ++--
src/bin/pg_verifybackup/po/el.po | 4 ++--
src/bin/pg_verifybackup/po/es.po | 4 ++--
src/bin/pg_verifybackup/po/fr.po | 4 ++--
src/bin/pg_verifybackup/po/it.po | 4 ++--
src/bin/pg_verifybackup/po/ja.po | 4 ++--
src/bin/pg_verifybackup/po/ka.po | 4 ++--
src/bin/pg_verifybackup/po/ko.po | 4 ++--
src/bin/pg_verifybackup/po/ru.po | 4 ++--
src/bin/pg_verifybackup/po/sv.po | 4 ++--
src/bin/pg_verifybackup/po/uk.po | 4 ++--
src/bin/pg_verifybackup/po/zh_CN.po | 4 ++--
src/bin/pg_verifybackup/po/zh_TW.po | 4 ++--
src/bin/pg_verifybackup/t/007_wal.pl | 4 ++--
16 files changed, 40 insertions(+), 40 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index 61c12975e4a..e9b8bfd51b1 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -261,7 +261,7 @@ PostgreSQL documentation
<varlistentry>
<term><option>-w <replaceable class="parameter">path</replaceable></option></term>
- <term><option>--wal-directory=<replaceable class="parameter">path</replaceable></option></term>
+ <term><option>--wal-path=<replaceable class="parameter">path</replaceable></option></term>
<listitem>
<para>
Try to parse WAL files stored in the specified directory, rather than
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 31ebc1581fb..1ee400199da 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -93,7 +93,7 @@ static void verify_file_checksum(verifier_context *context,
uint8 *buffer);
static void parse_required_wal(verifier_context *context,
char *pg_waldump_path,
- char *wal_directory);
+ char *wal_path);
static astreamer *create_archive_verifier(verifier_context *context,
char *archive_name,
Oid tblspc_oid,
@@ -126,7 +126,7 @@ main(int argc, char **argv)
{"progress", no_argument, NULL, 'P'},
{"quiet", no_argument, NULL, 'q'},
{"skip-checksums", no_argument, NULL, 's'},
- {"wal-directory", required_argument, NULL, 'w'},
+ {"wal-path", required_argument, NULL, 'w'},
{NULL, 0, NULL, 0}
};
@@ -135,7 +135,7 @@ main(int argc, char **argv)
char *manifest_path = NULL;
bool no_parse_wal = false;
bool quiet = false;
- char *wal_directory = NULL;
+ char *wal_path = NULL;
char *pg_waldump_path = NULL;
DIR *dir;
@@ -221,8 +221,8 @@ main(int argc, char **argv)
context.skip_checksums = true;
break;
case 'w':
- wal_directory = pstrdup(optarg);
- canonicalize_path(wal_directory);
+ wal_path = pstrdup(optarg);
+ canonicalize_path(wal_path);
break;
default:
/* getopt_long already emitted a complaint */
@@ -365,15 +365,15 @@ main(int argc, char **argv)
verify_backup_checksums(&context);
/* By default, look for the WAL in the backup directory, too. */
- if (wal_directory == NULL)
- wal_directory = psprintf("%s/pg_wal", context.backup_directory);
+ if (wal_path == NULL)
+ wal_path = psprintf("%s/pg_wal", context.backup_directory);
/*
* Try to parse the required ranges of WAL records, unless we were told
* not to do so.
*/
if (!no_parse_wal)
- parse_required_wal(&context, pg_waldump_path, wal_directory);
+ parse_required_wal(&context, pg_waldump_path, wal_path);
/*
* If everything looks OK, tell the user this, unless we were asked to
@@ -1198,7 +1198,7 @@ verify_file_checksum(verifier_context *context, manifest_file *m,
*/
static void
parse_required_wal(verifier_context *context, char *pg_waldump_path,
- char *wal_directory)
+ char *wal_path)
{
manifest_data *manifest = context->manifest;
manifest_wal_range *this_wal_range = manifest->first_wal_range;
@@ -1208,7 +1208,7 @@ parse_required_wal(verifier_context *context, char *pg_waldump_path,
char *pg_waldump_cmd;
pg_waldump_cmd = psprintf("\"%s\" --quiet --path=\"%s\" --timeline=%u --start=%X/%08X --end=%X/%08X\n",
- pg_waldump_path, wal_directory, this_wal_range->tli,
+ pg_waldump_path, wal_path, this_wal_range->tli,
LSN_FORMAT_ARGS(this_wal_range->start_lsn),
LSN_FORMAT_ARGS(this_wal_range->end_lsn));
fflush(NULL);
@@ -1376,7 +1376,7 @@ usage(void)
printf(_(" -P, --progress show progress information\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -s, --skip-checksums skip checksum verification\n"));
- printf(_(" -w, --wal-directory=PATH use specified path for WAL files\n"));
+ printf(_(" -w, --wal-path=PATH use specified path for WAL files\n"));
printf(_(" -V, --version output version information, then exit\n"));
printf(_(" -?, --help show this help, then exit\n"));
printf(_("\nReport bugs to <%s>.\n"), PACKAGE_BUGREPORT);
diff --git a/src/bin/pg_verifybackup/po/de.po b/src/bin/pg_verifybackup/po/de.po
index a9e24931100..9b5cd5898cf 100644
--- a/src/bin/pg_verifybackup/po/de.po
+++ b/src/bin/pg_verifybackup/po/de.po
@@ -785,8 +785,8 @@ msgstr " -s, --skip-checksums Überprüfung der Prüfsummen überspringe
#: pg_verifybackup.c:1379
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PFAD angegebenen Pfad für WAL-Dateien verwenden\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PFAD angegebenen Pfad für WAL-Dateien verwenden\n"
#: pg_verifybackup.c:1380
#, c-format
diff --git a/src/bin/pg_verifybackup/po/el.po b/src/bin/pg_verifybackup/po/el.po
index 3e3f20c67c5..81442f51c17 100644
--- a/src/bin/pg_verifybackup/po/el.po
+++ b/src/bin/pg_verifybackup/po/el.po
@@ -494,8 +494,8 @@ msgstr " -s, --skip-checksums παράκαμψε την επαλήθευ
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH χρησιμοποίησε την καθορισμένη διαδρομή για αρχεία WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH χρησιμοποίησε την καθορισμένη διαδρομή για αρχεία WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/es.po b/src/bin/pg_verifybackup/po/es.po
index 0cb958f3448..7f729fa35ba 100644
--- a/src/bin/pg_verifybackup/po/es.po
+++ b/src/bin/pg_verifybackup/po/es.po
@@ -495,8 +495,8 @@ msgstr " -s, --skip-checksums omitir la verificación de la suma de comp
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH utilizar la ruta especificada para los archivos WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH utilizar la ruta especificada para los archivos WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/fr.po b/src/bin/pg_verifybackup/po/fr.po
index da8c72f6427..09937966fa7 100644
--- a/src/bin/pg_verifybackup/po/fr.po
+++ b/src/bin/pg_verifybackup/po/fr.po
@@ -498,8 +498,8 @@ msgstr " -s, --skip-checksums ignore la vérification des sommes de cont
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=CHEMIN utilise le chemin spécifié pour les fichiers WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=CHEMIN utilise le chemin spécifié pour les fichiers WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/it.po b/src/bin/pg_verifybackup/po/it.po
index 317b0b71e7f..4da68d0074e 100644
--- a/src/bin/pg_verifybackup/po/it.po
+++ b/src/bin/pg_verifybackup/po/it.po
@@ -472,8 +472,8 @@ msgstr " -s, --skip-checksums salta la verifica del checksum\n"
#: pg_verifybackup.c:911
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH usa il percorso specificato per i file WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH usa il percorso specificato per i file WAL\n"
#: pg_verifybackup.c:912
#, c-format
diff --git a/src/bin/pg_verifybackup/po/ja.po b/src/bin/pg_verifybackup/po/ja.po
index c910fb236cc..a948959b54f 100644
--- a/src/bin/pg_verifybackup/po/ja.po
+++ b/src/bin/pg_verifybackup/po/ja.po
@@ -672,8 +672,8 @@ msgstr " -s, --skip-checksums チェックサム検証をスキップ\n"
#: pg_verifybackup.c:1379
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH WALファイルに指定したパスを使用する\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH WALファイルに指定したパスを使用する\n"
#: pg_verifybackup.c:1380
#, c-format
diff --git a/src/bin/pg_verifybackup/po/ka.po b/src/bin/pg_verifybackup/po/ka.po
index 982751984c7..ef2799316a8 100644
--- a/src/bin/pg_verifybackup/po/ka.po
+++ b/src/bin/pg_verifybackup/po/ka.po
@@ -784,8 +784,8 @@ msgstr " -s, --skip-checksums საკონტროლო ჯამ
#: pg_verifybackup.c:1379
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=ბილიკი WAL ფაილებისთვის მითითებული ბილიკის გამოყენება\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=ბილიკი WAL ფაილებისთვის მითითებული ბილიკის გამოყენება\n"
#: pg_verifybackup.c:1380
#, c-format
diff --git a/src/bin/pg_verifybackup/po/ko.po b/src/bin/pg_verifybackup/po/ko.po
index acdc3da5e02..eaf91ef1e98 100644
--- a/src/bin/pg_verifybackup/po/ko.po
+++ b/src/bin/pg_verifybackup/po/ko.po
@@ -501,8 +501,8 @@ msgstr " -s, --skip-checksums 체크섬 검사 건너뜀\n"
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=경로 WAL 파일이 있는 경로 지정\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=경로 WAL 파일이 있는 경로 지정\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/ru.po b/src/bin/pg_verifybackup/po/ru.po
index 64005feedfd..7fb0e5ab1f6 100644
--- a/src/bin/pg_verifybackup/po/ru.po
+++ b/src/bin/pg_verifybackup/po/ru.po
@@ -507,9 +507,9 @@ msgstr " -s, --skip-checksums пропустить проверку ко
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
msgstr ""
-" -w, --wal-directory=ПУТЬ использовать заданный путь к файлам WAL\n"
+" -w, --wal-path=ПУТЬ использовать заданный путь к файлам WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/sv.po b/src/bin/pg_verifybackup/po/sv.po
index 17240feeb5c..97125838e8c 100644
--- a/src/bin/pg_verifybackup/po/sv.po
+++ b/src/bin/pg_verifybackup/po/sv.po
@@ -492,8 +492,8 @@ msgstr " -s, --skip-checksums hoppa över verifiering av kontrollsummor\
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=SÖKVÄG använd denna sökväg till WAL-filer\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=SÖKVÄG använd denna sökväg till WAL-filer\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/uk.po b/src/bin/pg_verifybackup/po/uk.po
index 034b9764232..63f8041ab38 100644
--- a/src/bin/pg_verifybackup/po/uk.po
+++ b/src/bin/pg_verifybackup/po/uk.po
@@ -484,8 +484,8 @@ msgstr " -s, --skip-checksums не перевіряти контрольні с
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH використовувати вказаний шлях для файлів WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH використовувати вказаний шлях для файлів WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/zh_CN.po b/src/bin/pg_verifybackup/po/zh_CN.po
index b7d97c8976d..fb6fcae8b82 100644
--- a/src/bin/pg_verifybackup/po/zh_CN.po
+++ b/src/bin/pg_verifybackup/po/zh_CN.po
@@ -465,8 +465,8 @@ msgstr " -s, --skip-checksums 跳过校验和验证\n"
#: pg_verifybackup.c:919
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH 对WAL文件使用指定路径\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH 对WAL文件使用指定路径\n"
#: pg_verifybackup.c:920
#, c-format
diff --git a/src/bin/pg_verifybackup/po/zh_TW.po b/src/bin/pg_verifybackup/po/zh_TW.po
index c1b710b0a36..568f972b0bb 100644
--- a/src/bin/pg_verifybackup/po/zh_TW.po
+++ b/src/bin/pg_verifybackup/po/zh_TW.po
@@ -555,8 +555,8 @@ msgstr " -s, --skip-checksums 跳過檢查碼驗證\n"
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH 用指定的路徑存放 WAL 檔\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH 用指定的路徑存放 WAL 檔\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/t/007_wal.pl b/src/bin/pg_verifybackup/t/007_wal.pl
index babc4f0a86b..b07f80719b0 100644
--- a/src/bin/pg_verifybackup/t/007_wal.pl
+++ b/src/bin/pg_verifybackup/t/007_wal.pl
@@ -42,10 +42,10 @@ command_ok([ 'pg_verifybackup', '--no-parse-wal', $backup_path ],
command_ok(
[
'pg_verifybackup',
- '--wal-directory' => $relocated_pg_wal,
+ '--wal-path' => $relocated_pg_wal,
$backup_path
],
- '--wal-directory can be used to specify WAL directory');
+ '--wal-path can be used to specify WAL directory');
# Move directory back to original location.
rename($relocated_pg_wal, $original_pg_wal) || die "rename pg_wal back: $!";
--
2.47.1
[application/x-patch] v1-0009-pg_verifybackup-enabled-WAL-parsing-for-tar-forma.patch (9.3K, 10-v1-0009-pg_verifybackup-enabled-WAL-parsing-for-tar-forma.patch)
download | inline diff:
From 923a767b076e04c75f6472d2800a22ca99a31d53 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 17 Jul 2025 16:39:36 +0530
Subject: [PATCH v1 9/9] pg_verifybackup: enabled WAL parsing for tar-format
backup
Now that pg_waldump supports decoding from tar archives, we should
leverage this functionality to remove the previous restriction on WAL
parsing for tar-backed formats.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 5 +-
src/bin/pg_verifybackup/pg_verifybackup.c | 66 +++++++++++++------
src/bin/pg_verifybackup/t/002_algorithm.pl | 4 --
src/bin/pg_verifybackup/t/003_corruption.pl | 4 +-
src/bin/pg_verifybackup/t/008_untar.pl | 3 +-
src/bin/pg_verifybackup/t/010_client_untar.pl | 3 +-
6 files changed, 50 insertions(+), 35 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index e9b8bfd51b1..16b50b5a4df 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -36,10 +36,7 @@ PostgreSQL documentation
<literal>backup_manifest</literal> generated by the server at the time
of the backup. The backup may be stored either in the "plain" or the "tar"
format; this includes tar-format backups compressed with any algorithm
- supported by <application>pg_basebackup</application>. However, at present,
- <literal>WAL</literal> verification is supported only for plain-format
- backups. Therefore, if the backup is stored in tar-format, the
- <literal>-n, --no-parse-wal</literal> option should be used.
+ supported by <application>pg_basebackup</application>.
</para>
<para>
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 1ee400199da..4bfe6fdff16 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -74,7 +74,9 @@ pg_noreturn static void report_manifest_error(JsonManifestParseContext *context,
const char *fmt,...)
pg_attribute_printf(2, 3);
-static void verify_tar_backup(verifier_context *context, DIR *dir);
+static void verify_tar_backup(verifier_context *context, DIR *dir,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_plain_backup_directory(verifier_context *context,
char *relpath, char *fullpath,
DIR *dir);
@@ -83,7 +85,9 @@ static void verify_plain_backup_file(verifier_context *context, char *relpath,
static void verify_control_file(const char *controlpath,
uint64 manifest_system_identifier);
static void precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles);
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_tar_file(verifier_context *context, char *relpath,
char *fullpath, astreamer *streamer);
static void report_extra_backup_files(verifier_context *context);
@@ -136,6 +140,8 @@ main(int argc, char **argv)
bool no_parse_wal = false;
bool quiet = false;
char *wal_path = NULL;
+ char *base_archive_path = NULL;
+ char *wal_archive_path = NULL;
char *pg_waldump_path = NULL;
DIR *dir;
@@ -327,17 +333,6 @@ main(int argc, char **argv)
pfree(path);
}
- /*
- * XXX: In the future, we should consider enhancing pg_waldump to read WAL
- * files from an archive.
- */
- if (!no_parse_wal && context.format == 't')
- {
- pg_log_error("pg_waldump cannot read tar files");
- pg_log_error_hint("You must use -n/--no-parse-wal when verifying a tar-format backup.");
- exit(1);
- }
-
/*
* Perform the appropriate type of verification appropriate based on the
* backup format. This will close 'dir'.
@@ -346,7 +341,7 @@ main(int argc, char **argv)
verify_plain_backup_directory(&context, NULL, context.backup_directory,
dir);
else
- verify_tar_backup(&context, dir);
+ verify_tar_backup(&context, dir, &base_archive_path, &wal_archive_path);
/*
* The "matched" flag should now be set on every entry in the hash table.
@@ -364,9 +359,28 @@ main(int argc, char **argv)
if (context.format == 'p' && !context.skip_checksums)
verify_backup_checksums(&context);
- /* By default, look for the WAL in the backup directory, too. */
+ /*
+ * By default, WAL files are expected to be found in the backup directory
+ * for plain-format backups. In the case of tar-format backups, if a
+ * separate WAL archive is not found, the WAL files are most likely
+ * included within the main data directory archive.
+ */
if (wal_path == NULL)
- wal_path = psprintf("%s/pg_wal", context.backup_directory);
+ {
+ if (context.format == 'p')
+ wal_path = psprintf("%s/pg_wal", context.backup_directory);
+ else if (wal_archive_path)
+ wal_path = wal_archive_path;
+ else if (base_archive_path)
+ wal_path = base_archive_path;
+ else
+ {
+ pg_log_error("wal archive not found");
+ pg_log_error_hint("Specify the correct path using the option -w/--wal-path."
+ "Or you must use -n/--no-parse-wal when verifying a tar-format backup.");
+ exit(1);
+ }
+ }
/*
* Try to parse the required ranges of WAL records, unless we were told
@@ -787,7 +801,8 @@ verify_control_file(const char *controlpath, uint64 manifest_system_identifier)
* close when we're done with it.
*/
static void
-verify_tar_backup(verifier_context *context, DIR *dir)
+verify_tar_backup(verifier_context *context, DIR *dir, char **base_archive_path,
+ char **wal_archive_path)
{
struct dirent *dirent;
SimplePtrList tarfiles = {NULL, NULL};
@@ -816,7 +831,8 @@ verify_tar_backup(verifier_context *context, DIR *dir)
char *fullpath;
fullpath = psprintf("%s/%s", context->backup_directory, filename);
- precheck_tar_backup_file(context, filename, fullpath, &tarfiles);
+ precheck_tar_backup_file(context, filename, fullpath, &tarfiles,
+ base_archive_path, wal_archive_path);
pfree(fullpath);
}
}
@@ -875,11 +891,13 @@ verify_tar_backup(verifier_context *context, DIR *dir)
*
* The arguments to this function are mostly the same as the
* verify_plain_backup_file. The additional argument outputs a list of valid
- * tar files.
+ * tar files, along with the full paths to the main archive and the WAL
+ * directory archive.
*/
static void
precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles)
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path, char **wal_archive_path)
{
struct stat sb;
Oid tblspc_oid = InvalidOid;
@@ -918,9 +936,17 @@ precheck_tar_backup_file(verifier_context *context, char *relpath,
* extension such as .gz, .lz4, or .zst.
*/
if (strncmp("base", relpath, 4) == 0)
+ {
suffix = relpath + 4;
+
+ *base_archive_path = pstrdup(fullpath);
+ }
else if (strncmp("pg_wal", relpath, 6) == 0)
+ {
suffix = relpath + 6;
+
+ *wal_archive_path = pstrdup(fullpath);
+ }
else
{
/* Expected a <tablespaceoid>.tar file here. */
diff --git a/src/bin/pg_verifybackup/t/002_algorithm.pl b/src/bin/pg_verifybackup/t/002_algorithm.pl
index ae16c11bc4d..4f284a9e828 100644
--- a/src/bin/pg_verifybackup/t/002_algorithm.pl
+++ b/src/bin/pg_verifybackup/t/002_algorithm.pl
@@ -30,10 +30,6 @@ sub test_checksums
{
# Add switch to get a tar-format backup
push @backup, ('--format' => 'tar');
-
- # Add switch to skip WAL verification, which is not yet supported for
- # tar-format backups
- push @verify, ('--no-parse-wal');
}
# A backup with a bogus algorithm should fail.
diff --git a/src/bin/pg_verifybackup/t/003_corruption.pl b/src/bin/pg_verifybackup/t/003_corruption.pl
index 1dd60f709cf..f1ebdbb46b4 100644
--- a/src/bin/pg_verifybackup/t/003_corruption.pl
+++ b/src/bin/pg_verifybackup/t/003_corruption.pl
@@ -193,10 +193,8 @@ for my $scenario (@scenario)
command_ok([ $tar, '-cf' => "$tar_backup_path/base.tar", '.' ]);
chdir($cwd) || die "chdir: $!";
- # Now check that the backup no longer verifies. We must use -n
- # here, because pg_waldump can't yet read WAL from a tarfile.
command_fails_like(
- [ 'pg_verifybackup', '--no-parse-wal', $tar_backup_path ],
+ [ 'pg_verifybackup', $tar_backup_path ],
$scenario->{'fails_like'},
"corrupt backup fails verification: $name");
diff --git a/src/bin/pg_verifybackup/t/008_untar.pl b/src/bin/pg_verifybackup/t/008_untar.pl
index bc3d6b352ad..0cfe1f9532c 100644
--- a/src/bin/pg_verifybackup/t/008_untar.pl
+++ b/src/bin/pg_verifybackup/t/008_untar.pl
@@ -123,8 +123,7 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
diff --git a/src/bin/pg_verifybackup/t/010_client_untar.pl b/src/bin/pg_verifybackup/t/010_client_untar.pl
index b62faeb5acf..76269a73673 100644
--- a/src/bin/pg_verifybackup/t/010_client_untar.pl
+++ b/src/bin/pg_verifybackup/t/010_client_untar.pl
@@ -137,8 +137,7 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
--
2.47.1
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2025-08-25 12:28 Amul Sul <[email protected]>
parent: Amul Sul <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Amul Sul @ 2025-08-25 12:28 UTC (permalink / raw)
To: PostgreSQL Hackers <[email protected]>
On Thu, Aug 7, 2025 at 7:47 PM Amul Sul <[email protected]> wrote:
> [....]
> -----------------------------------
> Known Issues & Status:
> -----------------------------------
> - Timeline Switching: The current implementation in patch 006 does not
> correctly handle timeline switching. This is a known issue, especially
> when a timeline change occurs on a WAL file that has been written to a
> temporary location.
>
This is still pending and will be addressed in the next version.
Therefore, patch 0006 remains marked as WIP.
> - Testing: Local regression tests on CentOS and macOS M4 are passing.
> However, some tests on macOS Sonoma (specifically 008_untar.pl and
> 010_client_untar.pl) are failing in the GitHub workflow with a "WAL
> parsing failed for timeline 1" error. This issue is currently being
> investigated.
>
This has been fixed in the attached version; all GitHub workflow tests
are now fine.
Regards,
Amul
Attachments:
[application/x-patch] v2-0001-Refactor-pg_waldump-Move-some-declarations-to-new.patch (2.2K, 2-v2-0001-Refactor-pg_waldump-Move-some-declarations-to-new.patch)
download | inline diff:
From 233cf0977b18100916b0204ad7e57445c420dae6 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 24 Jun 2025 11:33:20 +0530
Subject: [PATCH v2 1/9] Refactor: pg_waldump: Move some declarations to new
pg_waldump.h
This is in preparation for adding a second source file to this
directory.
---
src/bin/pg_waldump/pg_waldump.c | 11 ++---------
src/bin/pg_waldump/pg_waldump.h | 27 +++++++++++++++++++++++++++
2 files changed, 29 insertions(+), 9 deletions(-)
create mode 100644 src/bin/pg_waldump/pg_waldump.h
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 13d3ec2f5be..a49b2fd96c7 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -29,6 +29,7 @@
#include "common/logging.h"
#include "common/relpath.h"
#include "getopt_long.h"
+#include "pg_waldump.h"
#include "rmgrdesc.h"
#include "storage/bufpage.h"
@@ -39,19 +40,11 @@
static const char *progname;
-static int WalSegSz;
+int WalSegSz = DEFAULT_XLOG_SEG_SIZE;
static volatile sig_atomic_t time_to_stop = false;
static const RelFileLocator emptyRelFileLocator = {0, 0, 0};
-typedef struct XLogDumpPrivate
-{
- TimeLineID timeline;
- XLogRecPtr startptr;
- XLogRecPtr endptr;
- bool endptr_reached;
-} XLogDumpPrivate;
-
typedef struct XLogDumpConfig
{
/* display options */
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
new file mode 100644
index 00000000000..9e62b64ead5
--- /dev/null
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -0,0 +1,27 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_waldump.h - decode and display WAL
+ *
+ * Copyright (c) 2013-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/pg_waldump.h
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_WALDUMP_H
+#define PG_WALDUMP_H
+
+#include "access/xlogdefs.h"
+
+extern int WalSegSz;
+
+/* Contains the necessary information to drive WAL decoding */
+typedef struct XLogDumpPrivate
+{
+ TimeLineID timeline;
+ XLogRecPtr startptr;
+ XLogRecPtr endptr;
+ bool endptr_reached;
+} XLogDumpPrivate;
+
+#endif /* end of PG_WALDUMP_H */
--
2.47.1
[application/x-patch] v2-0002-Refactor-pg_waldump-Separate-logic-used-to-calcul.patch (2.3K, 3-v2-0002-Refactor-pg_waldump-Separate-logic-used-to-calcul.patch)
download | inline diff:
From b46d48d7cc9fc42257f3d6da6850ff20f9461ae9 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 26 Jun 2025 11:42:53 +0530
Subject: [PATCH v2 2/9] Refactor: pg_waldump: Separate logic used to calculate
the required read size.
This refactoring prepares the codebase for an upcoming patch that will
support reading WAL from tar files. The logic for calculating the
required read size has been updated to handle both normal WAL files
and WAL files located inside a tar archive.
---
src/bin/pg_waldump/pg_waldump.c | 39 ++++++++++++++++++++++-----------
1 file changed, 26 insertions(+), 13 deletions(-)
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index a49b2fd96c7..8d0cd9e7156 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -326,6 +326,29 @@ identify_target_directory(char *directory, char *fname)
return NULL; /* not reached */
}
+/* Returns the size in bytes of the data to be read. */
+static inline int
+required_read_len(XLogDumpPrivate *private, XLogRecPtr targetPagePtr,
+ int reqLen)
+{
+ int count = XLOG_BLCKSZ;
+
+ if (private->endptr != InvalidXLogRecPtr)
+ {
+ if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
+ count = XLOG_BLCKSZ;
+ else if (targetPagePtr + reqLen <= private->endptr)
+ count = private->endptr - targetPagePtr;
+ else
+ {
+ private->endptr_reached = true;
+ return -1;
+ }
+ }
+
+ return count;
+}
+
/* pg_waldump's XLogReaderRoutine->segment_open callback */
static void
WALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
@@ -383,21 +406,11 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogRecPtr targetPtr, char *readBuff)
{
XLogDumpPrivate *private = state->private_data;
- int count = XLOG_BLCKSZ;
+ int count = required_read_len(private, targetPagePtr, reqLen);
WALReadError errinfo;
- if (private->endptr != InvalidXLogRecPtr)
- {
- if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
- count = XLOG_BLCKSZ;
- else if (targetPagePtr + reqLen <= private->endptr)
- count = private->endptr - targetPagePtr;
- else
- {
- private->endptr_reached = true;
- return -1;
- }
- }
+ if (private->endptr_reached)
+ return -1;
if (!WALRead(state, readBuff, targetPagePtr, count, private->timeline,
&errinfo))
--
2.47.1
[application/x-patch] v2-0003-Refactor-pg_waldump-Restructure-TAP-tests.patch (5.5K, 4-v2-0003-Refactor-pg_waldump-Restructure-TAP-tests.patch)
download | inline diff:
From 8e3236330522fae0674508cebc7d2f1d8379271f Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 30 Jul 2025 12:43:30 +0530
Subject: [PATCH v2 3/9] Refactor: pg_waldump: Restructure TAP tests.
Restructured some tests to run inside a loop, facilitating their
re-execution for decoding WAL from tar archives.
---
src/bin/pg_waldump/t/001_basic.pl | 123 ++++++++++++++++--------------
1 file changed, 67 insertions(+), 56 deletions(-)
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index f26d75e01cf..1b712e8d74d 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -198,28 +198,6 @@ command_like(
],
qr/./,
'runs with start and end segment specified');
-command_fails_like(
- [ 'pg_waldump', '--path' => $node->data_dir ],
- qr/error: no start WAL location given/,
- 'path option requires start location');
-command_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- '--end' => $end_lsn,
- ],
- qr/./,
- 'runs with path option and start and end locations');
-command_fails_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- ],
- qr/error: error in WAL record at/,
- 'falling off the end of the WAL results in an error');
-
command_like(
[
'pg_waldump', '--quiet',
@@ -227,15 +205,6 @@ command_like(
],
qr/^$/,
'no output with --quiet option');
-command_fails_like(
- [
- 'pg_waldump', '--quiet',
- '--path' => $node->data_dir,
- '--start' => $start_lsn
- ],
- qr/error: error in WAL record at/,
- 'errors are shown with --quiet');
-
# Test for: Display a message that we're skipping data if `from`
# wasn't a pointer to the start of a record.
@@ -272,7 +241,6 @@ sub test_pg_waldump
my $result = IPC::Run::run [
'pg_waldump',
- '--path' => $node->data_dir,
'--start' => $start_lsn,
'--end' => $end_lsn,
@opts
@@ -288,38 +256,81 @@ sub test_pg_waldump
my @lines;
-@lines = test_pg_waldump;
-is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+my @scenario = (
+ {
+ 'path' => $node->data_dir
+ });
-@lines = test_pg_waldump('--limit' => 6);
-is(@lines, 6, 'limit option observed');
+for my $scenario (@scenario)
+{
+ my $path = $scenario->{'path'};
-@lines = test_pg_waldump('--fullpage');
-is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
+ SKIP:
+ {
+ command_fails_like(
+ [ 'pg_waldump', '--path' => $path ],
+ qr/error: no start WAL location given/,
+ 'path option requires start location');
+ command_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ '--end' => $end_lsn,
+ ],
+ qr/./,
+ 'runs with path option and start and end locations');
+ command_fails_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ ],
+ qr/error: error in WAL record at/,
+ 'falling off the end of the WAL results in an error');
-@lines = test_pg_waldump('--stats');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ command_fails_like(
+ [
+ 'pg_waldump', '--quiet',
+ '--path' => $path,
+ '--start' => $start_lsn
+ ],
+ qr/error: error in WAL record at/,
+ 'errors are shown with --quiet');
-@lines = test_pg_waldump('--stats=record');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ @lines = test_pg_waldump('--path' => $path);
+ is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
-@lines = test_pg_waldump('--rmgr' => 'Btree');
-is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+ @lines = test_pg_waldump('--path' => $path, '--limit' => 6);
+ is(@lines, 6, 'limit option observed');
-@lines = test_pg_waldump('--fork' => 'init');
-is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+ @lines = test_pg_waldump('--path' => $path, '--fullpage');
+ is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
-is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
- 0, 'only lines for selected relation');
+ @lines = test_pg_waldump('--path' => $path, '--stats');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
- '--block' => 1);
-is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+ @lines = test_pg_waldump('--path' => $path, '--stats=record');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ @lines = test_pg_waldump('--path' => $path, '--rmgr' => 'Btree');
+ is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+
+ @lines = test_pg_waldump('--path' => $path, '--fork' => 'init');
+ is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+
+ @lines = test_pg_waldump('--path' => $path,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
+ is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
+ 0, 'only lines for selected relation');
+
+ @lines = test_pg_waldump('--path' => $path,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
+ '--block' => 1);
+ is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+ }
+}
done_testing();
--
2.47.1
[application/x-patch] v2-0004-pg_waldump-Rename-directory-creation-routine-for-.patch (1.6K, 5-v2-0004-pg_waldump-Rename-directory-creation-routine-for-.patch)
download | inline diff:
From 4baa0189d624cd4606dcab5f5417cf7d305a8223 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 29 Jul 2025 14:59:01 +0530
Subject: [PATCH v2 4/9] pg_waldump: Rename directory creation routine for
generalized use.
The create_fullpage_directory() function, currently used only for
storing full-page images from WAL records, should be renamed to a more
generalized name. This would allow it to be reused in future patches
for creating other directories as needed.
---
src/bin/pg_waldump/pg_waldump.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 8d0cd9e7156..4775275c07a 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -114,11 +114,11 @@ verify_directory(const char *directory)
}
/*
- * Create if necessary the directory storing the full-page images extracted
- * from the WAL records read.
+ * Create the directory if it doesn't exist. Report an error if creation fails
+ * or if an existing directory is not empty.
*/
static void
-create_fullpage_directory(char *path)
+create_directory(char *path)
{
int ret;
@@ -1112,8 +1112,12 @@ main(int argc, char **argv)
}
}
+ /*
+ * Create if necessary the directory storing the full-page images
+ * extracted from the WAL records read.
+ */
if (config.save_fullpage_path != NULL)
- create_fullpage_directory(config.save_fullpage_path);
+ create_directory(config.save_fullpage_path);
/* parse files as start/end boundaries, extract path if not specified */
if (optind < argc)
--
2.47.1
[application/x-patch] v2-0005-pg_waldump-Add-support-for-archived-WAL-decoding.patch (34.7K, 6-v2-0005-pg_waldump-Add-support-for-archived-WAL-decoding.patch)
download | inline diff:
From 5f97922e34e7d5a523dbe718a1b61ebb08c7403e Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 16 Jul 2025 18:37:59 +0530
Subject: [PATCH v2 5/9] pg_waldump: Add support for archived WAL decoding.
pg_waldump can now accept the path to a tar archive containing WAL
files and decode them. This feature was added primarily for
pg_verifybackup, which previously disabled WAL parsing for
tar-formatted backups.
Note that this patch requires that the WAL files within the archive be
in sequential order; an error will be reported otherwise. The next
patch is planned to remove this restriction.
---
doc/src/sgml/ref/pg_waldump.sgml | 8 +-
src/bin/pg_waldump/Makefile | 7 +-
src/bin/pg_waldump/astreamer_waldump.c | 378 +++++++++++++++++++++++++
src/bin/pg_waldump/meson.build | 4 +-
src/bin/pg_waldump/pg_waldump.c | 361 +++++++++++++++++++----
src/bin/pg_waldump/pg_waldump.h | 21 +-
src/bin/pg_waldump/t/001_basic.pl | 84 +++++-
src/tools/pgindent/typedefs.list | 1 +
8 files changed, 785 insertions(+), 79 deletions(-)
create mode 100644 src/bin/pg_waldump/astreamer_waldump.c
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index ce23add5577..d004bb0f67e 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -141,13 +141,17 @@ PostgreSQL documentation
<term><option>--path=<replaceable>path</replaceable></option></term>
<listitem>
<para>
- Specifies a directory to search for WAL segment files or a
- directory with a <literal>pg_wal</literal> subdirectory that
+ Specifies a tar archive or a directory to search for WAL segment files
+ or a directory with a <literal>pg_wal</literal> subdirectory that
contains such files. The default is to search in the current
directory, the <literal>pg_wal</literal> subdirectory of the
current directory, and the <literal>pg_wal</literal> subdirectory
of <envar>PGDATA</envar>.
</para>
+ <para>
+ If a tar archive is provided, its WAL segment files must be in
+ sequential order; otherwise, an error will be reported.
+ </para>
</listitem>
</varlistentry>
diff --git a/src/bin/pg_waldump/Makefile b/src/bin/pg_waldump/Makefile
index 4c1ee649501..b234613eb50 100644
--- a/src/bin/pg_waldump/Makefile
+++ b/src/bin/pg_waldump/Makefile
@@ -3,6 +3,9 @@
PGFILEDESC = "pg_waldump - decode and display WAL"
PGAPPICON=win32
+# make these available to TAP test scripts
+export TAR
+
subdir = src/bin/pg_waldump
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
@@ -12,11 +15,13 @@ OBJS = \
$(WIN32RES) \
compat.o \
pg_waldump.o \
+ astreamer_waldump.o \
rmgrdesc.o \
xlogreader.o \
xlogstats.o
-override CPPFLAGS := -DFRONTEND $(CPPFLAGS)
+override CPPFLAGS := -DFRONTEND -I$(libpq_srcdir) $(CPPFLAGS)
+LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils
RMGRDESCSOURCES = $(sort $(notdir $(wildcard $(top_srcdir)/src/backend/access/rmgrdesc/*desc*.c)))
RMGRDESCOBJS = $(patsubst %.c,%.o,$(RMGRDESCSOURCES))
diff --git a/src/bin/pg_waldump/astreamer_waldump.c b/src/bin/pg_waldump/astreamer_waldump.c
new file mode 100644
index 00000000000..916d388ef0c
--- /dev/null
+++ b/src/bin/pg_waldump/astreamer_waldump.c
@@ -0,0 +1,378 @@
+/*-------------------------------------------------------------------------
+ *
+ * astreamer_waldump.c
+ * A generic facility for reading WAL data from tar archives via archive
+ * streamer.
+ *
+ * Portions Copyright (c) 2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/astreamer_waldump.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres_fe.h"
+
+#include <unistd.h>
+
+#include "access/xlog_internal.h"
+#include "access/xlogdefs.h"
+#include "common/logging.h"
+#include "fe_utils/simple_list.h"
+#include "pg_waldump.h"
+
+/*
+ * How many bytes should we try to read from a file at once?
+ */
+#define READ_CHUNK_SIZE (128 * 1024)
+
+typedef struct astreamer_waldump
+{
+ /* These fields don't change once initialized. */
+ astreamer base;
+ XLogSegNo startSegNo;
+ XLogSegNo endSegNo;
+ XLogDumpPrivate *privateInfo;
+
+ /* These fields change with archive member. */
+ bool skipThisSeg;
+ XLogSegNo nextSegNo; /* Next expected segment to stream */
+} astreamer_waldump;
+
+static int astreamer_archive_read(XLogDumpPrivate *privateInfo);
+static void astreamer_waldump_content(astreamer *streamer,
+ astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context);
+static void astreamer_waldump_finalize(astreamer *streamer);
+static void astreamer_waldump_free(astreamer *streamer);
+
+static bool member_is_relevant_wal(astreamer_member *member,
+ TimeLineID startTimeLineID,
+ XLogSegNo startSegNo,
+ XLogSegNo endSegNo,
+ XLogSegNo nextSegNo,
+ XLogSegNo *curSegNo,
+ TimeLineID *curSegTimeline);
+
+static const astreamer_ops astreamer_waldump_ops = {
+ .content = astreamer_waldump_content,
+ .finalize = astreamer_waldump_finalize,
+ .free = astreamer_waldump_free
+};
+
+/*
+ * Copies WAL data from astreamer to readBuff; if unavailable, fetches more
+ * from the tar archive via astreamer.
+ */
+int
+astreamer_wal_read(char *readBuff, XLogRecPtr targetPagePtr, Size count,
+ XLogDumpPrivate *privateInfo)
+{
+ char *p = readBuff;
+ Size nbytes = count;
+ XLogRecPtr recptr = targetPagePtr;
+ volatile StringInfo astreamer_buf = privateInfo->archive_streamer_buf;
+
+ while (nbytes > 0)
+ {
+ char *buf = astreamer_buf->data;
+ int len = astreamer_buf->len;
+
+ /* WAL record range that the buffer contains */
+ XLogRecPtr endPtr = privateInfo->archive_streamer_read_ptr;
+ XLogRecPtr startPtr = (endPtr > len) ? endPtr - len : 0;
+
+ /*
+ * Ignore existing data if the required target page has not yet been
+ * read.
+ */
+ if (recptr >= endPtr)
+ {
+ len = 0;
+
+ /* Reset the buffer */
+ resetStringInfo(astreamer_buf);
+ }
+
+ if (len > 0 && recptr > startPtr)
+ {
+ int skipBytes = 0;
+
+ /*
+ * The required offset is not at the start of the archive streamer
+ * buffer, so skip bytes until reaching the desired offset of the
+ * target page.
+ */
+ skipBytes = recptr - startPtr;
+
+ buf += skipBytes;
+ len -= skipBytes;
+ }
+
+ if (len > 0)
+ {
+ int readBytes = len >= nbytes ? nbytes : len;
+
+ /*
+ * Ensure we are reading the correct page, unless we've received
+ * an invalid record pointer. In that specific case, it's
+ * acceptable to read any page.
+ */
+ Assert(XLogRecPtrIsInvalid(recptr) ||
+ (recptr >= startPtr && recptr < endPtr));
+
+ memcpy(p, buf, readBytes);
+
+ /* Update state for read */
+ nbytes -= readBytes;
+ p += readBytes;
+ recptr += readBytes;
+ }
+ else
+ {
+ /* Fetch more data */
+ if (astreamer_archive_read(privateInfo) == 0)
+ break; /* No data remaining */
+ }
+ }
+
+ return (count - nbytes) ? (count - nbytes) : -1;
+}
+
+/*
+ * Reads the archive and passes it to the archive streamer for decompression.
+ */
+static int
+astreamer_archive_read(XLogDumpPrivate *privateInfo)
+{
+ int rc;
+ char *buffer;
+
+ buffer = pg_malloc(READ_CHUNK_SIZE * sizeof(uint8));
+
+ /* Read more data from the tar file */
+ rc = read(privateInfo->archive_fd, buffer, READ_CHUNK_SIZE);
+ if (rc < 0)
+ pg_fatal("could not read file \"%s\": %m",
+ privateInfo->archive_name);
+
+ /*
+ * Decrypt (if required), and then parse the previously read contents of
+ * the tar file.
+ */
+ if (rc > 0)
+ astreamer_content(privateInfo->archive_streamer, NULL,
+ buffer, rc, ASTREAMER_UNKNOWN);
+ pg_free(buffer);
+
+ return rc;
+}
+
+/*
+ * Create an astreamer that can read WAL from tar file.
+ */
+astreamer *
+astreamer_waldump_content_new(astreamer *next, XLogRecPtr startptr,
+ XLogRecPtr endPtr, XLogDumpPrivate *privateInfo)
+{
+ astreamer_waldump *streamer;
+
+ streamer = palloc0(sizeof(astreamer_waldump));
+ *((const astreamer_ops **) &streamer->base.bbs_ops) =
+ &astreamer_waldump_ops;
+
+ streamer->base.bbs_next = next;
+ initStringInfo(&streamer->base.bbs_buffer);
+
+ if (XLogRecPtrIsInvalid(startptr))
+ streamer->startSegNo = 0;
+ else
+ {
+ XLByteToSeg(startptr, streamer->startSegNo, WalSegSz);
+
+ /*
+ * Initialize the record pointer to the beginning of the first
+ * segment; this pointer will track the WAL record reading status.
+ */
+ XLogSegNoOffsetToRecPtr(streamer->startSegNo, 0, WalSegSz,
+ privateInfo->archive_streamer_read_ptr);
+ }
+
+ if (XLogRecPtrIsInvalid(endPtr))
+ streamer->endSegNo = UINT64_MAX;
+ else
+ XLByteToSeg(endPtr, streamer->endSegNo, WalSegSz);
+
+ streamer->nextSegNo = streamer->startSegNo;
+ streamer->privateInfo = privateInfo;
+
+ return &streamer->base;
+}
+
+/*
+ * Main entry point of the archive streamer for reading WAL from a tar file.
+ */
+static void
+astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context)
+{
+ astreamer_waldump *mystreamer = (astreamer_waldump *) streamer;
+ XLogDumpPrivate *privateInfo = mystreamer->privateInfo;
+
+ Assert(context != ASTREAMER_UNKNOWN);
+
+ switch (context)
+ {
+ case ASTREAMER_MEMBER_HEADER:
+ {
+ XLogSegNo segNo;
+ TimeLineID timeline;
+
+ pg_log_debug("pg_waldump: reading \"%s\"", member->pathname);
+
+ mystreamer->skipThisSeg = false;
+
+ if (!member_is_relevant_wal(member,
+ privateInfo->timeline,
+ mystreamer->startSegNo,
+ mystreamer->endSegNo,
+ mystreamer->nextSegNo,
+ &segNo, &timeline))
+ {
+ mystreamer->skipThisSeg = true;
+ break;
+ }
+
+ /*
+ * If nextSegNo is 0, the check is skipped, and any WAL file
+ * can be read -- this typically occurs during initial
+ * verification.
+ */
+ if (mystreamer->nextSegNo == 0)
+ break;
+
+ /* WAL segments must be archived in order */
+ if (mystreamer->nextSegNo != segNo)
+ {
+ pg_log_error("WAL files are not archived in sequential order");
+ pg_log_error_detail("Expecting segment number " UINT64_FORMAT " but found " UINT64_FORMAT ".",
+ mystreamer->nextSegNo, segNo);
+ exit(1);
+ }
+
+ /*
+ * We track the reading of WAL segment records using a pointer
+ * that's continuously incremented by the length of the
+ * received data. This pointer is crucial for serving WAL page
+ * requests from the WAL decoding routine, so it must be
+ * accurate.
+ */
+#ifdef USE_ASSERT_CHECKING
+ if (mystreamer->nextSegNo != 0)
+ {
+ XLogRecPtr recPtr;
+
+ XLogSegNoOffsetToRecPtr(segNo, 0, WalSegSz, recPtr);
+ Assert(privateInfo->archive_streamer_read_ptr == recPtr);
+ }
+#endif
+
+ /* Save the timeline */
+ privateInfo->timeline = timeline;
+
+ /* Update the next expected segment number */
+ mystreamer->nextSegNo += 1;
+ }
+ break;
+
+ case ASTREAMER_MEMBER_CONTENTS:
+ /* Skip this segment */
+ if (mystreamer->skipThisSeg)
+ break;
+
+ /* Or, copy contents to buffer */
+ privateInfo->archive_streamer_read_ptr += len;
+ astreamer_buffer_bytes(streamer, &data, &len, len);
+ break;
+
+ case ASTREAMER_MEMBER_TRAILER:
+ break;
+
+ case ASTREAMER_ARCHIVE_TRAILER:
+ break;
+
+ default:
+ /* Shouldn't happen. */
+ pg_fatal("unexpected state while parsing tar file");
+ }
+}
+
+/*
+ * End-of-stream processing for a astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_finalize(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+}
+
+/*
+ * Free memory associated with a astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_free(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+
+ pfree(streamer->bbs_buffer.data);
+ pfree(streamer);
+}
+
+/*
+ * Returns true if the archive member name matches the WAL naming format and
+ * the corresponding WAL segment falls within the WAL decoding target range;
+ * otherwise, returns false.
+ */
+static bool
+member_is_relevant_wal(astreamer_member *member, TimeLineID startTimeLineID,
+ XLogSegNo startSegNo, XLogSegNo endSegNo,
+ XLogSegNo nextSegNo, XLogSegNo *curSegNo,
+ TimeLineID *curSegTimeline)
+{
+ int pathlen;
+ XLogSegNo segNo;
+ TimeLineID timeline;
+ char *fname;
+
+ /* We are only interested in normal files. */
+ if (member->is_directory || member->is_link)
+ return false;
+
+ pathlen = strlen(member->pathname);
+ if (pathlen < XLOG_FNAME_LEN)
+ return false;
+
+ /* WAL file could be with full path */
+ fname = member->pathname + (pathlen - XLOG_FNAME_LEN);
+ if (!IsXLogFileName(fname))
+ return false;
+
+ /* Parse position from file */
+ XLogFromFileName(fname, &timeline, &segNo, WalSegSz);
+
+ /* Ignore the older timeline */
+ if (startTimeLineID > timeline)
+ return false;
+
+ /* Skip if the current segment is not the desired one */
+ if (startSegNo > segNo || endSegNo < segNo)
+ return false;
+
+ *curSegNo = segNo;
+ *curSegTimeline = timeline;
+
+ return true;
+}
diff --git a/src/bin/pg_waldump/meson.build b/src/bin/pg_waldump/meson.build
index 937e0d68841..2a0300dc339 100644
--- a/src/bin/pg_waldump/meson.build
+++ b/src/bin/pg_waldump/meson.build
@@ -3,6 +3,7 @@
pg_waldump_sources = files(
'compat.c',
'pg_waldump.c',
+ 'astreamer_waldump.c',
'rmgrdesc.c',
)
@@ -18,7 +19,7 @@ endif
pg_waldump = executable('pg_waldump',
pg_waldump_sources,
- dependencies: [frontend_code, lz4, zstd],
+ dependencies: [frontend_code, lz4, zstd, libpq],
c_args: ['-DFRONTEND'], # needed for xlogreader et al
kwargs: default_bin_args,
)
@@ -29,6 +30,7 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
+ 'env': {'TAR': tar.found() ? tar.full_path() : ''},
'tests': [
't/001_basic.pl',
't/002_save_fullpage.pl',
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 4775275c07a..64f3a65b735 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -182,10 +182,9 @@ open_file_in_directory(const char *directory, const char *fname)
{
int fd = -1;
char fpath[MAXPGPATH];
+ char *dir = directory ? (char *) directory : ".";
- Assert(directory != NULL);
-
- snprintf(fpath, MAXPGPATH, "%s/%s", directory, fname);
+ snprintf(fpath, MAXPGPATH, "%s/%s", dir, fname);
fd = open(fpath, O_RDONLY | PG_BINARY, 0);
if (fd < 0 && errno != ENOENT)
@@ -326,6 +325,160 @@ identify_target_directory(char *directory, char *fname)
return NULL; /* not reached */
}
+/*
+ * Returns true if the given file is a tar archive and outputs its compression
+ * algorithm.
+ */
+static bool
+is_tar_file(const char *fname, pg_compress_algorithm *compression)
+{
+ int fname_len = strlen(fname);
+ pg_compress_algorithm compress_algo;
+
+ /* Now, check the compression type of the tar */
+ if (fname_len > 4 &&
+ strcmp(fname + fname_len - 4, ".tar") == 0)
+ compress_algo = PG_COMPRESSION_NONE;
+ else if (fname_len > 4 &&
+ strcmp(fname + fname_len - 4, ".tgz") == 0)
+ compress_algo = PG_COMPRESSION_GZIP;
+ else if (fname_len > 7 &&
+ strcmp(fname + fname_len - 7, ".tar.gz") == 0)
+ compress_algo = PG_COMPRESSION_GZIP;
+ else if (fname_len > 8 &&
+ strcmp(fname + fname_len - 8, ".tar.lz4") == 0)
+ compress_algo = PG_COMPRESSION_LZ4;
+ else if (fname_len > 8 &&
+ strcmp(fname + fname_len - 8, ".tar.zst") == 0)
+ compress_algo = PG_COMPRESSION_ZSTD;
+ else
+ return false;
+
+ *compression = compress_algo;
+
+ return true;
+}
+
+/*
+ * Creates an appropriate chain of archive streamers for reading the given
+ * tar archive.
+ */
+static void
+setup_astreamer(XLogDumpPrivate *private, pg_compress_algorithm compression,
+ XLogRecPtr startptr, XLogRecPtr endptr)
+{
+ astreamer *streamer = NULL;
+
+ streamer = astreamer_waldump_content_new(NULL, startptr, endptr, private);
+
+ /*
+ * Final extracted WAL data will reside in this streamer. However, since
+ * it sits at the bottom of the stack and isn't designed to propagate data
+ * upward, we need to hold a pointer to its data buffer in order to copy.
+ */
+ private->archive_streamer_buf = &streamer->bbs_buffer;
+
+ /* Before that we must parse the tar archive. */
+ streamer = astreamer_tar_parser_new(streamer);
+
+ /* Before that we must decompress, if archive is compressed. */
+ if (compression == PG_COMPRESSION_GZIP)
+ streamer = astreamer_gzip_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_LZ4)
+ streamer = astreamer_lz4_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_ZSTD)
+ streamer = astreamer_zstd_decompressor_new(streamer);
+
+ private->archive_streamer = streamer;
+}
+
+/*
+ * Initializes the archive reader for a tar file.
+ */
+static void
+init_tar_archive_reader(XLogDumpPrivate *private, char *waldir,
+ pg_compress_algorithm compression)
+{
+ int fd;
+
+ /* Now, the tar archive and store its file descriptor */
+ fd = open_file_in_directory(waldir, private->archive_name);
+
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", private->archive_name);
+
+ private->archive_fd = fd;
+
+ /* Setup tar archive reading facility */
+ setup_astreamer(private, compression, private->startptr, private->endptr);
+}
+
+/*
+ * Release the archive streamer chain and close the archive file.
+ */
+static void
+free_tar_archive_reader(XLogDumpPrivate *private)
+{
+ /*
+ * NB: Normally, astreamer_finalize() is called before astreamer_free() to
+ * flush any remaining buffered data or to ensure the end of the tar
+ * archive is reached. However, when decoding a WAL file, once we hit the
+ * end LSN, any remaining WAL data in the buffer or the tar archive's
+ * unreached end can be safely ignored.
+ */
+ astreamer_free(private->archive_streamer);
+
+ /* Close the file. */
+ if (close(private->archive_fd) != 0)
+ pg_log_error("could not close file \"%s\": %m",
+ private->archive_name);
+}
+
+/*
+ * Reads a WAL page from the archive and verifies WAL segment size.
+ */
+static void
+verify_tar_archive(XLogDumpPrivate *private, const char *waldir,
+ pg_compress_algorithm compression)
+{
+ PGAlignedXLogBlock buf;
+ int r;
+
+ setup_astreamer(private, compression, InvalidXLogRecPtr, InvalidXLogRecPtr);
+
+ /* Now, the tar archive and store its file descriptor */
+ private->archive_fd = open_file_in_directory(waldir, private->archive_name);
+
+ if (private->archive_fd < 0)
+ pg_fatal("could not open file \"%s\"", private->archive_name);
+
+ /* Read a wal page */
+ r = astreamer_wal_read(buf.data, InvalidXLogRecPtr, XLOG_BLCKSZ, private);
+
+ /* Set WalSegSz if WAL data is successfully read */
+ if (r == XLOG_BLCKSZ)
+ {
+ XLogLongPageHeader longhdr = (XLogLongPageHeader) buf.data;
+
+ WalSegSz = longhdr->xlp_seg_size;
+
+ if (!IsValidWalSegSize(WalSegSz))
+ {
+ pg_log_error(ngettext("invalid WAL segment size in WAL file \"%s\" (%d byte)",
+ "invalid WAL segment size in WAL file \"%s\" (%d bytes)",
+ WalSegSz),
+ private->archive_name, WalSegSz);
+ pg_log_error_detail("The WAL segment size must be a power of two between 1 MB and 1 GB.");
+ exit(1);
+ }
+ }
+ else
+ pg_fatal("could not read WAL data from \"%s\" archive: read %d of %d",
+ private->archive_name, r, XLOG_BLCKSZ);
+
+ free_tar_archive_reader(private);
+}
+
/* Returns the size in bytes of the data to be read. */
static inline int
required_read_len(XLogDumpPrivate *private, XLogRecPtr targetPagePtr,
@@ -406,7 +559,7 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogRecPtr targetPtr, char *readBuff)
{
XLogDumpPrivate *private = state->private_data;
- int count = required_read_len(private, targetPagePtr, reqLen);
+ int count = required_read_len(private, targetPtr, reqLen);
WALReadError errinfo;
if (private->endptr_reached)
@@ -436,6 +589,44 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
return count;
}
+/*
+ * pg_waldump's XLogReaderRoutine->segment_open callback to support dumping WAL
+ * files from tar archives.
+ */
+static void
+TarWALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
+ TimeLineID *tli_p)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->segment_close callback.
+ */
+static void
+TarWALDumpCloseSegment(XLogReaderState *state)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->page_read callback to support dumping WAL
+ * files from tar archives.
+ */
+static int
+TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
+ XLogRecPtr targetPtr, char *readBuff)
+{
+ XLogDumpPrivate *private = state->private_data;
+ int count = required_read_len(private, targetPtr, reqLen);
+
+ if (private->endptr_reached)
+ return -1;
+
+ /* Read the WAL page from the archive streamer */
+ return astreamer_wal_read(readBuff, targetPagePtr, count, private);
+}
+
/*
* Boolean to return whether the given WAL record matches a specific relation
* and optionally block.
@@ -773,8 +964,8 @@ usage(void)
printf(_(" -F, --fork=FORK only show records that modify blocks in fork FORK;\n"
" valid names are main, fsm, vm, init\n"));
printf(_(" -n, --limit=N number of records to display\n"));
- printf(_(" -p, --path=PATH directory in which to find WAL segment files or a\n"
- " directory with a ./pg_wal that contains such files\n"
+ printf(_(" -p, --path=PATH tar archive or a directory in which to find WAL segment files or\n"
+ " a directory with a ./pg_wal that contains such files\n"
" (default: current directory, ./pg_wal, $PGDATA/pg_wal)\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -r, --rmgr=RMGR only show records generated by resource manager RMGR;\n"
@@ -806,7 +997,11 @@ main(int argc, char **argv)
XLogRecord *record;
XLogRecPtr first_record;
char *waldir = NULL;
+ char *walpath = NULL;
char *errormsg;
+ bool is_tar = false;
+ XLogReaderRoutine *routine = NULL;
+ pg_compress_algorithm compression;
static struct option long_options[] = {
{"bkp-details", no_argument, NULL, 'b'},
@@ -938,7 +1133,7 @@ main(int argc, char **argv)
}
break;
case 'p':
- waldir = pg_strdup(optarg);
+ walpath = pg_strdup(optarg);
break;
case 'q':
config.quiet = true;
@@ -1102,10 +1297,20 @@ main(int argc, char **argv)
goto bad_argument;
}
- if (waldir != NULL)
+ if (walpath != NULL)
{
+ /* validate path points to tar archive */
+ if (is_tar_file(walpath, &compression))
+ {
+ char *fname = NULL;
+
+ split_path(walpath, &waldir, &fname);
+
+ private.archive_name = fname;
+ is_tar = true;
+ }
/* validate path points to directory */
- if (!verify_directory(waldir))
+ else if (!verify_directory(walpath))
{
pg_log_error("could not open directory \"%s\": %m", waldir);
goto bad_argument;
@@ -1129,44 +1334,23 @@ main(int argc, char **argv)
split_path(argv[optind], &directory, &fname);
- if (waldir == NULL && directory != NULL)
+ if (walpath == NULL && directory != NULL)
{
- waldir = directory;
+ walpath = directory;
- if (!verify_directory(waldir))
+ if (!verify_directory(walpath))
pg_fatal("could not open directory \"%s\": %m", waldir);
}
- waldir = identify_target_directory(waldir, fname);
- fd = open_file_in_directory(waldir, fname);
- if (fd < 0)
- pg_fatal("could not open file \"%s\"", fname);
- close(fd);
-
- /* parse position from file */
- XLogFromFileName(fname, &private.timeline, &segno, WalSegSz);
-
- if (XLogRecPtrIsInvalid(private.startptr))
- XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr);
- else if (!XLByteInSeg(private.startptr, segno, WalSegSz))
+ if (fname != NULL && is_tar_file(fname, &compression))
{
- pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.startptr),
- fname);
- goto bad_argument;
+ private.archive_name = fname;
+ waldir = walpath;
+ is_tar = true;
}
-
- /* no second file specified, set end position */
- if (!(optind + 1 < argc) && XLogRecPtrIsInvalid(private.endptr))
- XLogSegNoOffsetToRecPtr(segno + 1, 0, WalSegSz, private.endptr);
-
- /* parse ENDSEG if passed */
- if (optind + 1 < argc)
+ else
{
- XLogSegNo endsegno;
-
- /* ignore directory, already have that */
- split_path(argv[optind + 1], &directory, &fname);
+ waldir = identify_target_directory(walpath, fname);
fd = open_file_in_directory(waldir, fname);
if (fd < 0)
@@ -1174,32 +1358,67 @@ main(int argc, char **argv)
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &endsegno, WalSegSz);
+ XLogFromFileName(fname, &private.timeline, &segno, WalSegSz);
- if (endsegno < segno)
- pg_fatal("ENDSEG %s is before STARTSEG %s",
- argv[optind + 1], argv[optind]);
+ if (XLogRecPtrIsInvalid(private.startptr))
+ XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr);
+ else if (!XLByteInSeg(private.startptr, segno, WalSegSz))
+ {
+ pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.startptr),
+ fname);
+ goto bad_argument;
+ }
- if (XLogRecPtrIsInvalid(private.endptr))
- XLogSegNoOffsetToRecPtr(endsegno + 1, 0, WalSegSz,
- private.endptr);
+ /* no second file specified, set end position */
+ if (!(optind + 1 < argc) && XLogRecPtrIsInvalid(private.endptr))
+ XLogSegNoOffsetToRecPtr(segno + 1, 0, WalSegSz, private.endptr);
- /* set segno to endsegno for check of --end */
- segno = endsegno;
- }
+ /* parse ENDSEG if passed */
+ if (optind + 1 < argc)
+ {
+ XLogSegNo endsegno;
+ /* ignore directory, already have that */
+ split_path(argv[optind + 1], &directory, &fname);
- if (!XLByteInSeg(private.endptr, segno, WalSegSz) &&
- private.endptr != (segno + 1) * WalSegSz)
- {
- pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.endptr),
- argv[argc - 1]);
- goto bad_argument;
+ fd = open_file_in_directory(waldir, fname);
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", fname);
+ close(fd);
+
+ /* parse position from file */
+ XLogFromFileName(fname, &private.timeline, &endsegno, WalSegSz);
+
+ if (endsegno < segno)
+ pg_fatal("ENDSEG %s is before STARTSEG %s",
+ argv[optind + 1], argv[optind]);
+
+ if (XLogRecPtrIsInvalid(private.endptr))
+ XLogSegNoOffsetToRecPtr(endsegno + 1, 0, WalSegSz,
+ private.endptr);
+
+ /* set segno to endsegno for check of --end */
+ segno = endsegno;
+ }
+
+
+ if (!XLByteInSeg(private.endptr, segno, WalSegSz) &&
+ private.endptr != (segno + 1) * WalSegSz)
+ {
+ pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.endptr),
+ argv[argc - 1]);
+ goto bad_argument;
+ }
}
}
- else
- waldir = identify_target_directory(waldir, NULL);
+ else if (!is_tar)
+ waldir = identify_target_directory(walpath, NULL);
+
+ /* Verify that the archive contains valid WAL files */
+ if (is_tar)
+ verify_tar_archive(&private, waldir, compression);
/* we don't know what to print */
if (XLogRecPtrIsInvalid(private.startptr))
@@ -1211,11 +1430,26 @@ main(int argc, char **argv)
/* done with argument parsing, do the actual work */
/* we have everything we need, start reading */
+ if (is_tar)
+ {
+ /* Set up for reading tar file */
+ init_tar_archive_reader(&private, waldir, compression);
+
+ /* Routine to decode WAL files in tar archive */
+ routine = XL_ROUTINE(.page_read = TarWALDumpReadPage,
+ .segment_open = TarWALDumpOpenSegment,
+ .segment_close = TarWALDumpCloseSegment);
+ }
+ else
+ {
+ /* Routine to decode WAL files */
+ routine = XL_ROUTINE(.page_read = WALDumpReadPage,
+ .segment_open = WALDumpOpenSegment,
+ .segment_close = WALDumpCloseSegment);
+ }
+
xlogreader_state =
- XLogReaderAllocate(WalSegSz, waldir,
- XL_ROUTINE(.page_read = WALDumpReadPage,
- .segment_open = WALDumpOpenSegment,
- .segment_close = WALDumpCloseSegment),
+ XLogReaderAllocate(WalSegSz, waldir, routine,
&private);
if (!xlogreader_state)
pg_fatal("out of memory while allocating a WAL reading processor");
@@ -1325,6 +1559,9 @@ main(int argc, char **argv)
XLogReaderFree(xlogreader_state);
+ if (is_tar)
+ free_tar_archive_reader(&private);
+
return EXIT_SUCCESS;
bad_argument:
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index 9e62b64ead5..b5d440500de 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -12,6 +12,8 @@
#define PG_WALDUMP_H
#include "access/xlogdefs.h"
+#include "fe_utils/astreamer.h"
+#include "lib/stringinfo.h"
extern int WalSegSz;
@@ -22,6 +24,23 @@ typedef struct XLogDumpPrivate
XLogRecPtr startptr;
XLogRecPtr endptr;
bool endptr_reached;
+
+ /* Fields required to read WAL from archive */
+ char *archive_name; /* Tar archive name */
+ int archive_fd; /* File descriptor for the open tar file */
+
+ astreamer *archive_streamer;
+ StringInfo archive_streamer_buf; /* Buffer for receiving WAL data */
+ XLogRecPtr archive_streamer_read_ptr; /* Populate the buffer with records
+ until this record pointer */
} XLogDumpPrivate;
-#endif /* end of PG_WALDUMP_H */
+
+extern astreamer *astreamer_waldump_content_new(astreamer *next,
+ XLogRecPtr startptr,
+ XLogRecPtr endptr,
+ XLogDumpPrivate *privateInfo);
+extern int astreamer_wal_read(char *readBuff, XLogRecPtr startptr, Size count,
+ XLogDumpPrivate *privateInfo);
+
+#endif /* end of PG_WALDUMP_H */
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 1b712e8d74d..443126a9ce6 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -3,10 +3,13 @@
use strict;
use warnings FATAL => 'all';
+use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
+my $tar = $ENV{TAR};
+
program_help_ok('pg_waldump');
program_version_ok('pg_waldump');
program_options_handling_ok('pg_waldump');
@@ -235,7 +238,7 @@ command_like(
sub test_pg_waldump
{
local $Test::Builder::Level = $Test::Builder::Level + 1;
- my @opts = @_;
+ my ($path, @opts) = @_;
my ($stdout, $stderr);
@@ -243,6 +246,7 @@ sub test_pg_waldump
'pg_waldump',
'--start' => $start_lsn,
'--end' => $end_lsn,
+ '--path' => $path,
@opts
],
'>' => \$stdout,
@@ -254,11 +258,50 @@ sub test_pg_waldump
return @lines;
}
-my @lines;
+# Create a tar archive, sorting the file order
+sub generate_archive
+{
+ my ($archive, $directory, $compression_flags) = @_;
+
+ my @files;
+ opendir my $dh, $directory or die "opendir: $!";
+ while (my $entry = readdir $dh) {
+ # Skip '.' and '..'
+ next if $entry eq '.' || $entry eq '..';
+ push @files, $entry;
+ }
+ closedir $dh;
+
+ @files = sort @files;
+
+ # move into the WAL directory before archiving files
+ my $cwd = getcwd;
+ chdir($directory) || die "chdir: $!";
+ command_ok([$tar, $compression_flags, $archive, @files]);
+ chdir($cwd) || die "chdir: $!";
+}
+
+my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
my @scenario = (
{
- 'path' => $node->data_dir
+ 'path' => $node->data_dir,
+ 'is_archive' => 0,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar",
+ 'compression_method' => 'none',
+ 'compression_flags' => '-cf',
+ 'is_archive' => 1,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar.gz",
+ 'compression_method' => 'gzip',
+ 'compression_flags' => '-czf',
+ 'is_archive' => 1,
+ 'enabled' => check_pg_config("#define HAVE_LIBZ 1")
});
for my $scenario (@scenario)
@@ -267,6 +310,19 @@ for my $scenario (@scenario)
SKIP:
{
+ skip "tar command is not available", 3
+ if !defined $tar;
+ skip "$scenario->{'compression_method'} compression not supported by this build", 3
+ if !$scenario->{'enabled'} && $scenario->{'is_archive'};
+
+ # create pg_wal archive
+ if ($scenario->{'is_archive'})
+ {
+ generate_archive($path,
+ $node->data_dir . '/pg_wal',
+ $scenario->{'compression_flags'});
+ }
+
command_fails_like(
[ 'pg_waldump', '--path' => $path ],
qr/error: no start WAL location given/,
@@ -298,38 +354,42 @@ for my $scenario (@scenario)
qr/error: error in WAL record at/,
'errors are shown with --quiet');
- @lines = test_pg_waldump('--path' => $path);
+ my @lines;
+ @lines = test_pg_waldump($path);
is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
- @lines = test_pg_waldump('--path' => $path, '--limit' => 6);
+ @lines = test_pg_waldump($path, '--limit' => 6);
is(@lines, 6, 'limit option observed');
- @lines = test_pg_waldump('--path' => $path, '--fullpage');
+ @lines = test_pg_waldump($path, '--fullpage');
is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
- @lines = test_pg_waldump('--path' => $path, '--stats');
+ @lines = test_pg_waldump($path, '--stats');
like($lines[0], qr/WAL statistics/, "statistics on stdout");
is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
- @lines = test_pg_waldump('--path' => $path, '--stats=record');
+ @lines = test_pg_waldump($path, '--stats=record');
like($lines[0], qr/WAL statistics/, "statistics on stdout");
is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
- @lines = test_pg_waldump('--path' => $path, '--rmgr' => 'Btree');
+ @lines = test_pg_waldump($path, '--rmgr' => 'Btree');
is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
- @lines = test_pg_waldump('--path' => $path, '--fork' => 'init');
+ @lines = test_pg_waldump($path, '--fork' => 'init');
is(grep(!/fork init/, @lines), 0, 'only init fork lines');
- @lines = test_pg_waldump('--path' => $path,
+ @lines = test_pg_waldump($path,
'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
0, 'only lines for selected relation');
- @lines = test_pg_waldump('--path' => $path,
+ @lines = test_pg_waldump($path,
'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
'--block' => 1);
is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+
+ # Cleanup.
+ unlink $path if $scenario->{'is_archive'};
}
}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index a13e8162890..b406ca041ec 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3444,6 +3444,7 @@ astreamer_recovery_injector
astreamer_tar_archiver
astreamer_tar_parser
astreamer_verify
+astreamer_waldump
astreamer_zstd_frame
auth_password_hook_typ
autovac_table
--
2.47.1
[application/x-patch] v2-0006-WIP-pg_waldump-Remove-the-restriction-on-the-orde.patch (17.8K, 7-v2-0006-WIP-pg_waldump-Remove-the-restriction-on-the-orde.patch)
download | inline diff:
From dbacb4b3d19c579eb0e3b8aa2dbbff04c7273584 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Mon, 25 Aug 2025 17:26:29 +0530
Subject: [PATCH v2 6/9] WIP-pg_waldump: Remove the restriction on the order of
archived WAL files.
With previous patch, pg_waldump would stop decoding if WAL files were
not in the required sequence. With this patch, decoding will now
continue. Any WAL file that is out of order will be written to a
temporary location, from which it will be read later. Once a temporary
file has been read, it will be removed.
TODO:
Timeline switching is not handled correctly, especially when a
timeline change occurs on the next WAL file that was previously
written to a temporary location.
---
doc/src/sgml/ref/pg_waldump.sgml | 8 +-
src/bin/pg_waldump/astreamer_waldump.c | 189 +++++++++++++++++++++----
src/bin/pg_waldump/pg_waldump.c | 77 +++++++++-
src/bin/pg_waldump/pg_waldump.h | 26 +++-
src/bin/pg_waldump/t/001_basic.pl | 3 +-
5 files changed, 269 insertions(+), 34 deletions(-)
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index d004bb0f67e..8a28b4f0f91 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -149,8 +149,12 @@ PostgreSQL documentation
of <envar>PGDATA</envar>.
</para>
<para>
- If a tar archive is provided, its WAL segment files must be in
- sequential order; otherwise, an error will be reported.
+ If a tar archive is provided and its WAL segment files are not in
+ sequential order, those files will be written to a temporary directory
+ named <filename>pg_waldump_tmp_dir/</filename>. This directory will be
+ created inside the directory specified by the <envar>TMPDIR</envar>
+ environment variable if it is set; otherwise, it will be created within
+ the same directory as the tar archive.
</para>
</listitem>
</varlistentry>
diff --git a/src/bin/pg_waldump/astreamer_waldump.c b/src/bin/pg_waldump/astreamer_waldump.c
index 916d388ef0c..acfbace7502 100644
--- a/src/bin/pg_waldump/astreamer_waldump.c
+++ b/src/bin/pg_waldump/astreamer_waldump.c
@@ -18,8 +18,8 @@
#include "access/xlog_internal.h"
#include "access/xlogdefs.h"
+#include "common/file_perm.h"
#include "common/logging.h"
-#include "fe_utils/simple_list.h"
#include "pg_waldump.h"
/*
@@ -37,6 +37,8 @@ typedef struct astreamer_waldump
/* These fields change with archive member. */
bool skipThisSeg;
+ bool writeThisSeg;
+ FILE *segFp;
XLogSegNo nextSegNo; /* Next expected segment to stream */
} astreamer_waldump;
@@ -53,8 +55,15 @@ static bool member_is_relevant_wal(astreamer_member *member,
XLogSegNo startSegNo,
XLogSegNo endSegNo,
XLogSegNo nextSegNo,
+ char **curFname,
XLogSegNo *curSegNo,
TimeLineID *curSegTimeline);
+static FILE *member_prepare_tmp_write(XLogSegNo curSegNo,
+ const char *fname,
+ XLogDumpPrivate *privateInfo);
+static XLogSegNo member_next_segno(XLogSegNo curSegNo,
+ TimeLineID timeline,
+ XLogDumpPrivate *privateInfo);
static const astreamer_ops astreamer_waldump_ops = {
.content = astreamer_waldump_content,
@@ -189,17 +198,8 @@ astreamer_waldump_content_new(astreamer *next, XLogRecPtr startptr,
if (XLogRecPtrIsInvalid(startptr))
streamer->startSegNo = 0;
else
- {
XLByteToSeg(startptr, streamer->startSegNo, WalSegSz);
- /*
- * Initialize the record pointer to the beginning of the first
- * segment; this pointer will track the WAL record reading status.
- */
- XLogSegNoOffsetToRecPtr(streamer->startSegNo, 0, WalSegSz,
- privateInfo->archive_streamer_read_ptr);
- }
-
if (XLogRecPtrIsInvalid(endPtr))
streamer->endSegNo = UINT64_MAX;
else
@@ -228,19 +228,21 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
{
case ASTREAMER_MEMBER_HEADER:
{
+ char *fname;
XLogSegNo segNo;
TimeLineID timeline;
pg_log_debug("pg_waldump: reading \"%s\"", member->pathname);
mystreamer->skipThisSeg = false;
+ mystreamer->writeThisSeg = false;
if (!member_is_relevant_wal(member,
privateInfo->timeline,
mystreamer->startSegNo,
mystreamer->endSegNo,
mystreamer->nextSegNo,
- &segNo, &timeline))
+ &fname, &segNo, &timeline))
{
mystreamer->skipThisSeg = true;
break;
@@ -254,24 +256,38 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
if (mystreamer->nextSegNo == 0)
break;
- /* WAL segments must be archived in order */
+ /*
+ * When WAL segments are not archived sequentially, it becomes
+ * necessary to write out (or preserve) segments that might be
+ * required at a later point.
+ */
if (mystreamer->nextSegNo != segNo)
{
- pg_log_error("WAL files are not archived in sequential order");
- pg_log_error_detail("Expecting segment number " UINT64_FORMAT " but found " UINT64_FORMAT ".",
- mystreamer->nextSegNo, segNo);
- exit(1);
+ mystreamer->writeThisSeg = true;
+ mystreamer->segFp =
+ member_prepare_tmp_write(segNo, fname, privateInfo);
+ break;
}
/*
- * We track the reading of WAL segment records using a pointer
- * that's continuously incremented by the length of the
- * received data. This pointer is crucial for serving WAL page
- * requests from the WAL decoding routine, so it must be
- * accurate.
+ * We are now streaming segment containt.
+ *
+ * We need to track the reading of WAL segment records using a
+ * pointer that's typically incremented by the length of the
+ * data read. However, we sometimes export the WAL file to
+ * temporary storage, allowing the decoding routine to read
+ * directly from there. This makes continuous pointer
+ * incrementing challenging, as file reads can occur from any
+ * offset, leading to potential errors. Therefore, we now
+ * reset the pointer when reading from a file for streaming.
+ * Also, if there's any existing data in the buffer, the next
+ * WAL record should logically follow it.
*/
#ifdef USE_ASSERT_CHECKING
- if (mystreamer->nextSegNo != 0)
+ Assert(!mystreamer->skipThisSeg);
+ Assert(!mystreamer->writeThisSeg);
+
+ if (privateInfo->archive_streamer_buf->len != 0)
{
XLogRecPtr recPtr;
@@ -280,11 +296,19 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
}
#endif
+ /*
+ * Initialized to the beginning of the current segment being
+ * streamed through the buffer.
+ */
+ XLogSegNoOffsetToRecPtr(segNo, 0, WalSegSz,
+ privateInfo->archive_streamer_read_ptr);
+
/* Save the timeline */
privateInfo->timeline = timeline;
/* Update the next expected segment number */
- mystreamer->nextSegNo += 1;
+ mystreamer->nextSegNo =
+ member_next_segno(segNo, timeline, privateInfo);
}
break;
@@ -293,12 +317,44 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
if (mystreamer->skipThisSeg)
break;
+ /* Or, write contents to file */
+ if (mystreamer->writeThisSeg)
+ {
+ Assert(mystreamer->segFp != NULL);
+
+ errno = 0;
+ if (len > 0 && fwrite(data, len, 1, mystreamer->segFp) != 1)
+ {
+ char *fname;
+ int pathlen = strlen(member->pathname);
+
+ Assert(pathlen >= XLOG_FNAME_LEN);
+
+ fname = member->pathname + (pathlen - XLOG_FNAME_LEN);
+
+ /*
+ * If write didn't set errno, assume problem is no disk
+ * space
+ */
+ if (errno == 0)
+ errno = ENOSPC;
+ pg_fatal("could not write to file \"%s/%s\": %m",
+ privateInfo->tmpdir, fname);
+ }
+ break;
+ }
+
/* Or, copy contents to buffer */
privateInfo->archive_streamer_read_ptr += len;
astreamer_buffer_bytes(streamer, &data, &len, len);
break;
case ASTREAMER_MEMBER_TRAILER:
+ if (mystreamer->segFp != NULL)
+ {
+ fclose(mystreamer->segFp);
+ mystreamer->segFp = NULL;
+ }
break;
case ASTREAMER_ARCHIVE_TRAILER:
@@ -325,8 +381,14 @@ astreamer_waldump_finalize(astreamer *streamer)
static void
astreamer_waldump_free(astreamer *streamer)
{
+ astreamer_waldump *mystreamer;
+
Assert(streamer->bbs_next == NULL);
+ mystreamer = (astreamer_waldump *) streamer;
+ if (mystreamer->segFp != NULL)
+ fclose(mystreamer->segFp);
+
pfree(streamer->bbs_buffer.data);
pfree(streamer);
}
@@ -339,8 +401,8 @@ astreamer_waldump_free(astreamer *streamer)
static bool
member_is_relevant_wal(astreamer_member *member, TimeLineID startTimeLineID,
XLogSegNo startSegNo, XLogSegNo endSegNo,
- XLogSegNo nextSegNo, XLogSegNo *curSegNo,
- TimeLineID *curSegTimeline)
+ XLogSegNo nextSegNo, char **curFname,
+ XLogSegNo *curSegNo, TimeLineID *curSegTimeline)
{
int pathlen;
XLogSegNo segNo;
@@ -371,8 +433,85 @@ member_is_relevant_wal(astreamer_member *member, TimeLineID startTimeLineID,
if (startSegNo > segNo || endSegNo < segNo)
return false;
+ *curFname = fname;
*curSegNo = segNo;
*curSegTimeline = timeline;
return true;
}
+
+/*
+ * Create an empty placeholder file and return its handle. The file is also
+ * added to an exported list for future management, e.g. access, deletion, and
+ * existence checks.
+ */
+static FILE *
+member_prepare_tmp_write(XLogSegNo curSegNo, const char *fname,
+ XLogDumpPrivate *privateInfo)
+{
+ FILE *file;
+ char *fpath = get_tmp_wal_file_path(privateInfo, fname);
+
+ /* Create an empty placeholder */
+ file = fopen(fpath, PG_BINARY_W);
+ if (file == NULL)
+ pg_fatal("could not create file \"%s\": %m", fpath);
+
+#ifndef WIN32
+ if (chmod(fpath, pg_file_create_mode))
+ pg_fatal("could not set permissions on file \"%s\": %m",
+ fpath);
+#endif
+
+ /* Record this segment's export */
+ simple_string_list_append(&privateInfo->exportedSegList, fname);
+ pfree(fpath);
+
+ return file;
+}
+
+/*
+ * Get next WAL segment that needs to be retrieved from the archive.
+ *
+ * The function checks for the presence of a previously read and extracted WAL
+ * segment in the temporary storage. If a temporary file is found for that
+ * segment, it indicates the segment has already been successfully retrieved
+ * from the archive. In this case, the function increments the segment number
+ * and repeats the check. This process continues until a segment that has not
+ * yet been retrieved is found, at which point the function returns its number.
+ */
+static XLogSegNo
+member_next_segno(XLogSegNo curSegNo, TimeLineID timeline,
+ XLogDumpPrivate *privateInfo)
+{
+ XLogSegNo nextSegNo = curSegNo + 1;
+ bool exists;
+
+ /*
+ * If we find a file that was previously written to the temporary space,
+ * it indicates that the corresponding WAL segment request has already
+ * been fulfilled. In that case, we increment the nextSegNo counter and
+ * check again whether that segment number again. if found above steps
+ * will be return if not then we return that segment number which would be
+ * needed from the archive.
+ */
+ do
+ {
+ char fname[MAXFNAMELEN];
+
+ XLogFileName(fname, timeline, nextSegNo, WalSegSz);
+
+ /*
+ * If the WAL segment has already been exported, increment the counter
+ * and check for the next segment.
+ */
+ exists = false;
+ if (simple_string_list_member(&privateInfo->exportedSegList, fname))
+ {
+ nextSegNo += 1;
+ exists = true;
+ }
+ } while (exists);
+
+ return nextSegNo;
+}
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 64f3a65b735..d456adce59c 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -393,13 +393,14 @@ setup_astreamer(XLogDumpPrivate *private, pg_compress_algorithm compression,
}
/*
- * Initializes the archive reader for a tar file.
+ * Initializes the tar archive reader and a temporary directory for WAL files.
*/
static void
init_tar_archive_reader(XLogDumpPrivate *private, char *waldir,
pg_compress_algorithm compression)
{
int fd;
+ char *tmpdir;
/* Now, the tar archive and store its file descriptor */
fd = open_file_in_directory(waldir, private->archive_name);
@@ -411,6 +412,15 @@ init_tar_archive_reader(XLogDumpPrivate *private, char *waldir,
/* Setup tar archive reading facility */
setup_astreamer(private, compression, private->startptr, private->endptr);
+
+ /* Temporary space for writing WAL segments */
+ if (getenv("TMPDIR"))
+ tmpdir = pstrdup(getenv("TMPDIR"));
+ else
+ tmpdir = waldir != NULL ? pstrdup(waldir) : pstrdup(".");
+ canonicalize_path(tmpdir);
+
+ private->tmpdir = tmpdir;
}
/*
@@ -419,6 +429,8 @@ init_tar_archive_reader(XLogDumpPrivate *private, char *waldir,
static void
free_tar_archive_reader(XLogDumpPrivate *private)
{
+ SimpleStringListCell *cell;
+
/*
* NB: Normally, astreamer_finalize() is called before astreamer_free() to
* flush any remaining buffered data or to ensure the end of the tar
@@ -432,6 +444,15 @@ free_tar_archive_reader(XLogDumpPrivate *private)
if (close(private->archive_fd) != 0)
pg_log_error("could not close file \"%s\": %m",
private->archive_name);
+
+ /* Clear out any existing temporary files */
+ for (cell = private->exportedSegList.head; cell; cell = cell->next)
+ {
+ char *fpath = get_tmp_wal_file_path(private, cell->val);
+
+ unlink(fpath);
+ pfree(fpath);
+ }
}
/*
@@ -559,7 +580,7 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogRecPtr targetPtr, char *readBuff)
{
XLogDumpPrivate *private = state->private_data;
- int count = required_read_len(private, targetPtr, reqLen);
+ int count = required_read_len(private, targetPagePtr, reqLen);
WALReadError errinfo;
if (private->endptr_reached)
@@ -618,12 +639,60 @@ TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogRecPtr targetPtr, char *readBuff)
{
XLogDumpPrivate *private = state->private_data;
- int count = required_read_len(private, targetPtr, reqLen);
+ int count = required_read_len(private, targetPagePtr, reqLen);
+ XLogSegNo nextSegNo;
if (private->endptr_reached)
return -1;
- /* Read the WAL page from the archive streamer */
+ /*
+ * If the target page is in a different segment, first check for the WAL
+ * segment's physical existence in the temporary directory.
+ *
+ * XXX: Timeline change is not handled.
+ */
+ nextSegNo = state->seg.ws_segno;
+ if (!XLByteInSeg(targetPagePtr, nextSegNo, WalSegSz))
+ {
+ char fname[MAXPGPATH];
+ char *fpath;
+
+ if (state->seg.ws_file >= 0)
+ {
+ close(state->seg.ws_file);
+ state->seg.ws_file = -1;
+
+ /* Remove this file, as it is no longer needed. */
+ XLogFileName(fname, state->seg.ws_tli, nextSegNo, WalSegSz);
+ fpath = get_tmp_wal_file_path(private, fname);
+ unlink(fpath);
+ pfree(fpath);
+ }
+
+ XLByteToSeg(targetPagePtr, nextSegNo, WalSegSz);
+ state->seg.ws_tli = private->timeline;
+ state->seg.ws_segno = nextSegNo;
+
+ /*
+ * If the next segment exists, open it and continue reading from there
+ */
+ XLogFileName(fname, private->timeline, nextSegNo, WalSegSz);
+ if (simple_string_list_member(&private->exportedSegList, fname))
+ {
+ fpath = get_tmp_wal_file_path(private, fname);
+ state->seg.ws_file = open(fpath, O_RDONLY | PG_BINARY, 0);
+
+ if (state->seg.ws_file < 0)
+ pg_fatal("could not open file \"%s\": %m", fpath);
+ }
+ }
+
+ /* Continue reading from the open WAL segment, if any */
+ if (state->seg.ws_file >= 0)
+ return WALDumpReadPage(state, targetPagePtr, reqLen, targetPtr,
+ readBuff);
+
+ /* Otherwise, read the WAL page from the archive streamer */
return astreamer_wal_read(readBuff, targetPagePtr, count, private);
}
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index b5d440500de..614e679cb96 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -13,8 +13,11 @@
#include "access/xlogdefs.h"
#include "fe_utils/astreamer.h"
+#include "fe_utils/simple_list.h"
#include "lib/stringinfo.h"
+#define TEMP_FILE_EXT "waldump.tmp"
+
extern int WalSegSz;
/* Contains the necessary information to drive WAL decoding */
@@ -31,11 +34,30 @@ typedef struct XLogDumpPrivate
astreamer *archive_streamer;
StringInfo archive_streamer_buf; /* Buffer for receiving WAL data */
- XLogRecPtr archive_streamer_read_ptr; /* Populate the buffer with records
- until this record pointer */
+ XLogRecPtr archive_streamer_read_ptr; /* Populate the buffer with
+ * records until this record
+ * pointer */
+ char *tmpdir; /* Temporary direcotry to export file */
+ SimpleStringList exportedSegList; /* Temporary exported WAL file list */
} XLogDumpPrivate;
+/*
+ * Generate the temporary WAL file path.
+ *
+ * Note that the caller is responsible to pfree it.
+ */
+static inline char *
+get_tmp_wal_file_path(XLogDumpPrivate *privateInfo, const char *fname)
+{
+ char *fpath = (char *) palloc(MAXPGPATH);
+
+ snprintf(fpath, MAXPGPATH, "%s/%s.%s", privateInfo->tmpdir, fname,
+ TEMP_FILE_EXT);
+
+ return fpath;
+}
+
extern astreamer *astreamer_waldump_content_new(astreamer *next,
XLogRecPtr startptr,
XLogRecPtr endptr,
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 443126a9ce6..d5fa1f6d28d 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -7,6 +7,7 @@ use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
+use List::Util qw(shuffle);
my $tar = $ENV{TAR};
@@ -272,7 +273,7 @@ sub generate_archive
}
closedir $dh;
- @files = sort @files;
+ @files = shuffle @files;
# move into the WAL directory before archiving files
my $cwd = getcwd;
--
2.47.1
[application/x-patch] v2-0007-pg_verifybackup-Delay-default-WAL-directory-prepa.patch (1.7K, 8-v2-0007-pg_verifybackup-Delay-default-WAL-directory-prepa.patch)
download | inline diff:
From 64dbdfaa575749b76ebdd3fd235a8186b6eb19fc Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 16 Jul 2025 14:47:43 +0530
Subject: [PATCH v2 7/9] pg_verifybackup: Delay default WAL directory
preparation.
We are not sure whether to parse WAL from a directory or an archive
until the backup format is known. Therefore, we delay preparing the
default WAL directory until the point of parsing. This delay is
harmless, as the WAL directory is not used elsewhere.
---
src/bin/pg_verifybackup/pg_verifybackup.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 5e6c13bb921..31ebc1581fb 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -285,10 +285,6 @@ main(int argc, char **argv)
manifest_path = psprintf("%s/backup_manifest",
context.backup_directory);
- /* By default, look for the WAL in the backup directory, too. */
- if (wal_directory == NULL)
- wal_directory = psprintf("%s/pg_wal", context.backup_directory);
-
/*
* Try to read the manifest. We treat any errors encountered while parsing
* the manifest as fatal; there doesn't seem to be much point in trying to
@@ -368,6 +364,10 @@ main(int argc, char **argv)
if (context.format == 'p' && !context.skip_checksums)
verify_backup_checksums(&context);
+ /* By default, look for the WAL in the backup directory, too. */
+ if (wal_directory == NULL)
+ wal_directory = psprintf("%s/pg_wal", context.backup_directory);
+
/*
* Try to parse the required ranges of WAL records, unless we were told
* not to do so.
--
2.47.1
[application/x-patch] v2-0008-pg_verifybackup-Rename-the-wal-directory-switch-t.patch (15.6K, 9-v2-0008-pg_verifybackup-Rename-the-wal-directory-switch-t.patch)
download | inline diff:
From 49d74dca63e15c300a8ccf317d17003f6f9412e8 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 24 Jul 2025 16:37:43 +0530
Subject: [PATCH v2 8/9] pg_verifybackup: Rename the wal-directory switch to
wal-path
Future patches to pg_waldump will enable it to decode WAL directly
from tar files. This means you'll be able to specify a tar archive
path instead of a traditional WAL directory.
To keep things consistent and more versatile, we should also
generalize the input switch for pg_verifybackup. It should accept
either a directory or a tar file path that contains WALs. This change
will also aligning it with the existing manifest-path switch naming.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 2 +-
src/bin/pg_verifybackup/pg_verifybackup.c | 22 +++++++++++-----------
src/bin/pg_verifybackup/po/de.po | 4 ++--
src/bin/pg_verifybackup/po/el.po | 4 ++--
src/bin/pg_verifybackup/po/es.po | 4 ++--
src/bin/pg_verifybackup/po/fr.po | 4 ++--
src/bin/pg_verifybackup/po/it.po | 4 ++--
src/bin/pg_verifybackup/po/ja.po | 4 ++--
src/bin/pg_verifybackup/po/ka.po | 4 ++--
src/bin/pg_verifybackup/po/ko.po | 4 ++--
src/bin/pg_verifybackup/po/ru.po | 4 ++--
src/bin/pg_verifybackup/po/sv.po | 4 ++--
src/bin/pg_verifybackup/po/uk.po | 4 ++--
src/bin/pg_verifybackup/po/zh_CN.po | 4 ++--
src/bin/pg_verifybackup/po/zh_TW.po | 4 ++--
src/bin/pg_verifybackup/t/007_wal.pl | 4 ++--
16 files changed, 40 insertions(+), 40 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index 61c12975e4a..e9b8bfd51b1 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -261,7 +261,7 @@ PostgreSQL documentation
<varlistentry>
<term><option>-w <replaceable class="parameter">path</replaceable></option></term>
- <term><option>--wal-directory=<replaceable class="parameter">path</replaceable></option></term>
+ <term><option>--wal-path=<replaceable class="parameter">path</replaceable></option></term>
<listitem>
<para>
Try to parse WAL files stored in the specified directory, rather than
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 31ebc1581fb..1ee400199da 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -93,7 +93,7 @@ static void verify_file_checksum(verifier_context *context,
uint8 *buffer);
static void parse_required_wal(verifier_context *context,
char *pg_waldump_path,
- char *wal_directory);
+ char *wal_path);
static astreamer *create_archive_verifier(verifier_context *context,
char *archive_name,
Oid tblspc_oid,
@@ -126,7 +126,7 @@ main(int argc, char **argv)
{"progress", no_argument, NULL, 'P'},
{"quiet", no_argument, NULL, 'q'},
{"skip-checksums", no_argument, NULL, 's'},
- {"wal-directory", required_argument, NULL, 'w'},
+ {"wal-path", required_argument, NULL, 'w'},
{NULL, 0, NULL, 0}
};
@@ -135,7 +135,7 @@ main(int argc, char **argv)
char *manifest_path = NULL;
bool no_parse_wal = false;
bool quiet = false;
- char *wal_directory = NULL;
+ char *wal_path = NULL;
char *pg_waldump_path = NULL;
DIR *dir;
@@ -221,8 +221,8 @@ main(int argc, char **argv)
context.skip_checksums = true;
break;
case 'w':
- wal_directory = pstrdup(optarg);
- canonicalize_path(wal_directory);
+ wal_path = pstrdup(optarg);
+ canonicalize_path(wal_path);
break;
default:
/* getopt_long already emitted a complaint */
@@ -365,15 +365,15 @@ main(int argc, char **argv)
verify_backup_checksums(&context);
/* By default, look for the WAL in the backup directory, too. */
- if (wal_directory == NULL)
- wal_directory = psprintf("%s/pg_wal", context.backup_directory);
+ if (wal_path == NULL)
+ wal_path = psprintf("%s/pg_wal", context.backup_directory);
/*
* Try to parse the required ranges of WAL records, unless we were told
* not to do so.
*/
if (!no_parse_wal)
- parse_required_wal(&context, pg_waldump_path, wal_directory);
+ parse_required_wal(&context, pg_waldump_path, wal_path);
/*
* If everything looks OK, tell the user this, unless we were asked to
@@ -1198,7 +1198,7 @@ verify_file_checksum(verifier_context *context, manifest_file *m,
*/
static void
parse_required_wal(verifier_context *context, char *pg_waldump_path,
- char *wal_directory)
+ char *wal_path)
{
manifest_data *manifest = context->manifest;
manifest_wal_range *this_wal_range = manifest->first_wal_range;
@@ -1208,7 +1208,7 @@ parse_required_wal(verifier_context *context, char *pg_waldump_path,
char *pg_waldump_cmd;
pg_waldump_cmd = psprintf("\"%s\" --quiet --path=\"%s\" --timeline=%u --start=%X/%08X --end=%X/%08X\n",
- pg_waldump_path, wal_directory, this_wal_range->tli,
+ pg_waldump_path, wal_path, this_wal_range->tli,
LSN_FORMAT_ARGS(this_wal_range->start_lsn),
LSN_FORMAT_ARGS(this_wal_range->end_lsn));
fflush(NULL);
@@ -1376,7 +1376,7 @@ usage(void)
printf(_(" -P, --progress show progress information\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -s, --skip-checksums skip checksum verification\n"));
- printf(_(" -w, --wal-directory=PATH use specified path for WAL files\n"));
+ printf(_(" -w, --wal-path=PATH use specified path for WAL files\n"));
printf(_(" -V, --version output version information, then exit\n"));
printf(_(" -?, --help show this help, then exit\n"));
printf(_("\nReport bugs to <%s>.\n"), PACKAGE_BUGREPORT);
diff --git a/src/bin/pg_verifybackup/po/de.po b/src/bin/pg_verifybackup/po/de.po
index a9e24931100..9b5cd5898cf 100644
--- a/src/bin/pg_verifybackup/po/de.po
+++ b/src/bin/pg_verifybackup/po/de.po
@@ -785,8 +785,8 @@ msgstr " -s, --skip-checksums Überprüfung der Prüfsummen überspringe
#: pg_verifybackup.c:1379
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PFAD angegebenen Pfad für WAL-Dateien verwenden\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PFAD angegebenen Pfad für WAL-Dateien verwenden\n"
#: pg_verifybackup.c:1380
#, c-format
diff --git a/src/bin/pg_verifybackup/po/el.po b/src/bin/pg_verifybackup/po/el.po
index 3e3f20c67c5..81442f51c17 100644
--- a/src/bin/pg_verifybackup/po/el.po
+++ b/src/bin/pg_verifybackup/po/el.po
@@ -494,8 +494,8 @@ msgstr " -s, --skip-checksums παράκαμψε την επαλήθευ
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH χρησιμοποίησε την καθορισμένη διαδρομή για αρχεία WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH χρησιμοποίησε την καθορισμένη διαδρομή για αρχεία WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/es.po b/src/bin/pg_verifybackup/po/es.po
index 0cb958f3448..7f729fa35ba 100644
--- a/src/bin/pg_verifybackup/po/es.po
+++ b/src/bin/pg_verifybackup/po/es.po
@@ -495,8 +495,8 @@ msgstr " -s, --skip-checksums omitir la verificación de la suma de comp
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH utilizar la ruta especificada para los archivos WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH utilizar la ruta especificada para los archivos WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/fr.po b/src/bin/pg_verifybackup/po/fr.po
index da8c72f6427..09937966fa7 100644
--- a/src/bin/pg_verifybackup/po/fr.po
+++ b/src/bin/pg_verifybackup/po/fr.po
@@ -498,8 +498,8 @@ msgstr " -s, --skip-checksums ignore la vérification des sommes de cont
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=CHEMIN utilise le chemin spécifié pour les fichiers WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=CHEMIN utilise le chemin spécifié pour les fichiers WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/it.po b/src/bin/pg_verifybackup/po/it.po
index 317b0b71e7f..4da68d0074e 100644
--- a/src/bin/pg_verifybackup/po/it.po
+++ b/src/bin/pg_verifybackup/po/it.po
@@ -472,8 +472,8 @@ msgstr " -s, --skip-checksums salta la verifica del checksum\n"
#: pg_verifybackup.c:911
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH usa il percorso specificato per i file WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH usa il percorso specificato per i file WAL\n"
#: pg_verifybackup.c:912
#, c-format
diff --git a/src/bin/pg_verifybackup/po/ja.po b/src/bin/pg_verifybackup/po/ja.po
index c910fb236cc..a948959b54f 100644
--- a/src/bin/pg_verifybackup/po/ja.po
+++ b/src/bin/pg_verifybackup/po/ja.po
@@ -672,8 +672,8 @@ msgstr " -s, --skip-checksums チェックサム検証をスキップ\n"
#: pg_verifybackup.c:1379
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH WALファイルに指定したパスを使用する\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH WALファイルに指定したパスを使用する\n"
#: pg_verifybackup.c:1380
#, c-format
diff --git a/src/bin/pg_verifybackup/po/ka.po b/src/bin/pg_verifybackup/po/ka.po
index 982751984c7..ef2799316a8 100644
--- a/src/bin/pg_verifybackup/po/ka.po
+++ b/src/bin/pg_verifybackup/po/ka.po
@@ -784,8 +784,8 @@ msgstr " -s, --skip-checksums საკონტროლო ჯამ
#: pg_verifybackup.c:1379
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=ბილიკი WAL ფაილებისთვის მითითებული ბილიკის გამოყენება\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=ბილიკი WAL ფაილებისთვის მითითებული ბილიკის გამოყენება\n"
#: pg_verifybackup.c:1380
#, c-format
diff --git a/src/bin/pg_verifybackup/po/ko.po b/src/bin/pg_verifybackup/po/ko.po
index acdc3da5e02..eaf91ef1e98 100644
--- a/src/bin/pg_verifybackup/po/ko.po
+++ b/src/bin/pg_verifybackup/po/ko.po
@@ -501,8 +501,8 @@ msgstr " -s, --skip-checksums 체크섬 검사 건너뜀\n"
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=경로 WAL 파일이 있는 경로 지정\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=경로 WAL 파일이 있는 경로 지정\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/ru.po b/src/bin/pg_verifybackup/po/ru.po
index 64005feedfd..7fb0e5ab1f6 100644
--- a/src/bin/pg_verifybackup/po/ru.po
+++ b/src/bin/pg_verifybackup/po/ru.po
@@ -507,9 +507,9 @@ msgstr " -s, --skip-checksums пропустить проверку ко
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
msgstr ""
-" -w, --wal-directory=ПУТЬ использовать заданный путь к файлам WAL\n"
+" -w, --wal-path=ПУТЬ использовать заданный путь к файлам WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/sv.po b/src/bin/pg_verifybackup/po/sv.po
index 17240feeb5c..97125838e8c 100644
--- a/src/bin/pg_verifybackup/po/sv.po
+++ b/src/bin/pg_verifybackup/po/sv.po
@@ -492,8 +492,8 @@ msgstr " -s, --skip-checksums hoppa över verifiering av kontrollsummor\
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=SÖKVÄG använd denna sökväg till WAL-filer\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=SÖKVÄG använd denna sökväg till WAL-filer\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/uk.po b/src/bin/pg_verifybackup/po/uk.po
index 034b9764232..63f8041ab38 100644
--- a/src/bin/pg_verifybackup/po/uk.po
+++ b/src/bin/pg_verifybackup/po/uk.po
@@ -484,8 +484,8 @@ msgstr " -s, --skip-checksums не перевіряти контрольні с
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH використовувати вказаний шлях для файлів WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH використовувати вказаний шлях для файлів WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/zh_CN.po b/src/bin/pg_verifybackup/po/zh_CN.po
index b7d97c8976d..fb6fcae8b82 100644
--- a/src/bin/pg_verifybackup/po/zh_CN.po
+++ b/src/bin/pg_verifybackup/po/zh_CN.po
@@ -465,8 +465,8 @@ msgstr " -s, --skip-checksums 跳过校验和验证\n"
#: pg_verifybackup.c:919
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH 对WAL文件使用指定路径\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH 对WAL文件使用指定路径\n"
#: pg_verifybackup.c:920
#, c-format
diff --git a/src/bin/pg_verifybackup/po/zh_TW.po b/src/bin/pg_verifybackup/po/zh_TW.po
index c1b710b0a36..568f972b0bb 100644
--- a/src/bin/pg_verifybackup/po/zh_TW.po
+++ b/src/bin/pg_verifybackup/po/zh_TW.po
@@ -555,8 +555,8 @@ msgstr " -s, --skip-checksums 跳過檢查碼驗證\n"
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH 用指定的路徑存放 WAL 檔\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH 用指定的路徑存放 WAL 檔\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/t/007_wal.pl b/src/bin/pg_verifybackup/t/007_wal.pl
index babc4f0a86b..b07f80719b0 100644
--- a/src/bin/pg_verifybackup/t/007_wal.pl
+++ b/src/bin/pg_verifybackup/t/007_wal.pl
@@ -42,10 +42,10 @@ command_ok([ 'pg_verifybackup', '--no-parse-wal', $backup_path ],
command_ok(
[
'pg_verifybackup',
- '--wal-directory' => $relocated_pg_wal,
+ '--wal-path' => $relocated_pg_wal,
$backup_path
],
- '--wal-directory can be used to specify WAL directory');
+ '--wal-path can be used to specify WAL directory');
# Move directory back to original location.
rename($relocated_pg_wal, $original_pg_wal) || die "rename pg_wal back: $!";
--
2.47.1
[application/x-patch] v2-0009-pg_verifybackup-enabled-WAL-parsing-for-tar-forma.patch (9.3K, 10-v2-0009-pg_verifybackup-enabled-WAL-parsing-for-tar-forma.patch)
download | inline diff:
From 877dd072349fbfeb4a39f2f3cca13ba4b68d0912 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 17 Jul 2025 16:39:36 +0530
Subject: [PATCH v2 9/9] pg_verifybackup: enabled WAL parsing for tar-format
backup
Now that pg_waldump supports decoding from tar archives, we should
leverage this functionality to remove the previous restriction on WAL
parsing for tar-backed formats.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 5 +-
src/bin/pg_verifybackup/pg_verifybackup.c | 66 +++++++++++++------
src/bin/pg_verifybackup/t/002_algorithm.pl | 4 --
src/bin/pg_verifybackup/t/003_corruption.pl | 4 +-
src/bin/pg_verifybackup/t/008_untar.pl | 3 +-
src/bin/pg_verifybackup/t/010_client_untar.pl | 3 +-
6 files changed, 50 insertions(+), 35 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index e9b8bfd51b1..16b50b5a4df 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -36,10 +36,7 @@ PostgreSQL documentation
<literal>backup_manifest</literal> generated by the server at the time
of the backup. The backup may be stored either in the "plain" or the "tar"
format; this includes tar-format backups compressed with any algorithm
- supported by <application>pg_basebackup</application>. However, at present,
- <literal>WAL</literal> verification is supported only for plain-format
- backups. Therefore, if the backup is stored in tar-format, the
- <literal>-n, --no-parse-wal</literal> option should be used.
+ supported by <application>pg_basebackup</application>.
</para>
<para>
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 1ee400199da..4bfe6fdff16 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -74,7 +74,9 @@ pg_noreturn static void report_manifest_error(JsonManifestParseContext *context,
const char *fmt,...)
pg_attribute_printf(2, 3);
-static void verify_tar_backup(verifier_context *context, DIR *dir);
+static void verify_tar_backup(verifier_context *context, DIR *dir,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_plain_backup_directory(verifier_context *context,
char *relpath, char *fullpath,
DIR *dir);
@@ -83,7 +85,9 @@ static void verify_plain_backup_file(verifier_context *context, char *relpath,
static void verify_control_file(const char *controlpath,
uint64 manifest_system_identifier);
static void precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles);
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_tar_file(verifier_context *context, char *relpath,
char *fullpath, astreamer *streamer);
static void report_extra_backup_files(verifier_context *context);
@@ -136,6 +140,8 @@ main(int argc, char **argv)
bool no_parse_wal = false;
bool quiet = false;
char *wal_path = NULL;
+ char *base_archive_path = NULL;
+ char *wal_archive_path = NULL;
char *pg_waldump_path = NULL;
DIR *dir;
@@ -327,17 +333,6 @@ main(int argc, char **argv)
pfree(path);
}
- /*
- * XXX: In the future, we should consider enhancing pg_waldump to read WAL
- * files from an archive.
- */
- if (!no_parse_wal && context.format == 't')
- {
- pg_log_error("pg_waldump cannot read tar files");
- pg_log_error_hint("You must use -n/--no-parse-wal when verifying a tar-format backup.");
- exit(1);
- }
-
/*
* Perform the appropriate type of verification appropriate based on the
* backup format. This will close 'dir'.
@@ -346,7 +341,7 @@ main(int argc, char **argv)
verify_plain_backup_directory(&context, NULL, context.backup_directory,
dir);
else
- verify_tar_backup(&context, dir);
+ verify_tar_backup(&context, dir, &base_archive_path, &wal_archive_path);
/*
* The "matched" flag should now be set on every entry in the hash table.
@@ -364,9 +359,28 @@ main(int argc, char **argv)
if (context.format == 'p' && !context.skip_checksums)
verify_backup_checksums(&context);
- /* By default, look for the WAL in the backup directory, too. */
+ /*
+ * By default, WAL files are expected to be found in the backup directory
+ * for plain-format backups. In the case of tar-format backups, if a
+ * separate WAL archive is not found, the WAL files are most likely
+ * included within the main data directory archive.
+ */
if (wal_path == NULL)
- wal_path = psprintf("%s/pg_wal", context.backup_directory);
+ {
+ if (context.format == 'p')
+ wal_path = psprintf("%s/pg_wal", context.backup_directory);
+ else if (wal_archive_path)
+ wal_path = wal_archive_path;
+ else if (base_archive_path)
+ wal_path = base_archive_path;
+ else
+ {
+ pg_log_error("wal archive not found");
+ pg_log_error_hint("Specify the correct path using the option -w/--wal-path."
+ "Or you must use -n/--no-parse-wal when verifying a tar-format backup.");
+ exit(1);
+ }
+ }
/*
* Try to parse the required ranges of WAL records, unless we were told
@@ -787,7 +801,8 @@ verify_control_file(const char *controlpath, uint64 manifest_system_identifier)
* close when we're done with it.
*/
static void
-verify_tar_backup(verifier_context *context, DIR *dir)
+verify_tar_backup(verifier_context *context, DIR *dir, char **base_archive_path,
+ char **wal_archive_path)
{
struct dirent *dirent;
SimplePtrList tarfiles = {NULL, NULL};
@@ -816,7 +831,8 @@ verify_tar_backup(verifier_context *context, DIR *dir)
char *fullpath;
fullpath = psprintf("%s/%s", context->backup_directory, filename);
- precheck_tar_backup_file(context, filename, fullpath, &tarfiles);
+ precheck_tar_backup_file(context, filename, fullpath, &tarfiles,
+ base_archive_path, wal_archive_path);
pfree(fullpath);
}
}
@@ -875,11 +891,13 @@ verify_tar_backup(verifier_context *context, DIR *dir)
*
* The arguments to this function are mostly the same as the
* verify_plain_backup_file. The additional argument outputs a list of valid
- * tar files.
+ * tar files, along with the full paths to the main archive and the WAL
+ * directory archive.
*/
static void
precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles)
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path, char **wal_archive_path)
{
struct stat sb;
Oid tblspc_oid = InvalidOid;
@@ -918,9 +936,17 @@ precheck_tar_backup_file(verifier_context *context, char *relpath,
* extension such as .gz, .lz4, or .zst.
*/
if (strncmp("base", relpath, 4) == 0)
+ {
suffix = relpath + 4;
+
+ *base_archive_path = pstrdup(fullpath);
+ }
else if (strncmp("pg_wal", relpath, 6) == 0)
+ {
suffix = relpath + 6;
+
+ *wal_archive_path = pstrdup(fullpath);
+ }
else
{
/* Expected a <tablespaceoid>.tar file here. */
diff --git a/src/bin/pg_verifybackup/t/002_algorithm.pl b/src/bin/pg_verifybackup/t/002_algorithm.pl
index ae16c11bc4d..4f284a9e828 100644
--- a/src/bin/pg_verifybackup/t/002_algorithm.pl
+++ b/src/bin/pg_verifybackup/t/002_algorithm.pl
@@ -30,10 +30,6 @@ sub test_checksums
{
# Add switch to get a tar-format backup
push @backup, ('--format' => 'tar');
-
- # Add switch to skip WAL verification, which is not yet supported for
- # tar-format backups
- push @verify, ('--no-parse-wal');
}
# A backup with a bogus algorithm should fail.
diff --git a/src/bin/pg_verifybackup/t/003_corruption.pl b/src/bin/pg_verifybackup/t/003_corruption.pl
index 1dd60f709cf..f1ebdbb46b4 100644
--- a/src/bin/pg_verifybackup/t/003_corruption.pl
+++ b/src/bin/pg_verifybackup/t/003_corruption.pl
@@ -193,10 +193,8 @@ for my $scenario (@scenario)
command_ok([ $tar, '-cf' => "$tar_backup_path/base.tar", '.' ]);
chdir($cwd) || die "chdir: $!";
- # Now check that the backup no longer verifies. We must use -n
- # here, because pg_waldump can't yet read WAL from a tarfile.
command_fails_like(
- [ 'pg_verifybackup', '--no-parse-wal', $tar_backup_path ],
+ [ 'pg_verifybackup', $tar_backup_path ],
$scenario->{'fails_like'},
"corrupt backup fails verification: $name");
diff --git a/src/bin/pg_verifybackup/t/008_untar.pl b/src/bin/pg_verifybackup/t/008_untar.pl
index bc3d6b352ad..0cfe1f9532c 100644
--- a/src/bin/pg_verifybackup/t/008_untar.pl
+++ b/src/bin/pg_verifybackup/t/008_untar.pl
@@ -123,8 +123,7 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
diff --git a/src/bin/pg_verifybackup/t/010_client_untar.pl b/src/bin/pg_verifybackup/t/010_client_untar.pl
index b62faeb5acf..76269a73673 100644
--- a/src/bin/pg_verifybackup/t/010_client_untar.pl
+++ b/src/bin/pg_verifybackup/t/010_client_untar.pl
@@ -137,8 +137,7 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
--
2.47.1
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2025-08-26 11:52 Amul Sul <[email protected]>
parent: Amul Sul <[email protected]>
0 siblings, 2 replies; 85+ messages in thread
From: Amul Sul @ 2025-08-26 11:52 UTC (permalink / raw)
To: PostgreSQL Hackers <[email protected]>
On Mon, Aug 25, 2025 at 5:58 PM Amul Sul <[email protected]> wrote:
>
> On Thu, Aug 7, 2025 at 7:47 PM Amul Sul <[email protected]> wrote:
> > [....]
> > -----------------------------------
> > Known Issues & Status:
> > -----------------------------------
> > - Timeline Switching: The current implementation in patch 006 does not
> > correctly handle timeline switching. This is a known issue, especially
> > when a timeline change occurs on a WAL file that has been written to a
> > temporary location.
> >
>
> This is still pending and will be addressed in the next version.
> Therefore, patch 0006 remains marked as WIP.
>
After testing pg_waldump, I have realised that my previous
understanding of its timeline handling was incorrect. I had mistakenly
assumed by reading xlogreader code that it would use the same
timeline-switching logic found in xlogreader, without first verifying
this behavior. In testing, I found that pg_waldump does not follow
timeline switches. Instead, it expects all WAL files to be from a
single timeline, which is either specified by the user or determined
from the starting segment or default 1.
This is a positive finding, as it means we don't need to make
significant changes to align pg_waldump's current behavior. The
attached patches are now complete and no longer works in progress --
read for review. Additionally, I've dropped patch v2-0004 because it is
no longer necessary. The primary patches that implement the proposed
feature are now 0004 and 0005 in the attached set.
Regards,
Amul
Attachments:
[application/x-patch] v3-0001-Refactor-pg_waldump-Move-some-declarations-to-new.patch (2.2K, 2-v3-0001-Refactor-pg_waldump-Move-some-declarations-to-new.patch)
download | inline diff:
From b48d5a7ed121c694273ad8cf2c3c78aa4ae23b1d Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 24 Jun 2025 11:33:20 +0530
Subject: [PATCH v3 1/8] Refactor: pg_waldump: Move some declarations to new
pg_waldump.h
This is in preparation for adding a second source file to this
directory.
---
src/bin/pg_waldump/pg_waldump.c | 11 ++---------
src/bin/pg_waldump/pg_waldump.h | 27 +++++++++++++++++++++++++++
2 files changed, 29 insertions(+), 9 deletions(-)
create mode 100644 src/bin/pg_waldump/pg_waldump.h
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 13d3ec2f5be..a49b2fd96c7 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -29,6 +29,7 @@
#include "common/logging.h"
#include "common/relpath.h"
#include "getopt_long.h"
+#include "pg_waldump.h"
#include "rmgrdesc.h"
#include "storage/bufpage.h"
@@ -39,19 +40,11 @@
static const char *progname;
-static int WalSegSz;
+int WalSegSz = DEFAULT_XLOG_SEG_SIZE;
static volatile sig_atomic_t time_to_stop = false;
static const RelFileLocator emptyRelFileLocator = {0, 0, 0};
-typedef struct XLogDumpPrivate
-{
- TimeLineID timeline;
- XLogRecPtr startptr;
- XLogRecPtr endptr;
- bool endptr_reached;
-} XLogDumpPrivate;
-
typedef struct XLogDumpConfig
{
/* display options */
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
new file mode 100644
index 00000000000..9e62b64ead5
--- /dev/null
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -0,0 +1,27 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_waldump.h - decode and display WAL
+ *
+ * Copyright (c) 2013-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/pg_waldump.h
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_WALDUMP_H
+#define PG_WALDUMP_H
+
+#include "access/xlogdefs.h"
+
+extern int WalSegSz;
+
+/* Contains the necessary information to drive WAL decoding */
+typedef struct XLogDumpPrivate
+{
+ TimeLineID timeline;
+ XLogRecPtr startptr;
+ XLogRecPtr endptr;
+ bool endptr_reached;
+} XLogDumpPrivate;
+
+#endif /* end of PG_WALDUMP_H */
--
2.47.1
[application/x-patch] v3-0002-Refactor-pg_waldump-Separate-logic-used-to-calcul.patch (2.3K, 3-v3-0002-Refactor-pg_waldump-Separate-logic-used-to-calcul.patch)
download | inline diff:
From c42fad18faa0016eee4e2eee2e4d0d465156a787 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 26 Jun 2025 11:42:53 +0530
Subject: [PATCH v3 2/8] Refactor: pg_waldump: Separate logic used to calculate
the required read size.
This refactoring prepares the codebase for an upcoming patch that will
support reading WAL from tar files. The logic for calculating the
required read size has been updated to handle both normal WAL files
and WAL files located inside a tar archive.
---
src/bin/pg_waldump/pg_waldump.c | 39 ++++++++++++++++++++++-----------
1 file changed, 26 insertions(+), 13 deletions(-)
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index a49b2fd96c7..8d0cd9e7156 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -326,6 +326,29 @@ identify_target_directory(char *directory, char *fname)
return NULL; /* not reached */
}
+/* Returns the size in bytes of the data to be read. */
+static inline int
+required_read_len(XLogDumpPrivate *private, XLogRecPtr targetPagePtr,
+ int reqLen)
+{
+ int count = XLOG_BLCKSZ;
+
+ if (private->endptr != InvalidXLogRecPtr)
+ {
+ if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
+ count = XLOG_BLCKSZ;
+ else if (targetPagePtr + reqLen <= private->endptr)
+ count = private->endptr - targetPagePtr;
+ else
+ {
+ private->endptr_reached = true;
+ return -1;
+ }
+ }
+
+ return count;
+}
+
/* pg_waldump's XLogReaderRoutine->segment_open callback */
static void
WALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
@@ -383,21 +406,11 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogRecPtr targetPtr, char *readBuff)
{
XLogDumpPrivate *private = state->private_data;
- int count = XLOG_BLCKSZ;
+ int count = required_read_len(private, targetPagePtr, reqLen);
WALReadError errinfo;
- if (private->endptr != InvalidXLogRecPtr)
- {
- if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
- count = XLOG_BLCKSZ;
- else if (targetPagePtr + reqLen <= private->endptr)
- count = private->endptr - targetPagePtr;
- else
- {
- private->endptr_reached = true;
- return -1;
- }
- }
+ if (private->endptr_reached)
+ return -1;
if (!WALRead(state, readBuff, targetPagePtr, count, private->timeline,
&errinfo))
--
2.47.1
[application/x-patch] v3-0003-Refactor-pg_waldump-Restructure-TAP-tests.patch (5.5K, 4-v3-0003-Refactor-pg_waldump-Restructure-TAP-tests.patch)
download | inline diff:
From 31c4a7d8d6a24892e5c8bb476ea5665e15d93aec Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 30 Jul 2025 12:43:30 +0530
Subject: [PATCH v3 3/8] Refactor: pg_waldump: Restructure TAP tests.
Restructured some tests to run inside a loop, facilitating their
re-execution for decoding WAL from tar archives.
---
src/bin/pg_waldump/t/001_basic.pl | 123 ++++++++++++++++--------------
1 file changed, 67 insertions(+), 56 deletions(-)
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index f26d75e01cf..1b712e8d74d 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -198,28 +198,6 @@ command_like(
],
qr/./,
'runs with start and end segment specified');
-command_fails_like(
- [ 'pg_waldump', '--path' => $node->data_dir ],
- qr/error: no start WAL location given/,
- 'path option requires start location');
-command_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- '--end' => $end_lsn,
- ],
- qr/./,
- 'runs with path option and start and end locations');
-command_fails_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- ],
- qr/error: error in WAL record at/,
- 'falling off the end of the WAL results in an error');
-
command_like(
[
'pg_waldump', '--quiet',
@@ -227,15 +205,6 @@ command_like(
],
qr/^$/,
'no output with --quiet option');
-command_fails_like(
- [
- 'pg_waldump', '--quiet',
- '--path' => $node->data_dir,
- '--start' => $start_lsn
- ],
- qr/error: error in WAL record at/,
- 'errors are shown with --quiet');
-
# Test for: Display a message that we're skipping data if `from`
# wasn't a pointer to the start of a record.
@@ -272,7 +241,6 @@ sub test_pg_waldump
my $result = IPC::Run::run [
'pg_waldump',
- '--path' => $node->data_dir,
'--start' => $start_lsn,
'--end' => $end_lsn,
@opts
@@ -288,38 +256,81 @@ sub test_pg_waldump
my @lines;
-@lines = test_pg_waldump;
-is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+my @scenario = (
+ {
+ 'path' => $node->data_dir
+ });
-@lines = test_pg_waldump('--limit' => 6);
-is(@lines, 6, 'limit option observed');
+for my $scenario (@scenario)
+{
+ my $path = $scenario->{'path'};
-@lines = test_pg_waldump('--fullpage');
-is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
+ SKIP:
+ {
+ command_fails_like(
+ [ 'pg_waldump', '--path' => $path ],
+ qr/error: no start WAL location given/,
+ 'path option requires start location');
+ command_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ '--end' => $end_lsn,
+ ],
+ qr/./,
+ 'runs with path option and start and end locations');
+ command_fails_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ ],
+ qr/error: error in WAL record at/,
+ 'falling off the end of the WAL results in an error');
-@lines = test_pg_waldump('--stats');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ command_fails_like(
+ [
+ 'pg_waldump', '--quiet',
+ '--path' => $path,
+ '--start' => $start_lsn
+ ],
+ qr/error: error in WAL record at/,
+ 'errors are shown with --quiet');
-@lines = test_pg_waldump('--stats=record');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ @lines = test_pg_waldump('--path' => $path);
+ is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
-@lines = test_pg_waldump('--rmgr' => 'Btree');
-is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+ @lines = test_pg_waldump('--path' => $path, '--limit' => 6);
+ is(@lines, 6, 'limit option observed');
-@lines = test_pg_waldump('--fork' => 'init');
-is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+ @lines = test_pg_waldump('--path' => $path, '--fullpage');
+ is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
-is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
- 0, 'only lines for selected relation');
+ @lines = test_pg_waldump('--path' => $path, '--stats');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
- '--block' => 1);
-is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+ @lines = test_pg_waldump('--path' => $path, '--stats=record');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ @lines = test_pg_waldump('--path' => $path, '--rmgr' => 'Btree');
+ is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+
+ @lines = test_pg_waldump('--path' => $path, '--fork' => 'init');
+ is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+
+ @lines = test_pg_waldump('--path' => $path,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
+ is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
+ 0, 'only lines for selected relation');
+
+ @lines = test_pg_waldump('--path' => $path,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
+ '--block' => 1);
+ is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+ }
+}
done_testing();
--
2.47.1
[application/x-patch] v3-0004-pg_waldump-Add-support-for-archived-WAL-decoding.patch (34.5K, 5-v3-0004-pg_waldump-Add-support-for-archived-WAL-decoding.patch)
download | inline diff:
From f15b7e6d33e107cb141586af6072b41266eff0eb Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 16 Jul 2025 18:37:59 +0530
Subject: [PATCH v3 4/8] pg_waldump: Add support for archived WAL decoding.
pg_waldump can now accept the path to a tar archive containing WAL
files and decode them. This feature was added primarily for
pg_verifybackup, which previously disabled WAL parsing for
tar-formatted backups.
Note that this patch requires that the WAL files within the archive be
in sequential order; an error will be reported otherwise. The next
patch is planned to remove this restriction.
---
doc/src/sgml/ref/pg_waldump.sgml | 8 +-
src/bin/pg_waldump/Makefile | 7 +-
src/bin/pg_waldump/astreamer_waldump.c | 378 +++++++++++++++++++++++++
src/bin/pg_waldump/meson.build | 4 +-
src/bin/pg_waldump/pg_waldump.c | 362 +++++++++++++++++++----
src/bin/pg_waldump/pg_waldump.h | 21 +-
src/bin/pg_waldump/t/001_basic.pl | 84 +++++-
src/tools/pgindent/typedefs.list | 1 +
8 files changed, 787 insertions(+), 78 deletions(-)
create mode 100644 src/bin/pg_waldump/astreamer_waldump.c
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index ce23add5577..d004bb0f67e 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -141,13 +141,17 @@ PostgreSQL documentation
<term><option>--path=<replaceable>path</replaceable></option></term>
<listitem>
<para>
- Specifies a directory to search for WAL segment files or a
- directory with a <literal>pg_wal</literal> subdirectory that
+ Specifies a tar archive or a directory to search for WAL segment files
+ or a directory with a <literal>pg_wal</literal> subdirectory that
contains such files. The default is to search in the current
directory, the <literal>pg_wal</literal> subdirectory of the
current directory, and the <literal>pg_wal</literal> subdirectory
of <envar>PGDATA</envar>.
</para>
+ <para>
+ If a tar archive is provided, its WAL segment files must be in
+ sequential order; otherwise, an error will be reported.
+ </para>
</listitem>
</varlistentry>
diff --git a/src/bin/pg_waldump/Makefile b/src/bin/pg_waldump/Makefile
index 4c1ee649501..b234613eb50 100644
--- a/src/bin/pg_waldump/Makefile
+++ b/src/bin/pg_waldump/Makefile
@@ -3,6 +3,9 @@
PGFILEDESC = "pg_waldump - decode and display WAL"
PGAPPICON=win32
+# make these available to TAP test scripts
+export TAR
+
subdir = src/bin/pg_waldump
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
@@ -12,11 +15,13 @@ OBJS = \
$(WIN32RES) \
compat.o \
pg_waldump.o \
+ astreamer_waldump.o \
rmgrdesc.o \
xlogreader.o \
xlogstats.o
-override CPPFLAGS := -DFRONTEND $(CPPFLAGS)
+override CPPFLAGS := -DFRONTEND -I$(libpq_srcdir) $(CPPFLAGS)
+LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils
RMGRDESCSOURCES = $(sort $(notdir $(wildcard $(top_srcdir)/src/backend/access/rmgrdesc/*desc*.c)))
RMGRDESCOBJS = $(patsubst %.c,%.o,$(RMGRDESCSOURCES))
diff --git a/src/bin/pg_waldump/astreamer_waldump.c b/src/bin/pg_waldump/astreamer_waldump.c
new file mode 100644
index 00000000000..61876e834a9
--- /dev/null
+++ b/src/bin/pg_waldump/astreamer_waldump.c
@@ -0,0 +1,378 @@
+/*-------------------------------------------------------------------------
+ *
+ * astreamer_waldump.c
+ * A generic facility for reading WAL data from tar archives via archive
+ * streamer.
+ *
+ * Portions Copyright (c) 2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/astreamer_waldump.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres_fe.h"
+
+#include <unistd.h>
+
+#include "access/xlog_internal.h"
+#include "access/xlogdefs.h"
+#include "common/logging.h"
+#include "fe_utils/simple_list.h"
+#include "pg_waldump.h"
+
+/*
+ * How many bytes should we try to read from a file at once?
+ */
+#define READ_CHUNK_SIZE (128 * 1024)
+
+typedef struct astreamer_waldump
+{
+ /* These fields don't change once initialized. */
+ astreamer base;
+ XLogSegNo startSegNo;
+ XLogSegNo endSegNo;
+ XLogDumpPrivate *privateInfo;
+
+ /* These fields change with archive member. */
+ bool skipThisSeg;
+ XLogSegNo nextSegNo; /* Next expected segment to stream */
+} astreamer_waldump;
+
+static int astreamer_archive_read(XLogDumpPrivate *privateInfo);
+static void astreamer_waldump_content(astreamer *streamer,
+ astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context);
+static void astreamer_waldump_finalize(astreamer *streamer);
+static void astreamer_waldump_free(astreamer *streamer);
+
+static bool member_is_relevant_wal(astreamer_member *member,
+ TimeLineID startTimeLineID,
+ XLogSegNo startSegNo,
+ XLogSegNo endSegNo,
+ XLogSegNo nextSegNo,
+ XLogSegNo *curSegNo,
+ TimeLineID *curSegTimeline);
+
+static const astreamer_ops astreamer_waldump_ops = {
+ .content = astreamer_waldump_content,
+ .finalize = astreamer_waldump_finalize,
+ .free = astreamer_waldump_free
+};
+
+/*
+ * Copies WAL data from astreamer to readBuff; if unavailable, fetches more
+ * from the tar archive via astreamer.
+ */
+int
+astreamer_wal_read(char *readBuff, XLogRecPtr targetPagePtr, Size count,
+ XLogDumpPrivate *privateInfo)
+{
+ char *p = readBuff;
+ Size nbytes = count;
+ XLogRecPtr recptr = targetPagePtr;
+ volatile StringInfo astreamer_buf = privateInfo->archive_streamer_buf;
+
+ while (nbytes > 0)
+ {
+ char *buf = astreamer_buf->data;
+ int len = astreamer_buf->len;
+
+ /* WAL record range that the buffer contains */
+ XLogRecPtr endPtr = privateInfo->archive_streamer_read_ptr;
+ XLogRecPtr startPtr = (endPtr > len) ? endPtr - len : 0;
+
+ /*
+ * Ignore existing data if the required target page has not yet been
+ * read.
+ */
+ if (recptr >= endPtr)
+ {
+ len = 0;
+
+ /* Reset the buffer */
+ resetStringInfo(astreamer_buf);
+ }
+
+ if (len > 0 && recptr > startPtr)
+ {
+ int skipBytes = 0;
+
+ /*
+ * The required offset is not at the start of the archive streamer
+ * buffer, so skip bytes until reaching the desired offset of the
+ * target page.
+ */
+ skipBytes = recptr - startPtr;
+
+ buf += skipBytes;
+ len -= skipBytes;
+ }
+
+ if (len > 0)
+ {
+ int readBytes = len >= nbytes ? nbytes : len;
+
+ /*
+ * Ensure we are reading the correct page, unless we've received
+ * an invalid record pointer. In that specific case, it's
+ * acceptable to read any page.
+ */
+ Assert(XLogRecPtrIsInvalid(recptr) ||
+ (recptr >= startPtr && recptr < endPtr));
+
+ memcpy(p, buf, readBytes);
+
+ /* Update state for read */
+ nbytes -= readBytes;
+ p += readBytes;
+ recptr += readBytes;
+ }
+ else
+ {
+ /* Fetch more data */
+ if (astreamer_archive_read(privateInfo) == 0)
+ break; /* No data remaining */
+ }
+ }
+
+ return (count - nbytes) ? (count - nbytes) : -1;
+}
+
+/*
+ * Reads the archive and passes it to the archive streamer for decompression.
+ */
+static int
+astreamer_archive_read(XLogDumpPrivate *privateInfo)
+{
+ int rc;
+ char *buffer;
+
+ buffer = pg_malloc(READ_CHUNK_SIZE * sizeof(uint8));
+
+ /* Read more data from the tar file */
+ rc = read(privateInfo->archive_fd, buffer, READ_CHUNK_SIZE);
+ if (rc < 0)
+ pg_fatal("could not read file \"%s\": %m",
+ privateInfo->archive_name);
+
+ /*
+ * Decrypt (if required), and then parse the previously read contents of
+ * the tar file.
+ */
+ if (rc > 0)
+ astreamer_content(privateInfo->archive_streamer, NULL,
+ buffer, rc, ASTREAMER_UNKNOWN);
+ pg_free(buffer);
+
+ return rc;
+}
+
+/*
+ * Create an astreamer that can read WAL from tar file.
+ */
+astreamer *
+astreamer_waldump_content_new(astreamer *next, XLogRecPtr startptr,
+ XLogRecPtr endPtr, XLogDumpPrivate *privateInfo)
+{
+ astreamer_waldump *streamer;
+
+ streamer = palloc0(sizeof(astreamer_waldump));
+ *((const astreamer_ops **) &streamer->base.bbs_ops) =
+ &astreamer_waldump_ops;
+
+ streamer->base.bbs_next = next;
+ initStringInfo(&streamer->base.bbs_buffer);
+
+ if (XLogRecPtrIsInvalid(startptr))
+ streamer->startSegNo = 0;
+ else
+ {
+ XLByteToSeg(startptr, streamer->startSegNo, WalSegSz);
+
+ /*
+ * Initialize the record pointer to the beginning of the first
+ * segment; this pointer will track the WAL record reading status.
+ */
+ XLogSegNoOffsetToRecPtr(streamer->startSegNo, 0, WalSegSz,
+ privateInfo->archive_streamer_read_ptr);
+ }
+
+ if (XLogRecPtrIsInvalid(endPtr))
+ streamer->endSegNo = UINT64_MAX;
+ else
+ XLByteToSeg(endPtr, streamer->endSegNo, WalSegSz);
+
+ streamer->nextSegNo = streamer->startSegNo;
+ streamer->privateInfo = privateInfo;
+
+ return &streamer->base;
+}
+
+/*
+ * Main entry point of the archive streamer for reading WAL from a tar file.
+ */
+static void
+astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context)
+{
+ astreamer_waldump *mystreamer = (astreamer_waldump *) streamer;
+ XLogDumpPrivate *privateInfo = mystreamer->privateInfo;
+
+ Assert(context != ASTREAMER_UNKNOWN);
+
+ switch (context)
+ {
+ case ASTREAMER_MEMBER_HEADER:
+ {
+ XLogSegNo segNo;
+ TimeLineID timeline;
+
+ pg_log_debug("pg_waldump: reading \"%s\"", member->pathname);
+
+ mystreamer->skipThisSeg = false;
+
+ if (!member_is_relevant_wal(member,
+ privateInfo->timeline,
+ mystreamer->startSegNo,
+ mystreamer->endSegNo,
+ mystreamer->nextSegNo,
+ &segNo, &timeline))
+ {
+ mystreamer->skipThisSeg = true;
+ break;
+ }
+
+ /*
+ * If nextSegNo is 0, the check is skipped, and any WAL file
+ * can be read -- this typically occurs during initial
+ * verification.
+ */
+ if (mystreamer->nextSegNo == 0)
+ break;
+
+ /* WAL segments must be archived in order */
+ if (mystreamer->nextSegNo != segNo)
+ {
+ pg_log_error("WAL files are not archived in sequential order");
+ pg_log_error_detail("Expecting segment number " UINT64_FORMAT " but found " UINT64_FORMAT ".",
+ mystreamer->nextSegNo, segNo);
+ exit(1);
+ }
+
+ /*
+ * We track the reading of WAL segment records using a pointer
+ * that's continuously incremented by the length of the
+ * received data. This pointer is crucial for serving WAL page
+ * requests from the WAL decoding routine, so it must be
+ * accurate.
+ */
+#ifdef USE_ASSERT_CHECKING
+ if (mystreamer->nextSegNo != 0)
+ {
+ XLogRecPtr recPtr;
+
+ XLogSegNoOffsetToRecPtr(segNo, 0, WalSegSz, recPtr);
+ Assert(privateInfo->archive_streamer_read_ptr == recPtr);
+ }
+#endif
+
+ /* Save the timeline */
+ privateInfo->timeline = timeline;
+
+ /* Update the next expected segment number */
+ mystreamer->nextSegNo += 1;
+ }
+ break;
+
+ case ASTREAMER_MEMBER_CONTENTS:
+ /* Skip this segment */
+ if (mystreamer->skipThisSeg)
+ break;
+
+ /* Or, copy contents to buffer */
+ privateInfo->archive_streamer_read_ptr += len;
+ astreamer_buffer_bytes(streamer, &data, &len, len);
+ break;
+
+ case ASTREAMER_MEMBER_TRAILER:
+ break;
+
+ case ASTREAMER_ARCHIVE_TRAILER:
+ break;
+
+ default:
+ /* Shouldn't happen. */
+ pg_fatal("unexpected state while parsing tar file");
+ }
+}
+
+/*
+ * End-of-stream processing for a astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_finalize(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+}
+
+/*
+ * Free memory associated with a astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_free(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+
+ pfree(streamer->bbs_buffer.data);
+ pfree(streamer);
+}
+
+/*
+ * Returns true if the archive member name matches the WAL naming format and
+ * the corresponding WAL segment falls within the WAL decoding target range;
+ * otherwise, returns false.
+ */
+static bool
+member_is_relevant_wal(astreamer_member *member, TimeLineID startTimeLineID,
+ XLogSegNo startSegNo, XLogSegNo endSegNo,
+ XLogSegNo nextSegNo, XLogSegNo *curSegNo,
+ TimeLineID *curSegTimeline)
+{
+ int pathlen;
+ XLogSegNo segNo;
+ TimeLineID timeline;
+ char *fname;
+
+ /* We are only interested in normal files. */
+ if (member->is_directory || member->is_link)
+ return false;
+
+ pathlen = strlen(member->pathname);
+ if (pathlen < XLOG_FNAME_LEN)
+ return false;
+
+ /* WAL file could be with full path */
+ fname = member->pathname + (pathlen - XLOG_FNAME_LEN);
+ if (!IsXLogFileName(fname))
+ return false;
+
+ /* Parse position from file */
+ XLogFromFileName(fname, &timeline, &segNo, WalSegSz);
+
+ /* Ignore if the timeline is different */
+ if (startTimeLineID != timeline)
+ return false;
+
+ /* Skip if the current segment is not the desired one */
+ if (startSegNo > segNo || endSegNo < segNo)
+ return false;
+
+ *curSegNo = segNo;
+ *curSegTimeline = timeline;
+
+ return true;
+}
diff --git a/src/bin/pg_waldump/meson.build b/src/bin/pg_waldump/meson.build
index 937e0d68841..2a0300dc339 100644
--- a/src/bin/pg_waldump/meson.build
+++ b/src/bin/pg_waldump/meson.build
@@ -3,6 +3,7 @@
pg_waldump_sources = files(
'compat.c',
'pg_waldump.c',
+ 'astreamer_waldump.c',
'rmgrdesc.c',
)
@@ -18,7 +19,7 @@ endif
pg_waldump = executable('pg_waldump',
pg_waldump_sources,
- dependencies: [frontend_code, lz4, zstd],
+ dependencies: [frontend_code, lz4, zstd, libpq],
c_args: ['-DFRONTEND'], # needed for xlogreader et al
kwargs: default_bin_args,
)
@@ -29,6 +30,7 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
+ 'env': {'TAR': tar.found() ? tar.full_path() : ''},
'tests': [
't/001_basic.pl',
't/002_save_fullpage.pl',
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 8d0cd9e7156..d136f8f038e 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -326,6 +326,160 @@ identify_target_directory(char *directory, char *fname)
return NULL; /* not reached */
}
+/*
+ * Returns true if the given file is a tar archive and outputs its compression
+ * algorithm.
+ */
+static bool
+is_tar_file(const char *fname, pg_compress_algorithm *compression)
+{
+ int fname_len = strlen(fname);
+ pg_compress_algorithm compress_algo;
+
+ /* Now, check the compression type of the tar */
+ if (fname_len > 4 &&
+ strcmp(fname + fname_len - 4, ".tar") == 0)
+ compress_algo = PG_COMPRESSION_NONE;
+ else if (fname_len > 4 &&
+ strcmp(fname + fname_len - 4, ".tgz") == 0)
+ compress_algo = PG_COMPRESSION_GZIP;
+ else if (fname_len > 7 &&
+ strcmp(fname + fname_len - 7, ".tar.gz") == 0)
+ compress_algo = PG_COMPRESSION_GZIP;
+ else if (fname_len > 8 &&
+ strcmp(fname + fname_len - 8, ".tar.lz4") == 0)
+ compress_algo = PG_COMPRESSION_LZ4;
+ else if (fname_len > 8 &&
+ strcmp(fname + fname_len - 8, ".tar.zst") == 0)
+ compress_algo = PG_COMPRESSION_ZSTD;
+ else
+ return false;
+
+ *compression = compress_algo;
+
+ return true;
+}
+
+/*
+ * Creates an appropriate chain of archive streamers for reading the given
+ * tar archive.
+ */
+static void
+setup_astreamer(XLogDumpPrivate *private, pg_compress_algorithm compression,
+ XLogRecPtr startptr, XLogRecPtr endptr)
+{
+ astreamer *streamer = NULL;
+
+ streamer = astreamer_waldump_content_new(NULL, startptr, endptr, private);
+
+ /*
+ * Final extracted WAL data will reside in this streamer. However, since
+ * it sits at the bottom of the stack and isn't designed to propagate data
+ * upward, we need to hold a pointer to its data buffer in order to copy.
+ */
+ private->archive_streamer_buf = &streamer->bbs_buffer;
+
+ /* Before that we must parse the tar archive. */
+ streamer = astreamer_tar_parser_new(streamer);
+
+ /* Before that we must decompress, if archive is compressed. */
+ if (compression == PG_COMPRESSION_GZIP)
+ streamer = astreamer_gzip_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_LZ4)
+ streamer = astreamer_lz4_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_ZSTD)
+ streamer = astreamer_zstd_decompressor_new(streamer);
+
+ private->archive_streamer = streamer;
+}
+
+/*
+ * Initializes the archive reader for a tar file.
+ */
+static void
+init_tar_archive_reader(XLogDumpPrivate *private, char *waldir,
+ pg_compress_algorithm compression)
+{
+ int fd;
+
+ /* Now, the tar archive and store its file descriptor */
+ fd = open_file_in_directory(waldir, private->archive_name);
+
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", private->archive_name);
+
+ private->archive_fd = fd;
+
+ /* Setup tar archive reading facility */
+ setup_astreamer(private, compression, private->startptr, private->endptr);
+}
+
+/*
+ * Release the archive streamer chain and close the archive file.
+ */
+static void
+free_tar_archive_reader(XLogDumpPrivate *private)
+{
+ /*
+ * NB: Normally, astreamer_finalize() is called before astreamer_free() to
+ * flush any remaining buffered data or to ensure the end of the tar
+ * archive is reached. However, when decoding a WAL file, once we hit the
+ * end LSN, any remaining WAL data in the buffer or the tar archive's
+ * unreached end can be safely ignored.
+ */
+ astreamer_free(private->archive_streamer);
+
+ /* Close the file. */
+ if (close(private->archive_fd) != 0)
+ pg_log_error("could not close file \"%s\": %m",
+ private->archive_name);
+}
+
+/*
+ * Reads a WAL page from the archive and verifies WAL segment size.
+ */
+static void
+verify_tar_archive(XLogDumpPrivate *private, const char *waldir,
+ pg_compress_algorithm compression)
+{
+ PGAlignedXLogBlock buf;
+ int r;
+
+ setup_astreamer(private, compression, InvalidXLogRecPtr, InvalidXLogRecPtr);
+
+ /* Now, the tar archive and store its file descriptor */
+ private->archive_fd = open_file_in_directory(waldir, private->archive_name);
+
+ if (private->archive_fd < 0)
+ pg_fatal("could not open file \"%s\"", private->archive_name);
+
+ /* Read a wal page */
+ r = astreamer_wal_read(buf.data, InvalidXLogRecPtr, XLOG_BLCKSZ, private);
+
+ /* Set WalSegSz if WAL data is successfully read */
+ if (r == XLOG_BLCKSZ)
+ {
+ XLogLongPageHeader longhdr = (XLogLongPageHeader) buf.data;
+
+ WalSegSz = longhdr->xlp_seg_size;
+
+ if (!IsValidWalSegSize(WalSegSz))
+ {
+ pg_log_error(ngettext("invalid WAL segment size in WAL file \"%s\" (%d byte)",
+ "invalid WAL segment size in WAL file \"%s\" (%d bytes)",
+ WalSegSz),
+ private->archive_name, WalSegSz);
+ pg_log_error_detail("The WAL segment size must be a power of two between 1 MB and 1 GB.");
+ exit(1);
+ }
+ }
+ else
+ pg_fatal("could not read WAL data from \"%s\" archive: read %d of %d",
+ private->archive_name, r, XLOG_BLCKSZ);
+
+ free_tar_archive_reader(private);
+}
+
/* Returns the size in bytes of the data to be read. */
static inline int
required_read_len(XLogDumpPrivate *private, XLogRecPtr targetPagePtr,
@@ -406,7 +560,7 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogRecPtr targetPtr, char *readBuff)
{
XLogDumpPrivate *private = state->private_data;
- int count = required_read_len(private, targetPagePtr, reqLen);
+ int count = required_read_len(private, targetPtr, reqLen);
WALReadError errinfo;
if (private->endptr_reached)
@@ -436,6 +590,44 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
return count;
}
+/*
+ * pg_waldump's XLogReaderRoutine->segment_open callback to support dumping WAL
+ * files from tar archives.
+ */
+static void
+TarWALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
+ TimeLineID *tli_p)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->segment_close callback.
+ */
+static void
+TarWALDumpCloseSegment(XLogReaderState *state)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->page_read callback to support dumping WAL
+ * files from tar archives.
+ */
+static int
+TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
+ XLogRecPtr targetPtr, char *readBuff)
+{
+ XLogDumpPrivate *private = state->private_data;
+ int count = required_read_len(private, targetPtr, reqLen);
+
+ if (private->endptr_reached)
+ return -1;
+
+ /* Read the WAL page from the archive streamer */
+ return astreamer_wal_read(readBuff, targetPagePtr, count, private);
+}
+
/*
* Boolean to return whether the given WAL record matches a specific relation
* and optionally block.
@@ -773,8 +965,8 @@ usage(void)
printf(_(" -F, --fork=FORK only show records that modify blocks in fork FORK;\n"
" valid names are main, fsm, vm, init\n"));
printf(_(" -n, --limit=N number of records to display\n"));
- printf(_(" -p, --path=PATH directory in which to find WAL segment files or a\n"
- " directory with a ./pg_wal that contains such files\n"
+ printf(_(" -p, --path=PATH tar archive or a directory in which to find WAL segment files or\n"
+ " a directory with a ./pg_wal that contains such files\n"
" (default: current directory, ./pg_wal, $PGDATA/pg_wal)\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -r, --rmgr=RMGR only show records generated by resource manager RMGR;\n"
@@ -806,7 +998,10 @@ main(int argc, char **argv)
XLogRecord *record;
XLogRecPtr first_record;
char *waldir = NULL;
+ char *walpath = NULL;
char *errormsg;
+ bool is_tar = false;
+ pg_compress_algorithm compression;
static struct option long_options[] = {
{"bkp-details", no_argument, NULL, 'b'},
@@ -938,7 +1133,7 @@ main(int argc, char **argv)
}
break;
case 'p':
- waldir = pg_strdup(optarg);
+ walpath = pg_strdup(optarg);
break;
case 'q':
config.quiet = true;
@@ -1102,10 +1297,20 @@ main(int argc, char **argv)
goto bad_argument;
}
- if (waldir != NULL)
+ if (walpath != NULL)
{
+ /* validate path points to tar archive */
+ if (is_tar_file(walpath, &compression))
+ {
+ char *fname = NULL;
+
+ split_path(walpath, &waldir, &fname);
+
+ private.archive_name = fname;
+ is_tar = true;
+ }
/* validate path points to directory */
- if (!verify_directory(waldir))
+ else if (!verify_directory(walpath))
{
pg_log_error("could not open directory \"%s\": %m", waldir);
goto bad_argument;
@@ -1125,44 +1330,23 @@ main(int argc, char **argv)
split_path(argv[optind], &directory, &fname);
- if (waldir == NULL && directory != NULL)
+ if (walpath == NULL && directory != NULL)
{
- waldir = directory;
+ walpath = directory;
- if (!verify_directory(waldir))
+ if (!verify_directory(walpath))
pg_fatal("could not open directory \"%s\": %m", waldir);
}
- waldir = identify_target_directory(waldir, fname);
- fd = open_file_in_directory(waldir, fname);
- if (fd < 0)
- pg_fatal("could not open file \"%s\"", fname);
- close(fd);
-
- /* parse position from file */
- XLogFromFileName(fname, &private.timeline, &segno, WalSegSz);
-
- if (XLogRecPtrIsInvalid(private.startptr))
- XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr);
- else if (!XLByteInSeg(private.startptr, segno, WalSegSz))
+ if (fname != NULL && is_tar_file(fname, &compression))
{
- pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.startptr),
- fname);
- goto bad_argument;
+ private.archive_name = fname;
+ waldir = walpath ? pg_strdup(walpath) : pg_strdup(".");
+ is_tar = true;
}
-
- /* no second file specified, set end position */
- if (!(optind + 1 < argc) && XLogRecPtrIsInvalid(private.endptr))
- XLogSegNoOffsetToRecPtr(segno + 1, 0, WalSegSz, private.endptr);
-
- /* parse ENDSEG if passed */
- if (optind + 1 < argc)
+ else
{
- XLogSegNo endsegno;
-
- /* ignore directory, already have that */
- split_path(argv[optind + 1], &directory, &fname);
+ waldir = identify_target_directory(walpath, fname);
fd = open_file_in_directory(waldir, fname);
if (fd < 0)
@@ -1170,32 +1354,67 @@ main(int argc, char **argv)
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &endsegno, WalSegSz);
+ XLogFromFileName(fname, &private.timeline, &segno, WalSegSz);
- if (endsegno < segno)
- pg_fatal("ENDSEG %s is before STARTSEG %s",
- argv[optind + 1], argv[optind]);
+ if (XLogRecPtrIsInvalid(private.startptr))
+ XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr);
+ else if (!XLByteInSeg(private.startptr, segno, WalSegSz))
+ {
+ pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.startptr),
+ fname);
+ goto bad_argument;
+ }
- if (XLogRecPtrIsInvalid(private.endptr))
- XLogSegNoOffsetToRecPtr(endsegno + 1, 0, WalSegSz,
- private.endptr);
+ /* no second file specified, set end position */
+ if (!(optind + 1 < argc) && XLogRecPtrIsInvalid(private.endptr))
+ XLogSegNoOffsetToRecPtr(segno + 1, 0, WalSegSz, private.endptr);
- /* set segno to endsegno for check of --end */
- segno = endsegno;
- }
+ /* parse ENDSEG if passed */
+ if (optind + 1 < argc)
+ {
+ XLogSegNo endsegno;
+ /* ignore directory, already have that */
+ split_path(argv[optind + 1], &directory, &fname);
- if (!XLByteInSeg(private.endptr, segno, WalSegSz) &&
- private.endptr != (segno + 1) * WalSegSz)
- {
- pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.endptr),
- argv[argc - 1]);
- goto bad_argument;
+ fd = open_file_in_directory(waldir, fname);
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", fname);
+ close(fd);
+
+ /* parse position from file */
+ XLogFromFileName(fname, &private.timeline, &endsegno, WalSegSz);
+
+ if (endsegno < segno)
+ pg_fatal("ENDSEG %s is before STARTSEG %s",
+ argv[optind + 1], argv[optind]);
+
+ if (XLogRecPtrIsInvalid(private.endptr))
+ XLogSegNoOffsetToRecPtr(endsegno + 1, 0, WalSegSz,
+ private.endptr);
+
+ /* set segno to endsegno for check of --end */
+ segno = endsegno;
+ }
+
+
+ if (!XLByteInSeg(private.endptr, segno, WalSegSz) &&
+ private.endptr != (segno + 1) * WalSegSz)
+ {
+ pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.endptr),
+ argv[argc - 1]);
+ goto bad_argument;
+ }
}
}
- else
- waldir = identify_target_directory(waldir, NULL);
+ else if (!is_tar)
+ waldir = identify_target_directory(walpath, NULL);
+
+ /* Verify that the archive contains valid WAL files */
+ if (is_tar)
+ verify_tar_archive(&private, waldir, compression);
/* we don't know what to print */
if (XLogRecPtrIsInvalid(private.startptr))
@@ -1207,12 +1426,30 @@ main(int argc, char **argv)
/* done with argument parsing, do the actual work */
/* we have everything we need, start reading */
- xlogreader_state =
- XLogReaderAllocate(WalSegSz, waldir,
- XL_ROUTINE(.page_read = WALDumpReadPage,
- .segment_open = WALDumpOpenSegment,
- .segment_close = WALDumpCloseSegment),
- &private);
+ if (is_tar)
+ {
+ /* Set up for reading tar file */
+ init_tar_archive_reader(&private, waldir, compression);
+
+ /* Routine to decode WAL files in tar archive */
+ xlogreader_state =
+ XLogReaderAllocate(WalSegSz, waldir,
+ XL_ROUTINE(.page_read = TarWALDumpReadPage,
+ .segment_open = TarWALDumpOpenSegment,
+ .segment_close = TarWALDumpCloseSegment),
+ &private);
+ }
+ else
+ {
+ /* Routine to decode WAL files */
+ xlogreader_state =
+ XLogReaderAllocate(WalSegSz, waldir,
+ XL_ROUTINE(.page_read = WALDumpReadPage,
+ .segment_open = WALDumpOpenSegment,
+ .segment_close = WALDumpCloseSegment),
+ &private);
+ }
+
if (!xlogreader_state)
pg_fatal("out of memory while allocating a WAL reading processor");
@@ -1321,6 +1558,9 @@ main(int argc, char **argv)
XLogReaderFree(xlogreader_state);
+ if (is_tar)
+ free_tar_archive_reader(&private);
+
return EXIT_SUCCESS;
bad_argument:
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index 9e62b64ead5..b5d440500de 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -12,6 +12,8 @@
#define PG_WALDUMP_H
#include "access/xlogdefs.h"
+#include "fe_utils/astreamer.h"
+#include "lib/stringinfo.h"
extern int WalSegSz;
@@ -22,6 +24,23 @@ typedef struct XLogDumpPrivate
XLogRecPtr startptr;
XLogRecPtr endptr;
bool endptr_reached;
+
+ /* Fields required to read WAL from archive */
+ char *archive_name; /* Tar archive name */
+ int archive_fd; /* File descriptor for the open tar file */
+
+ astreamer *archive_streamer;
+ StringInfo archive_streamer_buf; /* Buffer for receiving WAL data */
+ XLogRecPtr archive_streamer_read_ptr; /* Populate the buffer with records
+ until this record pointer */
} XLogDumpPrivate;
-#endif /* end of PG_WALDUMP_H */
+
+extern astreamer *astreamer_waldump_content_new(astreamer *next,
+ XLogRecPtr startptr,
+ XLogRecPtr endptr,
+ XLogDumpPrivate *privateInfo);
+extern int astreamer_wal_read(char *readBuff, XLogRecPtr startptr, Size count,
+ XLogDumpPrivate *privateInfo);
+
+#endif /* end of PG_WALDUMP_H */
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 1b712e8d74d..443126a9ce6 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -3,10 +3,13 @@
use strict;
use warnings FATAL => 'all';
+use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
+my $tar = $ENV{TAR};
+
program_help_ok('pg_waldump');
program_version_ok('pg_waldump');
program_options_handling_ok('pg_waldump');
@@ -235,7 +238,7 @@ command_like(
sub test_pg_waldump
{
local $Test::Builder::Level = $Test::Builder::Level + 1;
- my @opts = @_;
+ my ($path, @opts) = @_;
my ($stdout, $stderr);
@@ -243,6 +246,7 @@ sub test_pg_waldump
'pg_waldump',
'--start' => $start_lsn,
'--end' => $end_lsn,
+ '--path' => $path,
@opts
],
'>' => \$stdout,
@@ -254,11 +258,50 @@ sub test_pg_waldump
return @lines;
}
-my @lines;
+# Create a tar archive, sorting the file order
+sub generate_archive
+{
+ my ($archive, $directory, $compression_flags) = @_;
+
+ my @files;
+ opendir my $dh, $directory or die "opendir: $!";
+ while (my $entry = readdir $dh) {
+ # Skip '.' and '..'
+ next if $entry eq '.' || $entry eq '..';
+ push @files, $entry;
+ }
+ closedir $dh;
+
+ @files = sort @files;
+
+ # move into the WAL directory before archiving files
+ my $cwd = getcwd;
+ chdir($directory) || die "chdir: $!";
+ command_ok([$tar, $compression_flags, $archive, @files]);
+ chdir($cwd) || die "chdir: $!";
+}
+
+my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
my @scenario = (
{
- 'path' => $node->data_dir
+ 'path' => $node->data_dir,
+ 'is_archive' => 0,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar",
+ 'compression_method' => 'none',
+ 'compression_flags' => '-cf',
+ 'is_archive' => 1,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar.gz",
+ 'compression_method' => 'gzip',
+ 'compression_flags' => '-czf',
+ 'is_archive' => 1,
+ 'enabled' => check_pg_config("#define HAVE_LIBZ 1")
});
for my $scenario (@scenario)
@@ -267,6 +310,19 @@ for my $scenario (@scenario)
SKIP:
{
+ skip "tar command is not available", 3
+ if !defined $tar;
+ skip "$scenario->{'compression_method'} compression not supported by this build", 3
+ if !$scenario->{'enabled'} && $scenario->{'is_archive'};
+
+ # create pg_wal archive
+ if ($scenario->{'is_archive'})
+ {
+ generate_archive($path,
+ $node->data_dir . '/pg_wal',
+ $scenario->{'compression_flags'});
+ }
+
command_fails_like(
[ 'pg_waldump', '--path' => $path ],
qr/error: no start WAL location given/,
@@ -298,38 +354,42 @@ for my $scenario (@scenario)
qr/error: error in WAL record at/,
'errors are shown with --quiet');
- @lines = test_pg_waldump('--path' => $path);
+ my @lines;
+ @lines = test_pg_waldump($path);
is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
- @lines = test_pg_waldump('--path' => $path, '--limit' => 6);
+ @lines = test_pg_waldump($path, '--limit' => 6);
is(@lines, 6, 'limit option observed');
- @lines = test_pg_waldump('--path' => $path, '--fullpage');
+ @lines = test_pg_waldump($path, '--fullpage');
is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
- @lines = test_pg_waldump('--path' => $path, '--stats');
+ @lines = test_pg_waldump($path, '--stats');
like($lines[0], qr/WAL statistics/, "statistics on stdout");
is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
- @lines = test_pg_waldump('--path' => $path, '--stats=record');
+ @lines = test_pg_waldump($path, '--stats=record');
like($lines[0], qr/WAL statistics/, "statistics on stdout");
is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
- @lines = test_pg_waldump('--path' => $path, '--rmgr' => 'Btree');
+ @lines = test_pg_waldump($path, '--rmgr' => 'Btree');
is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
- @lines = test_pg_waldump('--path' => $path, '--fork' => 'init');
+ @lines = test_pg_waldump($path, '--fork' => 'init');
is(grep(!/fork init/, @lines), 0, 'only init fork lines');
- @lines = test_pg_waldump('--path' => $path,
+ @lines = test_pg_waldump($path,
'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
0, 'only lines for selected relation');
- @lines = test_pg_waldump('--path' => $path,
+ @lines = test_pg_waldump($path,
'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
'--block' => 1);
is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+
+ # Cleanup.
+ unlink $path if $scenario->{'is_archive'};
}
}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index a13e8162890..b406ca041ec 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3444,6 +3444,7 @@ astreamer_recovery_injector
astreamer_tar_archiver
astreamer_tar_parser
astreamer_verify
+astreamer_waldump
astreamer_zstd_frame
auth_password_hook_typ
autovac_table
--
2.47.1
[application/x-patch] v3-0005-pg_waldump-Remove-the-restriction-on-the-order-of.patch (17.5K, 6-v3-0005-pg_waldump-Remove-the-restriction-on-the-order-of.patch)
download | inline diff:
From cf0201067a8627049838e85eecbe5da9aa9c8ef0 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Mon, 25 Aug 2025 17:26:29 +0530
Subject: [PATCH v3 5/8] pg_waldump: Remove the restriction on the order of
archived WAL files.
With previous patch, pg_waldump would stop decoding if WAL files were
not in the required sequence. With this patch, decoding will now
continue. Any WAL file that is out of order will be written to a
temporary location, from which it will be read later. Once a temporary
file has been read, it will be removed.
---
doc/src/sgml/ref/pg_waldump.sgml | 7 +-
src/bin/pg_waldump/astreamer_waldump.c | 189 +++++++++++++++++++++----
src/bin/pg_waldump/pg_waldump.c | 75 +++++++++-
src/bin/pg_waldump/pg_waldump.h | 26 +++-
src/bin/pg_waldump/t/001_basic.pl | 3 +-
5 files changed, 266 insertions(+), 34 deletions(-)
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index d004bb0f67e..c1afb4097b5 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -149,8 +149,11 @@ PostgreSQL documentation
of <envar>PGDATA</envar>.
</para>
<para>
- If a tar archive is provided, its WAL segment files must be in
- sequential order; otherwise, an error will be reported.
+ If a tar archive is provided and its WAL segment files are not in
+ sequential order, those files will be written temporarily. These files
+ will be created inside the directory specified by the <envar>TMPDIR</envar>
+ environment variable if it is set; otherwise, the temporary files will
+ be created within the same directory as the tar archive itself.
</para>
</listitem>
</varlistentry>
diff --git a/src/bin/pg_waldump/astreamer_waldump.c b/src/bin/pg_waldump/astreamer_waldump.c
index 61876e834a9..183f389d3f1 100644
--- a/src/bin/pg_waldump/astreamer_waldump.c
+++ b/src/bin/pg_waldump/astreamer_waldump.c
@@ -18,8 +18,8 @@
#include "access/xlog_internal.h"
#include "access/xlogdefs.h"
+#include "common/file_perm.h"
#include "common/logging.h"
-#include "fe_utils/simple_list.h"
#include "pg_waldump.h"
/*
@@ -37,6 +37,8 @@ typedef struct astreamer_waldump
/* These fields change with archive member. */
bool skipThisSeg;
+ bool writeThisSeg;
+ FILE *segFp;
XLogSegNo nextSegNo; /* Next expected segment to stream */
} astreamer_waldump;
@@ -53,8 +55,15 @@ static bool member_is_relevant_wal(astreamer_member *member,
XLogSegNo startSegNo,
XLogSegNo endSegNo,
XLogSegNo nextSegNo,
+ char **curFname,
XLogSegNo *curSegNo,
TimeLineID *curSegTimeline);
+static FILE *member_prepare_tmp_write(XLogSegNo curSegNo,
+ const char *fname,
+ XLogDumpPrivate *privateInfo);
+static XLogSegNo member_next_segno(XLogSegNo curSegNo,
+ TimeLineID timeline,
+ XLogDumpPrivate *privateInfo);
static const astreamer_ops astreamer_waldump_ops = {
.content = astreamer_waldump_content,
@@ -189,17 +198,8 @@ astreamer_waldump_content_new(astreamer *next, XLogRecPtr startptr,
if (XLogRecPtrIsInvalid(startptr))
streamer->startSegNo = 0;
else
- {
XLByteToSeg(startptr, streamer->startSegNo, WalSegSz);
- /*
- * Initialize the record pointer to the beginning of the first
- * segment; this pointer will track the WAL record reading status.
- */
- XLogSegNoOffsetToRecPtr(streamer->startSegNo, 0, WalSegSz,
- privateInfo->archive_streamer_read_ptr);
- }
-
if (XLogRecPtrIsInvalid(endPtr))
streamer->endSegNo = UINT64_MAX;
else
@@ -228,19 +228,21 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
{
case ASTREAMER_MEMBER_HEADER:
{
+ char *fname;
XLogSegNo segNo;
TimeLineID timeline;
pg_log_debug("pg_waldump: reading \"%s\"", member->pathname);
mystreamer->skipThisSeg = false;
+ mystreamer->writeThisSeg = false;
if (!member_is_relevant_wal(member,
privateInfo->timeline,
mystreamer->startSegNo,
mystreamer->endSegNo,
mystreamer->nextSegNo,
- &segNo, &timeline))
+ &fname, &segNo, &timeline))
{
mystreamer->skipThisSeg = true;
break;
@@ -254,24 +256,38 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
if (mystreamer->nextSegNo == 0)
break;
- /* WAL segments must be archived in order */
+ /*
+ * When WAL segments are not archived sequentially, it becomes
+ * necessary to write out (or preserve) segments that might be
+ * required at a later point.
+ */
if (mystreamer->nextSegNo != segNo)
{
- pg_log_error("WAL files are not archived in sequential order");
- pg_log_error_detail("Expecting segment number " UINT64_FORMAT " but found " UINT64_FORMAT ".",
- mystreamer->nextSegNo, segNo);
- exit(1);
+ mystreamer->writeThisSeg = true;
+ mystreamer->segFp =
+ member_prepare_tmp_write(segNo, fname, privateInfo);
+ break;
}
/*
- * We track the reading of WAL segment records using a pointer
- * that's continuously incremented by the length of the
- * received data. This pointer is crucial for serving WAL page
- * requests from the WAL decoding routine, so it must be
- * accurate.
+ * We are now streaming segment containt.
+ *
+ * We need to track the reading of WAL segment records using a
+ * pointer that's typically incremented by the length of the
+ * data read. However, we sometimes export the WAL file to
+ * temporary storage, allowing the decoding routine to read
+ * directly from there. This makes continuous pointer
+ * incrementing challenging, as file reads can occur from any
+ * offset, leading to potential errors. Therefore, we now
+ * reset the pointer when reading from a file for streaming.
+ * Also, if there's any existing data in the buffer, the next
+ * WAL record should logically follow it.
*/
#ifdef USE_ASSERT_CHECKING
- if (mystreamer->nextSegNo != 0)
+ Assert(!mystreamer->skipThisSeg);
+ Assert(!mystreamer->writeThisSeg);
+
+ if (privateInfo->archive_streamer_buf->len != 0)
{
XLogRecPtr recPtr;
@@ -280,11 +296,19 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
}
#endif
+ /*
+ * Initialized to the beginning of the current segment being
+ * streamed through the buffer.
+ */
+ XLogSegNoOffsetToRecPtr(segNo, 0, WalSegSz,
+ privateInfo->archive_streamer_read_ptr);
+
/* Save the timeline */
privateInfo->timeline = timeline;
/* Update the next expected segment number */
- mystreamer->nextSegNo += 1;
+ mystreamer->nextSegNo =
+ member_next_segno(segNo, timeline, privateInfo);
}
break;
@@ -293,12 +317,44 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
if (mystreamer->skipThisSeg)
break;
+ /* Or, write contents to file */
+ if (mystreamer->writeThisSeg)
+ {
+ Assert(mystreamer->segFp != NULL);
+
+ errno = 0;
+ if (len > 0 && fwrite(data, len, 1, mystreamer->segFp) != 1)
+ {
+ char *fname;
+ int pathlen = strlen(member->pathname);
+
+ Assert(pathlen >= XLOG_FNAME_LEN);
+
+ fname = member->pathname + (pathlen - XLOG_FNAME_LEN);
+
+ /*
+ * If write didn't set errno, assume problem is no disk
+ * space
+ */
+ if (errno == 0)
+ errno = ENOSPC;
+ pg_fatal("could not write to file \"%s/%s\": %m",
+ privateInfo->tmpdir, fname);
+ }
+ break;
+ }
+
/* Or, copy contents to buffer */
privateInfo->archive_streamer_read_ptr += len;
astreamer_buffer_bytes(streamer, &data, &len, len);
break;
case ASTREAMER_MEMBER_TRAILER:
+ if (mystreamer->segFp != NULL)
+ {
+ fclose(mystreamer->segFp);
+ mystreamer->segFp = NULL;
+ }
break;
case ASTREAMER_ARCHIVE_TRAILER:
@@ -325,8 +381,14 @@ astreamer_waldump_finalize(astreamer *streamer)
static void
astreamer_waldump_free(astreamer *streamer)
{
+ astreamer_waldump *mystreamer;
+
Assert(streamer->bbs_next == NULL);
+ mystreamer = (astreamer_waldump *) streamer;
+ if (mystreamer->segFp != NULL)
+ fclose(mystreamer->segFp);
+
pfree(streamer->bbs_buffer.data);
pfree(streamer);
}
@@ -339,8 +401,8 @@ astreamer_waldump_free(astreamer *streamer)
static bool
member_is_relevant_wal(astreamer_member *member, TimeLineID startTimeLineID,
XLogSegNo startSegNo, XLogSegNo endSegNo,
- XLogSegNo nextSegNo, XLogSegNo *curSegNo,
- TimeLineID *curSegTimeline)
+ XLogSegNo nextSegNo, char **curFname,
+ XLogSegNo *curSegNo, TimeLineID *curSegTimeline)
{
int pathlen;
XLogSegNo segNo;
@@ -371,8 +433,85 @@ member_is_relevant_wal(astreamer_member *member, TimeLineID startTimeLineID,
if (startSegNo > segNo || endSegNo < segNo)
return false;
+ *curFname = fname;
*curSegNo = segNo;
*curSegTimeline = timeline;
return true;
}
+
+/*
+ * Create an empty placeholder file and return its handle. The file is also
+ * added to an exported list for future management, e.g. access, deletion, and
+ * existence checks.
+ */
+static FILE *
+member_prepare_tmp_write(XLogSegNo curSegNo, const char *fname,
+ XLogDumpPrivate *privateInfo)
+{
+ FILE *file;
+ char *fpath = get_tmp_wal_file_path(privateInfo, fname);
+
+ /* Create an empty placeholder */
+ file = fopen(fpath, PG_BINARY_W);
+ if (file == NULL)
+ pg_fatal("could not create file \"%s\": %m", fpath);
+
+#ifndef WIN32
+ if (chmod(fpath, pg_file_create_mode))
+ pg_fatal("could not set permissions on file \"%s\": %m",
+ fpath);
+#endif
+
+ /* Record this segment's export */
+ simple_string_list_append(&privateInfo->exportedSegList, fname);
+ pfree(fpath);
+
+ return file;
+}
+
+/*
+ * Get next WAL segment that needs to be retrieved from the archive.
+ *
+ * The function checks for the presence of a previously read and extracted WAL
+ * segment in the temporary storage. If a temporary file is found for that
+ * segment, it indicates the segment has already been successfully retrieved
+ * from the archive. In this case, the function increments the segment number
+ * and repeats the check. This process continues until a segment that has not
+ * yet been retrieved is found, at which point the function returns its number.
+ */
+static XLogSegNo
+member_next_segno(XLogSegNo curSegNo, TimeLineID timeline,
+ XLogDumpPrivate *privateInfo)
+{
+ XLogSegNo nextSegNo = curSegNo + 1;
+ bool exists;
+
+ /*
+ * If we find a file that was previously written to the temporary space,
+ * it indicates that the corresponding WAL segment request has already
+ * been fulfilled. In that case, we increment the nextSegNo counter and
+ * check again whether that segment number again. if found above steps
+ * will be return if not then we return that segment number which would be
+ * needed from the archive.
+ */
+ do
+ {
+ char fname[MAXFNAMELEN];
+
+ XLogFileName(fname, timeline, nextSegNo, WalSegSz);
+
+ /*
+ * If the WAL segment has already been exported, increment the counter
+ * and check for the next segment.
+ */
+ exists = false;
+ if (simple_string_list_member(&privateInfo->exportedSegList, fname))
+ {
+ nextSegNo += 1;
+ exists = true;
+ }
+ } while (exists);
+
+ return nextSegNo;
+}
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index d136f8f038e..d57458d3148 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -394,13 +394,14 @@ setup_astreamer(XLogDumpPrivate *private, pg_compress_algorithm compression,
}
/*
- * Initializes the archive reader for a tar file.
+ * Initializes the tar archive reader and a temporary directory for WAL files.
*/
static void
init_tar_archive_reader(XLogDumpPrivate *private, char *waldir,
pg_compress_algorithm compression)
{
int fd;
+ char *tmpdir;
/* Now, the tar archive and store its file descriptor */
fd = open_file_in_directory(waldir, private->archive_name);
@@ -412,6 +413,15 @@ init_tar_archive_reader(XLogDumpPrivate *private, char *waldir,
/* Setup tar archive reading facility */
setup_astreamer(private, compression, private->startptr, private->endptr);
+
+ /* Temporary space for writing WAL segments */
+ if (getenv("TMPDIR"))
+ tmpdir = pg_strdup(getenv("TMPDIR"));
+ else
+ tmpdir = waldir != NULL ? pg_strdup(waldir) : pg_strdup(".");
+ canonicalize_path(tmpdir);
+
+ private->tmpdir = tmpdir;
}
/*
@@ -420,6 +430,8 @@ init_tar_archive_reader(XLogDumpPrivate *private, char *waldir,
static void
free_tar_archive_reader(XLogDumpPrivate *private)
{
+ SimpleStringListCell *cell;
+
/*
* NB: Normally, astreamer_finalize() is called before astreamer_free() to
* flush any remaining buffered data or to ensure the end of the tar
@@ -433,6 +445,15 @@ free_tar_archive_reader(XLogDumpPrivate *private)
if (close(private->archive_fd) != 0)
pg_log_error("could not close file \"%s\": %m",
private->archive_name);
+
+ /* Clear out any existing temporary files */
+ for (cell = private->exportedSegList.head; cell; cell = cell->next)
+ {
+ char *fpath = get_tmp_wal_file_path(private, cell->val);
+
+ unlink(fpath);
+ pfree(fpath);
+ }
}
/*
@@ -560,7 +581,7 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogRecPtr targetPtr, char *readBuff)
{
XLogDumpPrivate *private = state->private_data;
- int count = required_read_len(private, targetPtr, reqLen);
+ int count = required_read_len(private, targetPagePtr, reqLen);
WALReadError errinfo;
if (private->endptr_reached)
@@ -619,12 +640,58 @@ TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogRecPtr targetPtr, char *readBuff)
{
XLogDumpPrivate *private = state->private_data;
- int count = required_read_len(private, targetPtr, reqLen);
+ int count = required_read_len(private, targetPagePtr, reqLen);
+ XLogSegNo nextSegNo;
if (private->endptr_reached)
return -1;
- /* Read the WAL page from the archive streamer */
+ /*
+ * If the target page is in a different segment, first check for the WAL
+ * segment's physical existence in the temporary directory.
+ */
+ nextSegNo = state->seg.ws_segno;
+ if (!XLByteInSeg(targetPagePtr, nextSegNo, WalSegSz))
+ {
+ char fname[MAXPGPATH];
+ char *fpath;
+
+ if (state->seg.ws_file >= 0)
+ {
+ close(state->seg.ws_file);
+ state->seg.ws_file = -1;
+
+ /* Remove this file, as it is no longer needed. */
+ XLogFileName(fname, state->seg.ws_tli, nextSegNo, WalSegSz);
+ fpath = get_tmp_wal_file_path(private, fname);
+ unlink(fpath);
+ pfree(fpath);
+ }
+
+ XLByteToSeg(targetPagePtr, nextSegNo, WalSegSz);
+ state->seg.ws_tli = private->timeline;
+ state->seg.ws_segno = nextSegNo;
+
+ /*
+ * If the next segment exists, open it and continue reading from there
+ */
+ XLogFileName(fname, private->timeline, nextSegNo, WalSegSz);
+ if (simple_string_list_member(&private->exportedSegList, fname))
+ {
+ fpath = get_tmp_wal_file_path(private, fname);
+ state->seg.ws_file = open(fpath, O_RDONLY | PG_BINARY, 0);
+
+ if (state->seg.ws_file < 0)
+ pg_fatal("could not open file \"%s\": %m", fpath);
+ }
+ }
+
+ /* Continue reading from the open WAL segment, if any */
+ if (state->seg.ws_file >= 0)
+ return WALDumpReadPage(state, targetPagePtr, reqLen, targetPtr,
+ readBuff);
+
+ /* Otherwise, read the WAL page from the archive streamer */
return astreamer_wal_read(readBuff, targetPagePtr, count, private);
}
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index b5d440500de..614e679cb96 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -13,8 +13,11 @@
#include "access/xlogdefs.h"
#include "fe_utils/astreamer.h"
+#include "fe_utils/simple_list.h"
#include "lib/stringinfo.h"
+#define TEMP_FILE_EXT "waldump.tmp"
+
extern int WalSegSz;
/* Contains the necessary information to drive WAL decoding */
@@ -31,11 +34,30 @@ typedef struct XLogDumpPrivate
astreamer *archive_streamer;
StringInfo archive_streamer_buf; /* Buffer for receiving WAL data */
- XLogRecPtr archive_streamer_read_ptr; /* Populate the buffer with records
- until this record pointer */
+ XLogRecPtr archive_streamer_read_ptr; /* Populate the buffer with
+ * records until this record
+ * pointer */
+ char *tmpdir; /* Temporary direcotry to export file */
+ SimpleStringList exportedSegList; /* Temporary exported WAL file list */
} XLogDumpPrivate;
+/*
+ * Generate the temporary WAL file path.
+ *
+ * Note that the caller is responsible to pfree it.
+ */
+static inline char *
+get_tmp_wal_file_path(XLogDumpPrivate *privateInfo, const char *fname)
+{
+ char *fpath = (char *) palloc(MAXPGPATH);
+
+ snprintf(fpath, MAXPGPATH, "%s/%s.%s", privateInfo->tmpdir, fname,
+ TEMP_FILE_EXT);
+
+ return fpath;
+}
+
extern astreamer *astreamer_waldump_content_new(astreamer *next,
XLogRecPtr startptr,
XLogRecPtr endptr,
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 443126a9ce6..d5fa1f6d28d 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -7,6 +7,7 @@ use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
+use List::Util qw(shuffle);
my $tar = $ENV{TAR};
@@ -272,7 +273,7 @@ sub generate_archive
}
closedir $dh;
- @files = sort @files;
+ @files = shuffle @files;
# move into the WAL directory before archiving files
my $cwd = getcwd;
--
2.47.1
[application/x-patch] v3-0006-pg_verifybackup-Delay-default-WAL-directory-prepa.patch (1.7K, 7-v3-0006-pg_verifybackup-Delay-default-WAL-directory-prepa.patch)
download | inline diff:
From 60c2ecfbe80203c73fb35d763eb63a9fad7fff45 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 16 Jul 2025 14:47:43 +0530
Subject: [PATCH v3 6/8] pg_verifybackup: Delay default WAL directory
preparation.
We are not sure whether to parse WAL from a directory or an archive
until the backup format is known. Therefore, we delay preparing the
default WAL directory until the point of parsing. This delay is
harmless, as the WAL directory is not used elsewhere.
---
src/bin/pg_verifybackup/pg_verifybackup.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 5e6c13bb921..31ebc1581fb 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -285,10 +285,6 @@ main(int argc, char **argv)
manifest_path = psprintf("%s/backup_manifest",
context.backup_directory);
- /* By default, look for the WAL in the backup directory, too. */
- if (wal_directory == NULL)
- wal_directory = psprintf("%s/pg_wal", context.backup_directory);
-
/*
* Try to read the manifest. We treat any errors encountered while parsing
* the manifest as fatal; there doesn't seem to be much point in trying to
@@ -368,6 +364,10 @@ main(int argc, char **argv)
if (context.format == 'p' && !context.skip_checksums)
verify_backup_checksums(&context);
+ /* By default, look for the WAL in the backup directory, too. */
+ if (wal_directory == NULL)
+ wal_directory = psprintf("%s/pg_wal", context.backup_directory);
+
/*
* Try to parse the required ranges of WAL records, unless we were told
* not to do so.
--
2.47.1
[application/x-patch] v3-0007-pg_verifybackup-Rename-the-wal-directory-switch-t.patch (15.6K, 8-v3-0007-pg_verifybackup-Rename-the-wal-directory-switch-t.patch)
download | inline diff:
From 18ce61a331aa5800cd3e42b13faaa3e9b39fdc7e Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 24 Jul 2025 16:37:43 +0530
Subject: [PATCH v3 7/8] pg_verifybackup: Rename the wal-directory switch to
wal-path
Future patches to pg_waldump will enable it to decode WAL directly
from tar files. This means you'll be able to specify a tar archive
path instead of a traditional WAL directory.
To keep things consistent and more versatile, we should also
generalize the input switch for pg_verifybackup. It should accept
either a directory or a tar file path that contains WALs. This change
will also aligning it with the existing manifest-path switch naming.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 2 +-
src/bin/pg_verifybackup/pg_verifybackup.c | 22 +++++++++++-----------
src/bin/pg_verifybackup/po/de.po | 4 ++--
src/bin/pg_verifybackup/po/el.po | 4 ++--
src/bin/pg_verifybackup/po/es.po | 4 ++--
src/bin/pg_verifybackup/po/fr.po | 4 ++--
src/bin/pg_verifybackup/po/it.po | 4 ++--
src/bin/pg_verifybackup/po/ja.po | 4 ++--
src/bin/pg_verifybackup/po/ka.po | 4 ++--
src/bin/pg_verifybackup/po/ko.po | 4 ++--
src/bin/pg_verifybackup/po/ru.po | 4 ++--
src/bin/pg_verifybackup/po/sv.po | 4 ++--
src/bin/pg_verifybackup/po/uk.po | 4 ++--
src/bin/pg_verifybackup/po/zh_CN.po | 4 ++--
src/bin/pg_verifybackup/po/zh_TW.po | 4 ++--
src/bin/pg_verifybackup/t/007_wal.pl | 4 ++--
16 files changed, 40 insertions(+), 40 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index 61c12975e4a..e9b8bfd51b1 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -261,7 +261,7 @@ PostgreSQL documentation
<varlistentry>
<term><option>-w <replaceable class="parameter">path</replaceable></option></term>
- <term><option>--wal-directory=<replaceable class="parameter">path</replaceable></option></term>
+ <term><option>--wal-path=<replaceable class="parameter">path</replaceable></option></term>
<listitem>
<para>
Try to parse WAL files stored in the specified directory, rather than
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 31ebc1581fb..1ee400199da 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -93,7 +93,7 @@ static void verify_file_checksum(verifier_context *context,
uint8 *buffer);
static void parse_required_wal(verifier_context *context,
char *pg_waldump_path,
- char *wal_directory);
+ char *wal_path);
static astreamer *create_archive_verifier(verifier_context *context,
char *archive_name,
Oid tblspc_oid,
@@ -126,7 +126,7 @@ main(int argc, char **argv)
{"progress", no_argument, NULL, 'P'},
{"quiet", no_argument, NULL, 'q'},
{"skip-checksums", no_argument, NULL, 's'},
- {"wal-directory", required_argument, NULL, 'w'},
+ {"wal-path", required_argument, NULL, 'w'},
{NULL, 0, NULL, 0}
};
@@ -135,7 +135,7 @@ main(int argc, char **argv)
char *manifest_path = NULL;
bool no_parse_wal = false;
bool quiet = false;
- char *wal_directory = NULL;
+ char *wal_path = NULL;
char *pg_waldump_path = NULL;
DIR *dir;
@@ -221,8 +221,8 @@ main(int argc, char **argv)
context.skip_checksums = true;
break;
case 'w':
- wal_directory = pstrdup(optarg);
- canonicalize_path(wal_directory);
+ wal_path = pstrdup(optarg);
+ canonicalize_path(wal_path);
break;
default:
/* getopt_long already emitted a complaint */
@@ -365,15 +365,15 @@ main(int argc, char **argv)
verify_backup_checksums(&context);
/* By default, look for the WAL in the backup directory, too. */
- if (wal_directory == NULL)
- wal_directory = psprintf("%s/pg_wal", context.backup_directory);
+ if (wal_path == NULL)
+ wal_path = psprintf("%s/pg_wal", context.backup_directory);
/*
* Try to parse the required ranges of WAL records, unless we were told
* not to do so.
*/
if (!no_parse_wal)
- parse_required_wal(&context, pg_waldump_path, wal_directory);
+ parse_required_wal(&context, pg_waldump_path, wal_path);
/*
* If everything looks OK, tell the user this, unless we were asked to
@@ -1198,7 +1198,7 @@ verify_file_checksum(verifier_context *context, manifest_file *m,
*/
static void
parse_required_wal(verifier_context *context, char *pg_waldump_path,
- char *wal_directory)
+ char *wal_path)
{
manifest_data *manifest = context->manifest;
manifest_wal_range *this_wal_range = manifest->first_wal_range;
@@ -1208,7 +1208,7 @@ parse_required_wal(verifier_context *context, char *pg_waldump_path,
char *pg_waldump_cmd;
pg_waldump_cmd = psprintf("\"%s\" --quiet --path=\"%s\" --timeline=%u --start=%X/%08X --end=%X/%08X\n",
- pg_waldump_path, wal_directory, this_wal_range->tli,
+ pg_waldump_path, wal_path, this_wal_range->tli,
LSN_FORMAT_ARGS(this_wal_range->start_lsn),
LSN_FORMAT_ARGS(this_wal_range->end_lsn));
fflush(NULL);
@@ -1376,7 +1376,7 @@ usage(void)
printf(_(" -P, --progress show progress information\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -s, --skip-checksums skip checksum verification\n"));
- printf(_(" -w, --wal-directory=PATH use specified path for WAL files\n"));
+ printf(_(" -w, --wal-path=PATH use specified path for WAL files\n"));
printf(_(" -V, --version output version information, then exit\n"));
printf(_(" -?, --help show this help, then exit\n"));
printf(_("\nReport bugs to <%s>.\n"), PACKAGE_BUGREPORT);
diff --git a/src/bin/pg_verifybackup/po/de.po b/src/bin/pg_verifybackup/po/de.po
index a9e24931100..9b5cd5898cf 100644
--- a/src/bin/pg_verifybackup/po/de.po
+++ b/src/bin/pg_verifybackup/po/de.po
@@ -785,8 +785,8 @@ msgstr " -s, --skip-checksums Überprüfung der Prüfsummen überspringe
#: pg_verifybackup.c:1379
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PFAD angegebenen Pfad für WAL-Dateien verwenden\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PFAD angegebenen Pfad für WAL-Dateien verwenden\n"
#: pg_verifybackup.c:1380
#, c-format
diff --git a/src/bin/pg_verifybackup/po/el.po b/src/bin/pg_verifybackup/po/el.po
index 3e3f20c67c5..81442f51c17 100644
--- a/src/bin/pg_verifybackup/po/el.po
+++ b/src/bin/pg_verifybackup/po/el.po
@@ -494,8 +494,8 @@ msgstr " -s, --skip-checksums παράκαμψε την επαλήθευ
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH χρησιμοποίησε την καθορισμένη διαδρομή για αρχεία WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH χρησιμοποίησε την καθορισμένη διαδρομή για αρχεία WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/es.po b/src/bin/pg_verifybackup/po/es.po
index 0cb958f3448..7f729fa35ba 100644
--- a/src/bin/pg_verifybackup/po/es.po
+++ b/src/bin/pg_verifybackup/po/es.po
@@ -495,8 +495,8 @@ msgstr " -s, --skip-checksums omitir la verificación de la suma de comp
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH utilizar la ruta especificada para los archivos WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH utilizar la ruta especificada para los archivos WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/fr.po b/src/bin/pg_verifybackup/po/fr.po
index da8c72f6427..09937966fa7 100644
--- a/src/bin/pg_verifybackup/po/fr.po
+++ b/src/bin/pg_verifybackup/po/fr.po
@@ -498,8 +498,8 @@ msgstr " -s, --skip-checksums ignore la vérification des sommes de cont
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=CHEMIN utilise le chemin spécifié pour les fichiers WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=CHEMIN utilise le chemin spécifié pour les fichiers WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/it.po b/src/bin/pg_verifybackup/po/it.po
index 317b0b71e7f..4da68d0074e 100644
--- a/src/bin/pg_verifybackup/po/it.po
+++ b/src/bin/pg_verifybackup/po/it.po
@@ -472,8 +472,8 @@ msgstr " -s, --skip-checksums salta la verifica del checksum\n"
#: pg_verifybackup.c:911
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH usa il percorso specificato per i file WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH usa il percorso specificato per i file WAL\n"
#: pg_verifybackup.c:912
#, c-format
diff --git a/src/bin/pg_verifybackup/po/ja.po b/src/bin/pg_verifybackup/po/ja.po
index c910fb236cc..a948959b54f 100644
--- a/src/bin/pg_verifybackup/po/ja.po
+++ b/src/bin/pg_verifybackup/po/ja.po
@@ -672,8 +672,8 @@ msgstr " -s, --skip-checksums チェックサム検証をスキップ\n"
#: pg_verifybackup.c:1379
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH WALファイルに指定したパスを使用する\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH WALファイルに指定したパスを使用する\n"
#: pg_verifybackup.c:1380
#, c-format
diff --git a/src/bin/pg_verifybackup/po/ka.po b/src/bin/pg_verifybackup/po/ka.po
index 982751984c7..ef2799316a8 100644
--- a/src/bin/pg_verifybackup/po/ka.po
+++ b/src/bin/pg_verifybackup/po/ka.po
@@ -784,8 +784,8 @@ msgstr " -s, --skip-checksums საკონტროლო ჯამ
#: pg_verifybackup.c:1379
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=ბილიკი WAL ფაილებისთვის მითითებული ბილიკის გამოყენება\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=ბილიკი WAL ფაილებისთვის მითითებული ბილიკის გამოყენება\n"
#: pg_verifybackup.c:1380
#, c-format
diff --git a/src/bin/pg_verifybackup/po/ko.po b/src/bin/pg_verifybackup/po/ko.po
index acdc3da5e02..eaf91ef1e98 100644
--- a/src/bin/pg_verifybackup/po/ko.po
+++ b/src/bin/pg_verifybackup/po/ko.po
@@ -501,8 +501,8 @@ msgstr " -s, --skip-checksums 체크섬 검사 건너뜀\n"
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=경로 WAL 파일이 있는 경로 지정\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=경로 WAL 파일이 있는 경로 지정\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/ru.po b/src/bin/pg_verifybackup/po/ru.po
index 64005feedfd..7fb0e5ab1f6 100644
--- a/src/bin/pg_verifybackup/po/ru.po
+++ b/src/bin/pg_verifybackup/po/ru.po
@@ -507,9 +507,9 @@ msgstr " -s, --skip-checksums пропустить проверку ко
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
msgstr ""
-" -w, --wal-directory=ПУТЬ использовать заданный путь к файлам WAL\n"
+" -w, --wal-path=ПУТЬ использовать заданный путь к файлам WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/sv.po b/src/bin/pg_verifybackup/po/sv.po
index 17240feeb5c..97125838e8c 100644
--- a/src/bin/pg_verifybackup/po/sv.po
+++ b/src/bin/pg_verifybackup/po/sv.po
@@ -492,8 +492,8 @@ msgstr " -s, --skip-checksums hoppa över verifiering av kontrollsummor\
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=SÖKVÄG använd denna sökväg till WAL-filer\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=SÖKVÄG använd denna sökväg till WAL-filer\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/uk.po b/src/bin/pg_verifybackup/po/uk.po
index 034b9764232..63f8041ab38 100644
--- a/src/bin/pg_verifybackup/po/uk.po
+++ b/src/bin/pg_verifybackup/po/uk.po
@@ -484,8 +484,8 @@ msgstr " -s, --skip-checksums не перевіряти контрольні с
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH використовувати вказаний шлях для файлів WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH використовувати вказаний шлях для файлів WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/zh_CN.po b/src/bin/pg_verifybackup/po/zh_CN.po
index b7d97c8976d..fb6fcae8b82 100644
--- a/src/bin/pg_verifybackup/po/zh_CN.po
+++ b/src/bin/pg_verifybackup/po/zh_CN.po
@@ -465,8 +465,8 @@ msgstr " -s, --skip-checksums 跳过校验和验证\n"
#: pg_verifybackup.c:919
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH 对WAL文件使用指定路径\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH 对WAL文件使用指定路径\n"
#: pg_verifybackup.c:920
#, c-format
diff --git a/src/bin/pg_verifybackup/po/zh_TW.po b/src/bin/pg_verifybackup/po/zh_TW.po
index c1b710b0a36..568f972b0bb 100644
--- a/src/bin/pg_verifybackup/po/zh_TW.po
+++ b/src/bin/pg_verifybackup/po/zh_TW.po
@@ -555,8 +555,8 @@ msgstr " -s, --skip-checksums 跳過檢查碼驗證\n"
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH 用指定的路徑存放 WAL 檔\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH 用指定的路徑存放 WAL 檔\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/t/007_wal.pl b/src/bin/pg_verifybackup/t/007_wal.pl
index babc4f0a86b..b07f80719b0 100644
--- a/src/bin/pg_verifybackup/t/007_wal.pl
+++ b/src/bin/pg_verifybackup/t/007_wal.pl
@@ -42,10 +42,10 @@ command_ok([ 'pg_verifybackup', '--no-parse-wal', $backup_path ],
command_ok(
[
'pg_verifybackup',
- '--wal-directory' => $relocated_pg_wal,
+ '--wal-path' => $relocated_pg_wal,
$backup_path
],
- '--wal-directory can be used to specify WAL directory');
+ '--wal-path can be used to specify WAL directory');
# Move directory back to original location.
rename($relocated_pg_wal, $original_pg_wal) || die "rename pg_wal back: $!";
--
2.47.1
[application/x-patch] v3-0008-pg_verifybackup-enabled-WAL-parsing-for-tar-forma.patch (9.3K, 9-v3-0008-pg_verifybackup-enabled-WAL-parsing-for-tar-forma.patch)
download | inline diff:
From b3bd75e44fa50c89b52f650d4978fcb69d768652 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 17 Jul 2025 16:39:36 +0530
Subject: [PATCH v3 8/8] pg_verifybackup: enabled WAL parsing for tar-format
backup
Now that pg_waldump supports decoding from tar archives, we should
leverage this functionality to remove the previous restriction on WAL
parsing for tar-backed formats.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 5 +-
src/bin/pg_verifybackup/pg_verifybackup.c | 66 +++++++++++++------
src/bin/pg_verifybackup/t/002_algorithm.pl | 4 --
src/bin/pg_verifybackup/t/003_corruption.pl | 4 +-
src/bin/pg_verifybackup/t/008_untar.pl | 3 +-
src/bin/pg_verifybackup/t/010_client_untar.pl | 3 +-
6 files changed, 50 insertions(+), 35 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index e9b8bfd51b1..16b50b5a4df 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -36,10 +36,7 @@ PostgreSQL documentation
<literal>backup_manifest</literal> generated by the server at the time
of the backup. The backup may be stored either in the "plain" or the "tar"
format; this includes tar-format backups compressed with any algorithm
- supported by <application>pg_basebackup</application>. However, at present,
- <literal>WAL</literal> verification is supported only for plain-format
- backups. Therefore, if the backup is stored in tar-format, the
- <literal>-n, --no-parse-wal</literal> option should be used.
+ supported by <application>pg_basebackup</application>.
</para>
<para>
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 1ee400199da..4bfe6fdff16 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -74,7 +74,9 @@ pg_noreturn static void report_manifest_error(JsonManifestParseContext *context,
const char *fmt,...)
pg_attribute_printf(2, 3);
-static void verify_tar_backup(verifier_context *context, DIR *dir);
+static void verify_tar_backup(verifier_context *context, DIR *dir,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_plain_backup_directory(verifier_context *context,
char *relpath, char *fullpath,
DIR *dir);
@@ -83,7 +85,9 @@ static void verify_plain_backup_file(verifier_context *context, char *relpath,
static void verify_control_file(const char *controlpath,
uint64 manifest_system_identifier);
static void precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles);
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_tar_file(verifier_context *context, char *relpath,
char *fullpath, astreamer *streamer);
static void report_extra_backup_files(verifier_context *context);
@@ -136,6 +140,8 @@ main(int argc, char **argv)
bool no_parse_wal = false;
bool quiet = false;
char *wal_path = NULL;
+ char *base_archive_path = NULL;
+ char *wal_archive_path = NULL;
char *pg_waldump_path = NULL;
DIR *dir;
@@ -327,17 +333,6 @@ main(int argc, char **argv)
pfree(path);
}
- /*
- * XXX: In the future, we should consider enhancing pg_waldump to read WAL
- * files from an archive.
- */
- if (!no_parse_wal && context.format == 't')
- {
- pg_log_error("pg_waldump cannot read tar files");
- pg_log_error_hint("You must use -n/--no-parse-wal when verifying a tar-format backup.");
- exit(1);
- }
-
/*
* Perform the appropriate type of verification appropriate based on the
* backup format. This will close 'dir'.
@@ -346,7 +341,7 @@ main(int argc, char **argv)
verify_plain_backup_directory(&context, NULL, context.backup_directory,
dir);
else
- verify_tar_backup(&context, dir);
+ verify_tar_backup(&context, dir, &base_archive_path, &wal_archive_path);
/*
* The "matched" flag should now be set on every entry in the hash table.
@@ -364,9 +359,28 @@ main(int argc, char **argv)
if (context.format == 'p' && !context.skip_checksums)
verify_backup_checksums(&context);
- /* By default, look for the WAL in the backup directory, too. */
+ /*
+ * By default, WAL files are expected to be found in the backup directory
+ * for plain-format backups. In the case of tar-format backups, if a
+ * separate WAL archive is not found, the WAL files are most likely
+ * included within the main data directory archive.
+ */
if (wal_path == NULL)
- wal_path = psprintf("%s/pg_wal", context.backup_directory);
+ {
+ if (context.format == 'p')
+ wal_path = psprintf("%s/pg_wal", context.backup_directory);
+ else if (wal_archive_path)
+ wal_path = wal_archive_path;
+ else if (base_archive_path)
+ wal_path = base_archive_path;
+ else
+ {
+ pg_log_error("wal archive not found");
+ pg_log_error_hint("Specify the correct path using the option -w/--wal-path."
+ "Or you must use -n/--no-parse-wal when verifying a tar-format backup.");
+ exit(1);
+ }
+ }
/*
* Try to parse the required ranges of WAL records, unless we were told
@@ -787,7 +801,8 @@ verify_control_file(const char *controlpath, uint64 manifest_system_identifier)
* close when we're done with it.
*/
static void
-verify_tar_backup(verifier_context *context, DIR *dir)
+verify_tar_backup(verifier_context *context, DIR *dir, char **base_archive_path,
+ char **wal_archive_path)
{
struct dirent *dirent;
SimplePtrList tarfiles = {NULL, NULL};
@@ -816,7 +831,8 @@ verify_tar_backup(verifier_context *context, DIR *dir)
char *fullpath;
fullpath = psprintf("%s/%s", context->backup_directory, filename);
- precheck_tar_backup_file(context, filename, fullpath, &tarfiles);
+ precheck_tar_backup_file(context, filename, fullpath, &tarfiles,
+ base_archive_path, wal_archive_path);
pfree(fullpath);
}
}
@@ -875,11 +891,13 @@ verify_tar_backup(verifier_context *context, DIR *dir)
*
* The arguments to this function are mostly the same as the
* verify_plain_backup_file. The additional argument outputs a list of valid
- * tar files.
+ * tar files, along with the full paths to the main archive and the WAL
+ * directory archive.
*/
static void
precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles)
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path, char **wal_archive_path)
{
struct stat sb;
Oid tblspc_oid = InvalidOid;
@@ -918,9 +936,17 @@ precheck_tar_backup_file(verifier_context *context, char *relpath,
* extension such as .gz, .lz4, or .zst.
*/
if (strncmp("base", relpath, 4) == 0)
+ {
suffix = relpath + 4;
+
+ *base_archive_path = pstrdup(fullpath);
+ }
else if (strncmp("pg_wal", relpath, 6) == 0)
+ {
suffix = relpath + 6;
+
+ *wal_archive_path = pstrdup(fullpath);
+ }
else
{
/* Expected a <tablespaceoid>.tar file here. */
diff --git a/src/bin/pg_verifybackup/t/002_algorithm.pl b/src/bin/pg_verifybackup/t/002_algorithm.pl
index ae16c11bc4d..4f284a9e828 100644
--- a/src/bin/pg_verifybackup/t/002_algorithm.pl
+++ b/src/bin/pg_verifybackup/t/002_algorithm.pl
@@ -30,10 +30,6 @@ sub test_checksums
{
# Add switch to get a tar-format backup
push @backup, ('--format' => 'tar');
-
- # Add switch to skip WAL verification, which is not yet supported for
- # tar-format backups
- push @verify, ('--no-parse-wal');
}
# A backup with a bogus algorithm should fail.
diff --git a/src/bin/pg_verifybackup/t/003_corruption.pl b/src/bin/pg_verifybackup/t/003_corruption.pl
index 1dd60f709cf..f1ebdbb46b4 100644
--- a/src/bin/pg_verifybackup/t/003_corruption.pl
+++ b/src/bin/pg_verifybackup/t/003_corruption.pl
@@ -193,10 +193,8 @@ for my $scenario (@scenario)
command_ok([ $tar, '-cf' => "$tar_backup_path/base.tar", '.' ]);
chdir($cwd) || die "chdir: $!";
- # Now check that the backup no longer verifies. We must use -n
- # here, because pg_waldump can't yet read WAL from a tarfile.
command_fails_like(
- [ 'pg_verifybackup', '--no-parse-wal', $tar_backup_path ],
+ [ 'pg_verifybackup', $tar_backup_path ],
$scenario->{'fails_like'},
"corrupt backup fails verification: $name");
diff --git a/src/bin/pg_verifybackup/t/008_untar.pl b/src/bin/pg_verifybackup/t/008_untar.pl
index bc3d6b352ad..0cfe1f9532c 100644
--- a/src/bin/pg_verifybackup/t/008_untar.pl
+++ b/src/bin/pg_verifybackup/t/008_untar.pl
@@ -123,8 +123,7 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
diff --git a/src/bin/pg_verifybackup/t/010_client_untar.pl b/src/bin/pg_verifybackup/t/010_client_untar.pl
index b62faeb5acf..76269a73673 100644
--- a/src/bin/pg_verifybackup/t/010_client_untar.pl
+++ b/src/bin/pg_verifybackup/t/010_client_untar.pl
@@ -137,8 +137,7 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
--
2.47.1
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2025-09-08 13:37 Jakub Wartak <[email protected]>
parent: Amul Sul <[email protected]>
1 sibling, 1 reply; 85+ messages in thread
From: Jakub Wartak @ 2025-09-08 13:37 UTC (permalink / raw)
To: Amul Sul <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>
On Tue, Aug 26, 2025 at 1:53 PM Amul Sul <[email protected]> wrote:
>
[..patch]
Hi Amul!
0001: LGTM, maybe I would just slightly enhance the commit message
("This is in preparation for adding a second source file to this
directory.") -- maye bit a bit more verbose or use a message from
0002?
0002: LGTM
0003: LGTM
Tested here (after partial patch apply, and test suite did work fine).
0004:
a. Why should it be necessary to provide startLSN (-s) ? Couldn't
it autodetect the first WAL (tar file) inside and just use that with
some info message?
$ /usr/pgsql19/bin/pg_waldump --path=/tmp/base/pg_wal.tar
pg_waldump: error: no start WAL location given
b. Why would it like to open "blah" dir if I wanted that "blah"
segment from the archive? Shouldn't it tell that it was looking in the
archive and couldn find it inside?
$ /usr/pgsql19/bin/pg_waldump --path=/tmp/base/pg_wal.tar blah
pg_waldump: error: could not open file "blah": Not a directory
c. It doesnt work when using SEGSTART, but it's there:
$ /usr/pgsql19/bin/pg_waldump --path=/tmp/base/pg_wal.tar
000000010000000000000059
pg_waldump: error: could not open file "000000010000000000000059":
Not a directory
$ tar tf /tmp/base/pg_wal.tar | head -1
000000010000000000000059
d. I've later noticed that follow-up patches seem to use the
-s switch and there it seems to work OK. The above SEGSTART issue was
not detected, probably because tests need to be extended cover of
segment name rather than just --start LSN (see test_pg_waldump):
$ /usr/pgsql19/bin/pg_waldump --path=/tmp/base/pg_wal.tar --stats
-s 0/59000358
pg_waldump: first record is after 0/59000358, at 0/590003E8,
skipping over 144 bytes
WAL statistics between 0/590003E8 and 0/61000000:
[..]
e. Code around`if (walpath == NULL && directory != NULL)` needs
some comments.
f. Code around `if (fname != NULL && is_tar_file(fname,
&compression))` , so if fname is WAL segment here
(00000001000000000000005A) and we do check again if that has been
tar-ed (is_tar_file())? Why?
g. Just a question: the commit message says `Note that this patch
requires that the WAL files within the archive be in sequential order;
an error will be reported otherwise`. I'm wondering if such
occurrences are known to be happening in the wild? Or is it just an
assumption that if someone would modify the tar somehow? (either way
we could just add a reason why we need to handle such a case if we
know -- is manual alternation the only source of such state?). For the
record, I've tested crafting custom archives with out of sequence WAL
archives and the code seems to work (it was done using: tar --append
-f pg_wal.tar --format=ustar ..)
h. Anyway, in case of typo/wrong LSN, 0004 emits wrong error
message I think:
$ /usr/pgsql19/bin/pg_waldump --path=/tmp/base/pg_wal.tar --stats
-s 0/50000358
pg_waldump: error: WAL files are not archived in sequential order
pg_waldump: detail: Expecting segment number 80 but found 89.
it's just that the 50000358 LSN above is below the minimal LSN
present in the WAL segments (first segment is 000000010000000000000059
there, i've just intentionally provided a bad value 50.. as a typo and
it causes the wrong message). Now it might not be an issue as with
0005 patch the same test behaves OK (`pg_waldump: error: could not
find a valid record after 0/50000358`). It is just relevant if this
would be committed not all at once.
i. If I give wrong --timeline=999 to pg_waldump it fails with
misleading error message: could not read WAL data from "pg_wal.tar"
archive: read -1 of 8192
0005:
a. I'm wondering if we shouldn't log (to stderr?) some kind of
notification message (just once) that non-sequential WAL files were
discovered and that pg_waldump is starting to write to $somewhere as
it may be causing bigger I/O than anticipated when running the
command. This can easily help when troubleshooting why it is not fast,
and also having set TMPDIR to usually /tmp can be slow or too small.
b. IMHO member_prepare_tmp_write() / get_tmp_wal_file_path() with
TMPDIR can be prone to symlink attack. Consider setting TMPDIR=/tmp .
We are writing to e.g. /tmp/<WALsegment>.waldump.tmp in 0004 , but
that path is completely guessable. If an attacker prepares some
symlinks and links those to some other places, I think the code will
happily open and overwrite the contents of the rogue symlink. I think
using mkstemp(3)/tmpfile(3) would be a safer choice if TMPDIR needs to
be in play. Consider that pg_waldump can be run as root (there's no
mechanism preventing it from being used that way).
c. IMHO that unlink() might be not guaranteed to always remove
files, as in case of any trouble and exit() , those files might be
left over. I think we need some atexit() handlers. This can be
triggered with combo of options of nonsequential files in tar + wrong
LSN given:
$ tar tf pg_wal.tar
00000001000000000000005A
00000001000000000000005B
00000001000000000000005C
[..]
000000010000000000000060
000000010000000000000059 <-- out of order, appended last
$ ls -lh 0*
ls: cannot access '0*': No such file or directory
$ /usr/pgsql19/bin/pg_waldump --path=/tmp/ble/pg_wal.tar --stats
-s 0/10000358 #wrong LSN
pg_waldump: error: could not find a valid record after 0/10000358
$ ls -lh 0*
-rw------- 1 postgres postgres 16M Sep 8 14:44
000000010000000000000059.waldump.tmp
-rw------- 1 postgres postgres 16M Sep 8 14:44
00000001000000000000005A.waldump.tmp
[..]
0006: LGTM
0007:
a. Commit message says `Future patches to pg_waldump will enable
it to decode WAL directly` , but those pg_waldump are earlier patches,
right?
b. pg_verifybackup should print some info with --progress that it
is spawning pg_waldump (pg_verifybackup --progress mode does not
display anything related to verifing WALs, but it could)
c. I'm wondering, but pg_waldump seems to be not complaining if
--end=LSN is made into such a future that it doesn't exist. E.g. If
the latest WAL segment is 60 (with end LSN 0/60A77A59), but I run
pg_waldump `--end=0/7000000` , it will return code 0 and nothing on
stderr. So how sure are we that the necessary WAL segments (as per
backup_manifest) are actually inside the tar? It's supposed to be
verified, but it isn't for this use case? Same happens if craft
special tar and remove just one WAL segment from pg_wal.tar (simulate
missing WAL segment), but ask the pg_verifybackup/pg_waldump to verify
it to exact last LSN sequence, e.g.:
$ /usr/pgsql19/bin/pg_waldump --quiet
--path=/tmp/missing/pg_wal.tar --timeline=1 --start=0/59000028
--end=0/60A77A58 && echo OK # but it is not OK
OK
$ /usr/pgsql19/bin/pg_waldump --stats
--path=/tmp/missing/pg_wal.tar --timeline=1 --start=0/59000028
--end=0/60A77A58
WAL statistics between 0/59000028 and 0/5CFFFFD0: # <-- 0/5C LSN
maximum detected
[..]
Notice it has read till 0/5C (but I've asked till 0/60), because
I've removed 0D:
$ tar tf /tmp/missing/pg_wal.tar| grep ^0
000000010000000000000059
00000001000000000000005A
00000001000000000000005B
00000001000000000000005C
00000001000000000000005E <-- missing 5D
Yet it reported no errors.
0008:
LGTM
Another open question I have is this: shouldn't backup_manifest come
with CRC checksum for the archived WALs? Or does that guarantee that
backup_manifest WAL-Ranges are present in pg_wal.tar is good enough
because individual WAL files are CRC-protected itself?
-J.
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2025-09-12 10:55 Amul Sul <[email protected]>
parent: Jakub Wartak <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Amul Sul @ 2025-09-12 10:55 UTC (permalink / raw)
To: Jakub Wartak <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>
On Mon, Sep 8, 2025 at 7:07 PM Jakub Wartak
<[email protected]> wrote:
>
> On Tue, Aug 26, 2025 at 1:53 PM Amul Sul <[email protected]> wrote:
> >
> [..patch]
>
> Hi Amul!
>
Thanks for your review. I'm replying to a few of your comments now,
but for the rest, I need to think about them. I'm kind of in agreement
with some of them for the fix, but I won't be able to spend time on
that next week due to official travel. I'll try to get back as soon as
possible after that.
> a. Why should it be necessary to provide startLSN (-s) ? Couldn't
> it autodetect the first WAL (tar file) inside and just use that with
> some info message?
> $ /usr/pgsql19/bin/pg_waldump --path=/tmp/base/pg_wal.tar
> pg_waldump: error: no start WAL location given
>
There are two reasons. First, existing pg_waldump
--path=some_directory would result in the same error. Second, it would
force us to re-read the archive twice just to locate the first WAL
segment, which is inefficient.
> c. It doesnt work when using SEGSTART, but it's there:
> $ /usr/pgsql19/bin/pg_waldump --path=/tmp/base/pg_wal.tar
> 000000010000000000000059
> pg_waldump: error: could not open file "000000010000000000000059":
> Not a directory
> $ tar tf /tmp/base/pg_wal.tar | head -1
> 000000010000000000000059
>
I don't believe this is the correct use case. The WAL files are inside
a tar archive, and the requirement is to use a starting LSN and a
timeline (if not the default).
> d. I've later noticed that follow-up patches seem to use the
> -s switch and there it seems to work OK. The above SEGSTART issue was
> not detected, probably because tests need to be extended cover of
> segment name rather than just --start LSN (see test_pg_waldump):
> $ /usr/pgsql19/bin/pg_waldump --path=/tmp/base/pg_wal.tar --stats
> -s 0/59000358
> pg_waldump: first record is after 0/59000358, at 0/590003E8,
> skipping over 144 bytes
> WAL statistics between 0/590003E8 and 0/61000000:
> [..]
>
Hope previous reasoning makes sense to you.
> e. Code around`if (walpath == NULL && directory != NULL)` needs
> some comments.
>
I think this is an existing one.
> f. Code around `if (fname != NULL && is_tar_file(fname,
> &compression))` , so if fname is WAL segment here
> (00000001000000000000005A) and we do check again if that has been
> tar-ed (is_tar_file())? Why?
>
Again, how?
> g. Just a question: the commit message says `Note that this patch
> requires that the WAL files within the archive be in sequential order;
> an error will be reported otherwise`. I'm wondering if such
> occurrences are known to be happening in the wild? Or is it just an
> assumption that if someone would modify the tar somehow? (either way
> we could just add a reason why we need to handle such a case if we
> know -- is manual alternation the only source of such state?). For the
> record, I've tested crafting custom archives with out of sequence WAL
> archives and the code seems to work (it was done using: tar --append
> -f pg_wal.tar --format=ustar ..)
>
This is an almost nonexistent occurrence. While pg_basebackup archives
WAL files in sequential order, we don't have an explicit code to
enforce that order within it. Furthermore, since we can't control how
external tools might handle the files, this extra precaution is
necessary.
> Another open question I have is this: shouldn't backup_manifest come
> with CRC checksum for the archived WALs? Or does that guarantee that
> backup_manifest WAL-Ranges are present in pg_wal.tar is good enough
> because individual WAL files are CRC-protected itself?
>
I don't know, I have to check pg_verifybackup.
Regards,
Amul
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2025-09-12 18:28 Robert Haas <[email protected]>
parent: Amul Sul <[email protected]>
1 sibling, 2 replies; 85+ messages in thread
From: Robert Haas @ 2025-09-12 18:28 UTC (permalink / raw)
To: Amul Sul <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>
Here are some review comments on v3-0004:
In general, I think this looks pretty nice, but I think it needs more
cleanup and polishing.
There doesn't seem to be any reason for
astreamer_waldump_content_new() to take an astreamer *next argument.
If you look at astreamer.h, you'll see that some astreamer_BLAH_new()
functions take such an argument, and others don't. The ones that do
forward their input to another astreamer; the ones that don't, like
astreamer_plain_writer_new(), send it somewhere else. AFAICT, this
astreamer is never going to send its output to another astreamer, so
there's no reason for this argument.
I'm also a little confused by the choice of the name
astreamer_waldump_content_new(). I would have thought this would be
something like astreamer_waldump_new() or astreamer_xlogreader_new().
The word "content" doesn't seem to me to be adding much here, and it
invites confusion with the "content" callback.
I think you can merge setup_astreamer() into
init_tar_archive_reader(). The only other caller is
verify_tar_archive(), but that does exactly the same additional steps
as init_tar_archive_reader(), as far as I can see.
The return statement for astreamer_wal_read is really odd:
+ return (count - nbytes) ? (count - nbytes) : -1;
Since 0 is false in C, this is equivalent to: count != nbytes ? count
- nbytes : -1, but it's a strange way to write it. What makes it even
stranger is that it seems as though the intention here is to count the
number of bytes read, but you do that by taking the number of bytes
requested (count) and subtracting the number of bytes we didn't manage
to read (nbytes); and then you just up and return -1 instead of 0
whenever the answer would have been zero. This is all lacking in
comments and seems a bit more confusing than it needs to be. So my
suggestions are:
1. Consider redefining nbytes to be the number of bytes that you have
read instead of the number of bytes you haven't read. So the loop in
this function would be while (nbytes < count) instead of while (nbytes
> 0).
2. If you need to map 0 to -1, consider having the caller do this
instead of putting that inside this function.
3. Add a comment saying what the return value is supposed to be".
If you do both 1 and 2, then the return statement can just say "return
nbytes;" and the comment can say "Returns the number of bytes
successfully read."
I would suggest changing the name of the variable from "readBuff" to
"readBuf". There are no existing uses of readBuff in the code base.
I think this comment also needs improvement:
+ /*
+ * Ignore existing data if the required target page
has not yet been
+ * read.
+ */
+ if (recptr >= endPtr)
+ {
+ len = 0;
+
+ /* Reset the buffer */
+ resetStringInfo(astreamer_buf);
+ }
This comment is problematic for a few reasons. First, we're not
ignoring the existing data: we're throwing it out. Second, the comment
doesn't say why we're doing what we're doing, only that we're doing
it. Here's my guess at the actual explanation -- please correct me if
I'm wrong: "pg_waldump never reads the same WAL bytes more than once,
so if we're now being asked for data beyond the end of what we've
already read, that means none of the data we currently have in the
buffer will ever be consulted again. So, we can discard the existing
buffer contents and start over." By the way, if this explanation is
correct, it might be nice to add an assertion someplace that verifies
it, like asserting that we're always reading from an LSN greater than
or equal to (or exactly equal to?) the LSN immediately following the
last data we read.
In general, I wonder whether there's a way to make the separation of
concerns between astreamer_wal_read() and TarWALDumpReadPage()
cleaner. Right now, the latter is basically a stub, but I'm not sure
that is the best thing here. I already mentioned one example of how to
do this: make the responsibility for 0 => -1 translation the job of
TarWALDumpReadPage() rather than astreamer_wal_read(). But I think
there might be a little more we can do. In particular, I wonder
whether we could say that astreamer_wal_read() is only responsible for
filling the buffer, and the caller, TarWALDumpReadPage() in this case,
needs to empty it. That seems like it might produce a cleaner
separation of duties.
Another thing that isn't so nice right now is that
verify_tar_archive() has to open and close the archive only for
init_tar_archive_reader() to be called to reopen it again just moments
later. It would be nicer to open the file just once and then keep it
open. Here again, I wonder if the separation of duties could be a bit
cleaner.
Is there a real need to pass XLogDumpPrivate to astreamer_wal_read or
astreamer_archive_read? The only things that they need are archive_fd,
archive_name, archive_streamer, archive_streamer_buf, and
archive_streamer_read_ptr. In other words, they really don't care
about any of the *existing* things that are in XLogDumpPrivate. This
makes me wonder whether we should actually try to make this new
astreamer completely independent of xlogreader. In other words,
instead of calling it astreamer_waldump() or astreamer_xlogreader() as
I proposed above, maybe it could be a completely generic astreamer,
say astreamer_stringinfo_new(StringInfo *buf) that just appends to the
buffer. That would require also moving the stuff out of
astreamer_wal_read() that knows about XLogRecPtr, but why does that
function need to know about XLogRecPtr? Couldn't the caller figure out
that part and just tell this function how many bytes are needed?
--
Robert Haas
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2025-09-12 20:27 Robert Haas <[email protected]>
parent: Robert Haas <[email protected]>
1 sibling, 0 replies; 85+ messages in thread
From: Robert Haas @ 2025-09-12 20:27 UTC (permalink / raw)
To: Amul Sul <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>
On Fri, Sep 12, 2025 at 2:28 PM Robert Haas <[email protected]> wrote:
> Is there a real need to pass XLogDumpPrivate to astreamer_wal_read or
> astreamer_archive_read? The only things that they need are archive_fd,
> archive_name, archive_streamer, archive_streamer_buf, and
> archive_streamer_read_ptr. In other words, they really don't care
> about any of the *existing* things that are in XLogDumpPrivate. This
> makes me wonder whether we should actually try to make this new
> astreamer completely independent of xlogreader. In other words,
> instead of calling it astreamer_waldump() or astreamer_xlogreader() as
> I proposed above, maybe it could be a completely generic astreamer,
> say astreamer_stringinfo_new(StringInfo *buf) that just appends to the
> buffer. That would require also moving the stuff out of
> astreamer_wal_read() that knows about XLogRecPtr, but why does that
> function need to know about XLogRecPtr? Couldn't the caller figure out
> that part and just tell this function how many bytes are needed?
Hmm, on further thought, I think this was a silly idea. Part of the
intended function of this astreamer is to make sure we're only reading
WAL files from the archive, and eventually reordering them if
required, so obviously something completely generic isn't going to
work. Maybe there's a way to make this look a little cleaner and
tidier but this isn't it...
--
Robert Haas
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2025-09-25 08:18 Amul Sul <[email protected]>
parent: Amul Sul <[email protected]>
0 siblings, 0 replies; 85+ messages in thread
From: Amul Sul @ 2025-09-25 08:18 UTC (permalink / raw)
To: Jakub Wartak <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>
On Fri, Sep 12, 2025 at 4:25 PM Amul Sul <[email protected]> wrote:
>
> On Mon, Sep 8, 2025 at 7:07 PM Jakub Wartak
> <[email protected]> wrote:
> >
> > On Tue, Aug 26, 2025 at 1:53 PM Amul Sul <[email protected]> wrote:
> > >
> > [..patch]
> >
> > Hi Amul!
> >
>
> Thanks for your review. I'm replying to a few of your comments now,
> but for the rest, I need to think about them. I'm kind of in agreement
> with some of them for the fix, but I won't be able to spend time on
> that next week due to official travel. I'll try to get back as soon as
> possible after that.
>
Reverting on rest of review comments:
> 0001: LGTM, maybe I would just slightly enhance the commit message
> ("This is in preparation for adding a second source file to this
> directory.") -- maye bit a bit more verbose or use a message from
> 0002?
Done.
> b. Why would it like to open "blah" dir if I wanted that "blah"
> segment from the archive? Shouldn't it tell that it was looking in the
> archive and couldn find it inside?
> $ /usr/pgsql19/bin/pg_waldump --path=/tmp/base/pg_wal.tar blah
> pg_waldump: error: could not open file "blah": Not a directory
Now, an error will be thrown if any additional command-line
arguments are provided when an archive is specified, similar to how
existing extra arguments are handled.
> i. If I give wrong --timeline=999 to pg_waldump it fails with
> misleading error message: could not read WAL data from "pg_wal.tar"
> archive: read -1 of 8192
Now., added a much better error message for that case.
> a. I'm wondering if we shouldn't log (to stderr?) some kind of
> notification message (just once) that non-sequential WAL files were
> discovered and that pg_waldump is starting to write to $somewhere as
> it may be causing bigger I/O than anticipated when running the
> command. This can easily help when troubleshooting why it is not fast,
> and also having set TMPDIR to usually /tmp can be slow or too small.
Now, emitting info messages, but I'm not sure whether we should have
info or debug.
> b. IMHO member_prepare_tmp_write() / get_tmp_wal_file_path() with
> TMPDIR can be prone to symlink attack. Consider setting TMPDIR=/tmp .
> We are writing to e.g. /tmp/<WALsegment>.waldump.tmp in 0004 , but
> that path is completely guessable. If an attacker prepares some
> symlinks and links those to some other places, I think the code will
> happily open and overwrite the contents of the rogue symlink. I think
> using mkstemp(3)/tmpfile(3) would be a safer choice if TMPDIR needs to
> be in play. Consider that pg_waldump can be run as root (there's no
> mechanism preventing it from being used that way).
I am not sure what the worst-case scenario would be or what a good
alternative is.
> c. IMHO that unlink() might be not guaranteed to always remove
> files, as in case of any trouble and exit() , those files might be
> left over. I think we need some atexit() handlers. This can be
> triggered with combo of options of nonsequential files in tar + wrong
> LSN given:
Done.
> 0007:
> a. Commit message says `Future patches to pg_waldump will enable
> it to decode WAL directly` , but those pg_waldump are earlier patches,
> right?
Right, fixed.
> b. pg_verifybackup should print some info with --progress that it
> is spawning pg_waldump (pg_verifybackup --progress mode does not
> display anything related to verifing WALs, but it could)
If we decide to do that, it could be a separate project, IMHO.
> c. I'm wondering, but pg_waldump seems to be not complaining if
> --end=LSN is made into such a future that it doesn't exist.
The behavior will be kept as if a directory was provided with a start
and end LSN.
Thanks again for the review. I'll post the new patches in my next reply.
Regards,
Amul
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2025-09-25 08:24 Amul Sul <[email protected]>
parent: Robert Haas <[email protected]>
1 sibling, 1 reply; 85+ messages in thread
From: Amul Sul @ 2025-09-25 08:24 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>
On Fri, Sep 12, 2025 at 11:58 PM Robert Haas <[email protected]> wrote:
>
> Here are some review comments on v3-0004:
>
Thanks for the review. My replies are below.
> There doesn't seem to be any reason for
> astreamer_waldump_content_new() to take an astreamer *next argument.
> If you look at astreamer.h, you'll see that some astreamer_BLAH_new()
> functions take such an argument, and others don't. The ones that do
> forward their input to another astreamer; the ones that don't, like
> astreamer_plain_writer_new(), send it somewhere else. AFAICT, this
> astreamer is never going to send its output to another astreamer, so
> there's no reason for this argument.
>
Done.
> I'm also a little confused by the choice of the name
> astreamer_waldump_content_new(). I would have thought this would be
> something like astreamer_waldump_new() or astreamer_xlogreader_new().
> The word "content" doesn't seem to me to be adding much here, and it
> invites confusion with the "content" callback.
>
Done -- renamed to astreamer_waldump_new().
> I think you can merge setup_astreamer() into
> init_tar_archive_reader(). The only other caller is
> verify_tar_archive(), but that does exactly the same additional steps
> as init_tar_archive_reader(), as far as I can see.
>
Done.
> The return statement for astreamer_wal_read is really odd:
>
> + return (count - nbytes) ? (count - nbytes) : -1;
>
Agreed, that's a bit odd. This seems to be leftover code from the experimental
patch. The astreamer_wal_read() function should behave like WALRead():
it should either successfully read all the requested bytes or throw an
error. Corrected in the attached version.
>
> I would suggest changing the name of the variable from "readBuff" to
> "readBuf". There are no existing uses of readBuff in the code base.
>
The existing WALDumpReadPage() function has a "readBuff" argument, and
I've used it that way for consistency.
> I think this comment also needs improvement:
>
> + /*
> + * Ignore existing data if the required target page
> has not yet been
> + * read.
> + */
> + if (recptr >= endPtr)
> + {
> + len = 0;
> +
> + /* Reset the buffer */
> + resetStringInfo(astreamer_buf);
> + }
>
> This comment is problematic for a few reasons. First, we're not
> ignoring the existing data: we're throwing it out. Second, the comment
> doesn't say why we're doing what we're doing, only that we're doing
> it. Here's my guess at the actual explanation -- please correct me if
> I'm wrong: "pg_waldump never reads the same WAL bytes more than once,
> so if we're now being asked for data beyond the end of what we've
> already read, that means none of the data we currently have in the
> buffer will ever be consulted again. So, we can discard the existing
> buffer contents and start over." By the way, if this explanation is
> correct, it might be nice to add an assertion someplace that verifies
> it, like asserting that we're always reading from an LSN greater than
> or equal to (or exactly equal to?) the LSN immediately following the
> last data we read.
>
Updated the comment. The similar assertion exists right before
copying to the readBuff.
>
> Another thing that isn't so nice right now is that
> verify_tar_archive() has to open and close the archive only for
> init_tar_archive_reader() to be called to reopen it again just moments
> later. It would be nicer to open the file just once and then keep it
> open. Here again, I wonder if the separation of duties could be a bit
> cleaner.
>
Prefer to keep those separate, assuming that reopening the file won't
cause any significant harm. Let me know if you think otherwise.
Attached the updated version, kindly have a look.
Regards,
Amul
Attachments:
[application/x-patch] v4-0001-Refactor-pg_waldump-Move-some-declarations-to-new.patch (2.3K, 2-v4-0001-Refactor-pg_waldump-Move-some-declarations-to-new.patch)
download | inline diff:
From 8eb84b553d856bbbffda254e419152c236346848 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 24 Jun 2025 11:33:20 +0530
Subject: [PATCH v4 1/8] Refactor: pg_waldump: Move some declarations to new
pg_waldump.h
This change prepares for a second source file in this directory to
support reading WAL from tar files. Common structures, declarations,
and functions are being exported through this include file so
they can be used in both files.
---
src/bin/pg_waldump/pg_waldump.c | 11 ++---------
src/bin/pg_waldump/pg_waldump.h | 27 +++++++++++++++++++++++++++
2 files changed, 29 insertions(+), 9 deletions(-)
create mode 100644 src/bin/pg_waldump/pg_waldump.h
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 13d3ec2f5be..a49b2fd96c7 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -29,6 +29,7 @@
#include "common/logging.h"
#include "common/relpath.h"
#include "getopt_long.h"
+#include "pg_waldump.h"
#include "rmgrdesc.h"
#include "storage/bufpage.h"
@@ -39,19 +40,11 @@
static const char *progname;
-static int WalSegSz;
+int WalSegSz = DEFAULT_XLOG_SEG_SIZE;
static volatile sig_atomic_t time_to_stop = false;
static const RelFileLocator emptyRelFileLocator = {0, 0, 0};
-typedef struct XLogDumpPrivate
-{
- TimeLineID timeline;
- XLogRecPtr startptr;
- XLogRecPtr endptr;
- bool endptr_reached;
-} XLogDumpPrivate;
-
typedef struct XLogDumpConfig
{
/* display options */
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
new file mode 100644
index 00000000000..9e62b64ead5
--- /dev/null
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -0,0 +1,27 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_waldump.h - decode and display WAL
+ *
+ * Copyright (c) 2013-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/pg_waldump.h
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_WALDUMP_H
+#define PG_WALDUMP_H
+
+#include "access/xlogdefs.h"
+
+extern int WalSegSz;
+
+/* Contains the necessary information to drive WAL decoding */
+typedef struct XLogDumpPrivate
+{
+ TimeLineID timeline;
+ XLogRecPtr startptr;
+ XLogRecPtr endptr;
+ bool endptr_reached;
+} XLogDumpPrivate;
+
+#endif /* end of PG_WALDUMP_H */
--
2.47.1
[application/x-patch] v4-0002-Refactor-pg_waldump-Separate-logic-used-to-calcul.patch (2.3K, 3-v4-0002-Refactor-pg_waldump-Separate-logic-used-to-calcul.patch)
download | inline diff:
From 9f719d5744c293a91aca5a933b357296180281ff Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 26 Jun 2025 11:42:53 +0530
Subject: [PATCH v4 2/8] Refactor: pg_waldump: Separate logic used to calculate
the required read size.
This refactoring prepares the codebase for an upcoming patch that will
support reading WAL from tar files. The logic for calculating the
required read size has been updated to handle both normal WAL files
and WAL files located inside a tar archive.
---
src/bin/pg_waldump/pg_waldump.c | 39 ++++++++++++++++++++++-----------
1 file changed, 26 insertions(+), 13 deletions(-)
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index a49b2fd96c7..8d0cd9e7156 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -326,6 +326,29 @@ identify_target_directory(char *directory, char *fname)
return NULL; /* not reached */
}
+/* Returns the size in bytes of the data to be read. */
+static inline int
+required_read_len(XLogDumpPrivate *private, XLogRecPtr targetPagePtr,
+ int reqLen)
+{
+ int count = XLOG_BLCKSZ;
+
+ if (private->endptr != InvalidXLogRecPtr)
+ {
+ if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
+ count = XLOG_BLCKSZ;
+ else if (targetPagePtr + reqLen <= private->endptr)
+ count = private->endptr - targetPagePtr;
+ else
+ {
+ private->endptr_reached = true;
+ return -1;
+ }
+ }
+
+ return count;
+}
+
/* pg_waldump's XLogReaderRoutine->segment_open callback */
static void
WALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
@@ -383,21 +406,11 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogRecPtr targetPtr, char *readBuff)
{
XLogDumpPrivate *private = state->private_data;
- int count = XLOG_BLCKSZ;
+ int count = required_read_len(private, targetPagePtr, reqLen);
WALReadError errinfo;
- if (private->endptr != InvalidXLogRecPtr)
- {
- if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
- count = XLOG_BLCKSZ;
- else if (targetPagePtr + reqLen <= private->endptr)
- count = private->endptr - targetPagePtr;
- else
- {
- private->endptr_reached = true;
- return -1;
- }
- }
+ if (private->endptr_reached)
+ return -1;
if (!WALRead(state, readBuff, targetPagePtr, count, private->timeline,
&errinfo))
--
2.47.1
[application/x-patch] v4-0003-Refactor-pg_waldump-Restructure-TAP-tests.patch (5.5K, 4-v4-0003-Refactor-pg_waldump-Restructure-TAP-tests.patch)
download | inline diff:
From f51ddb6d02ef6e3383beebdd4486d10191499955 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 30 Jul 2025 12:43:30 +0530
Subject: [PATCH v4 3/8] Refactor: pg_waldump: Restructure TAP tests.
Restructured some tests to run inside a loop, facilitating their
re-execution for decoding WAL from tar archives.
---
src/bin/pg_waldump/t/001_basic.pl | 123 ++++++++++++++++--------------
1 file changed, 67 insertions(+), 56 deletions(-)
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index f26d75e01cf..1b712e8d74d 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -198,28 +198,6 @@ command_like(
],
qr/./,
'runs with start and end segment specified');
-command_fails_like(
- [ 'pg_waldump', '--path' => $node->data_dir ],
- qr/error: no start WAL location given/,
- 'path option requires start location');
-command_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- '--end' => $end_lsn,
- ],
- qr/./,
- 'runs with path option and start and end locations');
-command_fails_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- ],
- qr/error: error in WAL record at/,
- 'falling off the end of the WAL results in an error');
-
command_like(
[
'pg_waldump', '--quiet',
@@ -227,15 +205,6 @@ command_like(
],
qr/^$/,
'no output with --quiet option');
-command_fails_like(
- [
- 'pg_waldump', '--quiet',
- '--path' => $node->data_dir,
- '--start' => $start_lsn
- ],
- qr/error: error in WAL record at/,
- 'errors are shown with --quiet');
-
# Test for: Display a message that we're skipping data if `from`
# wasn't a pointer to the start of a record.
@@ -272,7 +241,6 @@ sub test_pg_waldump
my $result = IPC::Run::run [
'pg_waldump',
- '--path' => $node->data_dir,
'--start' => $start_lsn,
'--end' => $end_lsn,
@opts
@@ -288,38 +256,81 @@ sub test_pg_waldump
my @lines;
-@lines = test_pg_waldump;
-is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+my @scenario = (
+ {
+ 'path' => $node->data_dir
+ });
-@lines = test_pg_waldump('--limit' => 6);
-is(@lines, 6, 'limit option observed');
+for my $scenario (@scenario)
+{
+ my $path = $scenario->{'path'};
-@lines = test_pg_waldump('--fullpage');
-is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
+ SKIP:
+ {
+ command_fails_like(
+ [ 'pg_waldump', '--path' => $path ],
+ qr/error: no start WAL location given/,
+ 'path option requires start location');
+ command_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ '--end' => $end_lsn,
+ ],
+ qr/./,
+ 'runs with path option and start and end locations');
+ command_fails_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ ],
+ qr/error: error in WAL record at/,
+ 'falling off the end of the WAL results in an error');
-@lines = test_pg_waldump('--stats');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ command_fails_like(
+ [
+ 'pg_waldump', '--quiet',
+ '--path' => $path,
+ '--start' => $start_lsn
+ ],
+ qr/error: error in WAL record at/,
+ 'errors are shown with --quiet');
-@lines = test_pg_waldump('--stats=record');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ @lines = test_pg_waldump('--path' => $path);
+ is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
-@lines = test_pg_waldump('--rmgr' => 'Btree');
-is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+ @lines = test_pg_waldump('--path' => $path, '--limit' => 6);
+ is(@lines, 6, 'limit option observed');
-@lines = test_pg_waldump('--fork' => 'init');
-is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+ @lines = test_pg_waldump('--path' => $path, '--fullpage');
+ is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
-is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
- 0, 'only lines for selected relation');
+ @lines = test_pg_waldump('--path' => $path, '--stats');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
- '--block' => 1);
-is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+ @lines = test_pg_waldump('--path' => $path, '--stats=record');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ @lines = test_pg_waldump('--path' => $path, '--rmgr' => 'Btree');
+ is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+
+ @lines = test_pg_waldump('--path' => $path, '--fork' => 'init');
+ is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+
+ @lines = test_pg_waldump('--path' => $path,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
+ is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
+ 0, 'only lines for selected relation');
+
+ @lines = test_pg_waldump('--path' => $path,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
+ '--block' => 1);
+ is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+ }
+}
done_testing();
--
2.47.1
[application/x-patch] v4-0004-pg_waldump-Add-support-for-archived-WAL-decoding.patch (34.9K, 5-v4-0004-pg_waldump-Add-support-for-archived-WAL-decoding.patch)
download | inline diff:
From be0cf9b0d4ff99630298e499f93d09a17eeae141 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 16 Jul 2025 18:37:59 +0530
Subject: [PATCH v4 4/8] pg_waldump: Add support for archived WAL decoding.
pg_waldump can now accept the path to a tar archive containing WAL
files and decode them. This feature was added primarily for
pg_verifybackup, which previously disabled WAL parsing for
tar-formatted backups.
Note that this patch requires that the WAL files within the archive be
in sequential order; an error will be reported otherwise. The next
patch is planned to remove this restriction.
---
doc/src/sgml/ref/pg_waldump.sgml | 8 +-
src/bin/pg_waldump/Makefile | 7 +-
src/bin/pg_waldump/astreamer_waldump.c | 388 +++++++++++++++++++++++++
src/bin/pg_waldump/meson.build | 4 +-
src/bin/pg_waldump/pg_waldump.c | 365 +++++++++++++++++++----
src/bin/pg_waldump/pg_waldump.h | 20 +-
src/bin/pg_waldump/t/001_basic.pl | 84 +++++-
src/tools/pgindent/typedefs.list | 1 +
8 files changed, 799 insertions(+), 78 deletions(-)
create mode 100644 src/bin/pg_waldump/astreamer_waldump.c
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index ce23add5577..d004bb0f67e 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -141,13 +141,17 @@ PostgreSQL documentation
<term><option>--path=<replaceable>path</replaceable></option></term>
<listitem>
<para>
- Specifies a directory to search for WAL segment files or a
- directory with a <literal>pg_wal</literal> subdirectory that
+ Specifies a tar archive or a directory to search for WAL segment files
+ or a directory with a <literal>pg_wal</literal> subdirectory that
contains such files. The default is to search in the current
directory, the <literal>pg_wal</literal> subdirectory of the
current directory, and the <literal>pg_wal</literal> subdirectory
of <envar>PGDATA</envar>.
</para>
+ <para>
+ If a tar archive is provided, its WAL segment files must be in
+ sequential order; otherwise, an error will be reported.
+ </para>
</listitem>
</varlistentry>
diff --git a/src/bin/pg_waldump/Makefile b/src/bin/pg_waldump/Makefile
index 4c1ee649501..b234613eb50 100644
--- a/src/bin/pg_waldump/Makefile
+++ b/src/bin/pg_waldump/Makefile
@@ -3,6 +3,9 @@
PGFILEDESC = "pg_waldump - decode and display WAL"
PGAPPICON=win32
+# make these available to TAP test scripts
+export TAR
+
subdir = src/bin/pg_waldump
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
@@ -12,11 +15,13 @@ OBJS = \
$(WIN32RES) \
compat.o \
pg_waldump.o \
+ astreamer_waldump.o \
rmgrdesc.o \
xlogreader.o \
xlogstats.o
-override CPPFLAGS := -DFRONTEND $(CPPFLAGS)
+override CPPFLAGS := -DFRONTEND -I$(libpq_srcdir) $(CPPFLAGS)
+LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils
RMGRDESCSOURCES = $(sort $(notdir $(wildcard $(top_srcdir)/src/backend/access/rmgrdesc/*desc*.c)))
RMGRDESCOBJS = $(patsubst %.c,%.o,$(RMGRDESCSOURCES))
diff --git a/src/bin/pg_waldump/astreamer_waldump.c b/src/bin/pg_waldump/astreamer_waldump.c
new file mode 100644
index 00000000000..caf7da6ccb8
--- /dev/null
+++ b/src/bin/pg_waldump/astreamer_waldump.c
@@ -0,0 +1,388 @@
+/*-------------------------------------------------------------------------
+ *
+ * astreamer_waldump.c
+ * A generic facility for reading WAL data from tar archives via archive
+ * streamer.
+ *
+ * Portions Copyright (c) 2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/astreamer_waldump.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres_fe.h"
+
+#include <unistd.h>
+
+#include "access/xlog_internal.h"
+#include "access/xlogdefs.h"
+#include "common/logging.h"
+#include "fe_utils/simple_list.h"
+#include "pg_waldump.h"
+
+/*
+ * How many bytes should we try to read from a file at once?
+ */
+#define READ_CHUNK_SIZE (128 * 1024)
+
+/*
+ * When nextSegNo is 0, read from any available WAL file.
+ */
+#define READ_ANY_WAL(mystreamer) ((mystreamer)->nextSegNo == 0)
+
+typedef struct astreamer_waldump
+{
+ /* These fields don't change once initialized. */
+ astreamer base;
+ XLogSegNo startSegNo;
+ XLogSegNo endSegNo;
+ XLogDumpPrivate *privateInfo;
+
+ /* These fields change with archive member. */
+ bool skipThisSeg;
+ XLogSegNo nextSegNo; /* Next expected segment to stream */
+} astreamer_waldump;
+
+static int astreamer_archive_read(XLogDumpPrivate *privateInfo);
+static void astreamer_waldump_content(astreamer *streamer,
+ astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context);
+static void astreamer_waldump_finalize(astreamer *streamer);
+static void astreamer_waldump_free(astreamer *streamer);
+
+static bool member_is_relevant_wal(astreamer_waldump *mystreamer,
+ astreamer_member *member,
+ TimeLineID startTimeLineID,
+ XLogSegNo *curSegNo);
+
+static const astreamer_ops astreamer_waldump_ops = {
+ .content = astreamer_waldump_content,
+ .finalize = astreamer_waldump_finalize,
+ .free = astreamer_waldump_free
+};
+
+/*
+ * Copies WAL data from astreamer to readBuff; if unavailable, fetches more
+ * from the tar archive via astreamer.
+ */
+int
+astreamer_wal_read(char *readBuff, XLogRecPtr targetPagePtr, Size count,
+ XLogDumpPrivate *privateInfo)
+{
+ char *p = readBuff;
+ Size nbytes = count;
+ XLogRecPtr recptr = targetPagePtr;
+ volatile StringInfo astreamer_buf = privateInfo->archive_streamer_buf;
+
+ while (nbytes > 0)
+ {
+ char *buf = astreamer_buf->data;
+ int len = astreamer_buf->len;
+
+ /* WAL record range that the buffer contains */
+ XLogRecPtr endPtr = privateInfo->archive_streamer_read_ptr;
+ XLogRecPtr startPtr = (endPtr > len) ? endPtr - len : 0;
+
+ /*
+ * pg_waldump never ask the same WAL bytes more than once, so if we're
+ * now being asked for data beyond the end of what we've already read,
+ * that means none of the data we currently have in the buffer will
+ * ever be consulted again. So, we can discard the existing buffer
+ * contents and start over.
+ */
+ if (recptr >= endPtr)
+ {
+ len = 0;
+
+ /* Discard the buffered data */
+ resetStringInfo(astreamer_buf);
+ }
+
+ if (len > 0 && recptr > startPtr)
+ {
+ int skipBytes = 0;
+
+ /*
+ * The required offset is not at the start of the archive streamer
+ * buffer, so skip bytes until reaching the desired offset of the
+ * target page.
+ */
+ skipBytes = recptr - startPtr;
+
+ buf += skipBytes;
+ len -= skipBytes;
+ }
+
+ if (len > 0)
+ {
+ int readBytes = len >= nbytes ? nbytes : len;
+
+ /*
+ * Ensure we are reading the correct page, unless we've received
+ * an invalid record pointer. In that specific case, it's
+ * acceptable to read any page.
+ */
+ Assert(XLogRecPtrIsInvalid(recptr) ||
+ (recptr >= startPtr && recptr < endPtr));
+
+ memcpy(p, buf, readBytes);
+
+ /* Update state for read */
+ nbytes -= readBytes;
+ p += readBytes;
+ recptr += readBytes;
+ }
+ else
+ {
+ /* Fetch more data */
+ if (astreamer_archive_read(privateInfo) == 0)
+ {
+ char fname[MAXFNAMELEN];
+ XLogSegNo segno;
+
+ XLByteToSeg(targetPagePtr, segno, WalSegSz);
+ XLogFileName(fname, privateInfo->timeline, segno, WalSegSz);
+
+ pg_fatal("could not find file \"%s\" in \"%s\" archive",
+ fname, privateInfo->archive_name);
+ }
+ }
+ }
+
+ /*
+ * Should have either have successfully read all the requested bytes or
+ * reported a failure before this point.
+ */
+ Assert(nbytes == 0);
+
+ return count;
+}
+
+/*
+ * Reads the archive and passes it to the archive streamer for decompression.
+ */
+static int
+astreamer_archive_read(XLogDumpPrivate *privateInfo)
+{
+ int rc;
+ char *buffer;
+
+ buffer = pg_malloc(READ_CHUNK_SIZE * sizeof(uint8));
+
+ /* Read more data from the tar file */
+ rc = read(privateInfo->archive_fd, buffer, READ_CHUNK_SIZE);
+ if (rc < 0)
+ pg_fatal("could not read file \"%s\": %m",
+ privateInfo->archive_name);
+
+ /*
+ * Decrypt (if required), and then parse the previously read contents of
+ * the tar file.
+ */
+ if (rc > 0)
+ astreamer_content(privateInfo->archive_streamer, NULL,
+ buffer, rc, ASTREAMER_UNKNOWN);
+ pg_free(buffer);
+
+ return rc;
+}
+
+/*
+ * Create an astreamer that can read WAL from tar file.
+ */
+astreamer *
+astreamer_waldump_new(XLogRecPtr startptr, XLogRecPtr endPtr,
+ XLogDumpPrivate *privateInfo)
+{
+ astreamer_waldump *streamer;
+
+ streamer = palloc0(sizeof(astreamer_waldump));
+ *((const astreamer_ops **) &streamer->base.bbs_ops) =
+ &astreamer_waldump_ops;
+
+ initStringInfo(&streamer->base.bbs_buffer);
+
+ if (XLogRecPtrIsInvalid(startptr))
+ streamer->startSegNo = 0;
+ else
+ {
+ XLByteToSeg(startptr, streamer->startSegNo, WalSegSz);
+
+ /*
+ * Initialize the record pointer to the beginning of the first
+ * segment; this pointer will track the WAL record reading status.
+ */
+ XLogSegNoOffsetToRecPtr(streamer->startSegNo, 0, WalSegSz,
+ privateInfo->archive_streamer_read_ptr);
+ }
+
+ if (XLogRecPtrIsInvalid(endPtr))
+ streamer->endSegNo = UINT64_MAX;
+ else
+ XLByteToSeg(endPtr, streamer->endSegNo, WalSegSz);
+
+ streamer->nextSegNo = streamer->startSegNo;
+ streamer->privateInfo = privateInfo;
+
+ return &streamer->base;
+}
+
+/*
+ * Main entry point of the archive streamer for reading WAL from a tar file.
+ */
+static void
+astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context)
+{
+ astreamer_waldump *mystreamer = (astreamer_waldump *) streamer;
+ XLogDumpPrivate *privateInfo = mystreamer->privateInfo;
+
+ Assert(context != ASTREAMER_UNKNOWN);
+
+ switch (context)
+ {
+ case ASTREAMER_MEMBER_HEADER:
+ {
+ XLogSegNo segNo;
+
+ pg_log_debug("pg_waldump: reading \"%s\"", member->pathname);
+
+ mystreamer->skipThisSeg = false;
+
+ if (!member_is_relevant_wal(mystreamer, member,
+ privateInfo->timeline, &segNo))
+ {
+ mystreamer->skipThisSeg = true;
+ break;
+ }
+
+ /*
+ * Further checks are skipped if any WAL file can be read.
+ * This typically occurs during initial verification.
+ */
+ if (READ_ANY_WAL(mystreamer))
+ break;
+
+ /* WAL segments must be archived in order */
+ if (mystreamer->nextSegNo != segNo)
+ {
+ pg_log_error("WAL files are not archived in sequential order");
+ pg_log_error_detail("Expecting segment number " UINT64_FORMAT " but found " UINT64_FORMAT ".",
+ mystreamer->nextSegNo, segNo);
+ exit(1);
+ }
+
+ /*
+ * We track the reading of WAL segment records using a pointer
+ * that's continuously incremented by the length of the
+ * received data. This pointer is crucial for serving WAL page
+ * requests from the WAL decoding routine, so it must be
+ * accurate.
+ */
+#ifdef USE_ASSERT_CHECKING
+ if (mystreamer->nextSegNo != 0)
+ {
+ XLogRecPtr recPtr;
+
+ XLogSegNoOffsetToRecPtr(segNo, 0, WalSegSz, recPtr);
+ Assert(privateInfo->archive_streamer_read_ptr == recPtr);
+ }
+#endif
+ /* Update the next expected segment number */
+ mystreamer->nextSegNo += 1;
+ }
+ break;
+
+ case ASTREAMER_MEMBER_CONTENTS:
+ /* Skip this segment */
+ if (mystreamer->skipThisSeg)
+ break;
+
+ /* Or, copy contents to buffer */
+ privateInfo->archive_streamer_read_ptr += len;
+ astreamer_buffer_bytes(streamer, &data, &len, len);
+ break;
+
+ case ASTREAMER_MEMBER_TRAILER:
+ break;
+
+ case ASTREAMER_ARCHIVE_TRAILER:
+ break;
+
+ default:
+ /* Shouldn't happen. */
+ pg_fatal("unexpected state while parsing tar file");
+ }
+}
+
+/*
+ * End-of-stream processing for a astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_finalize(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+}
+
+/*
+ * Free memory associated with a astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_free(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+
+ pfree(streamer->bbs_buffer.data);
+ pfree(streamer);
+}
+
+/*
+ * Returns true if the archive member name matches the WAL naming format and
+ * the corresponding WAL segment falls within the WAL decoding target range;
+ * otherwise, returns false.
+ */
+static bool
+member_is_relevant_wal(astreamer_waldump *mystreamer, astreamer_member *member,
+ TimeLineID startTimeLineID, XLogSegNo *curSegNo)
+{
+ int pathlen;
+ XLogSegNo segNo;
+ TimeLineID timeline;
+ char *fname;
+
+ /* We are only interested in normal files. */
+ if (member->is_directory || member->is_link)
+ return false;
+
+ pathlen = strlen(member->pathname);
+ if (pathlen < XLOG_FNAME_LEN)
+ return false;
+
+ /* WAL file could be with full path */
+ fname = member->pathname + (pathlen - XLOG_FNAME_LEN);
+ if (!IsXLogFileName(fname))
+ return false;
+
+ /* Parse position from file */
+ XLogFromFileName(fname, &timeline, &segNo, WalSegSz);
+
+ /* No further checks are needed if any file ask to read */
+ if (!READ_ANY_WAL(mystreamer))
+ {
+ /* Ignore if the timeline is different */
+ if (startTimeLineID != timeline)
+ return false;
+
+ /* Skip if the current segment is not the desired one */
+ if (mystreamer->startSegNo > segNo || mystreamer->endSegNo < segNo)
+ return false;
+ }
+
+ *curSegNo = segNo;
+
+ return true;
+}
diff --git a/src/bin/pg_waldump/meson.build b/src/bin/pg_waldump/meson.build
index 937e0d68841..2a0300dc339 100644
--- a/src/bin/pg_waldump/meson.build
+++ b/src/bin/pg_waldump/meson.build
@@ -3,6 +3,7 @@
pg_waldump_sources = files(
'compat.c',
'pg_waldump.c',
+ 'astreamer_waldump.c',
'rmgrdesc.c',
)
@@ -18,7 +19,7 @@ endif
pg_waldump = executable('pg_waldump',
pg_waldump_sources,
- dependencies: [frontend_code, lz4, zstd],
+ dependencies: [frontend_code, lz4, zstd, libpq],
c_args: ['-DFRONTEND'], # needed for xlogreader et al
kwargs: default_bin_args,
)
@@ -29,6 +30,7 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
+ 'env': {'TAR': tar.found() ? tar.full_path() : ''},
'tests': [
't/001_basic.pl',
't/002_save_fullpage.pl',
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 8d0cd9e7156..393d6bfa9ef 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -326,6 +326,148 @@ identify_target_directory(char *directory, char *fname)
return NULL; /* not reached */
}
+/*
+ * Returns true if the given file is a tar archive and outputs its compression
+ * algorithm.
+ */
+static bool
+is_tar_file(const char *fname, pg_compress_algorithm *compression)
+{
+ int fname_len = strlen(fname);
+ pg_compress_algorithm compress_algo;
+
+ /* Now, check the compression type of the tar */
+ if (fname_len > 4 &&
+ strcmp(fname + fname_len - 4, ".tar") == 0)
+ compress_algo = PG_COMPRESSION_NONE;
+ else if (fname_len > 4 &&
+ strcmp(fname + fname_len - 4, ".tgz") == 0)
+ compress_algo = PG_COMPRESSION_GZIP;
+ else if (fname_len > 7 &&
+ strcmp(fname + fname_len - 7, ".tar.gz") == 0)
+ compress_algo = PG_COMPRESSION_GZIP;
+ else if (fname_len > 8 &&
+ strcmp(fname + fname_len - 8, ".tar.lz4") == 0)
+ compress_algo = PG_COMPRESSION_LZ4;
+ else if (fname_len > 8 &&
+ strcmp(fname + fname_len - 8, ".tar.zst") == 0)
+ compress_algo = PG_COMPRESSION_ZSTD;
+ else
+ return false;
+
+ *compression = compress_algo;
+
+ return true;
+}
+
+/*
+ * Initializes the tar archive reader and a temporary directory for WAL files.
+ */
+static void
+init_tar_archive_reader(XLogDumpPrivate *private, const char *waldir,
+ XLogRecPtr startptr, XLogRecPtr endptr,
+ pg_compress_algorithm compression)
+{
+ int fd;
+ astreamer *streamer;
+
+ /* Open tar archive and store its file descriptor */
+ fd = open_file_in_directory(waldir, private->archive_name);
+
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", private->archive_name);
+
+ private->archive_fd = fd;
+
+ /*
+ * Create an appropriate chain of archive streamers for reading the given
+ * tar archive.
+ */
+ streamer = astreamer_waldump_new(startptr, endptr, private);
+
+ /*
+ * Final extracted WAL data will reside in this streamer. However, since
+ * it sits at the bottom of the stack and isn't designed to propagate data
+ * upward, we need to hold a pointer to its data buffer in order to copy.
+ */
+ private->archive_streamer_buf = &streamer->bbs_buffer;
+
+ /* Before that we must parse the tar archive. */
+ streamer = astreamer_tar_parser_new(streamer);
+
+ /* Before that we must decompress, if archive is compressed. */
+ if (compression == PG_COMPRESSION_GZIP)
+ streamer = astreamer_gzip_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_LZ4)
+ streamer = astreamer_lz4_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_ZSTD)
+ streamer = astreamer_zstd_decompressor_new(streamer);
+
+ private->archive_streamer = streamer;
+}
+
+/*
+ * Release the archive streamer chain and close the archive file.
+ */
+static void
+free_tar_archive_reader(XLogDumpPrivate *private)
+{
+ /*
+ * NB: Normally, astreamer_finalize() is called before astreamer_free() to
+ * flush any remaining buffered data or to ensure the end of the tar
+ * archive is reached. However, when decoding a WAL file, once we hit the
+ * end LSN, any remaining WAL data in the buffer or the tar archive's
+ * unreached end can be safely ignored.
+ */
+ astreamer_free(private->archive_streamer);
+
+ /* Close the file. */
+ if (close(private->archive_fd) != 0)
+ pg_log_error("could not close file \"%s\": %m",
+ private->archive_name);
+}
+
+/*
+ * Reads a WAL page from the archive and verifies WAL segment size.
+ */
+static void
+verify_tar_archive(XLogDumpPrivate *private, const char *waldir,
+ pg_compress_algorithm compression)
+{
+ PGAlignedXLogBlock buf;
+ int r;
+
+ /* Initialize the reader to stream WAL data from a tar file */
+ init_tar_archive_reader(private, waldir, InvalidXLogRecPtr,
+ InvalidXLogRecPtr, compression);
+
+ /* Read a wal page */
+ r = astreamer_wal_read(buf.data, InvalidXLogRecPtr, XLOG_BLCKSZ, private);
+
+ /* Set WalSegSz if WAL data is successfully read */
+ if (r == XLOG_BLCKSZ)
+ {
+ XLogLongPageHeader longhdr = (XLogLongPageHeader) buf.data;
+
+ WalSegSz = longhdr->xlp_seg_size;
+
+ if (!IsValidWalSegSize(WalSegSz))
+ {
+ pg_log_error(ngettext("invalid WAL segment size in WAL file \"%s\" (%d byte)",
+ "invalid WAL segment size in WAL file \"%s\" (%d bytes)",
+ WalSegSz),
+ private->archive_name, WalSegSz);
+ pg_log_error_detail("The WAL segment size must be a power of two between 1 MB and 1 GB.");
+ exit(1);
+ }
+ }
+ else
+ pg_fatal("could not read WAL data from \"%s\" archive: read %d of %d",
+ private->archive_name, r, XLOG_BLCKSZ);
+
+ free_tar_archive_reader(private);
+}
+
/* Returns the size in bytes of the data to be read. */
static inline int
required_read_len(XLogDumpPrivate *private, XLogRecPtr targetPagePtr,
@@ -406,7 +548,7 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogRecPtr targetPtr, char *readBuff)
{
XLogDumpPrivate *private = state->private_data;
- int count = required_read_len(private, targetPagePtr, reqLen);
+ int count = required_read_len(private, targetPtr, reqLen);
WALReadError errinfo;
if (private->endptr_reached)
@@ -436,6 +578,44 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
return count;
}
+/*
+ * pg_waldump's XLogReaderRoutine->segment_open callback to support dumping WAL
+ * files from tar archives.
+ */
+static void
+TarWALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
+ TimeLineID *tli_p)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->segment_close callback.
+ */
+static void
+TarWALDumpCloseSegment(XLogReaderState *state)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->page_read callback to support dumping WAL
+ * files from tar archives.
+ */
+static int
+TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
+ XLogRecPtr targetPtr, char *readBuff)
+{
+ XLogDumpPrivate *private = state->private_data;
+ int count = required_read_len(private, targetPtr, reqLen);
+
+ if (private->endptr_reached)
+ return -1;
+
+ /* Read the WAL page from the archive streamer */
+ return astreamer_wal_read(readBuff, targetPagePtr, count, private);
+}
+
/*
* Boolean to return whether the given WAL record matches a specific relation
* and optionally block.
@@ -773,8 +953,8 @@ usage(void)
printf(_(" -F, --fork=FORK only show records that modify blocks in fork FORK;\n"
" valid names are main, fsm, vm, init\n"));
printf(_(" -n, --limit=N number of records to display\n"));
- printf(_(" -p, --path=PATH directory in which to find WAL segment files or a\n"
- " directory with a ./pg_wal that contains such files\n"
+ printf(_(" -p, --path=PATH tar archive or a directory in which to find WAL segment files or\n"
+ " a directory with a ./pg_wal that contains such files\n"
" (default: current directory, ./pg_wal, $PGDATA/pg_wal)\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -r, --rmgr=RMGR only show records generated by resource manager RMGR;\n"
@@ -806,7 +986,10 @@ main(int argc, char **argv)
XLogRecord *record;
XLogRecPtr first_record;
char *waldir = NULL;
+ char *walpath = NULL;
char *errormsg;
+ bool is_tar = false;
+ pg_compress_algorithm compression;
static struct option long_options[] = {
{"bkp-details", no_argument, NULL, 'b'},
@@ -938,7 +1121,7 @@ main(int argc, char **argv)
}
break;
case 'p':
- waldir = pg_strdup(optarg);
+ walpath = pg_strdup(optarg);
break;
case 'q':
config.quiet = true;
@@ -1102,10 +1285,20 @@ main(int argc, char **argv)
goto bad_argument;
}
- if (waldir != NULL)
+ if (walpath != NULL)
{
+ /* validate path points to tar archive */
+ if (is_tar_file(walpath, &compression))
+ {
+ char *fname = NULL;
+
+ split_path(walpath, &waldir, &fname);
+
+ private.archive_name = fname;
+ is_tar = true;
+ }
/* validate path points to directory */
- if (!verify_directory(waldir))
+ else if (!verify_directory(walpath))
{
pg_log_error("could not open directory \"%s\": %m", waldir);
goto bad_argument;
@@ -1123,46 +1316,36 @@ main(int argc, char **argv)
int fd;
XLogSegNo segno;
+ /*
+ * If a tar archive is passed using the --path option, all other
+ * arguments become unnecessary.
+ */
+ if (is_tar)
+ {
+ pg_log_error("unnecessary command-line arguments specified with tar archive (first is \"%s\")",
+ argv[optind]);
+ goto bad_argument;
+ }
+
split_path(argv[optind], &directory, &fname);
- if (waldir == NULL && directory != NULL)
+ if (walpath == NULL && directory != NULL)
{
- waldir = directory;
+ walpath = directory;
- if (!verify_directory(waldir))
+ if (!verify_directory(walpath))
pg_fatal("could not open directory \"%s\": %m", waldir);
}
- waldir = identify_target_directory(waldir, fname);
- fd = open_file_in_directory(waldir, fname);
- if (fd < 0)
- pg_fatal("could not open file \"%s\"", fname);
- close(fd);
-
- /* parse position from file */
- XLogFromFileName(fname, &private.timeline, &segno, WalSegSz);
-
- if (XLogRecPtrIsInvalid(private.startptr))
- XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr);
- else if (!XLByteInSeg(private.startptr, segno, WalSegSz))
+ if (fname != NULL && is_tar_file(fname, &compression))
{
- pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.startptr),
- fname);
- goto bad_argument;
+ private.archive_name = fname;
+ waldir = walpath ? pg_strdup(walpath) : pg_strdup(".");
+ is_tar = true;
}
-
- /* no second file specified, set end position */
- if (!(optind + 1 < argc) && XLogRecPtrIsInvalid(private.endptr))
- XLogSegNoOffsetToRecPtr(segno + 1, 0, WalSegSz, private.endptr);
-
- /* parse ENDSEG if passed */
- if (optind + 1 < argc)
+ else
{
- XLogSegNo endsegno;
-
- /* ignore directory, already have that */
- split_path(argv[optind + 1], &directory, &fname);
+ waldir = identify_target_directory(walpath, fname);
fd = open_file_in_directory(waldir, fname);
if (fd < 0)
@@ -1170,32 +1353,70 @@ main(int argc, char **argv)
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &endsegno, WalSegSz);
+ XLogFromFileName(fname, &private.timeline, &segno, WalSegSz);
- if (endsegno < segno)
- pg_fatal("ENDSEG %s is before STARTSEG %s",
- argv[optind + 1], argv[optind]);
+ if (XLogRecPtrIsInvalid(private.startptr))
+ XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr);
+ else if (!XLByteInSeg(private.startptr, segno, WalSegSz))
+ {
+ pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.startptr),
+ fname);
+ goto bad_argument;
+ }
- if (XLogRecPtrIsInvalid(private.endptr))
- XLogSegNoOffsetToRecPtr(endsegno + 1, 0, WalSegSz,
- private.endptr);
+ /* no second file specified, set end position */
+ if (!(optind + 1 < argc) && XLogRecPtrIsInvalid(private.endptr))
+ XLogSegNoOffsetToRecPtr(segno + 1, 0, WalSegSz, private.endptr);
- /* set segno to endsegno for check of --end */
- segno = endsegno;
- }
+ /* parse ENDSEG if passed */
+ if (optind + 1 < argc)
+ {
+ XLogSegNo endsegno;
+
+ /* ignore directory, already have that */
+ split_path(argv[optind + 1], &directory, &fname);
+
+ fd = open_file_in_directory(waldir, fname);
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", fname);
+ close(fd);
+
+ /* parse position from file */
+ XLogFromFileName(fname, &private.timeline, &endsegno, WalSegSz);
+
+ if (endsegno < segno)
+ pg_fatal("ENDSEG %s is before STARTSEG %s",
+ argv[optind + 1], argv[optind]);
+ if (XLogRecPtrIsInvalid(private.endptr))
+ XLogSegNoOffsetToRecPtr(endsegno + 1, 0, WalSegSz,
+ private.endptr);
- if (!XLByteInSeg(private.endptr, segno, WalSegSz) &&
- private.endptr != (segno + 1) * WalSegSz)
- {
- pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.endptr),
- argv[argc - 1]);
- goto bad_argument;
+ /* set segno to endsegno for check of --end */
+ segno = endsegno;
+ }
+
+
+ if (!XLByteInSeg(private.endptr, segno, WalSegSz) &&
+ private.endptr != (segno + 1) * WalSegSz)
+ {
+ pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.endptr),
+ argv[argc - 1]);
+ goto bad_argument;
+ }
}
}
- else
- waldir = identify_target_directory(waldir, NULL);
+ else if (!is_tar)
+ waldir = identify_target_directory(walpath, NULL);
+
+ /* Verify that the archive contains valid WAL files */
+ if (is_tar)
+ {
+ waldir = waldir ? pg_strdup(waldir) : pg_strdup(".");
+ verify_tar_archive(&private, waldir, compression);
+ }
/* we don't know what to print */
if (XLogRecPtrIsInvalid(private.startptr))
@@ -1207,12 +1428,31 @@ main(int argc, char **argv)
/* done with argument parsing, do the actual work */
/* we have everything we need, start reading */
- xlogreader_state =
- XLogReaderAllocate(WalSegSz, waldir,
- XL_ROUTINE(.page_read = WALDumpReadPage,
- .segment_open = WALDumpOpenSegment,
- .segment_close = WALDumpCloseSegment),
- &private);
+ if (is_tar)
+ {
+ /* Set up for reading tar file */
+ init_tar_archive_reader(&private, waldir, private.startptr,
+ private.endptr, compression);
+
+ /* Routine to decode WAL files in tar archive */
+ xlogreader_state =
+ XLogReaderAllocate(WalSegSz, waldir,
+ XL_ROUTINE(.page_read = TarWALDumpReadPage,
+ .segment_open = TarWALDumpOpenSegment,
+ .segment_close = TarWALDumpCloseSegment),
+ &private);
+ }
+ else
+ {
+ /* Routine to decode WAL files */
+ xlogreader_state =
+ XLogReaderAllocate(WalSegSz, waldir,
+ XL_ROUTINE(.page_read = WALDumpReadPage,
+ .segment_open = WALDumpOpenSegment,
+ .segment_close = WALDumpCloseSegment),
+ &private);
+ }
+
if (!xlogreader_state)
pg_fatal("out of memory while allocating a WAL reading processor");
@@ -1321,6 +1561,9 @@ main(int argc, char **argv)
XLogReaderFree(xlogreader_state);
+ if (is_tar)
+ free_tar_archive_reader(&private);
+
return EXIT_SUCCESS;
bad_argument:
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index 9e62b64ead5..4205e0ef597 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -12,6 +12,8 @@
#define PG_WALDUMP_H
#include "access/xlogdefs.h"
+#include "fe_utils/astreamer.h"
+#include "lib/stringinfo.h"
extern int WalSegSz;
@@ -22,6 +24,22 @@ typedef struct XLogDumpPrivate
XLogRecPtr startptr;
XLogRecPtr endptr;
bool endptr_reached;
+
+ /* Fields required to read WAL from archive */
+ char *archive_name; /* Tar archive name */
+ int archive_fd; /* File descriptor for the open tar file */
+
+ astreamer *archive_streamer;
+ StringInfo archive_streamer_buf; /* Buffer for receiving WAL data */
+ XLogRecPtr archive_streamer_read_ptr; /* Populate the buffer with records
+ until this record pointer */
} XLogDumpPrivate;
-#endif /* end of PG_WALDUMP_H */
+
+extern astreamer *astreamer_waldump_new(XLogRecPtr startptr,
+ XLogRecPtr endptr,
+ XLogDumpPrivate *privateInfo);
+extern int astreamer_wal_read(char *readBuff, XLogRecPtr startptr, Size count,
+ XLogDumpPrivate *privateInfo);
+
+#endif /* end of PG_WALDUMP_H */
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 1b712e8d74d..443126a9ce6 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -3,10 +3,13 @@
use strict;
use warnings FATAL => 'all';
+use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
+my $tar = $ENV{TAR};
+
program_help_ok('pg_waldump');
program_version_ok('pg_waldump');
program_options_handling_ok('pg_waldump');
@@ -235,7 +238,7 @@ command_like(
sub test_pg_waldump
{
local $Test::Builder::Level = $Test::Builder::Level + 1;
- my @opts = @_;
+ my ($path, @opts) = @_;
my ($stdout, $stderr);
@@ -243,6 +246,7 @@ sub test_pg_waldump
'pg_waldump',
'--start' => $start_lsn,
'--end' => $end_lsn,
+ '--path' => $path,
@opts
],
'>' => \$stdout,
@@ -254,11 +258,50 @@ sub test_pg_waldump
return @lines;
}
-my @lines;
+# Create a tar archive, sorting the file order
+sub generate_archive
+{
+ my ($archive, $directory, $compression_flags) = @_;
+
+ my @files;
+ opendir my $dh, $directory or die "opendir: $!";
+ while (my $entry = readdir $dh) {
+ # Skip '.' and '..'
+ next if $entry eq '.' || $entry eq '..';
+ push @files, $entry;
+ }
+ closedir $dh;
+
+ @files = sort @files;
+
+ # move into the WAL directory before archiving files
+ my $cwd = getcwd;
+ chdir($directory) || die "chdir: $!";
+ command_ok([$tar, $compression_flags, $archive, @files]);
+ chdir($cwd) || die "chdir: $!";
+}
+
+my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
my @scenario = (
{
- 'path' => $node->data_dir
+ 'path' => $node->data_dir,
+ 'is_archive' => 0,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar",
+ 'compression_method' => 'none',
+ 'compression_flags' => '-cf',
+ 'is_archive' => 1,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar.gz",
+ 'compression_method' => 'gzip',
+ 'compression_flags' => '-czf',
+ 'is_archive' => 1,
+ 'enabled' => check_pg_config("#define HAVE_LIBZ 1")
});
for my $scenario (@scenario)
@@ -267,6 +310,19 @@ for my $scenario (@scenario)
SKIP:
{
+ skip "tar command is not available", 3
+ if !defined $tar;
+ skip "$scenario->{'compression_method'} compression not supported by this build", 3
+ if !$scenario->{'enabled'} && $scenario->{'is_archive'};
+
+ # create pg_wal archive
+ if ($scenario->{'is_archive'})
+ {
+ generate_archive($path,
+ $node->data_dir . '/pg_wal',
+ $scenario->{'compression_flags'});
+ }
+
command_fails_like(
[ 'pg_waldump', '--path' => $path ],
qr/error: no start WAL location given/,
@@ -298,38 +354,42 @@ for my $scenario (@scenario)
qr/error: error in WAL record at/,
'errors are shown with --quiet');
- @lines = test_pg_waldump('--path' => $path);
+ my @lines;
+ @lines = test_pg_waldump($path);
is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
- @lines = test_pg_waldump('--path' => $path, '--limit' => 6);
+ @lines = test_pg_waldump($path, '--limit' => 6);
is(@lines, 6, 'limit option observed');
- @lines = test_pg_waldump('--path' => $path, '--fullpage');
+ @lines = test_pg_waldump($path, '--fullpage');
is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
- @lines = test_pg_waldump('--path' => $path, '--stats');
+ @lines = test_pg_waldump($path, '--stats');
like($lines[0], qr/WAL statistics/, "statistics on stdout");
is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
- @lines = test_pg_waldump('--path' => $path, '--stats=record');
+ @lines = test_pg_waldump($path, '--stats=record');
like($lines[0], qr/WAL statistics/, "statistics on stdout");
is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
- @lines = test_pg_waldump('--path' => $path, '--rmgr' => 'Btree');
+ @lines = test_pg_waldump($path, '--rmgr' => 'Btree');
is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
- @lines = test_pg_waldump('--path' => $path, '--fork' => 'init');
+ @lines = test_pg_waldump($path, '--fork' => 'init');
is(grep(!/fork init/, @lines), 0, 'only init fork lines');
- @lines = test_pg_waldump('--path' => $path,
+ @lines = test_pg_waldump($path,
'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
0, 'only lines for selected relation');
- @lines = test_pg_waldump('--path' => $path,
+ @lines = test_pg_waldump($path,
'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
'--block' => 1);
is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+
+ # Cleanup.
+ unlink $path if $scenario->{'is_archive'};
}
}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index a13e8162890..b406ca041ec 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3444,6 +3444,7 @@ astreamer_recovery_injector
astreamer_tar_archiver
astreamer_tar_parser
astreamer_verify
+astreamer_waldump
astreamer_zstd_frame
auth_password_hook_typ
autovac_table
--
2.47.1
[application/x-patch] v4-0005-pg_waldump-Remove-the-restriction-on-the-order-of.patch (19.3K, 6-v4-0005-pg_waldump-Remove-the-restriction-on-the-order-of.patch)
download | inline diff:
From d79a505af9532baae675557de0efb1bcc35d72f5 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Mon, 25 Aug 2025 17:26:29 +0530
Subject: [PATCH v4 5/8] pg_waldump: Remove the restriction on the order of
archived WAL files.
With previous patch, pg_waldump would stop decoding if WAL files were
not in the required sequence. With this patch, decoding will now
continue. Any WAL file that is out of order will be written to a
temporary location, from which it will be read later. Once a temporary
file has been read, it will be removed.
---
doc/src/sgml/ref/pg_waldump.sgml | 7 +-
src/bin/pg_waldump/astreamer_waldump.c | 214 +++++++++++++++++++++----
src/bin/pg_waldump/pg_waldump.c | 112 ++++++++++++-
src/bin/pg_waldump/pg_waldump.h | 30 +++-
src/bin/pg_waldump/t/001_basic.pl | 3 +-
5 files changed, 323 insertions(+), 43 deletions(-)
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index d004bb0f67e..c1afb4097b5 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -149,8 +149,11 @@ PostgreSQL documentation
of <envar>PGDATA</envar>.
</para>
<para>
- If a tar archive is provided, its WAL segment files must be in
- sequential order; otherwise, an error will be reported.
+ If a tar archive is provided and its WAL segment files are not in
+ sequential order, those files will be written temporarily. These files
+ will be created inside the directory specified by the <envar>TMPDIR</envar>
+ environment variable if it is set; otherwise, the temporary files will
+ be created within the same directory as the tar archive itself.
</para>
</listitem>
</varlistentry>
diff --git a/src/bin/pg_waldump/astreamer_waldump.c b/src/bin/pg_waldump/astreamer_waldump.c
index caf7da6ccb8..40876c77f6c 100644
--- a/src/bin/pg_waldump/astreamer_waldump.c
+++ b/src/bin/pg_waldump/astreamer_waldump.c
@@ -18,8 +18,8 @@
#include "access/xlog_internal.h"
#include "access/xlogdefs.h"
+#include "common/file_perm.h"
#include "common/logging.h"
-#include "fe_utils/simple_list.h"
#include "pg_waldump.h"
/*
@@ -42,10 +42,11 @@ typedef struct astreamer_waldump
/* These fields change with archive member. */
bool skipThisSeg;
+ bool writeThisSeg;
+ FILE *segFp;
XLogSegNo nextSegNo; /* Next expected segment to stream */
} astreamer_waldump;
-static int astreamer_archive_read(XLogDumpPrivate *privateInfo);
static void astreamer_waldump_content(astreamer *streamer,
astreamer_member *member,
const char *data, int len,
@@ -56,7 +57,12 @@ static void astreamer_waldump_free(astreamer *streamer);
static bool member_is_relevant_wal(astreamer_waldump *mystreamer,
astreamer_member *member,
TimeLineID startTimeLineID,
+ char **curFname,
XLogSegNo *curSegNo);
+static FILE *member_prepare_tmp_write(XLogSegNo curSegNo,
+ const char *fname);
+static XLogSegNo member_next_segno(XLogSegNo curSegNo,
+ TimeLineID timeline);
static const astreamer_ops astreamer_waldump_ops = {
.content = astreamer_waldump_content,
@@ -164,7 +170,7 @@ astreamer_wal_read(char *readBuff, XLogRecPtr targetPagePtr, Size count,
/*
* Reads the archive and passes it to the archive streamer for decompression.
*/
-static int
+int
astreamer_archive_read(XLogDumpPrivate *privateInfo)
{
int rc;
@@ -208,17 +214,8 @@ astreamer_waldump_new(XLogRecPtr startptr, XLogRecPtr endPtr,
if (XLogRecPtrIsInvalid(startptr))
streamer->startSegNo = 0;
else
- {
XLByteToSeg(startptr, streamer->startSegNo, WalSegSz);
- /*
- * Initialize the record pointer to the beginning of the first
- * segment; this pointer will track the WAL record reading status.
- */
- XLogSegNoOffsetToRecPtr(streamer->startSegNo, 0, WalSegSz,
- privateInfo->archive_streamer_read_ptr);
- }
-
if (XLogRecPtrIsInvalid(endPtr))
streamer->endSegNo = UINT64_MAX;
else
@@ -247,14 +244,16 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
{
case ASTREAMER_MEMBER_HEADER:
{
- XLogSegNo segNo;
+ char *fname;
pg_log_debug("pg_waldump: reading \"%s\"", member->pathname);
mystreamer->skipThisSeg = false;
+ mystreamer->writeThisSeg = false;
if (!member_is_relevant_wal(mystreamer, member,
- privateInfo->timeline, &segNo))
+ privateInfo->timeline,
+ &fname, &privateInfo->curSegNo))
{
mystreamer->skipThisSeg = true;
break;
@@ -267,33 +266,67 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
if (READ_ANY_WAL(mystreamer))
break;
- /* WAL segments must be archived in order */
- if (mystreamer->nextSegNo != segNo)
+ /*
+ * When WAL segments are not archived sequentially, it becomes
+ * necessary to write out (or preserve) segments that might be
+ * required at a later point.
+ */
+ if (mystreamer->nextSegNo != privateInfo->curSegNo)
{
- pg_log_error("WAL files are not archived in sequential order");
- pg_log_error_detail("Expecting segment number " UINT64_FORMAT " but found " UINT64_FORMAT ".",
- mystreamer->nextSegNo, segNo);
- exit(1);
+ mystreamer->writeThisSeg = true;
+ mystreamer->segFp =
+ member_prepare_tmp_write(privateInfo->curSegNo, fname);
+ break;
}
/*
- * We track the reading of WAL segment records using a pointer
- * that's continuously incremented by the length of the
- * received data. This pointer is crucial for serving WAL page
- * requests from the WAL decoding routine, so it must be
- * accurate.
+ * If the buffer contains data, the next WAL record must
+ * logically follow it. Otherwise, this file isn't the one we
+ * need, and we must export it.
*/
-#ifdef USE_ASSERT_CHECKING
- if (mystreamer->nextSegNo != 0)
+ else if (privateInfo->archive_streamer_buf->len != 0)
{
XLogRecPtr recPtr;
- XLogSegNoOffsetToRecPtr(segNo, 0, WalSegSz, recPtr);
- Assert(privateInfo->archive_streamer_read_ptr == recPtr);
+ XLogSegNoOffsetToRecPtr(privateInfo->curSegNo, 0, WalSegSz,
+ recPtr);
+
+ if (privateInfo->archive_streamer_read_ptr != recPtr)
+ {
+ mystreamer->writeThisSeg = true;
+ mystreamer->segFp =
+ member_prepare_tmp_write(privateInfo->curSegNo, fname);
+
+ /* Update the next expected segment number after this */
+ mystreamer->nextSegNo =
+ member_next_segno(privateInfo->curSegNo + 1,
+ privateInfo->timeline);
+ break;
+ }
}
-#endif
+
+ Assert(!mystreamer->skipThisSeg);
+ Assert(!mystreamer->writeThisSeg);
+
+ /*
+ * We are now streaming segment containt.
+ *
+ * We need to track the reading of WAL segment records using a
+ * pointer that's typically incremented by the length of the
+ * data read. However, we sometimes export the WAL file to
+ * temporary storage, allowing the decoding routine to read
+ * directly from there. This makes continuous pointer
+ * incrementing challenging, as file reads can occur from any
+ * offset, leading to potential errors. Therefore, we now
+ * reset the pointer when reading from a file for streaming.
+ */
+ XLogSegNoOffsetToRecPtr(privateInfo->curSegNo, 0, WalSegSz,
+ privateInfo->archive_streamer_read_ptr);
+
/* Update the next expected segment number */
- mystreamer->nextSegNo += 1;
+ mystreamer->nextSegNo =
+ member_next_segno(privateInfo->curSegNo,
+ privateInfo->timeline);
}
break;
@@ -302,12 +335,45 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
if (mystreamer->skipThisSeg)
break;
+ /* Or, write contents to file */
+ if (mystreamer->writeThisSeg)
+ {
+ Assert(mystreamer->segFp != NULL);
+
+ errno = 0;
+ if (len > 0 && fwrite(data, len, 1, mystreamer->segFp) != 1)
+ {
+ char *fname;
+ int pathlen = strlen(member->pathname);
+
+ Assert(pathlen >= XLOG_FNAME_LEN);
+
+ fname = member->pathname + (pathlen - XLOG_FNAME_LEN);
+
+ /*
+ * If write didn't set errno, assume problem is no disk
+ * space
+ */
+ if (errno == 0)
+ errno = ENOSPC;
+ pg_fatal("could not write to file \"%s\": %m",
+ get_tmp_wal_file_path(fname));
+ }
+ break;
+ }
+
/* Or, copy contents to buffer */
privateInfo->archive_streamer_read_ptr += len;
astreamer_buffer_bytes(streamer, &data, &len, len);
break;
case ASTREAMER_MEMBER_TRAILER:
+ if (mystreamer->segFp != NULL)
+ {
+ fclose(mystreamer->segFp);
+ mystreamer->segFp = NULL;
+ }
+ privateInfo->curSegNo = 0;
break;
case ASTREAMER_ARCHIVE_TRAILER:
@@ -334,8 +400,14 @@ astreamer_waldump_finalize(astreamer *streamer)
static void
astreamer_waldump_free(astreamer *streamer)
{
+ astreamer_waldump *mystreamer;
+
Assert(streamer->bbs_next == NULL);
+ mystreamer = (astreamer_waldump *) streamer;
+ if (mystreamer->segFp != NULL)
+ fclose(mystreamer->segFp);
+
pfree(streamer->bbs_buffer.data);
pfree(streamer);
}
@@ -347,7 +419,8 @@ astreamer_waldump_free(astreamer *streamer)
*/
static bool
member_is_relevant_wal(astreamer_waldump *mystreamer, astreamer_member *member,
- TimeLineID startTimeLineID, XLogSegNo *curSegNo)
+ TimeLineID startTimeLineID, char **curFname,
+ XLogSegNo *curSegNo)
{
int pathlen;
XLogSegNo segNo;
@@ -382,7 +455,84 @@ member_is_relevant_wal(astreamer_waldump *mystreamer, astreamer_member *member,
return false;
}
+ *curFname = fname;
*curSegNo = segNo;
return true;
}
+
+/*
+ * Create an empty placeholder file and return its handle. The file is also
+ * added to an exported list for future management, e.g. access, deletion, and
+ * existence checks.
+ */
+static FILE *
+member_prepare_tmp_write(XLogSegNo curSegNo, const char *fname)
+{
+ FILE *file;
+ char *fpath = get_tmp_wal_file_path(fname);
+
+ /* Create an empty placeholder */
+ file = fopen(fpath, PG_BINARY_W);
+ if (file == NULL)
+ pg_fatal("could not create file \"%s\": %m", fpath);
+
+#ifndef WIN32
+ if (chmod(fpath, pg_file_create_mode))
+ pg_fatal("could not set permissions on file \"%s\": %m",
+ fpath);
+#endif
+
+ pg_log_info("temporarily exporting file \"%s\"", fpath);
+
+ /* Record this segment's export */
+ simple_string_list_append(&TmpWalSegList, fname);
+ pfree(fpath);
+
+ return file;
+}
+
+/*
+ * Get next WAL segment that needs to be retrieved from the archive.
+ *
+ * The function checks for the presence of a previously read and extracted WAL
+ * segment in the temporary storage. If a temporary file is found for that
+ * segment, it indicates the segment has already been successfully retrieved
+ * from the archive. In this case, the function increments the segment number
+ * and repeats the check. This process continues until a segment that has not
+ * yet been retrieved is found, at which point the function returns its number.
+ */
+static XLogSegNo
+member_next_segno(XLogSegNo curSegNo, TimeLineID timeline)
+{
+ XLogSegNo nextSegNo = curSegNo + 1;
+ bool exists;
+
+ /*
+ * If we find a file that was previously written to the temporary space,
+ * it indicates that the corresponding WAL segment request has already
+ * been fulfilled. In that case, we increment the nextSegNo counter and
+ * check again whether that segment number again. if found above steps
+ * will be return if not then we return that segment number which would be
+ * needed from the archive.
+ */
+ do
+ {
+ char fname[MAXFNAMELEN];
+
+ XLogFileName(fname, timeline, nextSegNo, WalSegSz);
+
+ /*
+ * If the WAL segment has already been exported, increment the counter
+ * and check for the next segment.
+ */
+ exists = false;
+ if (simple_string_list_member(&TmpWalSegList, fname))
+ {
+ nextSegNo += 1;
+ exists = true;
+ }
+ } while (exists);
+
+ return nextSegNo;
+}
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 393d6bfa9ef..615227b691c 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -43,6 +43,10 @@ static const char *progname;
int WalSegSz = DEFAULT_XLOG_SEG_SIZE;
static volatile sig_atomic_t time_to_stop = false;
+/* Temporary exported WAL file directory and the list */
+char *TmpWalSegDir = NULL;
+SimpleStringList TmpWalSegList = {NULL, NULL};
+
static const RelFileLocator emptyRelFileLocator = {0, 0, 0};
typedef struct XLogDumpConfig
@@ -360,6 +364,41 @@ is_tar_file(const char *fname, pg_compress_algorithm *compression)
return true;
}
+/*
+ * Set up a temporary directory to temporarily store WAL segments.
+ */
+static void
+setup_tmp_walseg_dir(const char *waldir)
+{
+ /*
+ * Use the directory specified by the TEMDIR environment variable. If it's
+ * not set, use the provided WAL directory.
+ */
+ TmpWalSegDir = getenv("TMPDIR") ?
+ pg_strdup(getenv("TMPDIR")) : pg_strdup(waldir);
+ canonicalize_path(TmpWalSegDir);
+}
+
+/*
+ * Removes the temporarily store WAL segments, if any at exiting.
+ */
+static void
+remove_tmp_walseg_dir_atexit(void)
+{
+ SimpleStringListCell *cell;
+
+ /* Clear out any existing temporary files */
+ for (cell = TmpWalSegList.head; cell; cell = cell->next)
+ {
+ char *fpath = get_tmp_wal_file_path(cell->val);
+
+ if (unlink(fpath) == 0)
+ pg_log_info("removed file \"%s\"", fpath);
+ pfree(fpath);
+ }
+}
+
+
/*
* Initializes the tar archive reader and a temporary directory for WAL files.
*/
@@ -404,6 +443,7 @@ init_tar_archive_reader(XLogDumpPrivate *private, const char *waldir,
streamer = astreamer_zstd_decompressor_new(streamer);
private->archive_streamer = streamer;
+ private->curSegNo = 0;
}
/*
@@ -548,7 +588,7 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogRecPtr targetPtr, char *readBuff)
{
XLogDumpPrivate *private = state->private_data;
- int count = required_read_len(private, targetPtr, reqLen);
+ int count = required_read_len(private, targetPagePtr, reqLen);
WALReadError errinfo;
if (private->endptr_reached)
@@ -607,12 +647,70 @@ TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogRecPtr targetPtr, char *readBuff)
{
XLogDumpPrivate *private = state->private_data;
- int count = required_read_len(private, targetPtr, reqLen);
+ int count = required_read_len(private, targetPagePtr, reqLen);
+ XLogSegNo nextSegNo;
if (private->endptr_reached)
return -1;
- /* Read the WAL page from the archive streamer */
+ /*
+ * If the target page is in a different segment, first check for the WAL
+ * segment's physical existence in the temporary directory.
+ */
+ nextSegNo = state->seg.ws_segno;
+ if (!XLByteInSeg(targetPagePtr, nextSegNo, WalSegSz))
+ {
+ char fname[MAXFNAMELEN];
+ char *fpath;
+
+ if (state->seg.ws_file >= 0)
+ {
+ close(state->seg.ws_file);
+ state->seg.ws_file = -1;
+
+ /* Remove this file, as it is no longer needed. */
+ XLogFileName(fname, state->seg.ws_tli, nextSegNo, WalSegSz);
+ fpath = get_tmp_wal_file_path(fname);
+ pg_log_info("removing file \"%s\"", fpath);
+ unlink(fpath);
+ pfree(fpath);
+ }
+
+ XLByteToSeg(targetPagePtr, nextSegNo, WalSegSz);
+ state->seg.ws_tli = private->timeline;
+ state->seg.ws_segno = nextSegNo;
+
+ /*
+ * If the next segment exists, open it and continue reading from there
+ */
+ XLogFileName(fname, private->timeline, nextSegNo, WalSegSz);
+ if (simple_string_list_member(&TmpWalSegList, fname))
+ {
+ fpath = get_tmp_wal_file_path(fname);
+ state->seg.ws_file = open(fpath, O_RDONLY | PG_BINARY, 0);
+
+ if (state->seg.ws_file < 0)
+ pg_fatal("could not open file \"%s\": %m", fpath);
+ pfree(fpath);
+ }
+ }
+
+ /* Continue reading from the open WAL segment, if any */
+ if (state->seg.ws_file >= 0)
+ {
+ /*
+ * To prevent a race condition where the archive streamer is still
+ * exporting a file that we are trying to read, we invoke the streamer
+ * to ensure enough data is available.
+ */
+ if (private->curSegNo == state->seg.ws_segno)
+ astreamer_archive_read(private);
+
+ return WALDumpReadPage(state, targetPagePtr, reqLen, targetPtr,
+ readBuff);
+ }
+
+ /* Otherwise, read the WAL page from the archive streamer */
return astreamer_wal_read(readBuff, targetPagePtr, count, private);
}
@@ -1340,7 +1438,6 @@ main(int argc, char **argv)
if (fname != NULL && is_tar_file(fname, &compression))
{
private.archive_name = fname;
- waldir = walpath ? pg_strdup(walpath) : pg_strdup(".");
is_tar = true;
}
else
@@ -1434,6 +1531,13 @@ main(int argc, char **argv)
init_tar_archive_reader(&private, waldir, private.startptr,
private.endptr, compression);
+ /*
+ * Setup temporary directory to store WAL segments and set up an exit
+ * callback to remove it upon completion.
+ */
+ setup_tmp_walseg_dir(waldir);
+ atexit(remove_tmp_walseg_dir_atexit);
+
/* Routine to decode WAL files in tar archive */
xlogreader_state =
XLogReaderAllocate(WalSegSz, waldir,
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index 4205e0ef597..1a1cf35e6f3 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -13,9 +13,14 @@
#include "access/xlogdefs.h"
#include "fe_utils/astreamer.h"
+#include "fe_utils/simple_list.h"
#include "lib/stringinfo.h"
+#define TEMP_FILE_EXT "waldump.tmp"
+
extern int WalSegSz;
+extern char *TmpWalSegDir;
+extern SimpleStringList TmpWalSegList;
/* Contains the necessary information to drive WAL decoding */
typedef struct XLogDumpPrivate
@@ -31,15 +36,32 @@ typedef struct XLogDumpPrivate
astreamer *archive_streamer;
StringInfo archive_streamer_buf; /* Buffer for receiving WAL data */
- XLogRecPtr archive_streamer_read_ptr; /* Populate the buffer with records
- until this record pointer */
+ XLogRecPtr archive_streamer_read_ptr; /* Populate the buffer with
+ * records until this record
+ * pointer */
+ XLogSegNo curSegNo; /* Current segment being read */
} XLogDumpPrivate;
+/*
+ * Generate the temporary WAL file path.
+ *
+ * Note that the caller is responsible to pfree it.
+ */
+static inline char *
+get_tmp_wal_file_path(const char *fname)
+{
+ char *fpath = (char *) palloc(MAXPGPATH);
-extern astreamer *astreamer_waldump_new(XLogRecPtr startptr,
- XLogRecPtr endptr,
+ snprintf(fpath, MAXPGPATH, "%s/%s.%s", TmpWalSegDir, fname,
+ TEMP_FILE_EXT);
+
+ return fpath;
+}
+
+extern astreamer *astreamer_waldump_new(XLogRecPtr startptr, XLogRecPtr endptr,
XLogDumpPrivate *privateInfo);
extern int astreamer_wal_read(char *readBuff, XLogRecPtr startptr, Size count,
XLogDumpPrivate *privateInfo);
+extern int astreamer_archive_read(XLogDumpPrivate *privateInfo);
#endif /* end of PG_WALDUMP_H */
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 443126a9ce6..d5fa1f6d28d 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -7,6 +7,7 @@ use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
+use List::Util qw(shuffle);
my $tar = $ENV{TAR};
@@ -272,7 +273,7 @@ sub generate_archive
}
closedir $dh;
- @files = sort @files;
+ @files = shuffle @files;
# move into the WAL directory before archiving files
my $cwd = getcwd;
--
2.47.1
[application/x-patch] v4-0006-pg_verifybackup-Delay-default-WAL-directory-prepa.patch (1.7K, 7-v4-0006-pg_verifybackup-Delay-default-WAL-directory-prepa.patch)
download | inline diff:
From 9c768466e35384d3366abd6ce0d04b6932116256 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 16 Jul 2025 14:47:43 +0530
Subject: [PATCH v4 6/8] pg_verifybackup: Delay default WAL directory
preparation.
We are not sure whether to parse WAL from a directory or an archive
until the backup format is known. Therefore, we delay preparing the
default WAL directory until the point of parsing. This delay is
harmless, as the WAL directory is not used elsewhere.
---
src/bin/pg_verifybackup/pg_verifybackup.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 5e6c13bb921..31ebc1581fb 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -285,10 +285,6 @@ main(int argc, char **argv)
manifest_path = psprintf("%s/backup_manifest",
context.backup_directory);
- /* By default, look for the WAL in the backup directory, too. */
- if (wal_directory == NULL)
- wal_directory = psprintf("%s/pg_wal", context.backup_directory);
-
/*
* Try to read the manifest. We treat any errors encountered while parsing
* the manifest as fatal; there doesn't seem to be much point in trying to
@@ -368,6 +364,10 @@ main(int argc, char **argv)
if (context.format == 'p' && !context.skip_checksums)
verify_backup_checksums(&context);
+ /* By default, look for the WAL in the backup directory, too. */
+ if (wal_directory == NULL)
+ wal_directory = psprintf("%s/pg_wal", context.backup_directory);
+
/*
* Try to parse the required ranges of WAL records, unless we were told
* not to do so.
--
2.47.1
[application/x-patch] v4-0007-pg_verifybackup-Rename-the-wal-directory-switch-t.patch (15.6K, 8-v4-0007-pg_verifybackup-Rename-the-wal-directory-switch-t.patch)
download | inline diff:
From 43c598c482171e1c5d764ead6614c95104207aa4 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 24 Jul 2025 16:37:43 +0530
Subject: [PATCH v4 7/8] pg_verifybackup: Rename the wal-directory switch to
wal-path
With previous patches to pg_waldump can now decode WAL directly from
tar files. This means you'll be able to specify a tar archive path
instead of a traditional WAL directory.
To keep things consistent and more versatile, we should also
generalize the input switch for pg_verifybackup. It should accept
either a directory or a tar file path that contains WALs. This change
will also aligning it with the existing manifest-path switch naming.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 2 +-
src/bin/pg_verifybackup/pg_verifybackup.c | 22 +++++++++++-----------
src/bin/pg_verifybackup/po/de.po | 4 ++--
src/bin/pg_verifybackup/po/el.po | 4 ++--
src/bin/pg_verifybackup/po/es.po | 4 ++--
src/bin/pg_verifybackup/po/fr.po | 4 ++--
src/bin/pg_verifybackup/po/it.po | 4 ++--
src/bin/pg_verifybackup/po/ja.po | 4 ++--
src/bin/pg_verifybackup/po/ka.po | 4 ++--
src/bin/pg_verifybackup/po/ko.po | 4 ++--
src/bin/pg_verifybackup/po/ru.po | 4 ++--
src/bin/pg_verifybackup/po/sv.po | 4 ++--
src/bin/pg_verifybackup/po/uk.po | 4 ++--
src/bin/pg_verifybackup/po/zh_CN.po | 4 ++--
src/bin/pg_verifybackup/po/zh_TW.po | 4 ++--
src/bin/pg_verifybackup/t/007_wal.pl | 4 ++--
16 files changed, 40 insertions(+), 40 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index 61c12975e4a..e9b8bfd51b1 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -261,7 +261,7 @@ PostgreSQL documentation
<varlistentry>
<term><option>-w <replaceable class="parameter">path</replaceable></option></term>
- <term><option>--wal-directory=<replaceable class="parameter">path</replaceable></option></term>
+ <term><option>--wal-path=<replaceable class="parameter">path</replaceable></option></term>
<listitem>
<para>
Try to parse WAL files stored in the specified directory, rather than
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 31ebc1581fb..1ee400199da 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -93,7 +93,7 @@ static void verify_file_checksum(verifier_context *context,
uint8 *buffer);
static void parse_required_wal(verifier_context *context,
char *pg_waldump_path,
- char *wal_directory);
+ char *wal_path);
static astreamer *create_archive_verifier(verifier_context *context,
char *archive_name,
Oid tblspc_oid,
@@ -126,7 +126,7 @@ main(int argc, char **argv)
{"progress", no_argument, NULL, 'P'},
{"quiet", no_argument, NULL, 'q'},
{"skip-checksums", no_argument, NULL, 's'},
- {"wal-directory", required_argument, NULL, 'w'},
+ {"wal-path", required_argument, NULL, 'w'},
{NULL, 0, NULL, 0}
};
@@ -135,7 +135,7 @@ main(int argc, char **argv)
char *manifest_path = NULL;
bool no_parse_wal = false;
bool quiet = false;
- char *wal_directory = NULL;
+ char *wal_path = NULL;
char *pg_waldump_path = NULL;
DIR *dir;
@@ -221,8 +221,8 @@ main(int argc, char **argv)
context.skip_checksums = true;
break;
case 'w':
- wal_directory = pstrdup(optarg);
- canonicalize_path(wal_directory);
+ wal_path = pstrdup(optarg);
+ canonicalize_path(wal_path);
break;
default:
/* getopt_long already emitted a complaint */
@@ -365,15 +365,15 @@ main(int argc, char **argv)
verify_backup_checksums(&context);
/* By default, look for the WAL in the backup directory, too. */
- if (wal_directory == NULL)
- wal_directory = psprintf("%s/pg_wal", context.backup_directory);
+ if (wal_path == NULL)
+ wal_path = psprintf("%s/pg_wal", context.backup_directory);
/*
* Try to parse the required ranges of WAL records, unless we were told
* not to do so.
*/
if (!no_parse_wal)
- parse_required_wal(&context, pg_waldump_path, wal_directory);
+ parse_required_wal(&context, pg_waldump_path, wal_path);
/*
* If everything looks OK, tell the user this, unless we were asked to
@@ -1198,7 +1198,7 @@ verify_file_checksum(verifier_context *context, manifest_file *m,
*/
static void
parse_required_wal(verifier_context *context, char *pg_waldump_path,
- char *wal_directory)
+ char *wal_path)
{
manifest_data *manifest = context->manifest;
manifest_wal_range *this_wal_range = manifest->first_wal_range;
@@ -1208,7 +1208,7 @@ parse_required_wal(verifier_context *context, char *pg_waldump_path,
char *pg_waldump_cmd;
pg_waldump_cmd = psprintf("\"%s\" --quiet --path=\"%s\" --timeline=%u --start=%X/%08X --end=%X/%08X\n",
- pg_waldump_path, wal_directory, this_wal_range->tli,
+ pg_waldump_path, wal_path, this_wal_range->tli,
LSN_FORMAT_ARGS(this_wal_range->start_lsn),
LSN_FORMAT_ARGS(this_wal_range->end_lsn));
fflush(NULL);
@@ -1376,7 +1376,7 @@ usage(void)
printf(_(" -P, --progress show progress information\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -s, --skip-checksums skip checksum verification\n"));
- printf(_(" -w, --wal-directory=PATH use specified path for WAL files\n"));
+ printf(_(" -w, --wal-path=PATH use specified path for WAL files\n"));
printf(_(" -V, --version output version information, then exit\n"));
printf(_(" -?, --help show this help, then exit\n"));
printf(_("\nReport bugs to <%s>.\n"), PACKAGE_BUGREPORT);
diff --git a/src/bin/pg_verifybackup/po/de.po b/src/bin/pg_verifybackup/po/de.po
index a9e24931100..9b5cd5898cf 100644
--- a/src/bin/pg_verifybackup/po/de.po
+++ b/src/bin/pg_verifybackup/po/de.po
@@ -785,8 +785,8 @@ msgstr " -s, --skip-checksums Überprüfung der Prüfsummen überspringe
#: pg_verifybackup.c:1379
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PFAD angegebenen Pfad für WAL-Dateien verwenden\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PFAD angegebenen Pfad für WAL-Dateien verwenden\n"
#: pg_verifybackup.c:1380
#, c-format
diff --git a/src/bin/pg_verifybackup/po/el.po b/src/bin/pg_verifybackup/po/el.po
index 3e3f20c67c5..81442f51c17 100644
--- a/src/bin/pg_verifybackup/po/el.po
+++ b/src/bin/pg_verifybackup/po/el.po
@@ -494,8 +494,8 @@ msgstr " -s, --skip-checksums παράκαμψε την επαλήθευ
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH χρησιμοποίησε την καθορισμένη διαδρομή για αρχεία WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH χρησιμοποίησε την καθορισμένη διαδρομή για αρχεία WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/es.po b/src/bin/pg_verifybackup/po/es.po
index 0cb958f3448..7f729fa35ba 100644
--- a/src/bin/pg_verifybackup/po/es.po
+++ b/src/bin/pg_verifybackup/po/es.po
@@ -495,8 +495,8 @@ msgstr " -s, --skip-checksums omitir la verificación de la suma de comp
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH utilizar la ruta especificada para los archivos WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH utilizar la ruta especificada para los archivos WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/fr.po b/src/bin/pg_verifybackup/po/fr.po
index da8c72f6427..09937966fa7 100644
--- a/src/bin/pg_verifybackup/po/fr.po
+++ b/src/bin/pg_verifybackup/po/fr.po
@@ -498,8 +498,8 @@ msgstr " -s, --skip-checksums ignore la vérification des sommes de cont
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=CHEMIN utilise le chemin spécifié pour les fichiers WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=CHEMIN utilise le chemin spécifié pour les fichiers WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/it.po b/src/bin/pg_verifybackup/po/it.po
index 317b0b71e7f..4da68d0074e 100644
--- a/src/bin/pg_verifybackup/po/it.po
+++ b/src/bin/pg_verifybackup/po/it.po
@@ -472,8 +472,8 @@ msgstr " -s, --skip-checksums salta la verifica del checksum\n"
#: pg_verifybackup.c:911
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH usa il percorso specificato per i file WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH usa il percorso specificato per i file WAL\n"
#: pg_verifybackup.c:912
#, c-format
diff --git a/src/bin/pg_verifybackup/po/ja.po b/src/bin/pg_verifybackup/po/ja.po
index c910fb236cc..a948959b54f 100644
--- a/src/bin/pg_verifybackup/po/ja.po
+++ b/src/bin/pg_verifybackup/po/ja.po
@@ -672,8 +672,8 @@ msgstr " -s, --skip-checksums チェックサム検証をスキップ\n"
#: pg_verifybackup.c:1379
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH WALファイルに指定したパスを使用する\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH WALファイルに指定したパスを使用する\n"
#: pg_verifybackup.c:1380
#, c-format
diff --git a/src/bin/pg_verifybackup/po/ka.po b/src/bin/pg_verifybackup/po/ka.po
index 982751984c7..ef2799316a8 100644
--- a/src/bin/pg_verifybackup/po/ka.po
+++ b/src/bin/pg_verifybackup/po/ka.po
@@ -784,8 +784,8 @@ msgstr " -s, --skip-checksums საკონტროლო ჯამ
#: pg_verifybackup.c:1379
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=ბილიკი WAL ფაილებისთვის მითითებული ბილიკის გამოყენება\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=ბილიკი WAL ფაილებისთვის მითითებული ბილიკის გამოყენება\n"
#: pg_verifybackup.c:1380
#, c-format
diff --git a/src/bin/pg_verifybackup/po/ko.po b/src/bin/pg_verifybackup/po/ko.po
index acdc3da5e02..eaf91ef1e98 100644
--- a/src/bin/pg_verifybackup/po/ko.po
+++ b/src/bin/pg_verifybackup/po/ko.po
@@ -501,8 +501,8 @@ msgstr " -s, --skip-checksums 체크섬 검사 건너뜀\n"
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=경로 WAL 파일이 있는 경로 지정\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=경로 WAL 파일이 있는 경로 지정\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/ru.po b/src/bin/pg_verifybackup/po/ru.po
index 64005feedfd..7fb0e5ab1f6 100644
--- a/src/bin/pg_verifybackup/po/ru.po
+++ b/src/bin/pg_verifybackup/po/ru.po
@@ -507,9 +507,9 @@ msgstr " -s, --skip-checksums пропустить проверку ко
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
msgstr ""
-" -w, --wal-directory=ПУТЬ использовать заданный путь к файлам WAL\n"
+" -w, --wal-path=ПУТЬ использовать заданный путь к файлам WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/sv.po b/src/bin/pg_verifybackup/po/sv.po
index 17240feeb5c..97125838e8c 100644
--- a/src/bin/pg_verifybackup/po/sv.po
+++ b/src/bin/pg_verifybackup/po/sv.po
@@ -492,8 +492,8 @@ msgstr " -s, --skip-checksums hoppa över verifiering av kontrollsummor\
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=SÖKVÄG använd denna sökväg till WAL-filer\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=SÖKVÄG använd denna sökväg till WAL-filer\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/uk.po b/src/bin/pg_verifybackup/po/uk.po
index 034b9764232..63f8041ab38 100644
--- a/src/bin/pg_verifybackup/po/uk.po
+++ b/src/bin/pg_verifybackup/po/uk.po
@@ -484,8 +484,8 @@ msgstr " -s, --skip-checksums не перевіряти контрольні с
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH використовувати вказаний шлях для файлів WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH використовувати вказаний шлях для файлів WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/zh_CN.po b/src/bin/pg_verifybackup/po/zh_CN.po
index b7d97c8976d..fb6fcae8b82 100644
--- a/src/bin/pg_verifybackup/po/zh_CN.po
+++ b/src/bin/pg_verifybackup/po/zh_CN.po
@@ -465,8 +465,8 @@ msgstr " -s, --skip-checksums 跳过校验和验证\n"
#: pg_verifybackup.c:919
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH 对WAL文件使用指定路径\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH 对WAL文件使用指定路径\n"
#: pg_verifybackup.c:920
#, c-format
diff --git a/src/bin/pg_verifybackup/po/zh_TW.po b/src/bin/pg_verifybackup/po/zh_TW.po
index c1b710b0a36..568f972b0bb 100644
--- a/src/bin/pg_verifybackup/po/zh_TW.po
+++ b/src/bin/pg_verifybackup/po/zh_TW.po
@@ -555,8 +555,8 @@ msgstr " -s, --skip-checksums 跳過檢查碼驗證\n"
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH 用指定的路徑存放 WAL 檔\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH 用指定的路徑存放 WAL 檔\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/t/007_wal.pl b/src/bin/pg_verifybackup/t/007_wal.pl
index babc4f0a86b..b07f80719b0 100644
--- a/src/bin/pg_verifybackup/t/007_wal.pl
+++ b/src/bin/pg_verifybackup/t/007_wal.pl
@@ -42,10 +42,10 @@ command_ok([ 'pg_verifybackup', '--no-parse-wal', $backup_path ],
command_ok(
[
'pg_verifybackup',
- '--wal-directory' => $relocated_pg_wal,
+ '--wal-path' => $relocated_pg_wal,
$backup_path
],
- '--wal-directory can be used to specify WAL directory');
+ '--wal-path can be used to specify WAL directory');
# Move directory back to original location.
rename($relocated_pg_wal, $original_pg_wal) || die "rename pg_wal back: $!";
--
2.47.1
[application/x-patch] v4-0008-pg_verifybackup-enabled-WAL-parsing-for-tar-forma.patch (9.3K, 9-v4-0008-pg_verifybackup-enabled-WAL-parsing-for-tar-forma.patch)
download | inline diff:
From ca9b3ccec8143a53c2dbcae3a11a0edf09f5b96b Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 17 Jul 2025 16:39:36 +0530
Subject: [PATCH v4 8/8] pg_verifybackup: enabled WAL parsing for tar-format
backup
Now that pg_waldump supports decoding from tar archives, we should
leverage this functionality to remove the previous restriction on WAL
parsing for tar-backed formats.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 5 +-
src/bin/pg_verifybackup/pg_verifybackup.c | 66 +++++++++++++------
src/bin/pg_verifybackup/t/002_algorithm.pl | 4 --
src/bin/pg_verifybackup/t/003_corruption.pl | 4 +-
src/bin/pg_verifybackup/t/008_untar.pl | 3 +-
src/bin/pg_verifybackup/t/010_client_untar.pl | 3 +-
6 files changed, 50 insertions(+), 35 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index e9b8bfd51b1..16b50b5a4df 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -36,10 +36,7 @@ PostgreSQL documentation
<literal>backup_manifest</literal> generated by the server at the time
of the backup. The backup may be stored either in the "plain" or the "tar"
format; this includes tar-format backups compressed with any algorithm
- supported by <application>pg_basebackup</application>. However, at present,
- <literal>WAL</literal> verification is supported only for plain-format
- backups. Therefore, if the backup is stored in tar-format, the
- <literal>-n, --no-parse-wal</literal> option should be used.
+ supported by <application>pg_basebackup</application>.
</para>
<para>
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 1ee400199da..4bfe6fdff16 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -74,7 +74,9 @@ pg_noreturn static void report_manifest_error(JsonManifestParseContext *context,
const char *fmt,...)
pg_attribute_printf(2, 3);
-static void verify_tar_backup(verifier_context *context, DIR *dir);
+static void verify_tar_backup(verifier_context *context, DIR *dir,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_plain_backup_directory(verifier_context *context,
char *relpath, char *fullpath,
DIR *dir);
@@ -83,7 +85,9 @@ static void verify_plain_backup_file(verifier_context *context, char *relpath,
static void verify_control_file(const char *controlpath,
uint64 manifest_system_identifier);
static void precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles);
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_tar_file(verifier_context *context, char *relpath,
char *fullpath, astreamer *streamer);
static void report_extra_backup_files(verifier_context *context);
@@ -136,6 +140,8 @@ main(int argc, char **argv)
bool no_parse_wal = false;
bool quiet = false;
char *wal_path = NULL;
+ char *base_archive_path = NULL;
+ char *wal_archive_path = NULL;
char *pg_waldump_path = NULL;
DIR *dir;
@@ -327,17 +333,6 @@ main(int argc, char **argv)
pfree(path);
}
- /*
- * XXX: In the future, we should consider enhancing pg_waldump to read WAL
- * files from an archive.
- */
- if (!no_parse_wal && context.format == 't')
- {
- pg_log_error("pg_waldump cannot read tar files");
- pg_log_error_hint("You must use -n/--no-parse-wal when verifying a tar-format backup.");
- exit(1);
- }
-
/*
* Perform the appropriate type of verification appropriate based on the
* backup format. This will close 'dir'.
@@ -346,7 +341,7 @@ main(int argc, char **argv)
verify_plain_backup_directory(&context, NULL, context.backup_directory,
dir);
else
- verify_tar_backup(&context, dir);
+ verify_tar_backup(&context, dir, &base_archive_path, &wal_archive_path);
/*
* The "matched" flag should now be set on every entry in the hash table.
@@ -364,9 +359,28 @@ main(int argc, char **argv)
if (context.format == 'p' && !context.skip_checksums)
verify_backup_checksums(&context);
- /* By default, look for the WAL in the backup directory, too. */
+ /*
+ * By default, WAL files are expected to be found in the backup directory
+ * for plain-format backups. In the case of tar-format backups, if a
+ * separate WAL archive is not found, the WAL files are most likely
+ * included within the main data directory archive.
+ */
if (wal_path == NULL)
- wal_path = psprintf("%s/pg_wal", context.backup_directory);
+ {
+ if (context.format == 'p')
+ wal_path = psprintf("%s/pg_wal", context.backup_directory);
+ else if (wal_archive_path)
+ wal_path = wal_archive_path;
+ else if (base_archive_path)
+ wal_path = base_archive_path;
+ else
+ {
+ pg_log_error("wal archive not found");
+ pg_log_error_hint("Specify the correct path using the option -w/--wal-path."
+ "Or you must use -n/--no-parse-wal when verifying a tar-format backup.");
+ exit(1);
+ }
+ }
/*
* Try to parse the required ranges of WAL records, unless we were told
@@ -787,7 +801,8 @@ verify_control_file(const char *controlpath, uint64 manifest_system_identifier)
* close when we're done with it.
*/
static void
-verify_tar_backup(verifier_context *context, DIR *dir)
+verify_tar_backup(verifier_context *context, DIR *dir, char **base_archive_path,
+ char **wal_archive_path)
{
struct dirent *dirent;
SimplePtrList tarfiles = {NULL, NULL};
@@ -816,7 +831,8 @@ verify_tar_backup(verifier_context *context, DIR *dir)
char *fullpath;
fullpath = psprintf("%s/%s", context->backup_directory, filename);
- precheck_tar_backup_file(context, filename, fullpath, &tarfiles);
+ precheck_tar_backup_file(context, filename, fullpath, &tarfiles,
+ base_archive_path, wal_archive_path);
pfree(fullpath);
}
}
@@ -875,11 +891,13 @@ verify_tar_backup(verifier_context *context, DIR *dir)
*
* The arguments to this function are mostly the same as the
* verify_plain_backup_file. The additional argument outputs a list of valid
- * tar files.
+ * tar files, along with the full paths to the main archive and the WAL
+ * directory archive.
*/
static void
precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles)
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path, char **wal_archive_path)
{
struct stat sb;
Oid tblspc_oid = InvalidOid;
@@ -918,9 +936,17 @@ precheck_tar_backup_file(verifier_context *context, char *relpath,
* extension such as .gz, .lz4, or .zst.
*/
if (strncmp("base", relpath, 4) == 0)
+ {
suffix = relpath + 4;
+
+ *base_archive_path = pstrdup(fullpath);
+ }
else if (strncmp("pg_wal", relpath, 6) == 0)
+ {
suffix = relpath + 6;
+
+ *wal_archive_path = pstrdup(fullpath);
+ }
else
{
/* Expected a <tablespaceoid>.tar file here. */
diff --git a/src/bin/pg_verifybackup/t/002_algorithm.pl b/src/bin/pg_verifybackup/t/002_algorithm.pl
index ae16c11bc4d..4f284a9e828 100644
--- a/src/bin/pg_verifybackup/t/002_algorithm.pl
+++ b/src/bin/pg_verifybackup/t/002_algorithm.pl
@@ -30,10 +30,6 @@ sub test_checksums
{
# Add switch to get a tar-format backup
push @backup, ('--format' => 'tar');
-
- # Add switch to skip WAL verification, which is not yet supported for
- # tar-format backups
- push @verify, ('--no-parse-wal');
}
# A backup with a bogus algorithm should fail.
diff --git a/src/bin/pg_verifybackup/t/003_corruption.pl b/src/bin/pg_verifybackup/t/003_corruption.pl
index 1dd60f709cf..f1ebdbb46b4 100644
--- a/src/bin/pg_verifybackup/t/003_corruption.pl
+++ b/src/bin/pg_verifybackup/t/003_corruption.pl
@@ -193,10 +193,8 @@ for my $scenario (@scenario)
command_ok([ $tar, '-cf' => "$tar_backup_path/base.tar", '.' ]);
chdir($cwd) || die "chdir: $!";
- # Now check that the backup no longer verifies. We must use -n
- # here, because pg_waldump can't yet read WAL from a tarfile.
command_fails_like(
- [ 'pg_verifybackup', '--no-parse-wal', $tar_backup_path ],
+ [ 'pg_verifybackup', $tar_backup_path ],
$scenario->{'fails_like'},
"corrupt backup fails verification: $name");
diff --git a/src/bin/pg_verifybackup/t/008_untar.pl b/src/bin/pg_verifybackup/t/008_untar.pl
index bc3d6b352ad..0cfe1f9532c 100644
--- a/src/bin/pg_verifybackup/t/008_untar.pl
+++ b/src/bin/pg_verifybackup/t/008_untar.pl
@@ -123,8 +123,7 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
diff --git a/src/bin/pg_verifybackup/t/010_client_untar.pl b/src/bin/pg_verifybackup/t/010_client_untar.pl
index b62faeb5acf..76269a73673 100644
--- a/src/bin/pg_verifybackup/t/010_client_untar.pl
+++ b/src/bin/pg_verifybackup/t/010_client_untar.pl
@@ -137,8 +137,7 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
--
2.47.1
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2025-09-29 15:15 Robert Haas <[email protected]>
parent: Amul Sul <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Robert Haas @ 2025-09-29 15:15 UTC (permalink / raw)
To: Amul Sul <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>
On Thu, Sep 25, 2025 at 4:25 AM Amul Sul <[email protected]> wrote:
> > Another thing that isn't so nice right now is that
> > verify_tar_archive() has to open and close the archive only for
> > init_tar_archive_reader() to be called to reopen it again just moments
> > later. It would be nicer to open the file just once and then keep it
> > open. Here again, I wonder if the separation of duties could be a bit
> > cleaner.
>
> Prefer to keep those separate, assuming that reopening the file won't
> cause any significant harm. Let me know if you think otherwise.
Well, I guess I'd like to know why we can't do better. I'm not really
worried about performance, but reopening the file means that you can
never make it work with reading from a pipe.
--
Robert Haas
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2025-09-29 16:17 Amul Sul <[email protected]>
parent: Robert Haas <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Amul Sul @ 2025-09-29 16:17 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>
On Mon, Sep 29, 2025 at 8:45 PM Robert Haas <[email protected]> wrote:
>
> On Thu, Sep 25, 2025 at 4:25 AM Amul Sul <[email protected]> wrote:
> > > Another thing that isn't so nice right now is that
> > > verify_tar_archive() has to open and close the archive only for
> > > init_tar_archive_reader() to be called to reopen it again just moments
> > > later. It would be nicer to open the file just once and then keep it
> > > open. Here again, I wonder if the separation of duties could be a bit
> > > cleaner.
> >
> > Prefer to keep those separate, assuming that reopening the file won't
> > cause any significant harm. Let me know if you think otherwise.
>
> Well, I guess I'd like to know why we can't do better. I'm not really
> worried about performance, but reopening the file means that you can
> never make it work with reading from a pipe.
I have some skepticism regarding the extra coding that might be
introduced, as performance is not my primary concern here. If we aim
to keep the file open only once, that logic should be implemented
before calling verify_tar_archive(), not inside it. Implementing the
open and close logic within verify_tar_archive() and
free_tar_archive_reader() would create a confusing and scattered
pattern, especially since these separate operations require only two
lines of code each (open and close if it's a tar file). My second,
concern is that after verify_tar_archive(), we might need to reset the
file reader offset to the beginning. While reusing the buffered data
from the first iteration is technically possible, that only works if
the desired start LSN is at the absolute beginning of the archive, or
later in the sequence, which cannot be reliably guaranteed. Therefore,
for simplicity and avoid the complexity of managing that offset reset
code, I am thinking of a simpler approach.
Regards,
Amul
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2025-10-10 18:01 Robert Haas <[email protected]>
parent: Amul Sul <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Robert Haas @ 2025-10-10 18:01 UTC (permalink / raw)
To: Amul Sul <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>
On Mon, Sep 29, 2025 at 12:17 PM Amul Sul <[email protected]> wrote:
> While reusing the buffered data
> from the first iteration is technically possible, that only works if
> the desired start LSN is at the absolute beginning of the archive, or
> later in the sequence, which cannot be reliably guaranteed.
I spent a bunch of time studying this code today and I think that the
problem you're talking about here is evidence of a design problem with
astreamer_wal_read() and some of the other code in
astreamer_waldump.c. Your code calls astreamer_wal_read() when it
wants to peek at the first xlog block to determine the WAL segment
size, and it also calls astreamer_wal_read() when it wants read WAL
sequentially beginning at the start LSN and continuing until it
reaches the end LSN. However, these two cases have very different
requirements. verify_tar_archive(), which is misleadingly named and
really exists to determine the WAL segment size, just wants to read
the first xlog block that physically appears in the archive. Every
xlog block will have the same WAL segment size, so it does not matter
which one we read. On the other hand, TarWALDumpReadPage wants to read
WAL in sequential order. In other words, one call to
astreamer_wal_read() really wants to read a block without any block
reordering, and the other call wants to read a block with block
reordering.
To me, it looks like the problem here is that the block reordering
functionality should live on top of the astreamer, not inside of it.
Imagine that astreamer just spits out the bytes in the order in which
they physically appear in the archive, and then there's another
component that consumes and reorders those bytes. So, you read data
and push it into the astreamer until the number of bytes in the output
buffer is at least XLOG_BLCKSZ, and then from there you extract the
WAL segment size. Then, you call XLogReaderAllocate() and enter the
main loop. The reordering logic lives inside of TarWALDumpReadPage().
Each time it gets data from the astreamer's buffer, it either returns
it to the caller if it's in order or buffers it using temporary files
if not.
I found it's actually quite easy to write a patch that avoids
reopening the file. Here it is, on top of your v4:
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 2c42df46d43..c4346a5e211 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -368,17 +368,8 @@ init_tar_archive_reader(XLogDumpPrivate *private,
const char *waldir,
XLogRecPtr startptr, XLogRecPtr endptr,
pg_compress_algorithm compression)
{
- int fd;
astreamer *streamer;
- /* Open tar archive and store its file descriptor */
- fd = open_file_in_directory(waldir, private->archive_name);
-
- if (fd < 0)
- pg_fatal("could not open file \"%s\"", private->archive_name);
-
- private->archive_fd = fd;
-
/*
* Create an appropriate chain of archive streamers for reading the given
* tar archive.
@@ -1416,12 +1407,22 @@ main(int argc, char **argv)
/* we have everything we need, start reading */
if (is_tar)
{
+ /* Open tar archive and store its file descriptor */
+ private.archive_fd =
+ open_file_in_directory(waldir, private.archive_name);
+ if (private.archive_fd < 0)
+ pg_fatal("could not open file \"%s\"", private.archive_name);
+
/* Verify that the archive contains valid WAL files */
waldir = waldir ? pg_strdup(waldir) : pg_strdup(".");
init_tar_archive_reader(&private, waldir, InvalidXLogRecPtr,
InvalidXLogRecPtr, compression);
verify_tar_archive(&private);
- free_tar_archive_reader(&private);
+ astreamer_free(private.archive_streamer);
+
+ if (lseek(private.archive_fd, 0, SEEK_SET) != 0)
+ pg_log_error("could not seek in file \"%s\": %m",
+ private.archive_name);
/* Set up for reading tar file */
init_tar_archive_reader(&private, waldir, private.startptr,
Of course, this is not really what we want to do: it avoids reopening
the file, but because we can't back up the archive streamer once it's
been created, we have to lseek back to the beginning of the file. But
notice how silly this looks: with this patch, we free the archive
reader and immediately create a new archive reader that is exactly the
same in every way except that we call astreamer_waldump_new(startptr,
endptr, private) instead of astreamer_waldump_new(InvalidXLogRecPtr,
InvalidXLogRecPtr, private). We could arrange to update the original
archive streamer with new values of startSegNo and endSegNo after
verify_tar_archive(), but that's still not quite good enough, because
we might have already made some decisions on what to do with the data
that we read that it's too late to reverse. But, what that means is
that the astreamer_waldump machinery is not smart enough to read one
block of data without making irreversible decisions from which we
can't recover without recreating the entire object. I think we can,
and should, try to do better.
It's also worth noting that the unfortunate layering doesn't just
require us to read the first block of the file: it also complicates
the code in various places. The fact that astreamer_wal_read() needs a
special case for XLogRecPtrIsInvalid(recptr) is a direct result of
this problem, and the READ_ANY_WAL() macro and both the places that
test it are also direct results of this problem. In other words, I'm
arguing that astreamer_wal_read() is incorrectly defined, and that
error creates ugliness in the code both above and below
astreamer_wal_read().
While I'm on the topic of astreamer_wal_read(), here are a few other
problems I noticed:
* The return value is not documented, and it seems to always be count,
in which case it might as well return void. The caller already has the
value they passed for count.
* It seems like it would be more appropriate to assert that endPtr >=
len and just set startPtr = endPtr - len. I don't see how len > endPtr
can ever happen, and I bet bad things will happen if it does.
* "pg_waldump never ask the same" -> "pg_waldump never asks for the same"
Also, this is absolutely not OK with me:
/* Fetch more data */
if (astreamer_archive_read(privateInfo) == 0)
{
char fname[MAXFNAMELEN];
XLogSegNo segno;
XLByteToSeg(targetPagePtr, segno, WalSegSz); an
XLogFileName(fname,
privateInfo->timeline, segno, WalSegSz);
pg_fatal("could not find file \"%s\"
in \"%s\" archive",
fname,
privateInfo->archive_name);
}
astreamer_archive_read() will return 0 if we reach the end of the
tarfile, so this is saying that if we reach the end of the tar file
without finding the range of bytes for which we're looking, the
explanation must be that the relevant WAL file is missing from the
archive. But that is way too much action at a distance. I was able to
easily construct a counterexample by copying the first 81920 bytes of
a valid WAL file and then doing this:
[robert.haas pgsql-meson]$ tar tf pg_wal.tar
000000010000000000000005
[robert.haas pgsql-meson]$ pg_waldump -s 0/050008D8 -e 0/05FFED98
pg_wal.tar >/dev/null
pg_waldump: error: could not find file "000000010000000000000005" in
"pg_wal.tar" archive
Without the redirection to /dev/null, what happened was that
pg_waldump printed out a bunch of records from
000000010000000000000005 and then said that 000000010000000000000005
could not be found, which is obviously silly. But the fact that I
found a specific counterexample here isn't even really the point. The
point is that there's a big gap between what we actually know at this
point (which is that we've read the whole input file) and what the
message is claiming (which is that the reason must be that the file is
missing from the archive). Even if the counterexample above didn't
exist and that really were the only way for that to happen as of
today, that's very fragile. Maybe some future code change will make it
so that there's a second reason that could happen. How would somebody
realize that they had created a second condition by means of which
this code could be reached? If they did realize it, how would they get
the correct error to be reported?
I'm not quite sure how this should be fixed, but I strongly suspect
that the error report here needs to move closer to the code that is
doing the file reordering. Aside from the possibility of the file
being missing and the possibility of the file being too short, a third
possibility is that targetPagePtr retreats between one call and the
next. That really shouldn't happen, but there are no asserts here
verifying that it doesn't.
I also don't like the fact that one call to astreamer_archive_read()
checks the return value (but only whether it's zero, the specific
return value apparently doesn't matter, so why doesn't it return
bool?) and the other doesn't. That kind of coding pattern is very
rarely correct. The code says:
/* Continue reading from the open WAL segment, if any */
if (state->seg.ws_file >= 0)
{
/*
* To prevent a race condition where the archive streamer is still
* exporting a file that we are trying to read, we invoke the streamer
* to ensure enough data is available.
*/
if (private->curSegNo == state->seg.ws_segno)
astreamer_archive_read(private);
return WALDumpReadPage(state, targetPagePtr, reqLen, targetPtr,
readBuff);
}
But it's unclear why this should be good enough to ensure that enough
data is available. astreamer_archive_read() might read zero bytes and
return 0, so this doesn't really guarantee anything at all. On the
other hand, even if astereamer_archive_read() returns a non-zero
value, it's only going to read READ_CHUNK_SIZE bytes from the
underlying file, so if more than that needs to be read in order for us
to have enough data, we won't. I think it's very hard to imagine a
situation in which you can call astreamer_archive_read() without using
some loop. That's what astreamer_wal_read() does: it calls
astreamer_archive_read() until it either returns 0 -- in which case we
know we've failed -- or until we have enough data. Here we just hope
that calling it once is enough, and that checking for errors is
unimportant. I also don't understand the reference to a race
condition, because there's only one process with one thread here, I
believe, so what would be racing against?
Another thing I noticed is that astreamer_archive_read() makes
reference to decrypting, but there's no cryptography involved in any
of this.
--
Robert Haas
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2025-10-16 11:48 Amul Sul <[email protected]>
parent: Robert Haas <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Amul Sul @ 2025-10-16 11:48 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>
On Fri, Oct 10, 2025 at 11:32 PM Robert Haas <[email protected]> wrote:
>
> On Mon, Sep 29, 2025 at 12:17 PM Amul Sul <[email protected]> wrote:
> > While reusing the buffered data
> > from the first iteration is technically possible, that only works if
> > the desired start LSN is at the absolute beginning of the archive, or
> > later in the sequence, which cannot be reliably guaranteed.
>
> I spent a bunch of time studying this code today and I think that the
> problem you're talking about here is evidence of a design problem with
> astreamer_wal_read() and some of the other code in
> astreamer_waldump.c. Your code calls astreamer_wal_read() when it
> wants to peek at the first xlog block to determine the WAL segment
> size, and it also calls astreamer_wal_read() when it wants read WAL
> sequentially beginning at the start LSN and continuing until it
> reaches the end LSN. However, these two cases have very different
> requirements. verify_tar_archive(), which is misleadingly named and
> really exists to determine the WAL segment size, just wants to read
> the first xlog block that physically appears in the archive. Every
> xlog block will have the same WAL segment size, so it does not matter
> which one we read. On the other hand, TarWALDumpReadPage wants to read
> WAL in sequential order. In other words, one call to
> astreamer_wal_read() really wants to read a block without any block
> reordering, and the other call wants to read a block with block
> reordering.
>
> To me, it looks like the problem here is that the block reordering
> functionality should live on top of the astreamer, not inside of it.
> Imagine that astreamer just spits out the bytes in the order in which
> they physically appear in the archive, and then there's another
> component that consumes and reorders those bytes. So, you read data
> and push it into the astreamer until the number of bytes in the output
> buffer is at least XLOG_BLCKSZ, and then from there you extract the
> WAL segment size. Then, you call XLogReaderAllocate() and enter the
> main loop. The reordering logic lives inside of TarWALDumpReadPage().
> Each time it gets data from the astreamer's buffer, it either returns
> it to the caller if it's in order or buffers it using temporary files
> if not.
>
I initially considered implementing the reordering logic outside of
astreamer when we first discussed this project, but the implementation
could get complicated -- or at least feel hacky. Let me explain why:
astreamer reads the archive in fixed-size chunks (here it is 128KB).
Sometimes, a single read can contain data from two WAL files --
specifically, the tail end of one file and the start of the next --
because of how they’re physically stored in the archive. astreamer
knows where one file ends and another begins through tags like
ASTREAMER_MEMBER_HEADER, ASTREAMER_MEMBER_CONTENTS, and
ASTREAMER_MEMBER_TRAILER. However, it can’t pause mid-chunk to hold
data from the next file once the previous one ends and for the caller;
it pushes the entire chunk it has read to the target buffer.
So, if we put the reordering logic outside the streamer, we’d
sometimes be receiving buffers containing mixed data from two WAL
files. The caller would then need to correctly identify WAL file
boundaries within those buffers. This would require passing extra
metadata -- like segment numbers for the WAL files in the buffer, plus
start and end offsets of those segments within the buffer. While not
impossible, it feels a bit hacky and I'm unsure if that’s the best
approach.
> I found it's actually quite easy to write a patch that avoids
> reopening the file. Here it is, on top of your v4:
>
> diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
> index 2c42df46d43..c4346a5e211 100644
> --- a/src/bin/pg_waldump/pg_waldump.c
> +++ b/src/bin/pg_waldump/pg_waldump.c
> @@ -368,17 +368,8 @@ init_tar_archive_reader(XLogDumpPrivate *private,
> const char *waldir,
> XLogRecPtr startptr, XLogRecPtr endptr,
> pg_compress_algorithm compression)
> {
> - int fd;
> astreamer *streamer;
>
> - /* Open tar archive and store its file descriptor */
> - fd = open_file_in_directory(waldir, private->archive_name);
> -
> - if (fd < 0)
> - pg_fatal("could not open file \"%s\"", private->archive_name);
> -
> - private->archive_fd = fd;
> -
> /*
> * Create an appropriate chain of archive streamers for reading the given
> * tar archive.
> @@ -1416,12 +1407,22 @@ main(int argc, char **argv)
> /* we have everything we need, start reading */
> if (is_tar)
> {
> + /* Open tar archive and store its file descriptor */
> + private.archive_fd =
> + open_file_in_directory(waldir, private.archive_name);
> + if (private.archive_fd < 0)
> + pg_fatal("could not open file \"%s\"", private.archive_name);
> +
> /* Verify that the archive contains valid WAL files */
> waldir = waldir ? pg_strdup(waldir) : pg_strdup(".");
> init_tar_archive_reader(&private, waldir, InvalidXLogRecPtr,
> InvalidXLogRecPtr, compression);
> verify_tar_archive(&private);
> - free_tar_archive_reader(&private);
> + astreamer_free(private.archive_streamer);
> +
> + if (lseek(private.archive_fd, 0, SEEK_SET) != 0)
> + pg_log_error("could not seek in file \"%s\": %m",
> + private.archive_name);
>
> /* Set up for reading tar file */
> init_tar_archive_reader(&private, waldir, private.startptr,
>
> Of course, this is not really what we want to do: it avoids reopening
> the file, but because we can't back up the archive streamer once it's
> been created, we have to lseek back to the beginning of the file. But
> notice how silly this looks: with this patch, we free the archive
> reader and immediately create a new archive reader that is exactly the
> same in every way except that we call astreamer_waldump_new(startptr,
> endptr, private) instead of astreamer_waldump_new(InvalidXLogRecPtr,
> InvalidXLogRecPtr, private). We could arrange to update the original
> archive streamer with new values of startSegNo and endSegNo after
> verify_tar_archive(), but that's still not quite good enough, because
> we might have already made some decisions on what to do with the data
> that we read that it's too late to reverse. But, what that means is
> that the astreamer_waldump machinery is not smart enough to read one
> block of data without making irreversible decisions from which we
> can't recover without recreating the entire object. I think we can,
> and should, try to do better.
>
Agreed.
> It's also worth noting that the unfortunate layering doesn't just
> require us to read the first block of the file: it also complicates
> the code in various places. The fact that astreamer_wal_read() needs a
> special case for XLogRecPtrIsInvalid(recptr) is a direct result of
> this problem, and the READ_ANY_WAL() macro and both the places that
> test it are also direct results of this problem. In other words, I'm
> arguing that astreamer_wal_read() is incorrectly defined, and that
> error creates ugliness in the code both above and below
> astreamer_wal_read().
>
> While I'm on the topic of astreamer_wal_read(), here are a few other
> problems I noticed:
>
> * The return value is not documented, and it seems to always be count,
> in which case it might as well return void. The caller already has the
> value they passed for count.
The caller will be xlogreader, and I believe we shouldn't change that.
For the same reason, WALDumpReadPage() also returns the same.
> * It seems like it would be more appropriate to assert that endPtr >=
> len and just set startPtr = endPtr - len. I don't see how len > endPtr
> can ever happen, and I bet bad things will happen if it does.
> * "pg_waldump never ask the same" -> "pg_waldump never asks for the same"
>
Ok.
> Also, this is absolutely not OK with me:
>
> /* Fetch more data */
> if (astreamer_archive_read(privateInfo) == 0)
> {
> char fname[MAXFNAMELEN];
> XLogSegNo segno;
>
> XLByteToSeg(targetPagePtr, segno, WalSegSz); an
> XLogFileName(fname,
> privateInfo->timeline, segno, WalSegSz);
>
> pg_fatal("could not find file \"%s\"
> in \"%s\" archive",
> fname,
> privateInfo->archive_name);
> }
>
> astreamer_archive_read() will return 0 if we reach the end of the
> tarfile, so this is saying that if we reach the end of the tar file
> without finding the range of bytes for which we're looking, the
> explanation must be that the relevant WAL file is missing from the
> archive. But that is way too much action at a distance. I was able to
> easily construct a counterexample by copying the first 81920 bytes of
> a valid WAL file and then doing this:
>
> [robert.haas pgsql-meson]$ tar tf pg_wal.tar
> 000000010000000000000005
> [robert.haas pgsql-meson]$ pg_waldump -s 0/050008D8 -e 0/05FFED98
> pg_wal.tar >/dev/null
> pg_waldump: error: could not find file "000000010000000000000005" in
> "pg_wal.tar" archive
>
> Without the redirection to /dev/null, what happened was that
> pg_waldump printed out a bunch of records from
> 000000010000000000000005 and then said that 000000010000000000000005
> could not be found, which is obviously silly. But the fact that I
> found a specific counterexample here isn't even really the point. The
> point is that there's a big gap between what we actually know at this
> point (which is that we've read the whole input file) and what the
> message is claiming (which is that the reason must be that the file is
> missing from the archive). Even if the counterexample above didn't
> exist and that really were the only way for that to happen as of
> today, that's very fragile. Maybe some future code change will make it
> so that there's a second reason that could happen. How would somebody
> realize that they had created a second condition by means of which
> this code could be reached? If they did realize it, how would they get
> the correct error to be reported?
>
Agreed, I'll think about this.
>
> /* Continue reading from the open WAL segment, if any */
> if (state->seg.ws_file >= 0)
> {
> /*
> * To prevent a race condition where the archive streamer is still
> * exporting a file that we are trying to read, we invoke the streamer
> * to ensure enough data is available.
> */
> if (private->curSegNo == state->seg.ws_segno)
> astreamer_archive_read(private);
>
> return WALDumpReadPage(state, targetPagePtr, reqLen, targetPtr,
> readBuff);
> }
>
> But it's unclear why this should be good enough to ensure that enough
> data is available. astreamer_archive_read() might read zero bytes and
> return 0, so this doesn't really guarantee anything at all. On the
> other hand, even if astereamer_archive_read() returns a non-zero
> value, it's only going to read READ_CHUNK_SIZE bytes from the
> underlying file, so if more than that needs to be read in order for us
> to have enough data, we won't. I think it's very hard to imagine a
> situation in which you can call astreamer_archive_read() without using
> some loop.
The loop isn't needed because the caller always requests 8KB of data,
while READ_CHUNK_SIZE is 128KB. It’s assumed that the astreamer has
already created the file with some initial data. For example, if only
a few bytes have been written so far, when we reach
TarWALDumpReadPage(), it detects that we’re reading the same file
that the astreamer is still writing to and hasn’t finished. It then request to
appends 128KB of data by calling astreamer_archive_read, even though we
only need 8KB at a time. This process repeats each time the next 8KBchunk is
requested: astreamer_archive_read() appends another 128KB,and continues until
the file has been fully read and written.
> That's what astreamer_wal_read() does: it calls
> astreamer_archive_read() until it either returns 0 -- in which case we
> know we've failed -- or until we have enough data. Here we just hope
> that calling it once is enough, and that checking for errors is
> unimportant. I also don't understand the reference to a race
> condition, because there's only one process with one thread here, I
> believe, so what would be racing against?
>
In the case where the astreamer is exporting a file to disk but hasn’t
finished writing it, and we call TarWALDumpReadPage() to request
block(s) from that WAL file, we can read only up to the existing
blocks in the file. Since the file is incomplete, reading may fail
later. To handle this, astreamer_archive_read() is invoked to append
more data -- usually more than the requested amount, as explained
earlier. That is the race condition I am trying to handle.
Now, regarding the concern of astreamer_archive_read() returning zero
without reading or appending any data: this can happen only if the WAL
is shorter than expected -- an incomplete. In that case,
WALDumpReadPage() will raise the appropriate error, we don't have to
check at that point, I think.
> Another thing I noticed is that astreamer_archive_read() makes
> reference to decrypting, but there's no cryptography involved in any
> of this.
>
I think that was a typo -- I meant decompression.
Regards,
Amul
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2025-10-20 14:34 Robert Haas <[email protected]>
parent: Amul Sul <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Robert Haas @ 2025-10-20 14:34 UTC (permalink / raw)
To: Amul Sul <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>
On Thu, Oct 16, 2025 at 7:49 AM Amul Sul <[email protected]> wrote:
> astreamer reads the archive in fixed-size chunks (here it is 128KB).
> Sometimes, a single read can contain data from two WAL files --
> specifically, the tail end of one file and the start of the next --
> because of how they’re physically stored in the archive. astreamer
> knows where one file ends and another begins through tags like
> ASTREAMER_MEMBER_HEADER, ASTREAMER_MEMBER_CONTENTS, and
> ASTREAMER_MEMBER_TRAILER. However, it can’t pause mid-chunk to hold
> data from the next file once the previous one ends and for the caller;
> it pushes the entire chunk it has read to the target buffer.
Right, this makes sense.
> So, if we put the reordering logic outside the streamer, we’d
> sometimes be receiving buffers containing mixed data from two WAL
> files. The caller would then need to correctly identify WAL file
> boundaries within those buffers. This would require passing extra
> metadata -- like segment numbers for the WAL files in the buffer, plus
> start and end offsets of those segments within the buffer. While not
> impossible, it feels a bit hacky and I'm unsure if that’s the best
> approach.
I agree that we need that kind of metadata, but I don't see why our
need for it depends on where we do the reordering. That is, if we do
the reordering above the astreamer layer, we need to keep track of the
origin of each chunk of WAL bytes, and if we do the reordering within
the astreamer layer, we still need to keep track of the origin of the
WAL bytes. Doing the ordering properly requires that tracking, but it
doesn't say anything about where that tracking has to be performed.
I think it might be better if we didn't write to the astreamer's
buffer at all. For example, suppose we create a struct that looks
approximately like this:
struct ChunkOfDecodedWAL
{
XLogSegNo segno; // could also be XLogRecPtr start_lsn or char
*walfilename or whatever
StringInfoData buffer;
char *spillfilename; // or whatever we use to identify the temporary files
bool already_removed;
// potentially other metadata
};
Then, create a hash table and key it on the segno whatever. Have the
astreamer write to the hash table: when it gets a chunk of WAL, it
looks up or creates the relevant hash table entry and appends the data
to the buffer. At any convenient point in the code, you can decide to
write the data from the buffer to a spill file, after which you
resetStringInfo() on the buffer and populate the spill file name. When
you've used up the data, you remove the spill file and set the
already_removed flag.
I think this could also help with the error reporting stuff. When you
get to the end of the file, you'll know all the files you saw and how
much data you read from each of them. So you could possibly do
something like
ERROR: LSN %08X/%08X not found in archive "\%s\"
DETAIL: WAL segment %s is not present in the archive
-or
DETAIL: WAL segment %s was expected to be %u bytes, but was only %u bytes
-or-
DETAIL: whatever else can go wrong
The point is that every file you've ever seen has a hash table entry,
and in that hash table entry you can store everything about that file
that you need to know, whether that's the file data, the disk file
that contains the file data, the fact that we already threw the data
away, or any other fact that you can imagine wanting to know.
Said differently, the astreamer buffer is not really a great place to
write data. It exists because when we're just forwarding data from one
astreamer to the next, we will often need to buffer a small amount of
data to avoid terrible performance. However, it's only there to be
used when we don't have something better. I don't think any astreamer
that is intended to be the last one in the chain currently writes to
the buffer -- they write to the output file, or whatever, because
using an in-memory buffer as your final output destination is not a
real good plan.
> > While I'm on the topic of astreamer_wal_read(), here are a few other
> > problems I noticed:
> >
> > * The return value is not documented, and it seems to always be count,
> > in which case it might as well return void. The caller already has the
> > value they passed for count.
>
> The caller will be xlogreader, and I believe we shouldn't change that.
> For the same reason, WALDumpReadPage() also returns the same.
OK, but then you can make that clear via a brief comment.
> The loop isn't needed because the caller always requests 8KB of data,
> while READ_CHUNK_SIZE is 128KB. It’s assumed that the astreamer has
> already created the file with some initial data. For example, if only
> a few bytes have been written so far, when we reach
> TarWALDumpReadPage(), it detects that we’re reading the same file
> that the astreamer is still writing to and hasn’t finished. It then request to
> appends 128KB of data by calling astreamer_archive_read, even though we
> only need 8KB at a time. This process repeats each time the next 8KBchunk is
> requested: astreamer_archive_read() appends another 128KB,and continues until
> the file has been fully read and written.
Sure, but you don't know how much data is going to come out the other
end of the astreamer pipeline. Since the data is (possibly)
compressed, you expect at least as many bytes to emerge from the
output end as you add to the input end, but it's not a good idea to
rely on assumptions like that. Sometimes compressors end up making the
data slightly larger instead of smaller. It's unlikely that the effect
would be so dramatic that adding 128kB to one end of the pipeline
would make less than 8kB emerge from the other end, but it's not a
good idea to rely on assumptions like that. Not that this is a real
thing, but imagine that the compressed file had something in the
middle of it that behaved like a comment in C code, i.e. it didn't
generate any output.
> In the case where the astreamer is exporting a file to disk but hasn’t
> finished writing it, and we call TarWALDumpReadPage() to request
> block(s) from that WAL file, we can read only up to the existing
> blocks in the file. Since the file is incomplete, reading may fail
> later. To handle this, astreamer_archive_read() is invoked to append
> more data -- usually more than the requested amount, as explained
> earlier. That is the race condition I am trying to handle.
That's not what a race condition is:
https://en.wikipedia.org/wiki/Race_condition
> Now, regarding the concern of astreamer_archive_read() returning zero
> without reading or appending any data: this can happen only if the WAL
> is shorter than expected -- an incomplete. In that case,
> WALDumpReadPage() will raise the appropriate error, we don't have to
> check at that point, I think.
I'm not going to accept that kind of justification -- it is too
fragile to assume that you don't need to check for an error because it
"can't happen". Sometimes that is reasonable, but there is quite a lot
of action-at-a-distance here, so it does not feel safe.
> > Another thing I noticed is that astreamer_archive_read() makes
> > reference to decrypting, but there's no cryptography involved in any
> > of this.
>
> I think that was a typo -- I meant decompression.
I figured as much, but it still needs fixing.
--
Robert Haas
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2025-11-06 09:03 Amul Sul <[email protected]>
parent: Robert Haas <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Amul Sul @ 2025-11-06 09:03 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>
On Mon, Oct 20, 2025 at 8:05 PM Robert Haas <[email protected]> wrote:
>
> On Thu, Oct 16, 2025 at 7:49 AM Amul Sul <[email protected]> wrote:
>
> > So, if we put the reordering logic outside the streamer, we’d
> > sometimes be receiving buffers containing mixed data from two WAL
> > files. The caller would then need to correctly identify WAL file
> > boundaries within those buffers. This would require passing extra
> > metadata -- like segment numbers for the WAL files in the buffer, plus
> > start and end offsets of those segments within the buffer. While not
> > impossible, it feels a bit hacky and I'm unsure if that’s the best
> > approach.
>
> I agree that we need that kind of metadata, but I don't see why our
> need for it depends on where we do the reordering. That is, if we do
> the reordering above the astreamer layer, we need to keep track of the
> origin of each chunk of WAL bytes, and if we do the reordering within
> the astreamer layer, we still need to keep track of the origin of the
> WAL bytes. Doing the ordering properly requires that tracking, but it
> doesn't say anything about where that tracking has to be performed.
>
> I think it might be better if we didn't write to the astreamer's
> buffer at all. For example, suppose we create a struct that looks
> approximately like this:
>
> struct ChunkOfDecodedWAL
> {
> XLogSegNo segno; // could also be XLogRecPtr start_lsn or char
> *walfilename or whatever
> StringInfoData buffer;
> char *spillfilename; // or whatever we use to identify the temporary files
> bool already_removed;
> // potentially other metadata
> };
>
> Then, create a hash table and key it on the segno whatever. Have the
> astreamer write to the hash table: when it gets a chunk of WAL, it
> looks up or creates the relevant hash table entry and appends the data
> to the buffer. At any convenient point in the code, you can decide to
> write the data from the buffer to a spill file, after which you
> resetStringInfo() on the buffer and populate the spill file name. When
> you've used up the data, you remove the spill file and set the
> already_removed flag.
>
> I think this could also help with the error reporting stuff. When you
> get to the end of the file, you'll know all the files you saw and how
> much data you read from each of them. So you could possibly do
> something like
>
> ERROR: LSN %08X/%08X not found in archive "\%s\"
> DETAIL: WAL segment %s is not present in the archive
> -or
> DETAIL: WAL segment %s was expected to be %u bytes, but was only %u bytes
> -or-
> DETAIL: whatever else can go wrong
>
> The point is that every file you've ever seen has a hash table entry,
> and in that hash table entry you can store everything about that file
> that you need to know, whether that's the file data, the disk file
> that contains the file data, the fact that we already threw the data
> away, or any other fact that you can imagine wanting to know.
>
> Said differently, the astreamer buffer is not really a great place to
> write data. It exists because when we're just forwarding data from one
> astreamer to the next, we will often need to buffer a small amount of
> data to avoid terrible performance. However, it's only there to be
> used when we don't have something better. I don't think any astreamer
> that is intended to be the last one in the chain currently writes to
> the buffer -- they write to the output file, or whatever, because
> using an in-memory buffer as your final output destination is not a
> real good plan.
>
Make sense, I implemented this approach in the attached version, but
with a different structure name and a slightly different error
message. In the error output using the WAL file name instead of the
LSN. This is because the LSN at that point may differ from the
user-provided one (it might have been adjusted to the start of a WAL
page by xlogreader). This follows the same style used in the routine
that reads the WAL file. The LSN values (user provided) are only used
in error messages generated at the very beginning, specifically in the
main() function of pg_waldump.
I have also restructured the code by moving most of the tar file
reading logic out of pg_waldump.c into astreamer_waldump.c, which has
now been renamed to archive_waldump.c.
Kindly have a look at the attached version. Thank you !
Regards,
Amul
Attachments:
[application/x-patch] v5-0001-Refactor-pg_waldump-Move-some-declarations-to-new.patch (2.3K, 2-v5-0001-Refactor-pg_waldump-Move-some-declarations-to-new.patch)
download | inline diff:
From 9bfed15797bcecf15e828d2b48f64caead36e9bb Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 24 Jun 2025 11:33:20 +0530
Subject: [PATCH v5 1/8] Refactor: pg_waldump: Move some declarations to new
pg_waldump.h
This change prepares for a second source file in this directory to
support reading WAL from tar files. Common structures, declarations,
and functions are being exported through this include file so
they can be used in both files.
---
src/bin/pg_waldump/pg_waldump.c | 11 ++---------
src/bin/pg_waldump/pg_waldump.h | 27 +++++++++++++++++++++++++++
2 files changed, 29 insertions(+), 9 deletions(-)
create mode 100644 src/bin/pg_waldump/pg_waldump.h
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 13d3ec2f5be..a49b2fd96c7 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -29,6 +29,7 @@
#include "common/logging.h"
#include "common/relpath.h"
#include "getopt_long.h"
+#include "pg_waldump.h"
#include "rmgrdesc.h"
#include "storage/bufpage.h"
@@ -39,19 +40,11 @@
static const char *progname;
-static int WalSegSz;
+int WalSegSz = DEFAULT_XLOG_SEG_SIZE;
static volatile sig_atomic_t time_to_stop = false;
static const RelFileLocator emptyRelFileLocator = {0, 0, 0};
-typedef struct XLogDumpPrivate
-{
- TimeLineID timeline;
- XLogRecPtr startptr;
- XLogRecPtr endptr;
- bool endptr_reached;
-} XLogDumpPrivate;
-
typedef struct XLogDumpConfig
{
/* display options */
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
new file mode 100644
index 00000000000..9e62b64ead5
--- /dev/null
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -0,0 +1,27 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_waldump.h - decode and display WAL
+ *
+ * Copyright (c) 2013-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/pg_waldump.h
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_WALDUMP_H
+#define PG_WALDUMP_H
+
+#include "access/xlogdefs.h"
+
+extern int WalSegSz;
+
+/* Contains the necessary information to drive WAL decoding */
+typedef struct XLogDumpPrivate
+{
+ TimeLineID timeline;
+ XLogRecPtr startptr;
+ XLogRecPtr endptr;
+ bool endptr_reached;
+} XLogDumpPrivate;
+
+#endif /* end of PG_WALDUMP_H */
--
2.47.1
[application/x-patch] v5-0002-Refactor-pg_waldump-Separate-logic-used-to-calcul.patch (2.3K, 3-v5-0002-Refactor-pg_waldump-Separate-logic-used-to-calcul.patch)
download | inline diff:
From 830dcb9c9f98de3bfc6d0b19d56865ed1e175860 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 26 Jun 2025 11:42:53 +0530
Subject: [PATCH v5 2/8] Refactor: pg_waldump: Separate logic used to calculate
the required read size.
This refactoring prepares the codebase for an upcoming patch that will
support reading WAL from tar files. The logic for calculating the
required read size has been updated to handle both normal WAL files
and WAL files located inside a tar archive.
---
src/bin/pg_waldump/pg_waldump.c | 39 ++++++++++++++++++++++-----------
1 file changed, 26 insertions(+), 13 deletions(-)
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index a49b2fd96c7..8d0cd9e7156 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -326,6 +326,29 @@ identify_target_directory(char *directory, char *fname)
return NULL; /* not reached */
}
+/* Returns the size in bytes of the data to be read. */
+static inline int
+required_read_len(XLogDumpPrivate *private, XLogRecPtr targetPagePtr,
+ int reqLen)
+{
+ int count = XLOG_BLCKSZ;
+
+ if (private->endptr != InvalidXLogRecPtr)
+ {
+ if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
+ count = XLOG_BLCKSZ;
+ else if (targetPagePtr + reqLen <= private->endptr)
+ count = private->endptr - targetPagePtr;
+ else
+ {
+ private->endptr_reached = true;
+ return -1;
+ }
+ }
+
+ return count;
+}
+
/* pg_waldump's XLogReaderRoutine->segment_open callback */
static void
WALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
@@ -383,21 +406,11 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogRecPtr targetPtr, char *readBuff)
{
XLogDumpPrivate *private = state->private_data;
- int count = XLOG_BLCKSZ;
+ int count = required_read_len(private, targetPagePtr, reqLen);
WALReadError errinfo;
- if (private->endptr != InvalidXLogRecPtr)
- {
- if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
- count = XLOG_BLCKSZ;
- else if (targetPagePtr + reqLen <= private->endptr)
- count = private->endptr - targetPagePtr;
- else
- {
- private->endptr_reached = true;
- return -1;
- }
- }
+ if (private->endptr_reached)
+ return -1;
if (!WALRead(state, readBuff, targetPagePtr, count, private->timeline,
&errinfo))
--
2.47.1
[application/x-patch] v5-0003-Refactor-pg_waldump-Restructure-TAP-tests.patch (5.5K, 4-v5-0003-Refactor-pg_waldump-Restructure-TAP-tests.patch)
download | inline diff:
From fdf23c243bc21cefd062d2b4960460722805bbee Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 30 Jul 2025 12:43:30 +0530
Subject: [PATCH v5 3/8] Refactor: pg_waldump: Restructure TAP tests.
Restructured some tests to run inside a loop, facilitating their
re-execution for decoding WAL from tar archives.
---
src/bin/pg_waldump/t/001_basic.pl | 123 ++++++++++++++++--------------
1 file changed, 67 insertions(+), 56 deletions(-)
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index f26d75e01cf..1b712e8d74d 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -198,28 +198,6 @@ command_like(
],
qr/./,
'runs with start and end segment specified');
-command_fails_like(
- [ 'pg_waldump', '--path' => $node->data_dir ],
- qr/error: no start WAL location given/,
- 'path option requires start location');
-command_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- '--end' => $end_lsn,
- ],
- qr/./,
- 'runs with path option and start and end locations');
-command_fails_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- ],
- qr/error: error in WAL record at/,
- 'falling off the end of the WAL results in an error');
-
command_like(
[
'pg_waldump', '--quiet',
@@ -227,15 +205,6 @@ command_like(
],
qr/^$/,
'no output with --quiet option');
-command_fails_like(
- [
- 'pg_waldump', '--quiet',
- '--path' => $node->data_dir,
- '--start' => $start_lsn
- ],
- qr/error: error in WAL record at/,
- 'errors are shown with --quiet');
-
# Test for: Display a message that we're skipping data if `from`
# wasn't a pointer to the start of a record.
@@ -272,7 +241,6 @@ sub test_pg_waldump
my $result = IPC::Run::run [
'pg_waldump',
- '--path' => $node->data_dir,
'--start' => $start_lsn,
'--end' => $end_lsn,
@opts
@@ -288,38 +256,81 @@ sub test_pg_waldump
my @lines;
-@lines = test_pg_waldump;
-is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+my @scenario = (
+ {
+ 'path' => $node->data_dir
+ });
-@lines = test_pg_waldump('--limit' => 6);
-is(@lines, 6, 'limit option observed');
+for my $scenario (@scenario)
+{
+ my $path = $scenario->{'path'};
-@lines = test_pg_waldump('--fullpage');
-is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
+ SKIP:
+ {
+ command_fails_like(
+ [ 'pg_waldump', '--path' => $path ],
+ qr/error: no start WAL location given/,
+ 'path option requires start location');
+ command_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ '--end' => $end_lsn,
+ ],
+ qr/./,
+ 'runs with path option and start and end locations');
+ command_fails_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ ],
+ qr/error: error in WAL record at/,
+ 'falling off the end of the WAL results in an error');
-@lines = test_pg_waldump('--stats');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ command_fails_like(
+ [
+ 'pg_waldump', '--quiet',
+ '--path' => $path,
+ '--start' => $start_lsn
+ ],
+ qr/error: error in WAL record at/,
+ 'errors are shown with --quiet');
-@lines = test_pg_waldump('--stats=record');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ @lines = test_pg_waldump('--path' => $path);
+ is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
-@lines = test_pg_waldump('--rmgr' => 'Btree');
-is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+ @lines = test_pg_waldump('--path' => $path, '--limit' => 6);
+ is(@lines, 6, 'limit option observed');
-@lines = test_pg_waldump('--fork' => 'init');
-is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+ @lines = test_pg_waldump('--path' => $path, '--fullpage');
+ is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
-is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
- 0, 'only lines for selected relation');
+ @lines = test_pg_waldump('--path' => $path, '--stats');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
- '--block' => 1);
-is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+ @lines = test_pg_waldump('--path' => $path, '--stats=record');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ @lines = test_pg_waldump('--path' => $path, '--rmgr' => 'Btree');
+ is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+
+ @lines = test_pg_waldump('--path' => $path, '--fork' => 'init');
+ is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+
+ @lines = test_pg_waldump('--path' => $path,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
+ is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
+ 0, 'only lines for selected relation');
+
+ @lines = test_pg_waldump('--path' => $path,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
+ '--block' => 1);
+ is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+ }
+}
done_testing();
--
2.47.1
[application/x-patch] v5-0004-pg_waldump-Add-support-for-archived-WAL-decoding.patch (36.7K, 5-v5-0004-pg_waldump-Add-support-for-archived-WAL-decoding.patch)
download | inline diff:
From 787fc3c94431dedcc0d37d3d6d9329b62e4d00c5 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 5 Nov 2025 15:40:36 +0530
Subject: [PATCH v5 4/8] pg_waldump: Add support for archived WAL decoding.
pg_waldump can now accept the path to a tar archive containing WAL
files and decode them. This feature was added primarily for
pg_verifybackup, which previously disabled WAL parsing for
tar-formatted backups.
Note that this patch requires that the WAL files within the archive be
in sequential order; an error will be reported otherwise. The next
patch is planned to remove this restriction.
---
doc/src/sgml/ref/pg_waldump.sgml | 8 +-
src/bin/pg_waldump/Makefile | 7 +-
src/bin/pg_waldump/archive_waldump.c | 577 +++++++++++++++++++++++++++
src/bin/pg_waldump/meson.build | 4 +-
src/bin/pg_waldump/pg_waldump.c | 222 ++++++++---
src/bin/pg_waldump/pg_waldump.h | 36 +-
src/bin/pg_waldump/t/001_basic.pl | 84 +++-
src/tools/pgindent/typedefs.list | 3 +
8 files changed, 863 insertions(+), 78 deletions(-)
create mode 100644 src/bin/pg_waldump/archive_waldump.c
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index ce23add5577..d004bb0f67e 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -141,13 +141,17 @@ PostgreSQL documentation
<term><option>--path=<replaceable>path</replaceable></option></term>
<listitem>
<para>
- Specifies a directory to search for WAL segment files or a
- directory with a <literal>pg_wal</literal> subdirectory that
+ Specifies a tar archive or a directory to search for WAL segment files
+ or a directory with a <literal>pg_wal</literal> subdirectory that
contains such files. The default is to search in the current
directory, the <literal>pg_wal</literal> subdirectory of the
current directory, and the <literal>pg_wal</literal> subdirectory
of <envar>PGDATA</envar>.
</para>
+ <para>
+ If a tar archive is provided, its WAL segment files must be in
+ sequential order; otherwise, an error will be reported.
+ </para>
</listitem>
</varlistentry>
diff --git a/src/bin/pg_waldump/Makefile b/src/bin/pg_waldump/Makefile
index 4c1ee649501..05ac5763a57 100644
--- a/src/bin/pg_waldump/Makefile
+++ b/src/bin/pg_waldump/Makefile
@@ -3,6 +3,9 @@
PGFILEDESC = "pg_waldump - decode and display WAL"
PGAPPICON=win32
+# make these available to TAP test scripts
+export TAR
+
subdir = src/bin/pg_waldump
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
@@ -12,11 +15,13 @@ OBJS = \
$(WIN32RES) \
compat.o \
pg_waldump.o \
+ archive_waldump.o \
rmgrdesc.o \
xlogreader.o \
xlogstats.o
-override CPPFLAGS := -DFRONTEND $(CPPFLAGS)
+override CPPFLAGS := -DFRONTEND -I$(libpq_srcdir) $(CPPFLAGS)
+LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils
RMGRDESCSOURCES = $(sort $(notdir $(wildcard $(top_srcdir)/src/backend/access/rmgrdesc/*desc*.c)))
RMGRDESCOBJS = $(patsubst %.c,%.o,$(RMGRDESCSOURCES))
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
new file mode 100644
index 00000000000..e619e29d5d4
--- /dev/null
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -0,0 +1,577 @@
+/*-------------------------------------------------------------------------
+ *
+ * archive_waldump.c
+ * A generic facility for reading WAL data from tar archives via archive
+ * streamer.
+ *
+ * Portions Copyright (c) 2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/archive_waldump.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres_fe.h"
+
+#include <unistd.h>
+
+#include "access/xlog_internal.h"
+#include "common/hashfn.h"
+#include "common/logging.h"
+#include "fe_utils/simple_list.h"
+#include "pg_waldump.h"
+
+/*
+ * How many bytes should we try to read from a file at once?
+ */
+#define READ_CHUNK_SIZE (128 * 1024)
+
+/* Structure for storing the WAL segment data from the archive */
+typedef struct ArchivedWALEntry
+{
+ uint32 status; /* hash status */
+ XLogSegNo segno; /* hash key: WAL segment number */
+ TimeLineID timeline; /* timeline of this wal file */
+
+ StringInfoData buf;
+ bool tmpseg_exists; /* spill file exists? */
+
+ int total_read; /* total read of this WAL segment, including
+ * buffered and temporarily written data */
+} ArchivedWALEntry;
+
+#define SH_PREFIX ArchivedWAL
+#define SH_ELEMENT_TYPE ArchivedWALEntry
+#define SH_KEY_TYPE XLogSegNo
+#define SH_KEY segno
+#define SH_HASH_KEY(tb, key) murmurhash64((uint64) key)
+#define SH_EQUAL(tb, a, b) (a == b)
+#define SH_GET_HASH(tb, a) a->hash
+#define SH_SCOPE static inline
+#define SH_RAW_ALLOCATOR pg_malloc0
+#define SH_DECLARE
+#define SH_DEFINE
+#include "lib/simplehash.h"
+
+static ArchivedWAL_hash *ArchivedWAL_HTAB = NULL;
+
+typedef struct astreamer_waldump
+{
+ astreamer base;
+ XLogDumpPrivate *privateInfo;
+} astreamer_waldump;
+
+static int read_archive_file(XLogDumpPrivate *privateInfo, Size count);
+static ArchivedWALEntry *get_archive_wal_entry(XLogSegNo segno,
+ XLogDumpPrivate *privateInfo);
+
+static astreamer *astreamer_waldump_new(XLogDumpPrivate *privateInfo);
+static void astreamer_waldump_content(astreamer *streamer,
+ astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context);
+static void astreamer_waldump_finalize(astreamer *streamer);
+static void astreamer_waldump_free(astreamer *streamer);
+
+static bool member_is_wal_file(astreamer_waldump *mystreamer,
+ astreamer_member *member,
+ XLogSegNo *curSegNo,
+ TimeLineID *curTimeline);
+
+static const astreamer_ops astreamer_waldump_ops = {
+ .content = astreamer_waldump_content,
+ .finalize = astreamer_waldump_finalize,
+ .free = astreamer_waldump_free
+};
+
+/*
+ * Returns true if the given file is a tar archive and outputs its compression
+ * algorithm.
+ */
+bool
+is_archive_file(const char *fname, pg_compress_algorithm *compression)
+{
+ int fname_len = strlen(fname);
+ pg_compress_algorithm compress_algo;
+
+ /* Now, check the compression type of the tar */
+ if (fname_len > 4 &&
+ strcmp(fname + fname_len - 4, ".tar") == 0)
+ compress_algo = PG_COMPRESSION_NONE;
+ else if (fname_len > 4 &&
+ strcmp(fname + fname_len - 4, ".tgz") == 0)
+ compress_algo = PG_COMPRESSION_GZIP;
+ else if (fname_len > 7 &&
+ strcmp(fname + fname_len - 7, ".tar.gz") == 0)
+ compress_algo = PG_COMPRESSION_GZIP;
+ else if (fname_len > 8 &&
+ strcmp(fname + fname_len - 8, ".tar.lz4") == 0)
+ compress_algo = PG_COMPRESSION_LZ4;
+ else if (fname_len > 8 &&
+ strcmp(fname + fname_len - 8, ".tar.zst") == 0)
+ compress_algo = PG_COMPRESSION_ZSTD;
+ else
+ return false;
+
+ *compression = compress_algo;
+
+ return true;
+}
+
+/*
+ * Initializes the tar archive reader to read WAL files from the archive,
+ * creates a hash table to store them, performs quick existence checks for WAL
+ * entries in the archive and retrieves the WAL segment size, and sets up
+ * filtering criteria for relevant entries.
+ */
+void
+init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
+ pg_compress_algorithm compression)
+{
+ int fd;
+ astreamer *streamer;
+ ArchivedWALEntry *entry = NULL;
+ XLogLongPageHeader longhdr;
+
+ /* Open tar archive and store its file descriptor */
+ fd = open_file_in_directory(waldir, privateInfo->archive_name);
+
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", privateInfo->archive_name);
+
+ privateInfo->archive_fd = fd;
+
+ streamer = astreamer_waldump_new(privateInfo);
+
+ /* Before that we must parse the tar archive. */
+ streamer = astreamer_tar_parser_new(streamer);
+
+ /* Before that we must decompress, if archive is compressed. */
+ if (compression == PG_COMPRESSION_GZIP)
+ streamer = astreamer_gzip_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_LZ4)
+ streamer = astreamer_lz4_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_ZSTD)
+ streamer = astreamer_zstd_decompressor_new(streamer);
+
+ privateInfo->archive_streamer = streamer;
+
+ /* Hash table storing WAL entries read from the archive */
+ ArchivedWAL_HTAB = ArchivedWAL_create(16, NULL);
+
+ /*
+ * Verify that the archive contains valid WAL files and fetch WAL segment
+ * size
+ */
+ while (entry == NULL || entry->buf.len < XLOG_BLCKSZ)
+ {
+ if (read_archive_file(privateInfo, XLOG_BLCKSZ) == 0)
+ pg_fatal("could not find WAL in \"%s\" archive",
+ privateInfo->archive_name);
+
+ entry = privateInfo->cur_wal;
+ }
+
+ /* Set WalSegSz if WAL data is successfully read */
+ longhdr = (XLogLongPageHeader) entry->buf.data;
+
+ WalSegSz = longhdr->xlp_seg_size;
+
+ if (!IsValidWalSegSize(WalSegSz))
+ {
+ pg_log_error(ngettext("invalid WAL segment size in WAL file from archive \"%s\" (%d byte)",
+ "invalid WAL segment size in WAL file from archive \"%s\" (%d bytes)",
+ WalSegSz),
+ privateInfo->archive_name, WalSegSz);
+ pg_log_error_detail("The WAL segment size must be a power of two between 1 MB and 1 GB.");
+ exit(1);
+ }
+
+ /*
+ * With the WAL segment size available, we can now initialize the
+ * dependent start and end segment numbers.
+ */
+ XLByteToSeg(privateInfo->startptr, privateInfo->startSegNo, WalSegSz);
+ XLByteToSeg(privateInfo->endptr, privateInfo->endSegNo, WalSegSz);
+}
+
+/*
+ * Release the archive streamer chain and close the archive file.
+ */
+void
+free_archive_reader(XLogDumpPrivate *privateInfo)
+{
+ /*
+ * NB: Normally, astreamer_finalize() is called before astreamer_free() to
+ * flush any remaining buffered data or to ensure the end of the tar
+ * archive is reached. However, when decoding a WAL file, once we hit the
+ * end LSN, any remaining WAL data in the buffer or the tar archive's
+ * unreached end can be safely ignored.
+ */
+ astreamer_free(privateInfo->archive_streamer);
+
+ /* Close the file. */
+ if (close(privateInfo->archive_fd) != 0)
+ pg_log_error("could not close file \"%s\": %m",
+ privateInfo->archive_name);
+}
+
+/*
+ * Copies WAL data from astreamer to readBuff; if unavailable, fetches more
+ * from the tar archive via astreamer.
+ */
+int
+read_archive_wal_page(XLogDumpPrivate *privateInfo, XLogRecPtr targetPagePtr,
+ Size count, char *readBuff)
+{
+ char *p = readBuff;
+ Size nbytes = count;
+ XLogRecPtr recptr = targetPagePtr;
+ XLogSegNo segno;
+ ArchivedWALEntry *entry;
+
+ XLByteToSeg(targetPagePtr, segno, WalSegSz);
+ entry = get_archive_wal_entry(segno, privateInfo);
+
+ while (nbytes > 0)
+ {
+ char *buf = entry->buf.data;
+ int len = entry->buf.len;
+
+ /* WAL record range that the buffer contains */
+ XLogRecPtr endPtr;
+ XLogRecPtr startPtr;
+
+ XLogSegNoOffsetToRecPtr(entry->segno, entry->total_read,
+ WalSegSz, endPtr);
+ startPtr = endPtr - len;
+
+ Assert((endPtr - startPtr) == len);
+
+ /*
+ * pg_waldump never ask the same WAL bytes more than once, so if we're
+ * now being asked for data beyond the end of what we've already read,
+ * that means none of the data we currently have in the buffer will
+ * ever be consulted again. So, we can discard the existing buffer
+ * contents and start over.
+ */
+ if (recptr >= endPtr)
+ {
+ len = 0;
+
+ /* Discard the buffered data */
+ resetStringInfo(&entry->buf);
+ }
+
+ if (len > 0 && recptr > startPtr)
+ {
+ int skipBytes = 0;
+
+ /*
+ * The required offset is not at the start of the buffer, so skip
+ * bytes until reaching the desired offset of the target page.
+ */
+ skipBytes = recptr - startPtr;
+
+ buf += skipBytes;
+ len -= skipBytes;
+ }
+
+ if (len > 0)
+ {
+ int readBytes = len >= nbytes ? nbytes : len;
+
+ /* Ensure reading correct WAL record */
+ Assert(recptr >= startPtr && recptr < endPtr);
+
+ memcpy(p, buf, readBytes);
+
+ /* Update state for read */
+ nbytes -= readBytes;
+ p += readBytes;
+ recptr += readBytes;
+ }
+ else
+ {
+ /*
+ * Fetch more data; raise an error if it's not the current segment
+ * being read by the archive streamer or if reading of the
+ * archived file has finished.
+ */
+ if (privateInfo->cur_wal != entry ||
+ read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ {
+ char fname[MAXFNAMELEN];
+
+ XLogFileName(fname, privateInfo->timeline, entry->segno,
+ WalSegSz);
+ pg_fatal("could not read file \"%s\" from archive \"%s\": read %lld of %lld",
+ fname, privateInfo->archive_name,
+ (long long int) count - nbytes,
+ (long long int) nbytes);
+ }
+ }
+ }
+
+ /*
+ * Should have either have successfully read all the requested bytes or
+ * reported a failure before this point.
+ */
+ Assert(nbytes == 0);
+
+ /*
+ * NB: We return the fixed value provided as input. Although we could
+ * return a boolean since we either successfully read the WAL page or
+ * raise an error, but the caller expects this value to be returned. The
+ * routine that reads WAL pages from the physical WAL file follows the
+ * same convention.
+ */
+ return count;
+}
+
+/*
+ * Reads the archive file and passes it to the archive streamer for
+ * decompression.
+ */
+static int
+read_archive_file(XLogDumpPrivate *privateInfo, Size count)
+{
+ int rc;
+ char *buffer;
+
+ buffer = pg_malloc(READ_CHUNK_SIZE * sizeof(uint8));
+
+ rc = read(privateInfo->archive_fd, buffer, count);
+ if (rc < 0)
+ pg_fatal("could not read file \"%s\": %m",
+ privateInfo->archive_name);
+
+ /*
+ * Decompress (if required), and then parse the previously read contents
+ * of the tar file.
+ */
+ if (rc > 0)
+ astreamer_content(privateInfo->archive_streamer, NULL,
+ buffer, rc, ASTREAMER_UNKNOWN);
+ pg_free(buffer);
+
+ return rc;
+}
+
+/*
+ * Returns the archived WAL entry from the hash table if it exists. Otherwise,
+ * it invokes the routine to read the archived file and retrieve the entry if
+ * it is not already in hash table.
+ */
+static ArchivedWALEntry *
+get_archive_wal_entry(XLogSegNo segno, XLogDumpPrivate *privateInfo)
+{
+ ArchivedWALEntry *entry = NULL;
+ char fname[MAXFNAMELEN];
+
+ /* Search hash table */
+ entry = ArchivedWAL_lookup(ArchivedWAL_HTAB, segno);
+
+ if (entry != NULL)
+ return entry;
+
+ /* Needed WAL yet to be decoded from archive, do the same */
+ while (1)
+ {
+ entry = privateInfo->cur_wal;
+
+ /* Fetch more data */
+ if (entry == NULL || entry->buf.len == 0)
+ {
+ if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ break; /* archive file ended */
+ }
+
+ /*
+ * Either, here for the first time, or the archived streamer is
+ * reading a non-WAL file or an irrelevant WAL file.
+ */
+ if (entry == NULL)
+ continue;
+
+ /* Found the required entry */
+ if (entry->segno == segno)
+ return entry;
+
+ /*
+ * Ignore if the timeline is different or the current segment is not
+ * the desired one.
+ */
+ if (privateInfo->timeline != entry->timeline ||
+ privateInfo->startSegNo > entry->segno ||
+ privateInfo->endSegNo < entry->segno)
+ {
+ privateInfo->cur_wal = NULL;
+ continue;
+ }
+
+ /* WAL segments must be archived in order */
+ pg_log_error("WAL files are not archived in sequential order");
+ pg_log_error_detail("Expecting segment number " UINT64_FORMAT " but found " UINT64_FORMAT ".",
+ segno, entry->segno);
+ exit(1);
+ }
+
+ /* Requested WAL segment not found */
+ XLogFileName(fname, privateInfo->timeline, segno, WalSegSz);
+ pg_fatal("could not find file \"%s\" in archive", fname);
+}
+
+/*
+ * Create an astreamer that can read WAL from tar file.
+ */
+static astreamer *
+astreamer_waldump_new(XLogDumpPrivate *privateInfo)
+{
+ astreamer_waldump *streamer;
+
+ streamer = palloc0(sizeof(astreamer_waldump));
+ *((const astreamer_ops **) &streamer->base.bbs_ops) =
+ &astreamer_waldump_ops;
+
+ streamer->privateInfo = privateInfo;
+
+ return &streamer->base;
+}
+
+/*
+ * Main entry point of the archive streamer for reading WAL data from a tar
+ * file. If a member is identified as a valid WAL file, a hash entry is created
+ * for it, and its contents are copied into that entry's buffer, making them
+ * accessible to the decoding routine.
+ */
+static void
+astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context)
+{
+ astreamer_waldump *mystreamer = (astreamer_waldump *) streamer;
+ XLogDumpPrivate *privateInfo = mystreamer->privateInfo;
+
+ Assert(context != ASTREAMER_UNKNOWN);
+
+ switch (context)
+ {
+ case ASTREAMER_MEMBER_HEADER:
+ {
+ XLogSegNo segno;
+ TimeLineID timeline;
+ ArchivedWALEntry *entry;
+ bool found;
+
+ pg_log_debug("pg_waldump: reading \"%s\"", member->pathname);
+
+ if (!member_is_wal_file(mystreamer, member,
+ &segno, &timeline))
+ break;
+
+ entry = ArchivedWAL_insert(ArchivedWAL_HTAB, segno, &found);
+
+ /*
+ * Shouldn't happen, but if it does, simply ignore the
+ * duplicate WAL file.
+ */
+ if (found)
+ {
+ pg_log_warning("ignoring duplicate WAL file found in archive: \"%s\"",
+ member->pathname);
+ break;
+ }
+
+ initStringInfo(&entry->buf);
+ entry->timeline = timeline;
+ entry->total_read = 0;
+
+ privateInfo->cur_wal = entry;
+ }
+ break;
+
+ case ASTREAMER_MEMBER_CONTENTS:
+ if (privateInfo->cur_wal)
+ {
+ appendBinaryStringInfo(&privateInfo->cur_wal->buf, data, len);
+ privateInfo->cur_wal->total_read += len;
+ }
+ break;
+
+ case ASTREAMER_MEMBER_TRAILER:
+ privateInfo->cur_wal = NULL;
+ break;
+
+ case ASTREAMER_ARCHIVE_TRAILER:
+ break;
+
+ default:
+ /* Shouldn't happen. */
+ pg_fatal("unexpected state while parsing tar file");
+ }
+}
+
+/*
+ * End-of-stream processing for a astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_finalize(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+}
+
+/*
+ * Free memory associated with a astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_free(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+ pfree(streamer);
+}
+
+/*
+ * Returns true if the archive member name matches the WAL naming format. If
+ * successful, it also outputs the WAL segment number, and timeline.
+ */
+static bool
+member_is_wal_file(astreamer_waldump *mystreamer, astreamer_member *member,
+ XLogSegNo *curSegNo, TimeLineID *curTimeline)
+{
+ int pathlen;
+ XLogSegNo segNo;
+ TimeLineID timeline;
+ char *fname;
+
+ /* We are only interested in normal files. */
+ if (member->is_directory || member->is_link)
+ return false;
+
+ pathlen = strlen(member->pathname);
+ if (pathlen < XLOG_FNAME_LEN)
+ return false;
+
+ /* WAL file could be with full path */
+ fname = member->pathname + (pathlen - XLOG_FNAME_LEN);
+ if (!IsXLogFileName(fname))
+ return false;
+
+ /*
+ * XXX: On some systems (e.g., OpenBSD), the tar utility includes
+ * PaxHeaders when creating an archive. These are special entries that
+ * store extended metadata for the file entry immediately following them,
+ * and they share the exact same name as that file.
+ */
+ if (strstr(member->pathname, "PaxHeaders."))
+ return false;
+
+ /* Parse position from file */
+ XLogFromFileName(fname, &timeline, &segNo, WalSegSz);
+
+ *curSegNo = segNo;
+ *curTimeline = timeline;
+
+ return true;
+}
diff --git a/src/bin/pg_waldump/meson.build b/src/bin/pg_waldump/meson.build
index 937e0d68841..da00746587c 100644
--- a/src/bin/pg_waldump/meson.build
+++ b/src/bin/pg_waldump/meson.build
@@ -3,6 +3,7 @@
pg_waldump_sources = files(
'compat.c',
'pg_waldump.c',
+ 'archive_waldump.c',
'rmgrdesc.c',
)
@@ -18,7 +19,7 @@ endif
pg_waldump = executable('pg_waldump',
pg_waldump_sources,
- dependencies: [frontend_code, lz4, zstd],
+ dependencies: [frontend_code, lz4, zstd, libpq],
c_args: ['-DFRONTEND'], # needed for xlogreader et al
kwargs: default_bin_args,
)
@@ -29,6 +30,7 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
+ 'env': {'TAR': tar.found() ? tar.full_path() : ''},
'tests': [
't/001_basic.pl',
't/002_save_fullpage.pl',
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 8d0cd9e7156..8a838f16ba2 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -177,7 +177,7 @@ split_path(const char *path, char **dir, char **fname)
*
* return a read only fd
*/
-static int
+int
open_file_in_directory(const char *directory, const char *fname)
{
int fd = -1;
@@ -436,6 +436,44 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
return count;
}
+/*
+ * pg_waldump's XLogReaderRoutine->segment_open callback to support dumping WAL
+ * files from tar archives.
+ */
+static void
+TarWALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
+ TimeLineID *tli_p)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->segment_close callback.
+ */
+static void
+TarWALDumpCloseSegment(XLogReaderState *state)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->page_read callback to support dumping WAL
+ * files from tar archives.
+ */
+static int
+TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
+ XLogRecPtr targetPtr, char *readBuff)
+{
+ XLogDumpPrivate *private = state->private_data;
+ int count = required_read_len(private, targetPagePtr, reqLen);
+
+ if (private->endptr_reached)
+ return -1;
+
+ /* Read the WAL page from the archive streamer */
+ return read_archive_wal_page(private, targetPagePtr, count, readBuff);
+}
+
/*
* Boolean to return whether the given WAL record matches a specific relation
* and optionally block.
@@ -773,8 +811,8 @@ usage(void)
printf(_(" -F, --fork=FORK only show records that modify blocks in fork FORK;\n"
" valid names are main, fsm, vm, init\n"));
printf(_(" -n, --limit=N number of records to display\n"));
- printf(_(" -p, --path=PATH directory in which to find WAL segment files or a\n"
- " directory with a ./pg_wal that contains such files\n"
+ printf(_(" -p, --path=PATH tar archive or a directory in which to find WAL segment files or\n"
+ " a directory with a ./pg_wal that contains such files\n"
" (default: current directory, ./pg_wal, $PGDATA/pg_wal)\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -r, --rmgr=RMGR only show records generated by resource manager RMGR;\n"
@@ -806,7 +844,10 @@ main(int argc, char **argv)
XLogRecord *record;
XLogRecPtr first_record;
char *waldir = NULL;
+ char *walpath = NULL;
char *errormsg;
+ bool is_archive = false;
+ pg_compress_algorithm compression;
static struct option long_options[] = {
{"bkp-details", no_argument, NULL, 'b'},
@@ -938,7 +979,7 @@ main(int argc, char **argv)
}
break;
case 'p':
- waldir = pg_strdup(optarg);
+ walpath = pg_strdup(optarg);
break;
case 'q':
config.quiet = true;
@@ -1102,10 +1143,27 @@ main(int argc, char **argv)
goto bad_argument;
}
- if (waldir != NULL)
+ if (walpath != NULL)
{
+ /* validate path points to tar archive */
+ if (is_archive_file(walpath, &compression))
+ {
+ char *fname = NULL;
+
+ split_path(walpath, &waldir, &fname);
+
+ /*
+ * A NULL WAL directory indicates that the archive file is located
+ * in the current working directory of the pg_waldump execution
+ */
+ if (waldir == NULL)
+ waldir = pg_strdup(".");
+
+ private.archive_name = fname;
+ is_archive = true;
+ }
/* validate path points to directory */
- if (!verify_directory(waldir))
+ else if (!verify_directory(walpath))
{
pg_log_error("could not open directory \"%s\": %m", waldir);
goto bad_argument;
@@ -1123,46 +1181,36 @@ main(int argc, char **argv)
int fd;
XLogSegNo segno;
+ /*
+ * If a tar archive is passed using the --path option, all other
+ * arguments become unnecessary.
+ */
+ if (is_archive)
+ {
+ pg_log_error("unnecessary command-line arguments specified with tar archive (first is \"%s\")",
+ argv[optind]);
+ goto bad_argument;
+ }
+
split_path(argv[optind], &directory, &fname);
- if (waldir == NULL && directory != NULL)
+ if (walpath == NULL && directory != NULL)
{
- waldir = directory;
+ walpath = directory;
- if (!verify_directory(waldir))
+ if (!verify_directory(walpath))
pg_fatal("could not open directory \"%s\": %m", waldir);
}
- waldir = identify_target_directory(waldir, fname);
- fd = open_file_in_directory(waldir, fname);
- if (fd < 0)
- pg_fatal("could not open file \"%s\"", fname);
- close(fd);
-
- /* parse position from file */
- XLogFromFileName(fname, &private.timeline, &segno, WalSegSz);
-
- if (XLogRecPtrIsInvalid(private.startptr))
- XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr);
- else if (!XLByteInSeg(private.startptr, segno, WalSegSz))
+ if (fname != NULL && is_archive_file(fname, &compression))
{
- pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.startptr),
- fname);
- goto bad_argument;
+ waldir = walpath ? pg_strdup(walpath) : pg_strdup(".");
+ private.archive_name = fname;
+ is_archive = true;
}
-
- /* no second file specified, set end position */
- if (!(optind + 1 < argc) && XLogRecPtrIsInvalid(private.endptr))
- XLogSegNoOffsetToRecPtr(segno + 1, 0, WalSegSz, private.endptr);
-
- /* parse ENDSEG if passed */
- if (optind + 1 < argc)
+ else
{
- XLogSegNo endsegno;
-
- /* ignore directory, already have that */
- split_path(argv[optind + 1], &directory, &fname);
+ waldir = identify_target_directory(walpath, fname);
fd = open_file_in_directory(waldir, fname);
if (fd < 0)
@@ -1170,32 +1218,63 @@ main(int argc, char **argv)
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &endsegno, WalSegSz);
+ XLogFromFileName(fname, &private.timeline, &segno, WalSegSz);
- if (endsegno < segno)
- pg_fatal("ENDSEG %s is before STARTSEG %s",
- argv[optind + 1], argv[optind]);
+ if (XLogRecPtrIsInvalid(private.startptr))
+ XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr);
+ else if (!XLByteInSeg(private.startptr, segno, WalSegSz))
+ {
+ pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.startptr),
+ fname);
+ goto bad_argument;
+ }
- if (XLogRecPtrIsInvalid(private.endptr))
- XLogSegNoOffsetToRecPtr(endsegno + 1, 0, WalSegSz,
- private.endptr);
+ /* no second file specified, set end position */
+ if (!(optind + 1 < argc) && XLogRecPtrIsInvalid(private.endptr))
+ XLogSegNoOffsetToRecPtr(segno + 1, 0, WalSegSz, private.endptr);
- /* set segno to endsegno for check of --end */
- segno = endsegno;
- }
+ /* parse ENDSEG if passed */
+ if (optind + 1 < argc)
+ {
+ XLogSegNo endsegno;
+ /* ignore directory, already have that */
+ split_path(argv[optind + 1], &directory, &fname);
- if (!XLByteInSeg(private.endptr, segno, WalSegSz) &&
- private.endptr != (segno + 1) * WalSegSz)
- {
- pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.endptr),
- argv[argc - 1]);
- goto bad_argument;
+ fd = open_file_in_directory(waldir, fname);
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", fname);
+ close(fd);
+
+ /* parse position from file */
+ XLogFromFileName(fname, &private.timeline, &endsegno, WalSegSz);
+
+ if (endsegno < segno)
+ pg_fatal("ENDSEG %s is before STARTSEG %s",
+ argv[optind + 1], argv[optind]);
+
+ if (XLogRecPtrIsInvalid(private.endptr))
+ XLogSegNoOffsetToRecPtr(endsegno + 1, 0, WalSegSz,
+ private.endptr);
+
+ /* set segno to endsegno for check of --end */
+ segno = endsegno;
+ }
+
+
+ if (!XLByteInSeg(private.endptr, segno, WalSegSz) &&
+ private.endptr != (segno + 1) * WalSegSz)
+ {
+ pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.endptr),
+ argv[argc - 1]);
+ goto bad_argument;
+ }
}
}
- else
- waldir = identify_target_directory(waldir, NULL);
+ else if (!is_archive)
+ waldir = identify_target_directory(walpath, NULL);
/* we don't know what to print */
if (XLogRecPtrIsInvalid(private.startptr))
@@ -1207,12 +1286,30 @@ main(int argc, char **argv)
/* done with argument parsing, do the actual work */
/* we have everything we need, start reading */
- xlogreader_state =
- XLogReaderAllocate(WalSegSz, waldir,
- XL_ROUTINE(.page_read = WALDumpReadPage,
- .segment_open = WALDumpOpenSegment,
- .segment_close = WALDumpCloseSegment),
- &private);
+ if (is_archive)
+ {
+ /* Set up for reading tar file */
+ init_archive_reader(&private, waldir, compression);
+
+ /* Routine to decode WAL files in tar archive */
+ xlogreader_state =
+ XLogReaderAllocate(WalSegSz, waldir,
+ XL_ROUTINE(.page_read = TarWALDumpReadPage,
+ .segment_open = TarWALDumpOpenSegment,
+ .segment_close = TarWALDumpCloseSegment),
+ &private);
+ }
+ else
+ {
+ /* Routine to decode WAL files */
+ xlogreader_state =
+ XLogReaderAllocate(WalSegSz, waldir,
+ XL_ROUTINE(.page_read = WALDumpReadPage,
+ .segment_open = WALDumpOpenSegment,
+ .segment_close = WALDumpCloseSegment),
+ &private);
+ }
+
if (!xlogreader_state)
pg_fatal("out of memory while allocating a WAL reading processor");
@@ -1321,6 +1418,9 @@ main(int argc, char **argv)
XLogReaderFree(xlogreader_state);
+ if (is_archive)
+ free_archive_reader(&private);
+
return EXIT_SUCCESS;
bad_argument:
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index 9e62b64ead5..54758c3548a 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -12,9 +12,13 @@
#define PG_WALDUMP_H
#include "access/xlogdefs.h"
+#include "fe_utils/astreamer.h"
extern int WalSegSz;
+/* Forward declaration */
+struct ArchivedWALEntry;
+
/* Contains the necessary information to drive WAL decoding */
typedef struct XLogDumpPrivate
{
@@ -22,6 +26,36 @@ typedef struct XLogDumpPrivate
XLogRecPtr startptr;
XLogRecPtr endptr;
bool endptr_reached;
+
+ /* Fields required to read WAL from archive */
+ char *archive_name; /* Tar archive name */
+ int archive_fd; /* File descriptor for the open tar file */
+
+ astreamer *archive_streamer;
+
+ /* What the archive streamer is currently reading */
+ struct ArchivedWALEntry *cur_wal;
+
+ /*
+ * Although these values can be easily derived from startptr and endptr,
+ * doing so repeatedly for each archived member would be inefficient, as
+ * it would involve recalculating and filtering out irrelevant WAL
+ * segments.
+ */
+ XLogSegNo startSegNo;
+ XLogSegNo endSegNo;
} XLogDumpPrivate;
-#endif /* end of PG_WALDUMP_H */
+extern int open_file_in_directory(const char *directory, const char *fname);
+
+extern bool is_archive_file(const char *fname,
+ pg_compress_algorithm *compression);
+extern void init_archive_reader(XLogDumpPrivate *privateInfo,
+ const char *waldir,
+ pg_compress_algorithm compression);
+extern void free_archive_reader(XLogDumpPrivate *privateInfo);
+extern int read_archive_wal_page(XLogDumpPrivate *privateInfo,
+ XLogRecPtr targetPagePtr,
+ Size count, char *readBuff);
+
+#endif /* end of PG_WALDUMP_H */
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 1b712e8d74d..443126a9ce6 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -3,10 +3,13 @@
use strict;
use warnings FATAL => 'all';
+use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
+my $tar = $ENV{TAR};
+
program_help_ok('pg_waldump');
program_version_ok('pg_waldump');
program_options_handling_ok('pg_waldump');
@@ -235,7 +238,7 @@ command_like(
sub test_pg_waldump
{
local $Test::Builder::Level = $Test::Builder::Level + 1;
- my @opts = @_;
+ my ($path, @opts) = @_;
my ($stdout, $stderr);
@@ -243,6 +246,7 @@ sub test_pg_waldump
'pg_waldump',
'--start' => $start_lsn,
'--end' => $end_lsn,
+ '--path' => $path,
@opts
],
'>' => \$stdout,
@@ -254,11 +258,50 @@ sub test_pg_waldump
return @lines;
}
-my @lines;
+# Create a tar archive, sorting the file order
+sub generate_archive
+{
+ my ($archive, $directory, $compression_flags) = @_;
+
+ my @files;
+ opendir my $dh, $directory or die "opendir: $!";
+ while (my $entry = readdir $dh) {
+ # Skip '.' and '..'
+ next if $entry eq '.' || $entry eq '..';
+ push @files, $entry;
+ }
+ closedir $dh;
+
+ @files = sort @files;
+
+ # move into the WAL directory before archiving files
+ my $cwd = getcwd;
+ chdir($directory) || die "chdir: $!";
+ command_ok([$tar, $compression_flags, $archive, @files]);
+ chdir($cwd) || die "chdir: $!";
+}
+
+my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
my @scenario = (
{
- 'path' => $node->data_dir
+ 'path' => $node->data_dir,
+ 'is_archive' => 0,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar",
+ 'compression_method' => 'none',
+ 'compression_flags' => '-cf',
+ 'is_archive' => 1,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar.gz",
+ 'compression_method' => 'gzip',
+ 'compression_flags' => '-czf',
+ 'is_archive' => 1,
+ 'enabled' => check_pg_config("#define HAVE_LIBZ 1")
});
for my $scenario (@scenario)
@@ -267,6 +310,19 @@ for my $scenario (@scenario)
SKIP:
{
+ skip "tar command is not available", 3
+ if !defined $tar;
+ skip "$scenario->{'compression_method'} compression not supported by this build", 3
+ if !$scenario->{'enabled'} && $scenario->{'is_archive'};
+
+ # create pg_wal archive
+ if ($scenario->{'is_archive'})
+ {
+ generate_archive($path,
+ $node->data_dir . '/pg_wal',
+ $scenario->{'compression_flags'});
+ }
+
command_fails_like(
[ 'pg_waldump', '--path' => $path ],
qr/error: no start WAL location given/,
@@ -298,38 +354,42 @@ for my $scenario (@scenario)
qr/error: error in WAL record at/,
'errors are shown with --quiet');
- @lines = test_pg_waldump('--path' => $path);
+ my @lines;
+ @lines = test_pg_waldump($path);
is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
- @lines = test_pg_waldump('--path' => $path, '--limit' => 6);
+ @lines = test_pg_waldump($path, '--limit' => 6);
is(@lines, 6, 'limit option observed');
- @lines = test_pg_waldump('--path' => $path, '--fullpage');
+ @lines = test_pg_waldump($path, '--fullpage');
is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
- @lines = test_pg_waldump('--path' => $path, '--stats');
+ @lines = test_pg_waldump($path, '--stats');
like($lines[0], qr/WAL statistics/, "statistics on stdout");
is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
- @lines = test_pg_waldump('--path' => $path, '--stats=record');
+ @lines = test_pg_waldump($path, '--stats=record');
like($lines[0], qr/WAL statistics/, "statistics on stdout");
is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
- @lines = test_pg_waldump('--path' => $path, '--rmgr' => 'Btree');
+ @lines = test_pg_waldump($path, '--rmgr' => 'Btree');
is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
- @lines = test_pg_waldump('--path' => $path, '--fork' => 'init');
+ @lines = test_pg_waldump($path, '--fork' => 'init');
is(grep(!/fork init/, @lines), 0, 'only init fork lines');
- @lines = test_pg_waldump('--path' => $path,
+ @lines = test_pg_waldump($path,
'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
0, 'only lines for selected relation');
- @lines = test_pg_waldump('--path' => $path,
+ @lines = test_pg_waldump($path,
'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
'--block' => 1);
is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+
+ # Cleanup.
+ unlink $path if $scenario->{'is_archive'};
}
}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index bb4e1b37005..de2ad42bcab 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -139,6 +139,8 @@ ArchiveOpts
ArchiveShutdownCB
ArchiveStartupCB
ArchiveStreamState
+ArchivedWALEntry
+ArchivedWAL_hash
ArchiverOutput
ArchiverStage
ArrayAnalyzeExtraData
@@ -3453,6 +3455,7 @@ astreamer_recovery_injector
astreamer_tar_archiver
astreamer_tar_parser
astreamer_verify
+astreamer_waldump
astreamer_zstd_frame
auth_password_hook_typ
autovac_table
--
2.47.1
[application/x-patch] v5-0005-pg_waldump-Remove-the-restriction-on-the-order-of.patch (11.2K, 6-v5-0005-pg_waldump-Remove-the-restriction-on-the-order-of.patch)
download | inline diff:
From 866225e20f1389b94e25c40314b58332d7e0a6c5 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 6 Nov 2025 13:48:33 +0530
Subject: [PATCH v5 5/8] pg_waldump: Remove the restriction on the order of
archived WAL files.
With previous patch, pg_waldump would stop decoding if WAL files were
not in the required sequence. With this patch, decoding will now
continue. Any WAL file that is out of order will be written to a
temporary location, from which it will be read later. Once a temporary
file has been read, it will be removed.
---
src/bin/pg_waldump/archive_waldump.c | 207 +++++++++++++++++++++++++--
src/bin/pg_waldump/pg_waldump.c | 41 +++++-
src/bin/pg_waldump/pg_waldump.h | 4 +
src/bin/pg_waldump/t/001_basic.pl | 3 +-
4 files changed, 243 insertions(+), 12 deletions(-)
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
index e619e29d5d4..4a280b58ec2 100644
--- a/src/bin/pg_waldump/archive_waldump.c
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -17,6 +17,7 @@
#include <unistd.h>
#include "access/xlog_internal.h"
+#include "common/file_perm.h"
#include "common/hashfn.h"
#include "common/logging.h"
#include "fe_utils/simple_list.h"
@@ -27,6 +28,11 @@
*/
#define READ_CHUNK_SIZE (128 * 1024)
+#define TEMP_FILE_PREFIX "waldump.tmp"
+
+/* Temporary exported WAL file directory */
+static char *TmpWalSegDir = NULL;
+
/* Structure for storing the WAL segment data from the archive */
typedef struct ArchivedWALEntry
{
@@ -65,6 +71,11 @@ typedef struct astreamer_waldump
static int read_archive_file(XLogDumpPrivate *privateInfo, Size count);
static ArchivedWALEntry *get_archive_wal_entry(XLogSegNo segno,
XLogDumpPrivate *privateInfo);
+static void setup_tmpseg_dir(const char *waldir);
+static void cleanup_tmpseg_dir_atexit(void);
+
+static FILE *prepare_tmp_write(XLogSegNo segno);
+static void perform_tmp_write(XLogSegNo segno, StringInfo buf, FILE *file);
static astreamer *astreamer_waldump_new(XLogDumpPrivate *privateInfo);
static void astreamer_waldump_content(astreamer *streamer,
@@ -120,10 +131,11 @@ is_archive_file(const char *fname, pg_compress_algorithm *compression)
}
/*
- * Initializes the tar archive reader to read WAL files from the archive,
- * creates a hash table to store them, performs quick existence checks for WAL
- * entries in the archive and retrieves the WAL segment size, and sets up
- * filtering criteria for relevant entries.
+ * Initializes the tar archive reader, creates a hash table for WAL entries,
+ * checks for existing valid WAL segments in the archive file and retrieves the
+ * segment size, and sets up filters for relevant entries. It also configures a
+ * temporary directory for out-of-order WAL data and registers an exit callback
+ * to clean up temporary files.
*/
void
init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
@@ -194,6 +206,13 @@ init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
*/
XLByteToSeg(privateInfo->startptr, privateInfo->startSegNo, WalSegSz);
XLByteToSeg(privateInfo->endptr, privateInfo->endSegNo, WalSegSz);
+
+ /*
+ * Setup temporary directory to store WAL segments and set up an exit
+ * callback to remove it upon completion.
+ */
+ setup_tmpseg_dir(waldir);
+ atexit(cleanup_tmpseg_dir_atexit);
}
/*
@@ -362,13 +381,16 @@ read_archive_file(XLogDumpPrivate *privateInfo, Size count)
/*
* Returns the archived WAL entry from the hash table if it exists. Otherwise,
* it invokes the routine to read the archived file and retrieve the entry if
- * it is not already in hash table.
+ * it is not already present in the hash table. If the archive streamer happens
+ * to be reading a WAL from archive file that is not currently needed, that WAL
+ * data is written to a temporary file.
*/
static ArchivedWALEntry *
get_archive_wal_entry(XLogSegNo segno, XLogDumpPrivate *privateInfo)
{
ArchivedWALEntry *entry = NULL;
char fname[MAXFNAMELEN];
+ FILE *write_fp = NULL;
/* Search hash table */
entry = ArchivedWAL_lookup(ArchivedWAL_HTAB, segno);
@@ -411,11 +433,32 @@ get_archive_wal_entry(XLogSegNo segno, XLogDumpPrivate *privateInfo)
continue;
}
- /* WAL segments must be archived in order */
- pg_log_error("WAL files are not archived in sequential order");
- pg_log_error_detail("Expecting segment number " UINT64_FORMAT " but found " UINT64_FORMAT ".",
- segno, entry->segno);
- exit(1);
+ /*
+ * Archive streamer is currently reading a file that isn't the one
+ * asked for, but it's required for a future feature. It should be
+ * written to a temporary location for retrieval when needed.
+ */
+
+ /* Create a temporary file if one does not already exist */
+ if (!entry->tmpseg_exists)
+ {
+ write_fp = prepare_tmp_write(entry->segno);
+ entry->tmpseg_exists = true;
+ }
+
+ /* Flush data from the buffer to the file */
+ perform_tmp_write(entry->segno, &entry->buf, write_fp);
+ resetStringInfo(&entry->buf);
+
+ /*
+ * The change in the current segment entry indicates that the reading
+ * of this file has ended.
+ */
+ if (entry != privateInfo->cur_wal && write_fp != NULL)
+ {
+ fclose(write_fp);
+ write_fp = NULL;
+ }
}
/* Requested WAL segment not found */
@@ -423,6 +466,150 @@ get_archive_wal_entry(XLogSegNo segno, XLogDumpPrivate *privateInfo)
pg_fatal("could not find file \"%s\" in archive", fname);
}
+/*
+ * Set up a temporary directory to temporarily store WAL segments.
+ */
+static void
+setup_tmpseg_dir(const char *waldir)
+{
+ /*
+ * Use the directory specified by the TEMDIR environment variable. If it's
+ * not set, use the provided WAL directory to extract WAL file
+ * temporarily.
+ */
+ TmpWalSegDir = getenv("TMPDIR") ?
+ pg_strdup(getenv("TMPDIR")) : pg_strdup(waldir);
+ canonicalize_path(TmpWalSegDir);
+}
+
+/*
+ * Removes the temporarily store WAL segments, if any, at exiting.
+ */
+static void
+cleanup_tmpseg_dir_atexit(void)
+{
+ ArchivedWAL_iterator it;
+ ArchivedWALEntry *entry;
+
+ ArchivedWAL_start_iterate(ArchivedWAL_HTAB, &it);
+ while ((entry = ArchivedWAL_iterate(ArchivedWAL_HTAB, &it)) != NULL)
+ {
+ if (entry->tmpseg_exists)
+ {
+ remove_tmp_walseg(entry->segno, false);
+ entry->tmpseg_exists = false;
+ }
+ }
+}
+
+/*
+ * Generate the temporary WAL file path.
+ *
+ * Note that the caller is responsible to pfree it.
+ */
+char *
+get_tmp_walseg_path(XLogSegNo segno)
+{
+ char *fpath = (char *) palloc(MAXPGPATH);
+
+ snprintf(fpath, MAXPGPATH, "%s/%s.%08X%08X",
+ TmpWalSegDir,
+ TEMP_FILE_PREFIX,
+ (uint32) (segno / XLogSegmentsPerXLogId(WalSegSz)),
+ (uint32) (segno % XLogSegmentsPerXLogId(WalSegSz)));
+
+ return fpath;
+}
+
+/*
+ * Routine to check whether a temporary file exists for the corresponding WAL
+ * segment number.
+ */
+bool
+tmp_walseg_exists(XLogSegNo segno)
+{
+ ArchivedWALEntry *entry;
+
+ entry = ArchivedWAL_lookup(ArchivedWAL_HTAB, segno);
+
+ if (entry == NULL)
+ return false;
+
+ return entry->tmpseg_exists;
+}
+
+/*
+ * Create an empty placeholder file and return its handle.
+ */
+static FILE *
+prepare_tmp_write(XLogSegNo segno)
+{
+ FILE *file;
+ char *fpath;
+
+ fpath = get_tmp_walseg_path(segno);
+
+ /* Create an empty placeholder */
+ file = fopen(fpath, PG_BINARY_W);
+ if (file == NULL)
+ pg_fatal("could not create file \"%s\": %m", fpath);
+
+#ifndef WIN32
+ if (chmod(fpath, pg_file_create_mode))
+ pg_fatal("could not set permissions on file \"%s\": %m",
+ fpath);
+#endif
+
+ pg_log_debug("temporarily exporting file \"%s\"", fpath);
+ pfree(fpath);
+
+ return file;
+}
+
+/*
+ * Write buffer data to the given file handle.
+ */
+static void
+perform_tmp_write(XLogSegNo segno, StringInfo buf, FILE *file)
+{
+ Assert(file);
+
+ errno = 0;
+ if (buf->len > 0 && fwrite(buf->data, buf->len, 1, file) != 1)
+ {
+ /*
+ * If write didn't set errno, assume problem is no disk space
+ */
+ if (errno == 0)
+ errno = ENOSPC;
+ pg_fatal("could not write to file \"%s\": %m",
+ get_tmp_walseg_path(segno));
+ }
+}
+
+/*
+ * Remove temporary file
+ */
+void
+remove_tmp_walseg(XLogSegNo segno, bool update_entry)
+{
+ char *fpath = get_tmp_walseg_path(segno);
+
+ if (unlink(fpath) == 0)
+ pg_log_debug("removed file \"%s\"", fpath);
+ pfree(fpath);
+
+ /* Update entry if requested */
+ if (update_entry)
+ {
+ ArchivedWALEntry *entry;
+
+ entry = ArchivedWAL_lookup(ArchivedWAL_HTAB, segno);
+ Assert(entry != NULL);
+ entry->tmpseg_exists = false;
+ }
+}
+
/*
* Create an astreamer that can read WAL from tar file.
*/
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 8a838f16ba2..8acb7809645 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -466,11 +466,50 @@ TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
{
XLogDumpPrivate *private = state->private_data;
int count = required_read_len(private, targetPagePtr, reqLen);
+ XLogSegNo nextSegNo;
if (private->endptr_reached)
return -1;
- /* Read the WAL page from the archive streamer */
+ /*
+ * If the target page is in a different segment, first check for the WAL
+ * segment's physical existence in the temporary directory.
+ */
+ nextSegNo = state->seg.ws_segno;
+ if (!XLByteInSeg(targetPagePtr, nextSegNo, WalSegSz))
+ {
+ if (state->seg.ws_file >= 0)
+ {
+ close(state->seg.ws_file);
+ state->seg.ws_file = -1;
+
+ /* Remove this file, as it is no longer needed. */
+ remove_tmp_walseg(nextSegNo, true);
+ }
+
+ XLByteToSeg(targetPagePtr, nextSegNo, WalSegSz);
+ state->seg.ws_tli = private->timeline;
+ state->seg.ws_segno = nextSegNo;
+
+ /*
+ * If the next segment exists, open it and continue reading from there
+ */
+ if (tmp_walseg_exists(nextSegNo))
+ {
+ char *fpath;
+
+ fpath = get_tmp_walseg_path(nextSegNo);
+ state->seg.ws_file = open(fpath, O_RDONLY | PG_BINARY, 0);
+ pfree(fpath);
+ }
+ }
+
+ /* Continue reading from the open WAL segment, if any */
+ if (state->seg.ws_file >= 0)
+ return WALDumpReadPage(state, targetPagePtr, count, targetPtr,
+ readBuff);
+
+ /* Otherwise, read the WAL page from the archive streamer */
return read_archive_wal_page(private, targetPagePtr, count, readBuff);
}
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index 54758c3548a..5c1fb1e080a 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -58,4 +58,8 @@ extern int read_archive_wal_page(XLogDumpPrivate *privateInfo,
XLogRecPtr targetPagePtr,
Size count, char *readBuff);
+extern char *get_tmp_walseg_path(XLogSegNo segno);
+extern bool tmp_walseg_exists(XLogSegNo segno);
+extern void remove_tmp_walseg(XLogSegNo segno, bool update_entry);
+
#endif /* end of PG_WALDUMP_H */
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 443126a9ce6..d5fa1f6d28d 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -7,6 +7,7 @@ use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
+use List::Util qw(shuffle);
my $tar = $ENV{TAR};
@@ -272,7 +273,7 @@ sub generate_archive
}
closedir $dh;
- @files = sort @files;
+ @files = shuffle @files;
# move into the WAL directory before archiving files
my $cwd = getcwd;
--
2.47.1
[application/x-patch] v5-0006-pg_verifybackup-Delay-default-WAL-directory-prepa.patch (1.7K, 7-v5-0006-pg_verifybackup-Delay-default-WAL-directory-prepa.patch)
download | inline diff:
From f750f5fece87a9f642225065a540ad4a2d209496 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 16 Jul 2025 14:47:43 +0530
Subject: [PATCH v5 6/8] pg_verifybackup: Delay default WAL directory
preparation.
We are not sure whether to parse WAL from a directory or an archive
until the backup format is known. Therefore, we delay preparing the
default WAL directory until the point of parsing. This delay is
harmless, as the WAL directory is not used elsewhere.
---
src/bin/pg_verifybackup/pg_verifybackup.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 5e6c13bb921..31ebc1581fb 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -285,10 +285,6 @@ main(int argc, char **argv)
manifest_path = psprintf("%s/backup_manifest",
context.backup_directory);
- /* By default, look for the WAL in the backup directory, too. */
- if (wal_directory == NULL)
- wal_directory = psprintf("%s/pg_wal", context.backup_directory);
-
/*
* Try to read the manifest. We treat any errors encountered while parsing
* the manifest as fatal; there doesn't seem to be much point in trying to
@@ -368,6 +364,10 @@ main(int argc, char **argv)
if (context.format == 'p' && !context.skip_checksums)
verify_backup_checksums(&context);
+ /* By default, look for the WAL in the backup directory, too. */
+ if (wal_directory == NULL)
+ wal_directory = psprintf("%s/pg_wal", context.backup_directory);
+
/*
* Try to parse the required ranges of WAL records, unless we were told
* not to do so.
--
2.47.1
[application/x-patch] v5-0007-pg_verifybackup-Rename-the-wal-directory-switch-t.patch (15.6K, 8-v5-0007-pg_verifybackup-Rename-the-wal-directory-switch-t.patch)
download | inline diff:
From 33299daf17137bded756aedbe122232cc4ecc244 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 24 Jul 2025 16:37:43 +0530
Subject: [PATCH v5 7/8] pg_verifybackup: Rename the wal-directory switch to
wal-path
With previous patches to pg_waldump can now decode WAL directly from
tar files. This means you'll be able to specify a tar archive path
instead of a traditional WAL directory.
To keep things consistent and more versatile, we should also
generalize the input switch for pg_verifybackup. It should accept
either a directory or a tar file path that contains WALs. This change
will also aligning it with the existing manifest-path switch naming.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 2 +-
src/bin/pg_verifybackup/pg_verifybackup.c | 22 +++++++++++-----------
src/bin/pg_verifybackup/po/de.po | 4 ++--
src/bin/pg_verifybackup/po/el.po | 4 ++--
src/bin/pg_verifybackup/po/es.po | 4 ++--
src/bin/pg_verifybackup/po/fr.po | 4 ++--
src/bin/pg_verifybackup/po/it.po | 4 ++--
src/bin/pg_verifybackup/po/ja.po | 4 ++--
src/bin/pg_verifybackup/po/ka.po | 4 ++--
src/bin/pg_verifybackup/po/ko.po | 4 ++--
src/bin/pg_verifybackup/po/ru.po | 4 ++--
src/bin/pg_verifybackup/po/sv.po | 4 ++--
src/bin/pg_verifybackup/po/uk.po | 4 ++--
src/bin/pg_verifybackup/po/zh_CN.po | 4 ++--
src/bin/pg_verifybackup/po/zh_TW.po | 4 ++--
src/bin/pg_verifybackup/t/007_wal.pl | 4 ++--
16 files changed, 40 insertions(+), 40 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index 61c12975e4a..e9b8bfd51b1 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -261,7 +261,7 @@ PostgreSQL documentation
<varlistentry>
<term><option>-w <replaceable class="parameter">path</replaceable></option></term>
- <term><option>--wal-directory=<replaceable class="parameter">path</replaceable></option></term>
+ <term><option>--wal-path=<replaceable class="parameter">path</replaceable></option></term>
<listitem>
<para>
Try to parse WAL files stored in the specified directory, rather than
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 31ebc1581fb..1ee400199da 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -93,7 +93,7 @@ static void verify_file_checksum(verifier_context *context,
uint8 *buffer);
static void parse_required_wal(verifier_context *context,
char *pg_waldump_path,
- char *wal_directory);
+ char *wal_path);
static astreamer *create_archive_verifier(verifier_context *context,
char *archive_name,
Oid tblspc_oid,
@@ -126,7 +126,7 @@ main(int argc, char **argv)
{"progress", no_argument, NULL, 'P'},
{"quiet", no_argument, NULL, 'q'},
{"skip-checksums", no_argument, NULL, 's'},
- {"wal-directory", required_argument, NULL, 'w'},
+ {"wal-path", required_argument, NULL, 'w'},
{NULL, 0, NULL, 0}
};
@@ -135,7 +135,7 @@ main(int argc, char **argv)
char *manifest_path = NULL;
bool no_parse_wal = false;
bool quiet = false;
- char *wal_directory = NULL;
+ char *wal_path = NULL;
char *pg_waldump_path = NULL;
DIR *dir;
@@ -221,8 +221,8 @@ main(int argc, char **argv)
context.skip_checksums = true;
break;
case 'w':
- wal_directory = pstrdup(optarg);
- canonicalize_path(wal_directory);
+ wal_path = pstrdup(optarg);
+ canonicalize_path(wal_path);
break;
default:
/* getopt_long already emitted a complaint */
@@ -365,15 +365,15 @@ main(int argc, char **argv)
verify_backup_checksums(&context);
/* By default, look for the WAL in the backup directory, too. */
- if (wal_directory == NULL)
- wal_directory = psprintf("%s/pg_wal", context.backup_directory);
+ if (wal_path == NULL)
+ wal_path = psprintf("%s/pg_wal", context.backup_directory);
/*
* Try to parse the required ranges of WAL records, unless we were told
* not to do so.
*/
if (!no_parse_wal)
- parse_required_wal(&context, pg_waldump_path, wal_directory);
+ parse_required_wal(&context, pg_waldump_path, wal_path);
/*
* If everything looks OK, tell the user this, unless we were asked to
@@ -1198,7 +1198,7 @@ verify_file_checksum(verifier_context *context, manifest_file *m,
*/
static void
parse_required_wal(verifier_context *context, char *pg_waldump_path,
- char *wal_directory)
+ char *wal_path)
{
manifest_data *manifest = context->manifest;
manifest_wal_range *this_wal_range = manifest->first_wal_range;
@@ -1208,7 +1208,7 @@ parse_required_wal(verifier_context *context, char *pg_waldump_path,
char *pg_waldump_cmd;
pg_waldump_cmd = psprintf("\"%s\" --quiet --path=\"%s\" --timeline=%u --start=%X/%08X --end=%X/%08X\n",
- pg_waldump_path, wal_directory, this_wal_range->tli,
+ pg_waldump_path, wal_path, this_wal_range->tli,
LSN_FORMAT_ARGS(this_wal_range->start_lsn),
LSN_FORMAT_ARGS(this_wal_range->end_lsn));
fflush(NULL);
@@ -1376,7 +1376,7 @@ usage(void)
printf(_(" -P, --progress show progress information\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -s, --skip-checksums skip checksum verification\n"));
- printf(_(" -w, --wal-directory=PATH use specified path for WAL files\n"));
+ printf(_(" -w, --wal-path=PATH use specified path for WAL files\n"));
printf(_(" -V, --version output version information, then exit\n"));
printf(_(" -?, --help show this help, then exit\n"));
printf(_("\nReport bugs to <%s>.\n"), PACKAGE_BUGREPORT);
diff --git a/src/bin/pg_verifybackup/po/de.po b/src/bin/pg_verifybackup/po/de.po
index a9e24931100..9b5cd5898cf 100644
--- a/src/bin/pg_verifybackup/po/de.po
+++ b/src/bin/pg_verifybackup/po/de.po
@@ -785,8 +785,8 @@ msgstr " -s, --skip-checksums Überprüfung der Prüfsummen überspringe
#: pg_verifybackup.c:1379
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PFAD angegebenen Pfad für WAL-Dateien verwenden\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PFAD angegebenen Pfad für WAL-Dateien verwenden\n"
#: pg_verifybackup.c:1380
#, c-format
diff --git a/src/bin/pg_verifybackup/po/el.po b/src/bin/pg_verifybackup/po/el.po
index 3e3f20c67c5..81442f51c17 100644
--- a/src/bin/pg_verifybackup/po/el.po
+++ b/src/bin/pg_verifybackup/po/el.po
@@ -494,8 +494,8 @@ msgstr " -s, --skip-checksums παράκαμψε την επαλήθευ
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH χρησιμοποίησε την καθορισμένη διαδρομή για αρχεία WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH χρησιμοποίησε την καθορισμένη διαδρομή για αρχεία WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/es.po b/src/bin/pg_verifybackup/po/es.po
index 0cb958f3448..7f729fa35ba 100644
--- a/src/bin/pg_verifybackup/po/es.po
+++ b/src/bin/pg_verifybackup/po/es.po
@@ -495,8 +495,8 @@ msgstr " -s, --skip-checksums omitir la verificación de la suma de comp
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH utilizar la ruta especificada para los archivos WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH utilizar la ruta especificada para los archivos WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/fr.po b/src/bin/pg_verifybackup/po/fr.po
index da8c72f6427..09937966fa7 100644
--- a/src/bin/pg_verifybackup/po/fr.po
+++ b/src/bin/pg_verifybackup/po/fr.po
@@ -498,8 +498,8 @@ msgstr " -s, --skip-checksums ignore la vérification des sommes de cont
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=CHEMIN utilise le chemin spécifié pour les fichiers WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=CHEMIN utilise le chemin spécifié pour les fichiers WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/it.po b/src/bin/pg_verifybackup/po/it.po
index 317b0b71e7f..4da68d0074e 100644
--- a/src/bin/pg_verifybackup/po/it.po
+++ b/src/bin/pg_verifybackup/po/it.po
@@ -472,8 +472,8 @@ msgstr " -s, --skip-checksums salta la verifica del checksum\n"
#: pg_verifybackup.c:911
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH usa il percorso specificato per i file WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH usa il percorso specificato per i file WAL\n"
#: pg_verifybackup.c:912
#, c-format
diff --git a/src/bin/pg_verifybackup/po/ja.po b/src/bin/pg_verifybackup/po/ja.po
index c910fb236cc..a948959b54f 100644
--- a/src/bin/pg_verifybackup/po/ja.po
+++ b/src/bin/pg_verifybackup/po/ja.po
@@ -672,8 +672,8 @@ msgstr " -s, --skip-checksums チェックサム検証をスキップ\n"
#: pg_verifybackup.c:1379
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH WALファイルに指定したパスを使用する\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH WALファイルに指定したパスを使用する\n"
#: pg_verifybackup.c:1380
#, c-format
diff --git a/src/bin/pg_verifybackup/po/ka.po b/src/bin/pg_verifybackup/po/ka.po
index 982751984c7..ef2799316a8 100644
--- a/src/bin/pg_verifybackup/po/ka.po
+++ b/src/bin/pg_verifybackup/po/ka.po
@@ -784,8 +784,8 @@ msgstr " -s, --skip-checksums საკონტროლო ჯამ
#: pg_verifybackup.c:1379
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=ბილიკი WAL ფაილებისთვის მითითებული ბილიკის გამოყენება\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=ბილიკი WAL ფაილებისთვის მითითებული ბილიკის გამოყენება\n"
#: pg_verifybackup.c:1380
#, c-format
diff --git a/src/bin/pg_verifybackup/po/ko.po b/src/bin/pg_verifybackup/po/ko.po
index acdc3da5e02..eaf91ef1e98 100644
--- a/src/bin/pg_verifybackup/po/ko.po
+++ b/src/bin/pg_verifybackup/po/ko.po
@@ -501,8 +501,8 @@ msgstr " -s, --skip-checksums 체크섬 검사 건너뜀\n"
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=경로 WAL 파일이 있는 경로 지정\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=경로 WAL 파일이 있는 경로 지정\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/ru.po b/src/bin/pg_verifybackup/po/ru.po
index 64005feedfd..7fb0e5ab1f6 100644
--- a/src/bin/pg_verifybackup/po/ru.po
+++ b/src/bin/pg_verifybackup/po/ru.po
@@ -507,9 +507,9 @@ msgstr " -s, --skip-checksums пропустить проверку ко
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
msgstr ""
-" -w, --wal-directory=ПУТЬ использовать заданный путь к файлам WAL\n"
+" -w, --wal-path=ПУТЬ использовать заданный путь к файлам WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/sv.po b/src/bin/pg_verifybackup/po/sv.po
index 17240feeb5c..97125838e8c 100644
--- a/src/bin/pg_verifybackup/po/sv.po
+++ b/src/bin/pg_verifybackup/po/sv.po
@@ -492,8 +492,8 @@ msgstr " -s, --skip-checksums hoppa över verifiering av kontrollsummor\
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=SÖKVÄG använd denna sökväg till WAL-filer\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=SÖKVÄG använd denna sökväg till WAL-filer\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/uk.po b/src/bin/pg_verifybackup/po/uk.po
index 034b9764232..63f8041ab38 100644
--- a/src/bin/pg_verifybackup/po/uk.po
+++ b/src/bin/pg_verifybackup/po/uk.po
@@ -484,8 +484,8 @@ msgstr " -s, --skip-checksums не перевіряти контрольні с
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH використовувати вказаний шлях для файлів WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH використовувати вказаний шлях для файлів WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/zh_CN.po b/src/bin/pg_verifybackup/po/zh_CN.po
index b7d97c8976d..fb6fcae8b82 100644
--- a/src/bin/pg_verifybackup/po/zh_CN.po
+++ b/src/bin/pg_verifybackup/po/zh_CN.po
@@ -465,8 +465,8 @@ msgstr " -s, --skip-checksums 跳过校验和验证\n"
#: pg_verifybackup.c:919
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH 对WAL文件使用指定路径\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH 对WAL文件使用指定路径\n"
#: pg_verifybackup.c:920
#, c-format
diff --git a/src/bin/pg_verifybackup/po/zh_TW.po b/src/bin/pg_verifybackup/po/zh_TW.po
index c1b710b0a36..568f972b0bb 100644
--- a/src/bin/pg_verifybackup/po/zh_TW.po
+++ b/src/bin/pg_verifybackup/po/zh_TW.po
@@ -555,8 +555,8 @@ msgstr " -s, --skip-checksums 跳過檢查碼驗證\n"
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH 用指定的路徑存放 WAL 檔\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH 用指定的路徑存放 WAL 檔\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/t/007_wal.pl b/src/bin/pg_verifybackup/t/007_wal.pl
index babc4f0a86b..b07f80719b0 100644
--- a/src/bin/pg_verifybackup/t/007_wal.pl
+++ b/src/bin/pg_verifybackup/t/007_wal.pl
@@ -42,10 +42,10 @@ command_ok([ 'pg_verifybackup', '--no-parse-wal', $backup_path ],
command_ok(
[
'pg_verifybackup',
- '--wal-directory' => $relocated_pg_wal,
+ '--wal-path' => $relocated_pg_wal,
$backup_path
],
- '--wal-directory can be used to specify WAL directory');
+ '--wal-path can be used to specify WAL directory');
# Move directory back to original location.
rename($relocated_pg_wal, $original_pg_wal) || die "rename pg_wal back: $!";
--
2.47.1
[application/x-patch] v5-0008-pg_verifybackup-enabled-WAL-parsing-for-tar-forma.patch (9.9K, 9-v5-0008-pg_verifybackup-enabled-WAL-parsing-for-tar-forma.patch)
download | inline diff:
From 2498f315388fc8a1a840a2b883bca107b0113c0e Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 17 Jul 2025 16:39:36 +0530
Subject: [PATCH v5 8/8] pg_verifybackup: enabled WAL parsing for tar-format
backup
Now that pg_waldump supports decoding from tar archives, we should
leverage this functionality to remove the previous restriction on WAL
parsing for tar-backed formats.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 5 +-
src/bin/pg_verifybackup/pg_verifybackup.c | 66 +++++++++++++------
src/bin/pg_verifybackup/t/002_algorithm.pl | 4 --
src/bin/pg_verifybackup/t/003_corruption.pl | 4 +-
src/bin/pg_verifybackup/t/008_untar.pl | 5 +-
src/bin/pg_verifybackup/t/010_client_untar.pl | 5 +-
6 files changed, 50 insertions(+), 39 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index e9b8bfd51b1..16b50b5a4df 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -36,10 +36,7 @@ PostgreSQL documentation
<literal>backup_manifest</literal> generated by the server at the time
of the backup. The backup may be stored either in the "plain" or the "tar"
format; this includes tar-format backups compressed with any algorithm
- supported by <application>pg_basebackup</application>. However, at present,
- <literal>WAL</literal> verification is supported only for plain-format
- backups. Therefore, if the backup is stored in tar-format, the
- <literal>-n, --no-parse-wal</literal> option should be used.
+ supported by <application>pg_basebackup</application>.
</para>
<para>
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 1ee400199da..4bfe6fdff16 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -74,7 +74,9 @@ pg_noreturn static void report_manifest_error(JsonManifestParseContext *context,
const char *fmt,...)
pg_attribute_printf(2, 3);
-static void verify_tar_backup(verifier_context *context, DIR *dir);
+static void verify_tar_backup(verifier_context *context, DIR *dir,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_plain_backup_directory(verifier_context *context,
char *relpath, char *fullpath,
DIR *dir);
@@ -83,7 +85,9 @@ static void verify_plain_backup_file(verifier_context *context, char *relpath,
static void verify_control_file(const char *controlpath,
uint64 manifest_system_identifier);
static void precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles);
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_tar_file(verifier_context *context, char *relpath,
char *fullpath, astreamer *streamer);
static void report_extra_backup_files(verifier_context *context);
@@ -136,6 +140,8 @@ main(int argc, char **argv)
bool no_parse_wal = false;
bool quiet = false;
char *wal_path = NULL;
+ char *base_archive_path = NULL;
+ char *wal_archive_path = NULL;
char *pg_waldump_path = NULL;
DIR *dir;
@@ -327,17 +333,6 @@ main(int argc, char **argv)
pfree(path);
}
- /*
- * XXX: In the future, we should consider enhancing pg_waldump to read WAL
- * files from an archive.
- */
- if (!no_parse_wal && context.format == 't')
- {
- pg_log_error("pg_waldump cannot read tar files");
- pg_log_error_hint("You must use -n/--no-parse-wal when verifying a tar-format backup.");
- exit(1);
- }
-
/*
* Perform the appropriate type of verification appropriate based on the
* backup format. This will close 'dir'.
@@ -346,7 +341,7 @@ main(int argc, char **argv)
verify_plain_backup_directory(&context, NULL, context.backup_directory,
dir);
else
- verify_tar_backup(&context, dir);
+ verify_tar_backup(&context, dir, &base_archive_path, &wal_archive_path);
/*
* The "matched" flag should now be set on every entry in the hash table.
@@ -364,9 +359,28 @@ main(int argc, char **argv)
if (context.format == 'p' && !context.skip_checksums)
verify_backup_checksums(&context);
- /* By default, look for the WAL in the backup directory, too. */
+ /*
+ * By default, WAL files are expected to be found in the backup directory
+ * for plain-format backups. In the case of tar-format backups, if a
+ * separate WAL archive is not found, the WAL files are most likely
+ * included within the main data directory archive.
+ */
if (wal_path == NULL)
- wal_path = psprintf("%s/pg_wal", context.backup_directory);
+ {
+ if (context.format == 'p')
+ wal_path = psprintf("%s/pg_wal", context.backup_directory);
+ else if (wal_archive_path)
+ wal_path = wal_archive_path;
+ else if (base_archive_path)
+ wal_path = base_archive_path;
+ else
+ {
+ pg_log_error("wal archive not found");
+ pg_log_error_hint("Specify the correct path using the option -w/--wal-path."
+ "Or you must use -n/--no-parse-wal when verifying a tar-format backup.");
+ exit(1);
+ }
+ }
/*
* Try to parse the required ranges of WAL records, unless we were told
@@ -787,7 +801,8 @@ verify_control_file(const char *controlpath, uint64 manifest_system_identifier)
* close when we're done with it.
*/
static void
-verify_tar_backup(verifier_context *context, DIR *dir)
+verify_tar_backup(verifier_context *context, DIR *dir, char **base_archive_path,
+ char **wal_archive_path)
{
struct dirent *dirent;
SimplePtrList tarfiles = {NULL, NULL};
@@ -816,7 +831,8 @@ verify_tar_backup(verifier_context *context, DIR *dir)
char *fullpath;
fullpath = psprintf("%s/%s", context->backup_directory, filename);
- precheck_tar_backup_file(context, filename, fullpath, &tarfiles);
+ precheck_tar_backup_file(context, filename, fullpath, &tarfiles,
+ base_archive_path, wal_archive_path);
pfree(fullpath);
}
}
@@ -875,11 +891,13 @@ verify_tar_backup(verifier_context *context, DIR *dir)
*
* The arguments to this function are mostly the same as the
* verify_plain_backup_file. The additional argument outputs a list of valid
- * tar files.
+ * tar files, along with the full paths to the main archive and the WAL
+ * directory archive.
*/
static void
precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles)
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path, char **wal_archive_path)
{
struct stat sb;
Oid tblspc_oid = InvalidOid;
@@ -918,9 +936,17 @@ precheck_tar_backup_file(verifier_context *context, char *relpath,
* extension such as .gz, .lz4, or .zst.
*/
if (strncmp("base", relpath, 4) == 0)
+ {
suffix = relpath + 4;
+
+ *base_archive_path = pstrdup(fullpath);
+ }
else if (strncmp("pg_wal", relpath, 6) == 0)
+ {
suffix = relpath + 6;
+
+ *wal_archive_path = pstrdup(fullpath);
+ }
else
{
/* Expected a <tablespaceoid>.tar file here. */
diff --git a/src/bin/pg_verifybackup/t/002_algorithm.pl b/src/bin/pg_verifybackup/t/002_algorithm.pl
index ae16c11bc4d..4f284a9e828 100644
--- a/src/bin/pg_verifybackup/t/002_algorithm.pl
+++ b/src/bin/pg_verifybackup/t/002_algorithm.pl
@@ -30,10 +30,6 @@ sub test_checksums
{
# Add switch to get a tar-format backup
push @backup, ('--format' => 'tar');
-
- # Add switch to skip WAL verification, which is not yet supported for
- # tar-format backups
- push @verify, ('--no-parse-wal');
}
# A backup with a bogus algorithm should fail.
diff --git a/src/bin/pg_verifybackup/t/003_corruption.pl b/src/bin/pg_verifybackup/t/003_corruption.pl
index 1dd60f709cf..f1ebdbb46b4 100644
--- a/src/bin/pg_verifybackup/t/003_corruption.pl
+++ b/src/bin/pg_verifybackup/t/003_corruption.pl
@@ -193,10 +193,8 @@ for my $scenario (@scenario)
command_ok([ $tar, '-cf' => "$tar_backup_path/base.tar", '.' ]);
chdir($cwd) || die "chdir: $!";
- # Now check that the backup no longer verifies. We must use -n
- # here, because pg_waldump can't yet read WAL from a tarfile.
command_fails_like(
- [ 'pg_verifybackup', '--no-parse-wal', $tar_backup_path ],
+ [ 'pg_verifybackup', $tar_backup_path ],
$scenario->{'fails_like'},
"corrupt backup fails verification: $name");
diff --git a/src/bin/pg_verifybackup/t/008_untar.pl b/src/bin/pg_verifybackup/t/008_untar.pl
index bc3d6b352ad..09079a94fee 100644
--- a/src/bin/pg_verifybackup/t/008_untar.pl
+++ b/src/bin/pg_verifybackup/t/008_untar.pl
@@ -47,7 +47,6 @@ my $tsoid = $primary->safe_psql(
SELECT oid FROM pg_tablespace WHERE spcname = 'regress_ts1'));
my $backup_path = $primary->backup_dir . '/server-backup';
-my $extract_path = $primary->backup_dir . '/extracted-backup';
my @test_configuration = (
{
@@ -123,14 +122,12 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
# Cleanup.
rmtree($backup_path);
- rmtree($extract_path);
}
}
diff --git a/src/bin/pg_verifybackup/t/010_client_untar.pl b/src/bin/pg_verifybackup/t/010_client_untar.pl
index b62faeb5acf..5b0e76ee69d 100644
--- a/src/bin/pg_verifybackup/t/010_client_untar.pl
+++ b/src/bin/pg_verifybackup/t/010_client_untar.pl
@@ -32,7 +32,6 @@ print $jf $junk_data;
close $jf;
my $backup_path = $primary->backup_dir . '/client-backup';
-my $extract_path = $primary->backup_dir . '/extracted-backup';
my @test_configuration = (
{
@@ -137,13 +136,11 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
# Cleanup.
- rmtree($extract_path);
rmtree($backup_path);
}
}
--
2.47.1
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2025-11-17 04:50 Amul Sul <[email protected]>
parent: Amul Sul <[email protected]>
0 siblings, 2 replies; 85+ messages in thread
From: Amul Sul @ 2025-11-17 04:50 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>
On Thu, Nov 6, 2025 at 2:33 PM Amul Sul <[email protected]> wrote:
>
> On Mon, Oct 20, 2025 at 8:05 PM Robert Haas <[email protected]> wrote:
> >
> > On Thu, Oct 16, 2025 at 7:49 AM Amul Sul <[email protected]> wrote:
> > [....]
> Kindly have a look at the attached version. Thank you !
>
Attached is the rebased version against the latest master head (e76defbcf09).
Regards,
Amul
Attachments:
[application/octet-stream] v6-0001-Refactor-pg_waldump-Move-some-declarations-to-new.patch (2.3K, 2-v6-0001-Refactor-pg_waldump-Move-some-declarations-to-new.patch)
download | inline diff:
From f56a3ce0343d9f539f638b88445aefb256afbfb5 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 24 Jun 2025 11:33:20 +0530
Subject: [PATCH v6 1/8] Refactor: pg_waldump: Move some declarations to new
pg_waldump.h
This change prepares for a second source file in this directory to
support reading WAL from tar files. Common structures, declarations,
and functions are being exported through this include file so
they can be used in both files.
---
src/bin/pg_waldump/pg_waldump.c | 11 ++---------
src/bin/pg_waldump/pg_waldump.h | 27 +++++++++++++++++++++++++++
2 files changed, 29 insertions(+), 9 deletions(-)
create mode 100644 src/bin/pg_waldump/pg_waldump.h
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index c6d6ba79e44..5846ee24f46 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -29,6 +29,7 @@
#include "common/logging.h"
#include "common/relpath.h"
#include "getopt_long.h"
+#include "pg_waldump.h"
#include "rmgrdesc.h"
#include "storage/bufpage.h"
@@ -39,19 +40,11 @@
static const char *progname;
-static int WalSegSz;
+int WalSegSz = DEFAULT_XLOG_SEG_SIZE;
static volatile sig_atomic_t time_to_stop = false;
static const RelFileLocator emptyRelFileLocator = {0, 0, 0};
-typedef struct XLogDumpPrivate
-{
- TimeLineID timeline;
- XLogRecPtr startptr;
- XLogRecPtr endptr;
- bool endptr_reached;
-} XLogDumpPrivate;
-
typedef struct XLogDumpConfig
{
/* display options */
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
new file mode 100644
index 00000000000..9e62b64ead5
--- /dev/null
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -0,0 +1,27 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_waldump.h - decode and display WAL
+ *
+ * Copyright (c) 2013-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/pg_waldump.h
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_WALDUMP_H
+#define PG_WALDUMP_H
+
+#include "access/xlogdefs.h"
+
+extern int WalSegSz;
+
+/* Contains the necessary information to drive WAL decoding */
+typedef struct XLogDumpPrivate
+{
+ TimeLineID timeline;
+ XLogRecPtr startptr;
+ XLogRecPtr endptr;
+ bool endptr_reached;
+} XLogDumpPrivate;
+
+#endif /* end of PG_WALDUMP_H */
--
2.47.1
[application/octet-stream] v6-0002-Refactor-pg_waldump-Separate-logic-used-to-calcul.patch (2.3K, 3-v6-0002-Refactor-pg_waldump-Separate-logic-used-to-calcul.patch)
download | inline diff:
From 835cce8a5ed331215cf6c77075ed0ddea06ee859 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 26 Jun 2025 11:42:53 +0530
Subject: [PATCH v6 2/8] Refactor: pg_waldump: Separate logic used to calculate
the required read size.
This refactoring prepares the codebase for an upcoming patch that will
support reading WAL from tar files. The logic for calculating the
required read size has been updated to handle both normal WAL files
and WAL files located inside a tar archive.
---
src/bin/pg_waldump/pg_waldump.c | 39 ++++++++++++++++++++++-----------
1 file changed, 26 insertions(+), 13 deletions(-)
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 5846ee24f46..0dc28ea360c 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -326,6 +326,29 @@ identify_target_directory(char *directory, char *fname)
return NULL; /* not reached */
}
+/* Returns the size in bytes of the data to be read. */
+static inline int
+required_read_len(XLogDumpPrivate *private, XLogRecPtr targetPagePtr,
+ int reqLen)
+{
+ int count = XLOG_BLCKSZ;
+
+ if (XLogRecPtrIsValid(private->endptr))
+ {
+ if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
+ count = XLOG_BLCKSZ;
+ else if (targetPagePtr + reqLen <= private->endptr)
+ count = private->endptr - targetPagePtr;
+ else
+ {
+ private->endptr_reached = true;
+ return -1;
+ }
+ }
+
+ return count;
+}
+
/* pg_waldump's XLogReaderRoutine->segment_open callback */
static void
WALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
@@ -383,21 +406,11 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogRecPtr targetPtr, char *readBuff)
{
XLogDumpPrivate *private = state->private_data;
- int count = XLOG_BLCKSZ;
+ int count = required_read_len(private, targetPagePtr, reqLen);
WALReadError errinfo;
- if (XLogRecPtrIsValid(private->endptr))
- {
- if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
- count = XLOG_BLCKSZ;
- else if (targetPagePtr + reqLen <= private->endptr)
- count = private->endptr - targetPagePtr;
- else
- {
- private->endptr_reached = true;
- return -1;
- }
- }
+ if (private->endptr_reached)
+ return -1;
if (!WALRead(state, readBuff, targetPagePtr, count, private->timeline,
&errinfo))
--
2.47.1
[application/octet-stream] v6-0003-Refactor-pg_waldump-Restructure-TAP-tests.patch (5.5K, 4-v6-0003-Refactor-pg_waldump-Restructure-TAP-tests.patch)
download | inline diff:
From b4d347153a0b1eed353a0549331a52c232aa04ac Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 30 Jul 2025 12:43:30 +0530
Subject: [PATCH v6 3/8] Refactor: pg_waldump: Restructure TAP tests.
Restructured some tests to run inside a loop, facilitating their
re-execution for decoding WAL from tar archives.
---
src/bin/pg_waldump/t/001_basic.pl | 123 ++++++++++++++++--------------
1 file changed, 67 insertions(+), 56 deletions(-)
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index f26d75e01cf..1b712e8d74d 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -198,28 +198,6 @@ command_like(
],
qr/./,
'runs with start and end segment specified');
-command_fails_like(
- [ 'pg_waldump', '--path' => $node->data_dir ],
- qr/error: no start WAL location given/,
- 'path option requires start location');
-command_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- '--end' => $end_lsn,
- ],
- qr/./,
- 'runs with path option and start and end locations');
-command_fails_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- ],
- qr/error: error in WAL record at/,
- 'falling off the end of the WAL results in an error');
-
command_like(
[
'pg_waldump', '--quiet',
@@ -227,15 +205,6 @@ command_like(
],
qr/^$/,
'no output with --quiet option');
-command_fails_like(
- [
- 'pg_waldump', '--quiet',
- '--path' => $node->data_dir,
- '--start' => $start_lsn
- ],
- qr/error: error in WAL record at/,
- 'errors are shown with --quiet');
-
# Test for: Display a message that we're skipping data if `from`
# wasn't a pointer to the start of a record.
@@ -272,7 +241,6 @@ sub test_pg_waldump
my $result = IPC::Run::run [
'pg_waldump',
- '--path' => $node->data_dir,
'--start' => $start_lsn,
'--end' => $end_lsn,
@opts
@@ -288,38 +256,81 @@ sub test_pg_waldump
my @lines;
-@lines = test_pg_waldump;
-is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+my @scenario = (
+ {
+ 'path' => $node->data_dir
+ });
-@lines = test_pg_waldump('--limit' => 6);
-is(@lines, 6, 'limit option observed');
+for my $scenario (@scenario)
+{
+ my $path = $scenario->{'path'};
-@lines = test_pg_waldump('--fullpage');
-is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
+ SKIP:
+ {
+ command_fails_like(
+ [ 'pg_waldump', '--path' => $path ],
+ qr/error: no start WAL location given/,
+ 'path option requires start location');
+ command_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ '--end' => $end_lsn,
+ ],
+ qr/./,
+ 'runs with path option and start and end locations');
+ command_fails_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ ],
+ qr/error: error in WAL record at/,
+ 'falling off the end of the WAL results in an error');
-@lines = test_pg_waldump('--stats');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ command_fails_like(
+ [
+ 'pg_waldump', '--quiet',
+ '--path' => $path,
+ '--start' => $start_lsn
+ ],
+ qr/error: error in WAL record at/,
+ 'errors are shown with --quiet');
-@lines = test_pg_waldump('--stats=record');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ @lines = test_pg_waldump('--path' => $path);
+ is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
-@lines = test_pg_waldump('--rmgr' => 'Btree');
-is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+ @lines = test_pg_waldump('--path' => $path, '--limit' => 6);
+ is(@lines, 6, 'limit option observed');
-@lines = test_pg_waldump('--fork' => 'init');
-is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+ @lines = test_pg_waldump('--path' => $path, '--fullpage');
+ is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
-is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
- 0, 'only lines for selected relation');
+ @lines = test_pg_waldump('--path' => $path, '--stats');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
- '--block' => 1);
-is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+ @lines = test_pg_waldump('--path' => $path, '--stats=record');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ @lines = test_pg_waldump('--path' => $path, '--rmgr' => 'Btree');
+ is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+
+ @lines = test_pg_waldump('--path' => $path, '--fork' => 'init');
+ is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+
+ @lines = test_pg_waldump('--path' => $path,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
+ is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
+ 0, 'only lines for selected relation');
+
+ @lines = test_pg_waldump('--path' => $path,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
+ '--block' => 1);
+ is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+ }
+}
done_testing();
--
2.47.1
[application/octet-stream] v6-0004-pg_waldump-Add-support-for-archived-WAL-decoding.patch (36.5K, 5-v6-0004-pg_waldump-Add-support-for-archived-WAL-decoding.patch)
download | inline diff:
From 187e47acc12b4983a13c9c4aad7fbc66f92db0f6 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 5 Nov 2025 15:40:36 +0530
Subject: [PATCH v6 4/8] pg_waldump: Add support for archived WAL decoding.
pg_waldump can now accept the path to a tar archive containing WAL
files and decode them. This feature was added primarily for
pg_verifybackup, which previously disabled WAL parsing for
tar-formatted backups.
Note that this patch requires that the WAL files within the archive be
in sequential order; an error will be reported otherwise. The next
patch is planned to remove this restriction.
---
doc/src/sgml/ref/pg_waldump.sgml | 8 +-
src/bin/pg_waldump/Makefile | 7 +-
src/bin/pg_waldump/archive_waldump.c | 577 +++++++++++++++++++++++++++
src/bin/pg_waldump/meson.build | 4 +-
src/bin/pg_waldump/pg_waldump.c | 217 +++++++---
src/bin/pg_waldump/pg_waldump.h | 36 +-
src/bin/pg_waldump/t/001_basic.pl | 84 +++-
src/tools/pgindent/typedefs.list | 3 +
8 files changed, 860 insertions(+), 76 deletions(-)
create mode 100644 src/bin/pg_waldump/archive_waldump.c
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index ce23add5577..d004bb0f67e 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -141,13 +141,17 @@ PostgreSQL documentation
<term><option>--path=<replaceable>path</replaceable></option></term>
<listitem>
<para>
- Specifies a directory to search for WAL segment files or a
- directory with a <literal>pg_wal</literal> subdirectory that
+ Specifies a tar archive or a directory to search for WAL segment files
+ or a directory with a <literal>pg_wal</literal> subdirectory that
contains such files. The default is to search in the current
directory, the <literal>pg_wal</literal> subdirectory of the
current directory, and the <literal>pg_wal</literal> subdirectory
of <envar>PGDATA</envar>.
</para>
+ <para>
+ If a tar archive is provided, its WAL segment files must be in
+ sequential order; otherwise, an error will be reported.
+ </para>
</listitem>
</varlistentry>
diff --git a/src/bin/pg_waldump/Makefile b/src/bin/pg_waldump/Makefile
index 4c1ee649501..05ac5763a57 100644
--- a/src/bin/pg_waldump/Makefile
+++ b/src/bin/pg_waldump/Makefile
@@ -3,6 +3,9 @@
PGFILEDESC = "pg_waldump - decode and display WAL"
PGAPPICON=win32
+# make these available to TAP test scripts
+export TAR
+
subdir = src/bin/pg_waldump
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
@@ -12,11 +15,13 @@ OBJS = \
$(WIN32RES) \
compat.o \
pg_waldump.o \
+ archive_waldump.o \
rmgrdesc.o \
xlogreader.o \
xlogstats.o
-override CPPFLAGS := -DFRONTEND $(CPPFLAGS)
+override CPPFLAGS := -DFRONTEND -I$(libpq_srcdir) $(CPPFLAGS)
+LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils
RMGRDESCSOURCES = $(sort $(notdir $(wildcard $(top_srcdir)/src/backend/access/rmgrdesc/*desc*.c)))
RMGRDESCOBJS = $(patsubst %.c,%.o,$(RMGRDESCSOURCES))
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
new file mode 100644
index 00000000000..2830c89a7be
--- /dev/null
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -0,0 +1,577 @@
+/*-------------------------------------------------------------------------
+ *
+ * archive_waldump.c
+ * A generic facility for reading WAL data from tar archives via archive
+ * streamer.
+ *
+ * Portions Copyright (c) 2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/archive_waldump.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres_fe.h"
+
+#include <unistd.h>
+
+#include "access/xlog_internal.h"
+#include "common/hashfn.h"
+#include "common/logging.h"
+#include "fe_utils/simple_list.h"
+#include "pg_waldump.h"
+
+/*
+ * How many bytes should we try to read from a file at once?
+ */
+#define READ_CHUNK_SIZE (128 * 1024)
+
+/* Structure for storing the WAL segment data from the archive */
+typedef struct ArchivedWALEntry
+{
+ uint32 status; /* hash status */
+ XLogSegNo segno; /* hash key: WAL segment number */
+ TimeLineID timeline; /* timeline of this wal file */
+
+ StringInfoData buf;
+ bool tmpseg_exists; /* spill file exists? */
+
+ int total_read; /* total read of this WAL segment, including
+ * buffered and temporarily written data */
+} ArchivedWALEntry;
+
+#define SH_PREFIX ArchivedWAL
+#define SH_ELEMENT_TYPE ArchivedWALEntry
+#define SH_KEY_TYPE XLogSegNo
+#define SH_KEY segno
+#define SH_HASH_KEY(tb, key) murmurhash64((uint64) key)
+#define SH_EQUAL(tb, a, b) (a == b)
+#define SH_GET_HASH(tb, a) a->hash
+#define SH_SCOPE static inline
+#define SH_RAW_ALLOCATOR pg_malloc0
+#define SH_DECLARE
+#define SH_DEFINE
+#include "lib/simplehash.h"
+
+static ArchivedWAL_hash *ArchivedWAL_HTAB = NULL;
+
+typedef struct astreamer_waldump
+{
+ astreamer base;
+ XLogDumpPrivate *privateInfo;
+} astreamer_waldump;
+
+static int read_archive_file(XLogDumpPrivate *privateInfo, Size count);
+static ArchivedWALEntry *get_archive_wal_entry(XLogSegNo segno,
+ XLogDumpPrivate *privateInfo);
+
+static astreamer *astreamer_waldump_new(XLogDumpPrivate *privateInfo);
+static void astreamer_waldump_content(astreamer *streamer,
+ astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context);
+static void astreamer_waldump_finalize(astreamer *streamer);
+static void astreamer_waldump_free(astreamer *streamer);
+
+static bool member_is_wal_file(astreamer_waldump *mystreamer,
+ astreamer_member *member,
+ XLogSegNo *curSegNo,
+ TimeLineID *curTimeline);
+
+static const astreamer_ops astreamer_waldump_ops = {
+ .content = astreamer_waldump_content,
+ .finalize = astreamer_waldump_finalize,
+ .free = astreamer_waldump_free
+};
+
+/*
+ * Returns true if the given file is a tar archive and outputs its compression
+ * algorithm.
+ */
+bool
+is_archive_file(const char *fname, pg_compress_algorithm *compression)
+{
+ int fname_len = strlen(fname);
+ pg_compress_algorithm compress_algo;
+
+ /* Now, check the compression type of the tar */
+ if (fname_len > 4 &&
+ strcmp(fname + fname_len - 4, ".tar") == 0)
+ compress_algo = PG_COMPRESSION_NONE;
+ else if (fname_len > 4 &&
+ strcmp(fname + fname_len - 4, ".tgz") == 0)
+ compress_algo = PG_COMPRESSION_GZIP;
+ else if (fname_len > 7 &&
+ strcmp(fname + fname_len - 7, ".tar.gz") == 0)
+ compress_algo = PG_COMPRESSION_GZIP;
+ else if (fname_len > 8 &&
+ strcmp(fname + fname_len - 8, ".tar.lz4") == 0)
+ compress_algo = PG_COMPRESSION_LZ4;
+ else if (fname_len > 8 &&
+ strcmp(fname + fname_len - 8, ".tar.zst") == 0)
+ compress_algo = PG_COMPRESSION_ZSTD;
+ else
+ return false;
+
+ *compression = compress_algo;
+
+ return true;
+}
+
+/*
+ * Initializes the tar archive reader to read WAL files from the archive,
+ * creates a hash table to store them, performs quick existence checks for WAL
+ * entries in the archive and retrieves the WAL segment size, and sets up
+ * filtering criteria for relevant entries.
+ */
+void
+init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
+ pg_compress_algorithm compression)
+{
+ int fd;
+ astreamer *streamer;
+ ArchivedWALEntry *entry = NULL;
+ XLogLongPageHeader longhdr;
+
+ /* Open tar archive and store its file descriptor */
+ fd = open_file_in_directory(waldir, privateInfo->archive_name);
+
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", privateInfo->archive_name);
+
+ privateInfo->archive_fd = fd;
+
+ streamer = astreamer_waldump_new(privateInfo);
+
+ /* Before that we must parse the tar archive. */
+ streamer = astreamer_tar_parser_new(streamer);
+
+ /* Before that we must decompress, if archive is compressed. */
+ if (compression == PG_COMPRESSION_GZIP)
+ streamer = astreamer_gzip_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_LZ4)
+ streamer = astreamer_lz4_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_ZSTD)
+ streamer = astreamer_zstd_decompressor_new(streamer);
+
+ privateInfo->archive_streamer = streamer;
+
+ /* Hash table storing WAL entries read from the archive */
+ ArchivedWAL_HTAB = ArchivedWAL_create(16, NULL);
+
+ /*
+ * Verify that the archive contains valid WAL files and fetch WAL segment
+ * size
+ */
+ while (entry == NULL || entry->buf.len < XLOG_BLCKSZ)
+ {
+ if (read_archive_file(privateInfo, XLOG_BLCKSZ) == 0)
+ pg_fatal("could not find WAL in \"%s\" archive",
+ privateInfo->archive_name);
+
+ entry = privateInfo->cur_wal;
+ }
+
+ /* Set WalSegSz if WAL data is successfully read */
+ longhdr = (XLogLongPageHeader) entry->buf.data;
+
+ WalSegSz = longhdr->xlp_seg_size;
+
+ if (!IsValidWalSegSize(WalSegSz))
+ {
+ pg_log_error(ngettext("invalid WAL segment size in WAL file from archive \"%s\" (%d byte)",
+ "invalid WAL segment size in WAL file from archive \"%s\" (%d bytes)",
+ WalSegSz),
+ privateInfo->archive_name, WalSegSz);
+ pg_log_error_detail("The WAL segment size must be a power of two between 1 MB and 1 GB.");
+ exit(1);
+ }
+
+ /*
+ * With the WAL segment size available, we can now initialize the
+ * dependent start and end segment numbers.
+ */
+ XLByteToSeg(privateInfo->startptr, privateInfo->startSegNo, WalSegSz);
+ XLByteToSeg(privateInfo->endptr, privateInfo->endSegNo, WalSegSz);
+}
+
+/*
+ * Release the archive streamer chain and close the archive file.
+ */
+void
+free_archive_reader(XLogDumpPrivate *privateInfo)
+{
+ /*
+ * NB: Normally, astreamer_finalize() is called before astreamer_free() to
+ * flush any remaining buffered data or to ensure the end of the tar
+ * archive is reached. However, when decoding a WAL file, once we hit the
+ * end LSN, any remaining WAL data in the buffer or the tar archive's
+ * unreached end can be safely ignored.
+ */
+ astreamer_free(privateInfo->archive_streamer);
+
+ /* Close the file. */
+ if (close(privateInfo->archive_fd) != 0)
+ pg_log_error("could not close file \"%s\": %m",
+ privateInfo->archive_name);
+}
+
+/*
+ * Copies WAL data from astreamer to readBuff; if unavailable, fetches more
+ * from the tar archive via astreamer.
+ */
+int
+read_archive_wal_page(XLogDumpPrivate *privateInfo, XLogRecPtr targetPagePtr,
+ Size count, char *readBuff)
+{
+ char *p = readBuff;
+ Size nbytes = count;
+ XLogRecPtr recptr = targetPagePtr;
+ XLogSegNo segno;
+ ArchivedWALEntry *entry;
+
+ XLByteToSeg(targetPagePtr, segno, WalSegSz);
+ entry = get_archive_wal_entry(segno, privateInfo);
+
+ while (nbytes > 0)
+ {
+ char *buf = entry->buf.data;
+ int len = entry->buf.len;
+
+ /* WAL record range that the buffer contains */
+ XLogRecPtr endPtr;
+ XLogRecPtr startPtr;
+
+ XLogSegNoOffsetToRecPtr(entry->segno, entry->total_read,
+ WalSegSz, endPtr);
+ startPtr = endPtr - len;
+
+ Assert((endPtr - startPtr) == len);
+
+ /*
+ * pg_waldump never ask the same WAL bytes more than once, so if we're
+ * now being asked for data beyond the end of what we've already read,
+ * that means none of the data we currently have in the buffer will
+ * ever be consulted again. So, we can discard the existing buffer
+ * contents and start over.
+ */
+ if (recptr >= endPtr)
+ {
+ len = 0;
+
+ /* Discard the buffered data */
+ resetStringInfo(&entry->buf);
+ }
+
+ if (len > 0 && recptr > startPtr)
+ {
+ int skipBytes = 0;
+
+ /*
+ * The required offset is not at the start of the buffer, so skip
+ * bytes until reaching the desired offset of the target page.
+ */
+ skipBytes = recptr - startPtr;
+
+ buf += skipBytes;
+ len -= skipBytes;
+ }
+
+ if (len > 0)
+ {
+ int readBytes = len >= nbytes ? nbytes : len;
+
+ /* Ensure reading correct WAL record */
+ Assert(recptr >= startPtr && recptr < endPtr);
+
+ memcpy(p, buf, readBytes);
+
+ /* Update state for read */
+ nbytes -= readBytes;
+ p += readBytes;
+ recptr += readBytes;
+ }
+ else
+ {
+ /*
+ * Fetch more data; raise an error if it's not the current segment
+ * being read by the archive streamer or if reading of the
+ * archived file has finished.
+ */
+ if (privateInfo->cur_wal != entry ||
+ read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ {
+ char fname[MAXFNAMELEN];
+
+ XLogFileName(fname, privateInfo->timeline, entry->segno,
+ WalSegSz);
+ pg_fatal("could not read file \"%s\" from archive \"%s\": read %lld of %lld",
+ fname, privateInfo->archive_name,
+ (long long int) count - nbytes,
+ (long long int) nbytes);
+ }
+ }
+ }
+
+ /*
+ * Should have either have successfully read all the requested bytes or
+ * reported a failure before this point.
+ */
+ Assert(nbytes == 0);
+
+ /*
+ * NB: We return the fixed value provided as input. Although we could
+ * return a boolean since we either successfully read the WAL page or
+ * raise an error, but the caller expects this value to be returned. The
+ * routine that reads WAL pages from the physical WAL file follows the
+ * same convention.
+ */
+ return count;
+}
+
+/*
+ * Reads the archive file and passes it to the archive streamer for
+ * decompression.
+ */
+static int
+read_archive_file(XLogDumpPrivate *privateInfo, Size count)
+{
+ int rc;
+ char *buffer;
+
+ buffer = pg_malloc(READ_CHUNK_SIZE * sizeof(uint8));
+
+ rc = read(privateInfo->archive_fd, buffer, count);
+ if (rc < 0)
+ pg_fatal("could not read file \"%s\": %m",
+ privateInfo->archive_name);
+
+ /*
+ * Decompress (if required), and then parse the previously read contents
+ * of the tar file.
+ */
+ if (rc > 0)
+ astreamer_content(privateInfo->archive_streamer, NULL,
+ buffer, rc, ASTREAMER_UNKNOWN);
+ pg_free(buffer);
+
+ return rc;
+}
+
+/*
+ * Returns the archived WAL entry from the hash table if it exists. Otherwise,
+ * it invokes the routine to read the archived file and retrieve the entry if
+ * it is not already in hash table.
+ */
+static ArchivedWALEntry *
+get_archive_wal_entry(XLogSegNo segno, XLogDumpPrivate *privateInfo)
+{
+ ArchivedWALEntry *entry = NULL;
+ char fname[MAXFNAMELEN];
+
+ /* Search hash table */
+ entry = ArchivedWAL_lookup(ArchivedWAL_HTAB, segno);
+
+ if (entry != NULL)
+ return entry;
+
+ /* Needed WAL yet to be decoded from archive, do the same */
+ while (1)
+ {
+ entry = privateInfo->cur_wal;
+
+ /* Fetch more data */
+ if (entry == NULL || entry->buf.len == 0)
+ {
+ if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ break; /* archive file ended */
+ }
+
+ /*
+ * Either, here for the first time, or the archived streamer is
+ * reading a non-WAL file or an irrelevant WAL file.
+ */
+ if (entry == NULL)
+ continue;
+
+ /* Found the required entry */
+ if (entry->segno == segno)
+ return entry;
+
+ /*
+ * Ignore if the timeline is different or the current segment is not
+ * the desired one.
+ */
+ if (privateInfo->timeline != entry->timeline ||
+ privateInfo->startSegNo > entry->segno ||
+ privateInfo->endSegNo < entry->segno)
+ {
+ privateInfo->cur_wal = NULL;
+ continue;
+ }
+
+ /* WAL segments must be archived in order */
+ pg_log_error("WAL files are not archived in sequential order");
+ pg_log_error_detail("Expecting segment number " UINT64_FORMAT " but found " UINT64_FORMAT ".",
+ segno, entry->segno);
+ exit(1);
+ }
+
+ /* Requested WAL segment not found */
+ XLogFileName(fname, privateInfo->timeline, segno, WalSegSz);
+ pg_fatal("could not find file \"%s\" in archive", fname);
+}
+
+/*
+ * Create an astreamer that can read WAL from tar file.
+ */
+static astreamer *
+astreamer_waldump_new(XLogDumpPrivate *privateInfo)
+{
+ astreamer_waldump *streamer;
+
+ streamer = palloc0(sizeof(astreamer_waldump));
+ *((const astreamer_ops **) &streamer->base.bbs_ops) =
+ &astreamer_waldump_ops;
+
+ streamer->privateInfo = privateInfo;
+
+ return &streamer->base;
+}
+
+/*
+ * Main entry point of the archive streamer for reading WAL data from a tar
+ * file. If a member is identified as a valid WAL file, a hash entry is created
+ * for it, and its contents are copied into that entry's buffer, making them
+ * accessible to the decoding routine.
+ */
+static void
+astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context)
+{
+ astreamer_waldump *mystreamer = (astreamer_waldump *) streamer;
+ XLogDumpPrivate *privateInfo = mystreamer->privateInfo;
+
+ Assert(context != ASTREAMER_UNKNOWN);
+
+ switch (context)
+ {
+ case ASTREAMER_MEMBER_HEADER:
+ {
+ XLogSegNo segno;
+ TimeLineID timeline;
+ ArchivedWALEntry *entry;
+ bool found;
+
+ pg_log_debug("reading \"%s\"", member->pathname);
+
+ if (!member_is_wal_file(mystreamer, member,
+ &segno, &timeline))
+ break;
+
+ entry = ArchivedWAL_insert(ArchivedWAL_HTAB, segno, &found);
+
+ /*
+ * Shouldn't happen, but if it does, simply ignore the
+ * duplicate WAL file.
+ */
+ if (found)
+ {
+ pg_log_warning("ignoring duplicate WAL file found in archive: \"%s\"",
+ member->pathname);
+ break;
+ }
+
+ initStringInfo(&entry->buf);
+ entry->timeline = timeline;
+ entry->total_read = 0;
+
+ privateInfo->cur_wal = entry;
+ }
+ break;
+
+ case ASTREAMER_MEMBER_CONTENTS:
+ if (privateInfo->cur_wal)
+ {
+ appendBinaryStringInfo(&privateInfo->cur_wal->buf, data, len);
+ privateInfo->cur_wal->total_read += len;
+ }
+ break;
+
+ case ASTREAMER_MEMBER_TRAILER:
+ privateInfo->cur_wal = NULL;
+ break;
+
+ case ASTREAMER_ARCHIVE_TRAILER:
+ break;
+
+ default:
+ /* Shouldn't happen. */
+ pg_fatal("unexpected state while parsing tar file");
+ }
+}
+
+/*
+ * End-of-stream processing for a astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_finalize(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+}
+
+/*
+ * Free memory associated with a astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_free(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+ pfree(streamer);
+}
+
+/*
+ * Returns true if the archive member name matches the WAL naming format. If
+ * successful, it also outputs the WAL segment number, and timeline.
+ */
+static bool
+member_is_wal_file(astreamer_waldump *mystreamer, astreamer_member *member,
+ XLogSegNo *curSegNo, TimeLineID *curTimeline)
+{
+ int pathlen;
+ XLogSegNo segNo;
+ TimeLineID timeline;
+ char *fname;
+
+ /* We are only interested in normal files. */
+ if (member->is_directory || member->is_link)
+ return false;
+
+ pathlen = strlen(member->pathname);
+ if (pathlen < XLOG_FNAME_LEN)
+ return false;
+
+ /* WAL file could be with full path */
+ fname = member->pathname + (pathlen - XLOG_FNAME_LEN);
+ if (!IsXLogFileName(fname))
+ return false;
+
+ /*
+ * XXX: On some systems (e.g., OpenBSD), the tar utility includes
+ * PaxHeaders when creating an archive. These are special entries that
+ * store extended metadata for the file entry immediately following them,
+ * and they share the exact same name as that file.
+ */
+ if (strstr(member->pathname, "PaxHeaders."))
+ return false;
+
+ /* Parse position from file */
+ XLogFromFileName(fname, &timeline, &segNo, WalSegSz);
+
+ *curSegNo = segNo;
+ *curTimeline = timeline;
+
+ return true;
+}
diff --git a/src/bin/pg_waldump/meson.build b/src/bin/pg_waldump/meson.build
index 937e0d68841..da00746587c 100644
--- a/src/bin/pg_waldump/meson.build
+++ b/src/bin/pg_waldump/meson.build
@@ -3,6 +3,7 @@
pg_waldump_sources = files(
'compat.c',
'pg_waldump.c',
+ 'archive_waldump.c',
'rmgrdesc.c',
)
@@ -18,7 +19,7 @@ endif
pg_waldump = executable('pg_waldump',
pg_waldump_sources,
- dependencies: [frontend_code, lz4, zstd],
+ dependencies: [frontend_code, lz4, zstd, libpq],
c_args: ['-DFRONTEND'], # needed for xlogreader et al
kwargs: default_bin_args,
)
@@ -29,6 +30,7 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
+ 'env': {'TAR': tar.found() ? tar.full_path() : ''},
'tests': [
't/001_basic.pl',
't/002_save_fullpage.pl',
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 0dc28ea360c..7425d386d0c 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -177,7 +177,7 @@ split_path(const char *path, char **dir, char **fname)
*
* return a read only fd
*/
-static int
+int
open_file_in_directory(const char *directory, const char *fname)
{
int fd = -1;
@@ -436,6 +436,44 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
return count;
}
+/*
+ * pg_waldump's XLogReaderRoutine->segment_open callback to support dumping WAL
+ * files from tar archives.
+ */
+static void
+TarWALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
+ TimeLineID *tli_p)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->segment_close callback.
+ */
+static void
+TarWALDumpCloseSegment(XLogReaderState *state)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->page_read callback to support dumping WAL
+ * files from tar archives.
+ */
+static int
+TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
+ XLogRecPtr targetPtr, char *readBuff)
+{
+ XLogDumpPrivate *private = state->private_data;
+ int count = required_read_len(private, targetPagePtr, reqLen);
+
+ if (private->endptr_reached)
+ return -1;
+
+ /* Read the WAL page from the archive streamer */
+ return read_archive_wal_page(private, targetPagePtr, count, readBuff);
+}
+
/*
* Boolean to return whether the given WAL record matches a specific relation
* and optionally block.
@@ -773,8 +811,8 @@ usage(void)
printf(_(" -F, --fork=FORK only show records that modify blocks in fork FORK;\n"
" valid names are main, fsm, vm, init\n"));
printf(_(" -n, --limit=N number of records to display\n"));
- printf(_(" -p, --path=PATH directory in which to find WAL segment files or a\n"
- " directory with a ./pg_wal that contains such files\n"
+ printf(_(" -p, --path=PATH tar archive or a directory in which to find WAL segment files or\n"
+ " a directory with a ./pg_wal that contains such files\n"
" (default: current directory, ./pg_wal, $PGDATA/pg_wal)\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -r, --rmgr=RMGR only show records generated by resource manager RMGR;\n"
@@ -806,7 +844,10 @@ main(int argc, char **argv)
XLogRecord *record;
XLogRecPtr first_record;
char *waldir = NULL;
+ char *walpath = NULL;
char *errormsg;
+ bool is_archive = false;
+ pg_compress_algorithm compression;
static struct option long_options[] = {
{"bkp-details", no_argument, NULL, 'b'},
@@ -938,7 +979,7 @@ main(int argc, char **argv)
}
break;
case 'p':
- waldir = pg_strdup(optarg);
+ walpath = pg_strdup(optarg);
break;
case 'q':
config.quiet = true;
@@ -1102,10 +1143,27 @@ main(int argc, char **argv)
goto bad_argument;
}
- if (waldir != NULL)
+ if (walpath != NULL)
{
+ /* validate path points to tar archive */
+ if (is_archive_file(walpath, &compression))
+ {
+ char *fname = NULL;
+
+ split_path(walpath, &waldir, &fname);
+
+ /*
+ * A NULL WAL directory indicates that the archive file is located
+ * in the current working directory of the pg_waldump execution
+ */
+ if (waldir == NULL)
+ waldir = pg_strdup(".");
+
+ private.archive_name = fname;
+ is_archive = true;
+ }
/* validate path points to directory */
- if (!verify_directory(waldir))
+ else if (!verify_directory(walpath))
{
pg_log_error("could not open directory \"%s\": %m", waldir);
goto bad_argument;
@@ -1123,6 +1181,17 @@ main(int argc, char **argv)
int fd;
XLogSegNo segno;
+ /*
+ * If a tar archive is passed using the --path option, all other
+ * arguments become unnecessary.
+ */
+ if (is_archive)
+ {
+ pg_log_error("unnecessary command-line arguments specified with tar archive (first is \"%s\")",
+ argv[optind]);
+ goto bad_argument;
+ }
+
split_path(argv[optind], &directory, &fname);
if (waldir == NULL && directory != NULL)
@@ -1133,69 +1202,78 @@ main(int argc, char **argv)
pg_fatal("could not open directory \"%s\": %m", waldir);
}
- waldir = identify_target_directory(waldir, fname);
- fd = open_file_in_directory(waldir, fname);
- if (fd < 0)
- pg_fatal("could not open file \"%s\"", fname);
- close(fd);
-
- /* parse position from file */
- XLogFromFileName(fname, &private.timeline, &segno, WalSegSz);
-
- if (!XLogRecPtrIsValid(private.startptr))
- XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr);
- else if (!XLByteInSeg(private.startptr, segno, WalSegSz))
+ if (fname != NULL && is_archive_file(fname, &compression))
{
- pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.startptr),
- fname);
- goto bad_argument;
+ waldir = walpath ? pg_strdup(walpath) : pg_strdup(".");
+ private.archive_name = fname;
+ is_archive = true;
}
-
- /* no second file specified, set end position */
- if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(segno + 1, 0, WalSegSz, private.endptr);
-
- /* parse ENDSEG if passed */
- if (optind + 1 < argc)
+ else
{
- XLogSegNo endsegno;
-
- /* ignore directory, already have that */
- split_path(argv[optind + 1], &directory, &fname);
-
+ waldir = identify_target_directory(waldir, fname);
fd = open_file_in_directory(waldir, fname);
if (fd < 0)
pg_fatal("could not open file \"%s\"", fname);
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &endsegno, WalSegSz);
+ XLogFromFileName(fname, &private.timeline, &segno, WalSegSz);
- if (endsegno < segno)
- pg_fatal("ENDSEG %s is before STARTSEG %s",
- argv[optind + 1], argv[optind]);
+ if (!XLogRecPtrIsValid(private.startptr))
+ XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr);
+ else if (!XLByteInSeg(private.startptr, segno, WalSegSz))
+ {
+ pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.startptr),
+ fname);
+ goto bad_argument;
+ }
- if (!XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(endsegno + 1, 0, WalSegSz,
- private.endptr);
+ /* no second file specified, set end position */
+ if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
+ XLogSegNoOffsetToRecPtr(segno + 1, 0, WalSegSz, private.endptr);
- /* set segno to endsegno for check of --end */
- segno = endsegno;
- }
+ /* parse ENDSEG if passed */
+ if (optind + 1 < argc)
+ {
+ XLogSegNo endsegno;
+ /* ignore directory, already have that */
+ split_path(argv[optind + 1], &directory, &fname);
- if (!XLByteInSeg(private.endptr, segno, WalSegSz) &&
- private.endptr != (segno + 1) * WalSegSz)
- {
- pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.endptr),
- argv[argc - 1]);
- goto bad_argument;
+ fd = open_file_in_directory(waldir, fname);
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", fname);
+ close(fd);
+
+ /* parse position from file */
+ XLogFromFileName(fname, &private.timeline, &endsegno, WalSegSz);
+
+ if (endsegno < segno)
+ pg_fatal("ENDSEG %s is before STARTSEG %s",
+ argv[optind + 1], argv[optind]);
+
+ if (!XLogRecPtrIsValid(private.endptr))
+ XLogSegNoOffsetToRecPtr(endsegno + 1, 0, WalSegSz,
+ private.endptr);
+
+ /* set segno to endsegno for check of --end */
+ segno = endsegno;
+ }
+
+
+ if (!XLByteInSeg(private.endptr, segno, WalSegSz) &&
+ private.endptr != (segno + 1) * WalSegSz)
+ {
+ pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.endptr),
+ argv[argc - 1]);
+ goto bad_argument;
+ }
}
}
- else
- waldir = identify_target_directory(waldir, NULL);
+ else if (!is_archive)
+ waldir = identify_target_directory(walpath, NULL);
/* we don't know what to print */
if (!XLogRecPtrIsValid(private.startptr))
@@ -1207,12 +1285,30 @@ main(int argc, char **argv)
/* done with argument parsing, do the actual work */
/* we have everything we need, start reading */
- xlogreader_state =
- XLogReaderAllocate(WalSegSz, waldir,
- XL_ROUTINE(.page_read = WALDumpReadPage,
- .segment_open = WALDumpOpenSegment,
- .segment_close = WALDumpCloseSegment),
- &private);
+ if (is_archive)
+ {
+ /* Set up for reading tar file */
+ init_archive_reader(&private, waldir, compression);
+
+ /* Routine to decode WAL files in tar archive */
+ xlogreader_state =
+ XLogReaderAllocate(WalSegSz, waldir,
+ XL_ROUTINE(.page_read = TarWALDumpReadPage,
+ .segment_open = TarWALDumpOpenSegment,
+ .segment_close = TarWALDumpCloseSegment),
+ &private);
+ }
+ else
+ {
+ /* Routine to decode WAL files */
+ xlogreader_state =
+ XLogReaderAllocate(WalSegSz, waldir,
+ XL_ROUTINE(.page_read = WALDumpReadPage,
+ .segment_open = WALDumpOpenSegment,
+ .segment_close = WALDumpCloseSegment),
+ &private);
+ }
+
if (!xlogreader_state)
pg_fatal("out of memory while allocating a WAL reading processor");
@@ -1321,6 +1417,9 @@ main(int argc, char **argv)
XLogReaderFree(xlogreader_state);
+ if (is_archive)
+ free_archive_reader(&private);
+
return EXIT_SUCCESS;
bad_argument:
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index 9e62b64ead5..54758c3548a 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -12,9 +12,13 @@
#define PG_WALDUMP_H
#include "access/xlogdefs.h"
+#include "fe_utils/astreamer.h"
extern int WalSegSz;
+/* Forward declaration */
+struct ArchivedWALEntry;
+
/* Contains the necessary information to drive WAL decoding */
typedef struct XLogDumpPrivate
{
@@ -22,6 +26,36 @@ typedef struct XLogDumpPrivate
XLogRecPtr startptr;
XLogRecPtr endptr;
bool endptr_reached;
+
+ /* Fields required to read WAL from archive */
+ char *archive_name; /* Tar archive name */
+ int archive_fd; /* File descriptor for the open tar file */
+
+ astreamer *archive_streamer;
+
+ /* What the archive streamer is currently reading */
+ struct ArchivedWALEntry *cur_wal;
+
+ /*
+ * Although these values can be easily derived from startptr and endptr,
+ * doing so repeatedly for each archived member would be inefficient, as
+ * it would involve recalculating and filtering out irrelevant WAL
+ * segments.
+ */
+ XLogSegNo startSegNo;
+ XLogSegNo endSegNo;
} XLogDumpPrivate;
-#endif /* end of PG_WALDUMP_H */
+extern int open_file_in_directory(const char *directory, const char *fname);
+
+extern bool is_archive_file(const char *fname,
+ pg_compress_algorithm *compression);
+extern void init_archive_reader(XLogDumpPrivate *privateInfo,
+ const char *waldir,
+ pg_compress_algorithm compression);
+extern void free_archive_reader(XLogDumpPrivate *privateInfo);
+extern int read_archive_wal_page(XLogDumpPrivate *privateInfo,
+ XLogRecPtr targetPagePtr,
+ Size count, char *readBuff);
+
+#endif /* end of PG_WALDUMP_H */
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 1b712e8d74d..443126a9ce6 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -3,10 +3,13 @@
use strict;
use warnings FATAL => 'all';
+use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
+my $tar = $ENV{TAR};
+
program_help_ok('pg_waldump');
program_version_ok('pg_waldump');
program_options_handling_ok('pg_waldump');
@@ -235,7 +238,7 @@ command_like(
sub test_pg_waldump
{
local $Test::Builder::Level = $Test::Builder::Level + 1;
- my @opts = @_;
+ my ($path, @opts) = @_;
my ($stdout, $stderr);
@@ -243,6 +246,7 @@ sub test_pg_waldump
'pg_waldump',
'--start' => $start_lsn,
'--end' => $end_lsn,
+ '--path' => $path,
@opts
],
'>' => \$stdout,
@@ -254,11 +258,50 @@ sub test_pg_waldump
return @lines;
}
-my @lines;
+# Create a tar archive, sorting the file order
+sub generate_archive
+{
+ my ($archive, $directory, $compression_flags) = @_;
+
+ my @files;
+ opendir my $dh, $directory or die "opendir: $!";
+ while (my $entry = readdir $dh) {
+ # Skip '.' and '..'
+ next if $entry eq '.' || $entry eq '..';
+ push @files, $entry;
+ }
+ closedir $dh;
+
+ @files = sort @files;
+
+ # move into the WAL directory before archiving files
+ my $cwd = getcwd;
+ chdir($directory) || die "chdir: $!";
+ command_ok([$tar, $compression_flags, $archive, @files]);
+ chdir($cwd) || die "chdir: $!";
+}
+
+my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
my @scenario = (
{
- 'path' => $node->data_dir
+ 'path' => $node->data_dir,
+ 'is_archive' => 0,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar",
+ 'compression_method' => 'none',
+ 'compression_flags' => '-cf',
+ 'is_archive' => 1,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar.gz",
+ 'compression_method' => 'gzip',
+ 'compression_flags' => '-czf',
+ 'is_archive' => 1,
+ 'enabled' => check_pg_config("#define HAVE_LIBZ 1")
});
for my $scenario (@scenario)
@@ -267,6 +310,19 @@ for my $scenario (@scenario)
SKIP:
{
+ skip "tar command is not available", 3
+ if !defined $tar;
+ skip "$scenario->{'compression_method'} compression not supported by this build", 3
+ if !$scenario->{'enabled'} && $scenario->{'is_archive'};
+
+ # create pg_wal archive
+ if ($scenario->{'is_archive'})
+ {
+ generate_archive($path,
+ $node->data_dir . '/pg_wal',
+ $scenario->{'compression_flags'});
+ }
+
command_fails_like(
[ 'pg_waldump', '--path' => $path ],
qr/error: no start WAL location given/,
@@ -298,38 +354,42 @@ for my $scenario (@scenario)
qr/error: error in WAL record at/,
'errors are shown with --quiet');
- @lines = test_pg_waldump('--path' => $path);
+ my @lines;
+ @lines = test_pg_waldump($path);
is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
- @lines = test_pg_waldump('--path' => $path, '--limit' => 6);
+ @lines = test_pg_waldump($path, '--limit' => 6);
is(@lines, 6, 'limit option observed');
- @lines = test_pg_waldump('--path' => $path, '--fullpage');
+ @lines = test_pg_waldump($path, '--fullpage');
is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
- @lines = test_pg_waldump('--path' => $path, '--stats');
+ @lines = test_pg_waldump($path, '--stats');
like($lines[0], qr/WAL statistics/, "statistics on stdout");
is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
- @lines = test_pg_waldump('--path' => $path, '--stats=record');
+ @lines = test_pg_waldump($path, '--stats=record');
like($lines[0], qr/WAL statistics/, "statistics on stdout");
is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
- @lines = test_pg_waldump('--path' => $path, '--rmgr' => 'Btree');
+ @lines = test_pg_waldump($path, '--rmgr' => 'Btree');
is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
- @lines = test_pg_waldump('--path' => $path, '--fork' => 'init');
+ @lines = test_pg_waldump($path, '--fork' => 'init');
is(grep(!/fork init/, @lines), 0, 'only init fork lines');
- @lines = test_pg_waldump('--path' => $path,
+ @lines = test_pg_waldump($path,
'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
0, 'only lines for selected relation');
- @lines = test_pg_waldump('--path' => $path,
+ @lines = test_pg_waldump($path,
'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
'--block' => 1);
is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+
+ # Cleanup.
+ unlink $path if $scenario->{'is_archive'};
}
}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 23bce72ae64..0c8d6bfa3e1 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -139,6 +139,8 @@ ArchiveOpts
ArchiveShutdownCB
ArchiveStartupCB
ArchiveStreamState
+ArchivedWALEntry
+ArchivedWAL_hash
ArchiverOutput
ArchiverStage
ArrayAnalyzeExtraData
@@ -3461,6 +3463,7 @@ astreamer_recovery_injector
astreamer_tar_archiver
astreamer_tar_parser
astreamer_verify
+astreamer_waldump
astreamer_zstd_frame
auth_password_hook_typ
autovac_table
--
2.47.1
[application/octet-stream] v6-0005-pg_waldump-Remove-the-restriction-on-the-order-of.patch (11.2K, 6-v6-0005-pg_waldump-Remove-the-restriction-on-the-order-of.patch)
download | inline diff:
From 7422ff4700b7493eef3c9097e86f99da5aaeeac3 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 6 Nov 2025 13:48:33 +0530
Subject: [PATCH v6 5/8] pg_waldump: Remove the restriction on the order of
archived WAL files.
With previous patch, pg_waldump would stop decoding if WAL files were
not in the required sequence. With this patch, decoding will now
continue. Any WAL file that is out of order will be written to a
temporary location, from which it will be read later. Once a temporary
file has been read, it will be removed.
---
src/bin/pg_waldump/archive_waldump.c | 207 +++++++++++++++++++++++++--
src/bin/pg_waldump/pg_waldump.c | 41 +++++-
src/bin/pg_waldump/pg_waldump.h | 4 +
src/bin/pg_waldump/t/001_basic.pl | 3 +-
4 files changed, 243 insertions(+), 12 deletions(-)
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
index 2830c89a7be..7c8f17ba135 100644
--- a/src/bin/pg_waldump/archive_waldump.c
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -17,6 +17,7 @@
#include <unistd.h>
#include "access/xlog_internal.h"
+#include "common/file_perm.h"
#include "common/hashfn.h"
#include "common/logging.h"
#include "fe_utils/simple_list.h"
@@ -27,6 +28,11 @@
*/
#define READ_CHUNK_SIZE (128 * 1024)
+#define TEMP_FILE_PREFIX "waldump.tmp"
+
+/* Temporary exported WAL file directory */
+static char *TmpWalSegDir = NULL;
+
/* Structure for storing the WAL segment data from the archive */
typedef struct ArchivedWALEntry
{
@@ -65,6 +71,11 @@ typedef struct astreamer_waldump
static int read_archive_file(XLogDumpPrivate *privateInfo, Size count);
static ArchivedWALEntry *get_archive_wal_entry(XLogSegNo segno,
XLogDumpPrivate *privateInfo);
+static void setup_tmpseg_dir(const char *waldir);
+static void cleanup_tmpseg_dir_atexit(void);
+
+static FILE *prepare_tmp_write(XLogSegNo segno);
+static void perform_tmp_write(XLogSegNo segno, StringInfo buf, FILE *file);
static astreamer *astreamer_waldump_new(XLogDumpPrivate *privateInfo);
static void astreamer_waldump_content(astreamer *streamer,
@@ -120,10 +131,11 @@ is_archive_file(const char *fname, pg_compress_algorithm *compression)
}
/*
- * Initializes the tar archive reader to read WAL files from the archive,
- * creates a hash table to store them, performs quick existence checks for WAL
- * entries in the archive and retrieves the WAL segment size, and sets up
- * filtering criteria for relevant entries.
+ * Initializes the tar archive reader, creates a hash table for WAL entries,
+ * checks for existing valid WAL segments in the archive file and retrieves the
+ * segment size, and sets up filters for relevant entries. It also configures a
+ * temporary directory for out-of-order WAL data and registers an exit callback
+ * to clean up temporary files.
*/
void
init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
@@ -194,6 +206,13 @@ init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
*/
XLByteToSeg(privateInfo->startptr, privateInfo->startSegNo, WalSegSz);
XLByteToSeg(privateInfo->endptr, privateInfo->endSegNo, WalSegSz);
+
+ /*
+ * Setup temporary directory to store WAL segments and set up an exit
+ * callback to remove it upon completion.
+ */
+ setup_tmpseg_dir(waldir);
+ atexit(cleanup_tmpseg_dir_atexit);
}
/*
@@ -362,13 +381,16 @@ read_archive_file(XLogDumpPrivate *privateInfo, Size count)
/*
* Returns the archived WAL entry from the hash table if it exists. Otherwise,
* it invokes the routine to read the archived file and retrieve the entry if
- * it is not already in hash table.
+ * it is not already present in the hash table. If the archive streamer happens
+ * to be reading a WAL from archive file that is not currently needed, that WAL
+ * data is written to a temporary file.
*/
static ArchivedWALEntry *
get_archive_wal_entry(XLogSegNo segno, XLogDumpPrivate *privateInfo)
{
ArchivedWALEntry *entry = NULL;
char fname[MAXFNAMELEN];
+ FILE *write_fp = NULL;
/* Search hash table */
entry = ArchivedWAL_lookup(ArchivedWAL_HTAB, segno);
@@ -411,11 +433,32 @@ get_archive_wal_entry(XLogSegNo segno, XLogDumpPrivate *privateInfo)
continue;
}
- /* WAL segments must be archived in order */
- pg_log_error("WAL files are not archived in sequential order");
- pg_log_error_detail("Expecting segment number " UINT64_FORMAT " but found " UINT64_FORMAT ".",
- segno, entry->segno);
- exit(1);
+ /*
+ * Archive streamer is currently reading a file that isn't the one
+ * asked for, but it's required for a future feature. It should be
+ * written to a temporary location for retrieval when needed.
+ */
+
+ /* Create a temporary file if one does not already exist */
+ if (!entry->tmpseg_exists)
+ {
+ write_fp = prepare_tmp_write(entry->segno);
+ entry->tmpseg_exists = true;
+ }
+
+ /* Flush data from the buffer to the file */
+ perform_tmp_write(entry->segno, &entry->buf, write_fp);
+ resetStringInfo(&entry->buf);
+
+ /*
+ * The change in the current segment entry indicates that the reading
+ * of this file has ended.
+ */
+ if (entry != privateInfo->cur_wal && write_fp != NULL)
+ {
+ fclose(write_fp);
+ write_fp = NULL;
+ }
}
/* Requested WAL segment not found */
@@ -423,6 +466,150 @@ get_archive_wal_entry(XLogSegNo segno, XLogDumpPrivate *privateInfo)
pg_fatal("could not find file \"%s\" in archive", fname);
}
+/*
+ * Set up a temporary directory to temporarily store WAL segments.
+ */
+static void
+setup_tmpseg_dir(const char *waldir)
+{
+ /*
+ * Use the directory specified by the TEMDIR environment variable. If it's
+ * not set, use the provided WAL directory to extract WAL file
+ * temporarily.
+ */
+ TmpWalSegDir = getenv("TMPDIR") ?
+ pg_strdup(getenv("TMPDIR")) : pg_strdup(waldir);
+ canonicalize_path(TmpWalSegDir);
+}
+
+/*
+ * Removes the temporarily store WAL segments, if any, at exiting.
+ */
+static void
+cleanup_tmpseg_dir_atexit(void)
+{
+ ArchivedWAL_iterator it;
+ ArchivedWALEntry *entry;
+
+ ArchivedWAL_start_iterate(ArchivedWAL_HTAB, &it);
+ while ((entry = ArchivedWAL_iterate(ArchivedWAL_HTAB, &it)) != NULL)
+ {
+ if (entry->tmpseg_exists)
+ {
+ remove_tmp_walseg(entry->segno, false);
+ entry->tmpseg_exists = false;
+ }
+ }
+}
+
+/*
+ * Generate the temporary WAL file path.
+ *
+ * Note that the caller is responsible to pfree it.
+ */
+char *
+get_tmp_walseg_path(XLogSegNo segno)
+{
+ char *fpath = (char *) palloc(MAXPGPATH);
+
+ snprintf(fpath, MAXPGPATH, "%s/%s.%08X%08X",
+ TmpWalSegDir,
+ TEMP_FILE_PREFIX,
+ (uint32) (segno / XLogSegmentsPerXLogId(WalSegSz)),
+ (uint32) (segno % XLogSegmentsPerXLogId(WalSegSz)));
+
+ return fpath;
+}
+
+/*
+ * Routine to check whether a temporary file exists for the corresponding WAL
+ * segment number.
+ */
+bool
+tmp_walseg_exists(XLogSegNo segno)
+{
+ ArchivedWALEntry *entry;
+
+ entry = ArchivedWAL_lookup(ArchivedWAL_HTAB, segno);
+
+ if (entry == NULL)
+ return false;
+
+ return entry->tmpseg_exists;
+}
+
+/*
+ * Create an empty placeholder file and return its handle.
+ */
+static FILE *
+prepare_tmp_write(XLogSegNo segno)
+{
+ FILE *file;
+ char *fpath;
+
+ fpath = get_tmp_walseg_path(segno);
+
+ /* Create an empty placeholder */
+ file = fopen(fpath, PG_BINARY_W);
+ if (file == NULL)
+ pg_fatal("could not create file \"%s\": %m", fpath);
+
+#ifndef WIN32
+ if (chmod(fpath, pg_file_create_mode))
+ pg_fatal("could not set permissions on file \"%s\": %m",
+ fpath);
+#endif
+
+ pg_log_debug("temporarily exporting file \"%s\"", fpath);
+ pfree(fpath);
+
+ return file;
+}
+
+/*
+ * Write buffer data to the given file handle.
+ */
+static void
+perform_tmp_write(XLogSegNo segno, StringInfo buf, FILE *file)
+{
+ Assert(file);
+
+ errno = 0;
+ if (buf->len > 0 && fwrite(buf->data, buf->len, 1, file) != 1)
+ {
+ /*
+ * If write didn't set errno, assume problem is no disk space
+ */
+ if (errno == 0)
+ errno = ENOSPC;
+ pg_fatal("could not write to file \"%s\": %m",
+ get_tmp_walseg_path(segno));
+ }
+}
+
+/*
+ * Remove temporary file
+ */
+void
+remove_tmp_walseg(XLogSegNo segno, bool update_entry)
+{
+ char *fpath = get_tmp_walseg_path(segno);
+
+ if (unlink(fpath) == 0)
+ pg_log_debug("removed file \"%s\"", fpath);
+ pfree(fpath);
+
+ /* Update entry if requested */
+ if (update_entry)
+ {
+ ArchivedWALEntry *entry;
+
+ entry = ArchivedWAL_lookup(ArchivedWAL_HTAB, segno);
+ Assert(entry != NULL);
+ entry->tmpseg_exists = false;
+ }
+}
+
/*
* Create an astreamer that can read WAL from tar file.
*/
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 7425d386d0c..a472ef59575 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -466,11 +466,50 @@ TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
{
XLogDumpPrivate *private = state->private_data;
int count = required_read_len(private, targetPagePtr, reqLen);
+ XLogSegNo nextSegNo;
if (private->endptr_reached)
return -1;
- /* Read the WAL page from the archive streamer */
+ /*
+ * If the target page is in a different segment, first check for the WAL
+ * segment's physical existence in the temporary directory.
+ */
+ nextSegNo = state->seg.ws_segno;
+ if (!XLByteInSeg(targetPagePtr, nextSegNo, WalSegSz))
+ {
+ if (state->seg.ws_file >= 0)
+ {
+ close(state->seg.ws_file);
+ state->seg.ws_file = -1;
+
+ /* Remove this file, as it is no longer needed. */
+ remove_tmp_walseg(nextSegNo, true);
+ }
+
+ XLByteToSeg(targetPagePtr, nextSegNo, WalSegSz);
+ state->seg.ws_tli = private->timeline;
+ state->seg.ws_segno = nextSegNo;
+
+ /*
+ * If the next segment exists, open it and continue reading from there
+ */
+ if (tmp_walseg_exists(nextSegNo))
+ {
+ char *fpath;
+
+ fpath = get_tmp_walseg_path(nextSegNo);
+ state->seg.ws_file = open(fpath, O_RDONLY | PG_BINARY, 0);
+ pfree(fpath);
+ }
+ }
+
+ /* Continue reading from the open WAL segment, if any */
+ if (state->seg.ws_file >= 0)
+ return WALDumpReadPage(state, targetPagePtr, count, targetPtr,
+ readBuff);
+
+ /* Otherwise, read the WAL page from the archive streamer */
return read_archive_wal_page(private, targetPagePtr, count, readBuff);
}
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index 54758c3548a..5c1fb1e080a 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -58,4 +58,8 @@ extern int read_archive_wal_page(XLogDumpPrivate *privateInfo,
XLogRecPtr targetPagePtr,
Size count, char *readBuff);
+extern char *get_tmp_walseg_path(XLogSegNo segno);
+extern bool tmp_walseg_exists(XLogSegNo segno);
+extern void remove_tmp_walseg(XLogSegNo segno, bool update_entry);
+
#endif /* end of PG_WALDUMP_H */
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 443126a9ce6..d5fa1f6d28d 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -7,6 +7,7 @@ use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
+use List::Util qw(shuffle);
my $tar = $ENV{TAR};
@@ -272,7 +273,7 @@ sub generate_archive
}
closedir $dh;
- @files = sort @files;
+ @files = shuffle @files;
# move into the WAL directory before archiving files
my $cwd = getcwd;
--
2.47.1
[application/octet-stream] v6-0006-pg_verifybackup-Delay-default-WAL-directory-prepa.patch (1.7K, 7-v6-0006-pg_verifybackup-Delay-default-WAL-directory-prepa.patch)
download | inline diff:
From d89200d65e3a9cc2e8737ebea850e63c41a6e0e6 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 16 Jul 2025 14:47:43 +0530
Subject: [PATCH v6 6/8] pg_verifybackup: Delay default WAL directory
preparation.
We are not sure whether to parse WAL from a directory or an archive
until the backup format is known. Therefore, we delay preparing the
default WAL directory until the point of parsing. This delay is
harmless, as the WAL directory is not used elsewhere.
---
src/bin/pg_verifybackup/pg_verifybackup.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 8d5befa947f..a502e795b2e 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -285,10 +285,6 @@ main(int argc, char **argv)
manifest_path = psprintf("%s/backup_manifest",
context.backup_directory);
- /* By default, look for the WAL in the backup directory, too. */
- if (wal_directory == NULL)
- wal_directory = psprintf("%s/pg_wal", context.backup_directory);
-
/*
* Try to read the manifest. We treat any errors encountered while parsing
* the manifest as fatal; there doesn't seem to be much point in trying to
@@ -368,6 +364,10 @@ main(int argc, char **argv)
if (context.format == 'p' && !context.skip_checksums)
verify_backup_checksums(&context);
+ /* By default, look for the WAL in the backup directory, too. */
+ if (wal_directory == NULL)
+ wal_directory = psprintf("%s/pg_wal", context.backup_directory);
+
/*
* Try to parse the required ranges of WAL records, unless we were told
* not to do so.
--
2.47.1
[application/octet-stream] v6-0007-pg_verifybackup-Rename-the-wal-directory-switch-t.patch (15.6K, 8-v6-0007-pg_verifybackup-Rename-the-wal-directory-switch-t.patch)
download | inline diff:
From c9de01ace1794801aa189aebff923a68e93af893 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 24 Jul 2025 16:37:43 +0530
Subject: [PATCH v6 7/8] pg_verifybackup: Rename the wal-directory switch to
wal-path
With previous patches to pg_waldump can now decode WAL directly from
tar files. This means you'll be able to specify a tar archive path
instead of a traditional WAL directory.
To keep things consistent and more versatile, we should also
generalize the input switch for pg_verifybackup. It should accept
either a directory or a tar file path that contains WALs. This change
will also aligning it with the existing manifest-path switch naming.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 2 +-
src/bin/pg_verifybackup/pg_verifybackup.c | 22 +++++++++++-----------
src/bin/pg_verifybackup/po/de.po | 4 ++--
src/bin/pg_verifybackup/po/el.po | 4 ++--
src/bin/pg_verifybackup/po/es.po | 4 ++--
src/bin/pg_verifybackup/po/fr.po | 4 ++--
src/bin/pg_verifybackup/po/it.po | 4 ++--
src/bin/pg_verifybackup/po/ja.po | 4 ++--
src/bin/pg_verifybackup/po/ka.po | 4 ++--
src/bin/pg_verifybackup/po/ko.po | 4 ++--
src/bin/pg_verifybackup/po/ru.po | 4 ++--
src/bin/pg_verifybackup/po/sv.po | 4 ++--
src/bin/pg_verifybackup/po/uk.po | 4 ++--
src/bin/pg_verifybackup/po/zh_CN.po | 4 ++--
src/bin/pg_verifybackup/po/zh_TW.po | 4 ++--
src/bin/pg_verifybackup/t/007_wal.pl | 4 ++--
16 files changed, 40 insertions(+), 40 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index 61c12975e4a..e9b8bfd51b1 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -261,7 +261,7 @@ PostgreSQL documentation
<varlistentry>
<term><option>-w <replaceable class="parameter">path</replaceable></option></term>
- <term><option>--wal-directory=<replaceable class="parameter">path</replaceable></option></term>
+ <term><option>--wal-path=<replaceable class="parameter">path</replaceable></option></term>
<listitem>
<para>
Try to parse WAL files stored in the specified directory, rather than
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index a502e795b2e..9fcd6be004e 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -93,7 +93,7 @@ static void verify_file_checksum(verifier_context *context,
uint8 *buffer);
static void parse_required_wal(verifier_context *context,
char *pg_waldump_path,
- char *wal_directory);
+ char *wal_path);
static astreamer *create_archive_verifier(verifier_context *context,
char *archive_name,
Oid tblspc_oid,
@@ -126,7 +126,7 @@ main(int argc, char **argv)
{"progress", no_argument, NULL, 'P'},
{"quiet", no_argument, NULL, 'q'},
{"skip-checksums", no_argument, NULL, 's'},
- {"wal-directory", required_argument, NULL, 'w'},
+ {"wal-path", required_argument, NULL, 'w'},
{NULL, 0, NULL, 0}
};
@@ -135,7 +135,7 @@ main(int argc, char **argv)
char *manifest_path = NULL;
bool no_parse_wal = false;
bool quiet = false;
- char *wal_directory = NULL;
+ char *wal_path = NULL;
char *pg_waldump_path = NULL;
DIR *dir;
@@ -221,8 +221,8 @@ main(int argc, char **argv)
context.skip_checksums = true;
break;
case 'w':
- wal_directory = pstrdup(optarg);
- canonicalize_path(wal_directory);
+ wal_path = pstrdup(optarg);
+ canonicalize_path(wal_path);
break;
default:
/* getopt_long already emitted a complaint */
@@ -365,15 +365,15 @@ main(int argc, char **argv)
verify_backup_checksums(&context);
/* By default, look for the WAL in the backup directory, too. */
- if (wal_directory == NULL)
- wal_directory = psprintf("%s/pg_wal", context.backup_directory);
+ if (wal_path == NULL)
+ wal_path = psprintf("%s/pg_wal", context.backup_directory);
/*
* Try to parse the required ranges of WAL records, unless we were told
* not to do so.
*/
if (!no_parse_wal)
- parse_required_wal(&context, pg_waldump_path, wal_directory);
+ parse_required_wal(&context, pg_waldump_path, wal_path);
/*
* If everything looks OK, tell the user this, unless we were asked to
@@ -1198,7 +1198,7 @@ verify_file_checksum(verifier_context *context, manifest_file *m,
*/
static void
parse_required_wal(verifier_context *context, char *pg_waldump_path,
- char *wal_directory)
+ char *wal_path)
{
manifest_data *manifest = context->manifest;
manifest_wal_range *this_wal_range = manifest->first_wal_range;
@@ -1208,7 +1208,7 @@ parse_required_wal(verifier_context *context, char *pg_waldump_path,
char *pg_waldump_cmd;
pg_waldump_cmd = psprintf("\"%s\" --quiet --path=\"%s\" --timeline=%u --start=%X/%08X --end=%X/%08X\n",
- pg_waldump_path, wal_directory, this_wal_range->tli,
+ pg_waldump_path, wal_path, this_wal_range->tli,
LSN_FORMAT_ARGS(this_wal_range->start_lsn),
LSN_FORMAT_ARGS(this_wal_range->end_lsn));
fflush(NULL);
@@ -1376,7 +1376,7 @@ usage(void)
printf(_(" -P, --progress show progress information\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -s, --skip-checksums skip checksum verification\n"));
- printf(_(" -w, --wal-directory=PATH use specified path for WAL files\n"));
+ printf(_(" -w, --wal-path=PATH use specified path for WAL files\n"));
printf(_(" -V, --version output version information, then exit\n"));
printf(_(" -?, --help show this help, then exit\n"));
printf(_("\nReport bugs to <%s>.\n"), PACKAGE_BUGREPORT);
diff --git a/src/bin/pg_verifybackup/po/de.po b/src/bin/pg_verifybackup/po/de.po
index a9e24931100..9b5cd5898cf 100644
--- a/src/bin/pg_verifybackup/po/de.po
+++ b/src/bin/pg_verifybackup/po/de.po
@@ -785,8 +785,8 @@ msgstr " -s, --skip-checksums Überprüfung der Prüfsummen überspringe
#: pg_verifybackup.c:1379
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PFAD angegebenen Pfad für WAL-Dateien verwenden\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PFAD angegebenen Pfad für WAL-Dateien verwenden\n"
#: pg_verifybackup.c:1380
#, c-format
diff --git a/src/bin/pg_verifybackup/po/el.po b/src/bin/pg_verifybackup/po/el.po
index 3e3f20c67c5..81442f51c17 100644
--- a/src/bin/pg_verifybackup/po/el.po
+++ b/src/bin/pg_verifybackup/po/el.po
@@ -494,8 +494,8 @@ msgstr " -s, --skip-checksums παράκαμψε την επαλήθευ
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH χρησιμοποίησε την καθορισμένη διαδρομή για αρχεία WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH χρησιμοποίησε την καθορισμένη διαδρομή για αρχεία WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/es.po b/src/bin/pg_verifybackup/po/es.po
index 0cb958f3448..7f729fa35ba 100644
--- a/src/bin/pg_verifybackup/po/es.po
+++ b/src/bin/pg_verifybackup/po/es.po
@@ -495,8 +495,8 @@ msgstr " -s, --skip-checksums omitir la verificación de la suma de comp
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH utilizar la ruta especificada para los archivos WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH utilizar la ruta especificada para los archivos WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/fr.po b/src/bin/pg_verifybackup/po/fr.po
index da8c72f6427..09937966fa7 100644
--- a/src/bin/pg_verifybackup/po/fr.po
+++ b/src/bin/pg_verifybackup/po/fr.po
@@ -498,8 +498,8 @@ msgstr " -s, --skip-checksums ignore la vérification des sommes de cont
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=CHEMIN utilise le chemin spécifié pour les fichiers WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=CHEMIN utilise le chemin spécifié pour les fichiers WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/it.po b/src/bin/pg_verifybackup/po/it.po
index 317b0b71e7f..4da68d0074e 100644
--- a/src/bin/pg_verifybackup/po/it.po
+++ b/src/bin/pg_verifybackup/po/it.po
@@ -472,8 +472,8 @@ msgstr " -s, --skip-checksums salta la verifica del checksum\n"
#: pg_verifybackup.c:911
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH usa il percorso specificato per i file WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH usa il percorso specificato per i file WAL\n"
#: pg_verifybackup.c:912
#, c-format
diff --git a/src/bin/pg_verifybackup/po/ja.po b/src/bin/pg_verifybackup/po/ja.po
index c910fb236cc..a948959b54f 100644
--- a/src/bin/pg_verifybackup/po/ja.po
+++ b/src/bin/pg_verifybackup/po/ja.po
@@ -672,8 +672,8 @@ msgstr " -s, --skip-checksums チェックサム検証をスキップ\n"
#: pg_verifybackup.c:1379
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH WALファイルに指定したパスを使用する\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH WALファイルに指定したパスを使用する\n"
#: pg_verifybackup.c:1380
#, c-format
diff --git a/src/bin/pg_verifybackup/po/ka.po b/src/bin/pg_verifybackup/po/ka.po
index 982751984c7..ef2799316a8 100644
--- a/src/bin/pg_verifybackup/po/ka.po
+++ b/src/bin/pg_verifybackup/po/ka.po
@@ -784,8 +784,8 @@ msgstr " -s, --skip-checksums საკონტროლო ჯამ
#: pg_verifybackup.c:1379
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=ბილიკი WAL ფაილებისთვის მითითებული ბილიკის გამოყენება\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=ბილიკი WAL ფაილებისთვის მითითებული ბილიკის გამოყენება\n"
#: pg_verifybackup.c:1380
#, c-format
diff --git a/src/bin/pg_verifybackup/po/ko.po b/src/bin/pg_verifybackup/po/ko.po
index acdc3da5e02..eaf91ef1e98 100644
--- a/src/bin/pg_verifybackup/po/ko.po
+++ b/src/bin/pg_verifybackup/po/ko.po
@@ -501,8 +501,8 @@ msgstr " -s, --skip-checksums 체크섬 검사 건너뜀\n"
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=경로 WAL 파일이 있는 경로 지정\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=경로 WAL 파일이 있는 경로 지정\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/ru.po b/src/bin/pg_verifybackup/po/ru.po
index 64005feedfd..7fb0e5ab1f6 100644
--- a/src/bin/pg_verifybackup/po/ru.po
+++ b/src/bin/pg_verifybackup/po/ru.po
@@ -507,9 +507,9 @@ msgstr " -s, --skip-checksums пропустить проверку ко
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
msgstr ""
-" -w, --wal-directory=ПУТЬ использовать заданный путь к файлам WAL\n"
+" -w, --wal-path=ПУТЬ использовать заданный путь к файлам WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/sv.po b/src/bin/pg_verifybackup/po/sv.po
index 17240feeb5c..97125838e8c 100644
--- a/src/bin/pg_verifybackup/po/sv.po
+++ b/src/bin/pg_verifybackup/po/sv.po
@@ -492,8 +492,8 @@ msgstr " -s, --skip-checksums hoppa över verifiering av kontrollsummor\
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=SÖKVÄG använd denna sökväg till WAL-filer\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=SÖKVÄG använd denna sökväg till WAL-filer\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/uk.po b/src/bin/pg_verifybackup/po/uk.po
index 034b9764232..63f8041ab38 100644
--- a/src/bin/pg_verifybackup/po/uk.po
+++ b/src/bin/pg_verifybackup/po/uk.po
@@ -484,8 +484,8 @@ msgstr " -s, --skip-checksums не перевіряти контрольні с
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH використовувати вказаний шлях для файлів WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH використовувати вказаний шлях для файлів WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/zh_CN.po b/src/bin/pg_verifybackup/po/zh_CN.po
index b7d97c8976d..fb6fcae8b82 100644
--- a/src/bin/pg_verifybackup/po/zh_CN.po
+++ b/src/bin/pg_verifybackup/po/zh_CN.po
@@ -465,8 +465,8 @@ msgstr " -s, --skip-checksums 跳过校验和验证\n"
#: pg_verifybackup.c:919
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH 对WAL文件使用指定路径\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH 对WAL文件使用指定路径\n"
#: pg_verifybackup.c:920
#, c-format
diff --git a/src/bin/pg_verifybackup/po/zh_TW.po b/src/bin/pg_verifybackup/po/zh_TW.po
index c1b710b0a36..568f972b0bb 100644
--- a/src/bin/pg_verifybackup/po/zh_TW.po
+++ b/src/bin/pg_verifybackup/po/zh_TW.po
@@ -555,8 +555,8 @@ msgstr " -s, --skip-checksums 跳過檢查碼驗證\n"
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH 用指定的路徑存放 WAL 檔\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH 用指定的路徑存放 WAL 檔\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/t/007_wal.pl b/src/bin/pg_verifybackup/t/007_wal.pl
index babc4f0a86b..b07f80719b0 100644
--- a/src/bin/pg_verifybackup/t/007_wal.pl
+++ b/src/bin/pg_verifybackup/t/007_wal.pl
@@ -42,10 +42,10 @@ command_ok([ 'pg_verifybackup', '--no-parse-wal', $backup_path ],
command_ok(
[
'pg_verifybackup',
- '--wal-directory' => $relocated_pg_wal,
+ '--wal-path' => $relocated_pg_wal,
$backup_path
],
- '--wal-directory can be used to specify WAL directory');
+ '--wal-path can be used to specify WAL directory');
# Move directory back to original location.
rename($relocated_pg_wal, $original_pg_wal) || die "rename pg_wal back: $!";
--
2.47.1
[application/octet-stream] v6-0008-pg_verifybackup-enabled-WAL-parsing-for-tar-forma.patch (9.9K, 9-v6-0008-pg_verifybackup-enabled-WAL-parsing-for-tar-forma.patch)
download | inline diff:
From b823bc3086f80e869057ccef0c6e8a3f7c664a9a Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 17 Jul 2025 16:39:36 +0530
Subject: [PATCH v6 8/8] pg_verifybackup: enabled WAL parsing for tar-format
backup
Now that pg_waldump supports decoding from tar archives, we should
leverage this functionality to remove the previous restriction on WAL
parsing for tar-backed formats.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 5 +-
src/bin/pg_verifybackup/pg_verifybackup.c | 66 +++++++++++++------
src/bin/pg_verifybackup/t/002_algorithm.pl | 4 --
src/bin/pg_verifybackup/t/003_corruption.pl | 4 +-
src/bin/pg_verifybackup/t/008_untar.pl | 5 +-
src/bin/pg_verifybackup/t/010_client_untar.pl | 5 +-
6 files changed, 50 insertions(+), 39 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index e9b8bfd51b1..16b50b5a4df 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -36,10 +36,7 @@ PostgreSQL documentation
<literal>backup_manifest</literal> generated by the server at the time
of the backup. The backup may be stored either in the "plain" or the "tar"
format; this includes tar-format backups compressed with any algorithm
- supported by <application>pg_basebackup</application>. However, at present,
- <literal>WAL</literal> verification is supported only for plain-format
- backups. Therefore, if the backup is stored in tar-format, the
- <literal>-n, --no-parse-wal</literal> option should be used.
+ supported by <application>pg_basebackup</application>.
</para>
<para>
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 9fcd6be004e..6915fc7f28e 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -74,7 +74,9 @@ pg_noreturn static void report_manifest_error(JsonManifestParseContext *context,
const char *fmt,...)
pg_attribute_printf(2, 3);
-static void verify_tar_backup(verifier_context *context, DIR *dir);
+static void verify_tar_backup(verifier_context *context, DIR *dir,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_plain_backup_directory(verifier_context *context,
char *relpath, char *fullpath,
DIR *dir);
@@ -83,7 +85,9 @@ static void verify_plain_backup_file(verifier_context *context, char *relpath,
static void verify_control_file(const char *controlpath,
uint64 manifest_system_identifier);
static void precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles);
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_tar_file(verifier_context *context, char *relpath,
char *fullpath, astreamer *streamer);
static void report_extra_backup_files(verifier_context *context);
@@ -136,6 +140,8 @@ main(int argc, char **argv)
bool no_parse_wal = false;
bool quiet = false;
char *wal_path = NULL;
+ char *base_archive_path = NULL;
+ char *wal_archive_path = NULL;
char *pg_waldump_path = NULL;
DIR *dir;
@@ -327,17 +333,6 @@ main(int argc, char **argv)
pfree(path);
}
- /*
- * XXX: In the future, we should consider enhancing pg_waldump to read WAL
- * files from an archive.
- */
- if (!no_parse_wal && context.format == 't')
- {
- pg_log_error("pg_waldump cannot read tar files");
- pg_log_error_hint("You must use -n/--no-parse-wal when verifying a tar-format backup.");
- exit(1);
- }
-
/*
* Perform the appropriate type of verification appropriate based on the
* backup format. This will close 'dir'.
@@ -346,7 +341,7 @@ main(int argc, char **argv)
verify_plain_backup_directory(&context, NULL, context.backup_directory,
dir);
else
- verify_tar_backup(&context, dir);
+ verify_tar_backup(&context, dir, &base_archive_path, &wal_archive_path);
/*
* The "matched" flag should now be set on every entry in the hash table.
@@ -364,9 +359,28 @@ main(int argc, char **argv)
if (context.format == 'p' && !context.skip_checksums)
verify_backup_checksums(&context);
- /* By default, look for the WAL in the backup directory, too. */
+ /*
+ * By default, WAL files are expected to be found in the backup directory
+ * for plain-format backups. In the case of tar-format backups, if a
+ * separate WAL archive is not found, the WAL files are most likely
+ * included within the main data directory archive.
+ */
if (wal_path == NULL)
- wal_path = psprintf("%s/pg_wal", context.backup_directory);
+ {
+ if (context.format == 'p')
+ wal_path = psprintf("%s/pg_wal", context.backup_directory);
+ else if (wal_archive_path)
+ wal_path = wal_archive_path;
+ else if (base_archive_path)
+ wal_path = base_archive_path;
+ else
+ {
+ pg_log_error("wal archive not found");
+ pg_log_error_hint("Specify the correct path using the option -w/--wal-path."
+ "Or you must use -n/--no-parse-wal when verifying a tar-format backup.");
+ exit(1);
+ }
+ }
/*
* Try to parse the required ranges of WAL records, unless we were told
@@ -787,7 +801,8 @@ verify_control_file(const char *controlpath, uint64 manifest_system_identifier)
* close when we're done with it.
*/
static void
-verify_tar_backup(verifier_context *context, DIR *dir)
+verify_tar_backup(verifier_context *context, DIR *dir, char **base_archive_path,
+ char **wal_archive_path)
{
struct dirent *dirent;
SimplePtrList tarfiles = {NULL, NULL};
@@ -816,7 +831,8 @@ verify_tar_backup(verifier_context *context, DIR *dir)
char *fullpath;
fullpath = psprintf("%s/%s", context->backup_directory, filename);
- precheck_tar_backup_file(context, filename, fullpath, &tarfiles);
+ precheck_tar_backup_file(context, filename, fullpath, &tarfiles,
+ base_archive_path, wal_archive_path);
pfree(fullpath);
}
}
@@ -875,11 +891,13 @@ verify_tar_backup(verifier_context *context, DIR *dir)
*
* The arguments to this function are mostly the same as the
* verify_plain_backup_file. The additional argument outputs a list of valid
- * tar files.
+ * tar files, along with the full paths to the main archive and the WAL
+ * directory archive.
*/
static void
precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles)
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path, char **wal_archive_path)
{
struct stat sb;
Oid tblspc_oid = InvalidOid;
@@ -918,9 +936,17 @@ precheck_tar_backup_file(verifier_context *context, char *relpath,
* extension such as .gz, .lz4, or .zst.
*/
if (strncmp("base", relpath, 4) == 0)
+ {
suffix = relpath + 4;
+
+ *base_archive_path = pstrdup(fullpath);
+ }
else if (strncmp("pg_wal", relpath, 6) == 0)
+ {
suffix = relpath + 6;
+
+ *wal_archive_path = pstrdup(fullpath);
+ }
else
{
/* Expected a <tablespaceoid>.tar file here. */
diff --git a/src/bin/pg_verifybackup/t/002_algorithm.pl b/src/bin/pg_verifybackup/t/002_algorithm.pl
index ae16c11bc4d..4f284a9e828 100644
--- a/src/bin/pg_verifybackup/t/002_algorithm.pl
+++ b/src/bin/pg_verifybackup/t/002_algorithm.pl
@@ -30,10 +30,6 @@ sub test_checksums
{
# Add switch to get a tar-format backup
push @backup, ('--format' => 'tar');
-
- # Add switch to skip WAL verification, which is not yet supported for
- # tar-format backups
- push @verify, ('--no-parse-wal');
}
# A backup with a bogus algorithm should fail.
diff --git a/src/bin/pg_verifybackup/t/003_corruption.pl b/src/bin/pg_verifybackup/t/003_corruption.pl
index 1dd60f709cf..f1ebdbb46b4 100644
--- a/src/bin/pg_verifybackup/t/003_corruption.pl
+++ b/src/bin/pg_verifybackup/t/003_corruption.pl
@@ -193,10 +193,8 @@ for my $scenario (@scenario)
command_ok([ $tar, '-cf' => "$tar_backup_path/base.tar", '.' ]);
chdir($cwd) || die "chdir: $!";
- # Now check that the backup no longer verifies. We must use -n
- # here, because pg_waldump can't yet read WAL from a tarfile.
command_fails_like(
- [ 'pg_verifybackup', '--no-parse-wal', $tar_backup_path ],
+ [ 'pg_verifybackup', $tar_backup_path ],
$scenario->{'fails_like'},
"corrupt backup fails verification: $name");
diff --git a/src/bin/pg_verifybackup/t/008_untar.pl b/src/bin/pg_verifybackup/t/008_untar.pl
index bc3d6b352ad..09079a94fee 100644
--- a/src/bin/pg_verifybackup/t/008_untar.pl
+++ b/src/bin/pg_verifybackup/t/008_untar.pl
@@ -47,7 +47,6 @@ my $tsoid = $primary->safe_psql(
SELECT oid FROM pg_tablespace WHERE spcname = 'regress_ts1'));
my $backup_path = $primary->backup_dir . '/server-backup';
-my $extract_path = $primary->backup_dir . '/extracted-backup';
my @test_configuration = (
{
@@ -123,14 +122,12 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
# Cleanup.
rmtree($backup_path);
- rmtree($extract_path);
}
}
diff --git a/src/bin/pg_verifybackup/t/010_client_untar.pl b/src/bin/pg_verifybackup/t/010_client_untar.pl
index b62faeb5acf..5b0e76ee69d 100644
--- a/src/bin/pg_verifybackup/t/010_client_untar.pl
+++ b/src/bin/pg_verifybackup/t/010_client_untar.pl
@@ -32,7 +32,6 @@ print $jf $junk_data;
close $jf;
my $backup_path = $primary->backup_dir . '/client-backup';
-my $extract_path = $primary->backup_dir . '/extracted-backup';
my @test_configuration = (
{
@@ -137,13 +136,11 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
# Cleanup.
- rmtree($extract_path);
rmtree($backup_path);
}
}
--
2.47.1
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2025-11-19 08:20 Jakub Wartak <[email protected]>
parent: Amul Sul <[email protected]>
1 sibling, 1 reply; 85+ messages in thread
From: Jakub Wartak @ 2025-11-19 08:20 UTC (permalink / raw)
To: Amul Sul <[email protected]>; +Cc: Robert Haas <[email protected]>; PostgreSQL Hackers <[email protected]>
On Mon, Nov 17, 2025 at 5:51 AM Amul Sul <[email protected]> wrote:
>
> On Thu, Nov 6, 2025 at 2:33 PM Amul Sul <[email protected]> wrote:
> >
> > On Mon, Oct 20, 2025 at 8:05 PM Robert Haas <[email protected]> wrote:
> > >
> > > On Thu, Oct 16, 2025 at 7:49 AM Amul Sul <[email protected]> wrote:
> > > [....]
> > Kindly have a look at the attached version. Thank you !
> >
>
> Attached is the rebased version against the latest master head (e76defbcf09).
Hi Amul, thanks for working on this. I haven't really looked at the
source code deeply (I trust Robert eyes much more than mine on this
one), just skimmed a little bit:
1. As stated earlier, get_tmp_walseg_path() is still vulnerable (it
uses predictable path that could be used by attacker in $TMPDIR)
2. On the usability front:
a. If you do `pg_waldump --path pg_wal.tar -s 0/31000000` it will dump
a lot of WAL records and then print final:
pg_waldump: error: could not find file "000000010000000000000034" in archive
However, with `pg_waldump --path pg_wal.tar -s 0/31000000
--stats=record` (not passing '-e') it will simply bailout without
printing stats and with error:
pg_waldump: error: could not find file "000000010000000000000034" in archive
IMHO, it could print stats if it was capable of getting at least 1 WAL record.
3. The most critical issue for me was the initial lack of error
pass-through from pg_waldump (when used with WALs in tar) to the
pg_verifybackup. Now it works fine, so thanks for this:
a. pg_waldump is capable of discovering missing WALs as requested and
throwing proper return code (good)
$ /usr/pgsql19/bin/pg_waldump --path pg_wal.tar -s 0/31005F70 -e 0/343D2650 -q
pg_waldump: error: could not find file "000000010000000000000034" in archive
$ echo $?
1
$
b. pg_verifybackup now also complains properly with missing WAL inside tar
$ tar --delete -f pg_wal.tar 000000010000000000000032 # simulate loss of file
$ tar -tf pg_wal.tar
000000010000000000000031
archive_status/000000010000000000000031.done
archive_status/000000010000000000000032.done
000000010000000000000033
$ grep Start-LSN backup_manifest
{ "Timeline": 1, "Start-LSN": "0/31005F70", "End-LSN": "0/333D2650" }
$ /usr/pgsql19/bin/pg_verifybackup -P /tmp/basebackup/
791372/791372 kB (100%) verified
pg_waldump: error: could not find file "000000010000000000000032" in archive
pg_verifybackup: error: WAL parsing failed for timeline 1
$ echo $?
1
$
-J.
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2025-11-21 11:44 Amul Sul <[email protected]>
parent: Jakub Wartak <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Amul Sul @ 2025-11-21 11:44 UTC (permalink / raw)
To: Jakub Wartak <[email protected]>; +Cc: Robert Haas <[email protected]>; PostgreSQL Hackers <[email protected]>
On Wed, Nov 19, 2025 at 1:50 PM Jakub Wartak
<[email protected]> wrote:
>
> On Mon, Nov 17, 2025 at 5:51 AM Amul Sul <[email protected]> wrote:
> >
> > On Thu, Nov 6, 2025 at 2:33 PM Amul Sul <[email protected]> wrote:
> > >
> > > On Mon, Oct 20, 2025 at 8:05 PM Robert Haas <[email protected]> wrote:
> > > >
> > > > On Thu, Oct 16, 2025 at 7:49 AM Amul Sul <[email protected]> wrote:
> > > > [....]
> > > Kindly have a look at the attached version. Thank you !
> > >
> >
> > Attached is the rebased version against the latest master head (e76defbcf09).
>
> Hi Amul, thanks for working on this. I haven't really looked at the
> source code deeply (I trust Robert eyes much more than mine on this
> one), just skimmed a little bit:
>
> 1. As stated earlier, get_tmp_walseg_path() is still vulnerable (it
> uses predictable path that could be used by attacker in $TMPDIR)
>
Yeah, I haven't done anything regarding this since I am unsure of what
should be done and what the risks involved are. I am thinking of
taking Robert's opinion on this.
> 2. On the usability front:
>
> a. If you do `pg_waldump --path pg_wal.tar -s 0/31000000` it will dump
> a lot of WAL records and then print final:
> pg_waldump: error: could not find file "000000010000000000000034" in archive
>
> However, with `pg_waldump --path pg_wal.tar -s 0/31000000
> --stats=record` (not passing '-e') it will simply bailout without
> printing stats and with error:
> pg_waldump: error: could not find file "000000010000000000000034" in archive
>
> IMHO, it could print stats if it was capable of getting at least 1 WAL record.
>
The similar behavior in the current pg_waldump when using the --path
option with a WAL directory and a starting LSN. E.g:
$ pg_waldump -s 0/04FE36E0 --path=/tmp/backup/tmp/ --stats=record
pg_waldump: first record is after 0/04FE36E0, at 0/04FE3F90, skipping
over 2224 bytes
pg_waldump: error: could not find file "000000010000000000000009": No
such file or directory
> 3. The most critical issue for me was the initial lack of error
> pass-through from pg_waldump (when used with WALs in tar) to the
> pg_verifybackup. Now it works fine, so thanks for this:
>
Thanks, that was exactly the intention -- to complete pg_verifybackup
for tar-formatted backup verification.
Regards,
Amul
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2025-11-21 12:16 Amul Sul <[email protected]>
parent: Amul Sul <[email protected]>
1 sibling, 0 replies; 85+ messages in thread
From: Amul Sul @ 2025-11-21 12:16 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>
On Mon, Nov 17, 2025 at 10:20 AM Amul Sul <[email protected]> wrote:
>
> On Thu, Nov 6, 2025 at 2:33 PM Amul Sul <[email protected]> wrote:
> >
> > On Mon, Oct 20, 2025 at 8:05 PM Robert Haas <[email protected]> wrote:
> > >
> > > On Thu, Oct 16, 2025 at 7:49 AM Amul Sul <[email protected]> wrote:
> > > [....]
> > Kindly have a look at the attached version. Thank you !
> >
>
Attached is the updated version. I have fixed an assertion failure
that can occasionally occur with a partial WAL page read.
Regards,
Amul
Attachments:
[application/octet-stream] v7-0001-Refactor-pg_waldump-Move-some-declarations-to-new.patch (2.3K, 2-v7-0001-Refactor-pg_waldump-Move-some-declarations-to-new.patch)
download | inline diff:
From 280432691a9c98b1006d85769ee8e5a5869e55c8 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 24 Jun 2025 11:33:20 +0530
Subject: [PATCH v7 1/8] Refactor: pg_waldump: Move some declarations to new
pg_waldump.h
This change prepares for a second source file in this directory to
support reading WAL from tar files. Common structures, declarations,
and functions are being exported through this include file so
they can be used in both files.
---
src/bin/pg_waldump/pg_waldump.c | 11 ++---------
src/bin/pg_waldump/pg_waldump.h | 27 +++++++++++++++++++++++++++
2 files changed, 29 insertions(+), 9 deletions(-)
create mode 100644 src/bin/pg_waldump/pg_waldump.h
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index c6d6ba79e44..5846ee24f46 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -29,6 +29,7 @@
#include "common/logging.h"
#include "common/relpath.h"
#include "getopt_long.h"
+#include "pg_waldump.h"
#include "rmgrdesc.h"
#include "storage/bufpage.h"
@@ -39,19 +40,11 @@
static const char *progname;
-static int WalSegSz;
+int WalSegSz = DEFAULT_XLOG_SEG_SIZE;
static volatile sig_atomic_t time_to_stop = false;
static const RelFileLocator emptyRelFileLocator = {0, 0, 0};
-typedef struct XLogDumpPrivate
-{
- TimeLineID timeline;
- XLogRecPtr startptr;
- XLogRecPtr endptr;
- bool endptr_reached;
-} XLogDumpPrivate;
-
typedef struct XLogDumpConfig
{
/* display options */
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
new file mode 100644
index 00000000000..9e62b64ead5
--- /dev/null
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -0,0 +1,27 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_waldump.h - decode and display WAL
+ *
+ * Copyright (c) 2013-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/pg_waldump.h
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_WALDUMP_H
+#define PG_WALDUMP_H
+
+#include "access/xlogdefs.h"
+
+extern int WalSegSz;
+
+/* Contains the necessary information to drive WAL decoding */
+typedef struct XLogDumpPrivate
+{
+ TimeLineID timeline;
+ XLogRecPtr startptr;
+ XLogRecPtr endptr;
+ bool endptr_reached;
+} XLogDumpPrivate;
+
+#endif /* end of PG_WALDUMP_H */
--
2.47.1
[application/octet-stream] v7-0002-Refactor-pg_waldump-Separate-logic-used-to-calcul.patch (2.3K, 3-v7-0002-Refactor-pg_waldump-Separate-logic-used-to-calcul.patch)
download | inline diff:
From 11840029bc52e927858da7be1f91ece9b680d486 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 26 Jun 2025 11:42:53 +0530
Subject: [PATCH v7 2/8] Refactor: pg_waldump: Separate logic used to calculate
the required read size.
This refactoring prepares the codebase for an upcoming patch that will
support reading WAL from tar files. The logic for calculating the
required read size has been updated to handle both normal WAL files
and WAL files located inside a tar archive.
---
src/bin/pg_waldump/pg_waldump.c | 39 ++++++++++++++++++++++-----------
1 file changed, 26 insertions(+), 13 deletions(-)
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 5846ee24f46..0dc28ea360c 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -326,6 +326,29 @@ identify_target_directory(char *directory, char *fname)
return NULL; /* not reached */
}
+/* Returns the size in bytes of the data to be read. */
+static inline int
+required_read_len(XLogDumpPrivate *private, XLogRecPtr targetPagePtr,
+ int reqLen)
+{
+ int count = XLOG_BLCKSZ;
+
+ if (XLogRecPtrIsValid(private->endptr))
+ {
+ if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
+ count = XLOG_BLCKSZ;
+ else if (targetPagePtr + reqLen <= private->endptr)
+ count = private->endptr - targetPagePtr;
+ else
+ {
+ private->endptr_reached = true;
+ return -1;
+ }
+ }
+
+ return count;
+}
+
/* pg_waldump's XLogReaderRoutine->segment_open callback */
static void
WALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
@@ -383,21 +406,11 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogRecPtr targetPtr, char *readBuff)
{
XLogDumpPrivate *private = state->private_data;
- int count = XLOG_BLCKSZ;
+ int count = required_read_len(private, targetPagePtr, reqLen);
WALReadError errinfo;
- if (XLogRecPtrIsValid(private->endptr))
- {
- if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
- count = XLOG_BLCKSZ;
- else if (targetPagePtr + reqLen <= private->endptr)
- count = private->endptr - targetPagePtr;
- else
- {
- private->endptr_reached = true;
- return -1;
- }
- }
+ if (private->endptr_reached)
+ return -1;
if (!WALRead(state, readBuff, targetPagePtr, count, private->timeline,
&errinfo))
--
2.47.1
[application/octet-stream] v7-0003-Refactor-pg_waldump-Restructure-TAP-tests.patch (5.5K, 4-v7-0003-Refactor-pg_waldump-Restructure-TAP-tests.patch)
download | inline diff:
From 62a143b267d60d3c98b5063c3874c4c22135d1c0 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 30 Jul 2025 12:43:30 +0530
Subject: [PATCH v7 3/8] Refactor: pg_waldump: Restructure TAP tests.
Restructured some tests to run inside a loop, facilitating their
re-execution for decoding WAL from tar archives.
---
src/bin/pg_waldump/t/001_basic.pl | 123 ++++++++++++++++--------------
1 file changed, 67 insertions(+), 56 deletions(-)
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index f26d75e01cf..1b712e8d74d 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -198,28 +198,6 @@ command_like(
],
qr/./,
'runs with start and end segment specified');
-command_fails_like(
- [ 'pg_waldump', '--path' => $node->data_dir ],
- qr/error: no start WAL location given/,
- 'path option requires start location');
-command_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- '--end' => $end_lsn,
- ],
- qr/./,
- 'runs with path option and start and end locations');
-command_fails_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- ],
- qr/error: error in WAL record at/,
- 'falling off the end of the WAL results in an error');
-
command_like(
[
'pg_waldump', '--quiet',
@@ -227,15 +205,6 @@ command_like(
],
qr/^$/,
'no output with --quiet option');
-command_fails_like(
- [
- 'pg_waldump', '--quiet',
- '--path' => $node->data_dir,
- '--start' => $start_lsn
- ],
- qr/error: error in WAL record at/,
- 'errors are shown with --quiet');
-
# Test for: Display a message that we're skipping data if `from`
# wasn't a pointer to the start of a record.
@@ -272,7 +241,6 @@ sub test_pg_waldump
my $result = IPC::Run::run [
'pg_waldump',
- '--path' => $node->data_dir,
'--start' => $start_lsn,
'--end' => $end_lsn,
@opts
@@ -288,38 +256,81 @@ sub test_pg_waldump
my @lines;
-@lines = test_pg_waldump;
-is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+my @scenario = (
+ {
+ 'path' => $node->data_dir
+ });
-@lines = test_pg_waldump('--limit' => 6);
-is(@lines, 6, 'limit option observed');
+for my $scenario (@scenario)
+{
+ my $path = $scenario->{'path'};
-@lines = test_pg_waldump('--fullpage');
-is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
+ SKIP:
+ {
+ command_fails_like(
+ [ 'pg_waldump', '--path' => $path ],
+ qr/error: no start WAL location given/,
+ 'path option requires start location');
+ command_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ '--end' => $end_lsn,
+ ],
+ qr/./,
+ 'runs with path option and start and end locations');
+ command_fails_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ ],
+ qr/error: error in WAL record at/,
+ 'falling off the end of the WAL results in an error');
-@lines = test_pg_waldump('--stats');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ command_fails_like(
+ [
+ 'pg_waldump', '--quiet',
+ '--path' => $path,
+ '--start' => $start_lsn
+ ],
+ qr/error: error in WAL record at/,
+ 'errors are shown with --quiet');
-@lines = test_pg_waldump('--stats=record');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ @lines = test_pg_waldump('--path' => $path);
+ is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
-@lines = test_pg_waldump('--rmgr' => 'Btree');
-is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+ @lines = test_pg_waldump('--path' => $path, '--limit' => 6);
+ is(@lines, 6, 'limit option observed');
-@lines = test_pg_waldump('--fork' => 'init');
-is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+ @lines = test_pg_waldump('--path' => $path, '--fullpage');
+ is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
-is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
- 0, 'only lines for selected relation');
+ @lines = test_pg_waldump('--path' => $path, '--stats');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
- '--block' => 1);
-is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+ @lines = test_pg_waldump('--path' => $path, '--stats=record');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ @lines = test_pg_waldump('--path' => $path, '--rmgr' => 'Btree');
+ is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+
+ @lines = test_pg_waldump('--path' => $path, '--fork' => 'init');
+ is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+
+ @lines = test_pg_waldump('--path' => $path,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
+ is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
+ 0, 'only lines for selected relation');
+
+ @lines = test_pg_waldump('--path' => $path,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
+ '--block' => 1);
+ is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+ }
+}
done_testing();
--
2.47.1
[application/octet-stream] v7-0004-pg_waldump-Add-support-for-archived-WAL-decoding.patch (36.6K, 5-v7-0004-pg_waldump-Add-support-for-archived-WAL-decoding.patch)
download | inline diff:
From 8a2a15082f43f2cb4c30913f89f749b606ffee13 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 5 Nov 2025 15:40:36 +0530
Subject: [PATCH v7 4/8] pg_waldump: Add support for archived WAL decoding.
pg_waldump can now accept the path to a tar archive containing WAL
files and decode them. This feature was added primarily for
pg_verifybackup, which previously disabled WAL parsing for
tar-formatted backups.
Note that this patch requires that the WAL files within the archive be
in sequential order; an error will be reported otherwise. The next
patch is planned to remove this restriction.
---
doc/src/sgml/ref/pg_waldump.sgml | 8 +-
src/bin/pg_waldump/Makefile | 7 +-
src/bin/pg_waldump/archive_waldump.c | 584 +++++++++++++++++++++++++++
src/bin/pg_waldump/meson.build | 4 +-
src/bin/pg_waldump/pg_waldump.c | 215 +++++++---
src/bin/pg_waldump/pg_waldump.h | 36 +-
src/bin/pg_waldump/t/001_basic.pl | 84 +++-
src/tools/pgindent/typedefs.list | 3 +
8 files changed, 865 insertions(+), 76 deletions(-)
create mode 100644 src/bin/pg_waldump/archive_waldump.c
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index ce23add5577..d004bb0f67e 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -141,13 +141,17 @@ PostgreSQL documentation
<term><option>--path=<replaceable>path</replaceable></option></term>
<listitem>
<para>
- Specifies a directory to search for WAL segment files or a
- directory with a <literal>pg_wal</literal> subdirectory that
+ Specifies a tar archive or a directory to search for WAL segment files
+ or a directory with a <literal>pg_wal</literal> subdirectory that
contains such files. The default is to search in the current
directory, the <literal>pg_wal</literal> subdirectory of the
current directory, and the <literal>pg_wal</literal> subdirectory
of <envar>PGDATA</envar>.
</para>
+ <para>
+ If a tar archive is provided, its WAL segment files must be in
+ sequential order; otherwise, an error will be reported.
+ </para>
</listitem>
</varlistentry>
diff --git a/src/bin/pg_waldump/Makefile b/src/bin/pg_waldump/Makefile
index 4c1ee649501..05ac5763a57 100644
--- a/src/bin/pg_waldump/Makefile
+++ b/src/bin/pg_waldump/Makefile
@@ -3,6 +3,9 @@
PGFILEDESC = "pg_waldump - decode and display WAL"
PGAPPICON=win32
+# make these available to TAP test scripts
+export TAR
+
subdir = src/bin/pg_waldump
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
@@ -12,11 +15,13 @@ OBJS = \
$(WIN32RES) \
compat.o \
pg_waldump.o \
+ archive_waldump.o \
rmgrdesc.o \
xlogreader.o \
xlogstats.o
-override CPPFLAGS := -DFRONTEND $(CPPFLAGS)
+override CPPFLAGS := -DFRONTEND -I$(libpq_srcdir) $(CPPFLAGS)
+LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils
RMGRDESCSOURCES = $(sort $(notdir $(wildcard $(top_srcdir)/src/backend/access/rmgrdesc/*desc*.c)))
RMGRDESCOBJS = $(patsubst %.c,%.o,$(RMGRDESCSOURCES))
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
new file mode 100644
index 00000000000..61d6782f9b7
--- /dev/null
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -0,0 +1,584 @@
+/*-------------------------------------------------------------------------
+ *
+ * archive_waldump.c
+ * A generic facility for reading WAL data from tar archives via archive
+ * streamer.
+ *
+ * Portions Copyright (c) 2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/archive_waldump.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres_fe.h"
+
+#include <unistd.h>
+
+#include "access/xlog_internal.h"
+#include "common/hashfn.h"
+#include "common/logging.h"
+#include "fe_utils/simple_list.h"
+#include "pg_waldump.h"
+
+/*
+ * How many bytes should we try to read from a file at once?
+ */
+#define READ_CHUNK_SIZE (128 * 1024)
+
+/* Structure for storing the WAL segment data from the archive */
+typedef struct ArchivedWALEntry
+{
+ uint32 status; /* hash status */
+ XLogSegNo segno; /* hash key: WAL segment number */
+ TimeLineID timeline; /* timeline of this wal file */
+
+ StringInfoData buf;
+ bool tmpseg_exists; /* spill file exists? */
+
+ int total_read; /* total read of this WAL segment, including
+ * buffered and temporarily written data */
+} ArchivedWALEntry;
+
+#define SH_PREFIX ArchivedWAL
+#define SH_ELEMENT_TYPE ArchivedWALEntry
+#define SH_KEY_TYPE XLogSegNo
+#define SH_KEY segno
+#define SH_HASH_KEY(tb, key) murmurhash64((uint64) key)
+#define SH_EQUAL(tb, a, b) (a == b)
+#define SH_GET_HASH(tb, a) a->hash
+#define SH_SCOPE static inline
+#define SH_RAW_ALLOCATOR pg_malloc0
+#define SH_DECLARE
+#define SH_DEFINE
+#include "lib/simplehash.h"
+
+static ArchivedWAL_hash *ArchivedWAL_HTAB = NULL;
+
+typedef struct astreamer_waldump
+{
+ astreamer base;
+ XLogDumpPrivate *privateInfo;
+} astreamer_waldump;
+
+static int read_archive_file(XLogDumpPrivate *privateInfo, Size count);
+static ArchivedWALEntry *get_archive_wal_entry(XLogSegNo segno,
+ XLogDumpPrivate *privateInfo);
+
+static astreamer *astreamer_waldump_new(XLogDumpPrivate *privateInfo);
+static void astreamer_waldump_content(astreamer *streamer,
+ astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context);
+static void astreamer_waldump_finalize(astreamer *streamer);
+static void astreamer_waldump_free(astreamer *streamer);
+
+static bool member_is_wal_file(astreamer_waldump *mystreamer,
+ astreamer_member *member,
+ XLogSegNo *curSegNo,
+ TimeLineID *curTimeline);
+
+static const astreamer_ops astreamer_waldump_ops = {
+ .content = astreamer_waldump_content,
+ .finalize = astreamer_waldump_finalize,
+ .free = astreamer_waldump_free
+};
+
+/*
+ * Returns true if the given file is a tar archive and outputs its compression
+ * algorithm.
+ */
+bool
+is_archive_file(const char *fname, pg_compress_algorithm *compression)
+{
+ int fname_len = strlen(fname);
+ pg_compress_algorithm compress_algo;
+
+ /* Now, check the compression type of the tar */
+ if (fname_len > 4 &&
+ strcmp(fname + fname_len - 4, ".tar") == 0)
+ compress_algo = PG_COMPRESSION_NONE;
+ else if (fname_len > 4 &&
+ strcmp(fname + fname_len - 4, ".tgz") == 0)
+ compress_algo = PG_COMPRESSION_GZIP;
+ else if (fname_len > 7 &&
+ strcmp(fname + fname_len - 7, ".tar.gz") == 0)
+ compress_algo = PG_COMPRESSION_GZIP;
+ else if (fname_len > 8 &&
+ strcmp(fname + fname_len - 8, ".tar.lz4") == 0)
+ compress_algo = PG_COMPRESSION_LZ4;
+ else if (fname_len > 8 &&
+ strcmp(fname + fname_len - 8, ".tar.zst") == 0)
+ compress_algo = PG_COMPRESSION_ZSTD;
+ else
+ return false;
+
+ *compression = compress_algo;
+
+ return true;
+}
+
+/*
+ * Initializes the tar archive reader to read WAL files from the archive,
+ * creates a hash table to store them, performs quick existence checks for WAL
+ * entries in the archive and retrieves the WAL segment size, and sets up
+ * filtering criteria for relevant entries.
+ */
+void
+init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
+ pg_compress_algorithm compression)
+{
+ int fd;
+ astreamer *streamer;
+ ArchivedWALEntry *entry = NULL;
+ XLogLongPageHeader longhdr;
+
+ /* Open tar archive and store its file descriptor */
+ fd = open_file_in_directory(waldir, privateInfo->archive_name);
+
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", privateInfo->archive_name);
+
+ privateInfo->archive_fd = fd;
+
+ streamer = astreamer_waldump_new(privateInfo);
+
+ /* Before that we must parse the tar archive. */
+ streamer = astreamer_tar_parser_new(streamer);
+
+ /* Before that we must decompress, if archive is compressed. */
+ if (compression == PG_COMPRESSION_GZIP)
+ streamer = astreamer_gzip_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_LZ4)
+ streamer = astreamer_lz4_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_ZSTD)
+ streamer = astreamer_zstd_decompressor_new(streamer);
+
+ privateInfo->archive_streamer = streamer;
+
+ /* Hash table storing WAL entries read from the archive */
+ ArchivedWAL_HTAB = ArchivedWAL_create(16, NULL);
+
+ /*
+ * Verify that the archive contains valid WAL files and fetch WAL segment
+ * size
+ */
+ while (entry == NULL || entry->buf.len < XLOG_BLCKSZ)
+ {
+ if (read_archive_file(privateInfo, XLOG_BLCKSZ) == 0)
+ pg_fatal("could not find WAL in \"%s\" archive",
+ privateInfo->archive_name);
+
+ entry = privateInfo->cur_wal;
+ }
+
+ /* Set WalSegSz if WAL data is successfully read */
+ longhdr = (XLogLongPageHeader) entry->buf.data;
+
+ WalSegSz = longhdr->xlp_seg_size;
+
+ if (!IsValidWalSegSize(WalSegSz))
+ {
+ pg_log_error(ngettext("invalid WAL segment size in WAL file from archive \"%s\" (%d byte)",
+ "invalid WAL segment size in WAL file from archive \"%s\" (%d bytes)",
+ WalSegSz),
+ privateInfo->archive_name, WalSegSz);
+ pg_log_error_detail("The WAL segment size must be a power of two between 1 MB and 1 GB.");
+ exit(1);
+ }
+
+ /*
+ * With the WAL segment size available, we can now initialize the
+ * dependent start and end segment numbers.
+ */
+ XLByteToSeg(privateInfo->startptr, privateInfo->startSegNo, WalSegSz);
+ XLByteToSeg(privateInfo->endptr, privateInfo->endSegNo, WalSegSz);
+}
+
+/*
+ * Release the archive streamer chain and close the archive file.
+ */
+void
+free_archive_reader(XLogDumpPrivate *privateInfo)
+{
+ /*
+ * NB: Normally, astreamer_finalize() is called before astreamer_free() to
+ * flush any remaining buffered data or to ensure the end of the tar
+ * archive is reached. However, when decoding a WAL file, once we hit the
+ * end LSN, any remaining WAL data in the buffer or the tar archive's
+ * unreached end can be safely ignored.
+ */
+ astreamer_free(privateInfo->archive_streamer);
+
+ /* Close the file. */
+ if (close(privateInfo->archive_fd) != 0)
+ pg_log_error("could not close file \"%s\": %m",
+ privateInfo->archive_name);
+}
+
+/*
+ * Copies WAL data from astreamer to readBuff; if unavailable, fetches more
+ * from the tar archive via astreamer.
+ */
+int
+read_archive_wal_page(XLogDumpPrivate *privateInfo, XLogRecPtr targetPagePtr,
+ Size count, char *readBuff)
+{
+ char *p = readBuff;
+ Size nbytes = count;
+ XLogRecPtr recptr = targetPagePtr;
+ XLogSegNo segno;
+ ArchivedWALEntry *entry;
+
+ XLByteToSeg(targetPagePtr, segno, WalSegSz);
+ entry = get_archive_wal_entry(segno, privateInfo);
+
+ while (nbytes > 0)
+ {
+ char *buf = entry->buf.data;
+ int len = entry->buf.len;
+
+ /* WAL record range that the buffer contains */
+ XLogRecPtr endPtr;
+ XLogRecPtr startPtr;
+
+ XLogSegNoOffsetToRecPtr(entry->segno, entry->total_read,
+ WalSegSz, endPtr);
+ startPtr = endPtr - len;
+
+ /*
+ * pg_waldump may request to re-read the currently active page, but
+ * never a page older than the current one. Therefore, any fully
+ * consumed WAL data preceding the current page can be safely
+ * discarded.
+ */
+ if (recptr >= endPtr)
+ {
+ /* Discard the buffered data */
+ resetStringInfo(&entry->buf);
+ len = 0;
+
+ /*
+ * Push back the partial page data for the current page to the
+ * buffer, ensuring it remains available for re-reading if
+ * requested.
+ */
+ if (p > readBuff)
+ {
+ Assert((count - nbytes) > 0);
+ appendBinaryStringInfo(&entry->buf, readBuff, count - nbytes);
+ }
+ }
+
+ if (len > 0 && recptr > startPtr)
+ {
+ int skipBytes = 0;
+
+ /*
+ * The required offset is not at the start of the buffer, so skip
+ * bytes until reaching the desired offset of the target page.
+ */
+ skipBytes = recptr - startPtr;
+
+ buf += skipBytes;
+ len -= skipBytes;
+ }
+
+ if (len > 0)
+ {
+ int readBytes = len >= nbytes ? nbytes : len;
+
+ /* Ensure the reading page is in the buffer */
+ Assert(recptr >= startPtr && recptr < endPtr);
+
+ memcpy(p, buf, readBytes);
+
+ /* Update state for read */
+ nbytes -= readBytes;
+ p += readBytes;
+ recptr += readBytes;
+ }
+ else
+ {
+ /*
+ * Fetch more data; raise an error if it's not the current segment
+ * being read by the archive streamer or if reading of the
+ * archived file has finished.
+ */
+ if (privateInfo->cur_wal != entry ||
+ read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ {
+ char fname[MAXFNAMELEN];
+
+ XLogFileName(fname, privateInfo->timeline, entry->segno,
+ WalSegSz);
+ pg_fatal("could not read file \"%s\" from archive \"%s\": read %lld of %lld",
+ fname, privateInfo->archive_name,
+ (long long int) count - nbytes,
+ (long long int) nbytes);
+ }
+ }
+ }
+
+ /*
+ * Should have either have successfully read all the requested bytes or
+ * reported a failure before this point.
+ */
+ Assert(nbytes == 0);
+
+ /*
+ * NB: We return the fixed value provided as input. Although we could
+ * return a boolean since we either successfully read the WAL page or
+ * raise an error, but the caller expects this value to be returned. The
+ * routine that reads WAL pages from the physical WAL file follows the
+ * same convention.
+ */
+ return count;
+}
+
+/*
+ * Reads the archive file and passes it to the archive streamer for
+ * decompression.
+ */
+static int
+read_archive_file(XLogDumpPrivate *privateInfo, Size count)
+{
+ int rc;
+ char *buffer;
+
+ buffer = pg_malloc(READ_CHUNK_SIZE * sizeof(uint8));
+
+ rc = read(privateInfo->archive_fd, buffer, count);
+ if (rc < 0)
+ pg_fatal("could not read file \"%s\": %m",
+ privateInfo->archive_name);
+
+ /*
+ * Decompress (if required), and then parse the previously read contents
+ * of the tar file.
+ */
+ if (rc > 0)
+ astreamer_content(privateInfo->archive_streamer, NULL,
+ buffer, rc, ASTREAMER_UNKNOWN);
+ pg_free(buffer);
+
+ return rc;
+}
+
+/*
+ * Returns the archived WAL entry from the hash table if it exists. Otherwise,
+ * it invokes the routine to read the archived file and retrieve the entry if
+ * it is not already in hash table.
+ */
+static ArchivedWALEntry *
+get_archive_wal_entry(XLogSegNo segno, XLogDumpPrivate *privateInfo)
+{
+ ArchivedWALEntry *entry = NULL;
+ char fname[MAXFNAMELEN];
+
+ /* Search hash table */
+ entry = ArchivedWAL_lookup(ArchivedWAL_HTAB, segno);
+
+ if (entry != NULL)
+ return entry;
+
+ /* Needed WAL yet to be decoded from archive, do the same */
+ while (1)
+ {
+ entry = privateInfo->cur_wal;
+
+ /* Fetch more data */
+ if (entry == NULL || entry->buf.len == 0)
+ {
+ if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ break; /* archive file ended */
+ }
+
+ /*
+ * Either, here for the first time, or the archived streamer is
+ * reading a non-WAL file or an irrelevant WAL file.
+ */
+ if (entry == NULL)
+ continue;
+
+ /* Found the required entry */
+ if (entry->segno == segno)
+ return entry;
+
+ /*
+ * Ignore if the timeline is different or the current segment is not
+ * the desired one.
+ */
+ if (privateInfo->timeline != entry->timeline ||
+ privateInfo->startSegNo > entry->segno ||
+ privateInfo->endSegNo < entry->segno)
+ {
+ privateInfo->cur_wal = NULL;
+ continue;
+ }
+
+ /* WAL segments must be archived in order */
+ pg_log_error("WAL files are not archived in sequential order");
+ pg_log_error_detail("Expecting segment number " UINT64_FORMAT " but found " UINT64_FORMAT ".",
+ segno, entry->segno);
+ exit(1);
+ }
+
+ /* Requested WAL segment not found */
+ XLogFileName(fname, privateInfo->timeline, segno, WalSegSz);
+ pg_fatal("could not find file \"%s\" in archive", fname);
+}
+
+/*
+ * Create an astreamer that can read WAL from tar file.
+ */
+static astreamer *
+astreamer_waldump_new(XLogDumpPrivate *privateInfo)
+{
+ astreamer_waldump *streamer;
+
+ streamer = palloc0(sizeof(astreamer_waldump));
+ *((const astreamer_ops **) &streamer->base.bbs_ops) =
+ &astreamer_waldump_ops;
+
+ streamer->privateInfo = privateInfo;
+
+ return &streamer->base;
+}
+
+/*
+ * Main entry point of the archive streamer for reading WAL data from a tar
+ * file. If a member is identified as a valid WAL file, a hash entry is created
+ * for it, and its contents are copied into that entry's buffer, making them
+ * accessible to the decoding routine.
+ */
+static void
+astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context)
+{
+ astreamer_waldump *mystreamer = (astreamer_waldump *) streamer;
+ XLogDumpPrivate *privateInfo = mystreamer->privateInfo;
+
+ Assert(context != ASTREAMER_UNKNOWN);
+
+ switch (context)
+ {
+ case ASTREAMER_MEMBER_HEADER:
+ {
+ XLogSegNo segno;
+ TimeLineID timeline;
+ ArchivedWALEntry *entry;
+ bool found;
+
+ pg_log_debug("reading \"%s\"", member->pathname);
+
+ if (!member_is_wal_file(mystreamer, member,
+ &segno, &timeline))
+ break;
+
+ entry = ArchivedWAL_insert(ArchivedWAL_HTAB, segno, &found);
+
+ /*
+ * Shouldn't happen, but if it does, simply ignore the
+ * duplicate WAL file.
+ */
+ if (found)
+ {
+ pg_log_warning("ignoring duplicate WAL file found in archive: \"%s\"",
+ member->pathname);
+ break;
+ }
+
+ initStringInfo(&entry->buf);
+ entry->timeline = timeline;
+ entry->total_read = 0;
+
+ privateInfo->cur_wal = entry;
+ }
+ break;
+
+ case ASTREAMER_MEMBER_CONTENTS:
+ if (privateInfo->cur_wal)
+ {
+ appendBinaryStringInfo(&privateInfo->cur_wal->buf, data, len);
+ privateInfo->cur_wal->total_read += len;
+ }
+ break;
+
+ case ASTREAMER_MEMBER_TRAILER:
+ privateInfo->cur_wal = NULL;
+ break;
+
+ case ASTREAMER_ARCHIVE_TRAILER:
+ break;
+
+ default:
+ /* Shouldn't happen. */
+ pg_fatal("unexpected state while parsing tar file");
+ }
+}
+
+/*
+ * End-of-stream processing for a astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_finalize(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+}
+
+/*
+ * Free memory associated with a astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_free(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+ pfree(streamer);
+}
+
+/*
+ * Returns true if the archive member name matches the WAL naming format. If
+ * successful, it also outputs the WAL segment number, and timeline.
+ */
+static bool
+member_is_wal_file(astreamer_waldump *mystreamer, astreamer_member *member,
+ XLogSegNo *curSegNo, TimeLineID *curTimeline)
+{
+ int pathlen;
+ XLogSegNo segNo;
+ TimeLineID timeline;
+ char *fname;
+
+ /* We are only interested in normal files. */
+ if (member->is_directory || member->is_link)
+ return false;
+
+ pathlen = strlen(member->pathname);
+ if (pathlen < XLOG_FNAME_LEN)
+ return false;
+
+ /* WAL file could be with full path */
+ fname = member->pathname + (pathlen - XLOG_FNAME_LEN);
+ if (!IsXLogFileName(fname))
+ return false;
+
+ /*
+ * XXX: On some systems (e.g., OpenBSD), the tar utility includes
+ * PaxHeaders when creating an archive. These are special entries that
+ * store extended metadata for the file entry immediately following them,
+ * and they share the exact same name as that file.
+ */
+ if (strstr(member->pathname, "PaxHeaders."))
+ return false;
+
+ /* Parse position from file */
+ XLogFromFileName(fname, &timeline, &segNo, WalSegSz);
+
+ *curSegNo = segNo;
+ *curTimeline = timeline;
+
+ return true;
+}
diff --git a/src/bin/pg_waldump/meson.build b/src/bin/pg_waldump/meson.build
index 937e0d68841..da00746587c 100644
--- a/src/bin/pg_waldump/meson.build
+++ b/src/bin/pg_waldump/meson.build
@@ -3,6 +3,7 @@
pg_waldump_sources = files(
'compat.c',
'pg_waldump.c',
+ 'archive_waldump.c',
'rmgrdesc.c',
)
@@ -18,7 +19,7 @@ endif
pg_waldump = executable('pg_waldump',
pg_waldump_sources,
- dependencies: [frontend_code, lz4, zstd],
+ dependencies: [frontend_code, lz4, zstd, libpq],
c_args: ['-DFRONTEND'], # needed for xlogreader et al
kwargs: default_bin_args,
)
@@ -29,6 +30,7 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
+ 'env': {'TAR': tar.found() ? tar.full_path() : ''},
'tests': [
't/001_basic.pl',
't/002_save_fullpage.pl',
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 0dc28ea360c..02ad141e44a 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -177,7 +177,7 @@ split_path(const char *path, char **dir, char **fname)
*
* return a read only fd
*/
-static int
+int
open_file_in_directory(const char *directory, const char *fname)
{
int fd = -1;
@@ -436,6 +436,44 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
return count;
}
+/*
+ * pg_waldump's XLogReaderRoutine->segment_open callback to support dumping WAL
+ * files from tar archives.
+ */
+static void
+TarWALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
+ TimeLineID *tli_p)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->segment_close callback.
+ */
+static void
+TarWALDumpCloseSegment(XLogReaderState *state)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->page_read callback to support dumping WAL
+ * files from tar archives.
+ */
+static int
+TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
+ XLogRecPtr targetPtr, char *readBuff)
+{
+ XLogDumpPrivate *private = state->private_data;
+ int count = required_read_len(private, targetPagePtr, reqLen);
+
+ if (private->endptr_reached)
+ return -1;
+
+ /* Read the WAL page from the archive streamer */
+ return read_archive_wal_page(private, targetPagePtr, count, readBuff);
+}
+
/*
* Boolean to return whether the given WAL record matches a specific relation
* and optionally block.
@@ -773,8 +811,8 @@ usage(void)
printf(_(" -F, --fork=FORK only show records that modify blocks in fork FORK;\n"
" valid names are main, fsm, vm, init\n"));
printf(_(" -n, --limit=N number of records to display\n"));
- printf(_(" -p, --path=PATH directory in which to find WAL segment files or a\n"
- " directory with a ./pg_wal that contains such files\n"
+ printf(_(" -p, --path=PATH tar archive or a directory in which to find WAL segment files or\n"
+ " a directory with a ./pg_wal that contains such files\n"
" (default: current directory, ./pg_wal, $PGDATA/pg_wal)\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -r, --rmgr=RMGR only show records generated by resource manager RMGR;\n"
@@ -806,7 +844,10 @@ main(int argc, char **argv)
XLogRecord *record;
XLogRecPtr first_record;
char *waldir = NULL;
+ char *walpath = NULL;
char *errormsg;
+ bool is_archive = false;
+ pg_compress_algorithm compression;
static struct option long_options[] = {
{"bkp-details", no_argument, NULL, 'b'},
@@ -938,7 +979,7 @@ main(int argc, char **argv)
}
break;
case 'p':
- waldir = pg_strdup(optarg);
+ walpath = pg_strdup(optarg);
break;
case 'q':
config.quiet = true;
@@ -1102,10 +1143,20 @@ main(int argc, char **argv)
goto bad_argument;
}
- if (waldir != NULL)
+ if (walpath != NULL)
{
+ /* validate path points to tar archive */
+ if (is_archive_file(walpath, &compression))
+ {
+ char *fname = NULL;
+
+ split_path(walpath, &waldir, &fname);
+
+ private.archive_name = fname;
+ is_archive = true;
+ }
/* validate path points to directory */
- if (!verify_directory(waldir))
+ else if (!verify_directory(walpath))
{
pg_log_error("could not open directory \"%s\": %m", waldir);
goto bad_argument;
@@ -1123,6 +1174,17 @@ main(int argc, char **argv)
int fd;
XLogSegNo segno;
+ /*
+ * If a tar archive is passed using the --path option, all other
+ * arguments become unnecessary.
+ */
+ if (is_archive)
+ {
+ pg_log_error("unnecessary command-line arguments specified with tar archive (first is \"%s\")",
+ argv[optind]);
+ goto bad_argument;
+ }
+
split_path(argv[optind], &directory, &fname);
if (waldir == NULL && directory != NULL)
@@ -1133,69 +1195,77 @@ main(int argc, char **argv)
pg_fatal("could not open directory \"%s\": %m", waldir);
}
- waldir = identify_target_directory(waldir, fname);
- fd = open_file_in_directory(waldir, fname);
- if (fd < 0)
- pg_fatal("could not open file \"%s\"", fname);
- close(fd);
-
- /* parse position from file */
- XLogFromFileName(fname, &private.timeline, &segno, WalSegSz);
-
- if (!XLogRecPtrIsValid(private.startptr))
- XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr);
- else if (!XLByteInSeg(private.startptr, segno, WalSegSz))
+ if (fname != NULL && is_archive_file(fname, &compression))
{
- pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.startptr),
- fname);
- goto bad_argument;
+ private.archive_name = fname;
+ is_archive = true;
}
-
- /* no second file specified, set end position */
- if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(segno + 1, 0, WalSegSz, private.endptr);
-
- /* parse ENDSEG if passed */
- if (optind + 1 < argc)
+ else
{
- XLogSegNo endsegno;
-
- /* ignore directory, already have that */
- split_path(argv[optind + 1], &directory, &fname);
-
+ waldir = identify_target_directory(waldir, fname);
fd = open_file_in_directory(waldir, fname);
if (fd < 0)
pg_fatal("could not open file \"%s\"", fname);
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &endsegno, WalSegSz);
+ XLogFromFileName(fname, &private.timeline, &segno, WalSegSz);
- if (endsegno < segno)
- pg_fatal("ENDSEG %s is before STARTSEG %s",
- argv[optind + 1], argv[optind]);
+ if (!XLogRecPtrIsValid(private.startptr))
+ XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr);
+ else if (!XLByteInSeg(private.startptr, segno, WalSegSz))
+ {
+ pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.startptr),
+ fname);
+ goto bad_argument;
+ }
- if (!XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(endsegno + 1, 0, WalSegSz,
- private.endptr);
+ /* no second file specified, set end position */
+ if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
+ XLogSegNoOffsetToRecPtr(segno + 1, 0, WalSegSz, private.endptr);
- /* set segno to endsegno for check of --end */
- segno = endsegno;
- }
+ /* parse ENDSEG if passed */
+ if (optind + 1 < argc)
+ {
+ XLogSegNo endsegno;
+ /* ignore directory, already have that */
+ split_path(argv[optind + 1], &directory, &fname);
- if (!XLByteInSeg(private.endptr, segno, WalSegSz) &&
- private.endptr != (segno + 1) * WalSegSz)
- {
- pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.endptr),
- argv[argc - 1]);
- goto bad_argument;
+ fd = open_file_in_directory(waldir, fname);
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", fname);
+ close(fd);
+
+ /* parse position from file */
+ XLogFromFileName(fname, &private.timeline, &endsegno, WalSegSz);
+
+ if (endsegno < segno)
+ pg_fatal("ENDSEG %s is before STARTSEG %s",
+ argv[optind + 1], argv[optind]);
+
+ if (!XLogRecPtrIsValid(private.endptr))
+ XLogSegNoOffsetToRecPtr(endsegno + 1, 0, WalSegSz,
+ private.endptr);
+
+ /* set segno to endsegno for check of --end */
+ segno = endsegno;
+ }
+
+
+ if (!XLByteInSeg(private.endptr, segno, WalSegSz) &&
+ private.endptr != (segno + 1) * WalSegSz)
+ {
+ pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.endptr),
+ argv[argc - 1]);
+ goto bad_argument;
+ }
}
}
- else
- waldir = identify_target_directory(waldir, NULL);
+ else if (!is_archive)
+ waldir = identify_target_directory(walpath, NULL);
/* we don't know what to print */
if (!XLogRecPtrIsValid(private.startptr))
@@ -1207,12 +1277,36 @@ main(int argc, char **argv)
/* done with argument parsing, do the actual work */
/* we have everything we need, start reading */
- xlogreader_state =
- XLogReaderAllocate(WalSegSz, waldir,
- XL_ROUTINE(.page_read = WALDumpReadPage,
- .segment_open = WALDumpOpenSegment,
- .segment_close = WALDumpCloseSegment),
- &private);
+ if (is_archive)
+ {
+ /*
+ * A NULL WAL directory indicates that the archive file is located
+ * in the current working directory of the pg_waldump execution
+ */
+ waldir = waldir ? pg_strdup(waldir) : pg_strdup(".");
+
+ /* Set up for reading tar file */
+ init_archive_reader(&private, waldir, compression);
+
+ /* Routine to decode WAL files in tar archive */
+ xlogreader_state =
+ XLogReaderAllocate(WalSegSz, waldir,
+ XL_ROUTINE(.page_read = TarWALDumpReadPage,
+ .segment_open = TarWALDumpOpenSegment,
+ .segment_close = TarWALDumpCloseSegment),
+ &private);
+ }
+ else
+ {
+ /* Routine to decode WAL files */
+ xlogreader_state =
+ XLogReaderAllocate(WalSegSz, waldir,
+ XL_ROUTINE(.page_read = WALDumpReadPage,
+ .segment_open = WALDumpOpenSegment,
+ .segment_close = WALDumpCloseSegment),
+ &private);
+ }
+
if (!xlogreader_state)
pg_fatal("out of memory while allocating a WAL reading processor");
@@ -1321,6 +1415,9 @@ main(int argc, char **argv)
XLogReaderFree(xlogreader_state);
+ if (is_archive)
+ free_archive_reader(&private);
+
return EXIT_SUCCESS;
bad_argument:
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index 9e62b64ead5..54758c3548a 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -12,9 +12,13 @@
#define PG_WALDUMP_H
#include "access/xlogdefs.h"
+#include "fe_utils/astreamer.h"
extern int WalSegSz;
+/* Forward declaration */
+struct ArchivedWALEntry;
+
/* Contains the necessary information to drive WAL decoding */
typedef struct XLogDumpPrivate
{
@@ -22,6 +26,36 @@ typedef struct XLogDumpPrivate
XLogRecPtr startptr;
XLogRecPtr endptr;
bool endptr_reached;
+
+ /* Fields required to read WAL from archive */
+ char *archive_name; /* Tar archive name */
+ int archive_fd; /* File descriptor for the open tar file */
+
+ astreamer *archive_streamer;
+
+ /* What the archive streamer is currently reading */
+ struct ArchivedWALEntry *cur_wal;
+
+ /*
+ * Although these values can be easily derived from startptr and endptr,
+ * doing so repeatedly for each archived member would be inefficient, as
+ * it would involve recalculating and filtering out irrelevant WAL
+ * segments.
+ */
+ XLogSegNo startSegNo;
+ XLogSegNo endSegNo;
} XLogDumpPrivate;
-#endif /* end of PG_WALDUMP_H */
+extern int open_file_in_directory(const char *directory, const char *fname);
+
+extern bool is_archive_file(const char *fname,
+ pg_compress_algorithm *compression);
+extern void init_archive_reader(XLogDumpPrivate *privateInfo,
+ const char *waldir,
+ pg_compress_algorithm compression);
+extern void free_archive_reader(XLogDumpPrivate *privateInfo);
+extern int read_archive_wal_page(XLogDumpPrivate *privateInfo,
+ XLogRecPtr targetPagePtr,
+ Size count, char *readBuff);
+
+#endif /* end of PG_WALDUMP_H */
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 1b712e8d74d..443126a9ce6 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -3,10 +3,13 @@
use strict;
use warnings FATAL => 'all';
+use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
+my $tar = $ENV{TAR};
+
program_help_ok('pg_waldump');
program_version_ok('pg_waldump');
program_options_handling_ok('pg_waldump');
@@ -235,7 +238,7 @@ command_like(
sub test_pg_waldump
{
local $Test::Builder::Level = $Test::Builder::Level + 1;
- my @opts = @_;
+ my ($path, @opts) = @_;
my ($stdout, $stderr);
@@ -243,6 +246,7 @@ sub test_pg_waldump
'pg_waldump',
'--start' => $start_lsn,
'--end' => $end_lsn,
+ '--path' => $path,
@opts
],
'>' => \$stdout,
@@ -254,11 +258,50 @@ sub test_pg_waldump
return @lines;
}
-my @lines;
+# Create a tar archive, sorting the file order
+sub generate_archive
+{
+ my ($archive, $directory, $compression_flags) = @_;
+
+ my @files;
+ opendir my $dh, $directory or die "opendir: $!";
+ while (my $entry = readdir $dh) {
+ # Skip '.' and '..'
+ next if $entry eq '.' || $entry eq '..';
+ push @files, $entry;
+ }
+ closedir $dh;
+
+ @files = sort @files;
+
+ # move into the WAL directory before archiving files
+ my $cwd = getcwd;
+ chdir($directory) || die "chdir: $!";
+ command_ok([$tar, $compression_flags, $archive, @files]);
+ chdir($cwd) || die "chdir: $!";
+}
+
+my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
my @scenario = (
{
- 'path' => $node->data_dir
+ 'path' => $node->data_dir,
+ 'is_archive' => 0,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar",
+ 'compression_method' => 'none',
+ 'compression_flags' => '-cf',
+ 'is_archive' => 1,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar.gz",
+ 'compression_method' => 'gzip',
+ 'compression_flags' => '-czf',
+ 'is_archive' => 1,
+ 'enabled' => check_pg_config("#define HAVE_LIBZ 1")
});
for my $scenario (@scenario)
@@ -267,6 +310,19 @@ for my $scenario (@scenario)
SKIP:
{
+ skip "tar command is not available", 3
+ if !defined $tar;
+ skip "$scenario->{'compression_method'} compression not supported by this build", 3
+ if !$scenario->{'enabled'} && $scenario->{'is_archive'};
+
+ # create pg_wal archive
+ if ($scenario->{'is_archive'})
+ {
+ generate_archive($path,
+ $node->data_dir . '/pg_wal',
+ $scenario->{'compression_flags'});
+ }
+
command_fails_like(
[ 'pg_waldump', '--path' => $path ],
qr/error: no start WAL location given/,
@@ -298,38 +354,42 @@ for my $scenario (@scenario)
qr/error: error in WAL record at/,
'errors are shown with --quiet');
- @lines = test_pg_waldump('--path' => $path);
+ my @lines;
+ @lines = test_pg_waldump($path);
is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
- @lines = test_pg_waldump('--path' => $path, '--limit' => 6);
+ @lines = test_pg_waldump($path, '--limit' => 6);
is(@lines, 6, 'limit option observed');
- @lines = test_pg_waldump('--path' => $path, '--fullpage');
+ @lines = test_pg_waldump($path, '--fullpage');
is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
- @lines = test_pg_waldump('--path' => $path, '--stats');
+ @lines = test_pg_waldump($path, '--stats');
like($lines[0], qr/WAL statistics/, "statistics on stdout");
is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
- @lines = test_pg_waldump('--path' => $path, '--stats=record');
+ @lines = test_pg_waldump($path, '--stats=record');
like($lines[0], qr/WAL statistics/, "statistics on stdout");
is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
- @lines = test_pg_waldump('--path' => $path, '--rmgr' => 'Btree');
+ @lines = test_pg_waldump($path, '--rmgr' => 'Btree');
is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
- @lines = test_pg_waldump('--path' => $path, '--fork' => 'init');
+ @lines = test_pg_waldump($path, '--fork' => 'init');
is(grep(!/fork init/, @lines), 0, 'only init fork lines');
- @lines = test_pg_waldump('--path' => $path,
+ @lines = test_pg_waldump($path,
'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
0, 'only lines for selected relation');
- @lines = test_pg_waldump('--path' => $path,
+ @lines = test_pg_waldump($path,
'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
'--block' => 1);
is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+
+ # Cleanup.
+ unlink $path if $scenario->{'is_archive'};
}
}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index c751c25a04d..c38a1c3808b 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -139,6 +139,8 @@ ArchiveOpts
ArchiveShutdownCB
ArchiveStartupCB
ArchiveStreamState
+ArchivedWALEntry
+ArchivedWAL_hash
ArchiverOutput
ArchiverStage
ArrayAnalyzeExtraData
@@ -3465,6 +3467,7 @@ astreamer_recovery_injector
astreamer_tar_archiver
astreamer_tar_parser
astreamer_verify
+astreamer_waldump
astreamer_zstd_frame
auth_password_hook_typ
autovac_table
--
2.47.1
[application/octet-stream] v7-0005-pg_waldump-Remove-the-restriction-on-the-order-of.patch (11.2K, 6-v7-0005-pg_waldump-Remove-the-restriction-on-the-order-of.patch)
download | inline diff:
From b09abd6a9fc5493a71285a36417bcc18ac017985 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 6 Nov 2025 13:48:33 +0530
Subject: [PATCH v7 5/8] pg_waldump: Remove the restriction on the order of
archived WAL files.
With previous patch, pg_waldump would stop decoding if WAL files were
not in the required sequence. With this patch, decoding will now
continue. Any WAL file that is out of order will be written to a
temporary location, from which it will be read later. Once a temporary
file has been read, it will be removed.
---
src/bin/pg_waldump/archive_waldump.c | 207 +++++++++++++++++++++++++--
src/bin/pg_waldump/pg_waldump.c | 41 +++++-
src/bin/pg_waldump/pg_waldump.h | 4 +
src/bin/pg_waldump/t/001_basic.pl | 3 +-
4 files changed, 243 insertions(+), 12 deletions(-)
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
index 61d6782f9b7..6f87c1ab4a4 100644
--- a/src/bin/pg_waldump/archive_waldump.c
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -17,6 +17,7 @@
#include <unistd.h>
#include "access/xlog_internal.h"
+#include "common/file_perm.h"
#include "common/hashfn.h"
#include "common/logging.h"
#include "fe_utils/simple_list.h"
@@ -27,6 +28,11 @@
*/
#define READ_CHUNK_SIZE (128 * 1024)
+#define TEMP_FILE_PREFIX "waldump.tmp"
+
+/* Temporary exported WAL file directory */
+static char *TmpWalSegDir = NULL;
+
/* Structure for storing the WAL segment data from the archive */
typedef struct ArchivedWALEntry
{
@@ -65,6 +71,11 @@ typedef struct astreamer_waldump
static int read_archive_file(XLogDumpPrivate *privateInfo, Size count);
static ArchivedWALEntry *get_archive_wal_entry(XLogSegNo segno,
XLogDumpPrivate *privateInfo);
+static void setup_tmpseg_dir(const char *waldir);
+static void cleanup_tmpseg_dir_atexit(void);
+
+static FILE *prepare_tmp_write(XLogSegNo segno);
+static void perform_tmp_write(XLogSegNo segno, StringInfo buf, FILE *file);
static astreamer *astreamer_waldump_new(XLogDumpPrivate *privateInfo);
static void astreamer_waldump_content(astreamer *streamer,
@@ -120,10 +131,11 @@ is_archive_file(const char *fname, pg_compress_algorithm *compression)
}
/*
- * Initializes the tar archive reader to read WAL files from the archive,
- * creates a hash table to store them, performs quick existence checks for WAL
- * entries in the archive and retrieves the WAL segment size, and sets up
- * filtering criteria for relevant entries.
+ * Initializes the tar archive reader, creates a hash table for WAL entries,
+ * checks for existing valid WAL segments in the archive file and retrieves the
+ * segment size, and sets up filters for relevant entries. It also configures a
+ * temporary directory for out-of-order WAL data and registers an exit callback
+ * to clean up temporary files.
*/
void
init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
@@ -194,6 +206,13 @@ init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
*/
XLByteToSeg(privateInfo->startptr, privateInfo->startSegNo, WalSegSz);
XLByteToSeg(privateInfo->endptr, privateInfo->endSegNo, WalSegSz);
+
+ /*
+ * Setup temporary directory to store WAL segments and set up an exit
+ * callback to remove it upon completion.
+ */
+ setup_tmpseg_dir(waldir);
+ atexit(cleanup_tmpseg_dir_atexit);
}
/*
@@ -369,13 +388,16 @@ read_archive_file(XLogDumpPrivate *privateInfo, Size count)
/*
* Returns the archived WAL entry from the hash table if it exists. Otherwise,
* it invokes the routine to read the archived file and retrieve the entry if
- * it is not already in hash table.
+ * it is not already present in the hash table. If the archive streamer happens
+ * to be reading a WAL from archive file that is not currently needed, that WAL
+ * data is written to a temporary file.
*/
static ArchivedWALEntry *
get_archive_wal_entry(XLogSegNo segno, XLogDumpPrivate *privateInfo)
{
ArchivedWALEntry *entry = NULL;
char fname[MAXFNAMELEN];
+ FILE *write_fp = NULL;
/* Search hash table */
entry = ArchivedWAL_lookup(ArchivedWAL_HTAB, segno);
@@ -418,11 +440,32 @@ get_archive_wal_entry(XLogSegNo segno, XLogDumpPrivate *privateInfo)
continue;
}
- /* WAL segments must be archived in order */
- pg_log_error("WAL files are not archived in sequential order");
- pg_log_error_detail("Expecting segment number " UINT64_FORMAT " but found " UINT64_FORMAT ".",
- segno, entry->segno);
- exit(1);
+ /*
+ * Archive streamer is currently reading a file that isn't the one
+ * asked for, but it's required for a future feature. It should be
+ * written to a temporary location for retrieval when needed.
+ */
+
+ /* Create a temporary file if one does not already exist */
+ if (!entry->tmpseg_exists)
+ {
+ write_fp = prepare_tmp_write(entry->segno);
+ entry->tmpseg_exists = true;
+ }
+
+ /* Flush data from the buffer to the file */
+ perform_tmp_write(entry->segno, &entry->buf, write_fp);
+ resetStringInfo(&entry->buf);
+
+ /*
+ * The change in the current segment entry indicates that the reading
+ * of this file has ended.
+ */
+ if (entry != privateInfo->cur_wal && write_fp != NULL)
+ {
+ fclose(write_fp);
+ write_fp = NULL;
+ }
}
/* Requested WAL segment not found */
@@ -430,6 +473,150 @@ get_archive_wal_entry(XLogSegNo segno, XLogDumpPrivate *privateInfo)
pg_fatal("could not find file \"%s\" in archive", fname);
}
+/*
+ * Set up a temporary directory to temporarily store WAL segments.
+ */
+static void
+setup_tmpseg_dir(const char *waldir)
+{
+ /*
+ * Use the directory specified by the TEMDIR environment variable. If it's
+ * not set, use the provided WAL directory to extract WAL file
+ * temporarily.
+ */
+ TmpWalSegDir = getenv("TMPDIR") ?
+ pg_strdup(getenv("TMPDIR")) : pg_strdup(waldir);
+ canonicalize_path(TmpWalSegDir);
+}
+
+/*
+ * Removes the temporarily store WAL segments, if any, at exiting.
+ */
+static void
+cleanup_tmpseg_dir_atexit(void)
+{
+ ArchivedWAL_iterator it;
+ ArchivedWALEntry *entry;
+
+ ArchivedWAL_start_iterate(ArchivedWAL_HTAB, &it);
+ while ((entry = ArchivedWAL_iterate(ArchivedWAL_HTAB, &it)) != NULL)
+ {
+ if (entry->tmpseg_exists)
+ {
+ remove_tmp_walseg(entry->segno, false);
+ entry->tmpseg_exists = false;
+ }
+ }
+}
+
+/*
+ * Generate the temporary WAL file path.
+ *
+ * Note that the caller is responsible to pfree it.
+ */
+char *
+get_tmp_walseg_path(XLogSegNo segno)
+{
+ char *fpath = (char *) palloc(MAXPGPATH);
+
+ snprintf(fpath, MAXPGPATH, "%s/%s.%08X%08X",
+ TmpWalSegDir,
+ TEMP_FILE_PREFIX,
+ (uint32) (segno / XLogSegmentsPerXLogId(WalSegSz)),
+ (uint32) (segno % XLogSegmentsPerXLogId(WalSegSz)));
+
+ return fpath;
+}
+
+/*
+ * Routine to check whether a temporary file exists for the corresponding WAL
+ * segment number.
+ */
+bool
+tmp_walseg_exists(XLogSegNo segno)
+{
+ ArchivedWALEntry *entry;
+
+ entry = ArchivedWAL_lookup(ArchivedWAL_HTAB, segno);
+
+ if (entry == NULL)
+ return false;
+
+ return entry->tmpseg_exists;
+}
+
+/*
+ * Create an empty placeholder file and return its handle.
+ */
+static FILE *
+prepare_tmp_write(XLogSegNo segno)
+{
+ FILE *file;
+ char *fpath;
+
+ fpath = get_tmp_walseg_path(segno);
+
+ /* Create an empty placeholder */
+ file = fopen(fpath, PG_BINARY_W);
+ if (file == NULL)
+ pg_fatal("could not create file \"%s\": %m", fpath);
+
+#ifndef WIN32
+ if (chmod(fpath, pg_file_create_mode))
+ pg_fatal("could not set permissions on file \"%s\": %m",
+ fpath);
+#endif
+
+ pg_log_debug("temporarily exporting file \"%s\"", fpath);
+ pfree(fpath);
+
+ return file;
+}
+
+/*
+ * Write buffer data to the given file handle.
+ */
+static void
+perform_tmp_write(XLogSegNo segno, StringInfo buf, FILE *file)
+{
+ Assert(file);
+
+ errno = 0;
+ if (buf->len > 0 && fwrite(buf->data, buf->len, 1, file) != 1)
+ {
+ /*
+ * If write didn't set errno, assume problem is no disk space
+ */
+ if (errno == 0)
+ errno = ENOSPC;
+ pg_fatal("could not write to file \"%s\": %m",
+ get_tmp_walseg_path(segno));
+ }
+}
+
+/*
+ * Remove temporary file
+ */
+void
+remove_tmp_walseg(XLogSegNo segno, bool update_entry)
+{
+ char *fpath = get_tmp_walseg_path(segno);
+
+ if (unlink(fpath) == 0)
+ pg_log_debug("removed file \"%s\"", fpath);
+ pfree(fpath);
+
+ /* Update entry if requested */
+ if (update_entry)
+ {
+ ArchivedWALEntry *entry;
+
+ entry = ArchivedWAL_lookup(ArchivedWAL_HTAB, segno);
+ Assert(entry != NULL);
+ entry->tmpseg_exists = false;
+ }
+}
+
/*
* Create an astreamer that can read WAL from tar file.
*/
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 02ad141e44a..4c5974a6ae1 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -466,11 +466,50 @@ TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
{
XLogDumpPrivate *private = state->private_data;
int count = required_read_len(private, targetPagePtr, reqLen);
+ XLogSegNo nextSegNo;
if (private->endptr_reached)
return -1;
- /* Read the WAL page from the archive streamer */
+ /*
+ * If the target page is in a different segment, first check for the WAL
+ * segment's physical existence in the temporary directory.
+ */
+ nextSegNo = state->seg.ws_segno;
+ if (!XLByteInSeg(targetPagePtr, nextSegNo, WalSegSz))
+ {
+ if (state->seg.ws_file >= 0)
+ {
+ close(state->seg.ws_file);
+ state->seg.ws_file = -1;
+
+ /* Remove this file, as it is no longer needed. */
+ remove_tmp_walseg(nextSegNo, true);
+ }
+
+ XLByteToSeg(targetPagePtr, nextSegNo, WalSegSz);
+ state->seg.ws_tli = private->timeline;
+ state->seg.ws_segno = nextSegNo;
+
+ /*
+ * If the next segment exists, open it and continue reading from there
+ */
+ if (tmp_walseg_exists(nextSegNo))
+ {
+ char *fpath;
+
+ fpath = get_tmp_walseg_path(nextSegNo);
+ state->seg.ws_file = open(fpath, O_RDONLY | PG_BINARY, 0);
+ pfree(fpath);
+ }
+ }
+
+ /* Continue reading from the open WAL segment, if any */
+ if (state->seg.ws_file >= 0)
+ return WALDumpReadPage(state, targetPagePtr, count, targetPtr,
+ readBuff);
+
+ /* Otherwise, read the WAL page from the archive streamer */
return read_archive_wal_page(private, targetPagePtr, count, readBuff);
}
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index 54758c3548a..5c1fb1e080a 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -58,4 +58,8 @@ extern int read_archive_wal_page(XLogDumpPrivate *privateInfo,
XLogRecPtr targetPagePtr,
Size count, char *readBuff);
+extern char *get_tmp_walseg_path(XLogSegNo segno);
+extern bool tmp_walseg_exists(XLogSegNo segno);
+extern void remove_tmp_walseg(XLogSegNo segno, bool update_entry);
+
#endif /* end of PG_WALDUMP_H */
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 443126a9ce6..d5fa1f6d28d 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -7,6 +7,7 @@ use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
+use List::Util qw(shuffle);
my $tar = $ENV{TAR};
@@ -272,7 +273,7 @@ sub generate_archive
}
closedir $dh;
- @files = sort @files;
+ @files = shuffle @files;
# move into the WAL directory before archiving files
my $cwd = getcwd;
--
2.47.1
[application/octet-stream] v7-0006-pg_verifybackup-Delay-default-WAL-directory-prepa.patch (1.7K, 7-v7-0006-pg_verifybackup-Delay-default-WAL-directory-prepa.patch)
download | inline diff:
From 7e2523b0891c9911f64c35f736db24e3310e2090 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 16 Jul 2025 14:47:43 +0530
Subject: [PATCH v7 6/8] pg_verifybackup: Delay default WAL directory
preparation.
We are not sure whether to parse WAL from a directory or an archive
until the backup format is known. Therefore, we delay preparing the
default WAL directory until the point of parsing. This delay is
harmless, as the WAL directory is not used elsewhere.
---
src/bin/pg_verifybackup/pg_verifybackup.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 8d5befa947f..a502e795b2e 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -285,10 +285,6 @@ main(int argc, char **argv)
manifest_path = psprintf("%s/backup_manifest",
context.backup_directory);
- /* By default, look for the WAL in the backup directory, too. */
- if (wal_directory == NULL)
- wal_directory = psprintf("%s/pg_wal", context.backup_directory);
-
/*
* Try to read the manifest. We treat any errors encountered while parsing
* the manifest as fatal; there doesn't seem to be much point in trying to
@@ -368,6 +364,10 @@ main(int argc, char **argv)
if (context.format == 'p' && !context.skip_checksums)
verify_backup_checksums(&context);
+ /* By default, look for the WAL in the backup directory, too. */
+ if (wal_directory == NULL)
+ wal_directory = psprintf("%s/pg_wal", context.backup_directory);
+
/*
* Try to parse the required ranges of WAL records, unless we were told
* not to do so.
--
2.47.1
[application/octet-stream] v7-0007-pg_verifybackup-Rename-the-wal-directory-switch-t.patch (15.6K, 8-v7-0007-pg_verifybackup-Rename-the-wal-directory-switch-t.patch)
download | inline diff:
From 705fb42e96663b692c5b51650646cf0361b8083e Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 24 Jul 2025 16:37:43 +0530
Subject: [PATCH v7 7/8] pg_verifybackup: Rename the wal-directory switch to
wal-path
With previous patches to pg_waldump can now decode WAL directly from
tar files. This means you'll be able to specify a tar archive path
instead of a traditional WAL directory.
To keep things consistent and more versatile, we should also
generalize the input switch for pg_verifybackup. It should accept
either a directory or a tar file path that contains WALs. This change
will also aligning it with the existing manifest-path switch naming.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 2 +-
src/bin/pg_verifybackup/pg_verifybackup.c | 22 +++++++++++-----------
src/bin/pg_verifybackup/po/de.po | 4 ++--
src/bin/pg_verifybackup/po/el.po | 4 ++--
src/bin/pg_verifybackup/po/es.po | 4 ++--
src/bin/pg_verifybackup/po/fr.po | 4 ++--
src/bin/pg_verifybackup/po/it.po | 4 ++--
src/bin/pg_verifybackup/po/ja.po | 4 ++--
src/bin/pg_verifybackup/po/ka.po | 4 ++--
src/bin/pg_verifybackup/po/ko.po | 4 ++--
src/bin/pg_verifybackup/po/ru.po | 4 ++--
src/bin/pg_verifybackup/po/sv.po | 4 ++--
src/bin/pg_verifybackup/po/uk.po | 4 ++--
src/bin/pg_verifybackup/po/zh_CN.po | 4 ++--
src/bin/pg_verifybackup/po/zh_TW.po | 4 ++--
src/bin/pg_verifybackup/t/007_wal.pl | 4 ++--
16 files changed, 40 insertions(+), 40 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index 61c12975e4a..e9b8bfd51b1 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -261,7 +261,7 @@ PostgreSQL documentation
<varlistentry>
<term><option>-w <replaceable class="parameter">path</replaceable></option></term>
- <term><option>--wal-directory=<replaceable class="parameter">path</replaceable></option></term>
+ <term><option>--wal-path=<replaceable class="parameter">path</replaceable></option></term>
<listitem>
<para>
Try to parse WAL files stored in the specified directory, rather than
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index a502e795b2e..9fcd6be004e 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -93,7 +93,7 @@ static void verify_file_checksum(verifier_context *context,
uint8 *buffer);
static void parse_required_wal(verifier_context *context,
char *pg_waldump_path,
- char *wal_directory);
+ char *wal_path);
static astreamer *create_archive_verifier(verifier_context *context,
char *archive_name,
Oid tblspc_oid,
@@ -126,7 +126,7 @@ main(int argc, char **argv)
{"progress", no_argument, NULL, 'P'},
{"quiet", no_argument, NULL, 'q'},
{"skip-checksums", no_argument, NULL, 's'},
- {"wal-directory", required_argument, NULL, 'w'},
+ {"wal-path", required_argument, NULL, 'w'},
{NULL, 0, NULL, 0}
};
@@ -135,7 +135,7 @@ main(int argc, char **argv)
char *manifest_path = NULL;
bool no_parse_wal = false;
bool quiet = false;
- char *wal_directory = NULL;
+ char *wal_path = NULL;
char *pg_waldump_path = NULL;
DIR *dir;
@@ -221,8 +221,8 @@ main(int argc, char **argv)
context.skip_checksums = true;
break;
case 'w':
- wal_directory = pstrdup(optarg);
- canonicalize_path(wal_directory);
+ wal_path = pstrdup(optarg);
+ canonicalize_path(wal_path);
break;
default:
/* getopt_long already emitted a complaint */
@@ -365,15 +365,15 @@ main(int argc, char **argv)
verify_backup_checksums(&context);
/* By default, look for the WAL in the backup directory, too. */
- if (wal_directory == NULL)
- wal_directory = psprintf("%s/pg_wal", context.backup_directory);
+ if (wal_path == NULL)
+ wal_path = psprintf("%s/pg_wal", context.backup_directory);
/*
* Try to parse the required ranges of WAL records, unless we were told
* not to do so.
*/
if (!no_parse_wal)
- parse_required_wal(&context, pg_waldump_path, wal_directory);
+ parse_required_wal(&context, pg_waldump_path, wal_path);
/*
* If everything looks OK, tell the user this, unless we were asked to
@@ -1198,7 +1198,7 @@ verify_file_checksum(verifier_context *context, manifest_file *m,
*/
static void
parse_required_wal(verifier_context *context, char *pg_waldump_path,
- char *wal_directory)
+ char *wal_path)
{
manifest_data *manifest = context->manifest;
manifest_wal_range *this_wal_range = manifest->first_wal_range;
@@ -1208,7 +1208,7 @@ parse_required_wal(verifier_context *context, char *pg_waldump_path,
char *pg_waldump_cmd;
pg_waldump_cmd = psprintf("\"%s\" --quiet --path=\"%s\" --timeline=%u --start=%X/%08X --end=%X/%08X\n",
- pg_waldump_path, wal_directory, this_wal_range->tli,
+ pg_waldump_path, wal_path, this_wal_range->tli,
LSN_FORMAT_ARGS(this_wal_range->start_lsn),
LSN_FORMAT_ARGS(this_wal_range->end_lsn));
fflush(NULL);
@@ -1376,7 +1376,7 @@ usage(void)
printf(_(" -P, --progress show progress information\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -s, --skip-checksums skip checksum verification\n"));
- printf(_(" -w, --wal-directory=PATH use specified path for WAL files\n"));
+ printf(_(" -w, --wal-path=PATH use specified path for WAL files\n"));
printf(_(" -V, --version output version information, then exit\n"));
printf(_(" -?, --help show this help, then exit\n"));
printf(_("\nReport bugs to <%s>.\n"), PACKAGE_BUGREPORT);
diff --git a/src/bin/pg_verifybackup/po/de.po b/src/bin/pg_verifybackup/po/de.po
index a9e24931100..9b5cd5898cf 100644
--- a/src/bin/pg_verifybackup/po/de.po
+++ b/src/bin/pg_verifybackup/po/de.po
@@ -785,8 +785,8 @@ msgstr " -s, --skip-checksums Überprüfung der Prüfsummen überspringe
#: pg_verifybackup.c:1379
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PFAD angegebenen Pfad für WAL-Dateien verwenden\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PFAD angegebenen Pfad für WAL-Dateien verwenden\n"
#: pg_verifybackup.c:1380
#, c-format
diff --git a/src/bin/pg_verifybackup/po/el.po b/src/bin/pg_verifybackup/po/el.po
index 3e3f20c67c5..81442f51c17 100644
--- a/src/bin/pg_verifybackup/po/el.po
+++ b/src/bin/pg_verifybackup/po/el.po
@@ -494,8 +494,8 @@ msgstr " -s, --skip-checksums παράκαμψε την επαλήθευ
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH χρησιμοποίησε την καθορισμένη διαδρομή για αρχεία WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH χρησιμοποίησε την καθορισμένη διαδρομή για αρχεία WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/es.po b/src/bin/pg_verifybackup/po/es.po
index 0cb958f3448..7f729fa35ba 100644
--- a/src/bin/pg_verifybackup/po/es.po
+++ b/src/bin/pg_verifybackup/po/es.po
@@ -495,8 +495,8 @@ msgstr " -s, --skip-checksums omitir la verificación de la suma de comp
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH utilizar la ruta especificada para los archivos WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH utilizar la ruta especificada para los archivos WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/fr.po b/src/bin/pg_verifybackup/po/fr.po
index da8c72f6427..09937966fa7 100644
--- a/src/bin/pg_verifybackup/po/fr.po
+++ b/src/bin/pg_verifybackup/po/fr.po
@@ -498,8 +498,8 @@ msgstr " -s, --skip-checksums ignore la vérification des sommes de cont
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=CHEMIN utilise le chemin spécifié pour les fichiers WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=CHEMIN utilise le chemin spécifié pour les fichiers WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/it.po b/src/bin/pg_verifybackup/po/it.po
index 317b0b71e7f..4da68d0074e 100644
--- a/src/bin/pg_verifybackup/po/it.po
+++ b/src/bin/pg_verifybackup/po/it.po
@@ -472,8 +472,8 @@ msgstr " -s, --skip-checksums salta la verifica del checksum\n"
#: pg_verifybackup.c:911
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH usa il percorso specificato per i file WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH usa il percorso specificato per i file WAL\n"
#: pg_verifybackup.c:912
#, c-format
diff --git a/src/bin/pg_verifybackup/po/ja.po b/src/bin/pg_verifybackup/po/ja.po
index c910fb236cc..a948959b54f 100644
--- a/src/bin/pg_verifybackup/po/ja.po
+++ b/src/bin/pg_verifybackup/po/ja.po
@@ -672,8 +672,8 @@ msgstr " -s, --skip-checksums チェックサム検証をスキップ\n"
#: pg_verifybackup.c:1379
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH WALファイルに指定したパスを使用する\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH WALファイルに指定したパスを使用する\n"
#: pg_verifybackup.c:1380
#, c-format
diff --git a/src/bin/pg_verifybackup/po/ka.po b/src/bin/pg_verifybackup/po/ka.po
index 982751984c7..ef2799316a8 100644
--- a/src/bin/pg_verifybackup/po/ka.po
+++ b/src/bin/pg_verifybackup/po/ka.po
@@ -784,8 +784,8 @@ msgstr " -s, --skip-checksums საკონტროლო ჯამ
#: pg_verifybackup.c:1379
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=ბილიკი WAL ფაილებისთვის მითითებული ბილიკის გამოყენება\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=ბილიკი WAL ფაილებისთვის მითითებული ბილიკის გამოყენება\n"
#: pg_verifybackup.c:1380
#, c-format
diff --git a/src/bin/pg_verifybackup/po/ko.po b/src/bin/pg_verifybackup/po/ko.po
index acdc3da5e02..eaf91ef1e98 100644
--- a/src/bin/pg_verifybackup/po/ko.po
+++ b/src/bin/pg_verifybackup/po/ko.po
@@ -501,8 +501,8 @@ msgstr " -s, --skip-checksums 체크섬 검사 건너뜀\n"
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=경로 WAL 파일이 있는 경로 지정\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=경로 WAL 파일이 있는 경로 지정\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/ru.po b/src/bin/pg_verifybackup/po/ru.po
index 64005feedfd..7fb0e5ab1f6 100644
--- a/src/bin/pg_verifybackup/po/ru.po
+++ b/src/bin/pg_verifybackup/po/ru.po
@@ -507,9 +507,9 @@ msgstr " -s, --skip-checksums пропустить проверку ко
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
msgstr ""
-" -w, --wal-directory=ПУТЬ использовать заданный путь к файлам WAL\n"
+" -w, --wal-path=ПУТЬ использовать заданный путь к файлам WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/sv.po b/src/bin/pg_verifybackup/po/sv.po
index 17240feeb5c..97125838e8c 100644
--- a/src/bin/pg_verifybackup/po/sv.po
+++ b/src/bin/pg_verifybackup/po/sv.po
@@ -492,8 +492,8 @@ msgstr " -s, --skip-checksums hoppa över verifiering av kontrollsummor\
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=SÖKVÄG använd denna sökväg till WAL-filer\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=SÖKVÄG använd denna sökväg till WAL-filer\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/uk.po b/src/bin/pg_verifybackup/po/uk.po
index 034b9764232..63f8041ab38 100644
--- a/src/bin/pg_verifybackup/po/uk.po
+++ b/src/bin/pg_verifybackup/po/uk.po
@@ -484,8 +484,8 @@ msgstr " -s, --skip-checksums не перевіряти контрольні с
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH використовувати вказаний шлях для файлів WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH використовувати вказаний шлях для файлів WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/zh_CN.po b/src/bin/pg_verifybackup/po/zh_CN.po
index b7d97c8976d..fb6fcae8b82 100644
--- a/src/bin/pg_verifybackup/po/zh_CN.po
+++ b/src/bin/pg_verifybackup/po/zh_CN.po
@@ -465,8 +465,8 @@ msgstr " -s, --skip-checksums 跳过校验和验证\n"
#: pg_verifybackup.c:919
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH 对WAL文件使用指定路径\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH 对WAL文件使用指定路径\n"
#: pg_verifybackup.c:920
#, c-format
diff --git a/src/bin/pg_verifybackup/po/zh_TW.po b/src/bin/pg_verifybackup/po/zh_TW.po
index c1b710b0a36..568f972b0bb 100644
--- a/src/bin/pg_verifybackup/po/zh_TW.po
+++ b/src/bin/pg_verifybackup/po/zh_TW.po
@@ -555,8 +555,8 @@ msgstr " -s, --skip-checksums 跳過檢查碼驗證\n"
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH 用指定的路徑存放 WAL 檔\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH 用指定的路徑存放 WAL 檔\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/t/007_wal.pl b/src/bin/pg_verifybackup/t/007_wal.pl
index babc4f0a86b..b07f80719b0 100644
--- a/src/bin/pg_verifybackup/t/007_wal.pl
+++ b/src/bin/pg_verifybackup/t/007_wal.pl
@@ -42,10 +42,10 @@ command_ok([ 'pg_verifybackup', '--no-parse-wal', $backup_path ],
command_ok(
[
'pg_verifybackup',
- '--wal-directory' => $relocated_pg_wal,
+ '--wal-path' => $relocated_pg_wal,
$backup_path
],
- '--wal-directory can be used to specify WAL directory');
+ '--wal-path can be used to specify WAL directory');
# Move directory back to original location.
rename($relocated_pg_wal, $original_pg_wal) || die "rename pg_wal back: $!";
--
2.47.1
[application/octet-stream] v7-0008-pg_verifybackup-enabled-WAL-parsing-for-tar-forma.patch (9.9K, 9-v7-0008-pg_verifybackup-enabled-WAL-parsing-for-tar-forma.patch)
download | inline diff:
From b0eae9adecb56bf634a9193910b702712f6b72dc Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 17 Jul 2025 16:39:36 +0530
Subject: [PATCH v7 8/8] pg_verifybackup: enabled WAL parsing for tar-format
backup
Now that pg_waldump supports decoding from tar archives, we should
leverage this functionality to remove the previous restriction on WAL
parsing for tar-backed formats.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 5 +-
src/bin/pg_verifybackup/pg_verifybackup.c | 66 +++++++++++++------
src/bin/pg_verifybackup/t/002_algorithm.pl | 4 --
src/bin/pg_verifybackup/t/003_corruption.pl | 4 +-
src/bin/pg_verifybackup/t/008_untar.pl | 5 +-
src/bin/pg_verifybackup/t/010_client_untar.pl | 5 +-
6 files changed, 50 insertions(+), 39 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index e9b8bfd51b1..16b50b5a4df 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -36,10 +36,7 @@ PostgreSQL documentation
<literal>backup_manifest</literal> generated by the server at the time
of the backup. The backup may be stored either in the "plain" or the "tar"
format; this includes tar-format backups compressed with any algorithm
- supported by <application>pg_basebackup</application>. However, at present,
- <literal>WAL</literal> verification is supported only for plain-format
- backups. Therefore, if the backup is stored in tar-format, the
- <literal>-n, --no-parse-wal</literal> option should be used.
+ supported by <application>pg_basebackup</application>.
</para>
<para>
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 9fcd6be004e..6915fc7f28e 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -74,7 +74,9 @@ pg_noreturn static void report_manifest_error(JsonManifestParseContext *context,
const char *fmt,...)
pg_attribute_printf(2, 3);
-static void verify_tar_backup(verifier_context *context, DIR *dir);
+static void verify_tar_backup(verifier_context *context, DIR *dir,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_plain_backup_directory(verifier_context *context,
char *relpath, char *fullpath,
DIR *dir);
@@ -83,7 +85,9 @@ static void verify_plain_backup_file(verifier_context *context, char *relpath,
static void verify_control_file(const char *controlpath,
uint64 manifest_system_identifier);
static void precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles);
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_tar_file(verifier_context *context, char *relpath,
char *fullpath, astreamer *streamer);
static void report_extra_backup_files(verifier_context *context);
@@ -136,6 +140,8 @@ main(int argc, char **argv)
bool no_parse_wal = false;
bool quiet = false;
char *wal_path = NULL;
+ char *base_archive_path = NULL;
+ char *wal_archive_path = NULL;
char *pg_waldump_path = NULL;
DIR *dir;
@@ -327,17 +333,6 @@ main(int argc, char **argv)
pfree(path);
}
- /*
- * XXX: In the future, we should consider enhancing pg_waldump to read WAL
- * files from an archive.
- */
- if (!no_parse_wal && context.format == 't')
- {
- pg_log_error("pg_waldump cannot read tar files");
- pg_log_error_hint("You must use -n/--no-parse-wal when verifying a tar-format backup.");
- exit(1);
- }
-
/*
* Perform the appropriate type of verification appropriate based on the
* backup format. This will close 'dir'.
@@ -346,7 +341,7 @@ main(int argc, char **argv)
verify_plain_backup_directory(&context, NULL, context.backup_directory,
dir);
else
- verify_tar_backup(&context, dir);
+ verify_tar_backup(&context, dir, &base_archive_path, &wal_archive_path);
/*
* The "matched" flag should now be set on every entry in the hash table.
@@ -364,9 +359,28 @@ main(int argc, char **argv)
if (context.format == 'p' && !context.skip_checksums)
verify_backup_checksums(&context);
- /* By default, look for the WAL in the backup directory, too. */
+ /*
+ * By default, WAL files are expected to be found in the backup directory
+ * for plain-format backups. In the case of tar-format backups, if a
+ * separate WAL archive is not found, the WAL files are most likely
+ * included within the main data directory archive.
+ */
if (wal_path == NULL)
- wal_path = psprintf("%s/pg_wal", context.backup_directory);
+ {
+ if (context.format == 'p')
+ wal_path = psprintf("%s/pg_wal", context.backup_directory);
+ else if (wal_archive_path)
+ wal_path = wal_archive_path;
+ else if (base_archive_path)
+ wal_path = base_archive_path;
+ else
+ {
+ pg_log_error("wal archive not found");
+ pg_log_error_hint("Specify the correct path using the option -w/--wal-path."
+ "Or you must use -n/--no-parse-wal when verifying a tar-format backup.");
+ exit(1);
+ }
+ }
/*
* Try to parse the required ranges of WAL records, unless we were told
@@ -787,7 +801,8 @@ verify_control_file(const char *controlpath, uint64 manifest_system_identifier)
* close when we're done with it.
*/
static void
-verify_tar_backup(verifier_context *context, DIR *dir)
+verify_tar_backup(verifier_context *context, DIR *dir, char **base_archive_path,
+ char **wal_archive_path)
{
struct dirent *dirent;
SimplePtrList tarfiles = {NULL, NULL};
@@ -816,7 +831,8 @@ verify_tar_backup(verifier_context *context, DIR *dir)
char *fullpath;
fullpath = psprintf("%s/%s", context->backup_directory, filename);
- precheck_tar_backup_file(context, filename, fullpath, &tarfiles);
+ precheck_tar_backup_file(context, filename, fullpath, &tarfiles,
+ base_archive_path, wal_archive_path);
pfree(fullpath);
}
}
@@ -875,11 +891,13 @@ verify_tar_backup(verifier_context *context, DIR *dir)
*
* The arguments to this function are mostly the same as the
* verify_plain_backup_file. The additional argument outputs a list of valid
- * tar files.
+ * tar files, along with the full paths to the main archive and the WAL
+ * directory archive.
*/
static void
precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles)
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path, char **wal_archive_path)
{
struct stat sb;
Oid tblspc_oid = InvalidOid;
@@ -918,9 +936,17 @@ precheck_tar_backup_file(verifier_context *context, char *relpath,
* extension such as .gz, .lz4, or .zst.
*/
if (strncmp("base", relpath, 4) == 0)
+ {
suffix = relpath + 4;
+
+ *base_archive_path = pstrdup(fullpath);
+ }
else if (strncmp("pg_wal", relpath, 6) == 0)
+ {
suffix = relpath + 6;
+
+ *wal_archive_path = pstrdup(fullpath);
+ }
else
{
/* Expected a <tablespaceoid>.tar file here. */
diff --git a/src/bin/pg_verifybackup/t/002_algorithm.pl b/src/bin/pg_verifybackup/t/002_algorithm.pl
index ae16c11bc4d..4f284a9e828 100644
--- a/src/bin/pg_verifybackup/t/002_algorithm.pl
+++ b/src/bin/pg_verifybackup/t/002_algorithm.pl
@@ -30,10 +30,6 @@ sub test_checksums
{
# Add switch to get a tar-format backup
push @backup, ('--format' => 'tar');
-
- # Add switch to skip WAL verification, which is not yet supported for
- # tar-format backups
- push @verify, ('--no-parse-wal');
}
# A backup with a bogus algorithm should fail.
diff --git a/src/bin/pg_verifybackup/t/003_corruption.pl b/src/bin/pg_verifybackup/t/003_corruption.pl
index 1dd60f709cf..f1ebdbb46b4 100644
--- a/src/bin/pg_verifybackup/t/003_corruption.pl
+++ b/src/bin/pg_verifybackup/t/003_corruption.pl
@@ -193,10 +193,8 @@ for my $scenario (@scenario)
command_ok([ $tar, '-cf' => "$tar_backup_path/base.tar", '.' ]);
chdir($cwd) || die "chdir: $!";
- # Now check that the backup no longer verifies. We must use -n
- # here, because pg_waldump can't yet read WAL from a tarfile.
command_fails_like(
- [ 'pg_verifybackup', '--no-parse-wal', $tar_backup_path ],
+ [ 'pg_verifybackup', $tar_backup_path ],
$scenario->{'fails_like'},
"corrupt backup fails verification: $name");
diff --git a/src/bin/pg_verifybackup/t/008_untar.pl b/src/bin/pg_verifybackup/t/008_untar.pl
index bc3d6b352ad..09079a94fee 100644
--- a/src/bin/pg_verifybackup/t/008_untar.pl
+++ b/src/bin/pg_verifybackup/t/008_untar.pl
@@ -47,7 +47,6 @@ my $tsoid = $primary->safe_psql(
SELECT oid FROM pg_tablespace WHERE spcname = 'regress_ts1'));
my $backup_path = $primary->backup_dir . '/server-backup';
-my $extract_path = $primary->backup_dir . '/extracted-backup';
my @test_configuration = (
{
@@ -123,14 +122,12 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
# Cleanup.
rmtree($backup_path);
- rmtree($extract_path);
}
}
diff --git a/src/bin/pg_verifybackup/t/010_client_untar.pl b/src/bin/pg_verifybackup/t/010_client_untar.pl
index b62faeb5acf..5b0e76ee69d 100644
--- a/src/bin/pg_verifybackup/t/010_client_untar.pl
+++ b/src/bin/pg_verifybackup/t/010_client_untar.pl
@@ -32,7 +32,6 @@ print $jf $junk_data;
close $jf;
my $backup_path = $primary->backup_dir . '/client-backup';
-my $extract_path = $primary->backup_dir . '/extracted-backup';
my @test_configuration = (
{
@@ -137,13 +136,11 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
# Cleanup.
- rmtree($extract_path);
rmtree($backup_path);
}
}
--
2.47.1
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2025-11-25 06:37 Amul Sul <[email protected]>
parent: Amul Sul <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Amul Sul @ 2025-11-25 06:37 UTC (permalink / raw)
To: Jakub Wartak <[email protected]>; +Cc: Robert Haas <[email protected]>; PostgreSQL Hackers <[email protected]>
On Fri, Nov 21, 2025 at 5:14 PM Amul Sul <[email protected]> wrote:
>
> On Wed, Nov 19, 2025 at 1:50 PM Jakub Wartak
> <[email protected]> wrote:
> >
> > On Mon, Nov 17, 2025 at 5:51 AM Amul Sul <[email protected]> wrote:
> > >
> > > On Thu, Nov 6, 2025 at 2:33 PM Amul Sul <[email protected]> wrote:
> > > >
> > > > On Mon, Oct 20, 2025 at 8:05 PM Robert Haas <[email protected]> wrote:
> > > > >
> > > > > On Thu, Oct 16, 2025 at 7:49 AM Amul Sul <[email protected]> wrote:
> > > > > [....]
> > > > Kindly have a look at the attached version. Thank you !
> > > >
> > >
> > > Attached is the rebased version against the latest master head (e76defbcf09).
> >
> > Hi Amul, thanks for working on this. I haven't really looked at the
> > source code deeply (I trust Robert eyes much more than mine on this
> > one), just skimmed a little bit:
> >
> > 1. As stated earlier, get_tmp_walseg_path() is still vulnerable (it
> > uses predictable path that could be used by attacker in $TMPDIR)
> >
>
> Yeah, I haven't done anything regarding this since I am unsure of what
> should be done and what the risks involved are. I am thinking of
> taking Robert's opinion on this.
>
Per offline discussion with Robert and Jakub, I have updated the patch
to use mkdtemp() as suggested, which is already available in the tree
for similar purposes. Thanks !
Regards,
Amul
Attachments:
[application/x-patch] v8-0001-Refactor-pg_waldump-Move-some-declarations-to-new.patch (2.3K, 2-v8-0001-Refactor-pg_waldump-Move-some-declarations-to-new.patch)
download | inline diff:
From 1713593f78bb7799ef278424b0efa56acef8bfba Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 24 Jun 2025 11:33:20 +0530
Subject: [PATCH v8 1/8] Refactor: pg_waldump: Move some declarations to new
pg_waldump.h
This change prepares for a second source file in this directory to
support reading WAL from tar files. Common structures, declarations,
and functions are being exported through this include file so
they can be used in both files.
---
src/bin/pg_waldump/pg_waldump.c | 11 ++---------
src/bin/pg_waldump/pg_waldump.h | 27 +++++++++++++++++++++++++++
2 files changed, 29 insertions(+), 9 deletions(-)
create mode 100644 src/bin/pg_waldump/pg_waldump.h
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index c6d6ba79e44..5846ee24f46 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -29,6 +29,7 @@
#include "common/logging.h"
#include "common/relpath.h"
#include "getopt_long.h"
+#include "pg_waldump.h"
#include "rmgrdesc.h"
#include "storage/bufpage.h"
@@ -39,19 +40,11 @@
static const char *progname;
-static int WalSegSz;
+int WalSegSz = DEFAULT_XLOG_SEG_SIZE;
static volatile sig_atomic_t time_to_stop = false;
static const RelFileLocator emptyRelFileLocator = {0, 0, 0};
-typedef struct XLogDumpPrivate
-{
- TimeLineID timeline;
- XLogRecPtr startptr;
- XLogRecPtr endptr;
- bool endptr_reached;
-} XLogDumpPrivate;
-
typedef struct XLogDumpConfig
{
/* display options */
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
new file mode 100644
index 00000000000..9e62b64ead5
--- /dev/null
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -0,0 +1,27 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_waldump.h - decode and display WAL
+ *
+ * Copyright (c) 2013-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/pg_waldump.h
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_WALDUMP_H
+#define PG_WALDUMP_H
+
+#include "access/xlogdefs.h"
+
+extern int WalSegSz;
+
+/* Contains the necessary information to drive WAL decoding */
+typedef struct XLogDumpPrivate
+{
+ TimeLineID timeline;
+ XLogRecPtr startptr;
+ XLogRecPtr endptr;
+ bool endptr_reached;
+} XLogDumpPrivate;
+
+#endif /* end of PG_WALDUMP_H */
--
2.47.1
[application/x-patch] v8-0002-Refactor-pg_waldump-Separate-logic-used-to-calcul.patch (2.3K, 3-v8-0002-Refactor-pg_waldump-Separate-logic-used-to-calcul.patch)
download | inline diff:
From c5ce68512a54b7f586b458374f689df406f39ad6 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 26 Jun 2025 11:42:53 +0530
Subject: [PATCH v8 2/8] Refactor: pg_waldump: Separate logic used to calculate
the required read size.
This refactoring prepares the codebase for an upcoming patch that will
support reading WAL from tar files. The logic for calculating the
required read size has been updated to handle both normal WAL files
and WAL files located inside a tar archive.
---
src/bin/pg_waldump/pg_waldump.c | 39 ++++++++++++++++++++++-----------
1 file changed, 26 insertions(+), 13 deletions(-)
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 5846ee24f46..0dc28ea360c 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -326,6 +326,29 @@ identify_target_directory(char *directory, char *fname)
return NULL; /* not reached */
}
+/* Returns the size in bytes of the data to be read. */
+static inline int
+required_read_len(XLogDumpPrivate *private, XLogRecPtr targetPagePtr,
+ int reqLen)
+{
+ int count = XLOG_BLCKSZ;
+
+ if (XLogRecPtrIsValid(private->endptr))
+ {
+ if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
+ count = XLOG_BLCKSZ;
+ else if (targetPagePtr + reqLen <= private->endptr)
+ count = private->endptr - targetPagePtr;
+ else
+ {
+ private->endptr_reached = true;
+ return -1;
+ }
+ }
+
+ return count;
+}
+
/* pg_waldump's XLogReaderRoutine->segment_open callback */
static void
WALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
@@ -383,21 +406,11 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogRecPtr targetPtr, char *readBuff)
{
XLogDumpPrivate *private = state->private_data;
- int count = XLOG_BLCKSZ;
+ int count = required_read_len(private, targetPagePtr, reqLen);
WALReadError errinfo;
- if (XLogRecPtrIsValid(private->endptr))
- {
- if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
- count = XLOG_BLCKSZ;
- else if (targetPagePtr + reqLen <= private->endptr)
- count = private->endptr - targetPagePtr;
- else
- {
- private->endptr_reached = true;
- return -1;
- }
- }
+ if (private->endptr_reached)
+ return -1;
if (!WALRead(state, readBuff, targetPagePtr, count, private->timeline,
&errinfo))
--
2.47.1
[application/x-patch] v8-0003-Refactor-pg_waldump-Restructure-TAP-tests.patch (5.5K, 4-v8-0003-Refactor-pg_waldump-Restructure-TAP-tests.patch)
download | inline diff:
From e0a2296c0b79bf3e2c62c712829c9aaa0b516334 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 30 Jul 2025 12:43:30 +0530
Subject: [PATCH v8 3/8] Refactor: pg_waldump: Restructure TAP tests.
Restructured some tests to run inside a loop, facilitating their
re-execution for decoding WAL from tar archives.
---
src/bin/pg_waldump/t/001_basic.pl | 123 ++++++++++++++++--------------
1 file changed, 67 insertions(+), 56 deletions(-)
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index f26d75e01cf..1b712e8d74d 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -198,28 +198,6 @@ command_like(
],
qr/./,
'runs with start and end segment specified');
-command_fails_like(
- [ 'pg_waldump', '--path' => $node->data_dir ],
- qr/error: no start WAL location given/,
- 'path option requires start location');
-command_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- '--end' => $end_lsn,
- ],
- qr/./,
- 'runs with path option and start and end locations');
-command_fails_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- ],
- qr/error: error in WAL record at/,
- 'falling off the end of the WAL results in an error');
-
command_like(
[
'pg_waldump', '--quiet',
@@ -227,15 +205,6 @@ command_like(
],
qr/^$/,
'no output with --quiet option');
-command_fails_like(
- [
- 'pg_waldump', '--quiet',
- '--path' => $node->data_dir,
- '--start' => $start_lsn
- ],
- qr/error: error in WAL record at/,
- 'errors are shown with --quiet');
-
# Test for: Display a message that we're skipping data if `from`
# wasn't a pointer to the start of a record.
@@ -272,7 +241,6 @@ sub test_pg_waldump
my $result = IPC::Run::run [
'pg_waldump',
- '--path' => $node->data_dir,
'--start' => $start_lsn,
'--end' => $end_lsn,
@opts
@@ -288,38 +256,81 @@ sub test_pg_waldump
my @lines;
-@lines = test_pg_waldump;
-is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+my @scenario = (
+ {
+ 'path' => $node->data_dir
+ });
-@lines = test_pg_waldump('--limit' => 6);
-is(@lines, 6, 'limit option observed');
+for my $scenario (@scenario)
+{
+ my $path = $scenario->{'path'};
-@lines = test_pg_waldump('--fullpage');
-is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
+ SKIP:
+ {
+ command_fails_like(
+ [ 'pg_waldump', '--path' => $path ],
+ qr/error: no start WAL location given/,
+ 'path option requires start location');
+ command_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ '--end' => $end_lsn,
+ ],
+ qr/./,
+ 'runs with path option and start and end locations');
+ command_fails_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ ],
+ qr/error: error in WAL record at/,
+ 'falling off the end of the WAL results in an error');
-@lines = test_pg_waldump('--stats');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ command_fails_like(
+ [
+ 'pg_waldump', '--quiet',
+ '--path' => $path,
+ '--start' => $start_lsn
+ ],
+ qr/error: error in WAL record at/,
+ 'errors are shown with --quiet');
-@lines = test_pg_waldump('--stats=record');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ @lines = test_pg_waldump('--path' => $path);
+ is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
-@lines = test_pg_waldump('--rmgr' => 'Btree');
-is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+ @lines = test_pg_waldump('--path' => $path, '--limit' => 6);
+ is(@lines, 6, 'limit option observed');
-@lines = test_pg_waldump('--fork' => 'init');
-is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+ @lines = test_pg_waldump('--path' => $path, '--fullpage');
+ is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
-is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
- 0, 'only lines for selected relation');
+ @lines = test_pg_waldump('--path' => $path, '--stats');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
- '--block' => 1);
-is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+ @lines = test_pg_waldump('--path' => $path, '--stats=record');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ @lines = test_pg_waldump('--path' => $path, '--rmgr' => 'Btree');
+ is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+
+ @lines = test_pg_waldump('--path' => $path, '--fork' => 'init');
+ is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+
+ @lines = test_pg_waldump('--path' => $path,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
+ is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
+ 0, 'only lines for selected relation');
+
+ @lines = test_pg_waldump('--path' => $path,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
+ '--block' => 1);
+ is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+ }
+}
done_testing();
--
2.47.1
[application/x-patch] v8-0004-pg_waldump-Add-support-for-archived-WAL-decoding.patch (36.8K, 5-v8-0004-pg_waldump-Add-support-for-archived-WAL-decoding.patch)
download | inline diff:
From 7686e875c9da4cedafb884dd4cd153d35c96d540 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 5 Nov 2025 15:40:36 +0530
Subject: [PATCH v8 4/8] pg_waldump: Add support for archived WAL decoding.
pg_waldump can now accept the path to a tar archive containing WAL
files and decode them. This feature was added primarily for
pg_verifybackup, which previously disabled WAL parsing for
tar-formatted backups.
Note that this patch requires that the WAL files within the archive be
in sequential order; an error will be reported otherwise. The next
patch is planned to remove this restriction.
---
doc/src/sgml/ref/pg_waldump.sgml | 8 +-
src/bin/pg_waldump/Makefile | 7 +-
src/bin/pg_waldump/archive_waldump.c | 589 +++++++++++++++++++++++++++
src/bin/pg_waldump/meson.build | 4 +-
src/bin/pg_waldump/pg_waldump.c | 215 +++++++---
src/bin/pg_waldump/pg_waldump.h | 36 +-
src/bin/pg_waldump/t/001_basic.pl | 84 +++-
src/tools/pgindent/typedefs.list | 3 +
8 files changed, 870 insertions(+), 76 deletions(-)
create mode 100644 src/bin/pg_waldump/archive_waldump.c
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index ce23add5577..d004bb0f67e 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -141,13 +141,17 @@ PostgreSQL documentation
<term><option>--path=<replaceable>path</replaceable></option></term>
<listitem>
<para>
- Specifies a directory to search for WAL segment files or a
- directory with a <literal>pg_wal</literal> subdirectory that
+ Specifies a tar archive or a directory to search for WAL segment files
+ or a directory with a <literal>pg_wal</literal> subdirectory that
contains such files. The default is to search in the current
directory, the <literal>pg_wal</literal> subdirectory of the
current directory, and the <literal>pg_wal</literal> subdirectory
of <envar>PGDATA</envar>.
</para>
+ <para>
+ If a tar archive is provided, its WAL segment files must be in
+ sequential order; otherwise, an error will be reported.
+ </para>
</listitem>
</varlistentry>
diff --git a/src/bin/pg_waldump/Makefile b/src/bin/pg_waldump/Makefile
index 4c1ee649501..05ac5763a57 100644
--- a/src/bin/pg_waldump/Makefile
+++ b/src/bin/pg_waldump/Makefile
@@ -3,6 +3,9 @@
PGFILEDESC = "pg_waldump - decode and display WAL"
PGAPPICON=win32
+# make these available to TAP test scripts
+export TAR
+
subdir = src/bin/pg_waldump
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
@@ -12,11 +15,13 @@ OBJS = \
$(WIN32RES) \
compat.o \
pg_waldump.o \
+ archive_waldump.o \
rmgrdesc.o \
xlogreader.o \
xlogstats.o
-override CPPFLAGS := -DFRONTEND $(CPPFLAGS)
+override CPPFLAGS := -DFRONTEND -I$(libpq_srcdir) $(CPPFLAGS)
+LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils
RMGRDESCSOURCES = $(sort $(notdir $(wildcard $(top_srcdir)/src/backend/access/rmgrdesc/*desc*.c)))
RMGRDESCOBJS = $(patsubst %.c,%.o,$(RMGRDESCSOURCES))
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
new file mode 100644
index 00000000000..f991633e58c
--- /dev/null
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -0,0 +1,589 @@
+/*-------------------------------------------------------------------------
+ *
+ * archive_waldump.c
+ * A generic facility for reading WAL data from tar archives via archive
+ * streamer.
+ *
+ * Portions Copyright (c) 2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/archive_waldump.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres_fe.h"
+
+#include <unistd.h>
+
+#include "access/xlog_internal.h"
+#include "common/hashfn.h"
+#include "common/logging.h"
+#include "fe_utils/simple_list.h"
+#include "pg_waldump.h"
+
+/*
+ * How many bytes should we try to read from a file at once?
+ */
+#define READ_CHUNK_SIZE (128 * 1024)
+
+/* Structure for storing the WAL segment data from the archive */
+typedef struct ArchivedWALEntry
+{
+ uint32 status; /* hash status */
+ XLogSegNo segno; /* hash key: WAL segment number */
+ TimeLineID timeline; /* timeline of this wal file */
+
+ StringInfoData buf;
+ bool tmpseg_exists; /* spill file exists? */
+
+ int total_read; /* total read of this WAL segment, including
+ * buffered and temporarily written data */
+} ArchivedWALEntry;
+
+#define SH_PREFIX ArchivedWAL
+#define SH_ELEMENT_TYPE ArchivedWALEntry
+#define SH_KEY_TYPE XLogSegNo
+#define SH_KEY segno
+#define SH_HASH_KEY(tb, key) murmurhash64((uint64) key)
+#define SH_EQUAL(tb, a, b) (a == b)
+#define SH_GET_HASH(tb, a) a->hash
+#define SH_SCOPE static inline
+#define SH_RAW_ALLOCATOR pg_malloc0
+#define SH_DECLARE
+#define SH_DEFINE
+#include "lib/simplehash.h"
+
+static ArchivedWAL_hash *ArchivedWAL_HTAB = NULL;
+
+typedef struct astreamer_waldump
+{
+ astreamer base;
+ XLogDumpPrivate *privateInfo;
+} astreamer_waldump;
+
+static int read_archive_file(XLogDumpPrivate *privateInfo, Size count);
+static ArchivedWALEntry *get_archive_wal_entry(XLogSegNo segno,
+ XLogDumpPrivate *privateInfo);
+
+static astreamer *astreamer_waldump_new(XLogDumpPrivate *privateInfo);
+static void astreamer_waldump_content(astreamer *streamer,
+ astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context);
+static void astreamer_waldump_finalize(astreamer *streamer);
+static void astreamer_waldump_free(astreamer *streamer);
+
+static bool member_is_wal_file(astreamer_waldump *mystreamer,
+ astreamer_member *member,
+ XLogSegNo *curSegNo,
+ TimeLineID *curTimeline);
+
+static const astreamer_ops astreamer_waldump_ops = {
+ .content = astreamer_waldump_content,
+ .finalize = astreamer_waldump_finalize,
+ .free = astreamer_waldump_free
+};
+
+/*
+ * Returns true if the given file is a tar archive and outputs its compression
+ * algorithm.
+ */
+bool
+is_archive_file(const char *fname, pg_compress_algorithm *compression)
+{
+ int fname_len = strlen(fname);
+ pg_compress_algorithm compress_algo;
+
+ /* Now, check the compression type of the tar */
+ if (fname_len > 4 &&
+ strcmp(fname + fname_len - 4, ".tar") == 0)
+ compress_algo = PG_COMPRESSION_NONE;
+ else if (fname_len > 4 &&
+ strcmp(fname + fname_len - 4, ".tgz") == 0)
+ compress_algo = PG_COMPRESSION_GZIP;
+ else if (fname_len > 7 &&
+ strcmp(fname + fname_len - 7, ".tar.gz") == 0)
+ compress_algo = PG_COMPRESSION_GZIP;
+ else if (fname_len > 8 &&
+ strcmp(fname + fname_len - 8, ".tar.lz4") == 0)
+ compress_algo = PG_COMPRESSION_LZ4;
+ else if (fname_len > 8 &&
+ strcmp(fname + fname_len - 8, ".tar.zst") == 0)
+ compress_algo = PG_COMPRESSION_ZSTD;
+ else
+ return false;
+
+ *compression = compress_algo;
+
+ return true;
+}
+
+/*
+ * Initializes the tar archive reader to read WAL files from the archive,
+ * creates a hash table to store them, performs quick existence checks for WAL
+ * entries in the archive and retrieves the WAL segment size, and sets up
+ * filtering criteria for relevant entries.
+ */
+void
+init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
+ pg_compress_algorithm compression)
+{
+ int fd;
+ astreamer *streamer;
+ ArchivedWALEntry *entry = NULL;
+ XLogLongPageHeader longhdr;
+
+ /* Open tar archive and store its file descriptor */
+ fd = open_file_in_directory(waldir, privateInfo->archive_name);
+
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", privateInfo->archive_name);
+
+ privateInfo->archive_fd = fd;
+
+ streamer = astreamer_waldump_new(privateInfo);
+
+ /* Before that we must parse the tar archive. */
+ streamer = astreamer_tar_parser_new(streamer);
+
+ /* Before that we must decompress, if archive is compressed. */
+ if (compression == PG_COMPRESSION_GZIP)
+ streamer = astreamer_gzip_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_LZ4)
+ streamer = astreamer_lz4_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_ZSTD)
+ streamer = astreamer_zstd_decompressor_new(streamer);
+
+ privateInfo->archive_streamer = streamer;
+
+ /* Hash table storing WAL entries read from the archive */
+ ArchivedWAL_HTAB = ArchivedWAL_create(16, NULL);
+
+ /*
+ * Verify that the archive contains valid WAL files and fetch WAL segment
+ * size
+ */
+ while (entry == NULL || entry->buf.len < XLOG_BLCKSZ)
+ {
+ if (read_archive_file(privateInfo, XLOG_BLCKSZ) == 0)
+ pg_fatal("could not find WAL in \"%s\" archive",
+ privateInfo->archive_name);
+
+ entry = privateInfo->cur_wal;
+ }
+
+ /* Set WalSegSz if WAL data is successfully read */
+ longhdr = (XLogLongPageHeader) entry->buf.data;
+
+ WalSegSz = longhdr->xlp_seg_size;
+
+ if (!IsValidWalSegSize(WalSegSz))
+ {
+ pg_log_error(ngettext("invalid WAL segment size in WAL file from archive \"%s\" (%d byte)",
+ "invalid WAL segment size in WAL file from archive \"%s\" (%d bytes)",
+ WalSegSz),
+ privateInfo->archive_name, WalSegSz);
+ pg_log_error_detail("The WAL segment size must be a power of two between 1 MB and 1 GB.");
+ exit(1);
+ }
+
+ /*
+ * With the WAL segment size available, we can now initialize the
+ * dependent start and end segment numbers.
+ */
+ Assert(!XLogRecPtrIsInvalid(privateInfo->startptr));
+ XLByteToSeg(privateInfo->startptr, privateInfo->startSegNo, WalSegSz);
+
+ if (XLogRecPtrIsInvalid(privateInfo->endptr))
+ privateInfo->endSegNo = UINT64_MAX;
+ else
+ XLByteToSeg(privateInfo->endptr, privateInfo->endSegNo, WalSegSz);
+}
+
+/*
+ * Release the archive streamer chain and close the archive file.
+ */
+void
+free_archive_reader(XLogDumpPrivate *privateInfo)
+{
+ /*
+ * NB: Normally, astreamer_finalize() is called before astreamer_free() to
+ * flush any remaining buffered data or to ensure the end of the tar
+ * archive is reached. However, when decoding a WAL file, once we hit the
+ * end LSN, any remaining WAL data in the buffer or the tar archive's
+ * unreached end can be safely ignored.
+ */
+ astreamer_free(privateInfo->archive_streamer);
+
+ /* Close the file. */
+ if (close(privateInfo->archive_fd) != 0)
+ pg_log_error("could not close file \"%s\": %m",
+ privateInfo->archive_name);
+}
+
+/*
+ * Copies WAL data from astreamer to readBuff; if unavailable, fetches more
+ * from the tar archive via astreamer.
+ */
+int
+read_archive_wal_page(XLogDumpPrivate *privateInfo, XLogRecPtr targetPagePtr,
+ Size count, char *readBuff)
+{
+ char *p = readBuff;
+ Size nbytes = count;
+ XLogRecPtr recptr = targetPagePtr;
+ XLogSegNo segno;
+ ArchivedWALEntry *entry;
+
+ XLByteToSeg(targetPagePtr, segno, WalSegSz);
+ entry = get_archive_wal_entry(segno, privateInfo);
+
+ while (nbytes > 0)
+ {
+ char *buf = entry->buf.data;
+ int len = entry->buf.len;
+
+ /* WAL record range that the buffer contains */
+ XLogRecPtr endPtr;
+ XLogRecPtr startPtr;
+
+ XLogSegNoOffsetToRecPtr(entry->segno, entry->total_read,
+ WalSegSz, endPtr);
+ startPtr = endPtr - len;
+
+ /*
+ * pg_waldump may request to re-read the currently active page, but
+ * never a page older than the current one. Therefore, any fully
+ * consumed WAL data preceding the current page can be safely
+ * discarded.
+ */
+ if (recptr >= endPtr)
+ {
+ /* Discard the buffered data */
+ resetStringInfo(&entry->buf);
+ len = 0;
+
+ /*
+ * Push back the partial page data for the current page to the
+ * buffer, ensuring it remains available for re-reading if
+ * requested.
+ */
+ if (p > readBuff)
+ {
+ Assert((count - nbytes) > 0);
+ appendBinaryStringInfo(&entry->buf, readBuff, count - nbytes);
+ }
+ }
+
+ if (len > 0 && recptr > startPtr)
+ {
+ int skipBytes = 0;
+
+ /*
+ * The required offset is not at the start of the buffer, so skip
+ * bytes until reaching the desired offset of the target page.
+ */
+ skipBytes = recptr - startPtr;
+
+ buf += skipBytes;
+ len -= skipBytes;
+ }
+
+ if (len > 0)
+ {
+ int readBytes = len >= nbytes ? nbytes : len;
+
+ /* Ensure the reading page is in the buffer */
+ Assert(recptr >= startPtr && recptr < endPtr);
+
+ memcpy(p, buf, readBytes);
+
+ /* Update state for read */
+ nbytes -= readBytes;
+ p += readBytes;
+ recptr += readBytes;
+ }
+ else
+ {
+ /*
+ * Fetch more data; raise an error if it's not the current segment
+ * being read by the archive streamer or if reading of the
+ * archived file has finished.
+ */
+ if (privateInfo->cur_wal != entry ||
+ read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ {
+ char fname[MAXFNAMELEN];
+
+ XLogFileName(fname, privateInfo->timeline, entry->segno,
+ WalSegSz);
+ pg_fatal("could not read file \"%s\" from archive \"%s\": read %lld of %lld",
+ fname, privateInfo->archive_name,
+ (long long int) count - nbytes,
+ (long long int) nbytes);
+ }
+ }
+ }
+
+ /*
+ * Should have either have successfully read all the requested bytes or
+ * reported a failure before this point.
+ */
+ Assert(nbytes == 0);
+
+ /*
+ * NB: We return the fixed value provided as input. Although we could
+ * return a boolean since we either successfully read the WAL page or
+ * raise an error, but the caller expects this value to be returned. The
+ * routine that reads WAL pages from the physical WAL file follows the
+ * same convention.
+ */
+ return count;
+}
+
+/*
+ * Reads the archive file and passes it to the archive streamer for
+ * decompression.
+ */
+static int
+read_archive_file(XLogDumpPrivate *privateInfo, Size count)
+{
+ int rc;
+ char *buffer;
+
+ buffer = pg_malloc(READ_CHUNK_SIZE * sizeof(uint8));
+
+ rc = read(privateInfo->archive_fd, buffer, count);
+ if (rc < 0)
+ pg_fatal("could not read file \"%s\": %m",
+ privateInfo->archive_name);
+
+ /*
+ * Decompress (if required), and then parse the previously read contents
+ * of the tar file.
+ */
+ if (rc > 0)
+ astreamer_content(privateInfo->archive_streamer, NULL,
+ buffer, rc, ASTREAMER_UNKNOWN);
+ pg_free(buffer);
+
+ return rc;
+}
+
+/*
+ * Returns the archived WAL entry from the hash table if it exists. Otherwise,
+ * it invokes the routine to read the archived file and retrieve the entry if
+ * it is not already in hash table.
+ */
+static ArchivedWALEntry *
+get_archive_wal_entry(XLogSegNo segno, XLogDumpPrivate *privateInfo)
+{
+ ArchivedWALEntry *entry = NULL;
+ char fname[MAXFNAMELEN];
+
+ /* Search hash table */
+ entry = ArchivedWAL_lookup(ArchivedWAL_HTAB, segno);
+
+ if (entry != NULL)
+ return entry;
+
+ /* Needed WAL yet to be decoded from archive, do the same */
+ while (1)
+ {
+ entry = privateInfo->cur_wal;
+
+ /* Fetch more data */
+ if (entry == NULL || entry->buf.len == 0)
+ {
+ if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ break; /* archive file ended */
+ }
+
+ /*
+ * Either, here for the first time, or the archived streamer is
+ * reading a non-WAL file or an irrelevant WAL file.
+ */
+ if (entry == NULL)
+ continue;
+
+ /* Found the required entry */
+ if (entry->segno == segno)
+ return entry;
+
+ /*
+ * Ignore if the timeline is different or the current segment is not
+ * the desired one.
+ */
+ if (privateInfo->timeline != entry->timeline ||
+ privateInfo->startSegNo > entry->segno ||
+ privateInfo->endSegNo < entry->segno)
+ {
+ privateInfo->cur_wal = NULL;
+ continue;
+ }
+
+ /* WAL segments must be archived in order */
+ pg_log_error("WAL files are not archived in sequential order");
+ pg_log_error_detail("Expecting segment number " UINT64_FORMAT " but found " UINT64_FORMAT ".",
+ segno, entry->segno);
+ exit(1);
+ }
+
+ /* Requested WAL segment not found */
+ XLogFileName(fname, privateInfo->timeline, segno, WalSegSz);
+ pg_fatal("could not find file \"%s\" in archive", fname);
+}
+
+/*
+ * Create an astreamer that can read WAL from tar file.
+ */
+static astreamer *
+astreamer_waldump_new(XLogDumpPrivate *privateInfo)
+{
+ astreamer_waldump *streamer;
+
+ streamer = palloc0(sizeof(astreamer_waldump));
+ *((const astreamer_ops **) &streamer->base.bbs_ops) =
+ &astreamer_waldump_ops;
+
+ streamer->privateInfo = privateInfo;
+
+ return &streamer->base;
+}
+
+/*
+ * Main entry point of the archive streamer for reading WAL data from a tar
+ * file. If a member is identified as a valid WAL file, a hash entry is created
+ * for it, and its contents are copied into that entry's buffer, making them
+ * accessible to the decoding routine.
+ */
+static void
+astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context)
+{
+ astreamer_waldump *mystreamer = (astreamer_waldump *) streamer;
+ XLogDumpPrivate *privateInfo = mystreamer->privateInfo;
+
+ Assert(context != ASTREAMER_UNKNOWN);
+
+ switch (context)
+ {
+ case ASTREAMER_MEMBER_HEADER:
+ {
+ XLogSegNo segno;
+ TimeLineID timeline;
+ ArchivedWALEntry *entry;
+ bool found;
+
+ pg_log_debug("reading \"%s\"", member->pathname);
+
+ if (!member_is_wal_file(mystreamer, member,
+ &segno, &timeline))
+ break;
+
+ entry = ArchivedWAL_insert(ArchivedWAL_HTAB, segno, &found);
+
+ /*
+ * Shouldn't happen, but if it does, simply ignore the
+ * duplicate WAL file.
+ */
+ if (found)
+ {
+ pg_log_warning("ignoring duplicate WAL file found in archive: \"%s\"",
+ member->pathname);
+ break;
+ }
+
+ initStringInfo(&entry->buf);
+ entry->timeline = timeline;
+ entry->total_read = 0;
+
+ privateInfo->cur_wal = entry;
+ }
+ break;
+
+ case ASTREAMER_MEMBER_CONTENTS:
+ if (privateInfo->cur_wal)
+ {
+ appendBinaryStringInfo(&privateInfo->cur_wal->buf, data, len);
+ privateInfo->cur_wal->total_read += len;
+ }
+ break;
+
+ case ASTREAMER_MEMBER_TRAILER:
+ privateInfo->cur_wal = NULL;
+ break;
+
+ case ASTREAMER_ARCHIVE_TRAILER:
+ break;
+
+ default:
+ /* Shouldn't happen. */
+ pg_fatal("unexpected state while parsing tar file");
+ }
+}
+
+/*
+ * End-of-stream processing for a astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_finalize(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+}
+
+/*
+ * Free memory associated with a astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_free(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+ pfree(streamer);
+}
+
+/*
+ * Returns true if the archive member name matches the WAL naming format. If
+ * successful, it also outputs the WAL segment number, and timeline.
+ */
+static bool
+member_is_wal_file(astreamer_waldump *mystreamer, astreamer_member *member,
+ XLogSegNo *curSegNo, TimeLineID *curTimeline)
+{
+ int pathlen;
+ XLogSegNo segNo;
+ TimeLineID timeline;
+ char *fname;
+
+ /* We are only interested in normal files. */
+ if (member->is_directory || member->is_link)
+ return false;
+
+ pathlen = strlen(member->pathname);
+ if (pathlen < XLOG_FNAME_LEN)
+ return false;
+
+ /* WAL file could be with full path */
+ fname = member->pathname + (pathlen - XLOG_FNAME_LEN);
+ if (!IsXLogFileName(fname))
+ return false;
+
+ /*
+ * XXX: On some systems (e.g., OpenBSD), the tar utility includes
+ * PaxHeaders when creating an archive. These are special entries that
+ * store extended metadata for the file entry immediately following them,
+ * and they share the exact same name as that file.
+ */
+ if (strstr(member->pathname, "PaxHeaders."))
+ return false;
+
+ /* Parse position from file */
+ XLogFromFileName(fname, &timeline, &segNo, WalSegSz);
+
+ *curSegNo = segNo;
+ *curTimeline = timeline;
+
+ return true;
+}
diff --git a/src/bin/pg_waldump/meson.build b/src/bin/pg_waldump/meson.build
index 937e0d68841..da00746587c 100644
--- a/src/bin/pg_waldump/meson.build
+++ b/src/bin/pg_waldump/meson.build
@@ -3,6 +3,7 @@
pg_waldump_sources = files(
'compat.c',
'pg_waldump.c',
+ 'archive_waldump.c',
'rmgrdesc.c',
)
@@ -18,7 +19,7 @@ endif
pg_waldump = executable('pg_waldump',
pg_waldump_sources,
- dependencies: [frontend_code, lz4, zstd],
+ dependencies: [frontend_code, lz4, zstd, libpq],
c_args: ['-DFRONTEND'], # needed for xlogreader et al
kwargs: default_bin_args,
)
@@ -29,6 +30,7 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
+ 'env': {'TAR': tar.found() ? tar.full_path() : ''},
'tests': [
't/001_basic.pl',
't/002_save_fullpage.pl',
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 0dc28ea360c..02ad141e44a 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -177,7 +177,7 @@ split_path(const char *path, char **dir, char **fname)
*
* return a read only fd
*/
-static int
+int
open_file_in_directory(const char *directory, const char *fname)
{
int fd = -1;
@@ -436,6 +436,44 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
return count;
}
+/*
+ * pg_waldump's XLogReaderRoutine->segment_open callback to support dumping WAL
+ * files from tar archives.
+ */
+static void
+TarWALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
+ TimeLineID *tli_p)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->segment_close callback.
+ */
+static void
+TarWALDumpCloseSegment(XLogReaderState *state)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->page_read callback to support dumping WAL
+ * files from tar archives.
+ */
+static int
+TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
+ XLogRecPtr targetPtr, char *readBuff)
+{
+ XLogDumpPrivate *private = state->private_data;
+ int count = required_read_len(private, targetPagePtr, reqLen);
+
+ if (private->endptr_reached)
+ return -1;
+
+ /* Read the WAL page from the archive streamer */
+ return read_archive_wal_page(private, targetPagePtr, count, readBuff);
+}
+
/*
* Boolean to return whether the given WAL record matches a specific relation
* and optionally block.
@@ -773,8 +811,8 @@ usage(void)
printf(_(" -F, --fork=FORK only show records that modify blocks in fork FORK;\n"
" valid names are main, fsm, vm, init\n"));
printf(_(" -n, --limit=N number of records to display\n"));
- printf(_(" -p, --path=PATH directory in which to find WAL segment files or a\n"
- " directory with a ./pg_wal that contains such files\n"
+ printf(_(" -p, --path=PATH tar archive or a directory in which to find WAL segment files or\n"
+ " a directory with a ./pg_wal that contains such files\n"
" (default: current directory, ./pg_wal, $PGDATA/pg_wal)\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -r, --rmgr=RMGR only show records generated by resource manager RMGR;\n"
@@ -806,7 +844,10 @@ main(int argc, char **argv)
XLogRecord *record;
XLogRecPtr first_record;
char *waldir = NULL;
+ char *walpath = NULL;
char *errormsg;
+ bool is_archive = false;
+ pg_compress_algorithm compression;
static struct option long_options[] = {
{"bkp-details", no_argument, NULL, 'b'},
@@ -938,7 +979,7 @@ main(int argc, char **argv)
}
break;
case 'p':
- waldir = pg_strdup(optarg);
+ walpath = pg_strdup(optarg);
break;
case 'q':
config.quiet = true;
@@ -1102,10 +1143,20 @@ main(int argc, char **argv)
goto bad_argument;
}
- if (waldir != NULL)
+ if (walpath != NULL)
{
+ /* validate path points to tar archive */
+ if (is_archive_file(walpath, &compression))
+ {
+ char *fname = NULL;
+
+ split_path(walpath, &waldir, &fname);
+
+ private.archive_name = fname;
+ is_archive = true;
+ }
/* validate path points to directory */
- if (!verify_directory(waldir))
+ else if (!verify_directory(walpath))
{
pg_log_error("could not open directory \"%s\": %m", waldir);
goto bad_argument;
@@ -1123,6 +1174,17 @@ main(int argc, char **argv)
int fd;
XLogSegNo segno;
+ /*
+ * If a tar archive is passed using the --path option, all other
+ * arguments become unnecessary.
+ */
+ if (is_archive)
+ {
+ pg_log_error("unnecessary command-line arguments specified with tar archive (first is \"%s\")",
+ argv[optind]);
+ goto bad_argument;
+ }
+
split_path(argv[optind], &directory, &fname);
if (waldir == NULL && directory != NULL)
@@ -1133,69 +1195,77 @@ main(int argc, char **argv)
pg_fatal("could not open directory \"%s\": %m", waldir);
}
- waldir = identify_target_directory(waldir, fname);
- fd = open_file_in_directory(waldir, fname);
- if (fd < 0)
- pg_fatal("could not open file \"%s\"", fname);
- close(fd);
-
- /* parse position from file */
- XLogFromFileName(fname, &private.timeline, &segno, WalSegSz);
-
- if (!XLogRecPtrIsValid(private.startptr))
- XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr);
- else if (!XLByteInSeg(private.startptr, segno, WalSegSz))
+ if (fname != NULL && is_archive_file(fname, &compression))
{
- pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.startptr),
- fname);
- goto bad_argument;
+ private.archive_name = fname;
+ is_archive = true;
}
-
- /* no second file specified, set end position */
- if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(segno + 1, 0, WalSegSz, private.endptr);
-
- /* parse ENDSEG if passed */
- if (optind + 1 < argc)
+ else
{
- XLogSegNo endsegno;
-
- /* ignore directory, already have that */
- split_path(argv[optind + 1], &directory, &fname);
-
+ waldir = identify_target_directory(waldir, fname);
fd = open_file_in_directory(waldir, fname);
if (fd < 0)
pg_fatal("could not open file \"%s\"", fname);
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &endsegno, WalSegSz);
+ XLogFromFileName(fname, &private.timeline, &segno, WalSegSz);
- if (endsegno < segno)
- pg_fatal("ENDSEG %s is before STARTSEG %s",
- argv[optind + 1], argv[optind]);
+ if (!XLogRecPtrIsValid(private.startptr))
+ XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr);
+ else if (!XLByteInSeg(private.startptr, segno, WalSegSz))
+ {
+ pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.startptr),
+ fname);
+ goto bad_argument;
+ }
- if (!XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(endsegno + 1, 0, WalSegSz,
- private.endptr);
+ /* no second file specified, set end position */
+ if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
+ XLogSegNoOffsetToRecPtr(segno + 1, 0, WalSegSz, private.endptr);
- /* set segno to endsegno for check of --end */
- segno = endsegno;
- }
+ /* parse ENDSEG if passed */
+ if (optind + 1 < argc)
+ {
+ XLogSegNo endsegno;
+ /* ignore directory, already have that */
+ split_path(argv[optind + 1], &directory, &fname);
- if (!XLByteInSeg(private.endptr, segno, WalSegSz) &&
- private.endptr != (segno + 1) * WalSegSz)
- {
- pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.endptr),
- argv[argc - 1]);
- goto bad_argument;
+ fd = open_file_in_directory(waldir, fname);
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", fname);
+ close(fd);
+
+ /* parse position from file */
+ XLogFromFileName(fname, &private.timeline, &endsegno, WalSegSz);
+
+ if (endsegno < segno)
+ pg_fatal("ENDSEG %s is before STARTSEG %s",
+ argv[optind + 1], argv[optind]);
+
+ if (!XLogRecPtrIsValid(private.endptr))
+ XLogSegNoOffsetToRecPtr(endsegno + 1, 0, WalSegSz,
+ private.endptr);
+
+ /* set segno to endsegno for check of --end */
+ segno = endsegno;
+ }
+
+
+ if (!XLByteInSeg(private.endptr, segno, WalSegSz) &&
+ private.endptr != (segno + 1) * WalSegSz)
+ {
+ pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.endptr),
+ argv[argc - 1]);
+ goto bad_argument;
+ }
}
}
- else
- waldir = identify_target_directory(waldir, NULL);
+ else if (!is_archive)
+ waldir = identify_target_directory(walpath, NULL);
/* we don't know what to print */
if (!XLogRecPtrIsValid(private.startptr))
@@ -1207,12 +1277,36 @@ main(int argc, char **argv)
/* done with argument parsing, do the actual work */
/* we have everything we need, start reading */
- xlogreader_state =
- XLogReaderAllocate(WalSegSz, waldir,
- XL_ROUTINE(.page_read = WALDumpReadPage,
- .segment_open = WALDumpOpenSegment,
- .segment_close = WALDumpCloseSegment),
- &private);
+ if (is_archive)
+ {
+ /*
+ * A NULL WAL directory indicates that the archive file is located
+ * in the current working directory of the pg_waldump execution
+ */
+ waldir = waldir ? pg_strdup(waldir) : pg_strdup(".");
+
+ /* Set up for reading tar file */
+ init_archive_reader(&private, waldir, compression);
+
+ /* Routine to decode WAL files in tar archive */
+ xlogreader_state =
+ XLogReaderAllocate(WalSegSz, waldir,
+ XL_ROUTINE(.page_read = TarWALDumpReadPage,
+ .segment_open = TarWALDumpOpenSegment,
+ .segment_close = TarWALDumpCloseSegment),
+ &private);
+ }
+ else
+ {
+ /* Routine to decode WAL files */
+ xlogreader_state =
+ XLogReaderAllocate(WalSegSz, waldir,
+ XL_ROUTINE(.page_read = WALDumpReadPage,
+ .segment_open = WALDumpOpenSegment,
+ .segment_close = WALDumpCloseSegment),
+ &private);
+ }
+
if (!xlogreader_state)
pg_fatal("out of memory while allocating a WAL reading processor");
@@ -1321,6 +1415,9 @@ main(int argc, char **argv)
XLogReaderFree(xlogreader_state);
+ if (is_archive)
+ free_archive_reader(&private);
+
return EXIT_SUCCESS;
bad_argument:
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index 9e62b64ead5..54758c3548a 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -12,9 +12,13 @@
#define PG_WALDUMP_H
#include "access/xlogdefs.h"
+#include "fe_utils/astreamer.h"
extern int WalSegSz;
+/* Forward declaration */
+struct ArchivedWALEntry;
+
/* Contains the necessary information to drive WAL decoding */
typedef struct XLogDumpPrivate
{
@@ -22,6 +26,36 @@ typedef struct XLogDumpPrivate
XLogRecPtr startptr;
XLogRecPtr endptr;
bool endptr_reached;
+
+ /* Fields required to read WAL from archive */
+ char *archive_name; /* Tar archive name */
+ int archive_fd; /* File descriptor for the open tar file */
+
+ astreamer *archive_streamer;
+
+ /* What the archive streamer is currently reading */
+ struct ArchivedWALEntry *cur_wal;
+
+ /*
+ * Although these values can be easily derived from startptr and endptr,
+ * doing so repeatedly for each archived member would be inefficient, as
+ * it would involve recalculating and filtering out irrelevant WAL
+ * segments.
+ */
+ XLogSegNo startSegNo;
+ XLogSegNo endSegNo;
} XLogDumpPrivate;
-#endif /* end of PG_WALDUMP_H */
+extern int open_file_in_directory(const char *directory, const char *fname);
+
+extern bool is_archive_file(const char *fname,
+ pg_compress_algorithm *compression);
+extern void init_archive_reader(XLogDumpPrivate *privateInfo,
+ const char *waldir,
+ pg_compress_algorithm compression);
+extern void free_archive_reader(XLogDumpPrivate *privateInfo);
+extern int read_archive_wal_page(XLogDumpPrivate *privateInfo,
+ XLogRecPtr targetPagePtr,
+ Size count, char *readBuff);
+
+#endif /* end of PG_WALDUMP_H */
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 1b712e8d74d..443126a9ce6 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -3,10 +3,13 @@
use strict;
use warnings FATAL => 'all';
+use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
+my $tar = $ENV{TAR};
+
program_help_ok('pg_waldump');
program_version_ok('pg_waldump');
program_options_handling_ok('pg_waldump');
@@ -235,7 +238,7 @@ command_like(
sub test_pg_waldump
{
local $Test::Builder::Level = $Test::Builder::Level + 1;
- my @opts = @_;
+ my ($path, @opts) = @_;
my ($stdout, $stderr);
@@ -243,6 +246,7 @@ sub test_pg_waldump
'pg_waldump',
'--start' => $start_lsn,
'--end' => $end_lsn,
+ '--path' => $path,
@opts
],
'>' => \$stdout,
@@ -254,11 +258,50 @@ sub test_pg_waldump
return @lines;
}
-my @lines;
+# Create a tar archive, sorting the file order
+sub generate_archive
+{
+ my ($archive, $directory, $compression_flags) = @_;
+
+ my @files;
+ opendir my $dh, $directory or die "opendir: $!";
+ while (my $entry = readdir $dh) {
+ # Skip '.' and '..'
+ next if $entry eq '.' || $entry eq '..';
+ push @files, $entry;
+ }
+ closedir $dh;
+
+ @files = sort @files;
+
+ # move into the WAL directory before archiving files
+ my $cwd = getcwd;
+ chdir($directory) || die "chdir: $!";
+ command_ok([$tar, $compression_flags, $archive, @files]);
+ chdir($cwd) || die "chdir: $!";
+}
+
+my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
my @scenario = (
{
- 'path' => $node->data_dir
+ 'path' => $node->data_dir,
+ 'is_archive' => 0,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar",
+ 'compression_method' => 'none',
+ 'compression_flags' => '-cf',
+ 'is_archive' => 1,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar.gz",
+ 'compression_method' => 'gzip',
+ 'compression_flags' => '-czf',
+ 'is_archive' => 1,
+ 'enabled' => check_pg_config("#define HAVE_LIBZ 1")
});
for my $scenario (@scenario)
@@ -267,6 +310,19 @@ for my $scenario (@scenario)
SKIP:
{
+ skip "tar command is not available", 3
+ if !defined $tar;
+ skip "$scenario->{'compression_method'} compression not supported by this build", 3
+ if !$scenario->{'enabled'} && $scenario->{'is_archive'};
+
+ # create pg_wal archive
+ if ($scenario->{'is_archive'})
+ {
+ generate_archive($path,
+ $node->data_dir . '/pg_wal',
+ $scenario->{'compression_flags'});
+ }
+
command_fails_like(
[ 'pg_waldump', '--path' => $path ],
qr/error: no start WAL location given/,
@@ -298,38 +354,42 @@ for my $scenario (@scenario)
qr/error: error in WAL record at/,
'errors are shown with --quiet');
- @lines = test_pg_waldump('--path' => $path);
+ my @lines;
+ @lines = test_pg_waldump($path);
is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
- @lines = test_pg_waldump('--path' => $path, '--limit' => 6);
+ @lines = test_pg_waldump($path, '--limit' => 6);
is(@lines, 6, 'limit option observed');
- @lines = test_pg_waldump('--path' => $path, '--fullpage');
+ @lines = test_pg_waldump($path, '--fullpage');
is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
- @lines = test_pg_waldump('--path' => $path, '--stats');
+ @lines = test_pg_waldump($path, '--stats');
like($lines[0], qr/WAL statistics/, "statistics on stdout");
is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
- @lines = test_pg_waldump('--path' => $path, '--stats=record');
+ @lines = test_pg_waldump($path, '--stats=record');
like($lines[0], qr/WAL statistics/, "statistics on stdout");
is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
- @lines = test_pg_waldump('--path' => $path, '--rmgr' => 'Btree');
+ @lines = test_pg_waldump($path, '--rmgr' => 'Btree');
is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
- @lines = test_pg_waldump('--path' => $path, '--fork' => 'init');
+ @lines = test_pg_waldump($path, '--fork' => 'init');
is(grep(!/fork init/, @lines), 0, 'only init fork lines');
- @lines = test_pg_waldump('--path' => $path,
+ @lines = test_pg_waldump($path,
'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
0, 'only lines for selected relation');
- @lines = test_pg_waldump('--path' => $path,
+ @lines = test_pg_waldump($path,
'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
'--block' => 1);
is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+
+ # Cleanup.
+ unlink $path if $scenario->{'is_archive'};
}
}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 57a8f0366a5..981cdb69175 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -139,6 +139,8 @@ ArchiveOpts
ArchiveShutdownCB
ArchiveStartupCB
ArchiveStreamState
+ArchivedWALEntry
+ArchivedWAL_hash
ArchiverOutput
ArchiverStage
ArrayAnalyzeExtraData
@@ -3466,6 +3468,7 @@ astreamer_recovery_injector
astreamer_tar_archiver
astreamer_tar_parser
astreamer_verify
+astreamer_waldump
astreamer_zstd_frame
auth_password_hook_typ
autovac_table
--
2.47.1
[application/x-patch] v8-0005-pg_waldump-Remove-the-restriction-on-the-order-of.patch (11.5K, 6-v8-0005-pg_waldump-Remove-the-restriction-on-the-order-of.patch)
download | inline diff:
From 1a41393da777838efb3b095b8ab1cafd1fe92623 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 6 Nov 2025 13:48:33 +0530
Subject: [PATCH v8 5/8] pg_waldump: Remove the restriction on the order of
archived WAL files.
With previous patch, pg_waldump would stop decoding if WAL files were
not in the required sequence. With this patch, decoding will now
continue. Any WAL file that is out of order will be written to a
temporary location, from which it will be read later. Once a temporary
file has been read, it will be removed.
---
src/bin/pg_waldump/archive_waldump.c | 220 +++++++++++++++++++++++++--
src/bin/pg_waldump/pg_waldump.c | 41 ++++-
src/bin/pg_waldump/pg_waldump.h | 4 +
src/bin/pg_waldump/t/001_basic.pl | 3 +-
4 files changed, 256 insertions(+), 12 deletions(-)
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
index f991633e58c..d38855e9a10 100644
--- a/src/bin/pg_waldump/archive_waldump.c
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -17,6 +17,7 @@
#include <unistd.h>
#include "access/xlog_internal.h"
+#include "common/file_perm.h"
#include "common/hashfn.h"
#include "common/logging.h"
#include "fe_utils/simple_list.h"
@@ -27,6 +28,9 @@
*/
#define READ_CHUNK_SIZE (128 * 1024)
+/* Temporary exported WAL file directory */
+static char *TmpWalSegDir = NULL;
+
/* Structure for storing the WAL segment data from the archive */
typedef struct ArchivedWALEntry
{
@@ -65,6 +69,11 @@ typedef struct astreamer_waldump
static int read_archive_file(XLogDumpPrivate *privateInfo, Size count);
static ArchivedWALEntry *get_archive_wal_entry(XLogSegNo segno,
XLogDumpPrivate *privateInfo);
+static void setup_tmpseg_dir(const char *waldir);
+static void cleanup_tmpseg_dir_atexit(void);
+
+static FILE *prepare_tmp_write(XLogSegNo segno);
+static void perform_tmp_write(XLogSegNo segno, StringInfo buf, FILE *file);
static astreamer *astreamer_waldump_new(XLogDumpPrivate *privateInfo);
static void astreamer_waldump_content(astreamer *streamer,
@@ -120,10 +129,11 @@ is_archive_file(const char *fname, pg_compress_algorithm *compression)
}
/*
- * Initializes the tar archive reader to read WAL files from the archive,
- * creates a hash table to store them, performs quick existence checks for WAL
- * entries in the archive and retrieves the WAL segment size, and sets up
- * filtering criteria for relevant entries.
+ * Initializes the tar archive reader, creates a hash table for WAL entries,
+ * checks for existing valid WAL segments in the archive file and retrieves the
+ * segment size, and sets up filters for relevant entries. It also configures a
+ * temporary directory for out-of-order WAL data and registers an exit callback
+ * to clean up temporary files.
*/
void
init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
@@ -199,6 +209,13 @@ init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
privateInfo->endSegNo = UINT64_MAX;
else
XLByteToSeg(privateInfo->endptr, privateInfo->endSegNo, WalSegSz);
+
+ /*
+ * Setup temporary directory to store WAL segments and set up an exit
+ * callback to remove it upon completion.
+ */
+ setup_tmpseg_dir(waldir);
+ atexit(cleanup_tmpseg_dir_atexit);
}
/*
@@ -374,13 +391,16 @@ read_archive_file(XLogDumpPrivate *privateInfo, Size count)
/*
* Returns the archived WAL entry from the hash table if it exists. Otherwise,
* it invokes the routine to read the archived file and retrieve the entry if
- * it is not already in hash table.
+ * it is not already present in the hash table. If the archive streamer happens
+ * to be reading a WAL from archive file that is not currently needed, that WAL
+ * data is written to a temporary file.
*/
static ArchivedWALEntry *
get_archive_wal_entry(XLogSegNo segno, XLogDumpPrivate *privateInfo)
{
ArchivedWALEntry *entry = NULL;
char fname[MAXFNAMELEN];
+ FILE *write_fp = NULL;
/* Search hash table */
entry = ArchivedWAL_lookup(ArchivedWAL_HTAB, segno);
@@ -423,11 +443,32 @@ get_archive_wal_entry(XLogSegNo segno, XLogDumpPrivate *privateInfo)
continue;
}
- /* WAL segments must be archived in order */
- pg_log_error("WAL files are not archived in sequential order");
- pg_log_error_detail("Expecting segment number " UINT64_FORMAT " but found " UINT64_FORMAT ".",
- segno, entry->segno);
- exit(1);
+ /*
+ * Archive streamer is currently reading a file that isn't the one
+ * asked for, but it's required for a future feature. It should be
+ * written to a temporary location for retrieval when needed.
+ */
+
+ /* Create a temporary file if one does not already exist */
+ if (!entry->tmpseg_exists)
+ {
+ write_fp = prepare_tmp_write(entry->segno);
+ entry->tmpseg_exists = true;
+ }
+
+ /* Flush data from the buffer to the file */
+ perform_tmp_write(entry->segno, &entry->buf, write_fp);
+ resetStringInfo(&entry->buf);
+
+ /*
+ * The change in the current segment entry indicates that the reading
+ * of this file has ended.
+ */
+ if (entry != privateInfo->cur_wal && write_fp != NULL)
+ {
+ fclose(write_fp);
+ write_fp = NULL;
+ }
}
/* Requested WAL segment not found */
@@ -435,6 +476,165 @@ get_archive_wal_entry(XLogSegNo segno, XLogDumpPrivate *privateInfo)
pg_fatal("could not find file \"%s\" in archive", fname);
}
+/*
+ * Set up a temporary directory to temporarily store WAL segments.
+ */
+static void
+setup_tmpseg_dir(const char *waldir)
+{
+ char *template;
+
+ /*
+ * Use the directory specified by the TEMDIR environment variable. If it's
+ * not set, use the provided WAL directory to extract WAL file
+ * temporarily.
+ */
+ template = psprintf("%s/waldump_tmp-XXXXXX",
+ getenv("TMPDIR") ? getenv("TMPDIR") : waldir);
+ TmpWalSegDir = mkdtemp(template);
+
+ if (TmpWalSegDir == NULL)
+ pg_fatal("could not create directory \"%s\": %m", template);
+
+ canonicalize_path(TmpWalSegDir);
+
+ pg_log_debug("created directory \"%s\"", TmpWalSegDir);
+}
+
+/*
+ * Removes the temporarily store WAL segments, if any, at exiting.
+ */
+static void
+cleanup_tmpseg_dir_atexit(void)
+{
+ ArchivedWAL_iterator it;
+ ArchivedWALEntry *entry;
+
+ /* Remove temporary segments */
+ ArchivedWAL_start_iterate(ArchivedWAL_HTAB, &it);
+ while ((entry = ArchivedWAL_iterate(ArchivedWAL_HTAB, &it)) != NULL)
+ {
+ if (entry->tmpseg_exists)
+ {
+ remove_tmp_walseg(entry->segno, false);
+ entry->tmpseg_exists = false;
+ }
+ }
+
+ /* Remove temporary directory */
+ if (rmdir(TmpWalSegDir) == 0)
+ pg_log_debug("removed directory \"%s\"", TmpWalSegDir);
+}
+
+/*
+ * Generate the temporary WAL file path.
+ *
+ * Note that the caller is responsible to pfree it.
+ */
+char *
+get_tmp_walseg_path(XLogSegNo segno)
+{
+ char *fpath = (char *) palloc(MAXPGPATH);
+
+ Assert(TmpWalSegDir);
+
+ snprintf(fpath, MAXPGPATH, "%s/%08X%08X",
+ TmpWalSegDir,
+ (uint32) (segno / XLogSegmentsPerXLogId(WalSegSz)),
+ (uint32) (segno % XLogSegmentsPerXLogId(WalSegSz)));
+
+ return fpath;
+}
+
+/*
+ * Routine to check whether a temporary file exists for the corresponding WAL
+ * segment number.
+ */
+bool
+tmp_walseg_exists(XLogSegNo segno)
+{
+ ArchivedWALEntry *entry;
+
+ entry = ArchivedWAL_lookup(ArchivedWAL_HTAB, segno);
+
+ if (entry == NULL)
+ return false;
+
+ return entry->tmpseg_exists;
+}
+
+/*
+ * Create an empty placeholder file and return its handle.
+ */
+static FILE *
+prepare_tmp_write(XLogSegNo segno)
+{
+ FILE *file;
+ char *fpath;
+
+ fpath = get_tmp_walseg_path(segno);
+
+ /* Create an empty placeholder */
+ file = fopen(fpath, PG_BINARY_W);
+ if (file == NULL)
+ pg_fatal("could not create file \"%s\": %m", fpath);
+
+#ifndef WIN32
+ if (chmod(fpath, pg_file_create_mode))
+ pg_fatal("could not set permissions on file \"%s\": %m",
+ fpath);
+#endif
+
+ pg_log_debug("temporarily exporting file \"%s\"", fpath);
+ pfree(fpath);
+
+ return file;
+}
+
+/*
+ * Write buffer data to the given file handle.
+ */
+static void
+perform_tmp_write(XLogSegNo segno, StringInfo buf, FILE *file)
+{
+ Assert(file);
+
+ errno = 0;
+ if (buf->len > 0 && fwrite(buf->data, buf->len, 1, file) != 1)
+ {
+ /*
+ * If write didn't set errno, assume problem is no disk space
+ */
+ if (errno == 0)
+ errno = ENOSPC;
+ pg_fatal("could not write to file \"%s\": %m",
+ get_tmp_walseg_path(segno));
+ }
+}
+
+/*
+ * Remove temporary file
+ */
+void
+remove_tmp_walseg(XLogSegNo segno, bool update_entry)
+{
+ char *fpath = get_tmp_walseg_path(segno);
+
+ if (unlink(fpath) == 0)
+ pg_log_debug("removed file \"%s\"", fpath);
+ pfree(fpath);
+
+ /* Update entry if requested */
+ if (update_entry)
+ {
+ ArchivedWALEntry *entry;
+
+ entry = ArchivedWAL_lookup(ArchivedWAL_HTAB, segno);
+ Assert(entry != NULL);
+ entry->tmpseg_exists = false;
+ }
+}
+
/*
* Create an astreamer that can read WAL from tar file.
*/
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 02ad141e44a..4c5974a6ae1 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -466,11 +466,50 @@ TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
{
XLogDumpPrivate *private = state->private_data;
int count = required_read_len(private, targetPagePtr, reqLen);
+ XLogSegNo nextSegNo;
if (private->endptr_reached)
return -1;
- /* Read the WAL page from the archive streamer */
+ /*
+ * If the target page is in a different segment, first check for the WAL
+ * segment's physical existence in the temporary directory.
+ */
+ nextSegNo = state->seg.ws_segno;
+ if (!XLByteInSeg(targetPagePtr, nextSegNo, WalSegSz))
+ {
+ if (state->seg.ws_file >= 0)
+ {
+ close(state->seg.ws_file);
+ state->seg.ws_file = -1;
+
+ /* Remove this file, as it is no longer needed. */
+ remove_tmp_walseg(nextSegNo, true);
+ }
+
+ XLByteToSeg(targetPagePtr, nextSegNo, WalSegSz);
+ state->seg.ws_tli = private->timeline;
+ state->seg.ws_segno = nextSegNo;
+
+ /*
+ * If the next segment exists, open it and continue reading from there
+ */
+ if (tmp_walseg_exists(nextSegNo))
+ {
+ char *fpath;
+
+ fpath = get_tmp_walseg_path(nextSegNo);
+ state->seg.ws_file = open(fpath, O_RDONLY | PG_BINARY, 0);
+ pfree(fpath);
+ }
+ }
+
+ /* Continue reading from the open WAL segment, if any */
+ if (state->seg.ws_file >= 0)
+ return WALDumpReadPage(state, targetPagePtr, count, targetPtr,
+ readBuff);
+
+ /* Otherwise, read the WAL page from the archive streamer */
return read_archive_wal_page(private, targetPagePtr, count, readBuff);
}
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index 54758c3548a..5c1fb1e080a 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -58,4 +58,8 @@ extern int read_archive_wal_page(XLogDumpPrivate *privateInfo,
XLogRecPtr targetPagePtr,
Size count, char *readBuff);
+extern char *get_tmp_walseg_path(XLogSegNo segno);
+extern bool tmp_walseg_exists(XLogSegNo segno);
+extern void remove_tmp_walseg(XLogSegNo segno, bool update_entry);
+
#endif /* end of PG_WALDUMP_H */
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 443126a9ce6..d5fa1f6d28d 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -7,6 +7,7 @@ use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
+use List::Util qw(shuffle);
my $tar = $ENV{TAR};
@@ -272,7 +273,7 @@ sub generate_archive
}
closedir $dh;
- @files = sort @files;
+ @files = shuffle @files;
# move into the WAL directory before archiving files
my $cwd = getcwd;
--
2.47.1
[application/x-patch] v8-0006-pg_verifybackup-Delay-default-WAL-directory-prepa.patch (1.7K, 7-v8-0006-pg_verifybackup-Delay-default-WAL-directory-prepa.patch)
download | inline diff:
From 5cc37e03ae3b6cf3d01b681bafdc0cc3cf136c27 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 16 Jul 2025 14:47:43 +0530
Subject: [PATCH v8 6/8] pg_verifybackup: Delay default WAL directory
preparation.
We are not sure whether to parse WAL from a directory or an archive
until the backup format is known. Therefore, we delay preparing the
default WAL directory until the point of parsing. This delay is
harmless, as the WAL directory is not used elsewhere.
---
src/bin/pg_verifybackup/pg_verifybackup.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 8d5befa947f..a502e795b2e 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -285,10 +285,6 @@ main(int argc, char **argv)
manifest_path = psprintf("%s/backup_manifest",
context.backup_directory);
- /* By default, look for the WAL in the backup directory, too. */
- if (wal_directory == NULL)
- wal_directory = psprintf("%s/pg_wal", context.backup_directory);
-
/*
* Try to read the manifest. We treat any errors encountered while parsing
* the manifest as fatal; there doesn't seem to be much point in trying to
@@ -368,6 +364,10 @@ main(int argc, char **argv)
if (context.format == 'p' && !context.skip_checksums)
verify_backup_checksums(&context);
+ /* By default, look for the WAL in the backup directory, too. */
+ if (wal_directory == NULL)
+ wal_directory = psprintf("%s/pg_wal", context.backup_directory);
+
/*
* Try to parse the required ranges of WAL records, unless we were told
* not to do so.
--
2.47.1
[application/x-patch] v8-0007-pg_verifybackup-Rename-the-wal-directory-switch-t.patch (15.6K, 8-v8-0007-pg_verifybackup-Rename-the-wal-directory-switch-t.patch)
download | inline diff:
From cfb5b951450b65f32b7a2357630c184af36bdba7 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 24 Jul 2025 16:37:43 +0530
Subject: [PATCH v8 7/8] pg_verifybackup: Rename the wal-directory switch to
wal-path
With previous patches to pg_waldump can now decode WAL directly from
tar files. This means you'll be able to specify a tar archive path
instead of a traditional WAL directory.
To keep things consistent and more versatile, we should also
generalize the input switch for pg_verifybackup. It should accept
either a directory or a tar file path that contains WALs. This change
will also aligning it with the existing manifest-path switch naming.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 2 +-
src/bin/pg_verifybackup/pg_verifybackup.c | 22 +++++++++++-----------
src/bin/pg_verifybackup/po/de.po | 4 ++--
src/bin/pg_verifybackup/po/el.po | 4 ++--
src/bin/pg_verifybackup/po/es.po | 4 ++--
src/bin/pg_verifybackup/po/fr.po | 4 ++--
src/bin/pg_verifybackup/po/it.po | 4 ++--
src/bin/pg_verifybackup/po/ja.po | 4 ++--
src/bin/pg_verifybackup/po/ka.po | 4 ++--
src/bin/pg_verifybackup/po/ko.po | 4 ++--
src/bin/pg_verifybackup/po/ru.po | 4 ++--
src/bin/pg_verifybackup/po/sv.po | 4 ++--
src/bin/pg_verifybackup/po/uk.po | 4 ++--
src/bin/pg_verifybackup/po/zh_CN.po | 4 ++--
src/bin/pg_verifybackup/po/zh_TW.po | 4 ++--
src/bin/pg_verifybackup/t/007_wal.pl | 4 ++--
16 files changed, 40 insertions(+), 40 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index 61c12975e4a..e9b8bfd51b1 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -261,7 +261,7 @@ PostgreSQL documentation
<varlistentry>
<term><option>-w <replaceable class="parameter">path</replaceable></option></term>
- <term><option>--wal-directory=<replaceable class="parameter">path</replaceable></option></term>
+ <term><option>--wal-path=<replaceable class="parameter">path</replaceable></option></term>
<listitem>
<para>
Try to parse WAL files stored in the specified directory, rather than
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index a502e795b2e..9fcd6be004e 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -93,7 +93,7 @@ static void verify_file_checksum(verifier_context *context,
uint8 *buffer);
static void parse_required_wal(verifier_context *context,
char *pg_waldump_path,
- char *wal_directory);
+ char *wal_path);
static astreamer *create_archive_verifier(verifier_context *context,
char *archive_name,
Oid tblspc_oid,
@@ -126,7 +126,7 @@ main(int argc, char **argv)
{"progress", no_argument, NULL, 'P'},
{"quiet", no_argument, NULL, 'q'},
{"skip-checksums", no_argument, NULL, 's'},
- {"wal-directory", required_argument, NULL, 'w'},
+ {"wal-path", required_argument, NULL, 'w'},
{NULL, 0, NULL, 0}
};
@@ -135,7 +135,7 @@ main(int argc, char **argv)
char *manifest_path = NULL;
bool no_parse_wal = false;
bool quiet = false;
- char *wal_directory = NULL;
+ char *wal_path = NULL;
char *pg_waldump_path = NULL;
DIR *dir;
@@ -221,8 +221,8 @@ main(int argc, char **argv)
context.skip_checksums = true;
break;
case 'w':
- wal_directory = pstrdup(optarg);
- canonicalize_path(wal_directory);
+ wal_path = pstrdup(optarg);
+ canonicalize_path(wal_path);
break;
default:
/* getopt_long already emitted a complaint */
@@ -365,15 +365,15 @@ main(int argc, char **argv)
verify_backup_checksums(&context);
/* By default, look for the WAL in the backup directory, too. */
- if (wal_directory == NULL)
- wal_directory = psprintf("%s/pg_wal", context.backup_directory);
+ if (wal_path == NULL)
+ wal_path = psprintf("%s/pg_wal", context.backup_directory);
/*
* Try to parse the required ranges of WAL records, unless we were told
* not to do so.
*/
if (!no_parse_wal)
- parse_required_wal(&context, pg_waldump_path, wal_directory);
+ parse_required_wal(&context, pg_waldump_path, wal_path);
/*
* If everything looks OK, tell the user this, unless we were asked to
@@ -1198,7 +1198,7 @@ verify_file_checksum(verifier_context *context, manifest_file *m,
*/
static void
parse_required_wal(verifier_context *context, char *pg_waldump_path,
- char *wal_directory)
+ char *wal_path)
{
manifest_data *manifest = context->manifest;
manifest_wal_range *this_wal_range = manifest->first_wal_range;
@@ -1208,7 +1208,7 @@ parse_required_wal(verifier_context *context, char *pg_waldump_path,
char *pg_waldump_cmd;
pg_waldump_cmd = psprintf("\"%s\" --quiet --path=\"%s\" --timeline=%u --start=%X/%08X --end=%X/%08X\n",
- pg_waldump_path, wal_directory, this_wal_range->tli,
+ pg_waldump_path, wal_path, this_wal_range->tli,
LSN_FORMAT_ARGS(this_wal_range->start_lsn),
LSN_FORMAT_ARGS(this_wal_range->end_lsn));
fflush(NULL);
@@ -1376,7 +1376,7 @@ usage(void)
printf(_(" -P, --progress show progress information\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -s, --skip-checksums skip checksum verification\n"));
- printf(_(" -w, --wal-directory=PATH use specified path for WAL files\n"));
+ printf(_(" -w, --wal-path=PATH use specified path for WAL files\n"));
printf(_(" -V, --version output version information, then exit\n"));
printf(_(" -?, --help show this help, then exit\n"));
printf(_("\nReport bugs to <%s>.\n"), PACKAGE_BUGREPORT);
diff --git a/src/bin/pg_verifybackup/po/de.po b/src/bin/pg_verifybackup/po/de.po
index a9e24931100..9b5cd5898cf 100644
--- a/src/bin/pg_verifybackup/po/de.po
+++ b/src/bin/pg_verifybackup/po/de.po
@@ -785,8 +785,8 @@ msgstr " -s, --skip-checksums Überprüfung der Prüfsummen überspringe
#: pg_verifybackup.c:1379
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PFAD angegebenen Pfad für WAL-Dateien verwenden\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PFAD angegebenen Pfad für WAL-Dateien verwenden\n"
#: pg_verifybackup.c:1380
#, c-format
diff --git a/src/bin/pg_verifybackup/po/el.po b/src/bin/pg_verifybackup/po/el.po
index 3e3f20c67c5..81442f51c17 100644
--- a/src/bin/pg_verifybackup/po/el.po
+++ b/src/bin/pg_verifybackup/po/el.po
@@ -494,8 +494,8 @@ msgstr " -s, --skip-checksums παράκαμψε την επαλήθευ
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH χρησιμοποίησε την καθορισμένη διαδρομή για αρχεία WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH χρησιμοποίησε την καθορισμένη διαδρομή για αρχεία WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/es.po b/src/bin/pg_verifybackup/po/es.po
index 0cb958f3448..7f729fa35ba 100644
--- a/src/bin/pg_verifybackup/po/es.po
+++ b/src/bin/pg_verifybackup/po/es.po
@@ -495,8 +495,8 @@ msgstr " -s, --skip-checksums omitir la verificación de la suma de comp
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH utilizar la ruta especificada para los archivos WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH utilizar la ruta especificada para los archivos WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/fr.po b/src/bin/pg_verifybackup/po/fr.po
index da8c72f6427..09937966fa7 100644
--- a/src/bin/pg_verifybackup/po/fr.po
+++ b/src/bin/pg_verifybackup/po/fr.po
@@ -498,8 +498,8 @@ msgstr " -s, --skip-checksums ignore la vérification des sommes de cont
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=CHEMIN utilise le chemin spécifié pour les fichiers WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=CHEMIN utilise le chemin spécifié pour les fichiers WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/it.po b/src/bin/pg_verifybackup/po/it.po
index 317b0b71e7f..4da68d0074e 100644
--- a/src/bin/pg_verifybackup/po/it.po
+++ b/src/bin/pg_verifybackup/po/it.po
@@ -472,8 +472,8 @@ msgstr " -s, --skip-checksums salta la verifica del checksum\n"
#: pg_verifybackup.c:911
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH usa il percorso specificato per i file WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH usa il percorso specificato per i file WAL\n"
#: pg_verifybackup.c:912
#, c-format
diff --git a/src/bin/pg_verifybackup/po/ja.po b/src/bin/pg_verifybackup/po/ja.po
index c910fb236cc..a948959b54f 100644
--- a/src/bin/pg_verifybackup/po/ja.po
+++ b/src/bin/pg_verifybackup/po/ja.po
@@ -672,8 +672,8 @@ msgstr " -s, --skip-checksums チェックサム検証をスキップ\n"
#: pg_verifybackup.c:1379
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH WALファイルに指定したパスを使用する\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH WALファイルに指定したパスを使用する\n"
#: pg_verifybackup.c:1380
#, c-format
diff --git a/src/bin/pg_verifybackup/po/ka.po b/src/bin/pg_verifybackup/po/ka.po
index 982751984c7..ef2799316a8 100644
--- a/src/bin/pg_verifybackup/po/ka.po
+++ b/src/bin/pg_verifybackup/po/ka.po
@@ -784,8 +784,8 @@ msgstr " -s, --skip-checksums საკონტროლო ჯამ
#: pg_verifybackup.c:1379
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=ბილიკი WAL ფაილებისთვის მითითებული ბილიკის გამოყენება\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=ბილიკი WAL ფაილებისთვის მითითებული ბილიკის გამოყენება\n"
#: pg_verifybackup.c:1380
#, c-format
diff --git a/src/bin/pg_verifybackup/po/ko.po b/src/bin/pg_verifybackup/po/ko.po
index acdc3da5e02..eaf91ef1e98 100644
--- a/src/bin/pg_verifybackup/po/ko.po
+++ b/src/bin/pg_verifybackup/po/ko.po
@@ -501,8 +501,8 @@ msgstr " -s, --skip-checksums 체크섬 검사 건너뜀\n"
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=경로 WAL 파일이 있는 경로 지정\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=경로 WAL 파일이 있는 경로 지정\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/ru.po b/src/bin/pg_verifybackup/po/ru.po
index 64005feedfd..7fb0e5ab1f6 100644
--- a/src/bin/pg_verifybackup/po/ru.po
+++ b/src/bin/pg_verifybackup/po/ru.po
@@ -507,9 +507,9 @@ msgstr " -s, --skip-checksums пропустить проверку ко
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
msgstr ""
-" -w, --wal-directory=ПУТЬ использовать заданный путь к файлам WAL\n"
+" -w, --wal-path=ПУТЬ использовать заданный путь к файлам WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/sv.po b/src/bin/pg_verifybackup/po/sv.po
index 17240feeb5c..97125838e8c 100644
--- a/src/bin/pg_verifybackup/po/sv.po
+++ b/src/bin/pg_verifybackup/po/sv.po
@@ -492,8 +492,8 @@ msgstr " -s, --skip-checksums hoppa över verifiering av kontrollsummor\
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=SÖKVÄG använd denna sökväg till WAL-filer\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=SÖKVÄG använd denna sökväg till WAL-filer\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/uk.po b/src/bin/pg_verifybackup/po/uk.po
index 034b9764232..63f8041ab38 100644
--- a/src/bin/pg_verifybackup/po/uk.po
+++ b/src/bin/pg_verifybackup/po/uk.po
@@ -484,8 +484,8 @@ msgstr " -s, --skip-checksums не перевіряти контрольні с
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH використовувати вказаний шлях для файлів WAL\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH використовувати вказаний шлях для файлів WAL\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/po/zh_CN.po b/src/bin/pg_verifybackup/po/zh_CN.po
index b7d97c8976d..fb6fcae8b82 100644
--- a/src/bin/pg_verifybackup/po/zh_CN.po
+++ b/src/bin/pg_verifybackup/po/zh_CN.po
@@ -465,8 +465,8 @@ msgstr " -s, --skip-checksums 跳过校验和验证\n"
#: pg_verifybackup.c:919
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH 对WAL文件使用指定路径\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH 对WAL文件使用指定路径\n"
#: pg_verifybackup.c:920
#, c-format
diff --git a/src/bin/pg_verifybackup/po/zh_TW.po b/src/bin/pg_verifybackup/po/zh_TW.po
index c1b710b0a36..568f972b0bb 100644
--- a/src/bin/pg_verifybackup/po/zh_TW.po
+++ b/src/bin/pg_verifybackup/po/zh_TW.po
@@ -555,8 +555,8 @@ msgstr " -s, --skip-checksums 跳過檢查碼驗證\n"
#: pg_verifybackup.c:992
#, c-format
-msgid " -w, --wal-directory=PATH use specified path for WAL files\n"
-msgstr " -w, --wal-directory=PATH 用指定的路徑存放 WAL 檔\n"
+msgid " -w, --wal-path=PATH use specified path for WAL files\n"
+msgstr " -w, --wal-path=PATH 用指定的路徑存放 WAL 檔\n"
#: pg_verifybackup.c:993
#, c-format
diff --git a/src/bin/pg_verifybackup/t/007_wal.pl b/src/bin/pg_verifybackup/t/007_wal.pl
index babc4f0a86b..b07f80719b0 100644
--- a/src/bin/pg_verifybackup/t/007_wal.pl
+++ b/src/bin/pg_verifybackup/t/007_wal.pl
@@ -42,10 +42,10 @@ command_ok([ 'pg_verifybackup', '--no-parse-wal', $backup_path ],
command_ok(
[
'pg_verifybackup',
- '--wal-directory' => $relocated_pg_wal,
+ '--wal-path' => $relocated_pg_wal,
$backup_path
],
- '--wal-directory can be used to specify WAL directory');
+ '--wal-path can be used to specify WAL directory');
# Move directory back to original location.
rename($relocated_pg_wal, $original_pg_wal) || die "rename pg_wal back: $!";
--
2.47.1
[application/x-patch] v8-0008-pg_verifybackup-enabled-WAL-parsing-for-tar-forma.patch (9.9K, 9-v8-0008-pg_verifybackup-enabled-WAL-parsing-for-tar-forma.patch)
download | inline diff:
From d0028b96f02c6f15231f4788b88abd9ff46b17f1 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 17 Jul 2025 16:39:36 +0530
Subject: [PATCH v8 8/8] pg_verifybackup: enabled WAL parsing for tar-format
backup
Now that pg_waldump supports decoding from tar archives, we should
leverage this functionality to remove the previous restriction on WAL
parsing for tar-backed formats.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 5 +-
src/bin/pg_verifybackup/pg_verifybackup.c | 66 +++++++++++++------
src/bin/pg_verifybackup/t/002_algorithm.pl | 4 --
src/bin/pg_verifybackup/t/003_corruption.pl | 4 +-
src/bin/pg_verifybackup/t/008_untar.pl | 5 +-
src/bin/pg_verifybackup/t/010_client_untar.pl | 5 +-
6 files changed, 50 insertions(+), 39 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index e9b8bfd51b1..16b50b5a4df 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -36,10 +36,7 @@ PostgreSQL documentation
<literal>backup_manifest</literal> generated by the server at the time
of the backup. The backup may be stored either in the "plain" or the "tar"
format; this includes tar-format backups compressed with any algorithm
- supported by <application>pg_basebackup</application>. However, at present,
- <literal>WAL</literal> verification is supported only for plain-format
- backups. Therefore, if the backup is stored in tar-format, the
- <literal>-n, --no-parse-wal</literal> option should be used.
+ supported by <application>pg_basebackup</application>.
</para>
<para>
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 9fcd6be004e..6915fc7f28e 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -74,7 +74,9 @@ pg_noreturn static void report_manifest_error(JsonManifestParseContext *context,
const char *fmt,...)
pg_attribute_printf(2, 3);
-static void verify_tar_backup(verifier_context *context, DIR *dir);
+static void verify_tar_backup(verifier_context *context, DIR *dir,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_plain_backup_directory(verifier_context *context,
char *relpath, char *fullpath,
DIR *dir);
@@ -83,7 +85,9 @@ static void verify_plain_backup_file(verifier_context *context, char *relpath,
static void verify_control_file(const char *controlpath,
uint64 manifest_system_identifier);
static void precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles);
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_tar_file(verifier_context *context, char *relpath,
char *fullpath, astreamer *streamer);
static void report_extra_backup_files(verifier_context *context);
@@ -136,6 +140,8 @@ main(int argc, char **argv)
bool no_parse_wal = false;
bool quiet = false;
char *wal_path = NULL;
+ char *base_archive_path = NULL;
+ char *wal_archive_path = NULL;
char *pg_waldump_path = NULL;
DIR *dir;
@@ -327,17 +333,6 @@ main(int argc, char **argv)
pfree(path);
}
- /*
- * XXX: In the future, we should consider enhancing pg_waldump to read WAL
- * files from an archive.
- */
- if (!no_parse_wal && context.format == 't')
- {
- pg_log_error("pg_waldump cannot read tar files");
- pg_log_error_hint("You must use -n/--no-parse-wal when verifying a tar-format backup.");
- exit(1);
- }
-
/*
* Perform the appropriate type of verification appropriate based on the
* backup format. This will close 'dir'.
@@ -346,7 +341,7 @@ main(int argc, char **argv)
verify_plain_backup_directory(&context, NULL, context.backup_directory,
dir);
else
- verify_tar_backup(&context, dir);
+ verify_tar_backup(&context, dir, &base_archive_path, &wal_archive_path);
/*
* The "matched" flag should now be set on every entry in the hash table.
@@ -364,9 +359,28 @@ main(int argc, char **argv)
if (context.format == 'p' && !context.skip_checksums)
verify_backup_checksums(&context);
- /* By default, look for the WAL in the backup directory, too. */
+ /*
+ * By default, WAL files are expected to be found in the backup directory
+ * for plain-format backups. In the case of tar-format backups, if a
+ * separate WAL archive is not found, the WAL files are most likely
+ * included within the main data directory archive.
+ */
if (wal_path == NULL)
- wal_path = psprintf("%s/pg_wal", context.backup_directory);
+ {
+ if (context.format == 'p')
+ wal_path = psprintf("%s/pg_wal", context.backup_directory);
+ else if (wal_archive_path)
+ wal_path = wal_archive_path;
+ else if (base_archive_path)
+ wal_path = base_archive_path;
+ else
+ {
+ pg_log_error("wal archive not found");
+ pg_log_error_hint("Specify the correct path using the option -w/--wal-path."
+ "Or you must use -n/--no-parse-wal when verifying a tar-format backup.");
+ exit(1);
+ }
+ }
/*
* Try to parse the required ranges of WAL records, unless we were told
@@ -787,7 +801,8 @@ verify_control_file(const char *controlpath, uint64 manifest_system_identifier)
* close when we're done with it.
*/
static void
-verify_tar_backup(verifier_context *context, DIR *dir)
+verify_tar_backup(verifier_context *context, DIR *dir, char **base_archive_path,
+ char **wal_archive_path)
{
struct dirent *dirent;
SimplePtrList tarfiles = {NULL, NULL};
@@ -816,7 +831,8 @@ verify_tar_backup(verifier_context *context, DIR *dir)
char *fullpath;
fullpath = psprintf("%s/%s", context->backup_directory, filename);
- precheck_tar_backup_file(context, filename, fullpath, &tarfiles);
+ precheck_tar_backup_file(context, filename, fullpath, &tarfiles,
+ base_archive_path, wal_archive_path);
pfree(fullpath);
}
}
@@ -875,11 +891,13 @@ verify_tar_backup(verifier_context *context, DIR *dir)
*
* The arguments to this function are mostly the same as the
* verify_plain_backup_file. The additional argument outputs a list of valid
- * tar files.
+ * tar files, along with the full paths to the main archive and the WAL
+ * directory archive.
*/
static void
precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles)
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path, char **wal_archive_path)
{
struct stat sb;
Oid tblspc_oid = InvalidOid;
@@ -918,9 +936,17 @@ precheck_tar_backup_file(verifier_context *context, char *relpath,
* extension such as .gz, .lz4, or .zst.
*/
if (strncmp("base", relpath, 4) == 0)
+ {
suffix = relpath + 4;
+
+ *base_archive_path = pstrdup(fullpath);
+ }
else if (strncmp("pg_wal", relpath, 6) == 0)
+ {
suffix = relpath + 6;
+
+ *wal_archive_path = pstrdup(fullpath);
+ }
else
{
/* Expected a <tablespaceoid>.tar file here. */
diff --git a/src/bin/pg_verifybackup/t/002_algorithm.pl b/src/bin/pg_verifybackup/t/002_algorithm.pl
index ae16c11bc4d..4f284a9e828 100644
--- a/src/bin/pg_verifybackup/t/002_algorithm.pl
+++ b/src/bin/pg_verifybackup/t/002_algorithm.pl
@@ -30,10 +30,6 @@ sub test_checksums
{
# Add switch to get a tar-format backup
push @backup, ('--format' => 'tar');
-
- # Add switch to skip WAL verification, which is not yet supported for
- # tar-format backups
- push @verify, ('--no-parse-wal');
}
# A backup with a bogus algorithm should fail.
diff --git a/src/bin/pg_verifybackup/t/003_corruption.pl b/src/bin/pg_verifybackup/t/003_corruption.pl
index 1dd60f709cf..f1ebdbb46b4 100644
--- a/src/bin/pg_verifybackup/t/003_corruption.pl
+++ b/src/bin/pg_verifybackup/t/003_corruption.pl
@@ -193,10 +193,8 @@ for my $scenario (@scenario)
command_ok([ $tar, '-cf' => "$tar_backup_path/base.tar", '.' ]);
chdir($cwd) || die "chdir: $!";
- # Now check that the backup no longer verifies. We must use -n
- # here, because pg_waldump can't yet read WAL from a tarfile.
command_fails_like(
- [ 'pg_verifybackup', '--no-parse-wal', $tar_backup_path ],
+ [ 'pg_verifybackup', $tar_backup_path ],
$scenario->{'fails_like'},
"corrupt backup fails verification: $name");
diff --git a/src/bin/pg_verifybackup/t/008_untar.pl b/src/bin/pg_verifybackup/t/008_untar.pl
index bc3d6b352ad..09079a94fee 100644
--- a/src/bin/pg_verifybackup/t/008_untar.pl
+++ b/src/bin/pg_verifybackup/t/008_untar.pl
@@ -47,7 +47,6 @@ my $tsoid = $primary->safe_psql(
SELECT oid FROM pg_tablespace WHERE spcname = 'regress_ts1'));
my $backup_path = $primary->backup_dir . '/server-backup';
-my $extract_path = $primary->backup_dir . '/extracted-backup';
my @test_configuration = (
{
@@ -123,14 +122,12 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
# Cleanup.
rmtree($backup_path);
- rmtree($extract_path);
}
}
diff --git a/src/bin/pg_verifybackup/t/010_client_untar.pl b/src/bin/pg_verifybackup/t/010_client_untar.pl
index b62faeb5acf..5b0e76ee69d 100644
--- a/src/bin/pg_verifybackup/t/010_client_untar.pl
+++ b/src/bin/pg_verifybackup/t/010_client_untar.pl
@@ -32,7 +32,6 @@ print $jf $junk_data;
close $jf;
my $backup_path = $primary->backup_dir . '/client-backup';
-my $extract_path = $primary->backup_dir . '/extracted-backup';
my @test_configuration = (
{
@@ -137,13 +136,11 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
# Cleanup.
- rmtree($extract_path);
rmtree($backup_path);
}
}
--
2.47.1
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2025-11-25 08:50 Chao Li <[email protected]>
parent: Amul Sul <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Chao Li @ 2025-11-25 08:50 UTC (permalink / raw)
To: Amul Sul <[email protected]>; +Cc: Jakub Wartak <[email protected]>; Robert Haas <[email protected]>; PostgreSQL Hackers <[email protected]>
Hi Amul,
I reviewed the patch and got some comments:
> On Nov 25, 2025, at 14:37, Amul Sul <[email protected]> wrote:
>
>
> Regards,
> Amul
> <v8-0001-Refactor-pg_waldump-Move-some-declarations-to-new.patch><v8-0002-Refactor-pg_waldump-Separate-logic-used-to-calcul.patch><v8-0003-Refactor-pg_waldump-Restructure-TAP-tests.patch><v8-0004-pg_waldump-Add-support-for-archived-WAL-decoding.patch><v8-0005-pg_waldump-Remove-the-restriction-on-the-order-of.patch><v8-0006-pg_verifybackup-Delay-default-WAL-directory-prepa.patch><v8-0007-pg_verifybackup-Rename-the-wal-directory-switch-t.patch><v8-0008-pg_verifybackup-enabled-WAL-parsing-for-tar-forma.patch>
1 - 0001 - pg_waldump.h
```
+ * pg_waldump.h - decode and display WAL
+ *
+ * Copyright (c) 2013-2025, PostgreSQL Global Development Group
```
This header file is brand new, so copyright year should be only 2025.
2 - 0001 - pg_waldump.c
```
-static int WalSegSz;
+int WalSegSz = DEFAULT_XLOG_SEG_SIZE;
```
0001 claims a refactoring, but if you initialize WalSegSz with DEFAULT_XLOG_SEG_SIZE, then the behavior is changing, this change is no longer a pure refactor.
I would suggest leave WalSegSz uninitiated (compiler will set 0 to it), then no behavior change, so that 0001 stays a self-contained pure refactor.
The other nit thing is that, as “static” is removed, now “WalSegSz” is placed in middle of two static variables, which looks not good. If I were making the code change, I would have moved WalSegSz to after all static variables.
3 - 0002
```
@@ -383,21 +406,11 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogRecPtr targetPtr, char *readBuff)
{
XLogDumpPrivate *private = state->private_data;
- int count = XLOG_BLCKSZ;
+ int count = required_read_len(private, targetPagePtr, reqLen);
WALReadError errinfo;
- if (XLogRecPtrIsValid(private->endptr))
- {
- if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
- count = XLOG_BLCKSZ;
- else if (targetPagePtr + reqLen <= private->endptr)
- count = private->endptr - targetPagePtr;
- else
- {
- private->endptr_reached = true;
- return -1;
- }
- }
+ if (private->endptr_reached)
+ return -1;
```
This change introduces a logic hole. In old code, it sets private->endptr_reached = true; and return -1. In the code code, count and private->endptr_reached assignments are wrapped into required_read_len(). However, required_read_len() doesn’t check if private->endptr_reached has already been true, so that the logic hole is that, if private->endptr_reached is already true when calling required_read_len(), and required_read_len() returns a positive count, if (private->endptr_reached) will also be satisfied and return -1 from the function.
So, to be safe, we should check “if (count < 0) return -1”.
4 - 0002
```
+/* Returns the size in bytes of the data to be read. */
+static inline int
+required_read_len(XLogDumpPrivate *private, XLogRecPtr targetPagePtr,
+ int reqLen)
+{
```
The function comment is too simple. It doesn’t cover the case where -1 is returned.
5 - 0003
```
+my @scenario = (
+ {
+ 'path' => $node->data_dir
+ });
-@lines = test_pg_waldump('--limit' => 6);
-is(@lines, 6, 'limit option observed');
+for my $scenario (@scenario)
+{
```
"my @scenario” should be "my @scenarios”, so that for line become "for my $scenario (@scenarios)”, a little bit clearer.
6 - 0003
```
+ SKIP:
+ {
```
Why SKIP label is defined here? A SKIP label usually follows a skip statement, for example: in bin/pg_ctl/t/001_start_stop.pl
```
SKIP:
{
skip "unix-style permissions not supported on Windows", 2
if ($windows_os);
ok(-f $logFileName);
ok(check_mode_recursive("$tempdir/data", 0700, 0600));
}
```
7 - 0004 - Makefile
```
$(WIN32RES) \
compat.o \
pg_waldump.o \
+ archive_waldump.o \
rmgrdesc.o \
xlogreader.o \
xlogstats.o
```
Obviously the list was in alphabetical order, so archive_waldump.o should be placed before compat.o.
8 - 0004
```
+/*
+ * pg_waldump's XLogReaderRoutine->page_read callback to support dumping WAL
+ * files from tar archives.
+ */
+static int
+TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
+ XLogRecPtr targetPtr, char *readBuff)
+{
+ XLogDumpPrivate *private = state->private_data;
+ int count = required_read_len(private, targetPagePtr, reqLen);
```
Looking the page_read’s spec:
```
/*
* Data input callback
*
* This callback shall read at least reqLen valid bytes of the xlog page
* starting at targetPagePtr, and store them in readBuf. The callback
* shall return the number of bytes read (never more than XLOG_BLCKSZ), or
* -1 on failure. The callback shall sleep, if necessary, to wait for the
* requested bytes to become available. The callback will not be invoked
* again for the same page unless more than the returned number of bytes
* are needed.
*
* targetRecPtr is the position of the WAL record we're reading. Usually
* it is equal to targetPagePtr + reqLen, but sometimes xlogreader needs
* to read and verify the page or segment header, before it reads the
* actual WAL record it's interested in. In that case, targetRecPtr can
* be used to determine which timeline to read the page from.
*
* The callback shall set ->seg.ws_tli to the TLI of the file the page was
* read from.
*/
XLogPageReadCB page_read;
```
It says that page_read must read reqLen bytes, otherwise it should wait for more bytes.
However, TarWALDumpReadPage just calculate how many bytes can read and only read that long, which breaks the protocol. Is it a problem?
9 - 0004
```
+/*
+ * Create an astreamer that can read WAL from tar file.
+ */
+static astreamer *
+astreamer_waldump_new(XLogDumpPrivate *privateInfo)
+{
+ astreamer_waldump *streamer;
+
+ streamer = palloc0(sizeof(astreamer_waldump));
+ *((const astreamer_ops **) &streamer->base.bbs_ops) =
+ &astreamer_waldump_ops;
+
+ streamer->privateInfo = privateInfo;
+
+ return &streamer->base;
+}
```
This function allocates memory for streamer but only returns &streamer->base, so memory of streamer is leaked.
Also, in the function comment, “from tar file” => “from a tar file”.
10 - 0004
```
+ * End-of-stream processing for a astreamer_waldump stream.
```
Nit typo: a => an
11 - 0004
```
+ if (!IsValidWalSegSize(WalSegSz))
+ {
+ pg_log_error(ngettext("invalid WAL segment size in WAL file from archive \"%s\" (%d byte)",
+ "invalid WAL segment size in WAL file from archive \"%s\" (%d bytes)",
+ WalSegSz),
+ privateInfo->archive_name, WalSegSz);
+ pg_log_error_detail("The WAL segment size must be a power of two between 1 MB and 1 GB.");
+ exit(1);
+ }
```
Why don’t pg_fatal()?
12 - 0005
```
+ /* Create a temporary file if one does not already exist */
+ if (!entry->tmpseg_exists)
+ {
+ write_fp = prepare_tmp_write(entry->segno);
+ entry->tmpseg_exists = true;
+ }
+
+ /* Flush data from the buffer to the file */
+ perform_tmp_write(entry->segno, &entry->buf, write_fp);
+ resetStringInfo(&entry->buf);
+
+ /*
+ * The change in the current segment entry indicates that the reading
+ * of this file has ended.
+ */
+ if (entry != privateInfo->cur_wal && write_fp != NULL)
+ {
+ fclose(write_fp);
+ write_fp = NULL;
+ }
```
When entry->tmpseg_exists is true, then write_fp will not be initialized, but there should be a check to make sure write_fp is not NULL before perform_tmp_write().
Also, if write_fp != NULL, should we anyway close the file without considering entry != privateInfo->cur_wal? Otherwise write_fp may be left open.
13 - 0005
```
+ * Use the directory specified by the TEMDIR environment variable. If it’s
```
Typo: TEMDIR => TMPDIR
14 - 0005
```
+ * Set up a temporary directory to temporarily store WAL segments.
```
temporary and temporarily are redundant.
No comment for 0007.
15 - 0007
I wonder why we need to manually po files? This is the first time I see a patch including po file changes.
16 - 0008
```
+ {
+ pg_log_error("wal archive not found");
+ pg_log_error_hint("Specify the correct path using the option -w/--wal-path."
+ "Or you must use -n/--no-parse-wal when verifying a tar-format backup.");
+ exit(1);
+ }
```
“wal” should be “WAL”.
In the hint message, there should be a white space between the two sentences.
Again, why not pg_fatal().
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2025-11-26 06:02 Amul Sul <[email protected]>
parent: Chao Li <[email protected]>
0 siblings, 2 replies; 85+ messages in thread
From: Amul Sul @ 2025-11-26 06:02 UTC (permalink / raw)
To: Chao Li <[email protected]>; +Cc: Jakub Wartak <[email protected]>; Robert Haas <[email protected]>; PostgreSQL Hackers <[email protected]>
On Tue, Nov 25, 2025 at 2:21 PM Chao Li <[email protected]> wrote:
>
> Hi Amul,
>
> I reviewed the patch and got some comments:
>
Thanks for the review. Replying inline below.
> 1 - 0001 - pg_waldump.h
> ```
> + * pg_waldump.h - decode and display WAL
> + *
> + * Copyright (c) 2013-2025, PostgreSQL Global Development Group
> ```
>
> This header file is brand new, so copyright year should be only 2025.
>
Fixed in the attached version.
> 2 - 0001 - pg_waldump.c
> ```
> -static int WalSegSz;
> +int WalSegSz = DEFAULT_XLOG_SEG_SIZE;
> ```
>
> 0001 claims a refactoring, but if you initialize WalSegSz with DEFAULT_XLOG_SEG_SIZE, then the behavior is changing, this change is no longer a pure refactor.
>
> I would suggest leave WalSegSz uninitiated (compiler will set 0 to it), then no behavior change, so that 0001 stays a self-contained pure refactor.
>
Agreed.
> The other nit thing is that, as “static” is removed, now “WalSegSz” is placed in middle of two static variables, which looks not good. If I were making the code change, I would have moved WalSegSz to after all static variables.
>
I placed it before the static declaration.
> 3 - 0002
> ```
> @@ -383,21 +406,11 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
> XLogRecPtr targetPtr, char *readBuff)
> {
> XLogDumpPrivate *private = state->private_data;
> - int count = XLOG_BLCKSZ;
> + int count = required_read_len(private, targetPagePtr, reqLen);
> WALReadError errinfo;
>
> - if (XLogRecPtrIsValid(private->endptr))
> - {
> - if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
> - count = XLOG_BLCKSZ;
> - else if (targetPagePtr + reqLen <= private->endptr)
> - count = private->endptr - targetPagePtr;
> - else
> - {
> - private->endptr_reached = true;
> - return -1;
> - }
> - }
> + if (private->endptr_reached)
> + return -1;
> ```
>
> This change introduces a logic hole. In old code, it sets private->endptr_reached = true; and return -1. In the code code, count and private->endptr_reached assignments are wrapped into required_read_len(). However, required_read_len() doesn’t check if private->endptr_reached has already been true, so that the logic hole is that, if private->endptr_reached is already true when calling required_read_len(), and required_read_len() returns a positive count, if (private->endptr_reached) will also be satisfied and return -1 from the function.
>
> So, to be safe, we should check “if (count < 0) return -1”.
>
I do not really understand the logical hole where the behaviour is the
same as the previous, but I like the idea of checking endptr_reached.
This is quite unlikely to be true, but it looks like good practice to
check that flag before setting it. Did it that way in the attached
version.
> 4 - 0002
> ```
> +/* Returns the size in bytes of the data to be read. */
> +static inline int
> +required_read_len(XLogDumpPrivate *private, XLogRecPtr targetPagePtr,
> + int reqLen)
> +{
> ```
>
> The function comment is too simple. It doesn’t cover the case where -1 is returned.
>
Okay.
> 5 - 0003
> ```
> +my @scenario = (
> + {
> + 'path' => $node->data_dir
> + });
>
> -@lines = test_pg_waldump('--limit' => 6);
> -is(@lines, 6, 'limit option observed');
> +for my $scenario (@scenario)
> +{
> ```
>
> "my @scenario” should be "my @scenarios”, so that for line become "for my $scenario (@scenarios)”, a little bit clearer.
>
Done.
> 6 - 0003
> ```
> + SKIP:
> + {
> ```
>
> Why SKIP label is defined here? A SKIP label usually follows a skip statement, for example: in bin/pg_ctl/t/001_start_stop.pl
> ```
> SKIP:
> {
> skip "unix-style permissions not supported on Windows", 2
> if ($windows_os);
>
> ok(-f $logFileName);
> ok(check_mode_recursive("$tempdir/data", 0700, 0600));
> }
> ```
>
Yeah, I knew that, but that is needed in the next patch where I wanted
to avoid a large diff when introducing SKIP and the associated
indentation. This patch is not expected to be committed independently,
and I have added a note in the commit message for the same.
> 7 - 0004 - Makefile
> ```
> $(WIN32RES) \
> compat.o \
> pg_waldump.o \
> + archive_waldump.o \
> rmgrdesc.o \
> xlogreader.o \
> xlogstats.o
> ```
>
> Obviously the list was in alphabetical order, so archive_waldump.o should be placed before compat.o.
>
Done.
> 8 - 0004
> ```
> +/*
> + * pg_waldump's XLogReaderRoutine->page_read callback to support dumping WAL
> + * files from tar archives.
> + */
> +static int
> +TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
> + XLogRecPtr targetPtr, char *readBuff)
> +{
> + XLogDumpPrivate *private = state->private_data;
> + int count = required_read_len(private, targetPagePtr, reqLen);
> ```
>
> Looking the page_read’s spec:
> ```
> /*
> * Data input callback
> *
> * This callback shall read at least reqLen valid bytes of the xlog page
> * starting at targetPagePtr, and store them in readBuf. The callback
> * shall return the number of bytes read (never more than XLOG_BLCKSZ), or
> * -1 on failure. The callback shall sleep, if necessary, to wait for the
> * requested bytes to become available. The callback will not be invoked
> * again for the same page unless more than the returned number of bytes
> * are needed.
> *
> * targetRecPtr is the position of the WAL record we're reading. Usually
> * it is equal to targetPagePtr + reqLen, but sometimes xlogreader needs
> * to read and verify the page or segment header, before it reads the
> * actual WAL record it's interested in. In that case, targetRecPtr can
> * be used to determine which timeline to read the page from.
> *
> * The callback shall set ->seg.ws_tli to the TLI of the file the page was
> * read from.
> */
> XLogPageReadCB page_read;
> ```
>
> It says that page_read must read reqLen bytes, otherwise it should wait for more bytes.
>
> However,
just calculate how many bytes can read and only read that long, which
breaks the protocol. Is it a problem?
>
The behaviour is the same as the routine used to read the bare WAL
file. I don't think there will be any problem for the pg_waldump.
> 9 - 0004
> ```
> +/*
> + * Create an astreamer that can read WAL from tar file.
> + */
> +static astreamer *
> +astreamer_waldump_new(XLogDumpPrivate *privateInfo)
> +{
> + astreamer_waldump *streamer;
> +
> + streamer = palloc0(sizeof(astreamer_waldump));
> + *((const astreamer_ops **) &streamer->base.bbs_ops) =
> + &astreamer_waldump_ops;
> +
> + streamer->privateInfo = privateInfo;
> +
> + return &streamer->base;
> +}
> ```
>
> This function allocates memory for streamer but only returns &streamer->base, so memory of streamer is leaked.
>
May I know why you think there would be a memory leak? I believe the
address of the structure is the same as the address of its first
member, base. I am returning base because the goal is to return a
generic astreamer type, which is the standard approach used in other
archive streamer code.
> Also, in the function comment, “from tar file” => “from a tar file”.
>
> 10 - 0004
> ```
> + * End-of-stream processing for a astreamer_waldump stream.
> ```
>
> Nit typo: a => an
>
Done.
> 11 - 0004
> ```
> + if (!IsValidWalSegSize(WalSegSz))
> + {
> + pg_log_error(ngettext("invalid WAL segment size in WAL file from archive \"%s\" (%d byte)",
> + "invalid WAL segment size in WAL file from archive \"%s\" (%d bytes)",
> + WalSegSz),
> + privateInfo->archive_name, WalSegSz);
> + pg_log_error_detail("The WAL segment size must be a power of two between 1 MB and 1 GB.");
> + exit(1);
> + }
> ```
>
> Why don’t pg_fatal()?
>
This is how we do when we need to emit error details as well.
> 12 - 0005
> ```
> + /* Create a temporary file if one does not already exist */
> + if (!entry->tmpseg_exists)
> + {
> + write_fp = prepare_tmp_write(entry->segno);
> + entry->tmpseg_exists = true;
> + }
> +
> + /* Flush data from the buffer to the file */
> + perform_tmp_write(entry->segno, &entry->buf, write_fp);
> + resetStringInfo(&entry->buf);
> +
> + /*
> + * The change in the current segment entry indicates that the reading
> + * of this file has ended.
> + */
> + if (entry != privateInfo->cur_wal && write_fp != NULL)
> + {
> + fclose(write_fp);
> + write_fp = NULL;
> + }
> ```
>
> When entry->tmpseg_exists is true, then write_fp will not be initialized, but there should be a check to make sure write_fp is not NULL before perform_tmp_write().
>
perform_tmp_write() has assert for the same.
> Also, if write_fp != NULL, should we anyway close the file without considering entry != privateInfo->cur_wal? Otherwise write_fp may be left open.
>
We read the WAL from the tar file in chunks, and those same chunks are
written to the temporary file within the loop. If we close the
temporary file now, we will have to open it again later for the next
chunk write. Could you elaborate on a scenario where you believe this
file might be left open unintentionally?
> 13 - 0005
> ```
> + * Use the directory specified by the TEMDIR environment variable. If it’s
> ```
>
> Typo: TEMDIR => TMPDIR
>
Done.
> 14 - 0005
> ```
> + * Set up a temporary directory to temporarily store WAL segments.
> ```
>
> temporary and temporarily are redundant.
>
I believe that is grammatically correct and clear.
> No comment for 0007.
>
> 15 - 0007
>
> I wonder why we need to manually po files? This is the first time I see a patch including po file changes.
>
Okay, I included that initially to ensure the PO file update wasn't
overlooked during commit. I have removed it to minimize the diff and
added the note in the patch commit message.
> 16 - 0008
> ```
> + {
> + pg_log_error("wal archive not found");
> + pg_log_error_hint("Specify the correct path using the option -w/--wal-path."
> + "Or you must use -n/--no-parse-wal when verifying a tar-format backup.");
> + exit(1);
> + }
> ```
>
> “wal” should be “WAL”.
>
> In the hint message, there should be a white space between the two sentences.
>
Done.
Thanks again for your review comments; they are quite helpful. Kindly
take a look at the attached version.
Regards,
Amul
Attachments:
[application/x-patch] v9-0001-Refactor-pg_waldump-Move-some-declarations-to-new.patch (2.4K, 2-v9-0001-Refactor-pg_waldump-Move-some-declarations-to-new.patch)
download | inline diff:
From 38322b7e5062f7c45f47e09655a1c964e9eeb68b Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 24 Jun 2025 11:33:20 +0530
Subject: [PATCH v9 1/8] Refactor: pg_waldump: Move some declarations to new
pg_waldump.h
This change prepares for a second source file in this directory to
support reading WAL from tar files. Common structures, declarations,
and functions are being exported through this include file so
they can be used in both files.
---
src/bin/pg_waldump/pg_waldump.c | 12 +++---------
src/bin/pg_waldump/pg_waldump.h | 27 +++++++++++++++++++++++++++
2 files changed, 30 insertions(+), 9 deletions(-)
create mode 100644 src/bin/pg_waldump/pg_waldump.h
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index c6d6ba79e44..6680280dbbc 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -29,6 +29,7 @@
#include "common/logging.h"
#include "common/relpath.h"
#include "getopt_long.h"
+#include "pg_waldump.h"
#include "rmgrdesc.h"
#include "storage/bufpage.h"
@@ -37,21 +38,14 @@
* give a thought about doing the same in pg_walinspect contrib module as well.
*/
+int WalSegSz;
+
static const char *progname;
-static int WalSegSz;
static volatile sig_atomic_t time_to_stop = false;
static const RelFileLocator emptyRelFileLocator = {0, 0, 0};
-typedef struct XLogDumpPrivate
-{
- TimeLineID timeline;
- XLogRecPtr startptr;
- XLogRecPtr endptr;
- bool endptr_reached;
-} XLogDumpPrivate;
-
typedef struct XLogDumpConfig
{
/* display options */
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
new file mode 100644
index 00000000000..926d529f9d6
--- /dev/null
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -0,0 +1,27 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_waldump.h - decode and display WAL
+ *
+ * Copyright (c) 2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/pg_waldump.h
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_WALDUMP_H
+#define PG_WALDUMP_H
+
+#include "access/xlogdefs.h"
+
+extern int WalSegSz;
+
+/* Contains the necessary information to drive WAL decoding */
+typedef struct XLogDumpPrivate
+{
+ TimeLineID timeline;
+ XLogRecPtr startptr;
+ XLogRecPtr endptr;
+ bool endptr_reached;
+} XLogDumpPrivate;
+
+#endif /* end of PG_WALDUMP_H */
--
2.47.1
[application/x-patch] v9-0002-Refactor-pg_waldump-Separate-logic-used-to-calcul.patch (2.5K, 3-v9-0002-Refactor-pg_waldump-Separate-logic-used-to-calcul.patch)
download | inline diff:
From 01d144fe8af4d4232b4b043c2ef11b07d785b4bd Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 26 Jun 2025 11:42:53 +0530
Subject: [PATCH v9 2/8] Refactor: pg_waldump: Separate logic used to calculate
the required read size.
This refactoring prepares the codebase for an upcoming patch that will
support reading WAL from tar files. The logic for calculating the
required read size has been updated to handle both normal WAL files
and WAL files located inside a tar archive.
---
src/bin/pg_waldump/pg_waldump.c | 46 +++++++++++++++++++++++----------
1 file changed, 33 insertions(+), 13 deletions(-)
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 6680280dbbc..2c11e0e5ca1 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -327,6 +327,35 @@ identify_target_directory(char *directory, char *fname)
return NULL; /* not reached */
}
+/*
+ * Returns the size in bytes of the data to be read. Returns -1 if the end
+ * point has already been reached.
+ */
+static inline int
+required_read_len(XLogDumpPrivate *private, XLogRecPtr targetPagePtr,
+ int reqLen)
+{
+ int count = XLOG_BLCKSZ;
+
+ if (unlikely(private->endptr_reached))
+ return -1;
+
+ if (XLogRecPtrIsValid(private->endptr))
+ {
+ if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
+ count = XLOG_BLCKSZ;
+ else if (targetPagePtr + reqLen <= private->endptr)
+ count = private->endptr - targetPagePtr;
+ else
+ {
+ private->endptr_reached = true;
+ return -1;
+ }
+ }
+
+ return count;
+}
+
/* pg_waldump's XLogReaderRoutine->segment_open callback */
static void
WALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
@@ -384,21 +413,12 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogRecPtr targetPtr, char *readBuff)
{
XLogDumpPrivate *private = state->private_data;
- int count = XLOG_BLCKSZ;
+ int count = required_read_len(private, targetPagePtr, reqLen);
WALReadError errinfo;
- if (XLogRecPtrIsValid(private->endptr))
- {
- if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
- count = XLOG_BLCKSZ;
- else if (targetPagePtr + reqLen <= private->endptr)
- count = private->endptr - targetPagePtr;
- else
- {
- private->endptr_reached = true;
- return -1;
- }
- }
+ /* Bail out if the count to be read is not valid */
+ if (count < 0)
+ return -1;
if (!WALRead(state, readBuff, targetPagePtr, count, private->timeline,
&errinfo))
--
2.47.1
[application/x-patch] v9-0003-Refactor-pg_waldump-Restructure-TAP-tests.patch (5.6K, 4-v9-0003-Refactor-pg_waldump-Restructure-TAP-tests.patch)
download | inline diff:
From 60b4c33b9b95c3d24a8200dae5cbc75bb09daef8 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 25 Nov 2025 16:12:11 +0530
Subject: [PATCH v9 3/8] Refactor: pg_waldump: Restructure TAP tests.
Restructured some tests to run inside a loop, facilitating their
re-execution for decoding WAL from tar archives.
== NOTE ==
This is not intended to be committed separately. It can be merged
with the next patch, which is the main patch implementing this
feature.
---
src/bin/pg_waldump/t/001_basic.pl | 123 ++++++++++++++++--------------
1 file changed, 67 insertions(+), 56 deletions(-)
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index f26d75e01cf..c8fdc7cb4f3 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -198,28 +198,6 @@ command_like(
],
qr/./,
'runs with start and end segment specified');
-command_fails_like(
- [ 'pg_waldump', '--path' => $node->data_dir ],
- qr/error: no start WAL location given/,
- 'path option requires start location');
-command_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- '--end' => $end_lsn,
- ],
- qr/./,
- 'runs with path option and start and end locations');
-command_fails_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- ],
- qr/error: error in WAL record at/,
- 'falling off the end of the WAL results in an error');
-
command_like(
[
'pg_waldump', '--quiet',
@@ -227,15 +205,6 @@ command_like(
],
qr/^$/,
'no output with --quiet option');
-command_fails_like(
- [
- 'pg_waldump', '--quiet',
- '--path' => $node->data_dir,
- '--start' => $start_lsn
- ],
- qr/error: error in WAL record at/,
- 'errors are shown with --quiet');
-
# Test for: Display a message that we're skipping data if `from`
# wasn't a pointer to the start of a record.
@@ -272,7 +241,6 @@ sub test_pg_waldump
my $result = IPC::Run::run [
'pg_waldump',
- '--path' => $node->data_dir,
'--start' => $start_lsn,
'--end' => $end_lsn,
@opts
@@ -288,38 +256,81 @@ sub test_pg_waldump
my @lines;
-@lines = test_pg_waldump;
-is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+my @scenarios = (
+ {
+ 'path' => $node->data_dir
+ });
-@lines = test_pg_waldump('--limit' => 6);
-is(@lines, 6, 'limit option observed');
+for my $scenario (@scenarios)
+{
+ my $path = $scenario->{'path'};
-@lines = test_pg_waldump('--fullpage');
-is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
+ SKIP:
+ {
+ command_fails_like(
+ [ 'pg_waldump', '--path' => $path ],
+ qr/error: no start WAL location given/,
+ 'path option requires start location');
+ command_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ '--end' => $end_lsn,
+ ],
+ qr/./,
+ 'runs with path option and start and end locations');
+ command_fails_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ ],
+ qr/error: error in WAL record at/,
+ 'falling off the end of the WAL results in an error');
-@lines = test_pg_waldump('--stats');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ command_fails_like(
+ [
+ 'pg_waldump', '--quiet',
+ '--path' => $path,
+ '--start' => $start_lsn
+ ],
+ qr/error: error in WAL record at/,
+ 'errors are shown with --quiet');
-@lines = test_pg_waldump('--stats=record');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ @lines = test_pg_waldump('--path' => $path);
+ is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
-@lines = test_pg_waldump('--rmgr' => 'Btree');
-is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+ @lines = test_pg_waldump('--path' => $path, '--limit' => 6);
+ is(@lines, 6, 'limit option observed');
-@lines = test_pg_waldump('--fork' => 'init');
-is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+ @lines = test_pg_waldump('--path' => $path, '--fullpage');
+ is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
-is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
- 0, 'only lines for selected relation');
+ @lines = test_pg_waldump('--path' => $path, '--stats');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
- '--block' => 1);
-is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+ @lines = test_pg_waldump('--path' => $path, '--stats=record');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ @lines = test_pg_waldump('--path' => $path, '--rmgr' => 'Btree');
+ is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+
+ @lines = test_pg_waldump('--path' => $path, '--fork' => 'init');
+ is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+
+ @lines = test_pg_waldump('--path' => $path,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
+ is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
+ 0, 'only lines for selected relation');
+
+ @lines = test_pg_waldump('--path' => $path,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
+ '--block' => 1);
+ is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+ }
+}
done_testing();
--
2.47.1
[application/x-patch] v9-0004-pg_waldump-Add-support-for-archived-WAL-decoding.patch (37.4K, 5-v9-0004-pg_waldump-Add-support-for-archived-WAL-decoding.patch)
download | inline diff:
From bd2de61f136dd9111f69874af2782c4b2f6ab314 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 5 Nov 2025 15:40:36 +0530
Subject: [PATCH v9 4/8] pg_waldump: Add support for archived WAL decoding.
pg_waldump can now accept the path to a tar archive containing WAL
files and decode them. This feature was added primarily for
pg_verifybackup, which previously disabled WAL parsing for
tar-formatted backups.
Note that this patch requires that the WAL files within the archive be
in sequential order; an error will be reported otherwise. The next
patch is planned to remove this restriction.
---
doc/src/sgml/ref/pg_waldump.sgml | 8 +-
src/bin/pg_waldump/Makefile | 7 +-
src/bin/pg_waldump/archive_waldump.c | 596 +++++++++++++++++++++++++++
src/bin/pg_waldump/meson.build | 4 +-
src/bin/pg_waldump/pg_waldump.c | 218 +++++++---
src/bin/pg_waldump/pg_waldump.h | 34 ++
src/bin/pg_waldump/t/001_basic.pl | 84 +++-
src/tools/pgindent/typedefs.list | 3 +
8 files changed, 878 insertions(+), 76 deletions(-)
create mode 100644 src/bin/pg_waldump/archive_waldump.c
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index ce23add5577..d004bb0f67e 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -141,13 +141,17 @@ PostgreSQL documentation
<term><option>--path=<replaceable>path</replaceable></option></term>
<listitem>
<para>
- Specifies a directory to search for WAL segment files or a
- directory with a <literal>pg_wal</literal> subdirectory that
+ Specifies a tar archive or a directory to search for WAL segment files
+ or a directory with a <literal>pg_wal</literal> subdirectory that
contains such files. The default is to search in the current
directory, the <literal>pg_wal</literal> subdirectory of the
current directory, and the <literal>pg_wal</literal> subdirectory
of <envar>PGDATA</envar>.
</para>
+ <para>
+ If a tar archive is provided, its WAL segment files must be in
+ sequential order; otherwise, an error will be reported.
+ </para>
</listitem>
</varlistentry>
diff --git a/src/bin/pg_waldump/Makefile b/src/bin/pg_waldump/Makefile
index 4c1ee649501..aabb87566a2 100644
--- a/src/bin/pg_waldump/Makefile
+++ b/src/bin/pg_waldump/Makefile
@@ -3,6 +3,9 @@
PGFILEDESC = "pg_waldump - decode and display WAL"
PGAPPICON=win32
+# make these available to TAP test scripts
+export TAR
+
subdir = src/bin/pg_waldump
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
@@ -10,13 +13,15 @@ include $(top_builddir)/src/Makefile.global
OBJS = \
$(RMGRDESCOBJS) \
$(WIN32RES) \
+ archive_waldump.o \
compat.o \
pg_waldump.o \
rmgrdesc.o \
xlogreader.o \
xlogstats.o
-override CPPFLAGS := -DFRONTEND $(CPPFLAGS)
+override CPPFLAGS := -DFRONTEND -I$(libpq_srcdir) $(CPPFLAGS)
+LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils
RMGRDESCSOURCES = $(sort $(notdir $(wildcard $(top_srcdir)/src/backend/access/rmgrdesc/*desc*.c)))
RMGRDESCOBJS = $(patsubst %.c,%.o,$(RMGRDESCSOURCES))
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
new file mode 100644
index 00000000000..63141dc2ee2
--- /dev/null
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -0,0 +1,596 @@
+/*-------------------------------------------------------------------------
+ *
+ * archive_waldump.c
+ * A generic facility for reading WAL data from tar archives via archive
+ * streamer.
+ *
+ * Portions Copyright (c) 2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/archive_waldump.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres_fe.h"
+
+#include <unistd.h>
+
+#include "access/xlog_internal.h"
+#include "common/hashfn.h"
+#include "common/logging.h"
+#include "fe_utils/simple_list.h"
+#include "pg_waldump.h"
+
+/*
+ * How many bytes should we try to read from a file at once?
+ */
+#define READ_CHUNK_SIZE (128 * 1024)
+
+/* Structure for storing the WAL segment data from the archive */
+typedef struct ArchivedWALEntry
+{
+ uint32 status; /* hash status */
+ XLogSegNo segno; /* hash key: WAL segment number */
+ TimeLineID timeline; /* timeline of this wal file */
+
+ StringInfoData buf;
+ bool tmpseg_exists; /* spill file exists? */
+
+ int total_read; /* total read of this WAL segment, including
+ * buffered and temporarily written data */
+} ArchivedWALEntry;
+
+#define SH_PREFIX ArchivedWAL
+#define SH_ELEMENT_TYPE ArchivedWALEntry
+#define SH_KEY_TYPE XLogSegNo
+#define SH_KEY segno
+#define SH_HASH_KEY(tb, key) murmurhash64((uint64) key)
+#define SH_EQUAL(tb, a, b) (a == b)
+#define SH_GET_HASH(tb, a) a->hash
+#define SH_SCOPE static inline
+#define SH_RAW_ALLOCATOR pg_malloc0
+#define SH_DECLARE
+#define SH_DEFINE
+#include "lib/simplehash.h"
+
+static ArchivedWAL_hash *ArchivedWAL_HTAB = NULL;
+
+typedef struct astreamer_waldump
+{
+ astreamer base;
+ XLogDumpPrivate *privateInfo;
+} astreamer_waldump;
+
+static int read_archive_file(XLogDumpPrivate *privateInfo, Size count);
+static ArchivedWALEntry *get_archive_wal_entry(XLogSegNo segno,
+ XLogDumpPrivate *privateInfo);
+
+static astreamer *astreamer_waldump_new(XLogDumpPrivate *privateInfo);
+static void astreamer_waldump_content(astreamer *streamer,
+ astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context);
+static void astreamer_waldump_finalize(astreamer *streamer);
+static void astreamer_waldump_free(astreamer *streamer);
+
+static bool member_is_wal_file(astreamer_waldump *mystreamer,
+ astreamer_member *member,
+ XLogSegNo *curSegNo,
+ TimeLineID *curTimeline);
+
+static const astreamer_ops astreamer_waldump_ops = {
+ .content = astreamer_waldump_content,
+ .finalize = astreamer_waldump_finalize,
+ .free = astreamer_waldump_free
+};
+
+/*
+ * Returns true if the given file is a tar archive and outputs its compression
+ * algorithm.
+ */
+bool
+is_archive_file(const char *fname, pg_compress_algorithm *compression)
+{
+ int fname_len = strlen(fname);
+ pg_compress_algorithm compress_algo;
+
+ /* Now, check the compression type of the tar */
+ if (fname_len > 4 &&
+ strcmp(fname + fname_len - 4, ".tar") == 0)
+ compress_algo = PG_COMPRESSION_NONE;
+ else if (fname_len > 4 &&
+ strcmp(fname + fname_len - 4, ".tgz") == 0)
+ compress_algo = PG_COMPRESSION_GZIP;
+ else if (fname_len > 7 &&
+ strcmp(fname + fname_len - 7, ".tar.gz") == 0)
+ compress_algo = PG_COMPRESSION_GZIP;
+ else if (fname_len > 8 &&
+ strcmp(fname + fname_len - 8, ".tar.lz4") == 0)
+ compress_algo = PG_COMPRESSION_LZ4;
+ else if (fname_len > 8 &&
+ strcmp(fname + fname_len - 8, ".tar.zst") == 0)
+ compress_algo = PG_COMPRESSION_ZSTD;
+ else
+ return false;
+
+ *compression = compress_algo;
+
+ return true;
+}
+
+/*
+ * Initializes the tar archive reader to read WAL files from the archive,
+ * creates a hash table to store them, performs quick existence checks for WAL
+ * entries in the archive and retrieves the WAL segment size, and sets up
+ * filtering criteria for relevant entries.
+ */
+void
+init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
+ pg_compress_algorithm compression)
+{
+ int fd;
+ astreamer *streamer;
+ ArchivedWALEntry *entry = NULL;
+ XLogLongPageHeader longhdr;
+
+ /* Open tar archive and store its file descriptor */
+ fd = open_file_in_directory(waldir, privateInfo->archive_name);
+
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", privateInfo->archive_name);
+
+ privateInfo->archive_fd = fd;
+
+ streamer = astreamer_waldump_new(privateInfo);
+
+ /* Before that we must parse the tar archive. */
+ streamer = astreamer_tar_parser_new(streamer);
+
+ /* Before that we must decompress, if archive is compressed. */
+ if (compression == PG_COMPRESSION_GZIP)
+ streamer = astreamer_gzip_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_LZ4)
+ streamer = astreamer_lz4_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_ZSTD)
+ streamer = astreamer_zstd_decompressor_new(streamer);
+
+ privateInfo->archive_streamer = streamer;
+
+ /* Hash table storing WAL entries read from the archive */
+ ArchivedWAL_HTAB = ArchivedWAL_create(16, NULL);
+
+ /*
+ * Verify that the archive contains valid WAL files and fetch WAL segment
+ * size
+ */
+ while (entry == NULL || entry->buf.len < XLOG_BLCKSZ)
+ {
+ if (read_archive_file(privateInfo, XLOG_BLCKSZ) == 0)
+ pg_fatal("could not find WAL in \"%s\" archive",
+ privateInfo->archive_name);
+
+ entry = privateInfo->cur_wal;
+ }
+
+ /* Set WalSegSz if WAL data is successfully read */
+ longhdr = (XLogLongPageHeader) entry->buf.data;
+
+ WalSegSz = longhdr->xlp_seg_size;
+
+ if (!IsValidWalSegSize(WalSegSz))
+ {
+ pg_log_error(ngettext("invalid WAL segment size in WAL file from archive \"%s\" (%d byte)",
+ "invalid WAL segment size in WAL file from archive \"%s\" (%d bytes)",
+ WalSegSz),
+ privateInfo->archive_name, WalSegSz);
+ pg_log_error_detail("The WAL segment size must be a power of two between 1 MB and 1 GB.");
+ exit(1);
+ }
+
+ /*
+ * With the WAL segment size available, we can now initialize the
+ * dependent start and end segment numbers.
+ */
+ Assert(!XLogRecPtrIsInvalid(privateInfo->startptr));
+ XLByteToSeg(privateInfo->startptr, privateInfo->startSegNo, WalSegSz);
+
+ if (XLogRecPtrIsInvalid(privateInfo->endptr))
+ privateInfo->endSegNo = UINT64_MAX;
+ else
+ XLByteToSeg(privateInfo->endptr, privateInfo->endSegNo, WalSegSz);
+}
+
+/*
+ * Release the archive streamer chain and close the archive file.
+ */
+void
+free_archive_reader(XLogDumpPrivate *privateInfo)
+{
+ /*
+ * NB: Normally, astreamer_finalize() is called before astreamer_free() to
+ * flush any remaining buffered data or to ensure the end of the tar
+ * archive is reached. However, when decoding a WAL file, once we hit the
+ * end LSN, any remaining WAL data in the buffer or the tar archive's
+ * unreached end can be safely ignored.
+ */
+ astreamer_free(privateInfo->archive_streamer);
+
+ /* Close the file. */
+ if (close(privateInfo->archive_fd) != 0)
+ pg_log_error("could not close file \"%s\": %m",
+ privateInfo->archive_name);
+}
+
+/*
+ * Copies WAL data from astreamer to readBuff; if unavailable, fetches more
+ * from the tar archive via astreamer.
+ */
+int
+read_archive_wal_page(XLogDumpPrivate *privateInfo, XLogRecPtr targetPagePtr,
+ Size count, char *readBuff)
+{
+ char *p = readBuff;
+ Size nbytes = count;
+ XLogRecPtr recptr = targetPagePtr;
+ XLogSegNo segno;
+ ArchivedWALEntry *entry;
+
+ XLByteToSeg(targetPagePtr, segno, WalSegSz);
+ entry = get_archive_wal_entry(segno, privateInfo);
+
+ while (nbytes > 0)
+ {
+ char *buf = entry->buf.data;
+ int len = entry->buf.len;
+
+ /* WAL record range that the buffer contains */
+ XLogRecPtr endPtr;
+ XLogRecPtr startPtr;
+
+ XLogSegNoOffsetToRecPtr(entry->segno, entry->total_read,
+ WalSegSz, endPtr);
+ startPtr = endPtr - len;
+
+ /*
+ * pg_waldump may request to re-read the currently active page, but
+ * never a page older than the current one. Therefore, any fully
+ * consumed WAL data preceding the current page can be safely
+ * discarded.
+ */
+ if (recptr >= endPtr)
+ {
+ /* Discard the buffered data */
+ resetStringInfo(&entry->buf);
+ len = 0;
+
+ /*
+ * Push back the partial page data for the current page to the
+ * buffer, ensuring it remains available for re-reading if
+ * requested.
+ */
+ if (p > readBuff)
+ {
+ Assert((count - nbytes) > 0);
+ appendBinaryStringInfo(&entry->buf, readBuff, count - nbytes);
+ }
+ }
+
+ if (len > 0 && recptr > startPtr)
+ {
+ int skipBytes = 0;
+
+ /*
+ * The required offset is not at the start of the buffer, so skip
+ * bytes until reaching the desired offset of the target page.
+ */
+ skipBytes = recptr - startPtr;
+
+ buf += skipBytes;
+ len -= skipBytes;
+ }
+
+ if (len > 0)
+ {
+ int readBytes = len >= nbytes ? nbytes : len;
+
+ /* Ensure the reading page is in the buffer */
+ Assert(recptr >= startPtr && recptr < endPtr);
+
+ memcpy(p, buf, readBytes);
+
+ /* Update state for read */
+ nbytes -= readBytes;
+ p += readBytes;
+ recptr += readBytes;
+ }
+ else
+ {
+ /*
+ * Fetch more data; raise an error if it's not the current segment
+ * being read by the archive streamer or if reading of the
+ * archived file has finished.
+ */
+ if (privateInfo->cur_wal != entry ||
+ read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ {
+ char fname[MAXFNAMELEN];
+
+ XLogFileName(fname, privateInfo->timeline, entry->segno,
+ WalSegSz);
+ pg_fatal("could not read file \"%s\" from archive \"%s\": read %lld of %lld",
+ fname, privateInfo->archive_name,
+ (long long int) count - nbytes,
+ (long long int) nbytes);
+ }
+ }
+ }
+
+ /*
+ * Should have either have successfully read all the requested bytes or
+ * reported a failure before this point.
+ */
+ Assert(nbytes == 0);
+
+ /*
+ * NB: We return the fixed value provided as input. Although we could
+ * return a boolean since we either successfully read the WAL page or
+ * raise an error, but the caller expects this value to be returned. The
+ * routine that reads WAL pages from the physical WAL file follows the
+ * same convention.
+ */
+ return count;
+}
+
+/*
+ * Reads the archive file and passes it to the archive streamer for
+ * decompression.
+ */
+static int
+read_archive_file(XLogDumpPrivate *privateInfo, Size count)
+{
+ int rc;
+ char *buffer;
+
+ buffer = pg_malloc(READ_CHUNK_SIZE * sizeof(uint8));
+
+ rc = read(privateInfo->archive_fd, buffer, count);
+ if (rc < 0)
+ pg_fatal("could not read file \"%s\": %m",
+ privateInfo->archive_name);
+
+ /*
+ * Decompress (if required), and then parse the previously read contents
+ * of the tar file.
+ */
+ if (rc > 0)
+ astreamer_content(privateInfo->archive_streamer, NULL,
+ buffer, rc, ASTREAMER_UNKNOWN);
+ pg_free(buffer);
+
+ return rc;
+}
+
+/*
+ * Returns the archived WAL entry from the hash table if it exists. Otherwise,
+ * it invokes the routine to read the archived file and retrieve the entry if
+ * it is not already in hash table.
+ */
+static ArchivedWALEntry *
+get_archive_wal_entry(XLogSegNo segno, XLogDumpPrivate *privateInfo)
+{
+ ArchivedWALEntry *entry = NULL;
+ char fname[MAXFNAMELEN];
+
+ /* Search hash table */
+ entry = ArchivedWAL_lookup(ArchivedWAL_HTAB, segno);
+
+ if (entry != NULL)
+ return entry;
+
+ /* Needed WAL yet to be decoded from archive, do the same */
+ while (1)
+ {
+ entry = privateInfo->cur_wal;
+
+ /* Fetch more data */
+ if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ break; /* archive file ended */
+
+ /*
+ * Either, here for the first time, or the archived streamer is
+ * reading a non-WAL file or an irrelevant WAL file.
+ */
+ if (entry == NULL)
+ continue;
+
+ /* Found the required entry */
+ if (entry->segno == segno)
+ return entry;
+
+ /*
+ * Ignore if the timeline is different or the current segment is not
+ * the desired one.
+ */
+ if (privateInfo->timeline != entry->timeline ||
+ privateInfo->startSegNo > entry->segno ||
+ privateInfo->endSegNo < entry->segno)
+ {
+ privateInfo->cur_wal = NULL;
+ continue;
+ }
+
+ /*
+ * XXX: If the segment being read not the requested one, the data must
+ * be buffered, as we currently lack the mechanism to write it to a
+ * temporary file. This is a known limitation that will be fixed in the
+ * next patch, as the buffer could grow up to the full WAL segment
+ * size.
+ */
+ if (segno > entry->segno)
+ continue;
+
+ /* WAL segments must be archived in order */
+ pg_log_error("WAL files are not archived in sequential order");
+ pg_log_error_detail("Expecting segment number " UINT64_FORMAT " but found " UINT64_FORMAT ".",
+ segno, entry->segno);
+ exit(1);
+ }
+
+ /* Requested WAL segment not found */
+ XLogFileName(fname, privateInfo->timeline, segno, WalSegSz);
+ pg_fatal("could not find file \"%s\" in archive", fname);
+}
+
+/*
+ * Create an astreamer that can read WAL from a tar file.
+ */
+static astreamer *
+astreamer_waldump_new(XLogDumpPrivate *privateInfo)
+{
+ astreamer_waldump *streamer;
+
+ streamer = palloc0(sizeof(astreamer_waldump));
+ *((const astreamer_ops **) &streamer->base.bbs_ops) =
+ &astreamer_waldump_ops;
+
+ streamer->privateInfo = privateInfo;
+
+ return &streamer->base;
+}
+
+/*
+ * Main entry point of the archive streamer for reading WAL data from a tar
+ * file. If a member is identified as a valid WAL file, a hash entry is created
+ * for it, and its contents are copied into that entry's buffer, making them
+ * accessible to the decoding routine.
+ */
+static void
+astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context)
+{
+ astreamer_waldump *mystreamer = (astreamer_waldump *) streamer;
+ XLogDumpPrivate *privateInfo = mystreamer->privateInfo;
+
+ Assert(context != ASTREAMER_UNKNOWN);
+
+ switch (context)
+ {
+ case ASTREAMER_MEMBER_HEADER:
+ {
+ XLogSegNo segno;
+ TimeLineID timeline;
+ ArchivedWALEntry *entry;
+ bool found;
+
+ pg_log_debug("reading \"%s\"", member->pathname);
+
+ if (!member_is_wal_file(mystreamer, member,
+ &segno, &timeline))
+ break;
+
+ entry = ArchivedWAL_insert(ArchivedWAL_HTAB, segno, &found);
+
+ /*
+ * Shouldn't happen, but if it does, simply ignore the
+ * duplicate WAL file.
+ */
+ if (found)
+ {
+ pg_log_warning("ignoring duplicate WAL file found in archive: \"%s\"",
+ member->pathname);
+ break;
+ }
+
+ initStringInfo(&entry->buf);
+ entry->timeline = timeline;
+ entry->total_read = 0;
+
+ privateInfo->cur_wal = entry;
+ }
+ break;
+
+ case ASTREAMER_MEMBER_CONTENTS:
+ if (privateInfo->cur_wal)
+ {
+ appendBinaryStringInfo(&privateInfo->cur_wal->buf, data, len);
+ privateInfo->cur_wal->total_read += len;
+ }
+ break;
+
+ case ASTREAMER_MEMBER_TRAILER:
+ privateInfo->cur_wal = NULL;
+ break;
+
+ case ASTREAMER_ARCHIVE_TRAILER:
+ break;
+
+ default:
+ /* Shouldn't happen. */
+ pg_fatal("unexpected state while parsing tar file");
+ }
+}
+
+/*
+ * End-of-stream processing for an astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_finalize(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+}
+
+/*
+ * Free memory associated with a astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_free(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+ pfree(streamer);
+}
+
+/*
+ * Returns true if the archive member name matches the WAL naming format. If
+ * successful, it also outputs the WAL segment number, and timeline.
+ */
+static bool
+member_is_wal_file(astreamer_waldump *mystreamer, astreamer_member *member,
+ XLogSegNo *curSegNo, TimeLineID *curTimeline)
+{
+ int pathlen;
+ XLogSegNo segNo;
+ TimeLineID timeline;
+ char *fname;
+
+ /* We are only interested in normal files. */
+ if (member->is_directory || member->is_link)
+ return false;
+
+ pathlen = strlen(member->pathname);
+ if (pathlen < XLOG_FNAME_LEN)
+ return false;
+
+ /* WAL file could be with full path */
+ fname = member->pathname + (pathlen - XLOG_FNAME_LEN);
+ if (!IsXLogFileName(fname))
+ return false;
+
+ /*
+ * XXX: On some systems (e.g., OpenBSD), the tar utility includes
+ * PaxHeaders when creating an archive. These are special entries that
+ * store extended metadata for the file entry immediately following them,
+ * and they share the exact same name as that file.
+ */
+ if (strstr(member->pathname, "PaxHeaders."))
+ return false;
+
+ /* Parse position from file */
+ XLogFromFileName(fname, &timeline, &segNo, WalSegSz);
+
+ *curSegNo = segNo;
+ *curTimeline = timeline;
+
+ return true;
+}
diff --git a/src/bin/pg_waldump/meson.build b/src/bin/pg_waldump/meson.build
index 937e0d68841..f31e0d1cd86 100644
--- a/src/bin/pg_waldump/meson.build
+++ b/src/bin/pg_waldump/meson.build
@@ -1,6 +1,7 @@
# Copyright (c) 2022-2025, PostgreSQL Global Development Group
pg_waldump_sources = files(
+ 'archive_waldump.c',
'compat.c',
'pg_waldump.c',
'rmgrdesc.c',
@@ -18,7 +19,7 @@ endif
pg_waldump = executable('pg_waldump',
pg_waldump_sources,
- dependencies: [frontend_code, lz4, zstd],
+ dependencies: [frontend_code, lz4, zstd, libpq],
c_args: ['-DFRONTEND'], # needed for xlogreader et al
kwargs: default_bin_args,
)
@@ -29,6 +30,7 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
+ 'env': {'TAR': tar.found() ? tar.full_path() : ''},
'tests': [
't/001_basic.pl',
't/002_save_fullpage.pl',
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 2c11e0e5ca1..1eedf8e01b4 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -38,7 +38,7 @@
* give a thought about doing the same in pg_walinspect contrib module as well.
*/
-int WalSegSz;
+int WalSegSz = DEFAULT_XLOG_SEG_SIZE;
static const char *progname;
@@ -178,7 +178,7 @@ split_path(const char *path, char **dir, char **fname)
*
* return a read only fd
*/
-static int
+int
open_file_in_directory(const char *directory, const char *fname)
{
int fd = -1;
@@ -444,6 +444,45 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
return count;
}
+/*
+ * pg_waldump's XLogReaderRoutine->segment_open callback to support dumping WAL
+ * files from tar archives.
+ */
+static void
+TarWALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
+ TimeLineID *tli_p)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->segment_close callback.
+ */
+static void
+TarWALDumpCloseSegment(XLogReaderState *state)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->page_read callback to support dumping WAL
+ * files from tar archives.
+ */
+static int
+TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
+ XLogRecPtr targetPtr, char *readBuff)
+{
+ XLogDumpPrivate *private = state->private_data;
+ int count = required_read_len(private, targetPagePtr, reqLen);
+
+ /* Bail out if the count to be read is not valid */
+ if (count < 0)
+ return -1;
+
+ /* Read the WAL page from the archive streamer */
+ return read_archive_wal_page(private, targetPagePtr, count, readBuff);
+}
+
/*
* Boolean to return whether the given WAL record matches a specific relation
* and optionally block.
@@ -781,8 +820,8 @@ usage(void)
printf(_(" -F, --fork=FORK only show records that modify blocks in fork FORK;\n"
" valid names are main, fsm, vm, init\n"));
printf(_(" -n, --limit=N number of records to display\n"));
- printf(_(" -p, --path=PATH directory in which to find WAL segment files or a\n"
- " directory with a ./pg_wal that contains such files\n"
+ printf(_(" -p, --path=PATH tar archive or a directory in which to find WAL segment files or\n"
+ " a directory with a ./pg_wal that contains such files\n"
" (default: current directory, ./pg_wal, $PGDATA/pg_wal)\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -r, --rmgr=RMGR only show records generated by resource manager RMGR;\n"
@@ -814,7 +853,10 @@ main(int argc, char **argv)
XLogRecord *record;
XLogRecPtr first_record;
char *waldir = NULL;
+ char *walpath = NULL;
char *errormsg;
+ bool is_archive = false;
+ pg_compress_algorithm compression;
static struct option long_options[] = {
{"bkp-details", no_argument, NULL, 'b'},
@@ -946,7 +988,7 @@ main(int argc, char **argv)
}
break;
case 'p':
- waldir = pg_strdup(optarg);
+ walpath = pg_strdup(optarg);
break;
case 'q':
config.quiet = true;
@@ -1110,10 +1152,20 @@ main(int argc, char **argv)
goto bad_argument;
}
- if (waldir != NULL)
+ if (walpath != NULL)
{
+ /* validate path points to tar archive */
+ if (is_archive_file(walpath, &compression))
+ {
+ char *fname = NULL;
+
+ split_path(walpath, &waldir, &fname);
+
+ private.archive_name = fname;
+ is_archive = true;
+ }
/* validate path points to directory */
- if (!verify_directory(waldir))
+ else if (!verify_directory(walpath))
{
pg_log_error("could not open directory \"%s\": %m", waldir);
goto bad_argument;
@@ -1131,6 +1183,17 @@ main(int argc, char **argv)
int fd;
XLogSegNo segno;
+ /*
+ * If a tar archive is passed using the --path option, all other
+ * arguments become unnecessary.
+ */
+ if (is_archive)
+ {
+ pg_log_error("unnecessary command-line arguments specified with tar archive (first is \"%s\")",
+ argv[optind]);
+ goto bad_argument;
+ }
+
split_path(argv[optind], &directory, &fname);
if (waldir == NULL && directory != NULL)
@@ -1141,69 +1204,77 @@ main(int argc, char **argv)
pg_fatal("could not open directory \"%s\": %m", waldir);
}
- waldir = identify_target_directory(waldir, fname);
- fd = open_file_in_directory(waldir, fname);
- if (fd < 0)
- pg_fatal("could not open file \"%s\"", fname);
- close(fd);
-
- /* parse position from file */
- XLogFromFileName(fname, &private.timeline, &segno, WalSegSz);
-
- if (!XLogRecPtrIsValid(private.startptr))
- XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr);
- else if (!XLByteInSeg(private.startptr, segno, WalSegSz))
+ if (fname != NULL && is_archive_file(fname, &compression))
{
- pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.startptr),
- fname);
- goto bad_argument;
+ private.archive_name = fname;
+ is_archive = true;
}
-
- /* no second file specified, set end position */
- if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(segno + 1, 0, WalSegSz, private.endptr);
-
- /* parse ENDSEG if passed */
- if (optind + 1 < argc)
+ else
{
- XLogSegNo endsegno;
-
- /* ignore directory, already have that */
- split_path(argv[optind + 1], &directory, &fname);
-
+ waldir = identify_target_directory(waldir, fname);
fd = open_file_in_directory(waldir, fname);
if (fd < 0)
pg_fatal("could not open file \"%s\"", fname);
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &endsegno, WalSegSz);
+ XLogFromFileName(fname, &private.timeline, &segno, WalSegSz);
- if (endsegno < segno)
- pg_fatal("ENDSEG %s is before STARTSEG %s",
- argv[optind + 1], argv[optind]);
+ if (!XLogRecPtrIsValid(private.startptr))
+ XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr);
+ else if (!XLByteInSeg(private.startptr, segno, WalSegSz))
+ {
+ pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.startptr),
+ fname);
+ goto bad_argument;
+ }
- if (!XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(endsegno + 1, 0, WalSegSz,
- private.endptr);
+ /* no second file specified, set end position */
+ if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
+ XLogSegNoOffsetToRecPtr(segno + 1, 0, WalSegSz, private.endptr);
- /* set segno to endsegno for check of --end */
- segno = endsegno;
- }
+ /* parse ENDSEG if passed */
+ if (optind + 1 < argc)
+ {
+ XLogSegNo endsegno;
+ /* ignore directory, already have that */
+ split_path(argv[optind + 1], &directory, &fname);
- if (!XLByteInSeg(private.endptr, segno, WalSegSz) &&
- private.endptr != (segno + 1) * WalSegSz)
- {
- pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.endptr),
- argv[argc - 1]);
- goto bad_argument;
+ fd = open_file_in_directory(waldir, fname);
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", fname);
+ close(fd);
+
+ /* parse position from file */
+ XLogFromFileName(fname, &private.timeline, &endsegno, WalSegSz);
+
+ if (endsegno < segno)
+ pg_fatal("ENDSEG %s is before STARTSEG %s",
+ argv[optind + 1], argv[optind]);
+
+ if (!XLogRecPtrIsValid(private.endptr))
+ XLogSegNoOffsetToRecPtr(endsegno + 1, 0, WalSegSz,
+ private.endptr);
+
+ /* set segno to endsegno for check of --end */
+ segno = endsegno;
+ }
+
+
+ if (!XLByteInSeg(private.endptr, segno, WalSegSz) &&
+ private.endptr != (segno + 1) * WalSegSz)
+ {
+ pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.endptr),
+ argv[argc - 1]);
+ goto bad_argument;
+ }
}
}
- else
- waldir = identify_target_directory(waldir, NULL);
+ else if (!is_archive)
+ waldir = identify_target_directory(walpath, NULL);
/* we don't know what to print */
if (!XLogRecPtrIsValid(private.startptr))
@@ -1215,12 +1286,36 @@ main(int argc, char **argv)
/* done with argument parsing, do the actual work */
/* we have everything we need, start reading */
- xlogreader_state =
- XLogReaderAllocate(WalSegSz, waldir,
- XL_ROUTINE(.page_read = WALDumpReadPage,
- .segment_open = WALDumpOpenSegment,
- .segment_close = WALDumpCloseSegment),
- &private);
+ if (is_archive)
+ {
+ /*
+ * A NULL WAL directory indicates that the archive file is located in
+ * the current working directory of the pg_waldump execution
+ */
+ waldir = waldir ? pg_strdup(waldir) : pg_strdup(".");
+
+ /* Set up for reading tar file */
+ init_archive_reader(&private, waldir, compression);
+
+ /* Routine to decode WAL files in tar archive */
+ xlogreader_state =
+ XLogReaderAllocate(WalSegSz, waldir,
+ XL_ROUTINE(.page_read = TarWALDumpReadPage,
+ .segment_open = TarWALDumpOpenSegment,
+ .segment_close = TarWALDumpCloseSegment),
+ &private);
+ }
+ else
+ {
+ /* Routine to decode WAL files */
+ xlogreader_state =
+ XLogReaderAllocate(WalSegSz, waldir,
+ XL_ROUTINE(.page_read = WALDumpReadPage,
+ .segment_open = WALDumpOpenSegment,
+ .segment_close = WALDumpCloseSegment),
+ &private);
+ }
+
if (!xlogreader_state)
pg_fatal("out of memory while allocating a WAL reading processor");
@@ -1329,6 +1424,9 @@ main(int argc, char **argv)
XLogReaderFree(xlogreader_state);
+ if (is_archive)
+ free_archive_reader(&private);
+
return EXIT_SUCCESS;
bad_argument:
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index 926d529f9d6..ec7a33d40e0 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -12,9 +12,13 @@
#define PG_WALDUMP_H
#include "access/xlogdefs.h"
+#include "fe_utils/astreamer.h"
extern int WalSegSz;
+/* Forward declaration */
+struct ArchivedWALEntry;
+
/* Contains the necessary information to drive WAL decoding */
typedef struct XLogDumpPrivate
{
@@ -22,6 +26,36 @@ typedef struct XLogDumpPrivate
XLogRecPtr startptr;
XLogRecPtr endptr;
bool endptr_reached;
+
+ /* Fields required to read WAL from archive */
+ char *archive_name; /* Tar archive name */
+ int archive_fd; /* File descriptor for the open tar file */
+
+ astreamer *archive_streamer;
+
+ /* What the archive streamer is currently reading */
+ struct ArchivedWALEntry *cur_wal;
+
+ /*
+ * Although these values can be easily derived from startptr and endptr,
+ * doing so repeatedly for each archived member would be inefficient, as
+ * it would involve recalculating and filtering out irrelevant WAL
+ * segments.
+ */
+ XLogSegNo startSegNo;
+ XLogSegNo endSegNo;
} XLogDumpPrivate;
+extern int open_file_in_directory(const char *directory, const char *fname);
+
+extern bool is_archive_file(const char *fname,
+ pg_compress_algorithm *compression);
+extern void init_archive_reader(XLogDumpPrivate *privateInfo,
+ const char *waldir,
+ pg_compress_algorithm compression);
+extern void free_archive_reader(XLogDumpPrivate *privateInfo);
+extern int read_archive_wal_page(XLogDumpPrivate *privateInfo,
+ XLogRecPtr targetPagePtr,
+ Size count, char *readBuff);
+
#endif /* end of PG_WALDUMP_H */
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index c8fdc7cb4f3..b12bbc6f95b 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -3,10 +3,13 @@
use strict;
use warnings FATAL => 'all';
+use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
+my $tar = $ENV{TAR};
+
program_help_ok('pg_waldump');
program_version_ok('pg_waldump');
program_options_handling_ok('pg_waldump');
@@ -235,7 +238,7 @@ command_like(
sub test_pg_waldump
{
local $Test::Builder::Level = $Test::Builder::Level + 1;
- my @opts = @_;
+ my ($path, @opts) = @_;
my ($stdout, $stderr);
@@ -243,6 +246,7 @@ sub test_pg_waldump
'pg_waldump',
'--start' => $start_lsn,
'--end' => $end_lsn,
+ '--path' => $path,
@opts
],
'>' => \$stdout,
@@ -254,11 +258,50 @@ sub test_pg_waldump
return @lines;
}
-my @lines;
+# Create a tar archive, sorting the file order
+sub generate_archive
+{
+ my ($archive, $directory, $compression_flags) = @_;
+
+ my @files;
+ opendir my $dh, $directory or die "opendir: $!";
+ while (my $entry = readdir $dh) {
+ # Skip '.' and '..'
+ next if $entry eq '.' || $entry eq '..';
+ push @files, $entry;
+ }
+ closedir $dh;
+
+ @files = sort @files;
+
+ # move into the WAL directory before archiving files
+ my $cwd = getcwd;
+ chdir($directory) || die "chdir: $!";
+ command_ok([$tar, $compression_flags, $archive, @files]);
+ chdir($cwd) || die "chdir: $!";
+}
+
+my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
my @scenarios = (
{
- 'path' => $node->data_dir
+ 'path' => $node->data_dir,
+ 'is_archive' => 0,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar",
+ 'compression_method' => 'none',
+ 'compression_flags' => '-cf',
+ 'is_archive' => 1,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar.gz",
+ 'compression_method' => 'gzip',
+ 'compression_flags' => '-czf',
+ 'is_archive' => 1,
+ 'enabled' => check_pg_config("#define HAVE_LIBZ 1")
});
for my $scenario (@scenarios)
@@ -267,6 +310,19 @@ for my $scenario (@scenarios)
SKIP:
{
+ skip "tar command is not available", 3
+ if !defined $tar;
+ skip "$scenario->{'compression_method'} compression not supported by this build", 3
+ if !$scenario->{'enabled'} && $scenario->{'is_archive'};
+
+ # create pg_wal archive
+ if ($scenario->{'is_archive'})
+ {
+ generate_archive($path,
+ $node->data_dir . '/pg_wal',
+ $scenario->{'compression_flags'});
+ }
+
command_fails_like(
[ 'pg_waldump', '--path' => $path ],
qr/error: no start WAL location given/,
@@ -298,38 +354,42 @@ for my $scenario (@scenarios)
qr/error: error in WAL record at/,
'errors are shown with --quiet');
- @lines = test_pg_waldump('--path' => $path);
+ my @lines;
+ @lines = test_pg_waldump($path);
is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
- @lines = test_pg_waldump('--path' => $path, '--limit' => 6);
+ @lines = test_pg_waldump($path, '--limit' => 6);
is(@lines, 6, 'limit option observed');
- @lines = test_pg_waldump('--path' => $path, '--fullpage');
+ @lines = test_pg_waldump($path, '--fullpage');
is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
- @lines = test_pg_waldump('--path' => $path, '--stats');
+ @lines = test_pg_waldump($path, '--stats');
like($lines[0], qr/WAL statistics/, "statistics on stdout");
is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
- @lines = test_pg_waldump('--path' => $path, '--stats=record');
+ @lines = test_pg_waldump($path, '--stats=record');
like($lines[0], qr/WAL statistics/, "statistics on stdout");
is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
- @lines = test_pg_waldump('--path' => $path, '--rmgr' => 'Btree');
+ @lines = test_pg_waldump($path, '--rmgr' => 'Btree');
is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
- @lines = test_pg_waldump('--path' => $path, '--fork' => 'init');
+ @lines = test_pg_waldump($path, '--fork' => 'init');
is(grep(!/fork init/, @lines), 0, 'only init fork lines');
- @lines = test_pg_waldump('--path' => $path,
+ @lines = test_pg_waldump($path,
'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
0, 'only lines for selected relation');
- @lines = test_pg_waldump('--path' => $path,
+ @lines = test_pg_waldump($path,
'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
'--block' => 1);
is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+
+ # Cleanup.
+ unlink $path if $scenario->{'is_archive'};
}
}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 57a8f0366a5..981cdb69175 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -139,6 +139,8 @@ ArchiveOpts
ArchiveShutdownCB
ArchiveStartupCB
ArchiveStreamState
+ArchivedWALEntry
+ArchivedWAL_hash
ArchiverOutput
ArchiverStage
ArrayAnalyzeExtraData
@@ -3466,6 +3468,7 @@ astreamer_recovery_injector
astreamer_tar_archiver
astreamer_tar_parser
astreamer_verify
+astreamer_waldump
astreamer_zstd_frame
auth_password_hook_typ
autovac_table
--
2.47.1
[application/x-patch] v9-0005-pg_waldump-Remove-the-restriction-on-the-order-of.patch (13.4K, 6-v9-0005-pg_waldump-Remove-the-restriction-on-the-order-of.patch)
download | inline diff:
From 4815eb6ef2182b8ef5512bed842aea9821502853 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 6 Nov 2025 13:48:33 +0530
Subject: [PATCH v9 5/8] pg_waldump: Remove the restriction on the order of
archived WAL files.
With previous patch, pg_waldump would stop decoding if WAL files were
not in the required sequence. With this patch, decoding will now
continue. Any WAL file that is out of order will be written to a
temporary location, from which it will be read later. Once a temporary
file has been read, it will be removed.
---
doc/src/sgml/ref/pg_waldump.sgml | 8 +-
src/bin/pg_waldump/archive_waldump.c | 233 ++++++++++++++++++++++++---
src/bin/pg_waldump/pg_waldump.c | 41 ++++-
src/bin/pg_waldump/pg_waldump.h | 4 +
src/bin/pg_waldump/t/001_basic.pl | 3 +-
5 files changed, 265 insertions(+), 24 deletions(-)
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index d004bb0f67e..25f96f86168 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -149,8 +149,12 @@ PostgreSQL documentation
of <envar>PGDATA</envar>.
</para>
<para>
- If a tar archive is provided, its WAL segment files must be in
- sequential order; otherwise, an error will be reported.
+ If a tar archive is provided and its WAL segment files are not in
+ sequential order, those files will be written to a temporary directory
+ named starting with <filename>waldump/</filename>. This directory will be
+ created inside the directory specified by the <envar>TMPDIR</envar>
+ environment variable if it is set; otherwise, it will be created within
+ the same directory as the tar archive.
</para>
</listitem>
</varlistentry>
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
index 63141dc2ee2..2038876a516 100644
--- a/src/bin/pg_waldump/archive_waldump.c
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -17,6 +17,7 @@
#include <unistd.h>
#include "access/xlog_internal.h"
+#include "common/file_perm.h"
#include "common/hashfn.h"
#include "common/logging.h"
#include "fe_utils/simple_list.h"
@@ -27,6 +28,9 @@
*/
#define READ_CHUNK_SIZE (128 * 1024)
+/* Temporary exported WAL file directory */
+static char *TmpWalSegDir = NULL;
+
/* Structure for storing the WAL segment data from the archive */
typedef struct ArchivedWALEntry
{
@@ -65,6 +69,11 @@ typedef struct astreamer_waldump
static int read_archive_file(XLogDumpPrivate *privateInfo, Size count);
static ArchivedWALEntry *get_archive_wal_entry(XLogSegNo segno,
XLogDumpPrivate *privateInfo);
+static void setup_tmpseg_dir(const char *waldir);
+static void cleanup_tmpseg_dir_atexit(void);
+
+static FILE *prepare_tmp_write(XLogSegNo segno);
+static void perform_tmp_write(XLogSegNo segno, StringInfo buf, FILE *file);
static astreamer *astreamer_waldump_new(XLogDumpPrivate *privateInfo);
static void astreamer_waldump_content(astreamer *streamer,
@@ -120,10 +129,11 @@ is_archive_file(const char *fname, pg_compress_algorithm *compression)
}
/*
- * Initializes the tar archive reader to read WAL files from the archive,
- * creates a hash table to store them, performs quick existence checks for WAL
- * entries in the archive and retrieves the WAL segment size, and sets up
- * filtering criteria for relevant entries.
+ * Initializes the tar archive reader, creates a hash table for WAL entries,
+ * checks for existing valid WAL segments in the archive file and retrieves the
+ * segment size, and sets up filters for relevant entries. It also configures a
+ * temporary directory for out-of-order WAL data and registers an exit callback
+ * to clean up temporary files.
*/
void
init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
@@ -199,6 +209,13 @@ init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
privateInfo->endSegNo = UINT64_MAX;
else
XLByteToSeg(privateInfo->endptr, privateInfo->endSegNo, WalSegSz);
+
+ /*
+ * Setup temporary directory to store WAL segments and set up an exit
+ * callback to remove it upon completion.
+ */
+ setup_tmpseg_dir(waldir);
+ atexit(cleanup_tmpseg_dir_atexit);
}
/*
@@ -374,13 +391,16 @@ read_archive_file(XLogDumpPrivate *privateInfo, Size count)
/*
* Returns the archived WAL entry from the hash table if it exists. Otherwise,
* it invokes the routine to read the archived file and retrieve the entry if
- * it is not already in hash table.
+ * it is not already present in the hash table. If the archive streamer happens
+ * to be reading a WAL from archive file that is not currently needed, that WAL
+ * data is written to a temporary file.
*/
static ArchivedWALEntry *
get_archive_wal_entry(XLogSegNo segno, XLogDumpPrivate *privateInfo)
{
ArchivedWALEntry *entry = NULL;
char fname[MAXFNAMELEN];
+ FILE *write_fp = NULL;
/* Search hash table */
entry = ArchivedWAL_lookup(ArchivedWAL_HTAB, segno);
@@ -394,8 +414,11 @@ get_archive_wal_entry(XLogSegNo segno, XLogDumpPrivate *privateInfo)
entry = privateInfo->cur_wal;
/* Fetch more data */
- if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
- break; /* archive file ended */
+ if (entry == NULL || entry->buf.len == 0)
+ {
+ if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ break; /* archive file ended */
+ }
/*
* Either, here for the first time, or the archived streamer is
@@ -421,20 +444,31 @@ get_archive_wal_entry(XLogSegNo segno, XLogDumpPrivate *privateInfo)
}
/*
- * XXX: If the segment being read not the requested one, the data must
- * be buffered, as we currently lack the mechanism to write it to a
- * temporary file. This is a known limitation that will be fixed in the
- * next patch, as the buffer could grow up to the full WAL segment
- * size.
+ * Archive streamer is currently reading a file that isn't the one
+ * asked for, but it's required for a future feature. It should be
+ * written to a temporary location for retrieval when needed.
*/
- if (segno > entry->segno)
- continue;
- /* WAL segments must be archived in order */
- pg_log_error("WAL files are not archived in sequential order");
- pg_log_error_detail("Expecting segment number " UINT64_FORMAT " but found " UINT64_FORMAT ".",
- segno, entry->segno);
- exit(1);
+ /* Create a temporary file if one does not already exist */
+ if (!entry->tmpseg_exists)
+ {
+ write_fp = prepare_tmp_write(entry->segno);
+ entry->tmpseg_exists = true;
+ }
+
+ /* Flush data from the buffer to the file */
+ perform_tmp_write(entry->segno, &entry->buf, write_fp);
+ resetStringInfo(&entry->buf);
+
+ /*
+ * The change in the current segment entry indicates that the reading
+ * of this file has ended.
+ */
+ if (entry != privateInfo->cur_wal && write_fp != NULL)
+ {
+ fclose(write_fp);
+ write_fp = NULL;
+ }
}
/* Requested WAL segment not found */
@@ -443,7 +477,166 @@ get_archive_wal_entry(XLogSegNo segno, XLogDumpPrivate *privateInfo)
}
/*
- * Create an astreamer that can read WAL from a tar file.
+ * Set up a temporary directory to temporarily store WAL segments.
+ */
+static void
+setup_tmpseg_dir(const char *waldir)
+{
+ char *template;
+
+ /*
+ * Use the directory specified by the TMPDIR environment variable. If it's
+ * not set, use the provided WAL directory to extract WAL file
+ * temporarily.
+ */
+ template = psprintf("%s/waldump_tmp-XXXXXX",
+ getenv("TMPDIR") ? getenv("TMPDIR") : waldir);
+ TmpWalSegDir = mkdtemp(template);
+
+ if (TmpWalSegDir == NULL)
+ pg_fatal("could not create directory \"%s\": %m", template);
+
+ canonicalize_path(TmpWalSegDir);
+
+ pg_log_debug("created directory \"%s\"", TmpWalSegDir);
+}
+
+/*
+ * Removes the temporarily store WAL segments, if any, at exiting.
+ */
+static void
+cleanup_tmpseg_dir_atexit(void)
+{
+ ArchivedWAL_iterator it;
+ ArchivedWALEntry *entry;
+
+ /* Remove temporary segments */
+ ArchivedWAL_start_iterate(ArchivedWAL_HTAB, &it);
+ while ((entry = ArchivedWAL_iterate(ArchivedWAL_HTAB, &it)) != NULL)
+ {
+ if (entry->tmpseg_exists)
+ {
+ remove_tmp_walseg(entry->segno, false);
+ entry->tmpseg_exists = false;
+ }
+ }
+
+ /* Remove temporary directory */
+ if (rmdir(TmpWalSegDir) == 0)
+ pg_log_debug("removed directory \"%s\"", TmpWalSegDir);
+}
+
+/*
+ * Generate the temporary WAL file path.
+ *
+ * Note that the caller is responsible to pfree it.
+ */
+char *
+get_tmp_walseg_path(XLogSegNo segno)
+{
+ char *fpath = (char *) palloc(MAXPGPATH);
+
+ Assert(TmpWalSegDir);
+
+ snprintf(fpath, MAXPGPATH, "%s/%08X%08X",
+ TmpWalSegDir,
+ (uint32) (segno / XLogSegmentsPerXLogId(WalSegSz)),
+ (uint32) (segno % XLogSegmentsPerXLogId(WalSegSz)));
+
+ return fpath;
+}
+
+/*
+ * Routine to check whether a temporary file exists for the corresponding WAL
+ * segment number.
+ */
+bool
+tmp_walseg_exists(XLogSegNo segno)
+{
+ ArchivedWALEntry *entry;
+
+ entry = ArchivedWAL_lookup(ArchivedWAL_HTAB, segno);
+
+ if (entry == NULL)
+ return false;
+
+ return entry->tmpseg_exists;
+}
+
+/*
+ * Create an empty placeholder file and return its handle.
+ */
+static FILE *
+prepare_tmp_write(XLogSegNo segno)
+{
+ FILE *file;
+ char *fpath;
+
+ fpath = get_tmp_walseg_path(segno);
+
+ /* Create an empty placeholder */
+ file = fopen(fpath, PG_BINARY_W);
+ if (file == NULL)
+ pg_fatal("could not create file \"%s\": %m", fpath);
+
+#ifndef WIN32
+ if (chmod(fpath, pg_file_create_mode))
+ pg_fatal("could not set permissions on file \"%s\": %m",
+ fpath);
+#endif
+
+ pg_log_debug("temporarily exporting file \"%s\"", fpath);
+ pfree(fpath);
+
+ return file;
+}
+
+/*
+ * Write buffer data to the given file handle.
+ */
+static void
+perform_tmp_write(XLogSegNo segno, StringInfo buf, FILE *file)
+{
+ Assert(file);
+
+ errno = 0;
+ if (buf->len > 0 && fwrite(buf->data, buf->len, 1, file) != 1)
+ {
+ /*
+ * If write didn't set errno, assume problem is no disk space
+ */
+ if (errno == 0)
+ errno = ENOSPC;
+ pg_fatal("could not write to file \"%s\": %m",
+ get_tmp_walseg_path(segno));
+ }
+}
+
+/*
+ * Remove temporary file
+ */
+void
+remove_tmp_walseg(XLogSegNo segno, bool update_entry)
+{
+ char *fpath = get_tmp_walseg_path(segno);
+
+ if (unlink(fpath) == 0)
+ pg_log_debug("removed file \"%s\"", fpath);
+ pfree(fpath);
+
+ /* Update entry if requested */
+ if (update_entry)
+ {
+ ArchivedWALEntry *entry;
+
+ entry = ArchivedWAL_lookup(ArchivedWAL_HTAB, segno);
+ Assert(entry != NULL);
+ entry->tmpseg_exists = false;
+ }
+}
+
+/*
+ * Create an astreamer that can read WAL from tar file.
*/
static astreamer *
astreamer_waldump_new(XLogDumpPrivate *privateInfo)
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 1eedf8e01b4..9179f3ea4c4 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -474,12 +474,51 @@ TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
{
XLogDumpPrivate *private = state->private_data;
int count = required_read_len(private, targetPagePtr, reqLen);
+ XLogSegNo nextSegNo;
/* Bail out if the count to be read is not valid */
if (count < 0)
return -1;
- /* Read the WAL page from the archive streamer */
+ /*
+ * If the target page is in a different segment, first check for the WAL
+ * segment's physical existence in the temporary directory.
+ */
+ nextSegNo = state->seg.ws_segno;
+ if (!XLByteInSeg(targetPagePtr, nextSegNo, WalSegSz))
+ {
+ if (state->seg.ws_file >= 0)
+ {
+ close(state->seg.ws_file);
+ state->seg.ws_file = -1;
+
+ /* Remove this file, as it is no longer needed. */
+ remove_tmp_walseg(nextSegNo, true);
+ }
+
+ XLByteToSeg(targetPagePtr, nextSegNo, WalSegSz);
+ state->seg.ws_tli = private->timeline;
+ state->seg.ws_segno = nextSegNo;
+
+ /*
+ * If the next segment exists, open it and continue reading from there
+ */
+ if (tmp_walseg_exists(nextSegNo))
+ {
+ char *fpath;
+
+ fpath = get_tmp_walseg_path(nextSegNo);
+ state->seg.ws_file = open(fpath, O_RDONLY | PG_BINARY, 0);
+ pfree(fpath);
+ }
+ }
+
+ /* Continue reading from the open WAL segment, if any */
+ if (state->seg.ws_file >= 0)
+ return WALDumpReadPage(state, targetPagePtr, count, targetPtr,
+ readBuff);
+
+ /* Otherwise, read the WAL page from the archive streamer */
return read_archive_wal_page(private, targetPagePtr, count, readBuff);
}
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index ec7a33d40e0..03e02625ba1 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -58,4 +58,8 @@ extern int read_archive_wal_page(XLogDumpPrivate *privateInfo,
XLogRecPtr targetPagePtr,
Size count, char *readBuff);
+extern char *get_tmp_walseg_path(XLogSegNo segno);
+extern bool tmp_walseg_exists(XLogSegNo segno);
+extern void remove_tmp_walseg(XLogSegNo segno, bool update_entry);
+
#endif /* end of PG_WALDUMP_H */
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index b12bbc6f95b..d752b5dd656 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -7,6 +7,7 @@ use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
+use List::Util qw(shuffle);
my $tar = $ENV{TAR};
@@ -272,7 +273,7 @@ sub generate_archive
}
closedir $dh;
- @files = sort @files;
+ @files = shuffle @files;
# move into the WAL directory before archiving files
my $cwd = getcwd;
--
2.47.1
[application/x-patch] v9-0006-pg_verifybackup-Delay-default-WAL-directory-prepa.patch (1.7K, 7-v9-0006-pg_verifybackup-Delay-default-WAL-directory-prepa.patch)
download | inline diff:
From d4199456b4941dab2dd6dc6326b18045f5297e41 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 16 Jul 2025 14:47:43 +0530
Subject: [PATCH v9 6/8] pg_verifybackup: Delay default WAL directory
preparation.
We are not sure whether to parse WAL from a directory or an archive
until the backup format is known. Therefore, we delay preparing the
default WAL directory until the point of parsing. This delay is
harmless, as the WAL directory is not used elsewhere.
---
src/bin/pg_verifybackup/pg_verifybackup.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 8d5befa947f..a502e795b2e 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -285,10 +285,6 @@ main(int argc, char **argv)
manifest_path = psprintf("%s/backup_manifest",
context.backup_directory);
- /* By default, look for the WAL in the backup directory, too. */
- if (wal_directory == NULL)
- wal_directory = psprintf("%s/pg_wal", context.backup_directory);
-
/*
* Try to read the manifest. We treat any errors encountered while parsing
* the manifest as fatal; there doesn't seem to be much point in trying to
@@ -368,6 +364,10 @@ main(int argc, char **argv)
if (context.format == 'p' && !context.skip_checksums)
verify_backup_checksums(&context);
+ /* By default, look for the WAL in the backup directory, too. */
+ if (wal_directory == NULL)
+ wal_directory = psprintf("%s/pg_wal", context.backup_directory);
+
/*
* Try to parse the required ranges of WAL records, unless we were told
* not to do so.
--
2.47.1
[application/x-patch] v9-0007-pg_verifybackup-Rename-the-wal-directory-switch-t.patch (5.9K, 8-v9-0007-pg_verifybackup-Rename-the-wal-directory-switch-t.patch)
download | inline diff:
From cf5d6388a7ddbb2abf26d7749ff947e3338dbe99 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 25 Nov 2025 17:32:14 +0530
Subject: [PATCH v9 7/8] pg_verifybackup: Rename the wal-directory switch to
wal-path
With previous patches to pg_waldump can now decode WAL directly from
tar files. This means you'll be able to specify a tar archive path
instead of a traditional WAL directory.
To keep things consistent and more versatile, we should also
generalize the input switch for pg_verifybackup. It should accept
either a directory or a tar file path that contains WALs. This change
will also aligning it with the existing manifest-path switch naming.
== NOTE ==
The corresponding PO files require updating due to this change.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 2 +-
src/bin/pg_verifybackup/pg_verifybackup.c | 22 +++++++++++-----------
src/bin/pg_verifybackup/t/007_wal.pl | 4 ++--
3 files changed, 14 insertions(+), 14 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index 61c12975e4a..e9b8bfd51b1 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -261,7 +261,7 @@ PostgreSQL documentation
<varlistentry>
<term><option>-w <replaceable class="parameter">path</replaceable></option></term>
- <term><option>--wal-directory=<replaceable class="parameter">path</replaceable></option></term>
+ <term><option>--wal-path=<replaceable class="parameter">path</replaceable></option></term>
<listitem>
<para>
Try to parse WAL files stored in the specified directory, rather than
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index a502e795b2e..9fcd6be004e 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -93,7 +93,7 @@ static void verify_file_checksum(verifier_context *context,
uint8 *buffer);
static void parse_required_wal(verifier_context *context,
char *pg_waldump_path,
- char *wal_directory);
+ char *wal_path);
static astreamer *create_archive_verifier(verifier_context *context,
char *archive_name,
Oid tblspc_oid,
@@ -126,7 +126,7 @@ main(int argc, char **argv)
{"progress", no_argument, NULL, 'P'},
{"quiet", no_argument, NULL, 'q'},
{"skip-checksums", no_argument, NULL, 's'},
- {"wal-directory", required_argument, NULL, 'w'},
+ {"wal-path", required_argument, NULL, 'w'},
{NULL, 0, NULL, 0}
};
@@ -135,7 +135,7 @@ main(int argc, char **argv)
char *manifest_path = NULL;
bool no_parse_wal = false;
bool quiet = false;
- char *wal_directory = NULL;
+ char *wal_path = NULL;
char *pg_waldump_path = NULL;
DIR *dir;
@@ -221,8 +221,8 @@ main(int argc, char **argv)
context.skip_checksums = true;
break;
case 'w':
- wal_directory = pstrdup(optarg);
- canonicalize_path(wal_directory);
+ wal_path = pstrdup(optarg);
+ canonicalize_path(wal_path);
break;
default:
/* getopt_long already emitted a complaint */
@@ -365,15 +365,15 @@ main(int argc, char **argv)
verify_backup_checksums(&context);
/* By default, look for the WAL in the backup directory, too. */
- if (wal_directory == NULL)
- wal_directory = psprintf("%s/pg_wal", context.backup_directory);
+ if (wal_path == NULL)
+ wal_path = psprintf("%s/pg_wal", context.backup_directory);
/*
* Try to parse the required ranges of WAL records, unless we were told
* not to do so.
*/
if (!no_parse_wal)
- parse_required_wal(&context, pg_waldump_path, wal_directory);
+ parse_required_wal(&context, pg_waldump_path, wal_path);
/*
* If everything looks OK, tell the user this, unless we were asked to
@@ -1198,7 +1198,7 @@ verify_file_checksum(verifier_context *context, manifest_file *m,
*/
static void
parse_required_wal(verifier_context *context, char *pg_waldump_path,
- char *wal_directory)
+ char *wal_path)
{
manifest_data *manifest = context->manifest;
manifest_wal_range *this_wal_range = manifest->first_wal_range;
@@ -1208,7 +1208,7 @@ parse_required_wal(verifier_context *context, char *pg_waldump_path,
char *pg_waldump_cmd;
pg_waldump_cmd = psprintf("\"%s\" --quiet --path=\"%s\" --timeline=%u --start=%X/%08X --end=%X/%08X\n",
- pg_waldump_path, wal_directory, this_wal_range->tli,
+ pg_waldump_path, wal_path, this_wal_range->tli,
LSN_FORMAT_ARGS(this_wal_range->start_lsn),
LSN_FORMAT_ARGS(this_wal_range->end_lsn));
fflush(NULL);
@@ -1376,7 +1376,7 @@ usage(void)
printf(_(" -P, --progress show progress information\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -s, --skip-checksums skip checksum verification\n"));
- printf(_(" -w, --wal-directory=PATH use specified path for WAL files\n"));
+ printf(_(" -w, --wal-path=PATH use specified path for WAL files\n"));
printf(_(" -V, --version output version information, then exit\n"));
printf(_(" -?, --help show this help, then exit\n"));
printf(_("\nReport bugs to <%s>.\n"), PACKAGE_BUGREPORT);
diff --git a/src/bin/pg_verifybackup/t/007_wal.pl b/src/bin/pg_verifybackup/t/007_wal.pl
index babc4f0a86b..b07f80719b0 100644
--- a/src/bin/pg_verifybackup/t/007_wal.pl
+++ b/src/bin/pg_verifybackup/t/007_wal.pl
@@ -42,10 +42,10 @@ command_ok([ 'pg_verifybackup', '--no-parse-wal', $backup_path ],
command_ok(
[
'pg_verifybackup',
- '--wal-directory' => $relocated_pg_wal,
+ '--wal-path' => $relocated_pg_wal,
$backup_path
],
- '--wal-directory can be used to specify WAL directory');
+ '--wal-path can be used to specify WAL directory');
# Move directory back to original location.
rename($relocated_pg_wal, $original_pg_wal) || die "rename pg_wal back: $!";
--
2.47.1
[application/x-patch] v9-0008-pg_verifybackup-enabled-WAL-parsing-for-tar-forma.patch (9.9K, 9-v9-0008-pg_verifybackup-enabled-WAL-parsing-for-tar-forma.patch)
download | inline diff:
From b56d51b9094b3e9adeab3480a1c81d47c5a07bbd Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 25 Nov 2025 17:34:26 +0530
Subject: [PATCH v9 8/8] pg_verifybackup: enabled WAL parsing for tar-format
backup
Now that pg_waldump supports decoding from tar archives, we should
leverage this functionality to remove the previous restriction on WAL
parsing for tar-backed formats.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 5 +-
src/bin/pg_verifybackup/pg_verifybackup.c | 66 +++++++++++++------
src/bin/pg_verifybackup/t/002_algorithm.pl | 4 --
src/bin/pg_verifybackup/t/003_corruption.pl | 4 +-
src/bin/pg_verifybackup/t/008_untar.pl | 5 +-
src/bin/pg_verifybackup/t/010_client_untar.pl | 5 +-
6 files changed, 50 insertions(+), 39 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index e9b8bfd51b1..16b50b5a4df 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -36,10 +36,7 @@ PostgreSQL documentation
<literal>backup_manifest</literal> generated by the server at the time
of the backup. The backup may be stored either in the "plain" or the "tar"
format; this includes tar-format backups compressed with any algorithm
- supported by <application>pg_basebackup</application>. However, at present,
- <literal>WAL</literal> verification is supported only for plain-format
- backups. Therefore, if the backup is stored in tar-format, the
- <literal>-n, --no-parse-wal</literal> option should be used.
+ supported by <application>pg_basebackup</application>.
</para>
<para>
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 9fcd6be004e..40ec24c5984 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -74,7 +74,9 @@ pg_noreturn static void report_manifest_error(JsonManifestParseContext *context,
const char *fmt,...)
pg_attribute_printf(2, 3);
-static void verify_tar_backup(verifier_context *context, DIR *dir);
+static void verify_tar_backup(verifier_context *context, DIR *dir,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_plain_backup_directory(verifier_context *context,
char *relpath, char *fullpath,
DIR *dir);
@@ -83,7 +85,9 @@ static void verify_plain_backup_file(verifier_context *context, char *relpath,
static void verify_control_file(const char *controlpath,
uint64 manifest_system_identifier);
static void precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles);
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_tar_file(verifier_context *context, char *relpath,
char *fullpath, astreamer *streamer);
static void report_extra_backup_files(verifier_context *context);
@@ -136,6 +140,8 @@ main(int argc, char **argv)
bool no_parse_wal = false;
bool quiet = false;
char *wal_path = NULL;
+ char *base_archive_path = NULL;
+ char *wal_archive_path = NULL;
char *pg_waldump_path = NULL;
DIR *dir;
@@ -327,17 +333,6 @@ main(int argc, char **argv)
pfree(path);
}
- /*
- * XXX: In the future, we should consider enhancing pg_waldump to read WAL
- * files from an archive.
- */
- if (!no_parse_wal && context.format == 't')
- {
- pg_log_error("pg_waldump cannot read tar files");
- pg_log_error_hint("You must use -n/--no-parse-wal when verifying a tar-format backup.");
- exit(1);
- }
-
/*
* Perform the appropriate type of verification appropriate based on the
* backup format. This will close 'dir'.
@@ -346,7 +341,7 @@ main(int argc, char **argv)
verify_plain_backup_directory(&context, NULL, context.backup_directory,
dir);
else
- verify_tar_backup(&context, dir);
+ verify_tar_backup(&context, dir, &base_archive_path, &wal_archive_path);
/*
* The "matched" flag should now be set on every entry in the hash table.
@@ -364,9 +359,28 @@ main(int argc, char **argv)
if (context.format == 'p' && !context.skip_checksums)
verify_backup_checksums(&context);
- /* By default, look for the WAL in the backup directory, too. */
+ /*
+ * By default, WAL files are expected to be found in the backup directory
+ * for plain-format backups. In the case of tar-format backups, if a
+ * separate WAL archive is not found, the WAL files are most likely
+ * included within the main data directory archive.
+ */
if (wal_path == NULL)
- wal_path = psprintf("%s/pg_wal", context.backup_directory);
+ {
+ if (context.format == 'p')
+ wal_path = psprintf("%s/pg_wal", context.backup_directory);
+ else if (wal_archive_path)
+ wal_path = wal_archive_path;
+ else if (base_archive_path)
+ wal_path = base_archive_path;
+ else
+ {
+ pg_log_error("WAL archive not found");
+ pg_log_error_hint("Specify the correct path using the option -w/--wal-path. "
+ "Or you must use -n/--no-parse-wal when verifying a tar-format backup.");
+ exit(1);
+ }
+ }
/*
* Try to parse the required ranges of WAL records, unless we were told
@@ -787,7 +801,8 @@ verify_control_file(const char *controlpath, uint64 manifest_system_identifier)
* close when we're done with it.
*/
static void
-verify_tar_backup(verifier_context *context, DIR *dir)
+verify_tar_backup(verifier_context *context, DIR *dir, char **base_archive_path,
+ char **wal_archive_path)
{
struct dirent *dirent;
SimplePtrList tarfiles = {NULL, NULL};
@@ -816,7 +831,8 @@ verify_tar_backup(verifier_context *context, DIR *dir)
char *fullpath;
fullpath = psprintf("%s/%s", context->backup_directory, filename);
- precheck_tar_backup_file(context, filename, fullpath, &tarfiles);
+ precheck_tar_backup_file(context, filename, fullpath, &tarfiles,
+ base_archive_path, wal_archive_path);
pfree(fullpath);
}
}
@@ -875,11 +891,13 @@ verify_tar_backup(verifier_context *context, DIR *dir)
*
* The arguments to this function are mostly the same as the
* verify_plain_backup_file. The additional argument outputs a list of valid
- * tar files.
+ * tar files, along with the full paths to the main archive and the WAL
+ * directory archive.
*/
static void
precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles)
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path, char **wal_archive_path)
{
struct stat sb;
Oid tblspc_oid = InvalidOid;
@@ -918,9 +936,17 @@ precheck_tar_backup_file(verifier_context *context, char *relpath,
* extension such as .gz, .lz4, or .zst.
*/
if (strncmp("base", relpath, 4) == 0)
+ {
suffix = relpath + 4;
+
+ *base_archive_path = pstrdup(fullpath);
+ }
else if (strncmp("pg_wal", relpath, 6) == 0)
+ {
suffix = relpath + 6;
+
+ *wal_archive_path = pstrdup(fullpath);
+ }
else
{
/* Expected a <tablespaceoid>.tar file here. */
diff --git a/src/bin/pg_verifybackup/t/002_algorithm.pl b/src/bin/pg_verifybackup/t/002_algorithm.pl
index ae16c11bc4d..4f284a9e828 100644
--- a/src/bin/pg_verifybackup/t/002_algorithm.pl
+++ b/src/bin/pg_verifybackup/t/002_algorithm.pl
@@ -30,10 +30,6 @@ sub test_checksums
{
# Add switch to get a tar-format backup
push @backup, ('--format' => 'tar');
-
- # Add switch to skip WAL verification, which is not yet supported for
- # tar-format backups
- push @verify, ('--no-parse-wal');
}
# A backup with a bogus algorithm should fail.
diff --git a/src/bin/pg_verifybackup/t/003_corruption.pl b/src/bin/pg_verifybackup/t/003_corruption.pl
index 1dd60f709cf..f1ebdbb46b4 100644
--- a/src/bin/pg_verifybackup/t/003_corruption.pl
+++ b/src/bin/pg_verifybackup/t/003_corruption.pl
@@ -193,10 +193,8 @@ for my $scenario (@scenario)
command_ok([ $tar, '-cf' => "$tar_backup_path/base.tar", '.' ]);
chdir($cwd) || die "chdir: $!";
- # Now check that the backup no longer verifies. We must use -n
- # here, because pg_waldump can't yet read WAL from a tarfile.
command_fails_like(
- [ 'pg_verifybackup', '--no-parse-wal', $tar_backup_path ],
+ [ 'pg_verifybackup', $tar_backup_path ],
$scenario->{'fails_like'},
"corrupt backup fails verification: $name");
diff --git a/src/bin/pg_verifybackup/t/008_untar.pl b/src/bin/pg_verifybackup/t/008_untar.pl
index bc3d6b352ad..09079a94fee 100644
--- a/src/bin/pg_verifybackup/t/008_untar.pl
+++ b/src/bin/pg_verifybackup/t/008_untar.pl
@@ -47,7 +47,6 @@ my $tsoid = $primary->safe_psql(
SELECT oid FROM pg_tablespace WHERE spcname = 'regress_ts1'));
my $backup_path = $primary->backup_dir . '/server-backup';
-my $extract_path = $primary->backup_dir . '/extracted-backup';
my @test_configuration = (
{
@@ -123,14 +122,12 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
# Cleanup.
rmtree($backup_path);
- rmtree($extract_path);
}
}
diff --git a/src/bin/pg_verifybackup/t/010_client_untar.pl b/src/bin/pg_verifybackup/t/010_client_untar.pl
index b62faeb5acf..5b0e76ee69d 100644
--- a/src/bin/pg_verifybackup/t/010_client_untar.pl
+++ b/src/bin/pg_verifybackup/t/010_client_untar.pl
@@ -32,7 +32,6 @@ print $jf $junk_data;
close $jf;
my $backup_path = $primary->backup_dir . '/client-backup';
-my $extract_path = $primary->backup_dir . '/extracted-backup';
my @test_configuration = (
{
@@ -137,13 +136,11 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
# Cleanup.
- rmtree($extract_path);
rmtree($backup_path);
}
}
--
2.47.1
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2025-11-26 07:23 Chao Li <[email protected]>
parent: Amul Sul <[email protected]>
1 sibling, 0 replies; 85+ messages in thread
From: Chao Li @ 2025-11-26 07:23 UTC (permalink / raw)
To: Amul Sul <[email protected]>; +Cc: Jakub Wartak <[email protected]>; Robert Haas <[email protected]>; PostgreSQL Hackers <[email protected]>
> On Nov 26, 2025, at 14:02, Amul Sul <[email protected]> wrote:
>
>> 9 - 0004
>> ```
>> +/*
>> + * Create an astreamer that can read WAL from tar file.
>> + */
>> +static astreamer *
>> +astreamer_waldump_new(XLogDumpPrivate *privateInfo)
>> +{
>> + astreamer_waldump *streamer;
>> +
>> + streamer = palloc0(sizeof(astreamer_waldump));
>> + *((const astreamer_ops **) &streamer->base.bbs_ops) =
>> + &astreamer_waldump_ops;
>> +
>> + streamer->privateInfo = privateInfo;
>> +
>> + return &streamer->base;
>> +}
>> ```
>>
>> This function allocates memory for streamer but only returns &streamer->base, so memory of streamer is leaked.
>>
>
> May I know why you think there would be a memory leak? I believe the
> address of the structure is the same as the address of its first
> member, base. I am returning base because the goal is to return a
> generic astreamer type, which is the standard approach used in other
> archive streamer code.
Ah… Got it.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-02-10 09:36 Amul Sul <[email protected]>
parent: Amul Sul <[email protected]>
1 sibling, 1 reply; 85+ messages in thread
From: Amul Sul @ 2026-02-10 09:36 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Chao Li <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On Wed, Feb 4, 2026 at 6:39 PM Amul Sul <[email protected]> wrote:
>
> On Wed, Jan 28, 2026 at 2:41 AM Robert Haas <[email protected]> wrote:
> >
> > On Tue, Jan 27, 2026 at 7:07 AM Amul Sul <[email protected]> wrote:
> > > In the attached version, I am using the WAL segment name as the hash
> > > key, which is much more straightforward. I have rewritten
> > > read_archive_wal_page(), and it looks much cleaner than before. The
> > > logic to discard irrelevant WAL files is still within
> > > get_archive_wal_entry. I added an explanation for setting cur_wal to
> > > NULL, which is now handled in the separate function I mentioned
> > > previously.
> > >
> > > Kindly have a look at the attached version; let me know if you are
> > > still not happy with the current approach for filtering/discarding
> > > irrelevant WAL segments. It isn't much different from the previous
> > > version, but I have tried to keep it in a separate routine for better
> > > code readability, with comments to make it easier to understand. I
> > > also added a comment for ArchivedWALFile.
> >
> > I feel like the division of labor between get_archive_wal_entry() and
> > read_archive_wal_page() is odd. I noticed this in the last version,
> > too, and it still seems to be the case. get_archive_wal_entry() first
> > calls ArchivedWAL_lookup(). If that finds an entry, it just returns.
> > If it doesn't, it loops until an entry for the requested file shows up
> > and then returns it. Then control returns to read_archive_wal_page()
> > which loops some more until we have all the data we need for the
> > requested file. But it seems odd to me to have two separate loops
> > here. I think that the first loop is going to call read_archive_file()
> > until we find the beginning of the file that we care about and then
> > the second one is going to call read_archive_file() some more until we
> > have read enough of it to satisfy the request. It feels odd to me to
> > do it that way, as if we told somebody to first wait until 9 o'clock
> > and then wait another 30 minutes, instead of just telling them to wait
> > until 9:30. I realize it's not quite the same thing, because apart
> > from calling read_archive_file(), the two loops do different things,
> > but I still think it looks odd.
> >
> > + /*
> > + * Ignore if the timeline is different or the current segment is not
> > + * the desired one.
> > + */
> > + XLogFromFileName(entry->fname, &curSegTimeline, &curSegNo, WalSegSz);
> > + if (privateInfo->timeline != curSegTimeline ||
> > + privateInfo->startSegNo > curSegNo ||
> > + privateInfo->endSegNo < curSegNo ||
> > + segno > curSegNo)
> > + {
> > + free_archive_wal_entry(entry->fname, privateInfo);
> > + continue;
> > + }
> >
> > The comment doesn't match the code. If it did, the test would be
> > (privateInfo->timeline != curSegTimeline || segno != curSegno). But
> > instead the segno test is > rather than !=, and the checks against
> > startSegNo and endSegNo aren't explained at all. I think I understand
> > why the segno test uses > rather than !=, but it's the point of the
> > comment to explain things like that, rather than leaving the reader to
> > guess. And I don't know why we also need to test startSegNo and
> > endSegNo.
> >
> > I also wonder what the point is of doing XLogFromFileName() on the
> > fname provided by the caller and then again on entry->fname. Couldn't
> > you just compare the strings?
> >
> > Again, the division of labor is really odd here. It's the job of
> > astreamer_waldump_content() to skip things that aren't WAL files at
> > all, but it's the job of get_archive_wal_entry() to skip things that
> > are WAL files but not the one we want. I disagree with putting those
> > checks in completely separate parts of the code.
> >
>
> Keeping the timeline and segment start-end range checks inside the
> archive streamer creates a circular dependency that cannot be resolved
> without a 'dirty hack'. We must read the first available WAL file page
> to determine the wal_segment_size before it can calculate the target
> segment range. Moving the checks inside the streamer would make it
> impossible to process that initial file, as the necessary filtering
> parameters -- would still be unknown which would need to be skipped
> for the first read somehow. What if later we realized that the first
> WAL file which was allowed to be streamed by skipping that check is
> irrelevant and doesn't fall under the start-end segment range?
>
Please have a look at the attached version, specifically patch 0005.
In astreamer_waldump_content(), I have moved the WAL file filtration
check from get_archive_wal_entry(). This check will be skipped during
the initial read in init_archive_reader(), which instead performs it
explicitly once it determines the WAL segment size and the start/end
segments.
To access the WAL segment size inside astreamer_waldump_content(), I
have moved the WAL segment size variable into the XLogDumpPrivate
structure in the separate 0004 patch.
Regards,
Amul
Attachments:
[application/x-patch] v12-0001-Refactor-pg_waldump-Move-some-declarations-to-ne.patch (2.2K, 2-v12-0001-Refactor-pg_waldump-Move-some-declarations-to-ne.patch)
download | inline diff:
From 0731db48bb8d154aa72d2c956dec95a8127ae07d Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 22 Jan 2026 10:28:32 +0530
Subject: [PATCH v12 1/9] Refactor: pg_waldump: Move some declarations to new
pg_waldump.h
This change prepares for a second source file in this directory to
support reading WAL from tar files. Common structures, declarations,
and functions are being exported through this include file so
they can be used in both files.
---
src/bin/pg_waldump/pg_waldump.c | 9 +--------
src/bin/pg_waldump/pg_waldump.h | 25 +++++++++++++++++++++++++
2 files changed, 26 insertions(+), 8 deletions(-)
create mode 100644 src/bin/pg_waldump/pg_waldump.h
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index f3446385d6a..4b7411a6498 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -29,6 +29,7 @@
#include "common/logging.h"
#include "common/relpath.h"
#include "getopt_long.h"
+#include "pg_waldump.h"
#include "rmgrdesc.h"
#include "storage/bufpage.h"
@@ -43,14 +44,6 @@ static volatile sig_atomic_t time_to_stop = false;
static const RelFileLocator emptyRelFileLocator = {0, 0, 0};
-typedef struct XLogDumpPrivate
-{
- TimeLineID timeline;
- XLogRecPtr startptr;
- XLogRecPtr endptr;
- bool endptr_reached;
-} XLogDumpPrivate;
-
typedef struct XLogDumpConfig
{
/* display options */
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
new file mode 100644
index 00000000000..b88543856e5
--- /dev/null
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -0,0 +1,25 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_waldump.h - decode and display WAL
+ *
+ * Copyright (c) 2026, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/pg_waldump.h
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_WALDUMP_H
+#define PG_WALDUMP_H
+
+#include "access/xlogdefs.h"
+
+/* Contains the necessary information to drive WAL decoding */
+typedef struct XLogDumpPrivate
+{
+ TimeLineID timeline;
+ XLogRecPtr startptr;
+ XLogRecPtr endptr;
+ bool endptr_reached;
+} XLogDumpPrivate;
+
+#endif /* end of PG_WALDUMP_H */
--
2.47.1
[application/x-patch] v12-0002-Refactor-pg_waldump-Separate-logic-used-to-calcu.patch (2.4K, 3-v12-0002-Refactor-pg_waldump-Separate-logic-used-to-calcu.patch)
download | inline diff:
From 09f887f7f6b4c142b527dc6410bc781884646681 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 22 Jan 2026 10:38:16 +0530
Subject: [PATCH v12 2/9] Refactor: pg_waldump: Separate logic used to
calculate the required read size.
This refactoring prepares the codebase for an upcoming patch that will
support reading WAL from tar files. The logic for calculating the
required read size has been updated to handle both normal WAL files
and WAL files located inside a tar archive.
---
src/bin/pg_waldump/pg_waldump.c | 43 +++++++++++++++++++++++----------
1 file changed, 30 insertions(+), 13 deletions(-)
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 4b7411a6498..958a71a01cf 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -326,6 +326,32 @@ identify_target_directory(char *directory, char *fname, int *WalSegSz)
return NULL; /* not reached */
}
+/*
+ * Returns the size in bytes of the data to be read. Returns -1 if the end
+ * point has already been reached.
+ */
+static inline int
+required_read_len(XLogDumpPrivate *private, XLogRecPtr targetPagePtr,
+ int reqLen)
+{
+ int count = XLOG_BLCKSZ;
+
+ if (XLogRecPtrIsValid(private->endptr))
+ {
+ if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
+ count = XLOG_BLCKSZ;
+ else if (targetPagePtr + reqLen <= private->endptr)
+ count = private->endptr - targetPagePtr;
+ else
+ {
+ private->endptr_reached = true;
+ return -1;
+ }
+ }
+
+ return count;
+}
+
/* pg_waldump's XLogReaderRoutine->segment_open callback */
static void
WALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
@@ -383,21 +409,12 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogRecPtr targetPtr, char *readBuff)
{
XLogDumpPrivate *private = state->private_data;
- int count = XLOG_BLCKSZ;
+ int count = required_read_len(private, targetPagePtr, reqLen);
WALReadError errinfo;
- if (XLogRecPtrIsValid(private->endptr))
- {
- if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
- count = XLOG_BLCKSZ;
- else if (targetPagePtr + reqLen <= private->endptr)
- count = private->endptr - targetPagePtr;
- else
- {
- private->endptr_reached = true;
- return -1;
- }
- }
+ /* Bail out if the count to be read is not valid */
+ if (count < 0)
+ return -1;
if (!WALRead(state, readBuff, targetPagePtr, count, private->timeline,
&errinfo))
--
2.47.1
[application/x-patch] v12-0003-Refactor-pg_waldump-Restructure-TAP-tests.patch (5.6K, 4-v12-0003-Refactor-pg_waldump-Restructure-TAP-tests.patch)
download | inline diff:
From 05d2c4218d9c3496878f342c9c32ff5148d3d6a4 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 22 Jan 2026 11:06:05 +0530
Subject: [PATCH v12 3/9] Refactor: pg_waldump: Restructure TAP tests.
Restructured tests that do not have a WAL file argument to run within
a loop, facilitating their re-execution for decoding WAL from tar
archives.
== NOTE ==
This is not intended to be committed separately. It can be merged
with the next patch, which is the main patch implementing this
feature.
---
src/bin/pg_waldump/t/001_basic.pl | 123 ++++++++++++++++--------------
1 file changed, 67 insertions(+), 56 deletions(-)
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 5db5d20136f..3288fadcf48 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -198,28 +198,6 @@ command_like(
],
qr/./,
'runs with start and end segment specified');
-command_fails_like(
- [ 'pg_waldump', '--path' => $node->data_dir ],
- qr/error: no start WAL location given/,
- 'path option requires start location');
-command_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- '--end' => $end_lsn,
- ],
- qr/./,
- 'runs with path option and start and end locations');
-command_fails_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- ],
- qr/error: error in WAL record at/,
- 'falling off the end of the WAL results in an error');
-
command_like(
[
'pg_waldump', '--quiet',
@@ -227,15 +205,6 @@ command_like(
],
qr/^$/,
'no output with --quiet option');
-command_fails_like(
- [
- 'pg_waldump', '--quiet',
- '--path' => $node->data_dir,
- '--start' => $start_lsn
- ],
- qr/error: error in WAL record at/,
- 'errors are shown with --quiet');
-
# Test for: Display a message that we're skipping data if `from`
# wasn't a pointer to the start of a record.
@@ -272,7 +241,6 @@ sub test_pg_waldump
my $result = IPC::Run::run [
'pg_waldump',
- '--path' => $node->data_dir,
'--start' => $start_lsn,
'--end' => $end_lsn,
@opts
@@ -288,38 +256,81 @@ sub test_pg_waldump
my @lines;
-@lines = test_pg_waldump;
-is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+my @scenarios = (
+ {
+ 'path' => $node->data_dir
+ });
-@lines = test_pg_waldump('--limit' => 6);
-is(@lines, 6, 'limit option observed');
+for my $scenario (@scenarios)
+{
+ my $path = $scenario->{'path'};
-@lines = test_pg_waldump('--fullpage');
-is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
+ SKIP:
+ {
+ command_fails_like(
+ [ 'pg_waldump', '--path' => $path ],
+ qr/error: no start WAL location given/,
+ 'path option requires start location');
+ command_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ '--end' => $end_lsn,
+ ],
+ qr/./,
+ 'runs with path option and start and end locations');
+ command_fails_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ ],
+ qr/error: error in WAL record at/,
+ 'falling off the end of the WAL results in an error');
-@lines = test_pg_waldump('--stats');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ command_fails_like(
+ [
+ 'pg_waldump', '--quiet',
+ '--path' => $path,
+ '--start' => $start_lsn
+ ],
+ qr/error: error in WAL record at/,
+ 'errors are shown with --quiet');
-@lines = test_pg_waldump('--stats=record');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ @lines = test_pg_waldump('--path' => $path);
+ is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
-@lines = test_pg_waldump('--rmgr' => 'Btree');
-is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+ @lines = test_pg_waldump('--path' => $path, '--limit' => 6);
+ is(@lines, 6, 'limit option observed');
-@lines = test_pg_waldump('--fork' => 'init');
-is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+ @lines = test_pg_waldump('--path' => $path, '--fullpage');
+ is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
-is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
- 0, 'only lines for selected relation');
+ @lines = test_pg_waldump('--path' => $path, '--stats');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
- '--block' => 1);
-is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+ @lines = test_pg_waldump('--path' => $path, '--stats=record');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ @lines = test_pg_waldump('--path' => $path, '--rmgr' => 'Btree');
+ is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+
+ @lines = test_pg_waldump('--path' => $path, '--fork' => 'init');
+ is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+
+ @lines = test_pg_waldump('--path' => $path,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
+ is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
+ 0, 'only lines for selected relation');
+
+ @lines = test_pg_waldump('--path' => $path,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
+ '--block' => 1);
+ is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+ }
+}
done_testing();
--
2.47.1
[application/x-patch] v12-0004-Refactor-pg_waldump-Move-WAL-segment-size-to-XLo.patch (5.1K, 5-v12-0004-Refactor-pg_waldump-Move-WAL-segment-size-to-XLo.patch)
download | inline diff:
From c00a6728cde927f4bb092a95694d1fcd2c17205f Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 4 Feb 2026 15:31:51 +0530
Subject: [PATCH v12 4/9] Refactor: pg_waldump: Move WAL segment size to
XLogDumpPrivate.
Relocate the WAL segment size variable to the XLogDumpPrivate
structure and rename it to segsize for consistency. This change is
required to make the segment size accessible to the archive streamer
code, where passing it as a function argument is not feasible.
---
src/bin/pg_waldump/pg_waldump.c | 26 +++++++++++++-------------
src/bin/pg_waldump/pg_waldump.h | 1 +
2 files changed, 14 insertions(+), 13 deletions(-)
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 958a71a01cf..5d31b15dbd8 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -811,7 +811,6 @@ main(int argc, char **argv)
XLogRecPtr first_record;
char *waldir = NULL;
char *errormsg;
- int WalSegSz;
static struct option long_options[] = {
{"bkp-details", no_argument, NULL, 'b'},
@@ -865,6 +864,7 @@ main(int argc, char **argv)
memset(&stats, 0, sizeof(XLogStats));
private.timeline = 1;
+ private.segsize = 0;
private.startptr = InvalidXLogRecPtr;
private.endptr = InvalidXLogRecPtr;
private.endptr_reached = false;
@@ -1138,18 +1138,18 @@ main(int argc, char **argv)
pg_fatal("could not open directory \"%s\": %m", waldir);
}
- waldir = identify_target_directory(waldir, fname, &WalSegSz);
+ waldir = identify_target_directory(waldir, fname, &private.segsize);
fd = open_file_in_directory(waldir, fname);
if (fd < 0)
pg_fatal("could not open file \"%s\"", fname);
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &segno, WalSegSz);
+ XLogFromFileName(fname, &private.timeline, &segno, private.segsize);
if (!XLogRecPtrIsValid(private.startptr))
- XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr);
- else if (!XLByteInSeg(private.startptr, segno, WalSegSz))
+ XLogSegNoOffsetToRecPtr(segno, 0, private.segsize, private.startptr);
+ else if (!XLByteInSeg(private.startptr, segno, private.segsize))
{
pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
LSN_FORMAT_ARGS(private.startptr),
@@ -1159,7 +1159,7 @@ main(int argc, char **argv)
/* no second file specified, set end position */
if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(segno + 1, 0, WalSegSz, private.endptr);
+ XLogSegNoOffsetToRecPtr(segno + 1, 0, private.segsize, private.endptr);
/* parse ENDSEG if passed */
if (optind + 1 < argc)
@@ -1175,14 +1175,14 @@ main(int argc, char **argv)
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &endsegno, WalSegSz);
+ XLogFromFileName(fname, &private.timeline, &endsegno, private.segsize);
if (endsegno < segno)
pg_fatal("ENDSEG %s is before STARTSEG %s",
argv[optind + 1], argv[optind]);
if (!XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(endsegno + 1, 0, WalSegSz,
+ XLogSegNoOffsetToRecPtr(endsegno + 1, 0, private.segsize,
private.endptr);
/* set segno to endsegno for check of --end */
@@ -1190,8 +1190,8 @@ main(int argc, char **argv)
}
- if (!XLByteInSeg(private.endptr, segno, WalSegSz) &&
- private.endptr != (segno + 1) * WalSegSz)
+ if (!XLByteInSeg(private.endptr, segno, private.segsize) &&
+ private.endptr != (segno + 1) * private.segsize)
{
pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
LSN_FORMAT_ARGS(private.endptr),
@@ -1200,7 +1200,7 @@ main(int argc, char **argv)
}
}
else
- waldir = identify_target_directory(waldir, NULL, &WalSegSz);
+ waldir = identify_target_directory(waldir, NULL, &private.segsize);
/* we don't know what to print */
if (!XLogRecPtrIsValid(private.startptr))
@@ -1213,7 +1213,7 @@ main(int argc, char **argv)
/* we have everything we need, start reading */
xlogreader_state =
- XLogReaderAllocate(WalSegSz, waldir,
+ XLogReaderAllocate(private.segsize, waldir,
XL_ROUTINE(.page_read = WALDumpReadPage,
.segment_open = WALDumpOpenSegment,
.segment_close = WALDumpCloseSegment),
@@ -1234,7 +1234,7 @@ main(int argc, char **argv)
* a segment (e.g. we were used in file mode).
*/
if (first_record != private.startptr &&
- XLogSegmentOffset(private.startptr, WalSegSz) != 0)
+ XLogSegmentOffset(private.startptr, private.segsize) != 0)
pg_log_info(ngettext("first record is after %X/%08X, at %X/%08X, skipping over %u byte",
"first record is after %X/%08X, at %X/%08X, skipping over %u bytes",
(first_record - private.startptr)),
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index b88543856e5..4f1b2ab668b 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -17,6 +17,7 @@
typedef struct XLogDumpPrivate
{
TimeLineID timeline;
+ int segsize;
XLogRecPtr startptr;
XLogRecPtr endptr;
bool endptr_reached;
--
2.47.1
[application/x-patch] v12-0005-pg_waldump-Add-support-for-archived-WAL-decoding.patch (42.3K, 6-v12-0005-pg_waldump-Add-support-for-archived-WAL-decoding.patch)
download | inline diff:
From d1bb58eb796b8d1d25eb836c21dd874cb46f6359 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 10 Feb 2026 11:42:36 +0530
Subject: [PATCH v12 5/9] pg_waldump: Add support for archived WAL decoding.
pg_waldump can now accept the path to a tar archive containing WAL
files and decode them. This feature was added primarily for
pg_verifybackup, which previously disabled WAL parsing for
tar-formatted backups.
Note that this patch requires that the WAL files within the archive be
in sequential order; an error will be reported otherwise. The next
patch is planned to remove this restriction.
---
doc/src/sgml/ref/pg_waldump.sgml | 8 +-
src/bin/pg_waldump/Makefile | 7 +-
src/bin/pg_waldump/archive_waldump.c | 669 +++++++++++++++++++++++++++
src/bin/pg_waldump/meson.build | 4 +-
src/bin/pg_waldump/pg_waldump.c | 242 +++++++---
src/bin/pg_waldump/pg_waldump.h | 45 ++
src/bin/pg_waldump/t/001_basic.pl | 83 +++-
src/tools/pgindent/typedefs.list | 3 +
8 files changed, 986 insertions(+), 75 deletions(-)
create mode 100644 src/bin/pg_waldump/archive_waldump.c
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index ce23add5577..d004bb0f67e 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -141,13 +141,17 @@ PostgreSQL documentation
<term><option>--path=<replaceable>path</replaceable></option></term>
<listitem>
<para>
- Specifies a directory to search for WAL segment files or a
- directory with a <literal>pg_wal</literal> subdirectory that
+ Specifies a tar archive or a directory to search for WAL segment files
+ or a directory with a <literal>pg_wal</literal> subdirectory that
contains such files. The default is to search in the current
directory, the <literal>pg_wal</literal> subdirectory of the
current directory, and the <literal>pg_wal</literal> subdirectory
of <envar>PGDATA</envar>.
</para>
+ <para>
+ If a tar archive is provided, its WAL segment files must be in
+ sequential order; otherwise, an error will be reported.
+ </para>
</listitem>
</varlistentry>
diff --git a/src/bin/pg_waldump/Makefile b/src/bin/pg_waldump/Makefile
index 4c1ee649501..aabb87566a2 100644
--- a/src/bin/pg_waldump/Makefile
+++ b/src/bin/pg_waldump/Makefile
@@ -3,6 +3,9 @@
PGFILEDESC = "pg_waldump - decode and display WAL"
PGAPPICON=win32
+# make these available to TAP test scripts
+export TAR
+
subdir = src/bin/pg_waldump
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
@@ -10,13 +13,15 @@ include $(top_builddir)/src/Makefile.global
OBJS = \
$(RMGRDESCOBJS) \
$(WIN32RES) \
+ archive_waldump.o \
compat.o \
pg_waldump.o \
rmgrdesc.o \
xlogreader.o \
xlogstats.o
-override CPPFLAGS := -DFRONTEND $(CPPFLAGS)
+override CPPFLAGS := -DFRONTEND -I$(libpq_srcdir) $(CPPFLAGS)
+LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils
RMGRDESCSOURCES = $(sort $(notdir $(wildcard $(top_srcdir)/src/backend/access/rmgrdesc/*desc*.c)))
RMGRDESCOBJS = $(patsubst %.c,%.o,$(RMGRDESCSOURCES))
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
new file mode 100644
index 00000000000..27a5a5c6d5d
--- /dev/null
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -0,0 +1,669 @@
+/*-------------------------------------------------------------------------
+ *
+ * archive_waldump.c
+ * A generic facility for reading WAL data from tar archives via archive
+ * streamer.
+ *
+ * Portions Copyright (c) 2026, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/archive_waldump.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres_fe.h"
+
+#include <unistd.h>
+
+#include "access/xlog_internal.h"
+#include "common/hashfn.h"
+#include "common/logging.h"
+#include "fe_utils/simple_list.h"
+#include "pg_waldump.h"
+
+/*
+ * How many bytes should we try to read from a file at once?
+ */
+#define READ_CHUNK_SIZE (128 * 1024)
+
+/*
+ * Check if the start segment number is zero; this indicates a request to read
+ * any WAL file.
+ */
+#define READ_ANY_WAL(privateInfo) ((privateInfo)->start_segno == 0)
+
+/*
+ * Hash entry representing a WAL segment retrieved from the archive.
+ *
+ * While WAL segments are typically read sequentially, individual entries
+ * maintain their own buffers for the following reasons:
+ *
+ * 1. Boundary Handling: The archive streamer provides a continuous byte
+ * stream. A single streaming chunk may contain the end of one WAL segment
+ * and the start of the next. Separate buffers allow us to easily
+ * partition and track these bytes by their respective segments.
+ *
+ * 2. Out-of-Order Support: Dedicated buffers simplify logic if segments
+ * are ever archived or retrieved out of sequence.
+ *
+ * To minimize the memory footprint, entries and their associated buffers are
+ * freed immediately once consumed. Since pg_waldump does not request the same
+ * bytes twice, a segment is discarded as soon as it moves past it.
+ */
+typedef struct ArchivedWALFile
+{
+ uint32 status; /* hash status */
+ const char *fname; /* hash key: WAL segment name */
+
+ StringInfo buf; /* holds WAL bytes read from archive */
+
+ int read_len; /* total bytes of a WAL read from archive */
+} ArchivedWALFile;
+
+static uint32 hash_string_pointer(const char *s);
+#define SH_PREFIX ArchivedWAL
+#define SH_ELEMENT_TYPE ArchivedWALFile
+#define SH_KEY_TYPE const char *
+#define SH_KEY fname
+#define SH_HASH_KEY(tb, key) hash_string_pointer(key)
+#define SH_EQUAL(tb, a, b) (strcmp(a, b) == 0)
+#define SH_SCOPE static inline
+#define SH_RAW_ALLOCATOR pg_malloc0
+#define SH_DECLARE
+#define SH_DEFINE
+#include "lib/simplehash.h"
+
+typedef struct astreamer_waldump
+{
+ astreamer base;
+ XLogDumpPrivate *privateInfo;
+} astreamer_waldump;
+
+static ArchivedWALFile *get_archive_wal_entry(const char *fname,
+ XLogDumpPrivate *privateInfo,
+ int WalSegSz);
+static int read_archive_file(XLogDumpPrivate *privateInfo, Size count);
+
+static astreamer *astreamer_waldump_new(XLogDumpPrivate *privateInfo);
+static void astreamer_waldump_content(astreamer *streamer,
+ astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context);
+static void astreamer_waldump_finalize(astreamer *streamer);
+static void astreamer_waldump_free(astreamer *streamer);
+
+static bool member_is_wal_file(astreamer_waldump *mystreamer,
+ astreamer_member *member,
+ char **fname);
+
+static const astreamer_ops astreamer_waldump_ops = {
+ .content = astreamer_waldump_content,
+ .finalize = astreamer_waldump_finalize,
+ .free = astreamer_waldump_free
+};
+
+/*
+ * Returns true if the given file is a tar archive and outputs its compression
+ * algorithm.
+ */
+bool
+is_archive_file(const char *fname, pg_compress_algorithm *compression)
+{
+ int fname_len = strlen(fname);
+
+ /* Now, check the compression type of the tar */
+ if (fname_len > 4 &&
+ strcmp(fname + fname_len - 4, ".tar") == 0)
+ *compression = PG_COMPRESSION_NONE;
+ else if (fname_len > 4 &&
+ strcmp(fname + fname_len - 4, ".tgz") == 0)
+ *compression = PG_COMPRESSION_GZIP;
+ else if (fname_len > 7 &&
+ strcmp(fname + fname_len - 7, ".tar.gz") == 0)
+ *compression = PG_COMPRESSION_GZIP;
+ else if (fname_len > 8 &&
+ strcmp(fname + fname_len - 8, ".tar.lz4") == 0)
+ *compression = PG_COMPRESSION_LZ4;
+ else if (fname_len > 8 &&
+ strcmp(fname + fname_len - 8, ".tar.zst") == 0)
+ *compression = PG_COMPRESSION_ZSTD;
+ else
+ return false;
+
+ return true;
+}
+
+/*
+ * Initializes the tar archive reader, creates a hash table for WAL entries,
+ * checks for existing valid WAL segments in the archive file and retrieves the
+ * segment size, and sets up filters for relevant entries.
+ */
+void
+init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
+ int *WalSegSz, pg_compress_algorithm compression)
+{
+ int fd;
+ astreamer *streamer;
+ ArchivedWALFile *entry = NULL;
+ XLogLongPageHeader longhdr;
+ XLogSegNo segno;
+ TimeLineID timeline;
+
+ /* Open tar archive and store its file descriptor */
+ fd = open_file_in_directory(waldir, privateInfo->archive_name);
+
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", privateInfo->archive_name);
+
+ privateInfo->archive_fd = fd;
+
+ streamer = astreamer_waldump_new(privateInfo);
+
+ /* Before that we must parse the tar archive. */
+ streamer = astreamer_tar_parser_new(streamer);
+
+ /* Before that we must decompress, if archive is compressed. */
+ if (compression == PG_COMPRESSION_GZIP)
+ streamer = astreamer_gzip_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_LZ4)
+ streamer = astreamer_lz4_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_ZSTD)
+ streamer = astreamer_zstd_decompressor_new(streamer);
+
+ privateInfo->archive_streamer = streamer;
+
+ /*
+ * Hash table storing WAL entries read from the archive with an arbitrary
+ * initial size
+ */
+ privateInfo->archive_wal_htab = ArchivedWAL_create(8, NULL);
+
+ /*
+ * Verify that the archive contains valid WAL files and fetch WAL segment
+ * size
+ */
+ while (entry == NULL || entry->buf->len < XLOG_BLCKSZ)
+ {
+ if (read_archive_file(privateInfo, XLOG_BLCKSZ) == 0)
+ pg_fatal("could not find WAL in archive \"%s\"",
+ privateInfo->archive_name);
+
+ entry = privateInfo->cur_file;
+ }
+
+ /* Set WalSegSz if WAL data is successfully read */
+ longhdr = (XLogLongPageHeader) entry->buf->data;
+
+ if (!IsValidWalSegSize(longhdr->xlp_seg_size))
+ {
+ pg_log_error(ngettext("invalid WAL segment size in WAL file from archive \"%s\" (%d byte)",
+ "invalid WAL segment size in WAL file from archive \"%s\" (%d bytes)",
+ longhdr->xlp_seg_size),
+ privateInfo->archive_name, longhdr->xlp_seg_size);
+ pg_log_error_detail("The WAL segment size must be a power of two between 1 MB and 1 GB.");
+ exit(1);
+ }
+
+ *WalSegSz = longhdr->xlp_seg_size;
+
+ /*
+ * With the WAL segment size available, we can now initialize the
+ * dependent start and end segment numbers.
+ */
+ Assert(!XLogRecPtrIsInvalid(privateInfo->startptr));
+ XLByteToSeg(privateInfo->startptr, privateInfo->start_segno, *WalSegSz);
+
+ if (!XLogRecPtrIsInvalid(privateInfo->endptr))
+ XLByteToSeg(privateInfo->endptr, privateInfo->end_segno, *WalSegSz);
+
+ /*
+ * This WAL record was fetched before the filtering parameters
+ * (start_segno and end_segno) were fully initialized. Perform the
+ * relevance check against the user-provided range now; if the WAL falls
+ * outside this range, remove it from the hash table. Subsequent WAL will
+ * be filtered automatically by the archived streamer using the updated
+ * start_segno and end_segno values.
+ */
+ XLogFromFileName(entry->fname, &timeline, &segno, privateInfo->segsize);
+ if (privateInfo->timeline != timeline ||
+ privateInfo->start_segno > segno ||
+ privateInfo->end_segno < segno)
+ free_archive_wal_entry(entry->fname, privateInfo);
+}
+
+/*
+ * Release the archive streamer chain and close the archive file.
+ */
+void
+free_archive_reader(XLogDumpPrivate *privateInfo)
+{
+ /*
+ * NB: Normally, astreamer_finalize() is called before astreamer_free() to
+ * flush any remaining buffered data or to ensure the end of the tar
+ * archive is reached. However, when decoding a WAL file, once we hit the
+ * end LSN, any remaining WAL data in the buffer or the tar archive's
+ * unreached end can be safely ignored.
+ */
+ astreamer_free(privateInfo->archive_streamer);
+
+ /* Close the file. */
+ if (close(privateInfo->archive_fd) != 0)
+ pg_log_error("could not close file \"%s\": %m",
+ privateInfo->archive_name);
+}
+
+/*
+ * Copies WAL data from astreamer to readBuff; if unavailable, fetches more
+ * from the tar archive via astreamer.
+ */
+int
+read_archive_wal_page(XLogDumpPrivate *privateInfo, XLogRecPtr targetPagePtr,
+ Size count, char *readBuff, int WalSegSz)
+{
+ char *p = readBuff;
+ Size nbytes = count;
+ XLogRecPtr recptr = targetPagePtr;
+ XLogSegNo segno;
+ char fname[MAXFNAMELEN];
+ ArchivedWALFile *entry;
+
+ /* Identify the segment and locate its entry in the archive hash */
+ XLByteToSeg(targetPagePtr, segno, WalSegSz);
+ XLogFileName(fname, privateInfo->timeline, segno, WalSegSz);
+ entry = get_archive_wal_entry(fname, privateInfo, WalSegSz);
+
+ while (nbytes > 0)
+ {
+ char *buf = entry->buf->data;
+ int bufLen = entry->buf->len;
+ XLogRecPtr endPtr;
+ XLogRecPtr startPtr;
+
+ /* Calculate the LSN range currently residing in the buffer */
+ XLogSegNoOffsetToRecPtr(segno, entry->read_len, WalSegSz, endPtr);
+ startPtr = endPtr - bufLen;
+
+ /*
+ * Copy the requested WAL record if it exists in the buffer.
+ */
+ if (bufLen > 0 && startPtr <= recptr && recptr < endPtr)
+ {
+ int copyBytes;
+ int offset = recptr - startPtr;
+
+ /*
+ * Given startPtr <= recptr < endPtr and a total buffer size
+ * 'bufLen', the offset (recptr - startPtr) will always be less
+ * than 'bufLen'.
+ */
+ Assert(offset < bufLen);
+
+ copyBytes = Min(nbytes, bufLen - offset);
+ memcpy(p, buf + offset, copyBytes);
+
+ /* Update state for read */
+ recptr += copyBytes;
+ nbytes -= copyBytes;
+ p += copyBytes;
+ }
+ else
+ {
+ /*
+ * Before starting the actual decoding loop, pg_waldump tries to
+ * locate the first valid record from the user-specified start
+ * position, which might not be the start of a WAL record and
+ * could fall in the middle of a record that spans multiple pages.
+ * Consequently, the valid start position the decoder is looking
+ * for could be far away from that initial position.
+ *
+ * This may involve reading across multiple pages, and this
+ * pre-reading fetches data in multiple rounds from the archive
+ * streamer; normally, we would throw away existing buffer
+ * contents to fetch the next set of data, but that existing data
+ * might be needed once the main loop starts. Because previously
+ * read data cannot be re-read by the archive streamer, we delay
+ * resetting the buffer until the main decoding loop is entered.
+ *
+ * Once pg_waldump has entered the main loop, it may re-read the
+ * currently active page, but never an older one; therefore, any
+ * fully consumed WAL data preceding the current page can then be
+ * safely discarded.
+ */
+ if (privateInfo->decoding_started)
+ {
+ resetStringInfo(entry->buf);
+
+ /*
+ * Push back the partial page data for the current page to the
+ * buffer, ensuring it remains full page available for
+ * re-reading if requested.
+ */
+ if (p > readBuff)
+ {
+ Assert((count - nbytes) > 0);
+ appendBinaryStringInfo(entry->buf, readBuff, count - nbytes);
+ }
+ }
+
+ /*
+ * Now, fetch more data; raise an error if it's not the current
+ * segment being read by the archive streamer or if reading of the
+ * archived file has finished.
+ */
+ if (privateInfo->cur_file != entry ||
+ read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ pg_fatal("could not read file \"%s\" from archive \"%s\": read %lld of %lld",
+ fname, privateInfo->archive_name,
+ (long long int) count - nbytes,
+ (long long int) nbytes);
+ }
+ }
+
+ /*
+ * Should have either have successfully read all the requested bytes or
+ * reported a failure before this point.
+ */
+ Assert(nbytes == 0);
+
+ /*
+ * NB: We return the fixed value provided as input. Although we could
+ * return a boolean since we either successfully read the WAL page or
+ * raise an error, but the caller expects this value to be returned. The
+ * routine that reads WAL pages from the physical WAL file follows the
+ * same convention.
+ */
+ return count;
+}
+
+/*
+ * Clears the buffer of a WAL entry that is being ignored. This frees up memory
+ * and prevents the accumulation of irrelevant WAL data. Additionally,
+ * conditionally setting cur_file within privateinfo to NULL ensures the
+ * archive streamer skips unnecessary copy operations
+ */
+void
+free_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo)
+{
+ ArchivedWALFile *entry;
+
+ entry = ArchivedWAL_lookup(privateInfo->archive_wal_htab, fname);
+
+ if (entry == NULL)
+ return;
+
+ /* Destroy the buffer */
+ destroyStringInfo(entry->buf);
+ entry->buf = NULL;
+
+ /* Set cur_file to NULL if it matches the entry being ignored */
+ if (privateInfo->cur_file == entry)
+ privateInfo->cur_file = NULL;
+
+ ArchivedWAL_delete_item(privateInfo->archive_wal_htab, entry);
+}
+
+/*
+ * Returns the archived WAL entry from the hash table if it exists. Otherwise,
+ * it invokes the routine to read the archived file, which then populates the
+ * entry in the hash table if that WAL exists in the archive.
+ */
+static ArchivedWALFile *
+get_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo,
+ int WalSegSz)
+{
+ ArchivedWALFile *entry = NULL;
+
+ /* Search hash table */
+ entry = ArchivedWAL_lookup(privateInfo->archive_wal_htab, fname);
+
+ if (entry != NULL)
+ return entry;
+
+ /*
+ * The requested WAL entry has not been read from the archive yet; invoke
+ * the archive streamer to read it.
+ */
+ while (1)
+ {
+ /* Fetch more data */
+ if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ break; /* archive file ended */
+
+ /*
+ * Archived streamer is reading a non-WAL file or an irrelevant WAL
+ * file.
+ */
+ if (privateInfo->cur_file == NULL)
+ continue;
+
+ entry = privateInfo->cur_file;
+
+ /* Found the required entry */
+ if (strcmp(fname, entry->fname) == 0)
+ return entry;
+
+ /* WAL segments must be archived in order */
+ pg_log_error("WAL files are not archived in sequential order");
+ pg_log_error_detail("Expecting segment \"%s\" but found \"%s\".",
+ fname, entry->fname);
+ exit(1);
+ }
+
+ /* Requested WAL segment not found */
+ pg_fatal("could not find WAL \"%s\" in archive \"%s\"",
+ fname, privateInfo->archive_name);
+}
+
+/*
+ * Reads the archive file and passes it to the archive streamer for
+ * decompression.
+ */
+static int
+read_archive_file(XLogDumpPrivate *privateInfo, Size count)
+{
+ int rc;
+ char *buffer;
+
+ buffer = pg_malloc(count * sizeof(uint8));
+
+ rc = read(privateInfo->archive_fd, buffer, count);
+ if (rc < 0)
+ pg_fatal("could not read file \"%s\": %m",
+ privateInfo->archive_name);
+
+ /*
+ * Decompress (if required), and then parse the previously read contents
+ * of the tar file.
+ */
+ if (rc > 0)
+ astreamer_content(privateInfo->archive_streamer, NULL,
+ buffer, rc, ASTREAMER_UNKNOWN);
+ pg_free(buffer);
+
+ return rc;
+}
+
+/*
+ * Create an astreamer that can read WAL from a tar file.
+ */
+static astreamer *
+astreamer_waldump_new(XLogDumpPrivate *privateInfo)
+{
+ astreamer_waldump *streamer;
+
+ streamer = palloc0(sizeof(astreamer_waldump));
+ *((const astreamer_ops **) &streamer->base.bbs_ops) =
+ &astreamer_waldump_ops;
+
+ streamer->privateInfo = privateInfo;
+
+ return &streamer->base;
+}
+
+/*
+ * Main entry point of the archive streamer for reading WAL data from a tar
+ * file. If a member is identified as a valid WAL file, a hash entry is created
+ * for it, and its contents are copied into that entry's buffer, making them
+ * accessible to the decoding routine.
+ */
+static void
+astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context)
+{
+ astreamer_waldump *mystreamer = (astreamer_waldump *) streamer;
+ XLogDumpPrivate *privateInfo = mystreamer->privateInfo;
+
+ Assert(context != ASTREAMER_UNKNOWN);
+
+ switch (context)
+ {
+ case ASTREAMER_MEMBER_HEADER:
+ {
+ char *fname = NULL;
+ ArchivedWALFile *entry;
+ bool found;
+
+ pg_log_debug("reading \"%s\"", member->pathname);
+
+ if (!member_is_wal_file(mystreamer, member, &fname))
+ break;
+
+ /*
+ * Further checks are skipped if any WAL file can be read.
+ * This typically occurs during initial verification.
+ */
+ if (!READ_ANY_WAL(privateInfo))
+ {
+ XLogSegNo segno;
+ TimeLineID timeline;
+
+ /*
+ * Skip the segment if the timeline does not match, if it
+ * falls outside the caller-specified range.
+ */
+ XLogFromFileName(fname, &timeline, &segno, privateInfo->segsize);
+ if (privateInfo->timeline != timeline ||
+ privateInfo->start_segno > segno ||
+ privateInfo->end_segno < segno)
+ {
+ free(fname);
+ break;
+ }
+ }
+
+ entry = ArchivedWAL_insert(privateInfo->archive_wal_htab,
+ fname, &found);
+
+ /*
+ * Shouldn't happen, but if it does, simply ignore the
+ * duplicate WAL file.
+ */
+ if (found)
+ {
+ pg_log_warning("ignoring duplicate WAL \"%s\" found in archive \"%s\"",
+ member->pathname, privateInfo->archive_name);
+ break;
+ }
+
+ entry->buf = makeStringInfo();
+ entry->read_len = 0;
+ privateInfo->cur_file = entry;
+ }
+ break;
+
+ case ASTREAMER_MEMBER_CONTENTS:
+ if (privateInfo->cur_file)
+ {
+ appendBinaryStringInfo(privateInfo->cur_file->buf, data, len);
+ privateInfo->cur_file->read_len += len;
+ }
+ break;
+
+ case ASTREAMER_MEMBER_TRAILER:
+ privateInfo->cur_file = NULL;
+ break;
+
+ case ASTREAMER_ARCHIVE_TRAILER:
+ break;
+
+ default:
+ /* Shouldn't happen. */
+ pg_fatal("unexpected state while parsing tar file");
+ }
+}
+
+/*
+ * End-of-stream processing for an astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_finalize(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+}
+
+/*
+ * Free memory associated with a astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_free(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+ pfree(streamer);
+}
+
+/*
+ * Returns true if the archive member name matches the WAL naming format. If
+ * successful, it also outputs the WAL segment name.
+ */
+static bool
+member_is_wal_file(astreamer_waldump *mystreamer, astreamer_member *member,
+ char **fname)
+{
+ int pathlen;
+ char pathname[MAXPGPATH];
+ char *filename;
+
+ /* We are only interested in normal files. */
+ if (member->is_directory || member->is_link)
+ return false;
+
+ if (strlen(member->pathname) < XLOG_FNAME_LEN)
+ return false;
+
+ /*
+ * For a correct comparison, we must remove any '.' or '..' components
+ * from the member pathname. Similar to member_verify_header(), we prepend
+ * './' to the path so that canonicalize_path() can properly resolve and
+ * strip these references from the tar member name
+ */
+ snprintf(pathname, MAXPGPATH, "./%s", member->pathname);
+ canonicalize_path(pathname);
+ pathlen = strlen(pathname);
+
+ /* WAL files from the top-level or pg_wal directory will be decoded */
+ if (pathlen > XLOG_FNAME_LEN &&
+ strncmp(pathname, XLOGDIR, strlen(XLOGDIR)) != 0)
+ return false;
+
+ /* WAL file could be with full path */
+ filename = pathname + (pathlen - XLOG_FNAME_LEN);
+ if (!IsXLogFileName(filename))
+ return false;
+
+ *fname = pnstrdup(filename, XLOG_FNAME_LEN);
+
+ return true;
+}
+
+/*
+ * Helper function for filemap hash table.
+ */
+static uint32
+hash_string_pointer(const char *s)
+{
+ unsigned char *ss = (unsigned char *) s;
+
+ return hash_bytes(ss, strlen(s));
+}
diff --git a/src/bin/pg_waldump/meson.build b/src/bin/pg_waldump/meson.build
index 633a9874bb5..5296f21b82c 100644
--- a/src/bin/pg_waldump/meson.build
+++ b/src/bin/pg_waldump/meson.build
@@ -1,6 +1,7 @@
# Copyright (c) 2022-2026, PostgreSQL Global Development Group
pg_waldump_sources = files(
+ 'archive_waldump.c',
'compat.c',
'pg_waldump.c',
'rmgrdesc.c',
@@ -18,7 +19,7 @@ endif
pg_waldump = executable('pg_waldump',
pg_waldump_sources,
- dependencies: [frontend_code, lz4, zstd],
+ dependencies: [frontend_code, libpq, lz4, zstd],
c_args: ['-DFRONTEND'], # needed for xlogreader et al
kwargs: default_bin_args,
)
@@ -29,6 +30,7 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
+ 'env': {'TAR': tar.found() ? tar.full_path() : ''},
'tests': [
't/001_basic.pl',
't/002_save_fullpage.pl',
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 5d31b15dbd8..6d04462d039 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -176,7 +176,7 @@ split_path(const char *path, char **dir, char **fname)
*
* return a read only fd
*/
-static int
+int
open_file_in_directory(const char *directory, const char *fname)
{
int fd = -1;
@@ -440,6 +440,67 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
return count;
}
+/*
+ * pg_waldump's XLogReaderRoutine->segment_open callback to support dumping WAL
+ * files from tar archives.
+ */
+static void
+TarWALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
+ TimeLineID *tli_p)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->segment_close callback.
+ */
+static void
+TarWALDumpCloseSegment(XLogReaderState *state)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->page_read callback to support dumping WAL
+ * files from tar archives.
+ */
+static int
+TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
+ XLogRecPtr targetPtr, char *readBuff)
+{
+ XLogDumpPrivate *private = state->private_data;
+ int count = required_read_len(private, targetPagePtr, reqLen);
+ int WalSegSz = state->segcxt.ws_segsize;
+ XLogSegNo nextSegNo;
+
+ /* Bail out if the count to be read is not valid */
+ if (count < 0)
+ return -1;
+
+ /*
+ * If the target page is in a different segment, free the buffer space
+ * occupied by the previous segment data. Since pg_waldump never requests
+ * the same WAL bytes twice, moving to a new segment implies the previous
+ * buffer's data and that segment will not be needed again.
+ */
+ nextSegNo = state->seg.ws_segno;
+ if (!XLByteInSeg(targetPagePtr, nextSegNo, WalSegSz))
+ {
+ char fname[MAXFNAMELEN];
+
+ XLogFileName(fname, state->seg.ws_tli, nextSegNo, WalSegSz);
+ free_archive_wal_entry(fname, private);
+
+ XLByteToSeg(targetPagePtr, nextSegNo, WalSegSz);
+ state->seg.ws_tli = private->timeline;
+ state->seg.ws_segno = nextSegNo;
+ }
+
+ /* Read the WAL page from the archive streamer */
+ return read_archive_wal_page(private, targetPagePtr, count, readBuff,
+ WalSegSz);
+}
+
/*
* Boolean to return whether the given WAL record matches a specific relation
* and optionally block.
@@ -777,8 +838,8 @@ usage(void)
printf(_(" -F, --fork=FORK only show records that modify blocks in fork FORK;\n"
" valid names are main, fsm, vm, init\n"));
printf(_(" -n, --limit=N number of records to display\n"));
- printf(_(" -p, --path=PATH directory in which to find WAL segment files or a\n"
- " directory with a ./pg_wal that contains such files\n"
+ printf(_(" -p, --path=PATH tar archive or a directory in which to find WAL segment files or\n"
+ " a directory with a ./pg_wal that contains such files\n"
" (default: current directory, ./pg_wal, $PGDATA/pg_wal)\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -r, --rmgr=RMGR only show records generated by resource manager RMGR;\n"
@@ -810,7 +871,9 @@ main(int argc, char **argv)
XLogRecord *record;
XLogRecPtr first_record;
char *waldir = NULL;
+ char *walpath = NULL;
char *errormsg;
+ pg_compress_algorithm compression;
static struct option long_options[] = {
{"bkp-details", no_argument, NULL, 'b'},
@@ -868,6 +931,10 @@ main(int argc, char **argv)
private.startptr = InvalidXLogRecPtr;
private.endptr = InvalidXLogRecPtr;
private.endptr_reached = false;
+ private.decoding_started = false;
+ private.archive_name = NULL;
+ private.start_segno = 0;
+ private.end_segno = UINT64_MAX;
config.quiet = false;
config.bkp_details = false;
@@ -943,7 +1010,7 @@ main(int argc, char **argv)
}
break;
case 'p':
- waldir = pg_strdup(optarg);
+ walpath = pg_strdup(optarg);
break;
case 'q':
config.quiet = true;
@@ -1107,10 +1174,19 @@ main(int argc, char **argv)
goto bad_argument;
}
- if (waldir != NULL)
+ if (walpath != NULL)
{
+ /* validate path points to tar archive */
+ if (is_archive_file(walpath, &compression))
+ {
+ char *fname = NULL;
+
+ split_path(walpath, &waldir, &fname);
+
+ private.archive_name = fname;
+ }
/* validate path points to directory */
- if (!verify_directory(waldir))
+ else if (!verify_directory(walpath))
{
pg_log_error("could not open directory \"%s\": %m", waldir);
goto bad_argument;
@@ -1128,6 +1204,17 @@ main(int argc, char **argv)
int fd;
XLogSegNo segno;
+ /*
+ * If a tar archive is passed using the --path option, all other
+ * arguments become unnecessary.
+ */
+ if (private.archive_name)
+ {
+ pg_log_error("unnecessary command-line arguments specified with tar archive (first is \"%s\")",
+ argv[optind]);
+ goto bad_argument;
+ }
+
split_path(argv[optind], &directory, &fname);
if (waldir == NULL && directory != NULL)
@@ -1138,69 +1225,76 @@ main(int argc, char **argv)
pg_fatal("could not open directory \"%s\": %m", waldir);
}
- waldir = identify_target_directory(waldir, fname, &private.segsize);
- fd = open_file_in_directory(waldir, fname);
- if (fd < 0)
- pg_fatal("could not open file \"%s\"", fname);
- close(fd);
-
- /* parse position from file */
- XLogFromFileName(fname, &private.timeline, &segno, private.segsize);
-
- if (!XLogRecPtrIsValid(private.startptr))
- XLogSegNoOffsetToRecPtr(segno, 0, private.segsize, private.startptr);
- else if (!XLByteInSeg(private.startptr, segno, private.segsize))
+ if (fname != NULL && is_archive_file(fname, &compression))
{
- pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.startptr),
- fname);
- goto bad_argument;
+ private.archive_name = fname;
}
-
- /* no second file specified, set end position */
- if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(segno + 1, 0, private.segsize, private.endptr);
-
- /* parse ENDSEG if passed */
- if (optind + 1 < argc)
+ else
{
- XLogSegNo endsegno;
-
- /* ignore directory, already have that */
- split_path(argv[optind + 1], &directory, &fname);
-
+ waldir = identify_target_directory(waldir, fname, &private.segsize);
fd = open_file_in_directory(waldir, fname);
if (fd < 0)
pg_fatal("could not open file \"%s\"", fname);
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &endsegno, private.segsize);
+ XLogFromFileName(fname, &private.timeline, &segno, private.segsize);
- if (endsegno < segno)
- pg_fatal("ENDSEG %s is before STARTSEG %s",
- argv[optind + 1], argv[optind]);
+ if (!XLogRecPtrIsValid(private.startptr))
+ XLogSegNoOffsetToRecPtr(segno, 0, private.segsize, private.startptr);
+ else if (!XLByteInSeg(private.startptr, segno, private.segsize))
+ {
+ pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.startptr),
+ fname);
+ goto bad_argument;
+ }
- if (!XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(endsegno + 1, 0, private.segsize,
- private.endptr);
+ /* no second file specified, set end position */
+ if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
+ XLogSegNoOffsetToRecPtr(segno + 1, 0, private.segsize, private.endptr);
- /* set segno to endsegno for check of --end */
- segno = endsegno;
- }
+ /* parse ENDSEG if passed */
+ if (optind + 1 < argc)
+ {
+ XLogSegNo endsegno;
+ /* ignore directory, already have that */
+ split_path(argv[optind + 1], &directory, &fname);
- if (!XLByteInSeg(private.endptr, segno, private.segsize) &&
- private.endptr != (segno + 1) * private.segsize)
- {
- pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.endptr),
- argv[argc - 1]);
- goto bad_argument;
+ fd = open_file_in_directory(waldir, fname);
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", fname);
+ close(fd);
+
+ /* parse position from file */
+ XLogFromFileName(fname, &private.timeline, &endsegno, private.segsize);
+
+ if (endsegno < segno)
+ pg_fatal("ENDSEG %s is before STARTSEG %s",
+ argv[optind + 1], argv[optind]);
+
+ if (!XLogRecPtrIsValid(private.endptr))
+ XLogSegNoOffsetToRecPtr(endsegno + 1, 0, private.segsize,
+ private.endptr);
+
+ /* set segno to endsegno for check of --end */
+ segno = endsegno;
+ }
+
+
+ if (!XLByteInSeg(private.endptr, segno, private.segsize) &&
+ private.endptr != (segno + 1) * private.segsize)
+ {
+ pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.endptr),
+ argv[argc - 1]);
+ goto bad_argument;
+ }
}
}
- else
- waldir = identify_target_directory(waldir, NULL, &private.segsize);
+ else if (!private.archive_name)
+ waldir = identify_target_directory(walpath, NULL, &private.segsize);
/* we don't know what to print */
if (!XLogRecPtrIsValid(private.startptr))
@@ -1212,12 +1306,36 @@ main(int argc, char **argv)
/* done with argument parsing, do the actual work */
/* we have everything we need, start reading */
- xlogreader_state =
- XLogReaderAllocate(private.segsize, waldir,
- XL_ROUTINE(.page_read = WALDumpReadPage,
- .segment_open = WALDumpOpenSegment,
- .segment_close = WALDumpCloseSegment),
- &private);
+ if (private.archive_name)
+ {
+ /*
+ * A NULL WAL directory indicates that the archive file is located in
+ * the current working directory of the pg_waldump execution
+ */
+ if (waldir == NULL)
+ waldir = pg_strdup(".");
+
+ /* Set up for reading tar file */
+ init_archive_reader(&private, waldir, &private.segsize, compression);
+
+ /* Routine to decode WAL files in tar archive */
+ xlogreader_state =
+ XLogReaderAllocate(private.segsize, waldir,
+ XL_ROUTINE(.page_read = TarWALDumpReadPage,
+ .segment_open = TarWALDumpOpenSegment,
+ .segment_close = TarWALDumpCloseSegment),
+ &private);
+ }
+ else
+ {
+ xlogreader_state =
+ XLogReaderAllocate(private.segsize, waldir,
+ XL_ROUTINE(.page_read = WALDumpReadPage,
+ .segment_open = WALDumpOpenSegment,
+ .segment_close = WALDumpCloseSegment),
+ &private);
+ }
+
if (!xlogreader_state)
pg_fatal("out of memory while allocating a WAL reading processor");
@@ -1245,6 +1363,9 @@ main(int argc, char **argv)
if (config.stats == true && !config.quiet)
stats.startptr = first_record;
+ /* Flag indicating that the decoding loop has been entered */
+ private.decoding_started = true;
+
for (;;)
{
if (time_to_stop)
@@ -1326,6 +1447,9 @@ main(int argc, char **argv)
XLogReaderFree(xlogreader_state);
+ if (private.archive_name)
+ free_archive_reader(&private);
+
return EXIT_SUCCESS;
bad_argument:
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index 4f1b2ab668b..02da2c43b08 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -12,6 +12,11 @@
#define PG_WALDUMP_H
#include "access/xlogdefs.h"
+#include "fe_utils/astreamer.h"
+
+/* Forward declaration */
+struct ArchivedWALFile;
+struct ArchivedWAL_hash;
/* Contains the necessary information to drive WAL decoding */
typedef struct XLogDumpPrivate
@@ -21,6 +26,46 @@ typedef struct XLogDumpPrivate
XLogRecPtr startptr;
XLogRecPtr endptr;
bool endptr_reached;
+ bool decoding_started;
+
+ /* Fields required to read WAL from archive */
+ char *archive_name; /* Tar archive name */
+ int archive_fd; /* File descriptor for the open tar file */
+
+ astreamer *archive_streamer;
+
+ /* What the archive streamer is currently reading */
+ struct ArchivedWALFile *cur_file;
+
+ /*
+ * Hash table of all WAL files that the archive stream has read, including
+ * the one currently in progress.
+ */
+ struct ArchivedWAL_hash *archive_wal_htab;
+
+ /*
+ * Although these values can be easily derived from startptr and endptr,
+ * doing so repeatedly for each archived member would be inefficient, as
+ * it would involve recalculating and filtering out irrelevant WAL
+ * segments.
+ */
+ XLogSegNo start_segno;
+ XLogSegNo end_segno;
} XLogDumpPrivate;
+extern int open_file_in_directory(const char *directory, const char *fname);
+
+extern bool is_archive_file(const char *fname,
+ pg_compress_algorithm *compression);
+extern void init_archive_reader(XLogDumpPrivate *privateInfo,
+ const char *waldir, int *WalSegSz,
+ pg_compress_algorithm compression);
+extern void free_archive_reader(XLogDumpPrivate *privateInfo);
+extern int read_archive_wal_page(XLogDumpPrivate *privateInfo,
+ XLogRecPtr targetPagePtr,
+ Size count, char *readBuff,
+ int WalSegSz);
+extern void free_archive_wal_entry(const char *fname,
+ XLogDumpPrivate *privateInfo);
+
#endif /* end of PG_WALDUMP_H */
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 3288fadcf48..cae543c8990 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -3,10 +3,13 @@
use strict;
use warnings FATAL => 'all';
+use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
+my $tar = $ENV{TAR};
+
program_help_ok('pg_waldump');
program_version_ok('pg_waldump');
program_options_handling_ok('pg_waldump');
@@ -235,7 +238,7 @@ command_like(
sub test_pg_waldump
{
local $Test::Builder::Level = $Test::Builder::Level + 1;
- my @opts = @_;
+ my ($path, @opts) = @_;
my ($stdout, $stderr);
@@ -243,6 +246,7 @@ sub test_pg_waldump
'pg_waldump',
'--start' => $start_lsn,
'--end' => $end_lsn,
+ '--path' => $path,
@opts
],
'>' => \$stdout,
@@ -254,11 +258,50 @@ sub test_pg_waldump
return @lines;
}
-my @lines;
+# Create a tar archive, sorting the file order
+sub generate_archive
+{
+ my ($archive, $directory, $compression_flags) = @_;
+
+ my @files;
+ opendir my $dh, $directory or die "opendir: $!";
+ while (my $entry = readdir $dh) {
+ # Skip '.' and '..'
+ next if $entry eq '.' || $entry eq '..';
+ push @files, $entry;
+ }
+ closedir $dh;
+
+ @files = sort @files;
+
+ # move into the WAL directory before archiving files
+ my $cwd = getcwd;
+ chdir($directory) || die "chdir: $!";
+ command_ok([$tar, $compression_flags, $archive, @files]);
+ chdir($cwd) || die "chdir: $!";
+}
+
+my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
my @scenarios = (
{
- 'path' => $node->data_dir
+ 'path' => $node->data_dir,
+ 'is_archive' => 0,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar",
+ 'compression_method' => 'none',
+ 'compression_flags' => '-cf',
+ 'is_archive' => 1,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar.gz",
+ 'compression_method' => 'gzip',
+ 'compression_flags' => '-czf',
+ 'is_archive' => 1,
+ 'enabled' => check_pg_config("#define HAVE_LIBZ 1")
});
for my $scenario (@scenarios)
@@ -267,6 +310,19 @@ for my $scenario (@scenarios)
SKIP:
{
+ skip "tar command is not available", 3
+ if !defined $tar;
+ skip "$scenario->{'compression_method'} compression not supported by this build", 3
+ if !$scenario->{'enabled'} && $scenario->{'is_archive'};
+
+ # create pg_wal archive
+ if ($scenario->{'is_archive'})
+ {
+ generate_archive($path,
+ $node->data_dir . '/pg_wal',
+ $scenario->{'compression_flags'});
+ }
+
command_fails_like(
[ 'pg_waldump', '--path' => $path ],
qr/error: no start WAL location given/,
@@ -298,38 +354,41 @@ for my $scenario (@scenarios)
qr/error: error in WAL record at/,
'errors are shown with --quiet');
- @lines = test_pg_waldump('--path' => $path);
+ my @lines = test_pg_waldump($path);
is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
- @lines = test_pg_waldump('--path' => $path, '--limit' => 6);
+ @lines = test_pg_waldump($path, '--limit' => 6);
is(@lines, 6, 'limit option observed');
- @lines = test_pg_waldump('--path' => $path, '--fullpage');
+ @lines = test_pg_waldump($path, '--fullpage');
is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
- @lines = test_pg_waldump('--path' => $path, '--stats');
+ @lines = test_pg_waldump($path, '--stats');
like($lines[0], qr/WAL statistics/, "statistics on stdout");
is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
- @lines = test_pg_waldump('--path' => $path, '--stats=record');
+ @lines = test_pg_waldump($path, '--stats=record');
like($lines[0], qr/WAL statistics/, "statistics on stdout");
is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
- @lines = test_pg_waldump('--path' => $path, '--rmgr' => 'Btree');
+ @lines = test_pg_waldump($path, '--rmgr' => 'Btree');
is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
- @lines = test_pg_waldump('--path' => $path, '--fork' => 'init');
+ @lines = test_pg_waldump($path, '--fork' => 'init');
is(grep(!/fork init/, @lines), 0, 'only init fork lines');
- @lines = test_pg_waldump('--path' => $path,
+ @lines = test_pg_waldump($path,
'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
0, 'only lines for selected relation');
- @lines = test_pg_waldump('--path' => $path,
+ @lines = test_pg_waldump($path,
'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
'--block' => 1);
is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+
+ # Cleanup.
+ unlink $path if $scenario->{'is_archive'};
}
}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 9f5ee8fd482..2cd87de84ee 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -144,6 +144,8 @@ ArchiveOpts
ArchiveShutdownCB
ArchiveStartupCB
ArchiveStreamState
+ArchivedWALFile
+ArchivedWAL_hash
ArchiverOutput
ArchiverStage
ArrayAnalyzeExtraData
@@ -3506,6 +3508,7 @@ astreamer_recovery_injector
astreamer_tar_archiver
astreamer_tar_parser
astreamer_verify
+astreamer_waldump
astreamer_zstd_frame
auth_password_hook_typ
autovac_table
--
2.47.1
[application/x-patch] v12-0006-pg_waldump-Remove-the-restriction-on-the-order-o.patch (12.8K, 7-v12-0006-pg_waldump-Remove-the-restriction-on-the-order-o.patch)
download | inline diff:
From 77e29d3162afec46d9ed9f3d592bdbeee6347b37 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 27 Jan 2026 15:38:34 +0530
Subject: [PATCH v12 6/9] pg_waldump: Remove the restriction on the order of
archived WAL files.
With previous patch, pg_waldump would stop decoding if WAL files were
not in the required sequence. With this patch, decoding will now
continue. Any WAL file that is out of order will be written to a
temporary location, from which it will be read later. Once a temporary
file has been read, it will be removed.
---
doc/src/sgml/ref/pg_waldump.sgml | 8 +-
src/bin/pg_waldump/archive_waldump.c | 171 +++++++++++++++++++++++++--
src/bin/pg_waldump/pg_waldump.c | 31 ++++-
src/bin/pg_waldump/pg_waldump.h | 3 +
src/bin/pg_waldump/t/001_basic.pl | 3 +-
5 files changed, 196 insertions(+), 20 deletions(-)
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index d004bb0f67e..27adf77755c 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -149,8 +149,12 @@ PostgreSQL documentation
of <envar>PGDATA</envar>.
</para>
<para>
- If a tar archive is provided, its WAL segment files must be in
- sequential order; otherwise, an error will be reported.
+ If a tar archive is provided and its WAL segment files are not in
+ sequential order, those files will be written to a temporary directory
+ named starting with <filename>waldump_tmp</filename>. This directory will be
+ created inside the directory specified by the <envar>TMPDIR</envar>
+ environment variable if it is set; otherwise, it will be created within
+ the same directory as the tar archive.
</para>
</listitem>
</varlistentry>
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
index 27a5a5c6d5d..b1353088c4a 100644
--- a/src/bin/pg_waldump/archive_waldump.c
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -17,6 +17,7 @@
#include <unistd.h>
#include "access/xlog_internal.h"
+#include "common/file_perm.h"
#include "common/hashfn.h"
#include "common/logging.h"
#include "fe_utils/simple_list.h"
@@ -27,6 +28,9 @@
*/
#define READ_CHUNK_SIZE (128 * 1024)
+/* Temporary exported WAL file directory */
+char *TmpWalSegDir = NULL;
+
/*
* Check if the start segment number is zero; this indicates a request to read
* any WAL file.
@@ -57,6 +61,8 @@ typedef struct ArchivedWALFile
const char *fname; /* hash key: WAL segment name */
StringInfo buf; /* holds WAL bytes read from archive */
+ bool spilled; /* true if the WAL data was spilled to a
+ * temporary file */
int read_len; /* total bytes of a WAL read from archive */
} ArchivedWALFile;
@@ -84,6 +90,11 @@ static ArchivedWALFile *get_archive_wal_entry(const char *fname,
XLogDumpPrivate *privateInfo,
int WalSegSz);
static int read_archive_file(XLogDumpPrivate *privateInfo, Size count);
+static void setup_tmpwal_dir(const char *waldir);
+static void cleanup_tmpwal_dir_atexit(void);
+
+static FILE *prepare_tmp_write(const char *fname);
+static void perform_tmp_write(const char *fname, StringInfo buf, FILE *file);
static astreamer *astreamer_waldump_new(XLogDumpPrivate *privateInfo);
static void astreamer_waldump_content(astreamer *streamer,
@@ -137,7 +148,9 @@ is_archive_file(const char *fname, pg_compress_algorithm *compression)
/*
* Initializes the tar archive reader, creates a hash table for WAL entries,
* checks for existing valid WAL segments in the archive file and retrieves the
- * segment size, and sets up filters for relevant entries.
+ * segment size, and sets up filters for relevant entries. It also configures a
+ * temporary directory for out-of-order WAL data and registers an exit callback
+ * to clean up temporary files.
*/
void
init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
@@ -230,6 +243,13 @@ init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
privateInfo->start_segno > segno ||
privateInfo->end_segno < segno)
free_archive_wal_entry(entry->fname, privateInfo);
+
+ /*
+ * Setup temporary directory to store WAL segments and set up an exit
+ * callback to remove it upon completion.
+ */
+ setup_tmpwal_dir(waldir);
+ atexit(cleanup_tmpwal_dir_atexit);
}
/*
@@ -396,6 +416,17 @@ free_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo)
destroyStringInfo(entry->buf);
entry->buf = NULL;
+ /* Remove temporary file if any */
+ if (entry->spilled)
+ {
+ char fpath[MAXPGPATH];
+
+ snprintf(fpath, MAXPGPATH, "%s/%s", TmpWalSegDir, fname);
+
+ if (unlink(fpath) == 0)
+ pg_log_debug("removed file \"%s\"", fpath);
+ }
+
/* Set cur_file to NULL if it matches the entry being ignored */
if (privateInfo->cur_file == entry)
privateInfo->cur_file = NULL;
@@ -407,12 +438,16 @@ free_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo)
* Returns the archived WAL entry from the hash table if it exists. Otherwise,
* it invokes the routine to read the archived file, which then populates the
* entry in the hash table if that WAL exists in the archive.
+ * If the archive streamer happens to be reading a
+ * WAL from archive file that is not currently needed, that WAL data is written
+ * to a temporary file.
*/
static ArchivedWALFile *
get_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo,
int WalSegSz)
{
ArchivedWALFile *entry = NULL;
+ FILE *write_fp = NULL;
/* Search hash table */
entry = ArchivedWAL_lookup(privateInfo->archive_wal_htab, fname);
@@ -426,28 +461,59 @@ get_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo,
*/
while (1)
{
+ /*
+ * The WAL file entry currently being processed may change during
+ * archive streamer execution. Therefore, maintain a local variable to
+ * reference the previous entry, ensuring that any remaining data in
+ * its buffer is successfully flushed to the temporary file before
+ * switching to the next WAL entry.
+ */
+ entry = privateInfo->cur_file;
+
/* Fetch more data */
- if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
- break; /* archive file ended */
+ if (entry == NULL || entry->buf->len == 0)
+ {
+ if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ break; /* archive file ended */
+ }
/*
* Archived streamer is reading a non-WAL file or an irrelevant WAL
* file.
*/
- if (privateInfo->cur_file == NULL)
+ if (entry == NULL)
continue;
- entry = privateInfo->cur_file;
-
/* Found the required entry */
if (strcmp(fname, entry->fname) == 0)
return entry;
- /* WAL segments must be archived in order */
- pg_log_error("WAL files are not archived in sequential order");
- pg_log_error_detail("Expecting segment \"%s\" but found \"%s\".",
- fname, entry->fname);
- exit(1);
+ /*
+ * Archive streamer is currently reading a file that isn't the one
+ * asked for, but it's required in the future. It should be written to
+ * a temporary location for retrieval when needed.
+ */
+
+ /* Create a temporary file if one does not already exist */
+ if (!entry->spilled)
+ {
+ write_fp = prepare_tmp_write(entry->fname);
+ entry->spilled = true;
+ }
+
+ /* Flush data from the buffer to the file */
+ perform_tmp_write(entry->fname, entry->buf, write_fp);
+ resetStringInfo(entry->buf);
+
+ /*
+ * The change in the current segment entry indicates that the reading
+ * of this file has ended.
+ */
+ if (entry != privateInfo->cur_file && write_fp != NULL)
+ {
+ fclose(write_fp);
+ write_fp = NULL;
+ }
}
/* Requested WAL segment not found */
@@ -485,7 +551,88 @@ read_archive_file(XLogDumpPrivate *privateInfo, Size count)
}
/*
- * Create an astreamer that can read WAL from a tar file.
+ * Set up a temporary directory to temporarily store WAL segments.
+ */
+static void
+setup_tmpwal_dir(const char *waldir)
+{
+ char *template;
+
+ /*
+ * Use the directory specified by the TMPDIR environment variable. If it's
+ * not set, use the provided WAL directory to extract WAL file
+ * temporarily.
+ */
+ template = psprintf("%s/waldump_tmp-XXXXXX",
+ getenv("TMPDIR") ? getenv("TMPDIR") : waldir);
+ TmpWalSegDir = mkdtemp(template);
+
+ if (TmpWalSegDir == NULL)
+ pg_fatal("could not create directory \"%s\": %m", template);
+
+ canonicalize_path(TmpWalSegDir);
+
+ pg_log_debug("created directory \"%s\"", TmpWalSegDir);
+}
+
+/*
+ * Remove temporary directory at exit, if any.
+ */
+static void
+cleanup_tmpwal_dir_atexit(void)
+{
+ rmtree(TmpWalSegDir, true);
+}
+
+/*
+ * Create an empty placeholder file and return its handle.
+ */
+static FILE *
+prepare_tmp_write(const char *fname)
+{
+ char fpath[MAXPGPATH];
+ FILE *file;
+
+ snprintf(fpath, MAXPGPATH, "%s/%s", TmpWalSegDir, fname);
+
+ /* Create an empty placeholder */
+ file = fopen(fpath, PG_BINARY_W);
+ if (file == NULL)
+ pg_fatal("could not create file \"%s\": %m", fpath);
+
+#ifndef WIN32
+ if (chmod(fpath, pg_file_create_mode))
+ pg_fatal("could not set permissions on file \"%s\": %m",
+ fpath);
+#endif
+
+ pg_log_debug("spilling to temporary file \"%s\"", fpath);
+
+ return file;
+}
+
+/*
+ * Write buffer data to the given file handle.
+ */
+static void
+perform_tmp_write(const char *fname, StringInfo buf, FILE *file)
+{
+ Assert(file);
+
+ errno = 0;
+ if (buf->len > 0 && fwrite(buf->data, buf->len, 1, file) != 1)
+ {
+ /*
+ * If write didn't set errno, assume problem is no disk space
+ */
+ if (errno == 0)
+ errno = ENOSPC;
+ pg_fatal("could not write to file \"%s\": %m", fname);
+ }
+}
+
+/*
+ * Create an astreamer that can read WAL from tar file.
*/
static astreamer *
astreamer_waldump_new(XLogDumpPrivate *privateInfo)
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 6d04462d039..faf300af2be 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -478,25 +478,46 @@ TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
return -1;
/*
- * If the target page is in a different segment, free the buffer space
- * occupied by the previous segment data. Since pg_waldump never requests
- * the same WAL bytes twice, moving to a new segment implies the previous
- * buffer's data and that segment will not be needed again.
+ * If the target page is in a different segment, free the buffer and/or
+ * temporary file disk space occupied by the previous segment's data.
+ * Since pg_waldump never requests the same WAL bytes twice, moving to a
+ * new segment implies the previous buffer's data and that segment will
+ * not be needed again.
+ *
+ * Afterward, check for the next required WAL segment's physical existence
+ * in the temporary directory first before invoking the archive streamer.
*/
nextSegNo = state->seg.ws_segno;
if (!XLByteInSeg(targetPagePtr, nextSegNo, WalSegSz))
{
char fname[MAXFNAMELEN];
+ if (state->seg.ws_file >= 0)
+ {
+ close(state->seg.ws_file);
+ state->seg.ws_file = -1;
+ }
+
XLogFileName(fname, state->seg.ws_tli, nextSegNo, WalSegSz);
free_archive_wal_entry(fname, private);
XLByteToSeg(targetPagePtr, nextSegNo, WalSegSz);
state->seg.ws_tli = private->timeline;
state->seg.ws_segno = nextSegNo;
+
+ /*
+ * If the next segment exists, open it and continue reading from there
+ */
+ XLogFileName(fname, state->seg.ws_tli, nextSegNo, WalSegSz);
+ state->seg.ws_file = open_file_in_directory(TmpWalSegDir, fname);
}
- /* Read the WAL page from the archive streamer */
+ /* Continue reading from the open WAL segment, if any */
+ if (state->seg.ws_file >= 0)
+ return WALDumpReadPage(state, targetPagePtr, count, targetPtr,
+ readBuff);
+
+ /* Otherwise, read the WAL page from the archive streamer */
return read_archive_wal_page(private, targetPagePtr, count, readBuff,
WalSegSz);
}
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index 02da2c43b08..476f74e2846 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -18,6 +18,9 @@
struct ArchivedWALFile;
struct ArchivedWAL_hash;
+/* Temporary directory */
+extern char *TmpWalSegDir;
+
/* Contains the necessary information to drive WAL decoding */
typedef struct XLogDumpPrivate
{
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index cae543c8990..55a21c71208 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -7,6 +7,7 @@ use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
+use List::Util qw(shuffle);
my $tar = $ENV{TAR};
@@ -272,7 +273,7 @@ sub generate_archive
}
closedir $dh;
- @files = sort @files;
+ @files = shuffle @files;
# move into the WAL directory before archiving files
my $cwd = getcwd;
--
2.47.1
[application/x-patch] v12-0007-pg_verifybackup-Delay-default-WAL-directory-prep.patch (1.7K, 8-v12-0007-pg_verifybackup-Delay-default-WAL-directory-prep.patch)
download | inline diff:
From 8cb70851571e268a6be7763c845d79b9a8b50cc0 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 16 Jul 2025 14:47:43 +0530
Subject: [PATCH v12 7/9] pg_verifybackup: Delay default WAL directory
preparation.
We are not sure whether to parse WAL from a directory or an archive
until the backup format is known. Therefore, we delay preparing the
default WAL directory until the point of parsing. This delay is
harmless, as the WAL directory is not used elsewhere.
---
src/bin/pg_verifybackup/pg_verifybackup.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index f9f2d457f2f..ab01c4d003a 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -285,10 +285,6 @@ main(int argc, char **argv)
manifest_path = psprintf("%s/backup_manifest",
context.backup_directory);
- /* By default, look for the WAL in the backup directory, too. */
- if (wal_directory == NULL)
- wal_directory = psprintf("%s/pg_wal", context.backup_directory);
-
/*
* Try to read the manifest. We treat any errors encountered while parsing
* the manifest as fatal; there doesn't seem to be much point in trying to
@@ -368,6 +364,10 @@ main(int argc, char **argv)
if (context.format == 'p' && !context.skip_checksums)
verify_backup_checksums(&context);
+ /* By default, look for the WAL in the backup directory, too. */
+ if (wal_directory == NULL)
+ wal_directory = psprintf("%s/pg_wal", context.backup_directory);
+
/*
* Try to parse the required ranges of WAL records, unless we were told
* not to do so.
--
2.47.1
[application/x-patch] v12-0008-pg_verifybackup-Rename-the-wal-directory-switch-.patch (5.9K, 9-v12-0008-pg_verifybackup-Rename-the-wal-directory-switch-.patch)
download | inline diff:
From c2b0309deeb401d1cfd0ed6140d0a1d37bdd7e27 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 25 Nov 2025 17:32:14 +0530
Subject: [PATCH v12 8/9] pg_verifybackup: Rename the wal-directory switch to
wal-path
With previous patches to pg_waldump can now decode WAL directly from
tar files. This means you'll be able to specify a tar archive path
instead of a traditional WAL directory.
To keep things consistent and more versatile, we should also
generalize the input switch for pg_verifybackup. It should accept
either a directory or a tar file path that contains WALs. This change
will also aligning it with the existing manifest-path switch naming.
== NOTE ==
The corresponding PO files require updating due to this change.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 2 +-
src/bin/pg_verifybackup/pg_verifybackup.c | 22 +++++++++++-----------
src/bin/pg_verifybackup/t/007_wal.pl | 4 ++--
3 files changed, 14 insertions(+), 14 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index 61c12975e4a..e9b8bfd51b1 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -261,7 +261,7 @@ PostgreSQL documentation
<varlistentry>
<term><option>-w <replaceable class="parameter">path</replaceable></option></term>
- <term><option>--wal-directory=<replaceable class="parameter">path</replaceable></option></term>
+ <term><option>--wal-path=<replaceable class="parameter">path</replaceable></option></term>
<listitem>
<para>
Try to parse WAL files stored in the specified directory, rather than
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index ab01c4d003a..3103d36f1b9 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -93,7 +93,7 @@ static void verify_file_checksum(verifier_context *context,
uint8 *buffer);
static void parse_required_wal(verifier_context *context,
char *pg_waldump_path,
- char *wal_directory);
+ char *wal_path);
static astreamer *create_archive_verifier(verifier_context *context,
char *archive_name,
Oid tblspc_oid,
@@ -126,7 +126,7 @@ main(int argc, char **argv)
{"progress", no_argument, NULL, 'P'},
{"quiet", no_argument, NULL, 'q'},
{"skip-checksums", no_argument, NULL, 's'},
- {"wal-directory", required_argument, NULL, 'w'},
+ {"wal-path", required_argument, NULL, 'w'},
{NULL, 0, NULL, 0}
};
@@ -135,7 +135,7 @@ main(int argc, char **argv)
char *manifest_path = NULL;
bool no_parse_wal = false;
bool quiet = false;
- char *wal_directory = NULL;
+ char *wal_path = NULL;
char *pg_waldump_path = NULL;
DIR *dir;
@@ -221,8 +221,8 @@ main(int argc, char **argv)
context.skip_checksums = true;
break;
case 'w':
- wal_directory = pstrdup(optarg);
- canonicalize_path(wal_directory);
+ wal_path = pstrdup(optarg);
+ canonicalize_path(wal_path);
break;
default:
/* getopt_long already emitted a complaint */
@@ -365,15 +365,15 @@ main(int argc, char **argv)
verify_backup_checksums(&context);
/* By default, look for the WAL in the backup directory, too. */
- if (wal_directory == NULL)
- wal_directory = psprintf("%s/pg_wal", context.backup_directory);
+ if (wal_path == NULL)
+ wal_path = psprintf("%s/pg_wal", context.backup_directory);
/*
* Try to parse the required ranges of WAL records, unless we were told
* not to do so.
*/
if (!no_parse_wal)
- parse_required_wal(&context, pg_waldump_path, wal_directory);
+ parse_required_wal(&context, pg_waldump_path, wal_path);
/*
* If everything looks OK, tell the user this, unless we were asked to
@@ -1198,7 +1198,7 @@ verify_file_checksum(verifier_context *context, manifest_file *m,
*/
static void
parse_required_wal(verifier_context *context, char *pg_waldump_path,
- char *wal_directory)
+ char *wal_path)
{
manifest_data *manifest = context->manifest;
manifest_wal_range *this_wal_range = manifest->first_wal_range;
@@ -1208,7 +1208,7 @@ parse_required_wal(verifier_context *context, char *pg_waldump_path,
char *pg_waldump_cmd;
pg_waldump_cmd = psprintf("\"%s\" --quiet --path=\"%s\" --timeline=%u --start=%X/%08X --end=%X/%08X\n",
- pg_waldump_path, wal_directory, this_wal_range->tli,
+ pg_waldump_path, wal_path, this_wal_range->tli,
LSN_FORMAT_ARGS(this_wal_range->start_lsn),
LSN_FORMAT_ARGS(this_wal_range->end_lsn));
fflush(NULL);
@@ -1376,7 +1376,7 @@ usage(void)
printf(_(" -P, --progress show progress information\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -s, --skip-checksums skip checksum verification\n"));
- printf(_(" -w, --wal-directory=PATH use specified path for WAL files\n"));
+ printf(_(" -w, --wal-path=PATH use specified path for WAL files\n"));
printf(_(" -V, --version output version information, then exit\n"));
printf(_(" -?, --help show this help, then exit\n"));
printf(_("\nReport bugs to <%s>.\n"), PACKAGE_BUGREPORT);
diff --git a/src/bin/pg_verifybackup/t/007_wal.pl b/src/bin/pg_verifybackup/t/007_wal.pl
index 79087a1f6be..8ad2234453d 100644
--- a/src/bin/pg_verifybackup/t/007_wal.pl
+++ b/src/bin/pg_verifybackup/t/007_wal.pl
@@ -42,10 +42,10 @@ command_ok([ 'pg_verifybackup', '--no-parse-wal', $backup_path ],
command_ok(
[
'pg_verifybackup',
- '--wal-directory' => $relocated_pg_wal,
+ '--wal-path' => $relocated_pg_wal,
$backup_path
],
- '--wal-directory can be used to specify WAL directory');
+ '--wal-path can be used to specify WAL directory');
# Move directory back to original location.
rename($relocated_pg_wal, $original_pg_wal) || die "rename pg_wal back: $!";
--
2.47.1
[application/x-patch] v12-0009-pg_verifybackup-enabled-WAL-parsing-for-tar-form.patch (9.9K, 10-v12-0009-pg_verifybackup-enabled-WAL-parsing-for-tar-form.patch)
download | inline diff:
From 8abeadee490f1d7d751d71c3a1490986cbe26f36 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 25 Nov 2025 17:34:26 +0530
Subject: [PATCH v12 9/9] pg_verifybackup: enabled WAL parsing for tar-format
backup
Now that pg_waldump supports decoding from tar archives, we should
leverage this functionality to remove the previous restriction on WAL
parsing for tar-backed formats.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 5 +-
src/bin/pg_verifybackup/pg_verifybackup.c | 66 +++++++++++++------
src/bin/pg_verifybackup/t/002_algorithm.pl | 4 --
src/bin/pg_verifybackup/t/003_corruption.pl | 4 +-
src/bin/pg_verifybackup/t/008_untar.pl | 5 +-
src/bin/pg_verifybackup/t/010_client_untar.pl | 5 +-
6 files changed, 50 insertions(+), 39 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index e9b8bfd51b1..16b50b5a4df 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -36,10 +36,7 @@ PostgreSQL documentation
<literal>backup_manifest</literal> generated by the server at the time
of the backup. The backup may be stored either in the "plain" or the "tar"
format; this includes tar-format backups compressed with any algorithm
- supported by <application>pg_basebackup</application>. However, at present,
- <literal>WAL</literal> verification is supported only for plain-format
- backups. Therefore, if the backup is stored in tar-format, the
- <literal>-n, --no-parse-wal</literal> option should be used.
+ supported by <application>pg_basebackup</application>.
</para>
<para>
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 3103d36f1b9..cc492728ae8 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -74,7 +74,9 @@ pg_noreturn static void report_manifest_error(JsonManifestParseContext *context,
const char *fmt,...)
pg_attribute_printf(2, 3);
-static void verify_tar_backup(verifier_context *context, DIR *dir);
+static void verify_tar_backup(verifier_context *context, DIR *dir,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_plain_backup_directory(verifier_context *context,
char *relpath, char *fullpath,
DIR *dir);
@@ -83,7 +85,9 @@ static void verify_plain_backup_file(verifier_context *context, char *relpath,
static void verify_control_file(const char *controlpath,
uint64 manifest_system_identifier);
static void precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles);
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_tar_file(verifier_context *context, char *relpath,
char *fullpath, astreamer *streamer);
static void report_extra_backup_files(verifier_context *context);
@@ -136,6 +140,8 @@ main(int argc, char **argv)
bool no_parse_wal = false;
bool quiet = false;
char *wal_path = NULL;
+ char *base_archive_path = NULL;
+ char *wal_archive_path = NULL;
char *pg_waldump_path = NULL;
DIR *dir;
@@ -327,17 +333,6 @@ main(int argc, char **argv)
pfree(path);
}
- /*
- * XXX: In the future, we should consider enhancing pg_waldump to read WAL
- * files from an archive.
- */
- if (!no_parse_wal && context.format == 't')
- {
- pg_log_error("pg_waldump cannot read tar files");
- pg_log_error_hint("You must use -n/--no-parse-wal when verifying a tar-format backup.");
- exit(1);
- }
-
/*
* Perform the appropriate type of verification appropriate based on the
* backup format. This will close 'dir'.
@@ -346,7 +341,7 @@ main(int argc, char **argv)
verify_plain_backup_directory(&context, NULL, context.backup_directory,
dir);
else
- verify_tar_backup(&context, dir);
+ verify_tar_backup(&context, dir, &base_archive_path, &wal_archive_path);
/*
* The "matched" flag should now be set on every entry in the hash table.
@@ -364,9 +359,28 @@ main(int argc, char **argv)
if (context.format == 'p' && !context.skip_checksums)
verify_backup_checksums(&context);
- /* By default, look for the WAL in the backup directory, too. */
+ /*
+ * By default, WAL files are expected to be found in the backup directory
+ * for plain-format backups. In the case of tar-format backups, if a
+ * separate WAL archive is not found, the WAL files are most likely
+ * included within the main data directory archive.
+ */
if (wal_path == NULL)
- wal_path = psprintf("%s/pg_wal", context.backup_directory);
+ {
+ if (context.format == 'p')
+ wal_path = psprintf("%s/pg_wal", context.backup_directory);
+ else if (wal_archive_path)
+ wal_path = wal_archive_path;
+ else if (base_archive_path)
+ wal_path = base_archive_path;
+ else
+ {
+ pg_log_error("WAL archive not found");
+ pg_log_error_hint("Specify the correct path using the option -w/--wal-path. "
+ "Or you must use -n/--no-parse-wal when verifying a tar-format backup.");
+ exit(1);
+ }
+ }
/*
* Try to parse the required ranges of WAL records, unless we were told
@@ -787,7 +801,8 @@ verify_control_file(const char *controlpath, uint64 manifest_system_identifier)
* close when we're done with it.
*/
static void
-verify_tar_backup(verifier_context *context, DIR *dir)
+verify_tar_backup(verifier_context *context, DIR *dir, char **base_archive_path,
+ char **wal_archive_path)
{
struct dirent *dirent;
SimplePtrList tarfiles = {NULL, NULL};
@@ -816,7 +831,8 @@ verify_tar_backup(verifier_context *context, DIR *dir)
char *fullpath;
fullpath = psprintf("%s/%s", context->backup_directory, filename);
- precheck_tar_backup_file(context, filename, fullpath, &tarfiles);
+ precheck_tar_backup_file(context, filename, fullpath, &tarfiles,
+ base_archive_path, wal_archive_path);
pfree(fullpath);
}
}
@@ -875,11 +891,13 @@ verify_tar_backup(verifier_context *context, DIR *dir)
*
* The arguments to this function are mostly the same as the
* verify_plain_backup_file. The additional argument outputs a list of valid
- * tar files.
+ * tar files, along with the full paths to the main archive and the WAL
+ * directory archive.
*/
static void
precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles)
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path, char **wal_archive_path)
{
struct stat sb;
Oid tblspc_oid = InvalidOid;
@@ -918,9 +936,17 @@ precheck_tar_backup_file(verifier_context *context, char *relpath,
* extension such as .gz, .lz4, or .zst.
*/
if (strncmp("base", relpath, 4) == 0)
+ {
suffix = relpath + 4;
+
+ *base_archive_path = pstrdup(fullpath);
+ }
else if (strncmp("pg_wal", relpath, 6) == 0)
+ {
suffix = relpath + 6;
+
+ *wal_archive_path = pstrdup(fullpath);
+ }
else
{
/* Expected a <tablespaceoid>.tar file here. */
diff --git a/src/bin/pg_verifybackup/t/002_algorithm.pl b/src/bin/pg_verifybackup/t/002_algorithm.pl
index 0556191ec9d..edc515d5904 100644
--- a/src/bin/pg_verifybackup/t/002_algorithm.pl
+++ b/src/bin/pg_verifybackup/t/002_algorithm.pl
@@ -30,10 +30,6 @@ sub test_checksums
{
# Add switch to get a tar-format backup
push @backup, ('--format' => 'tar');
-
- # Add switch to skip WAL verification, which is not yet supported for
- # tar-format backups
- push @verify, ('--no-parse-wal');
}
# A backup with a bogus algorithm should fail.
diff --git a/src/bin/pg_verifybackup/t/003_corruption.pl b/src/bin/pg_verifybackup/t/003_corruption.pl
index b1d65b8aa0f..882d75d9dc2 100644
--- a/src/bin/pg_verifybackup/t/003_corruption.pl
+++ b/src/bin/pg_verifybackup/t/003_corruption.pl
@@ -193,10 +193,8 @@ for my $scenario (@scenario)
command_ok([ $tar, '-cf' => "$tar_backup_path/base.tar", '.' ]);
chdir($cwd) || die "chdir: $!";
- # Now check that the backup no longer verifies. We must use -n
- # here, because pg_waldump can't yet read WAL from a tarfile.
command_fails_like(
- [ 'pg_verifybackup', '--no-parse-wal', $tar_backup_path ],
+ [ 'pg_verifybackup', $tar_backup_path ],
$scenario->{'fails_like'},
"corrupt backup fails verification: $name");
diff --git a/src/bin/pg_verifybackup/t/008_untar.pl b/src/bin/pg_verifybackup/t/008_untar.pl
index ae67ae85a31..161c08c190d 100644
--- a/src/bin/pg_verifybackup/t/008_untar.pl
+++ b/src/bin/pg_verifybackup/t/008_untar.pl
@@ -47,7 +47,6 @@ my $tsoid = $primary->safe_psql(
SELECT oid FROM pg_tablespace WHERE spcname = 'regress_ts1'));
my $backup_path = $primary->backup_dir . '/server-backup';
-my $extract_path = $primary->backup_dir . '/extracted-backup';
my @test_configuration = (
{
@@ -123,14 +122,12 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
# Cleanup.
rmtree($backup_path);
- rmtree($extract_path);
}
}
diff --git a/src/bin/pg_verifybackup/t/010_client_untar.pl b/src/bin/pg_verifybackup/t/010_client_untar.pl
index 1ac7b5db75a..9670fbe4fda 100644
--- a/src/bin/pg_verifybackup/t/010_client_untar.pl
+++ b/src/bin/pg_verifybackup/t/010_client_untar.pl
@@ -32,7 +32,6 @@ print $jf $junk_data;
close $jf;
my $backup_path = $primary->backup_dir . '/client-backup';
-my $extract_path = $primary->backup_dir . '/extracted-backup';
my @test_configuration = (
{
@@ -137,13 +136,11 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
# Cleanup.
- rmtree($extract_path);
rmtree($backup_path);
}
}
--
2.47.1
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-02-18 06:58 Amul Sul <[email protected]>
parent: Amul Sul <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Amul Sul @ 2026-02-18 06:58 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Chao Li <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On Tue, Feb 10, 2026 at 3:06 PM Amul Sul <[email protected]> wrote:
>
> On Wed, Feb 4, 2026 at 6:39 PM Amul Sul <[email protected]> wrote:
> >
> > On Wed, Jan 28, 2026 at 2:41 AM Robert Haas <[email protected]> wrote:
> > >
> > > On Tue, Jan 27, 2026 at 7:07 AM Amul Sul <[email protected]> wrote:
> > > > In the attached version, I am using the WAL segment name as the hash
> > > > key, which is much more straightforward. I have rewritten
> > > > read_archive_wal_page(), and it looks much cleaner than before. The
> > > > logic to discard irrelevant WAL files is still within
> > > > get_archive_wal_entry. I added an explanation for setting cur_wal to
> > > > NULL, which is now handled in the separate function I mentioned
> > > > previously.
> > > >
> > > > Kindly have a look at the attached version; let me know if you are
> > > > still not happy with the current approach for filtering/discarding
> > > > irrelevant WAL segments. It isn't much different from the previous
> > > > version, but I have tried to keep it in a separate routine for better
> > > > code readability, with comments to make it easier to understand. I
> > > > also added a comment for ArchivedWALFile.
> > >
> > > I feel like the division of labor between get_archive_wal_entry() and
> > > read_archive_wal_page() is odd. I noticed this in the last version,
> > > too, and it still seems to be the case. get_archive_wal_entry() first
> > > calls ArchivedWAL_lookup(). If that finds an entry, it just returns.
> > > If it doesn't, it loops until an entry for the requested file shows up
> > > and then returns it. Then control returns to read_archive_wal_page()
> > > which loops some more until we have all the data we need for the
> > > requested file. But it seems odd to me to have two separate loops
> > > here. I think that the first loop is going to call read_archive_file()
> > > until we find the beginning of the file that we care about and then
> > > the second one is going to call read_archive_file() some more until we
> > > have read enough of it to satisfy the request. It feels odd to me to
> > > do it that way, as if we told somebody to first wait until 9 o'clock
> > > and then wait another 30 minutes, instead of just telling them to wait
> > > until 9:30. I realize it's not quite the same thing, because apart
> > > from calling read_archive_file(), the two loops do different things,
> > > but I still think it looks odd.
> > >
> > > + /*
> > > + * Ignore if the timeline is different or the current segment is not
> > > + * the desired one.
> > > + */
> > > + XLogFromFileName(entry->fname, &curSegTimeline, &curSegNo, WalSegSz);
> > > + if (privateInfo->timeline != curSegTimeline ||
> > > + privateInfo->startSegNo > curSegNo ||
> > > + privateInfo->endSegNo < curSegNo ||
> > > + segno > curSegNo)
> > > + {
> > > + free_archive_wal_entry(entry->fname, privateInfo);
> > > + continue;
> > > + }
> > >
> > > The comment doesn't match the code. If it did, the test would be
> > > (privateInfo->timeline != curSegTimeline || segno != curSegno). But
> > > instead the segno test is > rather than !=, and the checks against
> > > startSegNo and endSegNo aren't explained at all. I think I understand
> > > why the segno test uses > rather than !=, but it's the point of the
> > > comment to explain things like that, rather than leaving the reader to
> > > guess. And I don't know why we also need to test startSegNo and
> > > endSegNo.
> > >
> > > I also wonder what the point is of doing XLogFromFileName() on the
> > > fname provided by the caller and then again on entry->fname. Couldn't
> > > you just compare the strings?
> > >
> > > Again, the division of labor is really odd here. It's the job of
> > > astreamer_waldump_content() to skip things that aren't WAL files at
> > > all, but it's the job of get_archive_wal_entry() to skip things that
> > > are WAL files but not the one we want. I disagree with putting those
> > > checks in completely separate parts of the code.
> > >
> >
> > Keeping the timeline and segment start-end range checks inside the
> > archive streamer creates a circular dependency that cannot be resolved
> > without a 'dirty hack'. We must read the first available WAL file page
> > to determine the wal_segment_size before it can calculate the target
> > segment range. Moving the checks inside the streamer would make it
> > impossible to process that initial file, as the necessary filtering
> > parameters -- would still be unknown which would need to be skipped
> > for the first read somehow. What if later we realized that the first
> > WAL file which was allowed to be streamed by skipping that check is
> > irrelevant and doesn't fall under the start-end segment range?
> >
>
> Please have a look at the attached version, specifically patch 0005.
> In astreamer_waldump_content(), I have moved the WAL file filtration
> check from get_archive_wal_entry(). This check will be skipped during
> the initial read in init_archive_reader(), which instead performs it
> explicitly once it determines the WAL segment size and the start/end
> segments.
>
> To access the WAL segment size inside astreamer_waldump_content(), I
> have moved the WAL segment size variable into the XLogDumpPrivate
> structure in the separate 0004 patch.
Attached is an updated version including the aforesaid changes. It
includes a new refactoring patch (0001) that moves the logic for
identifying tar archives and their compression types from
pg_basebackup and pg_verifybackup into a separate-reusable function,
per a suggestion from Euler [1]. Additionally, I have added a test
for the contrecord decoding to the main patch (now 0006).
1] http://postgr.es/m/[email protected]
Regards,
Amul
Attachments:
[application/x-patch] v13-0001-Refactor-Move-tar-archive-parsing-into-a-common-.patch (6.7K, 2-v13-0001-Refactor-Move-tar-archive-parsing-into-a-common-.patch)
download | inline diff:
From 6e4331bcbb98160365fe02502ae0a5cd2aea1726 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 17 Feb 2026 14:51:11 +0530
Subject: [PATCH v13 01/10] Refactor: Move tar archive parsing into a common
location.
pg_basebackup and pg_verifybackup both require logic to identify tar
files and determine their compression types. Similar functionality
will be needed for pg_waldump when it gets the capability to decode
WAL files from tar archives. Moving this logic to a common location
allows for reuse and prevents code duplication.
---
src/bin/pg_basebackup/pg_basebackup.c | 36 +++++++----------------
src/bin/pg_verifybackup/pg_verifybackup.c | 12 +-------
src/common/compression.c | 30 +++++++++++++++++++
src/include/common/compression.h | 2 ++
4 files changed, 44 insertions(+), 36 deletions(-)
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 1e3a8203f77..8911b8b921d 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -1070,12 +1070,9 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
astreamer *manifest_inject_streamer = NULL;
bool inject_manifest;
bool is_tar,
- is_tar_gz,
- is_tar_lz4,
- is_tar_zstd,
is_compressed_tar;
+ pg_compress_algorithm compressed_tar_algorithm;
bool must_parse_archive;
- int archive_name_len = strlen(archive_name);
/*
* Normally, we emit the backup manifest as a separate file, but when
@@ -1084,24 +1081,13 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
*/
inject_manifest = (format == 't' && strcmp(basedir, "-") == 0 && manifest);
- /* Is this a tar archive? */
- is_tar = (archive_name_len > 4 &&
- strcmp(archive_name + archive_name_len - 4, ".tar") == 0);
-
- /* Is this a .tar.gz archive? */
- is_tar_gz = (archive_name_len > 7 &&
- strcmp(archive_name + archive_name_len - 7, ".tar.gz") == 0);
-
- /* Is this a .tar.lz4 archive? */
- is_tar_lz4 = (archive_name_len > 8 &&
- strcmp(archive_name + archive_name_len - 8, ".tar.lz4") == 0);
-
- /* Is this a .tar.zst archive? */
- is_tar_zstd = (archive_name_len > 8 &&
- strcmp(archive_name + archive_name_len - 8, ".tar.zst") == 0);
+ /* Check weather it is tar archive and its compress type */
+ is_tar = parse_tar_compress_algorithm(archive_name,
+ &compressed_tar_algorithm);
/* Is this any kind of compressed tar? */
- is_compressed_tar = is_tar_gz || is_tar_lz4 || is_tar_zstd;
+ is_compressed_tar = (is_tar &&
+ compressed_tar_algorithm != PG_COMPRESSION_NONE);
/*
* Injecting the manifest into a compressed tar file would be possible if
@@ -1128,7 +1114,7 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
(spclocation == NULL && writerecoveryconf));
/* At present, we only know how to parse tar archives. */
- if (must_parse_archive && !is_tar && !is_compressed_tar)
+ if (must_parse_archive && !is_tar)
{
pg_log_error("cannot parse archive \"%s\"", archive_name);
pg_log_error_detail("Only tar archives can be parsed.");
@@ -1263,13 +1249,13 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
* If the user has requested a server compressed archive along with
* archive extraction at client then we need to decompress it.
*/
- if (format == 'p')
+ if (format == 'p' && is_compressed_tar)
{
- if (is_tar_gz)
+ if (compressed_tar_algorithm == PG_COMPRESSION_GZIP)
streamer = astreamer_gzip_decompressor_new(streamer);
- else if (is_tar_lz4)
+ else if (compressed_tar_algorithm == PG_COMPRESSION_LZ4)
streamer = astreamer_lz4_decompressor_new(streamer);
- else if (is_tar_zstd)
+ else if (compressed_tar_algorithm == PG_COMPRESSION_ZSTD)
streamer = astreamer_zstd_decompressor_new(streamer);
}
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index f9f2d457f2f..5ddc4c33feb 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -941,17 +941,7 @@ precheck_tar_backup_file(verifier_context *context, char *relpath,
}
/* Now, check the compression type of the tar */
- if (strcmp(suffix, ".tar") == 0)
- compress_algorithm = PG_COMPRESSION_NONE;
- else if (strcmp(suffix, ".tgz") == 0)
- compress_algorithm = PG_COMPRESSION_GZIP;
- else if (strcmp(suffix, ".tar.gz") == 0)
- compress_algorithm = PG_COMPRESSION_GZIP;
- else if (strcmp(suffix, ".tar.lz4") == 0)
- compress_algorithm = PG_COMPRESSION_LZ4;
- else if (strcmp(suffix, ".tar.zst") == 0)
- compress_algorithm = PG_COMPRESSION_ZSTD;
- else
+ if (!parse_tar_compress_algorithm(suffix, &compress_algorithm))
{
report_backup_error(context,
"file \"%s\" is not expected in a tar format backup",
diff --git a/src/common/compression.c b/src/common/compression.c
index 92cd4ec7a0d..f117e21237f 100644
--- a/src/common/compression.c
+++ b/src/common/compression.c
@@ -41,6 +41,36 @@ static int expect_integer_value(char *keyword, char *value,
static bool expect_boolean_value(char *keyword, char *value,
pg_compress_specification *result);
+/*
+ * Look up a compression algorithm by archive file extension. Returns true and
+ * sets *algorithm if the name is recognized. Otherwise returns false.
+ */
+bool
+parse_tar_compress_algorithm(char *fname, pg_compress_algorithm *algorithm)
+{
+ int fname_len = strlen(fname);
+
+ if (fname_len >= 4 &&
+ strcmp(fname + fname_len - 4, ".tar") == 0)
+ *algorithm = PG_COMPRESSION_NONE;
+ else if (fname_len >= 4 &&
+ strcmp(fname + fname_len - 4, ".tgz") == 0)
+ *algorithm = PG_COMPRESSION_GZIP;
+ else if (fname_len >= 7 &&
+ strcmp(fname + fname_len - 7, ".tar.gz") == 0)
+ *algorithm = PG_COMPRESSION_GZIP;
+ else if (fname_len >= 8 &&
+ strcmp(fname + fname_len - 8, ".tar.lz4") == 0)
+ *algorithm = PG_COMPRESSION_LZ4;
+ else if (fname_len >= 8 &&
+ strcmp(fname + fname_len - 8, ".tar.zst") == 0)
+ *algorithm = PG_COMPRESSION_ZSTD;
+ else
+ return false;
+
+ return true;
+}
+
/*
* Look up a compression algorithm by name. Returns true and sets *algorithm
* if the name is recognized. Otherwise returns false.
diff --git a/src/include/common/compression.h b/src/include/common/compression.h
index 6c745b90066..50f21656b88 100644
--- a/src/include/common/compression.h
+++ b/src/include/common/compression.h
@@ -41,6 +41,8 @@ typedef struct pg_compress_specification
extern void parse_compress_options(const char *option, char **algorithm,
char **detail);
+extern bool parse_tar_compress_algorithm(char *fname,
+ pg_compress_algorithm *algorithm);
extern bool parse_compress_algorithm(char *name, pg_compress_algorithm *algorithm);
extern const char *get_compress_algorithm_name(pg_compress_algorithm algorithm);
--
2.47.1
[application/x-patch] v13-0002-Refactor-pg_waldump-Move-some-declarations-to-ne.patch (2.2K, 3-v13-0002-Refactor-pg_waldump-Move-some-declarations-to-ne.patch)
download | inline diff:
From 17fe3b9a8b649f0146fe237f8ed2960246f26907 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 22 Jan 2026 10:28:32 +0530
Subject: [PATCH v13 02/10] Refactor: pg_waldump: Move some declarations to new
pg_waldump.h
This change prepares for a second source file in this directory to
support reading WAL from tar files. Common structures, declarations,
and functions are being exported through this include file so
they can be used in both files.
---
src/bin/pg_waldump/pg_waldump.c | 9 +--------
src/bin/pg_waldump/pg_waldump.h | 25 +++++++++++++++++++++++++
2 files changed, 26 insertions(+), 8 deletions(-)
create mode 100644 src/bin/pg_waldump/pg_waldump.h
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index f3446385d6a..4b7411a6498 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -29,6 +29,7 @@
#include "common/logging.h"
#include "common/relpath.h"
#include "getopt_long.h"
+#include "pg_waldump.h"
#include "rmgrdesc.h"
#include "storage/bufpage.h"
@@ -43,14 +44,6 @@ static volatile sig_atomic_t time_to_stop = false;
static const RelFileLocator emptyRelFileLocator = {0, 0, 0};
-typedef struct XLogDumpPrivate
-{
- TimeLineID timeline;
- XLogRecPtr startptr;
- XLogRecPtr endptr;
- bool endptr_reached;
-} XLogDumpPrivate;
-
typedef struct XLogDumpConfig
{
/* display options */
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
new file mode 100644
index 00000000000..b88543856e5
--- /dev/null
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -0,0 +1,25 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_waldump.h - decode and display WAL
+ *
+ * Copyright (c) 2026, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/pg_waldump.h
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_WALDUMP_H
+#define PG_WALDUMP_H
+
+#include "access/xlogdefs.h"
+
+/* Contains the necessary information to drive WAL decoding */
+typedef struct XLogDumpPrivate
+{
+ TimeLineID timeline;
+ XLogRecPtr startptr;
+ XLogRecPtr endptr;
+ bool endptr_reached;
+} XLogDumpPrivate;
+
+#endif /* end of PG_WALDUMP_H */
--
2.47.1
[application/x-patch] v13-0003-Refactor-pg_waldump-Separate-logic-used-to-calcu.patch (2.4K, 4-v13-0003-Refactor-pg_waldump-Separate-logic-used-to-calcu.patch)
download | inline diff:
From 0184347f5582b5973b80056d03eb6cacc85725c2 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 22 Jan 2026 10:38:16 +0530
Subject: [PATCH v13 03/10] Refactor: pg_waldump: Separate logic used to
calculate the required read size.
This refactoring prepares the codebase for an upcoming patch that will
support reading WAL from tar files. The logic for calculating the
required read size has been updated to handle both normal WAL files
and WAL files located inside a tar archive.
---
src/bin/pg_waldump/pg_waldump.c | 43 +++++++++++++++++++++++----------
1 file changed, 30 insertions(+), 13 deletions(-)
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 4b7411a6498..958a71a01cf 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -326,6 +326,32 @@ identify_target_directory(char *directory, char *fname, int *WalSegSz)
return NULL; /* not reached */
}
+/*
+ * Returns the size in bytes of the data to be read. Returns -1 if the end
+ * point has already been reached.
+ */
+static inline int
+required_read_len(XLogDumpPrivate *private, XLogRecPtr targetPagePtr,
+ int reqLen)
+{
+ int count = XLOG_BLCKSZ;
+
+ if (XLogRecPtrIsValid(private->endptr))
+ {
+ if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
+ count = XLOG_BLCKSZ;
+ else if (targetPagePtr + reqLen <= private->endptr)
+ count = private->endptr - targetPagePtr;
+ else
+ {
+ private->endptr_reached = true;
+ return -1;
+ }
+ }
+
+ return count;
+}
+
/* pg_waldump's XLogReaderRoutine->segment_open callback */
static void
WALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
@@ -383,21 +409,12 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogRecPtr targetPtr, char *readBuff)
{
XLogDumpPrivate *private = state->private_data;
- int count = XLOG_BLCKSZ;
+ int count = required_read_len(private, targetPagePtr, reqLen);
WALReadError errinfo;
- if (XLogRecPtrIsValid(private->endptr))
- {
- if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
- count = XLOG_BLCKSZ;
- else if (targetPagePtr + reqLen <= private->endptr)
- count = private->endptr - targetPagePtr;
- else
- {
- private->endptr_reached = true;
- return -1;
- }
- }
+ /* Bail out if the count to be read is not valid */
+ if (count < 0)
+ return -1;
if (!WALRead(state, readBuff, targetPagePtr, count, private->timeline,
&errinfo))
--
2.47.1
[application/x-patch] v13-0004-Refactor-pg_waldump-Restructure-TAP-tests.patch (6.6K, 5-v13-0004-Refactor-pg_waldump-Restructure-TAP-tests.patch)
download | inline diff:
From b5298076adc315659a6cde3ae8f6883ee025f3d6 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 18 Feb 2026 11:07:57 +0530
Subject: [PATCH v13 04/10] Refactor: pg_waldump: Restructure TAP tests.
Restructured tests that do not have a WAL file argument to run within
a loop, facilitating their re-execution for decoding WAL from tar
archives.
== NOTE ==
This is not intended to be committed separately. It can be merged
with the next patch, which is the main patch implementing this
feature.
---
src/bin/pg_waldump/t/001_basic.pl | 140 +++++++++++++++++-------------
1 file changed, 79 insertions(+), 61 deletions(-)
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 5db5d20136f..f12ba52cbfc 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -198,28 +198,6 @@ command_like(
],
qr/./,
'runs with start and end segment specified');
-command_fails_like(
- [ 'pg_waldump', '--path' => $node->data_dir ],
- qr/error: no start WAL location given/,
- 'path option requires start location');
-command_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- '--end' => $end_lsn,
- ],
- qr/./,
- 'runs with path option and start and end locations');
-command_fails_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- ],
- qr/error: error in WAL record at/,
- 'falling off the end of the WAL results in an error');
-
command_like(
[
'pg_waldump', '--quiet',
@@ -227,22 +205,16 @@ command_like(
],
qr/^$/,
'no output with --quiet option');
-command_fails_like(
- [
- 'pg_waldump', '--quiet',
- '--path' => $node->data_dir,
- '--start' => $start_lsn
- ],
- qr/error: error in WAL record at/,
- 'errors are shown with --quiet');
-
# Test for: Display a message that we're skipping data if `from`
# wasn't a pointer to the start of a record.
+sub test_pg_waldump_skip_bytes
{
+ my ($path, $startlsn, $endlsn) = @_;
+
# Construct a new LSN that is one byte past the original
# start_lsn.
- my ($part1, $part2) = split qr{/}, $start_lsn;
+ my ($part1, $part2) = split qr{/}, $startlsn;
my $lsn2 = hex $part2;
$lsn2++;
my $new_start = sprintf("%s/%X", $part1, $lsn2);
@@ -252,7 +224,8 @@ command_fails_like(
my $result = IPC::Run::run [
'pg_waldump',
'--start' => $new_start,
- $node->data_dir . '/pg_wal/' . $start_walfile
+ '--end' => $endlsn,
+ '--path' => $path,
],
'>' => \$stdout,
'2>' => \$stderr;
@@ -266,15 +239,15 @@ command_fails_like(
sub test_pg_waldump
{
local $Test::Builder::Level = $Test::Builder::Level + 1;
- my @opts = @_;
+ my ($path, $startlsn, $endlsn, @opts) = @_;
my ($stdout, $stderr);
my $result = IPC::Run::run [
'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- '--end' => $end_lsn,
+ '--start' => $startlsn,
+ '--end' => $endlsn,
+ '--path' => $path,
@opts
],
'>' => \$stdout,
@@ -288,38 +261,83 @@ sub test_pg_waldump
my @lines;
-@lines = test_pg_waldump;
-is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+my @scenarios = (
+ {
+ 'path' => $node->data_dir
+ });
-@lines = test_pg_waldump('--limit' => 6);
-is(@lines, 6, 'limit option observed');
+for my $scenario (@scenarios)
+{
+ my $path = $scenario->{'path'};
-@lines = test_pg_waldump('--fullpage');
-is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
+ SKIP:
+ {
+ command_fails_like(
+ [ 'pg_waldump', '--path' => $path ],
+ qr/error: no start WAL location given/,
+ 'path option requires start location');
+ command_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ '--end' => $end_lsn,
+ ],
+ qr/./,
+ 'runs with path option and start and end locations');
+ command_fails_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ ],
+ qr/error: error in WAL record at/,
+ 'falling off the end of the WAL results in an error');
-@lines = test_pg_waldump('--stats');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ command_fails_like(
+ [
+ 'pg_waldump', '--quiet',
+ '--path' => $path,
+ '--start' => $start_lsn
+ ],
+ qr/error: error in WAL record at/,
+ 'errors are shown with --quiet');
-@lines = test_pg_waldump('--stats=record');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ test_pg_waldump_skip_bytes($path, $start_lsn, $end_lsn);
-@lines = test_pg_waldump('--rmgr' => 'Btree');
-is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
+ is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
-@lines = test_pg_waldump('--fork' => 'init');
-is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--limit' => 6);
+ is(@lines, 6, 'limit option observed');
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
-is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
- 0, 'only lines for selected relation');
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fullpage');
+ is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
- '--block' => 1);
-is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats=record');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--rmgr' => 'Btree');
+ is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fork' => 'init');
+ is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
+ is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
+ 0, 'only lines for selected relation');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
+ '--block' => 1);
+ is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+ }
+}
done_testing();
--
2.47.1
[application/x-patch] v13-0005-Refactor-pg_waldump-Move-WAL-segment-size-to-XLo.patch (5.1K, 6-v13-0005-Refactor-pg_waldump-Move-WAL-segment-size-to-XLo.patch)
download | inline diff:
From 6b38a9b7bd81a15ba8647f51d53156b8a9dd405c Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 4 Feb 2026 15:31:51 +0530
Subject: [PATCH v13 05/10] Refactor: pg_waldump: Move WAL segment size to
XLogDumpPrivate.
Relocate the WAL segment size variable to the XLogDumpPrivate
structure and rename it to segsize for consistency. This change is
required to make the segment size accessible to the archive streamer
code, where passing it as a function argument is not feasible.
---
src/bin/pg_waldump/pg_waldump.c | 26 +++++++++++++-------------
src/bin/pg_waldump/pg_waldump.h | 1 +
2 files changed, 14 insertions(+), 13 deletions(-)
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 958a71a01cf..5d31b15dbd8 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -811,7 +811,6 @@ main(int argc, char **argv)
XLogRecPtr first_record;
char *waldir = NULL;
char *errormsg;
- int WalSegSz;
static struct option long_options[] = {
{"bkp-details", no_argument, NULL, 'b'},
@@ -865,6 +864,7 @@ main(int argc, char **argv)
memset(&stats, 0, sizeof(XLogStats));
private.timeline = 1;
+ private.segsize = 0;
private.startptr = InvalidXLogRecPtr;
private.endptr = InvalidXLogRecPtr;
private.endptr_reached = false;
@@ -1138,18 +1138,18 @@ main(int argc, char **argv)
pg_fatal("could not open directory \"%s\": %m", waldir);
}
- waldir = identify_target_directory(waldir, fname, &WalSegSz);
+ waldir = identify_target_directory(waldir, fname, &private.segsize);
fd = open_file_in_directory(waldir, fname);
if (fd < 0)
pg_fatal("could not open file \"%s\"", fname);
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &segno, WalSegSz);
+ XLogFromFileName(fname, &private.timeline, &segno, private.segsize);
if (!XLogRecPtrIsValid(private.startptr))
- XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr);
- else if (!XLByteInSeg(private.startptr, segno, WalSegSz))
+ XLogSegNoOffsetToRecPtr(segno, 0, private.segsize, private.startptr);
+ else if (!XLByteInSeg(private.startptr, segno, private.segsize))
{
pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
LSN_FORMAT_ARGS(private.startptr),
@@ -1159,7 +1159,7 @@ main(int argc, char **argv)
/* no second file specified, set end position */
if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(segno + 1, 0, WalSegSz, private.endptr);
+ XLogSegNoOffsetToRecPtr(segno + 1, 0, private.segsize, private.endptr);
/* parse ENDSEG if passed */
if (optind + 1 < argc)
@@ -1175,14 +1175,14 @@ main(int argc, char **argv)
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &endsegno, WalSegSz);
+ XLogFromFileName(fname, &private.timeline, &endsegno, private.segsize);
if (endsegno < segno)
pg_fatal("ENDSEG %s is before STARTSEG %s",
argv[optind + 1], argv[optind]);
if (!XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(endsegno + 1, 0, WalSegSz,
+ XLogSegNoOffsetToRecPtr(endsegno + 1, 0, private.segsize,
private.endptr);
/* set segno to endsegno for check of --end */
@@ -1190,8 +1190,8 @@ main(int argc, char **argv)
}
- if (!XLByteInSeg(private.endptr, segno, WalSegSz) &&
- private.endptr != (segno + 1) * WalSegSz)
+ if (!XLByteInSeg(private.endptr, segno, private.segsize) &&
+ private.endptr != (segno + 1) * private.segsize)
{
pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
LSN_FORMAT_ARGS(private.endptr),
@@ -1200,7 +1200,7 @@ main(int argc, char **argv)
}
}
else
- waldir = identify_target_directory(waldir, NULL, &WalSegSz);
+ waldir = identify_target_directory(waldir, NULL, &private.segsize);
/* we don't know what to print */
if (!XLogRecPtrIsValid(private.startptr))
@@ -1213,7 +1213,7 @@ main(int argc, char **argv)
/* we have everything we need, start reading */
xlogreader_state =
- XLogReaderAllocate(WalSegSz, waldir,
+ XLogReaderAllocate(private.segsize, waldir,
XL_ROUTINE(.page_read = WALDumpReadPage,
.segment_open = WALDumpOpenSegment,
.segment_close = WALDumpCloseSegment),
@@ -1234,7 +1234,7 @@ main(int argc, char **argv)
* a segment (e.g. we were used in file mode).
*/
if (first_record != private.startptr &&
- XLogSegmentOffset(private.startptr, WalSegSz) != 0)
+ XLogSegmentOffset(private.startptr, private.segsize) != 0)
pg_log_info(ngettext("first record is after %X/%08X, at %X/%08X, skipping over %u byte",
"first record is after %X/%08X, at %X/%08X, skipping over %u bytes",
(first_record - private.startptr)),
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index b88543856e5..4f1b2ab668b 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -17,6 +17,7 @@
typedef struct XLogDumpPrivate
{
TimeLineID timeline;
+ int segsize;
XLogRecPtr startptr;
XLogRecPtr endptr;
bool endptr_reached;
--
2.47.1
[application/x-patch] v13-0006-pg_waldump-Add-support-for-archived-WAL-decoding.patch (41.7K, 7-v13-0006-pg_waldump-Add-support-for-archived-WAL-decoding.patch)
download | inline diff:
From fa0872a5c7714c1958780a95ad0091f0d0c5f94d Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 10 Feb 2026 11:42:36 +0530
Subject: [PATCH v13 06/10] pg_waldump: Add support for archived WAL decoding.
pg_waldump can now accept the path to a tar archive containing WAL
files and decode them. This feature was added primarily for
pg_verifybackup, which previously disabled WAL parsing for
tar-formatted backups.
Note that this patch requires that the WAL files within the archive be
in sequential order; an error will be reported otherwise. The next
patch is planned to remove this restriction.
---
doc/src/sgml/ref/pg_waldump.sgml | 8 +-
src/bin/pg_waldump/Makefile | 7 +-
src/bin/pg_waldump/archive_waldump.c | 638 +++++++++++++++++++++++++++
src/bin/pg_waldump/meson.build | 4 +-
src/bin/pg_waldump/pg_waldump.c | 255 ++++++++---
src/bin/pg_waldump/pg_waldump.h | 43 ++
src/bin/pg_waldump/t/001_basic.pl | 105 ++++-
src/tools/pgindent/typedefs.list | 3 +
8 files changed, 997 insertions(+), 66 deletions(-)
create mode 100644 src/bin/pg_waldump/archive_waldump.c
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index ce23add5577..d004bb0f67e 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -141,13 +141,17 @@ PostgreSQL documentation
<term><option>--path=<replaceable>path</replaceable></option></term>
<listitem>
<para>
- Specifies a directory to search for WAL segment files or a
- directory with a <literal>pg_wal</literal> subdirectory that
+ Specifies a tar archive or a directory to search for WAL segment files
+ or a directory with a <literal>pg_wal</literal> subdirectory that
contains such files. The default is to search in the current
directory, the <literal>pg_wal</literal> subdirectory of the
current directory, and the <literal>pg_wal</literal> subdirectory
of <envar>PGDATA</envar>.
</para>
+ <para>
+ If a tar archive is provided, its WAL segment files must be in
+ sequential order; otherwise, an error will be reported.
+ </para>
</listitem>
</varlistentry>
diff --git a/src/bin/pg_waldump/Makefile b/src/bin/pg_waldump/Makefile
index 4c1ee649501..aabb87566a2 100644
--- a/src/bin/pg_waldump/Makefile
+++ b/src/bin/pg_waldump/Makefile
@@ -3,6 +3,9 @@
PGFILEDESC = "pg_waldump - decode and display WAL"
PGAPPICON=win32
+# make these available to TAP test scripts
+export TAR
+
subdir = src/bin/pg_waldump
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
@@ -10,13 +13,15 @@ include $(top_builddir)/src/Makefile.global
OBJS = \
$(RMGRDESCOBJS) \
$(WIN32RES) \
+ archive_waldump.o \
compat.o \
pg_waldump.o \
rmgrdesc.o \
xlogreader.o \
xlogstats.o
-override CPPFLAGS := -DFRONTEND $(CPPFLAGS)
+override CPPFLAGS := -DFRONTEND -I$(libpq_srcdir) $(CPPFLAGS)
+LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils
RMGRDESCSOURCES = $(sort $(notdir $(wildcard $(top_srcdir)/src/backend/access/rmgrdesc/*desc*.c)))
RMGRDESCOBJS = $(patsubst %.c,%.o,$(RMGRDESCSOURCES))
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
new file mode 100644
index 00000000000..ecc022a81a2
--- /dev/null
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -0,0 +1,638 @@
+/*-------------------------------------------------------------------------
+ *
+ * archive_waldump.c
+ * A generic facility for reading WAL data from tar archives via archive
+ * streamer.
+ *
+ * Portions Copyright (c) 2026, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/archive_waldump.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres_fe.h"
+
+#include <unistd.h>
+
+#include "access/xlog_internal.h"
+#include "common/hashfn.h"
+#include "common/logging.h"
+#include "fe_utils/simple_list.h"
+#include "pg_waldump.h"
+
+/*
+ * How many bytes should we try to read from a file at once?
+ */
+#define READ_CHUNK_SIZE (128 * 1024)
+
+/*
+ * Check if the start segment number is zero; this indicates a request to read
+ * any WAL file.
+ */
+#define READ_ANY_WAL(privateInfo) ((privateInfo)->start_segno == 0)
+
+/*
+ * Hash entry representing a WAL segment retrieved from the archive.
+ *
+ * While WAL segments are typically read sequentially, individual entries
+ * maintain their own buffers for the following reasons:
+ *
+ * 1. Boundary Handling: The archive streamer provides a continuous byte
+ * stream. A single streaming chunk may contain the end of one WAL segment
+ * and the start of the next. Separate buffers allow us to easily
+ * partition and track these bytes by their respective segments.
+ *
+ * 2. Out-of-Order Support: Dedicated buffers simplify logic if segments
+ * are ever archived or retrieved out of sequence.
+ *
+ * To minimize the memory footprint, entries and their associated buffers are
+ * freed immediately once consumed. Since pg_waldump does not request the same
+ * bytes twice, a segment is discarded as soon as it moves past it.
+ */
+typedef struct ArchivedWALFile
+{
+ uint32 status; /* hash status */
+ const char *fname; /* hash key: WAL segment name */
+
+ StringInfo buf; /* holds WAL bytes read from archive */
+
+ int read_len; /* total bytes of a WAL read from archive */
+} ArchivedWALFile;
+
+static uint32 hash_string_pointer(const char *s);
+#define SH_PREFIX ArchivedWAL
+#define SH_ELEMENT_TYPE ArchivedWALFile
+#define SH_KEY_TYPE const char *
+#define SH_KEY fname
+#define SH_HASH_KEY(tb, key) hash_string_pointer(key)
+#define SH_EQUAL(tb, a, b) (strcmp(a, b) == 0)
+#define SH_SCOPE static inline
+#define SH_RAW_ALLOCATOR pg_malloc0
+#define SH_DECLARE
+#define SH_DEFINE
+#include "lib/simplehash.h"
+
+typedef struct astreamer_waldump
+{
+ astreamer base;
+ XLogDumpPrivate *privateInfo;
+} astreamer_waldump;
+
+static ArchivedWALFile *get_archive_wal_entry(const char *fname,
+ XLogDumpPrivate *privateInfo,
+ int WalSegSz);
+static int read_archive_file(XLogDumpPrivate *privateInfo, Size count);
+
+static astreamer *astreamer_waldump_new(XLogDumpPrivate *privateInfo);
+static void astreamer_waldump_content(astreamer *streamer,
+ astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context);
+static void astreamer_waldump_finalize(astreamer *streamer);
+static void astreamer_waldump_free(astreamer *streamer);
+
+static bool member_is_wal_file(astreamer_waldump *mystreamer,
+ astreamer_member *member,
+ char **fname);
+
+static const astreamer_ops astreamer_waldump_ops = {
+ .content = astreamer_waldump_content,
+ .finalize = astreamer_waldump_finalize,
+ .free = astreamer_waldump_free
+};
+
+/*
+ * Initializes the tar archive reader, creates a hash table for WAL entries,
+ * checks for existing valid WAL segments in the archive file and retrieves the
+ * segment size, and sets up filters for relevant entries.
+ */
+void
+init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
+ int *WalSegSz, pg_compress_algorithm compression)
+{
+ int fd;
+ astreamer *streamer;
+ ArchivedWALFile *entry = NULL;
+ XLogLongPageHeader longhdr;
+ XLogSegNo segno;
+ TimeLineID timeline;
+
+ /* Open tar archive and store its file descriptor */
+ fd = open_file_in_directory(waldir, privateInfo->archive_name);
+
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", privateInfo->archive_name);
+
+ privateInfo->archive_fd = fd;
+
+ streamer = astreamer_waldump_new(privateInfo);
+
+ /* Before that we must parse the tar archive. */
+ streamer = astreamer_tar_parser_new(streamer);
+
+ /* Before that we must decompress, if archive is compressed. */
+ if (compression == PG_COMPRESSION_GZIP)
+ streamer = astreamer_gzip_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_LZ4)
+ streamer = astreamer_lz4_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_ZSTD)
+ streamer = astreamer_zstd_decompressor_new(streamer);
+
+ privateInfo->archive_streamer = streamer;
+
+ /*
+ * Hash table storing WAL entries read from the archive with an arbitrary
+ * initial size
+ */
+ privateInfo->archive_wal_htab = ArchivedWAL_create(8, NULL);
+
+ /*
+ * Verify that the archive contains valid WAL files and fetch WAL segment
+ * size
+ */
+ while (entry == NULL || entry->buf->len < XLOG_BLCKSZ)
+ {
+ if (read_archive_file(privateInfo, XLOG_BLCKSZ) == 0)
+ pg_fatal("could not find WAL in archive \"%s\"",
+ privateInfo->archive_name);
+
+ entry = privateInfo->cur_file;
+ }
+
+ /* Set WalSegSz if WAL data is successfully read */
+ longhdr = (XLogLongPageHeader) entry->buf->data;
+
+ if (!IsValidWalSegSize(longhdr->xlp_seg_size))
+ {
+ pg_log_error(ngettext("invalid WAL segment size in WAL file from archive \"%s\" (%d byte)",
+ "invalid WAL segment size in WAL file from archive \"%s\" (%d bytes)",
+ longhdr->xlp_seg_size),
+ privateInfo->archive_name, longhdr->xlp_seg_size);
+ pg_log_error_detail("The WAL segment size must be a power of two between 1 MB and 1 GB.");
+ exit(1);
+ }
+
+ *WalSegSz = longhdr->xlp_seg_size;
+
+ /*
+ * With the WAL segment size available, we can now initialize the
+ * dependent start and end segment numbers.
+ */
+ Assert(!XLogRecPtrIsInvalid(privateInfo->startptr));
+ XLByteToSeg(privateInfo->startptr, privateInfo->start_segno, *WalSegSz);
+
+ if (!XLogRecPtrIsInvalid(privateInfo->endptr))
+ XLByteToSeg(privateInfo->endptr, privateInfo->end_segno, *WalSegSz);
+
+ /*
+ * This WAL record was fetched before the filtering parameters
+ * (start_segno and end_segno) were fully initialized. Perform the
+ * relevance check against the user-provided range now; if the WAL falls
+ * outside this range, remove it from the hash table. Subsequent WAL will
+ * be filtered automatically by the archived streamer using the updated
+ * start_segno and end_segno values.
+ */
+ XLogFromFileName(entry->fname, &timeline, &segno, privateInfo->segsize);
+ if (privateInfo->timeline != timeline ||
+ privateInfo->start_segno > segno ||
+ privateInfo->end_segno < segno)
+ free_archive_wal_entry(entry->fname, privateInfo);
+}
+
+/*
+ * Release the archive streamer chain and close the archive file.
+ */
+void
+free_archive_reader(XLogDumpPrivate *privateInfo)
+{
+ /*
+ * NB: Normally, astreamer_finalize() is called before astreamer_free() to
+ * flush any remaining buffered data or to ensure the end of the tar
+ * archive is reached. However, when decoding a WAL file, once we hit the
+ * end LSN, any remaining WAL data in the buffer or the tar archive's
+ * unreached end can be safely ignored.
+ */
+ astreamer_free(privateInfo->archive_streamer);
+
+ /* Close the file. */
+ if (close(privateInfo->archive_fd) != 0)
+ pg_log_error("could not close file \"%s\": %m",
+ privateInfo->archive_name);
+}
+
+/*
+ * Copies WAL data from astreamer to readBuff; if unavailable, fetches more
+ * from the tar archive via astreamer.
+ */
+int
+read_archive_wal_page(XLogDumpPrivate *privateInfo, XLogRecPtr targetPagePtr,
+ Size count, char *readBuff, int WalSegSz)
+{
+ char *p = readBuff;
+ Size nbytes = count;
+ XLogRecPtr recptr = targetPagePtr;
+ XLogSegNo segno;
+ char fname[MAXFNAMELEN];
+ ArchivedWALFile *entry;
+
+ /* Identify the segment and locate its entry in the archive hash */
+ XLByteToSeg(targetPagePtr, segno, WalSegSz);
+ XLogFileName(fname, privateInfo->timeline, segno, WalSegSz);
+ entry = get_archive_wal_entry(fname, privateInfo, WalSegSz);
+
+ while (nbytes > 0)
+ {
+ char *buf = entry->buf->data;
+ int bufLen = entry->buf->len;
+ XLogRecPtr endPtr;
+ XLogRecPtr startPtr;
+
+ /* Calculate the LSN range currently residing in the buffer */
+ XLogSegNoOffsetToRecPtr(segno, entry->read_len, WalSegSz, endPtr);
+ startPtr = endPtr - bufLen;
+
+ /*
+ * Copy the requested WAL record if it exists in the buffer.
+ */
+ if (bufLen > 0 && startPtr <= recptr && recptr < endPtr)
+ {
+ int copyBytes;
+ int offset = recptr - startPtr;
+
+ /*
+ * Given startPtr <= recptr < endPtr and a total buffer size
+ * 'bufLen', the offset (recptr - startPtr) will always be less
+ * than 'bufLen'.
+ */
+ Assert(offset < bufLen);
+
+ copyBytes = Min(nbytes, bufLen - offset);
+ memcpy(p, buf + offset, copyBytes);
+
+ /* Update state for read */
+ recptr += copyBytes;
+ nbytes -= copyBytes;
+ p += copyBytes;
+ }
+ else
+ {
+ /*
+ * Before starting the actual decoding loop, pg_waldump tries to
+ * locate the first valid record from the user-specified start
+ * position, which might not be the start of a WAL record and
+ * could fall in the middle of a record that spans multiple pages.
+ * Consequently, the valid start position the decoder is looking
+ * for could be far away from that initial position.
+ *
+ * This may involve reading across multiple pages, and this
+ * pre-reading fetches data in multiple rounds from the archive
+ * streamer; normally, we would throw away existing buffer
+ * contents to fetch the next set of data, but that existing data
+ * might be needed once the main loop starts. Because previously
+ * read data cannot be re-read by the archive streamer, we delay
+ * resetting the buffer until the main decoding loop is entered.
+ *
+ * Once pg_waldump has entered the main loop, it may re-read the
+ * currently active page, but never an older one; therefore, any
+ * fully consumed WAL data preceding the current page can then be
+ * safely discarded.
+ */
+ if (privateInfo->decoding_started)
+ {
+ resetStringInfo(entry->buf);
+
+ /*
+ * Push back the partial page data for the current page to the
+ * buffer, ensuring it remains full page available for
+ * re-reading if requested.
+ */
+ if (p > readBuff)
+ {
+ Assert((count - nbytes) > 0);
+ appendBinaryStringInfo(entry->buf, readBuff, count - nbytes);
+ }
+ }
+
+ /*
+ * Now, fetch more data; raise an error if it's not the current
+ * segment being read by the archive streamer or if reading of the
+ * archived file has finished.
+ */
+ if (privateInfo->cur_file != entry ||
+ read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ pg_fatal("could not read file \"%s\" from archive \"%s\": read %lld of %lld",
+ fname, privateInfo->archive_name,
+ (long long int) count - nbytes,
+ (long long int) nbytes);
+ }
+ }
+
+ /*
+ * Should have either have successfully read all the requested bytes or
+ * reported a failure before this point.
+ */
+ Assert(nbytes == 0);
+
+ /*
+ * NB: We return the fixed value provided as input. Although we could
+ * return a boolean since we either successfully read the WAL page or
+ * raise an error, but the caller expects this value to be returned. The
+ * routine that reads WAL pages from the physical WAL file follows the
+ * same convention.
+ */
+ return count;
+}
+
+/*
+ * Clears the buffer of a WAL entry that is being ignored. This frees up memory
+ * and prevents the accumulation of irrelevant WAL data. Additionally,
+ * conditionally setting cur_file within privateinfo to NULL ensures the
+ * archive streamer skips unnecessary copy operations
+ */
+void
+free_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo)
+{
+ ArchivedWALFile *entry;
+
+ entry = ArchivedWAL_lookup(privateInfo->archive_wal_htab, fname);
+
+ if (entry == NULL)
+ return;
+
+ /* Destroy the buffer */
+ destroyStringInfo(entry->buf);
+ entry->buf = NULL;
+
+ /* Set cur_file to NULL if it matches the entry being ignored */
+ if (privateInfo->cur_file == entry)
+ privateInfo->cur_file = NULL;
+
+ ArchivedWAL_delete_item(privateInfo->archive_wal_htab, entry);
+}
+
+/*
+ * Returns the archived WAL entry from the hash table if it exists. Otherwise,
+ * it invokes the routine to read the archived file, which then populates the
+ * entry in the hash table if that WAL exists in the archive.
+ */
+static ArchivedWALFile *
+get_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo,
+ int WalSegSz)
+{
+ ArchivedWALFile *entry = NULL;
+
+ /* Search hash table */
+ entry = ArchivedWAL_lookup(privateInfo->archive_wal_htab, fname);
+
+ if (entry != NULL)
+ return entry;
+
+ /*
+ * The requested WAL entry has not been read from the archive yet; invoke
+ * the archive streamer to read it.
+ */
+ while (1)
+ {
+ /* Fetch more data */
+ if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ break; /* archive file ended */
+
+ /*
+ * Archived streamer is reading a non-WAL file or an irrelevant WAL
+ * file.
+ */
+ if (privateInfo->cur_file == NULL)
+ continue;
+
+ entry = privateInfo->cur_file;
+
+ /* Found the required entry */
+ if (strcmp(fname, entry->fname) == 0)
+ return entry;
+
+ /* WAL segments must be archived in order */
+ pg_log_error("WAL files are not archived in sequential order");
+ pg_log_error_detail("Expecting segment \"%s\" but found \"%s\".",
+ fname, entry->fname);
+ exit(1);
+ }
+
+ /* Requested WAL segment not found */
+ pg_fatal("could not find WAL \"%s\" in archive \"%s\"",
+ fname, privateInfo->archive_name);
+}
+
+/*
+ * Reads the archive file and passes it to the archive streamer for
+ * decompression.
+ */
+static int
+read_archive_file(XLogDumpPrivate *privateInfo, Size count)
+{
+ int rc;
+ char *buffer;
+
+ buffer = pg_malloc(count * sizeof(uint8));
+
+ rc = read(privateInfo->archive_fd, buffer, count);
+ if (rc < 0)
+ pg_fatal("could not read file \"%s\": %m",
+ privateInfo->archive_name);
+
+ /*
+ * Decompress (if required), and then parse the previously read contents
+ * of the tar file.
+ */
+ if (rc > 0)
+ astreamer_content(privateInfo->archive_streamer, NULL,
+ buffer, rc, ASTREAMER_UNKNOWN);
+ pg_free(buffer);
+
+ return rc;
+}
+
+/*
+ * Create an astreamer that can read WAL from a tar file.
+ */
+static astreamer *
+astreamer_waldump_new(XLogDumpPrivate *privateInfo)
+{
+ astreamer_waldump *streamer;
+
+ streamer = palloc0(sizeof(astreamer_waldump));
+ *((const astreamer_ops **) &streamer->base.bbs_ops) =
+ &astreamer_waldump_ops;
+
+ streamer->privateInfo = privateInfo;
+
+ return &streamer->base;
+}
+
+/*
+ * Main entry point of the archive streamer for reading WAL data from a tar
+ * file. If a member is identified as a valid WAL file, a hash entry is created
+ * for it, and its contents are copied into that entry's buffer, making them
+ * accessible to the decoding routine.
+ */
+static void
+astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context)
+{
+ astreamer_waldump *mystreamer = (astreamer_waldump *) streamer;
+ XLogDumpPrivate *privateInfo = mystreamer->privateInfo;
+
+ Assert(context != ASTREAMER_UNKNOWN);
+
+ switch (context)
+ {
+ case ASTREAMER_MEMBER_HEADER:
+ {
+ char *fname = NULL;
+ ArchivedWALFile *entry;
+ bool found;
+
+ pg_log_debug("reading \"%s\"", member->pathname);
+
+ if (!member_is_wal_file(mystreamer, member, &fname))
+ break;
+
+ /*
+ * Further checks are skipped if any WAL file can be read.
+ * This typically occurs during initial verification.
+ */
+ if (!READ_ANY_WAL(privateInfo))
+ {
+ XLogSegNo segno;
+ TimeLineID timeline;
+
+ /*
+ * Skip the segment if the timeline does not match, if it
+ * falls outside the caller-specified range.
+ */
+ XLogFromFileName(fname, &timeline, &segno, privateInfo->segsize);
+ if (privateInfo->timeline != timeline ||
+ privateInfo->start_segno > segno ||
+ privateInfo->end_segno < segno)
+ {
+ free(fname);
+ break;
+ }
+ }
+
+ entry = ArchivedWAL_insert(privateInfo->archive_wal_htab,
+ fname, &found);
+
+ /*
+ * Shouldn't happen, but if it does, simply ignore the
+ * duplicate WAL file.
+ */
+ if (found)
+ {
+ pg_log_warning("ignoring duplicate WAL \"%s\" found in archive \"%s\"",
+ member->pathname, privateInfo->archive_name);
+ break;
+ }
+
+ entry->buf = makeStringInfo();
+ entry->read_len = 0;
+ privateInfo->cur_file = entry;
+ }
+ break;
+
+ case ASTREAMER_MEMBER_CONTENTS:
+ if (privateInfo->cur_file)
+ {
+ appendBinaryStringInfo(privateInfo->cur_file->buf, data, len);
+ privateInfo->cur_file->read_len += len;
+ }
+ break;
+
+ case ASTREAMER_MEMBER_TRAILER:
+ privateInfo->cur_file = NULL;
+ break;
+
+ case ASTREAMER_ARCHIVE_TRAILER:
+ break;
+
+ default:
+ /* Shouldn't happen. */
+ pg_fatal("unexpected state while parsing tar file");
+ }
+}
+
+/*
+ * End-of-stream processing for an astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_finalize(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+}
+
+/*
+ * Free memory associated with a astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_free(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+ pfree(streamer);
+}
+
+/*
+ * Returns true if the archive member name matches the WAL naming format. If
+ * successful, it also outputs the WAL segment name.
+ */
+static bool
+member_is_wal_file(astreamer_waldump *mystreamer, astreamer_member *member,
+ char **fname)
+{
+ int pathlen;
+ char pathname[MAXPGPATH];
+ char *filename;
+
+ /* We are only interested in normal files. */
+ if (member->is_directory || member->is_link)
+ return false;
+
+ if (strlen(member->pathname) < XLOG_FNAME_LEN)
+ return false;
+
+ /*
+ * For a correct comparison, we must remove any '.' or '..' components
+ * from the member pathname. Similar to member_verify_header(), we prepend
+ * './' to the path so that canonicalize_path() can properly resolve and
+ * strip these references from the tar member name
+ */
+ snprintf(pathname, MAXPGPATH, "./%s", member->pathname);
+ canonicalize_path(pathname);
+ pathlen = strlen(pathname);
+
+ /* WAL files from the top-level or pg_wal directory will be decoded */
+ if (pathlen > XLOG_FNAME_LEN &&
+ strncmp(pathname, XLOGDIR, strlen(XLOGDIR)) != 0)
+ return false;
+
+ /* WAL file could be with full path */
+ filename = pathname + (pathlen - XLOG_FNAME_LEN);
+ if (!IsXLogFileName(filename))
+ return false;
+
+ *fname = pnstrdup(filename, XLOG_FNAME_LEN);
+
+ return true;
+}
+
+/*
+ * Helper function for filemap hash table.
+ */
+static uint32
+hash_string_pointer(const char *s)
+{
+ unsigned char *ss = (unsigned char *) s;
+
+ return hash_bytes(ss, strlen(s));
+}
diff --git a/src/bin/pg_waldump/meson.build b/src/bin/pg_waldump/meson.build
index 633a9874bb5..5296f21b82c 100644
--- a/src/bin/pg_waldump/meson.build
+++ b/src/bin/pg_waldump/meson.build
@@ -1,6 +1,7 @@
# Copyright (c) 2022-2026, PostgreSQL Global Development Group
pg_waldump_sources = files(
+ 'archive_waldump.c',
'compat.c',
'pg_waldump.c',
'rmgrdesc.c',
@@ -18,7 +19,7 @@ endif
pg_waldump = executable('pg_waldump',
pg_waldump_sources,
- dependencies: [frontend_code, lz4, zstd],
+ dependencies: [frontend_code, libpq, lz4, zstd],
c_args: ['-DFRONTEND'], # needed for xlogreader et al
kwargs: default_bin_args,
)
@@ -29,6 +30,7 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
+ 'env': {'TAR': tar.found() ? tar.full_path() : ''},
'tests': [
't/001_basic.pl',
't/002_save_fullpage.pl',
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 5d31b15dbd8..90fc13f3609 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -176,7 +176,7 @@ split_path(const char *path, char **dir, char **fname)
*
* return a read only fd
*/
-static int
+int
open_file_in_directory(const char *directory, const char *fname)
{
int fd = -1;
@@ -440,6 +440,80 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
return count;
}
+/*
+ * pg_waldump's XLogReaderRoutine->segment_open callback to support dumping WAL
+ * files from tar archives.
+ */
+static void
+TarWALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
+ TimeLineID *tli_p)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->segment_close callback.
+ */
+static void
+TarWALDumpCloseSegment(XLogReaderState *state)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->page_read callback to support dumping WAL
+ * files from tar archives.
+ */
+static int
+TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
+ XLogRecPtr targetPtr, char *readBuff)
+{
+ XLogDumpPrivate *private = state->private_data;
+ int count = required_read_len(private, targetPagePtr, reqLen);
+ int WalSegSz = state->segcxt.ws_segsize;
+ XLogSegNo curSegNo;
+
+ /* Bail out if the count to be read is not valid */
+ if (count < 0)
+ return -1;
+
+ /*
+ * If the target page is in a different segment, free the buffer space
+ * occupied by the previous segment data. Since pg_waldump never requests
+ * the same WAL bytes twice, moving to a new segment implies the previous
+ * buffer's data and that segment will not be needed again.
+ */
+ curSegNo = state->seg.ws_segno;
+ if (!XLByteInSeg(targetPagePtr, curSegNo, WalSegSz))
+ {
+ char fname[MAXFNAMELEN];
+ XLogSegNo nextSegNo;
+
+ /*
+ * Calculate the next WAL segment to be decoded from the given page
+ * pointer
+ */
+ XLByteToSeg(targetPagePtr, nextSegNo, WalSegSz);
+ state->seg.ws_tli = private->timeline;
+ state->seg.ws_segno = nextSegNo;
+
+ /*
+ * If in pre-reading mode (prior to actual decoding), do not delete any
+ * entries that might be requested again once the decoding loop starts.
+ * For more details, see the comments in read_archive_wal_page().
+ */
+ if (private->decoding_started && curSegNo < nextSegNo)
+ {
+ XLogFileName(fname, state->seg.ws_tli, curSegNo, WalSegSz);
+ free_archive_wal_entry(fname, private);
+ }
+ }
+
+ /* Read the WAL page from the archive streamer */
+ return read_archive_wal_page(private, targetPagePtr, count, readBuff,
+ WalSegSz);
+}
+
/*
* Boolean to return whether the given WAL record matches a specific relation
* and optionally block.
@@ -777,8 +851,8 @@ usage(void)
printf(_(" -F, --fork=FORK only show records that modify blocks in fork FORK;\n"
" valid names are main, fsm, vm, init\n"));
printf(_(" -n, --limit=N number of records to display\n"));
- printf(_(" -p, --path=PATH directory in which to find WAL segment files or a\n"
- " directory with a ./pg_wal that contains such files\n"
+ printf(_(" -p, --path=PATH tar archive or a directory in which to find WAL segment files or\n"
+ " a directory with a ./pg_wal that contains such files\n"
" (default: current directory, ./pg_wal, $PGDATA/pg_wal)\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -r, --rmgr=RMGR only show records generated by resource manager RMGR;\n"
@@ -810,7 +884,9 @@ main(int argc, char **argv)
XLogRecord *record;
XLogRecPtr first_record;
char *waldir = NULL;
+ char *walpath = NULL;
char *errormsg;
+ pg_compress_algorithm compression;
static struct option long_options[] = {
{"bkp-details", no_argument, NULL, 'b'},
@@ -868,6 +944,10 @@ main(int argc, char **argv)
private.startptr = InvalidXLogRecPtr;
private.endptr = InvalidXLogRecPtr;
private.endptr_reached = false;
+ private.decoding_started = false;
+ private.archive_name = NULL;
+ private.start_segno = 0;
+ private.end_segno = UINT64_MAX;
config.quiet = false;
config.bkp_details = false;
@@ -943,7 +1023,7 @@ main(int argc, char **argv)
}
break;
case 'p':
- waldir = pg_strdup(optarg);
+ walpath = pg_strdup(optarg);
break;
case 'q':
config.quiet = true;
@@ -1107,10 +1187,19 @@ main(int argc, char **argv)
goto bad_argument;
}
- if (waldir != NULL)
+ if (walpath != NULL)
{
+ /* validate path points to tar archive */
+ if (parse_tar_compress_algorithm(walpath, &compression))
+ {
+ char *fname = NULL;
+
+ split_path(walpath, &waldir, &fname);
+
+ private.archive_name = fname;
+ }
/* validate path points to directory */
- if (!verify_directory(waldir))
+ else if (!verify_directory(walpath))
{
pg_log_error("could not open directory \"%s\": %m", waldir);
goto bad_argument;
@@ -1128,6 +1217,17 @@ main(int argc, char **argv)
int fd;
XLogSegNo segno;
+ /*
+ * If a tar archive is passed using the --path option, all other
+ * arguments become unnecessary.
+ */
+ if (private.archive_name)
+ {
+ pg_log_error("unnecessary command-line arguments specified with tar archive (first is \"%s\")",
+ argv[optind]);
+ goto bad_argument;
+ }
+
split_path(argv[optind], &directory, &fname);
if (waldir == NULL && directory != NULL)
@@ -1138,69 +1238,76 @@ main(int argc, char **argv)
pg_fatal("could not open directory \"%s\": %m", waldir);
}
- waldir = identify_target_directory(waldir, fname, &private.segsize);
- fd = open_file_in_directory(waldir, fname);
- if (fd < 0)
- pg_fatal("could not open file \"%s\"", fname);
- close(fd);
-
- /* parse position from file */
- XLogFromFileName(fname, &private.timeline, &segno, private.segsize);
-
- if (!XLogRecPtrIsValid(private.startptr))
- XLogSegNoOffsetToRecPtr(segno, 0, private.segsize, private.startptr);
- else if (!XLByteInSeg(private.startptr, segno, private.segsize))
+ if (fname != NULL && parse_tar_compress_algorithm(fname, &compression))
{
- pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.startptr),
- fname);
- goto bad_argument;
+ private.archive_name = fname;
}
-
- /* no second file specified, set end position */
- if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(segno + 1, 0, private.segsize, private.endptr);
-
- /* parse ENDSEG if passed */
- if (optind + 1 < argc)
+ else
{
- XLogSegNo endsegno;
-
- /* ignore directory, already have that */
- split_path(argv[optind + 1], &directory, &fname);
-
+ waldir = identify_target_directory(waldir, fname, &private.segsize);
fd = open_file_in_directory(waldir, fname);
if (fd < 0)
pg_fatal("could not open file \"%s\"", fname);
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &endsegno, private.segsize);
+ XLogFromFileName(fname, &private.timeline, &segno, private.segsize);
- if (endsegno < segno)
- pg_fatal("ENDSEG %s is before STARTSEG %s",
- argv[optind + 1], argv[optind]);
+ if (!XLogRecPtrIsValid(private.startptr))
+ XLogSegNoOffsetToRecPtr(segno, 0, private.segsize, private.startptr);
+ else if (!XLByteInSeg(private.startptr, segno, private.segsize))
+ {
+ pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.startptr),
+ fname);
+ goto bad_argument;
+ }
- if (!XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(endsegno + 1, 0, private.segsize,
- private.endptr);
+ /* no second file specified, set end position */
+ if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
+ XLogSegNoOffsetToRecPtr(segno + 1, 0, private.segsize, private.endptr);
- /* set segno to endsegno for check of --end */
- segno = endsegno;
- }
+ /* parse ENDSEG if passed */
+ if (optind + 1 < argc)
+ {
+ XLogSegNo endsegno;
+ /* ignore directory, already have that */
+ split_path(argv[optind + 1], &directory, &fname);
- if (!XLByteInSeg(private.endptr, segno, private.segsize) &&
- private.endptr != (segno + 1) * private.segsize)
- {
- pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.endptr),
- argv[argc - 1]);
- goto bad_argument;
+ fd = open_file_in_directory(waldir, fname);
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", fname);
+ close(fd);
+
+ /* parse position from file */
+ XLogFromFileName(fname, &private.timeline, &endsegno, private.segsize);
+
+ if (endsegno < segno)
+ pg_fatal("ENDSEG %s is before STARTSEG %s",
+ argv[optind + 1], argv[optind]);
+
+ if (!XLogRecPtrIsValid(private.endptr))
+ XLogSegNoOffsetToRecPtr(endsegno + 1, 0, private.segsize,
+ private.endptr);
+
+ /* set segno to endsegno for check of --end */
+ segno = endsegno;
+ }
+
+
+ if (!XLByteInSeg(private.endptr, segno, private.segsize) &&
+ private.endptr != (segno + 1) * private.segsize)
+ {
+ pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.endptr),
+ argv[argc - 1]);
+ goto bad_argument;
+ }
}
}
- else
- waldir = identify_target_directory(waldir, NULL, &private.segsize);
+ else if (!private.archive_name)
+ waldir = identify_target_directory(walpath, NULL, &private.segsize);
/* we don't know what to print */
if (!XLogRecPtrIsValid(private.startptr))
@@ -1212,12 +1319,36 @@ main(int argc, char **argv)
/* done with argument parsing, do the actual work */
/* we have everything we need, start reading */
- xlogreader_state =
- XLogReaderAllocate(private.segsize, waldir,
- XL_ROUTINE(.page_read = WALDumpReadPage,
- .segment_open = WALDumpOpenSegment,
- .segment_close = WALDumpCloseSegment),
- &private);
+ if (private.archive_name)
+ {
+ /*
+ * A NULL WAL directory indicates that the archive file is located in
+ * the current working directory of the pg_waldump execution
+ */
+ if (waldir == NULL)
+ waldir = pg_strdup(".");
+
+ /* Set up for reading tar file */
+ init_archive_reader(&private, waldir, &private.segsize, compression);
+
+ /* Routine to decode WAL files in tar archive */
+ xlogreader_state =
+ XLogReaderAllocate(private.segsize, waldir,
+ XL_ROUTINE(.page_read = TarWALDumpReadPage,
+ .segment_open = TarWALDumpOpenSegment,
+ .segment_close = TarWALDumpCloseSegment),
+ &private);
+ }
+ else
+ {
+ xlogreader_state =
+ XLogReaderAllocate(private.segsize, waldir,
+ XL_ROUTINE(.page_read = WALDumpReadPage,
+ .segment_open = WALDumpOpenSegment,
+ .segment_close = WALDumpCloseSegment),
+ &private);
+ }
+
if (!xlogreader_state)
pg_fatal("out of memory while allocating a WAL reading processor");
@@ -1245,6 +1376,9 @@ main(int argc, char **argv)
if (config.stats == true && !config.quiet)
stats.startptr = first_record;
+ /* Flag indicating that the decoding loop has been entered */
+ private.decoding_started = true;
+
for (;;)
{
if (time_to_stop)
@@ -1326,6 +1460,9 @@ main(int argc, char **argv)
XLogReaderFree(xlogreader_state);
+ if (private.archive_name)
+ free_archive_reader(&private);
+
return EXIT_SUCCESS;
bad_argument:
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index 4f1b2ab668b..90fe96840e2 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -12,6 +12,11 @@
#define PG_WALDUMP_H
#include "access/xlogdefs.h"
+#include "fe_utils/astreamer.h"
+
+/* Forward declaration */
+struct ArchivedWALFile;
+struct ArchivedWAL_hash;
/* Contains the necessary information to drive WAL decoding */
typedef struct XLogDumpPrivate
@@ -21,6 +26,44 @@ typedef struct XLogDumpPrivate
XLogRecPtr startptr;
XLogRecPtr endptr;
bool endptr_reached;
+ bool decoding_started;
+
+ /* Fields required to read WAL from archive */
+ char *archive_name; /* Tar archive name */
+ int archive_fd; /* File descriptor for the open tar file */
+
+ astreamer *archive_streamer;
+
+ /* What the archive streamer is currently reading */
+ struct ArchivedWALFile *cur_file;
+
+ /*
+ * Hash table of all WAL files that the archive stream has read, including
+ * the one currently in progress.
+ */
+ struct ArchivedWAL_hash *archive_wal_htab;
+
+ /*
+ * Although these values can be easily derived from startptr and endptr,
+ * doing so repeatedly for each archived member would be inefficient, as
+ * it would involve recalculating and filtering out irrelevant WAL
+ * segments.
+ */
+ XLogSegNo start_segno;
+ XLogSegNo end_segno;
} XLogDumpPrivate;
+extern int open_file_in_directory(const char *directory, const char *fname);
+
+extern void init_archive_reader(XLogDumpPrivate *privateInfo,
+ const char *waldir, int *WalSegSz,
+ pg_compress_algorithm compression);
+extern void free_archive_reader(XLogDumpPrivate *privateInfo);
+extern int read_archive_wal_page(XLogDumpPrivate *privateInfo,
+ XLogRecPtr targetPagePtr,
+ Size count, char *readBuff,
+ int WalSegSz);
+extern void free_archive_wal_entry(const char *fname,
+ XLogDumpPrivate *privateInfo);
+
#endif /* end of PG_WALDUMP_H */
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index f12ba52cbfc..9ab7457e9e2 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -3,10 +3,13 @@
use strict;
use warnings FATAL => 'all';
+use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
+my $tar = $ENV{TAR};
+
program_help_ok('pg_waldump');
program_version_ok('pg_waldump');
program_options_handling_ok('pg_waldump');
@@ -162,6 +165,42 @@ CREATE TABLESPACE ts1 LOCATION '$tblspc_path';
DROP TABLESPACE ts1;
});
+# Test: Decode a continuation record (contrecord) that spans multiple WAL
+# segments.
+#
+# Now consume all remaining room in the current WAL segment, leaving
+# space enough only for the start of a largish record.
+$node->safe_psql(
+ 'postgres', q{
+DO $$
+DECLARE
+ wal_segsize int := setting::int FROM pg_settings WHERE name = 'wal_segment_size';
+ remain int;
+ iters int := 0;
+BEGIN
+ LOOP
+ INSERT into t1(b)
+ select repeat(encode(sha256(g::text::bytea), 'hex'), (random() * 15 + 1)::int)
+ from generate_series(1, 10) g;
+
+ remain := wal_segsize - (pg_current_wal_insert_lsn() - '0/0') % wal_segsize;
+ IF remain < 2 * setting::int from pg_settings where name = 'block_size' THEN
+ RAISE log 'exiting after % iterations, % bytes to end of WAL segment', iters, remain;
+ EXIT;
+ END IF;
+ iters := iters + 1;
+ END LOOP;
+END
+$$;
+});
+
+my $contrecord_lsn = $node->safe_psql('postgres',
+ 'SELECT pg_current_wal_insert_lsn()');
+# Generate contrecord record
+$node->safe_psql('postgres',
+ qq{SELECT pg_logical_emit_message(true, 'test 026', repeat('xyzxz', 123456))}
+);
+
my ($end_lsn, $end_walfile) = split /\|/,
$node->safe_psql('postgres',
q{SELECT pg_current_wal_insert_lsn(), pg_walfile_name(pg_current_wal_insert_lsn())}
@@ -259,11 +298,50 @@ sub test_pg_waldump
return @lines;
}
-my @lines;
+# Create a tar archive, sorting the file order
+sub generate_archive
+{
+ my ($archive, $directory, $compression_flags) = @_;
+
+ my @files;
+ opendir my $dh, $directory or die "opendir: $!";
+ while (my $entry = readdir $dh) {
+ # Skip '.' and '..'
+ next if $entry eq '.' || $entry eq '..';
+ push @files, $entry;
+ }
+ closedir $dh;
+
+ @files = sort @files;
+
+ # move into the WAL directory before archiving files
+ my $cwd = getcwd;
+ chdir($directory) || die "chdir: $!";
+ command_ok([$tar, $compression_flags, $archive, @files]);
+ chdir($cwd) || die "chdir: $!";
+}
+
+my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
my @scenarios = (
{
- 'path' => $node->data_dir
+ 'path' => $node->data_dir,
+ 'is_archive' => 0,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar",
+ 'compression_method' => 'none',
+ 'compression_flags' => '-cf',
+ 'is_archive' => 1,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar.gz",
+ 'compression_method' => 'gzip',
+ 'compression_flags' => '-czf',
+ 'is_archive' => 1,
+ 'enabled' => check_pg_config("#define HAVE_LIBZ 1")
});
for my $scenario (@scenarios)
@@ -272,6 +350,19 @@ for my $scenario (@scenarios)
SKIP:
{
+ skip "tar command is not available", 3
+ if !defined $tar;
+ skip "$scenario->{'compression_method'} compression not supported by this build", 3
+ if !$scenario->{'enabled'} && $scenario->{'is_archive'};
+
+ # create pg_wal archive
+ if ($scenario->{'is_archive'})
+ {
+ generate_archive($path,
+ $node->data_dir . '/pg_wal',
+ $scenario->{'compression_flags'});
+ }
+
command_fails_like(
[ 'pg_waldump', '--path' => $path ],
qr/error: no start WAL location given/,
@@ -305,9 +396,14 @@ for my $scenario (@scenarios)
test_pg_waldump_skip_bytes($path, $start_lsn, $end_lsn);
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
+ my @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+ @lines = test_pg_waldump($path, $contrecord_lsn, $end_lsn);
+ is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+
+ test_pg_waldump_skip_bytes($path, $contrecord_lsn, $end_lsn);
+
@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--limit' => 6);
is(@lines, 6, 'limit option observed');
@@ -337,6 +433,9 @@ for my $scenario (@scenarios)
'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
'--block' => 1);
is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+
+ # Cleanup.
+ unlink $path if $scenario->{'is_archive'};
}
}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 241945734ec..18ab2e848b6 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -145,6 +145,8 @@ ArchiveOpts
ArchiveShutdownCB
ArchiveStartupCB
ArchiveStreamState
+ArchivedWALFile
+ArchivedWAL_hash
ArchiverOutput
ArchiverStage
ArrayAnalyzeExtraData
@@ -3511,6 +3513,7 @@ astreamer_recovery_injector
astreamer_tar_archiver
astreamer_tar_parser
astreamer_verify
+astreamer_waldump
astreamer_zstd_frame
auth_password_hook_typ
autovac_table
--
2.47.1
[application/x-patch] v13-0007-pg_waldump-Remove-the-restriction-on-the-order-o.patch (13.1K, 8-v13-0007-pg_waldump-Remove-the-restriction-on-the-order-o.patch)
download | inline diff:
From c36ba0507caaf04b4377cc83c4e63be187943367 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 27 Jan 2026 15:38:34 +0530
Subject: [PATCH v13 07/10] pg_waldump: Remove the restriction on the order of
archived WAL files.
With previous patch, pg_waldump would stop decoding if WAL files were
not in the required sequence. With this patch, decoding will now
continue. Any WAL file that is out of order will be written to a
temporary location, from which it will be read later. Once a temporary
file has been read, it will be removed.
---
doc/src/sgml/ref/pg_waldump.sgml | 8 +-
src/bin/pg_waldump/archive_waldump.c | 171 +++++++++++++++++++++++++--
src/bin/pg_waldump/pg_waldump.c | 32 ++++-
src/bin/pg_waldump/pg_waldump.h | 3 +
src/bin/pg_waldump/t/001_basic.pl | 3 +-
5 files changed, 197 insertions(+), 20 deletions(-)
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index d004bb0f67e..27adf77755c 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -149,8 +149,12 @@ PostgreSQL documentation
of <envar>PGDATA</envar>.
</para>
<para>
- If a tar archive is provided, its WAL segment files must be in
- sequential order; otherwise, an error will be reported.
+ If a tar archive is provided and its WAL segment files are not in
+ sequential order, those files will be written to a temporary directory
+ named starting with <filename>waldump_tmp</filename>. This directory will be
+ created inside the directory specified by the <envar>TMPDIR</envar>
+ environment variable if it is set; otherwise, it will be created within
+ the same directory as the tar archive.
</para>
</listitem>
</varlistentry>
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
index ecc022a81a2..84fae87492e 100644
--- a/src/bin/pg_waldump/archive_waldump.c
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -17,6 +17,7 @@
#include <unistd.h>
#include "access/xlog_internal.h"
+#include "common/file_perm.h"
#include "common/hashfn.h"
#include "common/logging.h"
#include "fe_utils/simple_list.h"
@@ -27,6 +28,9 @@
*/
#define READ_CHUNK_SIZE (128 * 1024)
+/* Temporary exported WAL file directory */
+char *TmpWalSegDir = NULL;
+
/*
* Check if the start segment number is zero; this indicates a request to read
* any WAL file.
@@ -57,6 +61,8 @@ typedef struct ArchivedWALFile
const char *fname; /* hash key: WAL segment name */
StringInfo buf; /* holds WAL bytes read from archive */
+ bool spilled; /* true if the WAL data was spilled to a
+ * temporary file */
int read_len; /* total bytes of a WAL read from archive */
} ArchivedWALFile;
@@ -84,6 +90,11 @@ static ArchivedWALFile *get_archive_wal_entry(const char *fname,
XLogDumpPrivate *privateInfo,
int WalSegSz);
static int read_archive_file(XLogDumpPrivate *privateInfo, Size count);
+static void setup_tmpwal_dir(const char *waldir);
+static void cleanup_tmpwal_dir_atexit(void);
+
+static FILE *prepare_tmp_write(const char *fname);
+static void perform_tmp_write(const char *fname, StringInfo buf, FILE *file);
static astreamer *astreamer_waldump_new(XLogDumpPrivate *privateInfo);
static void astreamer_waldump_content(astreamer *streamer,
@@ -106,7 +117,9 @@ static const astreamer_ops astreamer_waldump_ops = {
/*
* Initializes the tar archive reader, creates a hash table for WAL entries,
* checks for existing valid WAL segments in the archive file and retrieves the
- * segment size, and sets up filters for relevant entries.
+ * segment size, and sets up filters for relevant entries. It also configures a
+ * temporary directory for out-of-order WAL data and registers an exit callback
+ * to clean up temporary files.
*/
void
init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
@@ -199,6 +212,13 @@ init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
privateInfo->start_segno > segno ||
privateInfo->end_segno < segno)
free_archive_wal_entry(entry->fname, privateInfo);
+
+ /*
+ * Setup temporary directory to store WAL segments and set up an exit
+ * callback to remove it upon completion.
+ */
+ setup_tmpwal_dir(waldir);
+ atexit(cleanup_tmpwal_dir_atexit);
}
/*
@@ -365,6 +385,17 @@ free_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo)
destroyStringInfo(entry->buf);
entry->buf = NULL;
+ /* Remove temporary file if any */
+ if (entry->spilled)
+ {
+ char fpath[MAXPGPATH];
+
+ snprintf(fpath, MAXPGPATH, "%s/%s", TmpWalSegDir, fname);
+
+ if (unlink(fpath) == 0)
+ pg_log_debug("removed file \"%s\"", fpath);
+ }
+
/* Set cur_file to NULL if it matches the entry being ignored */
if (privateInfo->cur_file == entry)
privateInfo->cur_file = NULL;
@@ -376,12 +407,16 @@ free_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo)
* Returns the archived WAL entry from the hash table if it exists. Otherwise,
* it invokes the routine to read the archived file, which then populates the
* entry in the hash table if that WAL exists in the archive.
+ * If the archive streamer happens to be reading a
+ * WAL from archive file that is not currently needed, that WAL data is written
+ * to a temporary file.
*/
static ArchivedWALFile *
get_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo,
int WalSegSz)
{
ArchivedWALFile *entry = NULL;
+ FILE *write_fp = NULL;
/* Search hash table */
entry = ArchivedWAL_lookup(privateInfo->archive_wal_htab, fname);
@@ -395,28 +430,59 @@ get_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo,
*/
while (1)
{
+ /*
+ * The WAL file entry currently being processed may change during
+ * archive streamer execution. Therefore, maintain a local variable to
+ * reference the previous entry, ensuring that any remaining data in
+ * its buffer is successfully flushed to the temporary file before
+ * switching to the next WAL entry.
+ */
+ entry = privateInfo->cur_file;
+
/* Fetch more data */
- if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
- break; /* archive file ended */
+ if (entry == NULL || entry->buf->len == 0)
+ {
+ if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ break; /* archive file ended */
+ }
/*
* Archived streamer is reading a non-WAL file or an irrelevant WAL
* file.
*/
- if (privateInfo->cur_file == NULL)
+ if (entry == NULL)
continue;
- entry = privateInfo->cur_file;
-
/* Found the required entry */
if (strcmp(fname, entry->fname) == 0)
return entry;
- /* WAL segments must be archived in order */
- pg_log_error("WAL files are not archived in sequential order");
- pg_log_error_detail("Expecting segment \"%s\" but found \"%s\".",
- fname, entry->fname);
- exit(1);
+ /*
+ * Archive streamer is currently reading a file that isn't the one
+ * asked for, but it's required in the future. It should be written to
+ * a temporary location for retrieval when needed.
+ */
+
+ /* Create a temporary file if one does not already exist */
+ if (!entry->spilled)
+ {
+ write_fp = prepare_tmp_write(entry->fname);
+ entry->spilled = true;
+ }
+
+ /* Flush data from the buffer to the file */
+ perform_tmp_write(entry->fname, entry->buf, write_fp);
+ resetStringInfo(entry->buf);
+
+ /*
+ * The change in the current segment entry indicates that the reading
+ * of this file has ended.
+ */
+ if (entry != privateInfo->cur_file && write_fp != NULL)
+ {
+ fclose(write_fp);
+ write_fp = NULL;
+ }
}
/* Requested WAL segment not found */
@@ -454,7 +520,88 @@ read_archive_file(XLogDumpPrivate *privateInfo, Size count)
}
/*
- * Create an astreamer that can read WAL from a tar file.
+ * Set up a temporary directory to temporarily store WAL segments.
+ */
+static void
+setup_tmpwal_dir(const char *waldir)
+{
+ char *template;
+
+ /*
+ * Use the directory specified by the TMPDIR environment variable. If it's
+ * not set, use the provided WAL directory to extract WAL file
+ * temporarily.
+ */
+ template = psprintf("%s/waldump_tmp-XXXXXX",
+ getenv("TMPDIR") ? getenv("TMPDIR") : waldir);
+ TmpWalSegDir = mkdtemp(template);
+
+ if (TmpWalSegDir == NULL)
+ pg_fatal("could not create directory \"%s\": %m", template);
+
+ canonicalize_path(TmpWalSegDir);
+
+ pg_log_debug("created directory \"%s\"", TmpWalSegDir);
+}
+
+/*
+ * Remove temporary directory at exit, if any.
+ */
+static void
+cleanup_tmpwal_dir_atexit(void)
+{
+ rmtree(TmpWalSegDir, true);
+}
+
+/*
+ * Create an empty placeholder file and return its handle.
+ */
+static FILE *
+prepare_tmp_write(const char *fname)
+{
+ char fpath[MAXPGPATH];
+ FILE *file;
+
+ snprintf(fpath, MAXPGPATH, "%s/%s", TmpWalSegDir, fname);
+
+ /* Create an empty placeholder */
+ file = fopen(fpath, PG_BINARY_W);
+ if (file == NULL)
+ pg_fatal("could not create file \"%s\": %m", fpath);
+
+#ifndef WIN32
+ if (chmod(fpath, pg_file_create_mode))
+ pg_fatal("could not set permissions on file \"%s\": %m",
+ fpath);
+#endif
+
+ pg_log_debug("spilling to temporary file \"%s\"", fpath);
+
+ return file;
+}
+
+/*
+ * Write buffer data to the given file handle.
+ */
+static void
+perform_tmp_write(const char *fname, StringInfo buf, FILE *file)
+{
+ Assert(file);
+
+ errno = 0;
+ if (buf->len > 0 && fwrite(buf->data, buf->len, 1, file) != 1)
+ {
+ /*
+ * If write didn't set errno, assume problem is no disk space
+ */
+ if (errno == 0)
+ errno = ENOSPC;
+ pg_fatal("could not write to file \"%s\": %m", fname);
+ }
+}
+
+/*
+ * Create an astreamer that can read WAL from tar file.
*/
static astreamer *
astreamer_waldump_new(XLogDumpPrivate *privateInfo)
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 90fc13f3609..114969217d8 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -478,10 +478,14 @@ TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
return -1;
/*
- * If the target page is in a different segment, free the buffer space
- * occupied by the previous segment data. Since pg_waldump never requests
- * the same WAL bytes twice, moving to a new segment implies the previous
- * buffer's data and that segment will not be needed again.
+ * If the target page is in a different segment, free the buffer and/or
+ * temporary file disk space occupied by the previous segment's data.
+ * Since pg_waldump never requests the same WAL bytes twice, moving to a
+ * new segment implies the previous buffer's data and that segment will
+ * not be needed again.
+ *
+ * Afterward, check for the next required WAL segment's physical existence
+ * in the temporary directory first before invoking the archive streamer.
*/
curSegNo = state->seg.ws_segno;
if (!XLByteInSeg(targetPagePtr, curSegNo, WalSegSz))
@@ -497,6 +501,13 @@ TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
state->seg.ws_tli = private->timeline;
state->seg.ws_segno = nextSegNo;
+ /* Close the WAL segment file if it is currently open */
+ if (state->seg.ws_file >= 0)
+ {
+ close(state->seg.ws_file);
+ state->seg.ws_file = -1;
+ }
+
/*
* If in pre-reading mode (prior to actual decoding), do not delete any
* entries that might be requested again once the decoding loop starts.
@@ -507,9 +518,20 @@ TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogFileName(fname, state->seg.ws_tli, curSegNo, WalSegSz);
free_archive_wal_entry(fname, private);
}
+
+ /*
+ * If the next segment exists, open it and continue reading from there
+ */
+ XLogFileName(fname, state->seg.ws_tli, nextSegNo, WalSegSz);
+ state->seg.ws_file = open_file_in_directory(TmpWalSegDir, fname);
}
- /* Read the WAL page from the archive streamer */
+ /* Continue reading from the open WAL segment, if any */
+ if (state->seg.ws_file >= 0)
+ return WALDumpReadPage(state, targetPagePtr, count, targetPtr,
+ readBuff);
+
+ /* Otherwise, read the WAL page from the archive streamer */
return read_archive_wal_page(private, targetPagePtr, count, readBuff,
WalSegSz);
}
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index 90fe96840e2..75ed0f37538 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -18,6 +18,9 @@
struct ArchivedWALFile;
struct ArchivedWAL_hash;
+/* Temporary directory */
+extern char *TmpWalSegDir;
+
/* Contains the necessary information to drive WAL decoding */
typedef struct XLogDumpPrivate
{
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 9ab7457e9e2..9854c939007 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -7,6 +7,7 @@ use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
+use List::Util qw(shuffle);
my $tar = $ENV{TAR};
@@ -312,7 +313,7 @@ sub generate_archive
}
closedir $dh;
- @files = sort @files;
+ @files = shuffle @files;
# move into the WAL directory before archiving files
my $cwd = getcwd;
--
2.47.1
[application/x-patch] v13-0008-pg_verifybackup-Delay-default-WAL-directory-prep.patch (1.7K, 9-v13-0008-pg_verifybackup-Delay-default-WAL-directory-prep.patch)
download | inline diff:
From 15ab0e423bffb5985bea622aa7c243cbd3316c6b Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 16 Jul 2025 14:47:43 +0530
Subject: [PATCH v13 08/10] pg_verifybackup: Delay default WAL directory
preparation.
We are not sure whether to parse WAL from a directory or an archive
until the backup format is known. Therefore, we delay preparing the
default WAL directory until the point of parsing. This delay is
harmless, as the WAL directory is not used elsewhere.
---
src/bin/pg_verifybackup/pg_verifybackup.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 5ddc4c33feb..04cca3bc0f5 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -285,10 +285,6 @@ main(int argc, char **argv)
manifest_path = psprintf("%s/backup_manifest",
context.backup_directory);
- /* By default, look for the WAL in the backup directory, too. */
- if (wal_directory == NULL)
- wal_directory = psprintf("%s/pg_wal", context.backup_directory);
-
/*
* Try to read the manifest. We treat any errors encountered while parsing
* the manifest as fatal; there doesn't seem to be much point in trying to
@@ -368,6 +364,10 @@ main(int argc, char **argv)
if (context.format == 'p' && !context.skip_checksums)
verify_backup_checksums(&context);
+ /* By default, look for the WAL in the backup directory, too. */
+ if (wal_directory == NULL)
+ wal_directory = psprintf("%s/pg_wal", context.backup_directory);
+
/*
* Try to parse the required ranges of WAL records, unless we were told
* not to do so.
--
2.47.1
[application/x-patch] v13-0009-pg_verifybackup-Rename-the-wal-directory-switch-.patch (5.9K, 10-v13-0009-pg_verifybackup-Rename-the-wal-directory-switch-.patch)
download | inline diff:
From c9d1a4072da9bfa132e2e99238ddbb7e318e933b Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 25 Nov 2025 17:32:14 +0530
Subject: [PATCH v13 09/10] pg_verifybackup: Rename the wal-directory switch to
wal-path
With previous patches to pg_waldump can now decode WAL directly from
tar files. This means you'll be able to specify a tar archive path
instead of a traditional WAL directory.
To keep things consistent and more versatile, we should also
generalize the input switch for pg_verifybackup. It should accept
either a directory or a tar file path that contains WALs. This change
will also aligning it with the existing manifest-path switch naming.
== NOTE ==
The corresponding PO files require updating due to this change.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 2 +-
src/bin/pg_verifybackup/pg_verifybackup.c | 22 +++++++++++-----------
src/bin/pg_verifybackup/t/007_wal.pl | 4 ++--
3 files changed, 14 insertions(+), 14 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index 61c12975e4a..e9b8bfd51b1 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -261,7 +261,7 @@ PostgreSQL documentation
<varlistentry>
<term><option>-w <replaceable class="parameter">path</replaceable></option></term>
- <term><option>--wal-directory=<replaceable class="parameter">path</replaceable></option></term>
+ <term><option>--wal-path=<replaceable class="parameter">path</replaceable></option></term>
<listitem>
<para>
Try to parse WAL files stored in the specified directory, rather than
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 04cca3bc0f5..e149ca96050 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -93,7 +93,7 @@ static void verify_file_checksum(verifier_context *context,
uint8 *buffer);
static void parse_required_wal(verifier_context *context,
char *pg_waldump_path,
- char *wal_directory);
+ char *wal_path);
static astreamer *create_archive_verifier(verifier_context *context,
char *archive_name,
Oid tblspc_oid,
@@ -126,7 +126,7 @@ main(int argc, char **argv)
{"progress", no_argument, NULL, 'P'},
{"quiet", no_argument, NULL, 'q'},
{"skip-checksums", no_argument, NULL, 's'},
- {"wal-directory", required_argument, NULL, 'w'},
+ {"wal-path", required_argument, NULL, 'w'},
{NULL, 0, NULL, 0}
};
@@ -135,7 +135,7 @@ main(int argc, char **argv)
char *manifest_path = NULL;
bool no_parse_wal = false;
bool quiet = false;
- char *wal_directory = NULL;
+ char *wal_path = NULL;
char *pg_waldump_path = NULL;
DIR *dir;
@@ -221,8 +221,8 @@ main(int argc, char **argv)
context.skip_checksums = true;
break;
case 'w':
- wal_directory = pstrdup(optarg);
- canonicalize_path(wal_directory);
+ wal_path = pstrdup(optarg);
+ canonicalize_path(wal_path);
break;
default:
/* getopt_long already emitted a complaint */
@@ -365,15 +365,15 @@ main(int argc, char **argv)
verify_backup_checksums(&context);
/* By default, look for the WAL in the backup directory, too. */
- if (wal_directory == NULL)
- wal_directory = psprintf("%s/pg_wal", context.backup_directory);
+ if (wal_path == NULL)
+ wal_path = psprintf("%s/pg_wal", context.backup_directory);
/*
* Try to parse the required ranges of WAL records, unless we were told
* not to do so.
*/
if (!no_parse_wal)
- parse_required_wal(&context, pg_waldump_path, wal_directory);
+ parse_required_wal(&context, pg_waldump_path, wal_path);
/*
* If everything looks OK, tell the user this, unless we were asked to
@@ -1188,7 +1188,7 @@ verify_file_checksum(verifier_context *context, manifest_file *m,
*/
static void
parse_required_wal(verifier_context *context, char *pg_waldump_path,
- char *wal_directory)
+ char *wal_path)
{
manifest_data *manifest = context->manifest;
manifest_wal_range *this_wal_range = manifest->first_wal_range;
@@ -1198,7 +1198,7 @@ parse_required_wal(verifier_context *context, char *pg_waldump_path,
char *pg_waldump_cmd;
pg_waldump_cmd = psprintf("\"%s\" --quiet --path=\"%s\" --timeline=%u --start=%X/%08X --end=%X/%08X\n",
- pg_waldump_path, wal_directory, this_wal_range->tli,
+ pg_waldump_path, wal_path, this_wal_range->tli,
LSN_FORMAT_ARGS(this_wal_range->start_lsn),
LSN_FORMAT_ARGS(this_wal_range->end_lsn));
fflush(NULL);
@@ -1366,7 +1366,7 @@ usage(void)
printf(_(" -P, --progress show progress information\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -s, --skip-checksums skip checksum verification\n"));
- printf(_(" -w, --wal-directory=PATH use specified path for WAL files\n"));
+ printf(_(" -w, --wal-path=PATH use specified path for WAL files\n"));
printf(_(" -V, --version output version information, then exit\n"));
printf(_(" -?, --help show this help, then exit\n"));
printf(_("\nReport bugs to <%s>.\n"), PACKAGE_BUGREPORT);
diff --git a/src/bin/pg_verifybackup/t/007_wal.pl b/src/bin/pg_verifybackup/t/007_wal.pl
index 79087a1f6be..8ad2234453d 100644
--- a/src/bin/pg_verifybackup/t/007_wal.pl
+++ b/src/bin/pg_verifybackup/t/007_wal.pl
@@ -42,10 +42,10 @@ command_ok([ 'pg_verifybackup', '--no-parse-wal', $backup_path ],
command_ok(
[
'pg_verifybackup',
- '--wal-directory' => $relocated_pg_wal,
+ '--wal-path' => $relocated_pg_wal,
$backup_path
],
- '--wal-directory can be used to specify WAL directory');
+ '--wal-path can be used to specify WAL directory');
# Move directory back to original location.
rename($relocated_pg_wal, $original_pg_wal) || die "rename pg_wal back: $!";
--
2.47.1
[application/x-patch] v13-0010-pg_verifybackup-enabled-WAL-parsing-for-tar-form.patch (9.9K, 11-v13-0010-pg_verifybackup-enabled-WAL-parsing-for-tar-form.patch)
download | inline diff:
From 2083da7ca8f5964606a8ec02d58b95ce4e58002e Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 25 Nov 2025 17:34:26 +0530
Subject: [PATCH v13 10/10] pg_verifybackup: enabled WAL parsing for tar-format
backup
Now that pg_waldump supports decoding from tar archives, we should
leverage this functionality to remove the previous restriction on WAL
parsing for tar-backed formats.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 5 +-
src/bin/pg_verifybackup/pg_verifybackup.c | 66 +++++++++++++------
src/bin/pg_verifybackup/t/002_algorithm.pl | 4 --
src/bin/pg_verifybackup/t/003_corruption.pl | 4 +-
src/bin/pg_verifybackup/t/008_untar.pl | 5 +-
src/bin/pg_verifybackup/t/010_client_untar.pl | 5 +-
6 files changed, 50 insertions(+), 39 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index e9b8bfd51b1..16b50b5a4df 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -36,10 +36,7 @@ PostgreSQL documentation
<literal>backup_manifest</literal> generated by the server at the time
of the backup. The backup may be stored either in the "plain" or the "tar"
format; this includes tar-format backups compressed with any algorithm
- supported by <application>pg_basebackup</application>. However, at present,
- <literal>WAL</literal> verification is supported only for plain-format
- backups. Therefore, if the backup is stored in tar-format, the
- <literal>-n, --no-parse-wal</literal> option should be used.
+ supported by <application>pg_basebackup</application>.
</para>
<para>
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index e149ca96050..8dee0043bab 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -74,7 +74,9 @@ pg_noreturn static void report_manifest_error(JsonManifestParseContext *context,
const char *fmt,...)
pg_attribute_printf(2, 3);
-static void verify_tar_backup(verifier_context *context, DIR *dir);
+static void verify_tar_backup(verifier_context *context, DIR *dir,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_plain_backup_directory(verifier_context *context,
char *relpath, char *fullpath,
DIR *dir);
@@ -83,7 +85,9 @@ static void verify_plain_backup_file(verifier_context *context, char *relpath,
static void verify_control_file(const char *controlpath,
uint64 manifest_system_identifier);
static void precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles);
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_tar_file(verifier_context *context, char *relpath,
char *fullpath, astreamer *streamer);
static void report_extra_backup_files(verifier_context *context);
@@ -136,6 +140,8 @@ main(int argc, char **argv)
bool no_parse_wal = false;
bool quiet = false;
char *wal_path = NULL;
+ char *base_archive_path = NULL;
+ char *wal_archive_path = NULL;
char *pg_waldump_path = NULL;
DIR *dir;
@@ -327,17 +333,6 @@ main(int argc, char **argv)
pfree(path);
}
- /*
- * XXX: In the future, we should consider enhancing pg_waldump to read WAL
- * files from an archive.
- */
- if (!no_parse_wal && context.format == 't')
- {
- pg_log_error("pg_waldump cannot read tar files");
- pg_log_error_hint("You must use -n/--no-parse-wal when verifying a tar-format backup.");
- exit(1);
- }
-
/*
* Perform the appropriate type of verification appropriate based on the
* backup format. This will close 'dir'.
@@ -346,7 +341,7 @@ main(int argc, char **argv)
verify_plain_backup_directory(&context, NULL, context.backup_directory,
dir);
else
- verify_tar_backup(&context, dir);
+ verify_tar_backup(&context, dir, &base_archive_path, &wal_archive_path);
/*
* The "matched" flag should now be set on every entry in the hash table.
@@ -364,9 +359,28 @@ main(int argc, char **argv)
if (context.format == 'p' && !context.skip_checksums)
verify_backup_checksums(&context);
- /* By default, look for the WAL in the backup directory, too. */
+ /*
+ * By default, WAL files are expected to be found in the backup directory
+ * for plain-format backups. In the case of tar-format backups, if a
+ * separate WAL archive is not found, the WAL files are most likely
+ * included within the main data directory archive.
+ */
if (wal_path == NULL)
- wal_path = psprintf("%s/pg_wal", context.backup_directory);
+ {
+ if (context.format == 'p')
+ wal_path = psprintf("%s/pg_wal", context.backup_directory);
+ else if (wal_archive_path)
+ wal_path = wal_archive_path;
+ else if (base_archive_path)
+ wal_path = base_archive_path;
+ else
+ {
+ pg_log_error("WAL archive not found");
+ pg_log_error_hint("Specify the correct path using the option -w/--wal-path. "
+ "Or you must use -n/--no-parse-wal when verifying a tar-format backup.");
+ exit(1);
+ }
+ }
/*
* Try to parse the required ranges of WAL records, unless we were told
@@ -787,7 +801,8 @@ verify_control_file(const char *controlpath, uint64 manifest_system_identifier)
* close when we're done with it.
*/
static void
-verify_tar_backup(verifier_context *context, DIR *dir)
+verify_tar_backup(verifier_context *context, DIR *dir, char **base_archive_path,
+ char **wal_archive_path)
{
struct dirent *dirent;
SimplePtrList tarfiles = {NULL, NULL};
@@ -816,7 +831,8 @@ verify_tar_backup(verifier_context *context, DIR *dir)
char *fullpath;
fullpath = psprintf("%s/%s", context->backup_directory, filename);
- precheck_tar_backup_file(context, filename, fullpath, &tarfiles);
+ precheck_tar_backup_file(context, filename, fullpath, &tarfiles,
+ base_archive_path, wal_archive_path);
pfree(fullpath);
}
}
@@ -875,11 +891,13 @@ verify_tar_backup(verifier_context *context, DIR *dir)
*
* The arguments to this function are mostly the same as the
* verify_plain_backup_file. The additional argument outputs a list of valid
- * tar files.
+ * tar files, along with the full paths to the main archive and the WAL
+ * directory archive.
*/
static void
precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles)
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path, char **wal_archive_path)
{
struct stat sb;
Oid tblspc_oid = InvalidOid;
@@ -918,9 +936,17 @@ precheck_tar_backup_file(verifier_context *context, char *relpath,
* extension such as .gz, .lz4, or .zst.
*/
if (strncmp("base", relpath, 4) == 0)
+ {
suffix = relpath + 4;
+
+ *base_archive_path = pstrdup(fullpath);
+ }
else if (strncmp("pg_wal", relpath, 6) == 0)
+ {
suffix = relpath + 6;
+
+ *wal_archive_path = pstrdup(fullpath);
+ }
else
{
/* Expected a <tablespaceoid>.tar file here. */
diff --git a/src/bin/pg_verifybackup/t/002_algorithm.pl b/src/bin/pg_verifybackup/t/002_algorithm.pl
index 0556191ec9d..edc515d5904 100644
--- a/src/bin/pg_verifybackup/t/002_algorithm.pl
+++ b/src/bin/pg_verifybackup/t/002_algorithm.pl
@@ -30,10 +30,6 @@ sub test_checksums
{
# Add switch to get a tar-format backup
push @backup, ('--format' => 'tar');
-
- # Add switch to skip WAL verification, which is not yet supported for
- # tar-format backups
- push @verify, ('--no-parse-wal');
}
# A backup with a bogus algorithm should fail.
diff --git a/src/bin/pg_verifybackup/t/003_corruption.pl b/src/bin/pg_verifybackup/t/003_corruption.pl
index b1d65b8aa0f..882d75d9dc2 100644
--- a/src/bin/pg_verifybackup/t/003_corruption.pl
+++ b/src/bin/pg_verifybackup/t/003_corruption.pl
@@ -193,10 +193,8 @@ for my $scenario (@scenario)
command_ok([ $tar, '-cf' => "$tar_backup_path/base.tar", '.' ]);
chdir($cwd) || die "chdir: $!";
- # Now check that the backup no longer verifies. We must use -n
- # here, because pg_waldump can't yet read WAL from a tarfile.
command_fails_like(
- [ 'pg_verifybackup', '--no-parse-wal', $tar_backup_path ],
+ [ 'pg_verifybackup', $tar_backup_path ],
$scenario->{'fails_like'},
"corrupt backup fails verification: $name");
diff --git a/src/bin/pg_verifybackup/t/008_untar.pl b/src/bin/pg_verifybackup/t/008_untar.pl
index ae67ae85a31..161c08c190d 100644
--- a/src/bin/pg_verifybackup/t/008_untar.pl
+++ b/src/bin/pg_verifybackup/t/008_untar.pl
@@ -47,7 +47,6 @@ my $tsoid = $primary->safe_psql(
SELECT oid FROM pg_tablespace WHERE spcname = 'regress_ts1'));
my $backup_path = $primary->backup_dir . '/server-backup';
-my $extract_path = $primary->backup_dir . '/extracted-backup';
my @test_configuration = (
{
@@ -123,14 +122,12 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
# Cleanup.
rmtree($backup_path);
- rmtree($extract_path);
}
}
diff --git a/src/bin/pg_verifybackup/t/010_client_untar.pl b/src/bin/pg_verifybackup/t/010_client_untar.pl
index 1ac7b5db75a..9670fbe4fda 100644
--- a/src/bin/pg_verifybackup/t/010_client_untar.pl
+++ b/src/bin/pg_verifybackup/t/010_client_untar.pl
@@ -32,7 +32,6 @@ print $jf $junk_data;
close $jf;
my $backup_path = $primary->backup_dir . '/client-backup';
-my $extract_path = $primary->backup_dir . '/extracted-backup';
my @test_configuration = (
{
@@ -137,13 +136,11 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
# Cleanup.
- rmtree($extract_path);
rmtree($backup_path);
}
}
--
2.47.1
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-03-02 13:00 Amul Sul <[email protected]>
parent: Amul Sul <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Amul Sul @ 2026-03-02 13:00 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Chao Li <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On Wed, Feb 18, 2026 at 12:28 PM Amul Sul <[email protected]> wrote:
>
> On Tue, Feb 10, 2026 at 3:06 PM Amul Sul <[email protected]> wrote:
> >
> > On Wed, Feb 4, 2026 at 6:39 PM Amul Sul <[email protected]> wrote:
> > >
> > > On Wed, Jan 28, 2026 at 2:41 AM Robert Haas <[email protected]> wrote:
> > > >
> > > > On Tue, Jan 27, 2026 at 7:07 AM Amul Sul <[email protected]> wrote:
> > > > > In the attached version, I am using the WAL segment name as the hash
> > > > > key, which is much more straightforward. I have rewritten
> > > > > read_archive_wal_page(), and it looks much cleaner than before. The
> > > > > logic to discard irrelevant WAL files is still within
> > > > > get_archive_wal_entry. I added an explanation for setting cur_wal to
> > > > > NULL, which is now handled in the separate function I mentioned
> > > > > previously.
> > > > >
> > > > > Kindly have a look at the attached version; let me know if you are
> > > > > still not happy with the current approach for filtering/discarding
> > > > > irrelevant WAL segments. It isn't much different from the previous
> > > > > version, but I have tried to keep it in a separate routine for better
> > > > > code readability, with comments to make it easier to understand. I
> > > > > also added a comment for ArchivedWALFile.
> > > >
> > > > I feel like the division of labor between get_archive_wal_entry() and
> > > > read_archive_wal_page() is odd. I noticed this in the last version,
> > > > too, and it still seems to be the case. get_archive_wal_entry() first
> > > > calls ArchivedWAL_lookup(). If that finds an entry, it just returns.
> > > > If it doesn't, it loops until an entry for the requested file shows up
> > > > and then returns it. Then control returns to read_archive_wal_page()
> > > > which loops some more until we have all the data we need for the
> > > > requested file. But it seems odd to me to have two separate loops
> > > > here. I think that the first loop is going to call read_archive_file()
> > > > until we find the beginning of the file that we care about and then
> > > > the second one is going to call read_archive_file() some more until we
> > > > have read enough of it to satisfy the request. It feels odd to me to
> > > > do it that way, as if we told somebody to first wait until 9 o'clock
> > > > and then wait another 30 minutes, instead of just telling them to wait
> > > > until 9:30. I realize it's not quite the same thing, because apart
> > > > from calling read_archive_file(), the two loops do different things,
> > > > but I still think it looks odd.
> > > >
> > > > + /*
> > > > + * Ignore if the timeline is different or the current segment is not
> > > > + * the desired one.
> > > > + */
> > > > + XLogFromFileName(entry->fname, &curSegTimeline, &curSegNo, WalSegSz);
> > > > + if (privateInfo->timeline != curSegTimeline ||
> > > > + privateInfo->startSegNo > curSegNo ||
> > > > + privateInfo->endSegNo < curSegNo ||
> > > > + segno > curSegNo)
> > > > + {
> > > > + free_archive_wal_entry(entry->fname, privateInfo);
> > > > + continue;
> > > > + }
> > > >
> > > > The comment doesn't match the code. If it did, the test would be
> > > > (privateInfo->timeline != curSegTimeline || segno != curSegno). But
> > > > instead the segno test is > rather than !=, and the checks against
> > > > startSegNo and endSegNo aren't explained at all. I think I understand
> > > > why the segno test uses > rather than !=, but it's the point of the
> > > > comment to explain things like that, rather than leaving the reader to
> > > > guess. And I don't know why we also need to test startSegNo and
> > > > endSegNo.
> > > >
> > > > I also wonder what the point is of doing XLogFromFileName() on the
> > > > fname provided by the caller and then again on entry->fname. Couldn't
> > > > you just compare the strings?
> > > >
> > > > Again, the division of labor is really odd here. It's the job of
> > > > astreamer_waldump_content() to skip things that aren't WAL files at
> > > > all, but it's the job of get_archive_wal_entry() to skip things that
> > > > are WAL files but not the one we want. I disagree with putting those
> > > > checks in completely separate parts of the code.
> > > >
> > >
> > > Keeping the timeline and segment start-end range checks inside the
> > > archive streamer creates a circular dependency that cannot be resolved
> > > without a 'dirty hack'. We must read the first available WAL file page
> > > to determine the wal_segment_size before it can calculate the target
> > > segment range. Moving the checks inside the streamer would make it
> > > impossible to process that initial file, as the necessary filtering
> > > parameters -- would still be unknown which would need to be skipped
> > > for the first read somehow. What if later we realized that the first
> > > WAL file which was allowed to be streamed by skipping that check is
> > > irrelevant and doesn't fall under the start-end segment range?
> > >
> >
> > Please have a look at the attached version, specifically patch 0005.
> > In astreamer_waldump_content(), I have moved the WAL file filtration
> > check from get_archive_wal_entry(). This check will be skipped during
> > the initial read in init_archive_reader(), which instead performs it
> > explicitly once it determines the WAL segment size and the start/end
> > segments.
> >
> > To access the WAL segment size inside astreamer_waldump_content(), I
> > have moved the WAL segment size variable into the XLogDumpPrivate
> > structure in the separate 0004 patch.
>
> Attached is an updated version including the aforesaid changes. It
> includes a new refactoring patch (0001) that moves the logic for
> identifying tar archives and their compression types from
> pg_basebackup and pg_verifybackup into a separate-reusable function,
> per a suggestion from Euler [1]. Additionally, I have added a test
> for the contrecord decoding to the main patch (now 0006).
>
> 1] http://postgr.es/m/[email protected]
>
Rebased against the latest master, fixed typos in code comments, and
replaced palloc0 with palloc0_object.
Regards,
Amul
Attachments:
[application/octet-stream] v14-0001-Refactor-Move-tar-archive-parsing-into-a-common-.patch (6.7K, 2-v14-0001-Refactor-Move-tar-archive-parsing-into-a-common-.patch)
download | inline diff:
From 54fd70f2b5df10e6df575b4f85eaecb8a3c1ff94 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 17 Feb 2026 14:51:11 +0530
Subject: [PATCH v14 01/11] Refactor: Move tar archive parsing into a common
location.
pg_basebackup and pg_verifybackup both require logic to identify tar
files and determine their compression types. Similar functionality
will be needed for pg_waldump when it gets the capability to decode
WAL files from tar archives. Moving this logic to a common location
allows for reuse and prevents code duplication.
---
src/bin/pg_basebackup/pg_basebackup.c | 36 +++++++----------------
src/bin/pg_verifybackup/pg_verifybackup.c | 12 +-------
src/common/compression.c | 30 +++++++++++++++++++
src/include/common/compression.h | 2 ++
4 files changed, 44 insertions(+), 36 deletions(-)
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index fa169a8d642..c1a4672aa6f 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -1070,12 +1070,9 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
astreamer *manifest_inject_streamer = NULL;
bool inject_manifest;
bool is_tar,
- is_tar_gz,
- is_tar_lz4,
- is_tar_zstd,
is_compressed_tar;
+ pg_compress_algorithm compressed_tar_algorithm;
bool must_parse_archive;
- int archive_name_len = strlen(archive_name);
/*
* Normally, we emit the backup manifest as a separate file, but when
@@ -1084,24 +1081,13 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
*/
inject_manifest = (format == 't' && strcmp(basedir, "-") == 0 && manifest);
- /* Is this a tar archive? */
- is_tar = (archive_name_len > 4 &&
- strcmp(archive_name + archive_name_len - 4, ".tar") == 0);
-
- /* Is this a .tar.gz archive? */
- is_tar_gz = (archive_name_len > 7 &&
- strcmp(archive_name + archive_name_len - 7, ".tar.gz") == 0);
-
- /* Is this a .tar.lz4 archive? */
- is_tar_lz4 = (archive_name_len > 8 &&
- strcmp(archive_name + archive_name_len - 8, ".tar.lz4") == 0);
-
- /* Is this a .tar.zst archive? */
- is_tar_zstd = (archive_name_len > 8 &&
- strcmp(archive_name + archive_name_len - 8, ".tar.zst") == 0);
+ /* Check whether it is a tar archive and its compression type */
+ is_tar = parse_tar_compress_algorithm(archive_name,
+ &compressed_tar_algorithm);
/* Is this any kind of compressed tar? */
- is_compressed_tar = is_tar_gz || is_tar_lz4 || is_tar_zstd;
+ is_compressed_tar = (is_tar &&
+ compressed_tar_algorithm != PG_COMPRESSION_NONE);
/*
* Injecting the manifest into a compressed tar file would be possible if
@@ -1128,7 +1114,7 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
(spclocation == NULL && writerecoveryconf));
/* At present, we only know how to parse tar archives. */
- if (must_parse_archive && !is_tar && !is_compressed_tar)
+ if (must_parse_archive && !is_tar)
{
pg_log_error("cannot parse archive \"%s\"", archive_name);
pg_log_error_detail("Only tar archives can be parsed.");
@@ -1263,13 +1249,13 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
* If the user has requested a server compressed archive along with
* archive extraction at client then we need to decompress it.
*/
- if (format == 'p')
+ if (format == 'p' && is_compressed_tar)
{
- if (is_tar_gz)
+ if (compressed_tar_algorithm == PG_COMPRESSION_GZIP)
streamer = astreamer_gzip_decompressor_new(streamer);
- else if (is_tar_lz4)
+ else if (compressed_tar_algorithm == PG_COMPRESSION_LZ4)
streamer = astreamer_lz4_decompressor_new(streamer);
- else if (is_tar_zstd)
+ else if (compressed_tar_algorithm == PG_COMPRESSION_ZSTD)
streamer = astreamer_zstd_decompressor_new(streamer);
}
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index cbc9447384f..31f606c45b1 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -941,17 +941,7 @@ precheck_tar_backup_file(verifier_context *context, char *relpath,
}
/* Now, check the compression type of the tar */
- if (strcmp(suffix, ".tar") == 0)
- compress_algorithm = PG_COMPRESSION_NONE;
- else if (strcmp(suffix, ".tgz") == 0)
- compress_algorithm = PG_COMPRESSION_GZIP;
- else if (strcmp(suffix, ".tar.gz") == 0)
- compress_algorithm = PG_COMPRESSION_GZIP;
- else if (strcmp(suffix, ".tar.lz4") == 0)
- compress_algorithm = PG_COMPRESSION_LZ4;
- else if (strcmp(suffix, ".tar.zst") == 0)
- compress_algorithm = PG_COMPRESSION_ZSTD;
- else
+ if (!parse_tar_compress_algorithm(suffix, &compress_algorithm))
{
report_backup_error(context,
"file \"%s\" is not expected in a tar format backup",
diff --git a/src/common/compression.c b/src/common/compression.c
index 92cd4ec7a0d..f117e21237f 100644
--- a/src/common/compression.c
+++ b/src/common/compression.c
@@ -41,6 +41,36 @@ static int expect_integer_value(char *keyword, char *value,
static bool expect_boolean_value(char *keyword, char *value,
pg_compress_specification *result);
+/*
+ * Look up a compression algorithm by archive file extension. Returns true and
+ * sets *algorithm if the name is recognized. Otherwise returns false.
+ */
+bool
+parse_tar_compress_algorithm(char *fname, pg_compress_algorithm *algorithm)
+{
+ int fname_len = strlen(fname);
+
+ if (fname_len >= 4 &&
+ strcmp(fname + fname_len - 4, ".tar") == 0)
+ *algorithm = PG_COMPRESSION_NONE;
+ else if (fname_len >= 4 &&
+ strcmp(fname + fname_len - 4, ".tgz") == 0)
+ *algorithm = PG_COMPRESSION_GZIP;
+ else if (fname_len >= 7 &&
+ strcmp(fname + fname_len - 7, ".tar.gz") == 0)
+ *algorithm = PG_COMPRESSION_GZIP;
+ else if (fname_len >= 8 &&
+ strcmp(fname + fname_len - 8, ".tar.lz4") == 0)
+ *algorithm = PG_COMPRESSION_LZ4;
+ else if (fname_len >= 8 &&
+ strcmp(fname + fname_len - 8, ".tar.zst") == 0)
+ *algorithm = PG_COMPRESSION_ZSTD;
+ else
+ return false;
+
+ return true;
+}
+
/*
* Look up a compression algorithm by name. Returns true and sets *algorithm
* if the name is recognized. Otherwise returns false.
diff --git a/src/include/common/compression.h b/src/include/common/compression.h
index 6c745b90066..50f21656b88 100644
--- a/src/include/common/compression.h
+++ b/src/include/common/compression.h
@@ -41,6 +41,8 @@ typedef struct pg_compress_specification
extern void parse_compress_options(const char *option, char **algorithm,
char **detail);
+extern bool parse_tar_compress_algorithm(char *fname,
+ pg_compress_algorithm *algorithm);
extern bool parse_compress_algorithm(char *name, pg_compress_algorithm *algorithm);
extern const char *get_compress_algorithm_name(pg_compress_algorithm algorithm);
--
2.47.1
[application/octet-stream] v14-0002-Refactor-pg_waldump-Move-some-declarations-to-ne.patch (2.2K, 3-v14-0002-Refactor-pg_waldump-Move-some-declarations-to-ne.patch)
download | inline diff:
From 14706302872c7e35934345fe75e1f24a5857ad16 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 22 Jan 2026 10:28:32 +0530
Subject: [PATCH v14 02/11] Refactor: pg_waldump: Move some declarations to new
pg_waldump.h
This change prepares for a second source file in this directory to
support reading WAL from tar files. Common structures, declarations,
and functions are being exported through this include file so
they can be used in both files.
---
src/bin/pg_waldump/pg_waldump.c | 9 +--------
src/bin/pg_waldump/pg_waldump.h | 25 +++++++++++++++++++++++++
2 files changed, 26 insertions(+), 8 deletions(-)
create mode 100644 src/bin/pg_waldump/pg_waldump.h
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index f3446385d6a..4b7411a6498 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -29,6 +29,7 @@
#include "common/logging.h"
#include "common/relpath.h"
#include "getopt_long.h"
+#include "pg_waldump.h"
#include "rmgrdesc.h"
#include "storage/bufpage.h"
@@ -43,14 +44,6 @@ static volatile sig_atomic_t time_to_stop = false;
static const RelFileLocator emptyRelFileLocator = {0, 0, 0};
-typedef struct XLogDumpPrivate
-{
- TimeLineID timeline;
- XLogRecPtr startptr;
- XLogRecPtr endptr;
- bool endptr_reached;
-} XLogDumpPrivate;
-
typedef struct XLogDumpConfig
{
/* display options */
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
new file mode 100644
index 00000000000..64a9109229e
--- /dev/null
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -0,0 +1,25 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_waldump.h - decode and display WAL
+ *
+ * Copyright (c) 2026, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/pg_waldump.h
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_WALDUMP_H
+#define PG_WALDUMP_H
+
+#include "access/xlogdefs.h"
+
+/* Contains the necessary information to drive WAL decoding */
+typedef struct XLogDumpPrivate
+{
+ TimeLineID timeline;
+ XLogRecPtr startptr;
+ XLogRecPtr endptr;
+ bool endptr_reached;
+} XLogDumpPrivate;
+
+#endif /* PG_WALDUMP_H */
--
2.47.1
[application/octet-stream] v14-0003-Refactor-pg_waldump-Separate-logic-used-to-calcu.patch (2.4K, 4-v14-0003-Refactor-pg_waldump-Separate-logic-used-to-calcu.patch)
download | inline diff:
From e62670767a8164ca8c0a289aad05f24c3e84f8cc Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 22 Jan 2026 10:38:16 +0530
Subject: [PATCH v14 03/11] Refactor: pg_waldump: Separate logic used to
calculate the required read size.
This refactoring prepares the codebase for an upcoming patch that will
support reading WAL from tar files. The logic for calculating the
required read size has been updated to handle both normal WAL files
and WAL files located inside a tar archive.
---
src/bin/pg_waldump/pg_waldump.c | 43 +++++++++++++++++++++++----------
1 file changed, 30 insertions(+), 13 deletions(-)
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 4b7411a6498..958a71a01cf 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -326,6 +326,32 @@ identify_target_directory(char *directory, char *fname, int *WalSegSz)
return NULL; /* not reached */
}
+/*
+ * Returns the size in bytes of the data to be read. Returns -1 if the end
+ * point has already been reached.
+ */
+static inline int
+required_read_len(XLogDumpPrivate *private, XLogRecPtr targetPagePtr,
+ int reqLen)
+{
+ int count = XLOG_BLCKSZ;
+
+ if (XLogRecPtrIsValid(private->endptr))
+ {
+ if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
+ count = XLOG_BLCKSZ;
+ else if (targetPagePtr + reqLen <= private->endptr)
+ count = private->endptr - targetPagePtr;
+ else
+ {
+ private->endptr_reached = true;
+ return -1;
+ }
+ }
+
+ return count;
+}
+
/* pg_waldump's XLogReaderRoutine->segment_open callback */
static void
WALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
@@ -383,21 +409,12 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogRecPtr targetPtr, char *readBuff)
{
XLogDumpPrivate *private = state->private_data;
- int count = XLOG_BLCKSZ;
+ int count = required_read_len(private, targetPagePtr, reqLen);
WALReadError errinfo;
- if (XLogRecPtrIsValid(private->endptr))
- {
- if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
- count = XLOG_BLCKSZ;
- else if (targetPagePtr + reqLen <= private->endptr)
- count = private->endptr - targetPagePtr;
- else
- {
- private->endptr_reached = true;
- return -1;
- }
- }
+ /* Bail out if the count to be read is not valid */
+ if (count < 0)
+ return -1;
if (!WALRead(state, readBuff, targetPagePtr, count, private->timeline,
&errinfo))
--
2.47.1
[application/octet-stream] v14-0004-Refactor-pg_waldump-Restructure-TAP-tests.patch (6.6K, 5-v14-0004-Refactor-pg_waldump-Restructure-TAP-tests.patch)
download | inline diff:
From be1fbe441570c0aef766eed410eb3465f2450b53 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 18 Feb 2026 11:07:57 +0530
Subject: [PATCH v14 04/11] Refactor: pg_waldump: Restructure TAP tests.
Restructured tests that do not have a WAL file argument to run within
a loop, facilitating their re-execution for decoding WAL from tar
archives.
== NOTE ==
This is not intended to be committed separately. It can be merged
with the next patch, which is the main patch implementing this
feature.
---
src/bin/pg_waldump/t/001_basic.pl | 140 +++++++++++++++++-------------
1 file changed, 79 insertions(+), 61 deletions(-)
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 5db5d20136f..f12ba52cbfc 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -198,28 +198,6 @@ command_like(
],
qr/./,
'runs with start and end segment specified');
-command_fails_like(
- [ 'pg_waldump', '--path' => $node->data_dir ],
- qr/error: no start WAL location given/,
- 'path option requires start location');
-command_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- '--end' => $end_lsn,
- ],
- qr/./,
- 'runs with path option and start and end locations');
-command_fails_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- ],
- qr/error: error in WAL record at/,
- 'falling off the end of the WAL results in an error');
-
command_like(
[
'pg_waldump', '--quiet',
@@ -227,22 +205,16 @@ command_like(
],
qr/^$/,
'no output with --quiet option');
-command_fails_like(
- [
- 'pg_waldump', '--quiet',
- '--path' => $node->data_dir,
- '--start' => $start_lsn
- ],
- qr/error: error in WAL record at/,
- 'errors are shown with --quiet');
-
# Test for: Display a message that we're skipping data if `from`
# wasn't a pointer to the start of a record.
+sub test_pg_waldump_skip_bytes
{
+ my ($path, $startlsn, $endlsn) = @_;
+
# Construct a new LSN that is one byte past the original
# start_lsn.
- my ($part1, $part2) = split qr{/}, $start_lsn;
+ my ($part1, $part2) = split qr{/}, $startlsn;
my $lsn2 = hex $part2;
$lsn2++;
my $new_start = sprintf("%s/%X", $part1, $lsn2);
@@ -252,7 +224,8 @@ command_fails_like(
my $result = IPC::Run::run [
'pg_waldump',
'--start' => $new_start,
- $node->data_dir . '/pg_wal/' . $start_walfile
+ '--end' => $endlsn,
+ '--path' => $path,
],
'>' => \$stdout,
'2>' => \$stderr;
@@ -266,15 +239,15 @@ command_fails_like(
sub test_pg_waldump
{
local $Test::Builder::Level = $Test::Builder::Level + 1;
- my @opts = @_;
+ my ($path, $startlsn, $endlsn, @opts) = @_;
my ($stdout, $stderr);
my $result = IPC::Run::run [
'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- '--end' => $end_lsn,
+ '--start' => $startlsn,
+ '--end' => $endlsn,
+ '--path' => $path,
@opts
],
'>' => \$stdout,
@@ -288,38 +261,83 @@ sub test_pg_waldump
my @lines;
-@lines = test_pg_waldump;
-is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+my @scenarios = (
+ {
+ 'path' => $node->data_dir
+ });
-@lines = test_pg_waldump('--limit' => 6);
-is(@lines, 6, 'limit option observed');
+for my $scenario (@scenarios)
+{
+ my $path = $scenario->{'path'};
-@lines = test_pg_waldump('--fullpage');
-is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
+ SKIP:
+ {
+ command_fails_like(
+ [ 'pg_waldump', '--path' => $path ],
+ qr/error: no start WAL location given/,
+ 'path option requires start location');
+ command_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ '--end' => $end_lsn,
+ ],
+ qr/./,
+ 'runs with path option and start and end locations');
+ command_fails_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ ],
+ qr/error: error in WAL record at/,
+ 'falling off the end of the WAL results in an error');
-@lines = test_pg_waldump('--stats');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ command_fails_like(
+ [
+ 'pg_waldump', '--quiet',
+ '--path' => $path,
+ '--start' => $start_lsn
+ ],
+ qr/error: error in WAL record at/,
+ 'errors are shown with --quiet');
-@lines = test_pg_waldump('--stats=record');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ test_pg_waldump_skip_bytes($path, $start_lsn, $end_lsn);
-@lines = test_pg_waldump('--rmgr' => 'Btree');
-is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
+ is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
-@lines = test_pg_waldump('--fork' => 'init');
-is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--limit' => 6);
+ is(@lines, 6, 'limit option observed');
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
-is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
- 0, 'only lines for selected relation');
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fullpage');
+ is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
- '--block' => 1);
-is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats=record');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--rmgr' => 'Btree');
+ is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fork' => 'init');
+ is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
+ is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
+ 0, 'only lines for selected relation');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
+ '--block' => 1);
+ is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+ }
+}
done_testing();
--
2.47.1
[application/octet-stream] v14-0005-Refactor-pg_waldump-Move-WAL-segment-size-to-XLo.patch (5.1K, 6-v14-0005-Refactor-pg_waldump-Move-WAL-segment-size-to-XLo.patch)
download | inline diff:
From 8bb8dc6afe753f885520429613966f8cedc2b477 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 4 Feb 2026 15:31:51 +0530
Subject: [PATCH v14 05/11] Refactor: pg_waldump: Move WAL segment size to
XLogDumpPrivate.
Relocate the WAL segment size variable to the XLogDumpPrivate
structure and rename it to segsize for consistency. This change is
required to make the segment size accessible to the archive streamer
code, where passing it as a function argument is not feasible.
---
src/bin/pg_waldump/pg_waldump.c | 26 +++++++++++++-------------
src/bin/pg_waldump/pg_waldump.h | 1 +
2 files changed, 14 insertions(+), 13 deletions(-)
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 958a71a01cf..5d31b15dbd8 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -811,7 +811,6 @@ main(int argc, char **argv)
XLogRecPtr first_record;
char *waldir = NULL;
char *errormsg;
- int WalSegSz;
static struct option long_options[] = {
{"bkp-details", no_argument, NULL, 'b'},
@@ -865,6 +864,7 @@ main(int argc, char **argv)
memset(&stats, 0, sizeof(XLogStats));
private.timeline = 1;
+ private.segsize = 0;
private.startptr = InvalidXLogRecPtr;
private.endptr = InvalidXLogRecPtr;
private.endptr_reached = false;
@@ -1138,18 +1138,18 @@ main(int argc, char **argv)
pg_fatal("could not open directory \"%s\": %m", waldir);
}
- waldir = identify_target_directory(waldir, fname, &WalSegSz);
+ waldir = identify_target_directory(waldir, fname, &private.segsize);
fd = open_file_in_directory(waldir, fname);
if (fd < 0)
pg_fatal("could not open file \"%s\"", fname);
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &segno, WalSegSz);
+ XLogFromFileName(fname, &private.timeline, &segno, private.segsize);
if (!XLogRecPtrIsValid(private.startptr))
- XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr);
- else if (!XLByteInSeg(private.startptr, segno, WalSegSz))
+ XLogSegNoOffsetToRecPtr(segno, 0, private.segsize, private.startptr);
+ else if (!XLByteInSeg(private.startptr, segno, private.segsize))
{
pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
LSN_FORMAT_ARGS(private.startptr),
@@ -1159,7 +1159,7 @@ main(int argc, char **argv)
/* no second file specified, set end position */
if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(segno + 1, 0, WalSegSz, private.endptr);
+ XLogSegNoOffsetToRecPtr(segno + 1, 0, private.segsize, private.endptr);
/* parse ENDSEG if passed */
if (optind + 1 < argc)
@@ -1175,14 +1175,14 @@ main(int argc, char **argv)
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &endsegno, WalSegSz);
+ XLogFromFileName(fname, &private.timeline, &endsegno, private.segsize);
if (endsegno < segno)
pg_fatal("ENDSEG %s is before STARTSEG %s",
argv[optind + 1], argv[optind]);
if (!XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(endsegno + 1, 0, WalSegSz,
+ XLogSegNoOffsetToRecPtr(endsegno + 1, 0, private.segsize,
private.endptr);
/* set segno to endsegno for check of --end */
@@ -1190,8 +1190,8 @@ main(int argc, char **argv)
}
- if (!XLByteInSeg(private.endptr, segno, WalSegSz) &&
- private.endptr != (segno + 1) * WalSegSz)
+ if (!XLByteInSeg(private.endptr, segno, private.segsize) &&
+ private.endptr != (segno + 1) * private.segsize)
{
pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
LSN_FORMAT_ARGS(private.endptr),
@@ -1200,7 +1200,7 @@ main(int argc, char **argv)
}
}
else
- waldir = identify_target_directory(waldir, NULL, &WalSegSz);
+ waldir = identify_target_directory(waldir, NULL, &private.segsize);
/* we don't know what to print */
if (!XLogRecPtrIsValid(private.startptr))
@@ -1213,7 +1213,7 @@ main(int argc, char **argv)
/* we have everything we need, start reading */
xlogreader_state =
- XLogReaderAllocate(WalSegSz, waldir,
+ XLogReaderAllocate(private.segsize, waldir,
XL_ROUTINE(.page_read = WALDumpReadPage,
.segment_open = WALDumpOpenSegment,
.segment_close = WALDumpCloseSegment),
@@ -1234,7 +1234,7 @@ main(int argc, char **argv)
* a segment (e.g. we were used in file mode).
*/
if (first_record != private.startptr &&
- XLogSegmentOffset(private.startptr, WalSegSz) != 0)
+ XLogSegmentOffset(private.startptr, private.segsize) != 0)
pg_log_info(ngettext("first record is after %X/%08X, at %X/%08X, skipping over %u byte",
"first record is after %X/%08X, at %X/%08X, skipping over %u bytes",
(first_record - private.startptr)),
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index 64a9109229e..013b051506f 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -17,6 +17,7 @@
typedef struct XLogDumpPrivate
{
TimeLineID timeline;
+ int segsize;
XLogRecPtr startptr;
XLogRecPtr endptr;
bool endptr_reached;
--
2.47.1
[application/octet-stream] v14-0006-pg_waldump-Add-support-for-archived-WAL-decoding.patch (41.7K, 7-v14-0006-pg_waldump-Add-support-for-archived-WAL-decoding.patch)
download | inline diff:
From 4322e9804f9bce7f9fb30872c5d64736e91c653b Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 10 Feb 2026 11:42:36 +0530
Subject: [PATCH v14 06/11] pg_waldump: Add support for archived WAL decoding.
pg_waldump can now accept the path to a tar archive containing WAL
files and decode them. This feature was added primarily for
pg_verifybackup, which previously disabled WAL parsing for
tar-formatted backups.
Note that this patch requires that the WAL files within the archive be
in sequential order; an error will be reported otherwise. The next
patch is planned to remove this restriction.
---
doc/src/sgml/ref/pg_waldump.sgml | 8 +-
src/bin/pg_waldump/Makefile | 7 +-
src/bin/pg_waldump/archive_waldump.c | 639 +++++++++++++++++++++++++++
src/bin/pg_waldump/meson.build | 4 +-
src/bin/pg_waldump/pg_waldump.c | 255 ++++++++---
src/bin/pg_waldump/pg_waldump.h | 43 ++
src/bin/pg_waldump/t/001_basic.pl | 105 ++++-
src/tools/pgindent/typedefs.list | 3 +
8 files changed, 998 insertions(+), 66 deletions(-)
create mode 100644 src/bin/pg_waldump/archive_waldump.c
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index d1715ff5124..15fb8d13199 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -141,13 +141,17 @@ PostgreSQL documentation
<term><option>--path=<replaceable>path</replaceable></option></term>
<listitem>
<para>
- Specifies a directory to search for WAL segment files or a
- directory with a <literal>pg_wal</literal> subdirectory that
+ Specifies a tar archive or a directory to search for WAL segment files
+ or a directory with a <literal>pg_wal</literal> subdirectory that
contains such files. The default is to search in the current
directory, the <literal>pg_wal</literal> subdirectory of the
current directory, and the <literal>pg_wal</literal> subdirectory
of <envar>PGDATA</envar>.
</para>
+ <para>
+ If a tar archive is provided, its WAL segment files must be in
+ sequential order; otherwise, an error will be reported.
+ </para>
</listitem>
</varlistentry>
diff --git a/src/bin/pg_waldump/Makefile b/src/bin/pg_waldump/Makefile
index 4c1ee649501..aabb87566a2 100644
--- a/src/bin/pg_waldump/Makefile
+++ b/src/bin/pg_waldump/Makefile
@@ -3,6 +3,9 @@
PGFILEDESC = "pg_waldump - decode and display WAL"
PGAPPICON=win32
+# make these available to TAP test scripts
+export TAR
+
subdir = src/bin/pg_waldump
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
@@ -10,13 +13,15 @@ include $(top_builddir)/src/Makefile.global
OBJS = \
$(RMGRDESCOBJS) \
$(WIN32RES) \
+ archive_waldump.o \
compat.o \
pg_waldump.o \
rmgrdesc.o \
xlogreader.o \
xlogstats.o
-override CPPFLAGS := -DFRONTEND $(CPPFLAGS)
+override CPPFLAGS := -DFRONTEND -I$(libpq_srcdir) $(CPPFLAGS)
+LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils
RMGRDESCSOURCES = $(sort $(notdir $(wildcard $(top_srcdir)/src/backend/access/rmgrdesc/*desc*.c)))
RMGRDESCOBJS = $(patsubst %.c,%.o,$(RMGRDESCSOURCES))
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
new file mode 100644
index 00000000000..17d27ffa520
--- /dev/null
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -0,0 +1,639 @@
+/*-------------------------------------------------------------------------
+ *
+ * archive_waldump.c
+ * A generic facility for reading WAL data from tar archives via archive
+ * streamer.
+ *
+ * Portions Copyright (c) 2026, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/archive_waldump.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres_fe.h"
+
+#include <unistd.h>
+
+#include "access/xlog_internal.h"
+#include "common/hashfn.h"
+#include "common/logging.h"
+#include "fe_utils/simple_list.h"
+#include "pg_waldump.h"
+
+/*
+ * How many bytes should we try to read from a file at once?
+ */
+#define READ_CHUNK_SIZE (128 * 1024)
+
+/*
+ * Check if the start segment number is zero; this indicates a request to read
+ * any WAL file.
+ */
+#define READ_ANY_WAL(privateInfo) ((privateInfo)->start_segno == 0)
+
+/*
+ * Hash entry representing a WAL segment retrieved from the archive.
+ *
+ * While WAL segments are typically read sequentially, individual entries
+ * maintain their own buffers for the following reasons:
+ *
+ * 1. Boundary Handling: The archive streamer provides a continuous byte
+ * stream. A single streaming chunk may contain the end of one WAL segment
+ * and the start of the next. Separate buffers allow us to easily
+ * partition and track these bytes by their respective segments.
+ *
+ * 2. Out-of-Order Support: Dedicated buffers simplify logic if segments
+ * are ever archived or retrieved out of sequence.
+ *
+ * To minimize the memory footprint, entries and their associated buffers are
+ * freed immediately once consumed. Since pg_waldump does not request the same
+ * bytes twice, a segment is discarded as soon as it moves past it.
+ */
+typedef struct ArchivedWALFile
+{
+ uint32 status; /* hash status */
+ const char *fname; /* hash key: WAL segment name */
+
+ StringInfo buf; /* holds WAL bytes read from archive */
+
+ int read_len; /* total bytes of a WAL read from archive */
+} ArchivedWALFile;
+
+static uint32 hash_string_pointer(const char *s);
+#define SH_PREFIX ArchivedWAL
+#define SH_ELEMENT_TYPE ArchivedWALFile
+#define SH_KEY_TYPE const char *
+#define SH_KEY fname
+#define SH_HASH_KEY(tb, key) hash_string_pointer(key)
+#define SH_EQUAL(tb, a, b) (strcmp(a, b) == 0)
+#define SH_SCOPE static inline
+#define SH_RAW_ALLOCATOR pg_malloc0
+#define SH_DECLARE
+#define SH_DEFINE
+#include "lib/simplehash.h"
+
+typedef struct astreamer_waldump
+{
+ astreamer base;
+ XLogDumpPrivate *privateInfo;
+} astreamer_waldump;
+
+static ArchivedWALFile *get_archive_wal_entry(const char *fname,
+ XLogDumpPrivate *privateInfo,
+ int WalSegSz);
+static int read_archive_file(XLogDumpPrivate *privateInfo, Size count);
+
+static astreamer *astreamer_waldump_new(XLogDumpPrivate *privateInfo);
+static void astreamer_waldump_content(astreamer *streamer,
+ astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context);
+static void astreamer_waldump_finalize(astreamer *streamer);
+static void astreamer_waldump_free(astreamer *streamer);
+
+static bool member_is_wal_file(astreamer_waldump *mystreamer,
+ astreamer_member *member,
+ char **fname);
+
+static const astreamer_ops astreamer_waldump_ops = {
+ .content = astreamer_waldump_content,
+ .finalize = astreamer_waldump_finalize,
+ .free = astreamer_waldump_free
+};
+
+/*
+ * Initializes the tar archive reader, creates a hash table for WAL entries,
+ * checks for existing valid WAL segments in the archive file and retrieves the
+ * segment size, and sets up filters for relevant entries.
+ */
+void
+init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
+ int *WalSegSz, pg_compress_algorithm compression)
+{
+ int fd;
+ astreamer *streamer;
+ ArchivedWALFile *entry = NULL;
+ XLogLongPageHeader longhdr;
+ XLogSegNo segno;
+ TimeLineID timeline;
+
+ /* Open tar archive and store its file descriptor */
+ fd = open_file_in_directory(waldir, privateInfo->archive_name);
+
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", privateInfo->archive_name);
+
+ privateInfo->archive_fd = fd;
+
+ streamer = astreamer_waldump_new(privateInfo);
+
+ /* Before that we must parse the tar archive. */
+ streamer = astreamer_tar_parser_new(streamer);
+
+ /* Before that we must decompress, if archive is compressed. */
+ if (compression == PG_COMPRESSION_GZIP)
+ streamer = astreamer_gzip_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_LZ4)
+ streamer = astreamer_lz4_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_ZSTD)
+ streamer = astreamer_zstd_decompressor_new(streamer);
+
+ privateInfo->archive_streamer = streamer;
+
+ /*
+ * Hash table storing WAL entries read from the archive with an arbitrary
+ * initial size
+ */
+ privateInfo->archive_wal_htab = ArchivedWAL_create(8, NULL);
+
+ /*
+ * Verify that the archive contains valid WAL files and fetch WAL segment
+ * size
+ */
+ while (entry == NULL || entry->buf->len < XLOG_BLCKSZ)
+ {
+ if (read_archive_file(privateInfo, XLOG_BLCKSZ) == 0)
+ pg_fatal("could not find WAL in archive \"%s\"",
+ privateInfo->archive_name);
+
+ entry = privateInfo->cur_file;
+ }
+
+ /* Set WalSegSz if WAL data is successfully read */
+ longhdr = (XLogLongPageHeader) entry->buf->data;
+
+ if (!IsValidWalSegSize(longhdr->xlp_seg_size))
+ {
+ pg_log_error(ngettext("invalid WAL segment size in WAL file from archive \"%s\" (%d byte)",
+ "invalid WAL segment size in WAL file from archive \"%s\" (%d bytes)",
+ longhdr->xlp_seg_size),
+ privateInfo->archive_name, longhdr->xlp_seg_size);
+ pg_log_error_detail("The WAL segment size must be a power of two between 1 MB and 1 GB.");
+ exit(1);
+ }
+
+ *WalSegSz = longhdr->xlp_seg_size;
+
+ /*
+ * With the WAL segment size available, we can now initialize the
+ * dependent start and end segment numbers.
+ */
+ Assert(!XLogRecPtrIsInvalid(privateInfo->startptr));
+ XLByteToSeg(privateInfo->startptr, privateInfo->start_segno, *WalSegSz);
+
+ if (!XLogRecPtrIsInvalid(privateInfo->endptr))
+ XLByteToSeg(privateInfo->endptr, privateInfo->end_segno, *WalSegSz);
+
+ /*
+ * This WAL record was fetched before the filtering parameters
+ * (start_segno and end_segno) were fully initialized. Perform the
+ * relevance check against the user-provided range now; if the WAL falls
+ * outside this range, remove it from the hash table. Subsequent WAL will
+ * be filtered automatically by the archived streamer using the updated
+ * start_segno and end_segno values.
+ */
+ XLogFromFileName(entry->fname, &timeline, &segno, privateInfo->segsize);
+ if (privateInfo->timeline != timeline ||
+ privateInfo->start_segno > segno ||
+ privateInfo->end_segno < segno)
+ free_archive_wal_entry(entry->fname, privateInfo);
+}
+
+/*
+ * Release the archive streamer chain and close the archive file.
+ */
+void
+free_archive_reader(XLogDumpPrivate *privateInfo)
+{
+ /*
+ * NB: Normally, astreamer_finalize() is called before astreamer_free() to
+ * flush any remaining buffered data or to ensure the end of the tar
+ * archive is reached. However, when decoding a WAL file, once we hit the
+ * end LSN, any remaining WAL data in the buffer or the tar archive's
+ * unreached end can be safely ignored.
+ */
+ astreamer_free(privateInfo->archive_streamer);
+
+ /* Close the file. */
+ if (close(privateInfo->archive_fd) != 0)
+ pg_log_error("could not close file \"%s\": %m",
+ privateInfo->archive_name);
+}
+
+/*
+ * Copies WAL data from astreamer to readBuff; if unavailable, fetches more
+ * from the tar archive via astreamer.
+ */
+int
+read_archive_wal_page(XLogDumpPrivate *privateInfo, XLogRecPtr targetPagePtr,
+ Size count, char *readBuff, int WalSegSz)
+{
+ char *p = readBuff;
+ Size nbytes = count;
+ XLogRecPtr recptr = targetPagePtr;
+ XLogSegNo segno;
+ char fname[MAXFNAMELEN];
+ ArchivedWALFile *entry;
+
+ /* Identify the segment and locate its entry in the archive hash */
+ XLByteToSeg(targetPagePtr, segno, WalSegSz);
+ XLogFileName(fname, privateInfo->timeline, segno, WalSegSz);
+ entry = get_archive_wal_entry(fname, privateInfo, WalSegSz);
+
+ while (nbytes > 0)
+ {
+ char *buf = entry->buf->data;
+ int bufLen = entry->buf->len;
+ XLogRecPtr endPtr;
+ XLogRecPtr startPtr;
+
+ /* Calculate the LSN range currently residing in the buffer */
+ XLogSegNoOffsetToRecPtr(segno, entry->read_len, WalSegSz, endPtr);
+ startPtr = endPtr - bufLen;
+
+ /*
+ * Copy the requested WAL record if it exists in the buffer.
+ */
+ if (bufLen > 0 && startPtr <= recptr && recptr < endPtr)
+ {
+ int copyBytes;
+ int offset = recptr - startPtr;
+
+ /*
+ * Given startPtr <= recptr < endPtr and a total buffer size
+ * 'bufLen', the offset (recptr - startPtr) will always be less
+ * than 'bufLen'.
+ */
+ Assert(offset < bufLen);
+
+ copyBytes = Min(nbytes, bufLen - offset);
+ memcpy(p, buf + offset, copyBytes);
+
+ /* Update state for read */
+ recptr += copyBytes;
+ nbytes -= copyBytes;
+ p += copyBytes;
+ }
+ else
+ {
+ /*
+ * Before starting the actual decoding loop, pg_waldump tries to
+ * locate the first valid record from the user-specified start
+ * position, which might not be the start of a WAL record and
+ * could fall in the middle of a record that spans multiple pages.
+ * Consequently, the valid start position the decoder is looking
+ * for could be far away from that initial position.
+ *
+ * This may involve reading across multiple pages, and this
+ * pre-reading fetches data in multiple rounds from the archive
+ * streamer; normally, we would throw away existing buffer
+ * contents to fetch the next set of data, but that existing data
+ * might be needed once the main loop starts. Because previously
+ * read data cannot be re-read by the archive streamer, we delay
+ * resetting the buffer until the main decoding loop is entered.
+ *
+ * Once pg_waldump has entered the main loop, it may re-read the
+ * currently active page, but never an older one; therefore, any
+ * fully consumed WAL data preceding the current page can then be
+ * safely discarded.
+ */
+ if (privateInfo->decoding_started)
+ {
+ resetStringInfo(entry->buf);
+
+ /*
+ * Push back the partial page data for the current page to the
+ * buffer, ensuring it remains full page available for
+ * re-reading if requested.
+ */
+ if (p > readBuff)
+ {
+ Assert((count - nbytes) > 0);
+ appendBinaryStringInfo(entry->buf, readBuff, count - nbytes);
+ }
+ }
+
+ /*
+ * Now, fetch more data; raise an error if it's not the current
+ * segment being read by the archive streamer or if reading of the
+ * archived file has finished.
+ */
+ if (privateInfo->cur_file != entry ||
+ read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ pg_fatal("could not read file \"%s\" from archive \"%s\": read %lld of %lld",
+ fname, privateInfo->archive_name,
+ (long long int) count - nbytes,
+ (long long int) nbytes);
+ }
+ }
+
+ /*
+ * Should have either have successfully read all the requested bytes or
+ * reported a failure before this point.
+ */
+ Assert(nbytes == 0);
+
+ /*
+ * NB: We return the fixed value provided as input. Although we could
+ * return a boolean since we either successfully read the WAL page or
+ * raise an error, but the caller expects this value to be returned. The
+ * routine that reads WAL pages from the physical WAL file follows the
+ * same convention.
+ */
+ return count;
+}
+
+/*
+ * Clears the buffer of a WAL entry that is being ignored. This frees up memory
+ * and prevents the accumulation of irrelevant WAL data. Additionally,
+ * conditionally setting cur_file within privateinfo to NULL ensures the
+ * archive streamer skips unnecessary copy operations
+ */
+void
+free_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo)
+{
+ ArchivedWALFile *entry;
+
+ entry = ArchivedWAL_lookup(privateInfo->archive_wal_htab, fname);
+
+ if (entry == NULL)
+ return;
+
+ /* Destroy the buffer */
+ destroyStringInfo(entry->buf);
+ entry->buf = NULL;
+
+ /* Set cur_file to NULL if it matches the entry being ignored */
+ if (privateInfo->cur_file == entry)
+ privateInfo->cur_file = NULL;
+
+ ArchivedWAL_delete_item(privateInfo->archive_wal_htab, entry);
+}
+
+/*
+ * Returns the archived WAL entry from the hash table if it exists. Otherwise,
+ * it invokes the routine to read the archived file, which then populates the
+ * entry in the hash table if that WAL exists in the archive.
+ */
+static ArchivedWALFile *
+get_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo,
+ int WalSegSz)
+{
+ ArchivedWALFile *entry = NULL;
+
+ /* Search hash table */
+ entry = ArchivedWAL_lookup(privateInfo->archive_wal_htab, fname);
+
+ if (entry != NULL)
+ return entry;
+
+ /*
+ * The requested WAL entry has not been read from the archive yet; invoke
+ * the archive streamer to read it.
+ */
+ while (1)
+ {
+ /* Fetch more data */
+ if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ break; /* archive file ended */
+
+ /*
+ * Archived streamer is reading a non-WAL file or an irrelevant WAL
+ * file.
+ */
+ if (privateInfo->cur_file == NULL)
+ continue;
+
+ entry = privateInfo->cur_file;
+
+ /* Found the required entry */
+ if (strcmp(fname, entry->fname) == 0)
+ return entry;
+
+ /* WAL segments must be archived in order */
+ pg_log_error("WAL files are not archived in sequential order");
+ pg_log_error_detail("Expecting segment \"%s\" but found \"%s\".",
+ fname, entry->fname);
+ exit(1);
+ }
+
+ /* Requested WAL segment not found */
+ pg_fatal("could not find WAL \"%s\" in archive \"%s\"",
+ fname, privateInfo->archive_name);
+}
+
+/*
+ * Reads the archive file and passes it to the archive streamer for
+ * decompression.
+ */
+static int
+read_archive_file(XLogDumpPrivate *privateInfo, Size count)
+{
+ int rc;
+ char *buffer;
+
+ buffer = pg_malloc(count * sizeof(uint8));
+
+ rc = read(privateInfo->archive_fd, buffer, count);
+ if (rc < 0)
+ pg_fatal("could not read file \"%s\": %m",
+ privateInfo->archive_name);
+
+ /*
+ * Decompress (if required), and then parse the previously read contents
+ * of the tar file.
+ */
+ if (rc > 0)
+ astreamer_content(privateInfo->archive_streamer, NULL,
+ buffer, rc, ASTREAMER_UNKNOWN);
+ pg_free(buffer);
+
+ return rc;
+}
+
+/*
+ * Create an astreamer that can read WAL from a tar file.
+ */
+static astreamer *
+astreamer_waldump_new(XLogDumpPrivate *privateInfo)
+{
+ astreamer_waldump *streamer;
+
+ streamer = palloc0_object(astreamer_waldump);
+ *((const astreamer_ops **) &streamer->base.bbs_ops) =
+ &astreamer_waldump_ops;
+
+ streamer->privateInfo = privateInfo;
+
+ return &streamer->base;
+}
+
+/*
+ * Main entry point of the archive streamer for reading WAL data from a tar
+ * file. If a member is identified as a valid WAL file, a hash entry is created
+ * for it, and its contents are copied into that entry's buffer, making them
+ * accessible to the decoding routine.
+ */
+static void
+astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context)
+{
+ astreamer_waldump *mystreamer = (astreamer_waldump *) streamer;
+ XLogDumpPrivate *privateInfo = mystreamer->privateInfo;
+
+ Assert(context != ASTREAMER_UNKNOWN);
+
+ switch (context)
+ {
+ case ASTREAMER_MEMBER_HEADER:
+ {
+ char *fname = NULL;
+ ArchivedWALFile *entry;
+ bool found;
+
+ pg_log_debug("reading \"%s\"", member->pathname);
+
+ if (!member_is_wal_file(mystreamer, member, &fname))
+ break;
+
+ /*
+ * Further checks are skipped if any WAL file can be read.
+ * This typically occurs during initial verification.
+ */
+ if (!READ_ANY_WAL(privateInfo))
+ {
+ XLogSegNo segno;
+ TimeLineID timeline;
+
+ /*
+ * Skip the segment if the timeline does not match, if it
+ * falls outside the caller-specified range.
+ */
+ XLogFromFileName(fname, &timeline, &segno, privateInfo->segsize);
+ if (privateInfo->timeline != timeline ||
+ privateInfo->start_segno > segno ||
+ privateInfo->end_segno < segno)
+ {
+ free(fname);
+ break;
+ }
+ }
+
+ entry = ArchivedWAL_insert(privateInfo->archive_wal_htab,
+ fname, &found);
+
+ /*
+ * Shouldn't happen, but if it does, simply ignore the
+ * duplicate WAL file.
+ */
+ if (found)
+ {
+ pg_log_warning("ignoring duplicate WAL \"%s\" found in archive \"%s\"",
+ member->pathname, privateInfo->archive_name);
+ break;
+ }
+
+ entry->buf = makeStringInfo();
+ entry->spilled = false;
+ entry->read_len = 0;
+ privateInfo->cur_file = entry;
+ }
+ break;
+
+ case ASTREAMER_MEMBER_CONTENTS:
+ if (privateInfo->cur_file)
+ {
+ appendBinaryStringInfo(privateInfo->cur_file->buf, data, len);
+ privateInfo->cur_file->read_len += len;
+ }
+ break;
+
+ case ASTREAMER_MEMBER_TRAILER:
+ privateInfo->cur_file = NULL;
+ break;
+
+ case ASTREAMER_ARCHIVE_TRAILER:
+ break;
+
+ default:
+ /* Shouldn't happen. */
+ pg_fatal("unexpected state while parsing tar file");
+ }
+}
+
+/*
+ * End-of-stream processing for an astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_finalize(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+}
+
+/*
+ * Free memory associated with a astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_free(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+ pfree(streamer);
+}
+
+/*
+ * Returns true if the archive member name matches the WAL naming format. If
+ * successful, it also outputs the WAL segment name.
+ */
+static bool
+member_is_wal_file(astreamer_waldump *mystreamer, astreamer_member *member,
+ char **fname)
+{
+ int pathlen;
+ char pathname[MAXPGPATH];
+ char *filename;
+
+ /* We are only interested in normal files. */
+ if (member->is_directory || member->is_link)
+ return false;
+
+ if (strlen(member->pathname) < XLOG_FNAME_LEN)
+ return false;
+
+ /*
+ * For a correct comparison, we must remove any '.' or '..' components
+ * from the member pathname. Similar to member_verify_header(), we prepend
+ * './' to the path so that canonicalize_path() can properly resolve and
+ * strip these references from the tar member name
+ */
+ snprintf(pathname, MAXPGPATH, "./%s", member->pathname);
+ canonicalize_path(pathname);
+ pathlen = strlen(pathname);
+
+ /* WAL files from the top-level or pg_wal directory will be decoded */
+ if (pathlen > XLOG_FNAME_LEN &&
+ strncmp(pathname, XLOGDIR, strlen(XLOGDIR)) != 0)
+ return false;
+
+ /* WAL file could be with full path */
+ filename = pathname + (pathlen - XLOG_FNAME_LEN);
+ if (!IsXLogFileName(filename))
+ return false;
+
+ *fname = pnstrdup(filename, XLOG_FNAME_LEN);
+
+ return true;
+}
+
+/*
+ * Helper function for filemap hash table.
+ */
+static uint32
+hash_string_pointer(const char *s)
+{
+ unsigned char *ss = (unsigned char *) s;
+
+ return hash_bytes(ss, strlen(s));
+}
diff --git a/src/bin/pg_waldump/meson.build b/src/bin/pg_waldump/meson.build
index 633a9874bb5..5296f21b82c 100644
--- a/src/bin/pg_waldump/meson.build
+++ b/src/bin/pg_waldump/meson.build
@@ -1,6 +1,7 @@
# Copyright (c) 2022-2026, PostgreSQL Global Development Group
pg_waldump_sources = files(
+ 'archive_waldump.c',
'compat.c',
'pg_waldump.c',
'rmgrdesc.c',
@@ -18,7 +19,7 @@ endif
pg_waldump = executable('pg_waldump',
pg_waldump_sources,
- dependencies: [frontend_code, lz4, zstd],
+ dependencies: [frontend_code, libpq, lz4, zstd],
c_args: ['-DFRONTEND'], # needed for xlogreader et al
kwargs: default_bin_args,
)
@@ -29,6 +30,7 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
+ 'env': {'TAR': tar.found() ? tar.full_path() : ''},
'tests': [
't/001_basic.pl',
't/002_save_fullpage.pl',
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 5d31b15dbd8..90fc13f3609 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -176,7 +176,7 @@ split_path(const char *path, char **dir, char **fname)
*
* return a read only fd
*/
-static int
+int
open_file_in_directory(const char *directory, const char *fname)
{
int fd = -1;
@@ -440,6 +440,80 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
return count;
}
+/*
+ * pg_waldump's XLogReaderRoutine->segment_open callback to support dumping WAL
+ * files from tar archives.
+ */
+static void
+TarWALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
+ TimeLineID *tli_p)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->segment_close callback.
+ */
+static void
+TarWALDumpCloseSegment(XLogReaderState *state)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->page_read callback to support dumping WAL
+ * files from tar archives.
+ */
+static int
+TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
+ XLogRecPtr targetPtr, char *readBuff)
+{
+ XLogDumpPrivate *private = state->private_data;
+ int count = required_read_len(private, targetPagePtr, reqLen);
+ int WalSegSz = state->segcxt.ws_segsize;
+ XLogSegNo curSegNo;
+
+ /* Bail out if the count to be read is not valid */
+ if (count < 0)
+ return -1;
+
+ /*
+ * If the target page is in a different segment, free the buffer space
+ * occupied by the previous segment data. Since pg_waldump never requests
+ * the same WAL bytes twice, moving to a new segment implies the previous
+ * buffer's data and that segment will not be needed again.
+ */
+ curSegNo = state->seg.ws_segno;
+ if (!XLByteInSeg(targetPagePtr, curSegNo, WalSegSz))
+ {
+ char fname[MAXFNAMELEN];
+ XLogSegNo nextSegNo;
+
+ /*
+ * Calculate the next WAL segment to be decoded from the given page
+ * pointer
+ */
+ XLByteToSeg(targetPagePtr, nextSegNo, WalSegSz);
+ state->seg.ws_tli = private->timeline;
+ state->seg.ws_segno = nextSegNo;
+
+ /*
+ * If in pre-reading mode (prior to actual decoding), do not delete any
+ * entries that might be requested again once the decoding loop starts.
+ * For more details, see the comments in read_archive_wal_page().
+ */
+ if (private->decoding_started && curSegNo < nextSegNo)
+ {
+ XLogFileName(fname, state->seg.ws_tli, curSegNo, WalSegSz);
+ free_archive_wal_entry(fname, private);
+ }
+ }
+
+ /* Read the WAL page from the archive streamer */
+ return read_archive_wal_page(private, targetPagePtr, count, readBuff,
+ WalSegSz);
+}
+
/*
* Boolean to return whether the given WAL record matches a specific relation
* and optionally block.
@@ -777,8 +851,8 @@ usage(void)
printf(_(" -F, --fork=FORK only show records that modify blocks in fork FORK;\n"
" valid names are main, fsm, vm, init\n"));
printf(_(" -n, --limit=N number of records to display\n"));
- printf(_(" -p, --path=PATH directory in which to find WAL segment files or a\n"
- " directory with a ./pg_wal that contains such files\n"
+ printf(_(" -p, --path=PATH tar archive or a directory in which to find WAL segment files or\n"
+ " a directory with a ./pg_wal that contains such files\n"
" (default: current directory, ./pg_wal, $PGDATA/pg_wal)\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -r, --rmgr=RMGR only show records generated by resource manager RMGR;\n"
@@ -810,7 +884,9 @@ main(int argc, char **argv)
XLogRecord *record;
XLogRecPtr first_record;
char *waldir = NULL;
+ char *walpath = NULL;
char *errormsg;
+ pg_compress_algorithm compression;
static struct option long_options[] = {
{"bkp-details", no_argument, NULL, 'b'},
@@ -868,6 +944,10 @@ main(int argc, char **argv)
private.startptr = InvalidXLogRecPtr;
private.endptr = InvalidXLogRecPtr;
private.endptr_reached = false;
+ private.decoding_started = false;
+ private.archive_name = NULL;
+ private.start_segno = 0;
+ private.end_segno = UINT64_MAX;
config.quiet = false;
config.bkp_details = false;
@@ -943,7 +1023,7 @@ main(int argc, char **argv)
}
break;
case 'p':
- waldir = pg_strdup(optarg);
+ walpath = pg_strdup(optarg);
break;
case 'q':
config.quiet = true;
@@ -1107,10 +1187,19 @@ main(int argc, char **argv)
goto bad_argument;
}
- if (waldir != NULL)
+ if (walpath != NULL)
{
+ /* validate path points to tar archive */
+ if (parse_tar_compress_algorithm(walpath, &compression))
+ {
+ char *fname = NULL;
+
+ split_path(walpath, &waldir, &fname);
+
+ private.archive_name = fname;
+ }
/* validate path points to directory */
- if (!verify_directory(waldir))
+ else if (!verify_directory(walpath))
{
pg_log_error("could not open directory \"%s\": %m", waldir);
goto bad_argument;
@@ -1128,6 +1217,17 @@ main(int argc, char **argv)
int fd;
XLogSegNo segno;
+ /*
+ * If a tar archive is passed using the --path option, all other
+ * arguments become unnecessary.
+ */
+ if (private.archive_name)
+ {
+ pg_log_error("unnecessary command-line arguments specified with tar archive (first is \"%s\")",
+ argv[optind]);
+ goto bad_argument;
+ }
+
split_path(argv[optind], &directory, &fname);
if (waldir == NULL && directory != NULL)
@@ -1138,69 +1238,76 @@ main(int argc, char **argv)
pg_fatal("could not open directory \"%s\": %m", waldir);
}
- waldir = identify_target_directory(waldir, fname, &private.segsize);
- fd = open_file_in_directory(waldir, fname);
- if (fd < 0)
- pg_fatal("could not open file \"%s\"", fname);
- close(fd);
-
- /* parse position from file */
- XLogFromFileName(fname, &private.timeline, &segno, private.segsize);
-
- if (!XLogRecPtrIsValid(private.startptr))
- XLogSegNoOffsetToRecPtr(segno, 0, private.segsize, private.startptr);
- else if (!XLByteInSeg(private.startptr, segno, private.segsize))
+ if (fname != NULL && parse_tar_compress_algorithm(fname, &compression))
{
- pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.startptr),
- fname);
- goto bad_argument;
+ private.archive_name = fname;
}
-
- /* no second file specified, set end position */
- if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(segno + 1, 0, private.segsize, private.endptr);
-
- /* parse ENDSEG if passed */
- if (optind + 1 < argc)
+ else
{
- XLogSegNo endsegno;
-
- /* ignore directory, already have that */
- split_path(argv[optind + 1], &directory, &fname);
-
+ waldir = identify_target_directory(waldir, fname, &private.segsize);
fd = open_file_in_directory(waldir, fname);
if (fd < 0)
pg_fatal("could not open file \"%s\"", fname);
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &endsegno, private.segsize);
+ XLogFromFileName(fname, &private.timeline, &segno, private.segsize);
- if (endsegno < segno)
- pg_fatal("ENDSEG %s is before STARTSEG %s",
- argv[optind + 1], argv[optind]);
+ if (!XLogRecPtrIsValid(private.startptr))
+ XLogSegNoOffsetToRecPtr(segno, 0, private.segsize, private.startptr);
+ else if (!XLByteInSeg(private.startptr, segno, private.segsize))
+ {
+ pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.startptr),
+ fname);
+ goto bad_argument;
+ }
- if (!XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(endsegno + 1, 0, private.segsize,
- private.endptr);
+ /* no second file specified, set end position */
+ if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
+ XLogSegNoOffsetToRecPtr(segno + 1, 0, private.segsize, private.endptr);
- /* set segno to endsegno for check of --end */
- segno = endsegno;
- }
+ /* parse ENDSEG if passed */
+ if (optind + 1 < argc)
+ {
+ XLogSegNo endsegno;
+ /* ignore directory, already have that */
+ split_path(argv[optind + 1], &directory, &fname);
- if (!XLByteInSeg(private.endptr, segno, private.segsize) &&
- private.endptr != (segno + 1) * private.segsize)
- {
- pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.endptr),
- argv[argc - 1]);
- goto bad_argument;
+ fd = open_file_in_directory(waldir, fname);
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", fname);
+ close(fd);
+
+ /* parse position from file */
+ XLogFromFileName(fname, &private.timeline, &endsegno, private.segsize);
+
+ if (endsegno < segno)
+ pg_fatal("ENDSEG %s is before STARTSEG %s",
+ argv[optind + 1], argv[optind]);
+
+ if (!XLogRecPtrIsValid(private.endptr))
+ XLogSegNoOffsetToRecPtr(endsegno + 1, 0, private.segsize,
+ private.endptr);
+
+ /* set segno to endsegno for check of --end */
+ segno = endsegno;
+ }
+
+
+ if (!XLByteInSeg(private.endptr, segno, private.segsize) &&
+ private.endptr != (segno + 1) * private.segsize)
+ {
+ pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.endptr),
+ argv[argc - 1]);
+ goto bad_argument;
+ }
}
}
- else
- waldir = identify_target_directory(waldir, NULL, &private.segsize);
+ else if (!private.archive_name)
+ waldir = identify_target_directory(walpath, NULL, &private.segsize);
/* we don't know what to print */
if (!XLogRecPtrIsValid(private.startptr))
@@ -1212,12 +1319,36 @@ main(int argc, char **argv)
/* done with argument parsing, do the actual work */
/* we have everything we need, start reading */
- xlogreader_state =
- XLogReaderAllocate(private.segsize, waldir,
- XL_ROUTINE(.page_read = WALDumpReadPage,
- .segment_open = WALDumpOpenSegment,
- .segment_close = WALDumpCloseSegment),
- &private);
+ if (private.archive_name)
+ {
+ /*
+ * A NULL WAL directory indicates that the archive file is located in
+ * the current working directory of the pg_waldump execution
+ */
+ if (waldir == NULL)
+ waldir = pg_strdup(".");
+
+ /* Set up for reading tar file */
+ init_archive_reader(&private, waldir, &private.segsize, compression);
+
+ /* Routine to decode WAL files in tar archive */
+ xlogreader_state =
+ XLogReaderAllocate(private.segsize, waldir,
+ XL_ROUTINE(.page_read = TarWALDumpReadPage,
+ .segment_open = TarWALDumpOpenSegment,
+ .segment_close = TarWALDumpCloseSegment),
+ &private);
+ }
+ else
+ {
+ xlogreader_state =
+ XLogReaderAllocate(private.segsize, waldir,
+ XL_ROUTINE(.page_read = WALDumpReadPage,
+ .segment_open = WALDumpOpenSegment,
+ .segment_close = WALDumpCloseSegment),
+ &private);
+ }
+
if (!xlogreader_state)
pg_fatal("out of memory while allocating a WAL reading processor");
@@ -1245,6 +1376,9 @@ main(int argc, char **argv)
if (config.stats == true && !config.quiet)
stats.startptr = first_record;
+ /* Flag indicating that the decoding loop has been entered */
+ private.decoding_started = true;
+
for (;;)
{
if (time_to_stop)
@@ -1326,6 +1460,9 @@ main(int argc, char **argv)
XLogReaderFree(xlogreader_state);
+ if (private.archive_name)
+ free_archive_reader(&private);
+
return EXIT_SUCCESS;
bad_argument:
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index 013b051506f..54d54a8a718 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -12,6 +12,11 @@
#define PG_WALDUMP_H
#include "access/xlogdefs.h"
+#include "fe_utils/astreamer.h"
+
+/* Forward declaration */
+struct ArchivedWALFile;
+struct ArchivedWAL_hash;
/* Contains the necessary information to drive WAL decoding */
typedef struct XLogDumpPrivate
@@ -21,6 +26,44 @@ typedef struct XLogDumpPrivate
XLogRecPtr startptr;
XLogRecPtr endptr;
bool endptr_reached;
+ bool decoding_started;
+
+ /* Fields required to read WAL from archive */
+ char *archive_name; /* Tar archive name */
+ int archive_fd; /* File descriptor for the open tar file */
+
+ astreamer *archive_streamer;
+
+ /* What the archive streamer is currently reading */
+ struct ArchivedWALFile *cur_file;
+
+ /*
+ * Hash table of all WAL files that the archive stream has read, including
+ * the one currently in progress.
+ */
+ struct ArchivedWAL_hash *archive_wal_htab;
+
+ /*
+ * Although these values can be easily derived from startptr and endptr,
+ * doing so repeatedly for each archived member would be inefficient, as
+ * it would involve recalculating and filtering out irrelevant WAL
+ * segments.
+ */
+ XLogSegNo start_segno;
+ XLogSegNo end_segno;
} XLogDumpPrivate;
+extern int open_file_in_directory(const char *directory, const char *fname);
+
+extern void init_archive_reader(XLogDumpPrivate *privateInfo,
+ const char *waldir, int *WalSegSz,
+ pg_compress_algorithm compression);
+extern void free_archive_reader(XLogDumpPrivate *privateInfo);
+extern int read_archive_wal_page(XLogDumpPrivate *privateInfo,
+ XLogRecPtr targetPagePtr,
+ Size count, char *readBuff,
+ int WalSegSz);
+extern void free_archive_wal_entry(const char *fname,
+ XLogDumpPrivate *privateInfo);
+
#endif /* PG_WALDUMP_H */
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index f12ba52cbfc..9ab7457e9e2 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -3,10 +3,13 @@
use strict;
use warnings FATAL => 'all';
+use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
+my $tar = $ENV{TAR};
+
program_help_ok('pg_waldump');
program_version_ok('pg_waldump');
program_options_handling_ok('pg_waldump');
@@ -162,6 +165,42 @@ CREATE TABLESPACE ts1 LOCATION '$tblspc_path';
DROP TABLESPACE ts1;
});
+# Test: Decode a continuation record (contrecord) that spans multiple WAL
+# segments.
+#
+# Now consume all remaining room in the current WAL segment, leaving
+# space enough only for the start of a largish record.
+$node->safe_psql(
+ 'postgres', q{
+DO $$
+DECLARE
+ wal_segsize int := setting::int FROM pg_settings WHERE name = 'wal_segment_size';
+ remain int;
+ iters int := 0;
+BEGIN
+ LOOP
+ INSERT into t1(b)
+ select repeat(encode(sha256(g::text::bytea), 'hex'), (random() * 15 + 1)::int)
+ from generate_series(1, 10) g;
+
+ remain := wal_segsize - (pg_current_wal_insert_lsn() - '0/0') % wal_segsize;
+ IF remain < 2 * setting::int from pg_settings where name = 'block_size' THEN
+ RAISE log 'exiting after % iterations, % bytes to end of WAL segment', iters, remain;
+ EXIT;
+ END IF;
+ iters := iters + 1;
+ END LOOP;
+END
+$$;
+});
+
+my $contrecord_lsn = $node->safe_psql('postgres',
+ 'SELECT pg_current_wal_insert_lsn()');
+# Generate contrecord record
+$node->safe_psql('postgres',
+ qq{SELECT pg_logical_emit_message(true, 'test 026', repeat('xyzxz', 123456))}
+);
+
my ($end_lsn, $end_walfile) = split /\|/,
$node->safe_psql('postgres',
q{SELECT pg_current_wal_insert_lsn(), pg_walfile_name(pg_current_wal_insert_lsn())}
@@ -259,11 +298,50 @@ sub test_pg_waldump
return @lines;
}
-my @lines;
+# Create a tar archive, sorting the file order
+sub generate_archive
+{
+ my ($archive, $directory, $compression_flags) = @_;
+
+ my @files;
+ opendir my $dh, $directory or die "opendir: $!";
+ while (my $entry = readdir $dh) {
+ # Skip '.' and '..'
+ next if $entry eq '.' || $entry eq '..';
+ push @files, $entry;
+ }
+ closedir $dh;
+
+ @files = sort @files;
+
+ # move into the WAL directory before archiving files
+ my $cwd = getcwd;
+ chdir($directory) || die "chdir: $!";
+ command_ok([$tar, $compression_flags, $archive, @files]);
+ chdir($cwd) || die "chdir: $!";
+}
+
+my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
my @scenarios = (
{
- 'path' => $node->data_dir
+ 'path' => $node->data_dir,
+ 'is_archive' => 0,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar",
+ 'compression_method' => 'none',
+ 'compression_flags' => '-cf',
+ 'is_archive' => 1,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar.gz",
+ 'compression_method' => 'gzip',
+ 'compression_flags' => '-czf',
+ 'is_archive' => 1,
+ 'enabled' => check_pg_config("#define HAVE_LIBZ 1")
});
for my $scenario (@scenarios)
@@ -272,6 +350,19 @@ for my $scenario (@scenarios)
SKIP:
{
+ skip "tar command is not available", 3
+ if !defined $tar;
+ skip "$scenario->{'compression_method'} compression not supported by this build", 3
+ if !$scenario->{'enabled'} && $scenario->{'is_archive'};
+
+ # create pg_wal archive
+ if ($scenario->{'is_archive'})
+ {
+ generate_archive($path,
+ $node->data_dir . '/pg_wal',
+ $scenario->{'compression_flags'});
+ }
+
command_fails_like(
[ 'pg_waldump', '--path' => $path ],
qr/error: no start WAL location given/,
@@ -305,9 +396,14 @@ for my $scenario (@scenarios)
test_pg_waldump_skip_bytes($path, $start_lsn, $end_lsn);
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
+ my @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+ @lines = test_pg_waldump($path, $contrecord_lsn, $end_lsn);
+ is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+
+ test_pg_waldump_skip_bytes($path, $contrecord_lsn, $end_lsn);
+
@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--limit' => 6);
is(@lines, 6, 'limit option observed');
@@ -337,6 +433,9 @@ for my $scenario (@scenarios)
'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
'--block' => 1);
is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+
+ # Cleanup.
+ unlink $path if $scenario->{'is_archive'};
}
}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 77e3c04144e..595ad7d5c5a 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -145,6 +145,8 @@ ArchiveOpts
ArchiveShutdownCB
ArchiveStartupCB
ArchiveStreamState
+ArchivedWALFile
+ArchivedWAL_hash
ArchiverOutput
ArchiverStage
ArrayAnalyzeExtraData
@@ -3513,6 +3515,7 @@ astreamer_recovery_injector
astreamer_tar_archiver
astreamer_tar_parser
astreamer_verify
+astreamer_waldump
astreamer_zstd_frame
auth_password_hook_typ
autovac_table
--
2.47.1
[application/octet-stream] v14-0007-pg_waldump-Remove-the-restriction-on-the-order-o.patch (13.1K, 8-v14-0007-pg_waldump-Remove-the-restriction-on-the-order-o.patch)
download | inline diff:
From d3000a494b5d416d01e48def64a3e54e6b523dab Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 27 Jan 2026 15:38:34 +0530
Subject: [PATCH v14 07/11] pg_waldump: Remove the restriction on the order of
archived WAL files.
With previous patch, pg_waldump would stop decoding if WAL files were
not in the required sequence. With this patch, decoding will now
continue. Any WAL file that is out of order will be written to a
temporary location, from which it will be read later. Once a temporary
file has been read, it will be removed.
---
doc/src/sgml/ref/pg_waldump.sgml | 8 +-
src/bin/pg_waldump/archive_waldump.c | 171 +++++++++++++++++++++++++--
src/bin/pg_waldump/pg_waldump.c | 32 ++++-
src/bin/pg_waldump/pg_waldump.h | 3 +
src/bin/pg_waldump/t/001_basic.pl | 3 +-
5 files changed, 197 insertions(+), 20 deletions(-)
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index 15fb8d13199..b36323dde92 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -149,8 +149,12 @@ PostgreSQL documentation
of <envar>PGDATA</envar>.
</para>
<para>
- If a tar archive is provided, its WAL segment files must be in
- sequential order; otherwise, an error will be reported.
+ If a tar archive is provided and its WAL segment files are not in
+ sequential order, those files will be written to a temporary directory
+ named starting with <filename>waldump_tmp</filename>. This directory will be
+ created inside the directory specified by the <envar>TMPDIR</envar>
+ environment variable if it is set; otherwise, it will be created within
+ the same directory as the tar archive.
</para>
</listitem>
</varlistentry>
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
index 17d27ffa520..c5a4485b5b1 100644
--- a/src/bin/pg_waldump/archive_waldump.c
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -17,6 +17,7 @@
#include <unistd.h>
#include "access/xlog_internal.h"
+#include "common/file_perm.h"
#include "common/hashfn.h"
#include "common/logging.h"
#include "fe_utils/simple_list.h"
@@ -27,6 +28,9 @@
*/
#define READ_CHUNK_SIZE (128 * 1024)
+/* Temporary exported WAL file directory */
+char *TmpWalSegDir = NULL;
+
/*
* Check if the start segment number is zero; this indicates a request to read
* any WAL file.
@@ -57,6 +61,8 @@ typedef struct ArchivedWALFile
const char *fname; /* hash key: WAL segment name */
StringInfo buf; /* holds WAL bytes read from archive */
+ bool spilled; /* true if the WAL data was spilled to a
+ * temporary file */
int read_len; /* total bytes of a WAL read from archive */
} ArchivedWALFile;
@@ -84,6 +90,11 @@ static ArchivedWALFile *get_archive_wal_entry(const char *fname,
XLogDumpPrivate *privateInfo,
int WalSegSz);
static int read_archive_file(XLogDumpPrivate *privateInfo, Size count);
+static void setup_tmpwal_dir(const char *waldir);
+static void cleanup_tmpwal_dir_atexit(void);
+
+static FILE *prepare_tmp_write(const char *fname);
+static void perform_tmp_write(const char *fname, StringInfo buf, FILE *file);
static astreamer *astreamer_waldump_new(XLogDumpPrivate *privateInfo);
static void astreamer_waldump_content(astreamer *streamer,
@@ -106,7 +117,9 @@ static const astreamer_ops astreamer_waldump_ops = {
/*
* Initializes the tar archive reader, creates a hash table for WAL entries,
* checks for existing valid WAL segments in the archive file and retrieves the
- * segment size, and sets up filters for relevant entries.
+ * segment size, and sets up filters for relevant entries. It also configures a
+ * temporary directory for out-of-order WAL data and registers an exit callback
+ * to clean up temporary files.
*/
void
init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
@@ -199,6 +212,13 @@ init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
privateInfo->start_segno > segno ||
privateInfo->end_segno < segno)
free_archive_wal_entry(entry->fname, privateInfo);
+
+ /*
+ * Setup temporary directory to store WAL segments and set up an exit
+ * callback to remove it upon completion.
+ */
+ setup_tmpwal_dir(waldir);
+ atexit(cleanup_tmpwal_dir_atexit);
}
/*
@@ -365,6 +385,17 @@ free_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo)
destroyStringInfo(entry->buf);
entry->buf = NULL;
+ /* Remove temporary file if any */
+ if (entry->spilled)
+ {
+ char fpath[MAXPGPATH];
+
+ snprintf(fpath, MAXPGPATH, "%s/%s", TmpWalSegDir, fname);
+
+ if (unlink(fpath) == 0)
+ pg_log_debug("removed file \"%s\"", fpath);
+ }
+
/* Set cur_file to NULL if it matches the entry being ignored */
if (privateInfo->cur_file == entry)
privateInfo->cur_file = NULL;
@@ -376,12 +407,16 @@ free_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo)
* Returns the archived WAL entry from the hash table if it exists. Otherwise,
* it invokes the routine to read the archived file, which then populates the
* entry in the hash table if that WAL exists in the archive.
+ * If the archive streamer happens to be reading a
+ * WAL from archive file that is not currently needed, that WAL data is written
+ * to a temporary file.
*/
static ArchivedWALFile *
get_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo,
int WalSegSz)
{
ArchivedWALFile *entry = NULL;
+ FILE *write_fp = NULL;
/* Search hash table */
entry = ArchivedWAL_lookup(privateInfo->archive_wal_htab, fname);
@@ -395,28 +430,59 @@ get_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo,
*/
while (1)
{
+ /*
+ * The WAL file entry currently being processed may change during
+ * archive streamer execution. Therefore, maintain a local variable to
+ * reference the previous entry, ensuring that any remaining data in
+ * its buffer is successfully flushed to the temporary file before
+ * switching to the next WAL entry.
+ */
+ entry = privateInfo->cur_file;
+
/* Fetch more data */
- if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
- break; /* archive file ended */
+ if (entry == NULL || entry->buf->len == 0)
+ {
+ if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ break; /* archive file ended */
+ }
/*
* Archived streamer is reading a non-WAL file or an irrelevant WAL
* file.
*/
- if (privateInfo->cur_file == NULL)
+ if (entry == NULL)
continue;
- entry = privateInfo->cur_file;
-
/* Found the required entry */
if (strcmp(fname, entry->fname) == 0)
return entry;
- /* WAL segments must be archived in order */
- pg_log_error("WAL files are not archived in sequential order");
- pg_log_error_detail("Expecting segment \"%s\" but found \"%s\".",
- fname, entry->fname);
- exit(1);
+ /*
+ * Archive streamer is currently reading a file that isn't the one
+ * asked for, but it's required in the future. It should be written to
+ * a temporary location for retrieval when needed.
+ */
+
+ /* Create a temporary file if one does not already exist */
+ if (!entry->spilled)
+ {
+ write_fp = prepare_tmp_write(entry->fname);
+ entry->spilled = true;
+ }
+
+ /* Flush data from the buffer to the file */
+ perform_tmp_write(entry->fname, entry->buf, write_fp);
+ resetStringInfo(entry->buf);
+
+ /*
+ * The change in the current segment entry indicates that the reading
+ * of this file has ended.
+ */
+ if (entry != privateInfo->cur_file && write_fp != NULL)
+ {
+ fclose(write_fp);
+ write_fp = NULL;
+ }
}
/* Requested WAL segment not found */
@@ -454,7 +520,88 @@ read_archive_file(XLogDumpPrivate *privateInfo, Size count)
}
/*
- * Create an astreamer that can read WAL from a tar file.
+ * Set up a temporary directory to temporarily store WAL segments.
+ */
+static void
+setup_tmpwal_dir(const char *waldir)
+{
+ char *template;
+
+ /*
+ * Use the directory specified by the TMPDIR environment variable. If it's
+ * not set, use the provided WAL directory to extract WAL file
+ * temporarily.
+ */
+ template = psprintf("%s/waldump_tmp-XXXXXX",
+ getenv("TMPDIR") ? getenv("TMPDIR") : waldir);
+ TmpWalSegDir = mkdtemp(template);
+
+ if (TmpWalSegDir == NULL)
+ pg_fatal("could not create directory \"%s\": %m", template);
+
+ canonicalize_path(TmpWalSegDir);
+
+ pg_log_debug("created directory \"%s\"", TmpWalSegDir);
+}
+
+/*
+ * Remove temporary directory at exit, if any.
+ */
+static void
+cleanup_tmpwal_dir_atexit(void)
+{
+ rmtree(TmpWalSegDir, true);
+}
+
+/*
+ * Create an empty placeholder file and return its handle.
+ */
+static FILE *
+prepare_tmp_write(const char *fname)
+{
+ char fpath[MAXPGPATH];
+ FILE *file;
+
+ snprintf(fpath, MAXPGPATH, "%s/%s", TmpWalSegDir, fname);
+
+ /* Create an empty placeholder */
+ file = fopen(fpath, PG_BINARY_W);
+ if (file == NULL)
+ pg_fatal("could not create file \"%s\": %m", fpath);
+
+#ifndef WIN32
+ if (chmod(fpath, pg_file_create_mode))
+ pg_fatal("could not set permissions on file \"%s\": %m",
+ fpath);
+#endif
+
+ pg_log_debug("spilling to temporary file \"%s\"", fpath);
+
+ return file;
+}
+
+/*
+ * Write buffer data to the given file handle.
+ */
+static void
+perform_tmp_write(const char *fname, StringInfo buf, FILE *file)
+{
+ Assert(file);
+
+ errno = 0;
+ if (buf->len > 0 && fwrite(buf->data, buf->len, 1, file) != 1)
+ {
+ /*
+ * If write didn't set errno, assume problem is no disk space
+ */
+ if (errno == 0)
+ errno = ENOSPC;
+ pg_fatal("could not write to file \"%s\": %m", fname);
+ }
+}
+
+/*
+ * Create an astreamer that can read WAL from tar file.
*/
static astreamer *
astreamer_waldump_new(XLogDumpPrivate *privateInfo)
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 90fc13f3609..114969217d8 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -478,10 +478,14 @@ TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
return -1;
/*
- * If the target page is in a different segment, free the buffer space
- * occupied by the previous segment data. Since pg_waldump never requests
- * the same WAL bytes twice, moving to a new segment implies the previous
- * buffer's data and that segment will not be needed again.
+ * If the target page is in a different segment, free the buffer and/or
+ * temporary file disk space occupied by the previous segment's data.
+ * Since pg_waldump never requests the same WAL bytes twice, moving to a
+ * new segment implies the previous buffer's data and that segment will
+ * not be needed again.
+ *
+ * Afterward, check for the next required WAL segment's physical existence
+ * in the temporary directory first before invoking the archive streamer.
*/
curSegNo = state->seg.ws_segno;
if (!XLByteInSeg(targetPagePtr, curSegNo, WalSegSz))
@@ -497,6 +501,13 @@ TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
state->seg.ws_tli = private->timeline;
state->seg.ws_segno = nextSegNo;
+ /* Close the WAL segment file if it is currently open */
+ if (state->seg.ws_file >= 0)
+ {
+ close(state->seg.ws_file);
+ state->seg.ws_file = -1;
+ }
+
/*
* If in pre-reading mode (prior to actual decoding), do not delete any
* entries that might be requested again once the decoding loop starts.
@@ -507,9 +518,20 @@ TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogFileName(fname, state->seg.ws_tli, curSegNo, WalSegSz);
free_archive_wal_entry(fname, private);
}
+
+ /*
+ * If the next segment exists, open it and continue reading from there
+ */
+ XLogFileName(fname, state->seg.ws_tli, nextSegNo, WalSegSz);
+ state->seg.ws_file = open_file_in_directory(TmpWalSegDir, fname);
}
- /* Read the WAL page from the archive streamer */
+ /* Continue reading from the open WAL segment, if any */
+ if (state->seg.ws_file >= 0)
+ return WALDumpReadPage(state, targetPagePtr, count, targetPtr,
+ readBuff);
+
+ /* Otherwise, read the WAL page from the archive streamer */
return read_archive_wal_page(private, targetPagePtr, count, readBuff,
WalSegSz);
}
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index 54d54a8a718..6c242b7fcbc 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -18,6 +18,9 @@
struct ArchivedWALFile;
struct ArchivedWAL_hash;
+/* Temporary directory */
+extern char *TmpWalSegDir;
+
/* Contains the necessary information to drive WAL decoding */
typedef struct XLogDumpPrivate
{
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 9ab7457e9e2..9854c939007 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -7,6 +7,7 @@ use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
+use List::Util qw(shuffle);
my $tar = $ENV{TAR};
@@ -312,7 +313,7 @@ sub generate_archive
}
closedir $dh;
- @files = sort @files;
+ @files = shuffle @files;
# move into the WAL directory before archiving files
my $cwd = getcwd;
--
2.47.1
[application/octet-stream] v14-0008-pg_verifybackup-Delay-default-WAL-directory-prep.patch (1.7K, 9-v14-0008-pg_verifybackup-Delay-default-WAL-directory-prep.patch)
download | inline diff:
From 3ecf640004c7aaca0430101a2d88e3d010e07440 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 16 Jul 2025 14:47:43 +0530
Subject: [PATCH v14 08/11] pg_verifybackup: Delay default WAL directory
preparation.
We are not sure whether to parse WAL from a directory or an archive
until the backup format is known. Therefore, we delay preparing the
default WAL directory until the point of parsing. This delay is
harmless, as the WAL directory is not used elsewhere.
---
src/bin/pg_verifybackup/pg_verifybackup.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 31f606c45b1..8cc204719ee 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -285,10 +285,6 @@ main(int argc, char **argv)
manifest_path = psprintf("%s/backup_manifest",
context.backup_directory);
- /* By default, look for the WAL in the backup directory, too. */
- if (wal_directory == NULL)
- wal_directory = psprintf("%s/pg_wal", context.backup_directory);
-
/*
* Try to read the manifest. We treat any errors encountered while parsing
* the manifest as fatal; there doesn't seem to be much point in trying to
@@ -368,6 +364,10 @@ main(int argc, char **argv)
if (context.format == 'p' && !context.skip_checksums)
verify_backup_checksums(&context);
+ /* By default, look for the WAL in the backup directory, too. */
+ if (wal_directory == NULL)
+ wal_directory = psprintf("%s/pg_wal", context.backup_directory);
+
/*
* Try to parse the required ranges of WAL records, unless we were told
* not to do so.
--
2.47.1
[application/octet-stream] v14-0009-pg_verifybackup-Rename-the-wal-directory-switch-.patch (5.9K, 10-v14-0009-pg_verifybackup-Rename-the-wal-directory-switch-.patch)
download | inline diff:
From 9ab39e96cfecdfb0c3ec1630cc5b4718fa3986de Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 25 Nov 2025 17:32:14 +0530
Subject: [PATCH v14 09/11] pg_verifybackup: Rename the wal-directory switch to
wal-path
With previous patches to pg_waldump can now decode WAL directly from
tar files. This means you'll be able to specify a tar archive path
instead of a traditional WAL directory.
To keep things consistent and more versatile, we should also
generalize the input switch for pg_verifybackup. It should accept
either a directory or a tar file path that contains WALs. This change
will also aligning it with the existing manifest-path switch naming.
== NOTE ==
The corresponding PO files require updating due to this change.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 2 +-
src/bin/pg_verifybackup/pg_verifybackup.c | 22 +++++++++++-----------
src/bin/pg_verifybackup/t/007_wal.pl | 4 ++--
3 files changed, 14 insertions(+), 14 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index 61c12975e4a..e9b8bfd51b1 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -261,7 +261,7 @@ PostgreSQL documentation
<varlistentry>
<term><option>-w <replaceable class="parameter">path</replaceable></option></term>
- <term><option>--wal-directory=<replaceable class="parameter">path</replaceable></option></term>
+ <term><option>--wal-path=<replaceable class="parameter">path</replaceable></option></term>
<listitem>
<para>
Try to parse WAL files stored in the specified directory, rather than
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 8cc204719ee..34520546bc3 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -93,7 +93,7 @@ static void verify_file_checksum(verifier_context *context,
uint8 *buffer);
static void parse_required_wal(verifier_context *context,
char *pg_waldump_path,
- char *wal_directory);
+ char *wal_path);
static astreamer *create_archive_verifier(verifier_context *context,
char *archive_name,
Oid tblspc_oid,
@@ -126,7 +126,7 @@ main(int argc, char **argv)
{"progress", no_argument, NULL, 'P'},
{"quiet", no_argument, NULL, 'q'},
{"skip-checksums", no_argument, NULL, 's'},
- {"wal-directory", required_argument, NULL, 'w'},
+ {"wal-path", required_argument, NULL, 'w'},
{NULL, 0, NULL, 0}
};
@@ -135,7 +135,7 @@ main(int argc, char **argv)
char *manifest_path = NULL;
bool no_parse_wal = false;
bool quiet = false;
- char *wal_directory = NULL;
+ char *wal_path = NULL;
char *pg_waldump_path = NULL;
DIR *dir;
@@ -221,8 +221,8 @@ main(int argc, char **argv)
context.skip_checksums = true;
break;
case 'w':
- wal_directory = pstrdup(optarg);
- canonicalize_path(wal_directory);
+ wal_path = pstrdup(optarg);
+ canonicalize_path(wal_path);
break;
default:
/* getopt_long already emitted a complaint */
@@ -365,15 +365,15 @@ main(int argc, char **argv)
verify_backup_checksums(&context);
/* By default, look for the WAL in the backup directory, too. */
- if (wal_directory == NULL)
- wal_directory = psprintf("%s/pg_wal", context.backup_directory);
+ if (wal_path == NULL)
+ wal_path = psprintf("%s/pg_wal", context.backup_directory);
/*
* Try to parse the required ranges of WAL records, unless we were told
* not to do so.
*/
if (!no_parse_wal)
- parse_required_wal(&context, pg_waldump_path, wal_directory);
+ parse_required_wal(&context, pg_waldump_path, wal_path);
/*
* If everything looks OK, tell the user this, unless we were asked to
@@ -1188,7 +1188,7 @@ verify_file_checksum(verifier_context *context, manifest_file *m,
*/
static void
parse_required_wal(verifier_context *context, char *pg_waldump_path,
- char *wal_directory)
+ char *wal_path)
{
manifest_data *manifest = context->manifest;
manifest_wal_range *this_wal_range = manifest->first_wal_range;
@@ -1198,7 +1198,7 @@ parse_required_wal(verifier_context *context, char *pg_waldump_path,
char *pg_waldump_cmd;
pg_waldump_cmd = psprintf("\"%s\" --quiet --path=\"%s\" --timeline=%u --start=%X/%08X --end=%X/%08X\n",
- pg_waldump_path, wal_directory, this_wal_range->tli,
+ pg_waldump_path, wal_path, this_wal_range->tli,
LSN_FORMAT_ARGS(this_wal_range->start_lsn),
LSN_FORMAT_ARGS(this_wal_range->end_lsn));
fflush(NULL);
@@ -1366,7 +1366,7 @@ usage(void)
printf(_(" -P, --progress show progress information\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -s, --skip-checksums skip checksum verification\n"));
- printf(_(" -w, --wal-directory=PATH use specified path for WAL files\n"));
+ printf(_(" -w, --wal-path=PATH use specified path for WAL files\n"));
printf(_(" -V, --version output version information, then exit\n"));
printf(_(" -?, --help show this help, then exit\n"));
printf(_("\nReport bugs to <%s>.\n"), PACKAGE_BUGREPORT);
diff --git a/src/bin/pg_verifybackup/t/007_wal.pl b/src/bin/pg_verifybackup/t/007_wal.pl
index 79087a1f6be..8ad2234453d 100644
--- a/src/bin/pg_verifybackup/t/007_wal.pl
+++ b/src/bin/pg_verifybackup/t/007_wal.pl
@@ -42,10 +42,10 @@ command_ok([ 'pg_verifybackup', '--no-parse-wal', $backup_path ],
command_ok(
[
'pg_verifybackup',
- '--wal-directory' => $relocated_pg_wal,
+ '--wal-path' => $relocated_pg_wal,
$backup_path
],
- '--wal-directory can be used to specify WAL directory');
+ '--wal-path can be used to specify WAL directory');
# Move directory back to original location.
rename($relocated_pg_wal, $original_pg_wal) || die "rename pg_wal back: $!";
--
2.47.1
[application/octet-stream] v14-0010-pg_verifybackup-Enabled-WAL-parsing-for-tar-form.patch (9.9K, 11-v14-0010-pg_verifybackup-Enabled-WAL-parsing-for-tar-form.patch)
download | inline diff:
From 960405ac3bcaaf514019b0344da9f3e9fbae0e19 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 25 Nov 2025 17:34:26 +0530
Subject: [PATCH v14 10/11] pg_verifybackup: Enabled WAL parsing for tar-format
backup
Now that pg_waldump supports decoding from tar archives, we should
leverage this functionality to remove the previous restriction on WAL
parsing for tar-backed formats.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 5 +-
src/bin/pg_verifybackup/pg_verifybackup.c | 66 +++++++++++++------
src/bin/pg_verifybackup/t/002_algorithm.pl | 4 --
src/bin/pg_verifybackup/t/003_corruption.pl | 4 +-
src/bin/pg_verifybackup/t/008_untar.pl | 5 +-
src/bin/pg_verifybackup/t/010_client_untar.pl | 5 +-
6 files changed, 50 insertions(+), 39 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index e9b8bfd51b1..16b50b5a4df 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -36,10 +36,7 @@ PostgreSQL documentation
<literal>backup_manifest</literal> generated by the server at the time
of the backup. The backup may be stored either in the "plain" or the "tar"
format; this includes tar-format backups compressed with any algorithm
- supported by <application>pg_basebackup</application>. However, at present,
- <literal>WAL</literal> verification is supported only for plain-format
- backups. Therefore, if the backup is stored in tar-format, the
- <literal>-n, --no-parse-wal</literal> option should be used.
+ supported by <application>pg_basebackup</application>.
</para>
<para>
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 34520546bc3..935ab8fafa8 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -74,7 +74,9 @@ pg_noreturn static void report_manifest_error(JsonManifestParseContext *context,
const char *fmt,...)
pg_attribute_printf(2, 3);
-static void verify_tar_backup(verifier_context *context, DIR *dir);
+static void verify_tar_backup(verifier_context *context, DIR *dir,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_plain_backup_directory(verifier_context *context,
char *relpath, char *fullpath,
DIR *dir);
@@ -83,7 +85,9 @@ static void verify_plain_backup_file(verifier_context *context, char *relpath,
static void verify_control_file(const char *controlpath,
uint64 manifest_system_identifier);
static void precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles);
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_tar_file(verifier_context *context, char *relpath,
char *fullpath, astreamer *streamer);
static void report_extra_backup_files(verifier_context *context);
@@ -136,6 +140,8 @@ main(int argc, char **argv)
bool no_parse_wal = false;
bool quiet = false;
char *wal_path = NULL;
+ char *base_archive_path = NULL;
+ char *wal_archive_path = NULL;
char *pg_waldump_path = NULL;
DIR *dir;
@@ -327,17 +333,6 @@ main(int argc, char **argv)
pfree(path);
}
- /*
- * XXX: In the future, we should consider enhancing pg_waldump to read WAL
- * files from an archive.
- */
- if (!no_parse_wal && context.format == 't')
- {
- pg_log_error("pg_waldump cannot read tar files");
- pg_log_error_hint("You must use -n/--no-parse-wal when verifying a tar-format backup.");
- exit(1);
- }
-
/*
* Perform the appropriate type of verification appropriate based on the
* backup format. This will close 'dir'.
@@ -346,7 +341,7 @@ main(int argc, char **argv)
verify_plain_backup_directory(&context, NULL, context.backup_directory,
dir);
else
- verify_tar_backup(&context, dir);
+ verify_tar_backup(&context, dir, &base_archive_path, &wal_archive_path);
/*
* The "matched" flag should now be set on every entry in the hash table.
@@ -364,9 +359,28 @@ main(int argc, char **argv)
if (context.format == 'p' && !context.skip_checksums)
verify_backup_checksums(&context);
- /* By default, look for the WAL in the backup directory, too. */
+ /*
+ * By default, WAL files are expected to be found in the backup directory
+ * for plain-format backups. In the case of tar-format backups, if a
+ * separate WAL archive is not found, the WAL files are most likely
+ * included within the main data directory archive.
+ */
if (wal_path == NULL)
- wal_path = psprintf("%s/pg_wal", context.backup_directory);
+ {
+ if (context.format == 'p')
+ wal_path = psprintf("%s/pg_wal", context.backup_directory);
+ else if (wal_archive_path)
+ wal_path = wal_archive_path;
+ else if (base_archive_path)
+ wal_path = base_archive_path;
+ else
+ {
+ pg_log_error("WAL archive not found");
+ pg_log_error_hint("Specify the correct path using the option -w/--wal-path. "
+ "Or you must use -n/--no-parse-wal when verifying a tar-format backup.");
+ exit(1);
+ }
+ }
/*
* Try to parse the required ranges of WAL records, unless we were told
@@ -787,7 +801,8 @@ verify_control_file(const char *controlpath, uint64 manifest_system_identifier)
* close when we're done with it.
*/
static void
-verify_tar_backup(verifier_context *context, DIR *dir)
+verify_tar_backup(verifier_context *context, DIR *dir, char **base_archive_path,
+ char **wal_archive_path)
{
struct dirent *dirent;
SimplePtrList tarfiles = {NULL, NULL};
@@ -816,7 +831,8 @@ verify_tar_backup(verifier_context *context, DIR *dir)
char *fullpath;
fullpath = psprintf("%s/%s", context->backup_directory, filename);
- precheck_tar_backup_file(context, filename, fullpath, &tarfiles);
+ precheck_tar_backup_file(context, filename, fullpath, &tarfiles,
+ base_archive_path, wal_archive_path);
pfree(fullpath);
}
}
@@ -875,11 +891,13 @@ verify_tar_backup(verifier_context *context, DIR *dir)
*
* The arguments to this function are mostly the same as the
* verify_plain_backup_file. The additional argument outputs a list of valid
- * tar files.
+ * tar files, along with the full paths to the main archive and the WAL
+ * directory archive.
*/
static void
precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles)
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path, char **wal_archive_path)
{
struct stat sb;
Oid tblspc_oid = InvalidOid;
@@ -918,9 +936,17 @@ precheck_tar_backup_file(verifier_context *context, char *relpath,
* extension such as .gz, .lz4, or .zst.
*/
if (strncmp("base", relpath, 4) == 0)
+ {
suffix = relpath + 4;
+
+ *base_archive_path = pstrdup(fullpath);
+ }
else if (strncmp("pg_wal", relpath, 6) == 0)
+ {
suffix = relpath + 6;
+
+ *wal_archive_path = pstrdup(fullpath);
+ }
else
{
/* Expected a <tablespaceoid>.tar file here. */
diff --git a/src/bin/pg_verifybackup/t/002_algorithm.pl b/src/bin/pg_verifybackup/t/002_algorithm.pl
index 0556191ec9d..edc515d5904 100644
--- a/src/bin/pg_verifybackup/t/002_algorithm.pl
+++ b/src/bin/pg_verifybackup/t/002_algorithm.pl
@@ -30,10 +30,6 @@ sub test_checksums
{
# Add switch to get a tar-format backup
push @backup, ('--format' => 'tar');
-
- # Add switch to skip WAL verification, which is not yet supported for
- # tar-format backups
- push @verify, ('--no-parse-wal');
}
# A backup with a bogus algorithm should fail.
diff --git a/src/bin/pg_verifybackup/t/003_corruption.pl b/src/bin/pg_verifybackup/t/003_corruption.pl
index b1d65b8aa0f..882d75d9dc2 100644
--- a/src/bin/pg_verifybackup/t/003_corruption.pl
+++ b/src/bin/pg_verifybackup/t/003_corruption.pl
@@ -193,10 +193,8 @@ for my $scenario (@scenario)
command_ok([ $tar, '-cf' => "$tar_backup_path/base.tar", '.' ]);
chdir($cwd) || die "chdir: $!";
- # Now check that the backup no longer verifies. We must use -n
- # here, because pg_waldump can't yet read WAL from a tarfile.
command_fails_like(
- [ 'pg_verifybackup', '--no-parse-wal', $tar_backup_path ],
+ [ 'pg_verifybackup', $tar_backup_path ],
$scenario->{'fails_like'},
"corrupt backup fails verification: $name");
diff --git a/src/bin/pg_verifybackup/t/008_untar.pl b/src/bin/pg_verifybackup/t/008_untar.pl
index ae67ae85a31..161c08c190d 100644
--- a/src/bin/pg_verifybackup/t/008_untar.pl
+++ b/src/bin/pg_verifybackup/t/008_untar.pl
@@ -47,7 +47,6 @@ my $tsoid = $primary->safe_psql(
SELECT oid FROM pg_tablespace WHERE spcname = 'regress_ts1'));
my $backup_path = $primary->backup_dir . '/server-backup';
-my $extract_path = $primary->backup_dir . '/extracted-backup';
my @test_configuration = (
{
@@ -123,14 +122,12 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
# Cleanup.
rmtree($backup_path);
- rmtree($extract_path);
}
}
diff --git a/src/bin/pg_verifybackup/t/010_client_untar.pl b/src/bin/pg_verifybackup/t/010_client_untar.pl
index 1ac7b5db75a..9670fbe4fda 100644
--- a/src/bin/pg_verifybackup/t/010_client_untar.pl
+++ b/src/bin/pg_verifybackup/t/010_client_untar.pl
@@ -32,7 +32,6 @@ print $jf $junk_data;
close $jf;
my $backup_path = $primary->backup_dir . '/client-backup';
-my $extract_path = $primary->backup_dir . '/extracted-backup';
my @test_configuration = (
{
@@ -137,13 +136,11 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
# Cleanup.
- rmtree($extract_path);
rmtree($backup_path);
}
}
--
2.47.1
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-03-04 00:37 Andrew Dunstan <[email protected]>
parent: Amul Sul <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Andrew Dunstan @ 2026-03-04 00:37 UTC (permalink / raw)
To: Amul Sul <[email protected]>; Robert Haas <[email protected]>; +Cc: Chao Li <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On 2026-03-02 Mo 8:00 AM, Amul Sul wrote:
> On Wed, Feb 18, 2026 at 12:28 PM Amul Sul <[email protected]> wrote:
>> On Tue, Feb 10, 2026 at 3:06 PM Amul Sul <[email protected]> wrote:
>>> On Wed, Feb 4, 2026 at 6:39 PM Amul Sul <[email protected]> wrote:
>>>> On Wed, Jan 28, 2026 at 2:41 AM Robert Haas <[email protected]> wrote:
>>>>> On Tue, Jan 27, 2026 at 7:07 AM Amul Sul <[email protected]> wrote:
>>>>>> In the attached version, I am using the WAL segment name as the hash
>>>>>> key, which is much more straightforward. I have rewritten
>>>>>> read_archive_wal_page(), and it looks much cleaner than before. The
>>>>>> logic to discard irrelevant WAL files is still within
>>>>>> get_archive_wal_entry. I added an explanation for setting cur_wal to
>>>>>> NULL, which is now handled in the separate function I mentioned
>>>>>> previously.
>>>>>>
>>>>>> Kindly have a look at the attached version; let me know if you are
>>>>>> still not happy with the current approach for filtering/discarding
>>>>>> irrelevant WAL segments. It isn't much different from the previous
>>>>>> version, but I have tried to keep it in a separate routine for better
>>>>>> code readability, with comments to make it easier to understand. I
>>>>>> also added a comment for ArchivedWALFile.
>>>>> I feel like the division of labor between get_archive_wal_entry() and
>>>>> read_archive_wal_page() is odd. I noticed this in the last version,
>>>>> too, and it still seems to be the case. get_archive_wal_entry() first
>>>>> calls ArchivedWAL_lookup(). If that finds an entry, it just returns.
>>>>> If it doesn't, it loops until an entry for the requested file shows up
>>>>> and then returns it. Then control returns to read_archive_wal_page()
>>>>> which loops some more until we have all the data we need for the
>>>>> requested file. But it seems odd to me to have two separate loops
>>>>> here. I think that the first loop is going to call read_archive_file()
>>>>> until we find the beginning of the file that we care about and then
>>>>> the second one is going to call read_archive_file() some more until we
>>>>> have read enough of it to satisfy the request. It feels odd to me to
>>>>> do it that way, as if we told somebody to first wait until 9 o'clock
>>>>> and then wait another 30 minutes, instead of just telling them to wait
>>>>> until 9:30. I realize it's not quite the same thing, because apart
>>>>> from calling read_archive_file(), the two loops do different things,
>>>>> but I still think it looks odd.
>>>>>
>>>>> + /*
>>>>> + * Ignore if the timeline is different or the current segment is not
>>>>> + * the desired one.
>>>>> + */
>>>>> + XLogFromFileName(entry->fname, &curSegTimeline, &curSegNo, WalSegSz);
>>>>> + if (privateInfo->timeline != curSegTimeline ||
>>>>> + privateInfo->startSegNo > curSegNo ||
>>>>> + privateInfo->endSegNo < curSegNo ||
>>>>> + segno > curSegNo)
>>>>> + {
>>>>> + free_archive_wal_entry(entry->fname, privateInfo);
>>>>> + continue;
>>>>> + }
>>>>>
>>>>> The comment doesn't match the code. If it did, the test would be
>>>>> (privateInfo->timeline != curSegTimeline || segno != curSegno). But
>>>>> instead the segno test is > rather than !=, and the checks against
>>>>> startSegNo and endSegNo aren't explained at all. I think I understand
>>>>> why the segno test uses > rather than !=, but it's the point of the
>>>>> comment to explain things like that, rather than leaving the reader to
>>>>> guess. And I don't know why we also need to test startSegNo and
>>>>> endSegNo.
>>>>>
>>>>> I also wonder what the point is of doing XLogFromFileName() on the
>>>>> fname provided by the caller and then again on entry->fname. Couldn't
>>>>> you just compare the strings?
>>>>>
>>>>> Again, the division of labor is really odd here. It's the job of
>>>>> astreamer_waldump_content() to skip things that aren't WAL files at
>>>>> all, but it's the job of get_archive_wal_entry() to skip things that
>>>>> are WAL files but not the one we want. I disagree with putting those
>>>>> checks in completely separate parts of the code.
>>>>>
>>>> Keeping the timeline and segment start-end range checks inside the
>>>> archive streamer creates a circular dependency that cannot be resolved
>>>> without a 'dirty hack'. We must read the first available WAL file page
>>>> to determine the wal_segment_size before it can calculate the target
>>>> segment range. Moving the checks inside the streamer would make it
>>>> impossible to process that initial file, as the necessary filtering
>>>> parameters -- would still be unknown which would need to be skipped
>>>> for the first read somehow. What if later we realized that the first
>>>> WAL file which was allowed to be streamed by skipping that check is
>>>> irrelevant and doesn't fall under the start-end segment range?
>>>>
>>> Please have a look at the attached version, specifically patch 0005.
>>> In astreamer_waldump_content(), I have moved the WAL file filtration
>>> check from get_archive_wal_entry(). This check will be skipped during
>>> the initial read in init_archive_reader(), which instead performs it
>>> explicitly once it determines the WAL segment size and the start/end
>>> segments.
>>>
>>> To access the WAL segment size inside astreamer_waldump_content(), I
>>> have moved the WAL segment size variable into the XLogDumpPrivate
>>> structure in the separate 0004 patch.
>> Attached is an updated version including the aforesaid changes. It
>> includes a new refactoring patch (0001) that moves the logic for
>> identifying tar archives and their compression types from
>> pg_basebackup and pg_verifybackup into a separate-reusable function,
>> per a suggestion from Euler [1]. Additionally, I have added a test
>> for the contrecord decoding to the main patch (now 0006).
>>
>> 1] http://postgr.es/m/[email protected]
>>
> Rebased against the latest master, fixed typos in code comments, and
> replaced palloc0 with palloc0_object.
>
Hi Amul.
I think this looks in pretty good shape.
Attached are patches for a few things I think could be fixed. They are
mostly self-explanatory. The TAP test fix is the only sane way I could
come up with stopping the skip code you had from reporting a wildly
inaccurate number of tests skipped. The sane way to do this from a
Test::More perspective is a subtest, but unfortunately meson does not
like subtest output, which is why we don't use it elsewhere, so the only
way I could come up with was to split this out into a separate test. Of
course, we might just say we don't care about the misreport, in which
case we could just live with things as they are.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <[email protected]>
Date: Tue, 3 Mar 2026 00:00:00 +0000
Subject: [PATCH] Add pg_verifybackup test for tar-format WAL verification
The new tar-format WAL verification in pg_verifybackup had no test
coverage for the case where pg_basebackup produces a separate
pg_wal.tar (--format=tar --wal-method=stream). Add a test that takes
a tar-format backup and verifies it.
---
src/bin/pg_verifybackup/t/007_wal.pl | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/src/bin/pg_verifybackup/t/007_wal.pl b/src/bin/pg_verifybackup/t/007_wal.pl
index 8ad2234453d..0e0377bfacc 100644
--- a/src/bin/pg_verifybackup/t/007_wal.pl
+++ b/src/bin/pg_verifybackup/t/007_wal.pl
@@ -90,4 +90,20 @@ command_ok(
[ 'pg_verifybackup', $backup_path2 ],
'valid base backup with timeline > 1');
+# Test WAL verification for a tar-format backup with a separate pg_wal.tar,
+# as produced by pg_basebackup --format=tar --wal-method=stream.
+my $backup_path3 = $primary->backup_dir . '/test_tar_wal';
+$primary->command_ok(
+ [
+ 'pg_basebackup',
+ '--pgdata' => $backup_path3,
+ '--no-sync',
+ '--format' => 'tar',
+ '--checkpoint' => 'fast'
+ ],
+ "tar backup with separate pg_wal.tar");
+command_ok(
+ [ 'pg_verifybackup', $backup_path3 ],
+ 'WAL verification succeeds with separate pg_wal.tar');
+
done_testing();
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <[email protected]>
Date: Tue, 3 Mar 2026 00:00:00 +0000
Subject: [PATCH] Split pg_waldump TAP tests into directory and archive files
The original 001_basic.pl mixed directory and tar archive tests in a
single SKIP loop with a hardcoded skip count of 3, but each scenario
actually runs ~19 assertions. When tar is unavailable the skip count
was wrong, and the directory scenario was also wrongly guarded by the
tar-availability check.
Move all archive-related tests (tar, tar.gz) into a new
003_archive.pl that uses plan skip_all when tar is unavailable,
cleanly skipping the entire file. 001_basic.pl retains only
directory-based tests with no SKIP blocks needed.
---
src/bin/pg_waldump/meson.build | 1 +
src/bin/pg_waldump/t/001_basic.pl | 221 ++++++++++-----------------
src/bin/pg_waldump/t/003_archive.pl | 320 +++++++++++++++++++++++++++++++++++
3 files changed, 396 insertions(+), 146 deletions(-)
create mode 100644 src/bin/pg_waldump/t/003_archive.pl
diff --git a/src/bin/pg_waldump/meson.build b/src/bin/pg_waldump/meson.build
index 5296f21b82c..d2b4bd0c048 100644
--- a/src/bin/pg_waldump/meson.build
+++ b/src/bin/pg_waldump/meson.build
@@ -34,6 +34,7 @@ tests += {
'tests': [
't/001_basic.pl',
't/002_save_fullpage.pl',
+ 't/003_archive.pl',
],
},
}
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 9854c939007..282c9a37221 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -3,13 +3,9 @@
use strict;
use warnings FATAL => 'all';
-use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
-use List::Util qw(shuffle);
-
-my $tar = $ENV{TAR};
program_help_ok('pg_waldump');
program_version_ok('pg_waldump');
@@ -195,8 +191,8 @@ END
$$;
});
-my $contrecord_lsn = $node->safe_psql('postgres',
- 'SELECT pg_current_wal_insert_lsn()');
+my $contrecord_lsn =
+ $node->safe_psql('postgres', 'SELECT pg_current_wal_insert_lsn()');
# Generate contrecord record
$node->safe_psql('postgres',
qq{SELECT pg_logical_emit_message(true, 'test 026', repeat('xyzxz', 123456))}
@@ -299,145 +295,78 @@ sub test_pg_waldump
return @lines;
}
-# Create a tar archive, sorting the file order
-sub generate_archive
-{
- my ($archive, $directory, $compression_flags) = @_;
-
- my @files;
- opendir my $dh, $directory or die "opendir: $!";
- while (my $entry = readdir $dh) {
- # Skip '.' and '..'
- next if $entry eq '.' || $entry eq '..';
- push @files, $entry;
- }
- closedir $dh;
-
- @files = shuffle @files;
-
- # move into the WAL directory before archiving files
- my $cwd = getcwd;
- chdir($directory) || die "chdir: $!";
- command_ok([$tar, $compression_flags, $archive, @files]);
- chdir($cwd) || die "chdir: $!";
-}
-
-my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
-
-my @scenarios = (
- {
- 'path' => $node->data_dir,
- 'is_archive' => 0,
- 'enabled' => 1
- },
- {
- 'path' => "$tmp_dir/pg_wal.tar",
- 'compression_method' => 'none',
- 'compression_flags' => '-cf',
- 'is_archive' => 1,
- 'enabled' => 1
- },
- {
- 'path' => "$tmp_dir/pg_wal.tar.gz",
- 'compression_method' => 'gzip',
- 'compression_flags' => '-czf',
- 'is_archive' => 1,
- 'enabled' => check_pg_config("#define HAVE_LIBZ 1")
- });
-
-for my $scenario (@scenarios)
-{
- my $path = $scenario->{'path'};
-
- SKIP:
- {
- skip "tar command is not available", 3
- if !defined $tar;
- skip "$scenario->{'compression_method'} compression not supported by this build", 3
- if !$scenario->{'enabled'} && $scenario->{'is_archive'};
-
- # create pg_wal archive
- if ($scenario->{'is_archive'})
- {
- generate_archive($path,
- $node->data_dir . '/pg_wal',
- $scenario->{'compression_flags'});
- }
-
- command_fails_like(
- [ 'pg_waldump', '--path' => $path ],
- qr/error: no start WAL location given/,
- 'path option requires start location');
- command_like(
- [
- 'pg_waldump',
- '--path' => $path,
- '--start' => $start_lsn,
- '--end' => $end_lsn,
- ],
- qr/./,
- 'runs with path option and start and end locations');
- command_fails_like(
- [
- 'pg_waldump',
- '--path' => $path,
- '--start' => $start_lsn,
- ],
- qr/error: error in WAL record at/,
- 'falling off the end of the WAL results in an error');
-
- command_fails_like(
- [
- 'pg_waldump', '--quiet',
- '--path' => $path,
- '--start' => $start_lsn
- ],
- qr/error: error in WAL record at/,
- 'errors are shown with --quiet');
-
- test_pg_waldump_skip_bytes($path, $start_lsn, $end_lsn);
-
- my @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
- is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
-
- @lines = test_pg_waldump($path, $contrecord_lsn, $end_lsn);
- is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
-
- test_pg_waldump_skip_bytes($path, $contrecord_lsn, $end_lsn);
-
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--limit' => 6);
- is(@lines, 6, 'limit option observed');
-
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fullpage');
- is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
-
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats');
- like($lines[0], qr/WAL statistics/, "statistics on stdout");
- is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
-
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats=record');
- like($lines[0], qr/WAL statistics/, "statistics on stdout");
- is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
-
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--rmgr' => 'Btree');
- is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
-
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fork' => 'init');
- is(grep(!/fork init/, @lines), 0, 'only init fork lines');
-
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn,
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
- is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
- 0, 'only lines for selected relation');
-
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn,
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
- '--block' => 1);
- is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
-
- # Cleanup.
- unlink $path if $scenario->{'is_archive'};
- }
-}
+my $path = $node->data_dir;
+
+command_fails_like(
+ [ 'pg_waldump', '--path' => $path ],
+ qr/error: no start WAL location given/,
+ 'path option requires start location');
+command_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ '--end' => $end_lsn,
+ ],
+ qr/./,
+ 'runs with path option and start and end locations');
+command_fails_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ ],
+ qr/error: error in WAL record at/,
+ 'falling off the end of the WAL results in an error');
+
+command_fails_like(
+ [
+ 'pg_waldump', '--quiet',
+ '--path' => $path,
+ '--start' => $start_lsn
+ ],
+ qr/error: error in WAL record at/,
+ 'errors are shown with --quiet');
+
+test_pg_waldump_skip_bytes($path, $start_lsn, $end_lsn);
+
+my @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
+is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+
+@lines = test_pg_waldump($path, $contrecord_lsn, $end_lsn);
+is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+
+test_pg_waldump_skip_bytes($path, $contrecord_lsn, $end_lsn);
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--limit' => 6);
+is(@lines, 6, 'limit option observed');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fullpage');
+is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats');
+like($lines[0], qr/WAL statistics/, "statistics on stdout");
+is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats=record');
+like($lines[0], qr/WAL statistics/, "statistics on stdout");
+is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--rmgr' => 'Btree');
+is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fork' => 'init');
+is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
+is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
+ 0, 'only lines for selected relation');
+
+@lines = test_pg_waldump(
+ $path, $start_lsn, $end_lsn,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
+ '--block' => 1);
+is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
done_testing();
new file mode 100644
index 00000000000..c615713efd4
--- /dev/null
+++ b/src/bin/pg_waldump/t/003_archive.pl
@@ -0,0 +1,320 @@
+
+# Copyright (c) 2021-2026, PostgreSQL Global Development Group
+
+# Test pg_waldump's ability to read WAL from tar archives.
+
+use strict;
+use warnings FATAL => 'all';
+use Cwd;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+use List::Util qw(shuffle);
+
+my $tar = $ENV{TAR};
+
+if (!defined $tar)
+{
+ plan skip_all => 'tar command is not available';
+}
+
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init;
+$node->append_conf(
+ 'postgresql.conf', q{
+autovacuum = off
+checkpoint_timeout = 1h
+
+# for standbydesc
+archive_mode=on
+archive_command=''
+
+# for XLOG_HEAP_TRUNCATE
+wal_level=logical
+});
+$node->start;
+
+my ($start_lsn, $start_walfile) = split /\|/,
+ $node->safe_psql('postgres',
+ q{SELECT pg_current_wal_insert_lsn(), pg_walfile_name(pg_current_wal_insert_lsn())}
+ );
+
+$node->safe_psql(
+ 'postgres', q{
+-- heap, btree, hash, sequence
+CREATE TABLE t1 (a int GENERATED ALWAYS AS IDENTITY, b text);
+CREATE INDEX i1a ON t1 USING btree (a);
+CREATE INDEX i1b ON t1 USING hash (b);
+INSERT INTO t1 VALUES (default, 'one'), (default, 'two');
+DELETE FROM t1 WHERE b = 'one';
+TRUNCATE t1;
+
+-- abort
+START TRANSACTION;
+INSERT INTO t1 VALUES (default, 'three');
+ROLLBACK;
+
+-- unlogged/init fork
+CREATE UNLOGGED TABLE t2 (x int);
+CREATE INDEX i2 ON t2 USING btree (x);
+INSERT INTO t2 SELECT generate_series(1, 10);
+
+-- gin
+CREATE TABLE gin_idx_tbl (id bigserial PRIMARY KEY, data jsonb);
+CREATE INDEX gin_idx ON gin_idx_tbl USING gin (data);
+INSERT INTO gin_idx_tbl
+ WITH random_json AS (
+ SELECT json_object_agg(key, trunc(random() * 10)) as json_data
+ FROM unnest(array['a', 'b', 'c']) as u(key))
+ SELECT generate_series(1,500), json_data FROM random_json;
+
+-- gist, spgist
+CREATE TABLE gist_idx_tbl (p point);
+CREATE INDEX gist_idx ON gist_idx_tbl USING gist (p);
+CREATE INDEX spgist_idx ON gist_idx_tbl USING spgist (p);
+INSERT INTO gist_idx_tbl (p) VALUES (point '(1, 1)'), (point '(3, 2)'), (point '(6, 3)');
+
+-- brin
+CREATE TABLE brin_idx_tbl (col1 int, col2 text, col3 text );
+CREATE INDEX brin_idx ON brin_idx_tbl USING brin (col1, col2, col3) WITH (autosummarize=on);
+INSERT INTO brin_idx_tbl SELECT generate_series(1, 10000), 'dummy', 'dummy';
+UPDATE brin_idx_tbl SET col2 = 'updated' WHERE col1 BETWEEN 1 AND 5000;
+SELECT brin_summarize_range('brin_idx', 0);
+SELECT brin_desummarize_range('brin_idx', 0);
+
+VACUUM;
+
+-- logical message
+SELECT pg_logical_emit_message(true, 'foo', 'bar');
+
+-- relmap
+VACUUM FULL pg_authid;
+
+-- database
+CREATE DATABASE d1;
+DROP DATABASE d1;
+});
+
+my $tblspc_path = PostgreSQL::Test::Utils::tempdir_short();
+
+$node->safe_psql(
+ 'postgres', qq{
+CREATE TABLESPACE ts1 LOCATION '$tblspc_path';
+DROP TABLESPACE ts1;
+});
+
+# Consume all remaining room in the current WAL segment, leaving space enough
+# only for the start of a largish record, to test contrecord decoding.
+$node->safe_psql(
+ 'postgres', q{
+DO $$
+DECLARE
+ wal_segsize int := setting::int FROM pg_settings WHERE name = 'wal_segment_size';
+ remain int;
+ iters int := 0;
+BEGIN
+ LOOP
+ INSERT into t1(b)
+ select repeat(encode(sha256(g::text::bytea), 'hex'), (random() * 15 + 1)::int)
+ from generate_series(1, 10) g;
+
+ remain := wal_segsize - (pg_current_wal_insert_lsn() - '0/0') % wal_segsize;
+ IF remain < 2 * setting::int from pg_settings where name = 'block_size' THEN
+ RAISE log 'exiting after % iterations, % bytes to end of WAL segment', iters, remain;
+ EXIT;
+ END IF;
+ iters := iters + 1;
+ END LOOP;
+END
+$$;
+});
+
+my $contrecord_lsn =
+ $node->safe_psql('postgres', 'SELECT pg_current_wal_insert_lsn()');
+$node->safe_psql('postgres',
+ qq{SELECT pg_logical_emit_message(true, 'test 026', repeat('xyzxz', 123456))}
+);
+
+my ($end_lsn, $end_walfile) = split /\|/,
+ $node->safe_psql('postgres',
+ q{SELECT pg_current_wal_insert_lsn(), pg_walfile_name(pg_current_wal_insert_lsn())}
+ );
+
+$node->stop;
+
+
+sub test_pg_waldump_skip_bytes
+{
+ my ($path, $startlsn, $endlsn) = @_;
+
+ my ($part1, $part2) = split qr{/}, $startlsn;
+ my $lsn2 = hex $part2;
+ $lsn2++;
+ my $new_start = sprintf("%s/%X", $part1, $lsn2);
+
+ my ($stdout, $stderr);
+
+ my $result = IPC::Run::run [
+ 'pg_waldump',
+ '--start' => $new_start,
+ '--end' => $endlsn,
+ '--path' => $path,
+ ],
+ '>' => \$stdout,
+ '2>' => \$stderr;
+ ok($result, "runs with start segment and start LSN specified");
+ like($stderr, qr/first record is after/, 'info message printed');
+}
+
+sub test_pg_waldump
+{
+ local $Test::Builder::Level = $Test::Builder::Level + 1;
+ my ($path, $startlsn, $endlsn, @opts) = @_;
+
+ my ($stdout, $stderr);
+
+ my $result = IPC::Run::run [
+ 'pg_waldump',
+ '--start' => $startlsn,
+ '--end' => $endlsn,
+ '--path' => $path,
+ @opts
+ ],
+ '>' => \$stdout,
+ '2>' => \$stderr;
+ ok($result, "pg_waldump @opts: runs ok");
+ is($stderr, '', "pg_waldump @opts: no stderr");
+ my @lines = split /\n/, $stdout;
+ ok(@lines > 0, "pg_waldump @opts: some lines are output");
+ return @lines;
+}
+
+sub generate_archive
+{
+ my ($archive, $directory, $compression_flags) = @_;
+
+ my @files;
+ opendir my $dh, $directory or die "opendir: $!";
+ while (my $entry = readdir $dh)
+ {
+ next if $entry eq '.' || $entry eq '..';
+ push @files, $entry;
+ }
+ closedir $dh;
+
+ @files = shuffle @files;
+
+ my $cwd = getcwd;
+ chdir($directory) || die "chdir: $!";
+ command_ok([ $tar, $compression_flags, $archive, @files ],
+ "create archive $archive");
+ chdir($cwd) || die "chdir: $!";
+}
+
+
+my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
+
+my @scenarios = (
+ {
+ 'path' => "$tmp_dir/pg_wal.tar",
+ 'compression_method' => 'none',
+ 'compression_flags' => '-cf',
+ 'enabled' => 1,
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar.gz",
+ 'compression_method' => 'gzip',
+ 'compression_flags' => '-czf',
+ 'enabled' => check_pg_config("#define HAVE_LIBZ 1"),
+ });
+
+for my $scenario (@scenarios)
+{
+ my $path = $scenario->{'path'};
+ my $method = $scenario->{'compression_method'};
+
+ SKIP:
+ {
+ skip "$method compression not supported by this build", 1
+ if !$scenario->{'enabled'};
+
+ generate_archive(
+ $path,
+ $node->data_dir . '/pg_wal',
+ $scenario->{'compression_flags'});
+
+ command_fails_like(
+ [ 'pg_waldump', '--path' => $path ],
+ qr/error: no start WAL location given/,
+ "$method: path option requires start location");
+ command_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ '--end' => $end_lsn,
+ ],
+ qr/./,
+ "$method: runs with path option and start and end locations");
+ command_fails_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ ],
+ qr/error: error in WAL record at/,
+ "$method: falling off the end of the WAL results in an error");
+
+ command_fails_like(
+ [
+ 'pg_waldump', '--quiet',
+ '--path' => $path,
+ '--start' => $start_lsn
+ ],
+ qr/error: error in WAL record at/,
+ "$method: errors are shown with --quiet");
+
+ test_pg_waldump_skip_bytes($path, $start_lsn, $end_lsn);
+
+ my @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
+ is(grep(!/^rmgr: \w/, @lines),
+ 0, "$method: all output lines are rmgr lines");
+
+ @lines = test_pg_waldump($path, $contrecord_lsn, $end_lsn);
+ is(grep(!/^rmgr: \w/, @lines),
+ 0, "$method: contrecord - all output lines are rmgr lines");
+
+ test_pg_waldump_skip_bytes($path, $contrecord_lsn, $end_lsn);
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--limit' => 6);
+ is(@lines, 6, "$method: limit option observed");
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fullpage');
+ is(grep(!/^rmgr:.*\bFPW\b/, @lines),
+ 0, "$method: all output lines are FPW");
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats');
+ like($lines[0], qr/WAL statistics/, "$method: statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, "$method: no rmgr lines output");
+
+ @lines =
+ test_pg_waldump($path, $start_lsn, $end_lsn, '--stats=record');
+ like($lines[0], qr/WAL statistics/,
+ "$method: stats=record on stdout");
+ is(grep(/^rmgr:/, @lines),
+ 0, "$method: no rmgr lines with stats=record");
+
+ @lines =
+ test_pg_waldump($path, $start_lsn, $end_lsn, '--rmgr' => 'Btree');
+ is(grep(!/^rmgr: Btree/, @lines), 0, "$method: only Btree lines");
+
+ @lines =
+ test_pg_waldump($path, $start_lsn, $end_lsn, '--fork' => 'init');
+ is(grep(!/fork init/, @lines), 0, "$method: only init fork lines");
+
+ # Cleanup.
+ unlink $path;
+ }
+}
+
+done_testing();
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <[email protected]>
Date: Tue, 3 Mar 2026 00:00:00 +0000
Subject: [PATCH] Fix documentation for pg_waldump tar archive support
Two documentation issues with the tar archive reading feature:
- pg_waldump.sgml: When reading WAL from a tar archive with
out-of-order segments, pg_waldump spills to temporary files. TMPDIR
controls where those files are created, but this was not documented
in the Environment section.
- pg_verifybackup.sgml: The --wal-path option description still only
said "directory" even though it now also accepts tar archives.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 7 ++++---
doc/src/sgml/ref/pg_waldump.sgml | 11 +++++++++++
2 files changed, 15 insertions(+), 3 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index 16b50b5a4df..1695cfe91c8 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -261,9 +261,10 @@ PostgreSQL documentation
<term><option>--wal-path=<replaceable class="parameter">path</replaceable></option></term>
<listitem>
<para>
- Try to parse WAL files stored in the specified directory, rather than
- in <literal>pg_wal</literal>. This may be useful if the backup is
- stored in a separate location from the WAL archive.
+ Try to parse WAL files stored in the specified directory or tar
+ archive, rather than in <literal>pg_wal</literal>. This may be
+ useful if the backup is stored in a separate location from the WAL
+ archive.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index b36323dde92..9bbb4bd5772 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -391,6 +391,17 @@ PostgreSQL documentation
</para>
</listitem>
</varlistentry>
+
+ <varlistentry>
+ <term><envar>TMPDIR</envar></term>
+ <listitem>
+ <para>
+ Directory in which to create temporary files when reading WAL from a
+ tar archive with out-of-order segment files. If not set, the temporary
+ directory is created within the same directory as the tar archive.
+ </para>
+ </listitem>
+ </varlistentry>
</variablelist>
</refsect1>
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <[email protected]>
Date: Tue, 3 Mar 2026 00:00:00 +0000
Subject: [PATCH] Fix bugs in pg_waldump tar archive support
Fix several bugs introduced by the pg_waldump archive WAL reading
feature:
- pg_waldump.c: The error path for verify_directory() printed waldir
(which is NULL when --path is used) instead of walpath.
- archive_waldump.c: The error message for short reads had an operator
precedence bug: (long long int) count - nbytes cast only count, not
the subtraction result. Also reported nbytes (the requested amount)
instead of count (the total file size) for the "of" portion.
- archive_waldump.c: The "ignoring duplicate WAL" code path leaked
fname (allocated via pnstrdup/palloc). Also changed the existing
free(fname) to pfree(fname) for consistency.
- pg_verifybackup.c: The rename from --wal-directory to --wal-path
didn't preserve the old spelling as a backward-compatible alias.
- pg_verifybackup.c: Fix double space before "Or" in --wal-path
error hint message.
---
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 935ab8fafa8..b0b764913cf 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -131,6 +131,7 @@ main(int argc, char **argv)
{"quiet", no_argument, NULL, 'q'},
{"skip-checksums", no_argument, NULL, 's'},
{"wal-path", required_argument, NULL, 'w'},
+ {"wal-directory", required_argument, NULL, 'w'},
{NULL, 0, NULL, 0}
};
@@ -376,7 +377,7 @@ main(int argc, char **argv)
else
{
pg_log_error("WAL archive not found");
- pg_log_error_hint("Specify the correct path using the option -w/--wal-path. "
+ pg_log_error_hint("Specify the correct path using the option -w/--wal-path. "
"Or you must use -n/--no-parse-wal when verifying a tar-format backup.");
exit(1);
}
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
index c5a4485b5b1..1479efe61f5 100644
--- a/src/bin/pg_waldump/archive_waldump.c
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -344,8 +344,8 @@ read_archive_wal_page(XLogDumpPrivate *privateInfo, XLogRecPtr targetPagePtr,
read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
pg_fatal("could not read file \"%s\" from archive \"%s\": read %lld of %lld",
fname, privateInfo->archive_name,
- (long long int) count - nbytes,
- (long long int) nbytes);
+ (long long int) (count - nbytes),
+ (long long int) count);
}
}
@@ -664,7 +664,7 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
privateInfo->start_segno > segno ||
privateInfo->end_segno < segno)
{
- free(fname);
+ pfree(fname);
break;
}
}
@@ -680,6 +680,7 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
{
pg_log_warning("ignoring duplicate WAL \"%s\" found in archive \"%s\"",
member->pathname, privateInfo->archive_name);
+ pfree(fname);
break;
}
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 114969217d8..4b438b53ead 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -1223,7 +1223,7 @@ main(int argc, char **argv)
/* validate path points to directory */
else if (!verify_directory(walpath))
{
- pg_log_error("could not open directory \"%s\": %m", waldir);
+ pg_log_error("could not open directory \"%s\": %m", walpath);
goto bad_argument;
}
}
Attachments:
[text/plain] cf5955-tar-wal-test.patch.no-cfbot (1.4K, 2-cf5955-tar-wal-test.patch.no-cfbot)
download | inline diff:
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <[email protected]>
Date: Tue, 3 Mar 2026 00:00:00 +0000
Subject: [PATCH] Add pg_verifybackup test for tar-format WAL verification
The new tar-format WAL verification in pg_verifybackup had no test
coverage for the case where pg_basebackup produces a separate
pg_wal.tar (--format=tar --wal-method=stream). Add a test that takes
a tar-format backup and verifies it.
---
src/bin/pg_verifybackup/t/007_wal.pl | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/src/bin/pg_verifybackup/t/007_wal.pl b/src/bin/pg_verifybackup/t/007_wal.pl
index 8ad2234453d..0e0377bfacc 100644
--- a/src/bin/pg_verifybackup/t/007_wal.pl
+++ b/src/bin/pg_verifybackup/t/007_wal.pl
@@ -90,4 +90,20 @@ command_ok(
[ 'pg_verifybackup', $backup_path2 ],
'valid base backup with timeline > 1');
+# Test WAL verification for a tar-format backup with a separate pg_wal.tar,
+# as produced by pg_basebackup --format=tar --wal-method=stream.
+my $backup_path3 = $primary->backup_dir . '/test_tar_wal';
+$primary->command_ok(
+ [
+ 'pg_basebackup',
+ '--pgdata' => $backup_path3,
+ '--no-sync',
+ '--format' => 'tar',
+ '--checkpoint' => 'fast'
+ ],
+ "tar backup with separate pg_wal.tar");
+command_ok(
+ [ 'pg_verifybackup', $backup_path3 ],
+ 'WAL verification succeeds with separate pg_wal.tar');
+
done_testing();
[text/plain] cf5955-tap-test-fix.patch.no-cfbot (17.5K, 3-cf5955-tap-test-fix.patch.no-cfbot)
download | inline diff:
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <[email protected]>
Date: Tue, 3 Mar 2026 00:00:00 +0000
Subject: [PATCH] Split pg_waldump TAP tests into directory and archive files
The original 001_basic.pl mixed directory and tar archive tests in a
single SKIP loop with a hardcoded skip count of 3, but each scenario
actually runs ~19 assertions. When tar is unavailable the skip count
was wrong, and the directory scenario was also wrongly guarded by the
tar-availability check.
Move all archive-related tests (tar, tar.gz) into a new
003_archive.pl that uses plan skip_all when tar is unavailable,
cleanly skipping the entire file. 001_basic.pl retains only
directory-based tests with no SKIP blocks needed.
---
src/bin/pg_waldump/meson.build | 1 +
src/bin/pg_waldump/t/001_basic.pl | 221 ++++++++++-----------------
src/bin/pg_waldump/t/003_archive.pl | 320 +++++++++++++++++++++++++++++++++++
3 files changed, 396 insertions(+), 146 deletions(-)
create mode 100644 src/bin/pg_waldump/t/003_archive.pl
diff --git a/src/bin/pg_waldump/meson.build b/src/bin/pg_waldump/meson.build
index 5296f21b82c..d2b4bd0c048 100644
--- a/src/bin/pg_waldump/meson.build
+++ b/src/bin/pg_waldump/meson.build
@@ -34,6 +34,7 @@ tests += {
'tests': [
't/001_basic.pl',
't/002_save_fullpage.pl',
+ 't/003_archive.pl',
],
},
}
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 9854c939007..282c9a37221 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -3,13 +3,9 @@
use strict;
use warnings FATAL => 'all';
-use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
-use List::Util qw(shuffle);
-
-my $tar = $ENV{TAR};
program_help_ok('pg_waldump');
program_version_ok('pg_waldump');
@@ -195,8 +191,8 @@ END
$$;
});
-my $contrecord_lsn = $node->safe_psql('postgres',
- 'SELECT pg_current_wal_insert_lsn()');
+my $contrecord_lsn =
+ $node->safe_psql('postgres', 'SELECT pg_current_wal_insert_lsn()');
# Generate contrecord record
$node->safe_psql('postgres',
qq{SELECT pg_logical_emit_message(true, 'test 026', repeat('xyzxz', 123456))}
@@ -299,145 +295,78 @@ sub test_pg_waldump
return @lines;
}
-# Create a tar archive, sorting the file order
-sub generate_archive
-{
- my ($archive, $directory, $compression_flags) = @_;
-
- my @files;
- opendir my $dh, $directory or die "opendir: $!";
- while (my $entry = readdir $dh) {
- # Skip '.' and '..'
- next if $entry eq '.' || $entry eq '..';
- push @files, $entry;
- }
- closedir $dh;
-
- @files = shuffle @files;
-
- # move into the WAL directory before archiving files
- my $cwd = getcwd;
- chdir($directory) || die "chdir: $!";
- command_ok([$tar, $compression_flags, $archive, @files]);
- chdir($cwd) || die "chdir: $!";
-}
-
-my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
-
-my @scenarios = (
- {
- 'path' => $node->data_dir,
- 'is_archive' => 0,
- 'enabled' => 1
- },
- {
- 'path' => "$tmp_dir/pg_wal.tar",
- 'compression_method' => 'none',
- 'compression_flags' => '-cf',
- 'is_archive' => 1,
- 'enabled' => 1
- },
- {
- 'path' => "$tmp_dir/pg_wal.tar.gz",
- 'compression_method' => 'gzip',
- 'compression_flags' => '-czf',
- 'is_archive' => 1,
- 'enabled' => check_pg_config("#define HAVE_LIBZ 1")
- });
-
-for my $scenario (@scenarios)
-{
- my $path = $scenario->{'path'};
-
- SKIP:
- {
- skip "tar command is not available", 3
- if !defined $tar;
- skip "$scenario->{'compression_method'} compression not supported by this build", 3
- if !$scenario->{'enabled'} && $scenario->{'is_archive'};
-
- # create pg_wal archive
- if ($scenario->{'is_archive'})
- {
- generate_archive($path,
- $node->data_dir . '/pg_wal',
- $scenario->{'compression_flags'});
- }
-
- command_fails_like(
- [ 'pg_waldump', '--path' => $path ],
- qr/error: no start WAL location given/,
- 'path option requires start location');
- command_like(
- [
- 'pg_waldump',
- '--path' => $path,
- '--start' => $start_lsn,
- '--end' => $end_lsn,
- ],
- qr/./,
- 'runs with path option and start and end locations');
- command_fails_like(
- [
- 'pg_waldump',
- '--path' => $path,
- '--start' => $start_lsn,
- ],
- qr/error: error in WAL record at/,
- 'falling off the end of the WAL results in an error');
-
- command_fails_like(
- [
- 'pg_waldump', '--quiet',
- '--path' => $path,
- '--start' => $start_lsn
- ],
- qr/error: error in WAL record at/,
- 'errors are shown with --quiet');
-
- test_pg_waldump_skip_bytes($path, $start_lsn, $end_lsn);
-
- my @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
- is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
-
- @lines = test_pg_waldump($path, $contrecord_lsn, $end_lsn);
- is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
-
- test_pg_waldump_skip_bytes($path, $contrecord_lsn, $end_lsn);
-
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--limit' => 6);
- is(@lines, 6, 'limit option observed');
-
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fullpage');
- is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
-
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats');
- like($lines[0], qr/WAL statistics/, "statistics on stdout");
- is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
-
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats=record');
- like($lines[0], qr/WAL statistics/, "statistics on stdout");
- is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
-
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--rmgr' => 'Btree');
- is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
-
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fork' => 'init');
- is(grep(!/fork init/, @lines), 0, 'only init fork lines');
-
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn,
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
- is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
- 0, 'only lines for selected relation');
-
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn,
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
- '--block' => 1);
- is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
-
- # Cleanup.
- unlink $path if $scenario->{'is_archive'};
- }
-}
+my $path = $node->data_dir;
+
+command_fails_like(
+ [ 'pg_waldump', '--path' => $path ],
+ qr/error: no start WAL location given/,
+ 'path option requires start location');
+command_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ '--end' => $end_lsn,
+ ],
+ qr/./,
+ 'runs with path option and start and end locations');
+command_fails_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ ],
+ qr/error: error in WAL record at/,
+ 'falling off the end of the WAL results in an error');
+
+command_fails_like(
+ [
+ 'pg_waldump', '--quiet',
+ '--path' => $path,
+ '--start' => $start_lsn
+ ],
+ qr/error: error in WAL record at/,
+ 'errors are shown with --quiet');
+
+test_pg_waldump_skip_bytes($path, $start_lsn, $end_lsn);
+
+my @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
+is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+
+@lines = test_pg_waldump($path, $contrecord_lsn, $end_lsn);
+is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+
+test_pg_waldump_skip_bytes($path, $contrecord_lsn, $end_lsn);
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--limit' => 6);
+is(@lines, 6, 'limit option observed');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fullpage');
+is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats');
+like($lines[0], qr/WAL statistics/, "statistics on stdout");
+is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats=record');
+like($lines[0], qr/WAL statistics/, "statistics on stdout");
+is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--rmgr' => 'Btree');
+is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fork' => 'init');
+is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+
+@lines = test_pg_waldump($path, $start_lsn, $end_lsn,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
+is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
+ 0, 'only lines for selected relation');
+
+@lines = test_pg_waldump(
+ $path, $start_lsn, $end_lsn,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
+ '--block' => 1);
+is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
done_testing();
new file mode 100644
index 00000000000..c615713efd4
--- /dev/null
+++ b/src/bin/pg_waldump/t/003_archive.pl
@@ -0,0 +1,320 @@
+
+# Copyright (c) 2021-2026, PostgreSQL Global Development Group
+
+# Test pg_waldump's ability to read WAL from tar archives.
+
+use strict;
+use warnings FATAL => 'all';
+use Cwd;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+use List::Util qw(shuffle);
+
+my $tar = $ENV{TAR};
+
+if (!defined $tar)
+{
+ plan skip_all => 'tar command is not available';
+}
+
+my $node = PostgreSQL::Test::Cluster->new('main');
+$node->init;
+$node->append_conf(
+ 'postgresql.conf', q{
+autovacuum = off
+checkpoint_timeout = 1h
+
+# for standbydesc
+archive_mode=on
+archive_command=''
+
+# for XLOG_HEAP_TRUNCATE
+wal_level=logical
+});
+$node->start;
+
+my ($start_lsn, $start_walfile) = split /\|/,
+ $node->safe_psql('postgres',
+ q{SELECT pg_current_wal_insert_lsn(), pg_walfile_name(pg_current_wal_insert_lsn())}
+ );
+
+$node->safe_psql(
+ 'postgres', q{
+-- heap, btree, hash, sequence
+CREATE TABLE t1 (a int GENERATED ALWAYS AS IDENTITY, b text);
+CREATE INDEX i1a ON t1 USING btree (a);
+CREATE INDEX i1b ON t1 USING hash (b);
+INSERT INTO t1 VALUES (default, 'one'), (default, 'two');
+DELETE FROM t1 WHERE b = 'one';
+TRUNCATE t1;
+
+-- abort
+START TRANSACTION;
+INSERT INTO t1 VALUES (default, 'three');
+ROLLBACK;
+
+-- unlogged/init fork
+CREATE UNLOGGED TABLE t2 (x int);
+CREATE INDEX i2 ON t2 USING btree (x);
+INSERT INTO t2 SELECT generate_series(1, 10);
+
+-- gin
+CREATE TABLE gin_idx_tbl (id bigserial PRIMARY KEY, data jsonb);
+CREATE INDEX gin_idx ON gin_idx_tbl USING gin (data);
+INSERT INTO gin_idx_tbl
+ WITH random_json AS (
+ SELECT json_object_agg(key, trunc(random() * 10)) as json_data
+ FROM unnest(array['a', 'b', 'c']) as u(key))
+ SELECT generate_series(1,500), json_data FROM random_json;
+
+-- gist, spgist
+CREATE TABLE gist_idx_tbl (p point);
+CREATE INDEX gist_idx ON gist_idx_tbl USING gist (p);
+CREATE INDEX spgist_idx ON gist_idx_tbl USING spgist (p);
+INSERT INTO gist_idx_tbl (p) VALUES (point '(1, 1)'), (point '(3, 2)'), (point '(6, 3)');
+
+-- brin
+CREATE TABLE brin_idx_tbl (col1 int, col2 text, col3 text );
+CREATE INDEX brin_idx ON brin_idx_tbl USING brin (col1, col2, col3) WITH (autosummarize=on);
+INSERT INTO brin_idx_tbl SELECT generate_series(1, 10000), 'dummy', 'dummy';
+UPDATE brin_idx_tbl SET col2 = 'updated' WHERE col1 BETWEEN 1 AND 5000;
+SELECT brin_summarize_range('brin_idx', 0);
+SELECT brin_desummarize_range('brin_idx', 0);
+
+VACUUM;
+
+-- logical message
+SELECT pg_logical_emit_message(true, 'foo', 'bar');
+
+-- relmap
+VACUUM FULL pg_authid;
+
+-- database
+CREATE DATABASE d1;
+DROP DATABASE d1;
+});
+
+my $tblspc_path = PostgreSQL::Test::Utils::tempdir_short();
+
+$node->safe_psql(
+ 'postgres', qq{
+CREATE TABLESPACE ts1 LOCATION '$tblspc_path';
+DROP TABLESPACE ts1;
+});
+
+# Consume all remaining room in the current WAL segment, leaving space enough
+# only for the start of a largish record, to test contrecord decoding.
+$node->safe_psql(
+ 'postgres', q{
+DO $$
+DECLARE
+ wal_segsize int := setting::int FROM pg_settings WHERE name = 'wal_segment_size';
+ remain int;
+ iters int := 0;
+BEGIN
+ LOOP
+ INSERT into t1(b)
+ select repeat(encode(sha256(g::text::bytea), 'hex'), (random() * 15 + 1)::int)
+ from generate_series(1, 10) g;
+
+ remain := wal_segsize - (pg_current_wal_insert_lsn() - '0/0') % wal_segsize;
+ IF remain < 2 * setting::int from pg_settings where name = 'block_size' THEN
+ RAISE log 'exiting after % iterations, % bytes to end of WAL segment', iters, remain;
+ EXIT;
+ END IF;
+ iters := iters + 1;
+ END LOOP;
+END
+$$;
+});
+
+my $contrecord_lsn =
+ $node->safe_psql('postgres', 'SELECT pg_current_wal_insert_lsn()');
+$node->safe_psql('postgres',
+ qq{SELECT pg_logical_emit_message(true, 'test 026', repeat('xyzxz', 123456))}
+);
+
+my ($end_lsn, $end_walfile) = split /\|/,
+ $node->safe_psql('postgres',
+ q{SELECT pg_current_wal_insert_lsn(), pg_walfile_name(pg_current_wal_insert_lsn())}
+ );
+
+$node->stop;
+
+
+sub test_pg_waldump_skip_bytes
+{
+ my ($path, $startlsn, $endlsn) = @_;
+
+ my ($part1, $part2) = split qr{/}, $startlsn;
+ my $lsn2 = hex $part2;
+ $lsn2++;
+ my $new_start = sprintf("%s/%X", $part1, $lsn2);
+
+ my ($stdout, $stderr);
+
+ my $result = IPC::Run::run [
+ 'pg_waldump',
+ '--start' => $new_start,
+ '--end' => $endlsn,
+ '--path' => $path,
+ ],
+ '>' => \$stdout,
+ '2>' => \$stderr;
+ ok($result, "runs with start segment and start LSN specified");
+ like($stderr, qr/first record is after/, 'info message printed');
+}
+
+sub test_pg_waldump
+{
+ local $Test::Builder::Level = $Test::Builder::Level + 1;
+ my ($path, $startlsn, $endlsn, @opts) = @_;
+
+ my ($stdout, $stderr);
+
+ my $result = IPC::Run::run [
+ 'pg_waldump',
+ '--start' => $startlsn,
+ '--end' => $endlsn,
+ '--path' => $path,
+ @opts
+ ],
+ '>' => \$stdout,
+ '2>' => \$stderr;
+ ok($result, "pg_waldump @opts: runs ok");
+ is($stderr, '', "pg_waldump @opts: no stderr");
+ my @lines = split /\n/, $stdout;
+ ok(@lines > 0, "pg_waldump @opts: some lines are output");
+ return @lines;
+}
+
+sub generate_archive
+{
+ my ($archive, $directory, $compression_flags) = @_;
+
+ my @files;
+ opendir my $dh, $directory or die "opendir: $!";
+ while (my $entry = readdir $dh)
+ {
+ next if $entry eq '.' || $entry eq '..';
+ push @files, $entry;
+ }
+ closedir $dh;
+
+ @files = shuffle @files;
+
+ my $cwd = getcwd;
+ chdir($directory) || die "chdir: $!";
+ command_ok([ $tar, $compression_flags, $archive, @files ],
+ "create archive $archive");
+ chdir($cwd) || die "chdir: $!";
+}
+
+
+my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
+
+my @scenarios = (
+ {
+ 'path' => "$tmp_dir/pg_wal.tar",
+ 'compression_method' => 'none',
+ 'compression_flags' => '-cf',
+ 'enabled' => 1,
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar.gz",
+ 'compression_method' => 'gzip',
+ 'compression_flags' => '-czf',
+ 'enabled' => check_pg_config("#define HAVE_LIBZ 1"),
+ });
+
+for my $scenario (@scenarios)
+{
+ my $path = $scenario->{'path'};
+ my $method = $scenario->{'compression_method'};
+
+ SKIP:
+ {
+ skip "$method compression not supported by this build", 1
+ if !$scenario->{'enabled'};
+
+ generate_archive(
+ $path,
+ $node->data_dir . '/pg_wal',
+ $scenario->{'compression_flags'});
+
+ command_fails_like(
+ [ 'pg_waldump', '--path' => $path ],
+ qr/error: no start WAL location given/,
+ "$method: path option requires start location");
+ command_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ '--end' => $end_lsn,
+ ],
+ qr/./,
+ "$method: runs with path option and start and end locations");
+ command_fails_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ ],
+ qr/error: error in WAL record at/,
+ "$method: falling off the end of the WAL results in an error");
+
+ command_fails_like(
+ [
+ 'pg_waldump', '--quiet',
+ '--path' => $path,
+ '--start' => $start_lsn
+ ],
+ qr/error: error in WAL record at/,
+ "$method: errors are shown with --quiet");
+
+ test_pg_waldump_skip_bytes($path, $start_lsn, $end_lsn);
+
+ my @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
+ is(grep(!/^rmgr: \w/, @lines),
+ 0, "$method: all output lines are rmgr lines");
+
+ @lines = test_pg_waldump($path, $contrecord_lsn, $end_lsn);
+ is(grep(!/^rmgr: \w/, @lines),
+ 0, "$method: contrecord - all output lines are rmgr lines");
+
+ test_pg_waldump_skip_bytes($path, $contrecord_lsn, $end_lsn);
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--limit' => 6);
+ is(@lines, 6, "$method: limit option observed");
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fullpage');
+ is(grep(!/^rmgr:.*\bFPW\b/, @lines),
+ 0, "$method: all output lines are FPW");
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats');
+ like($lines[0], qr/WAL statistics/, "$method: statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, "$method: no rmgr lines output");
+
+ @lines =
+ test_pg_waldump($path, $start_lsn, $end_lsn, '--stats=record');
+ like($lines[0], qr/WAL statistics/,
+ "$method: stats=record on stdout");
+ is(grep(/^rmgr:/, @lines),
+ 0, "$method: no rmgr lines with stats=record");
+
+ @lines =
+ test_pg_waldump($path, $start_lsn, $end_lsn, '--rmgr' => 'Btree');
+ is(grep(!/^rmgr: Btree/, @lines), 0, "$method: only Btree lines");
+
+ @lines =
+ test_pg_waldump($path, $start_lsn, $end_lsn, '--fork' => 'init');
+ is(grep(!/fork init/, @lines), 0, "$method: only init fork lines");
+
+ # Cleanup.
+ unlink $path;
+ }
+}
+
+done_testing();
[text/plain] cf5955-docs.patch.no-cfbot (2.4K, 4-cf5955-docs.patch.no-cfbot)
download | inline diff:
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <[email protected]>
Date: Tue, 3 Mar 2026 00:00:00 +0000
Subject: [PATCH] Fix documentation for pg_waldump tar archive support
Two documentation issues with the tar archive reading feature:
- pg_waldump.sgml: When reading WAL from a tar archive with
out-of-order segments, pg_waldump spills to temporary files. TMPDIR
controls where those files are created, but this was not documented
in the Environment section.
- pg_verifybackup.sgml: The --wal-path option description still only
said "directory" even though it now also accepts tar archives.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 7 ++++---
doc/src/sgml/ref/pg_waldump.sgml | 11 +++++++++++
2 files changed, 15 insertions(+), 3 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index 16b50b5a4df..1695cfe91c8 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -261,9 +261,10 @@ PostgreSQL documentation
<term><option>--wal-path=<replaceable class="parameter">path</replaceable></option></term>
<listitem>
<para>
- Try to parse WAL files stored in the specified directory, rather than
- in <literal>pg_wal</literal>. This may be useful if the backup is
- stored in a separate location from the WAL archive.
+ Try to parse WAL files stored in the specified directory or tar
+ archive, rather than in <literal>pg_wal</literal>. This may be
+ useful if the backup is stored in a separate location from the WAL
+ archive.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index b36323dde92..9bbb4bd5772 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -391,6 +391,17 @@ PostgreSQL documentation
</para>
</listitem>
</varlistentry>
+
+ <varlistentry>
+ <term><envar>TMPDIR</envar></term>
+ <listitem>
+ <para>
+ Directory in which to create temporary files when reading WAL from a
+ tar archive with out-of-order segment files. If not set, the temporary
+ directory is created within the same directory as the tar archive.
+ </para>
+ </listitem>
+ </varlistentry>
</variablelist>
</refsect1>
[text/plain] cf5955-fixes.patch.no-cfbot (3.6K, 5-cf5955-fixes.patch.no-cfbot)
download | inline diff:
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <[email protected]>
Date: Tue, 3 Mar 2026 00:00:00 +0000
Subject: [PATCH] Fix bugs in pg_waldump tar archive support
Fix several bugs introduced by the pg_waldump archive WAL reading
feature:
- pg_waldump.c: The error path for verify_directory() printed waldir
(which is NULL when --path is used) instead of walpath.
- archive_waldump.c: The error message for short reads had an operator
precedence bug: (long long int) count - nbytes cast only count, not
the subtraction result. Also reported nbytes (the requested amount)
instead of count (the total file size) for the "of" portion.
- archive_waldump.c: The "ignoring duplicate WAL" code path leaked
fname (allocated via pnstrdup/palloc). Also changed the existing
free(fname) to pfree(fname) for consistency.
- pg_verifybackup.c: The rename from --wal-directory to --wal-path
didn't preserve the old spelling as a backward-compatible alias.
- pg_verifybackup.c: Fix double space before "Or" in --wal-path
error hint message.
---
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 935ab8fafa8..b0b764913cf 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -131,6 +131,7 @@ main(int argc, char **argv)
{"quiet", no_argument, NULL, 'q'},
{"skip-checksums", no_argument, NULL, 's'},
{"wal-path", required_argument, NULL, 'w'},
+ {"wal-directory", required_argument, NULL, 'w'},
{NULL, 0, NULL, 0}
};
@@ -376,7 +377,7 @@ main(int argc, char **argv)
else
{
pg_log_error("WAL archive not found");
- pg_log_error_hint("Specify the correct path using the option -w/--wal-path. "
+ pg_log_error_hint("Specify the correct path using the option -w/--wal-path. "
"Or you must use -n/--no-parse-wal when verifying a tar-format backup.");
exit(1);
}
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
index c5a4485b5b1..1479efe61f5 100644
--- a/src/bin/pg_waldump/archive_waldump.c
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -344,8 +344,8 @@ read_archive_wal_page(XLogDumpPrivate *privateInfo, XLogRecPtr targetPagePtr,
read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
pg_fatal("could not read file \"%s\" from archive \"%s\": read %lld of %lld",
fname, privateInfo->archive_name,
- (long long int) count - nbytes,
- (long long int) nbytes);
+ (long long int) (count - nbytes),
+ (long long int) count);
}
}
@@ -664,7 +664,7 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
privateInfo->start_segno > segno ||
privateInfo->end_segno < segno)
{
- free(fname);
+ pfree(fname);
break;
}
}
@@ -680,6 +680,7 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
{
pg_log_warning("ignoring duplicate WAL \"%s\" found in archive \"%s\"",
member->pathname, privateInfo->archive_name);
+ pfree(fname);
break;
}
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 114969217d8..4b438b53ead 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -1223,7 +1223,7 @@ main(int argc, char **argv)
/* validate path points to directory */
else if (!verify_directory(walpath))
{
- pg_log_error("could not open directory \"%s\": %m", waldir);
+ pg_log_error("could not open directory \"%s\": %m", walpath);
goto bad_argument;
}
}
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-03-04 12:52 Amul Sul <[email protected]>
parent: Andrew Dunstan <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Amul Sul @ 2026-03-04 12:52 UTC (permalink / raw)
To: Andrew Dunstan <[email protected]>; +Cc: Robert Haas <[email protected]>; Chao Li <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On Wed, Mar 4, 2026 at 6:07 AM Andrew Dunstan <[email protected]> wrote:
>
>
> On 2026-03-02 Mo 8:00 AM, Amul Sul wrote:
> > On Wed, Feb 18, 2026 at 12:28 PM Amul Sul <[email protected]> wrote:
> >> On Tue, Feb 10, 2026 at 3:06 PM Amul Sul <[email protected]> wrote:
> >>> On Wed, Feb 4, 2026 at 6:39 PM Amul Sul <[email protected]> wrote:
> >>>> On Wed, Jan 28, 2026 at 2:41 AM Robert Haas <[email protected]> wrote:
> >>>>> On Tue, Jan 27, 2026 at 7:07 AM Amul Sul <[email protected]> wrote:
> >>>>>> In the attached version, I am using the WAL segment name as the hash
> >>>>>> key, which is much more straightforward. I have rewritten
> >>>>>> read_archive_wal_page(), and it looks much cleaner than before. The
> >>>>>> logic to discard irrelevant WAL files is still within
> >>>>>> get_archive_wal_entry. I added an explanation for setting cur_wal to
> >>>>>> NULL, which is now handled in the separate function I mentioned
> >>>>>> previously.
> >>>>>>
> >>>>>> Kindly have a look at the attached version; let me know if you are
> >>>>>> still not happy with the current approach for filtering/discarding
> >>>>>> irrelevant WAL segments. It isn't much different from the previous
> >>>>>> version, but I have tried to keep it in a separate routine for better
> >>>>>> code readability, with comments to make it easier to understand. I
> >>>>>> also added a comment for ArchivedWALFile.
> >>>>> I feel like the division of labor between get_archive_wal_entry() and
> >>>>> read_archive_wal_page() is odd. I noticed this in the last version,
> >>>>> too, and it still seems to be the case. get_archive_wal_entry() first
> >>>>> calls ArchivedWAL_lookup(). If that finds an entry, it just returns.
> >>>>> If it doesn't, it loops until an entry for the requested file shows up
> >>>>> and then returns it. Then control returns to read_archive_wal_page()
> >>>>> which loops some more until we have all the data we need for the
> >>>>> requested file. But it seems odd to me to have two separate loops
> >>>>> here. I think that the first loop is going to call read_archive_file()
> >>>>> until we find the beginning of the file that we care about and then
> >>>>> the second one is going to call read_archive_file() some more until we
> >>>>> have read enough of it to satisfy the request. It feels odd to me to
> >>>>> do it that way, as if we told somebody to first wait until 9 o'clock
> >>>>> and then wait another 30 minutes, instead of just telling them to wait
> >>>>> until 9:30. I realize it's not quite the same thing, because apart
> >>>>> from calling read_archive_file(), the two loops do different things,
> >>>>> but I still think it looks odd.
> >>>>>
> >>>>> + /*
> >>>>> + * Ignore if the timeline is different or the current segment is not
> >>>>> + * the desired one.
> >>>>> + */
> >>>>> + XLogFromFileName(entry->fname, &curSegTimeline, &curSegNo, WalSegSz);
> >>>>> + if (privateInfo->timeline != curSegTimeline ||
> >>>>> + privateInfo->startSegNo > curSegNo ||
> >>>>> + privateInfo->endSegNo < curSegNo ||
> >>>>> + segno > curSegNo)
> >>>>> + {
> >>>>> + free_archive_wal_entry(entry->fname, privateInfo);
> >>>>> + continue;
> >>>>> + }
> >>>>>
> >>>>> The comment doesn't match the code. If it did, the test would be
> >>>>> (privateInfo->timeline != curSegTimeline || segno != curSegno). But
> >>>>> instead the segno test is > rather than !=, and the checks against
> >>>>> startSegNo and endSegNo aren't explained at all. I think I understand
> >>>>> why the segno test uses > rather than !=, but it's the point of the
> >>>>> comment to explain things like that, rather than leaving the reader to
> >>>>> guess. And I don't know why we also need to test startSegNo and
> >>>>> endSegNo.
> >>>>>
> >>>>> I also wonder what the point is of doing XLogFromFileName() on the
> >>>>> fname provided by the caller and then again on entry->fname. Couldn't
> >>>>> you just compare the strings?
> >>>>>
> >>>>> Again, the division of labor is really odd here. It's the job of
> >>>>> astreamer_waldump_content() to skip things that aren't WAL files at
> >>>>> all, but it's the job of get_archive_wal_entry() to skip things that
> >>>>> are WAL files but not the one we want. I disagree with putting those
> >>>>> checks in completely separate parts of the code.
> >>>>>
> >>>> Keeping the timeline and segment start-end range checks inside the
> >>>> archive streamer creates a circular dependency that cannot be resolved
> >>>> without a 'dirty hack'. We must read the first available WAL file page
> >>>> to determine the wal_segment_size before it can calculate the target
> >>>> segment range. Moving the checks inside the streamer would make it
> >>>> impossible to process that initial file, as the necessary filtering
> >>>> parameters -- would still be unknown which would need to be skipped
> >>>> for the first read somehow. What if later we realized that the first
> >>>> WAL file which was allowed to be streamed by skipping that check is
> >>>> irrelevant and doesn't fall under the start-end segment range?
> >>>>
> >>> Please have a look at the attached version, specifically patch 0005.
> >>> In astreamer_waldump_content(), I have moved the WAL file filtration
> >>> check from get_archive_wal_entry(). This check will be skipped during
> >>> the initial read in init_archive_reader(), which instead performs it
> >>> explicitly once it determines the WAL segment size and the start/end
> >>> segments.
> >>>
> >>> To access the WAL segment size inside astreamer_waldump_content(), I
> >>> have moved the WAL segment size variable into the XLogDumpPrivate
> >>> structure in the separate 0004 patch.
> >> Attached is an updated version including the aforesaid changes. It
> >> includes a new refactoring patch (0001) that moves the logic for
> >> identifying tar archives and their compression types from
> >> pg_basebackup and pg_verifybackup into a separate-reusable function,
> >> per a suggestion from Euler [1]. Additionally, I have added a test
> >> for the contrecord decoding to the main patch (now 0006).
> >>
> >> 1] http://postgr.es/m/[email protected]
> >>
> > Rebased against the latest master, fixed typos in code comments, and
> > replaced palloc0 with palloc0_object.
> >
>
> Hi Amul.
>
>
> I think this looks in pretty good shape.
>
Thank you very much for looking at the patch.
> Attached are patches for a few things I think could be fixed. They are
> mostly self-explanatory. The TAP test fix is the only sane way I could
> come up with stopping the skip code you had from reporting a wildly
> inaccurate number of tests skipped. The sane way to do this from a
> Test::More perspective is a subtest, but unfortunately meson does not
> like subtest output, which is why we don't use it elsewhere, so the only
> way I could come up with was to split this out into a separate test. Of
> course, we might just say we don't care about the misreport, in which
> case we could just live with things as they are.
>
I agree that the reported skip number was incorrect, and I have
corrected it in the attached patch. I haven't applied your patch for
the TAP test improvements yet because I wanted to double-check it
first with you; the patch as it stood created duplicate tests already
present in 001_basic.pl. To avoid this duplication, I have added a
loop that performs tests for both plain and tar WAL directory inputs,
similar to the approach used in pg_verifybackup for different
compression type tests (e.g., 008_untar.pl, 010_client_untar.pl). I
don't have any objection to doing so if you feel the duplication is
acceptable, but I feel that using a loop for the tests in 001_basic.pl
is a bit tidier. Let me know your thoughts.
I have applied all your other patches but skipped the changes to
pg_verifybackup.c from cf5955-fixes.patch.no-cfbot, as they seem
unrelated or perhaps I have misunderstood them.
Regards,
Amul
Attachments:
[application/x-patch] v15-0001-Refactor-Move-tar-archive-parsing-into-a-common-.patch (6.7K, 2-v15-0001-Refactor-Move-tar-archive-parsing-into-a-common-.patch)
download | inline diff:
From 54fd70f2b5df10e6df575b4f85eaecb8a3c1ff94 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 17 Feb 2026 14:51:11 +0530
Subject: [PATCH v15 01/11] Refactor: Move tar archive parsing into a common
location.
pg_basebackup and pg_verifybackup both require logic to identify tar
files and determine their compression types. Similar functionality
will be needed for pg_waldump when it gets the capability to decode
WAL files from tar archives. Moving this logic to a common location
allows for reuse and prevents code duplication.
---
src/bin/pg_basebackup/pg_basebackup.c | 36 +++++++----------------
src/bin/pg_verifybackup/pg_verifybackup.c | 12 +-------
src/common/compression.c | 30 +++++++++++++++++++
src/include/common/compression.h | 2 ++
4 files changed, 44 insertions(+), 36 deletions(-)
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index fa169a8d642..c1a4672aa6f 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -1070,12 +1070,9 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
astreamer *manifest_inject_streamer = NULL;
bool inject_manifest;
bool is_tar,
- is_tar_gz,
- is_tar_lz4,
- is_tar_zstd,
is_compressed_tar;
+ pg_compress_algorithm compressed_tar_algorithm;
bool must_parse_archive;
- int archive_name_len = strlen(archive_name);
/*
* Normally, we emit the backup manifest as a separate file, but when
@@ -1084,24 +1081,13 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
*/
inject_manifest = (format == 't' && strcmp(basedir, "-") == 0 && manifest);
- /* Is this a tar archive? */
- is_tar = (archive_name_len > 4 &&
- strcmp(archive_name + archive_name_len - 4, ".tar") == 0);
-
- /* Is this a .tar.gz archive? */
- is_tar_gz = (archive_name_len > 7 &&
- strcmp(archive_name + archive_name_len - 7, ".tar.gz") == 0);
-
- /* Is this a .tar.lz4 archive? */
- is_tar_lz4 = (archive_name_len > 8 &&
- strcmp(archive_name + archive_name_len - 8, ".tar.lz4") == 0);
-
- /* Is this a .tar.zst archive? */
- is_tar_zstd = (archive_name_len > 8 &&
- strcmp(archive_name + archive_name_len - 8, ".tar.zst") == 0);
+ /* Check whether it is a tar archive and its compression type */
+ is_tar = parse_tar_compress_algorithm(archive_name,
+ &compressed_tar_algorithm);
/* Is this any kind of compressed tar? */
- is_compressed_tar = is_tar_gz || is_tar_lz4 || is_tar_zstd;
+ is_compressed_tar = (is_tar &&
+ compressed_tar_algorithm != PG_COMPRESSION_NONE);
/*
* Injecting the manifest into a compressed tar file would be possible if
@@ -1128,7 +1114,7 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
(spclocation == NULL && writerecoveryconf));
/* At present, we only know how to parse tar archives. */
- if (must_parse_archive && !is_tar && !is_compressed_tar)
+ if (must_parse_archive && !is_tar)
{
pg_log_error("cannot parse archive \"%s\"", archive_name);
pg_log_error_detail("Only tar archives can be parsed.");
@@ -1263,13 +1249,13 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
* If the user has requested a server compressed archive along with
* archive extraction at client then we need to decompress it.
*/
- if (format == 'p')
+ if (format == 'p' && is_compressed_tar)
{
- if (is_tar_gz)
+ if (compressed_tar_algorithm == PG_COMPRESSION_GZIP)
streamer = astreamer_gzip_decompressor_new(streamer);
- else if (is_tar_lz4)
+ else if (compressed_tar_algorithm == PG_COMPRESSION_LZ4)
streamer = astreamer_lz4_decompressor_new(streamer);
- else if (is_tar_zstd)
+ else if (compressed_tar_algorithm == PG_COMPRESSION_ZSTD)
streamer = astreamer_zstd_decompressor_new(streamer);
}
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index cbc9447384f..31f606c45b1 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -941,17 +941,7 @@ precheck_tar_backup_file(verifier_context *context, char *relpath,
}
/* Now, check the compression type of the tar */
- if (strcmp(suffix, ".tar") == 0)
- compress_algorithm = PG_COMPRESSION_NONE;
- else if (strcmp(suffix, ".tgz") == 0)
- compress_algorithm = PG_COMPRESSION_GZIP;
- else if (strcmp(suffix, ".tar.gz") == 0)
- compress_algorithm = PG_COMPRESSION_GZIP;
- else if (strcmp(suffix, ".tar.lz4") == 0)
- compress_algorithm = PG_COMPRESSION_LZ4;
- else if (strcmp(suffix, ".tar.zst") == 0)
- compress_algorithm = PG_COMPRESSION_ZSTD;
- else
+ if (!parse_tar_compress_algorithm(suffix, &compress_algorithm))
{
report_backup_error(context,
"file \"%s\" is not expected in a tar format backup",
diff --git a/src/common/compression.c b/src/common/compression.c
index 92cd4ec7a0d..f117e21237f 100644
--- a/src/common/compression.c
+++ b/src/common/compression.c
@@ -41,6 +41,36 @@ static int expect_integer_value(char *keyword, char *value,
static bool expect_boolean_value(char *keyword, char *value,
pg_compress_specification *result);
+/*
+ * Look up a compression algorithm by archive file extension. Returns true and
+ * sets *algorithm if the name is recognized. Otherwise returns false.
+ */
+bool
+parse_tar_compress_algorithm(char *fname, pg_compress_algorithm *algorithm)
+{
+ int fname_len = strlen(fname);
+
+ if (fname_len >= 4 &&
+ strcmp(fname + fname_len - 4, ".tar") == 0)
+ *algorithm = PG_COMPRESSION_NONE;
+ else if (fname_len >= 4 &&
+ strcmp(fname + fname_len - 4, ".tgz") == 0)
+ *algorithm = PG_COMPRESSION_GZIP;
+ else if (fname_len >= 7 &&
+ strcmp(fname + fname_len - 7, ".tar.gz") == 0)
+ *algorithm = PG_COMPRESSION_GZIP;
+ else if (fname_len >= 8 &&
+ strcmp(fname + fname_len - 8, ".tar.lz4") == 0)
+ *algorithm = PG_COMPRESSION_LZ4;
+ else if (fname_len >= 8 &&
+ strcmp(fname + fname_len - 8, ".tar.zst") == 0)
+ *algorithm = PG_COMPRESSION_ZSTD;
+ else
+ return false;
+
+ return true;
+}
+
/*
* Look up a compression algorithm by name. Returns true and sets *algorithm
* if the name is recognized. Otherwise returns false.
diff --git a/src/include/common/compression.h b/src/include/common/compression.h
index 6c745b90066..50f21656b88 100644
--- a/src/include/common/compression.h
+++ b/src/include/common/compression.h
@@ -41,6 +41,8 @@ typedef struct pg_compress_specification
extern void parse_compress_options(const char *option, char **algorithm,
char **detail);
+extern bool parse_tar_compress_algorithm(char *fname,
+ pg_compress_algorithm *algorithm);
extern bool parse_compress_algorithm(char *name, pg_compress_algorithm *algorithm);
extern const char *get_compress_algorithm_name(pg_compress_algorithm algorithm);
--
2.47.1
[application/x-patch] v15-0002-Refactor-pg_waldump-Move-some-declarations-to-ne.patch (2.2K, 3-v15-0002-Refactor-pg_waldump-Move-some-declarations-to-ne.patch)
download | inline diff:
From 14706302872c7e35934345fe75e1f24a5857ad16 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 22 Jan 2026 10:28:32 +0530
Subject: [PATCH v15 02/11] Refactor: pg_waldump: Move some declarations to new
pg_waldump.h
This change prepares for a second source file in this directory to
support reading WAL from tar files. Common structures, declarations,
and functions are being exported through this include file so
they can be used in both files.
---
src/bin/pg_waldump/pg_waldump.c | 9 +--------
src/bin/pg_waldump/pg_waldump.h | 25 +++++++++++++++++++++++++
2 files changed, 26 insertions(+), 8 deletions(-)
create mode 100644 src/bin/pg_waldump/pg_waldump.h
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index f3446385d6a..4b7411a6498 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -29,6 +29,7 @@
#include "common/logging.h"
#include "common/relpath.h"
#include "getopt_long.h"
+#include "pg_waldump.h"
#include "rmgrdesc.h"
#include "storage/bufpage.h"
@@ -43,14 +44,6 @@ static volatile sig_atomic_t time_to_stop = false;
static const RelFileLocator emptyRelFileLocator = {0, 0, 0};
-typedef struct XLogDumpPrivate
-{
- TimeLineID timeline;
- XLogRecPtr startptr;
- XLogRecPtr endptr;
- bool endptr_reached;
-} XLogDumpPrivate;
-
typedef struct XLogDumpConfig
{
/* display options */
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
new file mode 100644
index 00000000000..64a9109229e
--- /dev/null
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -0,0 +1,25 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_waldump.h - decode and display WAL
+ *
+ * Copyright (c) 2026, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/pg_waldump.h
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_WALDUMP_H
+#define PG_WALDUMP_H
+
+#include "access/xlogdefs.h"
+
+/* Contains the necessary information to drive WAL decoding */
+typedef struct XLogDumpPrivate
+{
+ TimeLineID timeline;
+ XLogRecPtr startptr;
+ XLogRecPtr endptr;
+ bool endptr_reached;
+} XLogDumpPrivate;
+
+#endif /* PG_WALDUMP_H */
--
2.47.1
[application/x-patch] v15-0003-Refactor-pg_waldump-Separate-logic-used-to-calcu.patch (2.4K, 4-v15-0003-Refactor-pg_waldump-Separate-logic-used-to-calcu.patch)
download | inline diff:
From e62670767a8164ca8c0a289aad05f24c3e84f8cc Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 22 Jan 2026 10:38:16 +0530
Subject: [PATCH v15 03/11] Refactor: pg_waldump: Separate logic used to
calculate the required read size.
This refactoring prepares the codebase for an upcoming patch that will
support reading WAL from tar files. The logic for calculating the
required read size has been updated to handle both normal WAL files
and WAL files located inside a tar archive.
---
src/bin/pg_waldump/pg_waldump.c | 43 +++++++++++++++++++++++----------
1 file changed, 30 insertions(+), 13 deletions(-)
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 4b7411a6498..958a71a01cf 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -326,6 +326,32 @@ identify_target_directory(char *directory, char *fname, int *WalSegSz)
return NULL; /* not reached */
}
+/*
+ * Returns the size in bytes of the data to be read. Returns -1 if the end
+ * point has already been reached.
+ */
+static inline int
+required_read_len(XLogDumpPrivate *private, XLogRecPtr targetPagePtr,
+ int reqLen)
+{
+ int count = XLOG_BLCKSZ;
+
+ if (XLogRecPtrIsValid(private->endptr))
+ {
+ if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
+ count = XLOG_BLCKSZ;
+ else if (targetPagePtr + reqLen <= private->endptr)
+ count = private->endptr - targetPagePtr;
+ else
+ {
+ private->endptr_reached = true;
+ return -1;
+ }
+ }
+
+ return count;
+}
+
/* pg_waldump's XLogReaderRoutine->segment_open callback */
static void
WALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
@@ -383,21 +409,12 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogRecPtr targetPtr, char *readBuff)
{
XLogDumpPrivate *private = state->private_data;
- int count = XLOG_BLCKSZ;
+ int count = required_read_len(private, targetPagePtr, reqLen);
WALReadError errinfo;
- if (XLogRecPtrIsValid(private->endptr))
- {
- if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
- count = XLOG_BLCKSZ;
- else if (targetPagePtr + reqLen <= private->endptr)
- count = private->endptr - targetPagePtr;
- else
- {
- private->endptr_reached = true;
- return -1;
- }
- }
+ /* Bail out if the count to be read is not valid */
+ if (count < 0)
+ return -1;
if (!WALRead(state, readBuff, targetPagePtr, count, private->timeline,
&errinfo))
--
2.47.1
[application/x-patch] v15-0004-Refactor-pg_waldump-Restructure-TAP-tests.patch (6.6K, 5-v15-0004-Refactor-pg_waldump-Restructure-TAP-tests.patch)
download | inline diff:
From be1fbe441570c0aef766eed410eb3465f2450b53 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 18 Feb 2026 11:07:57 +0530
Subject: [PATCH v15 04/11] Refactor: pg_waldump: Restructure TAP tests.
Restructured tests that do not have a WAL file argument to run within
a loop, facilitating their re-execution for decoding WAL from tar
archives.
== NOTE ==
This is not intended to be committed separately. It can be merged
with the next patch, which is the main patch implementing this
feature.
---
src/bin/pg_waldump/t/001_basic.pl | 140 +++++++++++++++++-------------
1 file changed, 79 insertions(+), 61 deletions(-)
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 5db5d20136f..f12ba52cbfc 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -198,28 +198,6 @@ command_like(
],
qr/./,
'runs with start and end segment specified');
-command_fails_like(
- [ 'pg_waldump', '--path' => $node->data_dir ],
- qr/error: no start WAL location given/,
- 'path option requires start location');
-command_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- '--end' => $end_lsn,
- ],
- qr/./,
- 'runs with path option and start and end locations');
-command_fails_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- ],
- qr/error: error in WAL record at/,
- 'falling off the end of the WAL results in an error');
-
command_like(
[
'pg_waldump', '--quiet',
@@ -227,22 +205,16 @@ command_like(
],
qr/^$/,
'no output with --quiet option');
-command_fails_like(
- [
- 'pg_waldump', '--quiet',
- '--path' => $node->data_dir,
- '--start' => $start_lsn
- ],
- qr/error: error in WAL record at/,
- 'errors are shown with --quiet');
-
# Test for: Display a message that we're skipping data if `from`
# wasn't a pointer to the start of a record.
+sub test_pg_waldump_skip_bytes
{
+ my ($path, $startlsn, $endlsn) = @_;
+
# Construct a new LSN that is one byte past the original
# start_lsn.
- my ($part1, $part2) = split qr{/}, $start_lsn;
+ my ($part1, $part2) = split qr{/}, $startlsn;
my $lsn2 = hex $part2;
$lsn2++;
my $new_start = sprintf("%s/%X", $part1, $lsn2);
@@ -252,7 +224,8 @@ command_fails_like(
my $result = IPC::Run::run [
'pg_waldump',
'--start' => $new_start,
- $node->data_dir . '/pg_wal/' . $start_walfile
+ '--end' => $endlsn,
+ '--path' => $path,
],
'>' => \$stdout,
'2>' => \$stderr;
@@ -266,15 +239,15 @@ command_fails_like(
sub test_pg_waldump
{
local $Test::Builder::Level = $Test::Builder::Level + 1;
- my @opts = @_;
+ my ($path, $startlsn, $endlsn, @opts) = @_;
my ($stdout, $stderr);
my $result = IPC::Run::run [
'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- '--end' => $end_lsn,
+ '--start' => $startlsn,
+ '--end' => $endlsn,
+ '--path' => $path,
@opts
],
'>' => \$stdout,
@@ -288,38 +261,83 @@ sub test_pg_waldump
my @lines;
-@lines = test_pg_waldump;
-is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+my @scenarios = (
+ {
+ 'path' => $node->data_dir
+ });
-@lines = test_pg_waldump('--limit' => 6);
-is(@lines, 6, 'limit option observed');
+for my $scenario (@scenarios)
+{
+ my $path = $scenario->{'path'};
-@lines = test_pg_waldump('--fullpage');
-is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
+ SKIP:
+ {
+ command_fails_like(
+ [ 'pg_waldump', '--path' => $path ],
+ qr/error: no start WAL location given/,
+ 'path option requires start location');
+ command_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ '--end' => $end_lsn,
+ ],
+ qr/./,
+ 'runs with path option and start and end locations');
+ command_fails_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ ],
+ qr/error: error in WAL record at/,
+ 'falling off the end of the WAL results in an error');
-@lines = test_pg_waldump('--stats');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ command_fails_like(
+ [
+ 'pg_waldump', '--quiet',
+ '--path' => $path,
+ '--start' => $start_lsn
+ ],
+ qr/error: error in WAL record at/,
+ 'errors are shown with --quiet');
-@lines = test_pg_waldump('--stats=record');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ test_pg_waldump_skip_bytes($path, $start_lsn, $end_lsn);
-@lines = test_pg_waldump('--rmgr' => 'Btree');
-is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
+ is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
-@lines = test_pg_waldump('--fork' => 'init');
-is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--limit' => 6);
+ is(@lines, 6, 'limit option observed');
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
-is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
- 0, 'only lines for selected relation');
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fullpage');
+ is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
- '--block' => 1);
-is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats=record');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--rmgr' => 'Btree');
+ is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fork' => 'init');
+ is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
+ is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
+ 0, 'only lines for selected relation');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
+ '--block' => 1);
+ is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+ }
+}
done_testing();
--
2.47.1
[application/x-patch] v15-0005-Refactor-pg_waldump-Move-WAL-segment-size-to-XLo.patch (5.1K, 6-v15-0005-Refactor-pg_waldump-Move-WAL-segment-size-to-XLo.patch)
download | inline diff:
From 8bb8dc6afe753f885520429613966f8cedc2b477 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 4 Feb 2026 15:31:51 +0530
Subject: [PATCH v15 05/11] Refactor: pg_waldump: Move WAL segment size to
XLogDumpPrivate.
Relocate the WAL segment size variable to the XLogDumpPrivate
structure and rename it to segsize for consistency. This change is
required to make the segment size accessible to the archive streamer
code, where passing it as a function argument is not feasible.
---
src/bin/pg_waldump/pg_waldump.c | 26 +++++++++++++-------------
src/bin/pg_waldump/pg_waldump.h | 1 +
2 files changed, 14 insertions(+), 13 deletions(-)
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 958a71a01cf..5d31b15dbd8 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -811,7 +811,6 @@ main(int argc, char **argv)
XLogRecPtr first_record;
char *waldir = NULL;
char *errormsg;
- int WalSegSz;
static struct option long_options[] = {
{"bkp-details", no_argument, NULL, 'b'},
@@ -865,6 +864,7 @@ main(int argc, char **argv)
memset(&stats, 0, sizeof(XLogStats));
private.timeline = 1;
+ private.segsize = 0;
private.startptr = InvalidXLogRecPtr;
private.endptr = InvalidXLogRecPtr;
private.endptr_reached = false;
@@ -1138,18 +1138,18 @@ main(int argc, char **argv)
pg_fatal("could not open directory \"%s\": %m", waldir);
}
- waldir = identify_target_directory(waldir, fname, &WalSegSz);
+ waldir = identify_target_directory(waldir, fname, &private.segsize);
fd = open_file_in_directory(waldir, fname);
if (fd < 0)
pg_fatal("could not open file \"%s\"", fname);
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &segno, WalSegSz);
+ XLogFromFileName(fname, &private.timeline, &segno, private.segsize);
if (!XLogRecPtrIsValid(private.startptr))
- XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr);
- else if (!XLByteInSeg(private.startptr, segno, WalSegSz))
+ XLogSegNoOffsetToRecPtr(segno, 0, private.segsize, private.startptr);
+ else if (!XLByteInSeg(private.startptr, segno, private.segsize))
{
pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
LSN_FORMAT_ARGS(private.startptr),
@@ -1159,7 +1159,7 @@ main(int argc, char **argv)
/* no second file specified, set end position */
if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(segno + 1, 0, WalSegSz, private.endptr);
+ XLogSegNoOffsetToRecPtr(segno + 1, 0, private.segsize, private.endptr);
/* parse ENDSEG if passed */
if (optind + 1 < argc)
@@ -1175,14 +1175,14 @@ main(int argc, char **argv)
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &endsegno, WalSegSz);
+ XLogFromFileName(fname, &private.timeline, &endsegno, private.segsize);
if (endsegno < segno)
pg_fatal("ENDSEG %s is before STARTSEG %s",
argv[optind + 1], argv[optind]);
if (!XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(endsegno + 1, 0, WalSegSz,
+ XLogSegNoOffsetToRecPtr(endsegno + 1, 0, private.segsize,
private.endptr);
/* set segno to endsegno for check of --end */
@@ -1190,8 +1190,8 @@ main(int argc, char **argv)
}
- if (!XLByteInSeg(private.endptr, segno, WalSegSz) &&
- private.endptr != (segno + 1) * WalSegSz)
+ if (!XLByteInSeg(private.endptr, segno, private.segsize) &&
+ private.endptr != (segno + 1) * private.segsize)
{
pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
LSN_FORMAT_ARGS(private.endptr),
@@ -1200,7 +1200,7 @@ main(int argc, char **argv)
}
}
else
- waldir = identify_target_directory(waldir, NULL, &WalSegSz);
+ waldir = identify_target_directory(waldir, NULL, &private.segsize);
/* we don't know what to print */
if (!XLogRecPtrIsValid(private.startptr))
@@ -1213,7 +1213,7 @@ main(int argc, char **argv)
/* we have everything we need, start reading */
xlogreader_state =
- XLogReaderAllocate(WalSegSz, waldir,
+ XLogReaderAllocate(private.segsize, waldir,
XL_ROUTINE(.page_read = WALDumpReadPage,
.segment_open = WALDumpOpenSegment,
.segment_close = WALDumpCloseSegment),
@@ -1234,7 +1234,7 @@ main(int argc, char **argv)
* a segment (e.g. we were used in file mode).
*/
if (first_record != private.startptr &&
- XLogSegmentOffset(private.startptr, WalSegSz) != 0)
+ XLogSegmentOffset(private.startptr, private.segsize) != 0)
pg_log_info(ngettext("first record is after %X/%08X, at %X/%08X, skipping over %u byte",
"first record is after %X/%08X, at %X/%08X, skipping over %u bytes",
(first_record - private.startptr)),
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index 64a9109229e..013b051506f 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -17,6 +17,7 @@
typedef struct XLogDumpPrivate
{
TimeLineID timeline;
+ int segsize;
XLogRecPtr startptr;
XLogRecPtr endptr;
bool endptr_reached;
--
2.47.1
[application/x-patch] v15-0006-pg_waldump-Add-support-for-archived-WAL-decoding.patch (41.8K, 7-v15-0006-pg_waldump-Add-support-for-archived-WAL-decoding.patch)
download | inline diff:
From c8ff1a06931bc3690e27c03197f825ce8ca29a27 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 10 Feb 2026 11:42:36 +0530
Subject: [PATCH v15 06/11] pg_waldump: Add support for archived WAL decoding.
pg_waldump can now accept the path to a tar archive containing WAL
files and decode them. This feature was added primarily for
pg_verifybackup, which previously disabled WAL parsing for
tar-formatted backups.
Note that this patch requires that the WAL files within the archive be
in sequential order; an error will be reported otherwise. The next
patch is planned to remove this restriction.
---
doc/src/sgml/ref/pg_waldump.sgml | 8 +-
src/bin/pg_waldump/Makefile | 7 +-
src/bin/pg_waldump/archive_waldump.c | 639 +++++++++++++++++++++++++++
src/bin/pg_waldump/meson.build | 4 +-
src/bin/pg_waldump/pg_waldump.c | 257 ++++++++---
src/bin/pg_waldump/pg_waldump.h | 43 ++
src/bin/pg_waldump/t/001_basic.pl | 105 ++++-
src/tools/pgindent/typedefs.list | 3 +
8 files changed, 999 insertions(+), 67 deletions(-)
create mode 100644 src/bin/pg_waldump/archive_waldump.c
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index d1715ff5124..15fb8d13199 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -141,13 +141,17 @@ PostgreSQL documentation
<term><option>--path=<replaceable>path</replaceable></option></term>
<listitem>
<para>
- Specifies a directory to search for WAL segment files or a
- directory with a <literal>pg_wal</literal> subdirectory that
+ Specifies a tar archive or a directory to search for WAL segment files
+ or a directory with a <literal>pg_wal</literal> subdirectory that
contains such files. The default is to search in the current
directory, the <literal>pg_wal</literal> subdirectory of the
current directory, and the <literal>pg_wal</literal> subdirectory
of <envar>PGDATA</envar>.
</para>
+ <para>
+ If a tar archive is provided, its WAL segment files must be in
+ sequential order; otherwise, an error will be reported.
+ </para>
</listitem>
</varlistentry>
diff --git a/src/bin/pg_waldump/Makefile b/src/bin/pg_waldump/Makefile
index 4c1ee649501..aabb87566a2 100644
--- a/src/bin/pg_waldump/Makefile
+++ b/src/bin/pg_waldump/Makefile
@@ -3,6 +3,9 @@
PGFILEDESC = "pg_waldump - decode and display WAL"
PGAPPICON=win32
+# make these available to TAP test scripts
+export TAR
+
subdir = src/bin/pg_waldump
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
@@ -10,13 +13,15 @@ include $(top_builddir)/src/Makefile.global
OBJS = \
$(RMGRDESCOBJS) \
$(WIN32RES) \
+ archive_waldump.o \
compat.o \
pg_waldump.o \
rmgrdesc.o \
xlogreader.o \
xlogstats.o
-override CPPFLAGS := -DFRONTEND $(CPPFLAGS)
+override CPPFLAGS := -DFRONTEND -I$(libpq_srcdir) $(CPPFLAGS)
+LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils
RMGRDESCSOURCES = $(sort $(notdir $(wildcard $(top_srcdir)/src/backend/access/rmgrdesc/*desc*.c)))
RMGRDESCOBJS = $(patsubst %.c,%.o,$(RMGRDESCSOURCES))
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
new file mode 100644
index 00000000000..4a95b47b4da
--- /dev/null
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -0,0 +1,639 @@
+/*-------------------------------------------------------------------------
+ *
+ * archive_waldump.c
+ * A generic facility for reading WAL data from tar archives via archive
+ * streamer.
+ *
+ * Portions Copyright (c) 2026, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/archive_waldump.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres_fe.h"
+
+#include <unistd.h>
+
+#include "access/xlog_internal.h"
+#include "common/hashfn.h"
+#include "common/logging.h"
+#include "fe_utils/simple_list.h"
+#include "pg_waldump.h"
+
+/*
+ * How many bytes should we try to read from a file at once?
+ */
+#define READ_CHUNK_SIZE (128 * 1024)
+
+/*
+ * Check if the start segment number is zero; this indicates a request to read
+ * any WAL file.
+ */
+#define READ_ANY_WAL(privateInfo) ((privateInfo)->start_segno == 0)
+
+/*
+ * Hash entry representing a WAL segment retrieved from the archive.
+ *
+ * While WAL segments are typically read sequentially, individual entries
+ * maintain their own buffers for the following reasons:
+ *
+ * 1. Boundary Handling: The archive streamer provides a continuous byte
+ * stream. A single streaming chunk may contain the end of one WAL segment
+ * and the start of the next. Separate buffers allow us to easily
+ * partition and track these bytes by their respective segments.
+ *
+ * 2. Out-of-Order Support: Dedicated buffers simplify logic if segments
+ * are ever archived or retrieved out of sequence.
+ *
+ * To minimize the memory footprint, entries and their associated buffers are
+ * freed immediately once consumed. Since pg_waldump does not request the same
+ * bytes twice, a segment is discarded as soon as it moves past it.
+ */
+typedef struct ArchivedWALFile
+{
+ uint32 status; /* hash status */
+ const char *fname; /* hash key: WAL segment name */
+
+ StringInfo buf; /* holds WAL bytes read from archive */
+
+ int read_len; /* total bytes of a WAL read from archive */
+} ArchivedWALFile;
+
+static uint32 hash_string_pointer(const char *s);
+#define SH_PREFIX ArchivedWAL
+#define SH_ELEMENT_TYPE ArchivedWALFile
+#define SH_KEY_TYPE const char *
+#define SH_KEY fname
+#define SH_HASH_KEY(tb, key) hash_string_pointer(key)
+#define SH_EQUAL(tb, a, b) (strcmp(a, b) == 0)
+#define SH_SCOPE static inline
+#define SH_RAW_ALLOCATOR pg_malloc0
+#define SH_DECLARE
+#define SH_DEFINE
+#include "lib/simplehash.h"
+
+typedef struct astreamer_waldump
+{
+ astreamer base;
+ XLogDumpPrivate *privateInfo;
+} astreamer_waldump;
+
+static ArchivedWALFile *get_archive_wal_entry(const char *fname,
+ XLogDumpPrivate *privateInfo,
+ int WalSegSz);
+static int read_archive_file(XLogDumpPrivate *privateInfo, Size count);
+
+static astreamer *astreamer_waldump_new(XLogDumpPrivate *privateInfo);
+static void astreamer_waldump_content(astreamer *streamer,
+ astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context);
+static void astreamer_waldump_finalize(astreamer *streamer);
+static void astreamer_waldump_free(astreamer *streamer);
+
+static bool member_is_wal_file(astreamer_waldump *mystreamer,
+ astreamer_member *member,
+ char **fname);
+
+static const astreamer_ops astreamer_waldump_ops = {
+ .content = astreamer_waldump_content,
+ .finalize = astreamer_waldump_finalize,
+ .free = astreamer_waldump_free
+};
+
+/*
+ * Initializes the tar archive reader, creates a hash table for WAL entries,
+ * checks for existing valid WAL segments in the archive file and retrieves the
+ * segment size, and sets up filters for relevant entries.
+ */
+void
+init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
+ int *WalSegSz, pg_compress_algorithm compression)
+{
+ int fd;
+ astreamer *streamer;
+ ArchivedWALFile *entry = NULL;
+ XLogLongPageHeader longhdr;
+ XLogSegNo segno;
+ TimeLineID timeline;
+
+ /* Open tar archive and store its file descriptor */
+ fd = open_file_in_directory(waldir, privateInfo->archive_name);
+
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", privateInfo->archive_name);
+
+ privateInfo->archive_fd = fd;
+
+ streamer = astreamer_waldump_new(privateInfo);
+
+ /* Before that we must parse the tar archive. */
+ streamer = astreamer_tar_parser_new(streamer);
+
+ /* Before that we must decompress, if archive is compressed. */
+ if (compression == PG_COMPRESSION_GZIP)
+ streamer = astreamer_gzip_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_LZ4)
+ streamer = astreamer_lz4_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_ZSTD)
+ streamer = astreamer_zstd_decompressor_new(streamer);
+
+ privateInfo->archive_streamer = streamer;
+
+ /*
+ * Hash table storing WAL entries read from the archive with an arbitrary
+ * initial size
+ */
+ privateInfo->archive_wal_htab = ArchivedWAL_create(8, NULL);
+
+ /*
+ * Verify that the archive contains valid WAL files and fetch WAL segment
+ * size
+ */
+ while (entry == NULL || entry->buf->len < XLOG_BLCKSZ)
+ {
+ if (read_archive_file(privateInfo, XLOG_BLCKSZ) == 0)
+ pg_fatal("could not find WAL in archive \"%s\"",
+ privateInfo->archive_name);
+
+ entry = privateInfo->cur_file;
+ }
+
+ /* Set WalSegSz if WAL data is successfully read */
+ longhdr = (XLogLongPageHeader) entry->buf->data;
+
+ if (!IsValidWalSegSize(longhdr->xlp_seg_size))
+ {
+ pg_log_error(ngettext("invalid WAL segment size in WAL file from archive \"%s\" (%d byte)",
+ "invalid WAL segment size in WAL file from archive \"%s\" (%d bytes)",
+ longhdr->xlp_seg_size),
+ privateInfo->archive_name, longhdr->xlp_seg_size);
+ pg_log_error_detail("The WAL segment size must be a power of two between 1 MB and 1 GB.");
+ exit(1);
+ }
+
+ *WalSegSz = longhdr->xlp_seg_size;
+
+ /*
+ * With the WAL segment size available, we can now initialize the
+ * dependent start and end segment numbers.
+ */
+ Assert(!XLogRecPtrIsInvalid(privateInfo->startptr));
+ XLByteToSeg(privateInfo->startptr, privateInfo->start_segno, *WalSegSz);
+
+ if (!XLogRecPtrIsInvalid(privateInfo->endptr))
+ XLByteToSeg(privateInfo->endptr, privateInfo->end_segno, *WalSegSz);
+
+ /*
+ * This WAL record was fetched before the filtering parameters
+ * (start_segno and end_segno) were fully initialized. Perform the
+ * relevance check against the user-provided range now; if the WAL falls
+ * outside this range, remove it from the hash table. Subsequent WAL will
+ * be filtered automatically by the archived streamer using the updated
+ * start_segno and end_segno values.
+ */
+ XLogFromFileName(entry->fname, &timeline, &segno, privateInfo->segsize);
+ if (privateInfo->timeline != timeline ||
+ privateInfo->start_segno > segno ||
+ privateInfo->end_segno < segno)
+ free_archive_wal_entry(entry->fname, privateInfo);
+}
+
+/*
+ * Release the archive streamer chain and close the archive file.
+ */
+void
+free_archive_reader(XLogDumpPrivate *privateInfo)
+{
+ /*
+ * NB: Normally, astreamer_finalize() is called before astreamer_free() to
+ * flush any remaining buffered data or to ensure the end of the tar
+ * archive is reached. However, when decoding a WAL file, once we hit the
+ * end LSN, any remaining WAL data in the buffer or the tar archive's
+ * unreached end can be safely ignored.
+ */
+ astreamer_free(privateInfo->archive_streamer);
+
+ /* Close the file. */
+ if (close(privateInfo->archive_fd) != 0)
+ pg_log_error("could not close file \"%s\": %m",
+ privateInfo->archive_name);
+}
+
+/*
+ * Copies WAL data from astreamer to readBuff; if unavailable, fetches more
+ * from the tar archive via astreamer.
+ */
+int
+read_archive_wal_page(XLogDumpPrivate *privateInfo, XLogRecPtr targetPagePtr,
+ Size count, char *readBuff, int WalSegSz)
+{
+ char *p = readBuff;
+ Size nbytes = count;
+ XLogRecPtr recptr = targetPagePtr;
+ XLogSegNo segno;
+ char fname[MAXFNAMELEN];
+ ArchivedWALFile *entry;
+
+ /* Identify the segment and locate its entry in the archive hash */
+ XLByteToSeg(targetPagePtr, segno, WalSegSz);
+ XLogFileName(fname, privateInfo->timeline, segno, WalSegSz);
+ entry = get_archive_wal_entry(fname, privateInfo, WalSegSz);
+
+ while (nbytes > 0)
+ {
+ char *buf = entry->buf->data;
+ int bufLen = entry->buf->len;
+ XLogRecPtr endPtr;
+ XLogRecPtr startPtr;
+
+ /* Calculate the LSN range currently residing in the buffer */
+ XLogSegNoOffsetToRecPtr(segno, entry->read_len, WalSegSz, endPtr);
+ startPtr = endPtr - bufLen;
+
+ /*
+ * Copy the requested WAL record if it exists in the buffer.
+ */
+ if (bufLen > 0 && startPtr <= recptr && recptr < endPtr)
+ {
+ int copyBytes;
+ int offset = recptr - startPtr;
+
+ /*
+ * Given startPtr <= recptr < endPtr and a total buffer size
+ * 'bufLen', the offset (recptr - startPtr) will always be less
+ * than 'bufLen'.
+ */
+ Assert(offset < bufLen);
+
+ copyBytes = Min(nbytes, bufLen - offset);
+ memcpy(p, buf + offset, copyBytes);
+
+ /* Update state for read */
+ recptr += copyBytes;
+ nbytes -= copyBytes;
+ p += copyBytes;
+ }
+ else
+ {
+ /*
+ * Before starting the actual decoding loop, pg_waldump tries to
+ * locate the first valid record from the user-specified start
+ * position, which might not be the start of a WAL record and
+ * could fall in the middle of a record that spans multiple pages.
+ * Consequently, the valid start position the decoder is looking
+ * for could be far away from that initial position.
+ *
+ * This may involve reading across multiple pages, and this
+ * pre-reading fetches data in multiple rounds from the archive
+ * streamer; normally, we would throw away existing buffer
+ * contents to fetch the next set of data, but that existing data
+ * might be needed once the main loop starts. Because previously
+ * read data cannot be re-read by the archive streamer, we delay
+ * resetting the buffer until the main decoding loop is entered.
+ *
+ * Once pg_waldump has entered the main loop, it may re-read the
+ * currently active page, but never an older one; therefore, any
+ * fully consumed WAL data preceding the current page can then be
+ * safely discarded.
+ */
+ if (privateInfo->decoding_started)
+ {
+ resetStringInfo(entry->buf);
+
+ /*
+ * Push back the partial page data for the current page to the
+ * buffer, ensuring it remains full page available for
+ * re-reading if requested.
+ */
+ if (p > readBuff)
+ {
+ Assert((count - nbytes) > 0);
+ appendBinaryStringInfo(entry->buf, readBuff, count - nbytes);
+ }
+ }
+
+ /*
+ * Now, fetch more data; raise an error if it's not the current
+ * segment being read by the archive streamer or if reading of the
+ * archived file has finished.
+ */
+ if (privateInfo->cur_file != entry ||
+ read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ pg_fatal("could not read file \"%s\" from archive \"%s\": read %lld of %lld",
+ fname, privateInfo->archive_name,
+ (long long int) (count - nbytes),
+ (long long int) count);
+ }
+ }
+
+ /*
+ * Should have either have successfully read all the requested bytes or
+ * reported a failure before this point.
+ */
+ Assert(nbytes == 0);
+
+ /*
+ * NB: We return the fixed value provided as input. Although we could
+ * return a boolean since we either successfully read the WAL page or
+ * raise an error, but the caller expects this value to be returned. The
+ * routine that reads WAL pages from the physical WAL file follows the
+ * same convention.
+ */
+ return count;
+}
+
+/*
+ * Clears the buffer of a WAL entry that is being ignored. This frees up memory
+ * and prevents the accumulation of irrelevant WAL data. Additionally,
+ * conditionally setting cur_file within privateinfo to NULL ensures the
+ * archive streamer skips unnecessary copy operations
+ */
+void
+free_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo)
+{
+ ArchivedWALFile *entry;
+
+ entry = ArchivedWAL_lookup(privateInfo->archive_wal_htab, fname);
+
+ if (entry == NULL)
+ return;
+
+ /* Destroy the buffer */
+ destroyStringInfo(entry->buf);
+ entry->buf = NULL;
+
+ /* Set cur_file to NULL if it matches the entry being ignored */
+ if (privateInfo->cur_file == entry)
+ privateInfo->cur_file = NULL;
+
+ ArchivedWAL_delete_item(privateInfo->archive_wal_htab, entry);
+}
+
+/*
+ * Returns the archived WAL entry from the hash table if it exists. Otherwise,
+ * it invokes the routine to read the archived file, which then populates the
+ * entry in the hash table if that WAL exists in the archive.
+ */
+static ArchivedWALFile *
+get_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo,
+ int WalSegSz)
+{
+ ArchivedWALFile *entry = NULL;
+
+ /* Search hash table */
+ entry = ArchivedWAL_lookup(privateInfo->archive_wal_htab, fname);
+
+ if (entry != NULL)
+ return entry;
+
+ /*
+ * The requested WAL entry has not been read from the archive yet; invoke
+ * the archive streamer to read it.
+ */
+ while (1)
+ {
+ /* Fetch more data */
+ if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ break; /* archive file ended */
+
+ /*
+ * Archived streamer is reading a non-WAL file or an irrelevant WAL
+ * file.
+ */
+ if (privateInfo->cur_file == NULL)
+ continue;
+
+ entry = privateInfo->cur_file;
+
+ /* Found the required entry */
+ if (strcmp(fname, entry->fname) == 0)
+ return entry;
+
+ /* WAL segments must be archived in order */
+ pg_log_error("WAL files are not archived in sequential order");
+ pg_log_error_detail("Expecting segment \"%s\" but found \"%s\".",
+ fname, entry->fname);
+ exit(1);
+ }
+
+ /* Requested WAL segment not found */
+ pg_fatal("could not find WAL \"%s\" in archive \"%s\"",
+ fname, privateInfo->archive_name);
+}
+
+/*
+ * Reads the archive file and passes it to the archive streamer for
+ * decompression.
+ */
+static int
+read_archive_file(XLogDumpPrivate *privateInfo, Size count)
+{
+ int rc;
+ char *buffer;
+
+ buffer = pg_malloc(count * sizeof(uint8));
+
+ rc = read(privateInfo->archive_fd, buffer, count);
+ if (rc < 0)
+ pg_fatal("could not read file \"%s\": %m",
+ privateInfo->archive_name);
+
+ /*
+ * Decompress (if required), and then parse the previously read contents
+ * of the tar file.
+ */
+ if (rc > 0)
+ astreamer_content(privateInfo->archive_streamer, NULL,
+ buffer, rc, ASTREAMER_UNKNOWN);
+ pg_free(buffer);
+
+ return rc;
+}
+
+/*
+ * Create an astreamer that can read WAL from a tar file.
+ */
+static astreamer *
+astreamer_waldump_new(XLogDumpPrivate *privateInfo)
+{
+ astreamer_waldump *streamer;
+
+ streamer = palloc0_object(astreamer_waldump);
+ *((const astreamer_ops **) &streamer->base.bbs_ops) =
+ &astreamer_waldump_ops;
+
+ streamer->privateInfo = privateInfo;
+
+ return &streamer->base;
+}
+
+/*
+ * Main entry point of the archive streamer for reading WAL data from a tar
+ * file. If a member is identified as a valid WAL file, a hash entry is created
+ * for it, and its contents are copied into that entry's buffer, making them
+ * accessible to the decoding routine.
+ */
+static void
+astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context)
+{
+ astreamer_waldump *mystreamer = (astreamer_waldump *) streamer;
+ XLogDumpPrivate *privateInfo = mystreamer->privateInfo;
+
+ Assert(context != ASTREAMER_UNKNOWN);
+
+ switch (context)
+ {
+ case ASTREAMER_MEMBER_HEADER:
+ {
+ char *fname = NULL;
+ ArchivedWALFile *entry;
+ bool found;
+
+ pg_log_debug("reading \"%s\"", member->pathname);
+
+ if (!member_is_wal_file(mystreamer, member, &fname))
+ break;
+
+ /*
+ * Further checks are skipped if any WAL file can be read.
+ * This typically occurs during initial verification.
+ */
+ if (!READ_ANY_WAL(privateInfo))
+ {
+ XLogSegNo segno;
+ TimeLineID timeline;
+
+ /*
+ * Skip the segment if the timeline does not match, if it
+ * falls outside the caller-specified range.
+ */
+ XLogFromFileName(fname, &timeline, &segno, privateInfo->segsize);
+ if (privateInfo->timeline != timeline ||
+ privateInfo->start_segno > segno ||
+ privateInfo->end_segno < segno)
+ {
+ pfree(fname);
+ break;
+ }
+ }
+
+ entry = ArchivedWAL_insert(privateInfo->archive_wal_htab,
+ fname, &found);
+
+ /*
+ * Shouldn't happen, but if it does, simply ignore the
+ * duplicate WAL file.
+ */
+ if (found)
+ {
+ pg_log_warning("ignoring duplicate WAL \"%s\" found in archive \"%s\"",
+ member->pathname, privateInfo->archive_name);
+ pfree(fname);
+ break;
+ }
+
+ entry->buf = makeStringInfo();
+ entry->read_len = 0;
+ privateInfo->cur_file = entry;
+ }
+ break;
+
+ case ASTREAMER_MEMBER_CONTENTS:
+ if (privateInfo->cur_file)
+ {
+ appendBinaryStringInfo(privateInfo->cur_file->buf, data, len);
+ privateInfo->cur_file->read_len += len;
+ }
+ break;
+
+ case ASTREAMER_MEMBER_TRAILER:
+ privateInfo->cur_file = NULL;
+ break;
+
+ case ASTREAMER_ARCHIVE_TRAILER:
+ break;
+
+ default:
+ /* Shouldn't happen. */
+ pg_fatal("unexpected state while parsing tar file");
+ }
+}
+
+/*
+ * End-of-stream processing for an astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_finalize(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+}
+
+/*
+ * Free memory associated with a astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_free(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+ pfree(streamer);
+}
+
+/*
+ * Returns true if the archive member name matches the WAL naming format. If
+ * successful, it also outputs the WAL segment name.
+ */
+static bool
+member_is_wal_file(astreamer_waldump *mystreamer, astreamer_member *member,
+ char **fname)
+{
+ int pathlen;
+ char pathname[MAXPGPATH];
+ char *filename;
+
+ /* We are only interested in normal files. */
+ if (member->is_directory || member->is_link)
+ return false;
+
+ if (strlen(member->pathname) < XLOG_FNAME_LEN)
+ return false;
+
+ /*
+ * For a correct comparison, we must remove any '.' or '..' components
+ * from the member pathname. Similar to member_verify_header(), we prepend
+ * './' to the path so that canonicalize_path() can properly resolve and
+ * strip these references from the tar member name
+ */
+ snprintf(pathname, MAXPGPATH, "./%s", member->pathname);
+ canonicalize_path(pathname);
+ pathlen = strlen(pathname);
+
+ /* WAL files from the top-level or pg_wal directory will be decoded */
+ if (pathlen > XLOG_FNAME_LEN &&
+ strncmp(pathname, XLOGDIR, strlen(XLOGDIR)) != 0)
+ return false;
+
+ /* WAL file could be with full path */
+ filename = pathname + (pathlen - XLOG_FNAME_LEN);
+ if (!IsXLogFileName(filename))
+ return false;
+
+ *fname = pnstrdup(filename, XLOG_FNAME_LEN);
+
+ return true;
+}
+
+/*
+ * Helper function for filemap hash table.
+ */
+static uint32
+hash_string_pointer(const char *s)
+{
+ unsigned char *ss = (unsigned char *) s;
+
+ return hash_bytes(ss, strlen(s));
+}
diff --git a/src/bin/pg_waldump/meson.build b/src/bin/pg_waldump/meson.build
index 633a9874bb5..5296f21b82c 100644
--- a/src/bin/pg_waldump/meson.build
+++ b/src/bin/pg_waldump/meson.build
@@ -1,6 +1,7 @@
# Copyright (c) 2022-2026, PostgreSQL Global Development Group
pg_waldump_sources = files(
+ 'archive_waldump.c',
'compat.c',
'pg_waldump.c',
'rmgrdesc.c',
@@ -18,7 +19,7 @@ endif
pg_waldump = executable('pg_waldump',
pg_waldump_sources,
- dependencies: [frontend_code, lz4, zstd],
+ dependencies: [frontend_code, libpq, lz4, zstd],
c_args: ['-DFRONTEND'], # needed for xlogreader et al
kwargs: default_bin_args,
)
@@ -29,6 +30,7 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
+ 'env': {'TAR': tar.found() ? tar.full_path() : ''},
'tests': [
't/001_basic.pl',
't/002_save_fullpage.pl',
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 5d31b15dbd8..a18c56a7322 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -176,7 +176,7 @@ split_path(const char *path, char **dir, char **fname)
*
* return a read only fd
*/
-static int
+int
open_file_in_directory(const char *directory, const char *fname)
{
int fd = -1;
@@ -440,6 +440,80 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
return count;
}
+/*
+ * pg_waldump's XLogReaderRoutine->segment_open callback to support dumping WAL
+ * files from tar archives.
+ */
+static void
+TarWALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
+ TimeLineID *tli_p)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->segment_close callback.
+ */
+static void
+TarWALDumpCloseSegment(XLogReaderState *state)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->page_read callback to support dumping WAL
+ * files from tar archives.
+ */
+static int
+TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
+ XLogRecPtr targetPtr, char *readBuff)
+{
+ XLogDumpPrivate *private = state->private_data;
+ int count = required_read_len(private, targetPagePtr, reqLen);
+ int WalSegSz = state->segcxt.ws_segsize;
+ XLogSegNo curSegNo;
+
+ /* Bail out if the count to be read is not valid */
+ if (count < 0)
+ return -1;
+
+ /*
+ * If the target page is in a different segment, free the buffer space
+ * occupied by the previous segment data. Since pg_waldump never requests
+ * the same WAL bytes twice, moving to a new segment implies the previous
+ * buffer's data and that segment will not be needed again.
+ */
+ curSegNo = state->seg.ws_segno;
+ if (!XLByteInSeg(targetPagePtr, curSegNo, WalSegSz))
+ {
+ char fname[MAXFNAMELEN];
+ XLogSegNo nextSegNo;
+
+ /*
+ * Calculate the next WAL segment to be decoded from the given page
+ * pointer
+ */
+ XLByteToSeg(targetPagePtr, nextSegNo, WalSegSz);
+ state->seg.ws_tli = private->timeline;
+ state->seg.ws_segno = nextSegNo;
+
+ /*
+ * If in pre-reading mode (prior to actual decoding), do not delete any
+ * entries that might be requested again once the decoding loop starts.
+ * For more details, see the comments in read_archive_wal_page().
+ */
+ if (private->decoding_started && curSegNo < nextSegNo)
+ {
+ XLogFileName(fname, state->seg.ws_tli, curSegNo, WalSegSz);
+ free_archive_wal_entry(fname, private);
+ }
+ }
+
+ /* Read the WAL page from the archive streamer */
+ return read_archive_wal_page(private, targetPagePtr, count, readBuff,
+ WalSegSz);
+}
+
/*
* Boolean to return whether the given WAL record matches a specific relation
* and optionally block.
@@ -777,8 +851,8 @@ usage(void)
printf(_(" -F, --fork=FORK only show records that modify blocks in fork FORK;\n"
" valid names are main, fsm, vm, init\n"));
printf(_(" -n, --limit=N number of records to display\n"));
- printf(_(" -p, --path=PATH directory in which to find WAL segment files or a\n"
- " directory with a ./pg_wal that contains such files\n"
+ printf(_(" -p, --path=PATH tar archive or a directory in which to find WAL segment files or\n"
+ " a directory with a ./pg_wal that contains such files\n"
" (default: current directory, ./pg_wal, $PGDATA/pg_wal)\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -r, --rmgr=RMGR only show records generated by resource manager RMGR;\n"
@@ -810,7 +884,9 @@ main(int argc, char **argv)
XLogRecord *record;
XLogRecPtr first_record;
char *waldir = NULL;
+ char *walpath = NULL;
char *errormsg;
+ pg_compress_algorithm compression;
static struct option long_options[] = {
{"bkp-details", no_argument, NULL, 'b'},
@@ -868,6 +944,10 @@ main(int argc, char **argv)
private.startptr = InvalidXLogRecPtr;
private.endptr = InvalidXLogRecPtr;
private.endptr_reached = false;
+ private.decoding_started = false;
+ private.archive_name = NULL;
+ private.start_segno = 0;
+ private.end_segno = UINT64_MAX;
config.quiet = false;
config.bkp_details = false;
@@ -943,7 +1023,7 @@ main(int argc, char **argv)
}
break;
case 'p':
- waldir = pg_strdup(optarg);
+ walpath = pg_strdup(optarg);
break;
case 'q':
config.quiet = true;
@@ -1107,12 +1187,21 @@ main(int argc, char **argv)
goto bad_argument;
}
- if (waldir != NULL)
+ if (walpath != NULL)
{
+ /* validate path points to tar archive */
+ if (parse_tar_compress_algorithm(walpath, &compression))
+ {
+ char *fname = NULL;
+
+ split_path(walpath, &waldir, &fname);
+
+ private.archive_name = fname;
+ }
/* validate path points to directory */
- if (!verify_directory(waldir))
+ else if (!verify_directory(walpath))
{
- pg_log_error("could not open directory \"%s\": %m", waldir);
+ pg_log_error("could not open directory \"%s\": %m", walpath);
goto bad_argument;
}
}
@@ -1128,6 +1217,17 @@ main(int argc, char **argv)
int fd;
XLogSegNo segno;
+ /*
+ * If a tar archive is passed using the --path option, all other
+ * arguments become unnecessary.
+ */
+ if (private.archive_name)
+ {
+ pg_log_error("unnecessary command-line arguments specified with tar archive (first is \"%s\")",
+ argv[optind]);
+ goto bad_argument;
+ }
+
split_path(argv[optind], &directory, &fname);
if (waldir == NULL && directory != NULL)
@@ -1138,69 +1238,76 @@ main(int argc, char **argv)
pg_fatal("could not open directory \"%s\": %m", waldir);
}
- waldir = identify_target_directory(waldir, fname, &private.segsize);
- fd = open_file_in_directory(waldir, fname);
- if (fd < 0)
- pg_fatal("could not open file \"%s\"", fname);
- close(fd);
-
- /* parse position from file */
- XLogFromFileName(fname, &private.timeline, &segno, private.segsize);
-
- if (!XLogRecPtrIsValid(private.startptr))
- XLogSegNoOffsetToRecPtr(segno, 0, private.segsize, private.startptr);
- else if (!XLByteInSeg(private.startptr, segno, private.segsize))
+ if (fname != NULL && parse_tar_compress_algorithm(fname, &compression))
{
- pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.startptr),
- fname);
- goto bad_argument;
+ private.archive_name = fname;
}
-
- /* no second file specified, set end position */
- if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(segno + 1, 0, private.segsize, private.endptr);
-
- /* parse ENDSEG if passed */
- if (optind + 1 < argc)
+ else
{
- XLogSegNo endsegno;
-
- /* ignore directory, already have that */
- split_path(argv[optind + 1], &directory, &fname);
-
+ waldir = identify_target_directory(waldir, fname, &private.segsize);
fd = open_file_in_directory(waldir, fname);
if (fd < 0)
pg_fatal("could not open file \"%s\"", fname);
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &endsegno, private.segsize);
+ XLogFromFileName(fname, &private.timeline, &segno, private.segsize);
- if (endsegno < segno)
- pg_fatal("ENDSEG %s is before STARTSEG %s",
- argv[optind + 1], argv[optind]);
+ if (!XLogRecPtrIsValid(private.startptr))
+ XLogSegNoOffsetToRecPtr(segno, 0, private.segsize, private.startptr);
+ else if (!XLByteInSeg(private.startptr, segno, private.segsize))
+ {
+ pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.startptr),
+ fname);
+ goto bad_argument;
+ }
- if (!XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(endsegno + 1, 0, private.segsize,
- private.endptr);
+ /* no second file specified, set end position */
+ if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
+ XLogSegNoOffsetToRecPtr(segno + 1, 0, private.segsize, private.endptr);
- /* set segno to endsegno for check of --end */
- segno = endsegno;
- }
+ /* parse ENDSEG if passed */
+ if (optind + 1 < argc)
+ {
+ XLogSegNo endsegno;
+ /* ignore directory, already have that */
+ split_path(argv[optind + 1], &directory, &fname);
- if (!XLByteInSeg(private.endptr, segno, private.segsize) &&
- private.endptr != (segno + 1) * private.segsize)
- {
- pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.endptr),
- argv[argc - 1]);
- goto bad_argument;
+ fd = open_file_in_directory(waldir, fname);
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", fname);
+ close(fd);
+
+ /* parse position from file */
+ XLogFromFileName(fname, &private.timeline, &endsegno, private.segsize);
+
+ if (endsegno < segno)
+ pg_fatal("ENDSEG %s is before STARTSEG %s",
+ argv[optind + 1], argv[optind]);
+
+ if (!XLogRecPtrIsValid(private.endptr))
+ XLogSegNoOffsetToRecPtr(endsegno + 1, 0, private.segsize,
+ private.endptr);
+
+ /* set segno to endsegno for check of --end */
+ segno = endsegno;
+ }
+
+
+ if (!XLByteInSeg(private.endptr, segno, private.segsize) &&
+ private.endptr != (segno + 1) * private.segsize)
+ {
+ pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.endptr),
+ argv[argc - 1]);
+ goto bad_argument;
+ }
}
}
- else
- waldir = identify_target_directory(waldir, NULL, &private.segsize);
+ else if (!private.archive_name)
+ waldir = identify_target_directory(walpath, NULL, &private.segsize);
/* we don't know what to print */
if (!XLogRecPtrIsValid(private.startptr))
@@ -1212,12 +1319,36 @@ main(int argc, char **argv)
/* done with argument parsing, do the actual work */
/* we have everything we need, start reading */
- xlogreader_state =
- XLogReaderAllocate(private.segsize, waldir,
- XL_ROUTINE(.page_read = WALDumpReadPage,
- .segment_open = WALDumpOpenSegment,
- .segment_close = WALDumpCloseSegment),
- &private);
+ if (private.archive_name)
+ {
+ /*
+ * A NULL WAL directory indicates that the archive file is located in
+ * the current working directory of the pg_waldump execution
+ */
+ if (waldir == NULL)
+ waldir = pg_strdup(".");
+
+ /* Set up for reading tar file */
+ init_archive_reader(&private, waldir, &private.segsize, compression);
+
+ /* Routine to decode WAL files in tar archive */
+ xlogreader_state =
+ XLogReaderAllocate(private.segsize, waldir,
+ XL_ROUTINE(.page_read = TarWALDumpReadPage,
+ .segment_open = TarWALDumpOpenSegment,
+ .segment_close = TarWALDumpCloseSegment),
+ &private);
+ }
+ else
+ {
+ xlogreader_state =
+ XLogReaderAllocate(private.segsize, waldir,
+ XL_ROUTINE(.page_read = WALDumpReadPage,
+ .segment_open = WALDumpOpenSegment,
+ .segment_close = WALDumpCloseSegment),
+ &private);
+ }
+
if (!xlogreader_state)
pg_fatal("out of memory while allocating a WAL reading processor");
@@ -1245,6 +1376,9 @@ main(int argc, char **argv)
if (config.stats == true && !config.quiet)
stats.startptr = first_record;
+ /* Flag indicating that the decoding loop has been entered */
+ private.decoding_started = true;
+
for (;;)
{
if (time_to_stop)
@@ -1326,6 +1460,9 @@ main(int argc, char **argv)
XLogReaderFree(xlogreader_state);
+ if (private.archive_name)
+ free_archive_reader(&private);
+
return EXIT_SUCCESS;
bad_argument:
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index 013b051506f..54d54a8a718 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -12,6 +12,11 @@
#define PG_WALDUMP_H
#include "access/xlogdefs.h"
+#include "fe_utils/astreamer.h"
+
+/* Forward declaration */
+struct ArchivedWALFile;
+struct ArchivedWAL_hash;
/* Contains the necessary information to drive WAL decoding */
typedef struct XLogDumpPrivate
@@ -21,6 +26,44 @@ typedef struct XLogDumpPrivate
XLogRecPtr startptr;
XLogRecPtr endptr;
bool endptr_reached;
+ bool decoding_started;
+
+ /* Fields required to read WAL from archive */
+ char *archive_name; /* Tar archive name */
+ int archive_fd; /* File descriptor for the open tar file */
+
+ astreamer *archive_streamer;
+
+ /* What the archive streamer is currently reading */
+ struct ArchivedWALFile *cur_file;
+
+ /*
+ * Hash table of all WAL files that the archive stream has read, including
+ * the one currently in progress.
+ */
+ struct ArchivedWAL_hash *archive_wal_htab;
+
+ /*
+ * Although these values can be easily derived from startptr and endptr,
+ * doing so repeatedly for each archived member would be inefficient, as
+ * it would involve recalculating and filtering out irrelevant WAL
+ * segments.
+ */
+ XLogSegNo start_segno;
+ XLogSegNo end_segno;
} XLogDumpPrivate;
+extern int open_file_in_directory(const char *directory, const char *fname);
+
+extern void init_archive_reader(XLogDumpPrivate *privateInfo,
+ const char *waldir, int *WalSegSz,
+ pg_compress_algorithm compression);
+extern void free_archive_reader(XLogDumpPrivate *privateInfo);
+extern int read_archive_wal_page(XLogDumpPrivate *privateInfo,
+ XLogRecPtr targetPagePtr,
+ Size count, char *readBuff,
+ int WalSegSz);
+extern void free_archive_wal_entry(const char *fname,
+ XLogDumpPrivate *privateInfo);
+
#endif /* PG_WALDUMP_H */
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index f12ba52cbfc..6f8ce319841 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -3,10 +3,13 @@
use strict;
use warnings FATAL => 'all';
+use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
+my $tar = $ENV{TAR};
+
program_help_ok('pg_waldump');
program_version_ok('pg_waldump');
program_options_handling_ok('pg_waldump');
@@ -162,6 +165,42 @@ CREATE TABLESPACE ts1 LOCATION '$tblspc_path';
DROP TABLESPACE ts1;
});
+# Test: Decode a continuation record (contrecord) that spans multiple WAL
+# segments.
+#
+# Now consume all remaining room in the current WAL segment, leaving
+# space enough only for the start of a largish record.
+$node->safe_psql(
+ 'postgres', q{
+DO $$
+DECLARE
+ wal_segsize int := setting::int FROM pg_settings WHERE name = 'wal_segment_size';
+ remain int;
+ iters int := 0;
+BEGIN
+ LOOP
+ INSERT into t1(b)
+ select repeat(encode(sha256(g::text::bytea), 'hex'), (random() * 15 + 1)::int)
+ from generate_series(1, 10) g;
+
+ remain := wal_segsize - (pg_current_wal_insert_lsn() - '0/0') % wal_segsize;
+ IF remain < 2 * setting::int from pg_settings where name = 'block_size' THEN
+ RAISE log 'exiting after % iterations, % bytes to end of WAL segment', iters, remain;
+ EXIT;
+ END IF;
+ iters := iters + 1;
+ END LOOP;
+END
+$$;
+});
+
+my $contrecord_lsn = $node->safe_psql('postgres',
+ 'SELECT pg_current_wal_insert_lsn()');
+# Generate contrecord record
+$node->safe_psql('postgres',
+ qq{SELECT pg_logical_emit_message(true, 'test 026', repeat('xyzxz', 123456))}
+);
+
my ($end_lsn, $end_walfile) = split /\|/,
$node->safe_psql('postgres',
q{SELECT pg_current_wal_insert_lsn(), pg_walfile_name(pg_current_wal_insert_lsn())}
@@ -259,11 +298,50 @@ sub test_pg_waldump
return @lines;
}
-my @lines;
+# Create a tar archive, sorting the file order
+sub generate_archive
+{
+ my ($archive, $directory, $compression_flags) = @_;
+
+ my @files;
+ opendir my $dh, $directory or die "opendir: $!";
+ while (my $entry = readdir $dh) {
+ # Skip '.' and '..'
+ next if $entry eq '.' || $entry eq '..';
+ push @files, $entry;
+ }
+ closedir $dh;
+
+ @files = sort @files;
+
+ # move into the WAL directory before archiving files
+ my $cwd = getcwd;
+ chdir($directory) || die "chdir: $!";
+ command_ok([$tar, $compression_flags, $archive, @files]);
+ chdir($cwd) || die "chdir: $!";
+}
+
+my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
my @scenarios = (
{
- 'path' => $node->data_dir
+ 'path' => $node->data_dir,
+ 'is_archive' => 0,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar",
+ 'compression_method' => 'none',
+ 'compression_flags' => '-cf',
+ 'is_archive' => 1,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar.gz",
+ 'compression_method' => 'gzip',
+ 'compression_flags' => '-czf',
+ 'is_archive' => 1,
+ 'enabled' => check_pg_config("#define HAVE_LIBZ 1")
});
for my $scenario (@scenarios)
@@ -272,6 +350,19 @@ for my $scenario (@scenarios)
SKIP:
{
+ skip "tar command is not available", 56
+ if !defined $tar && $scenario->{'is_archive'};
+ skip "$scenario->{'compression_method'} compression not supported by this build", 56
+ if !$scenario->{'enabled'} && $scenario->{'is_archive'};
+
+ # create pg_wal archive
+ if ($scenario->{'is_archive'})
+ {
+ generate_archive($path,
+ $node->data_dir . '/pg_wal',
+ $scenario->{'compression_flags'});
+ }
+
command_fails_like(
[ 'pg_waldump', '--path' => $path ],
qr/error: no start WAL location given/,
@@ -305,9 +396,14 @@ for my $scenario (@scenarios)
test_pg_waldump_skip_bytes($path, $start_lsn, $end_lsn);
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
+ my @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+ @lines = test_pg_waldump($path, $contrecord_lsn, $end_lsn);
+ is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+
+ test_pg_waldump_skip_bytes($path, $contrecord_lsn, $end_lsn);
+
@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--limit' => 6);
is(@lines, 6, 'limit option observed');
@@ -337,6 +433,9 @@ for my $scenario (@scenarios)
'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
'--block' => 1);
is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+
+ # Cleanup.
+ unlink $path if $scenario->{'is_archive'};
}
}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 77e3c04144e..595ad7d5c5a 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -145,6 +145,8 @@ ArchiveOpts
ArchiveShutdownCB
ArchiveStartupCB
ArchiveStreamState
+ArchivedWALFile
+ArchivedWAL_hash
ArchiverOutput
ArchiverStage
ArrayAnalyzeExtraData
@@ -3513,6 +3515,7 @@ astreamer_recovery_injector
astreamer_tar_archiver
astreamer_tar_parser
astreamer_verify
+astreamer_waldump
astreamer_zstd_frame
auth_password_hook_typ
autovac_table
--
2.47.1
[application/x-patch] v15-0007-pg_waldump-Remove-the-restriction-on-the-order-o.patch (13.8K, 8-v15-0007-pg_waldump-Remove-the-restriction-on-the-order-o.patch)
download | inline diff:
From c5b0a92f4808816108bdff02e5c137280749d01c Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 27 Jan 2026 15:38:34 +0530
Subject: [PATCH v15 07/11] pg_waldump: Remove the restriction on the order of
archived WAL files.
With previous patch, pg_waldump would stop decoding if WAL files were
not in the required sequence. With this patch, decoding will now
continue. Any WAL file that is out of order will be written to a
temporary location, from which it will be read later. Once a temporary
file has been read, it will be removed.
---
doc/src/sgml/ref/pg_waldump.sgml | 19 ++-
src/bin/pg_waldump/archive_waldump.c | 172 +++++++++++++++++++++++++--
src/bin/pg_waldump/pg_waldump.c | 32 ++++-
src/bin/pg_waldump/pg_waldump.h | 3 +
src/bin/pg_waldump/t/001_basic.pl | 3 +-
5 files changed, 209 insertions(+), 20 deletions(-)
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index 15fb8d13199..9bbb4bd5772 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -149,8 +149,12 @@ PostgreSQL documentation
of <envar>PGDATA</envar>.
</para>
<para>
- If a tar archive is provided, its WAL segment files must be in
- sequential order; otherwise, an error will be reported.
+ If a tar archive is provided and its WAL segment files are not in
+ sequential order, those files will be written to a temporary directory
+ named starting with <filename>waldump_tmp</filename>. This directory will be
+ created inside the directory specified by the <envar>TMPDIR</envar>
+ environment variable if it is set; otherwise, it will be created within
+ the same directory as the tar archive.
</para>
</listitem>
</varlistentry>
@@ -387,6 +391,17 @@ PostgreSQL documentation
</para>
</listitem>
</varlistentry>
+
+ <varlistentry>
+ <term><envar>TMPDIR</envar></term>
+ <listitem>
+ <para>
+ Directory in which to create temporary files when reading WAL from a
+ tar archive with out-of-order segment files. If not set, the temporary
+ directory is created within the same directory as the tar archive.
+ </para>
+ </listitem>
+ </varlistentry>
</variablelist>
</refsect1>
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
index 4a95b47b4da..1479efe61f5 100644
--- a/src/bin/pg_waldump/archive_waldump.c
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -17,6 +17,7 @@
#include <unistd.h>
#include "access/xlog_internal.h"
+#include "common/file_perm.h"
#include "common/hashfn.h"
#include "common/logging.h"
#include "fe_utils/simple_list.h"
@@ -27,6 +28,9 @@
*/
#define READ_CHUNK_SIZE (128 * 1024)
+/* Temporary exported WAL file directory */
+char *TmpWalSegDir = NULL;
+
/*
* Check if the start segment number is zero; this indicates a request to read
* any WAL file.
@@ -57,6 +61,8 @@ typedef struct ArchivedWALFile
const char *fname; /* hash key: WAL segment name */
StringInfo buf; /* holds WAL bytes read from archive */
+ bool spilled; /* true if the WAL data was spilled to a
+ * temporary file */
int read_len; /* total bytes of a WAL read from archive */
} ArchivedWALFile;
@@ -84,6 +90,11 @@ static ArchivedWALFile *get_archive_wal_entry(const char *fname,
XLogDumpPrivate *privateInfo,
int WalSegSz);
static int read_archive_file(XLogDumpPrivate *privateInfo, Size count);
+static void setup_tmpwal_dir(const char *waldir);
+static void cleanup_tmpwal_dir_atexit(void);
+
+static FILE *prepare_tmp_write(const char *fname);
+static void perform_tmp_write(const char *fname, StringInfo buf, FILE *file);
static astreamer *astreamer_waldump_new(XLogDumpPrivate *privateInfo);
static void astreamer_waldump_content(astreamer *streamer,
@@ -106,7 +117,9 @@ static const astreamer_ops astreamer_waldump_ops = {
/*
* Initializes the tar archive reader, creates a hash table for WAL entries,
* checks for existing valid WAL segments in the archive file and retrieves the
- * segment size, and sets up filters for relevant entries.
+ * segment size, and sets up filters for relevant entries. It also configures a
+ * temporary directory for out-of-order WAL data and registers an exit callback
+ * to clean up temporary files.
*/
void
init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
@@ -199,6 +212,13 @@ init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
privateInfo->start_segno > segno ||
privateInfo->end_segno < segno)
free_archive_wal_entry(entry->fname, privateInfo);
+
+ /*
+ * Setup temporary directory to store WAL segments and set up an exit
+ * callback to remove it upon completion.
+ */
+ setup_tmpwal_dir(waldir);
+ atexit(cleanup_tmpwal_dir_atexit);
}
/*
@@ -365,6 +385,17 @@ free_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo)
destroyStringInfo(entry->buf);
entry->buf = NULL;
+ /* Remove temporary file if any */
+ if (entry->spilled)
+ {
+ char fpath[MAXPGPATH];
+
+ snprintf(fpath, MAXPGPATH, "%s/%s", TmpWalSegDir, fname);
+
+ if (unlink(fpath) == 0)
+ pg_log_debug("removed file \"%s\"", fpath);
+ }
+
/* Set cur_file to NULL if it matches the entry being ignored */
if (privateInfo->cur_file == entry)
privateInfo->cur_file = NULL;
@@ -376,12 +407,16 @@ free_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo)
* Returns the archived WAL entry from the hash table if it exists. Otherwise,
* it invokes the routine to read the archived file, which then populates the
* entry in the hash table if that WAL exists in the archive.
+ * If the archive streamer happens to be reading a
+ * WAL from archive file that is not currently needed, that WAL data is written
+ * to a temporary file.
*/
static ArchivedWALFile *
get_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo,
int WalSegSz)
{
ArchivedWALFile *entry = NULL;
+ FILE *write_fp = NULL;
/* Search hash table */
entry = ArchivedWAL_lookup(privateInfo->archive_wal_htab, fname);
@@ -395,28 +430,59 @@ get_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo,
*/
while (1)
{
+ /*
+ * The WAL file entry currently being processed may change during
+ * archive streamer execution. Therefore, maintain a local variable to
+ * reference the previous entry, ensuring that any remaining data in
+ * its buffer is successfully flushed to the temporary file before
+ * switching to the next WAL entry.
+ */
+ entry = privateInfo->cur_file;
+
/* Fetch more data */
- if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
- break; /* archive file ended */
+ if (entry == NULL || entry->buf->len == 0)
+ {
+ if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ break; /* archive file ended */
+ }
/*
* Archived streamer is reading a non-WAL file or an irrelevant WAL
* file.
*/
- if (privateInfo->cur_file == NULL)
+ if (entry == NULL)
continue;
- entry = privateInfo->cur_file;
-
/* Found the required entry */
if (strcmp(fname, entry->fname) == 0)
return entry;
- /* WAL segments must be archived in order */
- pg_log_error("WAL files are not archived in sequential order");
- pg_log_error_detail("Expecting segment \"%s\" but found \"%s\".",
- fname, entry->fname);
- exit(1);
+ /*
+ * Archive streamer is currently reading a file that isn't the one
+ * asked for, but it's required in the future. It should be written to
+ * a temporary location for retrieval when needed.
+ */
+
+ /* Create a temporary file if one does not already exist */
+ if (!entry->spilled)
+ {
+ write_fp = prepare_tmp_write(entry->fname);
+ entry->spilled = true;
+ }
+
+ /* Flush data from the buffer to the file */
+ perform_tmp_write(entry->fname, entry->buf, write_fp);
+ resetStringInfo(entry->buf);
+
+ /*
+ * The change in the current segment entry indicates that the reading
+ * of this file has ended.
+ */
+ if (entry != privateInfo->cur_file && write_fp != NULL)
+ {
+ fclose(write_fp);
+ write_fp = NULL;
+ }
}
/* Requested WAL segment not found */
@@ -454,7 +520,88 @@ read_archive_file(XLogDumpPrivate *privateInfo, Size count)
}
/*
- * Create an astreamer that can read WAL from a tar file.
+ * Set up a temporary directory to temporarily store WAL segments.
+ */
+static void
+setup_tmpwal_dir(const char *waldir)
+{
+ char *template;
+
+ /*
+ * Use the directory specified by the TMPDIR environment variable. If it's
+ * not set, use the provided WAL directory to extract WAL file
+ * temporarily.
+ */
+ template = psprintf("%s/waldump_tmp-XXXXXX",
+ getenv("TMPDIR") ? getenv("TMPDIR") : waldir);
+ TmpWalSegDir = mkdtemp(template);
+
+ if (TmpWalSegDir == NULL)
+ pg_fatal("could not create directory \"%s\": %m", template);
+
+ canonicalize_path(TmpWalSegDir);
+
+ pg_log_debug("created directory \"%s\"", TmpWalSegDir);
+}
+
+/*
+ * Remove temporary directory at exit, if any.
+ */
+static void
+cleanup_tmpwal_dir_atexit(void)
+{
+ rmtree(TmpWalSegDir, true);
+}
+
+/*
+ * Create an empty placeholder file and return its handle.
+ */
+static FILE *
+prepare_tmp_write(const char *fname)
+{
+ char fpath[MAXPGPATH];
+ FILE *file;
+
+ snprintf(fpath, MAXPGPATH, "%s/%s", TmpWalSegDir, fname);
+
+ /* Create an empty placeholder */
+ file = fopen(fpath, PG_BINARY_W);
+ if (file == NULL)
+ pg_fatal("could not create file \"%s\": %m", fpath);
+
+#ifndef WIN32
+ if (chmod(fpath, pg_file_create_mode))
+ pg_fatal("could not set permissions on file \"%s\": %m",
+ fpath);
+#endif
+
+ pg_log_debug("spilling to temporary file \"%s\"", fpath);
+
+ return file;
+}
+
+/*
+ * Write buffer data to the given file handle.
+ */
+static void
+perform_tmp_write(const char *fname, StringInfo buf, FILE *file)
+{
+ Assert(file);
+
+ errno = 0;
+ if (buf->len > 0 && fwrite(buf->data, buf->len, 1, file) != 1)
+ {
+ /*
+ * If write didn't set errno, assume problem is no disk space
+ */
+ if (errno == 0)
+ errno = ENOSPC;
+ pg_fatal("could not write to file \"%s\": %m", fname);
+ }
+}
+
+/*
+ * Create an astreamer that can read WAL from tar file.
*/
static astreamer *
astreamer_waldump_new(XLogDumpPrivate *privateInfo)
@@ -538,6 +685,7 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
}
entry->buf = makeStringInfo();
+ entry->spilled = false;
entry->read_len = 0;
privateInfo->cur_file = entry;
}
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index a18c56a7322..4b438b53ead 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -478,10 +478,14 @@ TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
return -1;
/*
- * If the target page is in a different segment, free the buffer space
- * occupied by the previous segment data. Since pg_waldump never requests
- * the same WAL bytes twice, moving to a new segment implies the previous
- * buffer's data and that segment will not be needed again.
+ * If the target page is in a different segment, free the buffer and/or
+ * temporary file disk space occupied by the previous segment's data.
+ * Since pg_waldump never requests the same WAL bytes twice, moving to a
+ * new segment implies the previous buffer's data and that segment will
+ * not be needed again.
+ *
+ * Afterward, check for the next required WAL segment's physical existence
+ * in the temporary directory first before invoking the archive streamer.
*/
curSegNo = state->seg.ws_segno;
if (!XLByteInSeg(targetPagePtr, curSegNo, WalSegSz))
@@ -497,6 +501,13 @@ TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
state->seg.ws_tli = private->timeline;
state->seg.ws_segno = nextSegNo;
+ /* Close the WAL segment file if it is currently open */
+ if (state->seg.ws_file >= 0)
+ {
+ close(state->seg.ws_file);
+ state->seg.ws_file = -1;
+ }
+
/*
* If in pre-reading mode (prior to actual decoding), do not delete any
* entries that might be requested again once the decoding loop starts.
@@ -507,9 +518,20 @@ TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogFileName(fname, state->seg.ws_tli, curSegNo, WalSegSz);
free_archive_wal_entry(fname, private);
}
+
+ /*
+ * If the next segment exists, open it and continue reading from there
+ */
+ XLogFileName(fname, state->seg.ws_tli, nextSegNo, WalSegSz);
+ state->seg.ws_file = open_file_in_directory(TmpWalSegDir, fname);
}
- /* Read the WAL page from the archive streamer */
+ /* Continue reading from the open WAL segment, if any */
+ if (state->seg.ws_file >= 0)
+ return WALDumpReadPage(state, targetPagePtr, count, targetPtr,
+ readBuff);
+
+ /* Otherwise, read the WAL page from the archive streamer */
return read_archive_wal_page(private, targetPagePtr, count, readBuff,
WalSegSz);
}
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index 54d54a8a718..6c242b7fcbc 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -18,6 +18,9 @@
struct ArchivedWALFile;
struct ArchivedWAL_hash;
+/* Temporary directory */
+extern char *TmpWalSegDir;
+
/* Contains the necessary information to drive WAL decoding */
typedef struct XLogDumpPrivate
{
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 6f8ce319841..6960bd46ba4 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -7,6 +7,7 @@ use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
+use List::Util qw(shuffle);
my $tar = $ENV{TAR};
@@ -312,7 +313,7 @@ sub generate_archive
}
closedir $dh;
- @files = sort @files;
+ @files = shuffle @files;
# move into the WAL directory before archiving files
my $cwd = getcwd;
--
2.47.1
[application/x-patch] v15-0008-pg_verifybackup-Delay-default-WAL-directory-prep.patch (1.7K, 9-v15-0008-pg_verifybackup-Delay-default-WAL-directory-prep.patch)
download | inline diff:
From 42c939b5d33b160d86ff01c21d61e5a68170b415 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 16 Jul 2025 14:47:43 +0530
Subject: [PATCH v15 08/11] pg_verifybackup: Delay default WAL directory
preparation.
We are not sure whether to parse WAL from a directory or an archive
until the backup format is known. Therefore, we delay preparing the
default WAL directory until the point of parsing. This delay is
harmless, as the WAL directory is not used elsewhere.
---
src/bin/pg_verifybackup/pg_verifybackup.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 31f606c45b1..8cc204719ee 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -285,10 +285,6 @@ main(int argc, char **argv)
manifest_path = psprintf("%s/backup_manifest",
context.backup_directory);
- /* By default, look for the WAL in the backup directory, too. */
- if (wal_directory == NULL)
- wal_directory = psprintf("%s/pg_wal", context.backup_directory);
-
/*
* Try to read the manifest. We treat any errors encountered while parsing
* the manifest as fatal; there doesn't seem to be much point in trying to
@@ -368,6 +364,10 @@ main(int argc, char **argv)
if (context.format == 'p' && !context.skip_checksums)
verify_backup_checksums(&context);
+ /* By default, look for the WAL in the backup directory, too. */
+ if (wal_directory == NULL)
+ wal_directory = psprintf("%s/pg_wal", context.backup_directory);
+
/*
* Try to parse the required ranges of WAL records, unless we were told
* not to do so.
--
2.47.1
[application/x-patch] v15-0009-pg_verifybackup-Rename-the-wal-directory-switch-.patch (5.9K, 10-v15-0009-pg_verifybackup-Rename-the-wal-directory-switch-.patch)
download | inline diff:
From dfa159e43527c0705b3b1b14303775c6b55b80f7 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 25 Nov 2025 17:32:14 +0530
Subject: [PATCH v15 09/11] pg_verifybackup: Rename the wal-directory switch to
wal-path
With previous patches to pg_waldump can now decode WAL directly from
tar files. This means you'll be able to specify a tar archive path
instead of a traditional WAL directory.
To keep things consistent and more versatile, we should also
generalize the input switch for pg_verifybackup. It should accept
either a directory or a tar file path that contains WALs. This change
will also aligning it with the existing manifest-path switch naming.
== NOTE ==
The corresponding PO files require updating due to this change.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 2 +-
src/bin/pg_verifybackup/pg_verifybackup.c | 22 +++++++++++-----------
src/bin/pg_verifybackup/t/007_wal.pl | 4 ++--
3 files changed, 14 insertions(+), 14 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index 61c12975e4a..e9b8bfd51b1 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -261,7 +261,7 @@ PostgreSQL documentation
<varlistentry>
<term><option>-w <replaceable class="parameter">path</replaceable></option></term>
- <term><option>--wal-directory=<replaceable class="parameter">path</replaceable></option></term>
+ <term><option>--wal-path=<replaceable class="parameter">path</replaceable></option></term>
<listitem>
<para>
Try to parse WAL files stored in the specified directory, rather than
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 8cc204719ee..34520546bc3 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -93,7 +93,7 @@ static void verify_file_checksum(verifier_context *context,
uint8 *buffer);
static void parse_required_wal(verifier_context *context,
char *pg_waldump_path,
- char *wal_directory);
+ char *wal_path);
static astreamer *create_archive_verifier(verifier_context *context,
char *archive_name,
Oid tblspc_oid,
@@ -126,7 +126,7 @@ main(int argc, char **argv)
{"progress", no_argument, NULL, 'P'},
{"quiet", no_argument, NULL, 'q'},
{"skip-checksums", no_argument, NULL, 's'},
- {"wal-directory", required_argument, NULL, 'w'},
+ {"wal-path", required_argument, NULL, 'w'},
{NULL, 0, NULL, 0}
};
@@ -135,7 +135,7 @@ main(int argc, char **argv)
char *manifest_path = NULL;
bool no_parse_wal = false;
bool quiet = false;
- char *wal_directory = NULL;
+ char *wal_path = NULL;
char *pg_waldump_path = NULL;
DIR *dir;
@@ -221,8 +221,8 @@ main(int argc, char **argv)
context.skip_checksums = true;
break;
case 'w':
- wal_directory = pstrdup(optarg);
- canonicalize_path(wal_directory);
+ wal_path = pstrdup(optarg);
+ canonicalize_path(wal_path);
break;
default:
/* getopt_long already emitted a complaint */
@@ -365,15 +365,15 @@ main(int argc, char **argv)
verify_backup_checksums(&context);
/* By default, look for the WAL in the backup directory, too. */
- if (wal_directory == NULL)
- wal_directory = psprintf("%s/pg_wal", context.backup_directory);
+ if (wal_path == NULL)
+ wal_path = psprintf("%s/pg_wal", context.backup_directory);
/*
* Try to parse the required ranges of WAL records, unless we were told
* not to do so.
*/
if (!no_parse_wal)
- parse_required_wal(&context, pg_waldump_path, wal_directory);
+ parse_required_wal(&context, pg_waldump_path, wal_path);
/*
* If everything looks OK, tell the user this, unless we were asked to
@@ -1188,7 +1188,7 @@ verify_file_checksum(verifier_context *context, manifest_file *m,
*/
static void
parse_required_wal(verifier_context *context, char *pg_waldump_path,
- char *wal_directory)
+ char *wal_path)
{
manifest_data *manifest = context->manifest;
manifest_wal_range *this_wal_range = manifest->first_wal_range;
@@ -1198,7 +1198,7 @@ parse_required_wal(verifier_context *context, char *pg_waldump_path,
char *pg_waldump_cmd;
pg_waldump_cmd = psprintf("\"%s\" --quiet --path=\"%s\" --timeline=%u --start=%X/%08X --end=%X/%08X\n",
- pg_waldump_path, wal_directory, this_wal_range->tli,
+ pg_waldump_path, wal_path, this_wal_range->tli,
LSN_FORMAT_ARGS(this_wal_range->start_lsn),
LSN_FORMAT_ARGS(this_wal_range->end_lsn));
fflush(NULL);
@@ -1366,7 +1366,7 @@ usage(void)
printf(_(" -P, --progress show progress information\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -s, --skip-checksums skip checksum verification\n"));
- printf(_(" -w, --wal-directory=PATH use specified path for WAL files\n"));
+ printf(_(" -w, --wal-path=PATH use specified path for WAL files\n"));
printf(_(" -V, --version output version information, then exit\n"));
printf(_(" -?, --help show this help, then exit\n"));
printf(_("\nReport bugs to <%s>.\n"), PACKAGE_BUGREPORT);
diff --git a/src/bin/pg_verifybackup/t/007_wal.pl b/src/bin/pg_verifybackup/t/007_wal.pl
index 79087a1f6be..8ad2234453d 100644
--- a/src/bin/pg_verifybackup/t/007_wal.pl
+++ b/src/bin/pg_verifybackup/t/007_wal.pl
@@ -42,10 +42,10 @@ command_ok([ 'pg_verifybackup', '--no-parse-wal', $backup_path ],
command_ok(
[
'pg_verifybackup',
- '--wal-directory' => $relocated_pg_wal,
+ '--wal-path' => $relocated_pg_wal,
$backup_path
],
- '--wal-directory can be used to specify WAL directory');
+ '--wal-path can be used to specify WAL directory');
# Move directory back to original location.
rename($relocated_pg_wal, $original_pg_wal) || die "rename pg_wal back: $!";
--
2.47.1
[application/x-patch] v15-0010-pg_verifybackup-Enabled-WAL-parsing-for-tar-form.patch (11.5K, 11-v15-0010-pg_verifybackup-Enabled-WAL-parsing-for-tar-form.patch)
download | inline diff:
From 72cd114f16823c3faee5adebdb1605833c835743 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 25 Nov 2025 17:34:26 +0530
Subject: [PATCH v15 10/11] pg_verifybackup: Enabled WAL parsing for tar-format
backup
Now that pg_waldump supports decoding from tar archives, we should
leverage this functionality to remove the previous restriction on WAL
parsing for tar-backed formats.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 12 ++--
src/bin/pg_verifybackup/pg_verifybackup.c | 66 +++++++++++++------
src/bin/pg_verifybackup/t/002_algorithm.pl | 4 --
src/bin/pg_verifybackup/t/003_corruption.pl | 4 +-
src/bin/pg_verifybackup/t/007_wal.pl | 16 +++++
src/bin/pg_verifybackup/t/008_untar.pl | 5 +-
src/bin/pg_verifybackup/t/010_client_untar.pl | 5 +-
7 files changed, 70 insertions(+), 42 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index e9b8bfd51b1..1695cfe91c8 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -36,10 +36,7 @@ PostgreSQL documentation
<literal>backup_manifest</literal> generated by the server at the time
of the backup. The backup may be stored either in the "plain" or the "tar"
format; this includes tar-format backups compressed with any algorithm
- supported by <application>pg_basebackup</application>. However, at present,
- <literal>WAL</literal> verification is supported only for plain-format
- backups. Therefore, if the backup is stored in tar-format, the
- <literal>-n, --no-parse-wal</literal> option should be used.
+ supported by <application>pg_basebackup</application>.
</para>
<para>
@@ -264,9 +261,10 @@ PostgreSQL documentation
<term><option>--wal-path=<replaceable class="parameter">path</replaceable></option></term>
<listitem>
<para>
- Try to parse WAL files stored in the specified directory, rather than
- in <literal>pg_wal</literal>. This may be useful if the backup is
- stored in a separate location from the WAL archive.
+ Try to parse WAL files stored in the specified directory or tar
+ archive, rather than in <literal>pg_wal</literal>. This may be
+ useful if the backup is stored in a separate location from the WAL
+ archive.
</para>
</listitem>
</varlistentry>
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 34520546bc3..935ab8fafa8 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -74,7 +74,9 @@ pg_noreturn static void report_manifest_error(JsonManifestParseContext *context,
const char *fmt,...)
pg_attribute_printf(2, 3);
-static void verify_tar_backup(verifier_context *context, DIR *dir);
+static void verify_tar_backup(verifier_context *context, DIR *dir,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_plain_backup_directory(verifier_context *context,
char *relpath, char *fullpath,
DIR *dir);
@@ -83,7 +85,9 @@ static void verify_plain_backup_file(verifier_context *context, char *relpath,
static void verify_control_file(const char *controlpath,
uint64 manifest_system_identifier);
static void precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles);
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_tar_file(verifier_context *context, char *relpath,
char *fullpath, astreamer *streamer);
static void report_extra_backup_files(verifier_context *context);
@@ -136,6 +140,8 @@ main(int argc, char **argv)
bool no_parse_wal = false;
bool quiet = false;
char *wal_path = NULL;
+ char *base_archive_path = NULL;
+ char *wal_archive_path = NULL;
char *pg_waldump_path = NULL;
DIR *dir;
@@ -327,17 +333,6 @@ main(int argc, char **argv)
pfree(path);
}
- /*
- * XXX: In the future, we should consider enhancing pg_waldump to read WAL
- * files from an archive.
- */
- if (!no_parse_wal && context.format == 't')
- {
- pg_log_error("pg_waldump cannot read tar files");
- pg_log_error_hint("You must use -n/--no-parse-wal when verifying a tar-format backup.");
- exit(1);
- }
-
/*
* Perform the appropriate type of verification appropriate based on the
* backup format. This will close 'dir'.
@@ -346,7 +341,7 @@ main(int argc, char **argv)
verify_plain_backup_directory(&context, NULL, context.backup_directory,
dir);
else
- verify_tar_backup(&context, dir);
+ verify_tar_backup(&context, dir, &base_archive_path, &wal_archive_path);
/*
* The "matched" flag should now be set on every entry in the hash table.
@@ -364,9 +359,28 @@ main(int argc, char **argv)
if (context.format == 'p' && !context.skip_checksums)
verify_backup_checksums(&context);
- /* By default, look for the WAL in the backup directory, too. */
+ /*
+ * By default, WAL files are expected to be found in the backup directory
+ * for plain-format backups. In the case of tar-format backups, if a
+ * separate WAL archive is not found, the WAL files are most likely
+ * included within the main data directory archive.
+ */
if (wal_path == NULL)
- wal_path = psprintf("%s/pg_wal", context.backup_directory);
+ {
+ if (context.format == 'p')
+ wal_path = psprintf("%s/pg_wal", context.backup_directory);
+ else if (wal_archive_path)
+ wal_path = wal_archive_path;
+ else if (base_archive_path)
+ wal_path = base_archive_path;
+ else
+ {
+ pg_log_error("WAL archive not found");
+ pg_log_error_hint("Specify the correct path using the option -w/--wal-path. "
+ "Or you must use -n/--no-parse-wal when verifying a tar-format backup.");
+ exit(1);
+ }
+ }
/*
* Try to parse the required ranges of WAL records, unless we were told
@@ -787,7 +801,8 @@ verify_control_file(const char *controlpath, uint64 manifest_system_identifier)
* close when we're done with it.
*/
static void
-verify_tar_backup(verifier_context *context, DIR *dir)
+verify_tar_backup(verifier_context *context, DIR *dir, char **base_archive_path,
+ char **wal_archive_path)
{
struct dirent *dirent;
SimplePtrList tarfiles = {NULL, NULL};
@@ -816,7 +831,8 @@ verify_tar_backup(verifier_context *context, DIR *dir)
char *fullpath;
fullpath = psprintf("%s/%s", context->backup_directory, filename);
- precheck_tar_backup_file(context, filename, fullpath, &tarfiles);
+ precheck_tar_backup_file(context, filename, fullpath, &tarfiles,
+ base_archive_path, wal_archive_path);
pfree(fullpath);
}
}
@@ -875,11 +891,13 @@ verify_tar_backup(verifier_context *context, DIR *dir)
*
* The arguments to this function are mostly the same as the
* verify_plain_backup_file. The additional argument outputs a list of valid
- * tar files.
+ * tar files, along with the full paths to the main archive and the WAL
+ * directory archive.
*/
static void
precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles)
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path, char **wal_archive_path)
{
struct stat sb;
Oid tblspc_oid = InvalidOid;
@@ -918,9 +936,17 @@ precheck_tar_backup_file(verifier_context *context, char *relpath,
* extension such as .gz, .lz4, or .zst.
*/
if (strncmp("base", relpath, 4) == 0)
+ {
suffix = relpath + 4;
+
+ *base_archive_path = pstrdup(fullpath);
+ }
else if (strncmp("pg_wal", relpath, 6) == 0)
+ {
suffix = relpath + 6;
+
+ *wal_archive_path = pstrdup(fullpath);
+ }
else
{
/* Expected a <tablespaceoid>.tar file here. */
diff --git a/src/bin/pg_verifybackup/t/002_algorithm.pl b/src/bin/pg_verifybackup/t/002_algorithm.pl
index 0556191ec9d..edc515d5904 100644
--- a/src/bin/pg_verifybackup/t/002_algorithm.pl
+++ b/src/bin/pg_verifybackup/t/002_algorithm.pl
@@ -30,10 +30,6 @@ sub test_checksums
{
# Add switch to get a tar-format backup
push @backup, ('--format' => 'tar');
-
- # Add switch to skip WAL verification, which is not yet supported for
- # tar-format backups
- push @verify, ('--no-parse-wal');
}
# A backup with a bogus algorithm should fail.
diff --git a/src/bin/pg_verifybackup/t/003_corruption.pl b/src/bin/pg_verifybackup/t/003_corruption.pl
index b1d65b8aa0f..882d75d9dc2 100644
--- a/src/bin/pg_verifybackup/t/003_corruption.pl
+++ b/src/bin/pg_verifybackup/t/003_corruption.pl
@@ -193,10 +193,8 @@ for my $scenario (@scenario)
command_ok([ $tar, '-cf' => "$tar_backup_path/base.tar", '.' ]);
chdir($cwd) || die "chdir: $!";
- # Now check that the backup no longer verifies. We must use -n
- # here, because pg_waldump can't yet read WAL from a tarfile.
command_fails_like(
- [ 'pg_verifybackup', '--no-parse-wal', $tar_backup_path ],
+ [ 'pg_verifybackup', $tar_backup_path ],
$scenario->{'fails_like'},
"corrupt backup fails verification: $name");
diff --git a/src/bin/pg_verifybackup/t/007_wal.pl b/src/bin/pg_verifybackup/t/007_wal.pl
index 8ad2234453d..0e0377bfacc 100644
--- a/src/bin/pg_verifybackup/t/007_wal.pl
+++ b/src/bin/pg_verifybackup/t/007_wal.pl
@@ -90,4 +90,20 @@ command_ok(
[ 'pg_verifybackup', $backup_path2 ],
'valid base backup with timeline > 1');
+# Test WAL verification for a tar-format backup with a separate pg_wal.tar,
+# as produced by pg_basebackup --format=tar --wal-method=stream.
+my $backup_path3 = $primary->backup_dir . '/test_tar_wal';
+$primary->command_ok(
+ [
+ 'pg_basebackup',
+ '--pgdata' => $backup_path3,
+ '--no-sync',
+ '--format' => 'tar',
+ '--checkpoint' => 'fast'
+ ],
+ "tar backup with separate pg_wal.tar");
+command_ok(
+ [ 'pg_verifybackup', $backup_path3 ],
+ 'WAL verification succeeds with separate pg_wal.tar');
+
done_testing();
diff --git a/src/bin/pg_verifybackup/t/008_untar.pl b/src/bin/pg_verifybackup/t/008_untar.pl
index ae67ae85a31..161c08c190d 100644
--- a/src/bin/pg_verifybackup/t/008_untar.pl
+++ b/src/bin/pg_verifybackup/t/008_untar.pl
@@ -47,7 +47,6 @@ my $tsoid = $primary->safe_psql(
SELECT oid FROM pg_tablespace WHERE spcname = 'regress_ts1'));
my $backup_path = $primary->backup_dir . '/server-backup';
-my $extract_path = $primary->backup_dir . '/extracted-backup';
my @test_configuration = (
{
@@ -123,14 +122,12 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
# Cleanup.
rmtree($backup_path);
- rmtree($extract_path);
}
}
diff --git a/src/bin/pg_verifybackup/t/010_client_untar.pl b/src/bin/pg_verifybackup/t/010_client_untar.pl
index 1ac7b5db75a..9670fbe4fda 100644
--- a/src/bin/pg_verifybackup/t/010_client_untar.pl
+++ b/src/bin/pg_verifybackup/t/010_client_untar.pl
@@ -32,7 +32,6 @@ print $jf $junk_data;
close $jf;
my $backup_path = $primary->backup_dir . '/client-backup';
-my $extract_path = $primary->backup_dir . '/extracted-backup';
my @test_configuration = (
{
@@ -137,13 +136,11 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
# Cleanup.
- rmtree($extract_path);
rmtree($backup_path);
}
}
--
2.47.1
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-03-04 21:50 Andrew Dunstan <[email protected]>
parent: Amul Sul <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Andrew Dunstan @ 2026-03-04 21:50 UTC (permalink / raw)
To: Amul Sul <[email protected]>; +Cc: Robert Haas <[email protected]>; Chao Li <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On 2026-03-04 We 7:52 AM, Amul Sul wrote:
> On Wed, Mar 4, 2026 at 6:07 AM Andrew Dunstan<[email protected]> wrote:
>>
>> On 2026-03-02 Mo 8:00 AM, Amul Sul wrote:
>>> On Wed, Feb 18, 2026 at 12:28 PM Amul Sul<[email protected]> wrote:
>>>> On Tue, Feb 10, 2026 at 3:06 PM Amul Sul<[email protected]> wrote:
>>>>> On Wed, Feb 4, 2026 at 6:39 PM Amul Sul<[email protected]> wrote:
>>>>>> On Wed, Jan 28, 2026 at 2:41 AM Robert Haas<[email protected]> wrote:
>>>>>>> On Tue, Jan 27, 2026 at 7:07 AM Amul Sul<[email protected]> wrote:
>>>>>>>> In the attached version, I am using the WAL segment name as the hash
>>>>>>>> key, which is much more straightforward. I have rewritten
>>>>>>>> read_archive_wal_page(), and it looks much cleaner than before. The
>>>>>>>> logic to discard irrelevant WAL files is still within
>>>>>>>> get_archive_wal_entry. I added an explanation for setting cur_wal to
>>>>>>>> NULL, which is now handled in the separate function I mentioned
>>>>>>>> previously.
>>>>>>>>
>>>>>>>> Kindly have a look at the attached version; let me know if you are
>>>>>>>> still not happy with the current approach for filtering/discarding
>>>>>>>> irrelevant WAL segments. It isn't much different from the previous
>>>>>>>> version, but I have tried to keep it in a separate routine for better
>>>>>>>> code readability, with comments to make it easier to understand. I
>>>>>>>> also added a comment for ArchivedWALFile.
>>>>>>> I feel like the division of labor between get_archive_wal_entry() and
>>>>>>> read_archive_wal_page() is odd. I noticed this in the last version,
>>>>>>> too, and it still seems to be the case. get_archive_wal_entry() first
>>>>>>> calls ArchivedWAL_lookup(). If that finds an entry, it just returns.
>>>>>>> If it doesn't, it loops until an entry for the requested file shows up
>>>>>>> and then returns it. Then control returns to read_archive_wal_page()
>>>>>>> which loops some more until we have all the data we need for the
>>>>>>> requested file. But it seems odd to me to have two separate loops
>>>>>>> here. I think that the first loop is going to call read_archive_file()
>>>>>>> until we find the beginning of the file that we care about and then
>>>>>>> the second one is going to call read_archive_file() some more until we
>>>>>>> have read enough of it to satisfy the request. It feels odd to me to
>>>>>>> do it that way, as if we told somebody to first wait until 9 o'clock
>>>>>>> and then wait another 30 minutes, instead of just telling them to wait
>>>>>>> until 9:30. I realize it's not quite the same thing, because apart
>>>>>>> from calling read_archive_file(), the two loops do different things,
>>>>>>> but I still think it looks odd.
>>>>>>>
>>>>>>> + /*
>>>>>>> + * Ignore if the timeline is different or the current segment is not
>>>>>>> + * the desired one.
>>>>>>> + */
>>>>>>> + XLogFromFileName(entry->fname, &curSegTimeline, &curSegNo, WalSegSz);
>>>>>>> + if (privateInfo->timeline != curSegTimeline ||
>>>>>>> + privateInfo->startSegNo > curSegNo ||
>>>>>>> + privateInfo->endSegNo < curSegNo ||
>>>>>>> + segno > curSegNo)
>>>>>>> + {
>>>>>>> + free_archive_wal_entry(entry->fname, privateInfo);
>>>>>>> + continue;
>>>>>>> + }
>>>>>>>
>>>>>>> The comment doesn't match the code. If it did, the test would be
>>>>>>> (privateInfo->timeline != curSegTimeline || segno != curSegno). But
>>>>>>> instead the segno test is > rather than !=, and the checks against
>>>>>>> startSegNo and endSegNo aren't explained at all. I think I understand
>>>>>>> why the segno test uses > rather than !=, but it's the point of the
>>>>>>> comment to explain things like that, rather than leaving the reader to
>>>>>>> guess. And I don't know why we also need to test startSegNo and
>>>>>>> endSegNo.
>>>>>>>
>>>>>>> I also wonder what the point is of doing XLogFromFileName() on the
>>>>>>> fname provided by the caller and then again on entry->fname. Couldn't
>>>>>>> you just compare the strings?
>>>>>>>
>>>>>>> Again, the division of labor is really odd here. It's the job of
>>>>>>> astreamer_waldump_content() to skip things that aren't WAL files at
>>>>>>> all, but it's the job of get_archive_wal_entry() to skip things that
>>>>>>> are WAL files but not the one we want. I disagree with putting those
>>>>>>> checks in completely separate parts of the code.
>>>>>>>
>>>>>> Keeping the timeline and segment start-end range checks inside the
>>>>>> archive streamer creates a circular dependency that cannot be resolved
>>>>>> without a 'dirty hack'. We must read the first available WAL file page
>>>>>> to determine the wal_segment_size before it can calculate the target
>>>>>> segment range. Moving the checks inside the streamer would make it
>>>>>> impossible to process that initial file, as the necessary filtering
>>>>>> parameters -- would still be unknown which would need to be skipped
>>>>>> for the first read somehow. What if later we realized that the first
>>>>>> WAL file which was allowed to be streamed by skipping that check is
>>>>>> irrelevant and doesn't fall under the start-end segment range?
>>>>>>
>>>>> Please have a look at the attached version, specifically patch 0005.
>>>>> In astreamer_waldump_content(), I have moved the WAL file filtration
>>>>> check from get_archive_wal_entry(). This check will be skipped during
>>>>> the initial read in init_archive_reader(), which instead performs it
>>>>> explicitly once it determines the WAL segment size and the start/end
>>>>> segments.
>>>>>
>>>>> To access the WAL segment size inside astreamer_waldump_content(), I
>>>>> have moved the WAL segment size variable into the XLogDumpPrivate
>>>>> structure in the separate 0004 patch.
>>>> Attached is an updated version including the aforesaid changes. It
>>>> includes a new refactoring patch (0001) that moves the logic for
>>>> identifying tar archives and their compression types from
>>>> pg_basebackup and pg_verifybackup into a separate-reusable function,
>>>> per a suggestion from Euler [1]. Additionally, I have added a test
>>>> for the contrecord decoding to the main patch (now 0006).
>>>>
>>>> 1]http://postgr.es/m/[email protected]
>>>>
>>> Rebased against the latest master, fixed typos in code comments, and
>>> replaced palloc0 with palloc0_object.
>>>
>> Hi Amul.
>>
>>
>> I think this looks in pretty good shape.
>>
> Thank you very much for looking at the patch.
>
>> Attached are patches for a few things I think could be fixed. They are
>> mostly self-explanatory. The TAP test fix is the only sane way I could
>> come up with stopping the skip code you had from reporting a wildly
>> inaccurate number of tests skipped. The sane way to do this from a
>> Test::More perspective is a subtest, but unfortunately meson does not
>> like subtest output, which is why we don't use it elsewhere, so the only
>> way I could come up with was to split this out into a separate test. Of
>> course, we might just say we don't care about the misreport, in which
>> case we could just live with things as they are.
>>
> I agree that the reported skip number was incorrect, and I have
> corrected it in the attached patch. I haven't applied your patch for
> the TAP test improvements yet because I wanted to double-check it
> first with you; the patch as it stood created duplicate tests already
> present in 001_basic.pl. To avoid this duplication, I have added a
> loop that performs tests for both plain and tar WAL directory inputs,
> similar to the approach used in pg_verifybackup for different
> compression type tests (e.g., 008_untar.pl, 010_client_untar.pl). I
> don't have any objection to doing so if you feel the duplication is
> acceptable, but I feel that using a loop for the tests in 001_basic.pl
> is a bit tidier. Let me know your thoughts.
I will take a look.
>
> I have applied all your other patches but skipped the changes to
> pg_verifybackup.c from cf5955-fixes.patch.no-cfbot, as they seem
> unrelated or perhaps I have misunderstood them.
<brown-paper-bag> That's what I get for using a poorly written tool.
cheers
andrew
--
Andrew Dunstan
EDB:https://www.enterprisedb.com
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-03-09 12:26 Amul Sul <[email protected]>
parent: Andrew Dunstan <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Amul Sul @ 2026-03-09 12:26 UTC (permalink / raw)
To: Andrew Dunstan <[email protected]>; +Cc: Robert Haas <[email protected]>; Chao Li <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On Sat, Mar 7, 2026 at 3:51 AM Andrew Dunstan <[email protected]> wrote:
>
>
> On 2026-03-04 We 4:50 PM, Andrew Dunstan wrote:
> >
> >
> > On 2026-03-04 We 7:52 AM, Amul Sul wrote:
> >> On Wed, Mar 4, 2026 at 6:07 AM Andrew Dunstan<[email protected]> wrote:
> >>> On 2026-03-02 Mo 8:00 AM, Amul Sul wrote:
> >>>> On Wed, Feb 18, 2026 at 12:28 PM Amul Sul<[email protected]> wrote:
> >>>>> On Tue, Feb 10, 2026 at 3:06 PM Amul Sul<[email protected]> wrote:
> >>>>>> On Wed, Feb 4, 2026 at 6:39 PM Amul Sul<[email protected]> wrote:
> >>>>>>> On Wed, Jan 28, 2026 at 2:41 AM Robert Haas<[email protected]> wrote:
> >>>>>>>> On Tue, Jan 27, 2026 at 7:07 AM Amul Sul<[email protected]> wrote:
> >>>>>>>>> In the attached version, I am using the WAL segment name as the hash
> >>>>>>>>> key, which is much more straightforward. I have rewritten
> >>>>>>>>> read_archive_wal_page(), and it looks much cleaner than before. The
> >>>>>>>>> logic to discard irrelevant WAL files is still within
> >>>>>>>>> get_archive_wal_entry. I added an explanation for setting cur_wal to
> >>>>>>>>> NULL, which is now handled in the separate function I mentioned
> >>>>>>>>> previously.
> >>>>>>>>>
> >>>>>>>>> Kindly have a look at the attached version; let me know if you are
> >>>>>>>>> still not happy with the current approach for filtering/discarding
> >>>>>>>>> irrelevant WAL segments. It isn't much different from the previous
> >>>>>>>>> version, but I have tried to keep it in a separate routine for better
> >>>>>>>>> code readability, with comments to make it easier to understand. I
> >>>>>>>>> also added a comment for ArchivedWALFile.
> >>>>>>>> I feel like the division of labor between get_archive_wal_entry() and
> >>>>>>>> read_archive_wal_page() is odd. I noticed this in the last version,
> >>>>>>>> too, and it still seems to be the case. get_archive_wal_entry() first
> >>>>>>>> calls ArchivedWAL_lookup(). If that finds an entry, it just returns.
> >>>>>>>> If it doesn't, it loops until an entry for the requested file shows up
> >>>>>>>> and then returns it. Then control returns to read_archive_wal_page()
> >>>>>>>> which loops some more until we have all the data we need for the
> >>>>>>>> requested file. But it seems odd to me to have two separate loops
> >>>>>>>> here. I think that the first loop is going to call read_archive_file()
> >>>>>>>> until we find the beginning of the file that we care about and then
> >>>>>>>> the second one is going to call read_archive_file() some more until we
> >>>>>>>> have read enough of it to satisfy the request. It feels odd to me to
> >>>>>>>> do it that way, as if we told somebody to first wait until 9 o'clock
> >>>>>>>> and then wait another 30 minutes, instead of just telling them to wait
> >>>>>>>> until 9:30. I realize it's not quite the same thing, because apart
> >>>>>>>> from calling read_archive_file(), the two loops do different things,
> >>>>>>>> but I still think it looks odd.
> >>>>>>>>
> >>>>>>>> + /*
> >>>>>>>> + * Ignore if the timeline is different or the current segment is not
> >>>>>>>> + * the desired one.
> >>>>>>>> + */
> >>>>>>>> + XLogFromFileName(entry->fname, &curSegTimeline, &curSegNo, WalSegSz);
> >>>>>>>> + if (privateInfo->timeline != curSegTimeline ||
> >>>>>>>> + privateInfo->startSegNo > curSegNo ||
> >>>>>>>> + privateInfo->endSegNo < curSegNo ||
> >>>>>>>> + segno > curSegNo)
> >>>>>>>> + {
> >>>>>>>> + free_archive_wal_entry(entry->fname, privateInfo);
> >>>>>>>> + continue;
> >>>>>>>> + }
> >>>>>>>>
> >>>>>>>> The comment doesn't match the code. If it did, the test would be
> >>>>>>>> (privateInfo->timeline != curSegTimeline || segno != curSegno). But
> >>>>>>>> instead the segno test is > rather than !=, and the checks against
> >>>>>>>> startSegNo and endSegNo aren't explained at all. I think I understand
> >>>>>>>> why the segno test uses > rather than !=, but it's the point of the
> >>>>>>>> comment to explain things like that, rather than leaving the reader to
> >>>>>>>> guess. And I don't know why we also need to test startSegNo and
> >>>>>>>> endSegNo.
> >>>>>>>>
> >>>>>>>> I also wonder what the point is of doing XLogFromFileName() on the
> >>>>>>>> fname provided by the caller and then again on entry->fname. Couldn't
> >>>>>>>> you just compare the strings?
> >>>>>>>>
> >>>>>>>> Again, the division of labor is really odd here. It's the job of
> >>>>>>>> astreamer_waldump_content() to skip things that aren't WAL files at
> >>>>>>>> all, but it's the job of get_archive_wal_entry() to skip things that
> >>>>>>>> are WAL files but not the one we want. I disagree with putting those
> >>>>>>>> checks in completely separate parts of the code.
> >>>>>>>>
> >>>>>>> Keeping the timeline and segment start-end range checks inside the
> >>>>>>> archive streamer creates a circular dependency that cannot be resolved
> >>>>>>> without a 'dirty hack'. We must read the first available WAL file page
> >>>>>>> to determine the wal_segment_size before it can calculate the target
> >>>>>>> segment range. Moving the checks inside the streamer would make it
> >>>>>>> impossible to process that initial file, as the necessary filtering
> >>>>>>> parameters -- would still be unknown which would need to be skipped
> >>>>>>> for the first read somehow. What if later we realized that the first
> >>>>>>> WAL file which was allowed to be streamed by skipping that check is
> >>>>>>> irrelevant and doesn't fall under the start-end segment range?
> >>>>>>>
> >>>>>> Please have a look at the attached version, specifically patch 0005.
> >>>>>> In astreamer_waldump_content(), I have moved the WAL file filtration
> >>>>>> check from get_archive_wal_entry(). This check will be skipped during
> >>>>>> the initial read in init_archive_reader(), which instead performs it
> >>>>>> explicitly once it determines the WAL segment size and the start/end
> >>>>>> segments.
> >>>>>>
> >>>>>> To access the WAL segment size inside astreamer_waldump_content(), I
> >>>>>> have moved the WAL segment size variable into the XLogDumpPrivate
> >>>>>> structure in the separate 0004 patch.
> >>>>> Attached is an updated version including the aforesaid changes. It
> >>>>> includes a new refactoring patch (0001) that moves the logic for
> >>>>> identifying tar archives and their compression types from
> >>>>> pg_basebackup and pg_verifybackup into a separate-reusable function,
> >>>>> per a suggestion from Euler [1]. Additionally, I have added a test
> >>>>> for the contrecord decoding to the main patch (now 0006).
> >>>>>
> >>>>> 1]http://postgr.es/m/[email protected]
> >>>>>
> >>>> Rebased against the latest master, fixed typos in code comments, and
> >>>> replaced palloc0 with palloc0_object.
> >>>>
> >>> Hi Amul.
> >>>
> >>>
> >>> I think this looks in pretty good shape.
> >>>
> >> Thank you very much for looking at the patch.
> >>
> >>> Attached are patches for a few things I think could be fixed. They are
> >>> mostly self-explanatory. The TAP test fix is the only sane way I could
> >>> come up with stopping the skip code you had from reporting a wildly
> >>> inaccurate number of tests skipped. The sane way to do this from a
> >>> Test::More perspective is a subtest, but unfortunately meson does not
> >>> like subtest output, which is why we don't use it elsewhere, so the only
> >>> way I could come up with was to split this out into a separate test. Of
> >>> course, we might just say we don't care about the misreport, in which
> >>> case we could just live with things as they are.
> >>>
> >> I agree that the reported skip number was incorrect, and I have
> >> corrected it in the attached patch. I haven't applied your patch for
> >> the TAP test improvements yet because I wanted to double-check it
> >> first with you; the patch as it stood created duplicate tests already
> >> present in 001_basic.pl. To avoid this duplication, I have added a
> >> loop that performs tests for both plain and tar WAL directory inputs,
> >> similar to the approach used in pg_verifybackup for different
> >> compression type tests (e.g., 008_untar.pl, 010_client_untar.pl). I
> >> don't have any objection to doing so if you feel the duplication is
> >> acceptable, but I feel that using a loop for the tests in 001_basic.pl
> >> is a bit tidier. Let me know your thoughts.
> >
> >
> > I will take a look.
> >
>
> I'm ok, with doing it this way. It's just a bit fragile - if we add a
> test the number will be wrong. But maybe it's not worth worrying about.
>
> Everything else looks fairly good. The attached fixes a few relatively
> minor issues in v15. The main one is that it stops allocating/freeing a
> buffer every time we call read_archive_file() and instead adds a
> reusable buffer. It also adds back wal-directory as an undocumented
> alias of wal-path, to avoid breaking legacy scripts unnecessarily, and
> adds constness to the fname argument of pg_tar_compress_algorithm, as
> well as fixing some indentation and grammar issues.
>
> All in all I think we're in good shape.
Thanks for the review. I have incorporated your suggested changes,
with one exception: I have skipped the buffer reallocation code in
read_archive_file(). Since we only handle two specific read sizes --
XLOG_BLCKSZ and READ_CHUNK_SIZE (128 KB, we defined in
archive_waldump.c) -- dynamic reallocation seems unnecessary. Instead,
I moved the allocation to init_archive_reader(), which now initializes
a buffer at READ_CHUNK_SIZE. I also added an assertion in
read_archive_file() to ensure that no read request exceeds this
allocated capacity.
Kindly have a look at the attached version and let me know your thoughts.
Regards,
Amul
Attachments:
[application/x-patch] v16-0001-Refactor-Move-tar-archive-parsing-into-a-common-.patch (6.7K, 2-v16-0001-Refactor-Move-tar-archive-parsing-into-a-common-.patch)
download | inline diff:
From 5f5be6940651f89ea843a6ee98eeab0087fab8fa Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 17 Feb 2026 14:51:11 +0530
Subject: [PATCH v16 01/11] Refactor: Move tar archive parsing into a common
location.
pg_basebackup and pg_verifybackup both require logic to identify tar
files and determine their compression types. Similar functionality
will be needed for pg_waldump when it gets the capability to decode
WAL files from tar archives. Moving this logic to a common location
allows for reuse and prevents code duplication.
---
src/bin/pg_basebackup/pg_basebackup.c | 36 +++++++----------------
src/bin/pg_verifybackup/pg_verifybackup.c | 12 +-------
src/common/compression.c | 30 +++++++++++++++++++
src/include/common/compression.h | 2 ++
4 files changed, 44 insertions(+), 36 deletions(-)
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index fa169a8d642..c1a4672aa6f 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -1070,12 +1070,9 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
astreamer *manifest_inject_streamer = NULL;
bool inject_manifest;
bool is_tar,
- is_tar_gz,
- is_tar_lz4,
- is_tar_zstd,
is_compressed_tar;
+ pg_compress_algorithm compressed_tar_algorithm;
bool must_parse_archive;
- int archive_name_len = strlen(archive_name);
/*
* Normally, we emit the backup manifest as a separate file, but when
@@ -1084,24 +1081,13 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
*/
inject_manifest = (format == 't' && strcmp(basedir, "-") == 0 && manifest);
- /* Is this a tar archive? */
- is_tar = (archive_name_len > 4 &&
- strcmp(archive_name + archive_name_len - 4, ".tar") == 0);
-
- /* Is this a .tar.gz archive? */
- is_tar_gz = (archive_name_len > 7 &&
- strcmp(archive_name + archive_name_len - 7, ".tar.gz") == 0);
-
- /* Is this a .tar.lz4 archive? */
- is_tar_lz4 = (archive_name_len > 8 &&
- strcmp(archive_name + archive_name_len - 8, ".tar.lz4") == 0);
-
- /* Is this a .tar.zst archive? */
- is_tar_zstd = (archive_name_len > 8 &&
- strcmp(archive_name + archive_name_len - 8, ".tar.zst") == 0);
+ /* Check whether it is a tar archive and its compression type */
+ is_tar = parse_tar_compress_algorithm(archive_name,
+ &compressed_tar_algorithm);
/* Is this any kind of compressed tar? */
- is_compressed_tar = is_tar_gz || is_tar_lz4 || is_tar_zstd;
+ is_compressed_tar = (is_tar &&
+ compressed_tar_algorithm != PG_COMPRESSION_NONE);
/*
* Injecting the manifest into a compressed tar file would be possible if
@@ -1128,7 +1114,7 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
(spclocation == NULL && writerecoveryconf));
/* At present, we only know how to parse tar archives. */
- if (must_parse_archive && !is_tar && !is_compressed_tar)
+ if (must_parse_archive && !is_tar)
{
pg_log_error("cannot parse archive \"%s\"", archive_name);
pg_log_error_detail("Only tar archives can be parsed.");
@@ -1263,13 +1249,13 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
* If the user has requested a server compressed archive along with
* archive extraction at client then we need to decompress it.
*/
- if (format == 'p')
+ if (format == 'p' && is_compressed_tar)
{
- if (is_tar_gz)
+ if (compressed_tar_algorithm == PG_COMPRESSION_GZIP)
streamer = astreamer_gzip_decompressor_new(streamer);
- else if (is_tar_lz4)
+ else if (compressed_tar_algorithm == PG_COMPRESSION_LZ4)
streamer = astreamer_lz4_decompressor_new(streamer);
- else if (is_tar_zstd)
+ else if (compressed_tar_algorithm == PG_COMPRESSION_ZSTD)
streamer = astreamer_zstd_decompressor_new(streamer);
}
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index cbc9447384f..31f606c45b1 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -941,17 +941,7 @@ precheck_tar_backup_file(verifier_context *context, char *relpath,
}
/* Now, check the compression type of the tar */
- if (strcmp(suffix, ".tar") == 0)
- compress_algorithm = PG_COMPRESSION_NONE;
- else if (strcmp(suffix, ".tgz") == 0)
- compress_algorithm = PG_COMPRESSION_GZIP;
- else if (strcmp(suffix, ".tar.gz") == 0)
- compress_algorithm = PG_COMPRESSION_GZIP;
- else if (strcmp(suffix, ".tar.lz4") == 0)
- compress_algorithm = PG_COMPRESSION_LZ4;
- else if (strcmp(suffix, ".tar.zst") == 0)
- compress_algorithm = PG_COMPRESSION_ZSTD;
- else
+ if (!parse_tar_compress_algorithm(suffix, &compress_algorithm))
{
report_backup_error(context,
"file \"%s\" is not expected in a tar format backup",
diff --git a/src/common/compression.c b/src/common/compression.c
index 92cd4ec7a0d..fb27501d297 100644
--- a/src/common/compression.c
+++ b/src/common/compression.c
@@ -41,6 +41,36 @@ static int expect_integer_value(char *keyword, char *value,
static bool expect_boolean_value(char *keyword, char *value,
pg_compress_specification *result);
+/*
+ * Look up a compression algorithm by archive file extension. Returns true and
+ * sets *algorithm if the name is recognized. Otherwise returns false.
+ */
+bool
+parse_tar_compress_algorithm(const char *fname, pg_compress_algorithm *algorithm)
+{
+ int fname_len = strlen(fname);
+
+ if (fname_len >= 4 &&
+ strcmp(fname + fname_len - 4, ".tar") == 0)
+ *algorithm = PG_COMPRESSION_NONE;
+ else if (fname_len >= 4 &&
+ strcmp(fname + fname_len - 4, ".tgz") == 0)
+ *algorithm = PG_COMPRESSION_GZIP;
+ else if (fname_len >= 7 &&
+ strcmp(fname + fname_len - 7, ".tar.gz") == 0)
+ *algorithm = PG_COMPRESSION_GZIP;
+ else if (fname_len >= 8 &&
+ strcmp(fname + fname_len - 8, ".tar.lz4") == 0)
+ *algorithm = PG_COMPRESSION_LZ4;
+ else if (fname_len >= 8 &&
+ strcmp(fname + fname_len - 8, ".tar.zst") == 0)
+ *algorithm = PG_COMPRESSION_ZSTD;
+ else
+ return false;
+
+ return true;
+}
+
/*
* Look up a compression algorithm by name. Returns true and sets *algorithm
* if the name is recognized. Otherwise returns false.
diff --git a/src/include/common/compression.h b/src/include/common/compression.h
index 6c745b90066..f99c747cdd3 100644
--- a/src/include/common/compression.h
+++ b/src/include/common/compression.h
@@ -41,6 +41,8 @@ typedef struct pg_compress_specification
extern void parse_compress_options(const char *option, char **algorithm,
char **detail);
+extern bool parse_tar_compress_algorithm(const char *fname,
+ pg_compress_algorithm *algorithm);
extern bool parse_compress_algorithm(char *name, pg_compress_algorithm *algorithm);
extern const char *get_compress_algorithm_name(pg_compress_algorithm algorithm);
--
2.47.1
[application/x-patch] v16-0002-Refactor-pg_waldump-Move-some-declarations-to-ne.patch (2.2K, 3-v16-0002-Refactor-pg_waldump-Move-some-declarations-to-ne.patch)
download | inline diff:
From a9a044df26e1ed14fe9eeabe5a2479e5456fa7ff Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 22 Jan 2026 10:28:32 +0530
Subject: [PATCH v16 02/11] Refactor: pg_waldump: Move some declarations to new
pg_waldump.h
This change prepares for a second source file in this directory to
support reading WAL from tar files. Common structures, declarations,
and functions are being exported through this include file so
they can be used in both files.
---
src/bin/pg_waldump/pg_waldump.c | 9 +--------
src/bin/pg_waldump/pg_waldump.h | 25 +++++++++++++++++++++++++
2 files changed, 26 insertions(+), 8 deletions(-)
create mode 100644 src/bin/pg_waldump/pg_waldump.h
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index f3446385d6a..4b7411a6498 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -29,6 +29,7 @@
#include "common/logging.h"
#include "common/relpath.h"
#include "getopt_long.h"
+#include "pg_waldump.h"
#include "rmgrdesc.h"
#include "storage/bufpage.h"
@@ -43,14 +44,6 @@ static volatile sig_atomic_t time_to_stop = false;
static const RelFileLocator emptyRelFileLocator = {0, 0, 0};
-typedef struct XLogDumpPrivate
-{
- TimeLineID timeline;
- XLogRecPtr startptr;
- XLogRecPtr endptr;
- bool endptr_reached;
-} XLogDumpPrivate;
-
typedef struct XLogDumpConfig
{
/* display options */
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
new file mode 100644
index 00000000000..64a9109229e
--- /dev/null
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -0,0 +1,25 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_waldump.h - decode and display WAL
+ *
+ * Copyright (c) 2026, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/pg_waldump.h
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_WALDUMP_H
+#define PG_WALDUMP_H
+
+#include "access/xlogdefs.h"
+
+/* Contains the necessary information to drive WAL decoding */
+typedef struct XLogDumpPrivate
+{
+ TimeLineID timeline;
+ XLogRecPtr startptr;
+ XLogRecPtr endptr;
+ bool endptr_reached;
+} XLogDumpPrivate;
+
+#endif /* PG_WALDUMP_H */
--
2.47.1
[application/x-patch] v16-0003-Refactor-pg_waldump-Separate-logic-used-to-calcu.patch (2.4K, 4-v16-0003-Refactor-pg_waldump-Separate-logic-used-to-calcu.patch)
download | inline diff:
From 6dbd37969b3972b35ce7542cf893cfd2c38ec137 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 22 Jan 2026 10:38:16 +0530
Subject: [PATCH v16 03/11] Refactor: pg_waldump: Separate logic used to
calculate the required read size.
This refactoring prepares the codebase for an upcoming patch that will
support reading WAL from tar files. The logic for calculating the
required read size has been updated to handle both normal WAL files
and WAL files located inside a tar archive.
---
src/bin/pg_waldump/pg_waldump.c | 43 +++++++++++++++++++++++----------
1 file changed, 30 insertions(+), 13 deletions(-)
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 4b7411a6498..958a71a01cf 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -326,6 +326,32 @@ identify_target_directory(char *directory, char *fname, int *WalSegSz)
return NULL; /* not reached */
}
+/*
+ * Returns the size in bytes of the data to be read. Returns -1 if the end
+ * point has already been reached.
+ */
+static inline int
+required_read_len(XLogDumpPrivate *private, XLogRecPtr targetPagePtr,
+ int reqLen)
+{
+ int count = XLOG_BLCKSZ;
+
+ if (XLogRecPtrIsValid(private->endptr))
+ {
+ if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
+ count = XLOG_BLCKSZ;
+ else if (targetPagePtr + reqLen <= private->endptr)
+ count = private->endptr - targetPagePtr;
+ else
+ {
+ private->endptr_reached = true;
+ return -1;
+ }
+ }
+
+ return count;
+}
+
/* pg_waldump's XLogReaderRoutine->segment_open callback */
static void
WALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
@@ -383,21 +409,12 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogRecPtr targetPtr, char *readBuff)
{
XLogDumpPrivate *private = state->private_data;
- int count = XLOG_BLCKSZ;
+ int count = required_read_len(private, targetPagePtr, reqLen);
WALReadError errinfo;
- if (XLogRecPtrIsValid(private->endptr))
- {
- if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
- count = XLOG_BLCKSZ;
- else if (targetPagePtr + reqLen <= private->endptr)
- count = private->endptr - targetPagePtr;
- else
- {
- private->endptr_reached = true;
- return -1;
- }
- }
+ /* Bail out if the count to be read is not valid */
+ if (count < 0)
+ return -1;
if (!WALRead(state, readBuff, targetPagePtr, count, private->timeline,
&errinfo))
--
2.47.1
[application/x-patch] v16-0004-Refactor-pg_waldump-Restructure-TAP-tests.patch (6.6K, 5-v16-0004-Refactor-pg_waldump-Restructure-TAP-tests.patch)
download | inline diff:
From 805bb1a6dac9f26e5d145fa7eb84d8cb3478ec85 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 18 Feb 2026 11:07:57 +0530
Subject: [PATCH v16 04/11] Refactor: pg_waldump: Restructure TAP tests.
Restructured tests that do not have a WAL file argument to run within
a loop, facilitating their re-execution for decoding WAL from tar
archives.
== NOTE ==
This is not intended to be committed separately. It can be merged
with the next patch, which is the main patch implementing this
feature.
---
src/bin/pg_waldump/t/001_basic.pl | 140 +++++++++++++++++-------------
1 file changed, 79 insertions(+), 61 deletions(-)
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 5db5d20136f..f12ba52cbfc 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -198,28 +198,6 @@ command_like(
],
qr/./,
'runs with start and end segment specified');
-command_fails_like(
- [ 'pg_waldump', '--path' => $node->data_dir ],
- qr/error: no start WAL location given/,
- 'path option requires start location');
-command_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- '--end' => $end_lsn,
- ],
- qr/./,
- 'runs with path option and start and end locations');
-command_fails_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- ],
- qr/error: error in WAL record at/,
- 'falling off the end of the WAL results in an error');
-
command_like(
[
'pg_waldump', '--quiet',
@@ -227,22 +205,16 @@ command_like(
],
qr/^$/,
'no output with --quiet option');
-command_fails_like(
- [
- 'pg_waldump', '--quiet',
- '--path' => $node->data_dir,
- '--start' => $start_lsn
- ],
- qr/error: error in WAL record at/,
- 'errors are shown with --quiet');
-
# Test for: Display a message that we're skipping data if `from`
# wasn't a pointer to the start of a record.
+sub test_pg_waldump_skip_bytes
{
+ my ($path, $startlsn, $endlsn) = @_;
+
# Construct a new LSN that is one byte past the original
# start_lsn.
- my ($part1, $part2) = split qr{/}, $start_lsn;
+ my ($part1, $part2) = split qr{/}, $startlsn;
my $lsn2 = hex $part2;
$lsn2++;
my $new_start = sprintf("%s/%X", $part1, $lsn2);
@@ -252,7 +224,8 @@ command_fails_like(
my $result = IPC::Run::run [
'pg_waldump',
'--start' => $new_start,
- $node->data_dir . '/pg_wal/' . $start_walfile
+ '--end' => $endlsn,
+ '--path' => $path,
],
'>' => \$stdout,
'2>' => \$stderr;
@@ -266,15 +239,15 @@ command_fails_like(
sub test_pg_waldump
{
local $Test::Builder::Level = $Test::Builder::Level + 1;
- my @opts = @_;
+ my ($path, $startlsn, $endlsn, @opts) = @_;
my ($stdout, $stderr);
my $result = IPC::Run::run [
'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- '--end' => $end_lsn,
+ '--start' => $startlsn,
+ '--end' => $endlsn,
+ '--path' => $path,
@opts
],
'>' => \$stdout,
@@ -288,38 +261,83 @@ sub test_pg_waldump
my @lines;
-@lines = test_pg_waldump;
-is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+my @scenarios = (
+ {
+ 'path' => $node->data_dir
+ });
-@lines = test_pg_waldump('--limit' => 6);
-is(@lines, 6, 'limit option observed');
+for my $scenario (@scenarios)
+{
+ my $path = $scenario->{'path'};
-@lines = test_pg_waldump('--fullpage');
-is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
+ SKIP:
+ {
+ command_fails_like(
+ [ 'pg_waldump', '--path' => $path ],
+ qr/error: no start WAL location given/,
+ 'path option requires start location');
+ command_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ '--end' => $end_lsn,
+ ],
+ qr/./,
+ 'runs with path option and start and end locations');
+ command_fails_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ ],
+ qr/error: error in WAL record at/,
+ 'falling off the end of the WAL results in an error');
-@lines = test_pg_waldump('--stats');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ command_fails_like(
+ [
+ 'pg_waldump', '--quiet',
+ '--path' => $path,
+ '--start' => $start_lsn
+ ],
+ qr/error: error in WAL record at/,
+ 'errors are shown with --quiet');
-@lines = test_pg_waldump('--stats=record');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ test_pg_waldump_skip_bytes($path, $start_lsn, $end_lsn);
-@lines = test_pg_waldump('--rmgr' => 'Btree');
-is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
+ is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
-@lines = test_pg_waldump('--fork' => 'init');
-is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--limit' => 6);
+ is(@lines, 6, 'limit option observed');
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
-is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
- 0, 'only lines for selected relation');
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fullpage');
+ is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
- '--block' => 1);
-is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats=record');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--rmgr' => 'Btree');
+ is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fork' => 'init');
+ is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
+ is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
+ 0, 'only lines for selected relation');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
+ '--block' => 1);
+ is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+ }
+}
done_testing();
--
2.47.1
[application/x-patch] v16-0005-Refactor-pg_waldump-Move-WAL-segment-size-to-XLo.patch (5.1K, 6-v16-0005-Refactor-pg_waldump-Move-WAL-segment-size-to-XLo.patch)
download | inline diff:
From f611a208cea878c44c2f983877c949b051443602 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 4 Feb 2026 15:31:51 +0530
Subject: [PATCH v16 05/11] Refactor: pg_waldump: Move WAL segment size to
XLogDumpPrivate.
Relocate the WAL segment size variable to the XLogDumpPrivate
structure and rename it to segsize for consistency. This change is
required to make the segment size accessible to the archive streamer
code, where passing it as a function argument is not feasible.
---
src/bin/pg_waldump/pg_waldump.c | 26 +++++++++++++-------------
src/bin/pg_waldump/pg_waldump.h | 1 +
2 files changed, 14 insertions(+), 13 deletions(-)
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 958a71a01cf..5d31b15dbd8 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -811,7 +811,6 @@ main(int argc, char **argv)
XLogRecPtr first_record;
char *waldir = NULL;
char *errormsg;
- int WalSegSz;
static struct option long_options[] = {
{"bkp-details", no_argument, NULL, 'b'},
@@ -865,6 +864,7 @@ main(int argc, char **argv)
memset(&stats, 0, sizeof(XLogStats));
private.timeline = 1;
+ private.segsize = 0;
private.startptr = InvalidXLogRecPtr;
private.endptr = InvalidXLogRecPtr;
private.endptr_reached = false;
@@ -1138,18 +1138,18 @@ main(int argc, char **argv)
pg_fatal("could not open directory \"%s\": %m", waldir);
}
- waldir = identify_target_directory(waldir, fname, &WalSegSz);
+ waldir = identify_target_directory(waldir, fname, &private.segsize);
fd = open_file_in_directory(waldir, fname);
if (fd < 0)
pg_fatal("could not open file \"%s\"", fname);
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &segno, WalSegSz);
+ XLogFromFileName(fname, &private.timeline, &segno, private.segsize);
if (!XLogRecPtrIsValid(private.startptr))
- XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr);
- else if (!XLByteInSeg(private.startptr, segno, WalSegSz))
+ XLogSegNoOffsetToRecPtr(segno, 0, private.segsize, private.startptr);
+ else if (!XLByteInSeg(private.startptr, segno, private.segsize))
{
pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
LSN_FORMAT_ARGS(private.startptr),
@@ -1159,7 +1159,7 @@ main(int argc, char **argv)
/* no second file specified, set end position */
if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(segno + 1, 0, WalSegSz, private.endptr);
+ XLogSegNoOffsetToRecPtr(segno + 1, 0, private.segsize, private.endptr);
/* parse ENDSEG if passed */
if (optind + 1 < argc)
@@ -1175,14 +1175,14 @@ main(int argc, char **argv)
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &endsegno, WalSegSz);
+ XLogFromFileName(fname, &private.timeline, &endsegno, private.segsize);
if (endsegno < segno)
pg_fatal("ENDSEG %s is before STARTSEG %s",
argv[optind + 1], argv[optind]);
if (!XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(endsegno + 1, 0, WalSegSz,
+ XLogSegNoOffsetToRecPtr(endsegno + 1, 0, private.segsize,
private.endptr);
/* set segno to endsegno for check of --end */
@@ -1190,8 +1190,8 @@ main(int argc, char **argv)
}
- if (!XLByteInSeg(private.endptr, segno, WalSegSz) &&
- private.endptr != (segno + 1) * WalSegSz)
+ if (!XLByteInSeg(private.endptr, segno, private.segsize) &&
+ private.endptr != (segno + 1) * private.segsize)
{
pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
LSN_FORMAT_ARGS(private.endptr),
@@ -1200,7 +1200,7 @@ main(int argc, char **argv)
}
}
else
- waldir = identify_target_directory(waldir, NULL, &WalSegSz);
+ waldir = identify_target_directory(waldir, NULL, &private.segsize);
/* we don't know what to print */
if (!XLogRecPtrIsValid(private.startptr))
@@ -1213,7 +1213,7 @@ main(int argc, char **argv)
/* we have everything we need, start reading */
xlogreader_state =
- XLogReaderAllocate(WalSegSz, waldir,
+ XLogReaderAllocate(private.segsize, waldir,
XL_ROUTINE(.page_read = WALDumpReadPage,
.segment_open = WALDumpOpenSegment,
.segment_close = WALDumpCloseSegment),
@@ -1234,7 +1234,7 @@ main(int argc, char **argv)
* a segment (e.g. we were used in file mode).
*/
if (first_record != private.startptr &&
- XLogSegmentOffset(private.startptr, WalSegSz) != 0)
+ XLogSegmentOffset(private.startptr, private.segsize) != 0)
pg_log_info(ngettext("first record is after %X/%08X, at %X/%08X, skipping over %u byte",
"first record is after %X/%08X, at %X/%08X, skipping over %u bytes",
(first_record - private.startptr)),
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index 64a9109229e..013b051506f 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -17,6 +17,7 @@
typedef struct XLogDumpPrivate
{
TimeLineID timeline;
+ int segsize;
XLogRecPtr startptr;
XLogRecPtr endptr;
bool endptr_reached;
--
2.47.1
[application/x-patch] v16-0006-pg_waldump-Add-support-for-archived-WAL-decoding.patch (42.5K, 7-v16-0006-pg_waldump-Add-support-for-archived-WAL-decoding.patch)
download | inline diff:
From fcf25ba0adf9ed1976f1b205c11a08866defb498 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 10 Feb 2026 11:42:36 +0530
Subject: [PATCH v16 06/11] pg_waldump: Add support for archived WAL decoding.
pg_waldump can now accept the path to a tar archive containing WAL
files and decode them. This feature was added primarily for
pg_verifybackup, which previously disabled WAL parsing for
tar-formatted backups.
Note that this patch requires that the WAL files within the archive be
in sequential order; an error will be reported otherwise. The next
patch is planned to remove this restriction.
---
doc/src/sgml/ref/pg_waldump.sgml | 8 +-
src/bin/pg_waldump/Makefile | 7 +-
src/bin/pg_waldump/archive_waldump.c | 653 +++++++++++++++++++++++++++
src/bin/pg_waldump/meson.build | 4 +-
src/bin/pg_waldump/pg_waldump.c | 257 ++++++++---
src/bin/pg_waldump/pg_waldump.h | 45 ++
src/bin/pg_waldump/t/001_basic.pl | 105 ++++-
src/tools/pgindent/typedefs.list | 3 +
8 files changed, 1015 insertions(+), 67 deletions(-)
create mode 100644 src/bin/pg_waldump/archive_waldump.c
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index d1715ff5124..15fb8d13199 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -141,13 +141,17 @@ PostgreSQL documentation
<term><option>--path=<replaceable>path</replaceable></option></term>
<listitem>
<para>
- Specifies a directory to search for WAL segment files or a
- directory with a <literal>pg_wal</literal> subdirectory that
+ Specifies a tar archive or a directory to search for WAL segment files
+ or a directory with a <literal>pg_wal</literal> subdirectory that
contains such files. The default is to search in the current
directory, the <literal>pg_wal</literal> subdirectory of the
current directory, and the <literal>pg_wal</literal> subdirectory
of <envar>PGDATA</envar>.
</para>
+ <para>
+ If a tar archive is provided, its WAL segment files must be in
+ sequential order; otherwise, an error will be reported.
+ </para>
</listitem>
</varlistentry>
diff --git a/src/bin/pg_waldump/Makefile b/src/bin/pg_waldump/Makefile
index 4c1ee649501..aabb87566a2 100644
--- a/src/bin/pg_waldump/Makefile
+++ b/src/bin/pg_waldump/Makefile
@@ -3,6 +3,9 @@
PGFILEDESC = "pg_waldump - decode and display WAL"
PGAPPICON=win32
+# make these available to TAP test scripts
+export TAR
+
subdir = src/bin/pg_waldump
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
@@ -10,13 +13,15 @@ include $(top_builddir)/src/Makefile.global
OBJS = \
$(RMGRDESCOBJS) \
$(WIN32RES) \
+ archive_waldump.o \
compat.o \
pg_waldump.o \
rmgrdesc.o \
xlogreader.o \
xlogstats.o
-override CPPFLAGS := -DFRONTEND $(CPPFLAGS)
+override CPPFLAGS := -DFRONTEND -I$(libpq_srcdir) $(CPPFLAGS)
+LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils
RMGRDESCSOURCES = $(sort $(notdir $(wildcard $(top_srcdir)/src/backend/access/rmgrdesc/*desc*.c)))
RMGRDESCOBJS = $(patsubst %.c,%.o,$(RMGRDESCSOURCES))
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
new file mode 100644
index 00000000000..0936ffc0a75
--- /dev/null
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -0,0 +1,653 @@
+/*-------------------------------------------------------------------------
+ *
+ * archive_waldump.c
+ * A generic facility for reading WAL data from tar archives via archive
+ * streamer.
+ *
+ * Portions Copyright (c) 2026, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/archive_waldump.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres_fe.h"
+
+#include <unistd.h>
+
+#include "access/xlog_internal.h"
+#include "common/hashfn.h"
+#include "common/logging.h"
+#include "fe_utils/simple_list.h"
+#include "pg_waldump.h"
+
+/*
+ * How many bytes should we try to read from a file at once?
+ */
+#define READ_CHUNK_SIZE (128 * 1024)
+
+/*
+ * Check if the start segment number is zero; this indicates a request to read
+ * any WAL file.
+ */
+#define READ_ANY_WAL(privateInfo) ((privateInfo)->start_segno == 0)
+
+/*
+ * Hash entry representing a WAL segment retrieved from the archive.
+ *
+ * While WAL segments are typically read sequentially, individual entries
+ * maintain their own buffers for the following reasons:
+ *
+ * 1. Boundary Handling: The archive streamer provides a continuous byte
+ * stream. A single streaming chunk may contain the end of one WAL segment
+ * and the start of the next. Separate buffers allow us to easily
+ * partition and track these bytes by their respective segments.
+ *
+ * 2. Out-of-Order Support: Dedicated buffers simplify logic if segments
+ * are ever archived or retrieved out of sequence.
+ *
+ * To minimize the memory footprint, entries and their associated buffers are
+ * freed immediately once consumed. Since pg_waldump does not request the same
+ * bytes twice, a segment is discarded as soon as it moves past it.
+ */
+typedef struct ArchivedWALFile
+{
+ uint32 status; /* hash status */
+ const char *fname; /* hash key: WAL segment name */
+
+ StringInfo buf; /* holds WAL bytes read from archive */
+
+ int read_len; /* total bytes of a WAL read from archive */
+} ArchivedWALFile;
+
+static uint32 hash_string_pointer(const char *s);
+#define SH_PREFIX ArchivedWAL
+#define SH_ELEMENT_TYPE ArchivedWALFile
+#define SH_KEY_TYPE const char *
+#define SH_KEY fname
+#define SH_HASH_KEY(tb, key) hash_string_pointer(key)
+#define SH_EQUAL(tb, a, b) (strcmp(a, b) == 0)
+#define SH_SCOPE static inline
+#define SH_RAW_ALLOCATOR pg_malloc0
+#define SH_DECLARE
+#define SH_DEFINE
+#include "lib/simplehash.h"
+
+typedef struct astreamer_waldump
+{
+ astreamer base;
+ XLogDumpPrivate *privateInfo;
+} astreamer_waldump;
+
+static ArchivedWALFile *get_archive_wal_entry(const char *fname,
+ XLogDumpPrivate *privateInfo,
+ int WalSegSz);
+static int read_archive_file(XLogDumpPrivate *privateInfo, Size count);
+
+static astreamer *astreamer_waldump_new(XLogDumpPrivate *privateInfo);
+static void astreamer_waldump_content(astreamer *streamer,
+ astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context);
+static void astreamer_waldump_finalize(astreamer *streamer);
+static void astreamer_waldump_free(astreamer *streamer);
+
+static bool member_is_wal_file(astreamer_waldump *mystreamer,
+ astreamer_member *member,
+ char **fname);
+
+static const astreamer_ops astreamer_waldump_ops = {
+ .content = astreamer_waldump_content,
+ .finalize = astreamer_waldump_finalize,
+ .free = astreamer_waldump_free
+};
+
+/*
+ * Initializes the tar archive reader, creates a hash table for WAL entries,
+ * checks for existing valid WAL segments in the archive file and retrieves the
+ * segment size, and sets up filters for relevant entries.
+ */
+void
+init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
+ int *WalSegSz, pg_compress_algorithm compression)
+{
+ int fd;
+ astreamer *streamer;
+ ArchivedWALFile *entry = NULL;
+ XLogLongPageHeader longhdr;
+ XLogSegNo segno;
+ TimeLineID timeline;
+
+ /* Open tar archive and store its file descriptor */
+ fd = open_file_in_directory(waldir, privateInfo->archive_name);
+
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", privateInfo->archive_name);
+
+ privateInfo->archive_fd = fd;
+
+ streamer = astreamer_waldump_new(privateInfo);
+
+ /* Before that we must parse the tar archive. */
+ streamer = astreamer_tar_parser_new(streamer);
+
+ /* Before that we must decompress, if archive is compressed. */
+ if (compression == PG_COMPRESSION_GZIP)
+ streamer = astreamer_gzip_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_LZ4)
+ streamer = astreamer_lz4_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_ZSTD)
+ streamer = astreamer_zstd_decompressor_new(streamer);
+
+ privateInfo->archive_streamer = streamer;
+
+ /*
+ * Allocate a buffer for reading the archive file to facilitate content
+ * decoding; read requests must not exceed the allocated buffer size.
+ */
+ privateInfo->archive_read_buf = pg_malloc(READ_CHUNK_SIZE);
+ privateInfo->archive_read_buf_size = READ_CHUNK_SIZE;
+
+ /*
+ * Hash table storing WAL entries read from the archive with an arbitrary
+ * initial size
+ */
+ privateInfo->archive_wal_htab = ArchivedWAL_create(8, NULL);
+
+ /*
+ * Verify that the archive contains valid WAL files and fetch WAL segment
+ * size
+ */
+ while (entry == NULL || entry->buf->len < XLOG_BLCKSZ)
+ {
+ if (read_archive_file(privateInfo, XLOG_BLCKSZ) == 0)
+ pg_fatal("could not find WAL in archive \"%s\"",
+ privateInfo->archive_name);
+
+ entry = privateInfo->cur_file;
+ }
+
+ /* Set WalSegSz if WAL data is successfully read */
+ longhdr = (XLogLongPageHeader) entry->buf->data;
+
+ if (!IsValidWalSegSize(longhdr->xlp_seg_size))
+ {
+ pg_log_error(ngettext("invalid WAL segment size in WAL file from archive \"%s\" (%d byte)",
+ "invalid WAL segment size in WAL file from archive \"%s\" (%d bytes)",
+ longhdr->xlp_seg_size),
+ privateInfo->archive_name, longhdr->xlp_seg_size);
+ pg_log_error_detail("The WAL segment size must be a power of two between 1 MB and 1 GB.");
+ exit(1);
+ }
+
+ *WalSegSz = longhdr->xlp_seg_size;
+
+ /*
+ * With the WAL segment size available, we can now initialize the
+ * dependent start and end segment numbers.
+ */
+ Assert(!XLogRecPtrIsInvalid(privateInfo->startptr));
+ XLByteToSeg(privateInfo->startptr, privateInfo->start_segno, *WalSegSz);
+
+ if (!XLogRecPtrIsInvalid(privateInfo->endptr))
+ XLByteToSeg(privateInfo->endptr, privateInfo->end_segno, *WalSegSz);
+
+ /*
+ * This WAL record was fetched before the filtering parameters
+ * (start_segno and end_segno) were fully initialized. Perform the
+ * relevance check against the user-provided range now; if the WAL falls
+ * outside this range, remove it from the hash table. Subsequent WAL will
+ * be filtered automatically by the archived streamer using the updated
+ * start_segno and end_segno values.
+ */
+ XLogFromFileName(entry->fname, &timeline, &segno, privateInfo->segsize);
+ if (privateInfo->timeline != timeline ||
+ privateInfo->start_segno > segno ||
+ privateInfo->end_segno < segno)
+ free_archive_wal_entry(entry->fname, privateInfo);
+}
+
+/*
+ * Release the archive streamer chain and close the archive file.
+ */
+void
+free_archive_reader(XLogDumpPrivate *privateInfo)
+{
+ /*
+ * NB: Normally, astreamer_finalize() is called before astreamer_free() to
+ * flush any remaining buffered data or to ensure the end of the tar
+ * archive is reached. However, when decoding a WAL file, once we hit the
+ * end LSN, any remaining WAL data in the buffer or the tar archive's
+ * unreached end can be safely ignored.
+ */
+ astreamer_free(privateInfo->archive_streamer);
+
+ /* Free the reusable read buffer. */
+ if (privateInfo->archive_read_buf != NULL)
+ {
+ pg_free(privateInfo->archive_read_buf);
+ privateInfo->archive_read_buf = NULL;
+ }
+
+ /* Close the file. */
+ if (close(privateInfo->archive_fd) != 0)
+ pg_log_error("could not close file \"%s\": %m",
+ privateInfo->archive_name);
+}
+
+/*
+ * Copies WAL data from astreamer to readBuff; if unavailable, fetches more
+ * from the tar archive via astreamer.
+ */
+int
+read_archive_wal_page(XLogDumpPrivate *privateInfo, XLogRecPtr targetPagePtr,
+ Size count, char *readBuff, int WalSegSz)
+{
+ char *p = readBuff;
+ Size nbytes = count;
+ XLogRecPtr recptr = targetPagePtr;
+ XLogSegNo segno;
+ char fname[MAXFNAMELEN];
+ ArchivedWALFile *entry;
+
+ /* Identify the segment and locate its entry in the archive hash */
+ XLByteToSeg(targetPagePtr, segno, WalSegSz);
+ XLogFileName(fname, privateInfo->timeline, segno, WalSegSz);
+ entry = get_archive_wal_entry(fname, privateInfo, WalSegSz);
+
+ while (nbytes > 0)
+ {
+ char *buf = entry->buf->data;
+ int bufLen = entry->buf->len;
+ XLogRecPtr endPtr;
+ XLogRecPtr startPtr;
+
+ /* Calculate the LSN range currently residing in the buffer */
+ XLogSegNoOffsetToRecPtr(segno, entry->read_len, WalSegSz, endPtr);
+ startPtr = endPtr - bufLen;
+
+ /*
+ * Copy the requested WAL record if it exists in the buffer.
+ */
+ if (bufLen > 0 && startPtr <= recptr && recptr < endPtr)
+ {
+ int copyBytes;
+ int offset = recptr - startPtr;
+
+ /*
+ * Given startPtr <= recptr < endPtr and a total buffer size
+ * 'bufLen', the offset (recptr - startPtr) will always be less
+ * than 'bufLen'.
+ */
+ Assert(offset < bufLen);
+
+ copyBytes = Min(nbytes, bufLen - offset);
+ memcpy(p, buf + offset, copyBytes);
+
+ /* Update state for read */
+ recptr += copyBytes;
+ nbytes -= copyBytes;
+ p += copyBytes;
+ }
+ else
+ {
+ /*
+ * Before starting the actual decoding loop, pg_waldump tries to
+ * locate the first valid record from the user-specified start
+ * position, which might not be the start of a WAL record and
+ * could fall in the middle of a record that spans multiple pages.
+ * Consequently, the valid start position the decoder is looking
+ * for could be far away from that initial position.
+ *
+ * This may involve reading across multiple pages, and this
+ * pre-reading fetches data in multiple rounds from the archive
+ * streamer; normally, we would throw away existing buffer
+ * contents to fetch the next set of data, but that existing data
+ * might be needed once the main loop starts. Because previously
+ * read data cannot be re-read by the archive streamer, we delay
+ * resetting the buffer until the main decoding loop is entered.
+ *
+ * Once pg_waldump has entered the main loop, it may re-read the
+ * currently active page, but never an older one; therefore, any
+ * fully consumed WAL data preceding the current page can then be
+ * safely discarded.
+ */
+ if (privateInfo->decoding_started)
+ {
+ resetStringInfo(entry->buf);
+
+ /*
+ * Push back the partial page data for the current page to the
+ * buffer, ensuring it remains full page available for
+ * re-reading if requested.
+ */
+ if (p > readBuff)
+ {
+ Assert((count - nbytes) > 0);
+ appendBinaryStringInfo(entry->buf, readBuff, count - nbytes);
+ }
+ }
+
+ /*
+ * Now, fetch more data; raise an error if it's not the current
+ * segment being read by the archive streamer or if reading of the
+ * archived file has finished.
+ */
+ if (privateInfo->cur_file != entry ||
+ read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ pg_fatal("could not read file \"%s\" from archive \"%s\": read %lld of %lld",
+ fname, privateInfo->archive_name,
+ (long long int) (count - nbytes),
+ (long long int) count);
+ }
+ }
+
+ /*
+ * Should have successfully read all the requested bytes or reported a
+ * failure before this point.
+ */
+ Assert(nbytes == 0);
+
+ /*
+ * NB: We return the fixed value provided as input. We could return a
+ * boolean since we either successfully read the WAL page or raise an
+ * error, but the caller expects this value to be returned. The routine
+ * that reads WAL pages from the physical WAL file follows the same
+ * convention.
+ */
+ return count;
+}
+
+/*
+ * Clears the buffer of a WAL entry that is being ignored. This frees up memory
+ * and prevents the accumulation of irrelevant WAL data. Additionally,
+ * conditionally setting cur_file within privateInfo to NULL ensures the
+ * archive streamer skips unnecessary copy operations.
+ */
+void
+free_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo)
+{
+ ArchivedWALFile *entry;
+
+ entry = ArchivedWAL_lookup(privateInfo->archive_wal_htab, fname);
+
+ if (entry == NULL)
+ return;
+
+ /* Destroy the buffer */
+ destroyStringInfo(entry->buf);
+ entry->buf = NULL;
+
+ /* Set cur_file to NULL if it matches the entry being ignored */
+ if (privateInfo->cur_file == entry)
+ privateInfo->cur_file = NULL;
+
+ ArchivedWAL_delete_item(privateInfo->archive_wal_htab, entry);
+}
+
+/*
+ * Returns the archived WAL entry from the hash table if it exists. Otherwise,
+ * it invokes the routine to read the archived file, which then populates the
+ * entry in the hash table if that WAL exists in the archive.
+ */
+static ArchivedWALFile *
+get_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo,
+ int WalSegSz)
+{
+ ArchivedWALFile *entry = NULL;
+
+ /* Search hash table */
+ entry = ArchivedWAL_lookup(privateInfo->archive_wal_htab, fname);
+
+ if (entry != NULL)
+ return entry;
+
+ /*
+ * The requested WAL entry has not been read from the archive yet; invoke
+ * the archive streamer to read it.
+ */
+ while (1)
+ {
+ /* Fetch more data */
+ if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ break; /* archive file ended */
+
+ /*
+ * Archived streamer is reading a non-WAL file or an irrelevant WAL
+ * file.
+ */
+ if (privateInfo->cur_file == NULL)
+ continue;
+
+ entry = privateInfo->cur_file;
+
+ /* Found the required entry */
+ if (strcmp(fname, entry->fname) == 0)
+ return entry;
+
+ /* WAL segments must be archived in order */
+ pg_log_error("WAL files are not archived in sequential order");
+ pg_log_error_detail("Expecting segment \"%s\" but found \"%s\".",
+ fname, entry->fname);
+ exit(1);
+ }
+
+ /* Requested WAL segment not found */
+ pg_fatal("could not find WAL \"%s\" in archive \"%s\"",
+ fname, privateInfo->archive_name);
+}
+
+/*
+ * Reads the archive file and passes it to the archive streamer for
+ * decompression.
+ */
+static int
+read_archive_file(XLogDumpPrivate *privateInfo, Size count)
+{
+ int rc;
+
+ /* The read request must not exceed the allocated buffer size. */
+ Assert(privateInfo->archive_read_buf_size >= count);
+
+ rc = read(privateInfo->archive_fd, privateInfo->archive_read_buf, count);
+ if (rc < 0)
+ pg_fatal("could not read file \"%s\": %m",
+ privateInfo->archive_name);
+
+ /*
+ * Decompress (if required), and then parse the previously read contents
+ * of the tar file.
+ */
+ if (rc > 0)
+ astreamer_content(privateInfo->archive_streamer, NULL,
+ privateInfo->archive_read_buf, rc,
+ ASTREAMER_UNKNOWN);
+
+ return rc;
+}
+
+/*
+ * Create an astreamer that can read WAL from a tar file.
+ */
+static astreamer *
+astreamer_waldump_new(XLogDumpPrivate *privateInfo)
+{
+ astreamer_waldump *streamer;
+
+ streamer = palloc0_object(astreamer_waldump);
+ *((const astreamer_ops **) &streamer->base.bbs_ops) =
+ &astreamer_waldump_ops;
+
+ streamer->privateInfo = privateInfo;
+
+ return &streamer->base;
+}
+
+/*
+ * Main entry point of the archive streamer for reading WAL data from a tar
+ * file. If a member is identified as a valid WAL file, a hash entry is created
+ * for it, and its contents are copied into that entry's buffer, making them
+ * accessible to the decoding routine.
+ */
+static void
+astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context)
+{
+ astreamer_waldump *mystreamer = (astreamer_waldump *) streamer;
+ XLogDumpPrivate *privateInfo = mystreamer->privateInfo;
+
+ Assert(context != ASTREAMER_UNKNOWN);
+
+ switch (context)
+ {
+ case ASTREAMER_MEMBER_HEADER:
+ {
+ char *fname = NULL;
+ ArchivedWALFile *entry;
+ bool found;
+
+ pg_log_debug("reading \"%s\"", member->pathname);
+
+ if (!member_is_wal_file(mystreamer, member, &fname))
+ break;
+
+ /*
+ * Further checks are skipped if any WAL file can be read.
+ * This typically occurs during initial verification.
+ */
+ if (!READ_ANY_WAL(privateInfo))
+ {
+ XLogSegNo segno;
+ TimeLineID timeline;
+
+ /*
+ * Skip the segment if the timeline does not match, if it
+ * falls outside the caller-specified range.
+ */
+ XLogFromFileName(fname, &timeline, &segno, privateInfo->segsize);
+ if (privateInfo->timeline != timeline ||
+ privateInfo->start_segno > segno ||
+ privateInfo->end_segno < segno)
+ {
+ pfree(fname);
+ break;
+ }
+ }
+
+ entry = ArchivedWAL_insert(privateInfo->archive_wal_htab,
+ fname, &found);
+
+ /*
+ * Shouldn't happen, but if it does, simply ignore the
+ * duplicate WAL file.
+ */
+ if (found)
+ {
+ pg_log_warning("ignoring duplicate WAL \"%s\" found in archive \"%s\"",
+ member->pathname, privateInfo->archive_name);
+ pfree(fname);
+ break;
+ }
+
+ entry->buf = makeStringInfo();
+ entry->read_len = 0;
+ privateInfo->cur_file = entry;
+ }
+ break;
+
+ case ASTREAMER_MEMBER_CONTENTS:
+ if (privateInfo->cur_file)
+ {
+ appendBinaryStringInfo(privateInfo->cur_file->buf, data, len);
+ privateInfo->cur_file->read_len += len;
+ }
+ break;
+
+ case ASTREAMER_MEMBER_TRAILER:
+ privateInfo->cur_file = NULL;
+ break;
+
+ case ASTREAMER_ARCHIVE_TRAILER:
+ break;
+
+ default:
+ /* Shouldn't happen. */
+ pg_fatal("unexpected state while parsing tar file");
+ }
+}
+
+/*
+ * End-of-stream processing for an astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_finalize(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+}
+
+/*
+ * Free memory associated with a astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_free(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+ pfree(streamer);
+}
+
+/*
+ * Returns true if the archive member name matches the WAL naming format. If
+ * successful, it also outputs the WAL segment name.
+ */
+static bool
+member_is_wal_file(astreamer_waldump *mystreamer, astreamer_member *member,
+ char **fname)
+{
+ int pathlen;
+ char pathname[MAXPGPATH];
+ char *filename;
+
+ /* We are only interested in normal files. */
+ if (member->is_directory || member->is_link)
+ return false;
+
+ if (strlen(member->pathname) < XLOG_FNAME_LEN)
+ return false;
+
+ /*
+ * For a correct comparison, we must remove any '.' or '..' components
+ * from the member pathname. Similar to member_verify_header(), we prepend
+ * './' to the path so that canonicalize_path() can properly resolve and
+ * strip these references from the tar member name
+ */
+ snprintf(pathname, MAXPGPATH, "./%s", member->pathname);
+ canonicalize_path(pathname);
+ pathlen = strlen(pathname);
+
+ /* WAL files from the top-level or pg_wal directory will be decoded */
+ if (pathlen > XLOG_FNAME_LEN &&
+ strncmp(pathname, XLOGDIR, strlen(XLOGDIR)) != 0)
+ return false;
+
+ /* WAL file could be with full path */
+ filename = pathname + (pathlen - XLOG_FNAME_LEN);
+ if (!IsXLogFileName(filename))
+ return false;
+
+ *fname = pnstrdup(filename, XLOG_FNAME_LEN);
+
+ return true;
+}
+
+/*
+ * Helper function for filemap hash table.
+ */
+static uint32
+hash_string_pointer(const char *s)
+{
+ unsigned char *ss = (unsigned char *) s;
+
+ return hash_bytes(ss, strlen(s));
+}
diff --git a/src/bin/pg_waldump/meson.build b/src/bin/pg_waldump/meson.build
index 633a9874bb5..5296f21b82c 100644
--- a/src/bin/pg_waldump/meson.build
+++ b/src/bin/pg_waldump/meson.build
@@ -1,6 +1,7 @@
# Copyright (c) 2022-2026, PostgreSQL Global Development Group
pg_waldump_sources = files(
+ 'archive_waldump.c',
'compat.c',
'pg_waldump.c',
'rmgrdesc.c',
@@ -18,7 +19,7 @@ endif
pg_waldump = executable('pg_waldump',
pg_waldump_sources,
- dependencies: [frontend_code, lz4, zstd],
+ dependencies: [frontend_code, libpq, lz4, zstd],
c_args: ['-DFRONTEND'], # needed for xlogreader et al
kwargs: default_bin_args,
)
@@ -29,6 +30,7 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
+ 'env': {'TAR': tar.found() ? tar.full_path() : ''},
'tests': [
't/001_basic.pl',
't/002_save_fullpage.pl',
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 5d31b15dbd8..f0b8116ff14 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -176,7 +176,7 @@ split_path(const char *path, char **dir, char **fname)
*
* return a read only fd
*/
-static int
+int
open_file_in_directory(const char *directory, const char *fname)
{
int fd = -1;
@@ -440,6 +440,81 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
return count;
}
+/*
+ * pg_waldump's XLogReaderRoutine->segment_open callback to support dumping WAL
+ * files from tar archives.
+ */
+static void
+TarWALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
+ TimeLineID *tli_p)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->segment_close callback.
+ */
+static void
+TarWALDumpCloseSegment(XLogReaderState *state)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->page_read callback to support dumping WAL
+ * files from tar archives.
+ */
+static int
+TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
+ XLogRecPtr targetPtr, char *readBuff)
+{
+ XLogDumpPrivate *private = state->private_data;
+ int count = required_read_len(private, targetPagePtr, reqLen);
+ int WalSegSz = state->segcxt.ws_segsize;
+ XLogSegNo curSegNo;
+
+ /* Bail out if the count to be read is not valid */
+ if (count < 0)
+ return -1;
+
+ /*
+ * If the target page is in a different segment, free the buffer space
+ * occupied by the previous segment data. Since pg_waldump never requests
+ * the same WAL bytes twice, moving to a new segment implies the previous
+ * buffer's data and that segment will not be needed again.
+ */
+ curSegNo = state->seg.ws_segno;
+ if (!XLByteInSeg(targetPagePtr, curSegNo, WalSegSz))
+ {
+ char fname[MAXFNAMELEN];
+ XLogSegNo nextSegNo;
+
+ /*
+ * Calculate the next WAL segment to be decoded from the given page
+ * pointer
+ */
+ XLByteToSeg(targetPagePtr, nextSegNo, WalSegSz);
+ state->seg.ws_tli = private->timeline;
+ state->seg.ws_segno = nextSegNo;
+
+ /*
+ * If in pre-reading mode (prior to actual decoding), do not delete
+ * any entries that might be requested again once the decoding loop
+ * starts. For more details, see the comments in
+ * read_archive_wal_page().
+ */
+ if (private->decoding_started && curSegNo < nextSegNo)
+ {
+ XLogFileName(fname, state->seg.ws_tli, curSegNo, WalSegSz);
+ free_archive_wal_entry(fname, private);
+ }
+ }
+
+ /* Read the WAL page from the archive streamer */
+ return read_archive_wal_page(private, targetPagePtr, count, readBuff,
+ WalSegSz);
+}
+
/*
* Boolean to return whether the given WAL record matches a specific relation
* and optionally block.
@@ -777,8 +852,8 @@ usage(void)
printf(_(" -F, --fork=FORK only show records that modify blocks in fork FORK;\n"
" valid names are main, fsm, vm, init\n"));
printf(_(" -n, --limit=N number of records to display\n"));
- printf(_(" -p, --path=PATH directory in which to find WAL segment files or a\n"
- " directory with a ./pg_wal that contains such files\n"
+ printf(_(" -p, --path=PATH tar archive or a directory in which to find WAL segment files or\n"
+ " a directory with a ./pg_wal that contains such files\n"
" (default: current directory, ./pg_wal, $PGDATA/pg_wal)\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -r, --rmgr=RMGR only show records generated by resource manager RMGR;\n"
@@ -810,7 +885,9 @@ main(int argc, char **argv)
XLogRecord *record;
XLogRecPtr first_record;
char *waldir = NULL;
+ char *walpath = NULL;
char *errormsg;
+ pg_compress_algorithm compression = PG_COMPRESSION_NONE;
static struct option long_options[] = {
{"bkp-details", no_argument, NULL, 'b'},
@@ -868,6 +945,10 @@ main(int argc, char **argv)
private.startptr = InvalidXLogRecPtr;
private.endptr = InvalidXLogRecPtr;
private.endptr_reached = false;
+ private.decoding_started = false;
+ private.archive_name = NULL;
+ private.start_segno = 0;
+ private.end_segno = UINT64_MAX;
config.quiet = false;
config.bkp_details = false;
@@ -943,7 +1024,7 @@ main(int argc, char **argv)
}
break;
case 'p':
- waldir = pg_strdup(optarg);
+ walpath = pg_strdup(optarg);
break;
case 'q':
config.quiet = true;
@@ -1107,12 +1188,21 @@ main(int argc, char **argv)
goto bad_argument;
}
- if (waldir != NULL)
+ if (walpath != NULL)
{
+ /* validate path points to tar archive */
+ if (parse_tar_compress_algorithm(walpath, &compression))
+ {
+ char *fname = NULL;
+
+ split_path(walpath, &waldir, &fname);
+
+ private.archive_name = fname;
+ }
/* validate path points to directory */
- if (!verify_directory(waldir))
+ else if (!verify_directory(walpath))
{
- pg_log_error("could not open directory \"%s\": %m", waldir);
+ pg_log_error("could not open directory \"%s\": %m", walpath);
goto bad_argument;
}
}
@@ -1128,6 +1218,17 @@ main(int argc, char **argv)
int fd;
XLogSegNo segno;
+ /*
+ * If a tar archive is passed using the --path option, all other
+ * arguments become unnecessary.
+ */
+ if (private.archive_name)
+ {
+ pg_log_error("unnecessary command-line arguments specified with tar archive (first is \"%s\")",
+ argv[optind]);
+ goto bad_argument;
+ }
+
split_path(argv[optind], &directory, &fname);
if (waldir == NULL && directory != NULL)
@@ -1138,69 +1239,75 @@ main(int argc, char **argv)
pg_fatal("could not open directory \"%s\": %m", waldir);
}
- waldir = identify_target_directory(waldir, fname, &private.segsize);
- fd = open_file_in_directory(waldir, fname);
- if (fd < 0)
- pg_fatal("could not open file \"%s\"", fname);
- close(fd);
-
- /* parse position from file */
- XLogFromFileName(fname, &private.timeline, &segno, private.segsize);
-
- if (!XLogRecPtrIsValid(private.startptr))
- XLogSegNoOffsetToRecPtr(segno, 0, private.segsize, private.startptr);
- else if (!XLByteInSeg(private.startptr, segno, private.segsize))
+ if (fname != NULL && parse_tar_compress_algorithm(fname, &compression))
{
- pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.startptr),
- fname);
- goto bad_argument;
+ private.archive_name = fname;
}
-
- /* no second file specified, set end position */
- if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(segno + 1, 0, private.segsize, private.endptr);
-
- /* parse ENDSEG if passed */
- if (optind + 1 < argc)
+ else
{
- XLogSegNo endsegno;
-
- /* ignore directory, already have that */
- split_path(argv[optind + 1], &directory, &fname);
-
+ waldir = identify_target_directory(waldir, fname, &private.segsize);
fd = open_file_in_directory(waldir, fname);
if (fd < 0)
pg_fatal("could not open file \"%s\"", fname);
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &endsegno, private.segsize);
+ XLogFromFileName(fname, &private.timeline, &segno, private.segsize);
- if (endsegno < segno)
- pg_fatal("ENDSEG %s is before STARTSEG %s",
- argv[optind + 1], argv[optind]);
+ if (!XLogRecPtrIsValid(private.startptr))
+ XLogSegNoOffsetToRecPtr(segno, 0, private.segsize, private.startptr);
+ else if (!XLByteInSeg(private.startptr, segno, private.segsize))
+ {
+ pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.startptr),
+ fname);
+ goto bad_argument;
+ }
- if (!XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(endsegno + 1, 0, private.segsize,
- private.endptr);
+ /* no second file specified, set end position */
+ if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
+ XLogSegNoOffsetToRecPtr(segno + 1, 0, private.segsize, private.endptr);
- /* set segno to endsegno for check of --end */
- segno = endsegno;
- }
+ /* parse ENDSEG if passed */
+ if (optind + 1 < argc)
+ {
+ XLogSegNo endsegno;
+ /* ignore directory, already have that */
+ split_path(argv[optind + 1], &directory, &fname);
- if (!XLByteInSeg(private.endptr, segno, private.segsize) &&
- private.endptr != (segno + 1) * private.segsize)
- {
- pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.endptr),
- argv[argc - 1]);
- goto bad_argument;
+ fd = open_file_in_directory(waldir, fname);
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", fname);
+ close(fd);
+
+ /* parse position from file */
+ XLogFromFileName(fname, &private.timeline, &endsegno, private.segsize);
+
+ if (endsegno < segno)
+ pg_fatal("ENDSEG %s is before STARTSEG %s",
+ argv[optind + 1], argv[optind]);
+
+ if (!XLogRecPtrIsValid(private.endptr))
+ XLogSegNoOffsetToRecPtr(endsegno + 1, 0, private.segsize,
+ private.endptr);
+
+ /* set segno to endsegno for check of --end */
+ segno = endsegno;
+ }
+
+ if (!XLByteInSeg(private.endptr, segno, private.segsize) &&
+ private.endptr != (segno + 1) * private.segsize)
+ {
+ pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.endptr),
+ argv[argc - 1]);
+ goto bad_argument;
+ }
}
}
- else
- waldir = identify_target_directory(waldir, NULL, &private.segsize);
+ else if (!private.archive_name)
+ waldir = identify_target_directory(walpath, NULL, &private.segsize);
/* we don't know what to print */
if (!XLogRecPtrIsValid(private.startptr))
@@ -1212,12 +1319,36 @@ main(int argc, char **argv)
/* done with argument parsing, do the actual work */
/* we have everything we need, start reading */
- xlogreader_state =
- XLogReaderAllocate(private.segsize, waldir,
- XL_ROUTINE(.page_read = WALDumpReadPage,
- .segment_open = WALDumpOpenSegment,
- .segment_close = WALDumpCloseSegment),
- &private);
+ if (private.archive_name)
+ {
+ /*
+ * A NULL WAL directory indicates that the archive file is located in
+ * the current working directory of the pg_waldump execution
+ */
+ if (waldir == NULL)
+ waldir = pg_strdup(".");
+
+ /* Set up for reading tar file */
+ init_archive_reader(&private, waldir, &private.segsize, compression);
+
+ /* Routine to decode WAL files in tar archive */
+ xlogreader_state =
+ XLogReaderAllocate(private.segsize, waldir,
+ XL_ROUTINE(.page_read = TarWALDumpReadPage,
+ .segment_open = TarWALDumpOpenSegment,
+ .segment_close = TarWALDumpCloseSegment),
+ &private);
+ }
+ else
+ {
+ xlogreader_state =
+ XLogReaderAllocate(private.segsize, waldir,
+ XL_ROUTINE(.page_read = WALDumpReadPage,
+ .segment_open = WALDumpOpenSegment,
+ .segment_close = WALDumpCloseSegment),
+ &private);
+ }
+
if (!xlogreader_state)
pg_fatal("out of memory while allocating a WAL reading processor");
@@ -1245,6 +1376,9 @@ main(int argc, char **argv)
if (config.stats == true && !config.quiet)
stats.startptr = first_record;
+ /* Flag indicating that the decoding loop has been entered */
+ private.decoding_started = true;
+
for (;;)
{
if (time_to_stop)
@@ -1326,6 +1460,9 @@ main(int argc, char **argv)
XLogReaderFree(xlogreader_state);
+ if (private.archive_name)
+ free_archive_reader(&private);
+
return EXIT_SUCCESS;
bad_argument:
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index 013b051506f..62054bc74c0 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -12,6 +12,11 @@
#define PG_WALDUMP_H
#include "access/xlogdefs.h"
+#include "fe_utils/astreamer.h"
+
+/* Forward declaration */
+struct ArchivedWALFile;
+struct ArchivedWAL_hash;
/* Contains the necessary information to drive WAL decoding */
typedef struct XLogDumpPrivate
@@ -21,6 +26,46 @@ typedef struct XLogDumpPrivate
XLogRecPtr startptr;
XLogRecPtr endptr;
bool endptr_reached;
+ bool decoding_started;
+
+ /* Fields required to read WAL from archive */
+ char *archive_name; /* Tar archive name */
+ int archive_fd; /* File descriptor for the open tar file */
+
+ astreamer *archive_streamer;
+ char *archive_read_buf; /* Reusable read buffer for archive I/O */
+ Size archive_read_buf_size;
+
+ /* What the archive streamer is currently reading */
+ struct ArchivedWALFile *cur_file;
+
+ /*
+ * Hash table of all WAL files that the archive stream has read, including
+ * the one currently in progress.
+ */
+ struct ArchivedWAL_hash *archive_wal_htab;
+
+ /*
+ * Although these values can be easily derived from startptr and endptr,
+ * doing so repeatedly for each archived member would be inefficient, as
+ * it would involve recalculating and filtering out irrelevant WAL
+ * segments.
+ */
+ XLogSegNo start_segno;
+ XLogSegNo end_segno;
} XLogDumpPrivate;
+extern int open_file_in_directory(const char *directory, const char *fname);
+
+extern void init_archive_reader(XLogDumpPrivate *privateInfo,
+ const char *waldir, int *WalSegSz,
+ pg_compress_algorithm compression);
+extern void free_archive_reader(XLogDumpPrivate *privateInfo);
+extern int read_archive_wal_page(XLogDumpPrivate *privateInfo,
+ XLogRecPtr targetPagePtr,
+ Size count, char *readBuff,
+ int WalSegSz);
+extern void free_archive_wal_entry(const char *fname,
+ XLogDumpPrivate *privateInfo);
+
#endif /* PG_WALDUMP_H */
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index f12ba52cbfc..6f8ce319841 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -3,10 +3,13 @@
use strict;
use warnings FATAL => 'all';
+use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
+my $tar = $ENV{TAR};
+
program_help_ok('pg_waldump');
program_version_ok('pg_waldump');
program_options_handling_ok('pg_waldump');
@@ -162,6 +165,42 @@ CREATE TABLESPACE ts1 LOCATION '$tblspc_path';
DROP TABLESPACE ts1;
});
+# Test: Decode a continuation record (contrecord) that spans multiple WAL
+# segments.
+#
+# Now consume all remaining room in the current WAL segment, leaving
+# space enough only for the start of a largish record.
+$node->safe_psql(
+ 'postgres', q{
+DO $$
+DECLARE
+ wal_segsize int := setting::int FROM pg_settings WHERE name = 'wal_segment_size';
+ remain int;
+ iters int := 0;
+BEGIN
+ LOOP
+ INSERT into t1(b)
+ select repeat(encode(sha256(g::text::bytea), 'hex'), (random() * 15 + 1)::int)
+ from generate_series(1, 10) g;
+
+ remain := wal_segsize - (pg_current_wal_insert_lsn() - '0/0') % wal_segsize;
+ IF remain < 2 * setting::int from pg_settings where name = 'block_size' THEN
+ RAISE log 'exiting after % iterations, % bytes to end of WAL segment', iters, remain;
+ EXIT;
+ END IF;
+ iters := iters + 1;
+ END LOOP;
+END
+$$;
+});
+
+my $contrecord_lsn = $node->safe_psql('postgres',
+ 'SELECT pg_current_wal_insert_lsn()');
+# Generate contrecord record
+$node->safe_psql('postgres',
+ qq{SELECT pg_logical_emit_message(true, 'test 026', repeat('xyzxz', 123456))}
+);
+
my ($end_lsn, $end_walfile) = split /\|/,
$node->safe_psql('postgres',
q{SELECT pg_current_wal_insert_lsn(), pg_walfile_name(pg_current_wal_insert_lsn())}
@@ -259,11 +298,50 @@ sub test_pg_waldump
return @lines;
}
-my @lines;
+# Create a tar archive, sorting the file order
+sub generate_archive
+{
+ my ($archive, $directory, $compression_flags) = @_;
+
+ my @files;
+ opendir my $dh, $directory or die "opendir: $!";
+ while (my $entry = readdir $dh) {
+ # Skip '.' and '..'
+ next if $entry eq '.' || $entry eq '..';
+ push @files, $entry;
+ }
+ closedir $dh;
+
+ @files = sort @files;
+
+ # move into the WAL directory before archiving files
+ my $cwd = getcwd;
+ chdir($directory) || die "chdir: $!";
+ command_ok([$tar, $compression_flags, $archive, @files]);
+ chdir($cwd) || die "chdir: $!";
+}
+
+my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
my @scenarios = (
{
- 'path' => $node->data_dir
+ 'path' => $node->data_dir,
+ 'is_archive' => 0,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar",
+ 'compression_method' => 'none',
+ 'compression_flags' => '-cf',
+ 'is_archive' => 1,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar.gz",
+ 'compression_method' => 'gzip',
+ 'compression_flags' => '-czf',
+ 'is_archive' => 1,
+ 'enabled' => check_pg_config("#define HAVE_LIBZ 1")
});
for my $scenario (@scenarios)
@@ -272,6 +350,19 @@ for my $scenario (@scenarios)
SKIP:
{
+ skip "tar command is not available", 56
+ if !defined $tar && $scenario->{'is_archive'};
+ skip "$scenario->{'compression_method'} compression not supported by this build", 56
+ if !$scenario->{'enabled'} && $scenario->{'is_archive'};
+
+ # create pg_wal archive
+ if ($scenario->{'is_archive'})
+ {
+ generate_archive($path,
+ $node->data_dir . '/pg_wal',
+ $scenario->{'compression_flags'});
+ }
+
command_fails_like(
[ 'pg_waldump', '--path' => $path ],
qr/error: no start WAL location given/,
@@ -305,9 +396,14 @@ for my $scenario (@scenarios)
test_pg_waldump_skip_bytes($path, $start_lsn, $end_lsn);
- @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
+ my @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+ @lines = test_pg_waldump($path, $contrecord_lsn, $end_lsn);
+ is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+
+ test_pg_waldump_skip_bytes($path, $contrecord_lsn, $end_lsn);
+
@lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--limit' => 6);
is(@lines, 6, 'limit option observed');
@@ -337,6 +433,9 @@ for my $scenario (@scenarios)
'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
'--block' => 1);
is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+
+ # Cleanup.
+ unlink $path if $scenario->{'is_archive'};
}
}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 3250564d4ff..d849293e6fa 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -145,6 +145,8 @@ ArchiveOpts
ArchiveShutdownCB
ArchiveStartupCB
ArchiveStreamState
+ArchivedWALFile
+ArchivedWAL_hash
ArchiverOutput
ArchiverStage
ArrayAnalyzeExtraData
@@ -3514,6 +3516,7 @@ astreamer_recovery_injector
astreamer_tar_archiver
astreamer_tar_parser
astreamer_verify
+astreamer_waldump
astreamer_zstd_frame
auth_password_hook_typ
autovac_table
--
2.47.1
[application/x-patch] v16-0007-pg_waldump-Remove-the-restriction-on-the-order-o.patch (13.8K, 8-v16-0007-pg_waldump-Remove-the-restriction-on-the-order-o.patch)
download | inline diff:
From bf3e02016deca04eb7948dae32d2c83688e0f4a9 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 27 Jan 2026 15:38:34 +0530
Subject: [PATCH v16 07/11] pg_waldump: Remove the restriction on the order of
archived WAL files.
With previous patch, pg_waldump would stop decoding if WAL files were
not in the required sequence. With this patch, decoding will now
continue. Any WAL file that is out of order will be written to a
temporary location, from which it will be read later. Once a temporary
file has been read, it will be removed.
---
doc/src/sgml/ref/pg_waldump.sgml | 19 ++-
src/bin/pg_waldump/archive_waldump.c | 172 +++++++++++++++++++++++++--
src/bin/pg_waldump/pg_waldump.c | 32 ++++-
src/bin/pg_waldump/pg_waldump.h | 3 +
src/bin/pg_waldump/t/001_basic.pl | 3 +-
5 files changed, 209 insertions(+), 20 deletions(-)
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index 15fb8d13199..9bbb4bd5772 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -149,8 +149,12 @@ PostgreSQL documentation
of <envar>PGDATA</envar>.
</para>
<para>
- If a tar archive is provided, its WAL segment files must be in
- sequential order; otherwise, an error will be reported.
+ If a tar archive is provided and its WAL segment files are not in
+ sequential order, those files will be written to a temporary directory
+ named starting with <filename>waldump_tmp</filename>. This directory will be
+ created inside the directory specified by the <envar>TMPDIR</envar>
+ environment variable if it is set; otherwise, it will be created within
+ the same directory as the tar archive.
</para>
</listitem>
</varlistentry>
@@ -387,6 +391,17 @@ PostgreSQL documentation
</para>
</listitem>
</varlistentry>
+
+ <varlistentry>
+ <term><envar>TMPDIR</envar></term>
+ <listitem>
+ <para>
+ Directory in which to create temporary files when reading WAL from a
+ tar archive with out-of-order segment files. If not set, the temporary
+ directory is created within the same directory as the tar archive.
+ </para>
+ </listitem>
+ </varlistentry>
</variablelist>
</refsect1>
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
index 0936ffc0a75..547a5154cb6 100644
--- a/src/bin/pg_waldump/archive_waldump.c
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -17,6 +17,7 @@
#include <unistd.h>
#include "access/xlog_internal.h"
+#include "common/file_perm.h"
#include "common/hashfn.h"
#include "common/logging.h"
#include "fe_utils/simple_list.h"
@@ -27,6 +28,9 @@
*/
#define READ_CHUNK_SIZE (128 * 1024)
+/* Temporary exported WAL file directory */
+char *TmpWalSegDir = NULL;
+
/*
* Check if the start segment number is zero; this indicates a request to read
* any WAL file.
@@ -57,6 +61,8 @@ typedef struct ArchivedWALFile
const char *fname; /* hash key: WAL segment name */
StringInfo buf; /* holds WAL bytes read from archive */
+ bool spilled; /* true if the WAL data was spilled to a
+ * temporary file */
int read_len; /* total bytes of a WAL read from archive */
} ArchivedWALFile;
@@ -84,6 +90,11 @@ static ArchivedWALFile *get_archive_wal_entry(const char *fname,
XLogDumpPrivate *privateInfo,
int WalSegSz);
static int read_archive_file(XLogDumpPrivate *privateInfo, Size count);
+static void setup_tmpwal_dir(const char *waldir);
+static void cleanup_tmpwal_dir_atexit(void);
+
+static FILE *prepare_tmp_write(const char *fname);
+static void perform_tmp_write(const char *fname, StringInfo buf, FILE *file);
static astreamer *astreamer_waldump_new(XLogDumpPrivate *privateInfo);
static void astreamer_waldump_content(astreamer *streamer,
@@ -106,7 +117,9 @@ static const astreamer_ops astreamer_waldump_ops = {
/*
* Initializes the tar archive reader, creates a hash table for WAL entries,
* checks for existing valid WAL segments in the archive file and retrieves the
- * segment size, and sets up filters for relevant entries.
+ * segment size, and sets up filters for relevant entries. It also configures a
+ * temporary directory for out-of-order WAL data and registers an exit callback
+ * to clean up temporary files.
*/
void
init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
@@ -206,6 +219,13 @@ init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
privateInfo->start_segno > segno ||
privateInfo->end_segno < segno)
free_archive_wal_entry(entry->fname, privateInfo);
+
+ /*
+ * Setup temporary directory to store WAL segments and set up an exit
+ * callback to remove it upon completion.
+ */
+ setup_tmpwal_dir(waldir);
+ atexit(cleanup_tmpwal_dir_atexit);
}
/*
@@ -379,6 +399,17 @@ free_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo)
destroyStringInfo(entry->buf);
entry->buf = NULL;
+ /* Remove temporary file if any */
+ if (entry->spilled)
+ {
+ char fpath[MAXPGPATH];
+
+ snprintf(fpath, MAXPGPATH, "%s/%s", TmpWalSegDir, fname);
+
+ if (unlink(fpath) == 0)
+ pg_log_debug("removed file \"%s\"", fpath);
+ }
+
/* Set cur_file to NULL if it matches the entry being ignored */
if (privateInfo->cur_file == entry)
privateInfo->cur_file = NULL;
@@ -390,12 +421,16 @@ free_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo)
* Returns the archived WAL entry from the hash table if it exists. Otherwise,
* it invokes the routine to read the archived file, which then populates the
* entry in the hash table if that WAL exists in the archive.
+ * If the archive streamer happens to be reading a
+ * WAL from archive file that is not currently needed, that WAL data is written
+ * to a temporary file.
*/
static ArchivedWALFile *
get_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo,
int WalSegSz)
{
ArchivedWALFile *entry = NULL;
+ FILE *write_fp = NULL;
/* Search hash table */
entry = ArchivedWAL_lookup(privateInfo->archive_wal_htab, fname);
@@ -409,28 +444,59 @@ get_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo,
*/
while (1)
{
+ /*
+ * The WAL file entry currently being processed may change during
+ * archive streamer execution. Therefore, maintain a local variable to
+ * reference the previous entry, ensuring that any remaining data in
+ * its buffer is successfully flushed to the temporary file before
+ * switching to the next WAL entry.
+ */
+ entry = privateInfo->cur_file;
+
/* Fetch more data */
- if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
- break; /* archive file ended */
+ if (entry == NULL || entry->buf->len == 0)
+ {
+ if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ break; /* archive file ended */
+ }
/*
* Archived streamer is reading a non-WAL file or an irrelevant WAL
* file.
*/
- if (privateInfo->cur_file == NULL)
+ if (entry == NULL)
continue;
- entry = privateInfo->cur_file;
-
/* Found the required entry */
if (strcmp(fname, entry->fname) == 0)
return entry;
- /* WAL segments must be archived in order */
- pg_log_error("WAL files are not archived in sequential order");
- pg_log_error_detail("Expecting segment \"%s\" but found \"%s\".",
- fname, entry->fname);
- exit(1);
+ /*
+ * Archive streamer is currently reading a file that isn't the one
+ * asked for, but it's required in the future. It should be written to
+ * a temporary location for retrieval when needed.
+ */
+
+ /* Create a temporary file if one does not already exist */
+ if (!entry->spilled)
+ {
+ write_fp = prepare_tmp_write(entry->fname);
+ entry->spilled = true;
+ }
+
+ /* Flush data from the buffer to the file */
+ perform_tmp_write(entry->fname, entry->buf, write_fp);
+ resetStringInfo(entry->buf);
+
+ /*
+ * The change in the current segment entry indicates that the reading
+ * of this file has ended.
+ */
+ if (entry != privateInfo->cur_file && write_fp != NULL)
+ {
+ fclose(write_fp);
+ write_fp = NULL;
+ }
}
/* Requested WAL segment not found */
@@ -468,7 +534,88 @@ read_archive_file(XLogDumpPrivate *privateInfo, Size count)
}
/*
- * Create an astreamer that can read WAL from a tar file.
+ * Set up a temporary directory to temporarily store WAL segments.
+ */
+static void
+setup_tmpwal_dir(const char *waldir)
+{
+ char *template;
+
+ /*
+ * Use the directory specified by the TMPDIR environment variable. If it's
+ * not set, use the provided WAL directory to extract WAL file
+ * temporarily.
+ */
+ template = psprintf("%s/waldump_tmp-XXXXXX",
+ getenv("TMPDIR") ? getenv("TMPDIR") : waldir);
+ TmpWalSegDir = mkdtemp(template);
+
+ if (TmpWalSegDir == NULL)
+ pg_fatal("could not create directory \"%s\": %m", template);
+
+ canonicalize_path(TmpWalSegDir);
+
+ pg_log_debug("created directory \"%s\"", TmpWalSegDir);
+}
+
+/*
+ * Remove temporary directory at exit, if any.
+ */
+static void
+cleanup_tmpwal_dir_atexit(void)
+{
+ rmtree(TmpWalSegDir, true);
+}
+
+/*
+ * Create an empty placeholder file and return its handle.
+ */
+static FILE *
+prepare_tmp_write(const char *fname)
+{
+ char fpath[MAXPGPATH];
+ FILE *file;
+
+ snprintf(fpath, MAXPGPATH, "%s/%s", TmpWalSegDir, fname);
+
+ /* Create an empty placeholder */
+ file = fopen(fpath, PG_BINARY_W);
+ if (file == NULL)
+ pg_fatal("could not create file \"%s\": %m", fpath);
+
+#ifndef WIN32
+ if (chmod(fpath, pg_file_create_mode))
+ pg_fatal("could not set permissions on file \"%s\": %m",
+ fpath);
+#endif
+
+ pg_log_debug("spilling to temporary file \"%s\"", fpath);
+
+ return file;
+}
+
+/*
+ * Write buffer data to the given file handle.
+ */
+static void
+perform_tmp_write(const char *fname, StringInfo buf, FILE *file)
+{
+ Assert(file);
+
+ errno = 0;
+ if (buf->len > 0 && fwrite(buf->data, buf->len, 1, file) != 1)
+ {
+ /*
+ * If write didn't set errno, assume problem is no disk space
+ */
+ if (errno == 0)
+ errno = ENOSPC;
+ pg_fatal("could not write to file \"%s\": %m", fname);
+ }
+}
+
+/*
+ * Create an astreamer that can read WAL from tar file.
*/
static astreamer *
astreamer_waldump_new(XLogDumpPrivate *privateInfo)
@@ -552,6 +699,7 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
}
entry->buf = makeStringInfo();
+ entry->spilled = false;
entry->read_len = 0;
privateInfo->cur_file = entry;
}
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index f0b8116ff14..e970b007883 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -478,10 +478,14 @@ TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
return -1;
/*
- * If the target page is in a different segment, free the buffer space
- * occupied by the previous segment data. Since pg_waldump never requests
- * the same WAL bytes twice, moving to a new segment implies the previous
- * buffer's data and that segment will not be needed again.
+ * If the target page is in a different segment, free the buffer and/or
+ * temporary file disk space occupied by the previous segment's data.
+ * Since pg_waldump never requests the same WAL bytes twice, moving to a
+ * new segment implies the previous buffer's data and that segment will
+ * not be needed again.
+ *
+ * Afterward, check for the next required WAL segment's physical existence
+ * in the temporary directory first before invoking the archive streamer.
*/
curSegNo = state->seg.ws_segno;
if (!XLByteInSeg(targetPagePtr, curSegNo, WalSegSz))
@@ -497,6 +501,13 @@ TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
state->seg.ws_tli = private->timeline;
state->seg.ws_segno = nextSegNo;
+ /* Close the WAL segment file if it is currently open */
+ if (state->seg.ws_file >= 0)
+ {
+ close(state->seg.ws_file);
+ state->seg.ws_file = -1;
+ }
+
/*
* If in pre-reading mode (prior to actual decoding), do not delete
* any entries that might be requested again once the decoding loop
@@ -508,9 +519,20 @@ TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogFileName(fname, state->seg.ws_tli, curSegNo, WalSegSz);
free_archive_wal_entry(fname, private);
}
+
+ /*
+ * If the next segment exists, open it and continue reading from there
+ */
+ XLogFileName(fname, state->seg.ws_tli, nextSegNo, WalSegSz);
+ state->seg.ws_file = open_file_in_directory(TmpWalSegDir, fname);
}
- /* Read the WAL page from the archive streamer */
+ /* Continue reading from the open WAL segment, if any */
+ if (state->seg.ws_file >= 0)
+ return WALDumpReadPage(state, targetPagePtr, count, targetPtr,
+ readBuff);
+
+ /* Otherwise, read the WAL page from the archive streamer */
return read_archive_wal_page(private, targetPagePtr, count, readBuff,
WalSegSz);
}
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index 62054bc74c0..1097390d575 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -18,6 +18,9 @@
struct ArchivedWALFile;
struct ArchivedWAL_hash;
+/* Temporary directory */
+extern char *TmpWalSegDir;
+
/* Contains the necessary information to drive WAL decoding */
typedef struct XLogDumpPrivate
{
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 6f8ce319841..6960bd46ba4 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -7,6 +7,7 @@ use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
+use List::Util qw(shuffle);
my $tar = $ENV{TAR};
@@ -312,7 +313,7 @@ sub generate_archive
}
closedir $dh;
- @files = sort @files;
+ @files = shuffle @files;
# move into the WAL directory before archiving files
my $cwd = getcwd;
--
2.47.1
[application/x-patch] v16-0008-pg_verifybackup-Delay-default-WAL-directory-prep.patch (1.7K, 9-v16-0008-pg_verifybackup-Delay-default-WAL-directory-prep.patch)
download | inline diff:
From fae022329eaf4c7a067dc96374017fcd2453c812 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 16 Jul 2025 14:47:43 +0530
Subject: [PATCH v16 08/11] pg_verifybackup: Delay default WAL directory
preparation.
We are not sure whether to parse WAL from a directory or an archive
until the backup format is known. Therefore, we delay preparing the
default WAL directory until the point of parsing. This delay is
harmless, as the WAL directory is not used elsewhere.
---
src/bin/pg_verifybackup/pg_verifybackup.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 31f606c45b1..8cc204719ee 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -285,10 +285,6 @@ main(int argc, char **argv)
manifest_path = psprintf("%s/backup_manifest",
context.backup_directory);
- /* By default, look for the WAL in the backup directory, too. */
- if (wal_directory == NULL)
- wal_directory = psprintf("%s/pg_wal", context.backup_directory);
-
/*
* Try to read the manifest. We treat any errors encountered while parsing
* the manifest as fatal; there doesn't seem to be much point in trying to
@@ -368,6 +364,10 @@ main(int argc, char **argv)
if (context.format == 'p' && !context.skip_checksums)
verify_backup_checksums(&context);
+ /* By default, look for the WAL in the backup directory, too. */
+ if (wal_directory == NULL)
+ wal_directory = psprintf("%s/pg_wal", context.backup_directory);
+
/*
* Try to parse the required ranges of WAL records, unless we were told
* not to do so.
--
2.47.1
[application/x-patch] v16-0009-pg_verifybackup-Rename-the-wal-directory-switch-.patch (5.9K, 10-v16-0009-pg_verifybackup-Rename-the-wal-directory-switch-.patch)
download | inline diff:
From 2a0ca4af4197e13f106cbf3bfa35600db2db3ff9 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 25 Nov 2025 17:32:14 +0530
Subject: [PATCH v16 09/11] pg_verifybackup: Rename the wal-directory switch to
wal-path
With previous patches to pg_waldump can now decode WAL directly from
tar files. This means you'll be able to specify a tar archive path
instead of a traditional WAL directory.
To keep things consistent and more versatile, we should also
generalize the input switch for pg_verifybackup. It should accept
either a directory or a tar file path that contains WALs. This change
will also aligning it with the existing manifest-path switch naming.
== NOTE ==
The corresponding PO files require updating due to this change.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 2 +-
src/bin/pg_verifybackup/pg_verifybackup.c | 23 ++++++++++++-----------
src/bin/pg_verifybackup/t/007_wal.pl | 4 ++--
3 files changed, 15 insertions(+), 14 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index 61c12975e4a..e9b8bfd51b1 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -261,7 +261,7 @@ PostgreSQL documentation
<varlistentry>
<term><option>-w <replaceable class="parameter">path</replaceable></option></term>
- <term><option>--wal-directory=<replaceable class="parameter">path</replaceable></option></term>
+ <term><option>--wal-path=<replaceable class="parameter">path</replaceable></option></term>
<listitem>
<para>
Try to parse WAL files stored in the specified directory, rather than
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 8cc204719ee..682c365431f 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -93,7 +93,7 @@ static void verify_file_checksum(verifier_context *context,
uint8 *buffer);
static void parse_required_wal(verifier_context *context,
char *pg_waldump_path,
- char *wal_directory);
+ char *wal_path);
static astreamer *create_archive_verifier(verifier_context *context,
char *archive_name,
Oid tblspc_oid,
@@ -126,7 +126,8 @@ main(int argc, char **argv)
{"progress", no_argument, NULL, 'P'},
{"quiet", no_argument, NULL, 'q'},
{"skip-checksums", no_argument, NULL, 's'},
- {"wal-directory", required_argument, NULL, 'w'},
+ {"wal-path", required_argument, NULL, 'w'},
+ {"wal-directory", required_argument, NULL, 'w'}, /* deprecated */
{NULL, 0, NULL, 0}
};
@@ -135,7 +136,7 @@ main(int argc, char **argv)
char *manifest_path = NULL;
bool no_parse_wal = false;
bool quiet = false;
- char *wal_directory = NULL;
+ char *wal_path = NULL;
char *pg_waldump_path = NULL;
DIR *dir;
@@ -221,8 +222,8 @@ main(int argc, char **argv)
context.skip_checksums = true;
break;
case 'w':
- wal_directory = pstrdup(optarg);
- canonicalize_path(wal_directory);
+ wal_path = pstrdup(optarg);
+ canonicalize_path(wal_path);
break;
default:
/* getopt_long already emitted a complaint */
@@ -365,15 +366,15 @@ main(int argc, char **argv)
verify_backup_checksums(&context);
/* By default, look for the WAL in the backup directory, too. */
- if (wal_directory == NULL)
- wal_directory = psprintf("%s/pg_wal", context.backup_directory);
+ if (wal_path == NULL)
+ wal_path = psprintf("%s/pg_wal", context.backup_directory);
/*
* Try to parse the required ranges of WAL records, unless we were told
* not to do so.
*/
if (!no_parse_wal)
- parse_required_wal(&context, pg_waldump_path, wal_directory);
+ parse_required_wal(&context, pg_waldump_path, wal_path);
/*
* If everything looks OK, tell the user this, unless we were asked to
@@ -1188,7 +1189,7 @@ verify_file_checksum(verifier_context *context, manifest_file *m,
*/
static void
parse_required_wal(verifier_context *context, char *pg_waldump_path,
- char *wal_directory)
+ char *wal_path)
{
manifest_data *manifest = context->manifest;
manifest_wal_range *this_wal_range = manifest->first_wal_range;
@@ -1198,7 +1199,7 @@ parse_required_wal(verifier_context *context, char *pg_waldump_path,
char *pg_waldump_cmd;
pg_waldump_cmd = psprintf("\"%s\" --quiet --path=\"%s\" --timeline=%u --start=%X/%08X --end=%X/%08X\n",
- pg_waldump_path, wal_directory, this_wal_range->tli,
+ pg_waldump_path, wal_path, this_wal_range->tli,
LSN_FORMAT_ARGS(this_wal_range->start_lsn),
LSN_FORMAT_ARGS(this_wal_range->end_lsn));
fflush(NULL);
@@ -1366,7 +1367,7 @@ usage(void)
printf(_(" -P, --progress show progress information\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -s, --skip-checksums skip checksum verification\n"));
- printf(_(" -w, --wal-directory=PATH use specified path for WAL files\n"));
+ printf(_(" -w, --wal-path=PATH use specified path for WAL files\n"));
printf(_(" -V, --version output version information, then exit\n"));
printf(_(" -?, --help show this help, then exit\n"));
printf(_("\nReport bugs to <%s>.\n"), PACKAGE_BUGREPORT);
diff --git a/src/bin/pg_verifybackup/t/007_wal.pl b/src/bin/pg_verifybackup/t/007_wal.pl
index 79087a1f6be..8ad2234453d 100644
--- a/src/bin/pg_verifybackup/t/007_wal.pl
+++ b/src/bin/pg_verifybackup/t/007_wal.pl
@@ -42,10 +42,10 @@ command_ok([ 'pg_verifybackup', '--no-parse-wal', $backup_path ],
command_ok(
[
'pg_verifybackup',
- '--wal-directory' => $relocated_pg_wal,
+ '--wal-path' => $relocated_pg_wal,
$backup_path
],
- '--wal-directory can be used to specify WAL directory');
+ '--wal-path can be used to specify WAL directory');
# Move directory back to original location.
rename($relocated_pg_wal, $original_pg_wal) || die "rename pg_wal back: $!";
--
2.47.1
[application/x-patch] v16-0010-pg_verifybackup-Enabled-WAL-parsing-for-tar-form.patch (11.5K, 11-v16-0010-pg_verifybackup-Enabled-WAL-parsing-for-tar-form.patch)
download | inline diff:
From ddf838dcfb1376d0fe76c0325df78f4037a948b3 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 25 Nov 2025 17:34:26 +0530
Subject: [PATCH v16 10/11] pg_verifybackup: Enabled WAL parsing for tar-format
backup
Now that pg_waldump supports decoding from tar archives, we should
leverage this functionality to remove the previous restriction on WAL
parsing for tar-backed formats.
---
doc/src/sgml/ref/pg_verifybackup.sgml | 12 ++--
src/bin/pg_verifybackup/pg_verifybackup.c | 66 +++++++++++++------
src/bin/pg_verifybackup/t/002_algorithm.pl | 4 --
src/bin/pg_verifybackup/t/003_corruption.pl | 4 +-
src/bin/pg_verifybackup/t/007_wal.pl | 16 +++++
src/bin/pg_verifybackup/t/008_untar.pl | 5 +-
src/bin/pg_verifybackup/t/010_client_untar.pl | 5 +-
7 files changed, 70 insertions(+), 42 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index e9b8bfd51b1..1695cfe91c8 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -36,10 +36,7 @@ PostgreSQL documentation
<literal>backup_manifest</literal> generated by the server at the time
of the backup. The backup may be stored either in the "plain" or the "tar"
format; this includes tar-format backups compressed with any algorithm
- supported by <application>pg_basebackup</application>. However, at present,
- <literal>WAL</literal> verification is supported only for plain-format
- backups. Therefore, if the backup is stored in tar-format, the
- <literal>-n, --no-parse-wal</literal> option should be used.
+ supported by <application>pg_basebackup</application>.
</para>
<para>
@@ -264,9 +261,10 @@ PostgreSQL documentation
<term><option>--wal-path=<replaceable class="parameter">path</replaceable></option></term>
<listitem>
<para>
- Try to parse WAL files stored in the specified directory, rather than
- in <literal>pg_wal</literal>. This may be useful if the backup is
- stored in a separate location from the WAL archive.
+ Try to parse WAL files stored in the specified directory or tar
+ archive, rather than in <literal>pg_wal</literal>. This may be
+ useful if the backup is stored in a separate location from the WAL
+ archive.
</para>
</listitem>
</varlistentry>
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 682c365431f..db79dd39103 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -74,7 +74,9 @@ pg_noreturn static void report_manifest_error(JsonManifestParseContext *context,
const char *fmt,...)
pg_attribute_printf(2, 3);
-static void verify_tar_backup(verifier_context *context, DIR *dir);
+static void verify_tar_backup(verifier_context *context, DIR *dir,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_plain_backup_directory(verifier_context *context,
char *relpath, char *fullpath,
DIR *dir);
@@ -83,7 +85,9 @@ static void verify_plain_backup_file(verifier_context *context, char *relpath,
static void verify_control_file(const char *controlpath,
uint64 manifest_system_identifier);
static void precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles);
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_tar_file(verifier_context *context, char *relpath,
char *fullpath, astreamer *streamer);
static void report_extra_backup_files(verifier_context *context);
@@ -137,6 +141,8 @@ main(int argc, char **argv)
bool no_parse_wal = false;
bool quiet = false;
char *wal_path = NULL;
+ char *base_archive_path = NULL;
+ char *wal_archive_path = NULL;
char *pg_waldump_path = NULL;
DIR *dir;
@@ -328,17 +334,6 @@ main(int argc, char **argv)
pfree(path);
}
- /*
- * XXX: In the future, we should consider enhancing pg_waldump to read WAL
- * files from an archive.
- */
- if (!no_parse_wal && context.format == 't')
- {
- pg_log_error("pg_waldump cannot read tar files");
- pg_log_error_hint("You must use -n/--no-parse-wal when verifying a tar-format backup.");
- exit(1);
- }
-
/*
* Perform the appropriate type of verification appropriate based on the
* backup format. This will close 'dir'.
@@ -347,7 +342,7 @@ main(int argc, char **argv)
verify_plain_backup_directory(&context, NULL, context.backup_directory,
dir);
else
- verify_tar_backup(&context, dir);
+ verify_tar_backup(&context, dir, &base_archive_path, &wal_archive_path);
/*
* The "matched" flag should now be set on every entry in the hash table.
@@ -365,9 +360,28 @@ main(int argc, char **argv)
if (context.format == 'p' && !context.skip_checksums)
verify_backup_checksums(&context);
- /* By default, look for the WAL in the backup directory, too. */
+ /*
+ * By default, WAL files are expected to be found in the backup directory
+ * for plain-format backups. In the case of tar-format backups, if a
+ * separate WAL archive is not found, the WAL files are most likely
+ * included within the main data directory archive.
+ */
if (wal_path == NULL)
- wal_path = psprintf("%s/pg_wal", context.backup_directory);
+ {
+ if (context.format == 'p')
+ wal_path = psprintf("%s/pg_wal", context.backup_directory);
+ else if (wal_archive_path)
+ wal_path = wal_archive_path;
+ else if (base_archive_path)
+ wal_path = base_archive_path;
+ else
+ {
+ pg_log_error("WAL archive not found");
+ pg_log_error_hint("Specify the correct path using the option -w/--wal-path. "
+ "Or you must use -n/--no-parse-wal when verifying a tar-format backup.");
+ exit(1);
+ }
+ }
/*
* Try to parse the required ranges of WAL records, unless we were told
@@ -788,7 +802,8 @@ verify_control_file(const char *controlpath, uint64 manifest_system_identifier)
* close when we're done with it.
*/
static void
-verify_tar_backup(verifier_context *context, DIR *dir)
+verify_tar_backup(verifier_context *context, DIR *dir, char **base_archive_path,
+ char **wal_archive_path)
{
struct dirent *dirent;
SimplePtrList tarfiles = {NULL, NULL};
@@ -817,7 +832,8 @@ verify_tar_backup(verifier_context *context, DIR *dir)
char *fullpath;
fullpath = psprintf("%s/%s", context->backup_directory, filename);
- precheck_tar_backup_file(context, filename, fullpath, &tarfiles);
+ precheck_tar_backup_file(context, filename, fullpath, &tarfiles,
+ base_archive_path, wal_archive_path);
pfree(fullpath);
}
}
@@ -876,11 +892,13 @@ verify_tar_backup(verifier_context *context, DIR *dir)
*
* The arguments to this function are mostly the same as the
* verify_plain_backup_file. The additional argument outputs a list of valid
- * tar files.
+ * tar files, along with the full paths to the main archive and the WAL
+ * directory archive.
*/
static void
precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles)
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path, char **wal_archive_path)
{
struct stat sb;
Oid tblspc_oid = InvalidOid;
@@ -919,9 +937,17 @@ precheck_tar_backup_file(verifier_context *context, char *relpath,
* extension such as .gz, .lz4, or .zst.
*/
if (strncmp("base", relpath, 4) == 0)
+ {
suffix = relpath + 4;
+
+ *base_archive_path = pstrdup(fullpath);
+ }
else if (strncmp("pg_wal", relpath, 6) == 0)
+ {
suffix = relpath + 6;
+
+ *wal_archive_path = pstrdup(fullpath);
+ }
else
{
/* Expected a <tablespaceoid>.tar file here. */
diff --git a/src/bin/pg_verifybackup/t/002_algorithm.pl b/src/bin/pg_verifybackup/t/002_algorithm.pl
index 0556191ec9d..edc515d5904 100644
--- a/src/bin/pg_verifybackup/t/002_algorithm.pl
+++ b/src/bin/pg_verifybackup/t/002_algorithm.pl
@@ -30,10 +30,6 @@ sub test_checksums
{
# Add switch to get a tar-format backup
push @backup, ('--format' => 'tar');
-
- # Add switch to skip WAL verification, which is not yet supported for
- # tar-format backups
- push @verify, ('--no-parse-wal');
}
# A backup with a bogus algorithm should fail.
diff --git a/src/bin/pg_verifybackup/t/003_corruption.pl b/src/bin/pg_verifybackup/t/003_corruption.pl
index b1d65b8aa0f..882d75d9dc2 100644
--- a/src/bin/pg_verifybackup/t/003_corruption.pl
+++ b/src/bin/pg_verifybackup/t/003_corruption.pl
@@ -193,10 +193,8 @@ for my $scenario (@scenario)
command_ok([ $tar, '-cf' => "$tar_backup_path/base.tar", '.' ]);
chdir($cwd) || die "chdir: $!";
- # Now check that the backup no longer verifies. We must use -n
- # here, because pg_waldump can't yet read WAL from a tarfile.
command_fails_like(
- [ 'pg_verifybackup', '--no-parse-wal', $tar_backup_path ],
+ [ 'pg_verifybackup', $tar_backup_path ],
$scenario->{'fails_like'},
"corrupt backup fails verification: $name");
diff --git a/src/bin/pg_verifybackup/t/007_wal.pl b/src/bin/pg_verifybackup/t/007_wal.pl
index 8ad2234453d..0e0377bfacc 100644
--- a/src/bin/pg_verifybackup/t/007_wal.pl
+++ b/src/bin/pg_verifybackup/t/007_wal.pl
@@ -90,4 +90,20 @@ command_ok(
[ 'pg_verifybackup', $backup_path2 ],
'valid base backup with timeline > 1');
+# Test WAL verification for a tar-format backup with a separate pg_wal.tar,
+# as produced by pg_basebackup --format=tar --wal-method=stream.
+my $backup_path3 = $primary->backup_dir . '/test_tar_wal';
+$primary->command_ok(
+ [
+ 'pg_basebackup',
+ '--pgdata' => $backup_path3,
+ '--no-sync',
+ '--format' => 'tar',
+ '--checkpoint' => 'fast'
+ ],
+ "tar backup with separate pg_wal.tar");
+command_ok(
+ [ 'pg_verifybackup', $backup_path3 ],
+ 'WAL verification succeeds with separate pg_wal.tar');
+
done_testing();
diff --git a/src/bin/pg_verifybackup/t/008_untar.pl b/src/bin/pg_verifybackup/t/008_untar.pl
index ae67ae85a31..161c08c190d 100644
--- a/src/bin/pg_verifybackup/t/008_untar.pl
+++ b/src/bin/pg_verifybackup/t/008_untar.pl
@@ -47,7 +47,6 @@ my $tsoid = $primary->safe_psql(
SELECT oid FROM pg_tablespace WHERE spcname = 'regress_ts1'));
my $backup_path = $primary->backup_dir . '/server-backup';
-my $extract_path = $primary->backup_dir . '/extracted-backup';
my @test_configuration = (
{
@@ -123,14 +122,12 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
# Cleanup.
rmtree($backup_path);
- rmtree($extract_path);
}
}
diff --git a/src/bin/pg_verifybackup/t/010_client_untar.pl b/src/bin/pg_verifybackup/t/010_client_untar.pl
index 1ac7b5db75a..9670fbe4fda 100644
--- a/src/bin/pg_verifybackup/t/010_client_untar.pl
+++ b/src/bin/pg_verifybackup/t/010_client_untar.pl
@@ -32,7 +32,6 @@ print $jf $junk_data;
close $jf;
my $backup_path = $primary->backup_dir . '/client-backup';
-my $extract_path = $primary->backup_dir . '/extracted-backup';
my @test_configuration = (
{
@@ -137,13 +136,11 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
# Cleanup.
- rmtree($extract_path);
rmtree($backup_path);
}
}
--
2.47.1
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-03-18 11:45 Amul Sul <[email protected]>
parent: Amul Sul <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Amul Sul @ 2026-03-18 11:45 UTC (permalink / raw)
To: Andrew Dunstan <[email protected]>; +Cc: Robert Haas <[email protected]>; Chao Li <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On Wed, Mar 11, 2026 at 10:38 PM Andrew Dunstan <[email protected]> wrote:
>
>
> On 2026-03-09 Mo 8:26 AM, Amul Sul wrote:
>
> On Sat, Mar 7, 2026 at 3:51 AM Andrew Dunstan <[email protected]> wrote:
>
> On 2026-03-04 We 4:50 PM, Andrew Dunstan wrote:
>
> On 2026-03-04 We 7:52 AM, Amul Sul wrote:
>
> On Wed, Mar 4, 2026 at 6:07 AM Andrew Dunstan<[email protected]> wrote:
>
> On 2026-03-02 Mo 8:00 AM, Amul Sul wrote:
>
> On Wed, Feb 18, 2026 at 12:28 PM Amul Sul<[email protected]> wrote:
>
> On Tue, Feb 10, 2026 at 3:06 PM Amul Sul<[email protected]> wrote:
>
> On Wed, Feb 4, 2026 at 6:39 PM Amul Sul<[email protected]> wrote:
>
> On Wed, Jan 28, 2026 at 2:41 AM Robert Haas<[email protected]> wrote:
>
> On Tue, Jan 27, 2026 at 7:07 AM Amul Sul<[email protected]> wrote:
>
> In the attached version, I am using the WAL segment name as the hash
> key, which is much more straightforward. I have rewritten
> read_archive_wal_page(), and it looks much cleaner than before. The
> logic to discard irrelevant WAL files is still within
> get_archive_wal_entry. I added an explanation for setting cur_wal to
> NULL, which is now handled in the separate function I mentioned
> previously.
>
> Kindly have a look at the attached version; let me know if you are
> still not happy with the current approach for filtering/discarding
> irrelevant WAL segments. It isn't much different from the previous
> version, but I have tried to keep it in a separate routine for better
> code readability, with comments to make it easier to understand. I
> also added a comment for ArchivedWALFile.
>
> I feel like the division of labor between get_archive_wal_entry() and
> read_archive_wal_page() is odd. I noticed this in the last version,
> too, and it still seems to be the case. get_archive_wal_entry() first
> calls ArchivedWAL_lookup(). If that finds an entry, it just returns.
> If it doesn't, it loops until an entry for the requested file shows up
> and then returns it. Then control returns to read_archive_wal_page()
> which loops some more until we have all the data we need for the
> requested file. But it seems odd to me to have two separate loops
> here. I think that the first loop is going to call read_archive_file()
> until we find the beginning of the file that we care about and then
> the second one is going to call read_archive_file() some more until we
> have read enough of it to satisfy the request. It feels odd to me to
> do it that way, as if we told somebody to first wait until 9 o'clock
> and then wait another 30 minutes, instead of just telling them to wait
> until 9:30. I realize it's not quite the same thing, because apart
> from calling read_archive_file(), the two loops do different things,
> but I still think it looks odd.
>
> + /*
> + * Ignore if the timeline is different or the current segment is not
> + * the desired one.
> + */
> + XLogFromFileName(entry->fname, &curSegTimeline, &curSegNo, WalSegSz);
> + if (privateInfo->timeline != curSegTimeline ||
> + privateInfo->startSegNo > curSegNo ||
> + privateInfo->endSegNo < curSegNo ||
> + segno > curSegNo)
> + {
> + free_archive_wal_entry(entry->fname, privateInfo);
> + continue;
> + }
>
> The comment doesn't match the code. If it did, the test would be
> (privateInfo->timeline != curSegTimeline || segno != curSegno). But
> instead the segno test is > rather than !=, and the checks against
> startSegNo and endSegNo aren't explained at all. I think I understand
> why the segno test uses > rather than !=, but it's the point of the
> comment to explain things like that, rather than leaving the reader to
> guess. And I don't know why we also need to test startSegNo and
> endSegNo.
>
> I also wonder what the point is of doing XLogFromFileName() on the
> fname provided by the caller and then again on entry->fname. Couldn't
> you just compare the strings?
>
> Again, the division of labor is really odd here. It's the job of
> astreamer_waldump_content() to skip things that aren't WAL files at
> all, but it's the job of get_archive_wal_entry() to skip things that
> are WAL files but not the one we want. I disagree with putting those
> checks in completely separate parts of the code.
>
> Keeping the timeline and segment start-end range checks inside the
> archive streamer creates a circular dependency that cannot be resolved
> without a 'dirty hack'. We must read the first available WAL file page
> to determine the wal_segment_size before it can calculate the target
> segment range. Moving the checks inside the streamer would make it
> impossible to process that initial file, as the necessary filtering
> parameters -- would still be unknown which would need to be skipped
> for the first read somehow. What if later we realized that the first
> WAL file which was allowed to be streamed by skipping that check is
> irrelevant and doesn't fall under the start-end segment range?
>
> Please have a look at the attached version, specifically patch 0005.
> In astreamer_waldump_content(), I have moved the WAL file filtration
> check from get_archive_wal_entry(). This check will be skipped during
> the initial read in init_archive_reader(), which instead performs it
> explicitly once it determines the WAL segment size and the start/end
> segments.
>
> To access the WAL segment size inside astreamer_waldump_content(), I
> have moved the WAL segment size variable into the XLogDumpPrivate
> structure in the separate 0004 patch.
>
> Attached is an updated version including the aforesaid changes. It
> includes a new refactoring patch (0001) that moves the logic for
> identifying tar archives and their compression types from
> pg_basebackup and pg_verifybackup into a separate-reusable function,
> per a suggestion from Euler [1]. Additionally, I have added a test
> for the contrecord decoding to the main patch (now 0006).
>
> 1]http://postgr.es/m/[email protected]
>
> Rebased against the latest master, fixed typos in code comments, and
> replaced palloc0 with palloc0_object.
>
> Hi Amul.
>
>
> I think this looks in pretty good shape.
>
> Thank you very much for looking at the patch.
>
> Attached are patches for a few things I think could be fixed. They are
> mostly self-explanatory. The TAP test fix is the only sane way I could
> come up with stopping the skip code you had from reporting a wildly
> inaccurate number of tests skipped. The sane way to do this from a
> Test::More perspective is a subtest, but unfortunately meson does not
> like subtest output, which is why we don't use it elsewhere, so the only
> way I could come up with was to split this out into a separate test. Of
> course, we might just say we don't care about the misreport, in which
> case we could just live with things as they are.
>
> I agree that the reported skip number was incorrect, and I have
> corrected it in the attached patch. I haven't applied your patch for
> the TAP test improvements yet because I wanted to double-check it
> first with you; the patch as it stood created duplicate tests already
> present in 001_basic.pl. To avoid this duplication, I have added a
> loop that performs tests for both plain and tar WAL directory inputs,
> similar to the approach used in pg_verifybackup for different
> compression type tests (e.g., 008_untar.pl, 010_client_untar.pl). I
> don't have any objection to doing so if you feel the duplication is
> acceptable, but I feel that using a loop for the tests in 001_basic.pl
> is a bit tidier. Let me know your thoughts.
>
> I will take a look.
>
> I'm ok, with doing it this way. It's just a bit fragile - if we add a
> test the number will be wrong. But maybe it's not worth worrying about.
>
> Everything else looks fairly good. The attached fixes a few relatively
> minor issues in v15. The main one is that it stops allocating/freeing a
> buffer every time we call read_archive_file() and instead adds a
> reusable buffer. It also adds back wal-directory as an undocumented
> alias of wal-path, to avoid breaking legacy scripts unnecessarily, and
> adds constness to the fname argument of pg_tar_compress_algorithm, as
> well as fixing some indentation and grammar issues.
>
> All in all I think we're in good shape.
>
> Thanks for the review. I have incorporated your suggested changes,
> with one exception: I have skipped the buffer reallocation code in
> read_archive_file(). Since we only handle two specific read sizes --
> XLOG_BLCKSZ and READ_CHUNK_SIZE (128 KB, we defined in
> archive_waldump.c) -- dynamic reallocation seems unnecessary. Instead,
> I moved the allocation to init_archive_reader(), which now initializes
> a buffer at READ_CHUNK_SIZE. I also added an assertion in
> read_archive_file() to ensure that no read request exceeds this
> allocated capacity.
>
> Kindly have a look at the attached version and let me know your thoughts.
>
>
> Looks pretty good. I have squashed them into three patches I think are committable. Also attached is a diff showing what's changed - mainly this:
>
> . --follow + tar archive rejected (pg_waldump.c) — new validation prevents a confusing pg_fatal when combining --follow with a tar archive
> . error messages split (archive_waldump.c) — the single "could not read file" error is now two distinct messages: "WAL segment is too short" (truncated file) vs "unexpected end of archive" (archive EOF) - Fixes an issue raised in review
> . hash table cleanup (archive_waldump.c) — free_archive_reader now iterates and frees all remaining hash entries and destroys the table
>
The final squashed version looks good to me, thank you. But, I would
like to propose splitting the 0001 patch into two separate commits: a
preparatory refactoring of the pg_waldump code and a standalone commit
that moves the tar archive detection and compression logic to a common
location, as the latter is an independent improvement to the existing
codebase. Additionally, since the test file refactoring was only kept
separate to facilitate the review and has already been reviewed, I
suggest merging those changes into the main feature patch i.e. 0002.
All other elements should remain in a single preparatory refactoring
patch for pg_waldump.
Attached is the version that includes the proposed split. No
additional changes to 0002 and 0003 patches.
Regards,
Amul
Attachments:
[application/x-patch] v18-0001-Move-tar-detection-and-compression-logic-to-comm.patch (7.0K, 2-v18-0001-Move-tar-detection-and-compression-logic-to-comm.patch)
download | inline diff:
From 93b0818ce1b44619a37b9c5624eb0c7792a30edd Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 17 Feb 2026 14:51:11 +0530
Subject: [PATCH v18 1/4] Move tar detection and compression logic to common.
Consolidate tar archive identification and compression-type detection
logic into a shared location. Currently used by pg_basebackup and
pg_verifybackup, this functionality is also required for upcoming
pg_waldump enhancements.
This change promotes code reuse and simplifies maintenance across
frontend tools.
Author: Amul Sul <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Jakub Wartak <[email protected]>
Reviewed-by: Andrew Dunstan <[email protected]>
discussion: https://postgr.es/m/CAAJ_b94bqdWN3h2J-PzzzQ2Npbwct5ZQHggn_QoYGhC2rn-=WQ@mail.gmail.com
---
src/bin/pg_basebackup/pg_basebackup.c | 36 +++++++----------------
src/bin/pg_verifybackup/pg_verifybackup.c | 12 +-------
src/common/compression.c | 30 +++++++++++++++++++
src/include/common/compression.h | 2 ++
4 files changed, 44 insertions(+), 36 deletions(-)
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index fa169a8d642..c1a4672aa6f 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -1070,12 +1070,9 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
astreamer *manifest_inject_streamer = NULL;
bool inject_manifest;
bool is_tar,
- is_tar_gz,
- is_tar_lz4,
- is_tar_zstd,
is_compressed_tar;
+ pg_compress_algorithm compressed_tar_algorithm;
bool must_parse_archive;
- int archive_name_len = strlen(archive_name);
/*
* Normally, we emit the backup manifest as a separate file, but when
@@ -1084,24 +1081,13 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
*/
inject_manifest = (format == 't' && strcmp(basedir, "-") == 0 && manifest);
- /* Is this a tar archive? */
- is_tar = (archive_name_len > 4 &&
- strcmp(archive_name + archive_name_len - 4, ".tar") == 0);
-
- /* Is this a .tar.gz archive? */
- is_tar_gz = (archive_name_len > 7 &&
- strcmp(archive_name + archive_name_len - 7, ".tar.gz") == 0);
-
- /* Is this a .tar.lz4 archive? */
- is_tar_lz4 = (archive_name_len > 8 &&
- strcmp(archive_name + archive_name_len - 8, ".tar.lz4") == 0);
-
- /* Is this a .tar.zst archive? */
- is_tar_zstd = (archive_name_len > 8 &&
- strcmp(archive_name + archive_name_len - 8, ".tar.zst") == 0);
+ /* Check whether it is a tar archive and its compression type */
+ is_tar = parse_tar_compress_algorithm(archive_name,
+ &compressed_tar_algorithm);
/* Is this any kind of compressed tar? */
- is_compressed_tar = is_tar_gz || is_tar_lz4 || is_tar_zstd;
+ is_compressed_tar = (is_tar &&
+ compressed_tar_algorithm != PG_COMPRESSION_NONE);
/*
* Injecting the manifest into a compressed tar file would be possible if
@@ -1128,7 +1114,7 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
(spclocation == NULL && writerecoveryconf));
/* At present, we only know how to parse tar archives. */
- if (must_parse_archive && !is_tar && !is_compressed_tar)
+ if (must_parse_archive && !is_tar)
{
pg_log_error("cannot parse archive \"%s\"", archive_name);
pg_log_error_detail("Only tar archives can be parsed.");
@@ -1263,13 +1249,13 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
* If the user has requested a server compressed archive along with
* archive extraction at client then we need to decompress it.
*/
- if (format == 'p')
+ if (format == 'p' && is_compressed_tar)
{
- if (is_tar_gz)
+ if (compressed_tar_algorithm == PG_COMPRESSION_GZIP)
streamer = astreamer_gzip_decompressor_new(streamer);
- else if (is_tar_lz4)
+ else if (compressed_tar_algorithm == PG_COMPRESSION_LZ4)
streamer = astreamer_lz4_decompressor_new(streamer);
- else if (is_tar_zstd)
+ else if (compressed_tar_algorithm == PG_COMPRESSION_ZSTD)
streamer = astreamer_zstd_decompressor_new(streamer);
}
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index cbc9447384f..31f606c45b1 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -941,17 +941,7 @@ precheck_tar_backup_file(verifier_context *context, char *relpath,
}
/* Now, check the compression type of the tar */
- if (strcmp(suffix, ".tar") == 0)
- compress_algorithm = PG_COMPRESSION_NONE;
- else if (strcmp(suffix, ".tgz") == 0)
- compress_algorithm = PG_COMPRESSION_GZIP;
- else if (strcmp(suffix, ".tar.gz") == 0)
- compress_algorithm = PG_COMPRESSION_GZIP;
- else if (strcmp(suffix, ".tar.lz4") == 0)
- compress_algorithm = PG_COMPRESSION_LZ4;
- else if (strcmp(suffix, ".tar.zst") == 0)
- compress_algorithm = PG_COMPRESSION_ZSTD;
- else
+ if (!parse_tar_compress_algorithm(suffix, &compress_algorithm))
{
report_backup_error(context,
"file \"%s\" is not expected in a tar format backup",
diff --git a/src/common/compression.c b/src/common/compression.c
index 92cd4ec7a0d..fb27501d297 100644
--- a/src/common/compression.c
+++ b/src/common/compression.c
@@ -41,6 +41,36 @@ static int expect_integer_value(char *keyword, char *value,
static bool expect_boolean_value(char *keyword, char *value,
pg_compress_specification *result);
+/*
+ * Look up a compression algorithm by archive file extension. Returns true and
+ * sets *algorithm if the name is recognized. Otherwise returns false.
+ */
+bool
+parse_tar_compress_algorithm(const char *fname, pg_compress_algorithm *algorithm)
+{
+ int fname_len = strlen(fname);
+
+ if (fname_len >= 4 &&
+ strcmp(fname + fname_len - 4, ".tar") == 0)
+ *algorithm = PG_COMPRESSION_NONE;
+ else if (fname_len >= 4 &&
+ strcmp(fname + fname_len - 4, ".tgz") == 0)
+ *algorithm = PG_COMPRESSION_GZIP;
+ else if (fname_len >= 7 &&
+ strcmp(fname + fname_len - 7, ".tar.gz") == 0)
+ *algorithm = PG_COMPRESSION_GZIP;
+ else if (fname_len >= 8 &&
+ strcmp(fname + fname_len - 8, ".tar.lz4") == 0)
+ *algorithm = PG_COMPRESSION_LZ4;
+ else if (fname_len >= 8 &&
+ strcmp(fname + fname_len - 8, ".tar.zst") == 0)
+ *algorithm = PG_COMPRESSION_ZSTD;
+ else
+ return false;
+
+ return true;
+}
+
/*
* Look up a compression algorithm by name. Returns true and sets *algorithm
* if the name is recognized. Otherwise returns false.
diff --git a/src/include/common/compression.h b/src/include/common/compression.h
index 6c745b90066..f99c747cdd3 100644
--- a/src/include/common/compression.h
+++ b/src/include/common/compression.h
@@ -41,6 +41,8 @@ typedef struct pg_compress_specification
extern void parse_compress_options(const char *option, char **algorithm,
char **detail);
+extern bool parse_tar_compress_algorithm(const char *fname,
+ pg_compress_algorithm *algorithm);
extern bool parse_compress_algorithm(char *name, pg_compress_algorithm *algorithm);
extern const char *get_compress_algorithm_name(pg_compress_algorithm algorithm);
--
2.47.1
[application/x-patch] v18-0002-pg_waldump-Preparatory-refactoring-for-tar-archi.patch (8.3K, 3-v18-0002-pg_waldump-Preparatory-refactoring-for-tar-archi.patch)
download | inline diff:
From 3790325ce1316d745e453081dc381f05b68ad036 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 22 Jan 2026 10:28:32 +0530
Subject: [PATCH v18 2/4] pg_waldump: Preparatory refactoring for tar archive
WAL decoding.
Several refactoring steps in preparation for adding tar archive WAL
decoding support to pg_waldump:
- Move XLogDumpPrivate and related declarations into a new pg_waldump.h
header, allowing a second source file to share them.
- Factor out required_read_len() so the read-size calculation can be
reused for both regular WAL files and tar-archived WAL.
- Move the WAL segment size variable into XLogDumpPrivate and rename it
to segsize, making it accessible to the archive streamer code.
Author: Amul Sul <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Jakub Wartak <[email protected]>
Reviewed-by: Andrew Dunstan <[email protected]>
discussion: https://postgr.es/m/CAAJ_b94bqdWN3h2J-PzzzQ2Npbwct5ZQHggn_QoYGhC2rn-=WQ@mail.gmail.com
---
src/bin/pg_waldump/pg_waldump.c | 78 +++++++++++++++++++--------------
src/bin/pg_waldump/pg_waldump.h | 26 +++++++++++
2 files changed, 70 insertions(+), 34 deletions(-)
create mode 100644 src/bin/pg_waldump/pg_waldump.h
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index f3446385d6a..5d31b15dbd8 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -29,6 +29,7 @@
#include "common/logging.h"
#include "common/relpath.h"
#include "getopt_long.h"
+#include "pg_waldump.h"
#include "rmgrdesc.h"
#include "storage/bufpage.h"
@@ -43,14 +44,6 @@ static volatile sig_atomic_t time_to_stop = false;
static const RelFileLocator emptyRelFileLocator = {0, 0, 0};
-typedef struct XLogDumpPrivate
-{
- TimeLineID timeline;
- XLogRecPtr startptr;
- XLogRecPtr endptr;
- bool endptr_reached;
-} XLogDumpPrivate;
-
typedef struct XLogDumpConfig
{
/* display options */
@@ -333,6 +326,32 @@ identify_target_directory(char *directory, char *fname, int *WalSegSz)
return NULL; /* not reached */
}
+/*
+ * Returns the size in bytes of the data to be read. Returns -1 if the end
+ * point has already been reached.
+ */
+static inline int
+required_read_len(XLogDumpPrivate *private, XLogRecPtr targetPagePtr,
+ int reqLen)
+{
+ int count = XLOG_BLCKSZ;
+
+ if (XLogRecPtrIsValid(private->endptr))
+ {
+ if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
+ count = XLOG_BLCKSZ;
+ else if (targetPagePtr + reqLen <= private->endptr)
+ count = private->endptr - targetPagePtr;
+ else
+ {
+ private->endptr_reached = true;
+ return -1;
+ }
+ }
+
+ return count;
+}
+
/* pg_waldump's XLogReaderRoutine->segment_open callback */
static void
WALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
@@ -390,21 +409,12 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogRecPtr targetPtr, char *readBuff)
{
XLogDumpPrivate *private = state->private_data;
- int count = XLOG_BLCKSZ;
+ int count = required_read_len(private, targetPagePtr, reqLen);
WALReadError errinfo;
- if (XLogRecPtrIsValid(private->endptr))
- {
- if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
- count = XLOG_BLCKSZ;
- else if (targetPagePtr + reqLen <= private->endptr)
- count = private->endptr - targetPagePtr;
- else
- {
- private->endptr_reached = true;
- return -1;
- }
- }
+ /* Bail out if the count to be read is not valid */
+ if (count < 0)
+ return -1;
if (!WALRead(state, readBuff, targetPagePtr, count, private->timeline,
&errinfo))
@@ -801,7 +811,6 @@ main(int argc, char **argv)
XLogRecPtr first_record;
char *waldir = NULL;
char *errormsg;
- int WalSegSz;
static struct option long_options[] = {
{"bkp-details", no_argument, NULL, 'b'},
@@ -855,6 +864,7 @@ main(int argc, char **argv)
memset(&stats, 0, sizeof(XLogStats));
private.timeline = 1;
+ private.segsize = 0;
private.startptr = InvalidXLogRecPtr;
private.endptr = InvalidXLogRecPtr;
private.endptr_reached = false;
@@ -1128,18 +1138,18 @@ main(int argc, char **argv)
pg_fatal("could not open directory \"%s\": %m", waldir);
}
- waldir = identify_target_directory(waldir, fname, &WalSegSz);
+ waldir = identify_target_directory(waldir, fname, &private.segsize);
fd = open_file_in_directory(waldir, fname);
if (fd < 0)
pg_fatal("could not open file \"%s\"", fname);
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &segno, WalSegSz);
+ XLogFromFileName(fname, &private.timeline, &segno, private.segsize);
if (!XLogRecPtrIsValid(private.startptr))
- XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr);
- else if (!XLByteInSeg(private.startptr, segno, WalSegSz))
+ XLogSegNoOffsetToRecPtr(segno, 0, private.segsize, private.startptr);
+ else if (!XLByteInSeg(private.startptr, segno, private.segsize))
{
pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
LSN_FORMAT_ARGS(private.startptr),
@@ -1149,7 +1159,7 @@ main(int argc, char **argv)
/* no second file specified, set end position */
if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(segno + 1, 0, WalSegSz, private.endptr);
+ XLogSegNoOffsetToRecPtr(segno + 1, 0, private.segsize, private.endptr);
/* parse ENDSEG if passed */
if (optind + 1 < argc)
@@ -1165,14 +1175,14 @@ main(int argc, char **argv)
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &endsegno, WalSegSz);
+ XLogFromFileName(fname, &private.timeline, &endsegno, private.segsize);
if (endsegno < segno)
pg_fatal("ENDSEG %s is before STARTSEG %s",
argv[optind + 1], argv[optind]);
if (!XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(endsegno + 1, 0, WalSegSz,
+ XLogSegNoOffsetToRecPtr(endsegno + 1, 0, private.segsize,
private.endptr);
/* set segno to endsegno for check of --end */
@@ -1180,8 +1190,8 @@ main(int argc, char **argv)
}
- if (!XLByteInSeg(private.endptr, segno, WalSegSz) &&
- private.endptr != (segno + 1) * WalSegSz)
+ if (!XLByteInSeg(private.endptr, segno, private.segsize) &&
+ private.endptr != (segno + 1) * private.segsize)
{
pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
LSN_FORMAT_ARGS(private.endptr),
@@ -1190,7 +1200,7 @@ main(int argc, char **argv)
}
}
else
- waldir = identify_target_directory(waldir, NULL, &WalSegSz);
+ waldir = identify_target_directory(waldir, NULL, &private.segsize);
/* we don't know what to print */
if (!XLogRecPtrIsValid(private.startptr))
@@ -1203,7 +1213,7 @@ main(int argc, char **argv)
/* we have everything we need, start reading */
xlogreader_state =
- XLogReaderAllocate(WalSegSz, waldir,
+ XLogReaderAllocate(private.segsize, waldir,
XL_ROUTINE(.page_read = WALDumpReadPage,
.segment_open = WALDumpOpenSegment,
.segment_close = WALDumpCloseSegment),
@@ -1224,7 +1234,7 @@ main(int argc, char **argv)
* a segment (e.g. we were used in file mode).
*/
if (first_record != private.startptr &&
- XLogSegmentOffset(private.startptr, WalSegSz) != 0)
+ XLogSegmentOffset(private.startptr, private.segsize) != 0)
pg_log_info(ngettext("first record is after %X/%08X, at %X/%08X, skipping over %u byte",
"first record is after %X/%08X, at %X/%08X, skipping over %u bytes",
(first_record - private.startptr)),
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
new file mode 100644
index 00000000000..013b051506f
--- /dev/null
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -0,0 +1,26 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_waldump.h - decode and display WAL
+ *
+ * Copyright (c) 2026, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/pg_waldump.h
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_WALDUMP_H
+#define PG_WALDUMP_H
+
+#include "access/xlogdefs.h"
+
+/* Contains the necessary information to drive WAL decoding */
+typedef struct XLogDumpPrivate
+{
+ TimeLineID timeline;
+ int segsize;
+ XLogRecPtr startptr;
+ XLogRecPtr endptr;
+ bool endptr_reached;
+} XLogDumpPrivate;
+
+#endif /* PG_WALDUMP_H */
--
2.47.1
[application/x-patch] v18-0003-pg_waldump-Add-support-for-reading-WAL-from-tar-.patch (54.4K, 4-v18-0003-pg_waldump-Add-support-for-reading-WAL-from-tar-.patch)
download | inline diff:
From 94816d7767e18691d25820490e1a079ba90813cd Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 18 Feb 2026 11:07:57 +0530
Subject: [PATCH v18 3/4] pg_waldump: Add support for reading WAL from tar
archives
pg_waldump can now accept the path to a tar archive (optionally
compressed with gzip, lz4, or zstd) containing WAL files and decode
them. This was added primarily for pg_verifybackup, which previously
had to skip WAL parsing for tar-format backups.
The implementation uses the existing archive streamer infrastructure
with a hash table to track WAL segments read from the archive. If WAL
files within the archive are not in sequential order, out-of-order
segments are written to a temporary directory (created via mkdtemp under
$TMPDIR or the archive's directory) and read back when needed. An
atexit callback ensures the temporary directory is cleaned up.
The --follow option is not supported when reading from a tar archive.
Author: Amul Sul <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Jakub Wartak <[email protected]>
Reviewed-by: Andrew Dunstan <[email protected]>
discussion: https://postgr.es/m/CAAJ_b94bqdWN3h2J-PzzzQ2Npbwct5ZQHggn_QoYGhC2rn-=WQ@mail.gmail.com
---
doc/src/sgml/ref/pg_waldump.sgml | 23 +-
src/bin/pg_waldump/Makefile | 7 +-
src/bin/pg_waldump/archive_waldump.c | 823 +++++++++++++++++++++++++++
src/bin/pg_waldump/meson.build | 4 +-
src/bin/pg_waldump/pg_waldump.c | 286 ++++++++--
src/bin/pg_waldump/pg_waldump.h | 48 ++
src/bin/pg_waldump/t/001_basic.pl | 242 ++++++--
src/tools/pgindent/typedefs.list | 4 +
8 files changed, 1311 insertions(+), 126 deletions(-)
create mode 100644 src/bin/pg_waldump/archive_waldump.c
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index d1715ff5124..9bbb4bd5772 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -141,13 +141,21 @@ PostgreSQL documentation
<term><option>--path=<replaceable>path</replaceable></option></term>
<listitem>
<para>
- Specifies a directory to search for WAL segment files or a
- directory with a <literal>pg_wal</literal> subdirectory that
+ Specifies a tar archive or a directory to search for WAL segment files
+ or a directory with a <literal>pg_wal</literal> subdirectory that
contains such files. The default is to search in the current
directory, the <literal>pg_wal</literal> subdirectory of the
current directory, and the <literal>pg_wal</literal> subdirectory
of <envar>PGDATA</envar>.
</para>
+ <para>
+ If a tar archive is provided and its WAL segment files are not in
+ sequential order, those files will be written to a temporary directory
+ named starting with <filename>waldump_tmp</filename>. This directory will be
+ created inside the directory specified by the <envar>TMPDIR</envar>
+ environment variable if it is set; otherwise, it will be created within
+ the same directory as the tar archive.
+ </para>
</listitem>
</varlistentry>
@@ -383,6 +391,17 @@ PostgreSQL documentation
</para>
</listitem>
</varlistentry>
+
+ <varlistentry>
+ <term><envar>TMPDIR</envar></term>
+ <listitem>
+ <para>
+ Directory in which to create temporary files when reading WAL from a
+ tar archive with out-of-order segment files. If not set, the temporary
+ directory is created within the same directory as the tar archive.
+ </para>
+ </listitem>
+ </varlistentry>
</variablelist>
</refsect1>
diff --git a/src/bin/pg_waldump/Makefile b/src/bin/pg_waldump/Makefile
index 4c1ee649501..aabb87566a2 100644
--- a/src/bin/pg_waldump/Makefile
+++ b/src/bin/pg_waldump/Makefile
@@ -3,6 +3,9 @@
PGFILEDESC = "pg_waldump - decode and display WAL"
PGAPPICON=win32
+# make these available to TAP test scripts
+export TAR
+
subdir = src/bin/pg_waldump
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
@@ -10,13 +13,15 @@ include $(top_builddir)/src/Makefile.global
OBJS = \
$(RMGRDESCOBJS) \
$(WIN32RES) \
+ archive_waldump.o \
compat.o \
pg_waldump.o \
rmgrdesc.o \
xlogreader.o \
xlogstats.o
-override CPPFLAGS := -DFRONTEND $(CPPFLAGS)
+override CPPFLAGS := -DFRONTEND -I$(libpq_srcdir) $(CPPFLAGS)
+LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils
RMGRDESCSOURCES = $(sort $(notdir $(wildcard $(top_srcdir)/src/backend/access/rmgrdesc/*desc*.c)))
RMGRDESCOBJS = $(patsubst %.c,%.o,$(RMGRDESCSOURCES))
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
new file mode 100644
index 00000000000..c93e02ece8b
--- /dev/null
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -0,0 +1,823 @@
+/*-------------------------------------------------------------------------
+ *
+ * archive_waldump.c
+ * A generic facility for reading WAL data from tar archives via archive
+ * streamer.
+ *
+ * Portions Copyright (c) 2026, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/archive_waldump.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres_fe.h"
+
+#include <unistd.h>
+
+#include "access/xlog_internal.h"
+#include "common/file_perm.h"
+#include "common/hashfn.h"
+#include "common/logging.h"
+#include "fe_utils/simple_list.h"
+#include "pg_waldump.h"
+
+/*
+ * How many bytes should we try to read from a file at once?
+ */
+#define READ_CHUNK_SIZE (128 * 1024)
+
+/* Temporary exported WAL file directory */
+char *TmpWalSegDir = NULL;
+
+/*
+ * Check if the start segment number is zero; this indicates a request to read
+ * any WAL file.
+ */
+#define READ_ANY_WAL(privateInfo) ((privateInfo)->start_segno == 0)
+
+/*
+ * Hash entry representing a WAL segment retrieved from the archive.
+ *
+ * While WAL segments are typically read sequentially, individual entries
+ * maintain their own buffers for the following reasons:
+ *
+ * 1. Boundary Handling: The archive streamer provides a continuous byte
+ * stream. A single streaming chunk may contain the end of one WAL segment
+ * and the start of the next. Separate buffers allow us to easily
+ * partition and track these bytes by their respective segments.
+ *
+ * 2. Out-of-Order Support: Dedicated buffers simplify logic if segments
+ * are ever archived or retrieved out of sequence.
+ *
+ * To minimize the memory footprint, entries and their associated buffers are
+ * freed immediately once consumed. Since pg_waldump does not request the same
+ * bytes twice, a segment is discarded as soon as pg_waldump moves past it.
+ */
+typedef struct ArchivedWALFile
+{
+ uint32 status; /* hash status */
+ const char *fname; /* hash key: WAL segment name */
+
+ StringInfo buf; /* holds WAL bytes read from archive */
+ bool spilled; /* true if the WAL data was spilled to a
+ * temporary file */
+
+ int read_len; /* total bytes of a WAL read from archive */
+} ArchivedWALFile;
+
+static uint32 hash_string_pointer(const char *s);
+#define SH_PREFIX ArchivedWAL
+#define SH_ELEMENT_TYPE ArchivedWALFile
+#define SH_KEY_TYPE const char *
+#define SH_KEY fname
+#define SH_HASH_KEY(tb, key) hash_string_pointer(key)
+#define SH_EQUAL(tb, a, b) (strcmp(a, b) == 0)
+#define SH_SCOPE static inline
+#define SH_RAW_ALLOCATOR pg_malloc0
+#define SH_DECLARE
+#define SH_DEFINE
+#include "lib/simplehash.h"
+
+typedef struct astreamer_waldump
+{
+ astreamer base;
+ XLogDumpPrivate *privateInfo;
+} astreamer_waldump;
+
+static ArchivedWALFile *get_archive_wal_entry(const char *fname,
+ XLogDumpPrivate *privateInfo,
+ int WalSegSz);
+static int read_archive_file(XLogDumpPrivate *privateInfo, Size count);
+static void setup_tmpwal_dir(const char *waldir);
+static void cleanup_tmpwal_dir_atexit(void);
+
+static FILE *prepare_tmp_write(const char *fname);
+static void perform_tmp_write(const char *fname, StringInfo buf, FILE *file);
+
+static astreamer *astreamer_waldump_new(XLogDumpPrivate *privateInfo);
+static void astreamer_waldump_content(astreamer *streamer,
+ astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context);
+static void astreamer_waldump_finalize(astreamer *streamer);
+static void astreamer_waldump_free(astreamer *streamer);
+
+static bool member_is_wal_file(astreamer_waldump *mystreamer,
+ astreamer_member *member,
+ char **fname);
+
+static const astreamer_ops astreamer_waldump_ops = {
+ .content = astreamer_waldump_content,
+ .finalize = astreamer_waldump_finalize,
+ .free = astreamer_waldump_free
+};
+
+/*
+ * Initializes the tar archive reader, creates a hash table for WAL entries,
+ * checks for existing valid WAL segments in the archive file and retrieves the
+ * segment size, and sets up filters for relevant entries. It also configures a
+ * temporary directory for out-of-order WAL data and registers an exit callback
+ * to clean up temporary files.
+ */
+void
+init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
+ int *WalSegSz, pg_compress_algorithm compression)
+{
+ int fd;
+ astreamer *streamer;
+ ArchivedWALFile *entry = NULL;
+ XLogLongPageHeader longhdr;
+ XLogSegNo segno;
+ TimeLineID timeline;
+
+ /* Open tar archive and store its file descriptor */
+ fd = open_file_in_directory(waldir, privateInfo->archive_name);
+
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", privateInfo->archive_name);
+
+ privateInfo->archive_fd = fd;
+
+ streamer = astreamer_waldump_new(privateInfo);
+
+ /* We must first parse the tar archive. */
+ streamer = astreamer_tar_parser_new(streamer);
+
+ /* If the archive is compressed, decompress before parsing. */
+ if (compression == PG_COMPRESSION_GZIP)
+ streamer = astreamer_gzip_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_LZ4)
+ streamer = astreamer_lz4_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_ZSTD)
+ streamer = astreamer_zstd_decompressor_new(streamer);
+
+ privateInfo->archive_streamer = streamer;
+
+ /*
+ * Allocate a buffer for reading the archive file to facilitate content
+ * decoding; read requests must not exceed the allocated buffer size.
+ */
+ privateInfo->archive_read_buf = pg_malloc(READ_CHUNK_SIZE);
+ privateInfo->archive_read_buf_size = READ_CHUNK_SIZE;
+
+ /*
+ * Hash table storing WAL entries read from the archive with an arbitrary
+ * initial size
+ */
+ privateInfo->archive_wal_htab = ArchivedWAL_create(8, NULL);
+
+ /*
+ * Verify that the archive contains valid WAL files and fetch WAL segment
+ * size
+ */
+ while (entry == NULL || entry->buf->len < XLOG_BLCKSZ)
+ {
+ if (read_archive_file(privateInfo, XLOG_BLCKSZ) == 0)
+ pg_fatal("could not find WAL in archive \"%s\"",
+ privateInfo->archive_name);
+
+ entry = privateInfo->cur_file;
+ }
+
+ /* Set WalSegSz if WAL data is successfully read */
+ longhdr = (XLogLongPageHeader) entry->buf->data;
+
+ if (!IsValidWalSegSize(longhdr->xlp_seg_size))
+ {
+ pg_log_error(ngettext("invalid WAL segment size in WAL file from archive \"%s\" (%d byte)",
+ "invalid WAL segment size in WAL file from archive \"%s\" (%d bytes)",
+ longhdr->xlp_seg_size),
+ privateInfo->archive_name, longhdr->xlp_seg_size);
+ pg_log_error_detail("The WAL segment size must be a power of two between 1 MB and 1 GB.");
+ exit(1);
+ }
+
+ *WalSegSz = longhdr->xlp_seg_size;
+
+ /*
+ * With the WAL segment size available, we can now initialize the
+ * dependent start and end segment numbers.
+ */
+ Assert(!XLogRecPtrIsInvalid(privateInfo->startptr));
+ XLByteToSeg(privateInfo->startptr, privateInfo->start_segno, *WalSegSz);
+
+ if (!XLogRecPtrIsInvalid(privateInfo->endptr))
+ XLByteToSeg(privateInfo->endptr, privateInfo->end_segno, *WalSegSz);
+
+ /*
+ * This WAL record was fetched before the filtering parameters
+ * (start_segno and end_segno) were fully initialized. Perform the
+ * relevance check against the user-provided range now; if the WAL falls
+ * outside this range, remove it from the hash table. Subsequent WAL will
+ * be filtered automatically by the archived streamer using the updated
+ * start_segno and end_segno values.
+ */
+ XLogFromFileName(entry->fname, &timeline, &segno, privateInfo->segsize);
+ if (privateInfo->timeline != timeline ||
+ privateInfo->start_segno > segno ||
+ privateInfo->end_segno < segno)
+ free_archive_wal_entry(entry->fname, privateInfo);
+
+ /*
+ * Setup temporary directory to store WAL segments and set up an exit
+ * callback to remove it upon completion.
+ */
+ setup_tmpwal_dir(waldir);
+ atexit(cleanup_tmpwal_dir_atexit);
+}
+
+/*
+ * Release the archive streamer chain and close the archive file.
+ */
+void
+free_archive_reader(XLogDumpPrivate *privateInfo)
+{
+ /*
+ * NB: Normally, astreamer_finalize() is called before astreamer_free() to
+ * flush any remaining buffered data or to ensure the end of the tar
+ * archive is reached. However, when decoding a WAL file, once we hit the
+ * end LSN, any remaining WAL data in the buffer or the tar archive's
+ * unreached end can be safely ignored.
+ */
+ astreamer_free(privateInfo->archive_streamer);
+
+ /* Free any remaining hash table entries and their buffers. */
+ if (privateInfo->archive_wal_htab != NULL)
+ {
+ ArchivedWAL_iterator iter;
+ ArchivedWALFile *entry;
+
+ ArchivedWAL_start_iterate(privateInfo->archive_wal_htab, &iter);
+ while ((entry = ArchivedWAL_iterate(privateInfo->archive_wal_htab,
+ &iter)) != NULL)
+ {
+ if (entry->buf != NULL)
+ destroyStringInfo(entry->buf);
+ }
+ ArchivedWAL_destroy(privateInfo->archive_wal_htab);
+ privateInfo->archive_wal_htab = NULL;
+ }
+
+ /* Free the reusable read buffer. */
+ if (privateInfo->archive_read_buf != NULL)
+ {
+ pg_free(privateInfo->archive_read_buf);
+ privateInfo->archive_read_buf = NULL;
+ }
+
+ /* Close the file. */
+ if (close(privateInfo->archive_fd) != 0)
+ pg_log_error("could not close file \"%s\": %m",
+ privateInfo->archive_name);
+}
+
+/*
+ * Copies WAL data from astreamer to readBuff; if unavailable, fetches more
+ * from the tar archive via astreamer.
+ */
+int
+read_archive_wal_page(XLogDumpPrivate *privateInfo, XLogRecPtr targetPagePtr,
+ Size count, char *readBuff, int WalSegSz)
+{
+ char *p = readBuff;
+ Size nbytes = count;
+ XLogRecPtr recptr = targetPagePtr;
+ XLogSegNo segno;
+ char fname[MAXFNAMELEN];
+ ArchivedWALFile *entry;
+
+ /* Identify the segment and locate its entry in the archive hash */
+ XLByteToSeg(targetPagePtr, segno, WalSegSz);
+ XLogFileName(fname, privateInfo->timeline, segno, WalSegSz);
+ entry = get_archive_wal_entry(fname, privateInfo, WalSegSz);
+
+ while (nbytes > 0)
+ {
+ char *buf = entry->buf->data;
+ int bufLen = entry->buf->len;
+ XLogRecPtr endPtr;
+ XLogRecPtr startPtr;
+
+ /* Calculate the LSN range currently residing in the buffer */
+ XLogSegNoOffsetToRecPtr(segno, entry->read_len, WalSegSz, endPtr);
+ startPtr = endPtr - bufLen;
+
+ /*
+ * Copy the requested WAL record if it exists in the buffer.
+ */
+ if (bufLen > 0 && startPtr <= recptr && recptr < endPtr)
+ {
+ int copyBytes;
+ int offset = recptr - startPtr;
+
+ /*
+ * Given startPtr <= recptr < endPtr and a total buffer size
+ * 'bufLen', the offset (recptr - startPtr) will always be less
+ * than 'bufLen'.
+ */
+ Assert(offset < bufLen);
+
+ copyBytes = Min(nbytes, bufLen - offset);
+ memcpy(p, buf + offset, copyBytes);
+
+ /* Update state for read */
+ recptr += copyBytes;
+ nbytes -= copyBytes;
+ p += copyBytes;
+ }
+ else
+ {
+ /*
+ * Before starting the actual decoding loop, pg_waldump tries to
+ * locate the first valid record from the user-specified start
+ * position, which might not be the start of a WAL record and
+ * could fall in the middle of a record that spans multiple pages.
+ * Consequently, the valid start position the decoder is looking
+ * for could be far away from that initial position.
+ *
+ * This may involve reading across multiple pages, and this
+ * pre-reading fetches data in multiple rounds from the archive
+ * streamer; normally, we would throw away existing buffer
+ * contents to fetch the next set of data, but that existing data
+ * might be needed once the main loop starts. Because previously
+ * read data cannot be re-read by the archive streamer, we delay
+ * resetting the buffer until the main decoding loop is entered.
+ *
+ * Once pg_waldump has entered the main loop, it may re-read the
+ * currently active page, but never an older one; therefore, any
+ * fully consumed WAL data preceding the current page can then be
+ * safely discarded.
+ */
+ if (privateInfo->decoding_started)
+ {
+ resetStringInfo(entry->buf);
+
+ /*
+ * Push back the partial page data for the current page to the
+ * buffer, ensuring it remains full page available for
+ * re-reading if requested.
+ */
+ if (p > readBuff)
+ {
+ Assert((count - nbytes) > 0);
+ appendBinaryStringInfo(entry->buf, readBuff, count - nbytes);
+ }
+ }
+
+ /*
+ * Now, fetch more data. Raise an error if the archive
+ * streamer has moved past our segment (meaning the WAL file
+ * in the archive is shorter than expected) or if reading the
+ * archive reached EOF.
+ */
+ if (privateInfo->cur_file != entry)
+ pg_fatal("WAL segment \"%s\" in archive \"%s\" is too short: read %lld of %lld bytes",
+ fname, privateInfo->archive_name,
+ (long long int) (count - nbytes),
+ (long long int) count);
+ if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ pg_fatal("unexpected end of archive \"%s\" while reading \"%s\": read %lld of %lld bytes",
+ privateInfo->archive_name, fname,
+ (long long int) (count - nbytes),
+ (long long int) count);
+ }
+ }
+
+ /*
+ * Should have successfully read all the requested bytes or reported a
+ * failure before this point.
+ */
+ Assert(nbytes == 0);
+
+ /*
+ * NB: We return the fixed value provided as input. We could return a
+ * boolean since we either successfully read the WAL page or raise an
+ * error, but the caller expects this value to be returned. The routine
+ * that reads WAL pages from the physical WAL file follows the same
+ * convention.
+ */
+ return count;
+}
+
+/*
+ * Clears the buffer of a WAL entry that is being ignored. This frees up memory
+ * and prevents the accumulation of irrelevant WAL data. Additionally,
+ * conditionally setting cur_file within privateInfo to NULL ensures the
+ * archive streamer skips unnecessary copy operations.
+ */
+void
+free_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo)
+{
+ ArchivedWALFile *entry;
+
+ entry = ArchivedWAL_lookup(privateInfo->archive_wal_htab, fname);
+
+ if (entry == NULL)
+ return;
+
+ /* Destroy the buffer */
+ destroyStringInfo(entry->buf);
+ entry->buf = NULL;
+
+ /* Remove temporary file if any */
+ if (entry->spilled)
+ {
+ char fpath[MAXPGPATH];
+
+ snprintf(fpath, MAXPGPATH, "%s/%s", TmpWalSegDir, fname);
+
+ if (unlink(fpath) == 0)
+ pg_log_debug("removed file \"%s\"", fpath);
+ }
+
+ /* Set cur_file to NULL if it matches the entry being ignored */
+ if (privateInfo->cur_file == entry)
+ privateInfo->cur_file = NULL;
+
+ ArchivedWAL_delete_item(privateInfo->archive_wal_htab, entry);
+}
+
+/*
+ * Returns the archived WAL entry from the hash table if it exists. Otherwise,
+ * it invokes the routine to read the archived file, which then populates the
+ * entry in the hash table if that WAL exists in the archive.
+ * If the archive streamer happens to be reading a
+ * WAL from archive file that is not currently needed, that WAL data is written
+ * to a temporary file.
+ */
+static ArchivedWALFile *
+get_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo,
+ int WalSegSz)
+{
+ ArchivedWALFile *entry = NULL;
+ FILE *write_fp = NULL;
+
+ /* Search hash table */
+ entry = ArchivedWAL_lookup(privateInfo->archive_wal_htab, fname);
+
+ if (entry != NULL)
+ return entry;
+
+ /*
+ * The requested WAL entry has not been read from the archive yet; invoke
+ * the archive streamer to read it.
+ */
+ while (1)
+ {
+ /*
+ * The WAL file entry currently being processed may change during
+ * archive streamer execution. Therefore, maintain a local variable to
+ * reference the previous entry, ensuring that any remaining data in
+ * its buffer is successfully flushed to the temporary file before
+ * switching to the next WAL entry.
+ */
+ entry = privateInfo->cur_file;
+
+ /* Fetch more data */
+ if (entry == NULL || entry->buf->len == 0)
+ {
+ if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ break; /* archive file ended */
+ }
+
+ /*
+ * Archived streamer is reading a non-WAL file or an irrelevant WAL
+ * file.
+ */
+ if (entry == NULL)
+ continue;
+
+ /* Found the required entry */
+ if (strcmp(fname, entry->fname) == 0)
+ return entry;
+
+ /*
+ * Archive streamer is currently reading a file that isn't the one
+ * asked for, but it's required in the future. It should be written to
+ * a temporary location for retrieval when needed.
+ */
+
+ /* Create a temporary file if one does not already exist */
+ if (!entry->spilled)
+ {
+ write_fp = prepare_tmp_write(entry->fname);
+ entry->spilled = true;
+ }
+
+ /* Flush data from the buffer to the file */
+ perform_tmp_write(entry->fname, entry->buf, write_fp);
+ resetStringInfo(entry->buf);
+
+ /*
+ * The change in the current segment entry indicates that the reading
+ * of this file has ended.
+ */
+ if (entry != privateInfo->cur_file && write_fp != NULL)
+ {
+ fclose(write_fp);
+ write_fp = NULL;
+ }
+ }
+
+ /* Requested WAL segment not found */
+ pg_fatal("could not find WAL \"%s\" in archive \"%s\"",
+ fname, privateInfo->archive_name);
+}
+
+/*
+ * Reads the archive file and passes it to the archive streamer for
+ * decompression.
+ */
+static int
+read_archive_file(XLogDumpPrivate *privateInfo, Size count)
+{
+ int rc;
+
+ /* The read request must not exceed the allocated buffer size. */
+ Assert(privateInfo->archive_read_buf_size >= count);
+
+ rc = read(privateInfo->archive_fd, privateInfo->archive_read_buf, count);
+ if (rc < 0)
+ pg_fatal("could not read file \"%s\": %m",
+ privateInfo->archive_name);
+
+ /*
+ * Decompress (if required), and then parse the previously read contents
+ * of the tar file.
+ */
+ if (rc > 0)
+ astreamer_content(privateInfo->archive_streamer, NULL,
+ privateInfo->archive_read_buf, rc,
+ ASTREAMER_UNKNOWN);
+
+ return rc;
+}
+
+/*
+ * Set up a temporary directory to temporarily store WAL segments.
+ */
+static void
+setup_tmpwal_dir(const char *waldir)
+{
+ char *template;
+
+ /*
+ * Use the directory specified by the TMPDIR environment variable. If it's
+ * not set, use the provided WAL directory to extract WAL file
+ * temporarily.
+ */
+ template = psprintf("%s/waldump_tmp-XXXXXX",
+ getenv("TMPDIR") ? getenv("TMPDIR") : waldir);
+ TmpWalSegDir = mkdtemp(template);
+
+ if (TmpWalSegDir == NULL)
+ pg_fatal("could not create directory \"%s\": %m", template);
+
+ canonicalize_path(TmpWalSegDir);
+
+ pg_log_debug("created directory \"%s\"", TmpWalSegDir);
+}
+
+/*
+ * Remove temporary directory at exit, if any.
+ */
+static void
+cleanup_tmpwal_dir_atexit(void)
+{
+ rmtree(TmpWalSegDir, true);
+}
+
+/*
+ * Create an empty placeholder file and return its handle.
+ */
+static FILE *
+prepare_tmp_write(const char *fname)
+{
+ char fpath[MAXPGPATH];
+ FILE *file;
+
+ snprintf(fpath, MAXPGPATH, "%s/%s", TmpWalSegDir, fname);
+
+ /* Create an empty placeholder */
+ file = fopen(fpath, PG_BINARY_W);
+ if (file == NULL)
+ pg_fatal("could not create file \"%s\": %m", fpath);
+
+#ifndef WIN32
+ if (chmod(fpath, pg_file_create_mode))
+ pg_fatal("could not set permissions on file \"%s\": %m",
+ fpath);
+#endif
+
+ pg_log_debug("spilling to temporary file \"%s\"", fpath);
+
+ return file;
+}
+
+/*
+ * Write buffer data to the given file handle.
+ */
+static void
+perform_tmp_write(const char *fname, StringInfo buf, FILE *file)
+{
+ Assert(file);
+
+ errno = 0;
+ if (buf->len > 0 && fwrite(buf->data, buf->len, 1, file) != 1)
+ {
+ /*
+ * If write didn't set errno, assume problem is no disk space
+ */
+ if (errno == 0)
+ errno = ENOSPC;
+ pg_fatal("could not write to file \"%s\": %m", fname);
+ }
+}
+
+/*
+ * Create an astreamer that can read WAL from tar file.
+ */
+static astreamer *
+astreamer_waldump_new(XLogDumpPrivate *privateInfo)
+{
+ astreamer_waldump *streamer;
+
+ streamer = palloc0_object(astreamer_waldump);
+ *((const astreamer_ops **) &streamer->base.bbs_ops) =
+ &astreamer_waldump_ops;
+
+ streamer->privateInfo = privateInfo;
+
+ return &streamer->base;
+}
+
+/*
+ * Main entry point of the archive streamer for reading WAL data from a tar
+ * file. If a member is identified as a valid WAL file, a hash entry is created
+ * for it, and its contents are copied into that entry's buffer, making them
+ * accessible to the decoding routine.
+ */
+static void
+astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context)
+{
+ astreamer_waldump *mystreamer = (astreamer_waldump *) streamer;
+ XLogDumpPrivate *privateInfo = mystreamer->privateInfo;
+
+ Assert(context != ASTREAMER_UNKNOWN);
+
+ switch (context)
+ {
+ case ASTREAMER_MEMBER_HEADER:
+ {
+ char *fname = NULL;
+ ArchivedWALFile *entry;
+ bool found;
+
+ pg_log_debug("reading \"%s\"", member->pathname);
+
+ if (!member_is_wal_file(mystreamer, member, &fname))
+ break;
+
+ /*
+ * Further checks are skipped if any WAL file can be read.
+ * This typically occurs during initial verification.
+ */
+ if (!READ_ANY_WAL(privateInfo))
+ {
+ XLogSegNo segno;
+ TimeLineID timeline;
+
+ /*
+ * Skip the segment if the timeline does not match, if it
+ * falls outside the caller-specified range.
+ */
+ XLogFromFileName(fname, &timeline, &segno, privateInfo->segsize);
+ if (privateInfo->timeline != timeline ||
+ privateInfo->start_segno > segno ||
+ privateInfo->end_segno < segno)
+ {
+ pfree(fname);
+ break;
+ }
+ }
+
+ entry = ArchivedWAL_insert(privateInfo->archive_wal_htab,
+ fname, &found);
+
+ /*
+ * Shouldn't happen, but if it does, simply ignore the
+ * duplicate WAL file.
+ */
+ if (found)
+ {
+ pg_log_warning("ignoring duplicate WAL \"%s\" found in archive \"%s\"",
+ member->pathname, privateInfo->archive_name);
+ pfree(fname);
+ break;
+ }
+
+ entry->buf = makeStringInfo();
+ entry->spilled = false;
+ entry->read_len = 0;
+ privateInfo->cur_file = entry;
+ }
+ break;
+
+ case ASTREAMER_MEMBER_CONTENTS:
+ if (privateInfo->cur_file)
+ {
+ appendBinaryStringInfo(privateInfo->cur_file->buf, data, len);
+ privateInfo->cur_file->read_len += len;
+ }
+ break;
+
+ case ASTREAMER_MEMBER_TRAILER:
+ privateInfo->cur_file = NULL;
+ break;
+
+ case ASTREAMER_ARCHIVE_TRAILER:
+ break;
+
+ default:
+ /* Shouldn't happen. */
+ pg_fatal("unexpected state while parsing tar file");
+ }
+}
+
+/*
+ * End-of-stream processing for an astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_finalize(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+}
+
+/*
+ * Free memory associated with an astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_free(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+ pfree(streamer);
+}
+
+/*
+ * Returns true if the archive member name matches the WAL naming format. If
+ * successful, it also outputs the WAL segment name.
+ */
+static bool
+member_is_wal_file(astreamer_waldump *mystreamer, astreamer_member *member,
+ char **fname)
+{
+ int pathlen;
+ char pathname[MAXPGPATH];
+ char *filename;
+
+ /* We are only interested in normal files. */
+ if (member->is_directory || member->is_link)
+ return false;
+
+ if (strlen(member->pathname) < XLOG_FNAME_LEN)
+ return false;
+
+ /*
+ * For a correct comparison, we must remove any '.' or '..' components
+ * from the member pathname. Similar to member_verify_header(), we prepend
+ * './' to the path so that canonicalize_path() can properly resolve and
+ * strip these references from the tar member name
+ */
+ snprintf(pathname, MAXPGPATH, "./%s", member->pathname);
+ canonicalize_path(pathname);
+ pathlen = strlen(pathname);
+
+ /* WAL files from the top-level or pg_wal directory will be decoded */
+ if (pathlen > XLOG_FNAME_LEN &&
+ strncmp(pathname, XLOGDIR, strlen(XLOGDIR)) != 0)
+ return false;
+
+ /* WAL file could be with full path */
+ filename = pathname + (pathlen - XLOG_FNAME_LEN);
+ if (!IsXLogFileName(filename))
+ return false;
+
+ *fname = pnstrdup(filename, XLOG_FNAME_LEN);
+
+ return true;
+}
+
+/*
+ * Helper function for filemap hash table.
+ */
+static uint32
+hash_string_pointer(const char *s)
+{
+ unsigned char *ss = (unsigned char *) s;
+
+ return hash_bytes(ss, strlen(s));
+}
diff --git a/src/bin/pg_waldump/meson.build b/src/bin/pg_waldump/meson.build
index 633a9874bb5..5296f21b82c 100644
--- a/src/bin/pg_waldump/meson.build
+++ b/src/bin/pg_waldump/meson.build
@@ -1,6 +1,7 @@
# Copyright (c) 2022-2026, PostgreSQL Global Development Group
pg_waldump_sources = files(
+ 'archive_waldump.c',
'compat.c',
'pg_waldump.c',
'rmgrdesc.c',
@@ -18,7 +19,7 @@ endif
pg_waldump = executable('pg_waldump',
pg_waldump_sources,
- dependencies: [frontend_code, lz4, zstd],
+ dependencies: [frontend_code, libpq, lz4, zstd],
c_args: ['-DFRONTEND'], # needed for xlogreader et al
kwargs: default_bin_args,
)
@@ -29,6 +30,7 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
+ 'env': {'TAR': tar.found() ? tar.full_path() : ''},
'tests': [
't/001_basic.pl',
't/002_save_fullpage.pl',
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 5d31b15dbd8..b13cedaa3e7 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -176,7 +176,7 @@ split_path(const char *path, char **dir, char **fname)
*
* return a read only fd
*/
-static int
+int
open_file_in_directory(const char *directory, const char *fname)
{
int fd = -1;
@@ -440,6 +440,103 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
return count;
}
+/*
+ * pg_waldump's XLogReaderRoutine->segment_open callback to support dumping WAL
+ * files from tar archives.
+ */
+static void
+TarWALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
+ TimeLineID *tli_p)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->segment_close callback.
+ */
+static void
+TarWALDumpCloseSegment(XLogReaderState *state)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->page_read callback to support dumping WAL
+ * files from tar archives.
+ */
+static int
+TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
+ XLogRecPtr targetPtr, char *readBuff)
+{
+ XLogDumpPrivate *private = state->private_data;
+ int count = required_read_len(private, targetPagePtr, reqLen);
+ int WalSegSz = state->segcxt.ws_segsize;
+ XLogSegNo curSegNo;
+
+ /* Bail out if the count to be read is not valid */
+ if (count < 0)
+ return -1;
+
+ /*
+ * If the target page is in a different segment, free the buffer and/or
+ * temporary file disk space occupied by the previous segment's data.
+ * Since pg_waldump never requests the same WAL bytes twice, moving to a
+ * new segment implies the previous buffer's data and that segment will
+ * not be needed again.
+ *
+ * Afterward, check for the next required WAL segment's physical existence
+ * in the temporary directory first before invoking the archive streamer.
+ */
+ curSegNo = state->seg.ws_segno;
+ if (!XLByteInSeg(targetPagePtr, curSegNo, WalSegSz))
+ {
+ char fname[MAXFNAMELEN];
+ XLogSegNo nextSegNo;
+
+ /*
+ * Calculate the next WAL segment to be decoded from the given page
+ * pointer
+ */
+ XLByteToSeg(targetPagePtr, nextSegNo, WalSegSz);
+ state->seg.ws_tli = private->timeline;
+ state->seg.ws_segno = nextSegNo;
+
+ /* Close the WAL segment file if it is currently open */
+ if (state->seg.ws_file >= 0)
+ {
+ close(state->seg.ws_file);
+ state->seg.ws_file = -1;
+ }
+
+ /*
+ * If in pre-reading mode (prior to actual decoding), do not delete
+ * any entries that might be requested again once the decoding loop
+ * starts. For more details, see the comments in
+ * read_archive_wal_page().
+ */
+ if (private->decoding_started && curSegNo < nextSegNo)
+ {
+ XLogFileName(fname, state->seg.ws_tli, curSegNo, WalSegSz);
+ free_archive_wal_entry(fname, private);
+ }
+
+ /*
+ * If the next segment exists, open it and continue reading from there
+ */
+ XLogFileName(fname, state->seg.ws_tli, nextSegNo, WalSegSz);
+ state->seg.ws_file = open_file_in_directory(TmpWalSegDir, fname);
+ }
+
+ /* Continue reading from the open WAL segment, if any */
+ if (state->seg.ws_file >= 0)
+ return WALDumpReadPage(state, targetPagePtr, count, targetPtr,
+ readBuff);
+
+ /* Otherwise, read the WAL page from the archive streamer */
+ return read_archive_wal_page(private, targetPagePtr, count, readBuff,
+ WalSegSz);
+}
+
/*
* Boolean to return whether the given WAL record matches a specific relation
* and optionally block.
@@ -777,8 +874,8 @@ usage(void)
printf(_(" -F, --fork=FORK only show records that modify blocks in fork FORK;\n"
" valid names are main, fsm, vm, init\n"));
printf(_(" -n, --limit=N number of records to display\n"));
- printf(_(" -p, --path=PATH directory in which to find WAL segment files or a\n"
- " directory with a ./pg_wal that contains such files\n"
+ printf(_(" -p, --path=PATH a tar archive or a directory in which to find WAL segment files or\n"
+ " a directory with a ./pg_wal that contains such files\n"
" (default: current directory, ./pg_wal, $PGDATA/pg_wal)\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -r, --rmgr=RMGR only show records generated by resource manager RMGR;\n"
@@ -810,7 +907,9 @@ main(int argc, char **argv)
XLogRecord *record;
XLogRecPtr first_record;
char *waldir = NULL;
+ char *walpath = NULL;
char *errormsg;
+ pg_compress_algorithm compression = PG_COMPRESSION_NONE;
static struct option long_options[] = {
{"bkp-details", no_argument, NULL, 'b'},
@@ -868,6 +967,10 @@ main(int argc, char **argv)
private.startptr = InvalidXLogRecPtr;
private.endptr = InvalidXLogRecPtr;
private.endptr_reached = false;
+ private.decoding_started = false;
+ private.archive_name = NULL;
+ private.start_segno = 0;
+ private.end_segno = UINT64_MAX;
config.quiet = false;
config.bkp_details = false;
@@ -943,7 +1046,7 @@ main(int argc, char **argv)
}
break;
case 'p':
- waldir = pg_strdup(optarg);
+ walpath = pg_strdup(optarg);
break;
case 'q':
config.quiet = true;
@@ -1107,12 +1210,21 @@ main(int argc, char **argv)
goto bad_argument;
}
- if (waldir != NULL)
+ if (walpath != NULL)
{
+ /* validate path points to tar archive */
+ if (parse_tar_compress_algorithm(walpath, &compression))
+ {
+ char *fname = NULL;
+
+ split_path(walpath, &waldir, &fname);
+
+ private.archive_name = fname;
+ }
/* validate path points to directory */
- if (!verify_directory(waldir))
+ else if (!verify_directory(walpath))
{
- pg_log_error("could not open directory \"%s\": %m", waldir);
+ pg_log_error("could not open directory \"%s\": %m", walpath);
goto bad_argument;
}
}
@@ -1128,6 +1240,17 @@ main(int argc, char **argv)
int fd;
XLogSegNo segno;
+ /*
+ * If a tar archive is passed using the --path option, all other
+ * arguments become unnecessary.
+ */
+ if (private.archive_name)
+ {
+ pg_log_error("unnecessary command-line arguments specified with tar archive (first is \"%s\")",
+ argv[optind]);
+ goto bad_argument;
+ }
+
split_path(argv[optind], &directory, &fname);
if (waldir == NULL && directory != NULL)
@@ -1138,69 +1261,75 @@ main(int argc, char **argv)
pg_fatal("could not open directory \"%s\": %m", waldir);
}
- waldir = identify_target_directory(waldir, fname, &private.segsize);
- fd = open_file_in_directory(waldir, fname);
- if (fd < 0)
- pg_fatal("could not open file \"%s\"", fname);
- close(fd);
-
- /* parse position from file */
- XLogFromFileName(fname, &private.timeline, &segno, private.segsize);
-
- if (!XLogRecPtrIsValid(private.startptr))
- XLogSegNoOffsetToRecPtr(segno, 0, private.segsize, private.startptr);
- else if (!XLByteInSeg(private.startptr, segno, private.segsize))
+ if (fname != NULL && parse_tar_compress_algorithm(fname, &compression))
{
- pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.startptr),
- fname);
- goto bad_argument;
+ private.archive_name = fname;
}
-
- /* no second file specified, set end position */
- if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(segno + 1, 0, private.segsize, private.endptr);
-
- /* parse ENDSEG if passed */
- if (optind + 1 < argc)
+ else
{
- XLogSegNo endsegno;
-
- /* ignore directory, already have that */
- split_path(argv[optind + 1], &directory, &fname);
-
+ waldir = identify_target_directory(waldir, fname, &private.segsize);
fd = open_file_in_directory(waldir, fname);
if (fd < 0)
pg_fatal("could not open file \"%s\"", fname);
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &endsegno, private.segsize);
+ XLogFromFileName(fname, &private.timeline, &segno, private.segsize);
- if (endsegno < segno)
- pg_fatal("ENDSEG %s is before STARTSEG %s",
- argv[optind + 1], argv[optind]);
+ if (!XLogRecPtrIsValid(private.startptr))
+ XLogSegNoOffsetToRecPtr(segno, 0, private.segsize, private.startptr);
+ else if (!XLByteInSeg(private.startptr, segno, private.segsize))
+ {
+ pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.startptr),
+ fname);
+ goto bad_argument;
+ }
- if (!XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(endsegno + 1, 0, private.segsize,
- private.endptr);
+ /* no second file specified, set end position */
+ if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
+ XLogSegNoOffsetToRecPtr(segno + 1, 0, private.segsize, private.endptr);
- /* set segno to endsegno for check of --end */
- segno = endsegno;
- }
+ /* parse ENDSEG if passed */
+ if (optind + 1 < argc)
+ {
+ XLogSegNo endsegno;
+ /* ignore directory, already have that */
+ split_path(argv[optind + 1], &directory, &fname);
- if (!XLByteInSeg(private.endptr, segno, private.segsize) &&
- private.endptr != (segno + 1) * private.segsize)
- {
- pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.endptr),
- argv[argc - 1]);
- goto bad_argument;
+ fd = open_file_in_directory(waldir, fname);
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", fname);
+ close(fd);
+
+ /* parse position from file */
+ XLogFromFileName(fname, &private.timeline, &endsegno, private.segsize);
+
+ if (endsegno < segno)
+ pg_fatal("ENDSEG %s is before STARTSEG %s",
+ argv[optind + 1], argv[optind]);
+
+ if (!XLogRecPtrIsValid(private.endptr))
+ XLogSegNoOffsetToRecPtr(endsegno + 1, 0, private.segsize,
+ private.endptr);
+
+ /* set segno to endsegno for check of --end */
+ segno = endsegno;
+ }
+
+ if (!XLByteInSeg(private.endptr, segno, private.segsize) &&
+ private.endptr != (segno + 1) * private.segsize)
+ {
+ pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.endptr),
+ argv[argc - 1]);
+ goto bad_argument;
+ }
}
}
- else
- waldir = identify_target_directory(waldir, NULL, &private.segsize);
+ else if (!private.archive_name)
+ waldir = identify_target_directory(walpath, NULL, &private.segsize);
/* we don't know what to print */
if (!XLogRecPtrIsValid(private.startptr))
@@ -1209,15 +1338,46 @@ main(int argc, char **argv)
goto bad_argument;
}
+ /* --follow is not supported with tar archives */
+ if (config.follow && private.archive_name)
+ {
+ pg_log_error("--follow is not supported when reading from a tar archive");
+ goto bad_argument;
+ }
+
/* done with argument parsing, do the actual work */
/* we have everything we need, start reading */
- xlogreader_state =
- XLogReaderAllocate(private.segsize, waldir,
- XL_ROUTINE(.page_read = WALDumpReadPage,
- .segment_open = WALDumpOpenSegment,
- .segment_close = WALDumpCloseSegment),
- &private);
+ if (private.archive_name)
+ {
+ /*
+ * A NULL WAL directory indicates that the archive file is located in
+ * the current working directory of the pg_waldump execution
+ */
+ if (waldir == NULL)
+ waldir = pg_strdup(".");
+
+ /* Set up for reading tar file */
+ init_archive_reader(&private, waldir, &private.segsize, compression);
+
+ /* Routine to decode WAL files in tar archive */
+ xlogreader_state =
+ XLogReaderAllocate(private.segsize, waldir,
+ XL_ROUTINE(.page_read = TarWALDumpReadPage,
+ .segment_open = TarWALDumpOpenSegment,
+ .segment_close = TarWALDumpCloseSegment),
+ &private);
+ }
+ else
+ {
+ xlogreader_state =
+ XLogReaderAllocate(private.segsize, waldir,
+ XL_ROUTINE(.page_read = WALDumpReadPage,
+ .segment_open = WALDumpOpenSegment,
+ .segment_close = WALDumpCloseSegment),
+ &private);
+ }
+
if (!xlogreader_state)
pg_fatal("out of memory while allocating a WAL reading processor");
@@ -1245,6 +1405,9 @@ main(int argc, char **argv)
if (config.stats == true && !config.quiet)
stats.startptr = first_record;
+ /* Flag indicating that the decoding loop has been entered */
+ private.decoding_started = true;
+
for (;;)
{
if (time_to_stop)
@@ -1326,6 +1489,9 @@ main(int argc, char **argv)
XLogReaderFree(xlogreader_state);
+ if (private.archive_name)
+ free_archive_reader(&private);
+
return EXIT_SUCCESS;
bad_argument:
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index 013b051506f..1097390d575 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -12,6 +12,14 @@
#define PG_WALDUMP_H
#include "access/xlogdefs.h"
+#include "fe_utils/astreamer.h"
+
+/* Forward declaration */
+struct ArchivedWALFile;
+struct ArchivedWAL_hash;
+
+/* Temporary directory */
+extern char *TmpWalSegDir;
/* Contains the necessary information to drive WAL decoding */
typedef struct XLogDumpPrivate
@@ -21,6 +29,46 @@ typedef struct XLogDumpPrivate
XLogRecPtr startptr;
XLogRecPtr endptr;
bool endptr_reached;
+ bool decoding_started;
+
+ /* Fields required to read WAL from archive */
+ char *archive_name; /* Tar archive name */
+ int archive_fd; /* File descriptor for the open tar file */
+
+ astreamer *archive_streamer;
+ char *archive_read_buf; /* Reusable read buffer for archive I/O */
+ Size archive_read_buf_size;
+
+ /* What the archive streamer is currently reading */
+ struct ArchivedWALFile *cur_file;
+
+ /*
+ * Hash table of all WAL files that the archive stream has read, including
+ * the one currently in progress.
+ */
+ struct ArchivedWAL_hash *archive_wal_htab;
+
+ /*
+ * Although these values can be easily derived from startptr and endptr,
+ * doing so repeatedly for each archived member would be inefficient, as
+ * it would involve recalculating and filtering out irrelevant WAL
+ * segments.
+ */
+ XLogSegNo start_segno;
+ XLogSegNo end_segno;
} XLogDumpPrivate;
+extern int open_file_in_directory(const char *directory, const char *fname);
+
+extern void init_archive_reader(XLogDumpPrivate *privateInfo,
+ const char *waldir, int *WalSegSz,
+ pg_compress_algorithm compression);
+extern void free_archive_reader(XLogDumpPrivate *privateInfo);
+extern int read_archive_wal_page(XLogDumpPrivate *privateInfo,
+ XLogRecPtr targetPagePtr,
+ Size count, char *readBuff,
+ int WalSegSz);
+extern void free_archive_wal_entry(const char *fname,
+ XLogDumpPrivate *privateInfo);
+
#endif /* PG_WALDUMP_H */
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 5db5d20136f..6960bd46ba4 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -3,9 +3,13 @@
use strict;
use warnings FATAL => 'all';
+use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
+use List::Util qw(shuffle);
+
+my $tar = $ENV{TAR};
program_help_ok('pg_waldump');
program_version_ok('pg_waldump');
@@ -162,6 +166,42 @@ CREATE TABLESPACE ts1 LOCATION '$tblspc_path';
DROP TABLESPACE ts1;
});
+# Test: Decode a continuation record (contrecord) that spans multiple WAL
+# segments.
+#
+# Now consume all remaining room in the current WAL segment, leaving
+# space enough only for the start of a largish record.
+$node->safe_psql(
+ 'postgres', q{
+DO $$
+DECLARE
+ wal_segsize int := setting::int FROM pg_settings WHERE name = 'wal_segment_size';
+ remain int;
+ iters int := 0;
+BEGIN
+ LOOP
+ INSERT into t1(b)
+ select repeat(encode(sha256(g::text::bytea), 'hex'), (random() * 15 + 1)::int)
+ from generate_series(1, 10) g;
+
+ remain := wal_segsize - (pg_current_wal_insert_lsn() - '0/0') % wal_segsize;
+ IF remain < 2 * setting::int from pg_settings where name = 'block_size' THEN
+ RAISE log 'exiting after % iterations, % bytes to end of WAL segment', iters, remain;
+ EXIT;
+ END IF;
+ iters := iters + 1;
+ END LOOP;
+END
+$$;
+});
+
+my $contrecord_lsn = $node->safe_psql('postgres',
+ 'SELECT pg_current_wal_insert_lsn()');
+# Generate contrecord record
+$node->safe_psql('postgres',
+ qq{SELECT pg_logical_emit_message(true, 'test 026', repeat('xyzxz', 123456))}
+);
+
my ($end_lsn, $end_walfile) = split /\|/,
$node->safe_psql('postgres',
q{SELECT pg_current_wal_insert_lsn(), pg_walfile_name(pg_current_wal_insert_lsn())}
@@ -198,28 +238,6 @@ command_like(
],
qr/./,
'runs with start and end segment specified');
-command_fails_like(
- [ 'pg_waldump', '--path' => $node->data_dir ],
- qr/error: no start WAL location given/,
- 'path option requires start location');
-command_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- '--end' => $end_lsn,
- ],
- qr/./,
- 'runs with path option and start and end locations');
-command_fails_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- ],
- qr/error: error in WAL record at/,
- 'falling off the end of the WAL results in an error');
-
command_like(
[
'pg_waldump', '--quiet',
@@ -227,22 +245,16 @@ command_like(
],
qr/^$/,
'no output with --quiet option');
-command_fails_like(
- [
- 'pg_waldump', '--quiet',
- '--path' => $node->data_dir,
- '--start' => $start_lsn
- ],
- qr/error: error in WAL record at/,
- 'errors are shown with --quiet');
-
# Test for: Display a message that we're skipping data if `from`
# wasn't a pointer to the start of a record.
+sub test_pg_waldump_skip_bytes
{
+ my ($path, $startlsn, $endlsn) = @_;
+
# Construct a new LSN that is one byte past the original
# start_lsn.
- my ($part1, $part2) = split qr{/}, $start_lsn;
+ my ($part1, $part2) = split qr{/}, $startlsn;
my $lsn2 = hex $part2;
$lsn2++;
my $new_start = sprintf("%s/%X", $part1, $lsn2);
@@ -252,7 +264,8 @@ command_fails_like(
my $result = IPC::Run::run [
'pg_waldump',
'--start' => $new_start,
- $node->data_dir . '/pg_wal/' . $start_walfile
+ '--end' => $endlsn,
+ '--path' => $path,
],
'>' => \$stdout,
'2>' => \$stderr;
@@ -266,15 +279,15 @@ command_fails_like(
sub test_pg_waldump
{
local $Test::Builder::Level = $Test::Builder::Level + 1;
- my @opts = @_;
+ my ($path, $startlsn, $endlsn, @opts) = @_;
my ($stdout, $stderr);
my $result = IPC::Run::run [
'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- '--end' => $end_lsn,
+ '--start' => $startlsn,
+ '--end' => $endlsn,
+ '--path' => $path,
@opts
],
'>' => \$stdout,
@@ -286,40 +299,145 @@ sub test_pg_waldump
return @lines;
}
-my @lines;
+# Create a tar archive, sorting the file order
+sub generate_archive
+{
+ my ($archive, $directory, $compression_flags) = @_;
-@lines = test_pg_waldump;
-is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+ my @files;
+ opendir my $dh, $directory or die "opendir: $!";
+ while (my $entry = readdir $dh) {
+ # Skip '.' and '..'
+ next if $entry eq '.' || $entry eq '..';
+ push @files, $entry;
+ }
+ closedir $dh;
-@lines = test_pg_waldump('--limit' => 6);
-is(@lines, 6, 'limit option observed');
+ @files = shuffle @files;
-@lines = test_pg_waldump('--fullpage');
-is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
+ # move into the WAL directory before archiving files
+ my $cwd = getcwd;
+ chdir($directory) || die "chdir: $!";
+ command_ok([$tar, $compression_flags, $archive, @files]);
+ chdir($cwd) || die "chdir: $!";
+}
-@lines = test_pg_waldump('--stats');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
-@lines = test_pg_waldump('--stats=record');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+my @scenarios = (
+ {
+ 'path' => $node->data_dir,
+ 'is_archive' => 0,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar",
+ 'compression_method' => 'none',
+ 'compression_flags' => '-cf',
+ 'is_archive' => 1,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar.gz",
+ 'compression_method' => 'gzip',
+ 'compression_flags' => '-czf',
+ 'is_archive' => 1,
+ 'enabled' => check_pg_config("#define HAVE_LIBZ 1")
+ });
-@lines = test_pg_waldump('--rmgr' => 'Btree');
-is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+for my $scenario (@scenarios)
+{
+ my $path = $scenario->{'path'};
-@lines = test_pg_waldump('--fork' => 'init');
-is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+ SKIP:
+ {
+ skip "tar command is not available", 56
+ if !defined $tar && $scenario->{'is_archive'};
+ skip "$scenario->{'compression_method'} compression not supported by this build", 56
+ if !$scenario->{'enabled'} && $scenario->{'is_archive'};
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
-is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
- 0, 'only lines for selected relation');
+ # create pg_wal archive
+ if ($scenario->{'is_archive'})
+ {
+ generate_archive($path,
+ $node->data_dir . '/pg_wal',
+ $scenario->{'compression_flags'});
+ }
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
- '--block' => 1);
-is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+ command_fails_like(
+ [ 'pg_waldump', '--path' => $path ],
+ qr/error: no start WAL location given/,
+ 'path option requires start location');
+ command_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ '--end' => $end_lsn,
+ ],
+ qr/./,
+ 'runs with path option and start and end locations');
+ command_fails_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ ],
+ qr/error: error in WAL record at/,
+ 'falling off the end of the WAL results in an error');
+ command_fails_like(
+ [
+ 'pg_waldump', '--quiet',
+ '--path' => $path,
+ '--start' => $start_lsn
+ ],
+ qr/error: error in WAL record at/,
+ 'errors are shown with --quiet');
+
+ test_pg_waldump_skip_bytes($path, $start_lsn, $end_lsn);
+
+ my @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
+ is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+
+ @lines = test_pg_waldump($path, $contrecord_lsn, $end_lsn);
+ is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+
+ test_pg_waldump_skip_bytes($path, $contrecord_lsn, $end_lsn);
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--limit' => 6);
+ is(@lines, 6, 'limit option observed');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fullpage');
+ is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats=record');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--rmgr' => 'Btree');
+ is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fork' => 'init');
+ is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
+ is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
+ 0, 'only lines for selected relation');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
+ '--block' => 1);
+ is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+
+ # Cleanup.
+ unlink $path if $scenario->{'is_archive'};
+ }
+}
done_testing();
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 52f8603a7be..4961c3024af 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -147,6 +147,9 @@ ArchiveOpts
ArchiveShutdownCB
ArchiveStartupCB
ArchiveStreamState
+ArchivedWALFile
+ArchivedWAL_hash
+ArchivedWAL_iterator
ArchiverOutput
ArchiverStage
ArrayAnalyzeExtraData
@@ -3540,6 +3543,7 @@ astreamer_recovery_injector
astreamer_tar_archiver
astreamer_tar_parser
astreamer_verify
+astreamer_waldump
astreamer_zstd_frame
auth_password_hook_typ
autovac_table
--
2.47.1
[application/x-patch] v18-0004-pg_verifybackup-Enable-WAL-parsing-for-tar-forma.patch (16.0K, 5-v18-0004-pg_verifybackup-Enable-WAL-parsing-for-tar-forma.patch)
download | inline diff:
From b3e7949959cca0d3c5258c63c172702794f8051e Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <[email protected]>
Date: Wed, 11 Mar 2026 11:26:36 -0400
Subject: [PATCH v18 4/4] pg_verifybackup: Enable WAL parsing for tar-format
backups
Now that pg_waldump supports reading WAL from tar archives, remove the
restriction that forced --no-parse-wal for tar-format backups.
pg_verifybackup now automatically locates the WAL archive: it looks for
a separate pg_wal.tar first, then falls back to the main base.tar. A
new --wal-path option (replacing the old --wal-directory, which is kept
as a silent alias) accepts either a directory or a tar archive path.
The default WAL directory preparation is deferred until the backup
format is known, since tar-format backups resolve the WAL path
differently from plain-format ones.
Author: Amul Sul <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Jakub Wartak <[email protected]>
Reviewed-by: Andrew Dunstan <[email protected]>
discussion: https://postgr.es/m/CAAJ_b94bqdWN3h2J-PzzzQ2Npbwct5ZQHggn_QoYGhC2rn-=WQ@mail.gmail.com
---
doc/src/sgml/ref/pg_verifybackup.sgml | 14 ++-
src/bin/pg_verifybackup/pg_verifybackup.c | 89 ++++++++++++-------
src/bin/pg_verifybackup/t/002_algorithm.pl | 4 -
src/bin/pg_verifybackup/t/003_corruption.pl | 4 +-
src/bin/pg_verifybackup/t/007_wal.pl | 20 ++++-
src/bin/pg_verifybackup/t/008_untar.pl | 5 +-
src/bin/pg_verifybackup/t/010_client_untar.pl | 5 +-
7 files changed, 85 insertions(+), 56 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index 61c12975e4a..1695cfe91c8 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -36,10 +36,7 @@ PostgreSQL documentation
<literal>backup_manifest</literal> generated by the server at the time
of the backup. The backup may be stored either in the "plain" or the "tar"
format; this includes tar-format backups compressed with any algorithm
- supported by <application>pg_basebackup</application>. However, at present,
- <literal>WAL</literal> verification is supported only for plain-format
- backups. Therefore, if the backup is stored in tar-format, the
- <literal>-n, --no-parse-wal</literal> option should be used.
+ supported by <application>pg_basebackup</application>.
</para>
<para>
@@ -261,12 +258,13 @@ PostgreSQL documentation
<varlistentry>
<term><option>-w <replaceable class="parameter">path</replaceable></option></term>
- <term><option>--wal-directory=<replaceable class="parameter">path</replaceable></option></term>
+ <term><option>--wal-path=<replaceable class="parameter">path</replaceable></option></term>
<listitem>
<para>
- Try to parse WAL files stored in the specified directory, rather than
- in <literal>pg_wal</literal>. This may be useful if the backup is
- stored in a separate location from the WAL archive.
+ Try to parse WAL files stored in the specified directory or tar
+ archive, rather than in <literal>pg_wal</literal>. This may be
+ useful if the backup is stored in a separate location from the WAL
+ archive.
</para>
</listitem>
</varlistentry>
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 31f606c45b1..db79dd39103 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -74,7 +74,9 @@ pg_noreturn static void report_manifest_error(JsonManifestParseContext *context,
const char *fmt,...)
pg_attribute_printf(2, 3);
-static void verify_tar_backup(verifier_context *context, DIR *dir);
+static void verify_tar_backup(verifier_context *context, DIR *dir,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_plain_backup_directory(verifier_context *context,
char *relpath, char *fullpath,
DIR *dir);
@@ -83,7 +85,9 @@ static void verify_plain_backup_file(verifier_context *context, char *relpath,
static void verify_control_file(const char *controlpath,
uint64 manifest_system_identifier);
static void precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles);
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_tar_file(verifier_context *context, char *relpath,
char *fullpath, astreamer *streamer);
static void report_extra_backup_files(verifier_context *context);
@@ -93,7 +97,7 @@ static void verify_file_checksum(verifier_context *context,
uint8 *buffer);
static void parse_required_wal(verifier_context *context,
char *pg_waldump_path,
- char *wal_directory);
+ char *wal_path);
static astreamer *create_archive_verifier(verifier_context *context,
char *archive_name,
Oid tblspc_oid,
@@ -126,7 +130,8 @@ main(int argc, char **argv)
{"progress", no_argument, NULL, 'P'},
{"quiet", no_argument, NULL, 'q'},
{"skip-checksums", no_argument, NULL, 's'},
- {"wal-directory", required_argument, NULL, 'w'},
+ {"wal-path", required_argument, NULL, 'w'},
+ {"wal-directory", required_argument, NULL, 'w'}, /* deprecated */
{NULL, 0, NULL, 0}
};
@@ -135,7 +140,9 @@ main(int argc, char **argv)
char *manifest_path = NULL;
bool no_parse_wal = false;
bool quiet = false;
- char *wal_directory = NULL;
+ char *wal_path = NULL;
+ char *base_archive_path = NULL;
+ char *wal_archive_path = NULL;
char *pg_waldump_path = NULL;
DIR *dir;
@@ -221,8 +228,8 @@ main(int argc, char **argv)
context.skip_checksums = true;
break;
case 'w':
- wal_directory = pstrdup(optarg);
- canonicalize_path(wal_directory);
+ wal_path = pstrdup(optarg);
+ canonicalize_path(wal_path);
break;
default:
/* getopt_long already emitted a complaint */
@@ -285,10 +292,6 @@ main(int argc, char **argv)
manifest_path = psprintf("%s/backup_manifest",
context.backup_directory);
- /* By default, look for the WAL in the backup directory, too. */
- if (wal_directory == NULL)
- wal_directory = psprintf("%s/pg_wal", context.backup_directory);
-
/*
* Try to read the manifest. We treat any errors encountered while parsing
* the manifest as fatal; there doesn't seem to be much point in trying to
@@ -331,17 +334,6 @@ main(int argc, char **argv)
pfree(path);
}
- /*
- * XXX: In the future, we should consider enhancing pg_waldump to read WAL
- * files from an archive.
- */
- if (!no_parse_wal && context.format == 't')
- {
- pg_log_error("pg_waldump cannot read tar files");
- pg_log_error_hint("You must use -n/--no-parse-wal when verifying a tar-format backup.");
- exit(1);
- }
-
/*
* Perform the appropriate type of verification appropriate based on the
* backup format. This will close 'dir'.
@@ -350,7 +342,7 @@ main(int argc, char **argv)
verify_plain_backup_directory(&context, NULL, context.backup_directory,
dir);
else
- verify_tar_backup(&context, dir);
+ verify_tar_backup(&context, dir, &base_archive_path, &wal_archive_path);
/*
* The "matched" flag should now be set on every entry in the hash table.
@@ -368,12 +360,35 @@ main(int argc, char **argv)
if (context.format == 'p' && !context.skip_checksums)
verify_backup_checksums(&context);
+ /*
+ * By default, WAL files are expected to be found in the backup directory
+ * for plain-format backups. In the case of tar-format backups, if a
+ * separate WAL archive is not found, the WAL files are most likely
+ * included within the main data directory archive.
+ */
+ if (wal_path == NULL)
+ {
+ if (context.format == 'p')
+ wal_path = psprintf("%s/pg_wal", context.backup_directory);
+ else if (wal_archive_path)
+ wal_path = wal_archive_path;
+ else if (base_archive_path)
+ wal_path = base_archive_path;
+ else
+ {
+ pg_log_error("WAL archive not found");
+ pg_log_error_hint("Specify the correct path using the option -w/--wal-path. "
+ "Or you must use -n/--no-parse-wal when verifying a tar-format backup.");
+ exit(1);
+ }
+ }
+
/*
* Try to parse the required ranges of WAL records, unless we were told
* not to do so.
*/
if (!no_parse_wal)
- parse_required_wal(&context, pg_waldump_path, wal_directory);
+ parse_required_wal(&context, pg_waldump_path, wal_path);
/*
* If everything looks OK, tell the user this, unless we were asked to
@@ -787,7 +802,8 @@ verify_control_file(const char *controlpath, uint64 manifest_system_identifier)
* close when we're done with it.
*/
static void
-verify_tar_backup(verifier_context *context, DIR *dir)
+verify_tar_backup(verifier_context *context, DIR *dir, char **base_archive_path,
+ char **wal_archive_path)
{
struct dirent *dirent;
SimplePtrList tarfiles = {NULL, NULL};
@@ -816,7 +832,8 @@ verify_tar_backup(verifier_context *context, DIR *dir)
char *fullpath;
fullpath = psprintf("%s/%s", context->backup_directory, filename);
- precheck_tar_backup_file(context, filename, fullpath, &tarfiles);
+ precheck_tar_backup_file(context, filename, fullpath, &tarfiles,
+ base_archive_path, wal_archive_path);
pfree(fullpath);
}
}
@@ -875,11 +892,13 @@ verify_tar_backup(verifier_context *context, DIR *dir)
*
* The arguments to this function are mostly the same as the
* verify_plain_backup_file. The additional argument outputs a list of valid
- * tar files.
+ * tar files, along with the full paths to the main archive and the WAL
+ * directory archive.
*/
static void
precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles)
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path, char **wal_archive_path)
{
struct stat sb;
Oid tblspc_oid = InvalidOid;
@@ -918,9 +937,17 @@ precheck_tar_backup_file(verifier_context *context, char *relpath,
* extension such as .gz, .lz4, or .zst.
*/
if (strncmp("base", relpath, 4) == 0)
+ {
suffix = relpath + 4;
+
+ *base_archive_path = pstrdup(fullpath);
+ }
else if (strncmp("pg_wal", relpath, 6) == 0)
+ {
suffix = relpath + 6;
+
+ *wal_archive_path = pstrdup(fullpath);
+ }
else
{
/* Expected a <tablespaceoid>.tar file here. */
@@ -1188,7 +1215,7 @@ verify_file_checksum(verifier_context *context, manifest_file *m,
*/
static void
parse_required_wal(verifier_context *context, char *pg_waldump_path,
- char *wal_directory)
+ char *wal_path)
{
manifest_data *manifest = context->manifest;
manifest_wal_range *this_wal_range = manifest->first_wal_range;
@@ -1198,7 +1225,7 @@ parse_required_wal(verifier_context *context, char *pg_waldump_path,
char *pg_waldump_cmd;
pg_waldump_cmd = psprintf("\"%s\" --quiet --path=\"%s\" --timeline=%u --start=%X/%08X --end=%X/%08X\n",
- pg_waldump_path, wal_directory, this_wal_range->tli,
+ pg_waldump_path, wal_path, this_wal_range->tli,
LSN_FORMAT_ARGS(this_wal_range->start_lsn),
LSN_FORMAT_ARGS(this_wal_range->end_lsn));
fflush(NULL);
@@ -1366,7 +1393,7 @@ usage(void)
printf(_(" -P, --progress show progress information\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -s, --skip-checksums skip checksum verification\n"));
- printf(_(" -w, --wal-directory=PATH use specified path for WAL files\n"));
+ printf(_(" -w, --wal-path=PATH use specified path for WAL files\n"));
printf(_(" -V, --version output version information, then exit\n"));
printf(_(" -?, --help show this help, then exit\n"));
printf(_("\nReport bugs to <%s>.\n"), PACKAGE_BUGREPORT);
diff --git a/src/bin/pg_verifybackup/t/002_algorithm.pl b/src/bin/pg_verifybackup/t/002_algorithm.pl
index 0556191ec9d..edc515d5904 100644
--- a/src/bin/pg_verifybackup/t/002_algorithm.pl
+++ b/src/bin/pg_verifybackup/t/002_algorithm.pl
@@ -30,10 +30,6 @@ sub test_checksums
{
# Add switch to get a tar-format backup
push @backup, ('--format' => 'tar');
-
- # Add switch to skip WAL verification, which is not yet supported for
- # tar-format backups
- push @verify, ('--no-parse-wal');
}
# A backup with a bogus algorithm should fail.
diff --git a/src/bin/pg_verifybackup/t/003_corruption.pl b/src/bin/pg_verifybackup/t/003_corruption.pl
index b1d65b8aa0f..882d75d9dc2 100644
--- a/src/bin/pg_verifybackup/t/003_corruption.pl
+++ b/src/bin/pg_verifybackup/t/003_corruption.pl
@@ -193,10 +193,8 @@ for my $scenario (@scenario)
command_ok([ $tar, '-cf' => "$tar_backup_path/base.tar", '.' ]);
chdir($cwd) || die "chdir: $!";
- # Now check that the backup no longer verifies. We must use -n
- # here, because pg_waldump can't yet read WAL from a tarfile.
command_fails_like(
- [ 'pg_verifybackup', '--no-parse-wal', $tar_backup_path ],
+ [ 'pg_verifybackup', $tar_backup_path ],
$scenario->{'fails_like'},
"corrupt backup fails verification: $name");
diff --git a/src/bin/pg_verifybackup/t/007_wal.pl b/src/bin/pg_verifybackup/t/007_wal.pl
index 79087a1f6be..0e0377bfacc 100644
--- a/src/bin/pg_verifybackup/t/007_wal.pl
+++ b/src/bin/pg_verifybackup/t/007_wal.pl
@@ -42,10 +42,10 @@ command_ok([ 'pg_verifybackup', '--no-parse-wal', $backup_path ],
command_ok(
[
'pg_verifybackup',
- '--wal-directory' => $relocated_pg_wal,
+ '--wal-path' => $relocated_pg_wal,
$backup_path
],
- '--wal-directory can be used to specify WAL directory');
+ '--wal-path can be used to specify WAL directory');
# Move directory back to original location.
rename($relocated_pg_wal, $original_pg_wal) || die "rename pg_wal back: $!";
@@ -90,4 +90,20 @@ command_ok(
[ 'pg_verifybackup', $backup_path2 ],
'valid base backup with timeline > 1');
+# Test WAL verification for a tar-format backup with a separate pg_wal.tar,
+# as produced by pg_basebackup --format=tar --wal-method=stream.
+my $backup_path3 = $primary->backup_dir . '/test_tar_wal';
+$primary->command_ok(
+ [
+ 'pg_basebackup',
+ '--pgdata' => $backup_path3,
+ '--no-sync',
+ '--format' => 'tar',
+ '--checkpoint' => 'fast'
+ ],
+ "tar backup with separate pg_wal.tar");
+command_ok(
+ [ 'pg_verifybackup', $backup_path3 ],
+ 'WAL verification succeeds with separate pg_wal.tar');
+
done_testing();
diff --git a/src/bin/pg_verifybackup/t/008_untar.pl b/src/bin/pg_verifybackup/t/008_untar.pl
index ae67ae85a31..161c08c190d 100644
--- a/src/bin/pg_verifybackup/t/008_untar.pl
+++ b/src/bin/pg_verifybackup/t/008_untar.pl
@@ -47,7 +47,6 @@ my $tsoid = $primary->safe_psql(
SELECT oid FROM pg_tablespace WHERE spcname = 'regress_ts1'));
my $backup_path = $primary->backup_dir . '/server-backup';
-my $extract_path = $primary->backup_dir . '/extracted-backup';
my @test_configuration = (
{
@@ -123,14 +122,12 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
# Cleanup.
rmtree($backup_path);
- rmtree($extract_path);
}
}
diff --git a/src/bin/pg_verifybackup/t/010_client_untar.pl b/src/bin/pg_verifybackup/t/010_client_untar.pl
index 1ac7b5db75a..9670fbe4fda 100644
--- a/src/bin/pg_verifybackup/t/010_client_untar.pl
+++ b/src/bin/pg_verifybackup/t/010_client_untar.pl
@@ -32,7 +32,6 @@ print $jf $junk_data;
close $jf;
my $backup_path = $primary->backup_dir . '/client-backup';
-my $extract_path = $primary->backup_dir . '/extracted-backup';
my @test_configuration = (
{
@@ -137,13 +136,11 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
# Cleanup.
- rmtree($extract_path);
rmtree($backup_path);
}
}
--
2.47.1
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-03-18 15:16 Amul Sul <[email protected]>
parent: Amul Sul <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Amul Sul @ 2026-03-18 15:16 UTC (permalink / raw)
To: Andrew Dunstan <[email protected]>; +Cc: Robert Haas <[email protected]>; Chao Li <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On Wed, Mar 18, 2026 at 5:15 PM Amul Sul <[email protected]> wrote:
>
> On Wed, Mar 11, 2026 at 10:38 PM Andrew Dunstan <[email protected]> wrote:
> > [...]
> > Looks pretty good. I have squashed them into three patches I think are committable. Also attached is a diff showing what's changed - mainly this:
> >
> > . --follow + tar archive rejected (pg_waldump.c) — new validation prevents a confusing pg_fatal when combining --follow with a tar archive
> > . error messages split (archive_waldump.c) — the single "could not read file" error is now two distinct messages: "WAL segment is too short" (truncated file) vs "unexpected end of archive" (archive EOF) - Fixes an issue raised in review
> > . hash table cleanup (archive_waldump.c) — free_archive_reader now iterates and frees all remaining hash entries and destroys the table
> >
>
> The final squashed version looks good to me, thank you. But, I would
> like to propose splitting the 0001 patch into two separate commits: a
> preparatory refactoring of the pg_waldump code and a standalone commit
> that moves the tar archive detection and compression logic to a common
> location, as the latter is an independent improvement to the existing
> codebase. Additionally, since the test file refactoring was only kept
> separate to facilitate the review and has already been reviewed, I
> suggest merging those changes into the main feature patch i.e. 0002.
> All other elements should remain in a single preparatory refactoring
> patch for pg_waldump.
>
> Attached is the version that includes the proposed split. No
> additional changes to 0002 and 0003 patches.
>
Added the two missing 'Reviewed-by' lines to the credit section of the
commit message and did a minor optimization in get_archive_wal_entry.
Regards,
Amul
Attachments:
[application/octet-stream] v19-0001-Move-tar-detection-and-compression-logic-to-comm.patch (7.1K, 2-v19-0001-Move-tar-detection-and-compression-logic-to-comm.patch)
download | inline diff:
From 2b3fec35c1070e187ee71ee7fdaa76bef09e076f Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Tue, 17 Feb 2026 14:51:11 +0530
Subject: [PATCH v19 1/4] Move tar detection and compression logic to common.
Consolidate tar archive identification and compression-type detection
logic into a shared location. Currently used by pg_basebackup and
pg_verifybackup, this functionality is also required for upcoming
pg_waldump enhancements.
This change promotes code reuse and simplifies maintenance across
frontend tools.
Author: Amul Sul <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Jakub Wartak <[email protected]>
Reviewed-by: Chao Li <[email protected]>
Reviewed-by: Euler Taveira <[email protected]>
Reviewed-by: Andrew Dunstan <[email protected]>
discussion: https://postgr.es/m/CAAJ_b94bqdWN3h2J-PzzzQ2Npbwct5ZQHggn_QoYGhC2rn-=WQ@mail.gmail.com
---
src/bin/pg_basebackup/pg_basebackup.c | 36 +++++++----------------
src/bin/pg_verifybackup/pg_verifybackup.c | 12 +-------
src/common/compression.c | 30 +++++++++++++++++++
src/include/common/compression.h | 2 ++
4 files changed, 44 insertions(+), 36 deletions(-)
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index fa169a8d642..c1a4672aa6f 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -1070,12 +1070,9 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
astreamer *manifest_inject_streamer = NULL;
bool inject_manifest;
bool is_tar,
- is_tar_gz,
- is_tar_lz4,
- is_tar_zstd,
is_compressed_tar;
+ pg_compress_algorithm compressed_tar_algorithm;
bool must_parse_archive;
- int archive_name_len = strlen(archive_name);
/*
* Normally, we emit the backup manifest as a separate file, but when
@@ -1084,24 +1081,13 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
*/
inject_manifest = (format == 't' && strcmp(basedir, "-") == 0 && manifest);
- /* Is this a tar archive? */
- is_tar = (archive_name_len > 4 &&
- strcmp(archive_name + archive_name_len - 4, ".tar") == 0);
-
- /* Is this a .tar.gz archive? */
- is_tar_gz = (archive_name_len > 7 &&
- strcmp(archive_name + archive_name_len - 7, ".tar.gz") == 0);
-
- /* Is this a .tar.lz4 archive? */
- is_tar_lz4 = (archive_name_len > 8 &&
- strcmp(archive_name + archive_name_len - 8, ".tar.lz4") == 0);
-
- /* Is this a .tar.zst archive? */
- is_tar_zstd = (archive_name_len > 8 &&
- strcmp(archive_name + archive_name_len - 8, ".tar.zst") == 0);
+ /* Check whether it is a tar archive and its compression type */
+ is_tar = parse_tar_compress_algorithm(archive_name,
+ &compressed_tar_algorithm);
/* Is this any kind of compressed tar? */
- is_compressed_tar = is_tar_gz || is_tar_lz4 || is_tar_zstd;
+ is_compressed_tar = (is_tar &&
+ compressed_tar_algorithm != PG_COMPRESSION_NONE);
/*
* Injecting the manifest into a compressed tar file would be possible if
@@ -1128,7 +1114,7 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
(spclocation == NULL && writerecoveryconf));
/* At present, we only know how to parse tar archives. */
- if (must_parse_archive && !is_tar && !is_compressed_tar)
+ if (must_parse_archive && !is_tar)
{
pg_log_error("cannot parse archive \"%s\"", archive_name);
pg_log_error_detail("Only tar archives can be parsed.");
@@ -1263,13 +1249,13 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
* If the user has requested a server compressed archive along with
* archive extraction at client then we need to decompress it.
*/
- if (format == 'p')
+ if (format == 'p' && is_compressed_tar)
{
- if (is_tar_gz)
+ if (compressed_tar_algorithm == PG_COMPRESSION_GZIP)
streamer = astreamer_gzip_decompressor_new(streamer);
- else if (is_tar_lz4)
+ else if (compressed_tar_algorithm == PG_COMPRESSION_LZ4)
streamer = astreamer_lz4_decompressor_new(streamer);
- else if (is_tar_zstd)
+ else if (compressed_tar_algorithm == PG_COMPRESSION_ZSTD)
streamer = astreamer_zstd_decompressor_new(streamer);
}
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index cbc9447384f..31f606c45b1 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -941,17 +941,7 @@ precheck_tar_backup_file(verifier_context *context, char *relpath,
}
/* Now, check the compression type of the tar */
- if (strcmp(suffix, ".tar") == 0)
- compress_algorithm = PG_COMPRESSION_NONE;
- else if (strcmp(suffix, ".tgz") == 0)
- compress_algorithm = PG_COMPRESSION_GZIP;
- else if (strcmp(suffix, ".tar.gz") == 0)
- compress_algorithm = PG_COMPRESSION_GZIP;
- else if (strcmp(suffix, ".tar.lz4") == 0)
- compress_algorithm = PG_COMPRESSION_LZ4;
- else if (strcmp(suffix, ".tar.zst") == 0)
- compress_algorithm = PG_COMPRESSION_ZSTD;
- else
+ if (!parse_tar_compress_algorithm(suffix, &compress_algorithm))
{
report_backup_error(context,
"file \"%s\" is not expected in a tar format backup",
diff --git a/src/common/compression.c b/src/common/compression.c
index 92cd4ec7a0d..fb27501d297 100644
--- a/src/common/compression.c
+++ b/src/common/compression.c
@@ -41,6 +41,36 @@ static int expect_integer_value(char *keyword, char *value,
static bool expect_boolean_value(char *keyword, char *value,
pg_compress_specification *result);
+/*
+ * Look up a compression algorithm by archive file extension. Returns true and
+ * sets *algorithm if the name is recognized. Otherwise returns false.
+ */
+bool
+parse_tar_compress_algorithm(const char *fname, pg_compress_algorithm *algorithm)
+{
+ int fname_len = strlen(fname);
+
+ if (fname_len >= 4 &&
+ strcmp(fname + fname_len - 4, ".tar") == 0)
+ *algorithm = PG_COMPRESSION_NONE;
+ else if (fname_len >= 4 &&
+ strcmp(fname + fname_len - 4, ".tgz") == 0)
+ *algorithm = PG_COMPRESSION_GZIP;
+ else if (fname_len >= 7 &&
+ strcmp(fname + fname_len - 7, ".tar.gz") == 0)
+ *algorithm = PG_COMPRESSION_GZIP;
+ else if (fname_len >= 8 &&
+ strcmp(fname + fname_len - 8, ".tar.lz4") == 0)
+ *algorithm = PG_COMPRESSION_LZ4;
+ else if (fname_len >= 8 &&
+ strcmp(fname + fname_len - 8, ".tar.zst") == 0)
+ *algorithm = PG_COMPRESSION_ZSTD;
+ else
+ return false;
+
+ return true;
+}
+
/*
* Look up a compression algorithm by name. Returns true and sets *algorithm
* if the name is recognized. Otherwise returns false.
diff --git a/src/include/common/compression.h b/src/include/common/compression.h
index 6c745b90066..f99c747cdd3 100644
--- a/src/include/common/compression.h
+++ b/src/include/common/compression.h
@@ -41,6 +41,8 @@ typedef struct pg_compress_specification
extern void parse_compress_options(const char *option, char **algorithm,
char **detail);
+extern bool parse_tar_compress_algorithm(const char *fname,
+ pg_compress_algorithm *algorithm);
extern bool parse_compress_algorithm(char *name, pg_compress_algorithm *algorithm);
extern const char *get_compress_algorithm_name(pg_compress_algorithm algorithm);
--
2.47.1
[application/octet-stream] v19-0002-pg_waldump-Preparatory-refactoring-for-tar-archi.patch (8.4K, 3-v19-0002-pg_waldump-Preparatory-refactoring-for-tar-archi.patch)
download | inline diff:
From 1de7233f3785e202662cd00b5b7fd2b750e24fea Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 22 Jan 2026 10:28:32 +0530
Subject: [PATCH v19 2/4] pg_waldump: Preparatory refactoring for tar archive
WAL decoding.
Several refactoring steps in preparation for adding tar archive WAL
decoding support to pg_waldump:
- Move XLogDumpPrivate and related declarations into a new pg_waldump.h
header, allowing a second source file to share them.
- Factor out required_read_len() so the read-size calculation can be
reused for both regular WAL files and tar-archived WAL.
- Move the WAL segment size variable into XLogDumpPrivate and rename it
to segsize, making it accessible to the archive streamer code.
Author: Amul Sul <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Jakub Wartak <[email protected]>
Reviewed-by: Chao Li <[email protected]>
Reviewed-by: Euler Taveira <[email protected]>
Reviewed-by: Andrew Dunstan <[email protected]>
discussion: https://postgr.es/m/CAAJ_b94bqdWN3h2J-PzzzQ2Npbwct5ZQHggn_QoYGhC2rn-=WQ@mail.gmail.com
---
src/bin/pg_waldump/pg_waldump.c | 78 +++++++++++++++++++--------------
src/bin/pg_waldump/pg_waldump.h | 26 +++++++++++
2 files changed, 70 insertions(+), 34 deletions(-)
create mode 100644 src/bin/pg_waldump/pg_waldump.h
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index f3446385d6a..5d31b15dbd8 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -29,6 +29,7 @@
#include "common/logging.h"
#include "common/relpath.h"
#include "getopt_long.h"
+#include "pg_waldump.h"
#include "rmgrdesc.h"
#include "storage/bufpage.h"
@@ -43,14 +44,6 @@ static volatile sig_atomic_t time_to_stop = false;
static const RelFileLocator emptyRelFileLocator = {0, 0, 0};
-typedef struct XLogDumpPrivate
-{
- TimeLineID timeline;
- XLogRecPtr startptr;
- XLogRecPtr endptr;
- bool endptr_reached;
-} XLogDumpPrivate;
-
typedef struct XLogDumpConfig
{
/* display options */
@@ -333,6 +326,32 @@ identify_target_directory(char *directory, char *fname, int *WalSegSz)
return NULL; /* not reached */
}
+/*
+ * Returns the size in bytes of the data to be read. Returns -1 if the end
+ * point has already been reached.
+ */
+static inline int
+required_read_len(XLogDumpPrivate *private, XLogRecPtr targetPagePtr,
+ int reqLen)
+{
+ int count = XLOG_BLCKSZ;
+
+ if (XLogRecPtrIsValid(private->endptr))
+ {
+ if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
+ count = XLOG_BLCKSZ;
+ else if (targetPagePtr + reqLen <= private->endptr)
+ count = private->endptr - targetPagePtr;
+ else
+ {
+ private->endptr_reached = true;
+ return -1;
+ }
+ }
+
+ return count;
+}
+
/* pg_waldump's XLogReaderRoutine->segment_open callback */
static void
WALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
@@ -390,21 +409,12 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogRecPtr targetPtr, char *readBuff)
{
XLogDumpPrivate *private = state->private_data;
- int count = XLOG_BLCKSZ;
+ int count = required_read_len(private, targetPagePtr, reqLen);
WALReadError errinfo;
- if (XLogRecPtrIsValid(private->endptr))
- {
- if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
- count = XLOG_BLCKSZ;
- else if (targetPagePtr + reqLen <= private->endptr)
- count = private->endptr - targetPagePtr;
- else
- {
- private->endptr_reached = true;
- return -1;
- }
- }
+ /* Bail out if the count to be read is not valid */
+ if (count < 0)
+ return -1;
if (!WALRead(state, readBuff, targetPagePtr, count, private->timeline,
&errinfo))
@@ -801,7 +811,6 @@ main(int argc, char **argv)
XLogRecPtr first_record;
char *waldir = NULL;
char *errormsg;
- int WalSegSz;
static struct option long_options[] = {
{"bkp-details", no_argument, NULL, 'b'},
@@ -855,6 +864,7 @@ main(int argc, char **argv)
memset(&stats, 0, sizeof(XLogStats));
private.timeline = 1;
+ private.segsize = 0;
private.startptr = InvalidXLogRecPtr;
private.endptr = InvalidXLogRecPtr;
private.endptr_reached = false;
@@ -1128,18 +1138,18 @@ main(int argc, char **argv)
pg_fatal("could not open directory \"%s\": %m", waldir);
}
- waldir = identify_target_directory(waldir, fname, &WalSegSz);
+ waldir = identify_target_directory(waldir, fname, &private.segsize);
fd = open_file_in_directory(waldir, fname);
if (fd < 0)
pg_fatal("could not open file \"%s\"", fname);
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &segno, WalSegSz);
+ XLogFromFileName(fname, &private.timeline, &segno, private.segsize);
if (!XLogRecPtrIsValid(private.startptr))
- XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr);
- else if (!XLByteInSeg(private.startptr, segno, WalSegSz))
+ XLogSegNoOffsetToRecPtr(segno, 0, private.segsize, private.startptr);
+ else if (!XLByteInSeg(private.startptr, segno, private.segsize))
{
pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
LSN_FORMAT_ARGS(private.startptr),
@@ -1149,7 +1159,7 @@ main(int argc, char **argv)
/* no second file specified, set end position */
if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(segno + 1, 0, WalSegSz, private.endptr);
+ XLogSegNoOffsetToRecPtr(segno + 1, 0, private.segsize, private.endptr);
/* parse ENDSEG if passed */
if (optind + 1 < argc)
@@ -1165,14 +1175,14 @@ main(int argc, char **argv)
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &endsegno, WalSegSz);
+ XLogFromFileName(fname, &private.timeline, &endsegno, private.segsize);
if (endsegno < segno)
pg_fatal("ENDSEG %s is before STARTSEG %s",
argv[optind + 1], argv[optind]);
if (!XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(endsegno + 1, 0, WalSegSz,
+ XLogSegNoOffsetToRecPtr(endsegno + 1, 0, private.segsize,
private.endptr);
/* set segno to endsegno for check of --end */
@@ -1180,8 +1190,8 @@ main(int argc, char **argv)
}
- if (!XLByteInSeg(private.endptr, segno, WalSegSz) &&
- private.endptr != (segno + 1) * WalSegSz)
+ if (!XLByteInSeg(private.endptr, segno, private.segsize) &&
+ private.endptr != (segno + 1) * private.segsize)
{
pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
LSN_FORMAT_ARGS(private.endptr),
@@ -1190,7 +1200,7 @@ main(int argc, char **argv)
}
}
else
- waldir = identify_target_directory(waldir, NULL, &WalSegSz);
+ waldir = identify_target_directory(waldir, NULL, &private.segsize);
/* we don't know what to print */
if (!XLogRecPtrIsValid(private.startptr))
@@ -1203,7 +1213,7 @@ main(int argc, char **argv)
/* we have everything we need, start reading */
xlogreader_state =
- XLogReaderAllocate(WalSegSz, waldir,
+ XLogReaderAllocate(private.segsize, waldir,
XL_ROUTINE(.page_read = WALDumpReadPage,
.segment_open = WALDumpOpenSegment,
.segment_close = WALDumpCloseSegment),
@@ -1224,7 +1234,7 @@ main(int argc, char **argv)
* a segment (e.g. we were used in file mode).
*/
if (first_record != private.startptr &&
- XLogSegmentOffset(private.startptr, WalSegSz) != 0)
+ XLogSegmentOffset(private.startptr, private.segsize) != 0)
pg_log_info(ngettext("first record is after %X/%08X, at %X/%08X, skipping over %u byte",
"first record is after %X/%08X, at %X/%08X, skipping over %u bytes",
(first_record - private.startptr)),
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
new file mode 100644
index 00000000000..013b051506f
--- /dev/null
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -0,0 +1,26 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_waldump.h - decode and display WAL
+ *
+ * Copyright (c) 2026, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/pg_waldump.h
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_WALDUMP_H
+#define PG_WALDUMP_H
+
+#include "access/xlogdefs.h"
+
+/* Contains the necessary information to drive WAL decoding */
+typedef struct XLogDumpPrivate
+{
+ TimeLineID timeline;
+ int segsize;
+ XLogRecPtr startptr;
+ XLogRecPtr endptr;
+ bool endptr_reached;
+} XLogDumpPrivate;
+
+#endif /* PG_WALDUMP_H */
--
2.47.1
[application/octet-stream] v19-0003-pg_waldump-Add-support-for-reading-WAL-from-tar-.patch (54.7K, 4-v19-0003-pg_waldump-Add-support-for-reading-WAL-from-tar-.patch)
download | inline diff:
From 0de254f057d17e772b3dccebb18b49ca7baa7e85 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Wed, 18 Feb 2026 11:07:57 +0530
Subject: [PATCH v19 3/4] pg_waldump: Add support for reading WAL from tar
archives
pg_waldump can now accept the path to a tar archive (optionally
compressed with gzip, lz4, or zstd) containing WAL files and decode
them. This was added primarily for pg_verifybackup, which previously
had to skip WAL parsing for tar-format backups.
The implementation uses the existing archive streamer infrastructure
with a hash table to track WAL segments read from the archive. If WAL
files within the archive are not in sequential order, out-of-order
segments are written to a temporary directory (created via mkdtemp under
$TMPDIR or the archive's directory) and read back when needed. An
atexit callback ensures the temporary directory is cleaned up.
The --follow option is not supported when reading from a tar archive.
Author: Amul Sul <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Jakub Wartak <[email protected]>
Reviewed-by: Chao Li <[email protected]>
Reviewed-by: Euler Taveira <[email protected]>
Reviewed-by: Andrew Dunstan <[email protected]>
discussion: https://postgr.es/m/CAAJ_b94bqdWN3h2J-PzzzQ2Npbwct5ZQHggn_QoYGhC2rn-=WQ@mail.gmail.com
---
doc/src/sgml/ref/pg_waldump.sgml | 23 +-
src/bin/pg_waldump/Makefile | 7 +-
src/bin/pg_waldump/archive_waldump.c | 827 +++++++++++++++++++++++++++
src/bin/pg_waldump/meson.build | 4 +-
src/bin/pg_waldump/pg_waldump.c | 286 +++++++--
src/bin/pg_waldump/pg_waldump.h | 48 ++
src/bin/pg_waldump/t/001_basic.pl | 242 ++++++--
src/tools/pgindent/typedefs.list | 4 +
8 files changed, 1315 insertions(+), 126 deletions(-)
create mode 100644 src/bin/pg_waldump/archive_waldump.c
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index d1715ff5124..9bbb4bd5772 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -141,13 +141,21 @@ PostgreSQL documentation
<term><option>--path=<replaceable>path</replaceable></option></term>
<listitem>
<para>
- Specifies a directory to search for WAL segment files or a
- directory with a <literal>pg_wal</literal> subdirectory that
+ Specifies a tar archive or a directory to search for WAL segment files
+ or a directory with a <literal>pg_wal</literal> subdirectory that
contains such files. The default is to search in the current
directory, the <literal>pg_wal</literal> subdirectory of the
current directory, and the <literal>pg_wal</literal> subdirectory
of <envar>PGDATA</envar>.
</para>
+ <para>
+ If a tar archive is provided and its WAL segment files are not in
+ sequential order, those files will be written to a temporary directory
+ named starting with <filename>waldump_tmp</filename>. This directory will be
+ created inside the directory specified by the <envar>TMPDIR</envar>
+ environment variable if it is set; otherwise, it will be created within
+ the same directory as the tar archive.
+ </para>
</listitem>
</varlistentry>
@@ -383,6 +391,17 @@ PostgreSQL documentation
</para>
</listitem>
</varlistentry>
+
+ <varlistentry>
+ <term><envar>TMPDIR</envar></term>
+ <listitem>
+ <para>
+ Directory in which to create temporary files when reading WAL from a
+ tar archive with out-of-order segment files. If not set, the temporary
+ directory is created within the same directory as the tar archive.
+ </para>
+ </listitem>
+ </varlistentry>
</variablelist>
</refsect1>
diff --git a/src/bin/pg_waldump/Makefile b/src/bin/pg_waldump/Makefile
index 4c1ee649501..aabb87566a2 100644
--- a/src/bin/pg_waldump/Makefile
+++ b/src/bin/pg_waldump/Makefile
@@ -3,6 +3,9 @@
PGFILEDESC = "pg_waldump - decode and display WAL"
PGAPPICON=win32
+# make these available to TAP test scripts
+export TAR
+
subdir = src/bin/pg_waldump
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
@@ -10,13 +13,15 @@ include $(top_builddir)/src/Makefile.global
OBJS = \
$(RMGRDESCOBJS) \
$(WIN32RES) \
+ archive_waldump.o \
compat.o \
pg_waldump.o \
rmgrdesc.o \
xlogreader.o \
xlogstats.o
-override CPPFLAGS := -DFRONTEND $(CPPFLAGS)
+override CPPFLAGS := -DFRONTEND -I$(libpq_srcdir) $(CPPFLAGS)
+LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils
RMGRDESCSOURCES = $(sort $(notdir $(wildcard $(top_srcdir)/src/backend/access/rmgrdesc/*desc*.c)))
RMGRDESCOBJS = $(patsubst %.c,%.o,$(RMGRDESCSOURCES))
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
new file mode 100644
index 00000000000..f36de991dc6
--- /dev/null
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -0,0 +1,827 @@
+/*-------------------------------------------------------------------------
+ *
+ * archive_waldump.c
+ * A generic facility for reading WAL data from tar archives via archive
+ * streamer.
+ *
+ * Portions Copyright (c) 2026, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/archive_waldump.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres_fe.h"
+
+#include <unistd.h>
+
+#include "access/xlog_internal.h"
+#include "common/file_perm.h"
+#include "common/hashfn.h"
+#include "common/logging.h"
+#include "fe_utils/simple_list.h"
+#include "pg_waldump.h"
+
+/*
+ * How many bytes should we try to read from a file at once?
+ */
+#define READ_CHUNK_SIZE (128 * 1024)
+
+/* Temporary exported WAL file directory */
+char *TmpWalSegDir = NULL;
+
+/*
+ * Check if the start segment number is zero; this indicates a request to read
+ * any WAL file.
+ */
+#define READ_ANY_WAL(privateInfo) ((privateInfo)->start_segno == 0)
+
+/*
+ * Hash entry representing a WAL segment retrieved from the archive.
+ *
+ * While WAL segments are typically read sequentially, individual entries
+ * maintain their own buffers for the following reasons:
+ *
+ * 1. Boundary Handling: The archive streamer provides a continuous byte
+ * stream. A single streaming chunk may contain the end of one WAL segment
+ * and the start of the next. Separate buffers allow us to easily
+ * partition and track these bytes by their respective segments.
+ *
+ * 2. Out-of-Order Support: Dedicated buffers simplify logic if segments
+ * are ever archived or retrieved out of sequence.
+ *
+ * To minimize the memory footprint, entries and their associated buffers are
+ * freed immediately once consumed. Since pg_waldump does not request the same
+ * bytes twice, a segment is discarded as soon as pg_waldump moves past it.
+ */
+typedef struct ArchivedWALFile
+{
+ uint32 status; /* hash status */
+ const char *fname; /* hash key: WAL segment name */
+
+ StringInfo buf; /* holds WAL bytes read from archive */
+ bool spilled; /* true if the WAL data was spilled to a
+ * temporary file */
+
+ int read_len; /* total bytes of a WAL read from archive */
+} ArchivedWALFile;
+
+static uint32 hash_string_pointer(const char *s);
+#define SH_PREFIX ArchivedWAL
+#define SH_ELEMENT_TYPE ArchivedWALFile
+#define SH_KEY_TYPE const char *
+#define SH_KEY fname
+#define SH_HASH_KEY(tb, key) hash_string_pointer(key)
+#define SH_EQUAL(tb, a, b) (strcmp(a, b) == 0)
+#define SH_SCOPE static inline
+#define SH_RAW_ALLOCATOR pg_malloc0
+#define SH_DECLARE
+#define SH_DEFINE
+#include "lib/simplehash.h"
+
+typedef struct astreamer_waldump
+{
+ astreamer base;
+ XLogDumpPrivate *privateInfo;
+} astreamer_waldump;
+
+static ArchivedWALFile *get_archive_wal_entry(const char *fname,
+ XLogDumpPrivate *privateInfo,
+ int WalSegSz);
+static int read_archive_file(XLogDumpPrivate *privateInfo, Size count);
+static void setup_tmpwal_dir(const char *waldir);
+static void cleanup_tmpwal_dir_atexit(void);
+
+static FILE *prepare_tmp_write(const char *fname);
+static void perform_tmp_write(const char *fname, StringInfo buf, FILE *file);
+
+static astreamer *astreamer_waldump_new(XLogDumpPrivate *privateInfo);
+static void astreamer_waldump_content(astreamer *streamer,
+ astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context);
+static void astreamer_waldump_finalize(astreamer *streamer);
+static void astreamer_waldump_free(astreamer *streamer);
+
+static bool member_is_wal_file(astreamer_waldump *mystreamer,
+ astreamer_member *member,
+ char **fname);
+
+static const astreamer_ops astreamer_waldump_ops = {
+ .content = astreamer_waldump_content,
+ .finalize = astreamer_waldump_finalize,
+ .free = astreamer_waldump_free
+};
+
+/*
+ * Initializes the tar archive reader, creates a hash table for WAL entries,
+ * checks for existing valid WAL segments in the archive file and retrieves the
+ * segment size, and sets up filters for relevant entries. It also configures a
+ * temporary directory for out-of-order WAL data and registers an exit callback
+ * to clean up temporary files.
+ */
+void
+init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
+ int *WalSegSz, pg_compress_algorithm compression)
+{
+ int fd;
+ astreamer *streamer;
+ ArchivedWALFile *entry = NULL;
+ XLogLongPageHeader longhdr;
+ XLogSegNo segno;
+ TimeLineID timeline;
+
+ /* Open tar archive and store its file descriptor */
+ fd = open_file_in_directory(waldir, privateInfo->archive_name);
+
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", privateInfo->archive_name);
+
+ privateInfo->archive_fd = fd;
+
+ streamer = astreamer_waldump_new(privateInfo);
+
+ /* We must first parse the tar archive. */
+ streamer = astreamer_tar_parser_new(streamer);
+
+ /* If the archive is compressed, decompress before parsing. */
+ if (compression == PG_COMPRESSION_GZIP)
+ streamer = astreamer_gzip_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_LZ4)
+ streamer = astreamer_lz4_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_ZSTD)
+ streamer = astreamer_zstd_decompressor_new(streamer);
+
+ privateInfo->archive_streamer = streamer;
+
+ /*
+ * Allocate a buffer for reading the archive file to facilitate content
+ * decoding; read requests must not exceed the allocated buffer size.
+ */
+ privateInfo->archive_read_buf = pg_malloc(READ_CHUNK_SIZE);
+ privateInfo->archive_read_buf_size = READ_CHUNK_SIZE;
+
+ /*
+ * Hash table storing WAL entries read from the archive with an arbitrary
+ * initial size
+ */
+ privateInfo->archive_wal_htab = ArchivedWAL_create(8, NULL);
+
+ /*
+ * Verify that the archive contains valid WAL files and fetch WAL segment
+ * size
+ */
+ while (entry == NULL || entry->buf->len < XLOG_BLCKSZ)
+ {
+ if (read_archive_file(privateInfo, XLOG_BLCKSZ) == 0)
+ pg_fatal("could not find WAL in archive \"%s\"",
+ privateInfo->archive_name);
+
+ entry = privateInfo->cur_file;
+ }
+
+ /* Set WalSegSz if WAL data is successfully read */
+ longhdr = (XLogLongPageHeader) entry->buf->data;
+
+ if (!IsValidWalSegSize(longhdr->xlp_seg_size))
+ {
+ pg_log_error(ngettext("invalid WAL segment size in WAL file from archive \"%s\" (%d byte)",
+ "invalid WAL segment size in WAL file from archive \"%s\" (%d bytes)",
+ longhdr->xlp_seg_size),
+ privateInfo->archive_name, longhdr->xlp_seg_size);
+ pg_log_error_detail("The WAL segment size must be a power of two between 1 MB and 1 GB.");
+ exit(1);
+ }
+
+ *WalSegSz = longhdr->xlp_seg_size;
+
+ /*
+ * With the WAL segment size available, we can now initialize the
+ * dependent start and end segment numbers.
+ */
+ Assert(!XLogRecPtrIsInvalid(privateInfo->startptr));
+ XLByteToSeg(privateInfo->startptr, privateInfo->start_segno, *WalSegSz);
+
+ if (!XLogRecPtrIsInvalid(privateInfo->endptr))
+ XLByteToSeg(privateInfo->endptr, privateInfo->end_segno, *WalSegSz);
+
+ /*
+ * This WAL record was fetched before the filtering parameters
+ * (start_segno and end_segno) were fully initialized. Perform the
+ * relevance check against the user-provided range now; if the WAL falls
+ * outside this range, remove it from the hash table. Subsequent WAL will
+ * be filtered automatically by the archived streamer using the updated
+ * start_segno and end_segno values.
+ */
+ XLogFromFileName(entry->fname, &timeline, &segno, privateInfo->segsize);
+ if (privateInfo->timeline != timeline ||
+ privateInfo->start_segno > segno ||
+ privateInfo->end_segno < segno)
+ free_archive_wal_entry(entry->fname, privateInfo);
+
+ /*
+ * Setup temporary directory to store WAL segments and set up an exit
+ * callback to remove it upon completion.
+ */
+ setup_tmpwal_dir(waldir);
+ atexit(cleanup_tmpwal_dir_atexit);
+}
+
+/*
+ * Release the archive streamer chain and close the archive file.
+ */
+void
+free_archive_reader(XLogDumpPrivate *privateInfo)
+{
+ /*
+ * NB: Normally, astreamer_finalize() is called before astreamer_free() to
+ * flush any remaining buffered data or to ensure the end of the tar
+ * archive is reached. However, when decoding a WAL file, once we hit the
+ * end LSN, any remaining WAL data in the buffer or the tar archive's
+ * unreached end can be safely ignored.
+ */
+ astreamer_free(privateInfo->archive_streamer);
+
+ /* Free any remaining hash table entries and their buffers. */
+ if (privateInfo->archive_wal_htab != NULL)
+ {
+ ArchivedWAL_iterator iter;
+ ArchivedWALFile *entry;
+
+ ArchivedWAL_start_iterate(privateInfo->archive_wal_htab, &iter);
+ while ((entry = ArchivedWAL_iterate(privateInfo->archive_wal_htab,
+ &iter)) != NULL)
+ {
+ if (entry->buf != NULL)
+ destroyStringInfo(entry->buf);
+ }
+ ArchivedWAL_destroy(privateInfo->archive_wal_htab);
+ privateInfo->archive_wal_htab = NULL;
+ }
+
+ /* Free the reusable read buffer. */
+ if (privateInfo->archive_read_buf != NULL)
+ {
+ pg_free(privateInfo->archive_read_buf);
+ privateInfo->archive_read_buf = NULL;
+ }
+
+ /* Close the file. */
+ if (close(privateInfo->archive_fd) != 0)
+ pg_log_error("could not close file \"%s\": %m",
+ privateInfo->archive_name);
+}
+
+/*
+ * Copies WAL data from astreamer to readBuff; if unavailable, fetches more
+ * from the tar archive via astreamer.
+ */
+int
+read_archive_wal_page(XLogDumpPrivate *privateInfo, XLogRecPtr targetPagePtr,
+ Size count, char *readBuff, int WalSegSz)
+{
+ char *p = readBuff;
+ Size nbytes = count;
+ XLogRecPtr recptr = targetPagePtr;
+ XLogSegNo segno;
+ char fname[MAXFNAMELEN];
+ ArchivedWALFile *entry;
+
+ /* Identify the segment and locate its entry in the archive hash */
+ XLByteToSeg(targetPagePtr, segno, WalSegSz);
+ XLogFileName(fname, privateInfo->timeline, segno, WalSegSz);
+ entry = get_archive_wal_entry(fname, privateInfo, WalSegSz);
+
+ while (nbytes > 0)
+ {
+ char *buf = entry->buf->data;
+ int bufLen = entry->buf->len;
+ XLogRecPtr endPtr;
+ XLogRecPtr startPtr;
+
+ /* Calculate the LSN range currently residing in the buffer */
+ XLogSegNoOffsetToRecPtr(segno, entry->read_len, WalSegSz, endPtr);
+ startPtr = endPtr - bufLen;
+
+ /*
+ * Copy the requested WAL record if it exists in the buffer.
+ */
+ if (bufLen > 0 && startPtr <= recptr && recptr < endPtr)
+ {
+ int copyBytes;
+ int offset = recptr - startPtr;
+
+ /*
+ * Given startPtr <= recptr < endPtr and a total buffer size
+ * 'bufLen', the offset (recptr - startPtr) will always be less
+ * than 'bufLen'.
+ */
+ Assert(offset < bufLen);
+
+ copyBytes = Min(nbytes, bufLen - offset);
+ memcpy(p, buf + offset, copyBytes);
+
+ /* Update state for read */
+ recptr += copyBytes;
+ nbytes -= copyBytes;
+ p += copyBytes;
+ }
+ else
+ {
+ /*
+ * Before starting the actual decoding loop, pg_waldump tries to
+ * locate the first valid record from the user-specified start
+ * position, which might not be the start of a WAL record and
+ * could fall in the middle of a record that spans multiple pages.
+ * Consequently, the valid start position the decoder is looking
+ * for could be far away from that initial position.
+ *
+ * This may involve reading across multiple pages, and this
+ * pre-reading fetches data in multiple rounds from the archive
+ * streamer; normally, we would throw away existing buffer
+ * contents to fetch the next set of data, but that existing data
+ * might be needed once the main loop starts. Because previously
+ * read data cannot be re-read by the archive streamer, we delay
+ * resetting the buffer until the main decoding loop is entered.
+ *
+ * Once pg_waldump has entered the main loop, it may re-read the
+ * currently active page, but never an older one; therefore, any
+ * fully consumed WAL data preceding the current page can then be
+ * safely discarded.
+ */
+ if (privateInfo->decoding_started)
+ {
+ resetStringInfo(entry->buf);
+
+ /*
+ * Push back the partial page data for the current page to the
+ * buffer, ensuring it remains full page available for
+ * re-reading if requested.
+ */
+ if (p > readBuff)
+ {
+ Assert((count - nbytes) > 0);
+ appendBinaryStringInfo(entry->buf, readBuff, count - nbytes);
+ }
+ }
+
+ /*
+ * Now, fetch more data. Raise an error if the archive streamer
+ * has moved past our segment (meaning the WAL file in the archive
+ * is shorter than expected) or if reading the archive reached
+ * EOF.
+ */
+ if (privateInfo->cur_file != entry)
+ pg_fatal("WAL segment \"%s\" in archive \"%s\" is too short: read %lld of %lld bytes",
+ fname, privateInfo->archive_name,
+ (long long int) (count - nbytes),
+ (long long int) count);
+ if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ pg_fatal("unexpected end of archive \"%s\" while reading \"%s\": read %lld of %lld bytes",
+ privateInfo->archive_name, fname,
+ (long long int) (count - nbytes),
+ (long long int) count);
+ }
+ }
+
+ /*
+ * Should have successfully read all the requested bytes or reported a
+ * failure before this point.
+ */
+ Assert(nbytes == 0);
+
+ /*
+ * NB: We return the fixed value provided as input. We could return a
+ * boolean since we either successfully read the WAL page or raise an
+ * error, but the caller expects this value to be returned. The routine
+ * that reads WAL pages from the physical WAL file follows the same
+ * convention.
+ */
+ return count;
+}
+
+/*
+ * Clears the buffer of a WAL entry that is being ignored. This frees up memory
+ * and prevents the accumulation of irrelevant WAL data. Additionally,
+ * conditionally setting cur_file within privateInfo to NULL ensures the
+ * archive streamer skips unnecessary copy operations.
+ */
+void
+free_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo)
+{
+ ArchivedWALFile *entry;
+
+ entry = ArchivedWAL_lookup(privateInfo->archive_wal_htab, fname);
+
+ if (entry == NULL)
+ return;
+
+ /* Destroy the buffer */
+ destroyStringInfo(entry->buf);
+ entry->buf = NULL;
+
+ /* Remove temporary file if any */
+ if (entry->spilled)
+ {
+ char fpath[MAXPGPATH];
+
+ snprintf(fpath, MAXPGPATH, "%s/%s", TmpWalSegDir, fname);
+
+ if (unlink(fpath) == 0)
+ pg_log_debug("removed file \"%s\"", fpath);
+ }
+
+ /* Set cur_file to NULL if it matches the entry being ignored */
+ if (privateInfo->cur_file == entry)
+ privateInfo->cur_file = NULL;
+
+ ArchivedWAL_delete_item(privateInfo->archive_wal_htab, entry);
+}
+
+/*
+ * Returns the archived WAL entry from the hash table if it exists. Otherwise,
+ * it invokes the routine to read the archived file, which then populates the
+ * entry in the hash table if that WAL exists in the archive.
+ * If the archive streamer happens to be reading a
+ * WAL from archive file that is not currently needed, that WAL data is written
+ * to a temporary file.
+ */
+static ArchivedWALFile *
+get_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo,
+ int WalSegSz)
+{
+ ArchivedWALFile *entry = NULL;
+ FILE *write_fp = NULL;
+
+ /*
+ * Search the hash table first. If the entry is found, return it.
+ * Otherwise, the requested WAL entry hasn't been read from the archive
+ * yet; invoke the archive streamer to fetch it.
+ */
+ while (1)
+ {
+ /*
+ * Search hash table.
+ *
+ * We perform the search inside the loop because a single iteration of
+ * the archive reader may decompress and extract multiple files into
+ * the hash table. One of these newly added files could be the one we
+ * are seeking.
+ */
+ entry = ArchivedWAL_lookup(privateInfo->archive_wal_htab, fname);
+
+ if (entry != NULL)
+ return entry;
+
+ /*
+ * The WAL file entry currently being processed may change during
+ * archive streamer execution. Therefore, maintain a local variable to
+ * reference the previous entry, ensuring that any remaining data in
+ * its buffer is successfully flushed to the temporary file before
+ * switching to the next WAL entry.
+ */
+ entry = privateInfo->cur_file;
+
+ /* Fetch more data */
+ if (entry == NULL || entry->buf->len == 0)
+ {
+ if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ break; /* archive file ended */
+ }
+
+ /*
+ * Archived streamer is reading a non-WAL file or an irrelevant WAL
+ * file.
+ */
+ if (entry == NULL)
+ continue;
+
+ /*
+ * Archive streamer is currently reading a file that isn't the one
+ * asked for, but it's required in the future. It should be written to
+ * a temporary location for retrieval when needed.
+ */
+
+ /* Create a temporary file if one does not already exist */
+ if (!entry->spilled)
+ {
+ write_fp = prepare_tmp_write(entry->fname);
+ entry->spilled = true;
+ }
+
+ /* Flush data from the buffer to the file */
+ perform_tmp_write(entry->fname, entry->buf, write_fp);
+ resetStringInfo(entry->buf);
+
+ /*
+ * The change in the current segment entry indicates that the reading
+ * of this file has ended.
+ */
+ if (entry != privateInfo->cur_file && write_fp != NULL)
+ {
+ fclose(write_fp);
+ write_fp = NULL;
+ }
+ }
+
+ /* Requested WAL segment not found */
+ pg_fatal("could not find WAL \"%s\" in archive \"%s\"",
+ fname, privateInfo->archive_name);
+}
+
+/*
+ * Reads the archive file and passes it to the archive streamer for
+ * decompression.
+ */
+static int
+read_archive_file(XLogDumpPrivate *privateInfo, Size count)
+{
+ int rc;
+
+ /* The read request must not exceed the allocated buffer size. */
+ Assert(privateInfo->archive_read_buf_size >= count);
+
+ rc = read(privateInfo->archive_fd, privateInfo->archive_read_buf, count);
+ if (rc < 0)
+ pg_fatal("could not read file \"%s\": %m",
+ privateInfo->archive_name);
+
+ /*
+ * Decompress (if required), and then parse the previously read contents
+ * of the tar file.
+ */
+ if (rc > 0)
+ astreamer_content(privateInfo->archive_streamer, NULL,
+ privateInfo->archive_read_buf, rc,
+ ASTREAMER_UNKNOWN);
+
+ return rc;
+}
+
+/*
+ * Set up a temporary directory to temporarily store WAL segments.
+ */
+static void
+setup_tmpwal_dir(const char *waldir)
+{
+ char *template;
+
+ /*
+ * Use the directory specified by the TMPDIR environment variable. If it's
+ * not set, use the provided WAL directory to extract WAL file
+ * temporarily.
+ */
+ template = psprintf("%s/waldump_tmp-XXXXXX",
+ getenv("TMPDIR") ? getenv("TMPDIR") : waldir);
+ TmpWalSegDir = mkdtemp(template);
+
+ if (TmpWalSegDir == NULL)
+ pg_fatal("could not create directory \"%s\": %m", template);
+
+ canonicalize_path(TmpWalSegDir);
+
+ pg_log_debug("created directory \"%s\"", TmpWalSegDir);
+}
+
+/*
+ * Remove temporary directory at exit, if any.
+ */
+static void
+cleanup_tmpwal_dir_atexit(void)
+{
+ rmtree(TmpWalSegDir, true);
+}
+
+/*
+ * Create an empty placeholder file and return its handle.
+ */
+static FILE *
+prepare_tmp_write(const char *fname)
+{
+ char fpath[MAXPGPATH];
+ FILE *file;
+
+ snprintf(fpath, MAXPGPATH, "%s/%s", TmpWalSegDir, fname);
+
+ /* Create an empty placeholder */
+ file = fopen(fpath, PG_BINARY_W);
+ if (file == NULL)
+ pg_fatal("could not create file \"%s\": %m", fpath);
+
+#ifndef WIN32
+ if (chmod(fpath, pg_file_create_mode))
+ pg_fatal("could not set permissions on file \"%s\": %m",
+ fpath);
+#endif
+
+ pg_log_debug("spilling to temporary file \"%s\"", fpath);
+
+ return file;
+}
+
+/*
+ * Write buffer data to the given file handle.
+ */
+static void
+perform_tmp_write(const char *fname, StringInfo buf, FILE *file)
+{
+ Assert(file);
+
+ errno = 0;
+ if (buf->len > 0 && fwrite(buf->data, buf->len, 1, file) != 1)
+ {
+ /*
+ * If write didn't set errno, assume problem is no disk space
+ */
+ if (errno == 0)
+ errno = ENOSPC;
+ pg_fatal("could not write to file \"%s\": %m", fname);
+ }
+}
+
+/*
+ * Create an astreamer that can read WAL from tar file.
+ */
+static astreamer *
+astreamer_waldump_new(XLogDumpPrivate *privateInfo)
+{
+ astreamer_waldump *streamer;
+
+ streamer = palloc0_object(astreamer_waldump);
+ *((const astreamer_ops **) &streamer->base.bbs_ops) =
+ &astreamer_waldump_ops;
+
+ streamer->privateInfo = privateInfo;
+
+ return &streamer->base;
+}
+
+/*
+ * Main entry point of the archive streamer for reading WAL data from a tar
+ * file. If a member is identified as a valid WAL file, a hash entry is created
+ * for it, and its contents are copied into that entry's buffer, making them
+ * accessible to the decoding routine.
+ */
+static void
+astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context)
+{
+ astreamer_waldump *mystreamer = (astreamer_waldump *) streamer;
+ XLogDumpPrivate *privateInfo = mystreamer->privateInfo;
+
+ Assert(context != ASTREAMER_UNKNOWN);
+
+ switch (context)
+ {
+ case ASTREAMER_MEMBER_HEADER:
+ {
+ char *fname = NULL;
+ ArchivedWALFile *entry;
+ bool found;
+
+ pg_log_debug("reading \"%s\"", member->pathname);
+
+ if (!member_is_wal_file(mystreamer, member, &fname))
+ break;
+
+ /*
+ * Further checks are skipped if any WAL file can be read.
+ * This typically occurs during initial verification.
+ */
+ if (!READ_ANY_WAL(privateInfo))
+ {
+ XLogSegNo segno;
+ TimeLineID timeline;
+
+ /*
+ * Skip the segment if the timeline does not match, if it
+ * falls outside the caller-specified range.
+ */
+ XLogFromFileName(fname, &timeline, &segno, privateInfo->segsize);
+ if (privateInfo->timeline != timeline ||
+ privateInfo->start_segno > segno ||
+ privateInfo->end_segno < segno)
+ {
+ pfree(fname);
+ break;
+ }
+ }
+
+ entry = ArchivedWAL_insert(privateInfo->archive_wal_htab,
+ fname, &found);
+
+ /*
+ * Shouldn't happen, but if it does, simply ignore the
+ * duplicate WAL file.
+ */
+ if (found)
+ {
+ pg_log_warning("ignoring duplicate WAL \"%s\" found in archive \"%s\"",
+ member->pathname, privateInfo->archive_name);
+ pfree(fname);
+ break;
+ }
+
+ entry->buf = makeStringInfo();
+ entry->spilled = false;
+ entry->read_len = 0;
+ privateInfo->cur_file = entry;
+ }
+ break;
+
+ case ASTREAMER_MEMBER_CONTENTS:
+ if (privateInfo->cur_file)
+ {
+ appendBinaryStringInfo(privateInfo->cur_file->buf, data, len);
+ privateInfo->cur_file->read_len += len;
+ }
+ break;
+
+ case ASTREAMER_MEMBER_TRAILER:
+ privateInfo->cur_file = NULL;
+ break;
+
+ case ASTREAMER_ARCHIVE_TRAILER:
+ break;
+
+ default:
+ /* Shouldn't happen. */
+ pg_fatal("unexpected state while parsing tar file");
+ }
+}
+
+/*
+ * End-of-stream processing for an astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_finalize(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+}
+
+/*
+ * Free memory associated with an astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_free(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+ pfree(streamer);
+}
+
+/*
+ * Returns true if the archive member name matches the WAL naming format. If
+ * successful, it also outputs the WAL segment name.
+ */
+static bool
+member_is_wal_file(astreamer_waldump *mystreamer, astreamer_member *member,
+ char **fname)
+{
+ int pathlen;
+ char pathname[MAXPGPATH];
+ char *filename;
+
+ /* We are only interested in normal files. */
+ if (member->is_directory || member->is_link)
+ return false;
+
+ if (strlen(member->pathname) < XLOG_FNAME_LEN)
+ return false;
+
+ /*
+ * For a correct comparison, we must remove any '.' or '..' components
+ * from the member pathname. Similar to member_verify_header(), we prepend
+ * './' to the path so that canonicalize_path() can properly resolve and
+ * strip these references from the tar member name
+ */
+ snprintf(pathname, MAXPGPATH, "./%s", member->pathname);
+ canonicalize_path(pathname);
+ pathlen = strlen(pathname);
+
+ /* WAL files from the top-level or pg_wal directory will be decoded */
+ if (pathlen > XLOG_FNAME_LEN &&
+ strncmp(pathname, XLOGDIR, strlen(XLOGDIR)) != 0)
+ return false;
+
+ /* WAL file could be with full path */
+ filename = pathname + (pathlen - XLOG_FNAME_LEN);
+ if (!IsXLogFileName(filename))
+ return false;
+
+ *fname = pnstrdup(filename, XLOG_FNAME_LEN);
+
+ return true;
+}
+
+/*
+ * Helper function for filemap hash table.
+ */
+static uint32
+hash_string_pointer(const char *s)
+{
+ unsigned char *ss = (unsigned char *) s;
+
+ return hash_bytes(ss, strlen(s));
+}
diff --git a/src/bin/pg_waldump/meson.build b/src/bin/pg_waldump/meson.build
index 633a9874bb5..5296f21b82c 100644
--- a/src/bin/pg_waldump/meson.build
+++ b/src/bin/pg_waldump/meson.build
@@ -1,6 +1,7 @@
# Copyright (c) 2022-2026, PostgreSQL Global Development Group
pg_waldump_sources = files(
+ 'archive_waldump.c',
'compat.c',
'pg_waldump.c',
'rmgrdesc.c',
@@ -18,7 +19,7 @@ endif
pg_waldump = executable('pg_waldump',
pg_waldump_sources,
- dependencies: [frontend_code, lz4, zstd],
+ dependencies: [frontend_code, libpq, lz4, zstd],
c_args: ['-DFRONTEND'], # needed for xlogreader et al
kwargs: default_bin_args,
)
@@ -29,6 +30,7 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
+ 'env': {'TAR': tar.found() ? tar.full_path() : ''},
'tests': [
't/001_basic.pl',
't/002_save_fullpage.pl',
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 5d31b15dbd8..b13cedaa3e7 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -176,7 +176,7 @@ split_path(const char *path, char **dir, char **fname)
*
* return a read only fd
*/
-static int
+int
open_file_in_directory(const char *directory, const char *fname)
{
int fd = -1;
@@ -440,6 +440,103 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
return count;
}
+/*
+ * pg_waldump's XLogReaderRoutine->segment_open callback to support dumping WAL
+ * files from tar archives.
+ */
+static void
+TarWALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
+ TimeLineID *tli_p)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->segment_close callback.
+ */
+static void
+TarWALDumpCloseSegment(XLogReaderState *state)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->page_read callback to support dumping WAL
+ * files from tar archives.
+ */
+static int
+TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
+ XLogRecPtr targetPtr, char *readBuff)
+{
+ XLogDumpPrivate *private = state->private_data;
+ int count = required_read_len(private, targetPagePtr, reqLen);
+ int WalSegSz = state->segcxt.ws_segsize;
+ XLogSegNo curSegNo;
+
+ /* Bail out if the count to be read is not valid */
+ if (count < 0)
+ return -1;
+
+ /*
+ * If the target page is in a different segment, free the buffer and/or
+ * temporary file disk space occupied by the previous segment's data.
+ * Since pg_waldump never requests the same WAL bytes twice, moving to a
+ * new segment implies the previous buffer's data and that segment will
+ * not be needed again.
+ *
+ * Afterward, check for the next required WAL segment's physical existence
+ * in the temporary directory first before invoking the archive streamer.
+ */
+ curSegNo = state->seg.ws_segno;
+ if (!XLByteInSeg(targetPagePtr, curSegNo, WalSegSz))
+ {
+ char fname[MAXFNAMELEN];
+ XLogSegNo nextSegNo;
+
+ /*
+ * Calculate the next WAL segment to be decoded from the given page
+ * pointer
+ */
+ XLByteToSeg(targetPagePtr, nextSegNo, WalSegSz);
+ state->seg.ws_tli = private->timeline;
+ state->seg.ws_segno = nextSegNo;
+
+ /* Close the WAL segment file if it is currently open */
+ if (state->seg.ws_file >= 0)
+ {
+ close(state->seg.ws_file);
+ state->seg.ws_file = -1;
+ }
+
+ /*
+ * If in pre-reading mode (prior to actual decoding), do not delete
+ * any entries that might be requested again once the decoding loop
+ * starts. For more details, see the comments in
+ * read_archive_wal_page().
+ */
+ if (private->decoding_started && curSegNo < nextSegNo)
+ {
+ XLogFileName(fname, state->seg.ws_tli, curSegNo, WalSegSz);
+ free_archive_wal_entry(fname, private);
+ }
+
+ /*
+ * If the next segment exists, open it and continue reading from there
+ */
+ XLogFileName(fname, state->seg.ws_tli, nextSegNo, WalSegSz);
+ state->seg.ws_file = open_file_in_directory(TmpWalSegDir, fname);
+ }
+
+ /* Continue reading from the open WAL segment, if any */
+ if (state->seg.ws_file >= 0)
+ return WALDumpReadPage(state, targetPagePtr, count, targetPtr,
+ readBuff);
+
+ /* Otherwise, read the WAL page from the archive streamer */
+ return read_archive_wal_page(private, targetPagePtr, count, readBuff,
+ WalSegSz);
+}
+
/*
* Boolean to return whether the given WAL record matches a specific relation
* and optionally block.
@@ -777,8 +874,8 @@ usage(void)
printf(_(" -F, --fork=FORK only show records that modify blocks in fork FORK;\n"
" valid names are main, fsm, vm, init\n"));
printf(_(" -n, --limit=N number of records to display\n"));
- printf(_(" -p, --path=PATH directory in which to find WAL segment files or a\n"
- " directory with a ./pg_wal that contains such files\n"
+ printf(_(" -p, --path=PATH a tar archive or a directory in which to find WAL segment files or\n"
+ " a directory with a ./pg_wal that contains such files\n"
" (default: current directory, ./pg_wal, $PGDATA/pg_wal)\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -r, --rmgr=RMGR only show records generated by resource manager RMGR;\n"
@@ -810,7 +907,9 @@ main(int argc, char **argv)
XLogRecord *record;
XLogRecPtr first_record;
char *waldir = NULL;
+ char *walpath = NULL;
char *errormsg;
+ pg_compress_algorithm compression = PG_COMPRESSION_NONE;
static struct option long_options[] = {
{"bkp-details", no_argument, NULL, 'b'},
@@ -868,6 +967,10 @@ main(int argc, char **argv)
private.startptr = InvalidXLogRecPtr;
private.endptr = InvalidXLogRecPtr;
private.endptr_reached = false;
+ private.decoding_started = false;
+ private.archive_name = NULL;
+ private.start_segno = 0;
+ private.end_segno = UINT64_MAX;
config.quiet = false;
config.bkp_details = false;
@@ -943,7 +1046,7 @@ main(int argc, char **argv)
}
break;
case 'p':
- waldir = pg_strdup(optarg);
+ walpath = pg_strdup(optarg);
break;
case 'q':
config.quiet = true;
@@ -1107,12 +1210,21 @@ main(int argc, char **argv)
goto bad_argument;
}
- if (waldir != NULL)
+ if (walpath != NULL)
{
+ /* validate path points to tar archive */
+ if (parse_tar_compress_algorithm(walpath, &compression))
+ {
+ char *fname = NULL;
+
+ split_path(walpath, &waldir, &fname);
+
+ private.archive_name = fname;
+ }
/* validate path points to directory */
- if (!verify_directory(waldir))
+ else if (!verify_directory(walpath))
{
- pg_log_error("could not open directory \"%s\": %m", waldir);
+ pg_log_error("could not open directory \"%s\": %m", walpath);
goto bad_argument;
}
}
@@ -1128,6 +1240,17 @@ main(int argc, char **argv)
int fd;
XLogSegNo segno;
+ /*
+ * If a tar archive is passed using the --path option, all other
+ * arguments become unnecessary.
+ */
+ if (private.archive_name)
+ {
+ pg_log_error("unnecessary command-line arguments specified with tar archive (first is \"%s\")",
+ argv[optind]);
+ goto bad_argument;
+ }
+
split_path(argv[optind], &directory, &fname);
if (waldir == NULL && directory != NULL)
@@ -1138,69 +1261,75 @@ main(int argc, char **argv)
pg_fatal("could not open directory \"%s\": %m", waldir);
}
- waldir = identify_target_directory(waldir, fname, &private.segsize);
- fd = open_file_in_directory(waldir, fname);
- if (fd < 0)
- pg_fatal("could not open file \"%s\"", fname);
- close(fd);
-
- /* parse position from file */
- XLogFromFileName(fname, &private.timeline, &segno, private.segsize);
-
- if (!XLogRecPtrIsValid(private.startptr))
- XLogSegNoOffsetToRecPtr(segno, 0, private.segsize, private.startptr);
- else if (!XLByteInSeg(private.startptr, segno, private.segsize))
+ if (fname != NULL && parse_tar_compress_algorithm(fname, &compression))
{
- pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.startptr),
- fname);
- goto bad_argument;
+ private.archive_name = fname;
}
-
- /* no second file specified, set end position */
- if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(segno + 1, 0, private.segsize, private.endptr);
-
- /* parse ENDSEG if passed */
- if (optind + 1 < argc)
+ else
{
- XLogSegNo endsegno;
-
- /* ignore directory, already have that */
- split_path(argv[optind + 1], &directory, &fname);
-
+ waldir = identify_target_directory(waldir, fname, &private.segsize);
fd = open_file_in_directory(waldir, fname);
if (fd < 0)
pg_fatal("could not open file \"%s\"", fname);
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &endsegno, private.segsize);
+ XLogFromFileName(fname, &private.timeline, &segno, private.segsize);
- if (endsegno < segno)
- pg_fatal("ENDSEG %s is before STARTSEG %s",
- argv[optind + 1], argv[optind]);
+ if (!XLogRecPtrIsValid(private.startptr))
+ XLogSegNoOffsetToRecPtr(segno, 0, private.segsize, private.startptr);
+ else if (!XLByteInSeg(private.startptr, segno, private.segsize))
+ {
+ pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.startptr),
+ fname);
+ goto bad_argument;
+ }
- if (!XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(endsegno + 1, 0, private.segsize,
- private.endptr);
+ /* no second file specified, set end position */
+ if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
+ XLogSegNoOffsetToRecPtr(segno + 1, 0, private.segsize, private.endptr);
- /* set segno to endsegno for check of --end */
- segno = endsegno;
- }
+ /* parse ENDSEG if passed */
+ if (optind + 1 < argc)
+ {
+ XLogSegNo endsegno;
+ /* ignore directory, already have that */
+ split_path(argv[optind + 1], &directory, &fname);
- if (!XLByteInSeg(private.endptr, segno, private.segsize) &&
- private.endptr != (segno + 1) * private.segsize)
- {
- pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.endptr),
- argv[argc - 1]);
- goto bad_argument;
+ fd = open_file_in_directory(waldir, fname);
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", fname);
+ close(fd);
+
+ /* parse position from file */
+ XLogFromFileName(fname, &private.timeline, &endsegno, private.segsize);
+
+ if (endsegno < segno)
+ pg_fatal("ENDSEG %s is before STARTSEG %s",
+ argv[optind + 1], argv[optind]);
+
+ if (!XLogRecPtrIsValid(private.endptr))
+ XLogSegNoOffsetToRecPtr(endsegno + 1, 0, private.segsize,
+ private.endptr);
+
+ /* set segno to endsegno for check of --end */
+ segno = endsegno;
+ }
+
+ if (!XLByteInSeg(private.endptr, segno, private.segsize) &&
+ private.endptr != (segno + 1) * private.segsize)
+ {
+ pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.endptr),
+ argv[argc - 1]);
+ goto bad_argument;
+ }
}
}
- else
- waldir = identify_target_directory(waldir, NULL, &private.segsize);
+ else if (!private.archive_name)
+ waldir = identify_target_directory(walpath, NULL, &private.segsize);
/* we don't know what to print */
if (!XLogRecPtrIsValid(private.startptr))
@@ -1209,15 +1338,46 @@ main(int argc, char **argv)
goto bad_argument;
}
+ /* --follow is not supported with tar archives */
+ if (config.follow && private.archive_name)
+ {
+ pg_log_error("--follow is not supported when reading from a tar archive");
+ goto bad_argument;
+ }
+
/* done with argument parsing, do the actual work */
/* we have everything we need, start reading */
- xlogreader_state =
- XLogReaderAllocate(private.segsize, waldir,
- XL_ROUTINE(.page_read = WALDumpReadPage,
- .segment_open = WALDumpOpenSegment,
- .segment_close = WALDumpCloseSegment),
- &private);
+ if (private.archive_name)
+ {
+ /*
+ * A NULL WAL directory indicates that the archive file is located in
+ * the current working directory of the pg_waldump execution
+ */
+ if (waldir == NULL)
+ waldir = pg_strdup(".");
+
+ /* Set up for reading tar file */
+ init_archive_reader(&private, waldir, &private.segsize, compression);
+
+ /* Routine to decode WAL files in tar archive */
+ xlogreader_state =
+ XLogReaderAllocate(private.segsize, waldir,
+ XL_ROUTINE(.page_read = TarWALDumpReadPage,
+ .segment_open = TarWALDumpOpenSegment,
+ .segment_close = TarWALDumpCloseSegment),
+ &private);
+ }
+ else
+ {
+ xlogreader_state =
+ XLogReaderAllocate(private.segsize, waldir,
+ XL_ROUTINE(.page_read = WALDumpReadPage,
+ .segment_open = WALDumpOpenSegment,
+ .segment_close = WALDumpCloseSegment),
+ &private);
+ }
+
if (!xlogreader_state)
pg_fatal("out of memory while allocating a WAL reading processor");
@@ -1245,6 +1405,9 @@ main(int argc, char **argv)
if (config.stats == true && !config.quiet)
stats.startptr = first_record;
+ /* Flag indicating that the decoding loop has been entered */
+ private.decoding_started = true;
+
for (;;)
{
if (time_to_stop)
@@ -1326,6 +1489,9 @@ main(int argc, char **argv)
XLogReaderFree(xlogreader_state);
+ if (private.archive_name)
+ free_archive_reader(&private);
+
return EXIT_SUCCESS;
bad_argument:
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index 013b051506f..1097390d575 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -12,6 +12,14 @@
#define PG_WALDUMP_H
#include "access/xlogdefs.h"
+#include "fe_utils/astreamer.h"
+
+/* Forward declaration */
+struct ArchivedWALFile;
+struct ArchivedWAL_hash;
+
+/* Temporary directory */
+extern char *TmpWalSegDir;
/* Contains the necessary information to drive WAL decoding */
typedef struct XLogDumpPrivate
@@ -21,6 +29,46 @@ typedef struct XLogDumpPrivate
XLogRecPtr startptr;
XLogRecPtr endptr;
bool endptr_reached;
+ bool decoding_started;
+
+ /* Fields required to read WAL from archive */
+ char *archive_name; /* Tar archive name */
+ int archive_fd; /* File descriptor for the open tar file */
+
+ astreamer *archive_streamer;
+ char *archive_read_buf; /* Reusable read buffer for archive I/O */
+ Size archive_read_buf_size;
+
+ /* What the archive streamer is currently reading */
+ struct ArchivedWALFile *cur_file;
+
+ /*
+ * Hash table of all WAL files that the archive stream has read, including
+ * the one currently in progress.
+ */
+ struct ArchivedWAL_hash *archive_wal_htab;
+
+ /*
+ * Although these values can be easily derived from startptr and endptr,
+ * doing so repeatedly for each archived member would be inefficient, as
+ * it would involve recalculating and filtering out irrelevant WAL
+ * segments.
+ */
+ XLogSegNo start_segno;
+ XLogSegNo end_segno;
} XLogDumpPrivate;
+extern int open_file_in_directory(const char *directory, const char *fname);
+
+extern void init_archive_reader(XLogDumpPrivate *privateInfo,
+ const char *waldir, int *WalSegSz,
+ pg_compress_algorithm compression);
+extern void free_archive_reader(XLogDumpPrivate *privateInfo);
+extern int read_archive_wal_page(XLogDumpPrivate *privateInfo,
+ XLogRecPtr targetPagePtr,
+ Size count, char *readBuff,
+ int WalSegSz);
+extern void free_archive_wal_entry(const char *fname,
+ XLogDumpPrivate *privateInfo);
+
#endif /* PG_WALDUMP_H */
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 5db5d20136f..6960bd46ba4 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -3,9 +3,13 @@
use strict;
use warnings FATAL => 'all';
+use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
+use List::Util qw(shuffle);
+
+my $tar = $ENV{TAR};
program_help_ok('pg_waldump');
program_version_ok('pg_waldump');
@@ -162,6 +166,42 @@ CREATE TABLESPACE ts1 LOCATION '$tblspc_path';
DROP TABLESPACE ts1;
});
+# Test: Decode a continuation record (contrecord) that spans multiple WAL
+# segments.
+#
+# Now consume all remaining room in the current WAL segment, leaving
+# space enough only for the start of a largish record.
+$node->safe_psql(
+ 'postgres', q{
+DO $$
+DECLARE
+ wal_segsize int := setting::int FROM pg_settings WHERE name = 'wal_segment_size';
+ remain int;
+ iters int := 0;
+BEGIN
+ LOOP
+ INSERT into t1(b)
+ select repeat(encode(sha256(g::text::bytea), 'hex'), (random() * 15 + 1)::int)
+ from generate_series(1, 10) g;
+
+ remain := wal_segsize - (pg_current_wal_insert_lsn() - '0/0') % wal_segsize;
+ IF remain < 2 * setting::int from pg_settings where name = 'block_size' THEN
+ RAISE log 'exiting after % iterations, % bytes to end of WAL segment', iters, remain;
+ EXIT;
+ END IF;
+ iters := iters + 1;
+ END LOOP;
+END
+$$;
+});
+
+my $contrecord_lsn = $node->safe_psql('postgres',
+ 'SELECT pg_current_wal_insert_lsn()');
+# Generate contrecord record
+$node->safe_psql('postgres',
+ qq{SELECT pg_logical_emit_message(true, 'test 026', repeat('xyzxz', 123456))}
+);
+
my ($end_lsn, $end_walfile) = split /\|/,
$node->safe_psql('postgres',
q{SELECT pg_current_wal_insert_lsn(), pg_walfile_name(pg_current_wal_insert_lsn())}
@@ -198,28 +238,6 @@ command_like(
],
qr/./,
'runs with start and end segment specified');
-command_fails_like(
- [ 'pg_waldump', '--path' => $node->data_dir ],
- qr/error: no start WAL location given/,
- 'path option requires start location');
-command_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- '--end' => $end_lsn,
- ],
- qr/./,
- 'runs with path option and start and end locations');
-command_fails_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- ],
- qr/error: error in WAL record at/,
- 'falling off the end of the WAL results in an error');
-
command_like(
[
'pg_waldump', '--quiet',
@@ -227,22 +245,16 @@ command_like(
],
qr/^$/,
'no output with --quiet option');
-command_fails_like(
- [
- 'pg_waldump', '--quiet',
- '--path' => $node->data_dir,
- '--start' => $start_lsn
- ],
- qr/error: error in WAL record at/,
- 'errors are shown with --quiet');
-
# Test for: Display a message that we're skipping data if `from`
# wasn't a pointer to the start of a record.
+sub test_pg_waldump_skip_bytes
{
+ my ($path, $startlsn, $endlsn) = @_;
+
# Construct a new LSN that is one byte past the original
# start_lsn.
- my ($part1, $part2) = split qr{/}, $start_lsn;
+ my ($part1, $part2) = split qr{/}, $startlsn;
my $lsn2 = hex $part2;
$lsn2++;
my $new_start = sprintf("%s/%X", $part1, $lsn2);
@@ -252,7 +264,8 @@ command_fails_like(
my $result = IPC::Run::run [
'pg_waldump',
'--start' => $new_start,
- $node->data_dir . '/pg_wal/' . $start_walfile
+ '--end' => $endlsn,
+ '--path' => $path,
],
'>' => \$stdout,
'2>' => \$stderr;
@@ -266,15 +279,15 @@ command_fails_like(
sub test_pg_waldump
{
local $Test::Builder::Level = $Test::Builder::Level + 1;
- my @opts = @_;
+ my ($path, $startlsn, $endlsn, @opts) = @_;
my ($stdout, $stderr);
my $result = IPC::Run::run [
'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- '--end' => $end_lsn,
+ '--start' => $startlsn,
+ '--end' => $endlsn,
+ '--path' => $path,
@opts
],
'>' => \$stdout,
@@ -286,40 +299,145 @@ sub test_pg_waldump
return @lines;
}
-my @lines;
+# Create a tar archive, sorting the file order
+sub generate_archive
+{
+ my ($archive, $directory, $compression_flags) = @_;
-@lines = test_pg_waldump;
-is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+ my @files;
+ opendir my $dh, $directory or die "opendir: $!";
+ while (my $entry = readdir $dh) {
+ # Skip '.' and '..'
+ next if $entry eq '.' || $entry eq '..';
+ push @files, $entry;
+ }
+ closedir $dh;
-@lines = test_pg_waldump('--limit' => 6);
-is(@lines, 6, 'limit option observed');
+ @files = shuffle @files;
-@lines = test_pg_waldump('--fullpage');
-is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
+ # move into the WAL directory before archiving files
+ my $cwd = getcwd;
+ chdir($directory) || die "chdir: $!";
+ command_ok([$tar, $compression_flags, $archive, @files]);
+ chdir($cwd) || die "chdir: $!";
+}
-@lines = test_pg_waldump('--stats');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
-@lines = test_pg_waldump('--stats=record');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+my @scenarios = (
+ {
+ 'path' => $node->data_dir,
+ 'is_archive' => 0,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar",
+ 'compression_method' => 'none',
+ 'compression_flags' => '-cf',
+ 'is_archive' => 1,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar.gz",
+ 'compression_method' => 'gzip',
+ 'compression_flags' => '-czf',
+ 'is_archive' => 1,
+ 'enabled' => check_pg_config("#define HAVE_LIBZ 1")
+ });
-@lines = test_pg_waldump('--rmgr' => 'Btree');
-is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+for my $scenario (@scenarios)
+{
+ my $path = $scenario->{'path'};
-@lines = test_pg_waldump('--fork' => 'init');
-is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+ SKIP:
+ {
+ skip "tar command is not available", 56
+ if !defined $tar && $scenario->{'is_archive'};
+ skip "$scenario->{'compression_method'} compression not supported by this build", 56
+ if !$scenario->{'enabled'} && $scenario->{'is_archive'};
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
-is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
- 0, 'only lines for selected relation');
+ # create pg_wal archive
+ if ($scenario->{'is_archive'})
+ {
+ generate_archive($path,
+ $node->data_dir . '/pg_wal',
+ $scenario->{'compression_flags'});
+ }
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
- '--block' => 1);
-is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+ command_fails_like(
+ [ 'pg_waldump', '--path' => $path ],
+ qr/error: no start WAL location given/,
+ 'path option requires start location');
+ command_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ '--end' => $end_lsn,
+ ],
+ qr/./,
+ 'runs with path option and start and end locations');
+ command_fails_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ ],
+ qr/error: error in WAL record at/,
+ 'falling off the end of the WAL results in an error');
+ command_fails_like(
+ [
+ 'pg_waldump', '--quiet',
+ '--path' => $path,
+ '--start' => $start_lsn
+ ],
+ qr/error: error in WAL record at/,
+ 'errors are shown with --quiet');
+
+ test_pg_waldump_skip_bytes($path, $start_lsn, $end_lsn);
+
+ my @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
+ is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+
+ @lines = test_pg_waldump($path, $contrecord_lsn, $end_lsn);
+ is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+
+ test_pg_waldump_skip_bytes($path, $contrecord_lsn, $end_lsn);
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--limit' => 6);
+ is(@lines, 6, 'limit option observed');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fullpage');
+ is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats=record');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--rmgr' => 'Btree');
+ is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fork' => 'init');
+ is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
+ is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
+ 0, 'only lines for selected relation');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
+ '--block' => 1);
+ is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+
+ # Cleanup.
+ unlink $path if $scenario->{'is_archive'};
+ }
+}
done_testing();
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 52f8603a7be..4961c3024af 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -147,6 +147,9 @@ ArchiveOpts
ArchiveShutdownCB
ArchiveStartupCB
ArchiveStreamState
+ArchivedWALFile
+ArchivedWAL_hash
+ArchivedWAL_iterator
ArchiverOutput
ArchiverStage
ArrayAnalyzeExtraData
@@ -3540,6 +3543,7 @@ astreamer_recovery_injector
astreamer_tar_archiver
astreamer_tar_parser
astreamer_verify
+astreamer_waldump
astreamer_zstd_frame
auth_password_hook_typ
autovac_table
--
2.47.1
[application/octet-stream] v19-0004-pg_verifybackup-Enable-WAL-parsing-for-tar-forma.patch (16.1K, 5-v19-0004-pg_verifybackup-Enable-WAL-parsing-for-tar-forma.patch)
download | inline diff:
From 382c0c70d12309f1fc71001c7705b46621d74332 Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <[email protected]>
Date: Wed, 11 Mar 2026 11:26:36 -0400
Subject: [PATCH v19 4/4] pg_verifybackup: Enable WAL parsing for tar-format
backups
Now that pg_waldump supports reading WAL from tar archives, remove the
restriction that forced --no-parse-wal for tar-format backups.
pg_verifybackup now automatically locates the WAL archive: it looks for
a separate pg_wal.tar first, then falls back to the main base.tar. A
new --wal-path option (replacing the old --wal-directory, which is kept
as a silent alias) accepts either a directory or a tar archive path.
The default WAL directory preparation is deferred until the backup
format is known, since tar-format backups resolve the WAL path
differently from plain-format ones.
Author: Amul Sul <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Jakub Wartak <[email protected]>
Reviewed-by: Chao Li <[email protected]>
Reviewed-by: Euler Taveira <[email protected]>
Reviewed-by: Andrew Dunstan <[email protected]>
discussion: https://postgr.es/m/CAAJ_b94bqdWN3h2J-PzzzQ2Npbwct5ZQHggn_QoYGhC2rn-=WQ@mail.gmail.com
---
doc/src/sgml/ref/pg_verifybackup.sgml | 14 ++-
src/bin/pg_verifybackup/pg_verifybackup.c | 89 ++++++++++++-------
src/bin/pg_verifybackup/t/002_algorithm.pl | 4 -
src/bin/pg_verifybackup/t/003_corruption.pl | 4 +-
src/bin/pg_verifybackup/t/007_wal.pl | 20 ++++-
src/bin/pg_verifybackup/t/008_untar.pl | 5 +-
src/bin/pg_verifybackup/t/010_client_untar.pl | 5 +-
7 files changed, 85 insertions(+), 56 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index 61c12975e4a..1695cfe91c8 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -36,10 +36,7 @@ PostgreSQL documentation
<literal>backup_manifest</literal> generated by the server at the time
of the backup. The backup may be stored either in the "plain" or the "tar"
format; this includes tar-format backups compressed with any algorithm
- supported by <application>pg_basebackup</application>. However, at present,
- <literal>WAL</literal> verification is supported only for plain-format
- backups. Therefore, if the backup is stored in tar-format, the
- <literal>-n, --no-parse-wal</literal> option should be used.
+ supported by <application>pg_basebackup</application>.
</para>
<para>
@@ -261,12 +258,13 @@ PostgreSQL documentation
<varlistentry>
<term><option>-w <replaceable class="parameter">path</replaceable></option></term>
- <term><option>--wal-directory=<replaceable class="parameter">path</replaceable></option></term>
+ <term><option>--wal-path=<replaceable class="parameter">path</replaceable></option></term>
<listitem>
<para>
- Try to parse WAL files stored in the specified directory, rather than
- in <literal>pg_wal</literal>. This may be useful if the backup is
- stored in a separate location from the WAL archive.
+ Try to parse WAL files stored in the specified directory or tar
+ archive, rather than in <literal>pg_wal</literal>. This may be
+ useful if the backup is stored in a separate location from the WAL
+ archive.
</para>
</listitem>
</varlistentry>
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 31f606c45b1..db79dd39103 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -74,7 +74,9 @@ pg_noreturn static void report_manifest_error(JsonManifestParseContext *context,
const char *fmt,...)
pg_attribute_printf(2, 3);
-static void verify_tar_backup(verifier_context *context, DIR *dir);
+static void verify_tar_backup(verifier_context *context, DIR *dir,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_plain_backup_directory(verifier_context *context,
char *relpath, char *fullpath,
DIR *dir);
@@ -83,7 +85,9 @@ static void verify_plain_backup_file(verifier_context *context, char *relpath,
static void verify_control_file(const char *controlpath,
uint64 manifest_system_identifier);
static void precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles);
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_tar_file(verifier_context *context, char *relpath,
char *fullpath, astreamer *streamer);
static void report_extra_backup_files(verifier_context *context);
@@ -93,7 +97,7 @@ static void verify_file_checksum(verifier_context *context,
uint8 *buffer);
static void parse_required_wal(verifier_context *context,
char *pg_waldump_path,
- char *wal_directory);
+ char *wal_path);
static astreamer *create_archive_verifier(verifier_context *context,
char *archive_name,
Oid tblspc_oid,
@@ -126,7 +130,8 @@ main(int argc, char **argv)
{"progress", no_argument, NULL, 'P'},
{"quiet", no_argument, NULL, 'q'},
{"skip-checksums", no_argument, NULL, 's'},
- {"wal-directory", required_argument, NULL, 'w'},
+ {"wal-path", required_argument, NULL, 'w'},
+ {"wal-directory", required_argument, NULL, 'w'}, /* deprecated */
{NULL, 0, NULL, 0}
};
@@ -135,7 +140,9 @@ main(int argc, char **argv)
char *manifest_path = NULL;
bool no_parse_wal = false;
bool quiet = false;
- char *wal_directory = NULL;
+ char *wal_path = NULL;
+ char *base_archive_path = NULL;
+ char *wal_archive_path = NULL;
char *pg_waldump_path = NULL;
DIR *dir;
@@ -221,8 +228,8 @@ main(int argc, char **argv)
context.skip_checksums = true;
break;
case 'w':
- wal_directory = pstrdup(optarg);
- canonicalize_path(wal_directory);
+ wal_path = pstrdup(optarg);
+ canonicalize_path(wal_path);
break;
default:
/* getopt_long already emitted a complaint */
@@ -285,10 +292,6 @@ main(int argc, char **argv)
manifest_path = psprintf("%s/backup_manifest",
context.backup_directory);
- /* By default, look for the WAL in the backup directory, too. */
- if (wal_directory == NULL)
- wal_directory = psprintf("%s/pg_wal", context.backup_directory);
-
/*
* Try to read the manifest. We treat any errors encountered while parsing
* the manifest as fatal; there doesn't seem to be much point in trying to
@@ -331,17 +334,6 @@ main(int argc, char **argv)
pfree(path);
}
- /*
- * XXX: In the future, we should consider enhancing pg_waldump to read WAL
- * files from an archive.
- */
- if (!no_parse_wal && context.format == 't')
- {
- pg_log_error("pg_waldump cannot read tar files");
- pg_log_error_hint("You must use -n/--no-parse-wal when verifying a tar-format backup.");
- exit(1);
- }
-
/*
* Perform the appropriate type of verification appropriate based on the
* backup format. This will close 'dir'.
@@ -350,7 +342,7 @@ main(int argc, char **argv)
verify_plain_backup_directory(&context, NULL, context.backup_directory,
dir);
else
- verify_tar_backup(&context, dir);
+ verify_tar_backup(&context, dir, &base_archive_path, &wal_archive_path);
/*
* The "matched" flag should now be set on every entry in the hash table.
@@ -368,12 +360,35 @@ main(int argc, char **argv)
if (context.format == 'p' && !context.skip_checksums)
verify_backup_checksums(&context);
+ /*
+ * By default, WAL files are expected to be found in the backup directory
+ * for plain-format backups. In the case of tar-format backups, if a
+ * separate WAL archive is not found, the WAL files are most likely
+ * included within the main data directory archive.
+ */
+ if (wal_path == NULL)
+ {
+ if (context.format == 'p')
+ wal_path = psprintf("%s/pg_wal", context.backup_directory);
+ else if (wal_archive_path)
+ wal_path = wal_archive_path;
+ else if (base_archive_path)
+ wal_path = base_archive_path;
+ else
+ {
+ pg_log_error("WAL archive not found");
+ pg_log_error_hint("Specify the correct path using the option -w/--wal-path. "
+ "Or you must use -n/--no-parse-wal when verifying a tar-format backup.");
+ exit(1);
+ }
+ }
+
/*
* Try to parse the required ranges of WAL records, unless we were told
* not to do so.
*/
if (!no_parse_wal)
- parse_required_wal(&context, pg_waldump_path, wal_directory);
+ parse_required_wal(&context, pg_waldump_path, wal_path);
/*
* If everything looks OK, tell the user this, unless we were asked to
@@ -787,7 +802,8 @@ verify_control_file(const char *controlpath, uint64 manifest_system_identifier)
* close when we're done with it.
*/
static void
-verify_tar_backup(verifier_context *context, DIR *dir)
+verify_tar_backup(verifier_context *context, DIR *dir, char **base_archive_path,
+ char **wal_archive_path)
{
struct dirent *dirent;
SimplePtrList tarfiles = {NULL, NULL};
@@ -816,7 +832,8 @@ verify_tar_backup(verifier_context *context, DIR *dir)
char *fullpath;
fullpath = psprintf("%s/%s", context->backup_directory, filename);
- precheck_tar_backup_file(context, filename, fullpath, &tarfiles);
+ precheck_tar_backup_file(context, filename, fullpath, &tarfiles,
+ base_archive_path, wal_archive_path);
pfree(fullpath);
}
}
@@ -875,11 +892,13 @@ verify_tar_backup(verifier_context *context, DIR *dir)
*
* The arguments to this function are mostly the same as the
* verify_plain_backup_file. The additional argument outputs a list of valid
- * tar files.
+ * tar files, along with the full paths to the main archive and the WAL
+ * directory archive.
*/
static void
precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles)
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path, char **wal_archive_path)
{
struct stat sb;
Oid tblspc_oid = InvalidOid;
@@ -918,9 +937,17 @@ precheck_tar_backup_file(verifier_context *context, char *relpath,
* extension such as .gz, .lz4, or .zst.
*/
if (strncmp("base", relpath, 4) == 0)
+ {
suffix = relpath + 4;
+
+ *base_archive_path = pstrdup(fullpath);
+ }
else if (strncmp("pg_wal", relpath, 6) == 0)
+ {
suffix = relpath + 6;
+
+ *wal_archive_path = pstrdup(fullpath);
+ }
else
{
/* Expected a <tablespaceoid>.tar file here. */
@@ -1188,7 +1215,7 @@ verify_file_checksum(verifier_context *context, manifest_file *m,
*/
static void
parse_required_wal(verifier_context *context, char *pg_waldump_path,
- char *wal_directory)
+ char *wal_path)
{
manifest_data *manifest = context->manifest;
manifest_wal_range *this_wal_range = manifest->first_wal_range;
@@ -1198,7 +1225,7 @@ parse_required_wal(verifier_context *context, char *pg_waldump_path,
char *pg_waldump_cmd;
pg_waldump_cmd = psprintf("\"%s\" --quiet --path=\"%s\" --timeline=%u --start=%X/%08X --end=%X/%08X\n",
- pg_waldump_path, wal_directory, this_wal_range->tli,
+ pg_waldump_path, wal_path, this_wal_range->tli,
LSN_FORMAT_ARGS(this_wal_range->start_lsn),
LSN_FORMAT_ARGS(this_wal_range->end_lsn));
fflush(NULL);
@@ -1366,7 +1393,7 @@ usage(void)
printf(_(" -P, --progress show progress information\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -s, --skip-checksums skip checksum verification\n"));
- printf(_(" -w, --wal-directory=PATH use specified path for WAL files\n"));
+ printf(_(" -w, --wal-path=PATH use specified path for WAL files\n"));
printf(_(" -V, --version output version information, then exit\n"));
printf(_(" -?, --help show this help, then exit\n"));
printf(_("\nReport bugs to <%s>.\n"), PACKAGE_BUGREPORT);
diff --git a/src/bin/pg_verifybackup/t/002_algorithm.pl b/src/bin/pg_verifybackup/t/002_algorithm.pl
index 0556191ec9d..edc515d5904 100644
--- a/src/bin/pg_verifybackup/t/002_algorithm.pl
+++ b/src/bin/pg_verifybackup/t/002_algorithm.pl
@@ -30,10 +30,6 @@ sub test_checksums
{
# Add switch to get a tar-format backup
push @backup, ('--format' => 'tar');
-
- # Add switch to skip WAL verification, which is not yet supported for
- # tar-format backups
- push @verify, ('--no-parse-wal');
}
# A backup with a bogus algorithm should fail.
diff --git a/src/bin/pg_verifybackup/t/003_corruption.pl b/src/bin/pg_verifybackup/t/003_corruption.pl
index b1d65b8aa0f..882d75d9dc2 100644
--- a/src/bin/pg_verifybackup/t/003_corruption.pl
+++ b/src/bin/pg_verifybackup/t/003_corruption.pl
@@ -193,10 +193,8 @@ for my $scenario (@scenario)
command_ok([ $tar, '-cf' => "$tar_backup_path/base.tar", '.' ]);
chdir($cwd) || die "chdir: $!";
- # Now check that the backup no longer verifies. We must use -n
- # here, because pg_waldump can't yet read WAL from a tarfile.
command_fails_like(
- [ 'pg_verifybackup', '--no-parse-wal', $tar_backup_path ],
+ [ 'pg_verifybackup', $tar_backup_path ],
$scenario->{'fails_like'},
"corrupt backup fails verification: $name");
diff --git a/src/bin/pg_verifybackup/t/007_wal.pl b/src/bin/pg_verifybackup/t/007_wal.pl
index 79087a1f6be..0e0377bfacc 100644
--- a/src/bin/pg_verifybackup/t/007_wal.pl
+++ b/src/bin/pg_verifybackup/t/007_wal.pl
@@ -42,10 +42,10 @@ command_ok([ 'pg_verifybackup', '--no-parse-wal', $backup_path ],
command_ok(
[
'pg_verifybackup',
- '--wal-directory' => $relocated_pg_wal,
+ '--wal-path' => $relocated_pg_wal,
$backup_path
],
- '--wal-directory can be used to specify WAL directory');
+ '--wal-path can be used to specify WAL directory');
# Move directory back to original location.
rename($relocated_pg_wal, $original_pg_wal) || die "rename pg_wal back: $!";
@@ -90,4 +90,20 @@ command_ok(
[ 'pg_verifybackup', $backup_path2 ],
'valid base backup with timeline > 1');
+# Test WAL verification for a tar-format backup with a separate pg_wal.tar,
+# as produced by pg_basebackup --format=tar --wal-method=stream.
+my $backup_path3 = $primary->backup_dir . '/test_tar_wal';
+$primary->command_ok(
+ [
+ 'pg_basebackup',
+ '--pgdata' => $backup_path3,
+ '--no-sync',
+ '--format' => 'tar',
+ '--checkpoint' => 'fast'
+ ],
+ "tar backup with separate pg_wal.tar");
+command_ok(
+ [ 'pg_verifybackup', $backup_path3 ],
+ 'WAL verification succeeds with separate pg_wal.tar');
+
done_testing();
diff --git a/src/bin/pg_verifybackup/t/008_untar.pl b/src/bin/pg_verifybackup/t/008_untar.pl
index ae67ae85a31..161c08c190d 100644
--- a/src/bin/pg_verifybackup/t/008_untar.pl
+++ b/src/bin/pg_verifybackup/t/008_untar.pl
@@ -47,7 +47,6 @@ my $tsoid = $primary->safe_psql(
SELECT oid FROM pg_tablespace WHERE spcname = 'regress_ts1'));
my $backup_path = $primary->backup_dir . '/server-backup';
-my $extract_path = $primary->backup_dir . '/extracted-backup';
my @test_configuration = (
{
@@ -123,14 +122,12 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
# Cleanup.
rmtree($backup_path);
- rmtree($extract_path);
}
}
diff --git a/src/bin/pg_verifybackup/t/010_client_untar.pl b/src/bin/pg_verifybackup/t/010_client_untar.pl
index 1ac7b5db75a..9670fbe4fda 100644
--- a/src/bin/pg_verifybackup/t/010_client_untar.pl
+++ b/src/bin/pg_verifybackup/t/010_client_untar.pl
@@ -32,7 +32,6 @@ print $jf $junk_data;
close $jf;
my $backup_path = $primary->backup_dir . '/client-backup';
-my $extract_path = $primary->backup_dir . '/extracted-backup';
my @test_configuration = (
{
@@ -137,13 +136,11 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
# Cleanup.
- rmtree($extract_path);
rmtree($backup_path);
}
}
--
2.47.1
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-03-19 10:20 Amul Sul <[email protected]>
parent: Amul Sul <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Amul Sul @ 2026-03-19 10:20 UTC (permalink / raw)
To: Andrew Dunstan <[email protected]>; +Cc: Robert Haas <[email protected]>; Chao Li <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On Wed, Mar 18, 2026 at 8:46 PM Amul Sul <[email protected]> wrote:
>
> On Wed, Mar 18, 2026 at 5:15 PM Amul Sul <[email protected]> wrote:
> >
> > On Wed, Mar 11, 2026 at 10:38 PM Andrew Dunstan <[email protected]> wrote:
> > > [...]
> > > Looks pretty good. I have squashed them into three patches I think are committable. Also attached is a diff showing what's changed - mainly this:
> > >
> > > . --follow + tar archive rejected (pg_waldump.c) — new validation prevents a confusing pg_fatal when combining --follow with a tar archive
> > > . error messages split (archive_waldump.c) — the single "could not read file" error is now two distinct messages: "WAL segment is too short" (truncated file) vs "unexpected end of archive" (archive EOF) - Fixes an issue raised in review
> > > . hash table cleanup (archive_waldump.c) — free_archive_reader now iterates and frees all remaining hash entries and destroys the table
> > >
> >
> > The final squashed version looks good to me, thank you. But, I would
> > like to propose splitting the 0001 patch into two separate commits: a
> > preparatory refactoring of the pg_waldump code and a standalone commit
> > that moves the tar archive detection and compression logic to a common
> > location, as the latter is an independent improvement to the existing
> > codebase. Additionally, since the test file refactoring was only kept
> > separate to facilitate the review and has already been reviewed, I
> > suggest merging those changes into the main feature patch i.e. 0002.
> > All other elements should remain in a single preparatory refactoring
> > patch for pg_waldump.
> >
> > Attached is the version that includes the proposed split. No
> > additional changes to 0002 and 0003 patches.
> >
>
> Added the two missing 'Reviewed-by' lines to the credit section of the
> commit message and did a minor optimization in get_archive_wal_entry.
>
Attaching an updated version. It includes some tweaks to code
comments, adds an assert inside get_archive_wal_entry(), moves the
archive_read_buf_size declaration and usage into an assert-enabled
check, and makes a minor change to precheck_tar_backup_file() to
assign out-variables only after successful validation.
Regards,
Amul
Attachments:
[application/x-patch] v20-0001-Move-tar-detection-and-compression-logic-to-comm.patch (7.1K, 2-v20-0001-Move-tar-detection-and-compression-logic-to-comm.patch)
download | inline diff:
From ba736014228ea250b8eb155f2776bb86feed2b55 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 19 Mar 2026 15:43:30 +0530
Subject: [PATCH v20 1/5] Move tar detection and compression logic to common.
Consolidate tar archive identification and compression-type detection
logic into a shared location. Currently used by pg_basebackup and
pg_verifybackup, this functionality is also required for upcoming
pg_waldump enhancements.
This change promotes code reuse and simplifies maintenance across
frontend tools.
Author: Amul Sul <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Jakub Wartak <[email protected]>
Reviewed-by: Chao Li <[email protected]>
Reviewed-by: Euler Taveira <[email protected]>
Reviewed-by: Andrew Dunstan <[email protected]>
discussion: https://postgr.es/m/CAAJ_b94bqdWN3h2J-PzzzQ2Npbwct5ZQHggn_QoYGhC2rn-=WQ@mail.gmail.com
---
src/bin/pg_basebackup/pg_basebackup.c | 36 +++++++----------------
src/bin/pg_verifybackup/pg_verifybackup.c | 12 +-------
src/common/compression.c | 30 +++++++++++++++++++
src/include/common/compression.h | 2 ++
4 files changed, 44 insertions(+), 36 deletions(-)
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index fa169a8d642..c1a4672aa6f 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -1070,12 +1070,9 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
astreamer *manifest_inject_streamer = NULL;
bool inject_manifest;
bool is_tar,
- is_tar_gz,
- is_tar_lz4,
- is_tar_zstd,
is_compressed_tar;
+ pg_compress_algorithm compressed_tar_algorithm;
bool must_parse_archive;
- int archive_name_len = strlen(archive_name);
/*
* Normally, we emit the backup manifest as a separate file, but when
@@ -1084,24 +1081,13 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
*/
inject_manifest = (format == 't' && strcmp(basedir, "-") == 0 && manifest);
- /* Is this a tar archive? */
- is_tar = (archive_name_len > 4 &&
- strcmp(archive_name + archive_name_len - 4, ".tar") == 0);
-
- /* Is this a .tar.gz archive? */
- is_tar_gz = (archive_name_len > 7 &&
- strcmp(archive_name + archive_name_len - 7, ".tar.gz") == 0);
-
- /* Is this a .tar.lz4 archive? */
- is_tar_lz4 = (archive_name_len > 8 &&
- strcmp(archive_name + archive_name_len - 8, ".tar.lz4") == 0);
-
- /* Is this a .tar.zst archive? */
- is_tar_zstd = (archive_name_len > 8 &&
- strcmp(archive_name + archive_name_len - 8, ".tar.zst") == 0);
+ /* Check whether it is a tar archive and its compression type */
+ is_tar = parse_tar_compress_algorithm(archive_name,
+ &compressed_tar_algorithm);
/* Is this any kind of compressed tar? */
- is_compressed_tar = is_tar_gz || is_tar_lz4 || is_tar_zstd;
+ is_compressed_tar = (is_tar &&
+ compressed_tar_algorithm != PG_COMPRESSION_NONE);
/*
* Injecting the manifest into a compressed tar file would be possible if
@@ -1128,7 +1114,7 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
(spclocation == NULL && writerecoveryconf));
/* At present, we only know how to parse tar archives. */
- if (must_parse_archive && !is_tar && !is_compressed_tar)
+ if (must_parse_archive && !is_tar)
{
pg_log_error("cannot parse archive \"%s\"", archive_name);
pg_log_error_detail("Only tar archives can be parsed.");
@@ -1263,13 +1249,13 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
* If the user has requested a server compressed archive along with
* archive extraction at client then we need to decompress it.
*/
- if (format == 'p')
+ if (format == 'p' && is_compressed_tar)
{
- if (is_tar_gz)
+ if (compressed_tar_algorithm == PG_COMPRESSION_GZIP)
streamer = astreamer_gzip_decompressor_new(streamer);
- else if (is_tar_lz4)
+ else if (compressed_tar_algorithm == PG_COMPRESSION_LZ4)
streamer = astreamer_lz4_decompressor_new(streamer);
- else if (is_tar_zstd)
+ else if (compressed_tar_algorithm == PG_COMPRESSION_ZSTD)
streamer = astreamer_zstd_decompressor_new(streamer);
}
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index cbc9447384f..31f606c45b1 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -941,17 +941,7 @@ precheck_tar_backup_file(verifier_context *context, char *relpath,
}
/* Now, check the compression type of the tar */
- if (strcmp(suffix, ".tar") == 0)
- compress_algorithm = PG_COMPRESSION_NONE;
- else if (strcmp(suffix, ".tgz") == 0)
- compress_algorithm = PG_COMPRESSION_GZIP;
- else if (strcmp(suffix, ".tar.gz") == 0)
- compress_algorithm = PG_COMPRESSION_GZIP;
- else if (strcmp(suffix, ".tar.lz4") == 0)
- compress_algorithm = PG_COMPRESSION_LZ4;
- else if (strcmp(suffix, ".tar.zst") == 0)
- compress_algorithm = PG_COMPRESSION_ZSTD;
- else
+ if (!parse_tar_compress_algorithm(suffix, &compress_algorithm))
{
report_backup_error(context,
"file \"%s\" is not expected in a tar format backup",
diff --git a/src/common/compression.c b/src/common/compression.c
index 92cd4ec7a0d..fefbed68bea 100644
--- a/src/common/compression.c
+++ b/src/common/compression.c
@@ -41,6 +41,36 @@ static int expect_integer_value(char *keyword, char *value,
static bool expect_boolean_value(char *keyword, char *value,
pg_compress_specification *result);
+/*
+ * Look up a compression algorithm by archive file extension. Returns true and
+ * sets *algorithm if the extension is recognized. Otherwise returns false.
+ */
+bool
+parse_tar_compress_algorithm(const char *fname, pg_compress_algorithm *algorithm)
+{
+ int fname_len = strlen(fname);
+
+ if (fname_len >= 4 &&
+ strcmp(fname + fname_len - 4, ".tar") == 0)
+ *algorithm = PG_COMPRESSION_NONE;
+ else if (fname_len >= 4 &&
+ strcmp(fname + fname_len - 4, ".tgz") == 0)
+ *algorithm = PG_COMPRESSION_GZIP;
+ else if (fname_len >= 7 &&
+ strcmp(fname + fname_len - 7, ".tar.gz") == 0)
+ *algorithm = PG_COMPRESSION_GZIP;
+ else if (fname_len >= 8 &&
+ strcmp(fname + fname_len - 8, ".tar.lz4") == 0)
+ *algorithm = PG_COMPRESSION_LZ4;
+ else if (fname_len >= 8 &&
+ strcmp(fname + fname_len - 8, ".tar.zst") == 0)
+ *algorithm = PG_COMPRESSION_ZSTD;
+ else
+ return false;
+
+ return true;
+}
+
/*
* Look up a compression algorithm by name. Returns true and sets *algorithm
* if the name is recognized. Otherwise returns false.
diff --git a/src/include/common/compression.h b/src/include/common/compression.h
index 6c745b90066..f99c747cdd3 100644
--- a/src/include/common/compression.h
+++ b/src/include/common/compression.h
@@ -41,6 +41,8 @@ typedef struct pg_compress_specification
extern void parse_compress_options(const char *option, char **algorithm,
char **detail);
+extern bool parse_tar_compress_algorithm(const char *fname,
+ pg_compress_algorithm *algorithm);
extern bool parse_compress_algorithm(char *name, pg_compress_algorithm *algorithm);
extern const char *get_compress_algorithm_name(pg_compress_algorithm algorithm);
--
2.47.1
[application/x-patch] v20-0002-pg_waldump-Preparatory-refactoring-for-tar-archi.patch (8.4K, 3-v20-0002-pg_waldump-Preparatory-refactoring-for-tar-archi.patch)
download | inline diff:
From 7b36f9bdebaf9be7e5adb9b8dac25394cb611d0b Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 19 Mar 2026 15:43:39 +0530
Subject: [PATCH v20 2/5] pg_waldump: Preparatory refactoring for tar archive
WAL decoding.
Several refactoring steps in preparation for adding tar archive WAL
decoding support to pg_waldump:
- Move XLogDumpPrivate and related declarations into a new pg_waldump.h
header, allowing a second source file to share them.
- Factor out required_read_len() so the read-size calculation can be
reused for both regular WAL files and tar-archived WAL.
- Move the WAL segment size variable into XLogDumpPrivate and rename it
to segsize, making it accessible to the archive streamer code.
Author: Amul Sul <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Jakub Wartak <[email protected]>
Reviewed-by: Chao Li <[email protected]>
Reviewed-by: Euler Taveira <[email protected]>
Reviewed-by: Andrew Dunstan <[email protected]>
discussion: https://postgr.es/m/CAAJ_b94bqdWN3h2J-PzzzQ2Npbwct5ZQHggn_QoYGhC2rn-=WQ@mail.gmail.com
---
src/bin/pg_waldump/pg_waldump.c | 78 +++++++++++++++++++--------------
src/bin/pg_waldump/pg_waldump.h | 26 +++++++++++
2 files changed, 70 insertions(+), 34 deletions(-)
create mode 100644 src/bin/pg_waldump/pg_waldump.h
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index f3446385d6a..5d31b15dbd8 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -29,6 +29,7 @@
#include "common/logging.h"
#include "common/relpath.h"
#include "getopt_long.h"
+#include "pg_waldump.h"
#include "rmgrdesc.h"
#include "storage/bufpage.h"
@@ -43,14 +44,6 @@ static volatile sig_atomic_t time_to_stop = false;
static const RelFileLocator emptyRelFileLocator = {0, 0, 0};
-typedef struct XLogDumpPrivate
-{
- TimeLineID timeline;
- XLogRecPtr startptr;
- XLogRecPtr endptr;
- bool endptr_reached;
-} XLogDumpPrivate;
-
typedef struct XLogDumpConfig
{
/* display options */
@@ -333,6 +326,32 @@ identify_target_directory(char *directory, char *fname, int *WalSegSz)
return NULL; /* not reached */
}
+/*
+ * Returns the size in bytes of the data to be read. Returns -1 if the end
+ * point has already been reached.
+ */
+static inline int
+required_read_len(XLogDumpPrivate *private, XLogRecPtr targetPagePtr,
+ int reqLen)
+{
+ int count = XLOG_BLCKSZ;
+
+ if (XLogRecPtrIsValid(private->endptr))
+ {
+ if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
+ count = XLOG_BLCKSZ;
+ else if (targetPagePtr + reqLen <= private->endptr)
+ count = private->endptr - targetPagePtr;
+ else
+ {
+ private->endptr_reached = true;
+ return -1;
+ }
+ }
+
+ return count;
+}
+
/* pg_waldump's XLogReaderRoutine->segment_open callback */
static void
WALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
@@ -390,21 +409,12 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogRecPtr targetPtr, char *readBuff)
{
XLogDumpPrivate *private = state->private_data;
- int count = XLOG_BLCKSZ;
+ int count = required_read_len(private, targetPagePtr, reqLen);
WALReadError errinfo;
- if (XLogRecPtrIsValid(private->endptr))
- {
- if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
- count = XLOG_BLCKSZ;
- else if (targetPagePtr + reqLen <= private->endptr)
- count = private->endptr - targetPagePtr;
- else
- {
- private->endptr_reached = true;
- return -1;
- }
- }
+ /* Bail out if the count to be read is not valid */
+ if (count < 0)
+ return -1;
if (!WALRead(state, readBuff, targetPagePtr, count, private->timeline,
&errinfo))
@@ -801,7 +811,6 @@ main(int argc, char **argv)
XLogRecPtr first_record;
char *waldir = NULL;
char *errormsg;
- int WalSegSz;
static struct option long_options[] = {
{"bkp-details", no_argument, NULL, 'b'},
@@ -855,6 +864,7 @@ main(int argc, char **argv)
memset(&stats, 0, sizeof(XLogStats));
private.timeline = 1;
+ private.segsize = 0;
private.startptr = InvalidXLogRecPtr;
private.endptr = InvalidXLogRecPtr;
private.endptr_reached = false;
@@ -1128,18 +1138,18 @@ main(int argc, char **argv)
pg_fatal("could not open directory \"%s\": %m", waldir);
}
- waldir = identify_target_directory(waldir, fname, &WalSegSz);
+ waldir = identify_target_directory(waldir, fname, &private.segsize);
fd = open_file_in_directory(waldir, fname);
if (fd < 0)
pg_fatal("could not open file \"%s\"", fname);
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &segno, WalSegSz);
+ XLogFromFileName(fname, &private.timeline, &segno, private.segsize);
if (!XLogRecPtrIsValid(private.startptr))
- XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr);
- else if (!XLByteInSeg(private.startptr, segno, WalSegSz))
+ XLogSegNoOffsetToRecPtr(segno, 0, private.segsize, private.startptr);
+ else if (!XLByteInSeg(private.startptr, segno, private.segsize))
{
pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
LSN_FORMAT_ARGS(private.startptr),
@@ -1149,7 +1159,7 @@ main(int argc, char **argv)
/* no second file specified, set end position */
if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(segno + 1, 0, WalSegSz, private.endptr);
+ XLogSegNoOffsetToRecPtr(segno + 1, 0, private.segsize, private.endptr);
/* parse ENDSEG if passed */
if (optind + 1 < argc)
@@ -1165,14 +1175,14 @@ main(int argc, char **argv)
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &endsegno, WalSegSz);
+ XLogFromFileName(fname, &private.timeline, &endsegno, private.segsize);
if (endsegno < segno)
pg_fatal("ENDSEG %s is before STARTSEG %s",
argv[optind + 1], argv[optind]);
if (!XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(endsegno + 1, 0, WalSegSz,
+ XLogSegNoOffsetToRecPtr(endsegno + 1, 0, private.segsize,
private.endptr);
/* set segno to endsegno for check of --end */
@@ -1180,8 +1190,8 @@ main(int argc, char **argv)
}
- if (!XLByteInSeg(private.endptr, segno, WalSegSz) &&
- private.endptr != (segno + 1) * WalSegSz)
+ if (!XLByteInSeg(private.endptr, segno, private.segsize) &&
+ private.endptr != (segno + 1) * private.segsize)
{
pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
LSN_FORMAT_ARGS(private.endptr),
@@ -1190,7 +1200,7 @@ main(int argc, char **argv)
}
}
else
- waldir = identify_target_directory(waldir, NULL, &WalSegSz);
+ waldir = identify_target_directory(waldir, NULL, &private.segsize);
/* we don't know what to print */
if (!XLogRecPtrIsValid(private.startptr))
@@ -1203,7 +1213,7 @@ main(int argc, char **argv)
/* we have everything we need, start reading */
xlogreader_state =
- XLogReaderAllocate(WalSegSz, waldir,
+ XLogReaderAllocate(private.segsize, waldir,
XL_ROUTINE(.page_read = WALDumpReadPage,
.segment_open = WALDumpOpenSegment,
.segment_close = WALDumpCloseSegment),
@@ -1224,7 +1234,7 @@ main(int argc, char **argv)
* a segment (e.g. we were used in file mode).
*/
if (first_record != private.startptr &&
- XLogSegmentOffset(private.startptr, WalSegSz) != 0)
+ XLogSegmentOffset(private.startptr, private.segsize) != 0)
pg_log_info(ngettext("first record is after %X/%08X, at %X/%08X, skipping over %u byte",
"first record is after %X/%08X, at %X/%08X, skipping over %u bytes",
(first_record - private.startptr)),
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
new file mode 100644
index 00000000000..013b051506f
--- /dev/null
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -0,0 +1,26 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_waldump.h - decode and display WAL
+ *
+ * Copyright (c) 2026, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/pg_waldump.h
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_WALDUMP_H
+#define PG_WALDUMP_H
+
+#include "access/xlogdefs.h"
+
+/* Contains the necessary information to drive WAL decoding */
+typedef struct XLogDumpPrivate
+{
+ TimeLineID timeline;
+ int segsize;
+ XLogRecPtr startptr;
+ XLogRecPtr endptr;
+ bool endptr_reached;
+} XLogDumpPrivate;
+
+#endif /* PG_WALDUMP_H */
--
2.47.1
[application/x-patch] v20-0003-pg_waldump-Add-support-for-reading-WAL-from-tar-.patch (56.2K, 4-v20-0003-pg_waldump-Add-support-for-reading-WAL-from-tar-.patch)
download | inline diff:
From 4c191b1cbed7eaa19d5ecff3072ce807943ffdf1 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 19 Mar 2026 15:43:46 +0530
Subject: [PATCH v20 3/5] pg_waldump: Add support for reading WAL from tar
archives
pg_waldump can now accept the path to a tar archive (optionally
compressed with gzip, lz4, or zstd) containing WAL files and decode
them. This was added primarily for pg_verifybackup, which previously
had to skip WAL parsing for tar-format backups.
The implementation uses the existing archive streamer infrastructure
with a hash table to track WAL segments read from the archive. If WAL
files within the archive are not in sequential order, out-of-order
segments are written to a temporary directory (created via mkdtemp under
$TMPDIR or the archive's directory) and read back when needed. An
atexit callback ensures the temporary directory is cleaned up.
The --follow option is not supported when reading from a tar archive.
Author: Amul Sul <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Jakub Wartak <[email protected]>
Reviewed-by: Chao Li <[email protected]>
Reviewed-by: Euler Taveira <[email protected]>
Reviewed-by: Andrew Dunstan <[email protected]>
discussion: https://postgr.es/m/CAAJ_b94bqdWN3h2J-PzzzQ2Npbwct5ZQHggn_QoYGhC2rn-=WQ@mail.gmail.com
---
doc/src/sgml/ref/pg_waldump.sgml | 23 +-
src/bin/pg_waldump/Makefile | 7 +-
src/bin/pg_waldump/archive_waldump.c | 847 +++++++++++++++++++++++++++
src/bin/pg_waldump/meson.build | 4 +-
src/bin/pg_waldump/pg_waldump.c | 293 +++++++--
src/bin/pg_waldump/pg_waldump.h | 50 ++
src/bin/pg_waldump/t/001_basic.pl | 242 ++++++--
src/tools/pgindent/typedefs.list | 4 +
8 files changed, 1342 insertions(+), 128 deletions(-)
create mode 100644 src/bin/pg_waldump/archive_waldump.c
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index d1715ff5124..9bbb4bd5772 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -141,13 +141,21 @@ PostgreSQL documentation
<term><option>--path=<replaceable>path</replaceable></option></term>
<listitem>
<para>
- Specifies a directory to search for WAL segment files or a
- directory with a <literal>pg_wal</literal> subdirectory that
+ Specifies a tar archive or a directory to search for WAL segment files
+ or a directory with a <literal>pg_wal</literal> subdirectory that
contains such files. The default is to search in the current
directory, the <literal>pg_wal</literal> subdirectory of the
current directory, and the <literal>pg_wal</literal> subdirectory
of <envar>PGDATA</envar>.
</para>
+ <para>
+ If a tar archive is provided and its WAL segment files are not in
+ sequential order, those files will be written to a temporary directory
+ named starting with <filename>waldump_tmp</filename>. This directory will be
+ created inside the directory specified by the <envar>TMPDIR</envar>
+ environment variable if it is set; otherwise, it will be created within
+ the same directory as the tar archive.
+ </para>
</listitem>
</varlistentry>
@@ -383,6 +391,17 @@ PostgreSQL documentation
</para>
</listitem>
</varlistentry>
+
+ <varlistentry>
+ <term><envar>TMPDIR</envar></term>
+ <listitem>
+ <para>
+ Directory in which to create temporary files when reading WAL from a
+ tar archive with out-of-order segment files. If not set, the temporary
+ directory is created within the same directory as the tar archive.
+ </para>
+ </listitem>
+ </varlistentry>
</variablelist>
</refsect1>
diff --git a/src/bin/pg_waldump/Makefile b/src/bin/pg_waldump/Makefile
index 4c1ee649501..aabb87566a2 100644
--- a/src/bin/pg_waldump/Makefile
+++ b/src/bin/pg_waldump/Makefile
@@ -3,6 +3,9 @@
PGFILEDESC = "pg_waldump - decode and display WAL"
PGAPPICON=win32
+# make these available to TAP test scripts
+export TAR
+
subdir = src/bin/pg_waldump
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
@@ -10,13 +13,15 @@ include $(top_builddir)/src/Makefile.global
OBJS = \
$(RMGRDESCOBJS) \
$(WIN32RES) \
+ archive_waldump.o \
compat.o \
pg_waldump.o \
rmgrdesc.o \
xlogreader.o \
xlogstats.o
-override CPPFLAGS := -DFRONTEND $(CPPFLAGS)
+override CPPFLAGS := -DFRONTEND -I$(libpq_srcdir) $(CPPFLAGS)
+LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils
RMGRDESCSOURCES = $(sort $(notdir $(wildcard $(top_srcdir)/src/backend/access/rmgrdesc/*desc*.c)))
RMGRDESCOBJS = $(patsubst %.c,%.o,$(RMGRDESCSOURCES))
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
new file mode 100644
index 00000000000..9cbcae3e8af
--- /dev/null
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -0,0 +1,847 @@
+/*-------------------------------------------------------------------------
+ *
+ * archive_waldump.c
+ * A generic facility for reading WAL data from tar archives via archive
+ * streamer.
+ *
+ * Portions Copyright (c) 2026, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/archive_waldump.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres_fe.h"
+
+#include <unistd.h>
+
+#include "access/xlog_internal.h"
+#include "common/file_perm.h"
+#include "common/hashfn.h"
+#include "common/logging.h"
+#include "fe_utils/simple_list.h"
+#include "pg_waldump.h"
+
+/*
+ * How many bytes should we try to read from a file at once?
+ */
+#define READ_CHUNK_SIZE (128 * 1024)
+
+/* Temporary exported WAL file directory */
+char *TmpWalSegDir = NULL;
+
+/*
+ * Check if the start segment number is zero; this indicates a request to read
+ * any WAL file.
+ */
+#define READ_ANY_WAL(privateInfo) ((privateInfo)->start_segno == 0)
+
+/*
+ * Hash entry representing a WAL segment retrieved from the archive.
+ *
+ * While WAL segments are typically read sequentially, individual entries
+ * maintain their own buffers for the following reasons:
+ *
+ * 1. Boundary Handling: The archive streamer provides a continuous byte
+ * stream. A single streaming chunk may contain the end of one WAL segment
+ * and the start of the next. Separate buffers allow us to easily
+ * partition and track these bytes by their respective segments.
+ *
+ * 2. Out-of-Order Support: Dedicated buffers simplify logic when segments
+ * are archived or retrieved out of sequence.
+ *
+ * To minimize the memory footprint, entries and their associated buffers are
+ * freed immediately once consumed. Since pg_waldump does not request the same
+ * bytes twice, a segment is discarded as soon as pg_waldump moves past it.
+ */
+typedef struct ArchivedWALFile
+{
+ uint32 status; /* hash status */
+ const char *fname; /* hash key: WAL segment name */
+
+ StringInfo buf; /* holds WAL bytes read from archive */
+ bool spilled; /* true if the WAL data was spilled to a
+ * temporary file */
+
+ int read_len; /* total bytes of a WAL read from archive */
+} ArchivedWALFile;
+
+static uint32 hash_string_pointer(const char *s);
+#define SH_PREFIX ArchivedWAL
+#define SH_ELEMENT_TYPE ArchivedWALFile
+#define SH_KEY_TYPE const char *
+#define SH_KEY fname
+#define SH_HASH_KEY(tb, key) hash_string_pointer(key)
+#define SH_EQUAL(tb, a, b) (strcmp(a, b) == 0)
+#define SH_SCOPE static inline
+#define SH_RAW_ALLOCATOR pg_malloc0
+#define SH_DECLARE
+#define SH_DEFINE
+#include "lib/simplehash.h"
+
+typedef struct astreamer_waldump
+{
+ astreamer base;
+ XLogDumpPrivate *privateInfo;
+} astreamer_waldump;
+
+static ArchivedWALFile *get_archive_wal_entry(const char *fname,
+ XLogDumpPrivate *privateInfo,
+ int WalSegSz);
+static int read_archive_file(XLogDumpPrivate *privateInfo, Size count);
+static void setup_tmpwal_dir(const char *waldir);
+static void cleanup_tmpwal_dir_atexit(void);
+
+static FILE *prepare_tmp_write(const char *fname);
+static void perform_tmp_write(const char *fname, StringInfo buf, FILE *file);
+
+static astreamer *astreamer_waldump_new(XLogDumpPrivate *privateInfo);
+static void astreamer_waldump_content(astreamer *streamer,
+ astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context);
+static void astreamer_waldump_finalize(astreamer *streamer);
+static void astreamer_waldump_free(astreamer *streamer);
+
+static bool member_is_wal_file(astreamer_waldump *mystreamer,
+ astreamer_member *member,
+ char **fname);
+
+static const astreamer_ops astreamer_waldump_ops = {
+ .content = astreamer_waldump_content,
+ .finalize = astreamer_waldump_finalize,
+ .free = astreamer_waldump_free
+};
+
+/*
+ * Initializes the tar archive reader: opens the archive, builds a hash table
+ * for WAL entries, reads ahead until a full WAL page header is available to
+ * determine the WAL segment size, and computes start/end segment numbers for
+ * filtering. It also sets up a temporary directory for out-of-order WAL data
+ * and registers an atexit callback to clean it up.
+ */
+void
+init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
+ int *WalSegSz, pg_compress_algorithm compression)
+{
+ int fd;
+ astreamer *streamer;
+ ArchivedWALFile *entry = NULL;
+ XLogLongPageHeader longhdr;
+ XLogSegNo segno;
+ TimeLineID timeline;
+
+ /* Open tar archive and store its file descriptor */
+ fd = open_file_in_directory(waldir, privateInfo->archive_name);
+
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", privateInfo->archive_name);
+
+ privateInfo->archive_fd = fd;
+
+ streamer = astreamer_waldump_new(privateInfo);
+
+ /* We must first parse the tar archive. */
+ streamer = astreamer_tar_parser_new(streamer);
+
+ /* If the archive is compressed, decompress before parsing. */
+ if (compression == PG_COMPRESSION_GZIP)
+ streamer = astreamer_gzip_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_LZ4)
+ streamer = astreamer_lz4_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_ZSTD)
+ streamer = astreamer_zstd_decompressor_new(streamer);
+
+ privateInfo->archive_streamer = streamer;
+
+ /*
+ * Allocate a buffer for reading the archive file to facilitate content
+ * decoding; read requests must not exceed the allocated buffer size.
+ */
+ privateInfo->archive_read_buf = pg_malloc(READ_CHUNK_SIZE);
+
+#ifdef USE_ASSERT_CHECKING
+ privateInfo->archive_read_buf_size = READ_CHUNK_SIZE;
+#endif
+
+ /*
+ * Hash table storing WAL entries read from the archive with an arbitrary
+ * initial size.
+ */
+ privateInfo->archive_wal_htab = ArchivedWAL_create(8, NULL);
+
+ /*
+ * Read until we have at least one full WAL page (XLOG_BLCKSZ bytes) from
+ * the first WAL segment in the archive so we can extract the WAL segment
+ * size from the long page header.
+ */
+ while (entry == NULL || entry->buf->len < XLOG_BLCKSZ)
+ {
+ if (read_archive_file(privateInfo, XLOG_BLCKSZ) == 0)
+ pg_fatal("could not find WAL in archive \"%s\"",
+ privateInfo->archive_name);
+
+ entry = privateInfo->cur_file;
+ }
+
+ /* Set WalSegSz if WAL data is successfully read */
+ longhdr = (XLogLongPageHeader) entry->buf->data;
+
+ if (!IsValidWalSegSize(longhdr->xlp_seg_size))
+ {
+ pg_log_error(ngettext("invalid WAL segment size in WAL file from archive \"%s\" (%d byte)",
+ "invalid WAL segment size in WAL file from archive \"%s\" (%d bytes)",
+ longhdr->xlp_seg_size),
+ privateInfo->archive_name, longhdr->xlp_seg_size);
+ pg_log_error_detail("The WAL segment size must be a power of two between 1 MB and 1 GB.");
+ exit(1);
+ }
+
+ *WalSegSz = longhdr->xlp_seg_size;
+
+ /*
+ * With the WAL segment size available, we can now initialize the
+ * dependent start and end segment numbers.
+ */
+ Assert(!XLogRecPtrIsInvalid(privateInfo->startptr));
+ XLByteToSeg(privateInfo->startptr, privateInfo->start_segno, *WalSegSz);
+
+ if (!XLogRecPtrIsInvalid(privateInfo->endptr))
+ XLByteToSeg(privateInfo->endptr, privateInfo->end_segno, *WalSegSz);
+
+ /*
+ * This WAL record was fetched before the filtering parameters
+ * (start_segno and end_segno) were fully initialized. Perform the
+ * relevance check against the user-provided range now; if the WAL falls
+ * outside this range, remove it from the hash table. Subsequent WAL will
+ * be filtered automatically by the archive streamer using the updated
+ * start_segno and end_segno values.
+ */
+ XLogFromFileName(entry->fname, &timeline, &segno, privateInfo->segsize);
+ if (privateInfo->timeline != timeline ||
+ privateInfo->start_segno > segno ||
+ privateInfo->end_segno < segno)
+ free_archive_wal_entry(entry->fname, privateInfo);
+
+ /*
+ * Setup temporary directory to store WAL segments and set up an exit
+ * callback to remove it upon completion.
+ */
+ setup_tmpwal_dir(waldir);
+ atexit(cleanup_tmpwal_dir_atexit);
+}
+
+/*
+ * Release the archive streamer chain and close the archive file.
+ */
+void
+free_archive_reader(XLogDumpPrivate *privateInfo)
+{
+ /*
+ * NB: Normally, astreamer_finalize() is called before astreamer_free() to
+ * flush any remaining buffered data or to ensure the end of the tar
+ * archive is reached. However, when decoding WAL, once we hit the end
+ * LSN, any remaining buffered data or unread portion of the archive can
+ * be safely ignored.
+ */
+ astreamer_free(privateInfo->archive_streamer);
+
+ /* Free any remaining hash table entries and their buffers. */
+ if (privateInfo->archive_wal_htab != NULL)
+ {
+ ArchivedWAL_iterator iter;
+ ArchivedWALFile *entry;
+
+ ArchivedWAL_start_iterate(privateInfo->archive_wal_htab, &iter);
+ while ((entry = ArchivedWAL_iterate(privateInfo->archive_wal_htab,
+ &iter)) != NULL)
+ {
+ if (entry->buf != NULL)
+ destroyStringInfo(entry->buf);
+ }
+ ArchivedWAL_destroy(privateInfo->archive_wal_htab);
+ privateInfo->archive_wal_htab = NULL;
+ }
+
+ /* Free the reusable read buffer. */
+ if (privateInfo->archive_read_buf != NULL)
+ {
+ pg_free(privateInfo->archive_read_buf);
+ privateInfo->archive_read_buf = NULL;
+ }
+
+ /* Close the file. */
+ if (close(privateInfo->archive_fd) != 0)
+ pg_log_error("could not close file \"%s\": %m",
+ privateInfo->archive_name);
+}
+
+/*
+ * Copies the requested WAL data from the hash entry's buffer into readBuff.
+ * If the buffer does not yet contain the needed bytes, fetches more data from
+ * the tar archive via the archive streamer.
+ */
+int
+read_archive_wal_page(XLogDumpPrivate *privateInfo, XLogRecPtr targetPagePtr,
+ Size count, char *readBuff, int WalSegSz)
+{
+ char *p = readBuff;
+ Size nbytes = count;
+ XLogRecPtr recptr = targetPagePtr;
+ XLogSegNo segno;
+ char fname[MAXFNAMELEN];
+ ArchivedWALFile *entry;
+
+ /* Identify the segment and locate its entry in the archive hash */
+ XLByteToSeg(targetPagePtr, segno, WalSegSz);
+ XLogFileName(fname, privateInfo->timeline, segno, WalSegSz);
+ entry = get_archive_wal_entry(fname, privateInfo, WalSegSz);
+
+ while (nbytes > 0)
+ {
+ char *buf = entry->buf->data;
+ int bufLen = entry->buf->len;
+ XLogRecPtr endPtr;
+ XLogRecPtr startPtr;
+
+ /*
+ * Calculate the LSN range currently residing in the buffer.
+ *
+ * read_len tracks total bytes received for this segment (including
+ * already-discarded data), so endPtr is the LSN just past the last
+ * buffered byte, and startPtr is the LSN of the first buffered byte.
+ */
+ XLogSegNoOffsetToRecPtr(segno, entry->read_len, WalSegSz, endPtr);
+ startPtr = endPtr - bufLen;
+
+ /*
+ * Copy the requested WAL record if it exists in the buffer.
+ */
+ if (bufLen > 0 && startPtr <= recptr && recptr < endPtr)
+ {
+ int copyBytes;
+ int offset = recptr - startPtr;
+
+ /*
+ * Given startPtr <= recptr < endPtr and a total buffer size
+ * 'bufLen', the offset (recptr - startPtr) will always be less
+ * than 'bufLen'.
+ */
+ Assert(offset < bufLen);
+
+ copyBytes = Min(nbytes, bufLen - offset);
+ memcpy(p, buf + offset, copyBytes);
+
+ /* Update state for read */
+ recptr += copyBytes;
+ nbytes -= copyBytes;
+ p += copyBytes;
+ }
+ else
+ {
+ /*
+ * Before starting the actual decoding loop, pg_waldump tries to
+ * locate the first valid record from the user-specified start
+ * position, which might not be the start of a WAL record and
+ * could fall in the middle of a record that spans multiple pages.
+ * Consequently, the valid start position the decoder is looking
+ * for could be far away from that initial position.
+ *
+ * This may involve reading across multiple pages, and this
+ * pre-reading fetches data in multiple rounds from the archive
+ * streamer; normally, we would throw away existing buffer
+ * contents to fetch the next set of data, but that existing data
+ * might be needed once the main loop starts. Because previously
+ * read data cannot be re-read by the archive streamer, we delay
+ * resetting the buffer until the main decoding loop is entered.
+ *
+ * Once pg_waldump has entered the main loop, it may re-read the
+ * currently active page, but never an older one; therefore, any
+ * fully consumed WAL data preceding the current page can then be
+ * safely discarded.
+ */
+ if (privateInfo->decoding_started)
+ {
+ resetStringInfo(entry->buf);
+
+ /*
+ * Push back the partial page data for the current page to the
+ * buffer, ensuring a full page remains available for
+ * re-reading if requested.
+ */
+ if (p > readBuff)
+ {
+ Assert((count - nbytes) > 0);
+ appendBinaryStringInfo(entry->buf, readBuff, count - nbytes);
+ }
+ }
+
+ /*
+ * Now, fetch more data. Raise an error if the archive streamer
+ * has moved past our segment (meaning the WAL file in the archive
+ * is shorter than expected) or if reading the archive reached
+ * EOF.
+ */
+ if (privateInfo->cur_file != entry)
+ pg_fatal("WAL segment \"%s\" in archive \"%s\" is too short: read %lld of %lld bytes",
+ fname, privateInfo->archive_name,
+ (long long int) (count - nbytes),
+ (long long int) count);
+ if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ pg_fatal("unexpected end of archive \"%s\" while reading \"%s\": read %lld of %lld bytes",
+ privateInfo->archive_name, fname,
+ (long long int) (count - nbytes),
+ (long long int) count);
+ }
+ }
+
+ /*
+ * Should have successfully read all the requested bytes or reported a
+ * failure before this point.
+ */
+ Assert(nbytes == 0);
+
+ /*
+ * NB: We return count unchanged. We could return a boolean since we
+ * either successfully read the WAL page or raise an error, but the caller
+ * expects this value to be returned. The routine that reads WAL pages
+ * from physical WAL files follows the same convention.
+ */
+ return count;
+}
+
+/*
+ * Releases the buffer of a WAL entry that is no longer needed, preventing the
+ * accumulation of irrelevant WAL data. Also removes any associated temporary
+ * file and clears privateInfo->cur_file if it points to this entry, so the
+ * archive streamer skips subsequent data for it.
+ */
+void
+free_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo)
+{
+ ArchivedWALFile *entry;
+
+ entry = ArchivedWAL_lookup(privateInfo->archive_wal_htab, fname);
+
+ if (entry == NULL)
+ return;
+
+ /* Destroy the buffer */
+ destroyStringInfo(entry->buf);
+ entry->buf = NULL;
+
+ /* Remove temporary file if any */
+ if (entry->spilled)
+ {
+ char fpath[MAXPGPATH];
+
+ snprintf(fpath, MAXPGPATH, "%s/%s", TmpWalSegDir, fname);
+
+ if (unlink(fpath) == 0)
+ pg_log_debug("removed file \"%s\"", fpath);
+ }
+
+ /* Clear cur_file if it points to the entry being freed */
+ if (privateInfo->cur_file == entry)
+ privateInfo->cur_file = NULL;
+
+ ArchivedWAL_delete_item(privateInfo->archive_wal_htab, entry);
+}
+
+/*
+ * Returns the archived WAL entry from the hash table if it already exists.
+ * Otherwise, reads more data from the archive until the requested entry is
+ * found. If the archive streamer is reading a WAL file from the archive that
+ * is not currently needed, that data is spilled to a temporary file for later
+ * retrieval.
+ */
+static ArchivedWALFile *
+get_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo,
+ int WalSegSz)
+{
+ ArchivedWALFile *entry = NULL;
+ FILE *write_fp = NULL;
+
+ /*
+ * Search the hash table first. If the entry is found, return it.
+ * Otherwise, the requested WAL entry hasn't been read from the archive
+ * yet; invoke the archive streamer to fetch it.
+ */
+ while (1)
+ {
+ /*
+ * Search hash table.
+ *
+ * We perform the search inside the loop because a single iteration of
+ * the archive reader may decompress and extract multiple files into
+ * the hash table. One of these newly added files could be the one we
+ * are seeking.
+ */
+ entry = ArchivedWAL_lookup(privateInfo->archive_wal_htab, fname);
+
+ if (entry != NULL)
+ return entry;
+
+ /*
+ * The WAL file entry currently being processed may change during
+ * archive streamer execution. Therefore, maintain a local variable to
+ * reference the previous entry, ensuring that any remaining data in
+ * its buffer is successfully flushed to the temporary file before
+ * switching to the next WAL entry.
+ */
+ entry = privateInfo->cur_file;
+
+ /*
+ * Fetch more data either when no current file is being tracked or
+ * when its buffer has been fully flushed to the temporary file.
+ */
+ if (entry == NULL || entry->buf->len == 0)
+ {
+ if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ break; /* archive file ended */
+ }
+
+ /*
+ * Archive streamer is reading a non-WAL file or an irrelevant WAL
+ * file.
+ */
+ if (entry == NULL)
+ continue;
+
+ /*
+ * Archive streamer is currently reading a file that isn't the one
+ * asked for, but it's required in the future. It should be written to
+ * a temporary location for retrieval when needed.
+ */
+ Assert(strcmp(fname, entry->fname) != 0);
+
+ /* Create a temporary file if one does not already exist */
+ if (!entry->spilled)
+ {
+ write_fp = prepare_tmp_write(entry->fname);
+ entry->spilled = true;
+ }
+
+ /* Flush data from the buffer to the file */
+ perform_tmp_write(entry->fname, entry->buf, write_fp);
+ resetStringInfo(entry->buf);
+
+ /*
+ * The change in the current segment entry indicates that the reading
+ * of this file has ended.
+ */
+ if (entry != privateInfo->cur_file && write_fp != NULL)
+ {
+ fclose(write_fp);
+ write_fp = NULL;
+ }
+ }
+
+ /* Requested WAL segment not found */
+ pg_fatal("could not find WAL \"%s\" in archive \"%s\"",
+ fname, privateInfo->archive_name);
+}
+
+/*
+ * Reads the archive file and passes it to the archive streamer for
+ * decompression.
+ */
+static int
+read_archive_file(XLogDumpPrivate *privateInfo, Size count)
+{
+ int rc;
+
+ /* The read request must not exceed the allocated buffer size. */
+ Assert(privateInfo->archive_read_buf_size >= count);
+
+ rc = read(privateInfo->archive_fd, privateInfo->archive_read_buf, count);
+ if (rc < 0)
+ pg_fatal("could not read file \"%s\": %m",
+ privateInfo->archive_name);
+
+ /*
+ * Decompress (if required), and then parse the previously read contents
+ * of the tar file.
+ */
+ if (rc > 0)
+ astreamer_content(privateInfo->archive_streamer, NULL,
+ privateInfo->archive_read_buf, rc,
+ ASTREAMER_UNKNOWN);
+
+ return rc;
+}
+
+/*
+ * Set up a temporary directory to temporarily store WAL segments.
+ */
+static void
+setup_tmpwal_dir(const char *waldir)
+{
+ char *template;
+
+ /*
+ * Use the directory specified by the TMPDIR environment variable. If it's
+ * not set, fall back to the provided WAL directory to store WAL files
+ * temporarily.
+ */
+ template = psprintf("%s/waldump_tmp-XXXXXX",
+ getenv("TMPDIR") ? getenv("TMPDIR") : waldir);
+ TmpWalSegDir = mkdtemp(template);
+
+ if (TmpWalSegDir == NULL)
+ pg_fatal("could not create directory \"%s\": %m", template);
+
+ canonicalize_path(TmpWalSegDir);
+
+ pg_log_debug("created directory \"%s\"", TmpWalSegDir);
+}
+
+/*
+ * Remove temporary directory at exit, if any.
+ */
+static void
+cleanup_tmpwal_dir_atexit(void)
+{
+ rmtree(TmpWalSegDir, true);
+}
+
+/*
+ * Create an empty placeholder file and return its handle.
+ */
+static FILE *
+prepare_tmp_write(const char *fname)
+{
+ char fpath[MAXPGPATH];
+ FILE *file;
+
+ snprintf(fpath, MAXPGPATH, "%s/%s", TmpWalSegDir, fname);
+
+ /* Create an empty placeholder */
+ file = fopen(fpath, PG_BINARY_W);
+ if (file == NULL)
+ pg_fatal("could not create file \"%s\": %m", fpath);
+
+#ifndef WIN32
+ if (chmod(fpath, pg_file_create_mode))
+ pg_fatal("could not set permissions on file \"%s\": %m",
+ fpath);
+#endif
+
+ pg_log_debug("spilling to temporary file \"%s\"", fpath);
+
+ return file;
+}
+
+/*
+ * Write buffer data to the given file handle.
+ */
+static void
+perform_tmp_write(const char *fname, StringInfo buf, FILE *file)
+{
+ Assert(file);
+
+ errno = 0;
+ if (buf->len > 0 && fwrite(buf->data, buf->len, 1, file) != 1)
+ {
+ /*
+ * If write didn't set errno, assume problem is no disk space
+ */
+ if (errno == 0)
+ errno = ENOSPC;
+ pg_fatal("could not write to file \"%s/%s\": %m", TmpWalSegDir, fname);
+ }
+}
+
+/*
+ * Create an astreamer that can read WAL from tar file.
+ */
+static astreamer *
+astreamer_waldump_new(XLogDumpPrivate *privateInfo)
+{
+ astreamer_waldump *streamer;
+
+ streamer = palloc0_object(astreamer_waldump);
+ *((const astreamer_ops **) &streamer->base.bbs_ops) =
+ &astreamer_waldump_ops;
+
+ streamer->privateInfo = privateInfo;
+
+ return &streamer->base;
+}
+
+/*
+ * Main entry point of the archive streamer for reading WAL data from a tar
+ * file. If a member is identified as a valid WAL file, a hash entry is created
+ * for it, and its contents are copied into that entry's buffer, making them
+ * accessible to the decoding routine.
+ */
+static void
+astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context)
+{
+ astreamer_waldump *mystreamer = (astreamer_waldump *) streamer;
+ XLogDumpPrivate *privateInfo = mystreamer->privateInfo;
+
+ Assert(context != ASTREAMER_UNKNOWN);
+
+ switch (context)
+ {
+ case ASTREAMER_MEMBER_HEADER:
+ {
+ char *fname = NULL;
+ ArchivedWALFile *entry;
+ bool found;
+
+ pg_log_debug("reading \"%s\"", member->pathname);
+
+ if (!member_is_wal_file(mystreamer, member, &fname))
+ break;
+
+ /*
+ * Further checks are skipped if any WAL file can be read.
+ * This typically occurs during initial verification.
+ */
+ if (!READ_ANY_WAL(privateInfo))
+ {
+ XLogSegNo segno;
+ TimeLineID timeline;
+
+ /*
+ * Skip the segment if the timeline does not match, if it
+ * falls outside the caller-specified range.
+ */
+ XLogFromFileName(fname, &timeline, &segno, privateInfo->segsize);
+ if (privateInfo->timeline != timeline ||
+ privateInfo->start_segno > segno ||
+ privateInfo->end_segno < segno)
+ {
+ pfree(fname);
+ break;
+ }
+ }
+
+ entry = ArchivedWAL_insert(privateInfo->archive_wal_htab,
+ fname, &found);
+
+ /*
+ * Shouldn't happen, but if it does, simply ignore the
+ * duplicate WAL file.
+ */
+ if (found)
+ {
+ pg_log_warning("ignoring duplicate WAL \"%s\" found in archive \"%s\"",
+ member->pathname, privateInfo->archive_name);
+ pfree(fname);
+ break;
+ }
+
+ entry->buf = makeStringInfo();
+ entry->spilled = false;
+ entry->read_len = 0;
+ privateInfo->cur_file = entry;
+ }
+ break;
+
+ case ASTREAMER_MEMBER_CONTENTS:
+ if (privateInfo->cur_file)
+ {
+ appendBinaryStringInfo(privateInfo->cur_file->buf, data, len);
+ privateInfo->cur_file->read_len += len;
+ }
+ break;
+
+ case ASTREAMER_MEMBER_TRAILER:
+
+ /*
+ * End of this tar member; mark cur_file NULL so subsequent
+ * content callbacks (if any) know no WAL file is currently
+ * active.
+ */
+ privateInfo->cur_file = NULL;
+ break;
+
+ case ASTREAMER_ARCHIVE_TRAILER:
+ break;
+
+ default:
+ /* Shouldn't happen. */
+ pg_fatal("unexpected state while parsing tar file");
+ }
+}
+
+/*
+ * End-of-stream processing for an astreamer_waldump stream. This is a
+ * terminal streamer so it must have no successor.
+ */
+static void
+astreamer_waldump_finalize(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+}
+
+/*
+ * Free memory associated with an astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_free(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+ pfree(streamer);
+}
+
+/*
+ * Returns true if the archive member name matches the WAL naming format. If
+ * successful, it also outputs the WAL segment name.
+ */
+static bool
+member_is_wal_file(astreamer_waldump *mystreamer, astreamer_member *member,
+ char **fname)
+{
+ int pathlen;
+ char pathname[MAXPGPATH];
+ char *filename;
+
+ /* We are only interested in normal files */
+ if (member->is_directory || member->is_link)
+ return false;
+
+ if (strlen(member->pathname) < XLOG_FNAME_LEN)
+ return false;
+
+ /*
+ * For a correct comparison, we must remove any '.' or '..' components
+ * from the member pathname. Similar to member_verify_header(), we prepend
+ * './' to the path so that canonicalize_path() can properly resolve and
+ * strip these references from the tar member name.
+ */
+ snprintf(pathname, MAXPGPATH, "./%s", member->pathname);
+ canonicalize_path(pathname);
+ pathlen = strlen(pathname);
+
+ /* WAL files from the top-level or pg_wal directory will be decoded */
+ if (pathlen > XLOG_FNAME_LEN &&
+ strncmp(pathname, XLOGDIR, strlen(XLOGDIR)) != 0)
+ return false;
+
+ /* WAL file may appear with a full path (e.g., pg_wal/<name>) */
+ filename = pathname + (pathlen - XLOG_FNAME_LEN);
+ if (!IsXLogFileName(filename))
+ return false;
+
+ *fname = pnstrdup(filename, XLOG_FNAME_LEN);
+
+ return true;
+}
+
+/*
+ * Helper function for WAL file hash table.
+ */
+static uint32
+hash_string_pointer(const char *s)
+{
+ unsigned char *ss = (unsigned char *) s;
+
+ return hash_bytes(ss, strlen(s));
+}
diff --git a/src/bin/pg_waldump/meson.build b/src/bin/pg_waldump/meson.build
index 633a9874bb5..5296f21b82c 100644
--- a/src/bin/pg_waldump/meson.build
+++ b/src/bin/pg_waldump/meson.build
@@ -1,6 +1,7 @@
# Copyright (c) 2022-2026, PostgreSQL Global Development Group
pg_waldump_sources = files(
+ 'archive_waldump.c',
'compat.c',
'pg_waldump.c',
'rmgrdesc.c',
@@ -18,7 +19,7 @@ endif
pg_waldump = executable('pg_waldump',
pg_waldump_sources,
- dependencies: [frontend_code, lz4, zstd],
+ dependencies: [frontend_code, libpq, lz4, zstd],
c_args: ['-DFRONTEND'], # needed for xlogreader et al
kwargs: default_bin_args,
)
@@ -29,6 +30,7 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
+ 'env': {'TAR': tar.found() ? tar.full_path() : ''},
'tests': [
't/001_basic.pl',
't/002_save_fullpage.pl',
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 5d31b15dbd8..f28153165e6 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -176,7 +176,7 @@ split_path(const char *path, char **dir, char **fname)
*
* return a read only fd
*/
-static int
+int
open_file_in_directory(const char *directory, const char *fname)
{
int fd = -1;
@@ -327,8 +327,8 @@ identify_target_directory(char *directory, char *fname, int *WalSegSz)
}
/*
- * Returns the size in bytes of the data to be read. Returns -1 if the end
- * point has already been reached.
+ * Returns the number of bytes to read for the given page. Returns -1 if
+ * the requested range has already been reached or exceeded.
*/
static inline int
required_read_len(XLogDumpPrivate *private, XLogRecPtr targetPagePtr,
@@ -440,6 +440,106 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
return count;
}
+/*
+ * pg_waldump's XLogReaderRoutine->segment_open callback to support dumping WAL
+ * files from tar archives. Segment tracking is handled by
+ * TarWALDumpReadPage, so no action is needed here.
+ */
+static void
+TarWALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
+ TimeLineID *tli_p)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->segment_close callback to support dumping
+ * WAL files from tar archives. Segment tracking is handled by
+ * TarWALDumpReadPage, so no action is needed here.
+ */
+static void
+TarWALDumpCloseSegment(XLogReaderState *state)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->page_read callback to support dumping WAL
+ * files from tar archives.
+ */
+static int
+TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
+ XLogRecPtr targetPtr, char *readBuff)
+{
+ XLogDumpPrivate *private = state->private_data;
+ int count = required_read_len(private, targetPagePtr, reqLen);
+ int WalSegSz = state->segcxt.ws_segsize;
+ XLogSegNo curSegNo;
+
+ /* Bail out if the count to be read is not valid */
+ if (count < 0)
+ return -1;
+
+ /*
+ * If the target page is in a different segment, free the buffer and/or
+ * temporary file disk space occupied by the previous segment's data.
+ * Since pg_waldump never requests the same WAL bytes twice, moving to a
+ * new segment implies the previous buffer's data and that segment will
+ * not be needed again.
+ *
+ * Afterward, check for the next required WAL segment's physical existence
+ * in the temporary directory first before invoking the archive streamer.
+ */
+ curSegNo = state->seg.ws_segno;
+ if (!XLByteInSeg(targetPagePtr, curSegNo, WalSegSz))
+ {
+ char fname[MAXFNAMELEN];
+ XLogSegNo nextSegNo;
+
+ /*
+ * Calculate the next WAL segment to be decoded from the given page
+ * pointer.
+ */
+ XLByteToSeg(targetPagePtr, nextSegNo, WalSegSz);
+ state->seg.ws_tli = private->timeline;
+ state->seg.ws_segno = nextSegNo;
+
+ /* Close the WAL segment file if it is currently open */
+ if (state->seg.ws_file >= 0)
+ {
+ close(state->seg.ws_file);
+ state->seg.ws_file = -1;
+ }
+
+ /*
+ * If in pre-reading mode (prior to actual decoding), do not delete
+ * any entries that might be requested again once the decoding loop
+ * starts. For more details, see the comments in
+ * read_archive_wal_page().
+ */
+ if (private->decoding_started && curSegNo < nextSegNo)
+ {
+ XLogFileName(fname, state->seg.ws_tli, curSegNo, WalSegSz);
+ free_archive_wal_entry(fname, private);
+ }
+
+ /*
+ * If the next segment exists, open it and continue reading from there
+ */
+ XLogFileName(fname, state->seg.ws_tli, nextSegNo, WalSegSz);
+ state->seg.ws_file = open_file_in_directory(TmpWalSegDir, fname);
+ }
+
+ /* Continue reading from the open WAL segment, if any */
+ if (state->seg.ws_file >= 0)
+ return WALDumpReadPage(state, targetPagePtr, count, targetPtr,
+ readBuff);
+
+ /* Otherwise, read the WAL page from the archive streamer */
+ return read_archive_wal_page(private, targetPagePtr, count, readBuff,
+ WalSegSz);
+}
+
/*
* Boolean to return whether the given WAL record matches a specific relation
* and optionally block.
@@ -777,8 +877,8 @@ usage(void)
printf(_(" -F, --fork=FORK only show records that modify blocks in fork FORK;\n"
" valid names are main, fsm, vm, init\n"));
printf(_(" -n, --limit=N number of records to display\n"));
- printf(_(" -p, --path=PATH directory in which to find WAL segment files or a\n"
- " directory with a ./pg_wal that contains such files\n"
+ printf(_(" -p, --path=PATH a tar archive or a directory in which to find WAL segment files or\n"
+ " a directory with a pg_wal subdirectory containing such files\n"
" (default: current directory, ./pg_wal, $PGDATA/pg_wal)\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -r, --rmgr=RMGR only show records generated by resource manager RMGR;\n"
@@ -810,7 +910,9 @@ main(int argc, char **argv)
XLogRecord *record;
XLogRecPtr first_record;
char *waldir = NULL;
+ char *walpath = NULL;
char *errormsg;
+ pg_compress_algorithm compression = PG_COMPRESSION_NONE;
static struct option long_options[] = {
{"bkp-details", no_argument, NULL, 'b'},
@@ -868,6 +970,10 @@ main(int argc, char **argv)
private.startptr = InvalidXLogRecPtr;
private.endptr = InvalidXLogRecPtr;
private.endptr_reached = false;
+ private.decoding_started = false;
+ private.archive_name = NULL;
+ private.start_segno = 0;
+ private.end_segno = UINT64_MAX;
config.quiet = false;
config.bkp_details = false;
@@ -943,7 +1049,7 @@ main(int argc, char **argv)
}
break;
case 'p':
- waldir = pg_strdup(optarg);
+ walpath = pg_strdup(optarg);
break;
case 'q':
config.quiet = true;
@@ -1107,12 +1213,21 @@ main(int argc, char **argv)
goto bad_argument;
}
- if (waldir != NULL)
+ if (walpath != NULL)
{
+ /* validate path points to tar archive */
+ if (parse_tar_compress_algorithm(walpath, &compression))
+ {
+ char *fname = NULL;
+
+ split_path(walpath, &waldir, &fname);
+
+ private.archive_name = fname;
+ }
/* validate path points to directory */
- if (!verify_directory(waldir))
+ else if (!verify_directory(walpath))
{
- pg_log_error("could not open directory \"%s\": %m", waldir);
+ pg_log_error("could not open directory \"%s\": %m", walpath);
goto bad_argument;
}
}
@@ -1128,6 +1243,17 @@ main(int argc, char **argv)
int fd;
XLogSegNo segno;
+ /*
+ * If a tar archive is passed using the --path option, all other
+ * arguments become unnecessary.
+ */
+ if (private.archive_name)
+ {
+ pg_log_error("unnecessary command-line arguments specified with tar archive (first is \"%s\")",
+ argv[optind]);
+ goto bad_argument;
+ }
+
split_path(argv[optind], &directory, &fname);
if (waldir == NULL && directory != NULL)
@@ -1138,69 +1264,75 @@ main(int argc, char **argv)
pg_fatal("could not open directory \"%s\": %m", waldir);
}
- waldir = identify_target_directory(waldir, fname, &private.segsize);
- fd = open_file_in_directory(waldir, fname);
- if (fd < 0)
- pg_fatal("could not open file \"%s\"", fname);
- close(fd);
-
- /* parse position from file */
- XLogFromFileName(fname, &private.timeline, &segno, private.segsize);
-
- if (!XLogRecPtrIsValid(private.startptr))
- XLogSegNoOffsetToRecPtr(segno, 0, private.segsize, private.startptr);
- else if (!XLByteInSeg(private.startptr, segno, private.segsize))
+ if (fname != NULL && parse_tar_compress_algorithm(fname, &compression))
{
- pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.startptr),
- fname);
- goto bad_argument;
+ private.archive_name = fname;
}
-
- /* no second file specified, set end position */
- if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(segno + 1, 0, private.segsize, private.endptr);
-
- /* parse ENDSEG if passed */
- if (optind + 1 < argc)
+ else
{
- XLogSegNo endsegno;
-
- /* ignore directory, already have that */
- split_path(argv[optind + 1], &directory, &fname);
-
+ waldir = identify_target_directory(waldir, fname, &private.segsize);
fd = open_file_in_directory(waldir, fname);
if (fd < 0)
pg_fatal("could not open file \"%s\"", fname);
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &endsegno, private.segsize);
+ XLogFromFileName(fname, &private.timeline, &segno, private.segsize);
- if (endsegno < segno)
- pg_fatal("ENDSEG %s is before STARTSEG %s",
- argv[optind + 1], argv[optind]);
+ if (!XLogRecPtrIsValid(private.startptr))
+ XLogSegNoOffsetToRecPtr(segno, 0, private.segsize, private.startptr);
+ else if (!XLByteInSeg(private.startptr, segno, private.segsize))
+ {
+ pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.startptr),
+ fname);
+ goto bad_argument;
+ }
- if (!XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(endsegno + 1, 0, private.segsize,
- private.endptr);
+ /* no second file specified, set end position */
+ if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
+ XLogSegNoOffsetToRecPtr(segno + 1, 0, private.segsize, private.endptr);
- /* set segno to endsegno for check of --end */
- segno = endsegno;
- }
+ /* parse ENDSEG if passed */
+ if (optind + 1 < argc)
+ {
+ XLogSegNo endsegno;
+ /* ignore directory, already have that */
+ split_path(argv[optind + 1], &directory, &fname);
- if (!XLByteInSeg(private.endptr, segno, private.segsize) &&
- private.endptr != (segno + 1) * private.segsize)
- {
- pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.endptr),
- argv[argc - 1]);
- goto bad_argument;
+ fd = open_file_in_directory(waldir, fname);
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", fname);
+ close(fd);
+
+ /* parse position from file */
+ XLogFromFileName(fname, &private.timeline, &endsegno, private.segsize);
+
+ if (endsegno < segno)
+ pg_fatal("ENDSEG %s is before STARTSEG %s",
+ argv[optind + 1], argv[optind]);
+
+ if (!XLogRecPtrIsValid(private.endptr))
+ XLogSegNoOffsetToRecPtr(endsegno + 1, 0, private.segsize,
+ private.endptr);
+
+ /* set segno to endsegno for check of --end */
+ segno = endsegno;
+ }
+
+ if (!XLByteInSeg(private.endptr, segno, private.segsize) &&
+ private.endptr != (segno + 1) * private.segsize)
+ {
+ pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.endptr),
+ argv[argc - 1]);
+ goto bad_argument;
+ }
}
}
- else
- waldir = identify_target_directory(waldir, NULL, &private.segsize);
+ else if (!private.archive_name)
+ waldir = identify_target_directory(walpath, NULL, &private.segsize);
/* we don't know what to print */
if (!XLogRecPtrIsValid(private.startptr))
@@ -1209,15 +1341,46 @@ main(int argc, char **argv)
goto bad_argument;
}
+ /* --follow is not supported with tar archives */
+ if (config.follow && private.archive_name)
+ {
+ pg_log_error("--follow is not supported when reading from a tar archive");
+ goto bad_argument;
+ }
+
/* done with argument parsing, do the actual work */
/* we have everything we need, start reading */
- xlogreader_state =
- XLogReaderAllocate(private.segsize, waldir,
- XL_ROUTINE(.page_read = WALDumpReadPage,
- .segment_open = WALDumpOpenSegment,
- .segment_close = WALDumpCloseSegment),
- &private);
+ if (private.archive_name)
+ {
+ /*
+ * A NULL WAL directory indicates that the archive file is located in
+ * the current working directory.
+ */
+ if (waldir == NULL)
+ waldir = pg_strdup(".");
+
+ /* Set up for reading tar file */
+ init_archive_reader(&private, waldir, &private.segsize, compression);
+
+ /* Routine to decode WAL files in tar archive */
+ xlogreader_state =
+ XLogReaderAllocate(private.segsize, waldir,
+ XL_ROUTINE(.page_read = TarWALDumpReadPage,
+ .segment_open = TarWALDumpOpenSegment,
+ .segment_close = TarWALDumpCloseSegment),
+ &private);
+ }
+ else
+ {
+ xlogreader_state =
+ XLogReaderAllocate(private.segsize, waldir,
+ XL_ROUTINE(.page_read = WALDumpReadPage,
+ .segment_open = WALDumpOpenSegment,
+ .segment_close = WALDumpCloseSegment),
+ &private);
+ }
+
if (!xlogreader_state)
pg_fatal("out of memory while allocating a WAL reading processor");
@@ -1245,6 +1408,9 @@ main(int argc, char **argv)
if (config.stats == true && !config.quiet)
stats.startptr = first_record;
+ /* Flag indicating that the decoding loop has been entered */
+ private.decoding_started = true;
+
for (;;)
{
if (time_to_stop)
@@ -1326,6 +1492,9 @@ main(int argc, char **argv)
XLogReaderFree(xlogreader_state);
+ if (private.archive_name)
+ free_archive_reader(&private);
+
return EXIT_SUCCESS;
bad_argument:
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index 013b051506f..fd25792b33a 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -12,6 +12,14 @@
#define PG_WALDUMP_H
#include "access/xlogdefs.h"
+#include "fe_utils/astreamer.h"
+
+/* Forward declaration */
+struct ArchivedWALFile;
+struct ArchivedWAL_hash;
+
+/* Temporary directory for spilling out-of-order WAL segments from archives */
+extern char *TmpWalSegDir;
/* Contains the necessary information to drive WAL decoding */
typedef struct XLogDumpPrivate
@@ -21,6 +29,48 @@ typedef struct XLogDumpPrivate
XLogRecPtr startptr;
XLogRecPtr endptr;
bool endptr_reached;
+ bool decoding_started;
+
+ /* Fields required to read WAL from archive */
+ char *archive_name; /* tar archive filename */
+ int archive_fd; /* File descriptor for the open tar file */
+
+ astreamer *archive_streamer;
+ char *archive_read_buf; /* Reusable read buffer for archive I/O */
+
+#ifdef USE_ASSERT_CHECKING
+ Size archive_read_buf_size;
+#endif
+
+ /* What the archive streamer is currently reading */
+ struct ArchivedWALFile *cur_file;
+
+ /*
+ * Hash table of all WAL files that the archive stream has read, including
+ * the one currently in progress.
+ */
+ struct ArchivedWAL_hash *archive_wal_htab;
+
+ /*
+ * Pre-computed segment numbers derived from startptr and endptr. Caching
+ * them avoids repeated XLByteToSeg() calls when filtering each archive
+ * member against the requested WAL range.
+ */
+ XLogSegNo start_segno;
+ XLogSegNo end_segno;
} XLogDumpPrivate;
+extern int open_file_in_directory(const char *directory, const char *fname);
+
+extern void init_archive_reader(XLogDumpPrivate *privateInfo,
+ const char *waldir, int *WalSegSz,
+ pg_compress_algorithm compression);
+extern void free_archive_reader(XLogDumpPrivate *privateInfo);
+extern int read_archive_wal_page(XLogDumpPrivate *privateInfo,
+ XLogRecPtr targetPagePtr,
+ Size count, char *readBuff,
+ int WalSegSz);
+extern void free_archive_wal_entry(const char *fname,
+ XLogDumpPrivate *privateInfo);
+
#endif /* PG_WALDUMP_H */
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 5db5d20136f..94c58187412 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -3,9 +3,13 @@
use strict;
use warnings FATAL => 'all';
+use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
+use List::Util qw(shuffle);
+
+my $tar = $ENV{TAR};
program_help_ok('pg_waldump');
program_version_ok('pg_waldump');
@@ -162,6 +166,42 @@ CREATE TABLESPACE ts1 LOCATION '$tblspc_path';
DROP TABLESPACE ts1;
});
+# Test: Decode a continuation record (contrecord) that spans multiple WAL
+# segments.
+#
+# Now consume all remaining room in the current WAL segment, leaving
+# space enough only for the start of a largish record.
+$node->safe_psql(
+ 'postgres', q{
+DO $$
+DECLARE
+ wal_segsize int := setting::int FROM pg_settings WHERE name = 'wal_segment_size';
+ remain int;
+ iters int := 0;
+BEGIN
+ LOOP
+ INSERT into t1(b)
+ select repeat(encode(sha256(g::text::bytea), 'hex'), (random() * 15 + 1)::int)
+ from generate_series(1, 10) g;
+
+ remain := wal_segsize - (pg_current_wal_insert_lsn() - '0/0') % wal_segsize;
+ IF remain < 2 * setting::int from pg_settings where name = 'block_size' THEN
+ RAISE log 'exiting after % iterations, % bytes to end of WAL segment', iters, remain;
+ EXIT;
+ END IF;
+ iters := iters + 1;
+ END LOOP;
+END
+$$;
+});
+
+my $contrecord_lsn = $node->safe_psql('postgres',
+ 'SELECT pg_current_wal_insert_lsn()');
+# Generate contrecord record
+$node->safe_psql('postgres',
+ qq{SELECT pg_logical_emit_message(true, 'test 026', repeat('xyzxz', 123456))}
+);
+
my ($end_lsn, $end_walfile) = split /\|/,
$node->safe_psql('postgres',
q{SELECT pg_current_wal_insert_lsn(), pg_walfile_name(pg_current_wal_insert_lsn())}
@@ -198,28 +238,6 @@ command_like(
],
qr/./,
'runs with start and end segment specified');
-command_fails_like(
- [ 'pg_waldump', '--path' => $node->data_dir ],
- qr/error: no start WAL location given/,
- 'path option requires start location');
-command_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- '--end' => $end_lsn,
- ],
- qr/./,
- 'runs with path option and start and end locations');
-command_fails_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- ],
- qr/error: error in WAL record at/,
- 'falling off the end of the WAL results in an error');
-
command_like(
[
'pg_waldump', '--quiet',
@@ -227,22 +245,16 @@ command_like(
],
qr/^$/,
'no output with --quiet option');
-command_fails_like(
- [
- 'pg_waldump', '--quiet',
- '--path' => $node->data_dir,
- '--start' => $start_lsn
- ],
- qr/error: error in WAL record at/,
- 'errors are shown with --quiet');
-
# Test for: Display a message that we're skipping data if `from`
# wasn't a pointer to the start of a record.
+sub test_pg_waldump_skip_bytes
{
+ my ($path, $startlsn, $endlsn) = @_;
+
# Construct a new LSN that is one byte past the original
# start_lsn.
- my ($part1, $part2) = split qr{/}, $start_lsn;
+ my ($part1, $part2) = split qr{/}, $startlsn;
my $lsn2 = hex $part2;
$lsn2++;
my $new_start = sprintf("%s/%X", $part1, $lsn2);
@@ -252,7 +264,8 @@ command_fails_like(
my $result = IPC::Run::run [
'pg_waldump',
'--start' => $new_start,
- $node->data_dir . '/pg_wal/' . $start_walfile
+ '--end' => $endlsn,
+ '--path' => $path,
],
'>' => \$stdout,
'2>' => \$stderr;
@@ -266,15 +279,15 @@ command_fails_like(
sub test_pg_waldump
{
local $Test::Builder::Level = $Test::Builder::Level + 1;
- my @opts = @_;
+ my ($path, $startlsn, $endlsn, @opts) = @_;
my ($stdout, $stderr);
my $result = IPC::Run::run [
'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- '--end' => $end_lsn,
+ '--start' => $startlsn,
+ '--end' => $endlsn,
+ '--path' => $path,
@opts
],
'>' => \$stdout,
@@ -286,40 +299,145 @@ sub test_pg_waldump
return @lines;
}
-my @lines;
+# Create a tar archive, shuffle the file order
+sub generate_archive
+{
+ my ($archive, $directory, $compression_flags) = @_;
-@lines = test_pg_waldump;
-is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+ my @files;
+ opendir my $dh, $directory or die "opendir: $!";
+ while (my $entry = readdir $dh) {
+ # Skip '.' and '..'
+ next if $entry eq '.' || $entry eq '..';
+ push @files, $entry;
+ }
+ closedir $dh;
-@lines = test_pg_waldump('--limit' => 6);
-is(@lines, 6, 'limit option observed');
+ @files = shuffle @files;
-@lines = test_pg_waldump('--fullpage');
-is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
+ # move into the WAL directory before archiving files
+ my $cwd = getcwd;
+ chdir($directory) || die "chdir: $!";
+ command_ok([$tar, $compression_flags, $archive, @files]);
+ chdir($cwd) || die "chdir: $!";
+}
-@lines = test_pg_waldump('--stats');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
-@lines = test_pg_waldump('--stats=record');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+my @scenarios = (
+ {
+ 'path' => $node->data_dir,
+ 'is_archive' => 0,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar",
+ 'compression_method' => 'none',
+ 'compression_flags' => '-cf',
+ 'is_archive' => 1,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar.gz",
+ 'compression_method' => 'gzip',
+ 'compression_flags' => '-czf',
+ 'is_archive' => 1,
+ 'enabled' => check_pg_config("#define HAVE_LIBZ 1")
+ });
-@lines = test_pg_waldump('--rmgr' => 'Btree');
-is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+for my $scenario (@scenarios)
+{
+ my $path = $scenario->{'path'};
-@lines = test_pg_waldump('--fork' => 'init');
-is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+ SKIP:
+ {
+ skip "tar command is not available", 56
+ if !defined $tar && $scenario->{'is_archive'};
+ skip "$scenario->{'compression_method'} compression not supported by this build", 56
+ if !$scenario->{'enabled'} && $scenario->{'is_archive'};
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
-is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
- 0, 'only lines for selected relation');
+ # create pg_wal archive
+ if ($scenario->{'is_archive'})
+ {
+ generate_archive($path,
+ $node->data_dir . '/pg_wal',
+ $scenario->{'compression_flags'});
+ }
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
- '--block' => 1);
-is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+ command_fails_like(
+ [ 'pg_waldump', '--path' => $path ],
+ qr/error: no start WAL location given/,
+ 'path option requires start location');
+ command_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ '--end' => $end_lsn,
+ ],
+ qr/./,
+ 'runs with path option and start and end locations');
+ command_fails_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ ],
+ qr/error: error in WAL record at/,
+ 'falling off the end of the WAL results in an error');
+ command_fails_like(
+ [
+ 'pg_waldump', '--quiet',
+ '--path' => $path,
+ '--start' => $start_lsn
+ ],
+ qr/error: error in WAL record at/,
+ 'errors are shown with --quiet');
+
+ test_pg_waldump_skip_bytes($path, $start_lsn, $end_lsn);
+
+ my @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
+ is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+
+ @lines = test_pg_waldump($path, $contrecord_lsn, $end_lsn);
+ is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+
+ test_pg_waldump_skip_bytes($path, $contrecord_lsn, $end_lsn);
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--limit' => 6);
+ is(@lines, 6, 'limit option observed');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fullpage');
+ is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats=record');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--rmgr' => 'Btree');
+ is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fork' => 'init');
+ is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
+ is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
+ 0, 'only lines for selected relation');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
+ '--block' => 1);
+ is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+
+ # Cleanup.
+ unlink $path if $scenario->{'is_archive'};
+ }
+}
done_testing();
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 174e2798443..3e2fc711a3e 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -147,6 +147,9 @@ ArchiveOpts
ArchiveShutdownCB
ArchiveStartupCB
ArchiveStreamState
+ArchivedWALFile
+ArchivedWAL_hash
+ArchivedWAL_iterator
ArchiverOutput
ArchiverStage
ArrayAnalyzeExtraData
@@ -3542,6 +3545,7 @@ astreamer_recovery_injector
astreamer_tar_archiver
astreamer_tar_parser
astreamer_verify
+astreamer_waldump
astreamer_zstd_frame
auth_password_hook_typ
autovac_table
--
2.47.1
[application/x-patch] v20-0004-pg_verifybackup-Enable-WAL-parsing-for-tar-forma.patch (16.7K, 5-v20-0004-pg_verifybackup-Enable-WAL-parsing-for-tar-forma.patch)
download | inline diff:
From cccb149f57bd3323506d459d005c7552c19aa07d Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <[email protected]>
Date: Thu, 19 Mar 2026 15:43:53 +0530
Subject: [PATCH v20 4/5] pg_verifybackup: Enable WAL parsing for tar-format
backups
Now that pg_waldump supports reading WAL from tar archives, remove the
restriction that forced --no-parse-wal for tar-format backups.
pg_verifybackup now automatically locates the WAL archive: it looks for
a separate pg_wal.tar first, then falls back to the main base.tar. A
new --wal-path option (replacing the old --wal-directory, which is kept
as a silent alias) accepts either a directory or a tar archive path.
The default WAL directory preparation is deferred until the backup
format is known, since tar-format backups resolve the WAL path
differently from plain-format ones.
Author: Amul Sul <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Jakub Wartak <[email protected]>
Reviewed-by: Chao Li <[email protected]>
Reviewed-by: Euler Taveira <[email protected]>
Reviewed-by: Andrew Dunstan <[email protected]>
discussion: https://postgr.es/m/CAAJ_b94bqdWN3h2J-PzzzQ2Npbwct5ZQHggn_QoYGhC2rn-=WQ@mail.gmail.com
---
doc/src/sgml/ref/pg_verifybackup.sgml | 14 ++-
src/bin/pg_verifybackup/pg_verifybackup.c | 96 ++++++++++++-------
src/bin/pg_verifybackup/t/002_algorithm.pl | 4 -
src/bin/pg_verifybackup/t/003_corruption.pl | 4 +-
src/bin/pg_verifybackup/t/007_wal.pl | 20 +++-
src/bin/pg_verifybackup/t/008_untar.pl | 5 +-
src/bin/pg_verifybackup/t/010_client_untar.pl | 5 +-
7 files changed, 91 insertions(+), 57 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index 61c12975e4a..1695cfe91c8 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -36,10 +36,7 @@ PostgreSQL documentation
<literal>backup_manifest</literal> generated by the server at the time
of the backup. The backup may be stored either in the "plain" or the "tar"
format; this includes tar-format backups compressed with any algorithm
- supported by <application>pg_basebackup</application>. However, at present,
- <literal>WAL</literal> verification is supported only for plain-format
- backups. Therefore, if the backup is stored in tar-format, the
- <literal>-n, --no-parse-wal</literal> option should be used.
+ supported by <application>pg_basebackup</application>.
</para>
<para>
@@ -261,12 +258,13 @@ PostgreSQL documentation
<varlistentry>
<term><option>-w <replaceable class="parameter">path</replaceable></option></term>
- <term><option>--wal-directory=<replaceable class="parameter">path</replaceable></option></term>
+ <term><option>--wal-path=<replaceable class="parameter">path</replaceable></option></term>
<listitem>
<para>
- Try to parse WAL files stored in the specified directory, rather than
- in <literal>pg_wal</literal>. This may be useful if the backup is
- stored in a separate location from the WAL archive.
+ Try to parse WAL files stored in the specified directory or tar
+ archive, rather than in <literal>pg_wal</literal>. This may be
+ useful if the backup is stored in a separate location from the WAL
+ archive.
</para>
</listitem>
</varlistentry>
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 31f606c45b1..b60ab8739d5 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -74,7 +74,9 @@ pg_noreturn static void report_manifest_error(JsonManifestParseContext *context,
const char *fmt,...)
pg_attribute_printf(2, 3);
-static void verify_tar_backup(verifier_context *context, DIR *dir);
+static void verify_tar_backup(verifier_context *context, DIR *dir,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_plain_backup_directory(verifier_context *context,
char *relpath, char *fullpath,
DIR *dir);
@@ -83,7 +85,9 @@ static void verify_plain_backup_file(verifier_context *context, char *relpath,
static void verify_control_file(const char *controlpath,
uint64 manifest_system_identifier);
static void precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles);
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_tar_file(verifier_context *context, char *relpath,
char *fullpath, astreamer *streamer);
static void report_extra_backup_files(verifier_context *context);
@@ -93,7 +97,7 @@ static void verify_file_checksum(verifier_context *context,
uint8 *buffer);
static void parse_required_wal(verifier_context *context,
char *pg_waldump_path,
- char *wal_directory);
+ char *wal_path);
static astreamer *create_archive_verifier(verifier_context *context,
char *archive_name,
Oid tblspc_oid,
@@ -126,7 +130,8 @@ main(int argc, char **argv)
{"progress", no_argument, NULL, 'P'},
{"quiet", no_argument, NULL, 'q'},
{"skip-checksums", no_argument, NULL, 's'},
- {"wal-directory", required_argument, NULL, 'w'},
+ {"wal-path", required_argument, NULL, 'w'},
+ {"wal-directory", required_argument, NULL, 'w'}, /* deprecated */
{NULL, 0, NULL, 0}
};
@@ -135,7 +140,9 @@ main(int argc, char **argv)
char *manifest_path = NULL;
bool no_parse_wal = false;
bool quiet = false;
- char *wal_directory = NULL;
+ char *wal_path = NULL;
+ char *base_archive_path = NULL;
+ char *wal_archive_path = NULL;
char *pg_waldump_path = NULL;
DIR *dir;
@@ -221,8 +228,8 @@ main(int argc, char **argv)
context.skip_checksums = true;
break;
case 'w':
- wal_directory = pstrdup(optarg);
- canonicalize_path(wal_directory);
+ wal_path = pstrdup(optarg);
+ canonicalize_path(wal_path);
break;
default:
/* getopt_long already emitted a complaint */
@@ -285,10 +292,6 @@ main(int argc, char **argv)
manifest_path = psprintf("%s/backup_manifest",
context.backup_directory);
- /* By default, look for the WAL in the backup directory, too. */
- if (wal_directory == NULL)
- wal_directory = psprintf("%s/pg_wal", context.backup_directory);
-
/*
* Try to read the manifest. We treat any errors encountered while parsing
* the manifest as fatal; there doesn't seem to be much point in trying to
@@ -331,17 +334,6 @@ main(int argc, char **argv)
pfree(path);
}
- /*
- * XXX: In the future, we should consider enhancing pg_waldump to read WAL
- * files from an archive.
- */
- if (!no_parse_wal && context.format == 't')
- {
- pg_log_error("pg_waldump cannot read tar files");
- pg_log_error_hint("You must use -n/--no-parse-wal when verifying a tar-format backup.");
- exit(1);
- }
-
/*
* Perform the appropriate type of verification appropriate based on the
* backup format. This will close 'dir'.
@@ -350,7 +342,7 @@ main(int argc, char **argv)
verify_plain_backup_directory(&context, NULL, context.backup_directory,
dir);
else
- verify_tar_backup(&context, dir);
+ verify_tar_backup(&context, dir, &base_archive_path, &wal_archive_path);
/*
* The "matched" flag should now be set on every entry in the hash table.
@@ -368,12 +360,35 @@ main(int argc, char **argv)
if (context.format == 'p' && !context.skip_checksums)
verify_backup_checksums(&context);
+ /*
+ * By default, WAL files are expected to be found in the backup directory
+ * for plain-format backups. In the case of tar-format backups, if a
+ * separate WAL archive is not found, the WAL files are most likely
+ * included within the main data directory archive.
+ */
+ if (wal_path == NULL)
+ {
+ if (context.format == 'p')
+ wal_path = psprintf("%s/pg_wal", context.backup_directory);
+ else if (wal_archive_path)
+ wal_path = wal_archive_path;
+ else if (base_archive_path)
+ wal_path = base_archive_path;
+ else
+ {
+ pg_log_error("WAL archive not found");
+ pg_log_error_hint("Specify the correct path using the option -w/--wal-path. "
+ "Or you must use -n/--no-parse-wal when verifying a tar-format backup.");
+ exit(1);
+ }
+ }
+
/*
* Try to parse the required ranges of WAL records, unless we were told
* not to do so.
*/
if (!no_parse_wal)
- parse_required_wal(&context, pg_waldump_path, wal_directory);
+ parse_required_wal(&context, pg_waldump_path, wal_path);
/*
* If everything looks OK, tell the user this, unless we were asked to
@@ -787,7 +802,8 @@ verify_control_file(const char *controlpath, uint64 manifest_system_identifier)
* close when we're done with it.
*/
static void
-verify_tar_backup(verifier_context *context, DIR *dir)
+verify_tar_backup(verifier_context *context, DIR *dir, char **base_archive_path,
+ char **wal_archive_path)
{
struct dirent *dirent;
SimplePtrList tarfiles = {NULL, NULL};
@@ -816,7 +832,8 @@ verify_tar_backup(verifier_context *context, DIR *dir)
char *fullpath;
fullpath = psprintf("%s/%s", context->backup_directory, filename);
- precheck_tar_backup_file(context, filename, fullpath, &tarfiles);
+ precheck_tar_backup_file(context, filename, fullpath, &tarfiles,
+ base_archive_path, wal_archive_path);
pfree(fullpath);
}
}
@@ -875,17 +892,21 @@ verify_tar_backup(verifier_context *context, DIR *dir)
*
* The arguments to this function are mostly the same as the
* verify_plain_backup_file. The additional argument outputs a list of valid
- * tar files.
+ * tar files, along with the full paths to the main archive and the WAL
+ * directory archive.
*/
static void
precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles)
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path, char **wal_archive_path)
{
struct stat sb;
Oid tblspc_oid = InvalidOid;
pg_compress_algorithm compress_algorithm;
tar_file *tar;
char *suffix = NULL;
+ bool is_base_archive = false;
+ bool is_wal_archive = false;
/* Should be tar format backup */
Assert(context->format == 't');
@@ -918,9 +939,15 @@ precheck_tar_backup_file(verifier_context *context, char *relpath,
* extension such as .gz, .lz4, or .zst.
*/
if (strncmp("base", relpath, 4) == 0)
+ {
suffix = relpath + 4;
+ is_base_archive = true;
+ }
else if (strncmp("pg_wal", relpath, 6) == 0)
+ {
suffix = relpath + 6;
+ is_wal_archive = true;
+ }
else
{
/* Expected a <tablespaceoid>.tar file here. */
@@ -953,8 +980,13 @@ precheck_tar_backup_file(verifier_context *context, char *relpath,
* Ignore WALs, as reading and verification will be handled through
* pg_waldump.
*/
- if (strncmp("pg_wal", relpath, 6) == 0)
+ if (is_wal_archive)
+ {
+ *wal_archive_path = pstrdup(fullpath);
return;
+ }
+ else if (is_base_archive)
+ *base_archive_path = pstrdup(fullpath);
/*
* Append the information to the list for complete verification at a later
@@ -1188,7 +1220,7 @@ verify_file_checksum(verifier_context *context, manifest_file *m,
*/
static void
parse_required_wal(verifier_context *context, char *pg_waldump_path,
- char *wal_directory)
+ char *wal_path)
{
manifest_data *manifest = context->manifest;
manifest_wal_range *this_wal_range = manifest->first_wal_range;
@@ -1198,7 +1230,7 @@ parse_required_wal(verifier_context *context, char *pg_waldump_path,
char *pg_waldump_cmd;
pg_waldump_cmd = psprintf("\"%s\" --quiet --path=\"%s\" --timeline=%u --start=%X/%08X --end=%X/%08X\n",
- pg_waldump_path, wal_directory, this_wal_range->tli,
+ pg_waldump_path, wal_path, this_wal_range->tli,
LSN_FORMAT_ARGS(this_wal_range->start_lsn),
LSN_FORMAT_ARGS(this_wal_range->end_lsn));
fflush(NULL);
@@ -1366,7 +1398,7 @@ usage(void)
printf(_(" -P, --progress show progress information\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -s, --skip-checksums skip checksum verification\n"));
- printf(_(" -w, --wal-directory=PATH use specified path for WAL files\n"));
+ printf(_(" -w, --wal-path=PATH use specified path for WAL files\n"));
printf(_(" -V, --version output version information, then exit\n"));
printf(_(" -?, --help show this help, then exit\n"));
printf(_("\nReport bugs to <%s>.\n"), PACKAGE_BUGREPORT);
diff --git a/src/bin/pg_verifybackup/t/002_algorithm.pl b/src/bin/pg_verifybackup/t/002_algorithm.pl
index 0556191ec9d..edc515d5904 100644
--- a/src/bin/pg_verifybackup/t/002_algorithm.pl
+++ b/src/bin/pg_verifybackup/t/002_algorithm.pl
@@ -30,10 +30,6 @@ sub test_checksums
{
# Add switch to get a tar-format backup
push @backup, ('--format' => 'tar');
-
- # Add switch to skip WAL verification, which is not yet supported for
- # tar-format backups
- push @verify, ('--no-parse-wal');
}
# A backup with a bogus algorithm should fail.
diff --git a/src/bin/pg_verifybackup/t/003_corruption.pl b/src/bin/pg_verifybackup/t/003_corruption.pl
index b1d65b8aa0f..882d75d9dc2 100644
--- a/src/bin/pg_verifybackup/t/003_corruption.pl
+++ b/src/bin/pg_verifybackup/t/003_corruption.pl
@@ -193,10 +193,8 @@ for my $scenario (@scenario)
command_ok([ $tar, '-cf' => "$tar_backup_path/base.tar", '.' ]);
chdir($cwd) || die "chdir: $!";
- # Now check that the backup no longer verifies. We must use -n
- # here, because pg_waldump can't yet read WAL from a tarfile.
command_fails_like(
- [ 'pg_verifybackup', '--no-parse-wal', $tar_backup_path ],
+ [ 'pg_verifybackup', $tar_backup_path ],
$scenario->{'fails_like'},
"corrupt backup fails verification: $name");
diff --git a/src/bin/pg_verifybackup/t/007_wal.pl b/src/bin/pg_verifybackup/t/007_wal.pl
index 79087a1f6be..0e0377bfacc 100644
--- a/src/bin/pg_verifybackup/t/007_wal.pl
+++ b/src/bin/pg_verifybackup/t/007_wal.pl
@@ -42,10 +42,10 @@ command_ok([ 'pg_verifybackup', '--no-parse-wal', $backup_path ],
command_ok(
[
'pg_verifybackup',
- '--wal-directory' => $relocated_pg_wal,
+ '--wal-path' => $relocated_pg_wal,
$backup_path
],
- '--wal-directory can be used to specify WAL directory');
+ '--wal-path can be used to specify WAL directory');
# Move directory back to original location.
rename($relocated_pg_wal, $original_pg_wal) || die "rename pg_wal back: $!";
@@ -90,4 +90,20 @@ command_ok(
[ 'pg_verifybackup', $backup_path2 ],
'valid base backup with timeline > 1');
+# Test WAL verification for a tar-format backup with a separate pg_wal.tar,
+# as produced by pg_basebackup --format=tar --wal-method=stream.
+my $backup_path3 = $primary->backup_dir . '/test_tar_wal';
+$primary->command_ok(
+ [
+ 'pg_basebackup',
+ '--pgdata' => $backup_path3,
+ '--no-sync',
+ '--format' => 'tar',
+ '--checkpoint' => 'fast'
+ ],
+ "tar backup with separate pg_wal.tar");
+command_ok(
+ [ 'pg_verifybackup', $backup_path3 ],
+ 'WAL verification succeeds with separate pg_wal.tar');
+
done_testing();
diff --git a/src/bin/pg_verifybackup/t/008_untar.pl b/src/bin/pg_verifybackup/t/008_untar.pl
index ae67ae85a31..161c08c190d 100644
--- a/src/bin/pg_verifybackup/t/008_untar.pl
+++ b/src/bin/pg_verifybackup/t/008_untar.pl
@@ -47,7 +47,6 @@ my $tsoid = $primary->safe_psql(
SELECT oid FROM pg_tablespace WHERE spcname = 'regress_ts1'));
my $backup_path = $primary->backup_dir . '/server-backup';
-my $extract_path = $primary->backup_dir . '/extracted-backup';
my @test_configuration = (
{
@@ -123,14 +122,12 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
# Cleanup.
rmtree($backup_path);
- rmtree($extract_path);
}
}
diff --git a/src/bin/pg_verifybackup/t/010_client_untar.pl b/src/bin/pg_verifybackup/t/010_client_untar.pl
index 1ac7b5db75a..9670fbe4fda 100644
--- a/src/bin/pg_verifybackup/t/010_client_untar.pl
+++ b/src/bin/pg_verifybackup/t/010_client_untar.pl
@@ -32,7 +32,6 @@ print $jf $junk_data;
close $jf;
my $backup_path = $primary->backup_dir . '/client-backup';
-my $extract_path = $primary->backup_dir . '/extracted-backup';
my @test_configuration = (
{
@@ -137,13 +136,11 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
# Cleanup.
- rmtree($extract_path);
rmtree($backup_path);
}
}
--
2.47.1
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-03-19 20:48 Zsolt Parragi <[email protected]>
parent: Amul Sul <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Zsolt Parragi @ 2026-03-19 20:48 UTC (permalink / raw)
To: Amul Sul <[email protected]>; +Cc: Andrew Dunstan <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
Hello!
Path is ignored with a positional argument, I think this is a bug?
This fails:
pg_waldump --path /wal/dir 000000010000000000000001
And this works:
pg_waldump --path /wal/dir --start 0/01000028 --end 0/010020F8
+{
+ int fname_len = strlen(fname);
+
Shouldn't this use size_t?
+ /*
+ * Setup temporary directory to store WAL segments and set up an exit
+ * callback to remove it upon completion.
+ */
+ setup_tmpwal_dir(waldir);
Maybe this could be deferred to be created only on first use? If I
understand correctly, in a typical scenario waldump won't use this
temporary directory, yet it always creates it.
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-03-20 11:31 Amul Sul <[email protected]>
parent: Zsolt Parragi <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Amul Sul @ 2026-03-20 11:31 UTC (permalink / raw)
To: Zsolt Parragi <[email protected]>; +Cc: Andrew Dunstan <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On Fri, Mar 20, 2026 at 2:18 AM Zsolt Parragi <[email protected]> wrote:
>
> Hello!
>
> Path is ignored with a positional argument, I think this is a bug?
>
> This fails:
>
> pg_waldump --path /wal/dir 000000010000000000000001
>
> And this works:
>
> pg_waldump --path /wal/dir --start 0/01000028 --end 0/010020F8
>
Good catch! I've fixed this in the attached version and updated a test
case to cover this scenario.
> +{
> + int fname_len = strlen(fname);
> +
>
> Shouldn't this use size_t?
>
Okay, that can be used. I’ve done the same in the attached version.
> + /*
> + * Setup temporary directory to store WAL segments and set up an exit
> + * callback to remove it upon completion.
> + */
> + setup_tmpwal_dir(waldir);
>
> Maybe this could be deferred to be created only on first use? If I
> understand correctly, in a typical scenario waldump won't use this
> temporary directory, yet it always creates it.
Yeah, that optimization can be done, but passing the waldir -- which
is only used once -- to the point where the first temp file is created
would require quite a bit of code refactoring that doesn't seem to
offer much gain, IMO.
Regards,
Amul
Attachments:
[application/x-patch] v21-0001-Move-tar-detection-and-compression-logic-to-comm.patch (7.1K, 2-v21-0001-Move-tar-detection-and-compression-logic-to-comm.patch)
download | inline diff:
From 608372553eb1bb88285081a0382fe2c227c90d60 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 19 Mar 2026 15:43:30 +0530
Subject: [PATCH v21 1/5] Move tar detection and compression logic to common.
Consolidate tar archive identification and compression-type detection
logic into a shared location. Currently used by pg_basebackup and
pg_verifybackup, this functionality is also required for upcoming
pg_waldump enhancements.
This change promotes code reuse and simplifies maintenance across
frontend tools.
Author: Amul Sul <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Jakub Wartak <[email protected]>
Reviewed-by: Chao Li <[email protected]>
Reviewed-by: Euler Taveira <[email protected]>
Reviewed-by: Andrew Dunstan <[email protected]>
Reviewed-by: Zsolt Parragi <[email protected]>
discussion: https://postgr.es/m/CAAJ_b94bqdWN3h2J-PzzzQ2Npbwct5ZQHggn_QoYGhC2rn-=WQ@mail.gmail.com
---
src/bin/pg_basebackup/pg_basebackup.c | 36 +++++++----------------
src/bin/pg_verifybackup/pg_verifybackup.c | 12 +-------
src/common/compression.c | 30 +++++++++++++++++++
src/include/common/compression.h | 2 ++
4 files changed, 44 insertions(+), 36 deletions(-)
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index fa169a8d642..c1a4672aa6f 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -1070,12 +1070,9 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
astreamer *manifest_inject_streamer = NULL;
bool inject_manifest;
bool is_tar,
- is_tar_gz,
- is_tar_lz4,
- is_tar_zstd,
is_compressed_tar;
+ pg_compress_algorithm compressed_tar_algorithm;
bool must_parse_archive;
- int archive_name_len = strlen(archive_name);
/*
* Normally, we emit the backup manifest as a separate file, but when
@@ -1084,24 +1081,13 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
*/
inject_manifest = (format == 't' && strcmp(basedir, "-") == 0 && manifest);
- /* Is this a tar archive? */
- is_tar = (archive_name_len > 4 &&
- strcmp(archive_name + archive_name_len - 4, ".tar") == 0);
-
- /* Is this a .tar.gz archive? */
- is_tar_gz = (archive_name_len > 7 &&
- strcmp(archive_name + archive_name_len - 7, ".tar.gz") == 0);
-
- /* Is this a .tar.lz4 archive? */
- is_tar_lz4 = (archive_name_len > 8 &&
- strcmp(archive_name + archive_name_len - 8, ".tar.lz4") == 0);
-
- /* Is this a .tar.zst archive? */
- is_tar_zstd = (archive_name_len > 8 &&
- strcmp(archive_name + archive_name_len - 8, ".tar.zst") == 0);
+ /* Check whether it is a tar archive and its compression type */
+ is_tar = parse_tar_compress_algorithm(archive_name,
+ &compressed_tar_algorithm);
/* Is this any kind of compressed tar? */
- is_compressed_tar = is_tar_gz || is_tar_lz4 || is_tar_zstd;
+ is_compressed_tar = (is_tar &&
+ compressed_tar_algorithm != PG_COMPRESSION_NONE);
/*
* Injecting the manifest into a compressed tar file would be possible if
@@ -1128,7 +1114,7 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
(spclocation == NULL && writerecoveryconf));
/* At present, we only know how to parse tar archives. */
- if (must_parse_archive && !is_tar && !is_compressed_tar)
+ if (must_parse_archive && !is_tar)
{
pg_log_error("cannot parse archive \"%s\"", archive_name);
pg_log_error_detail("Only tar archives can be parsed.");
@@ -1263,13 +1249,13 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
* If the user has requested a server compressed archive along with
* archive extraction at client then we need to decompress it.
*/
- if (format == 'p')
+ if (format == 'p' && is_compressed_tar)
{
- if (is_tar_gz)
+ if (compressed_tar_algorithm == PG_COMPRESSION_GZIP)
streamer = astreamer_gzip_decompressor_new(streamer);
- else if (is_tar_lz4)
+ else if (compressed_tar_algorithm == PG_COMPRESSION_LZ4)
streamer = astreamer_lz4_decompressor_new(streamer);
- else if (is_tar_zstd)
+ else if (compressed_tar_algorithm == PG_COMPRESSION_ZSTD)
streamer = astreamer_zstd_decompressor_new(streamer);
}
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index cbc9447384f..31f606c45b1 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -941,17 +941,7 @@ precheck_tar_backup_file(verifier_context *context, char *relpath,
}
/* Now, check the compression type of the tar */
- if (strcmp(suffix, ".tar") == 0)
- compress_algorithm = PG_COMPRESSION_NONE;
- else if (strcmp(suffix, ".tgz") == 0)
- compress_algorithm = PG_COMPRESSION_GZIP;
- else if (strcmp(suffix, ".tar.gz") == 0)
- compress_algorithm = PG_COMPRESSION_GZIP;
- else if (strcmp(suffix, ".tar.lz4") == 0)
- compress_algorithm = PG_COMPRESSION_LZ4;
- else if (strcmp(suffix, ".tar.zst") == 0)
- compress_algorithm = PG_COMPRESSION_ZSTD;
- else
+ if (!parse_tar_compress_algorithm(suffix, &compress_algorithm))
{
report_backup_error(context,
"file \"%s\" is not expected in a tar format backup",
diff --git a/src/common/compression.c b/src/common/compression.c
index 92cd4ec7a0d..ae2089d9406 100644
--- a/src/common/compression.c
+++ b/src/common/compression.c
@@ -41,6 +41,36 @@ static int expect_integer_value(char *keyword, char *value,
static bool expect_boolean_value(char *keyword, char *value,
pg_compress_specification *result);
+/*
+ * Look up a compression algorithm by archive file extension. Returns true and
+ * sets *algorithm if the extension is recognized. Otherwise returns false.
+ */
+bool
+parse_tar_compress_algorithm(const char *fname, pg_compress_algorithm *algorithm)
+{
+ size_t fname_len = strlen(fname);
+
+ if (fname_len >= 4 &&
+ strcmp(fname + fname_len - 4, ".tar") == 0)
+ *algorithm = PG_COMPRESSION_NONE;
+ else if (fname_len >= 4 &&
+ strcmp(fname + fname_len - 4, ".tgz") == 0)
+ *algorithm = PG_COMPRESSION_GZIP;
+ else if (fname_len >= 7 &&
+ strcmp(fname + fname_len - 7, ".tar.gz") == 0)
+ *algorithm = PG_COMPRESSION_GZIP;
+ else if (fname_len >= 8 &&
+ strcmp(fname + fname_len - 8, ".tar.lz4") == 0)
+ *algorithm = PG_COMPRESSION_LZ4;
+ else if (fname_len >= 8 &&
+ strcmp(fname + fname_len - 8, ".tar.zst") == 0)
+ *algorithm = PG_COMPRESSION_ZSTD;
+ else
+ return false;
+
+ return true;
+}
+
/*
* Look up a compression algorithm by name. Returns true and sets *algorithm
* if the name is recognized. Otherwise returns false.
diff --git a/src/include/common/compression.h b/src/include/common/compression.h
index 6c745b90066..f99c747cdd3 100644
--- a/src/include/common/compression.h
+++ b/src/include/common/compression.h
@@ -41,6 +41,8 @@ typedef struct pg_compress_specification
extern void parse_compress_options(const char *option, char **algorithm,
char **detail);
+extern bool parse_tar_compress_algorithm(const char *fname,
+ pg_compress_algorithm *algorithm);
extern bool parse_compress_algorithm(char *name, pg_compress_algorithm *algorithm);
extern const char *get_compress_algorithm_name(pg_compress_algorithm algorithm);
--
2.47.1
[application/x-patch] v21-0002-pg_waldump-Preparatory-refactoring-for-tar-archi.patch (8.4K, 3-v21-0002-pg_waldump-Preparatory-refactoring-for-tar-archi.patch)
download | inline diff:
From 73b7ca37810df5c30391f4f09a199e672acd6b75 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 19 Mar 2026 15:43:39 +0530
Subject: [PATCH v21 2/5] pg_waldump: Preparatory refactoring for tar archive
WAL decoding.
Several refactoring steps in preparation for adding tar archive WAL
decoding support to pg_waldump:
- Move XLogDumpPrivate and related declarations into a new pg_waldump.h
header, allowing a second source file to share them.
- Factor out required_read_len() so the read-size calculation can be
reused for both regular WAL files and tar-archived WAL.
- Move the WAL segment size variable into XLogDumpPrivate and rename it
to segsize, making it accessible to the archive streamer code.
Author: Amul Sul <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Jakub Wartak <[email protected]>
Reviewed-by: Chao Li <[email protected]>
Reviewed-by: Euler Taveira <[email protected]>
Reviewed-by: Andrew Dunstan <[email protected]>
discussion: https://postgr.es/m/CAAJ_b94bqdWN3h2J-PzzzQ2Npbwct5ZQHggn_QoYGhC2rn-=WQ@mail.gmail.com
---
src/bin/pg_waldump/pg_waldump.c | 78 +++++++++++++++++++--------------
src/bin/pg_waldump/pg_waldump.h | 26 +++++++++++
2 files changed, 70 insertions(+), 34 deletions(-)
create mode 100644 src/bin/pg_waldump/pg_waldump.h
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index f3446385d6a..5d31b15dbd8 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -29,6 +29,7 @@
#include "common/logging.h"
#include "common/relpath.h"
#include "getopt_long.h"
+#include "pg_waldump.h"
#include "rmgrdesc.h"
#include "storage/bufpage.h"
@@ -43,14 +44,6 @@ static volatile sig_atomic_t time_to_stop = false;
static const RelFileLocator emptyRelFileLocator = {0, 0, 0};
-typedef struct XLogDumpPrivate
-{
- TimeLineID timeline;
- XLogRecPtr startptr;
- XLogRecPtr endptr;
- bool endptr_reached;
-} XLogDumpPrivate;
-
typedef struct XLogDumpConfig
{
/* display options */
@@ -333,6 +326,32 @@ identify_target_directory(char *directory, char *fname, int *WalSegSz)
return NULL; /* not reached */
}
+/*
+ * Returns the size in bytes of the data to be read. Returns -1 if the end
+ * point has already been reached.
+ */
+static inline int
+required_read_len(XLogDumpPrivate *private, XLogRecPtr targetPagePtr,
+ int reqLen)
+{
+ int count = XLOG_BLCKSZ;
+
+ if (XLogRecPtrIsValid(private->endptr))
+ {
+ if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
+ count = XLOG_BLCKSZ;
+ else if (targetPagePtr + reqLen <= private->endptr)
+ count = private->endptr - targetPagePtr;
+ else
+ {
+ private->endptr_reached = true;
+ return -1;
+ }
+ }
+
+ return count;
+}
+
/* pg_waldump's XLogReaderRoutine->segment_open callback */
static void
WALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
@@ -390,21 +409,12 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogRecPtr targetPtr, char *readBuff)
{
XLogDumpPrivate *private = state->private_data;
- int count = XLOG_BLCKSZ;
+ int count = required_read_len(private, targetPagePtr, reqLen);
WALReadError errinfo;
- if (XLogRecPtrIsValid(private->endptr))
- {
- if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
- count = XLOG_BLCKSZ;
- else if (targetPagePtr + reqLen <= private->endptr)
- count = private->endptr - targetPagePtr;
- else
- {
- private->endptr_reached = true;
- return -1;
- }
- }
+ /* Bail out if the count to be read is not valid */
+ if (count < 0)
+ return -1;
if (!WALRead(state, readBuff, targetPagePtr, count, private->timeline,
&errinfo))
@@ -801,7 +811,6 @@ main(int argc, char **argv)
XLogRecPtr first_record;
char *waldir = NULL;
char *errormsg;
- int WalSegSz;
static struct option long_options[] = {
{"bkp-details", no_argument, NULL, 'b'},
@@ -855,6 +864,7 @@ main(int argc, char **argv)
memset(&stats, 0, sizeof(XLogStats));
private.timeline = 1;
+ private.segsize = 0;
private.startptr = InvalidXLogRecPtr;
private.endptr = InvalidXLogRecPtr;
private.endptr_reached = false;
@@ -1128,18 +1138,18 @@ main(int argc, char **argv)
pg_fatal("could not open directory \"%s\": %m", waldir);
}
- waldir = identify_target_directory(waldir, fname, &WalSegSz);
+ waldir = identify_target_directory(waldir, fname, &private.segsize);
fd = open_file_in_directory(waldir, fname);
if (fd < 0)
pg_fatal("could not open file \"%s\"", fname);
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &segno, WalSegSz);
+ XLogFromFileName(fname, &private.timeline, &segno, private.segsize);
if (!XLogRecPtrIsValid(private.startptr))
- XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr);
- else if (!XLByteInSeg(private.startptr, segno, WalSegSz))
+ XLogSegNoOffsetToRecPtr(segno, 0, private.segsize, private.startptr);
+ else if (!XLByteInSeg(private.startptr, segno, private.segsize))
{
pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
LSN_FORMAT_ARGS(private.startptr),
@@ -1149,7 +1159,7 @@ main(int argc, char **argv)
/* no second file specified, set end position */
if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(segno + 1, 0, WalSegSz, private.endptr);
+ XLogSegNoOffsetToRecPtr(segno + 1, 0, private.segsize, private.endptr);
/* parse ENDSEG if passed */
if (optind + 1 < argc)
@@ -1165,14 +1175,14 @@ main(int argc, char **argv)
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &endsegno, WalSegSz);
+ XLogFromFileName(fname, &private.timeline, &endsegno, private.segsize);
if (endsegno < segno)
pg_fatal("ENDSEG %s is before STARTSEG %s",
argv[optind + 1], argv[optind]);
if (!XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(endsegno + 1, 0, WalSegSz,
+ XLogSegNoOffsetToRecPtr(endsegno + 1, 0, private.segsize,
private.endptr);
/* set segno to endsegno for check of --end */
@@ -1180,8 +1190,8 @@ main(int argc, char **argv)
}
- if (!XLByteInSeg(private.endptr, segno, WalSegSz) &&
- private.endptr != (segno + 1) * WalSegSz)
+ if (!XLByteInSeg(private.endptr, segno, private.segsize) &&
+ private.endptr != (segno + 1) * private.segsize)
{
pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
LSN_FORMAT_ARGS(private.endptr),
@@ -1190,7 +1200,7 @@ main(int argc, char **argv)
}
}
else
- waldir = identify_target_directory(waldir, NULL, &WalSegSz);
+ waldir = identify_target_directory(waldir, NULL, &private.segsize);
/* we don't know what to print */
if (!XLogRecPtrIsValid(private.startptr))
@@ -1203,7 +1213,7 @@ main(int argc, char **argv)
/* we have everything we need, start reading */
xlogreader_state =
- XLogReaderAllocate(WalSegSz, waldir,
+ XLogReaderAllocate(private.segsize, waldir,
XL_ROUTINE(.page_read = WALDumpReadPage,
.segment_open = WALDumpOpenSegment,
.segment_close = WALDumpCloseSegment),
@@ -1224,7 +1234,7 @@ main(int argc, char **argv)
* a segment (e.g. we were used in file mode).
*/
if (first_record != private.startptr &&
- XLogSegmentOffset(private.startptr, WalSegSz) != 0)
+ XLogSegmentOffset(private.startptr, private.segsize) != 0)
pg_log_info(ngettext("first record is after %X/%08X, at %X/%08X, skipping over %u byte",
"first record is after %X/%08X, at %X/%08X, skipping over %u bytes",
(first_record - private.startptr)),
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
new file mode 100644
index 00000000000..013b051506f
--- /dev/null
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -0,0 +1,26 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_waldump.h - decode and display WAL
+ *
+ * Copyright (c) 2026, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/pg_waldump.h
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_WALDUMP_H
+#define PG_WALDUMP_H
+
+#include "access/xlogdefs.h"
+
+/* Contains the necessary information to drive WAL decoding */
+typedef struct XLogDumpPrivate
+{
+ TimeLineID timeline;
+ int segsize;
+ XLogRecPtr startptr;
+ XLogRecPtr endptr;
+ bool endptr_reached;
+} XLogDumpPrivate;
+
+#endif /* PG_WALDUMP_H */
--
2.47.1
[application/x-patch] v21-0003-pg_waldump-Add-support-for-reading-WAL-from-tar-.patch (56.4K, 4-v21-0003-pg_waldump-Add-support-for-reading-WAL-from-tar-.patch)
download | inline diff:
From 9c87d8e8de19ed7be63874aea61df19a0cb41dd2 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 19 Mar 2026 15:43:46 +0530
Subject: [PATCH v21 3/5] pg_waldump: Add support for reading WAL from tar
archives
pg_waldump can now accept the path to a tar archive (optionally
compressed with gzip, lz4, or zstd) containing WAL files and decode
them. This was added primarily for pg_verifybackup, which previously
had to skip WAL parsing for tar-format backups.
The implementation uses the existing archive streamer infrastructure
with a hash table to track WAL segments read from the archive. If WAL
files within the archive are not in sequential order, out-of-order
segments are written to a temporary directory (created via mkdtemp under
$TMPDIR or the archive's directory) and read back when needed. An
atexit callback ensures the temporary directory is cleaned up.
The --follow option is not supported when reading from a tar archive.
Author: Amul Sul <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Jakub Wartak <[email protected]>
Reviewed-by: Chao Li <[email protected]>
Reviewed-by: Euler Taveira <[email protected]>
Reviewed-by: Andrew Dunstan <[email protected]>
Reviewed-by: Zsolt Parragi <[email protected]>
discussion: https://postgr.es/m/CAAJ_b94bqdWN3h2J-PzzzQ2Npbwct5ZQHggn_QoYGhC2rn-=WQ@mail.gmail.com
---
doc/src/sgml/ref/pg_waldump.sgml | 23 +-
src/bin/pg_waldump/Makefile | 7 +-
src/bin/pg_waldump/archive_waldump.c | 847 +++++++++++++++++++++++++++
src/bin/pg_waldump/meson.build | 4 +-
src/bin/pg_waldump/pg_waldump.c | 295 ++++++++--
src/bin/pg_waldump/pg_waldump.h | 50 ++
src/bin/pg_waldump/t/001_basic.pl | 246 ++++++--
src/tools/pgindent/typedefs.list | 4 +
8 files changed, 1346 insertions(+), 130 deletions(-)
create mode 100644 src/bin/pg_waldump/archive_waldump.c
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index d1715ff5124..9bbb4bd5772 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -141,13 +141,21 @@ PostgreSQL documentation
<term><option>--path=<replaceable>path</replaceable></option></term>
<listitem>
<para>
- Specifies a directory to search for WAL segment files or a
- directory with a <literal>pg_wal</literal> subdirectory that
+ Specifies a tar archive or a directory to search for WAL segment files
+ or a directory with a <literal>pg_wal</literal> subdirectory that
contains such files. The default is to search in the current
directory, the <literal>pg_wal</literal> subdirectory of the
current directory, and the <literal>pg_wal</literal> subdirectory
of <envar>PGDATA</envar>.
</para>
+ <para>
+ If a tar archive is provided and its WAL segment files are not in
+ sequential order, those files will be written to a temporary directory
+ named starting with <filename>waldump_tmp</filename>. This directory will be
+ created inside the directory specified by the <envar>TMPDIR</envar>
+ environment variable if it is set; otherwise, it will be created within
+ the same directory as the tar archive.
+ </para>
</listitem>
</varlistentry>
@@ -383,6 +391,17 @@ PostgreSQL documentation
</para>
</listitem>
</varlistentry>
+
+ <varlistentry>
+ <term><envar>TMPDIR</envar></term>
+ <listitem>
+ <para>
+ Directory in which to create temporary files when reading WAL from a
+ tar archive with out-of-order segment files. If not set, the temporary
+ directory is created within the same directory as the tar archive.
+ </para>
+ </listitem>
+ </varlistentry>
</variablelist>
</refsect1>
diff --git a/src/bin/pg_waldump/Makefile b/src/bin/pg_waldump/Makefile
index 4c1ee649501..aabb87566a2 100644
--- a/src/bin/pg_waldump/Makefile
+++ b/src/bin/pg_waldump/Makefile
@@ -3,6 +3,9 @@
PGFILEDESC = "pg_waldump - decode and display WAL"
PGAPPICON=win32
+# make these available to TAP test scripts
+export TAR
+
subdir = src/bin/pg_waldump
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
@@ -10,13 +13,15 @@ include $(top_builddir)/src/Makefile.global
OBJS = \
$(RMGRDESCOBJS) \
$(WIN32RES) \
+ archive_waldump.o \
compat.o \
pg_waldump.o \
rmgrdesc.o \
xlogreader.o \
xlogstats.o
-override CPPFLAGS := -DFRONTEND $(CPPFLAGS)
+override CPPFLAGS := -DFRONTEND -I$(libpq_srcdir) $(CPPFLAGS)
+LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils
RMGRDESCSOURCES = $(sort $(notdir $(wildcard $(top_srcdir)/src/backend/access/rmgrdesc/*desc*.c)))
RMGRDESCOBJS = $(patsubst %.c,%.o,$(RMGRDESCSOURCES))
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
new file mode 100644
index 00000000000..9cbcae3e8af
--- /dev/null
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -0,0 +1,847 @@
+/*-------------------------------------------------------------------------
+ *
+ * archive_waldump.c
+ * A generic facility for reading WAL data from tar archives via archive
+ * streamer.
+ *
+ * Portions Copyright (c) 2026, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/archive_waldump.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres_fe.h"
+
+#include <unistd.h>
+
+#include "access/xlog_internal.h"
+#include "common/file_perm.h"
+#include "common/hashfn.h"
+#include "common/logging.h"
+#include "fe_utils/simple_list.h"
+#include "pg_waldump.h"
+
+/*
+ * How many bytes should we try to read from a file at once?
+ */
+#define READ_CHUNK_SIZE (128 * 1024)
+
+/* Temporary exported WAL file directory */
+char *TmpWalSegDir = NULL;
+
+/*
+ * Check if the start segment number is zero; this indicates a request to read
+ * any WAL file.
+ */
+#define READ_ANY_WAL(privateInfo) ((privateInfo)->start_segno == 0)
+
+/*
+ * Hash entry representing a WAL segment retrieved from the archive.
+ *
+ * While WAL segments are typically read sequentially, individual entries
+ * maintain their own buffers for the following reasons:
+ *
+ * 1. Boundary Handling: The archive streamer provides a continuous byte
+ * stream. A single streaming chunk may contain the end of one WAL segment
+ * and the start of the next. Separate buffers allow us to easily
+ * partition and track these bytes by their respective segments.
+ *
+ * 2. Out-of-Order Support: Dedicated buffers simplify logic when segments
+ * are archived or retrieved out of sequence.
+ *
+ * To minimize the memory footprint, entries and their associated buffers are
+ * freed immediately once consumed. Since pg_waldump does not request the same
+ * bytes twice, a segment is discarded as soon as pg_waldump moves past it.
+ */
+typedef struct ArchivedWALFile
+{
+ uint32 status; /* hash status */
+ const char *fname; /* hash key: WAL segment name */
+
+ StringInfo buf; /* holds WAL bytes read from archive */
+ bool spilled; /* true if the WAL data was spilled to a
+ * temporary file */
+
+ int read_len; /* total bytes of a WAL read from archive */
+} ArchivedWALFile;
+
+static uint32 hash_string_pointer(const char *s);
+#define SH_PREFIX ArchivedWAL
+#define SH_ELEMENT_TYPE ArchivedWALFile
+#define SH_KEY_TYPE const char *
+#define SH_KEY fname
+#define SH_HASH_KEY(tb, key) hash_string_pointer(key)
+#define SH_EQUAL(tb, a, b) (strcmp(a, b) == 0)
+#define SH_SCOPE static inline
+#define SH_RAW_ALLOCATOR pg_malloc0
+#define SH_DECLARE
+#define SH_DEFINE
+#include "lib/simplehash.h"
+
+typedef struct astreamer_waldump
+{
+ astreamer base;
+ XLogDumpPrivate *privateInfo;
+} astreamer_waldump;
+
+static ArchivedWALFile *get_archive_wal_entry(const char *fname,
+ XLogDumpPrivate *privateInfo,
+ int WalSegSz);
+static int read_archive_file(XLogDumpPrivate *privateInfo, Size count);
+static void setup_tmpwal_dir(const char *waldir);
+static void cleanup_tmpwal_dir_atexit(void);
+
+static FILE *prepare_tmp_write(const char *fname);
+static void perform_tmp_write(const char *fname, StringInfo buf, FILE *file);
+
+static astreamer *astreamer_waldump_new(XLogDumpPrivate *privateInfo);
+static void astreamer_waldump_content(astreamer *streamer,
+ astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context);
+static void astreamer_waldump_finalize(astreamer *streamer);
+static void astreamer_waldump_free(astreamer *streamer);
+
+static bool member_is_wal_file(astreamer_waldump *mystreamer,
+ astreamer_member *member,
+ char **fname);
+
+static const astreamer_ops astreamer_waldump_ops = {
+ .content = astreamer_waldump_content,
+ .finalize = astreamer_waldump_finalize,
+ .free = astreamer_waldump_free
+};
+
+/*
+ * Initializes the tar archive reader: opens the archive, builds a hash table
+ * for WAL entries, reads ahead until a full WAL page header is available to
+ * determine the WAL segment size, and computes start/end segment numbers for
+ * filtering. It also sets up a temporary directory for out-of-order WAL data
+ * and registers an atexit callback to clean it up.
+ */
+void
+init_archive_reader(XLogDumpPrivate *privateInfo, const char *waldir,
+ int *WalSegSz, pg_compress_algorithm compression)
+{
+ int fd;
+ astreamer *streamer;
+ ArchivedWALFile *entry = NULL;
+ XLogLongPageHeader longhdr;
+ XLogSegNo segno;
+ TimeLineID timeline;
+
+ /* Open tar archive and store its file descriptor */
+ fd = open_file_in_directory(waldir, privateInfo->archive_name);
+
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", privateInfo->archive_name);
+
+ privateInfo->archive_fd = fd;
+
+ streamer = astreamer_waldump_new(privateInfo);
+
+ /* We must first parse the tar archive. */
+ streamer = astreamer_tar_parser_new(streamer);
+
+ /* If the archive is compressed, decompress before parsing. */
+ if (compression == PG_COMPRESSION_GZIP)
+ streamer = astreamer_gzip_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_LZ4)
+ streamer = astreamer_lz4_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_ZSTD)
+ streamer = astreamer_zstd_decompressor_new(streamer);
+
+ privateInfo->archive_streamer = streamer;
+
+ /*
+ * Allocate a buffer for reading the archive file to facilitate content
+ * decoding; read requests must not exceed the allocated buffer size.
+ */
+ privateInfo->archive_read_buf = pg_malloc(READ_CHUNK_SIZE);
+
+#ifdef USE_ASSERT_CHECKING
+ privateInfo->archive_read_buf_size = READ_CHUNK_SIZE;
+#endif
+
+ /*
+ * Hash table storing WAL entries read from the archive with an arbitrary
+ * initial size.
+ */
+ privateInfo->archive_wal_htab = ArchivedWAL_create(8, NULL);
+
+ /*
+ * Read until we have at least one full WAL page (XLOG_BLCKSZ bytes) from
+ * the first WAL segment in the archive so we can extract the WAL segment
+ * size from the long page header.
+ */
+ while (entry == NULL || entry->buf->len < XLOG_BLCKSZ)
+ {
+ if (read_archive_file(privateInfo, XLOG_BLCKSZ) == 0)
+ pg_fatal("could not find WAL in archive \"%s\"",
+ privateInfo->archive_name);
+
+ entry = privateInfo->cur_file;
+ }
+
+ /* Set WalSegSz if WAL data is successfully read */
+ longhdr = (XLogLongPageHeader) entry->buf->data;
+
+ if (!IsValidWalSegSize(longhdr->xlp_seg_size))
+ {
+ pg_log_error(ngettext("invalid WAL segment size in WAL file from archive \"%s\" (%d byte)",
+ "invalid WAL segment size in WAL file from archive \"%s\" (%d bytes)",
+ longhdr->xlp_seg_size),
+ privateInfo->archive_name, longhdr->xlp_seg_size);
+ pg_log_error_detail("The WAL segment size must be a power of two between 1 MB and 1 GB.");
+ exit(1);
+ }
+
+ *WalSegSz = longhdr->xlp_seg_size;
+
+ /*
+ * With the WAL segment size available, we can now initialize the
+ * dependent start and end segment numbers.
+ */
+ Assert(!XLogRecPtrIsInvalid(privateInfo->startptr));
+ XLByteToSeg(privateInfo->startptr, privateInfo->start_segno, *WalSegSz);
+
+ if (!XLogRecPtrIsInvalid(privateInfo->endptr))
+ XLByteToSeg(privateInfo->endptr, privateInfo->end_segno, *WalSegSz);
+
+ /*
+ * This WAL record was fetched before the filtering parameters
+ * (start_segno and end_segno) were fully initialized. Perform the
+ * relevance check against the user-provided range now; if the WAL falls
+ * outside this range, remove it from the hash table. Subsequent WAL will
+ * be filtered automatically by the archive streamer using the updated
+ * start_segno and end_segno values.
+ */
+ XLogFromFileName(entry->fname, &timeline, &segno, privateInfo->segsize);
+ if (privateInfo->timeline != timeline ||
+ privateInfo->start_segno > segno ||
+ privateInfo->end_segno < segno)
+ free_archive_wal_entry(entry->fname, privateInfo);
+
+ /*
+ * Setup temporary directory to store WAL segments and set up an exit
+ * callback to remove it upon completion.
+ */
+ setup_tmpwal_dir(waldir);
+ atexit(cleanup_tmpwal_dir_atexit);
+}
+
+/*
+ * Release the archive streamer chain and close the archive file.
+ */
+void
+free_archive_reader(XLogDumpPrivate *privateInfo)
+{
+ /*
+ * NB: Normally, astreamer_finalize() is called before astreamer_free() to
+ * flush any remaining buffered data or to ensure the end of the tar
+ * archive is reached. However, when decoding WAL, once we hit the end
+ * LSN, any remaining buffered data or unread portion of the archive can
+ * be safely ignored.
+ */
+ astreamer_free(privateInfo->archive_streamer);
+
+ /* Free any remaining hash table entries and their buffers. */
+ if (privateInfo->archive_wal_htab != NULL)
+ {
+ ArchivedWAL_iterator iter;
+ ArchivedWALFile *entry;
+
+ ArchivedWAL_start_iterate(privateInfo->archive_wal_htab, &iter);
+ while ((entry = ArchivedWAL_iterate(privateInfo->archive_wal_htab,
+ &iter)) != NULL)
+ {
+ if (entry->buf != NULL)
+ destroyStringInfo(entry->buf);
+ }
+ ArchivedWAL_destroy(privateInfo->archive_wal_htab);
+ privateInfo->archive_wal_htab = NULL;
+ }
+
+ /* Free the reusable read buffer. */
+ if (privateInfo->archive_read_buf != NULL)
+ {
+ pg_free(privateInfo->archive_read_buf);
+ privateInfo->archive_read_buf = NULL;
+ }
+
+ /* Close the file. */
+ if (close(privateInfo->archive_fd) != 0)
+ pg_log_error("could not close file \"%s\": %m",
+ privateInfo->archive_name);
+}
+
+/*
+ * Copies the requested WAL data from the hash entry's buffer into readBuff.
+ * If the buffer does not yet contain the needed bytes, fetches more data from
+ * the tar archive via the archive streamer.
+ */
+int
+read_archive_wal_page(XLogDumpPrivate *privateInfo, XLogRecPtr targetPagePtr,
+ Size count, char *readBuff, int WalSegSz)
+{
+ char *p = readBuff;
+ Size nbytes = count;
+ XLogRecPtr recptr = targetPagePtr;
+ XLogSegNo segno;
+ char fname[MAXFNAMELEN];
+ ArchivedWALFile *entry;
+
+ /* Identify the segment and locate its entry in the archive hash */
+ XLByteToSeg(targetPagePtr, segno, WalSegSz);
+ XLogFileName(fname, privateInfo->timeline, segno, WalSegSz);
+ entry = get_archive_wal_entry(fname, privateInfo, WalSegSz);
+
+ while (nbytes > 0)
+ {
+ char *buf = entry->buf->data;
+ int bufLen = entry->buf->len;
+ XLogRecPtr endPtr;
+ XLogRecPtr startPtr;
+
+ /*
+ * Calculate the LSN range currently residing in the buffer.
+ *
+ * read_len tracks total bytes received for this segment (including
+ * already-discarded data), so endPtr is the LSN just past the last
+ * buffered byte, and startPtr is the LSN of the first buffered byte.
+ */
+ XLogSegNoOffsetToRecPtr(segno, entry->read_len, WalSegSz, endPtr);
+ startPtr = endPtr - bufLen;
+
+ /*
+ * Copy the requested WAL record if it exists in the buffer.
+ */
+ if (bufLen > 0 && startPtr <= recptr && recptr < endPtr)
+ {
+ int copyBytes;
+ int offset = recptr - startPtr;
+
+ /*
+ * Given startPtr <= recptr < endPtr and a total buffer size
+ * 'bufLen', the offset (recptr - startPtr) will always be less
+ * than 'bufLen'.
+ */
+ Assert(offset < bufLen);
+
+ copyBytes = Min(nbytes, bufLen - offset);
+ memcpy(p, buf + offset, copyBytes);
+
+ /* Update state for read */
+ recptr += copyBytes;
+ nbytes -= copyBytes;
+ p += copyBytes;
+ }
+ else
+ {
+ /*
+ * Before starting the actual decoding loop, pg_waldump tries to
+ * locate the first valid record from the user-specified start
+ * position, which might not be the start of a WAL record and
+ * could fall in the middle of a record that spans multiple pages.
+ * Consequently, the valid start position the decoder is looking
+ * for could be far away from that initial position.
+ *
+ * This may involve reading across multiple pages, and this
+ * pre-reading fetches data in multiple rounds from the archive
+ * streamer; normally, we would throw away existing buffer
+ * contents to fetch the next set of data, but that existing data
+ * might be needed once the main loop starts. Because previously
+ * read data cannot be re-read by the archive streamer, we delay
+ * resetting the buffer until the main decoding loop is entered.
+ *
+ * Once pg_waldump has entered the main loop, it may re-read the
+ * currently active page, but never an older one; therefore, any
+ * fully consumed WAL data preceding the current page can then be
+ * safely discarded.
+ */
+ if (privateInfo->decoding_started)
+ {
+ resetStringInfo(entry->buf);
+
+ /*
+ * Push back the partial page data for the current page to the
+ * buffer, ensuring a full page remains available for
+ * re-reading if requested.
+ */
+ if (p > readBuff)
+ {
+ Assert((count - nbytes) > 0);
+ appendBinaryStringInfo(entry->buf, readBuff, count - nbytes);
+ }
+ }
+
+ /*
+ * Now, fetch more data. Raise an error if the archive streamer
+ * has moved past our segment (meaning the WAL file in the archive
+ * is shorter than expected) or if reading the archive reached
+ * EOF.
+ */
+ if (privateInfo->cur_file != entry)
+ pg_fatal("WAL segment \"%s\" in archive \"%s\" is too short: read %lld of %lld bytes",
+ fname, privateInfo->archive_name,
+ (long long int) (count - nbytes),
+ (long long int) count);
+ if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ pg_fatal("unexpected end of archive \"%s\" while reading \"%s\": read %lld of %lld bytes",
+ privateInfo->archive_name, fname,
+ (long long int) (count - nbytes),
+ (long long int) count);
+ }
+ }
+
+ /*
+ * Should have successfully read all the requested bytes or reported a
+ * failure before this point.
+ */
+ Assert(nbytes == 0);
+
+ /*
+ * NB: We return count unchanged. We could return a boolean since we
+ * either successfully read the WAL page or raise an error, but the caller
+ * expects this value to be returned. The routine that reads WAL pages
+ * from physical WAL files follows the same convention.
+ */
+ return count;
+}
+
+/*
+ * Releases the buffer of a WAL entry that is no longer needed, preventing the
+ * accumulation of irrelevant WAL data. Also removes any associated temporary
+ * file and clears privateInfo->cur_file if it points to this entry, so the
+ * archive streamer skips subsequent data for it.
+ */
+void
+free_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo)
+{
+ ArchivedWALFile *entry;
+
+ entry = ArchivedWAL_lookup(privateInfo->archive_wal_htab, fname);
+
+ if (entry == NULL)
+ return;
+
+ /* Destroy the buffer */
+ destroyStringInfo(entry->buf);
+ entry->buf = NULL;
+
+ /* Remove temporary file if any */
+ if (entry->spilled)
+ {
+ char fpath[MAXPGPATH];
+
+ snprintf(fpath, MAXPGPATH, "%s/%s", TmpWalSegDir, fname);
+
+ if (unlink(fpath) == 0)
+ pg_log_debug("removed file \"%s\"", fpath);
+ }
+
+ /* Clear cur_file if it points to the entry being freed */
+ if (privateInfo->cur_file == entry)
+ privateInfo->cur_file = NULL;
+
+ ArchivedWAL_delete_item(privateInfo->archive_wal_htab, entry);
+}
+
+/*
+ * Returns the archived WAL entry from the hash table if it already exists.
+ * Otherwise, reads more data from the archive until the requested entry is
+ * found. If the archive streamer is reading a WAL file from the archive that
+ * is not currently needed, that data is spilled to a temporary file for later
+ * retrieval.
+ */
+static ArchivedWALFile *
+get_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo,
+ int WalSegSz)
+{
+ ArchivedWALFile *entry = NULL;
+ FILE *write_fp = NULL;
+
+ /*
+ * Search the hash table first. If the entry is found, return it.
+ * Otherwise, the requested WAL entry hasn't been read from the archive
+ * yet; invoke the archive streamer to fetch it.
+ */
+ while (1)
+ {
+ /*
+ * Search hash table.
+ *
+ * We perform the search inside the loop because a single iteration of
+ * the archive reader may decompress and extract multiple files into
+ * the hash table. One of these newly added files could be the one we
+ * are seeking.
+ */
+ entry = ArchivedWAL_lookup(privateInfo->archive_wal_htab, fname);
+
+ if (entry != NULL)
+ return entry;
+
+ /*
+ * The WAL file entry currently being processed may change during
+ * archive streamer execution. Therefore, maintain a local variable to
+ * reference the previous entry, ensuring that any remaining data in
+ * its buffer is successfully flushed to the temporary file before
+ * switching to the next WAL entry.
+ */
+ entry = privateInfo->cur_file;
+
+ /*
+ * Fetch more data either when no current file is being tracked or
+ * when its buffer has been fully flushed to the temporary file.
+ */
+ if (entry == NULL || entry->buf->len == 0)
+ {
+ if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ break; /* archive file ended */
+ }
+
+ /*
+ * Archive streamer is reading a non-WAL file or an irrelevant WAL
+ * file.
+ */
+ if (entry == NULL)
+ continue;
+
+ /*
+ * Archive streamer is currently reading a file that isn't the one
+ * asked for, but it's required in the future. It should be written to
+ * a temporary location for retrieval when needed.
+ */
+ Assert(strcmp(fname, entry->fname) != 0);
+
+ /* Create a temporary file if one does not already exist */
+ if (!entry->spilled)
+ {
+ write_fp = prepare_tmp_write(entry->fname);
+ entry->spilled = true;
+ }
+
+ /* Flush data from the buffer to the file */
+ perform_tmp_write(entry->fname, entry->buf, write_fp);
+ resetStringInfo(entry->buf);
+
+ /*
+ * The change in the current segment entry indicates that the reading
+ * of this file has ended.
+ */
+ if (entry != privateInfo->cur_file && write_fp != NULL)
+ {
+ fclose(write_fp);
+ write_fp = NULL;
+ }
+ }
+
+ /* Requested WAL segment not found */
+ pg_fatal("could not find WAL \"%s\" in archive \"%s\"",
+ fname, privateInfo->archive_name);
+}
+
+/*
+ * Reads the archive file and passes it to the archive streamer for
+ * decompression.
+ */
+static int
+read_archive_file(XLogDumpPrivate *privateInfo, Size count)
+{
+ int rc;
+
+ /* The read request must not exceed the allocated buffer size. */
+ Assert(privateInfo->archive_read_buf_size >= count);
+
+ rc = read(privateInfo->archive_fd, privateInfo->archive_read_buf, count);
+ if (rc < 0)
+ pg_fatal("could not read file \"%s\": %m",
+ privateInfo->archive_name);
+
+ /*
+ * Decompress (if required), and then parse the previously read contents
+ * of the tar file.
+ */
+ if (rc > 0)
+ astreamer_content(privateInfo->archive_streamer, NULL,
+ privateInfo->archive_read_buf, rc,
+ ASTREAMER_UNKNOWN);
+
+ return rc;
+}
+
+/*
+ * Set up a temporary directory to temporarily store WAL segments.
+ */
+static void
+setup_tmpwal_dir(const char *waldir)
+{
+ char *template;
+
+ /*
+ * Use the directory specified by the TMPDIR environment variable. If it's
+ * not set, fall back to the provided WAL directory to store WAL files
+ * temporarily.
+ */
+ template = psprintf("%s/waldump_tmp-XXXXXX",
+ getenv("TMPDIR") ? getenv("TMPDIR") : waldir);
+ TmpWalSegDir = mkdtemp(template);
+
+ if (TmpWalSegDir == NULL)
+ pg_fatal("could not create directory \"%s\": %m", template);
+
+ canonicalize_path(TmpWalSegDir);
+
+ pg_log_debug("created directory \"%s\"", TmpWalSegDir);
+}
+
+/*
+ * Remove temporary directory at exit, if any.
+ */
+static void
+cleanup_tmpwal_dir_atexit(void)
+{
+ rmtree(TmpWalSegDir, true);
+}
+
+/*
+ * Create an empty placeholder file and return its handle.
+ */
+static FILE *
+prepare_tmp_write(const char *fname)
+{
+ char fpath[MAXPGPATH];
+ FILE *file;
+
+ snprintf(fpath, MAXPGPATH, "%s/%s", TmpWalSegDir, fname);
+
+ /* Create an empty placeholder */
+ file = fopen(fpath, PG_BINARY_W);
+ if (file == NULL)
+ pg_fatal("could not create file \"%s\": %m", fpath);
+
+#ifndef WIN32
+ if (chmod(fpath, pg_file_create_mode))
+ pg_fatal("could not set permissions on file \"%s\": %m",
+ fpath);
+#endif
+
+ pg_log_debug("spilling to temporary file \"%s\"", fpath);
+
+ return file;
+}
+
+/*
+ * Write buffer data to the given file handle.
+ */
+static void
+perform_tmp_write(const char *fname, StringInfo buf, FILE *file)
+{
+ Assert(file);
+
+ errno = 0;
+ if (buf->len > 0 && fwrite(buf->data, buf->len, 1, file) != 1)
+ {
+ /*
+ * If write didn't set errno, assume problem is no disk space
+ */
+ if (errno == 0)
+ errno = ENOSPC;
+ pg_fatal("could not write to file \"%s/%s\": %m", TmpWalSegDir, fname);
+ }
+}
+
+/*
+ * Create an astreamer that can read WAL from tar file.
+ */
+static astreamer *
+astreamer_waldump_new(XLogDumpPrivate *privateInfo)
+{
+ astreamer_waldump *streamer;
+
+ streamer = palloc0_object(astreamer_waldump);
+ *((const astreamer_ops **) &streamer->base.bbs_ops) =
+ &astreamer_waldump_ops;
+
+ streamer->privateInfo = privateInfo;
+
+ return &streamer->base;
+}
+
+/*
+ * Main entry point of the archive streamer for reading WAL data from a tar
+ * file. If a member is identified as a valid WAL file, a hash entry is created
+ * for it, and its contents are copied into that entry's buffer, making them
+ * accessible to the decoding routine.
+ */
+static void
+astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context)
+{
+ astreamer_waldump *mystreamer = (astreamer_waldump *) streamer;
+ XLogDumpPrivate *privateInfo = mystreamer->privateInfo;
+
+ Assert(context != ASTREAMER_UNKNOWN);
+
+ switch (context)
+ {
+ case ASTREAMER_MEMBER_HEADER:
+ {
+ char *fname = NULL;
+ ArchivedWALFile *entry;
+ bool found;
+
+ pg_log_debug("reading \"%s\"", member->pathname);
+
+ if (!member_is_wal_file(mystreamer, member, &fname))
+ break;
+
+ /*
+ * Further checks are skipped if any WAL file can be read.
+ * This typically occurs during initial verification.
+ */
+ if (!READ_ANY_WAL(privateInfo))
+ {
+ XLogSegNo segno;
+ TimeLineID timeline;
+
+ /*
+ * Skip the segment if the timeline does not match, if it
+ * falls outside the caller-specified range.
+ */
+ XLogFromFileName(fname, &timeline, &segno, privateInfo->segsize);
+ if (privateInfo->timeline != timeline ||
+ privateInfo->start_segno > segno ||
+ privateInfo->end_segno < segno)
+ {
+ pfree(fname);
+ break;
+ }
+ }
+
+ entry = ArchivedWAL_insert(privateInfo->archive_wal_htab,
+ fname, &found);
+
+ /*
+ * Shouldn't happen, but if it does, simply ignore the
+ * duplicate WAL file.
+ */
+ if (found)
+ {
+ pg_log_warning("ignoring duplicate WAL \"%s\" found in archive \"%s\"",
+ member->pathname, privateInfo->archive_name);
+ pfree(fname);
+ break;
+ }
+
+ entry->buf = makeStringInfo();
+ entry->spilled = false;
+ entry->read_len = 0;
+ privateInfo->cur_file = entry;
+ }
+ break;
+
+ case ASTREAMER_MEMBER_CONTENTS:
+ if (privateInfo->cur_file)
+ {
+ appendBinaryStringInfo(privateInfo->cur_file->buf, data, len);
+ privateInfo->cur_file->read_len += len;
+ }
+ break;
+
+ case ASTREAMER_MEMBER_TRAILER:
+
+ /*
+ * End of this tar member; mark cur_file NULL so subsequent
+ * content callbacks (if any) know no WAL file is currently
+ * active.
+ */
+ privateInfo->cur_file = NULL;
+ break;
+
+ case ASTREAMER_ARCHIVE_TRAILER:
+ break;
+
+ default:
+ /* Shouldn't happen. */
+ pg_fatal("unexpected state while parsing tar file");
+ }
+}
+
+/*
+ * End-of-stream processing for an astreamer_waldump stream. This is a
+ * terminal streamer so it must have no successor.
+ */
+static void
+astreamer_waldump_finalize(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+}
+
+/*
+ * Free memory associated with an astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_free(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+ pfree(streamer);
+}
+
+/*
+ * Returns true if the archive member name matches the WAL naming format. If
+ * successful, it also outputs the WAL segment name.
+ */
+static bool
+member_is_wal_file(astreamer_waldump *mystreamer, astreamer_member *member,
+ char **fname)
+{
+ int pathlen;
+ char pathname[MAXPGPATH];
+ char *filename;
+
+ /* We are only interested in normal files */
+ if (member->is_directory || member->is_link)
+ return false;
+
+ if (strlen(member->pathname) < XLOG_FNAME_LEN)
+ return false;
+
+ /*
+ * For a correct comparison, we must remove any '.' or '..' components
+ * from the member pathname. Similar to member_verify_header(), we prepend
+ * './' to the path so that canonicalize_path() can properly resolve and
+ * strip these references from the tar member name.
+ */
+ snprintf(pathname, MAXPGPATH, "./%s", member->pathname);
+ canonicalize_path(pathname);
+ pathlen = strlen(pathname);
+
+ /* WAL files from the top-level or pg_wal directory will be decoded */
+ if (pathlen > XLOG_FNAME_LEN &&
+ strncmp(pathname, XLOGDIR, strlen(XLOGDIR)) != 0)
+ return false;
+
+ /* WAL file may appear with a full path (e.g., pg_wal/<name>) */
+ filename = pathname + (pathlen - XLOG_FNAME_LEN);
+ if (!IsXLogFileName(filename))
+ return false;
+
+ *fname = pnstrdup(filename, XLOG_FNAME_LEN);
+
+ return true;
+}
+
+/*
+ * Helper function for WAL file hash table.
+ */
+static uint32
+hash_string_pointer(const char *s)
+{
+ unsigned char *ss = (unsigned char *) s;
+
+ return hash_bytes(ss, strlen(s));
+}
diff --git a/src/bin/pg_waldump/meson.build b/src/bin/pg_waldump/meson.build
index 633a9874bb5..5296f21b82c 100644
--- a/src/bin/pg_waldump/meson.build
+++ b/src/bin/pg_waldump/meson.build
@@ -1,6 +1,7 @@
# Copyright (c) 2022-2026, PostgreSQL Global Development Group
pg_waldump_sources = files(
+ 'archive_waldump.c',
'compat.c',
'pg_waldump.c',
'rmgrdesc.c',
@@ -18,7 +19,7 @@ endif
pg_waldump = executable('pg_waldump',
pg_waldump_sources,
- dependencies: [frontend_code, lz4, zstd],
+ dependencies: [frontend_code, libpq, lz4, zstd],
c_args: ['-DFRONTEND'], # needed for xlogreader et al
kwargs: default_bin_args,
)
@@ -29,6 +30,7 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
+ 'env': {'TAR': tar.found() ? tar.full_path() : ''},
'tests': [
't/001_basic.pl',
't/002_save_fullpage.pl',
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 5d31b15dbd8..0f6d2372076 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -176,7 +176,7 @@ split_path(const char *path, char **dir, char **fname)
*
* return a read only fd
*/
-static int
+int
open_file_in_directory(const char *directory, const char *fname)
{
int fd = -1;
@@ -327,8 +327,8 @@ identify_target_directory(char *directory, char *fname, int *WalSegSz)
}
/*
- * Returns the size in bytes of the data to be read. Returns -1 if the end
- * point has already been reached.
+ * Returns the number of bytes to read for the given page. Returns -1 if
+ * the requested range has already been reached or exceeded.
*/
static inline int
required_read_len(XLogDumpPrivate *private, XLogRecPtr targetPagePtr,
@@ -440,6 +440,106 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
return count;
}
+/*
+ * pg_waldump's XLogReaderRoutine->segment_open callback to support dumping WAL
+ * files from tar archives. Segment tracking is handled by
+ * TarWALDumpReadPage, so no action is needed here.
+ */
+static void
+TarWALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
+ TimeLineID *tli_p)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->segment_close callback to support dumping
+ * WAL files from tar archives. Segment tracking is handled by
+ * TarWALDumpReadPage, so no action is needed here.
+ */
+static void
+TarWALDumpCloseSegment(XLogReaderState *state)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->page_read callback to support dumping WAL
+ * files from tar archives.
+ */
+static int
+TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
+ XLogRecPtr targetPtr, char *readBuff)
+{
+ XLogDumpPrivate *private = state->private_data;
+ int count = required_read_len(private, targetPagePtr, reqLen);
+ int WalSegSz = state->segcxt.ws_segsize;
+ XLogSegNo curSegNo;
+
+ /* Bail out if the count to be read is not valid */
+ if (count < 0)
+ return -1;
+
+ /*
+ * If the target page is in a different segment, free the buffer and/or
+ * temporary file disk space occupied by the previous segment's data.
+ * Since pg_waldump never requests the same WAL bytes twice, moving to a
+ * new segment implies the previous buffer's data and that segment will
+ * not be needed again.
+ *
+ * Afterward, check for the next required WAL segment's physical existence
+ * in the temporary directory first before invoking the archive streamer.
+ */
+ curSegNo = state->seg.ws_segno;
+ if (!XLByteInSeg(targetPagePtr, curSegNo, WalSegSz))
+ {
+ char fname[MAXFNAMELEN];
+ XLogSegNo nextSegNo;
+
+ /*
+ * Calculate the next WAL segment to be decoded from the given page
+ * pointer.
+ */
+ XLByteToSeg(targetPagePtr, nextSegNo, WalSegSz);
+ state->seg.ws_tli = private->timeline;
+ state->seg.ws_segno = nextSegNo;
+
+ /* Close the WAL segment file if it is currently open */
+ if (state->seg.ws_file >= 0)
+ {
+ close(state->seg.ws_file);
+ state->seg.ws_file = -1;
+ }
+
+ /*
+ * If in pre-reading mode (prior to actual decoding), do not delete
+ * any entries that might be requested again once the decoding loop
+ * starts. For more details, see the comments in
+ * read_archive_wal_page().
+ */
+ if (private->decoding_started && curSegNo < nextSegNo)
+ {
+ XLogFileName(fname, state->seg.ws_tli, curSegNo, WalSegSz);
+ free_archive_wal_entry(fname, private);
+ }
+
+ /*
+ * If the next segment exists, open it and continue reading from there
+ */
+ XLogFileName(fname, state->seg.ws_tli, nextSegNo, WalSegSz);
+ state->seg.ws_file = open_file_in_directory(TmpWalSegDir, fname);
+ }
+
+ /* Continue reading from the open WAL segment, if any */
+ if (state->seg.ws_file >= 0)
+ return WALDumpReadPage(state, targetPagePtr, count, targetPtr,
+ readBuff);
+
+ /* Otherwise, read the WAL page from the archive streamer */
+ return read_archive_wal_page(private, targetPagePtr, count, readBuff,
+ WalSegSz);
+}
+
/*
* Boolean to return whether the given WAL record matches a specific relation
* and optionally block.
@@ -777,8 +877,8 @@ usage(void)
printf(_(" -F, --fork=FORK only show records that modify blocks in fork FORK;\n"
" valid names are main, fsm, vm, init\n"));
printf(_(" -n, --limit=N number of records to display\n"));
- printf(_(" -p, --path=PATH directory in which to find WAL segment files or a\n"
- " directory with a ./pg_wal that contains such files\n"
+ printf(_(" -p, --path=PATH a tar archive or a directory in which to find WAL segment files or\n"
+ " a directory with a pg_wal subdirectory containing such files\n"
" (default: current directory, ./pg_wal, $PGDATA/pg_wal)\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -r, --rmgr=RMGR only show records generated by resource manager RMGR;\n"
@@ -810,7 +910,9 @@ main(int argc, char **argv)
XLogRecord *record;
XLogRecPtr first_record;
char *waldir = NULL;
+ char *walpath = NULL;
char *errormsg;
+ pg_compress_algorithm compression = PG_COMPRESSION_NONE;
static struct option long_options[] = {
{"bkp-details", no_argument, NULL, 'b'},
@@ -868,6 +970,10 @@ main(int argc, char **argv)
private.startptr = InvalidXLogRecPtr;
private.endptr = InvalidXLogRecPtr;
private.endptr_reached = false;
+ private.decoding_started = false;
+ private.archive_name = NULL;
+ private.start_segno = 0;
+ private.end_segno = UINT64_MAX;
config.quiet = false;
config.bkp_details = false;
@@ -943,7 +1049,7 @@ main(int argc, char **argv)
}
break;
case 'p':
- waldir = pg_strdup(optarg);
+ walpath = pg_strdup(optarg);
break;
case 'q':
config.quiet = true;
@@ -1107,14 +1213,25 @@ main(int argc, char **argv)
goto bad_argument;
}
- if (waldir != NULL)
+ if (walpath != NULL)
{
+ /* validate path points to tar archive */
+ if (parse_tar_compress_algorithm(walpath, &compression))
+ {
+ char *fname = NULL;
+
+ split_path(walpath, &waldir, &fname);
+
+ private.archive_name = fname;
+ }
/* validate path points to directory */
- if (!verify_directory(waldir))
+ else if (!verify_directory(walpath))
{
- pg_log_error("could not open directory \"%s\": %m", waldir);
+ pg_log_error("could not open directory \"%s\": %m", walpath);
goto bad_argument;
}
+ else
+ waldir = walpath;
}
if (config.save_fullpage_path != NULL)
@@ -1128,6 +1245,17 @@ main(int argc, char **argv)
int fd;
XLogSegNo segno;
+ /*
+ * If a tar archive is passed using the --path option, all other
+ * arguments become unnecessary.
+ */
+ if (private.archive_name)
+ {
+ pg_log_error("unnecessary command-line arguments specified with tar archive (first is \"%s\")",
+ argv[optind]);
+ goto bad_argument;
+ }
+
split_path(argv[optind], &directory, &fname);
if (waldir == NULL && directory != NULL)
@@ -1138,69 +1266,75 @@ main(int argc, char **argv)
pg_fatal("could not open directory \"%s\": %m", waldir);
}
- waldir = identify_target_directory(waldir, fname, &private.segsize);
- fd = open_file_in_directory(waldir, fname);
- if (fd < 0)
- pg_fatal("could not open file \"%s\"", fname);
- close(fd);
-
- /* parse position from file */
- XLogFromFileName(fname, &private.timeline, &segno, private.segsize);
-
- if (!XLogRecPtrIsValid(private.startptr))
- XLogSegNoOffsetToRecPtr(segno, 0, private.segsize, private.startptr);
- else if (!XLByteInSeg(private.startptr, segno, private.segsize))
+ if (fname != NULL && parse_tar_compress_algorithm(fname, &compression))
{
- pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.startptr),
- fname);
- goto bad_argument;
+ private.archive_name = fname;
}
-
- /* no second file specified, set end position */
- if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(segno + 1, 0, private.segsize, private.endptr);
-
- /* parse ENDSEG if passed */
- if (optind + 1 < argc)
+ else
{
- XLogSegNo endsegno;
-
- /* ignore directory, already have that */
- split_path(argv[optind + 1], &directory, &fname);
-
+ waldir = identify_target_directory(waldir, fname, &private.segsize);
fd = open_file_in_directory(waldir, fname);
if (fd < 0)
pg_fatal("could not open file \"%s\"", fname);
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &endsegno, private.segsize);
+ XLogFromFileName(fname, &private.timeline, &segno, private.segsize);
- if (endsegno < segno)
- pg_fatal("ENDSEG %s is before STARTSEG %s",
- argv[optind + 1], argv[optind]);
+ if (!XLogRecPtrIsValid(private.startptr))
+ XLogSegNoOffsetToRecPtr(segno, 0, private.segsize, private.startptr);
+ else if (!XLByteInSeg(private.startptr, segno, private.segsize))
+ {
+ pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.startptr),
+ fname);
+ goto bad_argument;
+ }
- if (!XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(endsegno + 1, 0, private.segsize,
- private.endptr);
+ /* no second file specified, set end position */
+ if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
+ XLogSegNoOffsetToRecPtr(segno + 1, 0, private.segsize, private.endptr);
- /* set segno to endsegno for check of --end */
- segno = endsegno;
- }
+ /* parse ENDSEG if passed */
+ if (optind + 1 < argc)
+ {
+ XLogSegNo endsegno;
+ /* ignore directory, already have that */
+ split_path(argv[optind + 1], &directory, &fname);
- if (!XLByteInSeg(private.endptr, segno, private.segsize) &&
- private.endptr != (segno + 1) * private.segsize)
- {
- pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.endptr),
- argv[argc - 1]);
- goto bad_argument;
+ fd = open_file_in_directory(waldir, fname);
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", fname);
+ close(fd);
+
+ /* parse position from file */
+ XLogFromFileName(fname, &private.timeline, &endsegno, private.segsize);
+
+ if (endsegno < segno)
+ pg_fatal("ENDSEG %s is before STARTSEG %s",
+ argv[optind + 1], argv[optind]);
+
+ if (!XLogRecPtrIsValid(private.endptr))
+ XLogSegNoOffsetToRecPtr(endsegno + 1, 0, private.segsize,
+ private.endptr);
+
+ /* set segno to endsegno for check of --end */
+ segno = endsegno;
+ }
+
+ if (!XLByteInSeg(private.endptr, segno, private.segsize) &&
+ private.endptr != (segno + 1) * private.segsize)
+ {
+ pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.endptr),
+ argv[argc - 1]);
+ goto bad_argument;
+ }
}
}
- else
- waldir = identify_target_directory(waldir, NULL, &private.segsize);
+ else if (!private.archive_name)
+ waldir = identify_target_directory(walpath, NULL, &private.segsize);
/* we don't know what to print */
if (!XLogRecPtrIsValid(private.startptr))
@@ -1209,15 +1343,46 @@ main(int argc, char **argv)
goto bad_argument;
}
+ /* --follow is not supported with tar archives */
+ if (config.follow && private.archive_name)
+ {
+ pg_log_error("--follow is not supported when reading from a tar archive");
+ goto bad_argument;
+ }
+
/* done with argument parsing, do the actual work */
/* we have everything we need, start reading */
- xlogreader_state =
- XLogReaderAllocate(private.segsize, waldir,
- XL_ROUTINE(.page_read = WALDumpReadPage,
- .segment_open = WALDumpOpenSegment,
- .segment_close = WALDumpCloseSegment),
- &private);
+ if (private.archive_name)
+ {
+ /*
+ * A NULL WAL directory indicates that the archive file is located in
+ * the current working directory.
+ */
+ if (waldir == NULL)
+ waldir = pg_strdup(".");
+
+ /* Set up for reading tar file */
+ init_archive_reader(&private, waldir, &private.segsize, compression);
+
+ /* Routine to decode WAL files in tar archive */
+ xlogreader_state =
+ XLogReaderAllocate(private.segsize, waldir,
+ XL_ROUTINE(.page_read = TarWALDumpReadPage,
+ .segment_open = TarWALDumpOpenSegment,
+ .segment_close = TarWALDumpCloseSegment),
+ &private);
+ }
+ else
+ {
+ xlogreader_state =
+ XLogReaderAllocate(private.segsize, waldir,
+ XL_ROUTINE(.page_read = WALDumpReadPage,
+ .segment_open = WALDumpOpenSegment,
+ .segment_close = WALDumpCloseSegment),
+ &private);
+ }
+
if (!xlogreader_state)
pg_fatal("out of memory while allocating a WAL reading processor");
@@ -1245,6 +1410,9 @@ main(int argc, char **argv)
if (config.stats == true && !config.quiet)
stats.startptr = first_record;
+ /* Flag indicating that the decoding loop has been entered */
+ private.decoding_started = true;
+
for (;;)
{
if (time_to_stop)
@@ -1326,6 +1494,9 @@ main(int argc, char **argv)
XLogReaderFree(xlogreader_state);
+ if (private.archive_name)
+ free_archive_reader(&private);
+
return EXIT_SUCCESS;
bad_argument:
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index 013b051506f..fd25792b33a 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -12,6 +12,14 @@
#define PG_WALDUMP_H
#include "access/xlogdefs.h"
+#include "fe_utils/astreamer.h"
+
+/* Forward declaration */
+struct ArchivedWALFile;
+struct ArchivedWAL_hash;
+
+/* Temporary directory for spilling out-of-order WAL segments from archives */
+extern char *TmpWalSegDir;
/* Contains the necessary information to drive WAL decoding */
typedef struct XLogDumpPrivate
@@ -21,6 +29,48 @@ typedef struct XLogDumpPrivate
XLogRecPtr startptr;
XLogRecPtr endptr;
bool endptr_reached;
+ bool decoding_started;
+
+ /* Fields required to read WAL from archive */
+ char *archive_name; /* tar archive filename */
+ int archive_fd; /* File descriptor for the open tar file */
+
+ astreamer *archive_streamer;
+ char *archive_read_buf; /* Reusable read buffer for archive I/O */
+
+#ifdef USE_ASSERT_CHECKING
+ Size archive_read_buf_size;
+#endif
+
+ /* What the archive streamer is currently reading */
+ struct ArchivedWALFile *cur_file;
+
+ /*
+ * Hash table of all WAL files that the archive stream has read, including
+ * the one currently in progress.
+ */
+ struct ArchivedWAL_hash *archive_wal_htab;
+
+ /*
+ * Pre-computed segment numbers derived from startptr and endptr. Caching
+ * them avoids repeated XLByteToSeg() calls when filtering each archive
+ * member against the requested WAL range.
+ */
+ XLogSegNo start_segno;
+ XLogSegNo end_segno;
} XLogDumpPrivate;
+extern int open_file_in_directory(const char *directory, const char *fname);
+
+extern void init_archive_reader(XLogDumpPrivate *privateInfo,
+ const char *waldir, int *WalSegSz,
+ pg_compress_algorithm compression);
+extern void free_archive_reader(XLogDumpPrivate *privateInfo);
+extern int read_archive_wal_page(XLogDumpPrivate *privateInfo,
+ XLogRecPtr targetPagePtr,
+ Size count, char *readBuff,
+ int WalSegSz);
+extern void free_archive_wal_entry(const char *fname,
+ XLogDumpPrivate *privateInfo);
+
#endif /* PG_WALDUMP_H */
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 5db5d20136f..11df7e092bf 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -3,9 +3,13 @@
use strict;
use warnings FATAL => 'all';
+use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
+use List::Util qw(shuffle);
+
+my $tar = $ENV{TAR};
program_help_ok('pg_waldump');
program_version_ok('pg_waldump');
@@ -162,6 +166,42 @@ CREATE TABLESPACE ts1 LOCATION '$tblspc_path';
DROP TABLESPACE ts1;
});
+# Test: Decode a continuation record (contrecord) that spans multiple WAL
+# segments.
+#
+# Now consume all remaining room in the current WAL segment, leaving
+# space enough only for the start of a largish record.
+$node->safe_psql(
+ 'postgres', q{
+DO $$
+DECLARE
+ wal_segsize int := setting::int FROM pg_settings WHERE name = 'wal_segment_size';
+ remain int;
+ iters int := 0;
+BEGIN
+ LOOP
+ INSERT into t1(b)
+ select repeat(encode(sha256(g::text::bytea), 'hex'), (random() * 15 + 1)::int)
+ from generate_series(1, 10) g;
+
+ remain := wal_segsize - (pg_current_wal_insert_lsn() - '0/0') % wal_segsize;
+ IF remain < 2 * setting::int from pg_settings where name = 'block_size' THEN
+ RAISE log 'exiting after % iterations, % bytes to end of WAL segment', iters, remain;
+ EXIT;
+ END IF;
+ iters := iters + 1;
+ END LOOP;
+END
+$$;
+});
+
+my $contrecord_lsn = $node->safe_psql('postgres',
+ 'SELECT pg_current_wal_insert_lsn()');
+# Generate contrecord record
+$node->safe_psql('postgres',
+ qq{SELECT pg_logical_emit_message(true, 'test 026', repeat('xyzxz', 123456))}
+);
+
my ($end_lsn, $end_walfile) = split /\|/,
$node->safe_psql('postgres',
q{SELECT pg_current_wal_insert_lsn(), pg_walfile_name(pg_current_wal_insert_lsn())}
@@ -198,51 +238,23 @@ command_like(
],
qr/./,
'runs with start and end segment specified');
-command_fails_like(
- [ 'pg_waldump', '--path' => $node->data_dir ],
- qr/error: no start WAL location given/,
- 'path option requires start location');
command_like(
[
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- '--end' => $end_lsn,
- ],
- qr/./,
- 'runs with path option and start and end locations');
-command_fails_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- ],
- qr/error: error in WAL record at/,
- 'falling off the end of the WAL results in an error');
-
-command_like(
- [
- 'pg_waldump', '--quiet',
- $node->data_dir . '/pg_wal/' . $start_walfile
+ 'pg_waldump', '--quiet', '--path',
+ $node->data_dir . '/pg_wal/', $start_walfile
],
qr/^$/,
'no output with --quiet option');
-command_fails_like(
- [
- 'pg_waldump', '--quiet',
- '--path' => $node->data_dir,
- '--start' => $start_lsn
- ],
- qr/error: error in WAL record at/,
- 'errors are shown with --quiet');
-
# Test for: Display a message that we're skipping data if `from`
# wasn't a pointer to the start of a record.
+sub test_pg_waldump_skip_bytes
{
+ my ($path, $startlsn, $endlsn) = @_;
+
# Construct a new LSN that is one byte past the original
# start_lsn.
- my ($part1, $part2) = split qr{/}, $start_lsn;
+ my ($part1, $part2) = split qr{/}, $startlsn;
my $lsn2 = hex $part2;
$lsn2++;
my $new_start = sprintf("%s/%X", $part1, $lsn2);
@@ -252,7 +264,8 @@ command_fails_like(
my $result = IPC::Run::run [
'pg_waldump',
'--start' => $new_start,
- $node->data_dir . '/pg_wal/' . $start_walfile
+ '--end' => $endlsn,
+ '--path' => $path,
],
'>' => \$stdout,
'2>' => \$stderr;
@@ -266,15 +279,15 @@ command_fails_like(
sub test_pg_waldump
{
local $Test::Builder::Level = $Test::Builder::Level + 1;
- my @opts = @_;
+ my ($path, $startlsn, $endlsn, @opts) = @_;
my ($stdout, $stderr);
my $result = IPC::Run::run [
'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- '--end' => $end_lsn,
+ '--start' => $startlsn,
+ '--end' => $endlsn,
+ '--path' => $path,
@opts
],
'>' => \$stdout,
@@ -286,40 +299,145 @@ sub test_pg_waldump
return @lines;
}
-my @lines;
+# Create a tar archive, shuffle the file order
+sub generate_archive
+{
+ my ($archive, $directory, $compression_flags) = @_;
-@lines = test_pg_waldump;
-is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+ my @files;
+ opendir my $dh, $directory or die "opendir: $!";
+ while (my $entry = readdir $dh) {
+ # Skip '.' and '..'
+ next if $entry eq '.' || $entry eq '..';
+ push @files, $entry;
+ }
+ closedir $dh;
-@lines = test_pg_waldump('--limit' => 6);
-is(@lines, 6, 'limit option observed');
+ @files = shuffle @files;
-@lines = test_pg_waldump('--fullpage');
-is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
+ # move into the WAL directory before archiving files
+ my $cwd = getcwd;
+ chdir($directory) || die "chdir: $!";
+ command_ok([$tar, $compression_flags, $archive, @files]);
+ chdir($cwd) || die "chdir: $!";
+}
-@lines = test_pg_waldump('--stats');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
-@lines = test_pg_waldump('--stats=record');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+my @scenarios = (
+ {
+ 'path' => $node->data_dir,
+ 'is_archive' => 0,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar",
+ 'compression_method' => 'none',
+ 'compression_flags' => '-cf',
+ 'is_archive' => 1,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar.gz",
+ 'compression_method' => 'gzip',
+ 'compression_flags' => '-czf',
+ 'is_archive' => 1,
+ 'enabled' => check_pg_config("#define HAVE_LIBZ 1")
+ });
-@lines = test_pg_waldump('--rmgr' => 'Btree');
-is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+for my $scenario (@scenarios)
+{
+ my $path = $scenario->{'path'};
-@lines = test_pg_waldump('--fork' => 'init');
-is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+ SKIP:
+ {
+ skip "tar command is not available", 56
+ if !defined $tar && $scenario->{'is_archive'};
+ skip "$scenario->{'compression_method'} compression not supported by this build", 56
+ if !$scenario->{'enabled'} && $scenario->{'is_archive'};
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
-is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
- 0, 'only lines for selected relation');
+ # create pg_wal archive
+ if ($scenario->{'is_archive'})
+ {
+ generate_archive($path,
+ $node->data_dir . '/pg_wal',
+ $scenario->{'compression_flags'});
+ }
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
- '--block' => 1);
-is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+ command_fails_like(
+ [ 'pg_waldump', '--path' => $path ],
+ qr/error: no start WAL location given/,
+ 'path option requires start location');
+ command_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ '--end' => $end_lsn,
+ ],
+ qr/./,
+ 'runs with path option and start and end locations');
+ command_fails_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ ],
+ qr/error: error in WAL record at/,
+ 'falling off the end of the WAL results in an error');
+ command_fails_like(
+ [
+ 'pg_waldump', '--quiet',
+ '--path' => $path,
+ '--start' => $start_lsn
+ ],
+ qr/error: error in WAL record at/,
+ 'errors are shown with --quiet');
+
+ test_pg_waldump_skip_bytes($path, $start_lsn, $end_lsn);
+
+ my @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
+ is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+
+ @lines = test_pg_waldump($path, $contrecord_lsn, $end_lsn);
+ is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+
+ test_pg_waldump_skip_bytes($path, $contrecord_lsn, $end_lsn);
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--limit' => 6);
+ is(@lines, 6, 'limit option observed');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fullpage');
+ is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats=record');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--rmgr' => 'Btree');
+ is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fork' => 'init');
+ is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
+ is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
+ 0, 'only lines for selected relation');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
+ '--block' => 1);
+ is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+
+ # Cleanup.
+ unlink $path if $scenario->{'is_archive'};
+ }
+}
done_testing();
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 174e2798443..3e2fc711a3e 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -147,6 +147,9 @@ ArchiveOpts
ArchiveShutdownCB
ArchiveStartupCB
ArchiveStreamState
+ArchivedWALFile
+ArchivedWAL_hash
+ArchivedWAL_iterator
ArchiverOutput
ArchiverStage
ArrayAnalyzeExtraData
@@ -3542,6 +3545,7 @@ astreamer_recovery_injector
astreamer_tar_archiver
astreamer_tar_parser
astreamer_verify
+astreamer_waldump
astreamer_zstd_frame
auth_password_hook_typ
autovac_table
--
2.47.1
[application/x-patch] v21-0004-pg_verifybackup-Enable-WAL-parsing-for-tar-forma.patch (16.7K, 5-v21-0004-pg_verifybackup-Enable-WAL-parsing-for-tar-forma.patch)
download | inline diff:
From 4afd782e69bce712df4a673615b7bfd36ae903ed Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <[email protected]>
Date: Thu, 19 Mar 2026 15:43:53 +0530
Subject: [PATCH v21 4/5] pg_verifybackup: Enable WAL parsing for tar-format
backups
Now that pg_waldump supports reading WAL from tar archives, remove the
restriction that forced --no-parse-wal for tar-format backups.
pg_verifybackup now automatically locates the WAL archive: it looks for
a separate pg_wal.tar first, then falls back to the main base.tar. A
new --wal-path option (replacing the old --wal-directory, which is kept
as a silent alias) accepts either a directory or a tar archive path.
The default WAL directory preparation is deferred until the backup
format is known, since tar-format backups resolve the WAL path
differently from plain-format ones.
Author: Amul Sul <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Jakub Wartak <[email protected]>
Reviewed-by: Chao Li <[email protected]>
Reviewed-by: Euler Taveira <[email protected]>
Reviewed-by: Andrew Dunstan <[email protected]>
discussion: https://postgr.es/m/CAAJ_b94bqdWN3h2J-PzzzQ2Npbwct5ZQHggn_QoYGhC2rn-=WQ@mail.gmail.com
---
doc/src/sgml/ref/pg_verifybackup.sgml | 14 ++-
src/bin/pg_verifybackup/pg_verifybackup.c | 96 ++++++++++++-------
src/bin/pg_verifybackup/t/002_algorithm.pl | 4 -
src/bin/pg_verifybackup/t/003_corruption.pl | 4 +-
src/bin/pg_verifybackup/t/007_wal.pl | 20 +++-
src/bin/pg_verifybackup/t/008_untar.pl | 5 +-
src/bin/pg_verifybackup/t/010_client_untar.pl | 5 +-
7 files changed, 91 insertions(+), 57 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index 61c12975e4a..1695cfe91c8 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -36,10 +36,7 @@ PostgreSQL documentation
<literal>backup_manifest</literal> generated by the server at the time
of the backup. The backup may be stored either in the "plain" or the "tar"
format; this includes tar-format backups compressed with any algorithm
- supported by <application>pg_basebackup</application>. However, at present,
- <literal>WAL</literal> verification is supported only for plain-format
- backups. Therefore, if the backup is stored in tar-format, the
- <literal>-n, --no-parse-wal</literal> option should be used.
+ supported by <application>pg_basebackup</application>.
</para>
<para>
@@ -261,12 +258,13 @@ PostgreSQL documentation
<varlistentry>
<term><option>-w <replaceable class="parameter">path</replaceable></option></term>
- <term><option>--wal-directory=<replaceable class="parameter">path</replaceable></option></term>
+ <term><option>--wal-path=<replaceable class="parameter">path</replaceable></option></term>
<listitem>
<para>
- Try to parse WAL files stored in the specified directory, rather than
- in <literal>pg_wal</literal>. This may be useful if the backup is
- stored in a separate location from the WAL archive.
+ Try to parse WAL files stored in the specified directory or tar
+ archive, rather than in <literal>pg_wal</literal>. This may be
+ useful if the backup is stored in a separate location from the WAL
+ archive.
</para>
</listitem>
</varlistentry>
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 31f606c45b1..b60ab8739d5 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -74,7 +74,9 @@ pg_noreturn static void report_manifest_error(JsonManifestParseContext *context,
const char *fmt,...)
pg_attribute_printf(2, 3);
-static void verify_tar_backup(verifier_context *context, DIR *dir);
+static void verify_tar_backup(verifier_context *context, DIR *dir,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_plain_backup_directory(verifier_context *context,
char *relpath, char *fullpath,
DIR *dir);
@@ -83,7 +85,9 @@ static void verify_plain_backup_file(verifier_context *context, char *relpath,
static void verify_control_file(const char *controlpath,
uint64 manifest_system_identifier);
static void precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles);
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_tar_file(verifier_context *context, char *relpath,
char *fullpath, astreamer *streamer);
static void report_extra_backup_files(verifier_context *context);
@@ -93,7 +97,7 @@ static void verify_file_checksum(verifier_context *context,
uint8 *buffer);
static void parse_required_wal(verifier_context *context,
char *pg_waldump_path,
- char *wal_directory);
+ char *wal_path);
static astreamer *create_archive_verifier(verifier_context *context,
char *archive_name,
Oid tblspc_oid,
@@ -126,7 +130,8 @@ main(int argc, char **argv)
{"progress", no_argument, NULL, 'P'},
{"quiet", no_argument, NULL, 'q'},
{"skip-checksums", no_argument, NULL, 's'},
- {"wal-directory", required_argument, NULL, 'w'},
+ {"wal-path", required_argument, NULL, 'w'},
+ {"wal-directory", required_argument, NULL, 'w'}, /* deprecated */
{NULL, 0, NULL, 0}
};
@@ -135,7 +140,9 @@ main(int argc, char **argv)
char *manifest_path = NULL;
bool no_parse_wal = false;
bool quiet = false;
- char *wal_directory = NULL;
+ char *wal_path = NULL;
+ char *base_archive_path = NULL;
+ char *wal_archive_path = NULL;
char *pg_waldump_path = NULL;
DIR *dir;
@@ -221,8 +228,8 @@ main(int argc, char **argv)
context.skip_checksums = true;
break;
case 'w':
- wal_directory = pstrdup(optarg);
- canonicalize_path(wal_directory);
+ wal_path = pstrdup(optarg);
+ canonicalize_path(wal_path);
break;
default:
/* getopt_long already emitted a complaint */
@@ -285,10 +292,6 @@ main(int argc, char **argv)
manifest_path = psprintf("%s/backup_manifest",
context.backup_directory);
- /* By default, look for the WAL in the backup directory, too. */
- if (wal_directory == NULL)
- wal_directory = psprintf("%s/pg_wal", context.backup_directory);
-
/*
* Try to read the manifest. We treat any errors encountered while parsing
* the manifest as fatal; there doesn't seem to be much point in trying to
@@ -331,17 +334,6 @@ main(int argc, char **argv)
pfree(path);
}
- /*
- * XXX: In the future, we should consider enhancing pg_waldump to read WAL
- * files from an archive.
- */
- if (!no_parse_wal && context.format == 't')
- {
- pg_log_error("pg_waldump cannot read tar files");
- pg_log_error_hint("You must use -n/--no-parse-wal when verifying a tar-format backup.");
- exit(1);
- }
-
/*
* Perform the appropriate type of verification appropriate based on the
* backup format. This will close 'dir'.
@@ -350,7 +342,7 @@ main(int argc, char **argv)
verify_plain_backup_directory(&context, NULL, context.backup_directory,
dir);
else
- verify_tar_backup(&context, dir);
+ verify_tar_backup(&context, dir, &base_archive_path, &wal_archive_path);
/*
* The "matched" flag should now be set on every entry in the hash table.
@@ -368,12 +360,35 @@ main(int argc, char **argv)
if (context.format == 'p' && !context.skip_checksums)
verify_backup_checksums(&context);
+ /*
+ * By default, WAL files are expected to be found in the backup directory
+ * for plain-format backups. In the case of tar-format backups, if a
+ * separate WAL archive is not found, the WAL files are most likely
+ * included within the main data directory archive.
+ */
+ if (wal_path == NULL)
+ {
+ if (context.format == 'p')
+ wal_path = psprintf("%s/pg_wal", context.backup_directory);
+ else if (wal_archive_path)
+ wal_path = wal_archive_path;
+ else if (base_archive_path)
+ wal_path = base_archive_path;
+ else
+ {
+ pg_log_error("WAL archive not found");
+ pg_log_error_hint("Specify the correct path using the option -w/--wal-path. "
+ "Or you must use -n/--no-parse-wal when verifying a tar-format backup.");
+ exit(1);
+ }
+ }
+
/*
* Try to parse the required ranges of WAL records, unless we were told
* not to do so.
*/
if (!no_parse_wal)
- parse_required_wal(&context, pg_waldump_path, wal_directory);
+ parse_required_wal(&context, pg_waldump_path, wal_path);
/*
* If everything looks OK, tell the user this, unless we were asked to
@@ -787,7 +802,8 @@ verify_control_file(const char *controlpath, uint64 manifest_system_identifier)
* close when we're done with it.
*/
static void
-verify_tar_backup(verifier_context *context, DIR *dir)
+verify_tar_backup(verifier_context *context, DIR *dir, char **base_archive_path,
+ char **wal_archive_path)
{
struct dirent *dirent;
SimplePtrList tarfiles = {NULL, NULL};
@@ -816,7 +832,8 @@ verify_tar_backup(verifier_context *context, DIR *dir)
char *fullpath;
fullpath = psprintf("%s/%s", context->backup_directory, filename);
- precheck_tar_backup_file(context, filename, fullpath, &tarfiles);
+ precheck_tar_backup_file(context, filename, fullpath, &tarfiles,
+ base_archive_path, wal_archive_path);
pfree(fullpath);
}
}
@@ -875,17 +892,21 @@ verify_tar_backup(verifier_context *context, DIR *dir)
*
* The arguments to this function are mostly the same as the
* verify_plain_backup_file. The additional argument outputs a list of valid
- * tar files.
+ * tar files, along with the full paths to the main archive and the WAL
+ * directory archive.
*/
static void
precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles)
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path, char **wal_archive_path)
{
struct stat sb;
Oid tblspc_oid = InvalidOid;
pg_compress_algorithm compress_algorithm;
tar_file *tar;
char *suffix = NULL;
+ bool is_base_archive = false;
+ bool is_wal_archive = false;
/* Should be tar format backup */
Assert(context->format == 't');
@@ -918,9 +939,15 @@ precheck_tar_backup_file(verifier_context *context, char *relpath,
* extension such as .gz, .lz4, or .zst.
*/
if (strncmp("base", relpath, 4) == 0)
+ {
suffix = relpath + 4;
+ is_base_archive = true;
+ }
else if (strncmp("pg_wal", relpath, 6) == 0)
+ {
suffix = relpath + 6;
+ is_wal_archive = true;
+ }
else
{
/* Expected a <tablespaceoid>.tar file here. */
@@ -953,8 +980,13 @@ precheck_tar_backup_file(verifier_context *context, char *relpath,
* Ignore WALs, as reading and verification will be handled through
* pg_waldump.
*/
- if (strncmp("pg_wal", relpath, 6) == 0)
+ if (is_wal_archive)
+ {
+ *wal_archive_path = pstrdup(fullpath);
return;
+ }
+ else if (is_base_archive)
+ *base_archive_path = pstrdup(fullpath);
/*
* Append the information to the list for complete verification at a later
@@ -1188,7 +1220,7 @@ verify_file_checksum(verifier_context *context, manifest_file *m,
*/
static void
parse_required_wal(verifier_context *context, char *pg_waldump_path,
- char *wal_directory)
+ char *wal_path)
{
manifest_data *manifest = context->manifest;
manifest_wal_range *this_wal_range = manifest->first_wal_range;
@@ -1198,7 +1230,7 @@ parse_required_wal(verifier_context *context, char *pg_waldump_path,
char *pg_waldump_cmd;
pg_waldump_cmd = psprintf("\"%s\" --quiet --path=\"%s\" --timeline=%u --start=%X/%08X --end=%X/%08X\n",
- pg_waldump_path, wal_directory, this_wal_range->tli,
+ pg_waldump_path, wal_path, this_wal_range->tli,
LSN_FORMAT_ARGS(this_wal_range->start_lsn),
LSN_FORMAT_ARGS(this_wal_range->end_lsn));
fflush(NULL);
@@ -1366,7 +1398,7 @@ usage(void)
printf(_(" -P, --progress show progress information\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -s, --skip-checksums skip checksum verification\n"));
- printf(_(" -w, --wal-directory=PATH use specified path for WAL files\n"));
+ printf(_(" -w, --wal-path=PATH use specified path for WAL files\n"));
printf(_(" -V, --version output version information, then exit\n"));
printf(_(" -?, --help show this help, then exit\n"));
printf(_("\nReport bugs to <%s>.\n"), PACKAGE_BUGREPORT);
diff --git a/src/bin/pg_verifybackup/t/002_algorithm.pl b/src/bin/pg_verifybackup/t/002_algorithm.pl
index 0556191ec9d..edc515d5904 100644
--- a/src/bin/pg_verifybackup/t/002_algorithm.pl
+++ b/src/bin/pg_verifybackup/t/002_algorithm.pl
@@ -30,10 +30,6 @@ sub test_checksums
{
# Add switch to get a tar-format backup
push @backup, ('--format' => 'tar');
-
- # Add switch to skip WAL verification, which is not yet supported for
- # tar-format backups
- push @verify, ('--no-parse-wal');
}
# A backup with a bogus algorithm should fail.
diff --git a/src/bin/pg_verifybackup/t/003_corruption.pl b/src/bin/pg_verifybackup/t/003_corruption.pl
index b1d65b8aa0f..882d75d9dc2 100644
--- a/src/bin/pg_verifybackup/t/003_corruption.pl
+++ b/src/bin/pg_verifybackup/t/003_corruption.pl
@@ -193,10 +193,8 @@ for my $scenario (@scenario)
command_ok([ $tar, '-cf' => "$tar_backup_path/base.tar", '.' ]);
chdir($cwd) || die "chdir: $!";
- # Now check that the backup no longer verifies. We must use -n
- # here, because pg_waldump can't yet read WAL from a tarfile.
command_fails_like(
- [ 'pg_verifybackup', '--no-parse-wal', $tar_backup_path ],
+ [ 'pg_verifybackup', $tar_backup_path ],
$scenario->{'fails_like'},
"corrupt backup fails verification: $name");
diff --git a/src/bin/pg_verifybackup/t/007_wal.pl b/src/bin/pg_verifybackup/t/007_wal.pl
index 79087a1f6be..0e0377bfacc 100644
--- a/src/bin/pg_verifybackup/t/007_wal.pl
+++ b/src/bin/pg_verifybackup/t/007_wal.pl
@@ -42,10 +42,10 @@ command_ok([ 'pg_verifybackup', '--no-parse-wal', $backup_path ],
command_ok(
[
'pg_verifybackup',
- '--wal-directory' => $relocated_pg_wal,
+ '--wal-path' => $relocated_pg_wal,
$backup_path
],
- '--wal-directory can be used to specify WAL directory');
+ '--wal-path can be used to specify WAL directory');
# Move directory back to original location.
rename($relocated_pg_wal, $original_pg_wal) || die "rename pg_wal back: $!";
@@ -90,4 +90,20 @@ command_ok(
[ 'pg_verifybackup', $backup_path2 ],
'valid base backup with timeline > 1');
+# Test WAL verification for a tar-format backup with a separate pg_wal.tar,
+# as produced by pg_basebackup --format=tar --wal-method=stream.
+my $backup_path3 = $primary->backup_dir . '/test_tar_wal';
+$primary->command_ok(
+ [
+ 'pg_basebackup',
+ '--pgdata' => $backup_path3,
+ '--no-sync',
+ '--format' => 'tar',
+ '--checkpoint' => 'fast'
+ ],
+ "tar backup with separate pg_wal.tar");
+command_ok(
+ [ 'pg_verifybackup', $backup_path3 ],
+ 'WAL verification succeeds with separate pg_wal.tar');
+
done_testing();
diff --git a/src/bin/pg_verifybackup/t/008_untar.pl b/src/bin/pg_verifybackup/t/008_untar.pl
index ae67ae85a31..161c08c190d 100644
--- a/src/bin/pg_verifybackup/t/008_untar.pl
+++ b/src/bin/pg_verifybackup/t/008_untar.pl
@@ -47,7 +47,6 @@ my $tsoid = $primary->safe_psql(
SELECT oid FROM pg_tablespace WHERE spcname = 'regress_ts1'));
my $backup_path = $primary->backup_dir . '/server-backup';
-my $extract_path = $primary->backup_dir . '/extracted-backup';
my @test_configuration = (
{
@@ -123,14 +122,12 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
# Cleanup.
rmtree($backup_path);
- rmtree($extract_path);
}
}
diff --git a/src/bin/pg_verifybackup/t/010_client_untar.pl b/src/bin/pg_verifybackup/t/010_client_untar.pl
index 1ac7b5db75a..9670fbe4fda 100644
--- a/src/bin/pg_verifybackup/t/010_client_untar.pl
+++ b/src/bin/pg_verifybackup/t/010_client_untar.pl
@@ -32,7 +32,6 @@ print $jf $junk_data;
close $jf;
my $backup_path = $primary->backup_dir . '/client-backup';
-my $extract_path = $primary->backup_dir . '/extracted-backup';
my @test_configuration = (
{
@@ -137,13 +136,11 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
# Cleanup.
- rmtree($extract_path);
rmtree($backup_path);
}
}
--
2.47.1
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-03-20 13:26 Amul Sul <[email protected]>
parent: Amul Sul <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Amul Sul @ 2026-03-20 13:26 UTC (permalink / raw)
To: Zsolt Parragi <[email protected]>; +Cc: Andrew Dunstan <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On Fri, Mar 20, 2026 at 5:01 PM Amul Sul <[email protected]> wrote:
>
> On Fri, Mar 20, 2026 at 2:18 AM Zsolt Parragi <[email protected]> wrote:
> >
> > Hello!
> >
> > Path is ignored with a positional argument, I think this is a bug?
> >
> > This fails:
> >
> > pg_waldump --path /wal/dir 000000010000000000000001
> >
> > And this works:
> >
> > pg_waldump --path /wal/dir --start 0/01000028 --end 0/010020F8
> >
>
> Good catch! I've fixed this in the attached version and updated a test
> case to cover this scenario.
>
> > +{
> > + int fname_len = strlen(fname);
> > +
> >
> > Shouldn't this use size_t?
> >
>
> Okay, that can be used. I’ve done the same in the attached version.
>
> > + /*
> > + * Setup temporary directory to store WAL segments and set up an exit
> > + * callback to remove it upon completion.
> > + */
> > + setup_tmpwal_dir(waldir);
> >
> > Maybe this could be deferred to be created only on first use? If I
> > understand correctly, in a typical scenario waldump won't use this
> > temporary directory, yet it always creates it.
>
> Yeah, that optimization can be done, but passing the waldir -- which
> is only used once -- to the point where the first temp file is created
> would require quite a bit of code refactoring that doesn't seem to
> offer much gain, IMO.
>
Since Andrew also leans toward creating the directory only when
needed, I have reconsidered the approach. I think we can pass waldir
(the archive directory) via XLogDumpPrivate, and I’ve implemented that
in the attached version.
Regards,
Amul
Attachments:
[application/octet-stream] v22-0001-Move-tar-detection-and-compression-logic-to-comm.patch (7.1K, 2-v22-0001-Move-tar-detection-and-compression-logic-to-comm.patch)
download | inline diff:
From f338bb11c4d69ca092de6a85939b93a1bdb34190 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 19 Mar 2026 15:43:30 +0530
Subject: [PATCH v22 1/5] Move tar detection and compression logic to common.
Consolidate tar archive identification and compression-type detection
logic into a shared location. Currently used by pg_basebackup and
pg_verifybackup, this functionality is also required for upcoming
pg_waldump enhancements.
This change promotes code reuse and simplifies maintenance across
frontend tools.
Author: Amul Sul <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Jakub Wartak <[email protected]>
Reviewed-by: Chao Li <[email protected]>
Reviewed-by: Euler Taveira <[email protected]>
Reviewed-by: Andrew Dunstan <[email protected]>
Reviewed-by: Zsolt Parragi <[email protected]>
discussion: https://postgr.es/m/CAAJ_b94bqdWN3h2J-PzzzQ2Npbwct5ZQHggn_QoYGhC2rn-=WQ@mail.gmail.com
---
src/bin/pg_basebackup/pg_basebackup.c | 36 +++++++----------------
src/bin/pg_verifybackup/pg_verifybackup.c | 12 +-------
src/common/compression.c | 30 +++++++++++++++++++
src/include/common/compression.h | 2 ++
4 files changed, 44 insertions(+), 36 deletions(-)
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index fa169a8d642..c1a4672aa6f 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -1070,12 +1070,9 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
astreamer *manifest_inject_streamer = NULL;
bool inject_manifest;
bool is_tar,
- is_tar_gz,
- is_tar_lz4,
- is_tar_zstd,
is_compressed_tar;
+ pg_compress_algorithm compressed_tar_algorithm;
bool must_parse_archive;
- int archive_name_len = strlen(archive_name);
/*
* Normally, we emit the backup manifest as a separate file, but when
@@ -1084,24 +1081,13 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
*/
inject_manifest = (format == 't' && strcmp(basedir, "-") == 0 && manifest);
- /* Is this a tar archive? */
- is_tar = (archive_name_len > 4 &&
- strcmp(archive_name + archive_name_len - 4, ".tar") == 0);
-
- /* Is this a .tar.gz archive? */
- is_tar_gz = (archive_name_len > 7 &&
- strcmp(archive_name + archive_name_len - 7, ".tar.gz") == 0);
-
- /* Is this a .tar.lz4 archive? */
- is_tar_lz4 = (archive_name_len > 8 &&
- strcmp(archive_name + archive_name_len - 8, ".tar.lz4") == 0);
-
- /* Is this a .tar.zst archive? */
- is_tar_zstd = (archive_name_len > 8 &&
- strcmp(archive_name + archive_name_len - 8, ".tar.zst") == 0);
+ /* Check whether it is a tar archive and its compression type */
+ is_tar = parse_tar_compress_algorithm(archive_name,
+ &compressed_tar_algorithm);
/* Is this any kind of compressed tar? */
- is_compressed_tar = is_tar_gz || is_tar_lz4 || is_tar_zstd;
+ is_compressed_tar = (is_tar &&
+ compressed_tar_algorithm != PG_COMPRESSION_NONE);
/*
* Injecting the manifest into a compressed tar file would be possible if
@@ -1128,7 +1114,7 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
(spclocation == NULL && writerecoveryconf));
/* At present, we only know how to parse tar archives. */
- if (must_parse_archive && !is_tar && !is_compressed_tar)
+ if (must_parse_archive && !is_tar)
{
pg_log_error("cannot parse archive \"%s\"", archive_name);
pg_log_error_detail("Only tar archives can be parsed.");
@@ -1263,13 +1249,13 @@ CreateBackupStreamer(char *archive_name, char *spclocation,
* If the user has requested a server compressed archive along with
* archive extraction at client then we need to decompress it.
*/
- if (format == 'p')
+ if (format == 'p' && is_compressed_tar)
{
- if (is_tar_gz)
+ if (compressed_tar_algorithm == PG_COMPRESSION_GZIP)
streamer = astreamer_gzip_decompressor_new(streamer);
- else if (is_tar_lz4)
+ else if (compressed_tar_algorithm == PG_COMPRESSION_LZ4)
streamer = astreamer_lz4_decompressor_new(streamer);
- else if (is_tar_zstd)
+ else if (compressed_tar_algorithm == PG_COMPRESSION_ZSTD)
streamer = astreamer_zstd_decompressor_new(streamer);
}
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index cbc9447384f..31f606c45b1 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -941,17 +941,7 @@ precheck_tar_backup_file(verifier_context *context, char *relpath,
}
/* Now, check the compression type of the tar */
- if (strcmp(suffix, ".tar") == 0)
- compress_algorithm = PG_COMPRESSION_NONE;
- else if (strcmp(suffix, ".tgz") == 0)
- compress_algorithm = PG_COMPRESSION_GZIP;
- else if (strcmp(suffix, ".tar.gz") == 0)
- compress_algorithm = PG_COMPRESSION_GZIP;
- else if (strcmp(suffix, ".tar.lz4") == 0)
- compress_algorithm = PG_COMPRESSION_LZ4;
- else if (strcmp(suffix, ".tar.zst") == 0)
- compress_algorithm = PG_COMPRESSION_ZSTD;
- else
+ if (!parse_tar_compress_algorithm(suffix, &compress_algorithm))
{
report_backup_error(context,
"file \"%s\" is not expected in a tar format backup",
diff --git a/src/common/compression.c b/src/common/compression.c
index 92cd4ec7a0d..ae2089d9406 100644
--- a/src/common/compression.c
+++ b/src/common/compression.c
@@ -41,6 +41,36 @@ static int expect_integer_value(char *keyword, char *value,
static bool expect_boolean_value(char *keyword, char *value,
pg_compress_specification *result);
+/*
+ * Look up a compression algorithm by archive file extension. Returns true and
+ * sets *algorithm if the extension is recognized. Otherwise returns false.
+ */
+bool
+parse_tar_compress_algorithm(const char *fname, pg_compress_algorithm *algorithm)
+{
+ size_t fname_len = strlen(fname);
+
+ if (fname_len >= 4 &&
+ strcmp(fname + fname_len - 4, ".tar") == 0)
+ *algorithm = PG_COMPRESSION_NONE;
+ else if (fname_len >= 4 &&
+ strcmp(fname + fname_len - 4, ".tgz") == 0)
+ *algorithm = PG_COMPRESSION_GZIP;
+ else if (fname_len >= 7 &&
+ strcmp(fname + fname_len - 7, ".tar.gz") == 0)
+ *algorithm = PG_COMPRESSION_GZIP;
+ else if (fname_len >= 8 &&
+ strcmp(fname + fname_len - 8, ".tar.lz4") == 0)
+ *algorithm = PG_COMPRESSION_LZ4;
+ else if (fname_len >= 8 &&
+ strcmp(fname + fname_len - 8, ".tar.zst") == 0)
+ *algorithm = PG_COMPRESSION_ZSTD;
+ else
+ return false;
+
+ return true;
+}
+
/*
* Look up a compression algorithm by name. Returns true and sets *algorithm
* if the name is recognized. Otherwise returns false.
diff --git a/src/include/common/compression.h b/src/include/common/compression.h
index 6c745b90066..f99c747cdd3 100644
--- a/src/include/common/compression.h
+++ b/src/include/common/compression.h
@@ -41,6 +41,8 @@ typedef struct pg_compress_specification
extern void parse_compress_options(const char *option, char **algorithm,
char **detail);
+extern bool parse_tar_compress_algorithm(const char *fname,
+ pg_compress_algorithm *algorithm);
extern bool parse_compress_algorithm(char *name, pg_compress_algorithm *algorithm);
extern const char *get_compress_algorithm_name(pg_compress_algorithm algorithm);
--
2.47.1
[application/octet-stream] v22-0002-pg_waldump-Preparatory-refactoring-for-tar-archi.patch (8.4K, 3-v22-0002-pg_waldump-Preparatory-refactoring-for-tar-archi.patch)
download | inline diff:
From 07c53a039d8be2abedc60810ab62eca77d21bfb8 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 19 Mar 2026 15:43:39 +0530
Subject: [PATCH v22 2/5] pg_waldump: Preparatory refactoring for tar archive
WAL decoding.
Several refactoring steps in preparation for adding tar archive WAL
decoding support to pg_waldump:
- Move XLogDumpPrivate and related declarations into a new pg_waldump.h
header, allowing a second source file to share them.
- Factor out required_read_len() so the read-size calculation can be
reused for both regular WAL files and tar-archived WAL.
- Move the WAL segment size variable into XLogDumpPrivate and rename it
to segsize, making it accessible to the archive streamer code.
Author: Amul Sul <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Jakub Wartak <[email protected]>
Reviewed-by: Chao Li <[email protected]>
Reviewed-by: Euler Taveira <[email protected]>
Reviewed-by: Andrew Dunstan <[email protected]>
discussion: https://postgr.es/m/CAAJ_b94bqdWN3h2J-PzzzQ2Npbwct5ZQHggn_QoYGhC2rn-=WQ@mail.gmail.com
---
src/bin/pg_waldump/pg_waldump.c | 78 +++++++++++++++++++--------------
src/bin/pg_waldump/pg_waldump.h | 26 +++++++++++
2 files changed, 70 insertions(+), 34 deletions(-)
create mode 100644 src/bin/pg_waldump/pg_waldump.h
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index f3446385d6a..5d31b15dbd8 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -29,6 +29,7 @@
#include "common/logging.h"
#include "common/relpath.h"
#include "getopt_long.h"
+#include "pg_waldump.h"
#include "rmgrdesc.h"
#include "storage/bufpage.h"
@@ -43,14 +44,6 @@ static volatile sig_atomic_t time_to_stop = false;
static const RelFileLocator emptyRelFileLocator = {0, 0, 0};
-typedef struct XLogDumpPrivate
-{
- TimeLineID timeline;
- XLogRecPtr startptr;
- XLogRecPtr endptr;
- bool endptr_reached;
-} XLogDumpPrivate;
-
typedef struct XLogDumpConfig
{
/* display options */
@@ -333,6 +326,32 @@ identify_target_directory(char *directory, char *fname, int *WalSegSz)
return NULL; /* not reached */
}
+/*
+ * Returns the size in bytes of the data to be read. Returns -1 if the end
+ * point has already been reached.
+ */
+static inline int
+required_read_len(XLogDumpPrivate *private, XLogRecPtr targetPagePtr,
+ int reqLen)
+{
+ int count = XLOG_BLCKSZ;
+
+ if (XLogRecPtrIsValid(private->endptr))
+ {
+ if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
+ count = XLOG_BLCKSZ;
+ else if (targetPagePtr + reqLen <= private->endptr)
+ count = private->endptr - targetPagePtr;
+ else
+ {
+ private->endptr_reached = true;
+ return -1;
+ }
+ }
+
+ return count;
+}
+
/* pg_waldump's XLogReaderRoutine->segment_open callback */
static void
WALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
@@ -390,21 +409,12 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
XLogRecPtr targetPtr, char *readBuff)
{
XLogDumpPrivate *private = state->private_data;
- int count = XLOG_BLCKSZ;
+ int count = required_read_len(private, targetPagePtr, reqLen);
WALReadError errinfo;
- if (XLogRecPtrIsValid(private->endptr))
- {
- if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
- count = XLOG_BLCKSZ;
- else if (targetPagePtr + reqLen <= private->endptr)
- count = private->endptr - targetPagePtr;
- else
- {
- private->endptr_reached = true;
- return -1;
- }
- }
+ /* Bail out if the count to be read is not valid */
+ if (count < 0)
+ return -1;
if (!WALRead(state, readBuff, targetPagePtr, count, private->timeline,
&errinfo))
@@ -801,7 +811,6 @@ main(int argc, char **argv)
XLogRecPtr first_record;
char *waldir = NULL;
char *errormsg;
- int WalSegSz;
static struct option long_options[] = {
{"bkp-details", no_argument, NULL, 'b'},
@@ -855,6 +864,7 @@ main(int argc, char **argv)
memset(&stats, 0, sizeof(XLogStats));
private.timeline = 1;
+ private.segsize = 0;
private.startptr = InvalidXLogRecPtr;
private.endptr = InvalidXLogRecPtr;
private.endptr_reached = false;
@@ -1128,18 +1138,18 @@ main(int argc, char **argv)
pg_fatal("could not open directory \"%s\": %m", waldir);
}
- waldir = identify_target_directory(waldir, fname, &WalSegSz);
+ waldir = identify_target_directory(waldir, fname, &private.segsize);
fd = open_file_in_directory(waldir, fname);
if (fd < 0)
pg_fatal("could not open file \"%s\"", fname);
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &segno, WalSegSz);
+ XLogFromFileName(fname, &private.timeline, &segno, private.segsize);
if (!XLogRecPtrIsValid(private.startptr))
- XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr);
- else if (!XLByteInSeg(private.startptr, segno, WalSegSz))
+ XLogSegNoOffsetToRecPtr(segno, 0, private.segsize, private.startptr);
+ else if (!XLByteInSeg(private.startptr, segno, private.segsize))
{
pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
LSN_FORMAT_ARGS(private.startptr),
@@ -1149,7 +1159,7 @@ main(int argc, char **argv)
/* no second file specified, set end position */
if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(segno + 1, 0, WalSegSz, private.endptr);
+ XLogSegNoOffsetToRecPtr(segno + 1, 0, private.segsize, private.endptr);
/* parse ENDSEG if passed */
if (optind + 1 < argc)
@@ -1165,14 +1175,14 @@ main(int argc, char **argv)
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &endsegno, WalSegSz);
+ XLogFromFileName(fname, &private.timeline, &endsegno, private.segsize);
if (endsegno < segno)
pg_fatal("ENDSEG %s is before STARTSEG %s",
argv[optind + 1], argv[optind]);
if (!XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(endsegno + 1, 0, WalSegSz,
+ XLogSegNoOffsetToRecPtr(endsegno + 1, 0, private.segsize,
private.endptr);
/* set segno to endsegno for check of --end */
@@ -1180,8 +1190,8 @@ main(int argc, char **argv)
}
- if (!XLByteInSeg(private.endptr, segno, WalSegSz) &&
- private.endptr != (segno + 1) * WalSegSz)
+ if (!XLByteInSeg(private.endptr, segno, private.segsize) &&
+ private.endptr != (segno + 1) * private.segsize)
{
pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
LSN_FORMAT_ARGS(private.endptr),
@@ -1190,7 +1200,7 @@ main(int argc, char **argv)
}
}
else
- waldir = identify_target_directory(waldir, NULL, &WalSegSz);
+ waldir = identify_target_directory(waldir, NULL, &private.segsize);
/* we don't know what to print */
if (!XLogRecPtrIsValid(private.startptr))
@@ -1203,7 +1213,7 @@ main(int argc, char **argv)
/* we have everything we need, start reading */
xlogreader_state =
- XLogReaderAllocate(WalSegSz, waldir,
+ XLogReaderAllocate(private.segsize, waldir,
XL_ROUTINE(.page_read = WALDumpReadPage,
.segment_open = WALDumpOpenSegment,
.segment_close = WALDumpCloseSegment),
@@ -1224,7 +1234,7 @@ main(int argc, char **argv)
* a segment (e.g. we were used in file mode).
*/
if (first_record != private.startptr &&
- XLogSegmentOffset(private.startptr, WalSegSz) != 0)
+ XLogSegmentOffset(private.startptr, private.segsize) != 0)
pg_log_info(ngettext("first record is after %X/%08X, at %X/%08X, skipping over %u byte",
"first record is after %X/%08X, at %X/%08X, skipping over %u bytes",
(first_record - private.startptr)),
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
new file mode 100644
index 00000000000..013b051506f
--- /dev/null
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -0,0 +1,26 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_waldump.h - decode and display WAL
+ *
+ * Copyright (c) 2026, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/pg_waldump.h
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_WALDUMP_H
+#define PG_WALDUMP_H
+
+#include "access/xlogdefs.h"
+
+/* Contains the necessary information to drive WAL decoding */
+typedef struct XLogDumpPrivate
+{
+ TimeLineID timeline;
+ int segsize;
+ XLogRecPtr startptr;
+ XLogRecPtr endptr;
+ bool endptr_reached;
+} XLogDumpPrivate;
+
+#endif /* PG_WALDUMP_H */
--
2.47.1
[application/octet-stream] v22-0003-pg_waldump-Add-support-for-reading-WAL-from-tar-.patch (56.9K, 4-v22-0003-pg_waldump-Add-support-for-reading-WAL-from-tar-.patch)
download | inline diff:
From 1524a491d93f1abb34530bc9fef4116bbd84d33a Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Thu, 19 Mar 2026 15:43:46 +0530
Subject: [PATCH v22 3/5] pg_waldump: Add support for reading WAL from tar
archives
pg_waldump can now accept the path to a tar archive (optionally
compressed with gzip, lz4, or zstd) containing WAL files and decode
them. This was added primarily for pg_verifybackup, which previously
had to skip WAL parsing for tar-format backups.
The implementation uses the existing archive streamer infrastructure
with a hash table to track WAL segments read from the archive. If WAL
files within the archive are not in sequential order, out-of-order
segments are written to a temporary directory (created via mkdtemp under
$TMPDIR or the archive's directory) and read back when needed. An
atexit callback ensures the temporary directory is cleaned up.
The --follow option is not supported when reading from a tar archive.
Author: Amul Sul <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Jakub Wartak <[email protected]>
Reviewed-by: Chao Li <[email protected]>
Reviewed-by: Euler Taveira <[email protected]>
Reviewed-by: Andrew Dunstan <[email protected]>
Reviewed-by: Zsolt Parragi <[email protected]>
discussion: https://postgr.es/m/CAAJ_b94bqdWN3h2J-PzzzQ2Npbwct5ZQHggn_QoYGhC2rn-=WQ@mail.gmail.com
---
doc/src/sgml/ref/pg_waldump.sgml | 23 +-
src/bin/pg_waldump/Makefile | 7 +-
src/bin/pg_waldump/archive_waldump.c | 861 +++++++++++++++++++++++++++
src/bin/pg_waldump/meson.build | 4 +-
src/bin/pg_waldump/pg_waldump.c | 288 +++++++--
src/bin/pg_waldump/pg_waldump.h | 51 ++
src/bin/pg_waldump/t/001_basic.pl | 246 ++++++--
src/tools/pgindent/typedefs.list | 4 +
8 files changed, 1356 insertions(+), 128 deletions(-)
create mode 100644 src/bin/pg_waldump/archive_waldump.c
diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index d1715ff5124..9bbb4bd5772 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -141,13 +141,21 @@ PostgreSQL documentation
<term><option>--path=<replaceable>path</replaceable></option></term>
<listitem>
<para>
- Specifies a directory to search for WAL segment files or a
- directory with a <literal>pg_wal</literal> subdirectory that
+ Specifies a tar archive or a directory to search for WAL segment files
+ or a directory with a <literal>pg_wal</literal> subdirectory that
contains such files. The default is to search in the current
directory, the <literal>pg_wal</literal> subdirectory of the
current directory, and the <literal>pg_wal</literal> subdirectory
of <envar>PGDATA</envar>.
</para>
+ <para>
+ If a tar archive is provided and its WAL segment files are not in
+ sequential order, those files will be written to a temporary directory
+ named starting with <filename>waldump_tmp</filename>. This directory will be
+ created inside the directory specified by the <envar>TMPDIR</envar>
+ environment variable if it is set; otherwise, it will be created within
+ the same directory as the tar archive.
+ </para>
</listitem>
</varlistentry>
@@ -383,6 +391,17 @@ PostgreSQL documentation
</para>
</listitem>
</varlistentry>
+
+ <varlistentry>
+ <term><envar>TMPDIR</envar></term>
+ <listitem>
+ <para>
+ Directory in which to create temporary files when reading WAL from a
+ tar archive with out-of-order segment files. If not set, the temporary
+ directory is created within the same directory as the tar archive.
+ </para>
+ </listitem>
+ </varlistentry>
</variablelist>
</refsect1>
diff --git a/src/bin/pg_waldump/Makefile b/src/bin/pg_waldump/Makefile
index 4c1ee649501..aabb87566a2 100644
--- a/src/bin/pg_waldump/Makefile
+++ b/src/bin/pg_waldump/Makefile
@@ -3,6 +3,9 @@
PGFILEDESC = "pg_waldump - decode and display WAL"
PGAPPICON=win32
+# make these available to TAP test scripts
+export TAR
+
subdir = src/bin/pg_waldump
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
@@ -10,13 +13,15 @@ include $(top_builddir)/src/Makefile.global
OBJS = \
$(RMGRDESCOBJS) \
$(WIN32RES) \
+ archive_waldump.o \
compat.o \
pg_waldump.o \
rmgrdesc.o \
xlogreader.o \
xlogstats.o
-override CPPFLAGS := -DFRONTEND $(CPPFLAGS)
+override CPPFLAGS := -DFRONTEND -I$(libpq_srcdir) $(CPPFLAGS)
+LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils
RMGRDESCSOURCES = $(sort $(notdir $(wildcard $(top_srcdir)/src/backend/access/rmgrdesc/*desc*.c)))
RMGRDESCOBJS = $(patsubst %.c,%.o,$(RMGRDESCSOURCES))
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
new file mode 100644
index 00000000000..f372777366e
--- /dev/null
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -0,0 +1,861 @@
+/*-------------------------------------------------------------------------
+ *
+ * archive_waldump.c
+ * A generic facility for reading WAL data from tar archives via archive
+ * streamer.
+ *
+ * Portions Copyright (c) 2026, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/bin/pg_waldump/archive_waldump.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres_fe.h"
+
+#include <unistd.h>
+
+#include "access/xlog_internal.h"
+#include "common/file_perm.h"
+#include "common/hashfn.h"
+#include "common/logging.h"
+#include "fe_utils/simple_list.h"
+#include "pg_waldump.h"
+
+/*
+ * How many bytes should we try to read from a file at once?
+ */
+#define READ_CHUNK_SIZE (128 * 1024)
+
+/* Temporary directory for spilled WAL segment files */
+char *TmpWalSegDir = NULL;
+
+/*
+ * Check if the start segment number is zero; this indicates a request to read
+ * any WAL file.
+ */
+#define READ_ANY_WAL(privateInfo) ((privateInfo)->start_segno == 0)
+
+/*
+ * Hash entry representing a WAL segment retrieved from the archive.
+ *
+ * While WAL segments are typically read sequentially, individual entries
+ * maintain their own buffers for the following reasons:
+ *
+ * 1. Boundary Handling: The archive streamer provides a continuous byte
+ * stream. A single streaming chunk may contain the end of one WAL segment
+ * and the start of the next. Separate buffers allow us to easily
+ * partition and track these bytes by their respective segments.
+ *
+ * 2. Out-of-Order Support: Dedicated buffers simplify logic when segments
+ * are archived or retrieved out of sequence.
+ *
+ * To minimize the memory footprint, entries and their associated buffers are
+ * freed immediately once consumed. Since pg_waldump does not request the same
+ * bytes twice, a segment is discarded as soon as pg_waldump moves past it.
+ */
+typedef struct ArchivedWALFile
+{
+ uint32 status; /* hash status */
+ const char *fname; /* hash key: WAL segment name */
+
+ StringInfo buf; /* holds WAL bytes read from archive */
+ bool spilled; /* true if the WAL data was spilled to a
+ * temporary file */
+
+ int read_len; /* total bytes received from archive for this
+ * segment, including already-consumed data */
+} ArchivedWALFile;
+
+static uint32 hash_string_pointer(const char *s);
+#define SH_PREFIX ArchivedWAL
+#define SH_ELEMENT_TYPE ArchivedWALFile
+#define SH_KEY_TYPE const char *
+#define SH_KEY fname
+#define SH_HASH_KEY(tb, key) hash_string_pointer(key)
+#define SH_EQUAL(tb, a, b) (strcmp(a, b) == 0)
+#define SH_SCOPE static inline
+#define SH_RAW_ALLOCATOR pg_malloc0
+#define SH_DECLARE
+#define SH_DEFINE
+#include "lib/simplehash.h"
+
+typedef struct astreamer_waldump
+{
+ astreamer base;
+ XLogDumpPrivate *privateInfo;
+} astreamer_waldump;
+
+static ArchivedWALFile *get_archive_wal_entry(const char *fname,
+ XLogDumpPrivate *privateInfo);
+static int read_archive_file(XLogDumpPrivate *privateInfo, Size count);
+static void setup_tmpwal_dir(const char *waldir);
+static void cleanup_tmpwal_dir_atexit(void);
+
+static FILE *prepare_tmp_write(const char *fname, XLogDumpPrivate *privateInfo);
+static void perform_tmp_write(const char *fname, StringInfo buf, FILE *file);
+
+static astreamer *astreamer_waldump_new(XLogDumpPrivate *privateInfo);
+static void astreamer_waldump_content(astreamer *streamer,
+ astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context);
+static void astreamer_waldump_finalize(astreamer *streamer);
+static void astreamer_waldump_free(astreamer *streamer);
+
+static bool member_is_wal_file(astreamer_waldump *mystreamer,
+ astreamer_member *member,
+ char **fname);
+
+static const astreamer_ops astreamer_waldump_ops = {
+ .content = astreamer_waldump_content,
+ .finalize = astreamer_waldump_finalize,
+ .free = astreamer_waldump_free
+};
+
+/*
+ * Initializes the tar archive reader: opens the archive, builds a hash table
+ * for WAL entries, reads ahead until a full WAL page header is available to
+ * determine the WAL segment size, and computes start/end segment numbers for
+ * filtering.
+ */
+void
+init_archive_reader(XLogDumpPrivate *privateInfo,
+ pg_compress_algorithm compression)
+{
+ int fd;
+ astreamer *streamer;
+ ArchivedWALFile *entry = NULL;
+ XLogLongPageHeader longhdr;
+ XLogSegNo segno;
+ TimeLineID timeline;
+
+ /* Open tar archive and store its file descriptor */
+ fd = open_file_in_directory(privateInfo->archive_dir,
+ privateInfo->archive_name);
+
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", privateInfo->archive_name);
+
+ privateInfo->archive_fd = fd;
+
+ streamer = astreamer_waldump_new(privateInfo);
+
+ /* We must first parse the tar archive. */
+ streamer = astreamer_tar_parser_new(streamer);
+
+ /* If the archive is compressed, decompress before parsing. */
+ if (compression == PG_COMPRESSION_GZIP)
+ streamer = astreamer_gzip_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_LZ4)
+ streamer = astreamer_lz4_decompressor_new(streamer);
+ else if (compression == PG_COMPRESSION_ZSTD)
+ streamer = astreamer_zstd_decompressor_new(streamer);
+
+ privateInfo->archive_streamer = streamer;
+
+ /*
+ * Allocate a buffer for reading the archive file to facilitate content
+ * decoding; read requests must not exceed the allocated buffer size.
+ */
+ privateInfo->archive_read_buf = pg_malloc(READ_CHUNK_SIZE);
+
+#ifdef USE_ASSERT_CHECKING
+ privateInfo->archive_read_buf_size = READ_CHUNK_SIZE;
+#endif
+
+ /*
+ * Hash table storing WAL entries read from the archive with an arbitrary
+ * initial size.
+ */
+ privateInfo->archive_wal_htab = ArchivedWAL_create(8, NULL);
+
+ /*
+ * Read until we have at least one full WAL page (XLOG_BLCKSZ bytes) from
+ * the first WAL segment in the archive so we can extract the WAL segment
+ * size from the long page header.
+ */
+ while (entry == NULL || entry->buf->len < XLOG_BLCKSZ)
+ {
+ if (read_archive_file(privateInfo, XLOG_BLCKSZ) == 0)
+ pg_fatal("could not find WAL in archive \"%s\"",
+ privateInfo->archive_name);
+
+ entry = privateInfo->cur_file;
+ }
+
+ /* Extract the WAL segment size from the long page header */
+ longhdr = (XLogLongPageHeader) entry->buf->data;
+
+ if (!IsValidWalSegSize(longhdr->xlp_seg_size))
+ {
+ pg_log_error(ngettext("invalid WAL segment size in WAL file from archive \"%s\" (%d byte)",
+ "invalid WAL segment size in WAL file from archive \"%s\" (%d bytes)",
+ longhdr->xlp_seg_size),
+ privateInfo->archive_name, longhdr->xlp_seg_size);
+ pg_log_error_detail("The WAL segment size must be a power of two between 1 MB and 1 GB.");
+ exit(1);
+ }
+
+ privateInfo->segsize = longhdr->xlp_seg_size;
+
+ /*
+ * With the WAL segment size available, we can now initialize the
+ * dependent start and end segment numbers.
+ */
+ Assert(!XLogRecPtrIsInvalid(privateInfo->startptr));
+ XLByteToSeg(privateInfo->startptr, privateInfo->start_segno,
+ privateInfo->segsize);
+
+ if (!XLogRecPtrIsInvalid(privateInfo->endptr))
+ XLByteToSeg(privateInfo->endptr, privateInfo->end_segno,
+ privateInfo->segsize);
+
+ /*
+ * This WAL record was fetched before the filtering parameters
+ * (start_segno and end_segno) were fully initialized. Perform the
+ * relevance check against the user-provided range now; if the WAL falls
+ * outside this range, remove it from the hash table. Subsequent WAL will
+ * be filtered automatically by the archive streamer using the updated
+ * start_segno and end_segno values.
+ */
+ XLogFromFileName(entry->fname, &timeline, &segno, privateInfo->segsize);
+ if (privateInfo->timeline != timeline ||
+ privateInfo->start_segno > segno ||
+ privateInfo->end_segno < segno)
+ free_archive_wal_entry(entry->fname, privateInfo);
+}
+
+/*
+ * Release the archive streamer chain and close the archive file.
+ */
+void
+free_archive_reader(XLogDumpPrivate *privateInfo)
+{
+ /*
+ * NB: Normally, astreamer_finalize() is called before astreamer_free() to
+ * flush any remaining buffered data or to ensure the end of the tar
+ * archive is reached. However, when decoding WAL, once we hit the end
+ * LSN, any remaining buffered data or unread portion of the archive can
+ * be safely ignored.
+ */
+ astreamer_free(privateInfo->archive_streamer);
+
+ /* Free any remaining hash table entries and their buffers. */
+ if (privateInfo->archive_wal_htab != NULL)
+ {
+ ArchivedWAL_iterator iter;
+ ArchivedWALFile *entry;
+
+ ArchivedWAL_start_iterate(privateInfo->archive_wal_htab, &iter);
+ while ((entry = ArchivedWAL_iterate(privateInfo->archive_wal_htab,
+ &iter)) != NULL)
+ {
+ if (entry->buf != NULL)
+ destroyStringInfo(entry->buf);
+ }
+ ArchivedWAL_destroy(privateInfo->archive_wal_htab);
+ privateInfo->archive_wal_htab = NULL;
+ }
+
+ /* Free the reusable read buffer. */
+ if (privateInfo->archive_read_buf != NULL)
+ {
+ pg_free(privateInfo->archive_read_buf);
+ privateInfo->archive_read_buf = NULL;
+ }
+
+ /* Close the file. */
+ if (close(privateInfo->archive_fd) != 0)
+ pg_log_error("could not close file \"%s\": %m",
+ privateInfo->archive_name);
+}
+
+/*
+ * Copies the requested WAL data from the hash entry's buffer into readBuff.
+ * If the buffer does not yet contain the needed bytes, fetches more data from
+ * the tar archive via the archive streamer.
+ */
+int
+read_archive_wal_page(XLogDumpPrivate *privateInfo, XLogRecPtr targetPagePtr,
+ Size count, char *readBuff)
+{
+ char *p = readBuff;
+ Size nbytes = count;
+ XLogRecPtr recptr = targetPagePtr;
+ int segsize = privateInfo->segsize;
+ XLogSegNo segno;
+ char fname[MAXFNAMELEN];
+ ArchivedWALFile *entry;
+
+ /* Identify the segment and locate its entry in the archive hash */
+ XLByteToSeg(targetPagePtr, segno, segsize);
+ XLogFileName(fname, privateInfo->timeline, segno, segsize);
+ entry = get_archive_wal_entry(fname, privateInfo);
+
+ while (nbytes > 0)
+ {
+ char *buf = entry->buf->data;
+ int bufLen = entry->buf->len;
+ XLogRecPtr endPtr;
+ XLogRecPtr startPtr;
+
+ /*
+ * Calculate the LSN range currently residing in the buffer.
+ *
+ * read_len tracks total bytes received for this segment (including
+ * already-discarded data), so endPtr is the LSN just past the last
+ * buffered byte, and startPtr is the LSN of the first buffered byte.
+ */
+ XLogSegNoOffsetToRecPtr(segno, entry->read_len, segsize, endPtr);
+ startPtr = endPtr - bufLen;
+
+ /*
+ * Copy the requested WAL record if it exists in the buffer.
+ */
+ if (bufLen > 0 && startPtr <= recptr && recptr < endPtr)
+ {
+ int copyBytes;
+ int offset = recptr - startPtr;
+
+ /*
+ * Given startPtr <= recptr < endPtr and a total buffer size
+ * 'bufLen', the offset (recptr - startPtr) will always be less
+ * than 'bufLen'.
+ */
+ Assert(offset < bufLen);
+
+ copyBytes = Min(nbytes, bufLen - offset);
+ memcpy(p, buf + offset, copyBytes);
+
+ /* Update state for read */
+ recptr += copyBytes;
+ nbytes -= copyBytes;
+ p += copyBytes;
+ }
+ else
+ {
+ /*
+ * Before starting the actual decoding loop, pg_waldump tries to
+ * locate the first valid record from the user-specified start
+ * position, which might not be the start of a WAL record and
+ * could fall in the middle of a record that spans multiple pages.
+ * Consequently, the valid start position the decoder is looking
+ * for could be far away from that initial position.
+ *
+ * This may involve reading across multiple pages, and this
+ * pre-reading fetches data in multiple rounds from the archive
+ * streamer; normally, we would throw away existing buffer
+ * contents to fetch the next set of data, but that existing data
+ * might be needed once the main loop starts. Because previously
+ * read data cannot be re-read by the archive streamer, we delay
+ * resetting the buffer until the main decoding loop is entered.
+ *
+ * Once pg_waldump has entered the main loop, it may re-read the
+ * currently active page, but never an older one; therefore, any
+ * fully consumed WAL data preceding the current page can then be
+ * safely discarded.
+ */
+ if (privateInfo->decoding_started)
+ {
+ resetStringInfo(entry->buf);
+
+ /*
+ * Push back the partial page data for the current page to the
+ * buffer, ensuring a full page remains available for
+ * re-reading if requested.
+ */
+ if (p > readBuff)
+ {
+ Assert((count - nbytes) > 0);
+ appendBinaryStringInfo(entry->buf, readBuff, count - nbytes);
+ }
+ }
+
+ /*
+ * Now, fetch more data. Raise an error if the archive streamer
+ * has moved past our segment (meaning the WAL file in the archive
+ * is shorter than expected) or if reading the archive reached
+ * EOF.
+ */
+ if (privateInfo->cur_file != entry)
+ pg_fatal("WAL segment \"%s\" in archive \"%s\" is too short: read %lld of %lld bytes",
+ fname, privateInfo->archive_name,
+ (long long int) (count - nbytes),
+ (long long int) count);
+ if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ pg_fatal("unexpected end of archive \"%s\" while reading \"%s\": read %lld of %lld bytes",
+ privateInfo->archive_name, fname,
+ (long long int) (count - nbytes),
+ (long long int) count);
+ }
+ }
+
+ /*
+ * Should have successfully read all the requested bytes or reported a
+ * failure before this point.
+ */
+ Assert(nbytes == 0);
+
+ /*
+ * NB: We return count unchanged. We could return a boolean since we
+ * either successfully read the WAL page or raise an error, but the caller
+ * expects this value to be returned. The routine that reads WAL pages
+ * from physical WAL files follows the same convention.
+ */
+ return count;
+}
+
+/*
+ * Releases the buffer of a WAL entry that is no longer needed, preventing the
+ * accumulation of irrelevant WAL data. Also removes any associated temporary
+ * file and clears privateInfo->cur_file if it points to this entry, so the
+ * archive streamer skips subsequent data for it.
+ */
+void
+free_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo)
+{
+ ArchivedWALFile *entry;
+
+ entry = ArchivedWAL_lookup(privateInfo->archive_wal_htab, fname);
+
+ if (entry == NULL)
+ return;
+
+ /* Destroy the buffer */
+ destroyStringInfo(entry->buf);
+ entry->buf = NULL;
+
+ /* Remove temporary file if any */
+ if (entry->spilled)
+ {
+ char fpath[MAXPGPATH];
+
+ snprintf(fpath, MAXPGPATH, "%s/%s", TmpWalSegDir, fname);
+
+ if (unlink(fpath) == 0)
+ pg_log_debug("removed file \"%s\"", fpath);
+ }
+
+ /* Clear cur_file if it points to the entry being freed */
+ if (privateInfo->cur_file == entry)
+ privateInfo->cur_file = NULL;
+
+ ArchivedWAL_delete_item(privateInfo->archive_wal_htab, entry);
+}
+
+/*
+ * Returns the archived WAL entry from the hash table if it already exists.
+ * Otherwise, reads more data from the archive until the requested entry is
+ * found. If the archive streamer is reading a WAL file from the archive that
+ * is not currently needed, that data is spilled to a temporary file for later
+ * retrieval.
+ */
+static ArchivedWALFile *
+get_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo)
+{
+ ArchivedWALFile *entry = NULL;
+ FILE *write_fp = NULL;
+
+ /*
+ * Search the hash table first. If the entry is found, return it.
+ * Otherwise, the requested WAL entry hasn't been read from the archive
+ * yet; invoke the archive streamer to fetch it.
+ */
+ while (1)
+ {
+ /*
+ * Search hash table.
+ *
+ * We perform the search inside the loop because a single iteration of
+ * the archive reader may decompress and extract multiple files into
+ * the hash table. One of these newly added files could be the one we
+ * are seeking.
+ */
+ entry = ArchivedWAL_lookup(privateInfo->archive_wal_htab, fname);
+
+ if (entry != NULL)
+ return entry;
+
+ /*
+ * Capture the current entry before calling read_archive_file(),
+ * because cur_file may advance to a new segment during streaming. We
+ * hold this reference so we can flush any remaining buffer data and
+ * close the write handle once we detect that cur_file has moved on.
+ */
+ entry = privateInfo->cur_file;
+
+ /*
+ * Fetch more data either when no current file is being tracked or
+ * when its buffer has been fully flushed to the temporary file.
+ */
+ if (entry == NULL || entry->buf->len == 0)
+ {
+ if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ break; /* archive file ended */
+ }
+
+ /*
+ * Archive streamer is reading a non-WAL file or an irrelevant WAL
+ * file.
+ */
+ if (entry == NULL)
+ continue;
+
+ /*
+ * The streamer is producing a WAL segment that isn't the one asked
+ * for; it must be arriving out of order. Spill its data to disk so
+ * it can be read back when needed.
+ */
+ Assert(strcmp(fname, entry->fname) != 0);
+
+ /* Create a temporary file if one does not already exist */
+ if (!entry->spilled)
+ {
+ write_fp = prepare_tmp_write(entry->fname, privateInfo);
+ entry->spilled = true;
+ }
+
+ /* Flush data from the buffer to the file */
+ perform_tmp_write(entry->fname, entry->buf, write_fp);
+ resetStringInfo(entry->buf);
+
+ /*
+ * If cur_file changed since we captured entry above, the archive
+ * streamer has finished this segment and moved on. Close its spill
+ * file handle so data is flushed to disk before the next segment
+ * starts writing to a different handle.
+ */
+ if (entry != privateInfo->cur_file && write_fp != NULL)
+ {
+ fclose(write_fp);
+ write_fp = NULL;
+ }
+ }
+
+ /* Requested WAL segment not found */
+ pg_fatal("could not find WAL \"%s\" in archive \"%s\"",
+ fname, privateInfo->archive_name);
+}
+
+/*
+ * Reads a chunk from the archive file and passes it through the streamer
+ * pipeline for decompression (if needed) and tar member extraction.
+ */
+static int
+read_archive_file(XLogDumpPrivate *privateInfo, Size count)
+{
+ int rc;
+
+ /* The read request must not exceed the allocated buffer size. */
+ Assert(privateInfo->archive_read_buf_size >= count);
+
+ rc = read(privateInfo->archive_fd, privateInfo->archive_read_buf, count);
+ if (rc < 0)
+ pg_fatal("could not read file \"%s\": %m",
+ privateInfo->archive_name);
+
+ /*
+ * Decompress (if required), and then parse the previously read contents
+ * of the tar file.
+ */
+ if (rc > 0)
+ astreamer_content(privateInfo->archive_streamer, NULL,
+ privateInfo->archive_read_buf, rc,
+ ASTREAMER_UNKNOWN);
+
+ return rc;
+}
+
+/*
+ * Set up a temporary directory to temporarily store WAL segments.
+ */
+static void
+setup_tmpwal_dir(const char *waldir)
+{
+ char *template;
+
+ Assert(TmpWalSegDir == NULL);
+
+ /*
+ * Use the directory specified by the TMPDIR environment variable. If it's
+ * not set, fall back to the provided WAL directory to store WAL files
+ * temporarily.
+ */
+ template = psprintf("%s/waldump_tmp-XXXXXX",
+ getenv("TMPDIR") ? getenv("TMPDIR") : waldir);
+ TmpWalSegDir = mkdtemp(template);
+
+ if (TmpWalSegDir == NULL)
+ pg_fatal("could not create directory \"%s\": %m", template);
+
+ canonicalize_path(TmpWalSegDir);
+
+ pg_log_debug("created directory \"%s\"", TmpWalSegDir);
+}
+
+/*
+ * Remove temporary directory at exit, if any.
+ */
+static void
+cleanup_tmpwal_dir_atexit(void)
+{
+ Assert(TmpWalSegDir != NULL);
+
+ rmtree(TmpWalSegDir, true);
+
+ TmpWalSegDir = NULL;
+}
+
+/*
+ * Open a file in the temporary spill directory for writing an out-of-order
+ * WAL segment, creating the directory and registering the cleanup callback
+ * if not already done. Returns the open file handle.
+ */
+static FILE *
+prepare_tmp_write(const char *fname, XLogDumpPrivate *privateInfo)
+{
+ char fpath[MAXPGPATH];
+ FILE *file;
+
+ /*
+ * Setup temporary directory to store WAL segments and set up an exit
+ * callback to remove it upon completion if not already.
+ */
+ if (unlikely(TmpWalSegDir == NULL))
+ {
+ setup_tmpwal_dir(privateInfo->archive_dir);
+ atexit(cleanup_tmpwal_dir_atexit);
+ }
+
+ snprintf(fpath, MAXPGPATH, "%s/%s", TmpWalSegDir, fname);
+
+ /* Open the spill file for writing */
+ file = fopen(fpath, PG_BINARY_W);
+ if (file == NULL)
+ pg_fatal("could not create file \"%s\": %m", fpath);
+
+#ifndef WIN32
+ if (chmod(fpath, pg_file_create_mode))
+ pg_fatal("could not set permissions on file \"%s\": %m",
+ fpath);
+#endif
+
+ pg_log_debug("spilling to temporary file \"%s\"", fpath);
+
+ return file;
+}
+
+/*
+ * Write buffer data to the given file handle.
+ */
+static void
+perform_tmp_write(const char *fname, StringInfo buf, FILE *file)
+{
+ Assert(file);
+
+ errno = 0;
+ if (buf->len > 0 && fwrite(buf->data, buf->len, 1, file) != 1)
+ {
+ /*
+ * If write didn't set errno, assume problem is no disk space
+ */
+ if (errno == 0)
+ errno = ENOSPC;
+ pg_fatal("could not write to file \"%s/%s\": %m", TmpWalSegDir, fname);
+ }
+}
+
+/*
+ * Create an astreamer that can read WAL from tar file.
+ */
+static astreamer *
+astreamer_waldump_new(XLogDumpPrivate *privateInfo)
+{
+ astreamer_waldump *streamer;
+
+ streamer = palloc0_object(astreamer_waldump);
+ *((const astreamer_ops **) &streamer->base.bbs_ops) =
+ &astreamer_waldump_ops;
+
+ streamer->privateInfo = privateInfo;
+
+ return &streamer->base;
+}
+
+/*
+ * Main entry point of the archive streamer for reading WAL data from a tar
+ * file. If a member is identified as a valid WAL file, a hash entry is created
+ * for it, and its contents are copied into that entry's buffer, making them
+ * accessible to the decoding routine.
+ */
+static void
+astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
+ const char *data, int len,
+ astreamer_archive_context context)
+{
+ astreamer_waldump *mystreamer = (astreamer_waldump *) streamer;
+ XLogDumpPrivate *privateInfo = mystreamer->privateInfo;
+
+ Assert(context != ASTREAMER_UNKNOWN);
+
+ switch (context)
+ {
+ case ASTREAMER_MEMBER_HEADER:
+ {
+ char *fname = NULL;
+ ArchivedWALFile *entry;
+ bool found;
+
+ pg_log_debug("reading \"%s\"", member->pathname);
+
+ if (!member_is_wal_file(mystreamer, member, &fname))
+ break;
+
+ /*
+ * Skip range filtering during initial startup, before the WAL
+ * segment size and segment number bounds are known.
+ */
+ if (!READ_ANY_WAL(privateInfo))
+ {
+ XLogSegNo segno;
+ TimeLineID timeline;
+
+ /*
+ * Skip the segment if the timeline does not match, if it
+ * falls outside the caller-specified range.
+ */
+ XLogFromFileName(fname, &timeline, &segno, privateInfo->segsize);
+ if (privateInfo->timeline != timeline ||
+ privateInfo->start_segno > segno ||
+ privateInfo->end_segno < segno)
+ {
+ pfree(fname);
+ break;
+ }
+ }
+
+ entry = ArchivedWAL_insert(privateInfo->archive_wal_htab,
+ fname, &found);
+
+ /*
+ * Shouldn't happen, but if it does, simply ignore the
+ * duplicate WAL file.
+ */
+ if (found)
+ {
+ pg_log_warning("ignoring duplicate WAL \"%s\" found in archive \"%s\"",
+ member->pathname, privateInfo->archive_name);
+ pfree(fname);
+ break;
+ }
+
+ entry->buf = makeStringInfo();
+ entry->spilled = false;
+ entry->read_len = 0;
+ privateInfo->cur_file = entry;
+ }
+ break;
+
+ case ASTREAMER_MEMBER_CONTENTS:
+ if (privateInfo->cur_file)
+ {
+ appendBinaryStringInfo(privateInfo->cur_file->buf, data, len);
+ privateInfo->cur_file->read_len += len;
+ }
+ break;
+
+ case ASTREAMER_MEMBER_TRAILER:
+
+ /*
+ * End of this tar member; mark cur_file NULL so subsequent
+ * content callbacks (if any) know no WAL file is currently
+ * active.
+ */
+ privateInfo->cur_file = NULL;
+ break;
+
+ case ASTREAMER_ARCHIVE_TRAILER:
+ break;
+
+ default:
+ /* Shouldn't happen. */
+ pg_fatal("unexpected state while parsing tar file");
+ }
+}
+
+/*
+ * End-of-stream processing for an astreamer_waldump stream. This is a
+ * terminal streamer so it must have no successor.
+ */
+static void
+astreamer_waldump_finalize(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+}
+
+/*
+ * Free memory associated with an astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_free(astreamer *streamer)
+{
+ Assert(streamer->bbs_next == NULL);
+ pfree(streamer);
+}
+
+/*
+ * Returns true if the archive member name matches the WAL naming format. If
+ * successful, it also outputs the WAL segment name.
+ */
+static bool
+member_is_wal_file(astreamer_waldump *mystreamer, astreamer_member *member,
+ char **fname)
+{
+ int pathlen;
+ char pathname[MAXPGPATH];
+ char *filename;
+
+ /* We are only interested in normal files */
+ if (member->is_directory || member->is_link)
+ return false;
+
+ if (strlen(member->pathname) < XLOG_FNAME_LEN)
+ return false;
+
+ /*
+ * For a correct comparison, we must remove any '.' or '..' components
+ * from the member pathname. Similar to member_verify_header(), we prepend
+ * './' to the path so that canonicalize_path() can properly resolve and
+ * strip these references from the tar member name.
+ */
+ snprintf(pathname, MAXPGPATH, "./%s", member->pathname);
+ canonicalize_path(pathname);
+ pathlen = strlen(pathname);
+
+ /* Skip files in subdirectories other than pg_wal/ */
+ if (pathlen > XLOG_FNAME_LEN &&
+ strncmp(pathname, XLOGDIR, strlen(XLOGDIR)) != 0)
+ return false;
+
+ /* WAL file may appear with a full path (e.g., pg_wal/<name>) */
+ filename = pathname + (pathlen - XLOG_FNAME_LEN);
+ if (!IsXLogFileName(filename))
+ return false;
+
+ *fname = pnstrdup(filename, XLOG_FNAME_LEN);
+
+ return true;
+}
+
+/*
+ * Helper function for WAL file hash table.
+ */
+static uint32
+hash_string_pointer(const char *s)
+{
+ unsigned char *ss = (unsigned char *) s;
+
+ return hash_bytes(ss, strlen(s));
+}
diff --git a/src/bin/pg_waldump/meson.build b/src/bin/pg_waldump/meson.build
index 633a9874bb5..5296f21b82c 100644
--- a/src/bin/pg_waldump/meson.build
+++ b/src/bin/pg_waldump/meson.build
@@ -1,6 +1,7 @@
# Copyright (c) 2022-2026, PostgreSQL Global Development Group
pg_waldump_sources = files(
+ 'archive_waldump.c',
'compat.c',
'pg_waldump.c',
'rmgrdesc.c',
@@ -18,7 +19,7 @@ endif
pg_waldump = executable('pg_waldump',
pg_waldump_sources,
- dependencies: [frontend_code, lz4, zstd],
+ dependencies: [frontend_code, libpq, lz4, zstd],
c_args: ['-DFRONTEND'], # needed for xlogreader et al
kwargs: default_bin_args,
)
@@ -29,6 +30,7 @@ tests += {
'sd': meson.current_source_dir(),
'bd': meson.current_build_dir(),
'tap': {
+ 'env': {'TAR': tar.found() ? tar.full_path() : ''},
'tests': [
't/001_basic.pl',
't/002_save_fullpage.pl',
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 5d31b15dbd8..f82507ef696 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -176,7 +176,7 @@ split_path(const char *path, char **dir, char **fname)
*
* return a read only fd
*/
-static int
+int
open_file_in_directory(const char *directory, const char *fname)
{
int fd = -1;
@@ -327,8 +327,8 @@ identify_target_directory(char *directory, char *fname, int *WalSegSz)
}
/*
- * Returns the size in bytes of the data to be read. Returns -1 if the end
- * point has already been reached.
+ * Returns the number of bytes to read for the given page. Returns -1 if
+ * the requested range has already been reached or exceeded.
*/
static inline int
required_read_len(XLogDumpPrivate *private, XLogRecPtr targetPagePtr,
@@ -412,7 +412,7 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
int count = required_read_len(private, targetPagePtr, reqLen);
WALReadError errinfo;
- /* Bail out if the count to be read is not valid */
+ /* Bail out if the end of the requested range has already been reached */
if (count < 0)
return -1;
@@ -440,6 +440,109 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
return count;
}
+/*
+ * pg_waldump's XLogReaderRoutine->segment_open callback to support dumping WAL
+ * files from tar archives. Segment tracking is handled by
+ * TarWALDumpReadPage, so no action is needed here.
+ */
+static void
+TarWALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
+ TimeLineID *tli_p)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->segment_close callback to support dumping
+ * WAL files from tar archives. Segment tracking is handled by
+ * TarWALDumpReadPage, so no action is needed here.
+ */
+static void
+TarWALDumpCloseSegment(XLogReaderState *state)
+{
+ /* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->page_read callback to support dumping WAL
+ * files from tar archives.
+ */
+static int
+TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
+ XLogRecPtr targetPtr, char *readBuff)
+{
+ XLogDumpPrivate *private = state->private_data;
+ int count = required_read_len(private, targetPagePtr, reqLen);
+ int segsize = state->segcxt.ws_segsize;
+ XLogSegNo curSegNo;
+
+ /* Bail out if the end of the requested range has already been reached */
+ if (count < 0)
+ return -1;
+
+ /*
+ * If the target page is in a different segment, release the hash entry
+ * buffer and remove any spilled temporary file for the previous segment.
+ * Since pg_waldump never requests the same WAL bytes twice, moving to a
+ * new segment means the previous segment's data will not be needed again.
+ *
+ * Afterward, check whether the next required WAL segment was already
+ * spilled to the temporary directory before invoking the archive
+ * streamer.
+ */
+ curSegNo = state->seg.ws_segno;
+ if (!XLByteInSeg(targetPagePtr, curSegNo, segsize))
+ {
+ char fname[MAXFNAMELEN];
+ XLogSegNo nextSegNo;
+
+ /*
+ * Calculate the next WAL segment to be decoded from the given page
+ * pointer.
+ */
+ XLByteToSeg(targetPagePtr, nextSegNo, segsize);
+ state->seg.ws_tli = private->timeline;
+ state->seg.ws_segno = nextSegNo;
+
+ /* Close the WAL segment file if it is currently open */
+ if (state->seg.ws_file >= 0)
+ {
+ close(state->seg.ws_file);
+ state->seg.ws_file = -1;
+ }
+
+ /*
+ * If in pre-reading mode (prior to actual decoding), do not delete
+ * any entries that might be requested again once the decoding loop
+ * starts. For more details, see the comments in
+ * read_archive_wal_page().
+ */
+ if (private->decoding_started && curSegNo < nextSegNo)
+ {
+ XLogFileName(fname, state->seg.ws_tli, curSegNo, segsize);
+ free_archive_wal_entry(fname, private);
+ }
+
+ /*
+ * If the next segment exists in the temporary spill directory, open
+ * it and continue reading from there.
+ */
+ if (TmpWalSegDir != NULL)
+ {
+ XLogFileName(fname, state->seg.ws_tli, nextSegNo, segsize);
+ state->seg.ws_file = open_file_in_directory(TmpWalSegDir, fname);
+ }
+ }
+
+ /* Continue reading from the open WAL segment, if any */
+ if (state->seg.ws_file >= 0)
+ return WALDumpReadPage(state, targetPagePtr, count, targetPtr,
+ readBuff);
+
+ /* Otherwise, read the WAL page from the archive streamer */
+ return read_archive_wal_page(private, targetPagePtr, count, readBuff);
+}
+
/*
* Boolean to return whether the given WAL record matches a specific relation
* and optionally block.
@@ -777,8 +880,8 @@ usage(void)
printf(_(" -F, --fork=FORK only show records that modify blocks in fork FORK;\n"
" valid names are main, fsm, vm, init\n"));
printf(_(" -n, --limit=N number of records to display\n"));
- printf(_(" -p, --path=PATH directory in which to find WAL segment files or a\n"
- " directory with a ./pg_wal that contains such files\n"
+ printf(_(" -p, --path=PATH a tar archive or a directory in which to find WAL segment files or\n"
+ " a directory with a pg_wal subdirectory containing such files\n"
" (default: current directory, ./pg_wal, $PGDATA/pg_wal)\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -r, --rmgr=RMGR only show records generated by resource manager RMGR;\n"
@@ -811,6 +914,7 @@ main(int argc, char **argv)
XLogRecPtr first_record;
char *waldir = NULL;
char *errormsg;
+ pg_compress_algorithm compression = PG_COMPRESSION_NONE;
static struct option long_options[] = {
{"bkp-details", no_argument, NULL, 'b'},
@@ -868,6 +972,10 @@ main(int argc, char **argv)
private.startptr = InvalidXLogRecPtr;
private.endptr = InvalidXLogRecPtr;
private.endptr_reached = false;
+ private.decoding_started = false;
+ private.archive_name = NULL;
+ private.start_segno = 0;
+ private.end_segno = UINT64_MAX;
config.quiet = false;
config.bkp_details = false;
@@ -1109,8 +1217,13 @@ main(int argc, char **argv)
if (waldir != NULL)
{
- /* validate path points to directory */
- if (!verify_directory(waldir))
+ /* Check whether the path looks like a tar archive by its extension */
+ if (parse_tar_compress_algorithm(waldir, &compression))
+ {
+ split_path(waldir, &private.archive_dir, &private.archive_name);
+ }
+ /* Otherwise it must be a directory */
+ else if (!verify_directory(waldir))
{
pg_log_error("could not open directory \"%s\": %m", waldir);
goto bad_argument;
@@ -1128,6 +1241,17 @@ main(int argc, char **argv)
int fd;
XLogSegNo segno;
+ /*
+ * If a tar archive is passed using the --path option, all other
+ * arguments become unnecessary.
+ */
+ if (private.archive_name)
+ {
+ pg_log_error("unnecessary command-line arguments specified with tar archive (first is \"%s\")",
+ argv[optind]);
+ goto bad_argument;
+ }
+
split_path(argv[optind], &directory, &fname);
if (waldir == NULL && directory != NULL)
@@ -1138,68 +1262,75 @@ main(int argc, char **argv)
pg_fatal("could not open directory \"%s\": %m", waldir);
}
- waldir = identify_target_directory(waldir, fname, &private.segsize);
- fd = open_file_in_directory(waldir, fname);
- if (fd < 0)
- pg_fatal("could not open file \"%s\"", fname);
- close(fd);
-
- /* parse position from file */
- XLogFromFileName(fname, &private.timeline, &segno, private.segsize);
-
- if (!XLogRecPtrIsValid(private.startptr))
- XLogSegNoOffsetToRecPtr(segno, 0, private.segsize, private.startptr);
- else if (!XLByteInSeg(private.startptr, segno, private.segsize))
+ if (fname != NULL && parse_tar_compress_algorithm(fname, &compression))
{
- pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.startptr),
- fname);
- goto bad_argument;
+ private.archive_dir = waldir;
+ private.archive_name = fname;
}
-
- /* no second file specified, set end position */
- if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(segno + 1, 0, private.segsize, private.endptr);
-
- /* parse ENDSEG if passed */
- if (optind + 1 < argc)
+ else
{
- XLogSegNo endsegno;
-
- /* ignore directory, already have that */
- split_path(argv[optind + 1], &directory, &fname);
-
+ waldir = identify_target_directory(waldir, fname, &private.segsize);
fd = open_file_in_directory(waldir, fname);
if (fd < 0)
pg_fatal("could not open file \"%s\"", fname);
close(fd);
/* parse position from file */
- XLogFromFileName(fname, &private.timeline, &endsegno, private.segsize);
+ XLogFromFileName(fname, &private.timeline, &segno, private.segsize);
- if (endsegno < segno)
- pg_fatal("ENDSEG %s is before STARTSEG %s",
- argv[optind + 1], argv[optind]);
+ if (!XLogRecPtrIsValid(private.startptr))
+ XLogSegNoOffsetToRecPtr(segno, 0, private.segsize, private.startptr);
+ else if (!XLByteInSeg(private.startptr, segno, private.segsize))
+ {
+ pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.startptr),
+ fname);
+ goto bad_argument;
+ }
- if (!XLogRecPtrIsValid(private.endptr))
- XLogSegNoOffsetToRecPtr(endsegno + 1, 0, private.segsize,
- private.endptr);
+ /* no second file specified, set end position */
+ if (!(optind + 1 < argc) && !XLogRecPtrIsValid(private.endptr))
+ XLogSegNoOffsetToRecPtr(segno + 1, 0, private.segsize, private.endptr);
- /* set segno to endsegno for check of --end */
- segno = endsegno;
- }
+ /* parse ENDSEG if passed */
+ if (optind + 1 < argc)
+ {
+ XLogSegNo endsegno;
+ /* ignore directory, already have that */
+ split_path(argv[optind + 1], &directory, &fname);
- if (!XLByteInSeg(private.endptr, segno, private.segsize) &&
- private.endptr != (segno + 1) * private.segsize)
- {
- pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
- LSN_FORMAT_ARGS(private.endptr),
- argv[argc - 1]);
- goto bad_argument;
+ fd = open_file_in_directory(waldir, fname);
+ if (fd < 0)
+ pg_fatal("could not open file \"%s\"", fname);
+ close(fd);
+
+ /* parse position from file */
+ XLogFromFileName(fname, &private.timeline, &endsegno, private.segsize);
+
+ if (endsegno < segno)
+ pg_fatal("ENDSEG %s is before STARTSEG %s",
+ argv[optind + 1], argv[optind]);
+
+ if (!XLogRecPtrIsValid(private.endptr))
+ XLogSegNoOffsetToRecPtr(endsegno + 1, 0, private.segsize,
+ private.endptr);
+
+ /* set segno to endsegno for check of --end */
+ segno = endsegno;
+ }
+
+ if (!XLByteInSeg(private.endptr, segno, private.segsize) &&
+ private.endptr != (segno + 1) * private.segsize)
+ {
+ pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
+ LSN_FORMAT_ARGS(private.endptr),
+ argv[argc - 1]);
+ goto bad_argument;
+ }
}
}
- else
+ else if (!private.archive_name)
waldir = identify_target_directory(waldir, NULL, &private.segsize);
/* we don't know what to print */
@@ -1209,15 +1340,46 @@ main(int argc, char **argv)
goto bad_argument;
}
+ /* --follow is not supported with tar archives */
+ if (config.follow && private.archive_name)
+ {
+ pg_log_error("--follow is not supported when reading from a tar archive");
+ goto bad_argument;
+ }
+
/* done with argument parsing, do the actual work */
/* we have everything we need, start reading */
- xlogreader_state =
- XLogReaderAllocate(private.segsize, waldir,
- XL_ROUTINE(.page_read = WALDumpReadPage,
- .segment_open = WALDumpOpenSegment,
- .segment_close = WALDumpCloseSegment),
- &private);
+ if (private.archive_name)
+ {
+ /*
+ * A NULL directory indicates that the archive file is located in the
+ * current working directory.
+ */
+ if (private.archive_dir == NULL)
+ private.archive_dir = pg_strdup(".");
+
+ /* Set up for reading tar file */
+ init_archive_reader(&private, compression);
+
+ /* Routine to decode WAL files in tar archive */
+ xlogreader_state =
+ XLogReaderAllocate(private.segsize, private.archive_dir,
+ XL_ROUTINE(.page_read = TarWALDumpReadPage,
+ .segment_open = TarWALDumpOpenSegment,
+ .segment_close = TarWALDumpCloseSegment),
+ &private);
+ }
+ else
+ {
+ xlogreader_state =
+ XLogReaderAllocate(private.segsize, waldir,
+ XL_ROUTINE(.page_read = WALDumpReadPage,
+ .segment_open = WALDumpOpenSegment,
+ .segment_close = WALDumpCloseSegment),
+ &private);
+ }
+
if (!xlogreader_state)
pg_fatal("out of memory while allocating a WAL reading processor");
@@ -1245,6 +1407,9 @@ main(int argc, char **argv)
if (config.stats == true && !config.quiet)
stats.startptr = first_record;
+ /* Flag indicating that the decoding loop has been entered */
+ private.decoding_started = true;
+
for (;;)
{
if (time_to_stop)
@@ -1326,6 +1491,9 @@ main(int argc, char **argv)
XLogReaderFree(xlogreader_state);
+ if (private.archive_name)
+ free_archive_reader(&private);
+
return EXIT_SUCCESS;
bad_argument:
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index 013b051506f..36893624f53 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -12,6 +12,14 @@
#define PG_WALDUMP_H
#include "access/xlogdefs.h"
+#include "fe_utils/astreamer.h"
+
+/* Forward declaration */
+struct ArchivedWALFile;
+struct ArchivedWAL_hash;
+
+/* Temporary directory for spilling out-of-order WAL segments from archives */
+extern char *TmpWalSegDir;
/* Contains the necessary information to drive WAL decoding */
typedef struct XLogDumpPrivate
@@ -21,6 +29,49 @@ typedef struct XLogDumpPrivate
XLogRecPtr startptr;
XLogRecPtr endptr;
bool endptr_reached;
+ bool decoding_started;
+
+ /* Fields required to read WAL from archive */
+ char *archive_dir;
+ char *archive_name; /* Tar archive filename */
+ int archive_fd; /* File descriptor for the open tar file */
+
+ astreamer *archive_streamer;
+ char *archive_read_buf; /* Reusable read buffer for archive I/O */
+
+#ifdef USE_ASSERT_CHECKING
+ Size archive_read_buf_size;
+#endif
+
+ /* What the archive streamer is currently reading */
+ struct ArchivedWALFile *cur_file;
+
+ /*
+ * Hash table of WAL segments currently buffered from the archive,
+ * including any segment currently being streamed. Entries are removed
+ * once consumed, so this does not accumulate all segments ever read.
+ */
+ struct ArchivedWAL_hash *archive_wal_htab;
+
+ /*
+ * Pre-computed segment numbers derived from startptr and endptr. Caching
+ * them avoids repeated XLByteToSeg() calls when filtering each archive
+ * member against the requested WAL range. end_segno is initialized to
+ * UINT64_MAX when no end limit is requested.
+ */
+ XLogSegNo start_segno;
+ XLogSegNo end_segno;
} XLogDumpPrivate;
+extern int open_file_in_directory(const char *directory, const char *fname);
+
+extern void init_archive_reader(XLogDumpPrivate *privateInfo,
+ pg_compress_algorithm compression);
+extern void free_archive_reader(XLogDumpPrivate *privateInfo);
+extern int read_archive_wal_page(XLogDumpPrivate *privateInfo,
+ XLogRecPtr targetPagePtr,
+ Size count, char *readBuff);
+extern void free_archive_wal_entry(const char *fname,
+ XLogDumpPrivate *privateInfo);
+
#endif /* PG_WALDUMP_H */
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 5db5d20136f..11df7e092bf 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -3,9 +3,13 @@
use strict;
use warnings FATAL => 'all';
+use Cwd;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
+use List::Util qw(shuffle);
+
+my $tar = $ENV{TAR};
program_help_ok('pg_waldump');
program_version_ok('pg_waldump');
@@ -162,6 +166,42 @@ CREATE TABLESPACE ts1 LOCATION '$tblspc_path';
DROP TABLESPACE ts1;
});
+# Test: Decode a continuation record (contrecord) that spans multiple WAL
+# segments.
+#
+# Now consume all remaining room in the current WAL segment, leaving
+# space enough only for the start of a largish record.
+$node->safe_psql(
+ 'postgres', q{
+DO $$
+DECLARE
+ wal_segsize int := setting::int FROM pg_settings WHERE name = 'wal_segment_size';
+ remain int;
+ iters int := 0;
+BEGIN
+ LOOP
+ INSERT into t1(b)
+ select repeat(encode(sha256(g::text::bytea), 'hex'), (random() * 15 + 1)::int)
+ from generate_series(1, 10) g;
+
+ remain := wal_segsize - (pg_current_wal_insert_lsn() - '0/0') % wal_segsize;
+ IF remain < 2 * setting::int from pg_settings where name = 'block_size' THEN
+ RAISE log 'exiting after % iterations, % bytes to end of WAL segment', iters, remain;
+ EXIT;
+ END IF;
+ iters := iters + 1;
+ END LOOP;
+END
+$$;
+});
+
+my $contrecord_lsn = $node->safe_psql('postgres',
+ 'SELECT pg_current_wal_insert_lsn()');
+# Generate contrecord record
+$node->safe_psql('postgres',
+ qq{SELECT pg_logical_emit_message(true, 'test 026', repeat('xyzxz', 123456))}
+);
+
my ($end_lsn, $end_walfile) = split /\|/,
$node->safe_psql('postgres',
q{SELECT pg_current_wal_insert_lsn(), pg_walfile_name(pg_current_wal_insert_lsn())}
@@ -198,51 +238,23 @@ command_like(
],
qr/./,
'runs with start and end segment specified');
-command_fails_like(
- [ 'pg_waldump', '--path' => $node->data_dir ],
- qr/error: no start WAL location given/,
- 'path option requires start location');
command_like(
[
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- '--end' => $end_lsn,
- ],
- qr/./,
- 'runs with path option and start and end locations');
-command_fails_like(
- [
- 'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- ],
- qr/error: error in WAL record at/,
- 'falling off the end of the WAL results in an error');
-
-command_like(
- [
- 'pg_waldump', '--quiet',
- $node->data_dir . '/pg_wal/' . $start_walfile
+ 'pg_waldump', '--quiet', '--path',
+ $node->data_dir . '/pg_wal/', $start_walfile
],
qr/^$/,
'no output with --quiet option');
-command_fails_like(
- [
- 'pg_waldump', '--quiet',
- '--path' => $node->data_dir,
- '--start' => $start_lsn
- ],
- qr/error: error in WAL record at/,
- 'errors are shown with --quiet');
-
# Test for: Display a message that we're skipping data if `from`
# wasn't a pointer to the start of a record.
+sub test_pg_waldump_skip_bytes
{
+ my ($path, $startlsn, $endlsn) = @_;
+
# Construct a new LSN that is one byte past the original
# start_lsn.
- my ($part1, $part2) = split qr{/}, $start_lsn;
+ my ($part1, $part2) = split qr{/}, $startlsn;
my $lsn2 = hex $part2;
$lsn2++;
my $new_start = sprintf("%s/%X", $part1, $lsn2);
@@ -252,7 +264,8 @@ command_fails_like(
my $result = IPC::Run::run [
'pg_waldump',
'--start' => $new_start,
- $node->data_dir . '/pg_wal/' . $start_walfile
+ '--end' => $endlsn,
+ '--path' => $path,
],
'>' => \$stdout,
'2>' => \$stderr;
@@ -266,15 +279,15 @@ command_fails_like(
sub test_pg_waldump
{
local $Test::Builder::Level = $Test::Builder::Level + 1;
- my @opts = @_;
+ my ($path, $startlsn, $endlsn, @opts) = @_;
my ($stdout, $stderr);
my $result = IPC::Run::run [
'pg_waldump',
- '--path' => $node->data_dir,
- '--start' => $start_lsn,
- '--end' => $end_lsn,
+ '--start' => $startlsn,
+ '--end' => $endlsn,
+ '--path' => $path,
@opts
],
'>' => \$stdout,
@@ -286,40 +299,145 @@ sub test_pg_waldump
return @lines;
}
-my @lines;
+# Create a tar archive, shuffle the file order
+sub generate_archive
+{
+ my ($archive, $directory, $compression_flags) = @_;
-@lines = test_pg_waldump;
-is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+ my @files;
+ opendir my $dh, $directory or die "opendir: $!";
+ while (my $entry = readdir $dh) {
+ # Skip '.' and '..'
+ next if $entry eq '.' || $entry eq '..';
+ push @files, $entry;
+ }
+ closedir $dh;
-@lines = test_pg_waldump('--limit' => 6);
-is(@lines, 6, 'limit option observed');
+ @files = shuffle @files;
-@lines = test_pg_waldump('--fullpage');
-is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
+ # move into the WAL directory before archiving files
+ my $cwd = getcwd;
+ chdir($directory) || die "chdir: $!";
+ command_ok([$tar, $compression_flags, $archive, @files]);
+ chdir($cwd) || die "chdir: $!";
+}
-@lines = test_pg_waldump('--stats');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
-@lines = test_pg_waldump('--stats=record');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+my @scenarios = (
+ {
+ 'path' => $node->data_dir,
+ 'is_archive' => 0,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar",
+ 'compression_method' => 'none',
+ 'compression_flags' => '-cf',
+ 'is_archive' => 1,
+ 'enabled' => 1
+ },
+ {
+ 'path' => "$tmp_dir/pg_wal.tar.gz",
+ 'compression_method' => 'gzip',
+ 'compression_flags' => '-czf',
+ 'is_archive' => 1,
+ 'enabled' => check_pg_config("#define HAVE_LIBZ 1")
+ });
-@lines = test_pg_waldump('--rmgr' => 'Btree');
-is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+for my $scenario (@scenarios)
+{
+ my $path = $scenario->{'path'};
-@lines = test_pg_waldump('--fork' => 'init');
-is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+ SKIP:
+ {
+ skip "tar command is not available", 56
+ if !defined $tar && $scenario->{'is_archive'};
+ skip "$scenario->{'compression_method'} compression not supported by this build", 56
+ if !$scenario->{'enabled'} && $scenario->{'is_archive'};
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
-is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
- 0, 'only lines for selected relation');
+ # create pg_wal archive
+ if ($scenario->{'is_archive'})
+ {
+ generate_archive($path,
+ $node->data_dir . '/pg_wal',
+ $scenario->{'compression_flags'});
+ }
-@lines = test_pg_waldump(
- '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
- '--block' => 1);
-is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+ command_fails_like(
+ [ 'pg_waldump', '--path' => $path ],
+ qr/error: no start WAL location given/,
+ 'path option requires start location');
+ command_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ '--end' => $end_lsn,
+ ],
+ qr/./,
+ 'runs with path option and start and end locations');
+ command_fails_like(
+ [
+ 'pg_waldump',
+ '--path' => $path,
+ '--start' => $start_lsn,
+ ],
+ qr/error: error in WAL record at/,
+ 'falling off the end of the WAL results in an error');
+ command_fails_like(
+ [
+ 'pg_waldump', '--quiet',
+ '--path' => $path,
+ '--start' => $start_lsn
+ ],
+ qr/error: error in WAL record at/,
+ 'errors are shown with --quiet');
+
+ test_pg_waldump_skip_bytes($path, $start_lsn, $end_lsn);
+
+ my @lines = test_pg_waldump($path, $start_lsn, $end_lsn);
+ is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+
+ @lines = test_pg_waldump($path, $contrecord_lsn, $end_lsn);
+ is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+
+ test_pg_waldump_skip_bytes($path, $contrecord_lsn, $end_lsn);
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--limit' => 6);
+ is(@lines, 6, 'limit option observed');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fullpage');
+ is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--stats=record');
+ like($lines[0], qr/WAL statistics/, "statistics on stdout");
+ is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--rmgr' => 'Btree');
+ is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn, '--fork' => 'init');
+ is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
+ is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
+ 0, 'only lines for selected relation');
+
+ @lines = test_pg_waldump($path, $start_lsn, $end_lsn,
+ '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
+ '--block' => 1);
+ is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+
+ # Cleanup.
+ unlink $path if $scenario->{'is_archive'};
+ }
+}
done_testing();
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index a4a2ed07816..3f428a64b47 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -147,6 +147,9 @@ ArchiveOpts
ArchiveShutdownCB
ArchiveStartupCB
ArchiveStreamState
+ArchivedWALFile
+ArchivedWAL_hash
+ArchivedWAL_iterator
ArchiverOutput
ArchiverStage
ArrayAnalyzeExtraData
@@ -3544,6 +3547,7 @@ astreamer_recovery_injector
astreamer_tar_archiver
astreamer_tar_parser
astreamer_verify
+astreamer_waldump
astreamer_zstd_frame
auth_password_hook_typ
autovac_table
--
2.47.1
[application/octet-stream] v22-0004-pg_verifybackup-Enable-WAL-parsing-for-tar-forma.patch (16.7K, 5-v22-0004-pg_verifybackup-Enable-WAL-parsing-for-tar-forma.patch)
download | inline diff:
From f758c8ae6e97140a9ea329f4e7f6ec8bb6271afd Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <[email protected]>
Date: Thu, 19 Mar 2026 15:43:53 +0530
Subject: [PATCH v22 4/5] pg_verifybackup: Enable WAL parsing for tar-format
backups
Now that pg_waldump supports reading WAL from tar archives, remove the
restriction that forced --no-parse-wal for tar-format backups.
pg_verifybackup now automatically locates the WAL archive: it looks for
a separate pg_wal.tar first, then falls back to the main base.tar. A
new --wal-path option (replacing the old --wal-directory, which is kept
as a silent alias) accepts either a directory or a tar archive path.
The default WAL directory preparation is deferred until the backup
format is known, since tar-format backups resolve the WAL path
differently from plain-format ones.
Author: Amul Sul <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Jakub Wartak <[email protected]>
Reviewed-by: Chao Li <[email protected]>
Reviewed-by: Euler Taveira <[email protected]>
Reviewed-by: Andrew Dunstan <[email protected]>
discussion: https://postgr.es/m/CAAJ_b94bqdWN3h2J-PzzzQ2Npbwct5ZQHggn_QoYGhC2rn-=WQ@mail.gmail.com
---
doc/src/sgml/ref/pg_verifybackup.sgml | 14 ++-
src/bin/pg_verifybackup/pg_verifybackup.c | 96 ++++++++++++-------
src/bin/pg_verifybackup/t/002_algorithm.pl | 4 -
src/bin/pg_verifybackup/t/003_corruption.pl | 4 +-
src/bin/pg_verifybackup/t/007_wal.pl | 20 +++-
src/bin/pg_verifybackup/t/008_untar.pl | 5 +-
src/bin/pg_verifybackup/t/010_client_untar.pl | 5 +-
7 files changed, 91 insertions(+), 57 deletions(-)
diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index 61c12975e4a..1695cfe91c8 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -36,10 +36,7 @@ PostgreSQL documentation
<literal>backup_manifest</literal> generated by the server at the time
of the backup. The backup may be stored either in the "plain" or the "tar"
format; this includes tar-format backups compressed with any algorithm
- supported by <application>pg_basebackup</application>. However, at present,
- <literal>WAL</literal> verification is supported only for plain-format
- backups. Therefore, if the backup is stored in tar-format, the
- <literal>-n, --no-parse-wal</literal> option should be used.
+ supported by <application>pg_basebackup</application>.
</para>
<para>
@@ -261,12 +258,13 @@ PostgreSQL documentation
<varlistentry>
<term><option>-w <replaceable class="parameter">path</replaceable></option></term>
- <term><option>--wal-directory=<replaceable class="parameter">path</replaceable></option></term>
+ <term><option>--wal-path=<replaceable class="parameter">path</replaceable></option></term>
<listitem>
<para>
- Try to parse WAL files stored in the specified directory, rather than
- in <literal>pg_wal</literal>. This may be useful if the backup is
- stored in a separate location from the WAL archive.
+ Try to parse WAL files stored in the specified directory or tar
+ archive, rather than in <literal>pg_wal</literal>. This may be
+ useful if the backup is stored in a separate location from the WAL
+ archive.
</para>
</listitem>
</varlistentry>
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 31f606c45b1..b60ab8739d5 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -74,7 +74,9 @@ pg_noreturn static void report_manifest_error(JsonManifestParseContext *context,
const char *fmt,...)
pg_attribute_printf(2, 3);
-static void verify_tar_backup(verifier_context *context, DIR *dir);
+static void verify_tar_backup(verifier_context *context, DIR *dir,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_plain_backup_directory(verifier_context *context,
char *relpath, char *fullpath,
DIR *dir);
@@ -83,7 +85,9 @@ static void verify_plain_backup_file(verifier_context *context, char *relpath,
static void verify_control_file(const char *controlpath,
uint64 manifest_system_identifier);
static void precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles);
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path,
+ char **wal_archive_path);
static void verify_tar_file(verifier_context *context, char *relpath,
char *fullpath, astreamer *streamer);
static void report_extra_backup_files(verifier_context *context);
@@ -93,7 +97,7 @@ static void verify_file_checksum(verifier_context *context,
uint8 *buffer);
static void parse_required_wal(verifier_context *context,
char *pg_waldump_path,
- char *wal_directory);
+ char *wal_path);
static astreamer *create_archive_verifier(verifier_context *context,
char *archive_name,
Oid tblspc_oid,
@@ -126,7 +130,8 @@ main(int argc, char **argv)
{"progress", no_argument, NULL, 'P'},
{"quiet", no_argument, NULL, 'q'},
{"skip-checksums", no_argument, NULL, 's'},
- {"wal-directory", required_argument, NULL, 'w'},
+ {"wal-path", required_argument, NULL, 'w'},
+ {"wal-directory", required_argument, NULL, 'w'}, /* deprecated */
{NULL, 0, NULL, 0}
};
@@ -135,7 +140,9 @@ main(int argc, char **argv)
char *manifest_path = NULL;
bool no_parse_wal = false;
bool quiet = false;
- char *wal_directory = NULL;
+ char *wal_path = NULL;
+ char *base_archive_path = NULL;
+ char *wal_archive_path = NULL;
char *pg_waldump_path = NULL;
DIR *dir;
@@ -221,8 +228,8 @@ main(int argc, char **argv)
context.skip_checksums = true;
break;
case 'w':
- wal_directory = pstrdup(optarg);
- canonicalize_path(wal_directory);
+ wal_path = pstrdup(optarg);
+ canonicalize_path(wal_path);
break;
default:
/* getopt_long already emitted a complaint */
@@ -285,10 +292,6 @@ main(int argc, char **argv)
manifest_path = psprintf("%s/backup_manifest",
context.backup_directory);
- /* By default, look for the WAL in the backup directory, too. */
- if (wal_directory == NULL)
- wal_directory = psprintf("%s/pg_wal", context.backup_directory);
-
/*
* Try to read the manifest. We treat any errors encountered while parsing
* the manifest as fatal; there doesn't seem to be much point in trying to
@@ -331,17 +334,6 @@ main(int argc, char **argv)
pfree(path);
}
- /*
- * XXX: In the future, we should consider enhancing pg_waldump to read WAL
- * files from an archive.
- */
- if (!no_parse_wal && context.format == 't')
- {
- pg_log_error("pg_waldump cannot read tar files");
- pg_log_error_hint("You must use -n/--no-parse-wal when verifying a tar-format backup.");
- exit(1);
- }
-
/*
* Perform the appropriate type of verification appropriate based on the
* backup format. This will close 'dir'.
@@ -350,7 +342,7 @@ main(int argc, char **argv)
verify_plain_backup_directory(&context, NULL, context.backup_directory,
dir);
else
- verify_tar_backup(&context, dir);
+ verify_tar_backup(&context, dir, &base_archive_path, &wal_archive_path);
/*
* The "matched" flag should now be set on every entry in the hash table.
@@ -368,12 +360,35 @@ main(int argc, char **argv)
if (context.format == 'p' && !context.skip_checksums)
verify_backup_checksums(&context);
+ /*
+ * By default, WAL files are expected to be found in the backup directory
+ * for plain-format backups. In the case of tar-format backups, if a
+ * separate WAL archive is not found, the WAL files are most likely
+ * included within the main data directory archive.
+ */
+ if (wal_path == NULL)
+ {
+ if (context.format == 'p')
+ wal_path = psprintf("%s/pg_wal", context.backup_directory);
+ else if (wal_archive_path)
+ wal_path = wal_archive_path;
+ else if (base_archive_path)
+ wal_path = base_archive_path;
+ else
+ {
+ pg_log_error("WAL archive not found");
+ pg_log_error_hint("Specify the correct path using the option -w/--wal-path. "
+ "Or you must use -n/--no-parse-wal when verifying a tar-format backup.");
+ exit(1);
+ }
+ }
+
/*
* Try to parse the required ranges of WAL records, unless we were told
* not to do so.
*/
if (!no_parse_wal)
- parse_required_wal(&context, pg_waldump_path, wal_directory);
+ parse_required_wal(&context, pg_waldump_path, wal_path);
/*
* If everything looks OK, tell the user this, unless we were asked to
@@ -787,7 +802,8 @@ verify_control_file(const char *controlpath, uint64 manifest_system_identifier)
* close when we're done with it.
*/
static void
-verify_tar_backup(verifier_context *context, DIR *dir)
+verify_tar_backup(verifier_context *context, DIR *dir, char **base_archive_path,
+ char **wal_archive_path)
{
struct dirent *dirent;
SimplePtrList tarfiles = {NULL, NULL};
@@ -816,7 +832,8 @@ verify_tar_backup(verifier_context *context, DIR *dir)
char *fullpath;
fullpath = psprintf("%s/%s", context->backup_directory, filename);
- precheck_tar_backup_file(context, filename, fullpath, &tarfiles);
+ precheck_tar_backup_file(context, filename, fullpath, &tarfiles,
+ base_archive_path, wal_archive_path);
pfree(fullpath);
}
}
@@ -875,17 +892,21 @@ verify_tar_backup(verifier_context *context, DIR *dir)
*
* The arguments to this function are mostly the same as the
* verify_plain_backup_file. The additional argument outputs a list of valid
- * tar files.
+ * tar files, along with the full paths to the main archive and the WAL
+ * directory archive.
*/
static void
precheck_tar_backup_file(verifier_context *context, char *relpath,
- char *fullpath, SimplePtrList *tarfiles)
+ char *fullpath, SimplePtrList *tarfiles,
+ char **base_archive_path, char **wal_archive_path)
{
struct stat sb;
Oid tblspc_oid = InvalidOid;
pg_compress_algorithm compress_algorithm;
tar_file *tar;
char *suffix = NULL;
+ bool is_base_archive = false;
+ bool is_wal_archive = false;
/* Should be tar format backup */
Assert(context->format == 't');
@@ -918,9 +939,15 @@ precheck_tar_backup_file(verifier_context *context, char *relpath,
* extension such as .gz, .lz4, or .zst.
*/
if (strncmp("base", relpath, 4) == 0)
+ {
suffix = relpath + 4;
+ is_base_archive = true;
+ }
else if (strncmp("pg_wal", relpath, 6) == 0)
+ {
suffix = relpath + 6;
+ is_wal_archive = true;
+ }
else
{
/* Expected a <tablespaceoid>.tar file here. */
@@ -953,8 +980,13 @@ precheck_tar_backup_file(verifier_context *context, char *relpath,
* Ignore WALs, as reading and verification will be handled through
* pg_waldump.
*/
- if (strncmp("pg_wal", relpath, 6) == 0)
+ if (is_wal_archive)
+ {
+ *wal_archive_path = pstrdup(fullpath);
return;
+ }
+ else if (is_base_archive)
+ *base_archive_path = pstrdup(fullpath);
/*
* Append the information to the list for complete verification at a later
@@ -1188,7 +1220,7 @@ verify_file_checksum(verifier_context *context, manifest_file *m,
*/
static void
parse_required_wal(verifier_context *context, char *pg_waldump_path,
- char *wal_directory)
+ char *wal_path)
{
manifest_data *manifest = context->manifest;
manifest_wal_range *this_wal_range = manifest->first_wal_range;
@@ -1198,7 +1230,7 @@ parse_required_wal(verifier_context *context, char *pg_waldump_path,
char *pg_waldump_cmd;
pg_waldump_cmd = psprintf("\"%s\" --quiet --path=\"%s\" --timeline=%u --start=%X/%08X --end=%X/%08X\n",
- pg_waldump_path, wal_directory, this_wal_range->tli,
+ pg_waldump_path, wal_path, this_wal_range->tli,
LSN_FORMAT_ARGS(this_wal_range->start_lsn),
LSN_FORMAT_ARGS(this_wal_range->end_lsn));
fflush(NULL);
@@ -1366,7 +1398,7 @@ usage(void)
printf(_(" -P, --progress show progress information\n"));
printf(_(" -q, --quiet do not print any output, except for errors\n"));
printf(_(" -s, --skip-checksums skip checksum verification\n"));
- printf(_(" -w, --wal-directory=PATH use specified path for WAL files\n"));
+ printf(_(" -w, --wal-path=PATH use specified path for WAL files\n"));
printf(_(" -V, --version output version information, then exit\n"));
printf(_(" -?, --help show this help, then exit\n"));
printf(_("\nReport bugs to <%s>.\n"), PACKAGE_BUGREPORT);
diff --git a/src/bin/pg_verifybackup/t/002_algorithm.pl b/src/bin/pg_verifybackup/t/002_algorithm.pl
index 0556191ec9d..edc515d5904 100644
--- a/src/bin/pg_verifybackup/t/002_algorithm.pl
+++ b/src/bin/pg_verifybackup/t/002_algorithm.pl
@@ -30,10 +30,6 @@ sub test_checksums
{
# Add switch to get a tar-format backup
push @backup, ('--format' => 'tar');
-
- # Add switch to skip WAL verification, which is not yet supported for
- # tar-format backups
- push @verify, ('--no-parse-wal');
}
# A backup with a bogus algorithm should fail.
diff --git a/src/bin/pg_verifybackup/t/003_corruption.pl b/src/bin/pg_verifybackup/t/003_corruption.pl
index b1d65b8aa0f..882d75d9dc2 100644
--- a/src/bin/pg_verifybackup/t/003_corruption.pl
+++ b/src/bin/pg_verifybackup/t/003_corruption.pl
@@ -193,10 +193,8 @@ for my $scenario (@scenario)
command_ok([ $tar, '-cf' => "$tar_backup_path/base.tar", '.' ]);
chdir($cwd) || die "chdir: $!";
- # Now check that the backup no longer verifies. We must use -n
- # here, because pg_waldump can't yet read WAL from a tarfile.
command_fails_like(
- [ 'pg_verifybackup', '--no-parse-wal', $tar_backup_path ],
+ [ 'pg_verifybackup', $tar_backup_path ],
$scenario->{'fails_like'},
"corrupt backup fails verification: $name");
diff --git a/src/bin/pg_verifybackup/t/007_wal.pl b/src/bin/pg_verifybackup/t/007_wal.pl
index 79087a1f6be..0e0377bfacc 100644
--- a/src/bin/pg_verifybackup/t/007_wal.pl
+++ b/src/bin/pg_verifybackup/t/007_wal.pl
@@ -42,10 +42,10 @@ command_ok([ 'pg_verifybackup', '--no-parse-wal', $backup_path ],
command_ok(
[
'pg_verifybackup',
- '--wal-directory' => $relocated_pg_wal,
+ '--wal-path' => $relocated_pg_wal,
$backup_path
],
- '--wal-directory can be used to specify WAL directory');
+ '--wal-path can be used to specify WAL directory');
# Move directory back to original location.
rename($relocated_pg_wal, $original_pg_wal) || die "rename pg_wal back: $!";
@@ -90,4 +90,20 @@ command_ok(
[ 'pg_verifybackup', $backup_path2 ],
'valid base backup with timeline > 1');
+# Test WAL verification for a tar-format backup with a separate pg_wal.tar,
+# as produced by pg_basebackup --format=tar --wal-method=stream.
+my $backup_path3 = $primary->backup_dir . '/test_tar_wal';
+$primary->command_ok(
+ [
+ 'pg_basebackup',
+ '--pgdata' => $backup_path3,
+ '--no-sync',
+ '--format' => 'tar',
+ '--checkpoint' => 'fast'
+ ],
+ "tar backup with separate pg_wal.tar");
+command_ok(
+ [ 'pg_verifybackup', $backup_path3 ],
+ 'WAL verification succeeds with separate pg_wal.tar');
+
done_testing();
diff --git a/src/bin/pg_verifybackup/t/008_untar.pl b/src/bin/pg_verifybackup/t/008_untar.pl
index ae67ae85a31..161c08c190d 100644
--- a/src/bin/pg_verifybackup/t/008_untar.pl
+++ b/src/bin/pg_verifybackup/t/008_untar.pl
@@ -47,7 +47,6 @@ my $tsoid = $primary->safe_psql(
SELECT oid FROM pg_tablespace WHERE spcname = 'regress_ts1'));
my $backup_path = $primary->backup_dir . '/server-backup';
-my $extract_path = $primary->backup_dir . '/extracted-backup';
my @test_configuration = (
{
@@ -123,14 +122,12 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
# Cleanup.
rmtree($backup_path);
- rmtree($extract_path);
}
}
diff --git a/src/bin/pg_verifybackup/t/010_client_untar.pl b/src/bin/pg_verifybackup/t/010_client_untar.pl
index 1ac7b5db75a..9670fbe4fda 100644
--- a/src/bin/pg_verifybackup/t/010_client_untar.pl
+++ b/src/bin/pg_verifybackup/t/010_client_untar.pl
@@ -32,7 +32,6 @@ print $jf $junk_data;
close $jf;
my $backup_path = $primary->backup_dir . '/client-backup';
-my $extract_path = $primary->backup_dir . '/extracted-backup';
my @test_configuration = (
{
@@ -137,13 +136,11 @@ for my $tc (@test_configuration)
# Verify tar backup.
$primary->command_ok(
[
- 'pg_verifybackup', '--no-parse-wal',
- '--exit-on-error', $backup_path,
+ 'pg_verifybackup', '--exit-on-error', $backup_path,
],
"verify backup, compression $method");
# Cleanup.
- rmtree($extract_path);
rmtree($backup_path);
}
}
--
2.47.1
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-03-20 19:33 Andrew Dunstan <[email protected]>
parent: Amul Sul <[email protected]>
0 siblings, 2 replies; 85+ messages in thread
From: Andrew Dunstan @ 2026-03-20 19:33 UTC (permalink / raw)
To: Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; +Cc: Robert Haas <[email protected]>; Chao Li <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On 2026-03-20 Fr 9:26 AM, Amul Sul wrote:
> On Fri, Mar 20, 2026 at 5:01 PM Amul Sul <[email protected]> wrote:
>> On Fri, Mar 20, 2026 at 2:18 AM Zsolt Parragi <[email protected]> wrote:
>>> Hello!
>>>
>>> Path is ignored with a positional argument, I think this is a bug?
>>>
>>> This fails:
>>>
>>> pg_waldump --path /wal/dir 000000010000000000000001
>>>
>>> And this works:
>>>
>>> pg_waldump --path /wal/dir --start 0/01000028 --end 0/010020F8
>>>
>> Good catch! I've fixed this in the attached version and updated a test
>> case to cover this scenario.
>>
>>> +{
>>> + int fname_len = strlen(fname);
>>> +
>>>
>>> Shouldn't this use size_t?
>>>
>> Okay, that can be used. I’ve done the same in the attached version.
>>
>>> + /*
>>> + * Setup temporary directory to store WAL segments and set up an exit
>>> + * callback to remove it upon completion.
>>> + */
>>> + setup_tmpwal_dir(waldir);
>>>
>>> Maybe this could be deferred to be created only on first use? If I
>>> understand correctly, in a typical scenario waldump won't use this
>>> temporary directory, yet it always creates it.
>> Yeah, that optimization can be done, but passing the waldir -- which
>> is only used once -- to the point where the first temp file is created
>> would require quite a bit of code refactoring that doesn't seem to
>> offer much gain, IMO.
>>
> Since Andrew also leans toward creating the directory only when
> needed, I have reconsidered the approach. I think we can pass waldir
> (the archive directory) via XLogDumpPrivate, and I’ve implemented that
> in the attached version.
>
Thanks, committed with very minor tweaks.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-03-21 06:19 Amul Sul <[email protected]>
parent: Andrew Dunstan <[email protected]>
1 sibling, 0 replies; 85+ messages in thread
From: Amul Sul @ 2026-03-21 06:19 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Andrew Dunstan <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On Sat, Mar 21, 2026 at 9:19 AM Tom Lane <[email protected]> wrote:
>
> Andrew Dunstan <[email protected]> writes:
> > Thanks, committed with very minor tweaks.
>
> Buildfarm members batta and hachi don't like this very much.
> They fail the pg_verifybackup tests like so:
>
> # Running: pg_verifybackup --exit-on-error /home/admin/batta/buildroot/HEAD/pgsql.build/src/bin/pg_verifybackup/tmp_check/t_008_untar_primary_data/backup/server-backup
> pg_waldump: error: could not find WAL in archive "base.tar.zst"
> pg_verifybackup: error: WAL parsing failed for timeline 1
>
> Only the zstd-compression case fails. I've spent several hours trying
> to reproduce this, without any luck, although I can get a similar
> failure in only the gzip case if I build with --with-wal-blocksize=64.
> I do not have an explanation for the seeming cross-platform
> difference. However after adding a lot of debug tracing, I believe
> I see the bug, or at least a related bug. This bit in
> archive_waldump.c's init_archive_reader is where the error comes from:
>
> /*
> * Read until we have at least one full WAL page (XLOG_BLCKSZ bytes) from
> * the first WAL segment in the archive so we can extract the WAL segment
> * size from the long page header.
> */
> while (entry == NULL || entry->buf->len < XLOG_BLCKSZ)
> {
> if (read_archive_file(privateInfo, XLOG_BLCKSZ) == 0)
> pg_fatal("could not find WAL in archive \"%s\"",
> privateInfo->archive_name);
>
> entry = privateInfo->cur_file;
> }
>
> That looks plausible but is in fact utterly broken when there's not a
> lot of WAL data in the archive, as there is not in this test case.
> There are at least two problems:
>
Thanks for the detailed debugging. I noticed the failure this morning
and had started investigating the issue, but in the meantime, I got
your helpful reply, which saved me a bunch of time and energy.
> 1. read_archive_file reads some data from the source WAL archive and
> shoves it into the astreamer decompression pipeline. However, once it
> runs out of source data, it just returns zero and we fail immediately.
> This does not account for the possibility --- nay, certainty --- that
> there is data queued inside the decompression pipeline. So this
> doesn't work if the data we need has been compressed into less than
> XLOG_BLCKSZ worth of compressed data. (I suppose that the seeming
> cross-platform differences have to do with the effectiveness of the
> compression algorithm, but I don't really understand why it'd not be
> the same everywhere.) We need to do astreamer_finalize once we run
> out of source data. I think the cleanest place to handle that would
> be inside read_archive_file, but its return convention will need some
> rework if we want to put it there (because rc == 0 shouldn't cause an
> immediate failure if we were able to finalize some more data). As an
> ugly experiment I put an astreamer_finalize call into the rc == 0 path
> of the above loop, but it still didn't work, because:
>
> 2. If the decompression pipeline reaches the end of the WAL file that
> we want, the ASTREAMER_MEMBER_TRAILER case in
> astreamer_waldump_content instantly resets privateInfo->cur_file to
> NULL. Then the loop in init_archive_reader cannot exit successfully,
> and it will just read till the end of the archive and fail.
>
> I see that of the three callers of read_archive_file, only
> get_archive_wal_entry is aware of this possibility; but
> init_archive_reader certainly needs to deal with it and I bet
> read_archive_wal_page does too. Moreover, get_archive_wal_entry's
> solution looks to me like a fragile kluge that probably doesn't work
> reliably either, the reason being that privateInfo->cur_file can
> change multiple times during a single call to read_archive_file,
> if the WAL data has been compressed sufficiently. That whole API
> seems to need some rethinking, not to mention better documentation
> than the zero it has now.
>
I agree; init_archive_reader needs that handling, but
read_archive_wal_page doesn't need any fix. Since it only deals with
the current entry and already holds a reference to it, there is no
need to fetch it from the hash table again.
init_archive_reader has to scan the hash table because it doesn't
already have the specific WAL filename it is looking for, unlike
get_archive_wal_entry. Please have a look at the attached patch, which
tries to fix that.
> While I'm bitching: this error message "could not find WAL in archive
> \"%s\"" seems to me to be completely misleading and off-point.
>
I tried to improve that in the attached version.
regards,
Amul
Attachments:
[application/x-patch] 0001-pg_waldump-buildfarm-fix.patch (3.1K, 2-0001-pg_waldump-buildfarm-fix.patch)
download | inline diff:
From bde3fb4e3125eed740b5d949a990b4e06d01499a Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Sat, 21 Mar 2026 11:22:50 +0530
Subject: [PATCH] pg_waldump: Handle archive exhaustion in
init_archive_reader().
When read_archive_file() returns 0, the archive may have already
buffered a complete WAL file into the hash table before exhausting
the input. Instead of immediately reporting an error, search the
hash table for an entry containing at least sizeof(XLogLongPageHeader)
bytes. Report a specific error if a WAL entry exists but is too
short (truncated/corrupt), or a generic error if no WAL was found
at all.
Also tighten the loop condition to check for sizeof(XLogLongPageHeader)
rather than XLOG_BLCKSZ, since only the long page header is needed
at this stage.
---
src/bin/pg_waldump/archive_waldump.c | 51 +++++++++++++++++++++++++---
1 file changed, 47 insertions(+), 4 deletions(-)
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
index b078c2d6960..5bd1faf3d95 100644
--- a/src/bin/pg_waldump/archive_waldump.c
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -176,13 +176,56 @@ init_archive_reader(XLogDumpPrivate *privateInfo,
* the first WAL segment in the archive so we can extract the WAL segment
* size from the long page header.
*/
- while (entry == NULL || entry->buf->len < XLOG_BLCKSZ)
+ while (entry == NULL || entry->read_len < sizeof(XLogLongPageHeader))
{
if (read_archive_file(privateInfo, XLOG_BLCKSZ) == 0)
- pg_fatal("could not find WAL in archive \"%s\"",
- privateInfo->archive_name);
+ {
+ ArchivedWAL_iterator iter;
+ ArchivedWALFile *e = NULL;
- entry = privateInfo->cur_file;
+ entry = NULL;
+
+ /*
+ * read_archive_file() returned 0, meaning the archive is
+ * exhausted. However, a sufficiently compressed archive may have
+ * already read a complete WAL file and inserted it into the hash
+ * table before returning. Search the hash table for any entry
+ * that already has enough buffered data to contain the long page
+ * header; if none is found, the archive contains no usable WAL.
+ */
+ ArchivedWAL_start_iterate(privateInfo->archive_wal_htab, &iter);
+ while ((e = ArchivedWAL_iterate(privateInfo->archive_wal_htab,
+ &iter)) != NULL)
+ {
+ if (e->read_len >= sizeof(XLogLongPageHeader))
+ {
+ entry = e;
+ break;
+ }
+ }
+
+ if (entry == NULL)
+ {
+ /*
+ * A WAL file was found in the hash table but it does not
+ * contain enough data to read the long page header,
+ * indicating a truncated or corrupt WAL segment.
+ */
+ if (e != NULL)
+ pg_fatal("could not read file \"%s\" from \"%s\" archive: read %d of %d",
+ e->fname, privateInfo->archive_name, e->read_len,
+ (int) sizeof(XLogLongPageHeader));
+
+ /*
+ * The hash table contains no WAL entries at all, meaning the
+ * archive holds no WAL data.
+ */
+ pg_fatal("could not find WAL in archive \"%s\"",
+ privateInfo->archive_name);
+ }
+ }
+ else
+ entry = privateInfo->cur_file;
}
/* Extract the WAL segment size from the long page header */
--
2.47.1
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-03-21 06:23 Michael Paquier <[email protected]>
parent: Andrew Dunstan <[email protected]>
1 sibling, 2 replies; 85+ messages in thread
From: Michael Paquier @ 2026-03-21 06:23 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On Fri, Mar 20, 2026 at 11:49:02PM -0400, Tom Lane wrote:
> Andrew Dunstan <[email protected]> writes:
> > Thanks, committed with very minor tweaks.
>
> Buildfarm members batta and hachi don't like this very much.
> They fail the pg_verifybackup tests like so:
>
> # Running: pg_verifybackup --exit-on-error /home/admin/batta/buildroot/HEAD/pgsql.build/src/bin/pg_verifybackup/tmp_check/t_008_untar_primary_data/backup/server-backup
> pg_waldump: error: could not find WAL in archive "base.tar.zst"
> pg_verifybackup: error: WAL parsing failed for timeline 1
I did not look at what's happening on the host, but it seems like a
safe bet to assume that we are not seeing many failures in the
buildfarm because we don't have many animals that have the idea to add
--with-zstd to their build configuration, like these two ones.
--
Michael
Attachments:
[application/pgp-signature] signature.asc (833B, 2-signature.asc)
download
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-03-21 15:35 Amul Sul <[email protected]>
parent: Michael Paquier <[email protected]>
1 sibling, 1 reply; 85+ messages in thread
From: Amul Sul @ 2026-03-21 15:35 UTC (permalink / raw)
To: Andrew Dunstan <[email protected]>; +Cc: Tom Lane <[email protected]>; Michael Paquier <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On Sat, Mar 21, 2026 at 5:51 PM Andrew Dunstan <[email protected]> wrote:
>
>
> On 2026-03-21 Sa 2:34 AM, Tom Lane wrote:
>
> Michael Paquier <[email protected]> writes:
>
> On Fri, Mar 20, 2026 at 11:49:02PM -0400, Tom Lane wrote:
>
> Buildfarm members batta and hachi don't like this very much.
>
> I did not look at what's happening on the host, but it seems like a
> safe bet to assume that we are not seeing many failures in the
> buildfarm because we don't have many animals that have the idea to add
> --with-zstd to their build configuration, like these two ones.
>
> That may be part of the story, but only part. I spent a good deal of
> time trying to reproduce batta & hachi's configurations locally, on
> several different platforms, but still couldn't duplicate what they
> are showing.
>
>
>
>
>
> Yeah, I haven't been able to reproduce it either. But while investigating I found a couple of issues. We neglected to add one of the tests to meson.build, and we neglected to close some files, causing errors on windows.
>
While the proposed fix of closing the file pointer before returning is
correct, we also need to ensure the file is reopened in the next call
to spill any remaining buffered data. I’ve made a small update to
Andrew's 0001 patch to handle this. Also, changes to meson.build don't
seem to be needed as we haven't committed that file yet (unless I am
missing something).
I’ve also reattached the other patches so they don't get lost: v2-0002
is Andrew's patch for the archive streamer, and v2-0003 is the patch I
posted previously [1].
Regards,
Amul
1] http://postgr.es/m/CAAJ_b95L5J7bjRNDjRj6WgqFcQeaBD+JX3sAuxPA4uopqEThxA@mail.gmail.com
Attachments:
[application/x-patch] v2-0001-Fix-pg_waldump-archive-reader-file-handle-leak-an.patch (1.8K, 2-v2-0001-Fix-pg_waldump-archive-reader-file-handle-leak-an.patch)
download | inline diff:
From 322fd5b96e9739937c587460b2780308705f5a83 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Sat, 21 Mar 2026 20:34:15 +0530
Subject: [PATCH v2 1/3] Fix-pg_waldump-archive-reader-file-handle-leak-and-r
---
src/bin/pg_waldump/archive_waldump.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
index b078c2d6960..1e9ae637940 100644
--- a/src/bin/pg_waldump/archive_waldump.c
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -474,7 +474,16 @@ get_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo)
entry = ArchivedWAL_lookup(privateInfo->archive_wal_htab, fname);
if (entry != NULL)
+ {
+ /*
+ * Found the target segment. Close any open spill file handle to
+ * avoid a leak; any remaining data for that segment will be
+ * written when the file is reopened in a subsequent call.
+ */
+ if (write_fp != NULL)
+ fclose(write_fp);
return entry;
+ }
/*
* Capture the current entry before calling read_archive_file(),
@@ -508,8 +517,8 @@ get_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo)
*/
Assert(strcmp(fname, entry->fname) != 0);
- /* Create a temporary file if one does not already exist */
- if (!entry->spilled)
+ /* Open a spill file for this segment if we haven't already */
+ if (!write_fp)
{
write_fp = prepare_tmp_write(entry->fname, privateInfo);
entry->spilled = true;
@@ -631,7 +640,7 @@ prepare_tmp_write(const char *fname, XLogDumpPrivate *privateInfo)
snprintf(fpath, MAXPGPATH, "%s/%s", TmpWalSegDir, fname);
/* Open the spill file for writing */
- file = fopen(fpath, PG_BINARY_W);
+ file = fopen(fpath, PG_BINARY_A);
if (file == NULL)
pg_fatal("could not create file \"%s\": %m", fpath);
--
2.47.1
[application/x-patch] v2-0002-Fix-astreamer-decompressor-finalize-to-send-corre.patch (2.5K, 3-v2-0002-Fix-astreamer-decompressor-finalize-to-send-corre.patch)
download | inline diff:
From 40e613592ab819c1b8346afe435babf0b212b9ef Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Sat, 21 Mar 2026 19:48:48 +0530
Subject: [PATCH v2 2/3] Fix-astreamer-decompressor-finalize-to-send-correct
---
src/fe_utils/astreamer_gzip.c | 9 +++++----
src/fe_utils/astreamer_lz4.c | 9 +++++----
src/fe_utils/astreamer_zstd.c | 2 +-
3 files changed, 11 insertions(+), 9 deletions(-)
diff --git a/src/fe_utils/astreamer_gzip.c b/src/fe_utils/astreamer_gzip.c
index 2e080c37a58..df392f67cab 100644
--- a/src/fe_utils/astreamer_gzip.c
+++ b/src/fe_utils/astreamer_gzip.c
@@ -347,10 +347,11 @@ astreamer_gzip_decompressor_finalize(astreamer *streamer)
* End of the stream, if there is some pending data in output buffers then
* we must forward it to next streamer.
*/
- astreamer_content(mystreamer->base.bbs_next, NULL,
- mystreamer->base.bbs_buffer.data,
- mystreamer->base.bbs_buffer.maxlen,
- ASTREAMER_UNKNOWN);
+ if (mystreamer->bytes_written > 0)
+ astreamer_content(mystreamer->base.bbs_next, NULL,
+ mystreamer->base.bbs_buffer.data,
+ mystreamer->bytes_written,
+ ASTREAMER_UNKNOWN);
astreamer_finalize(mystreamer->base.bbs_next);
}
diff --git a/src/fe_utils/astreamer_lz4.c b/src/fe_utils/astreamer_lz4.c
index 2bc32b42879..605c188007b 100644
--- a/src/fe_utils/astreamer_lz4.c
+++ b/src/fe_utils/astreamer_lz4.c
@@ -397,10 +397,11 @@ astreamer_lz4_decompressor_finalize(astreamer *streamer)
* End of the stream, if there is some pending data in output buffers then
* we must forward it to next streamer.
*/
- astreamer_content(mystreamer->base.bbs_next, NULL,
- mystreamer->base.bbs_buffer.data,
- mystreamer->base.bbs_buffer.maxlen,
- ASTREAMER_UNKNOWN);
+ if (mystreamer->bytes_written > 0)
+ astreamer_content(mystreamer->base.bbs_next, NULL,
+ mystreamer->base.bbs_buffer.data,
+ mystreamer->bytes_written,
+ ASTREAMER_UNKNOWN);
astreamer_finalize(mystreamer->base.bbs_next);
}
diff --git a/src/fe_utils/astreamer_zstd.c b/src/fe_utils/astreamer_zstd.c
index f26abcfd0fa..4b43ab795e3 100644
--- a/src/fe_utils/astreamer_zstd.c
+++ b/src/fe_utils/astreamer_zstd.c
@@ -347,7 +347,7 @@ astreamer_zstd_decompressor_finalize(astreamer *streamer)
if (mystreamer->zstd_outBuf.pos > 0)
astreamer_content(mystreamer->base.bbs_next, NULL,
mystreamer->base.bbs_buffer.data,
- mystreamer->base.bbs_buffer.maxlen,
+ mystreamer->zstd_outBuf.pos,
ASTREAMER_UNKNOWN);
astreamer_finalize(mystreamer->base.bbs_next);
--
2.47.1
[application/x-patch] v2-0003-pg_waldump-Handle-archive-exhaustion-in-init_arch.patch (3.3K, 4-v2-0003-pg_waldump-Handle-archive-exhaustion-in-init_arch.patch)
download | inline diff:
From a62de1b7b467a037651a2e1bb3820a390227ce78 Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Sat, 21 Mar 2026 20:57:23 +0530
Subject: [PATCH v2 3/3] pg_waldump: Handle archive exhaustion in
init_archive_reader().
When read_archive_file() returns 0, the archive may have already
buffered a complete WAL file into the hash table before exhausting
the input. Instead of immediately reporting an error, search the
hash table for an entry containing at least sizeof(XLogLongPageHeader)
bytes. Report a specific error if a WAL entry exists but is too
short (truncated/corrupt), or a generic error if no WAL was found
at all.
Also tighten the loop condition to check for sizeof(XLogLongPageHeader)
rather than XLOG_BLCKSZ, since only the long page header is needed
at this stage.
---
src/bin/pg_waldump/archive_waldump.c | 55 ++++++++++++++++++++++++++--
1 file changed, 51 insertions(+), 4 deletions(-)
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
index 1e9ae637940..dbc1751fb3c 100644
--- a/src/bin/pg_waldump/archive_waldump.c
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -176,13 +176,60 @@ init_archive_reader(XLogDumpPrivate *privateInfo,
* the first WAL segment in the archive so we can extract the WAL segment
* size from the long page header.
*/
- while (entry == NULL || entry->buf->len < XLOG_BLCKSZ)
+ while (entry == NULL || entry->read_len < sizeof(XLogLongPageHeader))
{
if (read_archive_file(privateInfo, XLOG_BLCKSZ) == 0)
- pg_fatal("could not find WAL in archive \"%s\"",
- privateInfo->archive_name);
+ {
+ ArchivedWAL_iterator iter;
+ ArchivedWALFile *e = NULL;
+ ArchivedWALFile *short_entry = NULL;
- entry = privateInfo->cur_file;
+ entry = NULL;
+
+ /*
+ * read_archive_file() returned 0, meaning the archive is
+ * exhausted. However, a sufficiently compressed archive may have
+ * already read a complete WAL file and inserted it into the hash
+ * table before returning. Search the hash table for any entry
+ * that already has enough buffered data to contain the long page
+ * header; if none is found, the archive contains no usable WAL.
+ */
+ ArchivedWAL_start_iterate(privateInfo->archive_wal_htab, &iter);
+ while ((e = ArchivedWAL_iterate(privateInfo->archive_wal_htab,
+ &iter)) != NULL)
+ {
+ if (e->read_len >= sizeof(XLogLongPageHeader))
+ {
+ entry = e;
+ break;
+ }
+ /* Remember a short entry in case we need to report it */
+ short_entry = e;
+ }
+
+ if (entry == NULL)
+ {
+ /*
+ * A WAL file was found in the hash table but it does not
+ * contain enough data to read the long page header,
+ * indicating a truncated or corrupt WAL segment.
+ */
+ if (short_entry != NULL)
+ pg_fatal("could not read file \"%s\" from \"%s\" archive: read %d of %d",
+ short_entry->fname, privateInfo->archive_name,
+ short_entry->read_len,
+ (int) sizeof(XLogLongPageHeader));
+
+ /*
+ * The hash table contains no WAL entries at all, meaning the
+ * archive holds no WAL data.
+ */
+ pg_fatal("could not find WAL in archive \"%s\"",
+ privateInfo->archive_name);
+ }
+ }
+ else
+ entry = privateInfo->cur_file;
}
/* Extract the WAL segment size from the long page header */
--
2.47.1
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-03-21 17:26 Amul Sul <[email protected]>
parent: Amul Sul <[email protected]>
0 siblings, 0 replies; 85+ messages in thread
From: Amul Sul @ 2026-03-21 17:26 UTC (permalink / raw)
To: Andrew Dunstan <[email protected]>; +Cc: Tom Lane <[email protected]>; Michael Paquier <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On Sat, Mar 21, 2026 at 9:05 PM Amul Sul <[email protected]> wrote:
>
> On Sat, Mar 21, 2026 at 5:51 PM Andrew Dunstan <[email protected]> wrote:
> >
> >
> > On 2026-03-21 Sa 2:34 AM, Tom Lane wrote:
> >
> > Michael Paquier <[email protected]> writes:
> >
> > On Fri, Mar 20, 2026 at 11:49:02PM -0400, Tom Lane wrote:
> >
> > Buildfarm members batta and hachi don't like this very much.
> >
> > I did not look at what's happening on the host, but it seems like a
> > safe bet to assume that we are not seeing many failures in the
> > buildfarm because we don't have many animals that have the idea to add
> > --with-zstd to their build configuration, like these two ones.
> >
> > That may be part of the story, but only part. I spent a good deal of
> > time trying to reproduce batta & hachi's configurations locally, on
> > several different platforms, but still couldn't duplicate what they
> > are showing.
> >
> >
> >
> >
> >
> > Yeah, I haven't been able to reproduce it either. But while investigating I found a couple of issues. We neglected to add one of the tests to meson.build, and we neglected to close some files, causing errors on windows.
> >
>
> While the proposed fix of closing the file pointer before returning is
> correct, we also need to ensure the file is reopened in the next call
> to spill any remaining buffered data. I’ve made a small update to
> Andrew's 0001 patch to handle this. Also, changes to meson.build don't
> seem to be needed as we haven't committed that file yet (unless I am
> missing something).
>
> I’ve also reattached the other patches so they don't get lost: v2-0002
> is Andrew's patch for the archive streamer, and v2-0003 is the patch I
> posted previously [1].
>
>
On further thought, I don't think v2-0001 is the right patch. Consider
the case where we write a temporary file partially: if the next
segment required for decoding is that same segment,
TarWALDumpReadPage() will find the physical file present and continue
decoding, potentially triggering an error later due to the shorter
file.
I have attached the v3-0001 patch, which ensures that once we start
writing a temporary file, it should be finished before performing the
lookup. This ensures we don't leave a partial file on disk.
Updated patches are attached; 0002 and 0003 remain the same as before.
Regards,
Amul
Attachments:
[application/x-patch] v3-0001-archive_waldump-skip-hash-lookup-and-tighten-writ.patch (1.8K, 2-v3-0001-archive_waldump-skip-hash-lookup-and-tighten-writ.patch)
download | inline diff:
From b3bfdac9e425f4cb9fd7d7b6c698dd1607b737ee Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Sat, 21 Mar 2026 22:27:22 +0530
Subject: [PATCH v3 1/3] archive_waldump: skip hash lookup and tighten write_fp
invariant
In get_archive_wal_entry(), when the streamer is still mid-segment
(entry == cur_file), jump directly to read_more instead of looping back
to the top and performing a hash table lookup that is guaranteed to fail.
---
src/bin/pg_waldump/archive_waldump.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
index b078c2d6960..ee292b6dc8d 100644
--- a/src/bin/pg_waldump/archive_waldump.c
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -484,6 +484,7 @@ get_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo)
*/
entry = privateInfo->cur_file;
+read_more:
/*
* Fetch more data either when no current file is being tracked or
* when its buffer has been fully flushed to the temporary file.
@@ -525,11 +526,20 @@ get_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo)
* file handle so data is flushed to disk before the next segment
* starts writing to a different handle.
*/
- if (entry != privateInfo->cur_file && write_fp != NULL)
+ if (entry != privateInfo->cur_file)
{
+ Assert(write_fp);
fclose(write_fp);
write_fp = NULL;
}
+ else
+ /*
+ * The file being written hasn't been completed. We must finish
+ * extracting it before performing the hash lookup; otherwise, the
+ * lookup might return without flushing the current segment buffer,
+ * leaving the file open and incomplete on disk.
+ */
+ goto read_more;
}
/* Requested WAL segment not found */
--
2.47.1
[application/x-patch] v3-0002-Fix-astreamer-decompressor-finalize-to-send-corre.patch (2.5K, 3-v3-0002-Fix-astreamer-decompressor-finalize-to-send-corre.patch)
download | inline diff:
From 3a52d70947f7c7bc3a0decbd473e95891ad3b6eb Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Sat, 21 Mar 2026 19:48:48 +0530
Subject: [PATCH v3 2/3] Fix-astreamer-decompressor-finalize-to-send-correct
---
src/fe_utils/astreamer_gzip.c | 9 +++++----
src/fe_utils/astreamer_lz4.c | 9 +++++----
src/fe_utils/astreamer_zstd.c | 2 +-
3 files changed, 11 insertions(+), 9 deletions(-)
diff --git a/src/fe_utils/astreamer_gzip.c b/src/fe_utils/astreamer_gzip.c
index 2e080c37a58..df392f67cab 100644
--- a/src/fe_utils/astreamer_gzip.c
+++ b/src/fe_utils/astreamer_gzip.c
@@ -347,10 +347,11 @@ astreamer_gzip_decompressor_finalize(astreamer *streamer)
* End of the stream, if there is some pending data in output buffers then
* we must forward it to next streamer.
*/
- astreamer_content(mystreamer->base.bbs_next, NULL,
- mystreamer->base.bbs_buffer.data,
- mystreamer->base.bbs_buffer.maxlen,
- ASTREAMER_UNKNOWN);
+ if (mystreamer->bytes_written > 0)
+ astreamer_content(mystreamer->base.bbs_next, NULL,
+ mystreamer->base.bbs_buffer.data,
+ mystreamer->bytes_written,
+ ASTREAMER_UNKNOWN);
astreamer_finalize(mystreamer->base.bbs_next);
}
diff --git a/src/fe_utils/astreamer_lz4.c b/src/fe_utils/astreamer_lz4.c
index 2bc32b42879..605c188007b 100644
--- a/src/fe_utils/astreamer_lz4.c
+++ b/src/fe_utils/astreamer_lz4.c
@@ -397,10 +397,11 @@ astreamer_lz4_decompressor_finalize(astreamer *streamer)
* End of the stream, if there is some pending data in output buffers then
* we must forward it to next streamer.
*/
- astreamer_content(mystreamer->base.bbs_next, NULL,
- mystreamer->base.bbs_buffer.data,
- mystreamer->base.bbs_buffer.maxlen,
- ASTREAMER_UNKNOWN);
+ if (mystreamer->bytes_written > 0)
+ astreamer_content(mystreamer->base.bbs_next, NULL,
+ mystreamer->base.bbs_buffer.data,
+ mystreamer->bytes_written,
+ ASTREAMER_UNKNOWN);
astreamer_finalize(mystreamer->base.bbs_next);
}
diff --git a/src/fe_utils/astreamer_zstd.c b/src/fe_utils/astreamer_zstd.c
index f26abcfd0fa..4b43ab795e3 100644
--- a/src/fe_utils/astreamer_zstd.c
+++ b/src/fe_utils/astreamer_zstd.c
@@ -347,7 +347,7 @@ astreamer_zstd_decompressor_finalize(astreamer *streamer)
if (mystreamer->zstd_outBuf.pos > 0)
astreamer_content(mystreamer->base.bbs_next, NULL,
mystreamer->base.bbs_buffer.data,
- mystreamer->base.bbs_buffer.maxlen,
+ mystreamer->zstd_outBuf.pos,
ASTREAMER_UNKNOWN);
astreamer_finalize(mystreamer->base.bbs_next);
--
2.47.1
[application/x-patch] v3-0003-pg_waldump-Handle-archive-exhaustion-in-init_arch.patch (3.3K, 4-v3-0003-pg_waldump-Handle-archive-exhaustion-in-init_arch.patch)
download | inline diff:
From 242b8904682cec326d059e3d11355ca2315c869c Mon Sep 17 00:00:00 2001
From: Amul Sul <[email protected]>
Date: Sat, 21 Mar 2026 20:57:23 +0530
Subject: [PATCH v3 3/3] pg_waldump: Handle archive exhaustion in
init_archive_reader().
When read_archive_file() returns 0, the archive may have already
buffered a complete WAL file into the hash table before exhausting
the input. Instead of immediately reporting an error, search the
hash table for an entry containing at least sizeof(XLogLongPageHeader)
bytes. Report a specific error if a WAL entry exists but is too
short (truncated/corrupt), or a generic error if no WAL was found
at all.
Also tighten the loop condition to check for sizeof(XLogLongPageHeader)
rather than XLOG_BLCKSZ, since only the long page header is needed
at this stage.
---
src/bin/pg_waldump/archive_waldump.c | 55 ++++++++++++++++++++++++++--
1 file changed, 51 insertions(+), 4 deletions(-)
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
index ee292b6dc8d..943c843e05b 100644
--- a/src/bin/pg_waldump/archive_waldump.c
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -176,13 +176,60 @@ init_archive_reader(XLogDumpPrivate *privateInfo,
* the first WAL segment in the archive so we can extract the WAL segment
* size from the long page header.
*/
- while (entry == NULL || entry->buf->len < XLOG_BLCKSZ)
+ while (entry == NULL || entry->read_len < sizeof(XLogLongPageHeader))
{
if (read_archive_file(privateInfo, XLOG_BLCKSZ) == 0)
- pg_fatal("could not find WAL in archive \"%s\"",
- privateInfo->archive_name);
+ {
+ ArchivedWAL_iterator iter;
+ ArchivedWALFile *e = NULL;
+ ArchivedWALFile *short_entry = NULL;
- entry = privateInfo->cur_file;
+ entry = NULL;
+
+ /*
+ * read_archive_file() returned 0, meaning the archive is
+ * exhausted. However, a sufficiently compressed archive may have
+ * already read a complete WAL file and inserted it into the hash
+ * table before returning. Search the hash table for any entry
+ * that already has enough buffered data to contain the long page
+ * header; if none is found, the archive contains no usable WAL.
+ */
+ ArchivedWAL_start_iterate(privateInfo->archive_wal_htab, &iter);
+ while ((e = ArchivedWAL_iterate(privateInfo->archive_wal_htab,
+ &iter)) != NULL)
+ {
+ if (e->read_len >= sizeof(XLogLongPageHeader))
+ {
+ entry = e;
+ break;
+ }
+ /* Remember a short entry in case we need to report it */
+ short_entry = e;
+ }
+
+ if (entry == NULL)
+ {
+ /*
+ * A WAL file was found in the hash table but it does not
+ * contain enough data to read the long page header,
+ * indicating a truncated or corrupt WAL segment.
+ */
+ if (short_entry != NULL)
+ pg_fatal("could not read file \"%s\" from \"%s\" archive: read %d of %d",
+ short_entry->fname, privateInfo->archive_name,
+ short_entry->read_len,
+ (int) sizeof(XLogLongPageHeader));
+
+ /*
+ * The hash table contains no WAL entries at all, meaning the
+ * archive holds no WAL data.
+ */
+ pg_fatal("could not find WAL in archive \"%s\"",
+ privateInfo->archive_name);
+ }
+ }
+ else
+ entry = privateInfo->cur_file;
}
/* Extract the WAL segment size from the long page header */
--
2.47.1
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-03-22 11:24 Andrew Dunstan <[email protected]>
parent: Michael Paquier <[email protected]>
1 sibling, 1 reply; 85+ messages in thread
From: Andrew Dunstan @ 2026-03-22 11:24 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Michael Paquier <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On Sun, Mar 22, 2026 at 12:24 AM Tom Lane <[email protected]> wrote:
> I wrote:
> > Unsurprisingly, applying this change to unmodified master results
> > in the pg_waldump and pg_verifybackup tests falling over. More
> > surprisingly, they still fall over after applying your fix to the
> > decompressors, so there's some other source of garbage trailing
> > data. I haven't figured out what.
>
> In the learn-something-new-every-day dept.: good ol' GNU tar itself
> does that. By default, it zero-pads its output to a multiple of 10kB
> after it's written the required terminator. Moreover, this behavior
> is actually specified by POSIX:
>
> -x format
> Specify the output archive format. The pax utility shall support
> the following formats:
> ...
> ustar
> The tar interchange format; see the EXTENDED DESCRIPTION
> section. The default blocksize for this format for character
> special archive files shall be 10240. Implementations shall
> support all blocksize values less than or equal to 32256 that
> are multiples of 512.
>
> So, astreamer_tar_parser_content's idea that it should disallow more
> than 1024 bytes of trailer is completely wrong, which we would have
> figured out long ago if the code attempting to enforce that weren't
> completely broken.
>
> You could argue that this means the tar files our existing utilities
> create aren't POSIX-compliant. I think it's all right though: we
> can just say that we write these files with blocksize 1024 not
> blocksize 10240, and tar-file readers are required to accept that
> per the above spec text.
>
> However, this discourages me from editorializing on the file trailer
> emitted by whatever wrote the tar file we are reading. I think
> emitting it as-is is the most appropriate thing. So we should just
> get rid of astreamer_tar_parser_content's nonfunctional error check
> and not change its behavior otherwise.
>
>
>
OK, patch 5 of this set does that. I reworked your previous patches 2 and 3
slightly - mostly additional comments, and fixing a bug in use
of sizeof(XLogLongPageHeader). Patch 4 here tries to fix the wrong use of
cur_file in get_archive_wal_entry()
cheers
andrew
Attachments:
[text/x-patch] v5-0003-Fix-init_archive_reader-to-not-depend-on-cur_file.patch (3.6K, 3-v5-0003-Fix-init_archive_reader-to-not-depend-on-cur_file.patch)
download | inline diff:
From 51d53b166df7c8eaebe49756e24088c16764807b Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <[email protected]>
Date: Sun, 22 Mar 2026 06:53:25 -0400
Subject: [PATCH v5 3/5] Fix init_archive_reader to not depend on cur_file.
init_archive_reader() relied on privateInfo->cur_file to track which
WAL segment was being read, but cur_file can become NULL if a member
trailer is processed during a read_archive_file() call. This could
cause unreproducible "could not find WAL in archive" failures,
particularly with compressed archives where all the WAL data fits
in a small number of compressed bytes.
Fix by scanning the hash table after each read to find any cached
WAL segment with sufficient data, instead of depending on cur_file.
Also reduce the minimum data requirement from XLOG_BLCKSZ to
sizeof(XLogLongPageHeaderData), since we only need the long page
header to extract the segment size.
Add a safety comment on cur_file in pg_waldump.h to document that
it can change during a single read_archive_file() call.
Author: Tom Lane <[email protected]>
Discussion: https://postgr.es/m/[email protected]
---
src/bin/pg_waldump/archive_waldump.c | 22 +++++++++++++++++-----
src/bin/pg_waldump/pg_waldump.h | 9 ++++++++-
2 files changed, 25 insertions(+), 6 deletions(-)
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
index cd092a057ef..3fce2183099 100644
--- a/src/bin/pg_waldump/archive_waldump.c
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -173,17 +173,29 @@ init_archive_reader(XLogDumpPrivate *privateInfo,
privateInfo->archive_wal_htab = ArchivedWAL_create(8, NULL);
/*
- * Read until we have at least one full WAL page (XLOG_BLCKSZ bytes) from
- * the first WAL segment in the archive so we can extract the WAL segment
- * size from the long page header.
+ * Read until we have at least one WAL segment with enough data to extract
+ * the WAL segment size from the long page header.
+ *
+ * We must not rely on cur_file here, because it can become NULL if a
+ * member trailer is processed during a read_archive_file() call. Instead,
+ * scan the hash table after each read to find any entry with sufficient
+ * data.
*/
- while (entry == NULL || entry->buf->len < XLOG_BLCKSZ)
+ while (entry == NULL)
{
+ ArchivedWAL_iterator iter;
+
if (!read_archive_file(privateInfo, XLOG_BLCKSZ))
pg_fatal("could not find WAL in archive \"%s\"",
privateInfo->archive_name);
- entry = privateInfo->cur_file;
+ ArchivedWAL_start_iterate(privateInfo->archive_wal_htab, &iter);
+ while ((entry = ArchivedWAL_iterate(privateInfo->archive_wal_htab,
+ &iter)) != NULL)
+ {
+ if (entry->read_len >= sizeof(XLogLongPageHeaderData))
+ break;
+ }
}
/* Extract the WAL segment size from the long page header */
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index cde7c6ca3f2..ca0dfd97168 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -44,7 +44,14 @@ typedef struct XLogDumpPrivate
Size archive_read_buf_size;
#endif
- /* What the archive streamer is currently reading */
+ /*
+ * The buffer for the WAL file the archive streamer is currently reading,
+ * or NULL if none. It is quite risky to examine this anywhere except in
+ * astreamer_waldump_content(), since it can change multiple times during
+ * a single read_archive_file() call. However, it is safe to assume that
+ * if cur_file is different from a particular ArchivedWALFile of interest,
+ * then the archive streamer has finished reading that file.
+ */
struct ArchivedWALFile *cur_file;
/*
--
2.43.0
[text/x-patch] v5-0001-Fix-finalization-of-decompressor-astreamers.patch (2.8K, 4-v5-0001-Fix-finalization-of-decompressor-astreamers.patch)
download | inline diff:
From 9978359771c14b411704d808abc0f602119f0a9f Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <[email protected]>
Date: Sun, 22 Mar 2026 05:31:14 -0400
Subject: [PATCH v5 1/5] Fix finalization of decompressor astreamers.
Send the correct amount of data to the next astreamer, not the
whole allocated buffer size. It's unclear how we missed this bug;
perhaps the use-cases so far are insensitive to trailing garbage.
Author: Andrew Dunstan <[email protected]>
Discussion: https://postgr.es/m/[email protected]
---
src/fe_utils/astreamer_gzip.c | 9 +++++----
src/fe_utils/astreamer_lz4.c | 9 +++++----
src/fe_utils/astreamer_zstd.c | 2 +-
3 files changed, 11 insertions(+), 9 deletions(-)
diff --git a/src/fe_utils/astreamer_gzip.c b/src/fe_utils/astreamer_gzip.c
index 2e080c37a58..df392f67cab 100644
--- a/src/fe_utils/astreamer_gzip.c
+++ b/src/fe_utils/astreamer_gzip.c
@@ -347,10 +347,11 @@ astreamer_gzip_decompressor_finalize(astreamer *streamer)
* End of the stream, if there is some pending data in output buffers then
* we must forward it to next streamer.
*/
- astreamer_content(mystreamer->base.bbs_next, NULL,
- mystreamer->base.bbs_buffer.data,
- mystreamer->base.bbs_buffer.maxlen,
- ASTREAMER_UNKNOWN);
+ if (mystreamer->bytes_written > 0)
+ astreamer_content(mystreamer->base.bbs_next, NULL,
+ mystreamer->base.bbs_buffer.data,
+ mystreamer->bytes_written,
+ ASTREAMER_UNKNOWN);
astreamer_finalize(mystreamer->base.bbs_next);
}
diff --git a/src/fe_utils/astreamer_lz4.c b/src/fe_utils/astreamer_lz4.c
index 2bc32b42879..605c188007b 100644
--- a/src/fe_utils/astreamer_lz4.c
+++ b/src/fe_utils/astreamer_lz4.c
@@ -397,10 +397,11 @@ astreamer_lz4_decompressor_finalize(astreamer *streamer)
* End of the stream, if there is some pending data in output buffers then
* we must forward it to next streamer.
*/
- astreamer_content(mystreamer->base.bbs_next, NULL,
- mystreamer->base.bbs_buffer.data,
- mystreamer->base.bbs_buffer.maxlen,
- ASTREAMER_UNKNOWN);
+ if (mystreamer->bytes_written > 0)
+ astreamer_content(mystreamer->base.bbs_next, NULL,
+ mystreamer->base.bbs_buffer.data,
+ mystreamer->bytes_written,
+ ASTREAMER_UNKNOWN);
astreamer_finalize(mystreamer->base.bbs_next);
}
diff --git a/src/fe_utils/astreamer_zstd.c b/src/fe_utils/astreamer_zstd.c
index f26abcfd0fa..4b43ab795e3 100644
--- a/src/fe_utils/astreamer_zstd.c
+++ b/src/fe_utils/astreamer_zstd.c
@@ -347,7 +347,7 @@ astreamer_zstd_decompressor_finalize(astreamer *streamer)
if (mystreamer->zstd_outBuf.pos > 0)
astreamer_content(mystreamer->base.bbs_next, NULL,
mystreamer->base.bbs_buffer.data,
- mystreamer->base.bbs_buffer.maxlen,
+ mystreamer->zstd_outBuf.pos,
ASTREAMER_UNKNOWN);
astreamer_finalize(mystreamer->base.bbs_next);
--
2.43.0
[text/x-patch] v5-0004-Fix-get_archive_wal_entry-to-handle-cur_file-tran.patch (6.5K, 5-v5-0004-Fix-get_archive_wal_entry-to-handle-cur_file-tran.patch)
download | inline diff:
From 91ed0e69a7df00be147548973a8172322998b528 Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <[email protected]>
Date: Sun, 22 Mar 2026 06:53:34 -0400
Subject: [PATCH v5 4/5] Fix get_archive_wal_entry to handle cur_file
transitions reliably.
As noted by Tom Lane, get_archive_wal_entry() uses cur_file in an
unsafe way: a single read_archive_file() call can trigger multiple
astreamer callbacks when compression is effective, causing cur_file
to change several times (entry -> NULL -> new entry) within one
call. The old code captured cur_file before the read and checked
for changes after, but this missed intermediate transitions. This
could cause spill-file handles to leak or data to not be flushed
when the streamer finishes one segment and starts another within
the same read.
Restructure the spill logic to explicitly track the entry being
spilled (spill_entry) separately from cur_file, and detect
transitions at the top of each loop iteration. Also ensure spill
file handles are closed on both success and error paths.
Discussion: https://postgr.es/m/[email protected]
---
src/bin/pg_waldump/archive_waldump.c | 117 ++++++++++++++++-----------
1 file changed, 69 insertions(+), 48 deletions(-)
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
index 3fce2183099..93ed856c674 100644
--- a/src/bin/pg_waldump/archive_waldump.c
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -463,11 +463,18 @@ free_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo)
* found. If the archive streamer is reading a WAL file from the archive that
* is not currently needed, that data is spilled to a temporary file for later
* retrieval.
+ *
+ * Because a single read_archive_file() call may trigger multiple astreamer
+ * callbacks (especially when data compresses well), cur_file can change
+ * several times within one call: from one entry to NULL (member trailer),
+ * and then to a new entry (next member header). The spill logic below
+ * handles this by flushing and closing per-entry state whenever we detect
+ * that the streamer has moved on.
*/
static ArchivedWALFile *
get_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo)
{
- ArchivedWALFile *entry = NULL;
+ ArchivedWALFile *spill_entry = NULL;
FILE *write_fp = NULL;
/*
@@ -477,6 +484,8 @@ get_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo)
*/
while (1)
{
+ ArchivedWALFile *entry;
+
/*
* Search hash table.
*
@@ -488,64 +497,76 @@ get_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo)
entry = ArchivedWAL_lookup(privateInfo->archive_wal_htab, fname);
if (entry != NULL)
+ {
+ /* Close any open spill file before returning. */
+ if (write_fp != NULL)
+ fclose(write_fp);
return entry;
-
- /*
- * Capture the current entry before calling read_archive_file(),
- * because cur_file may advance to a new segment during streaming. We
- * hold this reference so we can flush any remaining buffer data and
- * close the write handle once we detect that cur_file has moved on.
- */
- entry = privateInfo->cur_file;
-
- /*
- * Fetch more data either when no current file is being tracked or
- * when its buffer has been fully flushed to the temporary file.
- */
- if (entry == NULL || entry->buf->len == 0)
- {
- if (!read_archive_file(privateInfo, READ_CHUNK_SIZE))
- break; /* archive file ended */
}
/*
- * Archive streamer is reading a non-WAL file or an irrelevant WAL
- * file.
- */
- if (entry == NULL)
- continue;
-
- /*
- * The streamer is producing a WAL segment that isn't the one asked
- * for; it must be arriving out of order. Spill its data to disk so
- * it can be read back when needed.
- */
- Assert(strcmp(fname, entry->fname) != 0);
-
- /* Create a temporary file if one does not already exist */
- if (!entry->spilled)
- {
- write_fp = prepare_tmp_write(entry->fname, privateInfo);
- entry->spilled = true;
- }
-
- /* Flush data from the buffer to the file */
- perform_tmp_write(entry->fname, entry->buf, write_fp);
- resetStringInfo(entry->buf);
-
- /*
- * If cur_file changed since we captured entry above, the archive
- * streamer has finished this segment and moved on. Close its spill
- * file handle so data is flushed to disk before the next segment
- * starts writing to a different handle.
+ * If the streamer has moved on to a different entry than the one we
+ * were spilling, flush any remaining data for the old entry and close
+ * its spill file.
*/
- if (entry != privateInfo->cur_file && write_fp != NULL)
+ if (spill_entry != NULL && spill_entry != privateInfo->cur_file)
{
+ if (spill_entry->buf->len > 0)
+ {
+ perform_tmp_write(spill_entry->fname, spill_entry->buf,
+ write_fp);
+ resetStringInfo(spill_entry->buf);
+ }
fclose(write_fp);
write_fp = NULL;
+ spill_entry = NULL;
}
+
+ /*
+ * If no WAL file is currently being streamed (cur_file is NULL), or
+ * the current spill entry's buffer has been fully flushed, we need
+ * more data from the archive.
+ */
+ if (privateInfo->cur_file == NULL ||
+ (spill_entry != NULL && spill_entry->buf->len == 0))
+ {
+ if (!read_archive_file(privateInfo, READ_CHUNK_SIZE))
+ break; /* archive fully exhausted */
+ continue; /* re-check hash table and cur_file */
+ }
+
+ /*
+ * cur_file points to a WAL segment that isn't the one asked for; it
+ * must be arriving out of order. Spill its data to disk so it can be
+ * read back when needed.
+ */
+ spill_entry = privateInfo->cur_file;
+ Assert(strcmp(fname, spill_entry->fname) != 0);
+
+ /* Create a temporary file if one does not already exist */
+ if (!spill_entry->spilled)
+ {
+ write_fp = prepare_tmp_write(spill_entry->fname, privateInfo);
+ spill_entry->spilled = true;
+ }
+
+ /* Flush data from the buffer to the file */
+ perform_tmp_write(spill_entry->fname, spill_entry->buf, write_fp);
+ resetStringInfo(spill_entry->buf);
+
+ /*
+ * Read more data from the archive. This may add data to the current
+ * spill_entry's buffer, advance cur_file to a new entry, or set
+ * cur_file to NULL (member trailer).
+ */
+ if (!read_archive_file(privateInfo, READ_CHUNK_SIZE))
+ break; /* archive fully exhausted */
}
+ /* Close any open spill file before erroring out. */
+ if (write_fp != NULL)
+ fclose(write_fp);
+
/* Requested WAL segment not found */
pg_fatal("could not find WAL \"%s\" in archive \"%s\"",
fname, privateInfo->archive_name);
--
2.43.0
[text/x-patch] v5-0002-Fix-failure-to-finalize-the-decompression-pipelin.patch (6.6K, 6-v5-0002-Fix-failure-to-finalize-the-decompression-pipelin.patch)
download | inline diff:
From 0de5fd971602ae00fd3bd62cf5da0d8f0a3cce5a Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <[email protected]>
Date: Sun, 22 Mar 2026 05:32:45 -0400
Subject: [PATCH v5 2/5] Fix failure to finalize the decompression pipeline at
archive EOF.
archive_waldump.c called astreamer_finalize() nowhere. This meant
that any data retained in decompression buffers at the moment we
detect archive EOF would never reach astreamer_waldump_content(),
resulting in surprising failures if we actually need the last few
bytes of the archive file.
To fix, make read_archive_file() do the finalize once it detects
EOF. Change its API to return a boolean "yes there's more data"
rather than the entirely-misleading raw count of bytes read.
Also document the contract that cur_file can change (or become NULL)
during a single read_archive_file() call, since the decompression
pipeline may produce enough output to trigger multiple astreamer
callbacks.
Author: Tom Lane <[email protected]>
Discussion: https://postgr.es/m/[email protected]
---
src/bin/pg_waldump/archive_waldump.c | 50 +++++++++++++++++++++++-----
src/bin/pg_waldump/pg_waldump.h | 1 +
2 files changed, 42 insertions(+), 9 deletions(-)
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
index b078c2d6960..cd092a057ef 100644
--- a/src/bin/pg_waldump/archive_waldump.c
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -89,7 +89,7 @@ typedef struct astreamer_waldump
static ArchivedWALFile *get_archive_wal_entry(const char *fname,
XLogDumpPrivate *privateInfo);
-static int read_archive_file(XLogDumpPrivate *privateInfo, Size count);
+static bool read_archive_file(XLogDumpPrivate *privateInfo, Size count);
static void setup_tmpwal_dir(const char *waldir);
static void cleanup_tmpwal_dir_atexit(void);
@@ -139,6 +139,7 @@ init_archive_reader(XLogDumpPrivate *privateInfo,
pg_fatal("could not open file \"%s\"", privateInfo->archive_name);
privateInfo->archive_fd = fd;
+ privateInfo->archive_fd_eof = false;
streamer = astreamer_waldump_new(privateInfo);
@@ -178,7 +179,7 @@ init_archive_reader(XLogDumpPrivate *privateInfo,
*/
while (entry == NULL || entry->buf->len < XLOG_BLCKSZ)
{
- if (read_archive_file(privateInfo, XLOG_BLCKSZ) == 0)
+ if (!read_archive_file(privateInfo, XLOG_BLCKSZ))
pg_fatal("could not find WAL in archive \"%s\"",
privateInfo->archive_name);
@@ -236,9 +237,10 @@ free_archive_reader(XLogDumpPrivate *privateInfo)
/*
* NB: Normally, astreamer_finalize() is called before astreamer_free() to
* flush any remaining buffered data or to ensure the end of the tar
- * archive is reached. However, when decoding WAL, once we hit the end
- * LSN, any remaining buffered data or unread portion of the archive can
- * be safely ignored.
+ * archive is reached. read_archive_file() may have done so. However,
+ * when decoding WAL we can stop once we hit the end LSN, so we may never
+ * have read all of the input file. In that case any remaining buffered
+ * data or unread portion of the archive can be safely ignored.
*/
astreamer_free(privateInfo->archive_streamer);
@@ -384,7 +386,7 @@ read_archive_wal_page(XLogDumpPrivate *privateInfo, XLogRecPtr targetPagePtr,
fname, privateInfo->archive_name,
(long long int) (count - nbytes),
(long long int) count);
- if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ if (!read_archive_file(privateInfo, READ_CHUNK_SIZE))
pg_fatal("unexpected end of archive \"%s\" while reading \"%s\": read %lld of %lld bytes",
privateInfo->archive_name, fname,
(long long int) (count - nbytes),
@@ -490,7 +492,7 @@ get_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo)
*/
if (entry == NULL || entry->buf->len == 0)
{
- if (read_archive_file(privateInfo, READ_CHUNK_SIZE) == 0)
+ if (!read_archive_file(privateInfo, READ_CHUNK_SIZE))
break; /* archive file ended */
}
@@ -540,8 +542,22 @@ get_archive_wal_entry(const char *fname, XLogDumpPrivate *privateInfo)
/*
* Reads a chunk from the archive file and passes it through the streamer
* pipeline for decompression (if needed) and tar member extraction.
+ *
+ * count is the maximum amount to try to read this time. Note that it's
+ * measured in raw file bytes, and may have little to do with how much
+ * comes out of decompression/extraction.
+ *
+ * Returns true if successful, false if there is no more data.
+ *
+ * Callers must be aware that a single call may trigger multiple callbacks
+ * in astreamer_waldump_content, so privateInfo->cur_file can change value
+ * (or become NULL) during a call. In particular, cur_file is set to NULL
+ * when the ASTREAMER_MEMBER_TRAILER callback fires at the end of a tar
+ * member; it is then set to a new entry when the next WAL member's
+ * ASTREAMER_MEMBER_HEADER callback fires, which may or may not happen
+ * within the same call.
*/
-static int
+static bool
read_archive_file(XLogDumpPrivate *privateInfo, Size count)
{
int rc;
@@ -549,6 +565,11 @@ read_archive_file(XLogDumpPrivate *privateInfo, Size count)
/* The read request must not exceed the allocated buffer size. */
Assert(privateInfo->archive_read_buf_size >= count);
+ /* Fail if we already reached EOF in a prior call. */
+ if (privateInfo->archive_fd_eof)
+ return false;
+
+ /* Try to read some more data. */
rc = read(privateInfo->archive_fd, privateInfo->archive_read_buf, count);
if (rc < 0)
pg_fatal("could not read file \"%s\": %m",
@@ -562,8 +583,19 @@ read_archive_file(XLogDumpPrivate *privateInfo, Size count)
astreamer_content(privateInfo->archive_streamer, NULL,
privateInfo->archive_read_buf, rc,
ASTREAMER_UNKNOWN);
+ else
+ {
+ /*
+ * We reached EOF, but there is probably still data queued in the
+ * astreamer pipeline's buffers. Flush it out to ensure that we
+ * process everything.
+ */
+ astreamer_finalize(privateInfo->archive_streamer);
+ /* Set flag to ensure we don't finalize more than once. */
+ privateInfo->archive_fd_eof = true;
+ }
- return rc;
+ return true;
}
/*
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index 36893624f53..cde7c6ca3f2 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -35,6 +35,7 @@ typedef struct XLogDumpPrivate
char *archive_dir;
char *archive_name; /* Tar archive filename */
int archive_fd; /* File descriptor for the open tar file */
+ bool archive_fd_eof; /* Have we reached EOF on archive_fd? */
astreamer *archive_streamer;
char *archive_read_buf; /* Reusable read buffer for archive I/O */
--
2.43.0
[text/x-patch] v5-0005-Remove-nonfunctional-tar-file-trailer-size-check.patch (2.2K, 7-v5-0005-Remove-nonfunctional-tar-file-trailer-size-check.patch)
download | inline diff:
From cab3e116c1064afcc16e638555c74f7f460be55b Mon Sep 17 00:00:00 2001
From: Andrew Dunstan <[email protected]>
Date: Sun, 22 Mar 2026 06:53:58 -0400
Subject: [PATCH v5 5/5] Remove nonfunctional tar file trailer size check.
The ASTREAMER_ARCHIVE_TRAILER case in astreamer_tar_parser_content()
intended to reject tar files whose trailer exceeded 2 blocks. However,
the check compared 'len' after astreamer_buffer_bytes() had already
consumed all the data and set len to 0, so the pg_fatal() could never
fire.
Moreover, per the POSIX specification for the ustar format, the last
physical block of a tar archive is always full-sized, and "logical
records after the two zero logical records may contain undefined data."
GNU tar, for example, zero-pads its output to a 10kB boundary by
default. So rejecting extra data after the two zero blocks would be
wrong even if the check worked.
Remove the dead check and update the comment to explain why trailing
data is expected and harmless.
Per report from Tom Lane.
Discussion: https://postgr.es/m/[email protected]
---
src/fe_utils/astreamer_tar.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/src/fe_utils/astreamer_tar.c b/src/fe_utils/astreamer_tar.c
index 4b187f0a8c4..f8be5e4ff8a 100644
--- a/src/fe_utils/astreamer_tar.c
+++ b/src/fe_utils/astreamer_tar.c
@@ -236,12 +236,16 @@ astreamer_tar_parser_content(astreamer *streamer, astreamer_member *member,
/*
* We've seen an end-of-archive indicator, so anything more is
- * buffered and sent as part of the archive trailer. But we
- * don't expect more than 2 blocks.
+ * buffered and sent as part of the archive trailer.
+ *
+ * Per POSIX, the last physical block of a tar archive is
+ * always full-sized, so there may be undefined data after the
+ * two zero blocks that mark end-of-archive. GNU tar, for
+ * example, zero-pads to a 10kB boundary by default. We just
+ * buffer whatever we receive and pass it along at finalize
+ * time.
*/
astreamer_buffer_bytes(streamer, &data, &len, len);
- if (len > 2 * TAR_BLOCK_SIZE)
- pg_fatal("tar file trailer exceeds 2 blocks");
return;
default:
--
2.43.0
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-03-22 21:19 Andrew Dunstan <[email protected]>
parent: Andrew Dunstan <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Andrew Dunstan @ 2026-03-22 21:19 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Michael Paquier <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On Sun, Mar 22, 2026 at 2:17 PM Tom Lane <[email protected]> wrote:
> I wrote:
> > ... We can make this function far simpler
> > and more obviously correct if we just accept that we'll read a
> > WAL file completely before spilling it. See my proposed
> > alternative to 0004, attached.
>
> Actually, we can make that better yet by not expecting
> get_archive_wal_entry to clean up after init_archive's
> failure to free all irrelevant hashtable entries.
> Better version attached.
>
>
>
Yeah, this looks good. I know we also still need to do something about
rmtree trying to remove files we haven't closed. But what we have so far in
this set LGTM. If you want to push this I'm good, otherwise I'll look at it
tomorrow or Tuesday.
cheers
andrew
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-03-24 03:11 Michael Paquier <[email protected]>
parent: Andrew Dunstan <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Michael Paquier @ 2026-03-24 03:11 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On Sun, Mar 22, 2026 at 11:02:20PM -0400, Tom Lane wrote:
> Proposed patch attached. There might be an argument for using some
> other size than 256K for the other two decompressors, but my
> inclination is to try to make all three use roughly the same block
> size. (See also 66ec01dc4.)
The buildfarm has switched mostly to green, except on this one:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hoatzin&dt=2026-03-23%2006%3A00%3A42
--
Michael
Attachments:
[application/pgp-signature] signature.asc (833B, 2-signature.asc)
download
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-03-25 17:28 Andres Freund <[email protected]>
parent: Michael Paquier <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Andres Freund @ 2026-03-25 17:28 UTC (permalink / raw)
To: Michael Paquier <[email protected]>; +Cc: Tom Lane <[email protected]>; Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
Hi,
On 2026-03-24 12:11:44 +0900, Michael Paquier wrote:
> On Sun, Mar 22, 2026 at 11:02:20PM -0400, Tom Lane wrote:
> > Proposed patch attached. There might be an argument for using some
> > other size than 256K for the other two decompressors, but my
> > inclination is to try to make all three use roughly the same block
> > size. (See also 66ec01dc4.)
>
> The buildfarm has switched mostly to green, except on this one:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hoatzin&dt=2026-03-23%2006%3A00%3A42
I think there's a few more failues. Fairywren regularly fails, including in a
run from today.
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fairywren&dt=2026-03-25%2003%3A48%3A06
There also are a lot of CI failures. E.g.
https://cirrus-ci.com/task/6153854431002624
https://api.cirrus-ci.com/v1/artifact/task/6153854431002624/testrun/build/testrun/pg_waldump/001_bas...
# Running: pg_waldump --path C:\msys64\tmp\tNfU5IfQ4a/pg_wal.tar.gz --start 0/01806F48 --end 0/03093BD8
[22:46:25.358](3.991s) not ok 160 - runs with path option and start and end locations: exit code 0
[22:46:25.363](0.005s) # Failed test 'runs with path option and start and end locations: exit code 0'
# at C:/cirrus/src/bin/pg_waldump/t/001_basic.pl line 399.
[22:46:25.364](0.001s) ok 161 - runs with path option and start and end locations: no stderr
[22:46:25.365](0.001s) not ok 162 - runs with path option and start and end locations: matches
[22:46:25.365](0.000s) # Failed test 'runs with path option and start and end locations: matches'
# at C:/cirrus/src/bin/pg_waldump/t/001_basic.pl line 399.
[22:46:25.366](0.000s) # ''
# doesn't match '(?^:.)'
I was first suspecting that this is due to
commit 1c162c965a1
Author: Fujii Masao <[email protected]>
Date: 2026-03-24 22:33:09 +0900
Report detailed errors from XLogFindNextRecord() failures.
but there are afaict failures from before that:
https://cirrus-ci.com/task/5501609960013824
which is for 4019f725f5d, preceding 1c162c965a1
and
https://cirrus-ci.com/task/5317196043255808
It does feel however the failure frequency has increased substantially:
https://cirrus-ci.com/github/postgres/postgres/master
Greetings,
Andres Freund
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-03-28 21:36 Thomas Munro <[email protected]>
parent: Andres Freund <[email protected]>
0 siblings, 3 replies; 85+ messages in thread
From: Thomas Munro @ 2026-03-28 21:36 UTC (permalink / raw)
To: Andres Freund <[email protected]>; +Cc: Michael Paquier <[email protected]>; Tom Lane <[email protected]>; Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On Thu, Mar 26, 2026 at 6:28 AM Andres Freund <[email protected]> wrote:
> On 2026-03-24 12:11:44 +0900, Michael Paquier wrote:
> > On Sun, Mar 22, 2026 at 11:02:20PM -0400, Tom Lane wrote:
> > > Proposed patch attached. There might be an argument for using some
> > > other size than 256K for the other two decompressors, but my
> > > inclination is to try to make all three use roughly the same block
> > > size. (See also 66ec01dc4.)
> >
> > The buildfarm has switched mostly to green, except on this one:
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hoatzin&dt=2026-03-23%2006%3A00%3A42
>
> I think there's a few more failues. Fairywren regularly fails, including in a
> run from today.
This fails 100% of the time on my machine, even after e9d72348 and ff84efe4, eg:
# Running: pg_waldump --path /tmp/D8WG1Sv2HE/pg_wal.tar --start
0/017A2610 --end 0/02093848
[09:43:29.288](0.148s) not ok 104 - runs with path option and start
and end locations: exit code 0
[09:43:29.289](0.001s) # Failed test 'runs with path option and
start and end locations: exit code 0'
# at /home/tmunro/projects/postgresql/src/bin/pg_waldump/t/001_basic.pl
line 402.
[09:43:29.290](0.001s) not ok 105 - runs with path option and start
and end locations: no stderr
[09:43:29.291](0.001s) # Failed test 'runs with path option and
start and end locations: no stderr'
# at /home/tmunro/projects/postgresql/src/bin/pg_waldump/t/001_basic.pl
line 402.
[09:43:29.291](0.000s) # got: 'pg_waldump: error: could not
find WAL "000000010000000000000002" in archive "pg_wal.tar"
# '
I can see that it is wrong about the contents of the tar file:
$ pg_waldump --path _tmp_H_1gv81G1L_pg_wal.tar --start 0/017A2610
--end 0/020934F8 2>&1 | tail -3
rmgr: Hash len (rec/tot): 72/ 72, tx: 720, lsn:
0/01FFC1B8, prev 0/01FFC178, desc: INSERT off 40, blkref #0: rel
1663/5/16397 blk 2, blkref #1: rel
1663/5/16397 blk 0
rmgr: Transaction len (rec/tot): 46/ 46, tx: 720, lsn:
0/01FFC200, prev 0/01FFC1B8, desc: COMMIT 2026-03-29 10:15:24.112967
NZDT
pg_waldump: error: could not find WAL "000000010000000000000002" in
archive "_tmp_H_1gv81G1L_pg_wal.tar"
$ tar tvf _tmp_H_1gv81G1L_pg_wal.tar
drwx------ 0 tmunro tmunro 0 Mar 29 10:15 archive_status/
-rw------- 0 tmunro tmunro 0 Mar 29 10:15
archive_status/000000010000000000000002.ready
-rw------- 0 tmunro tmunro 0 Mar 29 10:15
archive_status/000000010000000000000001.ready
drwx------ 0 tmunro tmunro 0 Mar 29 10:08 summaries/
-rw------- 0 tmunro tmunro 16777216 Mar 29 10:15 000000010000000000000002
-rw------- 0 tmunro tmunro 16777216 Mar 29 10:15 000000010000000000000001
-rw------- 0 tmunro tmunro 16777216 Mar 29 10:15 000000010000000000000003
It seems like the place we'd be looking for the file is in
astreamer_tar_header(), so I added in some caveman debugging:
/*
* Parse key fields out of the header.
*/
fprintf(stderr, "XXXX [%s] XXXX\n", &buffer[TAR_OFFSET_NAME]);
strlcpy(member->pathname, &buffer[TAR_OFFSET_NAME], MAXPGPATH);
if (member->pathname[0] == '\0')
pg_fatal("tar member has empty name");
Now I see:
XXXX [archive_status/] XXXX
XXXX [archive_status/000000010000000000000002.ready] XXXX
XXXX [archive_status/000000010000000000000001.ready] XXXX
XXXX [summaries/] XXXX
XXXX [PaxHeader/000000010000000000000002] XXXX
XXXX [GNUSparseFile.0/000000010000000000000002] XXXX
XXXX [000000010000000000000001] XXXX
rmgr: XLOG len (rec/tot): 30/ 30, tx: 0, lsn:
0/017A2610, prev 0/017A25F0, desc: NEXTOID 24576
rmgr: Standby len (rec/tot): 42/ 42, tx: 692, lsn:
0/017A2630, prev 0/017A2610, desc: LOCK xid 692 db 5 rel 16384
rmgr: Storage len (rec/tot): 42/ 42, tx: 692, lsn:
0/017A2660, prev 0/017A2630, desc: CREATE base/5/16384
... lots more normal output ...
rmgr: Hash len (rec/tot): 72/ 72, tx: 720, lsn:
0/01FFBED8, prev 0/01FFBE98, desc: INSERT off 97, blkref #0: rel
1663/5/16397 blk 2, blkref #1: rel
1663/5/16397 blk 0
rmgr: Heap len (rec/tot): 575/ 575, tx: 720, lsn:
0/01FFBF20, prev 0/01FFBED8, desc: INSERT off: 12, flags: 0x08, blkref
#0: rel 1663/5/16393 blk 52
rmgr: Btree len (rec/tot): 64/ 64, tx: 720,
lsn:XXXX [PaxHeader/000000010000000000000003] XXXX
XXXX [GNUSparseFile.0/000000010000000000000003] XXXX
0/01FFC178, prev 0/01FFBF20, desc: INSERT_LEAF off: 344, blkref #0:
rel 1663/5/16396 blk 2
rmgr: Hash len (rec/tot): 72/ 72, tx: 720, lsn:
0/01FFC1B8, prev 0/01FFC178, desc: INSERT off 40, blkref #0: rel
1663/5/16397 blk 2, blkref #1: rel
1663/5/16397 blk 0
rmgr: Transaction len (rec/tot): 46/ 46, tx: 720, lsn:
0/01FFC200, prev 0/01FFC1B8, desc: COMMIT 2026-03-29 10:15:24.112967
NZDT
pg_waldump: error: could not find WAL "000000010000000000000002" in
archive "_tmp_H_1gv81G1L_pg_wal.tar"
Seems like it already stepped over 000000010000000000000002 earlier?
Could it be a table-of-contents order dependency bug or something like
that?
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-03-28 22:08 Thomas Munro <[email protected]>
parent: Thomas Munro <[email protected]>
2 siblings, 0 replies; 85+ messages in thread
From: Thomas Munro @ 2026-03-28 22:08 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On Sun, Mar 29, 2026 at 10:49 AM Tom Lane <[email protected]> wrote:
> Huh. What is your platform exactly? Maybe more to the point,
> what is the tar program you're using?
FreeBSD tar 3.8.2.
> > Seems like it already stepped over 000000010000000000000002 earlier?
> > Could it be a table-of-contents order dependency bug or something like
> > that?
>
> If you look at the TAP script, you'll see that it tries to randomize
> the order of the entries in the tar file (see sub generate_archive).
> So if that's the problem, it shouldn't reproduce 100%, and also we
> should be seeing lots of freckles on the buildfarm. We're not, so
> there must be something off-the-beaten-track about your test
> environment.
Right, I see now. There is something different about
000000010000000000000002 though: it doesn't seem to have a normal
(non-PAX, non-GNU) TOC entry, unlike000000010000000000000001. Trying
to figure out why...
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-03-28 22:57 Tomas Vondra <[email protected]>
parent: Thomas Munro <[email protected]>
2 siblings, 1 reply; 85+ messages in thread
From: Tomas Vondra @ 2026-03-28 22:57 UTC (permalink / raw)
To: Tom Lane <[email protected]>; Thomas Munro <[email protected]>; +Cc: Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On 3/28/26 23:36, Tom Lane wrote:
> I wrote:
>> However ... I do not find any indication in the GNU tar docs
>> that it produces sparse files by default. It looks like you
>> need to say -S/--sparse to make that happen. Maybe you have
>> a version that's been hacked to make that the default?
>
> Bleah. Digging in the man pages at freebsd.org, I read
>
> --read-sparse
> (c, r, u modes only) Read sparse file information from disk.
> This is the reverse of --no-read-sparse and the default behav-
> ior.
>
> It's apparently been there and been default since FreeBSD 13.1.
> This leads one to wonder how come BF member dikkop is managing
> to run this test successfully. I speculate that it's using a
> filesystem type that doesn't do sparse files (cc'ing Vondra
> for confirmation on that).
>
It's running on ufs. But I think the explanation is very simple. We had
a short power outage on Thursday, and the FreeBSD machine failed to boot
properly after the power was restored. IIUC this test is new, right?
I fixed the machine, it'll start running the tests in a couple minutes.
regards
--
Tomas Vondra
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-03-28 23:15 Thomas Munro <[email protected]>
parent: Thomas Munro <[email protected]>
2 siblings, 0 replies; 85+ messages in thread
From: Thomas Munro @ 2026-03-28 23:15 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Tomas Vondra <[email protected]>; Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On Sun, Mar 29, 2026 at 11:37 AM Tom Lane <[email protected]> wrote:
> I wrote:
> > However ... I do not find any indication in the GNU tar docs
> > that it produces sparse files by default. It looks like you
> > need to say -S/--sparse to make that happen. Maybe you have
> > a version that's been hacked to make that the default?
>
> Bleah. Digging in the man pages at freebsd.org, I read
>
> --read-sparse
> (c, r, u modes only) Read sparse file information from disk.
> This is the reverse of --no-read-sparse and the default behav-
> ior.
>
> It's apparently been there and been default since FreeBSD 13.1.
> This leads one to wonder how come BF member dikkop is managing
> to run this test successfully. I speculate that it's using a
> filesystem type that doesn't do sparse files (cc'ing Vondra
> for confirmation on that).
>
> It looks like to make this test stable on modern FreeBSD,
> we need to see if tar accepts --no-read-sparse and use that
> switch if so.
Yeah. Here's my attempt at perl.
I think your Mac probably has a similar tar program BTW... but apfs
probably doesn't go around making holes visible to lseek()
automatically or at least as eagerly as my ZFS system.
Attachments:
[text/x-patch] 0001-Fix-pg_waldump-test-for-libarchive-tar.patch (1.8K, 2-0001-Fix-pg_waldump-test-for-libarchive-tar.patch)
download | inline diff:
From b1a9131d072dadfe9c5666e75a932ba70b8d2f99 Mon Sep 17 00:00:00 2001
From: Thomas Munro <[email protected]>
Date: Sun, 29 Mar 2026 12:03:42 +1300
Subject: [PATCH] Fix pg_waldump test for libarchive tar.
libarchive tar (the one shipped on macOS, *BSD systems) might decide to
archive non-standard sparse encoding with GNU extensions (unlike GNU tar
itself) by default. pg_waldump can't read them. Suppress that, if $TAR
understands --no-sparse-files.
Discussion: https://postgr.es/m/1624716.1774736283%40sss.pgh.pa.us
---
src/bin/pg_waldump/t/001_basic.pl | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 8bb8fa225f6..d911296bb66 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -11,6 +11,14 @@ use Test::More;
use List::Util qw(shuffle);
my $tar = $ENV{TAR};
+my @TAR_C_FLAGS;
+
+# libarchive tar (as found on *BSD and macOS) might create sparse files by
+# default, and we can't read them
+if (system("$tar --no-read-sparse -c - /dev/null > /dev/null") == 0)
+{
+ push(@TAR_C_FLAGS, "--no-read-sparse");
+}
program_help_ok('pg_waldump');
program_version_ok('pg_waldump');
@@ -331,7 +339,6 @@ sub test_pg_waldump
sub generate_archive
{
my ($archive, $directory, $compression_flags) = @_;
-
my @files;
opendir my $dh, $directory or die "opendir: $!";
while (my $entry = readdir $dh) {
@@ -346,7 +353,7 @@ sub generate_archive
# move into the WAL directory before archiving files
my $cwd = getcwd;
chdir($directory) || die "chdir: $!";
- command_ok([$tar, $compression_flags, $archive, @files]);
+ command_ok([$tar, @TAR_C_FLAGS, $compression_flags, $archive, @files]);
chdir($cwd) || die "chdir: $!";
}
--
2.52.0
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-03-29 01:59 Andrew Dunstan <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Andrew Dunstan @ 2026-03-29 01:59 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Thomas Munro <[email protected]>; Tomas Vondra <[email protected]>; Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Masao Fujii <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
> On Mar 28, 2026, at 7:34 PM, Tom Lane <[email protected]> wrote:
>
> Thomas Munro <[email protected]> writes:
>>> On Sun, Mar 29, 2026 at 11:37 AM Tom Lane <[email protected]> wrote:
>>> It looks like to make this test stable on modern FreeBSD,
>>> we need to see if tar accepts --no-read-sparse and use that
>>> switch if so.
>
>> Yeah. Here's my attempt at perl.
>
> Andrew might have some stylistic suggestions, but this looks
> plausible to me.
Looks basically ok to me, although I wouldn’t make the variable name all caps.
Cheers
Andrew
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-03-29 02:01 Thomas Munro <[email protected]>
parent: Andrew Dunstan <[email protected]>
0 siblings, 0 replies; 85+ messages in thread
From: Thomas Munro @ 2026-03-29 02:01 UTC (permalink / raw)
To: Andrew Dunstan <[email protected]>; +Cc: Tom Lane <[email protected]>; Tomas Vondra <[email protected]>; Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Masao Fujii <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On Sun, Mar 29, 2026 at 2:59 PM Andrew Dunstan <[email protected]> wrote:
> Looks basically ok to me, although I wouldn’t make the variable name all caps.
Will fix and push. Thanks!
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-03-29 13:33 Tomas Vondra <[email protected]>
parent: Tomas Vondra <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Tomas Vondra @ 2026-03-29 13:33 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Thomas Munro <[email protected]>; Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On 3/29/26 00:12, Tom Lane wrote:
> Tomas Vondra <[email protected]> writes:
>> On 3/28/26 23:36, Tom Lane wrote:
>>> It's apparently been there and been default since FreeBSD 13.1.
>>> This leads one to wonder how come BF member dikkop is managing
>>> to run this test successfully. I speculate that it's using a
>>> filesystem type that doesn't do sparse files (cc'ing Vondra
>>> for confirmation on that).
>
>> It's running on ufs. But I think the explanation is very simple. We had
>> a short power outage on Thursday, and the FreeBSD machine failed to boot
>> properly after the power was restored. IIUC this test is new, right?
>
> Not that new, it dates to b15c15139, about a week ago.
>
> I've reproduced Thomas' failure on a local FreeBSD 15.0 image
> using zfs, and confirmed that this cowboy hack fixes it:
>
Interesting. Then I guess it has to be due to some difference in ufs vs.
zfs, when handling sparse files. It might be useful to add a bit more
variation here, and switch some of the animals to non-default
filesystems (not just the FreeBSD ones, which we seem to have only two
that run reasonably often). I'd bet most of the linux systems run on
ext4/xfs, few on btrfs/zfs.
regards
--
Tomas Vondra
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-03-29 22:11 Thomas Munro <[email protected]>
parent: Tomas Vondra <[email protected]>
0 siblings, 2 replies; 85+ messages in thread
From: Thomas Munro @ 2026-03-29 22:11 UTC (permalink / raw)
To: Tomas Vondra <[email protected]>; +Cc: Tom Lane <[email protected]>; Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On Mon, Mar 30, 2026 at 2:33 AM Tomas Vondra <[email protected]> wrote:
> On 3/29/26 00:12, Tom Lane wrote:
> > I've reproduced Thomas' failure on a local FreeBSD 15.0 image
> > using zfs, and confirmed that this cowboy hack fixes it:
> >
>
> Interesting. Then I guess it has to be due to some difference in ufs vs.
> zfs, when handling sparse files. It might be useful to add a bit more
> variation here, and switch some of the animals to non-default
> filesystems (not just the FreeBSD ones, which we seem to have only two
> that run reasonably often). I'd bet most of the linux systems run on
> ext4/xfs, few on btrfs/zfs.
UFS does have sparse files (its ancestor invented them some time
around (time_t) 0), it just doesn't make them unless you tell it to.
PostgreSQL only does that if you set wal_init_zero=false.
ZFS is different because it creates holes automagically when you write
zeroes, at least if compression is enabled so it has to scan all your
bytes anyway.
I was curious to know if BTRFS does that too, or hides
zero-compression at some lower invisible level:
$ echo "hello" > 1MB-sparse.dat
$ truncate -s 512KB 1MB-sparse.dat
$ echo "world" >> 1MB-sparse.dat
$ truncate -s 1MB 1MB-sparse.dat
$ ls -l 1MB-sparse.dat
-rw-rw-r-- 1 tmunro tmunro 1000000 Mar 30 10:11 1MB-sparse.dat
$ du -hs 1MB-sparse.dat
8.0K 1MB-sparse.dat
$ strace tar -S -cf foo.tar 1MB-sparse.dat 2>&1 | grep seek
lseek(4, 0, SEEK_DATA) = 0
lseek(4, 0, SEEK_HOLE) = 4096
lseek(4, 4096, SEEK_DATA) = 512000
lseek(4, 512000, SEEK_HOLE) = 516096
lseek(4, 516096, SEEK_DATA) = -1 ENXIO (No such device or address)
... so that's a yes, lseek sees holes that we didn't ask it to make,
just like on ZFS, but the rest of this trace of GNU tar -S -cf is
interesting:
lseek(5, 0, SEEK_SET) = 0
lseek(5, 0, SEEK_SET) = 0
lseek(4, 0, SEEK_SET) = 0
lseek(4, 512000, SEEK_SET) = 512000
lseek(4, 1000000, SEEK_SET) = 1000000
It didn't write out PAX format! Instead it replicated the holes into
the tar file itself with SEEK_SET.
$ strings foo.tar | grep Sparse
You have to add --format=posix to enable the GNU behaviour that BSD
tar is emulating by default:
$ tar --format=posix -S -cf foo.tar 1MB-sparse.dat
$ strings foo.tar | grep Sparse
./GNUSparseFile.4190/1MB-sparse.dat
I expected GNU tar to be forced to do that if writing to non-seekable
output, eg "tar -S -c 1MB-sparse.dat | cat > foo.tar", but somehow it
manages to write out only ~10KB of plain ustar format that it is able
to restore to the full 1MB apparent size using some other trick, but
... ENOTIME, I dunno how it's doing that. Might be interesting to see
if pg_waldump can read it though, 'cause the bytes aren't all there.
BTW I confirmed that Apple tar does have -S by default too, it's just
that APFS doesn't make holes magically, so this test would presumably
have broken on a Mac if wal_init_zero had been forced to zero (not
tested).
Anyway, given the defaults, GNU tar + ZFS/BTRFS users must be pretty
unlikely to hit this in the wild, and the symptom is a confusing error
in a maintenance tool, not corruption, so I don't think this is a big
deal. I might still try teaching the astreamer code to understand PAX
1.0 when it sees it in the next cycle though, for the benefit of
FreeBSD users. A quick and dirty version could probably just unmangle
the name and skip the first block of data, since any valid WAL file
will not begin with a hole and valid WAL data will end at the first
hole and fail our verification, but of course a real implementation
should read the map properly[1]...
[1] https://www.gnu.org/software/tar/manual/html_node/PAX-1.html
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-03-29 22:20 Thomas Munro <[email protected]>
parent: Thomas Munro <[email protected]>
1 sibling, 0 replies; 85+ messages in thread
From: Thomas Munro @ 2026-03-29 22:20 UTC (permalink / raw)
To: Tomas Vondra <[email protected]>; +Cc: Tom Lane <[email protected]>; Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On Mon, Mar 30, 2026 at 11:11 AM Thomas Munro <[email protected]> wrote:
> ... so that's a yes, lseek sees holes that we didn't ask it to make,
Oops, sorry, I wrote that email too fast and got my examples mixed up,
BTFS actually *doesnt* do that automatically, that was of course a
trace showing a file with explicitly made holes. So this is probably
be a ZFS-only issue unless you're using wal_init_zero=0, and then any
file system could result in PAX-sparse-format tarballs, but even then
only if you use non-default switches that in practice no one will use
with GNU tar, or if you use BSD tar. So in practice this is a
FreeBSD-only issue.
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-04-01 02:05 Thomas Munro <[email protected]>
parent: Thomas Munro <[email protected]>
1 sibling, 2 replies; 85+ messages in thread
From: Thomas Munro @ 2026-04-01 02:05 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Tomas Vondra <[email protected]>; Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On Mon, Mar 30, 2026 at 11:23 AM Tom Lane <[email protected]> wrote:
> Thomas Munro <[email protected]> writes:
> > Anyway, given the defaults, GNU tar + ZFS/BTRFS users must be pretty
> > unlikely to hit this in the wild, and the symptom is a confusing error
> > in a maintenance tool, not corruption, so I don't think this is a big
> > deal. I might still try teaching the astreamer code to understand PAX
> > 1.0 when it sees it in the next cycle though, for the benefit of
> > FreeBSD users.
>
> I agree that this isn't too critical if the effects are confined to
> pg_waldump. I believe that pg_basebackup and pg_verifybackup also use
> astreamer_tar.c, but it's not clear to me if they'd ever be asked to
> parse files made by tar(1) and not by our own sparseness-ignorant
> tar-writing code. If they can be, that'd be a higher-priority reason
> to fill in this gap.
I pushed the workaround for the test.
Yeah I can't see any reason why pg_verifybackup --wal-path=foo.tar
won't suffer the same problem in the wild. Again, it's not the end of
the world because it'll just fail and you'll probably eventually
figure out why. So perhaps we should just improve our detection of
archives that we can't handle? Straw man algorithm:
If you can't find $NAME in the archive, then check if PaxHeaders/$NAME
exists, and if so, fail with 'unsupported TAR format for WAL file "%s"
in archive "%s"' instead. That'd probably work well enough in
practice, because astreamer_tar.c treats PAX extended header
pseudo-files as regular files (they're not, they have type 'x'), and
both GNU and BSD tar happen to use that.
POSIX doesn't require that naming, so it would in theory be more
correct to teach astreamer_tar.c to recognise PAX extended headers and
fish out enough information and link it to the following archive
member, but a simple test to improve error messaging seems like the
right level of effort here.
Here's a test patch that shows the problem on any system with GNU tar
or BSD tar and a file system that supports sparse files. The test
succeeds because it looks for "error: could not find WAL" but the idea
would be to change it to look for a new error message like that. My
motivation was to make this reproducible on any system, in case that's
helpful for Amul and Andrew if they're interested in trying to improve
this edge case in time for the release. Otherwise I'll come back to
it, but probably not in time...
Attachments:
[text/x-patch] 0001-Add-a-pg_waldump-test-with-GNU-tar-PAX-format.patch (4.2K, 2-0001-Add-a-pg_waldump-test-with-GNU-tar-PAX-format.patch)
download | inline diff:
From 084d71f81143f0462caf03569722b5f0b2a147e6 Mon Sep 17 00:00:00 2001
From: Thomas Munro <[email protected]>
Date: Mon, 30 Mar 2026 18:20:09 +1300
Subject: [PATCH] Add a pg_waldump test with GNU tar PAX format.
XXX Update this to test for a new improved error message!
XXX Should this test run for all the scenarios? Doesn't seem like
compression is relevant to this problem so I just added it as a
standalone test...
XXX No doubt the perl isn't the greatest...
---
src/bin/pg_waldump/t/001_basic.pl | 73 +++++++++++++++++++++++++++++--
1 file changed, 69 insertions(+), 4 deletions(-)
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index ce1f6aa30c0..7f8a319c85d 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -6,6 +6,7 @@ use warnings FATAL => 'all';
use Cwd;
use File::Copy;
use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::RecursiveCopy;
use PostgreSQL::Test::Utils;
use Test::More;
use List::Util qw(shuffle);
@@ -212,9 +213,13 @@ $node->safe_psql('postgres',
qq{SELECT pg_logical_emit_message(true, 'test 026', repeat('xyzxz', 123456))}
);
-my ($end_lsn, $end_walfile) = split /\|/,
+my ($end_lsn, $end_walfile, $wal_segsize) = split /\|/,
$node->safe_psql('postgres',
- q{SELECT pg_current_wal_insert_lsn(), pg_walfile_name(pg_current_wal_insert_lsn())}
+ q{SELECT pg_current_wal_insert_lsn(),
+ pg_walfile_name(pg_current_wal_insert_lsn()),
+ setting
+ FROM pg_settings
+ WHERE name = 'wal_segment_size'}
);
my $default_ts_oid = $node->safe_psql('postgres',
@@ -339,7 +344,7 @@ sub test_pg_waldump
# Create a tar archive, shuffle the file order
sub generate_archive
{
- my ($archive, $directory, $compression_flags) = @_;
+ my ($archive, $directory, $compression_flags, @extra_flags) = @_;
my @files;
opendir my $dh, $directory or die "opendir: $!";
@@ -350,12 +355,17 @@ sub generate_archive
}
closedir $dh;
+ if (!@extra_flags)
+ {
+ @extra_flags = @tar_c_flags;
+ }
+
@files = shuffle @files;
# move into the WAL directory before archiving files
my $cwd = getcwd;
chdir($directory) || die "chdir: $!";
- command_ok([$tar, @tar_c_flags, $compression_flags, $archive, @files]);
+ command_ok([$tar, @extra_flags, $compression_flags, $archive, @files]);
chdir($cwd) || die "chdir: $!";
}
@@ -477,4 +487,59 @@ for my $scenario (@scenarios)
}
}
+SKIP:
+ skip "tar command is not available", 1
+ if !defined $tar;
+
+ my @sparse_flags;
+
+ # Tell $TAR to use GNU tar's PAX sparse file archive format, so we can test
+ # our handling of that.
+
+ # GNU tar
+ @sparse_flags = ("--sparse", "--format=pax")
+ if system("$tar --sparse --format=pax -c " .
+ $node->data_dir . "/pg_wal/* /dev/null > /dev/null") == 0;
+ # BSD tar (this is the default, but we still need to detect BSD tar)
+ @sparse_flags = ("--read-sparse", "--format=pax")
+ if system("$tar --read-sparse --format=pax -c " .
+ $node->data_dir . "/pg_wal/* /dev/null > /dev/null") == 0;
+
+ skip "tar command doesn't support GNU PAX format for sparse files", 1
+ if !@sparse_flags;
+
+ PostgreSQL::Test::RecursiveCopy::copypath($node->data_dir . '/pg_wal',
+ $tmp_dir . '/pg_wal_sparse');
+
+ # truncate the unused part of final WAL file
+ my $end_byte = $end_lsn;
+ $end_byte =~ s/\///;
+ $end_byte = hex($end_byte);
+ $end_byte %= $wal_segsize;
+ truncate $tmp_dir . '/pg_wal_sparse/' . $end_walfile, $end_byte;
+
+ # now re-extend it to create a hole
+ truncate $tmp_dir . '/pg_wal_sparse/' . $end_walfile, $wal_segsize;
+
+ # XXX maybe we should detect sparse files with stat (size > blocks * block
+ # size?), and skip the test if truncate failed to make one... that
+ # might happen on eg windows I think? otherwise we'd have to tolerate
+ # the pg_waldump command succeeding OR failing with a certain message
+
+ generate_archive($tmp_dir . '/pg_wal_sparse.tar',
+ $tmp_dir . '/pg_wal_sparse',
+ '-cf',
+ @sparse_flags);
+
+ # XXX change this to check for new improved error message
+ command_fails_like(
+ [
+ 'pg_waldump',
+ '--path' => $tmp_dir . '/pg_wal_sparse.tar',
+ '--start' => $start_lsn,
+ '--end' => $end_lsn,
+ ],
+ qr/error: could not find WAL/,
+ 'fails with GNU tar PAX-format sparse files');
+
done_testing();
--
2.53.0
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-04-01 10:39 Andrew Dunstan <[email protected]>
parent: Thomas Munro <[email protected]>
1 sibling, 1 reply; 85+ messages in thread
From: Andrew Dunstan @ 2026-04-01 10:39 UTC (permalink / raw)
To: Thomas Munro <[email protected]>; Tom Lane <[email protected]>; +Cc: Tomas Vondra <[email protected]>; Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On 2026-03-31 Tu 10:05 PM, Thomas Munro wrote:
> On Mon, Mar 30, 2026 at 11:23 AM Tom Lane<[email protected]> wrote:
>> Thomas Munro<[email protected]> writes:
>>> Anyway, given the defaults, GNU tar + ZFS/BTRFS users must be pretty
>>> unlikely to hit this in the wild, and the symptom is a confusing error
>>> in a maintenance tool, not corruption, so I don't think this is a big
>>> deal. I might still try teaching the astreamer code to understand PAX
>>> 1.0 when it sees it in the next cycle though, for the benefit of
>>> FreeBSD users.
>> I agree that this isn't too critical if the effects are confined to
>> pg_waldump. I believe that pg_basebackup and pg_verifybackup also use
>> astreamer_tar.c, but it's not clear to me if they'd ever be asked to
>> parse files made by tar(1) and not by our own sparseness-ignorant
>> tar-writing code. If they can be, that'd be a higher-priority reason
>> to fill in this gap.
> I pushed the workaround for the test.
It occurred to me this morning that we probably shouldn't run this test
on Windows, and if we do we shouldn't be using /dev/null (the Windows
equivalent of which is just "nul"). The simplest fix would just be to
add a "!$windows_os" to the if test.
cheers
andrew
--
Andrew Dunstan
EDB:https://www.enterprisedb.com
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-04-01 13:26 Andres Freund <[email protected]>
parent: Andrew Dunstan <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Andres Freund @ 2026-04-01 13:26 UTC (permalink / raw)
To: Andrew Dunstan <[email protected]>; +Cc: Thomas Munro <[email protected]>; Tom Lane <[email protected]>; Tomas Vondra <[email protected]>; Michael Paquier <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
Hi,
On 2026-04-01 06:39:05 -0400, Andrew Dunstan wrote:
>
> On 2026-03-31 Tu 10:05 PM, Thomas Munro wrote:
> > On Mon, Mar 30, 2026 at 11:23 AM Tom Lane<[email protected]> wrote:
> > > Thomas Munro<[email protected]> writes:
> > > > Anyway, given the defaults, GNU tar + ZFS/BTRFS users must be pretty
> > > > unlikely to hit this in the wild, and the symptom is a confusing error
> > > > in a maintenance tool, not corruption, so I don't think this is a big
> > > > deal. I might still try teaching the astreamer code to understand PAX
> > > > 1.0 when it sees it in the next cycle though, for the benefit of
> > > > FreeBSD users.
> > > I agree that this isn't too critical if the effects are confined to
> > > pg_waldump. I believe that pg_basebackup and pg_verifybackup also use
> > > astreamer_tar.c, but it's not clear to me if they'd ever be asked to
> > > parse files made by tar(1) and not by our own sparseness-ignorant
> > > tar-writing code. If they can be, that'd be a higher-priority reason
> > > to fill in this gap.
> > I pushed the workaround for the test.
>
>
> It occurred to me this morning that we probably shouldn't run this test on
> Windows, and if we do we shouldn't be using /dev/null (the Windows
> equivalent of which is just "nul"). The simplest fix would just be to add a
> "!$windows_os" to the if test.
Why should we skip this test on windows?
I think we have historically been way too liberal about sprinkling
!$windows_os test disablements around. More than once there were actual bugs
that we just swept under the rug by disabling the tests that detected them.
Either we support windows or we don't.
Greetings,
Andres Freund
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-04-01 14:19 Andrew Dunstan <[email protected]>
parent: Andres Freund <[email protected]>
0 siblings, 0 replies; 85+ messages in thread
From: Andrew Dunstan @ 2026-04-01 14:19 UTC (permalink / raw)
To: Andres Freund <[email protected]>; +Cc: Thomas Munro <[email protected]>; Tom Lane <[email protected]>; Tomas Vondra <[email protected]>; Michael Paquier <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On 2026-04-01 We 9:26 AM, Andres Freund wrote:
> Hi,
>
> On 2026-04-01 06:39:05 -0400, Andrew Dunstan wrote:
>> On 2026-03-31 Tu 10:05 PM, Thomas Munro wrote:
>>> On Mon, Mar 30, 2026 at 11:23 AM Tom Lane<[email protected]> wrote:
>>>> Thomas Munro<[email protected]> writes:
>>>>> Anyway, given the defaults, GNU tar + ZFS/BTRFS users must be pretty
>>>>> unlikely to hit this in the wild, and the symptom is a confusing error
>>>>> in a maintenance tool, not corruption, so I don't think this is a big
>>>>> deal. I might still try teaching the astreamer code to understand PAX
>>>>> 1.0 when it sees it in the next cycle though, for the benefit of
>>>>> FreeBSD users.
>>>> I agree that this isn't too critical if the effects are confined to
>>>> pg_waldump. I believe that pg_basebackup and pg_verifybackup also use
>>>> astreamer_tar.c, but it's not clear to me if they'd ever be asked to
>>>> parse files made by tar(1) and not by our own sparseness-ignorant
>>>> tar-writing code. If they can be, that'd be a higher-priority reason
>>>> to fill in this gap.
>>> I pushed the workaround for the test.
>>
>> It occurred to me this morning that we probably shouldn't run this test on
>> Windows, and if we do we shouldn't be using /dev/null (the Windows
>> equivalent of which is just "nul"). The simplest fix would just be to add a
>> "!$windows_os" to the if test.
> Why should we skip this test on windows?
>
> I think we have historically been way too liberal about sprinkling
> !$windows_os test disablements around. More than once there were actual bugs
> that we just swept under the rug by disabling the tests that detected them.
> Either we support windows or we don't.
>
Maybe I misunderstood, but I didn't think this was going to be an issue
on NTFS.
In general I agree with you, though. I try to avoid skipping things on
Windows.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-04-01 18:25 Tom Lane <[email protected]>
parent: Thomas Munro <[email protected]>
1 sibling, 2 replies; 85+ messages in thread
From: Tom Lane @ 2026-04-01 18:25 UTC (permalink / raw)
To: Thomas Munro <[email protected]>; +Cc: Tomas Vondra <[email protected]>; Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
Thomas Munro <[email protected]> writes:
> On Mon, Mar 30, 2026 at 11:23 AM Tom Lane <[email protected]> wrote:
>> I agree that this isn't too critical if the effects are confined to
>> pg_waldump. I believe that pg_basebackup and pg_verifybackup also use
>> astreamer_tar.c, but it's not clear to me if they'd ever be asked to
>> parse files made by tar(1) and not by our own sparseness-ignorant
>> tar-writing code. If they can be, that'd be a higher-priority reason
>> to fill in this gap.
> Yeah I can't see any reason why pg_verifybackup --wal-path=foo.tar
> won't suffer the same problem in the wild. Again, it's not the end of
> the world because it'll just fail and you'll probably eventually
> figure out why. So perhaps we should just improve our detection of
> archives that we can't handle?
After reading the POSIX spec for pax format (in the pax(1) man page),
I think it's absolutely essential that we reject files that contain
pax extension headers. Those can change the interpretation of the
following file header(s) in nearly arbitrary ways, so we have plenty
of problems besides this sparse-file issue if we just ignore them.
(Of course, later we can consider improving the code to handle them
correctly, but that ain't happening in time for v19.)
Also, if we are admitting the possibility that what we are reading
was made by a platform-supplied tar and not our own code, I think
it verges on lunacy to behave as though unsupported typeflags are
regular files.
So I think we need something more or less like the attached.
regards, tom lane
Attachments:
[text/x-diff] v1-tighten-tar-typeflag-handling.patch (5.3K, 2-v1-tighten-tar-typeflag-handling.patch)
download | inline diff:
diff --git a/src/bin/pg_basebackup/astreamer_inject.c b/src/bin/pg_basebackup/astreamer_inject.c
index 14a96733bad..a5dff0ac1c6 100644
--- a/src/bin/pg_basebackup/astreamer_inject.c
+++ b/src/bin/pg_basebackup/astreamer_inject.c
@@ -224,8 +224,9 @@ astreamer_inject_file(astreamer *streamer, char *pathname, char *data,
strlcpy(member.pathname, pathname, MAXPGPATH);
member.size = len;
member.mode = pg_file_create_mode;
+ member.is_regular = true;
member.is_directory = false;
- member.is_link = false;
+ member.is_symlink = false;
member.linktarget[0] = '\0';
/*
diff --git a/src/bin/pg_verifybackup/astreamer_verify.c b/src/bin/pg_verifybackup/astreamer_verify.c
index 26c98186530..99bd69ce7c9 100644
--- a/src/bin/pg_verifybackup/astreamer_verify.c
+++ b/src/bin/pg_verifybackup/astreamer_verify.c
@@ -165,7 +165,7 @@ member_verify_header(astreamer *streamer, astreamer_member *member)
char pathname[MAXPGPATH];
/* We are only interested in normal files. */
- if (member->is_directory || member->is_link)
+ if (!member->is_regular)
return;
/*
diff --git a/src/bin/pg_waldump/archive_waldump.c b/src/bin/pg_waldump/archive_waldump.c
index ff1cc3fa5f0..e4a4bf44a7e 100644
--- a/src/bin/pg_waldump/archive_waldump.c
+++ b/src/bin/pg_waldump/archive_waldump.c
@@ -815,7 +815,7 @@ member_is_wal_file(astreamer_waldump *mystreamer, astreamer_member *member,
char *filename;
/* We are only interested in normal files */
- if (member->is_directory || member->is_link)
+ if (!member->is_regular)
return false;
if (strlen(member->pathname) < XLOG_FNAME_LEN)
diff --git a/src/fe_utils/astreamer_file.c b/src/fe_utils/astreamer_file.c
index 158e9a14f2c..0fca70a4f86 100644
--- a/src/fe_utils/astreamer_file.c
+++ b/src/fe_utils/astreamer_file.c
@@ -228,9 +228,13 @@ astreamer_extractor_content(astreamer *streamer, astreamer_member *member,
mystreamer->filename[fnamelen - 1] = '\0';
/* Dispatch based on file type. */
- if (member->is_directory)
+ if (member->is_regular)
+ mystreamer->file =
+ create_file_for_extract(mystreamer->filename,
+ member->mode);
+ else if (member->is_directory)
extract_directory(mystreamer->filename, member->mode);
- else if (member->is_link)
+ else if (member->is_symlink)
{
const char *linktarget = member->linktarget;
@@ -238,10 +242,6 @@ astreamer_extractor_content(astreamer *streamer, astreamer_member *member,
linktarget = mystreamer->link_map(linktarget);
extract_link(mystreamer->filename, linktarget);
}
- else
- mystreamer->file =
- create_file_for_extract(mystreamer->filename,
- member->mode);
/* Report output file change. */
if (mystreamer->report_output_file)
diff --git a/src/fe_utils/astreamer_tar.c b/src/fe_utils/astreamer_tar.c
index 3b094fc0328..3de6f8d6fb2 100644
--- a/src/fe_utils/astreamer_tar.c
+++ b/src/fe_utils/astreamer_tar.c
@@ -272,6 +272,9 @@ astreamer_tar_header(astreamer_tar_parser *mystreamer)
Assert(mystreamer->base.bbs_buffer.len == TAR_BLOCK_SIZE);
+ /* Zero out fields of *member, just for consistency. */
+ memset(member, 0, sizeof(astreamer_member));
+
/* Check whether we've got a block of all zero bytes. */
for (i = 0; i < TAR_BLOCK_SIZE; ++i)
{
@@ -299,12 +302,28 @@ astreamer_tar_header(astreamer_tar_parser *mystreamer)
member->mode = read_tar_number(&buffer[TAR_OFFSET_MODE], 8);
member->uid = read_tar_number(&buffer[TAR_OFFSET_UID], 8);
member->gid = read_tar_number(&buffer[TAR_OFFSET_GID], 8);
- member->is_directory =
- (buffer[TAR_OFFSET_TYPEFLAG] == TAR_FILETYPE_DIRECTORY);
- member->is_link =
- (buffer[TAR_OFFSET_TYPEFLAG] == TAR_FILETYPE_SYMLINK);
- if (member->is_link)
- strlcpy(member->linktarget, &buffer[TAR_OFFSET_LINKNAME], 100);
+
+ switch (buffer[TAR_OFFSET_TYPEFLAG])
+ {
+ case TAR_FILETYPE_PLAIN:
+ case '\0': /* backwards compatibility hack, per POSIX */
+ member->is_regular = true;
+ break;
+ case TAR_FILETYPE_DIRECTORY:
+ member->is_directory = true;
+ break;
+ case TAR_FILETYPE_SYMLINK:
+ member->is_symlink = true;
+ strlcpy(member->linktarget, &buffer[TAR_OFFSET_LINKNAME], 100);
+ break;
+ case TAR_FILETYPE_PAX_EXTENDED:
+ case TAR_FILETYPE_PAX_EXTENDED_GLOBAL:
+ pg_fatal("pax extensions to tar format are not supported");
+ break;
+ default:
+ /* For special files, set none of the three is_xxx flags */
+ break;
+ }
/* Compute number of padding bytes. */
mystreamer->pad_bytes_expected = tarPaddingBytesRequired(member->size);
diff --git a/src/include/fe_utils/astreamer.h b/src/include/fe_utils/astreamer.h
index f370ef62720..8329e4efbc5 100644
--- a/src/include/fe_utils/astreamer.h
+++ b/src/include/fe_utils/astreamer.h
@@ -83,8 +83,10 @@ typedef struct
mode_t mode;
uid_t uid;
gid_t gid;
+ /* note: special filetypes will set none of these flags */
+ bool is_regular;
bool is_directory;
- bool is_link;
+ bool is_symlink;
char linktarget[MAXPGPATH];
} astreamer_member;
diff --git a/src/include/pgtar.h b/src/include/pgtar.h
index eb93bdef5c4..d6c434e09b0 100644
--- a/src/include/pgtar.h
+++ b/src/include/pgtar.h
@@ -60,6 +60,8 @@ enum tarFileType
TAR_FILETYPE_PLAIN = '0',
TAR_FILETYPE_SYMLINK = '2',
TAR_FILETYPE_DIRECTORY = '5',
+ TAR_FILETYPE_PAX_EXTENDED = 'x',
+ TAR_FILETYPE_PAX_EXTENDED_GLOBAL = 'g',
};
extern enum tarError tarCreateHeader(char *h, const char *filename,
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-04-02 00:16 Thomas Munro <[email protected]>
parent: Tom Lane <[email protected]>
1 sibling, 1 reply; 85+ messages in thread
From: Thomas Munro @ 2026-04-02 00:16 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Tomas Vondra <[email protected]>; Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On Thu, Apr 2, 2026 at 7:25 AM Tom Lane <[email protected]> wrote:
> Also, if we are admitting the possibility that what we are reading
> was made by a platform-supplied tar and not our own code, I think
> it verges on lunacy to behave as though unsupported typeflags are
> regular files.
Yeah, if this is the first time we parse files we didn't make then
that makes total sense. I was a bit unsure of that question when I
suggested we reject pax only after we've failed to find a file, in
case there are scenarios that work today with harmless ignorable pax
headers that don't change the file name.
> So I think we need something more or less like the attached.
LGTM. Tested with both tars here. I updated that little test patch
for this. Not sure if you think it's worth a test though, now that
it's so simple.
@Andrew: I tried using File::Spec->devnull() this time. Are you able
to check if this works OK on Windows, applied on top of Tom's patch?
AFAIK should be able to run this new test and pass, not skip it. But
it could be that the shell invocation needs tweaking. It's hard to
tell from CI. (Huh, apparently Windows ships a copy of BSD tar as
C:\Windows\System32\tar.exe these days.)
Attachments:
[text/x-patch] 0001-Test-rejection-of-pax-extended-tar-files.patch (3.0K, 2-0001-Test-rejection-of-pax-extended-tar-files.patch)
download | inline diff:
From 57b9d1a58a9eb30feec72b0ccce6bdfd5e1ef92e Mon Sep 17 00:00:00 2001
From: Thomas Munro <[email protected]>
Date: Mon, 30 Mar 2026 18:20:09 +1300
Subject: [PATCH] Test rejection of pax extended tar files.
Also change 852de579's shell invocation to use NUL on Windows instead of
/dev/null, per feedback from Andrew Dunstan. That command wasn't
expected to succeed on Windows, but the similar usage added here might.
---
src/bin/pg_waldump/t/001_basic.pl | 42 +++++++++++++++++++++++++------
1 file changed, 35 insertions(+), 7 deletions(-)
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index ce1f6aa30c0..3b99a8f6d4f 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -5,6 +5,7 @@ use strict;
use warnings FATAL => 'all';
use Cwd;
use File::Copy;
+use File::Spec;
use PostgreSQL::Test::Cluster;
use PostgreSQL::Test::Utils;
use Test::More;
@@ -13,12 +14,14 @@ use List::Util qw(shuffle);
my $tar = $ENV{TAR};
my @tar_c_flags;
-# By default, bsdtar archives sparse files in GNU tar's --format=posix --sparse
-# format, so pg_waldump can't find files that ZFS has decided to store with
-# holes. Turn that off.
-if (system("$tar --no-read-sparse -c - /dev/null > /dev/null") == 0)
+my $devnull = File::Spec->devnull();
+
+# By default, bsdtar archives sparse files in GNU tar's --format=pax --sparse
+# format, so pg_waldump rejects WAL that ZFS has decided to store with holes.
+# Turn that off.
+if (system("$tar --no-read-sparse -c $devnull > $devnull") == 0)
{
- push(@tar_c_flags, "--no-read-sparse");
+ push(@tar_c_flags, "--no-read-sparse");
}
program_help_ok('pg_waldump');
@@ -339,7 +342,7 @@ sub test_pg_waldump
# Create a tar archive, shuffle the file order
sub generate_archive
{
- my ($archive, $directory, $compression_flags) = @_;
+ my ($archive, $directory, $compression_flags, @extra_flags) = @_;
my @files;
opendir my $dh, $directory or die "opendir: $!";
@@ -355,7 +358,7 @@ sub generate_archive
# move into the WAL directory before archiving files
my $cwd = getcwd;
chdir($directory) || die "chdir: $!";
- command_ok([$tar, @tar_c_flags, $compression_flags, $archive, @files]);
+ command_ok([$tar, @extra_flags, @tar_c_flags, $compression_flags, $archive, @files]);
chdir($cwd) || die "chdir: $!";
}
@@ -477,4 +480,29 @@ for my $scenario (@scenarios)
}
}
+SKIP:
+{
+ skip "tar command is not available", 1
+ if !defined $tar;
+
+ skip "tar command doesn't understand --format=pax", 1
+ if system("$tar --format=pax -c " .
+ $node->data_dir . "/pg_wal/* > $devnull") != 0;
+
+ generate_archive($tmp_dir . '/pg_wal_pax.tar',
+ $node->data_dir . '/pg_wal',
+ '-cf',
+ ("--format=pax"));
+
+ command_fails_like(
+ [
+ 'pg_waldump',
+ '--path' => $tmp_dir . '/pg_wal_pax.tar',
+ '--start' => $start_lsn,
+ '--end' => $end_lsn,
+ ],
+ qr/error: pax extensions to tar format are not supported/,
+ 'fails if pax extended header is detected');
+}
+
done_testing();
--
2.52.0
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-04-02 01:22 Tom Lane <[email protected]>
parent: Tom Lane <[email protected]>
1 sibling, 1 reply; 85+ messages in thread
From: Tom Lane @ 2026-04-02 01:22 UTC (permalink / raw)
To: Thomas Munro <[email protected]>; +Cc: Tomas Vondra <[email protected]>; Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
Thomas Munro <[email protected]> writes:
> On Thu, Apr 2, 2026 at 7:25 AM Tom Lane <[email protected]> wrote:
>> Also, if we are admitting the possibility that what we are reading
>> was made by a platform-supplied tar and not our own code, I think
>> it verges on lunacy to behave as though unsupported typeflags are
>> regular files.
> Yeah, if this is the first time we parse files we didn't make then
> that makes total sense. I was a bit unsure of that question when I
> suggested we reject pax only after we've failed to find a file, in
> case there are scenarios that work today with harmless ignorable pax
> headers that don't change the file name.
Of course this is all new so far as pg_waldump is concerned.
I'm a bit unclear about whether pg_verifybackup's exposure
is large enough to warrant back-patching any of this.
Looking again at astreamer_tar.c, I suddenly realized that it doesn't
do any meaningful input validation. So if you feed it junk input,
you get garbage errors that aren't even predictable:
$ head -c 32768 /dev/urandom >junk.tar
$ pg_waldump --path junk.tar --start 1/0x10000000
pg_waldump: error: COPY stream ended before last file was finished
$ head -c 32768 /dev/urandom >junk.tar
$ pg_waldump --path junk.tar --start 1/0x10000000
pg_waldump: error: tar member has empty name
$ head -c 32768 /dev/urandom >junk.tar
$ pg_waldump --path junk.tar --start 1/0x10000000
pg_waldump: error: COPY stream ended before last file was finished
$ head -c 32768 /dev/urandom >junk.tar
$ pg_waldump --path junk.tar --start 1/0x10000000
pg_waldump: error: could not find WAL in archive "junk.tar"
tar itself is considerably saner:
$ tar tf junk.tar
tar: This does not look like a tar archive
tar: Skipping to next header
tar: Exiting with failure status due to previous errors
So I think we need something like the attached, in addition
to what I sent before. This just makes astreamer_tar.c use
the isValidTarHeader function that pg_dump already had.
(I decided to const-ify isValidTarHeader's argument while
moving it to a shared location, which in turn requires
const-ifying tarChecksum.)
regards, tom lane
Attachments:
[text/x-diff] v1-validate-tar-headers.patch (5.1K, 2-v1-validate-tar-headers.patch)
download | inline diff:
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 271a2c3e481..fecf6f2d1ce 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -44,6 +44,7 @@
#include "pg_backup_archiver.h"
#include "pg_backup_db.h"
#include "pg_backup_utils.h"
+#include "pgtar.h"
#define TEXT_DUMP_HEADER "--\n-- PostgreSQL database dump\n--\n\n"
#define TEXT_DUMPALL_HEADER "--\n-- PostgreSQL database cluster dump\n--\n\n"
@@ -2372,7 +2373,7 @@ _discoverArchiveFormat(ArchiveHandle *AH)
}
if (!isValidTarHeader(AH->lookahead))
- pg_fatal("input file does not appear to be a valid archive");
+ pg_fatal("input file does not appear to be a valid tar archive");
AH->format = archTar;
}
diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h
index 365073b3eae..9c3aca6543a 100644
--- a/src/bin/pg_dump/pg_backup_archiver.h
+++ b/src/bin/pg_dump/pg_backup_archiver.h
@@ -465,8 +465,6 @@ extern void InitArchiveFmt_Null(ArchiveHandle *AH);
extern void InitArchiveFmt_Directory(ArchiveHandle *AH);
extern void InitArchiveFmt_Tar(ArchiveHandle *AH);
-extern bool isValidTarHeader(char *header);
-
extern void ReconnectToServer(ArchiveHandle *AH, const char *dbname);
extern void IssueCommandPerBlob(ArchiveHandle *AH, TocEntry *te,
const char *cmdBegin, const char *cmdEnd);
diff --git a/src/bin/pg_dump/pg_backup_tar.c b/src/bin/pg_dump/pg_backup_tar.c
index d94d0de2a5d..a3879410c94 100644
--- a/src/bin/pg_dump/pg_backup_tar.c
+++ b/src/bin/pg_dump/pg_backup_tar.c
@@ -984,31 +984,6 @@ tarPrintf(TAR_MEMBER *th, const char *fmt,...)
return (int) cnt;
}
-bool
-isValidTarHeader(char *header)
-{
- int sum;
- int chk = tarChecksum(header);
-
- sum = read_tar_number(&header[TAR_OFFSET_CHECKSUM], 8);
-
- if (sum != chk)
- return false;
-
- /* POSIX tar format */
- if (memcmp(&header[TAR_OFFSET_MAGIC], "ustar\0", 6) == 0 &&
- memcmp(&header[TAR_OFFSET_VERSION], "00", 2) == 0)
- return true;
- /* GNU tar format */
- if (memcmp(&header[TAR_OFFSET_MAGIC], "ustar \0", 8) == 0)
- return true;
- /* not-quite-POSIX format written by pre-9.3 pg_dump */
- if (memcmp(&header[TAR_OFFSET_MAGIC], "ustar00\0", 8) == 0)
- return true;
-
- return false;
-}
-
/* Given the member, write the TAR header & copy the file */
static void
_tarAddFile(ArchiveHandle *AH, TAR_MEMBER *th)
diff --git a/src/fe_utils/astreamer_tar.c b/src/fe_utils/astreamer_tar.c
index 3b094fc0328..8fa3329237d 100644
--- a/src/fe_utils/astreamer_tar.c
+++ b/src/fe_utils/astreamer_tar.c
@@ -260,7 +260,8 @@ astreamer_tar_parser_content(astreamer *streamer, astreamer_member *member,
* Parse a file header within a tar stream.
*
* The return value is true if we found a file header and passed it on to the
- * next astreamer; it is false if we have reached the archive trailer.
+ * next astreamer; it is false if we have found the archive trailer.
+ * We throw error if we see invalid data.
*/
static bool
astreamer_tar_header(astreamer_tar_parser *mystreamer)
@@ -289,6 +290,12 @@ astreamer_tar_header(astreamer_tar_parser *mystreamer)
if (!has_nonzero_byte)
return false;
+ /*
+ * Verify that we have a reasonable-looking header.
+ */
+ if (!isValidTarHeader(buffer))
+ pg_fatal("input file does not appear to be a valid tar archive");
+
/*
* Parse key fields out of the header.
*/
diff --git a/src/include/pgtar.h b/src/include/pgtar.h
index eb93bdef5c4..55b8c7c77a4 100644
--- a/src/include/pgtar.h
+++ b/src/include/pgtar.h
@@ -68,7 +68,8 @@ extern enum tarError tarCreateHeader(char *h, const char *filename,
time_t mtime);
extern uint64 read_tar_number(const char *s, int len);
extern void print_tar_number(char *s, int len, uint64 val);
-extern int tarChecksum(char *header);
+extern int tarChecksum(const char *header);
+extern bool isValidTarHeader(const char *header);
/*
* Compute the number of padding bytes required for an entry in a tar
diff --git a/src/port/tar.c b/src/port/tar.c
index 592b4fb7b0f..db462b90292 100644
--- a/src/port/tar.c
+++ b/src/port/tar.c
@@ -87,7 +87,7 @@ read_tar_number(const char *s, int len)
* be 512 bytes, per the tar standard.
*/
int
-tarChecksum(char *header)
+tarChecksum(const char *header)
{
int i,
sum;
@@ -104,6 +104,35 @@ tarChecksum(char *header)
return sum;
}
+/*
+ * Check validity of a tar header (assumed to be 512 bytes long).
+ * We verify the checksum and the magic number / version.
+ */
+bool
+isValidTarHeader(const char *header)
+{
+ int sum;
+ int chk = tarChecksum(header);
+
+ sum = read_tar_number(&header[TAR_OFFSET_CHECKSUM], 8);
+
+ if (sum != chk)
+ return false;
+
+ /* POSIX tar format */
+ if (memcmp(&header[TAR_OFFSET_MAGIC], "ustar\0", 6) == 0 &&
+ memcmp(&header[TAR_OFFSET_VERSION], "00", 2) == 0)
+ return true;
+ /* GNU tar format */
+ if (memcmp(&header[TAR_OFFSET_MAGIC], "ustar \0", 8) == 0)
+ return true;
+ /* not-quite-POSIX format written by pre-9.3 pg_dump */
+ if (memcmp(&header[TAR_OFFSET_MAGIC], "ustar00\0", 8) == 0)
+ return true;
+
+ return false;
+}
+
/*
* Fill in the buffer pointed to by h with a tar format header. This buffer
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-04-02 02:14 Thomas Munro <[email protected]>
parent: Tom Lane <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Thomas Munro @ 2026-04-02 02:14 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Tomas Vondra <[email protected]>; Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On Thu, Apr 2, 2026 at 2:22 PM Tom Lane <[email protected]> wrote:
> Looking again at astreamer_tar.c, I suddenly realized that it doesn't
> do any meaningful input validation. So if you feed it junk input,
> you get garbage errors that aren't even predictable:
Wow.
> So I think we need something like the attached, in addition
> to what I sent before. This just makes astreamer_tar.c use
> the isValidTarHeader function that pg_dump already had.
> (I decided to const-ify isValidTarHeader's argument while
> moving it to a shared location, which in turn requires
> const-ifying tarChecksum.)
LGTM.
$ echo -n x | dd of=foo.tar bs=1 seek=257 count=1 conv=notrunc
$ strings foo.tar | grep tar | head -1
xstar
$ pg_waldump --path=foo.tar -s 0/1 -e 0/100
pg_waldump: error: input file does not appear to be a valid tar archive
$ echo -n u | dd of=foo.tar bs=1 seek=257 count=1 conv=notrunc
$ strings foo.tar | grep tar | head -1
ustar
$ pg_waldump --path=foo.tar -s 0/1 -e 0/100
... other output...
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-04-02 11:36 Andrew Dunstan <[email protected]>
parent: Thomas Munro <[email protected]>
0 siblings, 0 replies; 85+ messages in thread
From: Andrew Dunstan @ 2026-04-02 11:36 UTC (permalink / raw)
To: Thomas Munro <[email protected]>; Tom Lane <[email protected]>; +Cc: Tomas Vondra <[email protected]>; Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On 2026-04-01 We 8:16 PM, Thomas Munro wrote:
> On Thu, Apr 2, 2026 at 7:25 AM Tom Lane<[email protected]> wrote:
>> Also, if we are admitting the possibility that what we are reading
>> was made by a platform-supplied tar and not our own code, I think
>> it verges on lunacy to behave as though unsupported typeflags are
>> regular files.
> Yeah, if this is the first time we parse files we didn't make then
> that makes total sense. I was a bit unsure of that question when I
> suggested we reject pax only after we've failed to find a file, in
> case there are scenarios that work today with harmless ignorable pax
> headers that don't change the file name.
>
>> So I think we need something more or less like the attached.
> LGTM. Tested with both tars here. I updated that little test patch
> for this. Not sure if you think it's worth a test though, now that
> it's so simple.
>
> @Andrew: I tried usingFile::Spec->devnull() this time. Are you able
> to check if this works OK on Windows, applied on top of Tom's patch?
> AFAIK should be able to run this new test and pass, not skip it. But
> it could be that the shell invocation needs tweaking. It's hard to
> tell from CI. (Huh, apparently Windows ships a copy of BSD tar as
> C:\Windows\System32\tar.exe these days.)
Yes, that appears to work. I would put a "2>&1" at the end - we don't
care about the output, just whether or not it succeeds:
C:\Windows\system32>perl -MFile::Spec -e "print File::Spec->devnull();"
nul
C:\Windows\system32>tar --no-read-sparse -c - nul > nul 2>&1 && echo hello
C:\Windows\system32>tar --no-read-sparse -c - nul > nul 2>&1 || echo hello
hello
cheers
andrew
--
Andrew Dunstan
EDB:https://www.enterprisedb.com
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-04-02 16:29 Tom Lane <[email protected]>
parent: Thomas Munro <[email protected]>
0 siblings, 3 replies; 85+ messages in thread
From: Tom Lane @ 2026-04-02 16:29 UTC (permalink / raw)
To: Thomas Munro <[email protected]>; +Cc: Tomas Vondra <[email protected]>; Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
Thomas Munro <[email protected]> writes:
> On Thu, Apr 2, 2026 at 2:22 PM Tom Lane <[email protected]> wrote:
>> So I think we need something like the attached, in addition
>> to what I sent before. This just makes astreamer_tar.c use
>> the isValidTarHeader function that pg_dump already had.
> LGTM.
Pushed, thanks for reviewing! In the event I decided to back-patch to
v18, where these fixes could protect pg_verifybackup against tar files
it can't handle. In older branches the astreamer (nee bbstreamer)
logic only exists inside pg_basebackup, and IIUC that's not really
exposed to tar files that weren't made by our own code. So I did
not bother with back-patching further, though it'd be possible to
do that if someone knows a scenario where it'd matter.
regards, tom lane
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-04-02 17:13 Tomas Vondra <[email protected]>
parent: Tom Lane <[email protected]>
2 siblings, 0 replies; 85+ messages in thread
From: Tomas Vondra @ 2026-04-02 17:13 UTC (permalink / raw)
To: Tom Lane <[email protected]>; Thomas Munro <[email protected]>; +Cc: Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On 4/2/26 18:29, Tom Lane wrote:
> Thomas Munro <[email protected]> writes:
>> On Thu, Apr 2, 2026 at 2:22 PM Tom Lane <[email protected]> wrote:
>>> So I think we need something like the attached, in addition
>>> to what I sent before. This just makes astreamer_tar.c use
>>> the isValidTarHeader function that pg_dump already had.
>
>> LGTM.
>
> Pushed, thanks for reviewing! In the event I decided to back-patch to
> v18, where these fixes could protect pg_verifybackup against tar files
> it can't handle. In older branches the astreamer (nee bbstreamer)
> logic only exists inside pg_basebackup, and IIUC that's not really
> exposed to tar files that weren't made by our own code. So I did
> not bother with back-patching further, though it'd be possible to
> do that if someone knows a scenario where it'd matter.
>
It seems jay/hippopotamus failed on this, for some reason. Those two
animals are on the same host, just using different compilers. I did
update+reboot the machine this afternoon (before the test), but I don't
see why that would cause the failure.
Maybe there's something special about OpenSUSE?
--
Tomas Vondra
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-04-02 17:43 Tom Lane <[email protected]>
parent: Tom Lane <[email protected]>
2 siblings, 1 reply; 85+ messages in thread
From: Tom Lane @ 2026-04-02 17:43 UTC (permalink / raw)
To: Tomas Vondra <[email protected]>; +Cc: Thomas Munro <[email protected]>; Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
Tomas Vondra <[email protected]> writes:
> On 4/2/26 18:29, Tom Lane wrote:
>> Pushed, thanks for reviewing! In the event I decided to back-patch to
>> v18, where these fixes could protect pg_verifybackup against tar files
>> it can't handle.
> It seems jay/hippopotamus failed on this, for some reason.
Hmm:
# Failed test 'corrupt backup fails verification: extra_file: matches'
# at t/003_corruption.pl line 198.
# 'pg_verifybackup: error: pax extensions to tar format are not supported
# '
# doesn't match '(?^:extra_file.*present (on disk|in archive "[^"]+") but not in the manifest)'
> Maybe there's something special about OpenSUSE?
Apparently its version of "tar" will produce pax-extended files
at the drop of a hat. I have an OpenSUSE image around here
somewhere, will see if I can reproduce this. But while I'm
asking, what filesystem are those animals running on top of?
regards, tom lane
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-04-02 18:08 Tomas Vondra <[email protected]>
parent: Tom Lane <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Tomas Vondra @ 2026-04-02 18:08 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Thomas Munro <[email protected]>; Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On 4/2/26 19:43, Tom Lane wrote:
> Tomas Vondra <[email protected]> writes:
>> On 4/2/26 18:29, Tom Lane wrote:
>>> Pushed, thanks for reviewing! In the event I decided to back-patch to
>>> v18, where these fixes could protect pg_verifybackup against tar files
>>> it can't handle.
>
>> It seems jay/hippopotamus failed on this, for some reason.
>
> Hmm:
>
> # Failed test 'corrupt backup fails verification: extra_file: matches'
> # at t/003_corruption.pl line 198.
> # 'pg_verifybackup: error: pax extensions to tar format are not supported
> # '
> # doesn't match '(?^:extra_file.*present (on disk|in archive "[^"]+") but not in the manifest)'
>
>> Maybe there's something special about OpenSUSE?
>
> Apparently its version of "tar" will produce pax-extended files
> at the drop of a hat. I have an OpenSUSE image around here
> somewhere, will see if I can reproduce this. But while I'm
> asking, what filesystem are those animals running on top of?
>
btrfs
--
Tomas Vondra
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-04-02 18:47 Tom Lane <[email protected]>
parent: Tomas Vondra <[email protected]>
0 siblings, 4 replies; 85+ messages in thread
From: Tom Lane @ 2026-04-02 18:47 UTC (permalink / raw)
To: Tomas Vondra <[email protected]>; +Cc: Thomas Munro <[email protected]>; Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
Tomas Vondra <[email protected]> writes:
> On 4/2/26 19:43, Tom Lane wrote:
>> Tomas Vondra <[email protected]> writes:
>>> Maybe there's something special about OpenSUSE?
>> Apparently its version of "tar" will produce pax-extended files
>> at the drop of a hat. I have an OpenSUSE image around here
>> somewhere, will see if I can reproduce this. But while I'm
>> asking, what filesystem are those animals running on top of?
> btrfs
Yup, so capable of making sparse WAL files. I can reproduce the
problem here, and what I see is
> tar --version
tar (GNU tar) 1.34
Copyright (C) 2021 Free Software Foundation, Inc.
...
> tar -?
...
-H, --format=FORMAT create archive of the given format
FORMAT is one of the following:
gnu GNU tar 1.13.x format
oldgnu GNU format as per tar <= 1.12
pax POSIX 1003.1-2001 (pax) format
posix same as pax
ustar POSIX 1003.1-1988 (ustar) format
v7 old V7 tar format
...
*This* tar defaults to:
--format=posix -f- -b20 --quoting-style=escape --rmt-command=/usr/bin/rmt
--rsh-command=/usr/bin/ssh
So there you have it: pax format by default. This is unlike what
I see on RHEL or Fedora:
...
*This* tar defaults to:
--format=gnu -f- -b20 --quoting-style=escape --rmt-command=/etc/rmt
--rsh-command=/usr/bin/ssh
So it looks like we need a switch hack similar to what we did for
BSD tar, but injecting "--format=gnu" (or perhaps "--format=ustar"?)
if the tar program will take that.
Interestingly, pg_verifybackup's t/003_corruption.pl test also fails
with the same issue, so apparently this platform is even more
aggressive about sparse-ifying files than Thomas' FreeBSD box.
I wonder how come we managed to pass that test case before on
these machines.
I'm inclined to push the logic for selecting these tar options
into some common subroutine in Test::Utils, rather than having
two copies (and maybe more later).
regards, tom lane
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-04-02 19:15 Tom Lane <[email protected]>
parent: Tom Lane <[email protected]>
3 siblings, 0 replies; 85+ messages in thread
From: Tom Lane @ 2026-04-02 19:15 UTC (permalink / raw)
To: Tomas Vondra <[email protected]>; +Cc: Thomas Munro <[email protected]>; Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
I wrote:
> Interestingly, pg_verifybackup's t/003_corruption.pl test also fails
> with the same issue, so apparently this platform is even more
> aggressive about sparse-ifying files than Thomas' FreeBSD box.
> I wonder how come we managed to pass that test case before on
> these machines.
The answer to that seems to be that the test scripts for
pg_verifybackup simply fail to detect when it's mishandling
sparse tar entries. We only check failing cases not successful
cases, and in each case the error checked for is independent of
whether we would have extracted WAL data correctly. Grumble.
regards, tom lane
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-04-02 22:41 Thomas Munro <[email protected]>
parent: Tom Lane <[email protected]>
3 siblings, 0 replies; 85+ messages in thread
From: Thomas Munro @ 2026-04-02 22:41 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Tomas Vondra <[email protected]>; Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On Fri, Apr 3, 2026 at 7:48 AM Tom Lane <[email protected]> wrote:
> *This* tar defaults to:
> --format=posix -f- -b20 --quoting-style=escape --rmt-command=/usr/bin/rmt
> --rsh-command=/usr/bin/ssh
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-04-02 22:50 Tom Lane <[email protected]>
parent: Tom Lane <[email protected]>
3 siblings, 1 reply; 85+ messages in thread
From: Tom Lane @ 2026-04-02 22:50 UTC (permalink / raw)
To: Thomas Munro <[email protected]>; +Cc: Tomas Vondra <[email protected]>; Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
Thomas Munro <[email protected]> writes:
> On Fri, Apr 3, 2026 at 7:48 AM Tom Lane <[email protected]> wrote:
>> Interestingly, pg_verifybackup's t/003_corruption.pl test also fails
>> with the same issue, so apparently this platform is even more
>> aggressive about sparse-ifying files than Thomas' FreeBSD box.
> Looks like sparse files and BTRFS might be a red herring then, if it's
> simply been told to put a pax header on every file?
It looked to me that it was only putting a pax header on sparse-ified
WAL files.
> How about using --format=ustar, instead of that sparse control stuff?
I did it that way for GNU tar, but did not research whether bsdtar
will take that option. Feel free to hack on ebba64c08 some more.
(It seems though that the two tars' locutions for "write to stdout"
are different, so we might have to have separate tests even if they
end up pushing the same option.)
regards, tom lane
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-04-02 23:43 Sami Imseih <[email protected]>
parent: Tom Lane <[email protected]>
2 siblings, 1 reply; 85+ messages in thread
From: Sami Imseih @ 2026-04-02 23:43 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Thomas Munro <[email protected]>; Tomas Vondra <[email protected]>; Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
> >> So I think we need something like the attached, in addition
> >> to what I sent before. This just makes astreamer_tar.c use
> >> the isValidTarHeader function that pg_dump already had.
>
> > LGTM.
>
> Pushed, thanks for reviewing! In the event I decided to back-patch to
> v18, where these fixes could protect pg_verifybackup against tar files
> it can't handle.
Hi,
I just encountered a regression test failure for pg_waldump due to ebba64c08d9.
````
――――――――――――――――――――――――――――――――――――――――――――――――――――― ✀
――――――――――――――――――――――――――――――――――――――――――――――――――――――
Listing only the last 100 lines from a long log.
# at /local/home/simseih/pgdev/installations/worktrees/dev/src/bin/pg_waldump/t/001_basic.pl
line 432.
# got: 'pg_waldump: error: could not find WAL in archive
"pg_wal.tar.gz"
# '
# expected: ''
```
and regress_log_001_basic shows this:
```
# Running: /usr/bin/tar --format=ustar -cf /tmp/ja26rXZOnb/pg_wal.tar
archive_status 000000010000000000000002 000000010000000000000001
summaries 000000010000000000000003
[22:25:00.525](0.008s) not ok 101
[22:25:00.525](0.000s) # Failed test at
/local/home/simseih/pgdev/installations/worktrees/dev/src/bin/pg_waldump/t/001_basic.pl
line 350.
[22:25:00.525](0.000s) # ---------- command failed ----------
[22:25:00.526](0.000s) # /usr/bin/tar --format=ustar -cf
/tmp/ja26rXZOnb/pg_wal.tar archive_status 000000010000000000000002
000000010000000000000001 summaries 000000010000000000000003
[22:25:00.526](0.000s) # -------------- stderr --------------
[22:25:00.526](0.000s) # /usr/bin/tar: value 10012663 out of uid_t
range 0..2097151
```
The --format=ustar has a limit of 2^21 (2097151) for UID/GID [1]
and on my machine the UID is 10012663.
So I found that one way to deal with this is to run the tar command with
--owner=0 --group=0. As far as I can tell, the owner and group IDs don't
matter for these tests, so maybe that is OK.
@@ -1333,6 +1333,10 @@ sub tar_portability_options
== 0)
{
push(@tar_p_flags, "--format=ustar");
+ # ustar format supports UIDs only up to 2^21 (2097151).
+ # Override owner/group to avoid failures on systems where
+ # the running user's UID/GID exceeds that limit.
+ push(@tar_p_flags, "--owner=0", "--group=0");
}
While this fixes the test, I am now not sure what the broader implications are
for --format=ustar for pg_waldump in the broader discussion?
[1] [https://www.gnu.org/software/tar/manual/html_section/Formats.html]
--
Sami Imseih
Amazon Web Services (AWS)
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-04-02 23:49 Thomas Munro <[email protected]>
parent: Tom Lane <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Thomas Munro @ 2026-04-02 23:49 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Tomas Vondra <[email protected]>; Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On Fri, Apr 3, 2026 at 11:50 AM Tom Lane <[email protected]> wrote:
> > How about using --format=ustar, instead of that sparse control stuff?
>
> I did it that way for GNU tar, but did not research whether bsdtar
> will take that option. Feel free to hack on ebba64c08 some more.
>
> (It seems though that the two tars' locutions for "write to stdout"
> are different, so we might have to have separate tests even if they
> end up pushing the same option.)
I have:
$ tar --version
bsdtar 3.8.2 - libarchive 3.8.2 zlib/1.3.1 liblzma/5.8.1 libzstd/1.5.2
openssl/3.5.4 libb2/bundled
$ gtar --version
tar (GNU tar) 1.35
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html;.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by John Gilmore and Jay Fenlason.
This seems to work for both:
$ tar --format=ustar -c /dev/null > /dev/null
tar: Removing leading '/' from member names
$ gtar --format=ustar -c /dev/null > /dev/null
gtar: Removing leading `/' from member names
The attached passes with both, and regress_log_001_basic looks like:
# Running: /usr/bin/tar --format=ustar -cf /tmp/J_ifbfUOSd/pg_wal.tar
archive_status 000000010000000000000001 000000010000000000000003
000000010000000000000002 summaries
[12:12:24.301](0.072s) ok 101
# Running: /usr/local/bin/gtar --format=ustar -cf
/tmp/pbdsHdrAdw/pg_wal.tar 000000010000000000000002 archive_status
000000010000000000000003 summaries 000000010000000000000001
[12:18:14.739](0.050s) ok 101
I think a Windows system could be using either. BSD tar comes
pre-installed by Microsoft and people often install GNU tools. So I
think we should use File::Spec->devnull() instead of /dev/null, and
Andrew showed that working. I doubt Windows is capable of making
sparse files (except perhaps with ReFS?), but it's nice to use the
same code everywhere and future-proof in case GNU carries out its
thread to switch to pax by default. Windows probably has file
attributes that ustar can't represent (?), so I guess that might
motivate it to use pax headers if they are indeed added only when
needed.
Longer term I think we need to tolerate but ignore pax headers. If I
understand the spirit of this long evolution, pax archives are
intended to be acceptable to pre-pax implementations, which implies
that they can't really change the meaning of the bits of the file
contents. That's why GNU's --sparse hides funky file encodings from
old tars by renaming them to GNUSparseFile.%p/%f, and that leads back
to my original suggestion that we should figure out how to detect and
reject pax only if we failed to find the file under the expected name.
(Or of course we could just implement support for that, and I have a
half-baked trial patch for that but now is not the time.)
Attachments:
[text/x-patch] 0001-Harmonize-tar-option-tests-from-ebba64c0.patch (1.6K, 2-0001-Harmonize-tar-option-tests-from-ebba64c0.patch)
download | inline diff:
From 0635c0106946ce76c1bb84a5c17b71d0b0e574f7 Mon Sep 17 00:00:00 2001
From: Thomas Munro <[email protected]>
Date: Fri, 3 Apr 2026 12:03:56 +1300
Subject: [PATCH] Harmonize tar option tests from ebba64c0.
* GNU and BSD tar both understand --format=ustar.
* Windows lacks /dev/null, but perl knows its local name.
Discussion: https://postgr.es/m/3676229.1775170250%40sss.pgh.pa.us
Backpatch-through: 18
---
src/test/perl/PostgreSQL/Test/Utils.pm | 15 ++++-----------
1 file changed, 4 insertions(+), 11 deletions(-)
diff --git a/src/test/perl/PostgreSQL/Test/Utils.pm b/src/test/perl/PostgreSQL/Test/Utils.pm
index 120999f6ac9..05e1698efa6 100644
--- a/src/test/perl/PostgreSQL/Test/Utils.pm
+++ b/src/test/perl/PostgreSQL/Test/Utils.pm
@@ -1328,21 +1328,14 @@ sub tar_portability_options
# GNU tar typically produces gnu-format archives, which we can read fine.
# But some platforms configure it to default to posix/pax format, and
- # apparently they enable --sparse too. Override that.
- if (system("$tar --format=ustar -c -O /dev/null >/dev/null 2>/dev/null")
+ # apparently they enable --sparse too. BSD tar does something similar.
+ # Override that.
+ my $devnull = File::Spec->devnull();
+ if (system("$tar --format=ustar -c $devnull >$devnull 2>$devnull")
== 0)
{
push(@tar_p_flags, "--format=ustar");
}
-
- # bsdtar also archives sparse files by default, but it spells the switch
- # to disable that differently.
- if (system("$tar --no-read-sparse -c - /dev/null >/dev/null 2>/dev/null")
- == 0)
- {
- push(@tar_p_flags, "--no-read-sparse");
- }
-
return @tar_p_flags;
}
--
2.53.0
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-04-03 00:07 Thomas Munro <[email protected]>
parent: Sami Imseih <[email protected]>
0 siblings, 0 replies; 85+ messages in thread
From: Thomas Munro @ 2026-04-03 00:07 UTC (permalink / raw)
To: Sami Imseih <[email protected]>; +Cc: Tom Lane <[email protected]>; Tomas Vondra <[email protected]>; Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On Fri, Apr 3, 2026 at 12:43 PM Sami Imseih <[email protected]> wrote:
> The --format=ustar has a limit of 2^21 (2097151) for UID/GID [1]
> and on my machine the UID is 10012663.
>
> So I found that one way to deal with this is to run the tar command with
> --owner=0 --group=0. As far as I can tell, the owner and group IDs don't
> matter for these tests, so maybe that is OK.
>
> @@ -1333,6 +1333,10 @@ sub tar_portability_options
> == 0)
> {
> push(@tar_p_flags, "--format=ustar");
> + # ustar format supports UIDs only up to 2^21 (2097151).
> + # Override owner/group to avoid failures on systems where
> + # the running user's UID/GID exceeds that limit.
> + push(@tar_p_flags, "--owner=0", "--group=0");
Interesting. BSD tar accepts those too, so here's an update to my
previous patch.
> While this fixes the test, I am now not sure what the broader implications are
> for --format=ustar for pg_waldump in the broader discussion?
I think users who have their own tar scripts will mostly be
unaffected, but a small minority will see the new error if they try to
use pg_verifybackup or pg_waldump, and they'll find their way to
--format=ustar, and then they might see the UID/GID error in their own
.tar production scripts, and find their way to adding those switches
too. Seems OK? Especially with a documentation note once we've
settle all of this.
Attachments:
[text/x-patch] v2-0001-Improve-tar-portability-logic-from-ebba64c0.patch (2.1K, 2-v2-0001-Improve-tar-portability-logic-from-ebba64c0.patch)
download | inline diff:
From a2c065fd5cee53bcde2ccaf92939737e8fc07f06 Mon Sep 17 00:00:00 2001
From: Thomas Munro <[email protected]>
Date: Fri, 3 Apr 2026 12:03:56 +1300
Subject: [PATCH v2] Improve tar portability logic from ebba64c0.
* GNU and BSD tar both understand --format=ustar.
* Windows lacks /dev/null, but perl knows its local name.
* ustar format doesn't like large UID/GID values, so set them to 0.
Backpatch-through: 18
Co-authored-by: Thomas Munro <[email protected]>
Co-authored-by: Sami Imseih <[email protected]>
Discussion: https://postgr.es/m/3676229.1775170250%40sss.pgh.pa.us
Discussion: https://postgr.es/m/CAA5RZ0tt89MgNi4-0F4onH%2B-TFSsysFjMM-tBc6aXbuQv5xBXw%40mail.gmail.com
---
src/test/perl/PostgreSQL/Test/Utils.pm | 20 ++++++++------------
1 file changed, 8 insertions(+), 12 deletions(-)
diff --git a/src/test/perl/PostgreSQL/Test/Utils.pm b/src/test/perl/PostgreSQL/Test/Utils.pm
index 120999f6ac9..370acfcef7e 100644
--- a/src/test/perl/PostgreSQL/Test/Utils.pm
+++ b/src/test/perl/PostgreSQL/Test/Utils.pm
@@ -1328,21 +1328,17 @@ sub tar_portability_options
# GNU tar typically produces gnu-format archives, which we can read fine.
# But some platforms configure it to default to posix/pax format, and
- # apparently they enable --sparse too. Override that.
- if (system("$tar --format=ustar -c -O /dev/null >/dev/null 2>/dev/null")
- == 0)
- {
- push(@tar_p_flags, "--format=ustar");
- }
-
- # bsdtar also archives sparse files by default, but it spells the switch
- # to disable that differently.
- if (system("$tar --no-read-sparse -c - /dev/null >/dev/null 2>/dev/null")
+ # apparently they enable --sparse too. BSD tar does something similar.
+ #
+ # ustar format supports UIDs only up to 2^21 (2097151). Override
+ # owner/group to avoid failures on systems where the running user's UID/GID
+ # exceeds that limit.
+ my $devnull = File::Spec->devnull();
+ if (system("$tar --format=ustar --owner=0 --group=0 -c $devnull >$devnull 2>$devnull")
== 0)
{
- push(@tar_p_flags, "--no-read-sparse");
+ push(@tar_p_flags, "--format=ustar", "--owner=0", "--group=0");
}
-
return @tar_p_flags;
}
--
2.53.0
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-04-03 00:11 Tom Lane <[email protected]>
parent: Thomas Munro <[email protected]>
0 siblings, 2 replies; 85+ messages in thread
From: Tom Lane @ 2026-04-03 00:11 UTC (permalink / raw)
To: Thomas Munro <[email protected]>; +Cc: Tomas Vondra <[email protected]>; Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
Thomas Munro <[email protected]> writes:
> On Fri, Apr 3, 2026 at 11:50 AM Tom Lane <[email protected]> wrote:
>>> How about using --format=ustar, instead of that sparse control stuff?
>> I did it that way for GNU tar, but did not research whether bsdtar
>> will take that option. Feel free to hack on ebba64c08 some more.
> This seems to work for both:
> $ tar --format=ustar -c /dev/null > /dev/null
> tar: Removing leading '/' from member names
> $ gtar --format=ustar -c /dev/null > /dev/null
> gtar: Removing leading `/' from member names
Cool. LGTM.
> I think a Windows system could be using either. BSD tar comes
> pre-installed by Microsoft and people often install GNU tools. So I
> think we should use File::Spec->devnull() instead of /dev/null, and
> Andrew showed that working.
Agreed.
> Longer term I think we need to tolerate but ignore pax headers. If I
> understand the spirit of this long evolution, pax archives are
> intended to be acceptable to pre-pax implementations, which implies
> that they can't really change the meaning of the bits of the file
> contents.
I don't buy that. For example, POSIX specifies these allowed
fields in an extended header:
linkpath
The pathname of a link being created to another file, of any
type, previously archived. This record shall override the
linkname field in the following ustar header block(s).
path
The pathname of the following file(s). This record shall
override the name and prefix fields in the following header
block(s).
size
The size of the file in octets, expressed as a decimal number
using digits from the ISO/IEC 646:1991 standard. This record
shall override the size field in the following header
block(s).
GNU tar seems to try hard to ensure that a non-pax-aware tar can
extract *something* from a tar file, but it's not guaranteed that the
something contains the right data or is located at the right pathname.
It looks like the goal is to allow post-processing to pick up the
pieces.
In any case, this is all completely moot if we don't write code to
de-sparse a sparse entry: we will not be able to validate WAL data
if the WAL file is missing some pages. So I see little point in
having code that tolerates pax headers if it doesn't also do that.
regards, tom lane
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-04-03 00:47 Thomas Munro <[email protected]>
parent: Tom Lane <[email protected]>
1 sibling, 0 replies; 85+ messages in thread
From: Thomas Munro @ 2026-04-03 00:47 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Tomas Vondra <[email protected]>; Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On Fri, Apr 3, 2026 at 1:11 PM Tom Lane <[email protected]> wrote:
> In any case, this is all completely moot if we don't write code to
> de-sparse a sparse entry: we will not be able to validate WAL data
> if the WAL file is missing some pages. So I see little point in
> having code that tolerates pax headers if it doesn't also do that.
Yeah. FWIW I spent a few hours hacking on that the other day and
could decode many files, but I now realise that the task was made more
difficult by a problem you fixed: without header validation, small
mistakes resulted in corruption or went bananas. With that now
addressed, I hope I can get it into shape and propose it for the next
cycle...
For what it's worth, I was just speculating about how one might
reasonably handle unrecognised *non-standard* header names, not the
POSIX-standardised ones which, you're right, we'd probably need to
grok properly. If we assumed reasonable engineering decisions
following (what I understood to be) the spirit of pax, maybe we could
assume that new non-standard headers either don't affect file contents
and thus could be ignored (think: GNU.windows.permissions=...), or do
affect file contents but have measures in place to prevent unknown
encodings from being exposed to unsuspecting software (think:
deathstation.byte=9bit). That's a position we could choose to take,
anyway, in the absence of a crystal ball... Fortunately there aren't
really many implementations of POSIX left, so it's not like we're
dealing with the Fermi Paradox here...
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-04-03 00:59 Thomas Munro <[email protected]>
parent: Tom Lane <[email protected]>
1 sibling, 1 reply; 85+ messages in thread
From: Thomas Munro @ 2026-04-03 00:59 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Tomas Vondra <[email protected]>; Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On Fri, Apr 3, 2026 at 1:11 PM Tom Lane <[email protected]> wrote:
> Cool. LGTM.
Thanks. I'll go ahead and push v2 shortly, which includes Sami's
change (with credit), unless you have any better ideas for that bit.
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-04-03 01:11 Tom Lane <[email protected]>
parent: Thomas Munro <[email protected]>
0 siblings, 0 replies; 85+ messages in thread
From: Tom Lane @ 2026-04-03 01:11 UTC (permalink / raw)
To: Thomas Munro <[email protected]>; +Cc: Tomas Vondra <[email protected]>; Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
Thomas Munro <[email protected]> writes:
> Thanks. I'll go ahead and push v2 shortly, which includes Sami's
> change (with credit), unless you have any better ideas for that bit.
Not really. I wondered for a bit if it was smart to create tar files
whose contents would appear root-owned, but it could only matter if
somebody extracted such a file as root, and I don't see a plausible
pathway for that to happen, since these are merely transient test
files.
We could dodge that hazard by using, say, "--owner=1" but I'm
not sure whether that'd introduce its own problems. In any case,
walmethods.c's tar_open_for_write is already filling in zeroes
there.
regards, tom lane
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-04-03 11:37 Nazir Bilal Yavuz <[email protected]>
parent: Tom Lane <[email protected]>
3 siblings, 1 reply; 85+ messages in thread
From: Nazir Bilal Yavuz @ 2026-04-03 11:37 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Tomas Vondra <[email protected]>; Thomas Munro <[email protected]>; Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
Hi,
On Thu, 2 Apr 2026 at 21:48, Tom Lane <[email protected]> wrote:
>
> FORMAT is one of the following:
> gnu GNU tar 1.13.x format
> oldgnu GNU format as per tar <= 1.12
> pax POSIX 1003.1-2001 (pax) format
> posix same as pax
> ustar POSIX 1003.1-1988 (ustar) format
> v7 old V7 tar format
> ...
> *This* tar defaults to:
> --format=posix -f- -b20 --quoting-style=escape --rmt-command=/usr/bin/rmt
> --rsh-command=/usr/bin/ssh
>
> So there you have it: pax format by default. This is unlike what
> I see on RHEL or Fedora:
It seems that the problem also applies to OpenBSD [1]:
-F format
Specify the output archive format, with the default format being
pax. tar currently supports the following formats:
OpenBSD CI tasks started to fail [2] after bc30c704ad with the errors:
```
Listing only the last 100 lines from a long log.
# at /home/postgres/postgres/src/bin/pg_waldump/t/001_basic.pl line 440.
# got: 'pg_waldump: error: pax extensions to tar format are
not supported
# Failed test 'corrupt backup fails verification: extra_file: matches'
# at /home/postgres/postgres/src/bin/pg_verifybackup/t/003_corruption.pl
line 198.
# 'pg_verifybackup: error: pax extensions to tar
format are not supported
Summary of Failures:
239/381 postgresql:pg_waldump / pg_waldump/001_basic
ERROR 18.58s exit status 84
225/381 postgresql:pg_verifybackup / pg_verifybackup/003_corruption
ERROR 45.12s exit status 8
```
I also tried Thomas'
"v2-0001-Improve-tar-portability-logic-from-ebba64c0" [3] but it
didn't fix the problem on OpenBSD [4].
[1] https://man.openbsd.org/tar#F
[2] https://cirrus-ci.com/task/5439721360326656
[3] https://postgr.es/m/CA%2BhUKGLMkv_fnGXzVRO8qbx5uHs-qMn151GTJYCfn9w1ZamGNg%40mail.gmail.com
[4] https://cirrus-ci.com/task/5602126958690304
--
Regards,
Nazir Bilal Yavuz
Microsoft
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-04-03 14:37 Thomas Munro <[email protected]>
parent: Nazir Bilal Yavuz <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Thomas Munro @ 2026-04-03 14:37 UTC (permalink / raw)
To: Nazir Bilal Yavuz <[email protected]>; +Cc: Tom Lane <[email protected]>; Tomas Vondra <[email protected]>; Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On Sat, Apr 4, 2026 at 12:38 AM Nazir Bilal Yavuz <[email protected]> wrote:
> I also tried Thomas'
> "v2-0001-Improve-tar-portability-logic-from-ebba64c0" [3] but it
> didn't fix the problem on OpenBSD [4].
Apparently it wants -F ustar, like this. Funny that it passed on the
build farm animals though. Oh, it looks like they changed the default
fairly recently.
https://undeadly.org/cgi?action=article;sid=20240417053301
Attachments:
[text/x-patch] v3-0001-Improve-tar-portability-logic-from-ebba64c0.patch (2.4K, 2-v3-0001-Improve-tar-portability-logic-from-ebba64c0.patch)
download | inline diff:
From 4b69f7d64798a5a55fdbb2447cee7a0d4326280c Mon Sep 17 00:00:00 2001
From: Thomas Munro <[email protected]>
Date: Fri, 3 Apr 2026 12:03:56 +1300
Subject: [PATCH v3] Improve tar portability logic from ebba64c0.
* GNU and BSD tar both understand --format=ustar.
* Windows lacks /dev/null, but perl knows its local name.
* ustar format doesn't like large UID/GID values, so set them to 0.
* OpenBSD has its own tar which understands -F ustar.
Backpatch-through: 18
Co-authored-by: Thomas Munro <[email protected]>
Co-authored-by: Sami Imseih <[email protected]>
Reviewed-by: Tom Lane <[email protected]>
Reviewed-by: Nazir Bilal Yavuz <[email protected]>
Discussion: https://postgr.es/m/3676229.1775170250%40sss.pgh.pa.us
Discussion: https://postgr.es/m/CAA5RZ0tt89MgNi4-0F4onH%2B-TFSsysFjMM-tBc6aXbuQv5xBXw%40mail.gmail.com
---
src/test/perl/PostgreSQL/Test/Utils.pm | 21 ++++++++++++---------
1 file changed, 12 insertions(+), 9 deletions(-)
diff --git a/src/test/perl/PostgreSQL/Test/Utils.pm b/src/test/perl/PostgreSQL/Test/Utils.pm
index 120999f6ac9..077305cd790 100644
--- a/src/test/perl/PostgreSQL/Test/Utils.pm
+++ b/src/test/perl/PostgreSQL/Test/Utils.pm
@@ -1328,21 +1328,24 @@ sub tar_portability_options
# GNU tar typically produces gnu-format archives, which we can read fine.
# But some platforms configure it to default to posix/pax format, and
- # apparently they enable --sparse too. Override that.
- if (system("$tar --format=ustar -c -O /dev/null >/dev/null 2>/dev/null")
+ # apparently they enable --sparse too. BSD tar (libarchive) does something
+ # similar.
+ #
+ # ustar format supports UIDs only up to 2^21 (2097151). Override
+ # owner/group to avoid failures on systems where the running user's UID/GID
+ # exceeds that limit.
+ my $devnull = File::Spec->devnull();
+ if (system("$tar --format=ustar --owner=0 --group=0 -c $devnull >$devnull 2>$devnull")
== 0)
{
- push(@tar_p_flags, "--format=ustar");
+ push(@tar_p_flags, "--format=ustar", "--owner=0", "--group=0");
}
- # bsdtar also archives sparse files by default, but it spells the switch
- # to disable that differently.
- if (system("$tar --no-read-sparse -c - /dev/null >/dev/null 2>/dev/null")
- == 0)
+ # OpenBSD's tar also defaults to pax, but spells the switch differently.
+ if (system("$tar -F ustar -c $devnull >$devnull 2>$devnull"))
{
- push(@tar_p_flags, "--no-read-sparse");
+ push(@tar_p_flags, "-F", "ustar");
}
-
return @tar_p_flags;
}
--
2.47.3
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-04-03 14:59 Sami Imseih <[email protected]>
parent: Thomas Munro <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Sami Imseih @ 2026-04-03 14:59 UTC (permalink / raw)
To: Thomas Munro <[email protected]>; +Cc: Nazir Bilal Yavuz <[email protected]>; Tom Lane <[email protected]>; Tomas Vondra <[email protected]>; Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
Hi,
> On Sat, Apr 4, 2026 at 12:38 AM Nazir Bilal Yavuz <[email protected]> wrote:
> > I also tried Thomas'
> > "v2-0001-Improve-tar-portability-logic-from-ebba64c0" [3] but it
> > didn't fix the problem on OpenBSD [4].
>
> Apparently it wants -F ustar, like this. Funny that it passed on the
> build farm animals though. Oh, it looks like they changed the default
> fairly recently.
LGTM with just a correction of my earlier comment.
< + # ustar format supports UIDs only up to 2^21 (2097151). Override
---
> + # ustar format supports UIDs only up to 2^21 - 1 (2097151). Override
--
Sami
Attachments:
[application/octet-stream] v4-0001-Improve-tar-portability-logic-from-ebba64c0.patch (2.4K, 2-v4-0001-Improve-tar-portability-logic-from-ebba64c0.patch)
download | inline diff:
From fe53cc6d93114a6700dc00a57795e53a0fa0a4ee Mon Sep 17 00:00:00 2001
From: Thomas Munro <[email protected]>
Date: Fri, 3 Apr 2026 12:03:56 +1300
Subject: [PATCH v4 1/1] Improve tar portability logic from ebba64c0.
* GNU and BSD tar both understand --format=ustar.
* Windows lacks /dev/null, but perl knows its local name.
* ustar format doesn't like large UID/GID values, so set them to 0.
* OpenBSD has its own tar which understands -F ustar.
Backpatch-through: 18
Co-authored-by: Thomas Munro <[email protected]>
Co-authored-by: Sami Imseih <[email protected]>
Reviewed-by: Tom Lane <[email protected]>
Reviewed-by: Nazir Bilal Yavuz <[email protected]>
Discussion: https://postgr.es/m/3676229.1775170250%40sss.pgh.pa.us
Discussion: https://postgr.es/m/CAA5RZ0tt89MgNi4-0F4onH%2B-TFSsysFjMM-tBc6aXbuQv5xBXw%40mail.gmail.com
---
src/test/perl/PostgreSQL/Test/Utils.pm | 21 ++++++++++++---------
1 file changed, 12 insertions(+), 9 deletions(-)
diff --git a/src/test/perl/PostgreSQL/Test/Utils.pm b/src/test/perl/PostgreSQL/Test/Utils.pm
index 120999f6ac9..050037f0d93 100644
--- a/src/test/perl/PostgreSQL/Test/Utils.pm
+++ b/src/test/perl/PostgreSQL/Test/Utils.pm
@@ -1328,21 +1328,24 @@ sub tar_portability_options
# GNU tar typically produces gnu-format archives, which we can read fine.
# But some platforms configure it to default to posix/pax format, and
- # apparently they enable --sparse too. Override that.
- if (system("$tar --format=ustar -c -O /dev/null >/dev/null 2>/dev/null")
+ # apparently they enable --sparse too. BSD tar (libarchive) does something
+ # similar.
+ #
+ # ustar format supports UIDs only up to 2^21 - 1 (2097151). Override
+ # owner/group to avoid failures on systems where the running user's UID/GID
+ # exceeds that limit.
+ my $devnull = File::Spec->devnull();
+ if (system("$tar --format=ustar --owner=0 --group=0 -c $devnull >$devnull 2>$devnull")
== 0)
{
- push(@tar_p_flags, "--format=ustar");
+ push(@tar_p_flags, "--format=ustar", "--owner=0", "--group=0");
}
- # bsdtar also archives sparse files by default, but it spells the switch
- # to disable that differently.
- if (system("$tar --no-read-sparse -c - /dev/null >/dev/null 2>/dev/null")
- == 0)
+ # OpenBSD's tar also defaults to pax, but spells the switch differently.
+ if (system("$tar -F ustar -c $devnull >$devnull 2>$devnull"))
{
- push(@tar_p_flags, "--no-read-sparse");
+ push(@tar_p_flags, "-F", "ustar");
}
-
return @tar_p_flags;
}
--
2.50.1
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-04-03 15:30 Nazir Bilal Yavuz <[email protected]>
parent: Sami Imseih <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Nazir Bilal Yavuz @ 2026-04-03 15:30 UTC (permalink / raw)
To: Sami Imseih <[email protected]>; +Cc: Thomas Munro <[email protected]>; Tom Lane <[email protected]>; Tomas Vondra <[email protected]>; Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
Hi,
On Fri, 3 Apr 2026 at 17:59, Sami Imseih <[email protected]> wrote:
>
> Hi,
>
> > On Sat, Apr 4, 2026 at 12:38 AM Nazir Bilal Yavuz <[email protected]> wrote:
> > > I also tried Thomas'
> > > "v2-0001-Improve-tar-portability-logic-from-ebba64c0" [3] but it
> > > didn't fix the problem on OpenBSD [4].
> >
> > Apparently it wants -F ustar, like this. Funny that it passed on the
> > build farm animals though. Oh, it looks like they changed the default
> > fairly recently.
>
> LGTM with just a correction of my earlier comment.
Thanks for the patches! I confirm that both v3 and v4 fix the problem
for OpenBSD CI.
--
Regards,
Nazir Bilal Yavuz
Microsoft
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-04-04 01:07 Thomas Munro <[email protected]>
parent: Nazir Bilal Yavuz <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Thomas Munro @ 2026-04-04 01:07 UTC (permalink / raw)
To: Nazir Bilal Yavuz <[email protected]>; +Cc: Sami Imseih <[email protected]>; Tom Lane <[email protected]>; Tomas Vondra <[email protected]>; Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On Sat, Apr 4, 2026 at 4:30 AM Nazir Bilal Yavuz <[email protected]> wrote:
> On Fri, 3 Apr 2026 at 17:59, Sami Imseih <[email protected]> wrote:
> > > On Sat, Apr 4, 2026 at 12:38 AM Nazir Bilal Yavuz <[email protected]> wrote:
> > > > I also tried Thomas'
> > > > "v2-0001-Improve-tar-portability-logic-from-ebba64c0" [3] but it
> > > > didn't fix the problem on OpenBSD [4].
> > >
> > > Apparently it wants -F ustar, like this. Funny that it passed on the
> > > build farm animals though. Oh, it looks like they changed the default
> > > fairly recently.
> >
> > LGTM with just a correction of my earlier comment.
>
> Thanks for the patches! I confirm that both v3 and v4 fix the problem
> for OpenBSD CI.
Pushed, after testing on an OpenBSD VM and making some corrections:
* I'd screwed up the test command line in a way that worked by coincidence
** OpenBSD tar writes to a tape device by default, so use -f /dev/null
** I'd forgotten == 0, so the result was inverted, hiding that screwup
* -f /dev/null is a better form for all of them because the default
destination is a build option
* needed elsif instead of if, or BSD tar finished up getting both
--format=ustar and -F ustar
* ran perltidy, keeping only the hunks due to this patch
CI passes and shows "212 subtests passed" for all five Unixen +
Windows/mingw, but only "156 subtests passed" for Windows/MSVC.
.cirrus.tasks.yml appears to use the same $TAR for both, namely the
system tar, so I think we can say that *this* thing is working, but
something else might be wrong with our scripting glue somewhere?
The other OSes on our list are AIX and Solaris. From a quick look at
their manuals, I don't foresee issues with pax or large UIDs.
Hopefully that covers everything!
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-04-10 07:57 Thomas Munro <[email protected]>
parent: Thomas Munro <[email protected]>
0 siblings, 1 reply; 85+ messages in thread
From: Thomas Munro @ 2026-04-10 07:57 UTC (permalink / raw)
To: Nazir Bilal Yavuz <[email protected]>; +Cc: Sami Imseih <[email protected]>; Tom Lane <[email protected]>; Tomas Vondra <[email protected]>; Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Andrew Dunstan <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
Nitpicking code review for commit b15c1513:
+read_archive_wal_page(XLogDumpPrivate *privateInfo, XLogRecPtr targetPagePtr,
+ Size count, char *readBuff)
I thought we agreed to stop using Size for new code? size_t has been
around since C89.
+ pg_fatal("WAL segment \"%s\" in archive \"%s\" is too short: rea
d %lld of %lld bytes",
+ fname, privateInfo->archive_name,
+ (long long int) (count - nbytes),
+ (long long int) count);
Why cast to long long int? That's the sort of thing we used to have
to do for int64 (but no longer), but here it's size_t anyway. %zu has
been around since C99.
^ permalink raw reply [nested|flat] 85+ messages in thread
* Re: pg_waldump: support decoding of WAL inside tarfile
@ 2026-04-10 12:57 Andrew Dunstan <[email protected]>
parent: Thomas Munro <[email protected]>
0 siblings, 0 replies; 85+ messages in thread
From: Andrew Dunstan @ 2026-04-10 12:57 UTC (permalink / raw)
To: Thomas Munro <[email protected]>; Nazir Bilal Yavuz <[email protected]>; +Cc: Sami Imseih <[email protected]>; Tom Lane <[email protected]>; Tomas Vondra <[email protected]>; Andres Freund <[email protected]>; Michael Paquier <[email protected]>; Amul Sul <[email protected]>; Zsolt Parragi <[email protected]>; Robert Haas <[email protected]>; Chao Li <[email protected]>; Anthonin Bonnefoy <[email protected]>; Fujii Masao <[email protected]>; Jakub Wartak <[email protected]>; PostgreSQL Hackers <[email protected]>
On 2026-04-10 Fr 3:57 AM, Thomas Munro wrote:
> Nitpicking code review for commit b15c1513:
>
> +read_archive_wal_page(XLogDumpPrivate *privateInfo, XLogRecPtr targetPagePtr,
> + Size count, char *readBuff)
>
> I thought we agreed to stop using Size for new code? size_t has been
> around since C89.
Must have missed the memo :-(
>
> + pg_fatal("WAL segment \"%s\" in archive \"%s\" is too short: rea
> d %lld of %lld bytes",
> + fname, privateInfo->archive_name,
> + (long long int) (count - nbytes),
> + (long long int) count);
>
> Why cast to long long int? That's the sort of thing we used to have
> to do for int64 (but no longer), but here it's size_t anyway. %zu has
> been around since C99.
will fix. Thanks for looking.
cheers
andrew
--
Andrew Dunstan
EDB:https://www.enterprisedb.com
^ permalink raw reply [nested|flat] 85+ messages in thread
end of thread, other threads:[~2026-04-10 12:57 UTC | newest]
Thread overview: 85+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2025-08-07 14:17 pg_waldump: support decoding of WAL inside tarfile Amul Sul <[email protected]>
2025-08-25 12:28 ` Amul Sul <[email protected]>
2025-08-26 11:52 ` Amul Sul <[email protected]>
2025-09-08 13:37 ` Jakub Wartak <[email protected]>
2025-09-12 10:55 ` Amul Sul <[email protected]>
2025-09-25 08:18 ` Amul Sul <[email protected]>
2025-09-12 18:28 ` Robert Haas <[email protected]>
2025-09-12 20:27 ` Robert Haas <[email protected]>
2025-09-25 08:24 ` Amul Sul <[email protected]>
2025-09-29 15:15 ` Robert Haas <[email protected]>
2025-09-29 16:17 ` Amul Sul <[email protected]>
2025-10-10 18:01 ` Robert Haas <[email protected]>
2025-10-16 11:48 ` Amul Sul <[email protected]>
2025-10-20 14:34 ` Robert Haas <[email protected]>
2025-11-06 09:03 ` Amul Sul <[email protected]>
2025-11-17 04:50 ` Amul Sul <[email protected]>
2025-11-19 08:20 ` Jakub Wartak <[email protected]>
2025-11-21 11:44 ` Amul Sul <[email protected]>
2025-11-25 06:37 ` Amul Sul <[email protected]>
2025-11-25 08:50 ` Chao Li <[email protected]>
2025-11-26 06:02 ` Amul Sul <[email protected]>
2025-11-26 07:23 ` Chao Li <[email protected]>
2026-02-10 09:36 ` Amul Sul <[email protected]>
2026-02-18 06:58 ` Amul Sul <[email protected]>
2026-03-02 13:00 ` Amul Sul <[email protected]>
2026-03-04 00:37 ` Andrew Dunstan <[email protected]>
2026-03-04 12:52 ` Amul Sul <[email protected]>
2026-03-04 21:50 ` Andrew Dunstan <[email protected]>
2026-03-09 12:26 ` Amul Sul <[email protected]>
2026-03-18 11:45 ` Amul Sul <[email protected]>
2026-03-18 15:16 ` Amul Sul <[email protected]>
2026-03-19 10:20 ` Amul Sul <[email protected]>
2026-03-19 20:48 ` Zsolt Parragi <[email protected]>
2026-03-20 11:31 ` Amul Sul <[email protected]>
2026-03-20 13:26 ` Amul Sul <[email protected]>
2026-03-20 19:33 ` Andrew Dunstan <[email protected]>
2026-03-21 06:19 ` Amul Sul <[email protected]>
2026-03-21 06:23 ` Michael Paquier <[email protected]>
2026-03-21 15:35 ` Amul Sul <[email protected]>
2026-03-21 17:26 ` Amul Sul <[email protected]>
2026-03-22 11:24 ` Andrew Dunstan <[email protected]>
2026-03-22 21:19 ` Andrew Dunstan <[email protected]>
2026-03-24 03:11 ` Michael Paquier <[email protected]>
2026-03-25 17:28 ` Andres Freund <[email protected]>
2026-03-28 21:36 ` Thomas Munro <[email protected]>
2026-03-28 22:08 ` Thomas Munro <[email protected]>
2026-03-28 22:57 ` Tomas Vondra <[email protected]>
2026-03-29 13:33 ` Tomas Vondra <[email protected]>
2026-03-29 22:11 ` Thomas Munro <[email protected]>
2026-03-29 22:20 ` Thomas Munro <[email protected]>
2026-04-01 02:05 ` Thomas Munro <[email protected]>
2026-04-01 10:39 ` Andrew Dunstan <[email protected]>
2026-04-01 13:26 ` Andres Freund <[email protected]>
2026-04-01 14:19 ` Andrew Dunstan <[email protected]>
2026-04-01 18:25 ` Tom Lane <[email protected]>
2026-04-02 00:16 ` Thomas Munro <[email protected]>
2026-04-02 11:36 ` Andrew Dunstan <[email protected]>
2026-04-02 01:22 ` Tom Lane <[email protected]>
2026-04-02 02:14 ` Thomas Munro <[email protected]>
2026-04-02 16:29 ` Tom Lane <[email protected]>
2026-04-02 17:13 ` Tomas Vondra <[email protected]>
2026-04-02 17:43 ` Tom Lane <[email protected]>
2026-04-02 18:08 ` Tomas Vondra <[email protected]>
2026-04-02 18:47 ` Tom Lane <[email protected]>
2026-04-02 19:15 ` Tom Lane <[email protected]>
2026-04-02 22:41 ` Thomas Munro <[email protected]>
2026-04-02 22:50 ` Tom Lane <[email protected]>
2026-04-02 23:49 ` Thomas Munro <[email protected]>
2026-04-03 00:11 ` Tom Lane <[email protected]>
2026-04-03 00:47 ` Thomas Munro <[email protected]>
2026-04-03 00:59 ` Thomas Munro <[email protected]>
2026-04-03 01:11 ` Tom Lane <[email protected]>
2026-04-03 11:37 ` Nazir Bilal Yavuz <[email protected]>
2026-04-03 14:37 ` Thomas Munro <[email protected]>
2026-04-03 14:59 ` Sami Imseih <[email protected]>
2026-04-03 15:30 ` Nazir Bilal Yavuz <[email protected]>
2026-04-04 01:07 ` Thomas Munro <[email protected]>
2026-04-10 07:57 ` Thomas Munro <[email protected]>
2026-04-10 12:57 ` Andrew Dunstan <[email protected]>
2026-04-02 23:43 ` Sami Imseih <[email protected]>
2026-04-03 00:07 ` Thomas Munro <[email protected]>
2026-03-28 23:15 ` Thomas Munro <[email protected]>
2025-11-21 12:16 ` Amul Sul <[email protected]>
2026-03-29 01:59 Re: pg_waldump: support decoding of WAL inside tarfile Andrew Dunstan <[email protected]>
2026-03-29 02:01 ` Thomas Munro <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox