public inbox for [email protected]  
help / color / mirror / Atom feed
From: Thomas Munro <[email protected]>
To: Tom Lane <[email protected]>
Cc: Tomas Vondra <[email protected]>
Cc: Andres Freund <[email protected]>
Cc: Michael Paquier <[email protected]>
Cc: Andrew Dunstan <[email protected]>
Cc: Amul Sul <[email protected]>
Cc: Zsolt Parragi <[email protected]>
Cc: Robert Haas <[email protected]>
Cc: Chao Li <[email protected]>
Cc: Anthonin Bonnefoy <[email protected]>
Cc: Fujii Masao <[email protected]>
Cc: Jakub Wartak <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Subject: Re: pg_waldump: support decoding of WAL inside tarfile
Date: Thu, 2 Apr 2026 13:16:17 +1300
Message-ID: <CA+hUKGJv2hpjMDwG31JSR7TfFUxNPPtQKMm6mHVVRMAT0XppBQ@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
	<[email protected]>
	<[email protected]>
	<CAD5tBcLVWKnph3iB-VPuPKR0dCckOJRFZW2-4H7HTTmhw8-vOg@mail.gmail.com>
	<[email protected]>
	<[email protected]!!.pa.us>
	<CAD5tBcLsYDz+Nzx8MryjxiKaN3fGKd4ZgXuN1Jn=CYxw9dh+AA@mail.gmail.com>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<x2tknjejjouleunkqrvpnwn2tuulunybinycidefm3wmnsyhht@pw5uo3wrqx43>
	<CA+hUKGL2dppjO4o28ZY7n_LTWviKLAi-7KZ=tx5w2HGevCEYPA@mail.gmail.com>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<CA+hUKGJyvdyWMC-RW1njqevD-q_gTbFq+DyDiFpUJVaG+DY20w@mail.gmail.com>
	<[email protected]>
	<CA+hUKG+Pqz5=YQG_=8ho0YsTfn2HWOsJQWqS4j0q8QQWweJP9w@mail.gmail.com>
	<[email protected]>

On Thu, Apr 2, 2026 at 7:25 AM Tom Lane <[email protected]> wrote:
> Also, if we are admitting the possibility that what we are reading
> was made by a platform-supplied tar and not our own code, I think
> it verges on lunacy to behave as though unsupported typeflags are
> regular files.

Yeah, if this is the first time we parse files we didn't make then
that makes total sense.  I was a bit unsure of that question when I
suggested we reject pax only after we've failed to find a file, in
case there are scenarios that work today with harmless ignorable pax
headers that don't change the file name.

> So I think we need something more or less like the attached.

LGTM.  Tested with both tars here.  I updated that little test patch
for this.  Not sure if you think it's worth a test though, now that
it's so simple.

@Andrew: I tried using File::Spec->devnull() this time.  Are you able
to check if this works OK on Windows, applied on top of Tom's patch?
AFAIK should be able to run this new test and pass, not skip it.  But
it could be that the shell invocation needs tweaking.  It's hard to
tell from CI.  (Huh, apparently Windows ships a copy of BSD tar as
C:\Windows\System32\tar.exe these days.)


Attachments:

  [text/x-patch] 0001-Test-rejection-of-pax-extended-tar-files.patch (3.0K, 2-0001-Test-rejection-of-pax-extended-tar-files.patch)
  download | inline diff:
From 57b9d1a58a9eb30feec72b0ccce6bdfd5e1ef92e Mon Sep 17 00:00:00 2001
From: Thomas Munro <[email protected]>
Date: Mon, 30 Mar 2026 18:20:09 +1300
Subject: [PATCH] Test rejection of pax extended tar files.

Also change 852de579's shell invocation to use NUL on Windows instead of
/dev/null, per feedback from Andrew Dunstan.  That command wasn't
expected to succeed on Windows, but the similar usage added here might.
---
 src/bin/pg_waldump/t/001_basic.pl | 42 +++++++++++++++++++++++++------
 1 file changed, 35 insertions(+), 7 deletions(-)

diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index ce1f6aa30c0..3b99a8f6d4f 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -5,6 +5,7 @@ use strict;
 use warnings FATAL => 'all';
 use Cwd;
 use File::Copy;
+use File::Spec;
 use PostgreSQL::Test::Cluster;
 use PostgreSQL::Test::Utils;
 use Test::More;
@@ -13,12 +14,14 @@ use List::Util qw(shuffle);
 my $tar = $ENV{TAR};
 my @tar_c_flags;
 
-# By default, bsdtar archives sparse files in GNU tar's --format=posix --sparse
-# format, so pg_waldump can't find files that ZFS has decided to store with
-# holes.  Turn that off.
-if (system("$tar --no-read-sparse -c - /dev/null > /dev/null") == 0)
+my $devnull = File::Spec->devnull();
+
+# By default, bsdtar archives sparse files in GNU tar's --format=pax --sparse
+# format, so pg_waldump rejects WAL that ZFS has decided to store with holes.
+# Turn that off.
+if (system("$tar --no-read-sparse -c $devnull > $devnull") == 0)
 {
-  push(@tar_c_flags, "--no-read-sparse");
+	push(@tar_c_flags, "--no-read-sparse");
 }
 
 program_help_ok('pg_waldump');
@@ -339,7 +342,7 @@ sub test_pg_waldump
 # Create a tar archive, shuffle the file order
 sub generate_archive
 {
-	my ($archive, $directory, $compression_flags) = @_;
+	my ($archive, $directory, $compression_flags, @extra_flags) = @_;
 
 	my @files;
 	opendir my $dh, $directory or die "opendir: $!";
@@ -355,7 +358,7 @@ sub generate_archive
 	# move into the WAL directory before archiving files
 	my $cwd = getcwd;
 	chdir($directory) || die "chdir: $!";
-	command_ok([$tar, @tar_c_flags, $compression_flags, $archive, @files]);
+	command_ok([$tar, @extra_flags, @tar_c_flags, $compression_flags, $archive, @files]);
 	chdir($cwd) || die "chdir: $!";
 }
 
@@ -477,4 +480,29 @@ for my $scenario (@scenarios)
 	}
 }
 
+SKIP:
+{
+	skip "tar command is not available", 1
+		if !defined $tar;
+
+	skip "tar command doesn't understand --format=pax", 1
+		if system("$tar --format=pax -c " .
+				  $node->data_dir . "/pg_wal/* > $devnull") != 0;
+
+	generate_archive($tmp_dir . '/pg_wal_pax.tar',
+					 $node->data_dir . '/pg_wal',
+					 '-cf',
+					 ("--format=pax"));
+
+	command_fails_like(
+		[
+			'pg_waldump',
+			'--path' => $tmp_dir . '/pg_wal_pax.tar',
+			'--start' => $start_lsn,
+			'--end' => $end_lsn,
+		],
+		qr/error: pax extensions to tar format are not supported/,
+		'fails if pax extended header is detected');
+}
+
 done_testing();
-- 
2.52.0



view thread (87+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: pg_waldump: support decoding of WAL inside tarfile
  In-Reply-To: <CA+hUKGJv2hpjMDwG31JSR7TfFUxNPPtQKMm6mHVVRMAT0XppBQ@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox