public inbox for [email protected]  
help / color / mirror / Atom feed
From: Tom Lane <[email protected]>
To: Thomas Munro <[email protected]>
Cc: Tomas Vondra <[email protected]>
Cc: Andres Freund <[email protected]>
Cc: Michael Paquier <[email protected]>
Cc: Andrew Dunstan <[email protected]>
Cc: Amul Sul <[email protected]>
Cc: Zsolt Parragi <[email protected]>
Cc: Robert Haas <[email protected]>
Cc: Chao Li <[email protected]>
Cc: Anthonin Bonnefoy <[email protected]>
Cc: Fujii Masao <[email protected]>
Cc: Jakub Wartak <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Subject: Re: pg_waldump: support decoding of WAL inside tarfile
Date: Thu, 02 Apr 2026 20:11:37 -0400
Message-ID: <[email protected]> (raw)
In-Reply-To: <CA+hUKGKfti_FMFuduXEZs96W5Boce9gSLZ5Ei158dFiuLuWLgA@mail.gmail.com>
References: <[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<x2tknjejjouleunkqrvpnwn2tuulunybinycidefm3wmnsyhht@pw5uo3wrqx43>
	<CA+hUKGL2dppjO4o28ZY7n_LTWviKLAi-7KZ=tx5w2HGevCEYPA@mail.gmail.com>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<CA+hUKGJyvdyWMC-RW1njqevD-q_gTbFq+DyDiFpUJVaG+DY20w@mail.gmail.com>
	<[email protected]>
	<CA+hUKG+Pqz5=YQG_=8ho0YsTfn2HWOsJQWqS4j0q8QQWweJP9w@mail.gmail.com>
	<[email protected]>
	<[email protected]>
	<CA+hUKG+-pn14s_tjEBO6YKHmc=uRhGVn=w2oM91KKnEUc7pH0Q@mail.gmail.com>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<CA+hUKG+A-=BKP69ze5oc4O5RSBZtKk7dGoq! [email protected]>
	<[email protected]>
	<CA+hUKGKfti_FMFuduXEZs96W5Boce9gSLZ5Ei158dFiuLuWLgA@mail.gmail.com>

Thomas Munro <[email protected]> writes:
> On Fri, Apr 3, 2026 at 11:50 AM Tom Lane <[email protected]> wrote:
>>> How about using --format=ustar, instead of that sparse control stuff?

>> I did it that way for GNU tar, but did not research whether bsdtar
>> will take that option.  Feel free to hack on ebba64c08 some more.

> This seems to work for both:

> $ tar --format=ustar -c /dev/null  > /dev/null
> tar: Removing leading '/' from member names
> $ gtar --format=ustar -c /dev/null  > /dev/null
> gtar: Removing leading `/' from member names

Cool.  LGTM.

> I think a Windows system could be using either.  BSD tar comes
> pre-installed by Microsoft and people often install GNU tools.  So I
> think we should use File::Spec->devnull() instead of /dev/null, and
> Andrew showed that working.

Agreed.

> Longer term I think we need to tolerate but ignore pax headers.  If I
> understand the spirit of this long evolution, pax archives are
> intended to be acceptable to pre-pax implementations, which implies
> that they can't really change the meaning of the bits of the file
> contents.

I don't buy that.  For example, POSIX specifies these allowed
fields in an extended header:

    linkpath
        The pathname of a link being created to another file, of any
        type, previously archived. This record shall override the
        linkname field in the following ustar header block(s).

    path
        The pathname of the following file(s). This record shall
        override the name and prefix fields in the following header
        block(s).

    size
        The size of the file in octets, expressed as a decimal number
        using digits from the ISO/IEC 646:1991 standard. This record
        shall override the size field in the following header
        block(s).

GNU tar seems to try hard to ensure that a non-pax-aware tar can
extract *something* from a tar file, but it's not guaranteed that the
something contains the right data or is located at the right pathname.
It looks like the goal is to allow post-processing to pick up the
pieces.

In any case, this is all completely moot if we don't write code to
de-sparse a sparse entry: we will not be able to validate WAL data
if the WAL file is missing some pages.  So I see little point in
having code that tolerates pax headers if it doesn't also do that.

			regards, tom lane





view thread (87+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: pg_waldump: support decoding of WAL inside tarfile
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox