Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w8S8b-000XvU-1Y for pgsql-hackers@arkaria.postgresql.org; Fri, 03 Apr 2026 00:11:49 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w8S8a-008zBa-0m for pgsql-hackers@arkaria.postgresql.org; Fri, 03 Apr 2026 00:11:48 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w8S8Z-008zBR-36 for pgsql-hackers@lists.postgresql.org; Fri, 03 Apr 2026 00:11:48 +0000 Received: from sss.pgh.pa.us ([68.162.161.243]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1w8S8Y-00000000GjH-1yTm for pgsql-hackers@lists.postgresql.org; Fri, 03 Apr 2026 00:11:47 +0000 Received: from sss1.sss.pgh.pa.us (localhost [127.0.0.1]) by sss.pgh.pa.us (8.15.2/8.15.2) with ESMTP id 6330BbJm3686765; Thu, 2 Apr 2026 20:11:37 -0400 From: Tom Lane To: Thomas Munro cc: Tomas Vondra , Andres Freund , Michael Paquier , Andrew Dunstan , Amul Sul , Zsolt Parragi , Robert Haas , Chao Li , Anthonin Bonnefoy , Fujii Masao , Jakub Wartak , PostgreSQL Hackers Subject: Re: pg_waldump: support decoding of WAL inside tarfile In-reply-to: References: <2250061.1774104346@sss.pgh.pa.us> <3341199.1774221191@sss.pgh.pa.us> <3424809.1774234940@sss.pgh.pa.us> <1624716.1774736283@sss.pgh.pa.us> <1626907.1774737417@sss.pgh.pa.us> <97a382c0-1f19-4ea0-951f-e37e6abc34a3@vondra.me> <1630755.1774739531@sss.pgh.pa.us> <1873141.1774823011@sss.pgh.pa.us> <3049460.1775067940@sss.pgh.pa.us> <3118179.1775092964@sss.pgh.pa.us> <3565835.1775147392@sss.pgh.pa.us> <3579709.1775151816@sss.pgh.pa.us> <63de1553-829a-488d-8ee0-976afb8dd32c@vondra.me> <3586483.1775155672@sss.pgh.pa.us> <3676229.1775170250@sss.pgh.pa.us> Comments: In-reply-to Thomas Munro message dated "Fri, 03 Apr 2026 12:49:21 +1300" MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-ID: <3686763.1775175097.1@sss.pgh.pa.us> Content-Transfer-Encoding: 8bit Date: Thu, 02 Apr 2026 20:11:37 -0400 Message-ID: <3686764.1775175097@sss.pgh.pa.us> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Thomas Munro writes: > On Fri, Apr 3, 2026 at 11:50 AM Tom Lane wrote: >>> How about using --format=ustar, instead of that sparse control stuff? >> I did it that way for GNU tar, but did not research whether bsdtar >> will take that option. Feel free to hack on ebba64c08 some more. > This seems to work for both: > $ tar --format=ustar -c /dev/null > /dev/null > tar: Removing leading '/' from member names > $ gtar --format=ustar -c /dev/null > /dev/null > gtar: Removing leading `/' from member names Cool. LGTM. > I think a Windows system could be using either. BSD tar comes > pre-installed by Microsoft and people often install GNU tools. So I > think we should use File::Spec->devnull() instead of /dev/null, and > Andrew showed that working. Agreed. > Longer term I think we need to tolerate but ignore pax headers. If I > understand the spirit of this long evolution, pax archives are > intended to be acceptable to pre-pax implementations, which implies > that they can't really change the meaning of the bits of the file > contents. I don't buy that. For example, POSIX specifies these allowed fields in an extended header: linkpath The pathname of a link being created to another file, of any type, previously archived. This record shall override the linkname field in the following ustar header block(s). path The pathname of the following file(s). This record shall override the name and prefix fields in the following header block(s). size The size of the file in octets, expressed as a decimal number using digits from the ISO/IEC 646:1991 standard. This record shall override the size field in the following header block(s). GNU tar seems to try hard to ensure that a non-pax-aware tar can extract *something* from a tar file, but it's not guaranteed that the something contains the right data or is located at the right pathname. It looks like the goal is to allow post-processing to pick up the pieces. In any case, this is all completely moot if we don't write code to de-sparse a sparse entry: we will not be able to validate WAL data if the WAL file is missing some pages. So I see little point in having code that tolerates pax headers if it doesn't also do that. regards, tom lane