Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w3ph1-001aTi-1x for pgsql-hackers@arkaria.postgresql.org; Sat, 21 Mar 2026 06:20:15 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w3pgy-0096OP-20 for pgsql-hackers@arkaria.postgresql.org; Sat, 21 Mar 2026 06:20:13 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w3pgy-0096OG-0x for pgsql-hackers@lists.postgresql.org; Sat, 21 Mar 2026 06:20:12 +0000 Received: from mail-pj1-x1034.google.com ([2607:f8b0:4864:20::1034]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1w3pgv-00000000Iax-39la for pgsql-hackers@lists.postgresql.org; Sat, 21 Mar 2026 06:20:12 +0000 Received: by mail-pj1-x1034.google.com with SMTP id 98e67ed59e1d1-35b9d29480aso1033058a91.1 for ; Fri, 20 Mar 2026 23:20:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1774074006; cv=none; d=google.com; s=arc-20240605; b=iOi0kk4Kvsqn1xF5oRq8V9tIcfpSJVghyifD35m+h72J7Jxg9KfqIPzrYTD4FRejPQ 7O8kOZLT4sx54IDXqhmrCYLNy0zy7ddcoDfPpPQ0EgE5d5qbk66LccC5EFZdnxbpzbiS W9gvabdaZGqQotOlgJ5MnA1hqDjg1ygZ9+grw7L2+Za5PL6R2qNS29iupuRJwOeEfD3R O8C2Mr/70Uwp9S9z2F/kyQhyI0GGGImLpTMUqLRLZ4yi5zlfpdvh9YkCPvNvAeeOhBVz rtpzOxmK9apfYDHoYmGfTkhum3Jf1QSAkKd0KfsspHgL8ofM15o1jV7yNNYRWLtJiJIy PkRg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature; bh=PScSlADT7SA7I9RIxIt4GXEKt0ifkILHHLYgWU+lpgw=; fh=xtB6vqwP2hGUDvNEJNphjq/H+fP6qOySFGyyCAlJtsk=; b=jcLooC/Er4lfsxtkKwoR5z9+pcvnSaqzN5hJrpLejG/jE3fG2jX5emwf5qvF2AtVNV JgaWdz1nEMAGa4UbrJOD6k0cZbKBFeGibOX+8aIoL5NNoimWixBUDz+XfPxPWA/c0laX fBSiDjIPRZb/2wAC7cK9qgQy54CR7KF6MkFxYMRA7k5kLHJpWms9e2d1wYubZWHMkcFY iVaPkDeY4JrUJwWCXLpuh0MiDZ+QFVO+EPSDHKJLRL/Aa+HuWgQDQiRNHlwe4q2AGrMO KR07COQpGPUY2r1nQpylk2MfhIfI3aOJlC/IsefEfJz2bLHHHDNPakHO1EXxEeN8EsRc bx/w==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1774074006; x=1774678806; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=PScSlADT7SA7I9RIxIt4GXEKt0ifkILHHLYgWU+lpgw=; b=E0NAv2wSWFxUXCrtCylIO/AOgHRFh7bV/2PKdhh9U0U9MvDSrTgVfYEoc2+VTcd+Hb rK35OOsYGC1DTu4kfrgX+hzW3GJVCKL0aOJNblUy9jWKYsaWXWMT9Aei3HPONabdohQh CsjKo/0NuTTc4GxhpgDTmcKEwbJu+Do/atIihz+Wy9hVbDSR9mtdNWLm+AEMYUefsTQu vfAFjuaEQeonX729px+ewcz92a0fC7R26YNb29XAwnuKtkzLpswa1lj8DEPXkP6Hbdrs tWmn6vdIUVsIvu6/AzsBScQzjhFPNSTBgc/aDrMUL0GOJKP0EISlakVp9NKNb00a5pAf yyuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774074006; x=1774678806; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=PScSlADT7SA7I9RIxIt4GXEKt0ifkILHHLYgWU+lpgw=; b=J0H6L6Bc6IXvK9tE+IYZO6jpzli58opePUIxsOgHtQ0THAkKKVbQEjzwOWA+98aDh1 oqrm9yKJfyBuQ5vTYg7LhNJKDs82d4Zj3rYMAc4+/6QrFz3k8anIit5s7eJxvW/1E+NY +MUly9+DNTDzTogp1qYhoo2NAicNselEs7iAgnEZFRShrW5OciqSvJ/mI2/M+df9fb3i DdqII4hd5jY9S/Xw27zV3ScaVcJljrS9ACOuEdHUMpJtghetiZSUMfnTVIlKPD4bOfFX AQ3iluuY0ifeJPcdVKMwXXgfX7xgp3Fv+Smi2u3iZeILwaj1/8PwOOo/8+vxRzdD/U81 yCZQ== X-Forwarded-Encrypted: i=1; AJvYcCXhwTRmRnAIcIVWznrz4ya0yLSj1E90HHHAqaN/iLsWuzDY419HaiTrPPZsA/AG91X37pT5V+YAQjrjWcgs@lists.postgresql.org X-Gm-Message-State: AOJu0YzylvAs6W2a0a6csUiF4+jMWsE4zEelup3AayRrV1/sgIgUfy7D rU49FLvB6cCO1sikg69D1tSeDHvH5zceA91Ac94dBpKTCv8mPA2kFeDWbLkBShA8TbUJThquf+A GVAzDOiif/pHuV9n9YTpL5qh2q+1oM1E= X-Gm-Gg: ATEYQzxnujk+2bxrLfGOvyNWkQsFxvGWVqGCgSTuS1Dp2ZlmXxsJRkZNrpR3mttsJYb 3l/Wa36Hr11k07Qe8JGSTgx4ZdheNJywWLkfZXVqYHQ/1Nfnju0Vu9BBDe/aU7rIMocNHsJZu1f 5eBm7dKVV45ZZcgMzXeYYATiCLHlF1tUW2FU+JlaApskM5dQKolNCr4gueeOK/J2kYTkR7ampc5 15lasqs3aZnJA/ECxk71cHJ2Nli6WsZrXfYJgFyd1uLUNneaaHtm3RZOwUJQIN9U7+QtRndt0yy tlurXjHaVQ== X-Received: by 2002:a17:90b:2e0c:b0:354:c602:a573 with SMTP id 98e67ed59e1d1-35bd2d667e8mr4023656a91.27.1774074006115; Fri, 20 Mar 2026 23:20:06 -0700 (PDT) MIME-Version: 1.0 References: <72973471-8946-40c7-8b2d-8f95540d90e2@dunslane.net> <401bf08a-c8f1-48e2-9a30-78deaa9fa7c5@dunslane.net> <9a2be101-42c9-40dd-9860-aa12f06bf0e0@dunslane.net> <2178517.1774064942@sss.pgh.pa.us> In-Reply-To: <2178517.1774064942@sss.pgh.pa.us> From: Amul Sul Date: Sat, 21 Mar 2026 11:49:28 +0530 X-Gm-Features: AaiRm53BSxxgjSVAYMno7VwC0UfiKPzfPy-7PB187QdWvvI08WOEf5Nd-enHKXE Message-ID: Subject: Re: pg_waldump: support decoding of WAL inside tarfile To: Tom Lane Cc: Andrew Dunstan , Zsolt Parragi , Robert Haas , Chao Li , Jakub Wartak , PostgreSQL Hackers Content-Type: multipart/mixed; boundary="000000000000f0ac82064d82ca25" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000f0ac82064d82ca25 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Sat, Mar 21, 2026 at 9:19=E2=80=AFAM Tom Lane wrote: > > Andrew Dunstan writes: > > Thanks, committed with very minor tweaks. > > Buildfarm members batta and hachi don't like this very much. > They fail the pg_verifybackup tests like so: > > # Running: pg_verifybackup --exit-on-error /home/admin/batta/buildroot/HE= AD/pgsql.build/src/bin/pg_verifybackup/tmp_check/t_008_untar_primary_data/b= ackup/server-backup > pg_waldump: error: could not find WAL in archive "base.tar.zst" > pg_verifybackup: error: WAL parsing failed for timeline 1 > > Only the zstd-compression case fails. I've spent several hours trying > to reproduce this, without any luck, although I can get a similar > failure in only the gzip case if I build with --with-wal-blocksize=3D64. > I do not have an explanation for the seeming cross-platform > difference. However after adding a lot of debug tracing, I believe > I see the bug, or at least a related bug. This bit in > archive_waldump.c's init_archive_reader is where the error comes from: > > /* > * Read until we have at least one full WAL page (XLOG_BLCKSZ bytes) = from > * the first WAL segment in the archive so we can extract the WAL seg= ment > * size from the long page header. > */ > while (entry =3D=3D NULL || entry->buf->len < XLOG_BLCKSZ) > { > if (read_archive_file(privateInfo, XLOG_BLCKSZ) =3D=3D 0) > pg_fatal("could not find WAL in archive \"%s\"", > privateInfo->archive_name); > > entry =3D privateInfo->cur_file; > } > > That looks plausible but is in fact utterly broken when there's not a > lot of WAL data in the archive, as there is not in this test case. > There are at least two problems: > Thanks for the detailed debugging. I noticed the failure this morning and had started investigating the issue, but in the meantime, I got your helpful reply, which saved me a bunch of time and energy. > 1. read_archive_file reads some data from the source WAL archive and > shoves it into the astreamer decompression pipeline. However, once it > runs out of source data, it just returns zero and we fail immediately. > This does not account for the possibility --- nay, certainty --- that > there is data queued inside the decompression pipeline. So this > doesn't work if the data we need has been compressed into less than > XLOG_BLCKSZ worth of compressed data. (I suppose that the seeming > cross-platform differences have to do with the effectiveness of the > compression algorithm, but I don't really understand why it'd not be > the same everywhere.) We need to do astreamer_finalize once we run > out of source data. I think the cleanest place to handle that would > be inside read_archive_file, but its return convention will need some > rework if we want to put it there (because rc =3D=3D 0 shouldn't cause an > immediate failure if we were able to finalize some more data). As an > ugly experiment I put an astreamer_finalize call into the rc =3D=3D 0 pat= h > of the above loop, but it still didn't work, because: > > 2. If the decompression pipeline reaches the end of the WAL file that > we want, the ASTREAMER_MEMBER_TRAILER case in > astreamer_waldump_content instantly resets privateInfo->cur_file to > NULL. Then the loop in init_archive_reader cannot exit successfully, > and it will just read till the end of the archive and fail. > > I see that of the three callers of read_archive_file, only > get_archive_wal_entry is aware of this possibility; but > init_archive_reader certainly needs to deal with it and I bet > read_archive_wal_page does too. Moreover, get_archive_wal_entry's > solution looks to me like a fragile kluge that probably doesn't work > reliably either, the reason being that privateInfo->cur_file can > change multiple times during a single call to read_archive_file, > if the WAL data has been compressed sufficiently. That whole API > seems to need some rethinking, not to mention better documentation > than the zero it has now. > I agree; init_archive_reader needs that handling, but read_archive_wal_page doesn't need any fix. Since it only deals with the current entry and already holds a reference to it, there is no need to fetch it from the hash table again. init_archive_reader has to scan the hash table because it doesn't already have the specific WAL filename it is looking for, unlike get_archive_wal_entry. Please have a look at the attached patch, which tries to fix that. > While I'm bitching: this error message "could not find WAL in archive > \"%s\"" seems to me to be completely misleading and off-point. > I tried to improve that in the attached version. regards, Amul --000000000000f0ac82064d82ca25 Content-Type: application/x-patch; name="0001-pg_waldump-buildfarm-fix.patch" Content-Disposition: attachment; filename="0001-pg_waldump-buildfarm-fix.patch" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: f_mmzx0hd00 RnJvbSBiZGUzZmI0ZTMxMjVlZWQ3NDBiNWQ5NDlhOTkwYjRlMDZkMDE0OTlhIE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBBbXVsIFN1bCA8c3VsYW11bEBnbWFpbC5jb20+CkRhdGU6IFNh dCwgMjEgTWFyIDIwMjYgMTE6MjI6NTAgKzA1MzAKU3ViamVjdDogW1BBVENIXSBwZ193YWxkdW1w OiBIYW5kbGUgYXJjaGl2ZSBleGhhdXN0aW9uIGluCiBpbml0X2FyY2hpdmVfcmVhZGVyKCkuCgpX aGVuIHJlYWRfYXJjaGl2ZV9maWxlKCkgcmV0dXJucyAwLCB0aGUgYXJjaGl2ZSBtYXkgaGF2ZSBh bHJlYWR5CmJ1ZmZlcmVkIGEgY29tcGxldGUgV0FMIGZpbGUgaW50byB0aGUgaGFzaCB0YWJsZSBi ZWZvcmUgZXhoYXVzdGluZwp0aGUgaW5wdXQuICBJbnN0ZWFkIG9mIGltbWVkaWF0ZWx5IHJlcG9y dGluZyBhbiBlcnJvciwgc2VhcmNoIHRoZQpoYXNoIHRhYmxlIGZvciBhbiBlbnRyeSBjb250YWlu aW5nIGF0IGxlYXN0IHNpemVvZihYTG9nTG9uZ1BhZ2VIZWFkZXIpCmJ5dGVzLiAgUmVwb3J0IGEg c3BlY2lmaWMgZXJyb3IgaWYgYSBXQUwgZW50cnkgZXhpc3RzIGJ1dCBpcyB0b28Kc2hvcnQgKHRy dW5jYXRlZC9jb3JydXB0KSwgb3IgYSBnZW5lcmljIGVycm9yIGlmIG5vIFdBTCB3YXMgZm91bmQK YXQgYWxsLgoKQWxzbyB0aWdodGVuIHRoZSBsb29wIGNvbmRpdGlvbiB0byBjaGVjayBmb3Igc2l6 ZW9mKFhMb2dMb25nUGFnZUhlYWRlcikKcmF0aGVyIHRoYW4gWExPR19CTENLU1osIHNpbmNlIG9u bHkgdGhlIGxvbmcgcGFnZSBoZWFkZXIgaXMgbmVlZGVkCmF0IHRoaXMgc3RhZ2UuCi0tLQogc3Jj L2Jpbi9wZ193YWxkdW1wL2FyY2hpdmVfd2FsZHVtcC5jIHwgNTEgKysrKysrKysrKysrKysrKysr KysrKysrKy0tLQogMSBmaWxlIGNoYW5nZWQsIDQ3IGluc2VydGlvbnMoKyksIDQgZGVsZXRpb25z KC0pCgpkaWZmIC0tZ2l0IGEvc3JjL2Jpbi9wZ193YWxkdW1wL2FyY2hpdmVfd2FsZHVtcC5jIGIv c3JjL2Jpbi9wZ193YWxkdW1wL2FyY2hpdmVfd2FsZHVtcC5jCmluZGV4IGIwNzhjMmQ2OTYwLi41 YmQxZmFmM2Q5NSAxMDA2NDQKLS0tIGEvc3JjL2Jpbi9wZ193YWxkdW1wL2FyY2hpdmVfd2FsZHVt cC5jCisrKyBiL3NyYy9iaW4vcGdfd2FsZHVtcC9hcmNoaXZlX3dhbGR1bXAuYwpAQCAtMTc2LDEz ICsxNzYsNTYgQEAgaW5pdF9hcmNoaXZlX3JlYWRlcihYTG9nRHVtcFByaXZhdGUgKnByaXZhdGVJ bmZvLAogCSAqIHRoZSBmaXJzdCBXQUwgc2VnbWVudCBpbiB0aGUgYXJjaGl2ZSBzbyB3ZSBjYW4g ZXh0cmFjdCB0aGUgV0FMIHNlZ21lbnQKIAkgKiBzaXplIGZyb20gdGhlIGxvbmcgcGFnZSBoZWFk ZXIuCiAJICovCi0Jd2hpbGUgKGVudHJ5ID09IE5VTEwgfHwgZW50cnktPmJ1Zi0+bGVuIDwgWExP R19CTENLU1opCisJd2hpbGUgKGVudHJ5ID09IE5VTEwgfHwgZW50cnktPnJlYWRfbGVuIDwgc2l6 ZW9mKFhMb2dMb25nUGFnZUhlYWRlcikpCiAJewogCQlpZiAocmVhZF9hcmNoaXZlX2ZpbGUocHJp dmF0ZUluZm8sIFhMT0dfQkxDS1NaKSA9PSAwKQotCQkJcGdfZmF0YWwoImNvdWxkIG5vdCBmaW5k IFdBTCBpbiBhcmNoaXZlIFwiJXNcIiIsCi0JCQkJCSBwcml2YXRlSW5mby0+YXJjaGl2ZV9uYW1l KTsKKwkJeworCQkJQXJjaGl2ZWRXQUxfaXRlcmF0b3IgaXRlcjsKKwkJCUFyY2hpdmVkV0FMRmls ZSAqZSA9IE5VTEw7CiAKLQkJZW50cnkgPSBwcml2YXRlSW5mby0+Y3VyX2ZpbGU7CisJCQllbnRy eSA9IE5VTEw7CisKKwkJCS8qCisJCQkgKiByZWFkX2FyY2hpdmVfZmlsZSgpIHJldHVybmVkIDAs IG1lYW5pbmcgdGhlIGFyY2hpdmUgaXMKKwkJCSAqIGV4aGF1c3RlZC4gIEhvd2V2ZXIsIGEgc3Vm ZmljaWVudGx5IGNvbXByZXNzZWQgYXJjaGl2ZSBtYXkgaGF2ZQorCQkJICogYWxyZWFkeSByZWFk IGEgY29tcGxldGUgV0FMIGZpbGUgYW5kIGluc2VydGVkIGl0IGludG8gdGhlIGhhc2gKKwkJCSAq IHRhYmxlIGJlZm9yZSByZXR1cm5pbmcuICBTZWFyY2ggdGhlIGhhc2ggdGFibGUgZm9yIGFueSBl bnRyeQorCQkJICogdGhhdCBhbHJlYWR5IGhhcyBlbm91Z2ggYnVmZmVyZWQgZGF0YSB0byBjb250 YWluIHRoZSBsb25nIHBhZ2UKKwkJCSAqIGhlYWRlcjsgaWYgbm9uZSBpcyBmb3VuZCwgdGhlIGFy Y2hpdmUgY29udGFpbnMgbm8gdXNhYmxlIFdBTC4KKwkJCSAqLworCQkJQXJjaGl2ZWRXQUxfc3Rh cnRfaXRlcmF0ZShwcml2YXRlSW5mby0+YXJjaGl2ZV93YWxfaHRhYiwgJml0ZXIpOworCQkJd2hp bGUgKChlID0gQXJjaGl2ZWRXQUxfaXRlcmF0ZShwcml2YXRlSW5mby0+YXJjaGl2ZV93YWxfaHRh YiwKKwkJCQkJCQkJCQkJJml0ZXIpKSAhPSBOVUxMKQorCQkJeworCQkJCWlmIChlLT5yZWFkX2xl biA+PSBzaXplb2YoWExvZ0xvbmdQYWdlSGVhZGVyKSkKKwkJCQl7CisJCQkJCWVudHJ5ID0gZTsK KwkJCQkJYnJlYWs7CisJCQkJfQorCQkJfQorCisJCQlpZiAoZW50cnkgPT0gTlVMTCkKKwkJCXsK KwkJCQkvKgorCQkJCSAqIEEgV0FMIGZpbGUgd2FzIGZvdW5kIGluIHRoZSBoYXNoIHRhYmxlIGJ1 dCBpdCBkb2VzIG5vdAorCQkJCSAqIGNvbnRhaW4gZW5vdWdoIGRhdGEgdG8gcmVhZCB0aGUgbG9u ZyBwYWdlIGhlYWRlciwKKwkJCQkgKiBpbmRpY2F0aW5nIGEgdHJ1bmNhdGVkIG9yIGNvcnJ1cHQg V0FMIHNlZ21lbnQuCisJCQkJICovCisJCQkJaWYgKGUgIT0gTlVMTCkKKwkJCQkJcGdfZmF0YWwo ImNvdWxkIG5vdCByZWFkIGZpbGUgXCIlc1wiIGZyb20gXCIlc1wiIGFyY2hpdmU6IHJlYWQgJWQg b2YgJWQiLAorCQkJCQkJCSBlLT5mbmFtZSwgcHJpdmF0ZUluZm8tPmFyY2hpdmVfbmFtZSwgZS0+ cmVhZF9sZW4sCisJCQkJCQkJIChpbnQpIHNpemVvZihYTG9nTG9uZ1BhZ2VIZWFkZXIpKTsKKwor CQkJCS8qCisJCQkJICogVGhlIGhhc2ggdGFibGUgY29udGFpbnMgbm8gV0FMIGVudHJpZXMgYXQg YWxsLCBtZWFuaW5nIHRoZQorCQkJCSAqIGFyY2hpdmUgaG9sZHMgbm8gV0FMIGRhdGEuCisJCQkJ ICovCisJCQkJcGdfZmF0YWwoImNvdWxkIG5vdCBmaW5kIFdBTCBpbiBhcmNoaXZlIFwiJXNcIiIs CisJCQkJCQkgcHJpdmF0ZUluZm8tPmFyY2hpdmVfbmFtZSk7CisJCQl9CisJCX0KKwkJZWxzZQor CQkJZW50cnkgPSBwcml2YXRlSW5mby0+Y3VyX2ZpbGU7CiAJfQogCiAJLyogRXh0cmFjdCB0aGUg V0FMIHNlZ21lbnQgc2l6ZSBmcm9tIHRoZSBsb25nIHBhZ2UgaGVhZGVyICovCi0tIAoyLjQ3LjEK Cg== --000000000000f0ac82064d82ca25--