Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ux1Rq-009txS-9y for pgsql-hackers@arkaria.postgresql.org; Fri, 12 Sep 2025 10:56:10 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1ux1Ro-000Xc1-4B for pgsql-hackers@arkaria.postgresql.org; Fri, 12 Sep 2025 10:56:08 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ux1Rn-000Xbr-Pb for pgsql-hackers@lists.postgresql.org; Fri, 12 Sep 2025 10:56:08 +0000 Received: from mail-pl1-x62f.google.com ([2607:f8b0:4864:20::62f]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1ux1Rm-0001Fa-1Z for pgsql-hackers@lists.postgresql.org; Fri, 12 Sep 2025 10:56:07 +0000 Received: by mail-pl1-x62f.google.com with SMTP id d9443c01a7336-2445806df50so14783725ad.1 for ; Fri, 12 Sep 2025 03:56:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1757674566; x=1758279366; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=pg+KGJ3P5DsTibRODw3w/6flTpVaYrI6QKA6jsKkhik=; b=YeuP3N85UvoLqHbpkymHPar4QctcbNuBOwFXVwS2AjJ4Fk16EYCyPjgJ8uTvsuj7Y8 VbSFuDfsbE61nH4Q65qKB+NuBNo/aA3BatpzacA5i2ns1fazNlB0qR1NR+XLjYCod5mE RX0ptwaJYTBRSGomDHxdZzCRXJCSiZfBW41A0Bs/7Y6EIfLASz7VbUzN7oji9RHZK0SS BiOYzOEw8VwAFk1zlQj7kaKBines1jlF8Xfr04JtORyOCh5dDrmxQzvA9lvYoGeQbvAv 1/SNMmjXuv6Gjmbl67wAU3S54D5hj3y5ipu2mKmjCB0Vh72pqIRcYZOlVek2UwJ24yeU yy0w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1757674566; x=1758279366; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=pg+KGJ3P5DsTibRODw3w/6flTpVaYrI6QKA6jsKkhik=; b=QoczXdZDAz1bECHNat9msABYlyfvmKYh82ccYiisGzOeCVra77ex5uxZj6AZ7Wpk+c SIyPzNiNSnhXypR2ATHprqHbY5rzE7UAZ2Ujfh6oW7u5RQhfiCbnWkqR3iIsEFyu5rWU wJ5iojWDslqSJyd1vXCxZaslDiPE8xoGXJd2hKZMhLEKGVCyTL67YGuo8HsBCJ1trdBb UReY2Qqm6W+6LO0PlQZlaENnH9Z4tu9qb0JEnZ0bsR3Uxc3FdE4VblodxnVxPLhfLLuz 4Ukb6wJfiCzkwsRVUoKwxFPu2tEiS6emDQqOA9BJG46dyQ5UC+RXXvqcc28xX47/8XBS pFNg== X-Gm-Message-State: AOJu0YycWPVzRXt5Fq33UB87/wpzxJbqCNTRKcAjSCWsafMb1FmttjVM jcPxWtjxHWLf0BybIhUYN/o6JPIbFDFw/NA6xdU22cJmDF/pFqd0tgqbhnB6/L4XN3x+EaTTx+B aSTXXCNCOUpBMXTCYQeEY+ZtOtYu/w74= X-Gm-Gg: ASbGnculVrj7kUABa9Ek9DajJSi9BZAPrNGsq+pGIEF6kaYt74S2NsnwDtIGsllx9/z YBousynSP3KZBSwEYz9EOrr4MFw3jizQcvchMG+P324qg+UTR2sp1O78cTR+zZu0yl9pWTMjDzX GhFV+yAgAKgkkCqCzjDQgGZuiGlYSoLsvZpp/GC5Mq3KTTTMihtcfUd3QJylWAmZtnrddl2ezYu iGv4q4ZlQ== X-Google-Smtp-Source: AGHT+IFULq2R6LN9hiUsTILlP8cMbSsEpZo+StBedjLNh7ZXyw9HpknLNl5oQCHWFhfVXlKXPppggrdCvkJ4RpJQ1ew= X-Received: by 2002:a17:903:1b4c:b0:24c:965a:f94d with SMTP id d9443c01a7336-25d26a595d2mr30715475ad.46.1757674565609; Fri, 12 Sep 2025 03:56:05 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Amul Sul Date: Fri, 12 Sep 2025 16:25:28 +0530 X-Gm-Features: Ac12FXw0TfURaXThDCfY5Lfl8w6rG41ofX4qfu5eqT7rXs0pGMe1eSDgQYHEGuE Message-ID: Subject: Re: pg_waldump: support decoding of WAL inside tarfile To: Jakub Wartak Cc: PostgreSQL Hackers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Mon, Sep 8, 2025 at 7:07=E2=80=AFPM Jakub Wartak wrote: > > On Tue, Aug 26, 2025 at 1:53=E2=80=AFPM Amul Sul wrot= e: > > > [..patch] > > Hi Amul! > Thanks for your review. I'm replying to a few of your comments now, but for the rest, I need to think about them. I'm kind of in agreement with some of them for the fix, but I won't be able to spend time on that next week due to official travel. I'll try to get back as soon as possible after that. > a. Why should it be necessary to provide startLSN (-s) ? Couldn't > it autodetect the first WAL (tar file) inside and just use that with > some info message? > $ /usr/pgsql19/bin/pg_waldump --path=3D/tmp/base/pg_wal.tar > pg_waldump: error: no start WAL location given > There are two reasons. First, existing pg_waldump --path=3Dsome_directory would result in the same error. Second, it would force us to re-read the archive twice just to locate the first WAL segment, which is inefficient. > c. It doesnt work when using SEGSTART, but it's there: > $ /usr/pgsql19/bin/pg_waldump --path=3D/tmp/base/pg_wal.tar > 000000010000000000000059 > pg_waldump: error: could not open file "000000010000000000000059": > Not a directory > $ tar tf /tmp/base/pg_wal.tar | head -1 > 000000010000000000000059 > I don't believe this is the correct use case. The WAL files are inside a tar archive, and the requirement is to use a starting LSN and a timeline (if not the default). > d. I've later noticed that follow-up patches seem to use the > -s switch and there it seems to work OK. The above SEGSTART issue was > not detected, probably because tests need to be extended cover of > segment name rather than just --start LSN (see test_pg_waldump): > $ /usr/pgsql19/bin/pg_waldump --path=3D/tmp/base/pg_wal.tar --stats > -s 0/59000358 > pg_waldump: first record is after 0/59000358, at 0/590003E8, > skipping over 144 bytes > WAL statistics between 0/590003E8 and 0/61000000: > [..] > Hope previous reasoning makes sense to you. > e. Code around`if (walpath =3D=3D NULL && directory !=3D NULL)` needs > some comments. > I think this is an existing one. > f. Code around `if (fname !=3D NULL && is_tar_file(fname, > &compression))` , so if fname is WAL segment here > (00000001000000000000005A) and we do check again if that has been > tar-ed (is_tar_file())? Why? > Again, how? > g. Just a question: the commit message says `Note that this patch > requires that the WAL files within the archive be in sequential order; > an error will be reported otherwise`. I'm wondering if such > occurrences are known to be happening in the wild? Or is it just an > assumption that if someone would modify the tar somehow? (either way > we could just add a reason why we need to handle such a case if we > know -- is manual alternation the only source of such state?). For the > record, I've tested crafting custom archives with out of sequence WAL > archives and the code seems to work (it was done using: tar --append > -f pg_wal.tar --format=3Dustar ..) > This is an almost nonexistent occurrence. While pg_basebackup archives WAL files in sequential order, we don't have an explicit code to enforce that order within it. Furthermore, since we can't control how external tools might handle the files, this extra precaution is necessary. > Another open question I have is this: shouldn't backup_manifest come > with CRC checksum for the archived WALs? Or does that guarantee that > backup_manifest WAL-Ranges are present in pg_wal.tar is good enough > because individual WAL files are CRC-protected itself? > I don't know, I have to check pg_verifybackup. Regards, Amul