public inbox for [email protected]  
help / color / mirror / Atom feed
From: Amul Sul <[email protected]>
To: Robert Haas <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Subject: Re: pg_waldump: support decoding of WAL inside tarfile
Date: Mon, 29 Sep 2025 21:47:10 +0530
Message-ID: <CAAJ_b94gK1np8d1h-2c1YoCccGXr4zspTa-FC7X_bfXZNz=-DA@mail.gmail.com> (raw)
In-Reply-To: <CA+TgmobF5c7ZcZHdEhqwNxGDZzWG2bDtpRaDtoVELWX_VHs_1A@mail.gmail.com>
References: <CAAJ_b94bqdWN3h2J-PzzzQ2Npbwct5ZQHggn_QoYGhC2rn-=WQ@mail.gmail.com>
	<CAAJ_b97PQjE4kFD8Qk6UvtLrfPMixw1nxBz0OP5Z2WB2B-uMxQ@mail.gmail.com>
	<CAAJ_b97JA8ehy_UDddrnGwDt9HG5NmJq8ATtmeMqo7YD-=tLyQ@mail.gmail.com>
	<CA+TgmoZjhWDG_AR1i+L1yss-wbuWvxrdRwSdVUUUnVPrJV2CnQ@mail.gmail.com>
	<CAAJ_b94Uh+b41LQG45bZFK+i62EVvv972LiGWWWuR64=-64rTQ@mail.gmail.com>
	<CA+TgmobF5c7ZcZHdEhqwNxGDZzWG2bDtpRaDtoVELWX_VHs_1A@mail.gmail.com>

On Mon, Sep 29, 2025 at 8:45 PM Robert Haas <[email protected]> wrote:
>
> On Thu, Sep 25, 2025 at 4:25 AM Amul Sul <[email protected]> wrote:
> > > Another thing that isn't so nice right now is that
> > > verify_tar_archive() has to open and close the archive only for
> > > init_tar_archive_reader() to be called to reopen it again just moments
> > > later. It would be nicer to open the file just once and then keep it
> > > open. Here again, I wonder if the separation of duties could be a bit
> > > cleaner.
> >
> > Prefer to keep those separate, assuming that reopening the file won't
> > cause any significant harm. Let me know if you think otherwise.
>
> Well, I guess I'd like to know why we can't do better. I'm not really
> worried about performance, but reopening the file means that you can
> never make it work with reading from a pipe.

I have some skepticism regarding the extra coding that might be
introduced, as performance is not my primary concern here. If we aim
to keep the file open only once, that logic should be implemented
before calling verify_tar_archive(), not inside it. Implementing the
open and close logic within verify_tar_archive() and
free_tar_archive_reader() would create a confusing and scattered
pattern, especially since these separate operations require only two
lines of code each (open and close if it's a tar file). My second,
concern is that after verify_tar_archive(), we might need to reset the
file reader offset to the beginning. While reusing the buffered data
from the first iteration is technically possible, that only works if
the desired start LSN is at the absolute beginning of the archive, or
later in the sequence, which cannot be reliably guaranteed. Therefore,
for simplicity and avoid the complexity of managing that offset reset
code, I am thinking of a simpler approach.

Regards,
Amul





view thread (83+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re: pg_waldump: support decoding of WAL inside tarfile
  In-Reply-To: <CAAJ_b94gK1np8d1h-2c1YoCccGXr4zspTa-FC7X_bfXZNz=-DA@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox