Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vkQIb-00EtIo-2D for pgsql-hackers@arkaria.postgresql.org; Mon, 26 Jan 2026 17:22:50 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vkQIa-009alv-2d for pgsql-hackers@arkaria.postgresql.org; Mon, 26 Jan 2026 17:22:49 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vkQIa-009aln-1X for pgsql-hackers@lists.postgresql.org; Mon, 26 Jan 2026 17:22:48 +0000 Received: from mail-ed1-x536.google.com ([2a00:1450:4864:20::536]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1vkQIY-00000000ZhN-1Olj for pgsql-hackers@lists.postgresql.org; Mon, 26 Jan 2026 17:22:48 +0000 Received: by mail-ed1-x536.google.com with SMTP id 4fb4d7f45d1cf-64bea6c5819so7117410a12.3 for ; Mon, 26 Jan 2026 09:22:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1769448165; cv=none; d=google.com; s=arc-20240605; b=Ym+/Yz3sdOCdywvxIRcd09D6uwJ5C66qrFSeFjCjAB7E17MRbxEeGxYuFPiEsHEYWr UiTZcH1o/lhLkkVFK+wyY/4FsZNUJogx96Bg1GH0xDR2zwev2ky85NF3IQTs+Q0ydaQz XVm5Kd49NZF3Cp3lYsC4qQVuh+mfuv52x1kpew67DQII2wa/lKc9oN3O/ioVqmOp9e7C 0px7Nl2oFruLbdke7BnJiajQgkHJ7C8ZNKWZaR4VKkMQTQxb737Pef93iaPbSDHOVqN+ nEKq6cmc5R4PC0Ytu2B53BuUqW0XovYLsBjwfzP5qLOuKYSNw5DWQOKs0gcDknHaMic/ lilQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=G0CNi7VI3qVGGB6Nn7D/2HgAGuYHN00Qprg4G8ov2RE=; fh=OnO6noQm/+UIv7PnLuU9hNmgEx4wCKd2oeAjdWOx73U=; b=GVMMWfbedbIKFeuJeajCumW/XIbn2cjtnLC1RZGs0eOUU+qRq25ZkTJLTyEer5tRw9 SDdn9US5mavOWV1ycLu0coY6gbcz7bZsIsYF1/GOkvTgR1LehzDEMVepLK/P7CavtmGv 35dM+q7eAoJ4hxgLDbiK+TJfSyCilgWof9GoUaxIjnH4CgqKwhw1U4qrwWvn75p7dj1u 3dqxkcGdZg27nMrhx4PhTWxsIfJvgBIUVCgBy3xrzobQdtNTNa8/vKRA2tJsRociwgk9 KZ3LHUe05InXOhHlPIV/OeUibTpXU2PNFJKHJsmk9TwE1Q/KMHT2ogyukm+S3nkn83jw Sj8Q==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1769448165; x=1770052965; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=G0CNi7VI3qVGGB6Nn7D/2HgAGuYHN00Qprg4G8ov2RE=; b=NCmgRYYUvx1JKwfffw7HFFedognJKKIb9ZoQAWdQuAw+4nqibC+C6Ef/WqPwQ/k9fQ fhXy8VaHYnmhpVInsCrj1Gnaod61ssdTjWzqMojdk7U6Sy64rLO1zS/CTn4rqqVbKKjz cn29Z6bFiVfAyl6UK6NJMPjjrBPu9ItKAheCRrbCZvFsIShj1V9s17/YsItQBvC30u3P O+RDvsJMBm51ZeDVc8bMeipsgd/vbBfjAedcMueRT1A7d3qyVP5e0/4XdvKTu65p739r SW4j+WSUdbOO9/slOluVRF1oZot/nq5cLNWaBayE84vH+VSrzeI9WRoAb64swx+vnCXn b0lA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1769448165; x=1770052965; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=G0CNi7VI3qVGGB6Nn7D/2HgAGuYHN00Qprg4G8ov2RE=; b=iHgKZIy8lzj6Y4SoN+urW+FrB1JWzY5t3mr75vjYU4Zag8luHnKCLEYckqw1gwOrTn lONb/mFnaxGxA5Dt/8waSJAK91d7szuDKbcRR/aALhWNII4HfnuXe+3/6vKGmi9pKaLB YjcryqwL0RXTe6ylhdP2KFfOJVyypoexjGmMXwNBeoy+RIK6j1ACpdo4MqSHeul67L78 SODhaWZsbVsbmujvQUSKlBZljA+KNI5PHx8fOrh3P/xZbdjxYJ/2np0RaGiKO/J3MliE yZ6NW9vEzu4sWolYdCH3obQllQCyCdFpXtBllV1pLfhjBOK0CkKYCCZk0NEJwKjw5Yq6 iSJg== X-Forwarded-Encrypted: i=1; AJvYcCWvB/+WIrRsgXsTJHKdNrmV0O7DuMvKk5MCEOIjx9ImumKqc5DTZcdZFyAzZRnLIz05ckAafyZYLjVqCk3T@lists.postgresql.org X-Gm-Message-State: AOJu0YxzaAch3FUyyBKTjMexE+43Bo++q3msP+CXlxBYY5mhhAL8fRuF XDvHsXB8NhsEIjz//Q/NkdeHg7UKKRmxw5hougU8GEcszRZ0weQ+arUY+vNdjKqM3g6fgk7WHcn fKpVq7Hyty3c+AtqUrR9ciGdjJyixbvw= X-Gm-Gg: AZuq6aKYg4nlpOje0N1ukExzAlwW0eojWp4IeUiKOTY+lxM4XeQ2L7PT9A66wEuRSc1 p5s7WW/PvEmgpv/Hsgwmip4T4IRF58cHhurHQsLfGqtMDo6mhe2Lg0amY+dl4us8R1G7wcX5ssL mb5YAektQgIJacs9YzBPI05hwwuiLHTe5Izu+3AKvBSescQ3hX/hl3RxveiyYuz2mSCa4hhfHoZ AtcwzUkz24UjFzGkSXJJtS6V7/C3Neq6zgoaNvBM/2rx+ouVn43/Oq57wvbGtDKG8t3y9FwoDHf zMyrYKLuGObpvTkeqfR13FVl6N95ZoUTf21rfQ== X-Received: by 2002:a17:907:748:b0:b88:411c:fcbf with SMTP id a640c23a62f3a-b8d0a274978mr361479566b.12.1769448164800; Mon, 26 Jan 2026 09:22:44 -0800 (PST) MIME-Version: 1.0 References: <731ADE6F-01C5-4996-BAEE-5851DFC3F502@gmail.com> In-Reply-To: From: Robert Haas Date: Mon, 26 Jan 2026 12:22:33 -0500 X-Gm-Features: AZwV_Qi6lZ7pZirwfH6kcSIOpwchBpWtnhPW4fcXd54k6ooeZnBRoGfCE7duU80 Message-ID: Subject: Re: pg_waldump: support decoding of WAL inside tarfile To: Amul Sul Cc: Chao Li , Jakub Wartak , PostgreSQL Hackers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Fri, Jan 23, 2026 at 7:27=E2=80=AFAM Amul Sul wrote: > Another option I previously considered was adding the filtration logic > inside the archive streamer itself. However, since the very first read > is required to calculate the WAL segment size, the filter check cannot > be performed immediately. However, we could send a signal to the > archive streamer via privateInfo (e.g., a read_any_wal or > skip_wal_check boolean flag) to disable the filtration check until the > size is calculated. But that approach isn't very elegant; if the first > WAL page we read belongs to a segment we actually want to skip, we > would still have to run the filter check and handle the skip/removal > logic outside of the streamer (i.e., inside init_archive_reader()). > This would result in performing the same filtration check in two > different places. I mean, I don't really buy this logic. If the information added to privateInfo is "here's the LSN before which you can remove stuff," and it starts out initialized to 0/0, then the read of the first WAL page causes no problem at all, because nothing is before 0/0. After it gets updated to some non-zero value, the next call to astreamer_waldump_content() can handle discarding any data we don't need. IMHO, the best argument for keeping things are is that in some cases, that decision might result in a bit of delay in discarding old data, but I don't think that really matters. I think all that we care about is the peak memory utilization of an operation, and I don't think that an explicit signaling system should increase that at all. That said, I'm certainly willing to consider other ideas about how this can work. However, I feel strongly that the logic needs to be not only correct, but clear and well-explained. Setting cur_wal to NULL to make the astreamer skip without adequate comments doesn't meet that standard. Maybe with some better comments it's all right, but frankly I'm a bit skeptical. Right now, you're using whether or not cur_wal is NULL as a signal to skip data or not skip data. How is that better than passing down the LSN and TLI that you want to read next and letting the astreamer figure out what to do itself? It's a signaling mechanism either way, but it seems a lot easier to figure out whether we always keep the LSN and TLI updated properly than to figure out whether cur_wal is always NULL at exactly the right times. --=20 Robert Haas EDB: http://www.enterprisedb.com