public inbox for [email protected]  
help / color / mirror / Atom feed
From: Dharin Shah <[email protected]>
To: Michael Paquier <[email protected]>
Cc: Robert Treat <[email protected]>
Cc: Peter Eisentraut <[email protected]>
Cc: [email protected]
Subject: Re: Fwd: [PATCH] Add zstd compression for TOAST using extended header format
Date: Wed, 31 Dec 2025 16:02:24 +0100
Message-ID: <CAOj6k6eSVoz8Z-i6u2vGgofid6yXenb1a2AhbBkfi-N11jorHA@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <CAOj6k6dkseVvZzmEAWvBd6twZsCU0DbN+qeM7CoDuMM3r9doiw@mail.gmail.com>
	<CAOj6k6eSsQ_PTUTxW0upn6wp7tzbvQwqkQco-=+majwX8p6JJg@mail.gmail.com>
	<[email protected]>
	<[email protected]>
	<CAOj6k6fxs0Bwjr34W4aURzFPta0DTGLN-1ic-U+-M_EqJ+Wd8A@mail.gmail.com>
	<[email protected]>
	<CABV9wwOdeFHHNnw+rUsVnFek-oqwsJnJX9KFG6AT5Yqz5RA_vg@mail.gmail.com>
	<[email protected]>
	<CAOj6k6dG0M-G0gs_Htray5y_pfvacXSjKYw0uJaaYJvOeqk7xw@mail.gmail.com>
	<CAOj6k6dEVi0NvLjMLDhyrJS_n_NZO5D_OU89AO1u53u6NCDDwQ@mail.gmail.com>
	<[email protected]>

Thanks Michael,

After looking more closely at your “8‑byte TOAST values / infinite loop”
thread and patch series, I see this is very much the same direction you
outlined there: introduce a normalized in-memory representation for
external pointers (toast_external_data) and keep most call sites from
having to reason about vartag_external/va_extinfo details directly [1].

For this refactor patch I kept the decoder local to detoast.c to minimize
scope and avoid committing to a broader API boundary too early. But if the
consensus heads toward a shared interface closer to the format definitions
(as in your toast_external approach), I’m happy to respin/rework this patch
to align with that direction, rather than working on parallel
abstractions. It should also be straightforward to mold this refactor in
the direction of the 8‑byte value-id work without changing the overall
detoast structure.

On HAS_TCINFO flag: the intent is to make payload layout explicit. In the
current code, “external is compressed” effectively implies “payload begins
with tcinfo”, which is wired into fetch/slice logic. For a vartag-based
follow-up (e.g., zstd), we may want compressed payloads without a tcinfo
prefix, so having an explicit flag keeps detoast paths uniform and avoids
method-specific shortcuts.

Let me know what you’d prefer for next steps: keep this patch as a
detoast-local refactor, or respin it to align more directly with a shared
decoded external-pointer interface in the direction of the 8‑byte work.

[1]
https://www.postgresql.org/message-id/flat/CAN-LCVNsE4x0k11ZRWvU4ySTbe98fwA16qzV7p8dxogWnD5Jng%40mai...


Thanks,
Dharin

On Tue, Dec 30, 2025 at 12:46 AM Michael Paquier <[email protected]>
wrote:

> On Mon, Dec 29, 2025 at 02:45:27PM +0100, Dharin Shah wrote:
> > The goal is to centralize “how do we interpret an external datum?” so
> that
> > detoast code paths don’t have to reason directly about va_extinfo
> encoding
> > vs payload layout details. This is intended as groundwork for a follow-up
> > patch adding a new vartag-based method (e.g., zstd) without scattering
> > special cases in detoast paths.
>
> +static bool
> +decode_external_toast_pointer(const struct varlena *attr,
> +                                                  DecodedExternalToast
> *decoded)
> [...]
> +typedef enum ToastDecompressMethod
> +{
> +       TOAST_DECOMP_NONE = 0,
> +       TOAST_DECOMP_PGLZ = 1,
> +       TOAST_DECOMP_LZ4 = 2
> +} ToastDecompressMethod;
> +
> +typedef struct DecodedExternalToast
> +{
> +       Oid                     toastrelid;
> +       Oid                     valueid;
> +       uint32          rawsize;                /* Decompressed size; for
> future methods without tcinfo */
> +       uint32          extsize;                /* On-disk payload size */
> +       ToastDecompressMethod method;
> +       uint8           flags;
> +} DecodedExternalToast;
>
> Yeah, honestly this is a layer I have been thinking about as well as
> one option, but contrary to you I have been focusing on putting that
> into varatt.h, with the exception of the value being an Oid8.  I think
> that you have an interesting point in focusing your implementation to
> be stored in the detoast part, though.  I'd need to spend a bit more
> time to see the result this would lead at with the larger 8-byte issue
> in mind, but this is something that would come at no real cost as it
> has no function pointer redirection compared to what I was first
> envisioning on the other thread.  That's especially true if it makes
> the CompressionId business easier to mold around when adding a new
> vartag.
>
> > Why HAS_TCINFO?
> > - Previously, “is compressed?” was used as a proxy for whether the
> external
> > payload begins with tcinfo. This patch makes that explicit: HAS_TCINFO
> > captures payload layout, which is distinct from whether the value is
> > compressed. This separation is needed for future methods that may store
> > external compressed payloads without tcinfo.
>
> It is possible to model the on-memory data as we want.  This
> suggestion would be OK with some flags.
> --
> Michael
>


reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Fwd: [PATCH] Add zstd compression for TOAST using extended header format
  In-Reply-To: <CAOj6k6eSVoz8Z-i6u2vGgofid6yXenb1a2AhbBkfi-N11jorHA@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox