public inbox for [email protected]  
help / color / mirror / Atom feed
From: Michael Paquier <[email protected]>
To: Peter Eisentraut <[email protected]>
Cc: Dharin Shah <[email protected]>
Cc: [email protected]
Subject: Re: Fwd: [PATCH] Add zstd compression for TOAST using extended header format
Date: Thu, 18 Dec 2025 15:35:07 +0900
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <CAOj6k6dy2CRVA6Lsb5N59zE-7KNVKt=oYwWyg8ULK8zOOY8e7A@mail.gmail.com>
	<CAOj6k6dkseVvZzmEAWvBd6twZsCU0DbN+qeM7CoDuMM3r9doiw@mail.gmail.com>
	<CAOj6k6eSsQ_PTUTxW0upn6wp7tzbvQwqkQco-=+majwX8p6JJg@mail.gmail.com>
	<[email protected]>

On Wed, Dec 17, 2025 at 04:11:38PM +0100, Peter Eisentraut wrote:
> On 16.12.25 11:51, Dharin Shah wrote:
> > - Zstd only applies to external TOAST, not inline compression. The 2-bit
> > limit in va_tcinfo stays as-is for inline data, where pglz/lz4 work fine
> > anyway. Zstd's wins show up on larger values.
> 
> This is a very complicated patch.  To motivate it, you should show some
> detailed performance measurements that show these wins.

Yes, this is expected for any patch posted.  Zstd is an improved
version of lz4, acting as a sort of industry standard these days, and
any byte sequences I have looked at points that zstd leads kind of
always to a better compression ratio for less or equivalent CPU cost
compared to LZ4.  Not saying that numbers are not required, they are.
But I strongly suspect numbers among these lines.

FWIW, it's not a complicated patch, it is a large mechanical patch
that enforces a bunch of TOAST code paths to do what it wants.  If we
are going to do something about that and agree on something, I think
that we should just use a new vartag_external for this matter
(spoiler: I think we should use a new vartag_external value), but
keep the toast structure at 16 bytes all the time, leaving alone the
extra bit in the existing varatt_external structure so as there is no
impact for heap relations if zstd is used, as long as the TOAST value
is 32 bits.  The patch introduces a new vartag_external with
VARTAG_ONDISK_EXTENDED, so while it leads to a better compatibility,
it also means that all zstd entries have to pay an extra amount of
space in the main relation as an effect of a different
default_toast_compression.  The difficulty is not in the
implementation, it would be on agreeing on what folks would be OK
with in terms if vartag and varatt structures, and that's one of the
oldest areas of the PG code, that has complications and assumptions of
its own.

The test implementation looks wrong to me.  Why is there any need for
an extra test module test_toast_ext?  You could just reuse the same
structure as compression_lz4.sql, but adapted to zstd.  That's why a
split with compression.sql has been done in 74a3fc36f314, FYI.

You should also aim at splitting the patch more to make it easier to
review: one of the sticky point of this area of the code is to untie
completely DefaultCompressionId, its GUC and the TOAST code.  The GUC
default_toast_compression accepts by design only 4 values.  This needs
to go further, and should be refactored as a piece of its own.

And also, I would prefer if the 32-bit value issue is tackled first,
but that's a digression here, for a different thread.  :) 
--
Michael


Attachments:

  [application/pgp-signature] signature.asc (833B, 2-signature.asc)
  download

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Fwd: [PATCH] Add zstd compression for TOAST using extended header format
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox