public inbox for [email protected]
help / color / mirror / Atom feedFrom: Dharin Shah <[email protected]>
To: Michael Paquier <[email protected]>
Cc: Peter Eisentraut <[email protected]>
Cc: [email protected]
Subject: Re: Fwd: [PATCH] Add zstd compression for TOAST using extended header format
Date: Wed, 24 Dec 2025 01:47:16 +0100
Message-ID: <CAOj6k6f2B3hNxDcnB5AgHX4kaTW8XTAfMAjRx4upDBOugxqF4w@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <CAOj6k6dy2CRVA6Lsb5N59zE-7KNVKt=oYwWyg8ULK8zOOY8e7A@mail.gmail.com>
<CAOj6k6dkseVvZzmEAWvBd6twZsCU0DbN+qeM7CoDuMM3r9doiw@mail.gmail.com>
<CAOj6k6eSsQ_PTUTxW0upn6wp7tzbvQwqkQco-=+majwX8p6JJg@mail.gmail.com>
<[email protected]>
<[email protected]>
<CAOj6k6fxs0Bwjr34W4aURzFPta0DTGLN-1ic-U+-M_EqJ+Wd8A@mail.gmail.com>
<[email protected]>
Hello,
Following up on my earlier patch submission, I've reworked the zstd TOAST
compression implementation based on our discussion here. The new patch now
avoids the 20-byte extended header.
Current Approach
- New `VARTAG_ONDISK_ZSTD` (value 19) for ZSTD external storage
- Maintains existing 16-byte varatt_external structure
- ZSTD external-only (no inline compression)
Note: Using a dedicated VARTAG_ONDISK_ZSTD keeps the on-disk TOAST pointer
payload at 16 bytes, but it is not a general extensible metadata carrier.
If PostgreSQL later adopts a more general extensible TOAST framework, this
change should not block it; VARTAG_ONDISK_ZSTD would remain as a supported
legacy encoding, while new toasted values could be written using the newer
framework and old values rewritten via normal table rewrites.
Storage (170 MB uncompressed):
ZSTD: 22 MB (7.60x) - 38.7% space savings vs LZ4
PGLZ: 36 MB (4.76x)
LZ4: 36 MB (4.66x)
Key findings:
- Large values (>50KB): ZSTD 33% better compression than PGLZ (~30% better
than LZ4)
- Low-entropy data: ZSTD compresses what LZ77 methods cannot
- Small values: ZSTD pays external overhead vs inline PGLZ/LZ4
While ZSTD uses slightly less space overall, the external storage mechanism
incurs a TOAST fetch overhead for small values, potentially impacting
performance.
Backwards Compatibility Tests
- Mixed compression: Rows with PGLZ, LZ4, and ZSTD coexist and decompress
correctly
- Lazy recompression: ALTER COLUMN ... SET COMPRESSION zstd affects new
data; existing data is lazily recompressed upon UPDATE or VACUUM FULL.
- Inline vs external: Small values remain inline; large values use
appropriate external compression.
Data integrity: All data decompresses correctly across all methods.
Trade-offs and Design Considerations
- External-only avoids consuming cmid=3 and extended header complexity
- Slice access: no ZSTD-specific optimization (follow-up area)
- Hybrid inline/external for small values: not in this patch (feedback
welcome)
Reviewer Questions - Is vartag-based external-only acceptable?
- Should compression level (currently 3) be configurable? - Is the external
storage overhead for small values acceptable, or is hybrid inline/external
behavior needed?
Thanks, Dharin
On Thu, Dec 18, 2025 at 11:44 PM Michael Paquier <[email protected]>
wrote:
> On Thu, Dec 18, 2025 at 10:44:22PM +0100, Dharin Shah wrote:
> > I want to make sure I understand your main point: you're OK with a new
> > `vartag_external`, but prefer we avoid increasing the heap TOAST pointer
> > from 16 -> 20 bytes since every zstd-toasted value would pay +4 bytes in
> > the main heap tuple.
>
> That would be my choice, yes. Not sure about the opinion of others on
> this matter.
>
> > I also realize the "compatibility" of the extended header doesn't buy us
> > much — we'll need to support the existing 16-byte varatt_external forever
> > for backward compatibility. Adding a 20-byte structure just means two
> > formats to maintain indefinitely.
>
> Yes. Patches have to maintain on-disk compatibility.
>
> > A couple clarifying questions if we go with new vartag (e.g.,
> > `VARTAG_ONDISK_ZSTD`), same 16-byte `varatt_external` payload, vartag as
> > discriminator
> > 1. How should we handle future methods beyond zstd? One tag per method,
> or
> > store a method id elsewhere (e.g., in TOAST chunk header)?
>
> My suspicion would be that we could either use a new set of vartags in
> the future for each compression method. When it comes to zstd there
> is something that comes in play: we could set some bits related to
> dictionnaries at tuple level. Not sure if this is the best design or
> if using an attribute-level option is more adapted (for example a
> JSONB blob could be applied as an attribute with common keys in a
> dictionnary saving a lot of on-disk space even before compression),
> but keeping some bits free in the 16-byte header leaves this option
> open with a new vartag_external. Saying that, zstd is good enough
> that I strongly suspect that we would not regret it for quite a few
> years. One issue that has pushed towards the addition of lz4 as an
> option for toast compression is that pglz was worse in terms of CPU
> cost. zlib is also more expensive than lz4 or zstd, especially at
> very high compression level for usually little compression gains.
>
> > 2. And re: "as long as the TOAST value is 32 bits" — are you referring to
> > the 30-bit extsize field in va_extinfo (i.e., avoid stealing bits from
> > extsize for method encoding)?
>
> I mean extending the TOAST value to 8 bytes, as per the following
> issues:
> https://www.postgresql.org/message-id/764273.1669674269%40sss.pgh.pa.us
> https://commitfest.postgresql.org/patch/5830/
>
> > *Key findings (i guess well known at this point):*
> > - ZSTD excels for repetitive/pattern-heavy data (6.7x better than PGLZ)
> > - For low-redundancy data (MD5 hashes), ZSTD still achieves ~2x better
> > - The T4 result showing zstd as "worse" is not about compression quality
> -
> > it's about missing inline storage support. ZSTD actually compresses
> better,
> > but pays unnecessary TOAST overhead.
> >
> > I'll share the detailed benchmark script with the next patch revision.
> But
> > also a potential path forward could be that we could just fully replace
> > pglz (can bring it up later in different thread)
>
> I don't think that we will ever be able to remove pglz. It would be
> nice, as final result of course, but I also expect that not being able
> to decompress pglz data is going to lead to a lot of user pain. That
> would be also very expensive to check at upgrade for large instances.
>
> > *On Testing and Patch Structure*
> > Agreed on both points:
> > - I'll use `compression_zstd.sql` following the `compression_lz4.sql`
> > pattern (removing the test_toast_ext module)
>
> Okay.
>
> > - I'll split the GUC refactoring into a separate preparatory patch
>
> This refactoring, if done nicely, is worth an independent piece. It's
> something that I have actually done for the sake of the other thread,
> though the result was not really much liked by others. Perhaps I'm
> just lacking imagination with this abstraction, and I'd surely welcome
> different ideas.
> --
> Michael
>
Attachments:
[application/octet-stream] benchmark_toast_compression.sql (26.2K, 3-benchmark_toast_compression.sql)
download
[application/octet-stream] v3-0001-Add-ZSTD-TOAST-compression-using-VARTAG-ONDISK-ZSTD.patch (52.1K, 4-v3-0001-Add-ZSTD-TOAST-compression-using-VARTAG-ONDISK-ZSTD.patch)
download | inline diff:
From b206ea02a266a630d1c869f19fa2adf716165809 Mon Sep 17 00:00:00 2001
From: Dharin Shah <[email protected]>
Date: Sun, 21 Dec 2025 18:38:36 +0100
Subject: [PATCH v3] Add ZSTD compression support for TOAST using
VARTAG_ONDISK_ZSTD (Option B, level 3)
---
src/backend/access/common/detoast.c | 98 ++++-
src/backend/access/common/indextuple.c | 22 +-
src/backend/access/common/toast_compression.c | 166 +++++++-
src/backend/access/common/toast_internals.c | 80 +++-
src/backend/access/table/toast_helper.c | 11 +-
src/backend/replication/logical/proto.c | 4 +-
src/backend/replication/pgoutput/pgoutput.c | 6 +-
src/backend/utils/adt/varlena.c | 6 +-
src/backend/utils/misc/guc_tables.c | 3 +
src/bin/pg_dump/pg_dump.c | 3 +
src/bin/psql/describe.c | 5 +-
src/include/access/toast_compression.h | 14 +
src/include/access/toast_internals.h | 10 +-
src/include/varatt.h | 17 +-
.../regress/expected/compression_zstd.out | 361 ++++++++++++++++++
src/test/regress/parallel_schedule | 2 +-
src/test/regress/sql/compression_zstd.sql | 178 +++++++++
17 files changed, 947 insertions(+), 39 deletions(-)
create mode 100644 src/test/regress/expected/compression_zstd.out
create mode 100644 src/test/regress/sql/compression_zstd.sql
diff --git a/src/backend/access/common/detoast.c b/src/backend/access/common/detoast.c
index 62651787742..ebf21c85c86 100644
--- a/src/backend/access/common/detoast.c
+++ b/src/backend/access/common/detoast.c
@@ -46,7 +46,23 @@ detoast_external_attr(struct varlena *attr)
{
struct varlena *result;
- if (VARATT_IS_EXTERNAL_ONDISK(attr))
+ if (VARATT_IS_EXTERNAL_ONDISK_ZSTD(attr))
+ {
+ /*
+ * This is a ZSTD-compressed external datum --- fetch and decompress it
+ */
+ struct varatt_external toast_pointer;
+ struct varlena *compressed;
+ int32 rawsize;
+
+ VARATT_EXTERNAL_GET_POINTER(toast_pointer, attr);
+ rawsize = toast_pointer.va_rawsize - VARHDRSZ;
+
+ compressed = toast_fetch_datum(attr);
+ result = zstd_decompress_datum(compressed, rawsize);
+ pfree(compressed);
+ }
+ else if (VARATT_IS_EXTERNAL_ONDISK(attr))
{
/*
* This is an external stored plain value
@@ -115,7 +131,23 @@ detoast_external_attr(struct varlena *attr)
struct varlena *
detoast_attr(struct varlena *attr)
{
- if (VARATT_IS_EXTERNAL_ONDISK(attr))
+ if (VARATT_IS_EXTERNAL_ONDISK_ZSTD(attr))
+ {
+ /*
+ * This is a ZSTD-compressed external datum --- fetch and decompress it
+ */
+ struct varatt_external toast_pointer;
+ struct varlena *compressed;
+ int32 rawsize;
+
+ VARATT_EXTERNAL_GET_POINTER(toast_pointer, attr);
+ rawsize = toast_pointer.va_rawsize - VARHDRSZ;
+
+ compressed = toast_fetch_datum(attr);
+ attr = zstd_decompress_datum(compressed, rawsize);
+ pfree(compressed);
+ }
+ else if (VARATT_IS_EXTERNAL_ONDISK(attr))
{
/*
* This is an externally stored datum --- fetch it back from there
@@ -223,7 +255,23 @@ detoast_attr_slice(struct varlena *attr,
else if (pg_add_s32_overflow(sliceoffset, slicelength, &slicelimit))
slicelength = slicelimit = -1;
- if (VARATT_IS_EXTERNAL_ONDISK(attr))
+ if (VARATT_IS_EXTERNAL_ONDISK_ZSTD(attr))
+ {
+ /*
+ * This is a ZSTD-compressed external datum --- fetch, decompress, then slice
+ */
+ struct varatt_external toast_pointer;
+ struct varlena *compressed;
+ int32 rawsize;
+
+ VARATT_EXTERNAL_GET_POINTER(toast_pointer, attr);
+ rawsize = toast_pointer.va_rawsize - VARHDRSZ;
+
+ compressed = toast_fetch_datum(attr);
+ preslice = zstd_decompress_datum_slice(compressed, rawsize, slicelimit >= 0 ? slicelimit : rawsize);
+ pfree(compressed);
+ }
+ else if (VARATT_IS_EXTERNAL_ONDISK(attr))
{
struct varatt_external toast_pointer;
@@ -246,8 +294,8 @@ detoast_attr_slice(struct varlena *attr,
* Determine maximum amount of compressed data needed for a prefix
* of a given length (after decompression).
*
- * At least for now, if it's LZ4 data, we'll have to fetch the
- * whole thing, because there doesn't seem to be an API call to
+ * At least for now, if it's LZ4 data, we'll have to fetch
+ * the whole thing, because there doesn't seem to be an API call to
* determine how much compressed data we need to be sure of being
* able to decompress the required slice.
*/
@@ -346,8 +394,9 @@ toast_fetch_datum(struct varlena *attr)
struct varlena *result;
struct varatt_external toast_pointer;
int32 attrsize;
+ bool is_zstd = VARATT_IS_EXTERNAL_ONDISK_ZSTD(attr);
- if (!VARATT_IS_EXTERNAL_ONDISK(attr))
+ if (!VARATT_IS_EXTERNAL_ONDISK(attr) && !is_zstd)
elog(ERROR, "toast_fetch_datum shouldn't be called for non-ondisk datums");
/* Must copy to access aligned fields */
@@ -357,6 +406,17 @@ toast_fetch_datum(struct varlena *attr)
result = (struct varlena *) palloc(attrsize + VARHDRSZ);
+ /*
+ * Set varlena header format based on how data is stored in TOAST:
+ *
+ * For PGLZ/LZ4: TOAST chunks contain tcinfo compression header followed
+ * by compressed data. Mark as compressed varlena so decompression can
+ * read the tcinfo metadata.
+ *
+ * For ZSTD: TOAST chunks contain only raw ZSTD compressed bytes (no tcinfo).
+ * The compression method is identified by VARTAG_ONDISK_ZSTD instead of
+ * tcinfo bits. Mark as plain varlena since there's no tcinfo header to parse.
+ */
if (VARATT_EXTERNAL_IS_COMPRESSED(toast_pointer))
SET_VARSIZE_COMPRESSED(result, attrsize + VARHDRSZ);
else
@@ -400,19 +460,24 @@ toast_fetch_datum_slice(struct varlena *attr, int32 sliceoffset,
struct varlena *result;
struct varatt_external toast_pointer;
int32 attrsize;
+ bool is_zstd = VARATT_IS_EXTERNAL_ONDISK_ZSTD(attr);
+ bool is_compressed;
- if (!VARATT_IS_EXTERNAL_ONDISK(attr))
+ if (!VARATT_IS_EXTERNAL_ONDISK(attr) && !is_zstd)
elog(ERROR, "toast_fetch_datum_slice shouldn't be called for non-ondisk datums");
/* Must copy to access aligned fields */
VARATT_EXTERNAL_GET_POINTER(toast_pointer, attr);
+ /* For ZSTD, the vartag indicates compression; for others, check va_extinfo */
+ is_compressed = is_zstd || VARATT_EXTERNAL_IS_COMPRESSED(toast_pointer);
+
/*
* It's nonsense to fetch slices of a compressed datum unless when it's a
* prefix -- this isn't lo_* we can't return a compressed datum which is
* meaningful to toast later.
*/
- Assert(!VARATT_EXTERNAL_IS_COMPRESSED(toast_pointer) || 0 == sliceoffset);
+ Assert(!is_compressed || 0 == sliceoffset);
attrsize = VARATT_EXTERNAL_GET_EXTSIZE(toast_pointer);
@@ -425,7 +490,8 @@ toast_fetch_datum_slice(struct varlena *attr, int32 sliceoffset,
/*
* When fetching a prefix of a compressed external datum, account for the
* space required by va_tcinfo, which is stored at the beginning as an
- * int32 value.
+ * int32 value. This only applies to pglz/lz4, not zstd (which has no
+ * tcinfo header).
*/
if (VARATT_EXTERNAL_IS_COMPRESSED(toast_pointer) && slicelength > 0)
slicelength = slicelength + sizeof(int32);
@@ -440,6 +506,10 @@ toast_fetch_datum_slice(struct varlena *attr, int32 sliceoffset,
result = (struct varlena *) palloc(slicelength + VARHDRSZ);
+ /*
+ * Use compressed varlena format only for pglz/lz4 which have tcinfo.
+ * For zstd, use plain format since payload lacks tcinfo.
+ */
if (VARATT_EXTERNAL_IS_COMPRESSED(toast_pointer))
SET_VARSIZE_COMPRESSED(result, slicelength + VARHDRSZ);
else
@@ -477,6 +547,9 @@ toast_decompress_datum(struct varlena *attr)
/*
* Fetch the compression method id stored in the compression header and
* decompress the data using the appropriate decompression routine.
+ *
+ * Note: Zstd external data never goes through this dispatch (it uses
+ * VARTAG_ONDISK_ZSTD and is handled separately).
*/
cmid = TOAST_COMPRESS_METHOD(attr);
switch (cmid)
@@ -520,6 +593,9 @@ toast_decompress_datum_slice(struct varlena *attr, int32 slicelength)
/*
* Fetch the compression method id stored in the compression header and
* decompress the data slice using the appropriate decompression routine.
+ *
+ * Note: Zstd external data never goes through this dispatch (it uses
+ * VARTAG_ONDISK_ZSTD and is handled separately).
*/
cmid = TOAST_COMPRESS_METHOD(attr);
switch (cmid)
@@ -547,7 +623,7 @@ toast_raw_datum_size(Datum value)
struct varlena *attr = (struct varlena *) DatumGetPointer(value);
Size result;
- if (VARATT_IS_EXTERNAL_ONDISK(attr))
+ if (VARATT_IS_EXTERNAL_ONDISK(attr) || VARATT_IS_EXTERNAL_ONDISK_ZSTD(attr))
{
/* va_rawsize is the size of the original datum -- including header */
struct varatt_external toast_pointer;
@@ -603,7 +679,7 @@ toast_datum_size(Datum value)
struct varlena *attr = (struct varlena *) DatumGetPointer(value);
Size result;
- if (VARATT_IS_EXTERNAL_ONDISK(attr))
+ if (VARATT_IS_EXTERNAL_ONDISK(attr) || VARATT_IS_EXTERNAL_ONDISK_ZSTD(attr))
{
/*
* Attribute is stored externally - return the extsize whether
diff --git a/src/backend/access/common/indextuple.c b/src/backend/access/common/indextuple.c
index 3efa3889c6f..b24902b7a25 100644
--- a/src/backend/access/common/indextuple.c
+++ b/src/backend/access/common/indextuple.c
@@ -20,6 +20,7 @@
#include "access/heaptoast.h"
#include "access/htup_details.h"
#include "access/itup.h"
+#include "access/toast_compression.h"
#include "access/toast_internals.h"
/*
@@ -123,9 +124,28 @@ index_form_tuple_context(TupleDesc tupleDescriptor,
att->attstorage == TYPSTORAGE_MAIN))
{
Datum cvalue;
+ char cmethod = att->attcompression;
+
+ /*
+ * Index tuples must be self-contained (cannot reference external TOAST).
+ * ZSTD compression uses external storage only (identified by vartag rather
+ * than inline tcinfo bits). For indexed values declared COMPRESSION zstd,
+ * fall back to inline-capable compression: prefer LZ4 when available, else PGLZ.
+ *
+ * Use explicit method rather than default_toast_compression so fallback
+ * works even when default is zstd.
+ */
+ if (cmethod == TOAST_ZSTD_COMPRESSION)
+ {
+#ifdef USE_LZ4
+ cmethod = TOAST_LZ4_COMPRESSION;
+#else
+ cmethod = TOAST_PGLZ_COMPRESSION;
+#endif
+ }
cvalue = toast_compress_datum(untoasted_values[i],
- att->attcompression);
+ cmethod);
if (DatumGetPointer(cvalue) != NULL)
{
diff --git a/src/backend/access/common/toast_compression.c b/src/backend/access/common/toast_compression.c
index 926f1e4008a..e8a4c6f328d 100644
--- a/src/backend/access/common/toast_compression.c
+++ b/src/backend/access/common/toast_compression.c
@@ -17,8 +17,13 @@
#include <lz4.h>
#endif
+#ifdef USE_ZSTD
+#include <zstd.h>
+#endif
+
#include "access/detoast.h"
#include "access/toast_compression.h"
+#include "access/toast_internals.h"
#include "common/pg_lzcompress.h"
#include "varatt.h"
@@ -245,6 +250,147 @@ lz4_decompress_datum_slice(const struct varlena *value, int32 slicelength)
#endif
}
+/* ----------
+ * zstd compression/decompression routines
+ *
+ * ZSTD uses VARTAG_ONDISK_ZSTD for external storage, not cmid=3.
+ * TOAST_ZSTD_COMPRESSION_ID exists only for introspection (SQL functions).
+ * ----------
+ */
+
+/*
+ * Compress a varlena using ZSTD.
+ *
+ * Returns the compressed varlena, or NULL if compression fails.
+ */
+struct varlena *
+zstd_compress_datum(const struct varlena *value)
+{
+#ifndef USE_ZSTD
+ NO_COMPRESSION_SUPPORT("zstd");
+ return NULL; /* keep compiler quiet */
+#else
+ int32 valsize;
+ size_t len;
+ size_t max_size;
+ struct varlena *tmp = NULL;
+
+ valsize = VARSIZE_ANY_EXHDR(value);
+
+ /*
+ * No point in wasting a zstd header on empty or very short inputs.
+ */
+ if (unlikely(valsize < 32))
+ return NULL;
+
+ /*
+ * Allocate buffer for compressed output. Return a plain varlena containing
+ * just the ZSTD compressed frame. toast_save_datum() will store this to
+ * external TOAST without adding tcinfo header (compression method is
+ * identified by VARTAG_ONDISK_ZSTD instead).
+ */
+ max_size = ZSTD_compressBound(valsize);
+ tmp = (struct varlena *) palloc(max_size + VARHDRSZ);
+
+ len = ZSTD_compress((char *) tmp + VARHDRSZ,
+ max_size,
+ VARDATA_ANY(value),
+ valsize,
+ 3); /* compression level 3 for balanced speed/ratio */
+
+ if (unlikely(ZSTD_isError(len)))
+ elog(ERROR, "zstd compression failed: %s", ZSTD_getErrorName(len));
+
+ /* data is incompressible so just free the memory and return NULL */
+ if (len >= (size_t) valsize)
+ {
+ pfree(tmp);
+ return NULL;
+ }
+
+ SET_VARSIZE(tmp, len + VARHDRSZ);
+
+ return tmp;
+#endif
+}
+
+/*
+ * Decompress a varlena that was compressed using ZSTD.
+ */
+struct varlena *
+zstd_decompress_datum(const struct varlena *value, int32 rawsize)
+{
+#ifndef USE_ZSTD
+ NO_COMPRESSION_SUPPORT("zstd");
+ return NULL; /* keep compiler quiet */
+#else
+ size_t decomp_size;
+ struct varlena *result;
+
+ result = (struct varlena *) palloc(rawsize + VARHDRSZ);
+
+ decomp_size = ZSTD_decompress(VARDATA(result),
+ rawsize,
+ (char *) value + VARHDRSZ,
+ VARSIZE(value) - VARHDRSZ);
+
+ if (unlikely(ZSTD_isError(decomp_size)))
+ ereport(ERROR,
+ (errcode(ERRCODE_DATA_CORRUPTED),
+ errmsg_internal("compressed zstd data is corrupt: %s",
+ ZSTD_getErrorName(decomp_size))));
+
+ SET_VARSIZE(result, decomp_size + VARHDRSZ);
+
+ return result;
+#endif
+}
+
+/*
+ * Decompress part of a varlena that was compressed using ZSTD.
+ *
+ * Note: We decompress the full datum then return the requested slice.
+ * This is necessary because detoast_attr_slice() calls toast_fetch_datum()
+ * first (which fetches all compressed TOAST chunks), so the real bottleneck
+ * is TOAST I/O, not decompression method. ZSTD doesn't support true random
+ * access within compressed frames, and streaming APIs don't help when the
+ * full compressed input is already materialized in memory.
+ */
+struct varlena *
+zstd_decompress_datum_slice(const struct varlena *value, int32 rawsize, int32 slicelength)
+{
+#ifndef USE_ZSTD
+ NO_COMPRESSION_SUPPORT("zstd");
+ return NULL; /* keep compiler quiet */
+#else
+ size_t decomp_size;
+ struct varlena *result;
+
+ /* Limit to actual size if slice request is larger */
+ if (slicelength >= rawsize)
+ return zstd_decompress_datum(value, rawsize);
+
+ /* Decompress the full data */
+ result = (struct varlena *) palloc(rawsize + VARHDRSZ);
+
+ decomp_size = ZSTD_decompress(VARDATA(result),
+ rawsize,
+ (char *) value + VARHDRSZ,
+ VARSIZE(value) - VARHDRSZ);
+
+ if (unlikely(ZSTD_isError(decomp_size)))
+ ereport(ERROR,
+ (errcode(ERRCODE_DATA_CORRUPTED),
+ errmsg_internal("compressed zstd data is corrupt: %s",
+ ZSTD_getErrorName(decomp_size))));
+
+ /* Truncate to requested size */
+ SET_VARSIZE(result, slicelength + VARHDRSZ);
+
+ return result;
+#endif
+}
+
/*
* Extract compression ID from a varlena.
*
@@ -259,8 +405,17 @@ toast_get_compression_id(struct varlena *attr)
* If it is stored externally then fetch the compression method id from
* the external toast pointer. If compressed inline, fetch it from the
* toast compression header.
+ *
+ * For ZSTD external data, VARTAG_ONDISK_ZSTD indicates compression,
+ * so we return TOAST_ZSTD_COMPRESSION_ID directly without checking
+ * va_extinfo bits.
*/
- if (VARATT_IS_EXTERNAL_ONDISK(attr))
+ if (VARATT_IS_EXTERNAL_ONDISK_ZSTD(attr))
+ {
+ /* ZSTD external data uses vartag to indicate compression */
+ cmid = TOAST_ZSTD_COMPRESSION_ID;
+ }
+ else if (VARATT_IS_EXTERNAL_ONDISK(attr))
{
struct varatt_external toast_pointer;
@@ -293,6 +448,13 @@ CompressionNameToMethod(const char *compression)
#endif
return TOAST_LZ4_COMPRESSION;
}
+ else if (strcmp(compression, "zstd") == 0)
+ {
+#ifndef USE_ZSTD
+ NO_COMPRESSION_SUPPORT("zstd");
+#endif
+ return TOAST_ZSTD_COMPRESSION;
+ }
return InvalidCompressionMethod;
}
@@ -309,6 +471,8 @@ GetCompressionMethodName(char method)
return "pglz";
case TOAST_LZ4_COMPRESSION:
return "lz4";
+ case TOAST_ZSTD_COMPRESSION:
+ return "zstd";
default:
elog(ERROR, "invalid compression method %c", method);
return NULL; /* keep compiler quiet */
diff --git a/src/backend/access/common/toast_internals.c b/src/backend/access/common/toast_internals.c
index d06af82de15..77fd8bc64ec 100644
--- a/src/backend/access/common/toast_internals.c
+++ b/src/backend/access/common/toast_internals.c
@@ -18,6 +18,7 @@
#include "access/heapam.h"
#include "access/heaptoast.h"
#include "access/table.h"
+#include "access/toast_compression.h"
#include "access/toast_internals.h"
#include "access/xact.h"
#include "catalog/catalog.h"
@@ -60,6 +61,9 @@ toast_compress_datum(Datum value, char cmethod)
/*
* Call appropriate compression routine for the compression method.
+ *
+ * Note: Zstd does not support inline compression (returns NULL immediately).
+ * Zstd data is always stored externally with VARTAG_ONDISK_ZSTD.
*/
switch (cmethod)
{
@@ -71,6 +75,9 @@ toast_compress_datum(Datum value, char cmethod)
tmp = lz4_compress_datum((const struct varlena *) DatumGetPointer(value));
cmid = TOAST_LZ4_COMPRESSION_ID;
break;
+ case TOAST_ZSTD_COMPRESSION:
+ /* Zstd: no inline compression, force external storage */
+ return PointerGetDatum(NULL);
default:
elog(ERROR, "invalid compression method %c", cmethod);
}
@@ -112,12 +119,13 @@ toast_compress_datum(Datum value, char cmethod)
* rel: the main relation we're working with (not the toast rel!)
* value: datum to be pushed to toast storage
* oldexternal: if not NULL, toast pointer previously representing the datum
+ * cmethod: compression method for the column (from attcompression)
* options: options to be passed to heap_insert() for toast rows
* ----------
*/
Datum
toast_save_datum(Relation rel, Datum value,
- struct varlena *oldexternal, int options)
+ struct varlena *oldexternal, char cmethod, int options)
{
Relation toastrel;
Relation *toastidxs;
@@ -131,6 +139,8 @@ toast_save_datum(Relation rel, Datum value,
Pointer dval = DatumGetPointer(value);
int num_indexes;
int validIndex;
+ bool is_zstd = false;
+ struct varlena *zstd_compressed = NULL;
Assert(!VARATT_IS_EXTERNAL(dval));
@@ -172,18 +182,57 @@ toast_save_datum(Relation rel, Datum value,
/* rawsize in a compressed datum is just the size of the payload */
toast_pointer.va_rawsize = VARDATA_COMPRESSED_GET_EXTSIZE(dval) + VARHDRSZ;
- /* set external size and compression method */
+ /*
+ * Inline-compressed data (only pglz/lz4, never zstd).
+ * Encode compression method from tcinfo into va_extinfo bits 30-31.
+ */
VARATT_EXTERNAL_SET_SIZE_AND_COMPRESS_METHOD(toast_pointer, data_todo,
VARDATA_COMPRESSED_GET_COMPRESS_METHOD(dval));
- /* Assert that the numbers look like it's compressed */
Assert(VARATT_EXTERNAL_IS_COMPRESSED(toast_pointer));
}
else
{
- data_p = VARDATA(dval);
- data_todo = VARSIZE(dval) - VARHDRSZ;
- toast_pointer.va_rawsize = VARSIZE(dval);
- toast_pointer.va_extinfo = data_todo;
+ /*
+ * Uncompressed data. For zstd, compress it now before storing.
+ * If no compression method specified, use default_toast_compression.
+ */
+ char effective_cmethod = cmethod;
+ if (!CompressionMethodIsValid(effective_cmethod))
+ effective_cmethod = default_toast_compression;
+
+ if (effective_cmethod == TOAST_ZSTD_COMPRESSION)
+ {
+ zstd_compressed = zstd_compress_datum((const struct varlena *) dval);
+ if (likely(zstd_compressed != NULL))
+ {
+ /*
+ * Successfully compressed with ZSTD. Store raw compressed bytes
+ * to TOAST (no tcinfo header). VARTAG_ONDISK_ZSTD identifies the
+ * compression method.
+ */
+ data_p = VARDATA(zstd_compressed);
+ data_todo = VARSIZE(zstd_compressed) - VARHDRSZ;
+ toast_pointer.va_rawsize = VARSIZE(dval);
+ toast_pointer.va_extinfo = data_todo;
+ is_zstd = true;
+ }
+ else
+ {
+ /* Incompressible, store uncompressed */
+ data_p = VARDATA(dval);
+ data_todo = VARSIZE(dval) - VARHDRSZ;
+ toast_pointer.va_rawsize = VARSIZE(dval);
+ toast_pointer.va_extinfo = data_todo;
+ }
+ }
+ else
+ {
+ /* pglz/lz4 or uncompressed: store as-is */
+ data_p = VARDATA(dval);
+ data_todo = VARSIZE(dval) - VARHDRSZ;
+ toast_pointer.va_rawsize = VARSIZE(dval);
+ toast_pointer.va_extinfo = data_todo;
+ }
}
/*
@@ -227,7 +276,8 @@ toast_save_datum(Relation rel, Datum value,
{
struct varatt_external old_toast_pointer;
- Assert(VARATT_IS_EXTERNAL_ONDISK(oldexternal));
+ Assert(VARATT_IS_EXTERNAL_ONDISK(oldexternal) ||
+ VARATT_IS_EXTERNAL_ONDISK_ZSTD(oldexternal));
/* Must copy to access aligned fields */
VARATT_EXTERNAL_GET_POINTER(old_toast_pointer, oldexternal);
if (old_toast_pointer.va_toastrelid == rel->rd_toastoid)
@@ -357,10 +407,18 @@ toast_save_datum(Relation rel, Datum value,
table_close(toastrel, NoLock);
/*
- * Create the TOAST pointer value that we'll return
+ * Free the ZSTD compressed varlena if we allocated one
+ */
+ if (zstd_compressed != NULL)
+ pfree(zstd_compressed);
+
+ /*
+ * Create the TOAST pointer value that we'll return.
+ * Use VARTAG_ONDISK_ZSTD for ZSTD-compressed data to indicate compression
+ * via the vartag rather than encoding it in va_extinfo bits 30-31.
*/
result = (struct varlena *) palloc(TOAST_POINTER_SIZE);
- SET_VARTAG_EXTERNAL(result, VARTAG_ONDISK);
+ SET_VARTAG_EXTERNAL(result, is_zstd ? VARTAG_ONDISK_ZSTD : VARTAG_ONDISK);
memcpy(VARDATA_EXTERNAL(result), &toast_pointer, sizeof(toast_pointer));
return PointerGetDatum(result);
@@ -385,7 +443,7 @@ toast_delete_datum(Relation rel, Datum value, bool is_speculative)
int num_indexes;
int validIndex;
- if (!VARATT_IS_EXTERNAL_ONDISK(attr))
+ if (!VARATT_IS_EXTERNAL_ONDISK(attr) && !VARATT_IS_EXTERNAL_ONDISK_ZSTD(attr))
return;
/* Must copy to access aligned fields */
diff --git a/src/backend/access/table/toast_helper.c b/src/backend/access/table/toast_helper.c
index 11f97d65367..f2371a60971 100644
--- a/src/backend/access/table/toast_helper.c
+++ b/src/backend/access/table/toast_helper.c
@@ -71,10 +71,12 @@ toast_tuple_init(ToastTupleContext *ttc)
* we have to delete it later.
*/
if (att->attlen == -1 && !ttc->ttc_oldisnull[i] &&
- VARATT_IS_EXTERNAL_ONDISK(old_value))
+ (VARATT_IS_EXTERNAL_ONDISK(old_value) ||
+ VARATT_IS_EXTERNAL_ONDISK_ZSTD(old_value)))
{
if (ttc->ttc_isnull[i] ||
- !VARATT_IS_EXTERNAL_ONDISK(new_value) ||
+ (!VARATT_IS_EXTERNAL_ONDISK(new_value) &&
+ !VARATT_IS_EXTERNAL_ONDISK_ZSTD(new_value)) ||
memcmp(old_value, new_value,
VARSIZE_EXTERNAL(old_value)) != 0)
{
@@ -261,7 +263,7 @@ toast_tuple_externalize(ToastTupleContext *ttc, int attribute, int options)
attr->tai_colflags |= TOASTCOL_IGNORE;
*value = toast_save_datum(ttc->ttc_rel, old_value, attr->tai_oldexternal,
- options);
+ attr->tai_compression, options);
if ((attr->tai_colflags & TOASTCOL_NEEDS_FREE) != 0)
pfree(DatumGetPointer(old_value));
attr->tai_colflags |= TOASTCOL_NEEDS_FREE;
@@ -330,7 +332,8 @@ toast_delete_external(Relation rel, const Datum *values, const bool *isnull,
if (isnull[i])
continue;
- else if (VARATT_IS_EXTERNAL_ONDISK(DatumGetPointer(value)))
+ else if (VARATT_IS_EXTERNAL_ONDISK(DatumGetPointer(value)) ||
+ VARATT_IS_EXTERNAL_ONDISK_ZSTD(DatumGetPointer(value)))
toast_delete_datum(rel, value, is_speculative);
}
}
diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c
index 27ad74fd759..09535abd778 100644
--- a/src/backend/replication/logical/proto.c
+++ b/src/backend/replication/logical/proto.c
@@ -812,7 +812,9 @@ logicalrep_write_tuple(StringInfo out, Relation rel, TupleTableSlot *slot,
continue;
}
- if (att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(DatumGetPointer(values[i])))
+ if (att->attlen == -1 &&
+ (VARATT_IS_EXTERNAL_ONDISK(DatumGetPointer(values[i])) ||
+ VARATT_IS_EXTERNAL_ONDISK_ZSTD(DatumGetPointer(values[i]))))
{
/*
* Unchanged toasted datum. (Note that we don't promise to detect
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 787998abb8a..fbe229c262b 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -1397,8 +1397,10 @@ pgoutput_row_filter(Relation relation, TupleTableSlot *old_slot,
* VARTAG_INDIRECT. See ReorderBufferToastReplace.
*/
if (att->attlen == -1 &&
- VARATT_IS_EXTERNAL_ONDISK(DatumGetPointer(new_slot->tts_values[i])) &&
- !VARATT_IS_EXTERNAL_ONDISK(DatumGetPointer(old_slot->tts_values[i])))
+ (VARATT_IS_EXTERNAL_ONDISK(DatumGetPointer(new_slot->tts_values[i])) ||
+ VARATT_IS_EXTERNAL_ONDISK_ZSTD(DatumGetPointer(new_slot->tts_values[i]))) &&
+ !VARATT_IS_EXTERNAL_ONDISK(DatumGetPointer(old_slot->tts_values[i])) &&
+ !VARATT_IS_EXTERNAL_ONDISK_ZSTD(DatumGetPointer(old_slot->tts_values[i])))
{
if (!tmp_new_slot)
{
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index 8adeb8dadc6..5f5cc5da449 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -4179,6 +4179,9 @@ pg_column_compression(PG_FUNCTION_ARGS)
case TOAST_LZ4_COMPRESSION_ID:
result = "lz4";
break;
+ case TOAST_ZSTD_COMPRESSION_ID:
+ result = "zstd";
+ break;
default:
elog(ERROR, "invalid compression method id %d", cmid);
}
@@ -4219,7 +4222,8 @@ pg_column_toast_chunk_id(PG_FUNCTION_ARGS)
attr = (struct varlena *) DatumGetPointer(PG_GETARG_DATUM(0));
- if (!VARATT_IS_EXTERNAL_ONDISK(attr))
+ if (!VARATT_IS_EXTERNAL_ONDISK(attr) &&
+ !VARATT_IS_EXTERNAL_ONDISK_ZSTD(attr))
PG_RETURN_NULL();
VARATT_EXTERNAL_GET_POINTER(toast_pointer, attr);
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 04ab0a26608..555b0143685 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -460,6 +460,9 @@ static const struct config_enum_entry default_toast_compression_options[] = {
{"pglz", TOAST_PGLZ_COMPRESSION, false},
#ifdef USE_LZ4
{"lz4", TOAST_LZ4_COMPRESSION, false},
+#endif
+#ifdef USE_ZSTD
+ {"zstd", TOAST_ZSTD_COMPRESSION, false},
#endif
{NULL, 0, false}
};
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 27f6be3f0f8..4f660a19c35 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -17905,6 +17905,9 @@ dumpTableSchema(Archive *fout, const TableInfo *tbinfo)
case 'l':
cmname = "lz4";
break;
+ case 'z':
+ cmname = "zstd";
+ break;
default:
cmname = NULL;
break;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 36f24502842..7d6377e27ca 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2206,8 +2206,9 @@ describeOneTableDetails(const char *schemaname,
/* these strings are literal in our syntax, so not translated. */
printTableAddCell(&cont, (compression[0] == 'p' ? "pglz" :
(compression[0] == 'l' ? "lz4" :
- (compression[0] == '\0' ? "" :
- "???"))),
+ (compression[0] == 'z' ? "zstd" :
+ (compression[0] == '\0' ? "" :
+ "???")))),
false, false);
}
diff --git a/src/include/access/toast_compression.h b/src/include/access/toast_compression.h
index 13c4612ceed..1be06aafccb 100644
--- a/src/include/access/toast_compression.h
+++ b/src/include/access/toast_compression.h
@@ -33,12 +33,17 @@ extern PGDLLIMPORT int default_toast_compression;
* below. We might someday support more than 4 compression methods, but
* we can never have more than 4 values in this enum, because there are
* only 2 bits available in the places where this is stored.
+ *
+ * Note: TOAST_ZSTD_COMPRESSION_ID is not used in 2-bit cmid fields. Zstd
+ * uses VARTAG_ONDISK_ZSTD for external storage. This ID exists only for
+ * introspection (e.g., pg_column_compression()).
*/
typedef enum ToastCompressionId
{
TOAST_PGLZ_COMPRESSION_ID = 0,
TOAST_LZ4_COMPRESSION_ID = 1,
TOAST_INVALID_COMPRESSION_ID = 2,
+ TOAST_ZSTD_COMPRESSION_ID = 3, /* introspection only, not in cmid */
} ToastCompressionId;
/*
@@ -48,6 +53,7 @@ typedef enum ToastCompressionId
*/
#define TOAST_PGLZ_COMPRESSION 'p'
#define TOAST_LZ4_COMPRESSION 'l'
+#define TOAST_ZSTD_COMPRESSION 'z'
#define InvalidCompressionMethod '\0'
#define CompressionMethodIsValid(cm) ((cm) != InvalidCompressionMethod)
@@ -65,6 +71,14 @@ extern struct varlena *lz4_decompress_datum(const struct varlena *value);
extern struct varlena *lz4_decompress_datum_slice(const struct varlena *value,
int32 slicelength);
+/* zstd compression/decompression routines */
+extern struct varlena *zstd_compress_datum(const struct varlena *value);
+extern struct varlena *zstd_decompress_datum(const struct varlena *value,
+ int32 rawsize);
+extern struct varlena *zstd_decompress_datum_slice(const struct varlena *value,
+ int32 rawsize,
+ int32 slicelength);
+
/* other stuff */
extern ToastCompressionId toast_get_compression_id(struct varlena *attr);
extern char CompressionNameToMethod(const char *compression);
diff --git a/src/include/access/toast_internals.h b/src/include/access/toast_internals.h
index 06ae8583c1e..77d5081eeed 100644
--- a/src/include/access/toast_internals.h
+++ b/src/include/access/toast_internals.h
@@ -36,11 +36,17 @@ typedef struct toast_compress_header
#define TOAST_COMPRESS_METHOD(ptr) \
(((toast_compress_header *) (ptr))->tcinfo >> VARLENA_EXTSIZE_BITS)
+/*
+ * Set compression header info. Zstd uses TOAST_INVALID_COMPRESSION_ID, not
+ * TOAST_ZSTD_COMPRESSION_ID (cmid=3 is not used in tcinfo).
+ */
#define TOAST_COMPRESS_SET_SIZE_AND_COMPRESS_METHOD(ptr, len, cm_method) \
do { \
Assert((len) > 0 && (len) <= VARLENA_EXTSIZE_MASK); \
Assert((cm_method) == TOAST_PGLZ_COMPRESSION_ID || \
- (cm_method) == TOAST_LZ4_COMPRESSION_ID); \
+ (cm_method) == TOAST_LZ4_COMPRESSION_ID || \
+ (cm_method) == TOAST_INVALID_COMPRESSION_ID); \
+ Assert((cm_method) != TOAST_ZSTD_COMPRESSION_ID); \
((toast_compress_header *) (ptr))->tcinfo = \
(len) | ((uint32) (cm_method) << VARLENA_EXTSIZE_BITS); \
} while (0)
@@ -50,7 +56,7 @@ extern Oid toast_get_valid_index(Oid toastoid, LOCKMODE lock);
extern void toast_delete_datum(Relation rel, Datum value, bool is_speculative);
extern Datum toast_save_datum(Relation rel, Datum value,
- struct varlena *oldexternal, int options);
+ struct varlena *oldexternal, char cmethod, int options);
extern int toast_open_indexes(Relation toastrel,
LOCKMODE lock,
diff --git a/src/include/varatt.h b/src/include/varatt.h
index aeeabf9145b..cf5436f7bf1 100644
--- a/src/include/varatt.h
+++ b/src/include/varatt.h
@@ -80,13 +80,19 @@ typedef struct varatt_expanded
* Type tag for the various sorts of "TOAST pointer" datums. The peculiar
* value for VARTAG_ONDISK comes from a requirement for on-disk compatibility
* with a previous notion that the tag field was the pointer datum's length.
+ *
+ * VARTAG_ONDISK_ZSTD is used for ZSTD-compressed external TOAST data.
+ * Unlike pglz and lz4 which store the compression method in va_extinfo bits
+ * 30-31, ZSTD uses a separate vartag to preserve all 32 bits of va_extinfo
+ * for future use (compression level, dictionary ID, etc.).
*/
typedef enum vartag_external
{
VARTAG_INDIRECT = 1,
VARTAG_EXPANDED_RO = 2,
VARTAG_EXPANDED_RW = 3,
- VARTAG_ONDISK = 18
+ VARTAG_ONDISK = 18,
+ VARTAG_ONDISK_ZSTD = 19
} vartag_external;
/* Is a TOAST pointer either type of expanded-object pointer? */
@@ -105,7 +111,7 @@ VARTAG_SIZE(vartag_external tag)
return sizeof(varatt_indirect);
else if (VARTAG_IS_EXPANDED(tag))
return sizeof(varatt_expanded);
- else if (tag == VARTAG_ONDISK)
+ else if (tag == VARTAG_ONDISK || tag == VARTAG_ONDISK_ZSTD)
return sizeof(varatt_external);
else
{
@@ -363,6 +369,13 @@ VARATT_IS_EXTERNAL_ONDISK(const void *PTR)
return VARATT_IS_EXTERNAL(PTR) && VARTAG_EXTERNAL(PTR) == VARTAG_ONDISK;
}
+/* Is varlena datum a pointer to on-disk ZSTD-compressed toasted data? */
+static inline bool
+VARATT_IS_EXTERNAL_ONDISK_ZSTD(const void *PTR)
+{
+ return VARATT_IS_EXTERNAL(PTR) && VARTAG_EXTERNAL(PTR) == VARTAG_ONDISK_ZSTD;
+}
+
/* Is varlena datum an indirect pointer? */
static inline bool
VARATT_IS_EXTERNAL_INDIRECT(const void *PTR)
diff --git a/src/test/regress/expected/compression_zstd.out b/src/test/regress/expected/compression_zstd.out
new file mode 100644
index 00000000000..1f5bb43b542
--- /dev/null
+++ b/src/test/regress/expected/compression_zstd.out
@@ -0,0 +1,361 @@
+-- Tests for TOAST compression with zstd
+SELECT NOT(enumvals @> '{zstd}') AS skip_test FROM pg_settings WHERE
+ name = 'default_toast_compression' \gset
+\if :skip_test
+ \echo '*** skipping TOAST tests with zstd (not supported) ***'
+ \quit
+\endif
+CREATE SCHEMA zstd;
+SET search_path TO zstd, public;
+\set HIDE_TOAST_COMPRESSION false
+-- Ensure we get stable results regardless of the installation's default.
+-- We rely on this GUC value for a few tests.
+SET default_toast_compression = 'pglz';
+-- test creating table with compression method
+CREATE TABLE cmdata_pglz(f1 text COMPRESSION pglz);
+CREATE INDEX idx ON cmdata_pglz(f1);
+INSERT INTO cmdata_pglz VALUES(repeat('1234567890', 1000));
+\d+ cmdata_pglz
+ Table "zstd.cmdata_pglz"
+ Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
+--------+------+-----------+----------+---------+----------+-------------+--------------+-------------
+ f1 | text | | | | extended | pglz | |
+Indexes:
+ "idx" btree (f1)
+
+CREATE TABLE cmdata_zstd(f1 TEXT COMPRESSION zstd);
+INSERT INTO cmdata_zstd VALUES(repeat('1234567890', 1004));
+\d+ cmdata_zstd
+ Table "zstd.cmdata_zstd"
+ Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
+--------+------+-----------+----------+---------+----------+-------------+--------------+-------------
+ f1 | text | | | | extended | zstd | |
+
+-- verify stored compression method in the data
+SELECT pg_column_compression(f1) FROM cmdata_zstd;
+ pg_column_compression
+-----------------------
+ zstd
+(1 row)
+
+-- decompress data slice
+SELECT SUBSTR(f1, 200, 5) FROM cmdata_pglz;
+ substr
+--------
+ 01234
+(1 row)
+
+SELECT SUBSTR(f1, 2000, 50) FROM cmdata_zstd;
+ substr
+----------------------------------------------------
+ 01234567890123456789012345678901234567890123456789
+(1 row)
+
+-- copy with table creation
+SELECT * INTO cmmove1 FROM cmdata_zstd;
+\d+ cmmove1
+ Table "zstd.cmmove1"
+ Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
+--------+------+-----------+----------+---------+----------+-------------+--------------+-------------
+ f1 | text | | | | extended | | |
+
+SELECT pg_column_compression(f1) FROM cmmove1;
+ pg_column_compression
+-----------------------
+ pglz
+(1 row)
+
+-- test LIKE INCLUDING COMPRESSION. The GUC default_toast_compression
+-- has no effect, the compression method from the table being copied.
+CREATE TABLE cmdata2 (LIKE cmdata_zstd INCLUDING COMPRESSION);
+\d+ cmdata2
+ Table "zstd.cmdata2"
+ Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
+--------+------+-----------+----------+---------+----------+-------------+--------------+-------------
+ f1 | text | | | | extended | zstd | |
+
+DROP TABLE cmdata2;
+-- copy to existing table
+CREATE TABLE cmmove3(f1 text COMPRESSION pglz);
+INSERT INTO cmmove3 SELECT * FROM cmdata_pglz;
+INSERT INTO cmmove3 SELECT * FROM cmdata_zstd;
+SELECT pg_column_compression(f1) FROM cmmove3;
+ pg_column_compression
+-----------------------
+ pglz
+ pglz
+(2 rows)
+
+-- update using datum from different table with ZSTD data.
+CREATE TABLE cmmove2(f1 text COMPRESSION pglz);
+INSERT INTO cmmove2 VALUES (repeat('1234567890', 1004));
+SELECT pg_column_compression(f1) FROM cmmove2;
+ pg_column_compression
+-----------------------
+ pglz
+(1 row)
+
+UPDATE cmmove2 SET f1 = cmdata_zstd.f1 FROM cmdata_zstd;
+SELECT pg_column_compression(f1) FROM cmmove2;
+ pg_column_compression
+-----------------------
+ pglz
+(1 row)
+
+-- test externally stored compressed data
+CREATE OR REPLACE FUNCTION large_val_zstd() RETURNS TEXT LANGUAGE SQL AS
+'select array_agg(fipshash(g::text))::text from generate_series(1, 256) g';
+CREATE TABLE cmdata2 (f1 text COMPRESSION zstd);
+INSERT INTO cmdata2 SELECT large_val_zstd() || repeat('a', 4000);
+SELECT pg_column_compression(f1) FROM cmdata2;
+ pg_column_compression
+-----------------------
+ zstd
+(1 row)
+
+SELECT SUBSTR(f1, 200, 5) FROM cmdata2;
+ substr
+--------
+ 79026
+(1 row)
+
+-- test pg_column_toast_chunk_id with zstd
+SELECT pg_column_toast_chunk_id(f1) IS NOT NULL AS has_toast_chunk FROM cmdata2;
+ has_toast_chunk
+-----------------
+ t
+(1 row)
+
+DROP TABLE cmdata2;
+DROP FUNCTION large_val_zstd;
+-- test compression with materialized view
+CREATE MATERIALIZED VIEW compressmv(x) AS SELECT * FROM cmdata_zstd;
+\d+ compressmv
+ Materialized view "zstd.compressmv"
+ Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
+--------+------+-----------+----------+---------+----------+-------------+--------------+-------------
+ x | text | | | | extended | | |
+View definition:
+ SELECT f1 AS x
+ FROM cmdata_zstd;
+
+SELECT pg_column_compression(f1) FROM cmdata_zstd;
+ pg_column_compression
+-----------------------
+ zstd
+(1 row)
+
+SELECT pg_column_compression(x) FROM compressmv;
+ pg_column_compression
+-----------------------
+ pglz
+(1 row)
+
+-- test compression with partition
+CREATE TABLE cmpart(f1 text COMPRESSION zstd) PARTITION BY HASH(f1);
+CREATE TABLE cmpart1 PARTITION OF cmpart FOR VALUES WITH (MODULUS 2, REMAINDER 0);
+CREATE TABLE cmpart2(f1 text COMPRESSION pglz);
+ALTER TABLE cmpart ATTACH PARTITION cmpart2 FOR VALUES WITH (MODULUS 2, REMAINDER 1);
+INSERT INTO cmpart VALUES (repeat('123456789', 1004));
+INSERT INTO cmpart VALUES (repeat('123456789', 4004));
+SELECT pg_column_compression(f1) FROM cmpart1;
+ pg_column_compression
+-----------------------
+ zstd
+(1 row)
+
+SELECT pg_column_compression(f1) FROM cmpart2;
+ pg_column_compression
+-----------------------
+ pglz
+(1 row)
+
+-- test compression with inheritance
+CREATE TABLE cminh() INHERITS(cmdata_pglz, cmdata_zstd); -- error
+NOTICE: merging multiple inherited definitions of column "f1"
+ERROR: column "f1" has a compression method conflict
+DETAIL: pglz versus zstd
+CREATE TABLE cminh(f1 TEXT COMPRESSION zstd) INHERITS(cmdata_pglz); -- error
+NOTICE: merging column "f1" with inherited definition
+ERROR: column "f1" has a compression method conflict
+DETAIL: pglz versus zstd
+CREATE TABLE cmdata3(f1 text);
+CREATE TABLE cminh() INHERITS (cmdata_pglz, cmdata3);
+NOTICE: merging multiple inherited definitions of column "f1"
+-- test default_toast_compression GUC
+SET default_toast_compression = 'zstd';
+-- test alter compression method
+ALTER TABLE cmdata_pglz ALTER COLUMN f1 SET COMPRESSION zstd;
+INSERT INTO cmdata_pglz VALUES (repeat('123456789', 4004));
+\d+ cmdata_pglz
+ Table "zstd.cmdata_pglz"
+ Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
+--------+------+-----------+----------+---------+----------+-------------+--------------+-------------
+ f1 | text | | | | extended | zstd | |
+Indexes:
+ "idx" btree (f1)
+Child tables: cminh
+
+SELECT pg_column_compression(f1) FROM cmdata_pglz;
+ pg_column_compression
+-----------------------
+ pglz
+ zstd
+(2 rows)
+
+ALTER TABLE cmdata_pglz ALTER COLUMN f1 SET COMPRESSION pglz;
+-- test alter compression method for materialized views
+ALTER MATERIALIZED VIEW compressmv ALTER COLUMN x SET COMPRESSION zstd;
+\d+ compressmv
+ Materialized view "zstd.compressmv"
+ Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
+--------+------+-----------+----------+---------+----------+-------------+--------------+-------------
+ x | text | | | | extended | zstd | |
+View definition:
+ SELECT f1 AS x
+ FROM cmdata_zstd;
+
+-- test alter compression method for partitioned tables
+ALTER TABLE cmpart1 ALTER COLUMN f1 SET COMPRESSION pglz;
+ALTER TABLE cmpart2 ALTER COLUMN f1 SET COMPRESSION zstd;
+-- new data should be compressed with the current compression method
+INSERT INTO cmpart VALUES (repeat('123456789', 1004));
+INSERT INTO cmpart VALUES (repeat('123456789', 4004));
+SELECT pg_column_compression(f1) FROM cmpart1;
+ pg_column_compression
+-----------------------
+ zstd
+ pglz
+(2 rows)
+
+SELECT pg_column_compression(f1) FROM cmpart2;
+ pg_column_compression
+-----------------------
+ pglz
+ zstd
+(2 rows)
+
+-- test expression index
+CREATE TABLE cmdata2 (f1 TEXT COMPRESSION pglz, f2 TEXT COMPRESSION zstd);
+CREATE UNIQUE INDEX idx1 ON cmdata2 ((f1 || f2));
+INSERT INTO cmdata2 VALUES((SELECT array_agg(fipshash(g::TEXT))::TEXT FROM
+generate_series(1, 50) g), VERSION());
+-- test cross-method operations (zstd <-> lz4 if available)
+-- This tests interaction between all three compression methods
+SELECT enumvals @> '{lz4}' AS has_lz4 FROM pg_settings WHERE
+ name = 'default_toast_compression' \gset
+\if :has_lz4
+CREATE TABLE cmdata_lz4(f1 TEXT COMPRESSION lz4);
+INSERT INTO cmdata_lz4 VALUES(repeat('1234567890', 1004));
+SELECT pg_column_compression(f1) FROM cmdata_lz4;
+ pg_column_compression
+-----------------------
+ lz4
+(1 row)
+
+-- copy from zstd to lz4 table
+CREATE TABLE cmmove4(f1 text COMPRESSION lz4);
+INSERT INTO cmmove4 SELECT * FROM cmdata_zstd;
+SELECT pg_column_compression(f1) FROM cmmove4;
+ pg_column_compression
+-----------------------
+ lz4
+(1 row)
+
+-- copy from lz4 to zstd table
+CREATE TABLE cmmove5(f1 text COMPRESSION zstd);
+INSERT INTO cmmove5 SELECT * FROM cmdata_lz4;
+SELECT pg_column_compression(f1) FROM cmmove5;
+ pg_column_compression
+-----------------------
+ lz4
+(1 row)
+
+\else
+\echo '*** skipping LZ4 cross-method tests (lz4 not supported) ***'
+\endif
+-- check data is ok
+SELECT length(f1) FROM cmdata_pglz;
+ length
+--------
+ 10000
+ 36036
+(2 rows)
+
+SELECT length(f1) FROM cmdata_zstd;
+ length
+--------
+ 10040
+(1 row)
+
+\if :has_lz4
+SELECT length(f1) FROM cmdata_lz4;
+ length
+--------
+ 10040
+(1 row)
+
+\endif
+SELECT length(f1) FROM cmmove1;
+ length
+--------
+ 10040
+(1 row)
+
+SELECT length(f1) FROM cmmove2;
+ length
+--------
+ 10040
+(1 row)
+
+SELECT length(f1) FROM cmmove3;
+ length
+--------
+ 10000
+ 10040
+(2 rows)
+
+\if :has_lz4
+SELECT length(f1) FROM cmmove4;
+ length
+--------
+ 10040
+(1 row)
+
+SELECT length(f1) FROM cmmove5;
+ length
+--------
+ 10040
+(1 row)
+
+\endif
+-- test parallel workers with ZSTD (if supported)
+CREATE TABLE parallel_zstd_test (id int, data text COMPRESSION zstd);
+INSERT INTO parallel_zstd_test SELECT i, repeat('x' || i::text, 3000) FROM generate_series(1, 100) i;
+SELECT count(*), avg(length(data)) FROM parallel_zstd_test;
+ count | avg
+-------+-----------------------
+ 100 | 8760.0000000000000000
+(1 row)
+
+SELECT count(*), sum(length(substring(data, 1, 50))) FROM parallel_zstd_test;
+ count | sum
+-------+------
+ 100 | 5000
+(1 row)
+
+DROP TABLE parallel_zstd_test;
+-- test COPY with ZSTD compressed data
+CREATE TABLE copy_zstd_test (id int, data text COMPRESSION zstd);
+INSERT INTO copy_zstd_test VALUES (1, repeat('copydata', 2000));
+\copy copy_zstd_test TO '/tmp/zstd_copy_test.dat'
+TRUNCATE copy_zstd_test;
+\copy copy_zstd_test FROM '/tmp/zstd_copy_test.dat'
+SELECT id, length(data), pg_column_compression(data) FROM copy_zstd_test;
+ id | length | pg_column_compression
+----+--------+-----------------------
+ 1 | 16000 | zstd
+(1 row)
+
+DROP TABLE copy_zstd_test;
+\set HIDE_TOAST_COMPRESSION true
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 905f9bca959..1cd161fa2c4 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -123,7 +123,7 @@ test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion tr
# The stats test resets stats, so nothing else needing stats access can be in
# this group.
# ----------
-test: partition_merge partition_split partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression compression_lz4 memoize stats predicate numa eager_aggregate
+test: partition_merge partition_split partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression compression_lz4 compression_zstd memoize stats predicate numa eager_aggregate
# event_trigger depends on create_am and cannot run concurrently with
# any test that runs DDL
diff --git a/src/test/regress/sql/compression_zstd.sql b/src/test/regress/sql/compression_zstd.sql
new file mode 100644
index 00000000000..8a38092f034
--- /dev/null
+++ b/src/test/regress/sql/compression_zstd.sql
@@ -0,0 +1,178 @@
+-- Tests for TOAST compression with zstd
+
+SELECT NOT(enumvals @> '{zstd}') AS skip_test FROM pg_settings WHERE
+ name = 'default_toast_compression' \gset
+\if :skip_test
+ \echo '*** skipping TOAST tests with zstd (not supported) ***'
+ \quit
+\endif
+
+CREATE SCHEMA zstd;
+SET search_path TO zstd, public;
+
+\set HIDE_TOAST_COMPRESSION false
+
+-- Ensure we get stable results regardless of the installation's default.
+-- We rely on this GUC value for a few tests.
+SET default_toast_compression = 'pglz';
+
+-- test creating table with compression method
+CREATE TABLE cmdata_pglz(f1 text COMPRESSION pglz);
+CREATE INDEX idx ON cmdata_pglz(f1);
+INSERT INTO cmdata_pglz VALUES(repeat('1234567890', 1000));
+\d+ cmdata_pglz
+CREATE TABLE cmdata_zstd(f1 TEXT COMPRESSION zstd);
+INSERT INTO cmdata_zstd VALUES(repeat('1234567890', 1004));
+\d+ cmdata_zstd
+
+-- verify stored compression method in the data
+SELECT pg_column_compression(f1) FROM cmdata_zstd;
+
+-- decompress data slice
+SELECT SUBSTR(f1, 200, 5) FROM cmdata_pglz;
+SELECT SUBSTR(f1, 2000, 50) FROM cmdata_zstd;
+
+-- copy with table creation
+SELECT * INTO cmmove1 FROM cmdata_zstd;
+\d+ cmmove1
+SELECT pg_column_compression(f1) FROM cmmove1;
+
+-- test LIKE INCLUDING COMPRESSION. The GUC default_toast_compression
+-- has no effect, the compression method from the table being copied.
+CREATE TABLE cmdata2 (LIKE cmdata_zstd INCLUDING COMPRESSION);
+\d+ cmdata2
+DROP TABLE cmdata2;
+
+-- copy to existing table
+CREATE TABLE cmmove3(f1 text COMPRESSION pglz);
+INSERT INTO cmmove3 SELECT * FROM cmdata_pglz;
+INSERT INTO cmmove3 SELECT * FROM cmdata_zstd;
+SELECT pg_column_compression(f1) FROM cmmove3;
+
+-- update using datum from different table with ZSTD data.
+CREATE TABLE cmmove2(f1 text COMPRESSION pglz);
+INSERT INTO cmmove2 VALUES (repeat('1234567890', 1004));
+SELECT pg_column_compression(f1) FROM cmmove2;
+UPDATE cmmove2 SET f1 = cmdata_zstd.f1 FROM cmdata_zstd;
+SELECT pg_column_compression(f1) FROM cmmove2;
+
+-- test externally stored compressed data
+CREATE OR REPLACE FUNCTION large_val_zstd() RETURNS TEXT LANGUAGE SQL AS
+'select array_agg(fipshash(g::text))::text from generate_series(1, 256) g';
+CREATE TABLE cmdata2 (f1 text COMPRESSION zstd);
+INSERT INTO cmdata2 SELECT large_val_zstd() || repeat('a', 4000);
+SELECT pg_column_compression(f1) FROM cmdata2;
+SELECT SUBSTR(f1, 200, 5) FROM cmdata2;
+
+-- test pg_column_toast_chunk_id with zstd
+SELECT pg_column_toast_chunk_id(f1) IS NOT NULL AS has_toast_chunk FROM cmdata2;
+
+DROP TABLE cmdata2;
+DROP FUNCTION large_val_zstd;
+
+-- test compression with materialized view
+CREATE MATERIALIZED VIEW compressmv(x) AS SELECT * FROM cmdata_zstd;
+\d+ compressmv
+SELECT pg_column_compression(f1) FROM cmdata_zstd;
+SELECT pg_column_compression(x) FROM compressmv;
+
+-- test compression with partition
+CREATE TABLE cmpart(f1 text COMPRESSION zstd) PARTITION BY HASH(f1);
+CREATE TABLE cmpart1 PARTITION OF cmpart FOR VALUES WITH (MODULUS 2, REMAINDER 0);
+CREATE TABLE cmpart2(f1 text COMPRESSION pglz);
+
+ALTER TABLE cmpart ATTACH PARTITION cmpart2 FOR VALUES WITH (MODULUS 2, REMAINDER 1);
+INSERT INTO cmpart VALUES (repeat('123456789', 1004));
+INSERT INTO cmpart VALUES (repeat('123456789', 4004));
+SELECT pg_column_compression(f1) FROM cmpart1;
+SELECT pg_column_compression(f1) FROM cmpart2;
+
+-- test compression with inheritance
+CREATE TABLE cminh() INHERITS(cmdata_pglz, cmdata_zstd); -- error
+CREATE TABLE cminh(f1 TEXT COMPRESSION zstd) INHERITS(cmdata_pglz); -- error
+CREATE TABLE cmdata3(f1 text);
+CREATE TABLE cminh() INHERITS (cmdata_pglz, cmdata3);
+
+-- test default_toast_compression GUC
+SET default_toast_compression = 'zstd';
+
+-- test alter compression method
+ALTER TABLE cmdata_pglz ALTER COLUMN f1 SET COMPRESSION zstd;
+INSERT INTO cmdata_pglz VALUES (repeat('123456789', 4004));
+\d+ cmdata_pglz
+SELECT pg_column_compression(f1) FROM cmdata_pglz;
+ALTER TABLE cmdata_pglz ALTER COLUMN f1 SET COMPRESSION pglz;
+
+-- test alter compression method for materialized views
+ALTER MATERIALIZED VIEW compressmv ALTER COLUMN x SET COMPRESSION zstd;
+\d+ compressmv
+
+-- test alter compression method for partitioned tables
+ALTER TABLE cmpart1 ALTER COLUMN f1 SET COMPRESSION pglz;
+ALTER TABLE cmpart2 ALTER COLUMN f1 SET COMPRESSION zstd;
+
+-- new data should be compressed with the current compression method
+INSERT INTO cmpart VALUES (repeat('123456789', 1004));
+INSERT INTO cmpart VALUES (repeat('123456789', 4004));
+SELECT pg_column_compression(f1) FROM cmpart1;
+SELECT pg_column_compression(f1) FROM cmpart2;
+
+-- test expression index
+CREATE TABLE cmdata2 (f1 TEXT COMPRESSION pglz, f2 TEXT COMPRESSION zstd);
+CREATE UNIQUE INDEX idx1 ON cmdata2 ((f1 || f2));
+INSERT INTO cmdata2 VALUES((SELECT array_agg(fipshash(g::TEXT))::TEXT FROM
+generate_series(1, 50) g), VERSION());
+
+-- test cross-method operations (zstd <-> lz4 if available)
+-- This tests interaction between all three compression methods
+SELECT enumvals @> '{lz4}' AS has_lz4 FROM pg_settings WHERE
+ name = 'default_toast_compression' \gset
+\if :has_lz4
+CREATE TABLE cmdata_lz4(f1 TEXT COMPRESSION lz4);
+INSERT INTO cmdata_lz4 VALUES(repeat('1234567890', 1004));
+SELECT pg_column_compression(f1) FROM cmdata_lz4;
+
+-- copy from zstd to lz4 table
+CREATE TABLE cmmove4(f1 text COMPRESSION lz4);
+INSERT INTO cmmove4 SELECT * FROM cmdata_zstd;
+SELECT pg_column_compression(f1) FROM cmmove4;
+
+-- copy from lz4 to zstd table
+CREATE TABLE cmmove5(f1 text COMPRESSION zstd);
+INSERT INTO cmmove5 SELECT * FROM cmdata_lz4;
+SELECT pg_column_compression(f1) FROM cmmove5;
+\else
+\echo '*** skipping LZ4 cross-method tests (lz4 not supported) ***'
+\endif
+
+-- check data is ok
+SELECT length(f1) FROM cmdata_pglz;
+SELECT length(f1) FROM cmdata_zstd;
+\if :has_lz4
+SELECT length(f1) FROM cmdata_lz4;
+\endif
+SELECT length(f1) FROM cmmove1;
+SELECT length(f1) FROM cmmove2;
+SELECT length(f1) FROM cmmove3;
+\if :has_lz4
+SELECT length(f1) FROM cmmove4;
+SELECT length(f1) FROM cmmove5;
+\endif
+
+-- test parallel workers with ZSTD (if supported)
+CREATE TABLE parallel_zstd_test (id int, data text COMPRESSION zstd);
+INSERT INTO parallel_zstd_test SELECT i, repeat('x' || i::text, 3000) FROM generate_series(1, 100) i;
+SELECT count(*), avg(length(data)) FROM parallel_zstd_test;
+SELECT count(*), sum(length(substring(data, 1, 50))) FROM parallel_zstd_test;
+DROP TABLE parallel_zstd_test;
+
+-- test COPY with ZSTD compressed data
+CREATE TABLE copy_zstd_test (id int, data text COMPRESSION zstd);
+INSERT INTO copy_zstd_test VALUES (1, repeat('copydata', 2000));
+\copy copy_zstd_test TO '/tmp/zstd_copy_test.dat'
+TRUNCATE copy_zstd_test;
+\copy copy_zstd_test FROM '/tmp/zstd_copy_test.dat'
+SELECT id, length(data), pg_column_compression(data) FROM copy_zstd_test;
+DROP TABLE copy_zstd_test;
+
+\set HIDE_TOAST_COMPRESSION true
--
2.39.3 (Apple Git-146)
[application/octet-stream] backwards_compatibility_test.sql (13.8K, 5-backwards_compatibility_test.sql)
download
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected]
Subject: Re: Fwd: [PATCH] Add zstd compression for TOAST using extended header format
In-Reply-To: <CAOj6k6f2B3hNxDcnB5AgHX4kaTW8XTAfMAjRx4upDBOugxqF4w@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox