Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vVO3H-001V93-3B for pgadmin-hackers@arkaria.postgresql.org; Tue, 16 Dec 2025 05:56:52 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vVO3G-004Ia0-2u for pgadmin-hackers@arkaria.postgresql.org; Tue, 16 Dec 2025 05:56:51 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vVO3G-004IZs-21 for pgadmin-hackers@lists.postgresql.org; Tue, 16 Dec 2025 05:56:51 +0000 Received: from mail-pj1-x1033.google.com ([2607:f8b0:4864:20::1033]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vVO3C-000zZv-2y for pgadmin-hackers@lists.postgresql.org; Tue, 16 Dec 2025 05:56:50 +0000 Received: by mail-pj1-x1033.google.com with SMTP id 98e67ed59e1d1-34c30f0f12eso2285948a91.1 for ; Mon, 15 Dec 2025 21:56:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=enterprisedb.com; s=google; t=1765864603; x=1766469403; darn=lists.postgresql.org; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:from:to:cc:subject:date:message-id:reply-to; bh=t1KxjPux3iHa7lYcldjGULR4/0YIDL5K44gNDrm20HE=; b=Vf8l9cCPUDnbOYnuBDVKpbnSYDyGsUZWq3XybZBd2o6uns7lMHh0ikOjweMrIiNzSG Y8NldGkLz+abvmUJ6snwzry0+ZQbTDtdr9Zxnw5n1EOXYRpNueeoxlt7cxKVRuQZBR5c hR44/9ApRodg/sp9N4jwJhpqRKzMT1k0WjtbgnfCzSvnnMCRjU7Y6SbHjlZ6I9WHrkxZ hVQTemN30JPKdH5qy2elrDe9bTzywT9R/KkEjAuJxIHE/0k7I1TkJZV4ZXPfndY2IBCb xRHiU0X7nc1FdDoTLUHFl4mviq75jfQZhLqZpC1YTfXA/VF8KTE/H1JwWcbJ1Cmajm5R H7lg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1765864603; x=1766469403; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=t1KxjPux3iHa7lYcldjGULR4/0YIDL5K44gNDrm20HE=; b=Lv2D4mEr7qhEzTqf/NfywnJBCFWQ8IqgHYo0qHTunJ+hus9Pdk7m9Ax/UjahnyV6ur dYXlxlbpA5mhKezB4Vmpt+C/XfhvcZRHLj/ala/nA8iWgocCwf/Ikhh4dql1lMqq1x5x 3Qoyj7bO6AKqSBaN2gc2owFe4CvT7sF6QMaaqYyrQwP4qiCeS8TxnjlOB4RWGGuegu0N Qodx/MwyQzaCv6ZQxPa2ZCs5ugHgJpUlqbpOCfFRN/uetKHxTSL3EZUCaq6hz5Og8dWa ZgxosvOftkCnOBl0DsdZEbLQUiL4U7hFYhd2ZxWh2AR9mtlz4vXCPsaukWD1vIoS6Or2 5t0A== X-Gm-Message-State: AOJu0YyIQbkPzTkSHfQORLarpcUwDpolfDIXj+ebygYRRDvgb5gwlG6l Wvghx5+CRxmL6d9D6vz186MOaANF4aKgmHTvKVsvsiB4JzmoAhzZKw4RtW8+E9u4KA== X-Gm-Gg: AY/fxX7CLNoe9Yf//vp48Lpl/T4Xl1dDJ3Ve5KVIde7qeUiE2wzlY3iaPZlczwSr/Am Q6gcBhZ1qnh91lUSUOEkL71MU5P15bWT3Qqe7YAKn7Tk4JmWo9zG6jB8wKYtTJlDiAd9h0gyhRS Q7UpHgU2+HrPfMerhVDJVu3IxeFM7i2gf/m0J3gWREc4/bCQYexgT+R1QEihdAljuxjDy5NHeD7 fYTbheJfJt9SVpT0DxDAAEN9ZKKwwM9jCwPr+D4EhDTX+vg+BPgVLzVqJcpVm+HSD0vx9AhiMpV Pdg+1lTFKmgYIggjhKVXG8ZsyvRzEq97f0QDvj+/H7TBEXjhw1fInbwoVNXWmve57N/ojdMn+sT f29VvGWYCP42SF8LiXhvUBp0btg1DZMgPjmBnr9kRInDDXWj/O87TXLXnHWt8sgmc02M1c6an1x uxBMivUxEHFHmsxiNrk7BKz4GgR4pGO1f1ClUqWLYChx0Rp+vrH1grCA== X-Google-Smtp-Source: AGHT+IEZCOCEBq3Wx+VJNKhnrdTRT0ElGrrrDmhM6NdoHkpvN5998Xg1K/DC9UI0gOIwbfQIcsRt3Q== X-Received: by 2002:a17:90a:da84:b0:34c:ab9c:b5b6 with SMTP id 98e67ed59e1d1-34cab9cb5eamr3053477a91.1.1765864603161; Mon, 15 Dec 2025 21:56:43 -0800 (PST) Received: from smtpclient.apple ([163.116.213.147]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-34abe3f71eesm10591958a91.13.2025.12.15.21.56.41 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 15 Dec 2025 21:56:42 -0800 (PST) From: Murtuza Zabuawala Message-Id: Content-Type: multipart/alternative; boundary="Apple-Mail=_8F74A529-0BF8-42F9-9FBA-80652A398135" Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3864.300.41.1.7\)) Subject: Re: [PATCH] Add zstd compression for TOAST using extended header format Date: Tue, 16 Dec 2025 11:26:30 +0530 In-Reply-To: Cc: pgadmin-hackers@lists.postgresql.org To: Dharin Shah References: X-Mailer: Apple Mail (2.3864.300.41.1.7) List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --Apple-Mail=_8F74A529-0BF8-42F9-9FBA-80652A398135 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Hello, You may want to consider sending the patch to the pgsql-hackers mailing = list. Murtuza Zabuawala enterprisedb.com > On 16 Dec 2025, at 12:46=E2=80=AFAM, Dharin Shah = wrote: >=20 > Hello PG Hackers, >=20 > Want to submit a patch that implements zstd compression for TOAST data = using a 20-byte TOAST pointer format, directly addressing the concerns = raised in prior discussions [1 = = ][2 = ][3 = ]= . >=20 > A bit of a background in the 2022 thread [3 = ]= , The overall suggestion was to have something extensible for the TOAST = header >=20 > i.e. something like: > 00 =3D PGLZ > 01 =3D LZ4 > 10 =3D reserved for future emergencies > 11 =3D extended header with additional type byte >=20 > This patch implements that idea. > The new header format: >=20 > struct varatt_external_extended { > int32 va_rawsize; /* same as legacy */ > uint32 va_extinfo; /* cmid=3D3 signals extended format */ > uint8 va_flags; /* feature flags */ > uint8 va_data[3]; /* va_data[0] =3D compression method */ > Oid va_valueid; /* same as legacy */ > Oid va_toastrelid; /* same as legacy */ > }; >=20 > A few notes: >=20 > - Zstd only applies to external TOAST, not inline compression. The = 2-bit limit in va_tcinfo stays as-is for inline data, where pglz/lz4 = work fine anyway. Zstd's wins show up on larger values. > - A GUC use_extended_toast_header controls whether pglz/lz4 also use = the 20-byte format (defaults to off for compatibility, can enable it if = you want consistency). > - Legacy 16-byte pointers continue to work - we check the vartag to = determine which format to read. >=20 > The 4 extra bytes per pointer is negligible for typical TOAST data = sizes, and it gives us room to grow. >=20 > Regards,=20 > Dharin > --Apple-Mail=_8F74A529-0BF8-42F9-9FBA-80652A398135 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8
Hello,

You = may want to consider sending the patch to the pgsql-hackers mailing = list.



Murtuza = Zabuawala


On 16 Dec 2025, at 12:46=E2=80=AFA= M, Dharin Shah <dharinshah95@gmail.com> wrote:

Hello PG Hackers,

Want to submit a patch that implements = zstd compression for TOAST data using a 20-byte TOAST pointer format, = directly addressing the concerns raised in prior discussions = [1][2][3].

A bit of a background in the 2022 = thread [3], The overall suggestion was to have = something extensible for the TOAST header

i.e. something = like:
00 =3D PGLZ
01 =3D LZ4
10 =3D reserved for future = emergencies
11 =3D extended header with additional = type byte

This patch implements that = idea.
The new header = format:

  struct varatt_external_extended = {
      int32   = va_rawsize;     /* same as legacy */
      uint32 =  va_extinfo;     /* cmid=3D3 signals extended format = */
      uint8   = va_flags;       /* feature flags = */
      uint8   = va_data[3];     /* va_data[0] =3D compression method = */
      Oid     = va_valueid;     /* same as legacy */
      Oid     = va_toastrelid;  /* same as legacy */
  = };

A few = notes:

- Zstd only applies to external TOAST, = not inline compression. The 2-bit limit in va_tcinfo stays as-is for = inline data, where pglz/lz4 work fine anyway. Zstd's wins show up on = larger values.
- A GUC use_extended_toast_header = controls whether pglz/lz4 also use the 20-byte format (defaults to off = for compatibility, can enable it if you want = consistency).
- Legacy 16-byte pointers continue to = work - we check the vartag to determine which format to = read.

The 4 extra bytes per pointer is = negligible for typical TOAST data sizes, and it gives us room to = grow.

Regards,
Dharin
<zstd-toast-compression-external.patch>

= --Apple-Mail=_8F74A529-0BF8-42F9-9FBA-80652A398135--