Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w3CEs-000zur-3B for pgsql-hackers@arkaria.postgresql.org; Thu, 19 Mar 2026 12:12:35 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w3CEr-000NbZ-1W for pgsql-hackers@arkaria.postgresql.org; Thu, 19 Mar 2026 12:12:33 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w3CEq-000NbQ-3B for pgsql-hackers@lists.postgresql.org; Thu, 19 Mar 2026 12:12:33 +0000 Received: from mail-qt1-x836.google.com ([2607:f8b0:4864:20::836]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1w3CEn-00000000YKU-1bF1 for pgsql-hackers@postgresql.org; Thu, 19 Mar 2026 12:12:31 +0000 Received: by mail-qt1-x836.google.com with SMTP id d75a77b69052e-5094ba0af1aso9665981cf.0 for ; Thu, 19 Mar 2026 05:12:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1773922350; cv=none; d=google.com; s=arc-20240605; b=R0CpJDlkqfdLE5l4HwkRDphxTUHkvgt2eeB0jHZ3ldl7VfWAqL4/5MqmO+pIuKwhcx D30ES8UPSUqPejgoxDmYw7sxkXgUowU482RamIhJtS+e6KYtJsHGkFAZJ/VvHBP9qSpa E54dDiAlaJLkRvF50qj3EdnkJI5UeitI+K6Q8Bd1DpCJipsmzeUr+5/+tNPeRVvM/xbl rYAgb+0vBOn8C5nlBpQbPOPbo3r7NRG1gA4whxhwQ+QsQ1qMmaopHG7tCP42ebud7Na1 atwSsArQ6phRv3qN54kcIAGTNksSOlY7af/0jECOAgtB24Flcx2JsPnD6z7yz7eO2Ddi kUiw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature; bh=KZiArpREDsg7Ys+/nTyYSzddD352RtiiRMYzvXEnIAE=; fh=EZUn7Ezlb+zp3Ldot09NKsBo1uFoK70H4pr7CE4erB0=; b=F4QA1U66frlnKnsErDV8fCovh1pwXkmdRzr/ornt3lVTOfDX6cIkRo6kFgRq83rhPD FPdbo8j95Xao0X1SmuzWTCtBcMojtUGjdINfQjeeb79RjmuNc9msWSLH54CMUE9ZVtMi dJf/TrW9Hs2+J50kxBIrbN04fzWD9xdVthZDcQBvIBAvdReMSK7m7J8TJQ9li5fXHmVN vlfpISrgpCdb9rBwTMrKmdt4ybnnJJJiccUVJFd5C1nTQ88RPqj7/311pUjYkNjLXgVs 482JRd+i1s09+d/cQ7INenIAnIot4vUHANpexeZtHslimYmTkp9yINIVJjOATkkUSf5U h6jw==; darn=postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1773922350; x=1774527150; darn=postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=KZiArpREDsg7Ys+/nTyYSzddD352RtiiRMYzvXEnIAE=; b=H6s1KLwR6eg1rs/qaBnbbIv+gE5K7T29xbDwqeZD/VOmCVi0YHsx0NaWwzWSFK2kO9 EhBuCKAGw+JnZOg209ue5tDlDhfTXkAi3M6jGAe9Gma/gmUkb7ZO8T6pZDU3mcrc7r3a 2V+LhxJ//6JjyGqc/YRuKOvB3Y7+DF4acvFlDGKx/rjmBbG/DN2lH+a0iyDhPVakrLYu ZDNhwWIVDY02HdtBAtR/v9OwXFzB4rsFvPB+HYw+CJBw5TXHo4vhXdwFDjwduDxS3tWr rLl6/45APIsey4rfPhZB7w9odVfxQfMSIEYid3ZnvUm9fxe5vTxY5gsu+NlxV7xXYr8w pWiQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773922350; x=1774527150; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=KZiArpREDsg7Ys+/nTyYSzddD352RtiiRMYzvXEnIAE=; b=CgCZ8QZE13i+L3Q+RXu+PJaaaYD7bjJ8k+fmRVBAbrEXszNMs0jIT+P6rVIi2ff7/v QXWY2XjDp/581wwvCsSMjosxPcSS6RNksr/VK9BvKCrUtELG5SpK04CyGZ3qD/QmZfTx ja1qAQsAEF0hTHDj4o7oMNc0N5zHq6/eRF24LE7zy6sxjvFTjHuPA7HwJbiSSVjGinkf GoVASXBRA2t56v91oCcBIf6LzCWijDOAzLpRH3ZEtm3lN8lPu27CDtfGqKw7drN0BO5f iFbAFQmttgqedSLsEhSk+sw4TavTDvS/q9omJ3Z5LEGSp0ORKD3WNJuTGYsdHEmDjvgk 810g== X-Forwarded-Encrypted: i=1; AJvYcCWXIfJTUqfhjPTapUik60VFEeBnyuj6T5Ig9Mqab3yz7uDKSnGM0YSC8pU1Ap/W3rAkzXtDNfukIJktjKH1@postgresql.org X-Gm-Message-State: AOJu0Yx6K6KpF/aZjkIz6JkCBivYH51g7Du9m2laiPKwSbBrNbO629BW J/jL9y8wZ14uhLCgDT4LwGfbmF6BCO3JdKmgxn0PmL05yuHbIeMQ9q95hiSPb4eSvKl9vRsNn7C 0zvW8mFT4zvhyoiGRw77FYtBpANcE16I= X-Gm-Gg: ATEYQzy0ZC8fpIXzfhSRQU9eqwA62HSNk5m7toEzHTFrXAbYk4acUvWWrB/Wio4ojqF geDCkpI+6S2IYZ6uE0Iqo0hnzcouvs1oOBBiuU1lxyZNI/waVtgMCVGiW+XGz/kMDtcEQGG5Qud m2d1JkGV4YChIDcF9T2rCEsX87bnk8gBI+ZWVIqD/sFIQYhXb1Cf6LdFkNscdMwNUQORrTOhJr7 hRLDqjbOoGsItU4tBJ/Zzeg937QKxPqE0bEqtkiCOQiCJKA97lwGTbXDlk4/SnG+M7MI6ezq4jo 6hYaK7U= X-Received: by 2002:ac8:57d2:0:b0:509:4b11:6cf4 with SMTP id d75a77b69052e-50b14745e07mr96116891cf.5.1773922349911; Thu, 19 Mar 2026 05:12:29 -0700 (PDT) MIME-Version: 1.0 References: <1791665551.452444.1761209220211.ref@mail.yahoo.com> <18022523-0F8F-4C07-AFF5-57DC9086D78E@yandex-team.ru> <1895971769.8343.1761240853939@mail.yahoo.com> <574624399.175025.1761290201491@mail.yahoo.com> <953203149.383019.1761345585325@mail.yahoo.com> <6F76FA61-E2DC-44EF-9504-889D9BDB4EBD@yandex-team.ru> <1154454839.957923.1761604611424@mail.yahoo.com> <87ldkv8cog.fsf@wibble.ilmari.org> <87ikfz7zcu.fsf@wibble.ilmari.org> <9CB824CC-70DC-4165-AC6A-9664F47209EE@yandex-team.ru> In-Reply-To: From: Chengxi Sun Date: Thu, 19 Mar 2026 20:12:17 +0800 X-Gm-Features: AaiRm51eXha4DVA0rb-OQhfDGhF78yX3Fr2UFyzj97rFVJavGWIYx-JiQtLIcgQ Message-ID: Subject: Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions To: Aleksander Alekseev Cc: Masahiko Sawada , pgsql-hackers , Andrey Borodin , =?UTF-8?Q?Dagfinn_Ilmari_Manns=C3=A5ker?= Content-Type: multipart/alternative; boundary="000000000000869261064d5f7b68" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000869261064d5f7b68 Content-Type: text/plain; charset="UTF-8" I have a concern with base32hex_decode(). It only checks where the first = appears, but it does not validate the final group length or the required amount of padding. Because of that, some invalid inputs are accepted silently. For example: postgres=# SET bytea_output = hex; SET postgres=# SELECT '0' AS input, decode('0', 'base32hex'); input | decode -------+-------- 0 | \x (1 row) postgres=# SELECT '000' AS input , decode('000', 'base32hex'); input | decode -------+-------- 000 | \x00 (1 row) postgres=# SELECT '24=' as input , decode('24=', 'base32hex'); input | decode -------+-------- 24= | \x11 (1 row) These looks good, but if we verify that with python: % python3 - <<'PY' import base64 tests = [ "24", "24======", "0", "000", "24=", ] for s in tests: try: out = base64.b32hexdecode(s, casefold=True) print(f"{s!r} -> OK {out.hex()}") except Exception as e: print(f"{s!r} -> ERROR: {e}") PY The outputs are: '24' -> ERROR: Incorrect padding '24======' -> OK 11 '0' -> ERROR: Incorrect padding '000' -> ERROR: Incorrect padding '24=' -> ERROR: Incorrect padding I might be missing some context here, so I wanted to ask: is this behavior intentional, or would it make sense to enforce stricter validation for Base32hex input? Best regards, Chengxi Sun --000000000000869261064d5f7b68 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I have a concern with base32hex_decode(). It only checks where the first= =3D appears,=C2=A0
but it does not validate the final group length or the required amou= nt of padding.=C2=A0
Because of that, some invalid inputs are accepted silently.<= br>
For example:

postgres=3D# SET bytea_output =3D hex;SET
postgres=3D# SELECT '0' AS input, decode('0', '= base32hex');
=C2=A0input | decode
-= ------+--------
=C2=A00 =C2=A0 =C2=A0 | \x
(1 row)

post= gres=3D# SELECT '000' AS input , decode('000', 'base32h= ex');
=C2=A0input | decode
-------+--------
=C2=A0000 =C2=A0 |= \x00
(1 row)

postgres=3D# SELECT '24=3D' as input , deco= de('24=3D', 'base32hex');
=C2=A0input | decode
-------+--------
=C2=A024=3D =C2=A0 | \x11
(1= row)

These looks good, but if we verify that with python:
% python3 - <<'PY'
import b= ase64

tests =3D [
=C2=A0 =C2=A0 "24",
=C2=A0 =C2=A0 = "24=3D=3D=3D=3D=3D=3D",
=C2=A0 =C2=A0 "0",
=C2=A0= =C2=A0 "000",
=C2=A0 =C2=A0 "24=3D",
]

fo= r s in tests:
=C2=A0 =C2=A0 try:
=C2=A0 =C2=A0 =C2=A0 =C2=A0 out =3D = base64.b32hexdecode(s, casefold=3DTrue)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 prin= t(f"{s!r} -> OK {out.hex()}")
=C2=A0 =C2=A0 except Exceptio= n as e:
=C2=A0 =C2=A0 =C2=A0 =C2=A0 print(f"{s!r} -> ERROR: {e}&= quot;)
PY

The outputs are:
'24' -> ERROR: Incorrect padding
'24=3D= =3D=3D=3D=3D=3D' -> OK 11
'0' -> ERROR: Incorrect padd= ing
'000' -> ERROR: Incorrect padding
'24=3D' ->= ; ERROR: Incorrect padding

I might be missing some context here, so I want= ed to ask: is this behavior intentional,=C2=A0
or would it make sense to enforce stricter validation for Base32hex inp= ut?=C2=A0=C2=A0

Best regards,

Chengxi Sun
--000000000000869261064d5f7b68--