Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions

public inbox for [email protected]  
help / color / mirror / Atom feed

From: Chengxi Sun <[email protected]>
To: Aleksander Alekseev <[email protected]>
Cc: Masahiko Sawada <[email protected]>
Cc: pgsql-hackers <[email protected]>
Cc: Andrey Borodin <[email protected]>
Cc: Dagfinn Ilmari Mannsåker <[email protected]>
Subject: Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
Date: Thu, 19 Mar 2026 20:12:17 +0800
Message-ID: <CAMvSjCRxFgKC3JfOMSr358zGu166niRY2UqaTS_=oQcyiBArmQ@mail.gmail.com> (raw)
In-Reply-To: <CAJ7c6TOJ-aJh=dnPSwtzWsHnANi=SVZQ3bYuZLtS2w8xxVngnQ@mail.gmail.com>
References: <[email protected]>
	<CAJ7c6TOramr1UTLcyB128LWMqita1Y7=arq3KHaU=qikf5yKOQ@mail.gmail.com>
	<[email protected]>
	<[email protected]>
	<CAD21AoCxy_3VW2_z5Rxc-tFsuqeyGsA-F_kD-tx6XXBC56nTCg@mail.gmail.com>
	<[email protected]>
	<CAD21AoAXQcZ2mMkxX6NPdFpdC-D3AhE--qyH9Se3XTrDX6x-bg@mail.gmail.com>
	<[email protected]>
	<CAD21AoCzEDdwpyPwA0d-QmCRe5rMz3m160SJgxMwKke85e8n0w@mail.gmail.com>
	<[email protected]>
	<CAD21AoCKvYCccndY4CRcwk1bOuWxJionOidEtKQWoJTqS7wL1g@mail.gmail.com>
	<[email protected]>
	<CAGECzQTb9pg2Qw0XmOEKaJivLJ8kdGq3Cq38OqSgwiFVD=9f8A@mail.gmail.com>
	<[email protected]>
	<[email protected]>
	<CAD21AoCf-0LdE=Js3ViXmKfSATJOjSNym1+80O+=-Y3X6LDMyA@mail.gmail.com>
	<[email protected]>
	<[email protected]>
	<CAJ7c6TNCGT2eZ8JttAZKUhVv3jJFTHvjFw6typzSGkM5_vePow@mail.gmail.com>
	<CAJ7c6TMZOvVcbEZ-KH8e4mDuQ6ZTfgV-BJoLX_aTd1MtjScyqA@mail.gmail.com>
	<CAD21AoCcikUgU5coyvM7PLJg5M-75LzQhm-Uxmqn_7M=DHkj+w@mail.gmail.com>
	<CAJ7c6TPu7NGrSj-V2-ns0GTNGx9cmoKQgzt-p2MpJepiGjQnYw@mail.gmail.com>
	<CAD21AoC2Ooy=W=4YjQ_GO5nEYZ2tsn43=TkLf7ysE-96ygohKg@mail.gmail.com>
	<CAJ7c6TOJ-aJh=dnPSwtzWsHnANi=SVZQ3bYuZLtS2w8xxVngnQ@mail.gmail.com>

I have a concern with base32hex_decode(). It only checks where the first =
appears,
but it does not validate the final group length or the required amount of
padding.
Because of that, some invalid inputs are accepted silently.

For example:

postgres=# SET bytea_output = hex;
SET
postgres=# SELECT '0' AS input, decode('0', 'base32hex');
 input | decode
-------+--------
 0     | \x
(1 row)

postgres=# SELECT '000' AS input , decode('000', 'base32hex');
 input | decode
-------+--------
 000   | \x00
(1 row)

postgres=# SELECT '24=' as input , decode('24=', 'base32hex');
 input | decode
-------+--------
 24=   | \x11
(1 row)

These looks good, but if we verify that with python:
% python3 - <<'PY'
import base64

tests = [
    "24",
    "24======",
    "0",
    "000",
    "24=",
]

for s in tests:
    try:
        out = base64.b32hexdecode(s, casefold=True)
        print(f"{s!r} -> OK {out.hex()}")
    except Exception as e:
        print(f"{s!r} -> ERROR: {e}")
PY

The outputs are:
'24' -> ERROR: Incorrect padding
'24======' -> OK 11
'0' -> ERROR: Incorrect padding
'000' -> ERROR: Incorrect padding
'24=' -> ERROR: Incorrect padding

I might be missing some context here, so I wanted to ask: is this behavior
intentional,
or would it make sense to enforce stricter validation for Base32hex input?

Best regards,

Chengxi Sun

view thread (63+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  In-Reply-To: <CAMvSjCRxFgKC3JfOMSr358zGu166niRY2UqaTS_=oQcyiBArmQ@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox