public inbox for [email protected]
help / color / mirror / Atom feedFrom: Masahiko Sawada <[email protected]>
To: Andrey Borodin <[email protected]>
Cc: Tom Lane <[email protected]>
Cc: Aleksander Alekseev <[email protected]>
Cc: pgsql-hackers <[email protected]>
Cc: Chao Li <[email protected]>
Cc: Dagfinn Ilmari Mannsåker <[email protected]>
Subject: Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
Date: Thu, 26 Mar 2026 10:30:21 -0700
Message-ID: <CAD21AoB2Qe0VYWw_84WH6Zuip1DMNNhLpUu5b+r77CUS3yfmZg@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
<[email protected]>
<CAD21AoCxy_3VW2_z5Rxc-tFsuqeyGsA-F_kD-tx6XXBC56nTCg@mail.gmail.com>
<[email protected]>
<CAD21AoAXQcZ2mMkxX6NPdFpdC-D3AhE--qyH9Se3XTrDX6x-bg@mail.gmail.com>
<[email protected]>
<CAD21AoCzEDdwpyPwA0d-QmCRe5rMz3m160SJgxMwKke85e8n0w@mail.gmail.com>
<[email protected]>
<CAD21AoCKvYCccndY4CRcwk1bOuWxJionOidEtKQWoJTqS7wL1g@mail.gmail.com>
<[email protected]>
<CAGECzQTb9pg2Qw0XmOEKaJivLJ8kdGq3Cq38OqSgwiFVD=9f8A@mail.gmail.com>
<[email protected]>
<[email protected]>
<CAD21AoCf-0LdE=Js3ViXmKfSATJOjSNym1+80O+=-Y3X6LDMyA@mail.gmail.com>
<[email protected]>
<[email protected]>
<CAJ7c6TNCGT2eZ8JttAZKUhVv3jJFTHvjFw6typzSGkM5_vePow@mail.gmail.com>
<CAD21AoCcikUgU5coyvM7PLJg5M-75LzQhm-Uxmqn_7M=DHkj+w@mail.gmail.com>
<CAJ7c6TPu7NGrSj-V2-ns0GTNGx9cmoKQgzt-p2MpJepiGjQnYw@mail.gmail.com>
<CAD21AoC2Ooy=W=4YjQ_GO5nEYZ2tsn43=TkLf7ysE-96ygohKg@mail.gmail.com>
<[email protected]>
<CAD21AoA7UMWG8YO2Edf1UWSdtaof6gOnF98bJGoz1z1JBzozuQ@mail.gmail.com>
<CAJ7c6TMd4ebu7oha2C9QZ_aKNE5-keEJ=bE_k_-vrMuw2p1LNg@mail.gmail.com>
<CAJ7c6TNzQES3EhzoftNUEg=mfi69SYEQ=Fh-=rr84b7zHECHeg@mail.gmail.com>
<CAD21AoAj+KOc74wOr04oGo9ySYmf_TYkMi5fByLF80cqyZmB=g@mail.gmail.com>
<[email protected]>
<CAJ7c6TN7fjWDWSVmhQuNCPYLVL9cs9ifc4D_Hj6wUQd2Lw2oNQ@mail.gmail.com>
<CAD21AoAMST1V1OE=yvg3vQn=p6DDJBAaaFDx39z=P-+Z9iXwOQ@mail.gmail.com>
<[email protected]>
<[email protected]>
On Thu, Mar 26, 2026 at 1:32 AM Andrey Borodin <[email protected]> wrote:
>
>
>
> > On 26 Mar 2026, at 04:40, Tom Lane <[email protected]> wrote:
> >
> > I wonder whether this discovery puts enough of a hole in the
> > value-proposition for base32hex that we should just revert
> > this patch altogether. "It works except in some locales"
> > isn't a very appetizing prospect, so the whole idea is starting
> > to feel more like a foot-gun than a widely-useful feature.
>
> To be precise, this discovery cast shadows on argument "[base32hex is ]lexicographically sortable format that preserves temporal ordering for UUIDv7". And, actually, any UUID. But I do not think it invalidates the argument completely.
>
> It's taken from RFC[0], actually, that states:
> One property with this alphabet, which the base64 and base32
> alphabets lack, is that encoded data maintains its sort order when
> the encoded data is compared bit-wise.
>
>
> RFC does not give any other benefits.
> Personally, I like that it's compact, visually better than base64, and RFC-compliant.
> And IMO argument "base32hex is lexicographically sortable format that preserves ordering for UUID in C locale" is still very strong.
> Though, there's a little footy shooty in last 3 words.
Yeah, I still find that base32hex is useful.
As I mentioned in another email, I think we should make a note the
fact that "base32hex is lexicographically sortable format that
preserves ordering for UUID in C locale" in the documentation. I've
attached the patch. Feedback is very welcome.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Attachments:
[text/x-patch] v1-0001-doc-Add-note-about-collation-requirements-for-bas.patch (1.6K, 2-v1-0001-doc-Add-note-about-collation-requirements-for-bas.patch)
download | inline diff:
From 515c666b60f7f81f6b2a004ebfb91b358188470c Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <[email protected]>
Date: Thu, 26 Mar 2026 10:17:23 -0700
Subject: [PATCH v1] doc: Add note about collation requirements for base32hex
sortability.
While fixing the base32hex UUID sortability test in commit 89210037a0a,
it turned out that the expected lexicographical order is only maintained
under the C collation (or an equivalent byte-wise collation).
Since this is not just a testing quirk but could be a real trap users
might fall into when sorting encoded data in their databases, we added
a note to the documentation to make this requirement explicitly clear.
Reviewed-by:
Discussion: https://postgr.es/m/CAD21AoAwX1D6baSGuQXm0mzPXPWB07kgaoaaahjNHHenbdY24A@mail.gmail.com
---
doc/src/sgml/func/func-binarystring.sgml | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/doc/src/sgml/func/func-binarystring.sgml b/doc/src/sgml/func/func-binarystring.sgml
index 0aaf9bc68f1..9f731d7bca0 100644
--- a/doc/src/sgml/func/func-binarystring.sgml
+++ b/doc/src/sgml/func/func-binarystring.sgml
@@ -790,6 +790,14 @@
produces a 26-character string compared to the standard 36-character
UUID representation.
</para>
+ <note>
+ <para>
+ To maintain the lexicographical sort order of the encoded data,
+ ensure that the text is sorted using the C collation
+ (e.g., using <literal>COLLATE "C"</literal>). Natural language
+ collations may sort characters differently and break the ordering.
+ </para>
+ </note>
</listitem>
</varlistentry>
--
2.53.0
view thread (63+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
In-Reply-To: <CAD21AoB2Qe0VYWw_84WH6Zuip1DMNNhLpUu5b+r77CUS3yfmZg@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox