public inbox for [email protected]  
help / color / mirror / Atom feed
From: Masahiko Sawada <[email protected]>
To: Tom Lane <[email protected]>
Cc: Tomas Vondra <[email protected]>
Cc: Aleksander Alekseev <[email protected]>
Cc: pgsql-hackers <[email protected]>
Cc: Chao Li <[email protected]>
Cc: Dagfinn Ilmari Mannsåker <[email protected]>
Cc: Andrey Borodin <[email protected]>
Subject: Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
Date: Wed, 25 Mar 2026 20:13:55 -0700
Message-ID: <CAD21AoCd9e5YvVWQY5JyEzK6tGLtJJWgw1+zgGpAwTOnJ7X1GQ@mail.gmail.com> (raw)
In-Reply-To: <CAD21AoC=92zAEmaGt+3U1jSD77jgdQT_HXUwPPtZEze2sJ=vWA@mail.gmail.com>
References: <[email protected]>
	<[email protected]>
	<CAJ7c6TNCGT2eZ8JttAZKUhVv3jJFTHvjFw6typzSGkM5_vePow@mail.gmail.com>
	<CAD21AoCcikUgU5coyvM7PLJg5M-75LzQhm-Uxmqn_7M=DHkj+w@mail.gmail.com>
	<CAJ7c6TPu7NGrSj-V2-ns0GTNGx9cmoKQgzt-p2MpJepiGjQnYw@mail.gmail.com>
	<CAD21AoC2Ooy=W=4YjQ_GO5nEYZ2tsn43=TkLf7ysE-96ygohKg@mail.gmail.com>
	<[email protected]>
	<CAD21AoA7UMWG8YO2Edf1UWSdtaof6gOnF98bJGoz1z1JBzozuQ@mail.gmail.com>
	<CAJ7c6TMd4ebu7oha2C9QZ_aKNE5-keEJ=bE_k_-vrMuw2p1LNg@mail.gmail.com>
	<CAJ7c6TNzQES3EhzoftNUEg=mfi69SYEQ=Fh-=rr84b7zHECHeg@mail.gmail.com>
	<CAD21AoAj+KOc74wOr04oGo9ySYmf_TYkMi5fByLF80cqyZmB=g@mail.gmail.com>
	<[email protected]>
	<CAJ7c6TN7fjWDWSVmhQuNCPYLVL9cs9ifc4D_Hj6wUQd2Lw2oNQ@mail.gmail.com>
	<CAD21AoAMST1V1OE=yvg3vQn=p6DDJBAaaFDx39z=P-+Z9iXwOQ@mail.gmail.com>
	<[email protected]>
	<[email protected]>
	<CAD21AoAwX1D6baSGuQXm0mzPXPWB07kgaoaaahjNHHenbdY24A@mail.gmail.com>
	<CAD21AoC=92zAEmaGt+3U1jSD77jgdQT_HXUwPPtZEze2sJ=vWA@mail.gmail.com>

On Wed, Mar 25, 2026 at 6:21 PM Masahiko Sawada <[email protected]> wrote:
>
> On Wed, Mar 25, 2026 at 6:09 PM Masahiko Sawada <[email protected]> wrote:
> >
> > On Wed, Mar 25, 2026 at 5:35 PM Tom Lane <[email protected]> wrote:
> > >
> > > Tomas Vondra <[email protected]> writes:
> > > > On 3/26/26 00:40, Tom Lane wrote:
> > > >> I believe what's happening there is that in cs_CZ locale,
> > > >> "V" doesn't follow simple ASCII sort ordering.
> > >
> > > > With cs_CZ all letters sort *before* numbers, while in en_US it's the
> > > > other way around. V is not special in any way.
> > >
> > > Ah, sorry, I should have researched a bit instead of relying on
> > > fading memory.  The quirk I was thinking of is that in cs_CZ,
> > > "ch" sorts after "h":
> > >
> > > u8=# select 'h' < 'ch'::text collate "en_US";
> > >  ?column?
> > > ----------
> > >  f
> > > (1 row)
> > >
> > > u8=# select 'h' < 'ch'::text collate "cs_CZ";
> > >  ?column?
> > > ----------
> > >  t
> > > (1 row)
> > >
> > > Regular hex encoding isn't bitten by that because it doesn't
> > > use 'h' in the text form ... but this base32hex thingie does.
> > >
> > > However, your point is also correct:
> > >
> > > u8=# select '0' < 'C'::text ;
> > >  ?column?
> > > ----------
> > >  t
> > > (1 row)
> > >
> > > u8=# select '0' < 'C'::text collate "cs_CZ";
> > >  ?column?
> > > ----------
> > >  f
> > > (1 row)
> > >
> > > and that breaks "text ordering matches numeric ordering"
> > > for both traditional hex and base32hex.  So maybe this
> > > is not as big a deal as I first thought.  We need a fix
> > > for the new test though.  Probably adding COLLATE "C"
> > > would be enough.
> >
> > Thank you for the report and the analysis.
> >
> > I've reproduced the issue with "cs_CZ" collation and adding COLLATE
> > "C" to the query resolves it. It seems also a good idea to add a note
> > in the documentation too as users might face the same issue. For
> > example,
> >
> > To maintain the lexicographical sort order of the encoded data, ensure
> > that the text is sorted using the C collation (e.g., using COLLATE
> > "C"). Natural language collations may sort characters differently and
> > break the ordering.
> >
>
> Attached the patch doing the above idea.

Pushed the fix without the documentation changes to make the buildfarm
animals happy first.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com





view thread (63+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  In-Reply-To: <CAD21AoCd9e5YvVWQY5JyEzK6tGLtJJWgw1+zgGpAwTOnJ7X1GQ@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox