Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1vCC3P-00AIdr-UU for pgsql-hackers@arkaria.postgresql.org; Fri, 24 Oct 2025 07:17:39 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1vCC3O-00EZ3a-QP for pgsql-hackers@arkaria.postgresql.org; Fri, 24 Oct 2025 07:17:37 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1vCC3N-00EZ3S-Ly for pgsql-hackers@lists.postgresql.org; Fri, 24 Oct 2025 07:17:37 +0000 Received: from sonic311-23.consmr.mail.gq1.yahoo.com ([98.137.65.204]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vCC3J-003wrg-06 for pgsql-hackers@postgresql.org; Fri, 24 Oct 2025 07:17:36 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com.au; s=s2048; t=1761290248; bh=QLgQk1JSdsQobHT15M76JAg8NIDKkeLJCTK9luXv6V4=; h=Date:From:To:Cc:In-Reply-To:References:Subject:From:Subject:Reply-To; b=uko6oGupwh3YXyi+oEqNGDpa3bbmBU/Zx8aQP3hMqlSwOv7WAOIKD2mWhMP1T99m3KuD1UotofpN2CPqqvdyo+eLjSq7Zn7YYEI8KWgX+R5JMqlxYt4QuHRvFtBqLr9+FqIMUx+txlcRL16JwVpPtMNWIlee8cbn2IQV5NQknrXR7QkJMLbDljtLuCL6Zmli0WlhbhQUnXfHlNmFD2QRYzJw7VKv7c9oDSwAUUtsx8PM6d9ggQ9zmjGXY7pY/uVELiLTbunLrw1AEjd/OPkPU3lHfgKkee9bfdXww1I3W3S31lunur56Weazo2n2Mi3bxDqHtP0JI+8/LZ8hozD+pg== X-SONIC-DKIM-SIGN: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1761290248; bh=G03N7Z5kN8c7oLCFz6SJ+88klpdbXyqy/miBLDgUJj5=; h=X-Sonic-MF:Date:From:To:Subject:From:Subject; b=eRBBwb0I6LnRsj3LFmuRCC2SW+7przChZ7BPRYLyG2+5fo9FAbtYU/5fD7xDxDhNhuWFxH727RKiZj2WHaIfkS+VSdicIkV/VOeWEFmTNbbXPBDvjFy8QOZkZ3L7tIoNHF+Z4XDsOEsvuqvMStC/7HzClk3NvN4X+nGvKrJEPt3Ixat2w8zrt2WSUDVuhhH5Wmrpqp3h/AMXnjlPPFlASELhXAtN6SdOOeyUZqeY8jj3tW+7OHeAaCrMJ6hF+v6ZF+8w8HkE4Aj8EcQ/jP2nsVNZY26JgRQLJiENxo1sZl7EtSPKD5wnhF57oE9zEzW7nyvgNG89EYpMmqlaoyYn/A== X-YMail-OSG: zPEsoE4VM1ngixK3B_tFJAI2cKvZKgHCSHcZRxaEqHkmMWdH4Jwh5NAuUFKIL7d ydUWZ9tDMVU6WqGb5jKPGlv7TdFOc8nvYwc8D6WDhiyw7WQ8aK7edNMlMrq6jysCRVbV4M0HItUH NLdujs8Vq.IlnhtHMYqObvcfJmNEM2HHYxU.bZOKtGPiEf66mk.XNc7UmAM3K34pkH9FETK_NARC GlIXgKGqlgwhVC.DmFf0AYqAinQxLJscluFdWQv3tBEATuOZsYUMBh8P4bI4Wf0EaCwFQ653gWiY vXpVH.8mSw.c7zgDgYD9_.SJrKTLvflsTQVQMFe6bQDXWKSRhMd_QTpnuu.r5_GFoySBm_ZbPAJ1 PntkDue3O4sUFcVzoTxRKWI45y1zgFUQjKDbTBKB68LRa725TGcb9wXO.gB3Aiz3yjVC4LQfUBQ_ B1MqA7nXfnxDxUtak6aHUVgLtr2Fk3jvaqVFlGgIg3lQNIneC0TRiOyy7k7P6.y1Kt3JsKP1iiNW bEM3Gfz5pTdRoyNavo6brl13Nlf.BHpaVbNRSXUnEehIYwKqD0CZkhfpa1_Ns3r8EMgtdZT5ldDz 05X_2euzBsZeq65uat_VKeTdnWj.kWWCZ2xikNxVZsFDMzkE0Oftgje3YIOCvM8TzZUZ5DN.lh8a esmwRfGOo6e0s8Ttj9i79Axno1_maJ.atMgeuolbpt9n52AG0wuGAkitEd4r57T9hFxr_tYL6KRj 4F9EUHTfyCAs_hUqQDzL1n__EkZmwV_1Ly4mQbqIyhY3Bi3MtQqWfYePwtqc_raTrpPMszmMRBi8 vC0GhcvBOVaVnsfJgzGHAu_uQp4eiqVTDM_t3szjwd8N97LLBpBsDf9BtuP.RiP9u0f_UOV8v8jh ngj9gwNyUkx4edG00caKVXeu90HXPUM_oE.jyEWnt9uKXVPupHDnE01HIrrLO4G5QhTiGr3zymxb dSyxa2CBfriKPHWTPU4gJQhFFw6x03.myuqWoWO_z_qb3KcOep6qR_kIL3dsh6kkXb19wBdxP679 o79BzeI3HNIzTpic.D9JACQI5tl656yYH0vjvwuamcfHb_Xk9QtAfCOwuTpRVF0IeYirwpNj5HUW EFiOat2LOeKWYF18kDX43G63Mkqwf5U44u.ydqT9fEoUrDeVcVy8TJmeMPxFe8o00ivyYgnXWlRb pVDvnPjd.6hV6LbT.MIwf.lnBrpwjOdHhAsjqP8cLpLD0iRaxiA25tiG.ik9y3hDdp6CBmF4PhKD sHu1SchDRmoBHdDhA62LsOsYBCVBQc2yQLH9l_eE94FB512zQ.3m1XOz8p6rYNoOOsjaTxBmlfeu b612p.5k0CTmEIAWd0RBYg_6hTtK_X0edSXpNIXS6emEsWeTjTsNdJXSTH42BaWE.5PgbYtMzzEM oop4xYAnDBTgPxqJhk3etCSciE57wxtvozmLXZ1j6nxzt2yXTgDfZadrTaEqbsQ8h3ztai63alWT VRtDT0NT1zHqu7I1m.FocbUEx0Av8puU5FOeayV4XR47DCewUrvd7DUaChKu4jDq58bkZ4eUixRD QK.PSjBo3CynuR1oFys1p.BRfiZvo6rrqxDTaWqeQLsqAVpLgR3D_W10M1LgQ5qnr_ue9JqSz.uw l2C7QpkGlr8fxIVzL_ibfku4ElVd7ZhjUyMK5z88mqJi.Nn9_5ANiGHsHjjVP_ZC8cNtWDoT47RW 8WDyT3Z0OiW95TzREVwMJ27DSQbEcgFudJ_xAR05ri5wjEonKJPgsQculV.gsuJnmGL65CF8UWBh I_MFipjboLUcYo9Dcr5_b5urASfPl9IGZXoDcK8vG92QHGE2ns2jtDvTdADKO.IKz2CBp3lkERdb hghiVU3w2wVr0qEuAqzQAcqAc2AQSvjj2nKaI8rEVfi2MP.IFmkVOLOytbywX6RPYkhRV0EDyIzM o57agK1f5T5Q0BoHhtUS37imix2YM4a11PVewz7pjncO3.ZfatATVB_KgLfwWiARpemRs6IcVoof hm4kRbw85U35yokqd6mNwO8sPRiQAGiMLfhDgFUZxZpXRNk.N_FG.1zN.u8bq2woqNFiaJXcb5h2 IrmBMqEYSKfgWNDKXORljrYm9WjEjysJpF1Nu6u1SpAKRwww2ul6Jl9IyuNgu4v4U_9YR3mTuHWN LBf47UMKjHklYO7BvVo5GvQURS5PwiHJAjlVmGBOA8tNP0xwDfgDmxfP.9HA7lRYeL4oMxPp6N0h W8PbGVx.q_qxnM4ewkZDO.dAl.Iysr0g.gAjFoOHlbBfzMPdyPuiyx6kB4F2sIFI5yT7UA_BJAc4 A3814sg-- X-Sonic-MF: X-Sonic-ID: 944caab0-7d6f-41c7-b84a-e9d81556b8cf Received: from sonic.gate.mail.ne1.yahoo.com by sonic311.consmr.mail.gq1.yahoo.com with HTTP; Fri, 24 Oct 2025 07:17:28 +0000 Date: Fri, 24 Oct 2025 07:16:41 +0000 (UTC) From: Sergey Prokhorenko To: Masahiko Sawada Cc: Andrey Borodin , pgsql-hackers Message-ID: <574624399.175025.1761290201491@mail.yahoo.com> In-Reply-To: References: <1791665551.452444.1761209220211.ref@mail.yahoo.com> <1791665551.452444.1761209220211@mail.yahoo.com> <18022523-0F8F-4C07-AFF5-57DC9086D78E@yandex-team.ru> <1895971769.8343.1761240853939@mail.yahoo.com> Subject: Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_175024_785795272.1761290201488" X-Mailer: WebService/1.1.24652 YMailNorrin Content-Length: 22245 List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk ------=_Part_175024_785795272.1761290201488 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Thu, Oct 23, 2025 at 3:46=E2=80=AFPM Sergey Prokhorenko wrote: > > > Given that what uuid_to_base32hex() actually does is encoding the > input UUID,=C2=A0 I find that it could be confusing if we have a similar > function other than encode() function. Also, we could end up > introducing as many encoding and decoding functions dedicated for UUID > as we want to support encoding methods, bloating the functions. > > > So as the first step, +1 for supporting base32hex for encode() and > decode() functions and supporting the UUID <-> bytea conversion. I > believe it would cover most use cases and the cost of UUID <-> bytea > conversion is negligible. > > > Regards, > > > -- > > Masahiko Sawada > > Amazon Web Services:=C2=A0https://aws.amazon.com > > > Masahiko, > > I see you're in favor of base32hex encoding. That's great! > > Your arguments make sense, and I generally support enhancing the standard= encode() and decode() functions to handle base32hex. It seems like the rig= ht approach from a developer experience standpoint. > > However, I'm unclear about some implementation aspects. Why add conversio= ns between UUID and bytea data types? Wouldn't that require creating dedica= ted UUID <-> bytea conversion functions? Instead, could we implement encode= () as polymorphic to handle UUID type inputs directly? For decode(), we'd n= eed=C2=A0 some way (a parameter?) to specify the UUID output type instead o= f bytea. Another option would be automatic type casting when inserting byte= a data into UUID columns. Neither an extra parameter nor additional type ca= sting seems ideal to me, though I don't have better alternatives. While we can implement something like decode(uuid, text), I don't think we can implement decode() in the way you proposed unless I'm missing something. I think the conversion support between UUID and bytea is useful in general, not limited to encode()/decode() support. And users would be able to create wrapper functions if they don't want to add casting for every encode() and decode() calls. For example, create function uuid_to_base32(uuid) returns text language sql immutable st= rict begin atomic =C2=A0 =C2=A0 select encode($1::bytea, 'base32hex'); end; Since such functions are inlineable, the different between executing encode(uuid_data::bytea, 'base32hex') and encode(uuid_data, 'base32hex') would only be the conversion; one palloc and one memcpy. > But actually, for a short UUID text encoding to succeed, it's more import= ant that it becomes the single, de facto standard. We should avoid supporti= ng multiple encodings, just as the authors and contributors of RFC 9562 did= :=C2=A0https://github.com/uuid6/new-uuid-encoding-techniques-ietf-draft/dis= cussions/17#discussioncomment-10614817=C2=A0 =C2=A0 Therefore, whenever pos= sible, encode() and decode() should support just one UUID text encoding, na= mely base32hex. I guess it's ultimately the developer's choice, no? For example, if they are using multiple databases (or data processing platforms) in their system and 'hex' is the only encoding that all components can encode and decode, they might choose 'hex' encoding. Regards, -- Masahiko Sawada Amazon Web Services:=C2=A0https://aws.amazon.com ____________________________________________ Masahiko, Developers will still be able to use the long canonical 'hex' UUID format f= or compatibility. But the short format is not a developer choice, but a con= vention. We mustn't allow a situation where 25% of systems use base32hex, 2= 5% use Crocksford's Base32, 25% use base36, and 25% even use erroneously so= rted base64. That's a very real nightmare. You, too, have every reason not = to want to increase the number of built-in functions in PostgreSQL. But here is a solution that I hope will satisfy everyone: encode('019535d9-3df7-79fb-b466-=E2=80=8Bfa907fa17f9e', 'uuid_to_base32hex'= ) -> 06AJBM9TUTSVND36VA87V8BVJOdecode('06AJBM9TUTSVND36VA87V8BVJO', 'base32= hex_to_uuid') -> 019535d9-3df7-79fb-b466-=E2=80=8Bfa907fa17f9e I don't see any real business need for UUID <-> bytea conversions. Best regards,Sergey Prokhorenko =20 ------=_Part_175024_785795272.1761290201488 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
On Thu, Oct 23, 2025 at 3:46=E2= =80=AFPM Sergey Prokhorenko
&l= t;sergeyprokhorenko@yahoo.com.au>= ; wrote:
>
> > Given that what uuid_to_base32hex()= actually does is encoding the
> input UUID,  I find that it could be confusing if we have a simi= lar
> function other than e= ncode() function. Also, we could end up
> introducing as many encoding and decoding functions dedicate= d for UUID
> as we want to = support encoding methods, bloating the functions.
>
> = > So as the first step, +1 for supporting base32hex for encode() and

> decode() functions and suppo= rting the UUID <-> bytea conversion. I
> believe it would cover most use cases and the cost of U= UID <-> bytea
> conve= rsion is negligible.
>
> > Regards,
>
> > --
> > M= asahiko Sawada
> > Amazo= n Web Services: https://aws.amazon.com
>
>
> Masahiko,

>
> I see you're in favor of base32hex encoding. That's = great!
>
> Your arguments make sense, and I generally = support enhancing the standard encode() and decode() functions to handle ba= se32hex. It seems like the right approach from a developer experience stand= point.
>
> However, I'm unclear about some implementat= ion aspects. Why add conversions between UUID and bytea data types? Wouldn'= t that require creating dedicated UUID <-> bytea conversion functions= ? Instead, could we implement encode() as polymorphic to handle UUID type i= nputs directly? For decode(), we'd need  some way (a parameter?) to sp= ecify the UUID output type instead of bytea. Another option would be automa= tic type casting when inserting bytea data into UUID columns. Neither an ex= tra parameter nor additional type casting seems ideal to me, though I don't= have better alternatives.
While we can implement something like d= ecode(uuid, text), I don't
thi= nk we can implement decode() in the way you proposed unless I'm
missing something.

I think the= conversion support between UUID and bytea is useful in
general, not limited to encode()/decode() support= . And users would be
able to c= reate wrapper functions if they don't want to add casting for
every encode() and decode() calls. For exam= ple,

create function uuid_to_base32(uuid) returns text language s= ql immutable strict
begin atom= ic
    select encode= ($1::bytea, 'base32hex');
end;=

Since such functions are inlineable, the different between execut= ing
encode(uuid_data::bytea, '= base32hex') and encode(uuid_data,
'base32hex') would only be the conversion; one palloc and one memcpy.

> But actually, for a short UUID text encoding to succeed, it's = more important that it becomes the single, de facto standard. We should avo= id supporting multiple encodings, just as the authors and contributors of R= FC 9562 did: https://github.com/uuid6/ne= w-uuid-encoding-techniques-ietf-draft/discussions/17#discussioncomment-1061= 4817    Therefore, whenever possible, encode() and decode() s= hould support just one UUID text encoding, namely base32hex.

I gue= ss it's ultimately the developer's choice, no? For example, if

they are using multiple databases (or data= processing platforms) in
thei= r system and 'hex' is the only encoding that all components can
encode and decode, they might choose 'hex= ' encoding.


Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

_________________________________= ___________


Masahiko,

Develop= ers will still be able to use the long canonical 'hex' UUID format for comp= atibility. But the short format is not a developer choice, but a convention= . We mustn't allow a situation where 25% of systems use base32hex, 25% use = Crocksford's Base32, 25% use base36, and 25% even use erroneously sorted ba= se64. That's a very real nightmare. You, too, have every reason not to want= to increase the number of built-in functions in PostgreSQL.

=
But here is a solution that I hope will satisfy everyone:=

encode('019535d9-3df7-79fb-b466-=E2=80= =8Bfa907fa17f9e', 'uuid_to_base32hex') -> 06AJBM9TUTSVND36VA87V8BVJO
decode('06AJBM9TUTSVND36VA87V8BVJO', 'base32hex_to_uuid') -> 0195= 35d9-3df7-79fb-b466-=E2=80=8Bfa907fa17f9e

I don't = see any real business need for UUID <-> bytea conversions.

Best regards,
Sergey Prokhorenko







<= div id=3D"ydpaf8c6c9cyahoo_quoted_2196900714" class=3D"ydpaf8c6c9cyahoo_quo= ted">


------=_Part_175024_785795272.1761290201488--