Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1vDriN-000DXC-3h for pgsql-hackers@arkaria.postgresql.org; Tue, 28 Oct 2025 21:58:51 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1vDriK-00FOHM-5B for pgsql-hackers@arkaria.postgresql.org; Tue, 28 Oct 2025 21:58:47 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1vDriI-00FOHE-SI for pgsql-hackers@lists.postgresql.org; Tue, 28 Oct 2025 21:58:46 +0000 Received: from sonic318-20.consmr.mail.gq1.yahoo.com ([98.137.70.146]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vDriD-004loP-2r for pgsql-hackers@postgresql.org; Tue, 28 Oct 2025 21:58:45 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com.au; s=s2048; t=1761688717; bh=ztZUeUCThOWXMJbX58MtJR/nEyneLiBZRzrHBVcOZTU=; h=Date:From:To:Cc:In-Reply-To:References:Subject:From:Subject:Reply-To; b=pJC1Ud+plt0FKiURbuGAErlnVrPNEikiz/Kl7qwLpg/uzHsCAu7VpOoNFu1eA0QIrL0pGsRWe/p+wFEooic1ylls45dIjVo/IYvumM2V8azW7ifft09vG85AOgzTi0FU3x6WbpBb21BzqiXOZPj2EZlbtSJXH7fTk0yA6+aZtiZD3g+WjalreZ2cBMpukw5kvcXlDTdav/8exc9yC9zQhwXXEOPa+OxVhdrOcshe603O9DGqc6G4i55zSgU5WdraRjlHqhzM4Bq345dNQQzFli6Gber2Adt7958tA1C3i+FX+alF9deQv7DNhu076657igW7ZRqUMidKjZ7DRmUvew== X-SONIC-DKIM-SIGN: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1761688717; bh=SNAIqMcTL1ODeOnNAEN954gmn5wRz8ftopSPN8mTlid=; h=X-Sonic-MF:Date:From:To:Subject:From:Subject; b=bTETRYbPOnXEsutgw3gnmDPwDO3D3xsscXrvKPmuC3duvvc4KBZZmHJ5O5pL03ochr/J0Z9A3FiV4yAUwylV1QRasqAqX/f1ayqZHRjRJpy20052lD4rAUAHLHslwEY4d+eKT/XxaxQFVcqugF/07cnCx+uzlw/rxcdgAN6gmegeLHogu3FPPkPo/FU9gIrs00uRrmKTvuub4UWUC5kc6IcWvp535+jFO/4iaJWjmPM9rWTqd6YLSTTA2xF49WQCFP2RNhOcM8mP9sRXB8CJ8qW3+RNmNN7Cvq7aiJD1LZy45GLzl7kxPKtWe3iFhVhyxyTbq8PufsnteNnTkPyoOw== X-YMail-OSG: qVulF80VM1m8qRXwZaqhd8DL3bwyAqU2rsQr2JumyfqiB2aXZgi.RkJ2.YpiQ.t hNfsM6i8hjEHhu_aEWlmPClP.ZagKVVA1bvGnhu2.bucQNM7NWWUdmhBXvpjcty29f50N3_ETZWt Hg9DypwFC71JTVMAiPP_YVY6zEJoHgxB6xRdmq6hS49AO1aWlN5g033UsWyTz7VyELQQy4cZHCAH hkC.d9Wgibi4V35WPo54JglSwNUndcdYzDoEvxiR3hw5XzuM.S70rSf8KQ6tnYq.N9YLDSmNU3ls 0Es15dFoOjrzqqJTfwfhjDQQX22OFy5_K5S06mw2q63imVVFslj5Uvqaf7rQdxG7XNXW3312b.G3 3IqM51itBq9ygP59kRcy6pIIJDuneLlRIzrv66IMigJ.8rB_q5MgGhqTpBx4BTJEA9kH6q03RJ_V VnE7K0UqgO1.CZ_w2KIaJyQTq2WYOBSVPRzIin3apIxblKfghxKFJwaPIt7b_R0iZDO5Muwf5rW9 c5lBNCPYz5fLndcvUPboFqSaGlr_7BU6rv4J_fgX9.isL4PEDZ6C_hswGOyUP7fHbuhDPtQPAbtb yn0CLU2qIfhACOWbrZvX0mujBrHn7TnowmBAdJCahIFljPMUY.0RYbIzJKIVb9qqd8R3cigZBHY8 FuYFxp5SlNaX0A2YdxCDZAx4KwLOBhiRgRmg.hfdSJ3OILJiX1xU2hL3NulUr3V_DsTyC6HzqvTc rDrkqlpWtBUghhv00ISUC45c97EhI3UtpV7bZwGPG5nvn9ExI1MdfgWvLmwrBQvqeuSqmA4AdTAT cuMV40sbL4Huph6Y6YPRLrr15MRxcwOHXNggGQUZp9lSgAtmne1wVPzBmG6zHlYRdIaaWjf48hKo pplhkkiprg19VfUCIDjP8.HJ4Ymr47bFdFLp4kgpu5Btz6eqVOpP63jUib53zYdHHcRsGJAtsz_. jyd_rctqERW8cfwkCJHv_L.xkPObpXRRTtuuGxcCGp6Jhq16J7hRrbYiKdVLKtioKCj9QDdeSOhU eHnvcTAGNLHz.Lo8HOZSslpWtHgIFtSSx2f8k98v6O2l2mt.rXZlKWiP2u1v4MFhEvkF.gZ8tFSF mjPeNr19PzVb4acNKfC5Ye8g5H9VCw559NJmPQYDV2AGAWhKUTEkpvyz2Gq5umfgfSrCaESD6jV4 dFq8tnblPZGtiHi.03YNRGdbJ.oAWUpKlHCB3AeIFT1NJUjEseQAiw8xZtCL.qbmrrDT5YiZE5W_ XH4onJzZhz9c4AmgkOAhsQBJ7OCTPNwT21mXXu6j8Jmr3eNrB5wisDXMcGCGH1BicBwCS94veJTQ H6joh6L1ptfJZ9qdMvRyBQfDq5Gzrdf3LPJXDZ72_nPoIY4CJ4jteSDL.waC9jPMNv928O_47x2p KmyWgcLFVVMSoVTVhNiw0p_NcTgCMfyTBlQ_4V87bLOhitecrYbUmAhGD7v62xN.A3cbxuhXg9zO YyGAMhQmP5.o19135O34icgvnF6yyvO_Dh28teuOPnMt_TwIHVUoZb15gYQE5jh8CLP0Lo5azZIq 4PRxhCs4LJQ.RdsamS_SMGPSbw.kZYDcAtrUCie6gPkIiHu9KRY29Y61PrN7DEQ3W9VL9p8c.87j nb4eB97wsu6X5Ba8Q8InkViA0RJlyftkh0Rob_LTZtov_YS6_KFPqpSrzdUU6qDtV8pbzumTHhbG 9zWOcwprK_4752ui1XgjHaflVDCxpac6ljdPwixXNA_8jYFOqhKnho8uhhsGVh.aEeEYL1IjeuU5 Pff9oV36JOHT.Mio7lI1RtDfInf9J.gCl6c.cAxZdk9XE.yux86AfXFfC4ZdHpAyQ_FaTrXz5tbw OguiTPtTxmphtYDRL9d4IQ46yWCo0zWQMMspR0AC7NhXEGur09eXDyhPkSBT5gJTWUcYfVJeJaFR yL2TRPVI9fnrz6SvDbSOPzMwouf88yiXbXdgLaf1pzpeTIN7kT1US._hYcCcLp0SPqPTknfXjeZD CEt5O4kkmDheK9NwzTDPPvTdyCQeI.rMAx7KPJuCz8xl1et5RrVD0aK0cNll6Pgf0iJeK_2hxgK. FW0Dae3WGjWPs_WS.A5upHqZRTdn_OCYjbpFd9vVs0_M_ALkmAiMnVc2oXl3MlW3cxdoYGTyBbmD YX9lgr7vwDbzibAPSJRfG.o7JenBjOoWJT_hom_SjYFmrLs.W2jL07bJwj0TcEKdyqk1ttEKGrt_ IlURHQ2vuJ0RqkJ7eHPd7vT0w_s.O5Q429QV4mQQlL_7yvh7s7UsLzD674wWgLtsfxNaDbyMV2Tv rox6CoDkBbQ_qCp7pM.6rx2f3LGklz3k- X-Sonic-MF: X-Sonic-ID: fdb0a166-1b5a-4662-91f6-fcc6706b030a Received: from sonic.gate.mail.ne1.yahoo.com by sonic318.consmr.mail.gq1.yahoo.com with HTTP; Tue, 28 Oct 2025 21:58:37 +0000 Date: Tue, 28 Oct 2025 21:56:32 +0000 (UTC) From: Sergey Prokhorenko To: Jelte Fennema-Nio Cc: Andrey Borodin , Masahiko Sawada , pgsql-hackers Message-ID: <969840022.1271501.1761688592366@mail.yahoo.com> In-Reply-To: References: <1791665551.452444.1761209220211.ref@mail.yahoo.com> <1791665551.452444.1761209220211@mail.yahoo.com> <18022523-0F8F-4C07-AFF5-57DC9086D78E@yandex-team.ru> <1895971769.8343.1761240853939@mail.yahoo.com> <574624399.175025.1761290201491@mail.yahoo.com> <953203149.383019.1761345585325@mail.yahoo.com> <6F76FA61-E2DC-44EF-9504-889D9BDB4EBD@yandex-team.ru> <1154454839.957923.1761604611424@mail.yahoo.com> Subject: Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_1271500_211866713.1761688592364" X-Mailer: WebService/1.1.24652 YMailNorrin Content-Length: 11643 List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk ------=_Part_1271500_211866713.1761688592364 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable First of all, I'm definitely a proponent of being able to encode UUIDsusing= base32hex in Postgres. On Mon, 27 Oct 2025 at 23:37, Sergey Prokhorenko wrote: > I wanted to highlight an important discussion among the authors and contr= ibutors of RFC 9562 regarding UUID text encoding: > > https://github.com/uuid6/new-uuid-encoding-techniques-ietf-draft/discussi= ons/17#discussioncomment-10614817 I think a very important thing to note here is that this is a github discussion, not an officially accepted RFC. I think if it was an officially accepted RFC on how to encode UUIDs then you would have a lot less pushback here. Right now your emails mostly read like you want to push your preferential format, while essentially disallowing other encodings. While base32hex seems like a good choice for UUIDv7 I see no reason to give it preferential treatment at this point in time. crockford base32 seems just as valid. And e.g. base64url[1] seems totally fine for UUID versions that have no inherent ordering like UUIDv4. And if someone comes up with a base64urlhex format you could have even shorter bit still sortable UUIDs at the expense of legibility. The main reason why a specific encoding should receive preferential treatment in Postgres, would be if it was standardized, as that would help with interoperability. At this point in time there's no such standard (not even a draft), so forcing an explicit encoding will actually reduce interoperability, because people already encode their UUIDs in various different forms. > but the discussion established that base32hex is the existing standard fo= rmat already defined in RFC 4648, Section 7, specifically designed for sort= -preserving encoding. You even reach a similar conclusion here: not choosing crockford base32, purely because it does not have an official RFC. > This context is crucial because it underscores that the uuid type, as a f= irst-class concept, deserves its own standardized text encoding. It already has! The standard text encoding is defined in RFC 4122. That's why postgres displays it as such when encoding to text. > Regarding the proposal to couple UUID encoding with the bytea type throug= h encode()/decode() functions: I understand the appeal of reusing existing = infrastructure, but this creates a conceptual mismatch. UUID is a distinct = semantic type in PostgreSQL, not merely binary data. The bytea type has exi= sted for decades without base32hex encoding, and that's worked fine, becaus= e bytea represents arbitrary binary data, not universally unique identifier= s with specific structural properties and needs. I think by far the first step is to make the encoding of UUIDs in different formats possible in Postgres. The way to do so with the least API impact (and thus as you noticed, least pushback), would be to add base32hex to the list of encoding formats in the encode/decode functions. Then combining that with UUID <-> bytea casting (which also seems totally reasonable functionality to me), would give you the functionality (but not the defaults you want). In a follow up patch I would personally be fine making the API to encode UUIDs a bit more friendly. In particular, adding an overload to the encode function that takes a UUID instead of a bytea seems reasonable to me, i.e. encode(id uuid, format text) -> text I'm currently less convinced about a decode_uuid function though. I think some perf argument (including some benchmarks) would need to be made to convince me of its usefulness. Because purely from an API friendliness lens, I feel like decode('...', 'base32hex)::uuid and decode_uuid('...', 'base32hex') rank basically the same. Once/if an accepted RFC actually defines a default shorter encoding for UUIDs we could I would definitely be in favor of adding a decode_uuid function with the default encoding configured as a default argument. As well as adding the default argument to the uuid encode overload function. ____________________________________________________ Hi=C2=A0Jelte, Here's a project for a compact, sortable text encoding Base64UUID=C2=A0spec= ifically for UUIDs, the idea for which you submitted ("base64urlhex"): https://github.com/sergeyprokhorenko/Base64UUID Where's the best place to discuss this: in this thread or starting a new on= e? Best regards,Sergey Prokhorenko =20 ------=_Part_1271500_211866713.1761688592364 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
First= of all, I'm definitely a proponent of being able to encode UUIDs
using base32hex in Postgres.

On Mon, 27 Oct 2025 at 23:37, Sergey Prokhorenko
<sergeyprokhorenko@yahoo.com.au> wrot= e:
> I wanted to highlight an important discussion amo= ng the authors and contributors of RFC 9562 regarding UUID text encoding:>
> https://git= hub.com/uuid6/new-uuid-encoding-techniques-ietf-draft/discussions/17#discus= sioncomment-10614817

I think a ver= y important thing to note here is that this is a github
d= iscussion, not an officially accepted RFC. I think if it was an
officially accepted RFC on how to encode UUIDs then you would have a=
lot less pushback here. Right now your emails mostly rea= d like you
want to push your preferential format, while e= ssentially disallowing
other encodings. While base32hex s= eems like a good choice for UUIDv7 I
see no reason to giv= e it preferential treatment at this point in time.
crockf= ord base32 seems just as valid. And e.g. base64url[1] seems
totally fine for UUID versions that have no inherent ordering like
UUIDv4. And if someone comes up with a base64urlhex format you= could
have even shorter bit still sortable UUIDs at the = expense of
legibility.

The main reason why a specific encoding should receive preferential
treatment in Postgres, would be if it was standardized, as th= at would
help with interoperability. At this point in tim= e there's no such
standard (not even a draft), so forcing= an explicit encoding will
actually reduce interoperabili= ty, because people already encode their
UUIDs in various = different forms.

> but the discussi= on established that base32hex is the existing standard format already defin= ed in RFC 4648, Section 7, specifically designed for sort-preserving encodi= ng.

You even reach a similar conclusio= n here: not choosing crockford
base32, purely because it = does not have an official RFC.

> Th= is context is crucial because it underscores that the uuid type, as a first= -class concept, deserves its own standardized text encoding.

It already has! The standard text encoding is define= d in RFC 4122.
That's why postgres displays it as such wh= en encoding to text.


> Regarding the prop= osal to couple UUID encoding with the bytea type through encode()/decode() = functions: I understand the appeal of reusing existing infrastructure, but = this creates a conceptual mismatch. UUID is a distinct semantic type in Pos= tgreSQL, not merely binary data. The bytea type has existed for decades wit= hout base32hex encoding, and that's worked fine, because bytea represents a= rbitrary binary data, not universally unique identifiers with specific stru= ctural properties and needs.


I t= hink by far the first step is to make the encoding of UUIDs in
different formats possible in Postgres. The way to do so with the
least API impact (and thus as you noticed, least pushback), = would be
to add base32hex to the list of encoding formats= in the encode/decode
functions. Then combining that with= UUID <-> bytea casting (which also
seems totally r= easonable functionality to me), would give you the
functi= onality (but not the defaults you want).

In a follow up patch I would personally be fine making the API to
encode UUIDs a bit more friendly. In particular, adding an over= load to
the encode function that takes a UUID instead of = a bytea seems
reasonable to me, i.e. encode(id uuid, form= at text) -> text

I'm currently less= convinced about a decode_uuid function though. I
think s= ome perf argument (including some benchmarks) would need to be
made to convince me of its usefulness. Because purely from an API
friendliness lens, I feel like decode('...', 'base32hex)::uu= id and
decode_uuid('...', 'base32hex') rank basically the= same.

Once/if an accepted RFC actuall= y defines a default shorter encoding
for UUIDs we could I= would definitely be in favor of adding a
decode_uuid fun= ction with the default encoding configured as a default
a= rgument. As well as adding the default argument to the uuid encode
overload function.

_______________________________________= _____________


Hi Jelte,

Here's a project for a compact, sortable tex= t encoding Base64UUID specifically for UUIDs, the idea fo= r which you submitted ("base64urlhex"):
h= ttps://github.com/sergeyprokhorenko/Base64UUID

Where's the best place to discuss this: in this thread or starting a new o= ne?


Best regards,
Sergey Prokhorenko



------=_Part_1271500_211866713.1761688592364--