Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1vDkhp-00F2Wy-8p for pgsql-hackers@arkaria.postgresql.org; Tue, 28 Oct 2025 14:29:49 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1vDkho-00DJBt-4Q for pgsql-hackers@arkaria.postgresql.org; Tue, 28 Oct 2025 14:29:47 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1vDkhn-00DJBk-2K for pgsql-hackers@lists.postgresql.org; Tue, 28 Oct 2025 14:29:46 +0000 Received: from sonic315-54.consmr.mail.gq1.yahoo.com ([98.137.65.30]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vDkhi-004i9c-0c for pgsql-hackers@postgresql.org; Tue, 28 Oct 2025 14:29:45 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com.au; s=s2048; t=1761661778; bh=AMh2u0j/JKGYETmwIddJuSjbRsEcKwHvpbe5ENhhgZI=; h=Date:From:To:Cc:In-Reply-To:References:Subject:From:Subject:Reply-To; b=VFvtDEXgsIsE/VObut6y8eqqs+ahZGAhU2uQ3MxiS3j/DSDJyJSaIeXTvxu80/F50Kh+kodcByvU/2iq6sDOZISHzD13pvLmzLGqzl4CBRkLiftZAWcEO7JGT/pVhpV+gD54jzgBxpzLJe0bUK80Ulck+p6jCOPUp4BLuS2x6JtUFrfdLrK+ouwocEDHDbjXXJ3zAtcr5GngBJRTsqlZki04swDFmtu63TQouKCP+/4EX1+KI29IE+gA4Yn2P1u0BDDTC4kZfkyOaE9mJoSBHwQq5XMizXPLVd5GaYeLxQCLnbrqJp1S2iVDlHSGzJ5mXMPlHv4vxpch7iKqep9SoQ== X-SONIC-DKIM-SIGN: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1761661778; bh=Q1IR1RNSjaKnLgpmh/CaU7AzIapIt1Vn1OfAVmtGmhV=; h=X-Sonic-MF:Date:From:To:Subject:From:Subject; b=qaqQfx04JSoXns1/0fN8nyhcQR+yTyq3kTdp1qF+p9N1FImQvPoWL8oO0P9dkVR48wf6CA6GJg6yPDH6iK/imOqDUI9yhCEuCyDq8aYsomYmAijl1PEfQWKkr0BPr+mosGP7qZ8wiw3Ir7krmkWsmh5DcL/aYiXTRxH47fyR9GQtIu0y/g393h2cbvjRQVqTbKxztH9maV85T9SYiYK+Rh4mBhS/x9XZi2VxzZ3ddPy7I/Pm3H6j5EUypItDam94H/YLz8sOsfcINKrUvxz5ecu1zuuI04nPwPYuuV9mHAmNvupkHBuB7SC6ooq7Tcxb70crThmzWhTYSWOJscWT9g== X-YMail-OSG: MT.Q7B8VM1kimDkLOhBnrQg7esCElC3nxVV6opXH52m2.uJbUk._bj2UTUsYoj5 eTW_.ctSGxMH4IbZBG_.o8VYCfxszSAfirDZkrBCJHdpttr7KV9raEQzzgb3XEd_1n1nUEs2xos8 yLijyDLWananQP4U7GDE3VHM9jKgUdQUc085K_69FTdHvQHteKZvd8M5BEcsmFQpWioLryDNOua6 ZpzpaD9XPrsmyTYqHVdmJROgISKk.7K7Gf3S_cP8ndmwYywwVJlmJw5_G9EZkn2GCpKwG6LrXiFq SUZuwKx4wlgo_cHL.upCIR1fYP.D.vbg04Bi1MPU.enq6mATtLXMUWAGMv1Ul8o_8Wo2HNdjm2yY F5kgjPHJp4irp5XQF.8580ht4.PngwWSvxSS6Ls.tk6YV7BcPegEyxzBQ8NT9Y3yJl0yvr0VmUJY 3LDo8EPopKtaYOwjRrr_8tYjVKDC.KkLNfbmri16Cj_tk_Uz6QgucToFPZxbhFYlbFowSBkM9pAg Av72Zsk7HYRgD7Z5nrwq.83HEsQeQqc31GrilADorA34OXuhKAvMn2jAC0Hb4tMjXTxez0bSm.Su ZZ0z7jiX4lYDIX94oIcLqJgOvf69c8F4zhnPmLZpOtEjROB7zhz77r2I848hP3cbsXK0soIc7U.U _b4FB6C3scnZ4w__pfkTxFv_FXPrzwGWKG9KHtGduTT.eLZAaH6gRK5lfJLy3ReWUQ0xAP4ySPyo Y8SPhK1i7I_BR6JyVnLNLl1SVdW3Fy05F4FbxsUELNKw_oSJxt2hM7W5_mkqG4Cn2qrXx5KT.24P 2d4Pmp7P4or9N.x4ylvsHC2zNvbsjVrpZ939Ac_6JV1BzWMqTmB6NkzW19GFRaDQXgedds5nP3mc KB83C3bryeeKoe2gEZpwXIcaelni4XgsNVztsrkUQbMp39Ddj2xZRqGlD9AD_23R6sjX7xVK6676 dP1m4MIxaqG7jULXkbmqM6r2FpUfG0CwdmM67doZ_UdeWLk8xS5GiIz3HPf6ZoB6RTnW_wuKnNZR vfeH9dMVXaauhS7IYWv86brWaTeUE88eRZCaKBV2ult_UcU8xsC19p4K8nPCjF8iYGvFgn_sxMWo X.8n0nGirhGdRLXyFItECuxUqCN49HtC_gXoDmL8U5fyR3dbqWwmAVdO3dunxjaq0INt9hU.yDE2 7qoluLJi70NwYD3OGtWs1a3JLoto3aacYNeL8.pKVKAQO1IDa.JsXOG_V6hFd170klf87qAT2l6Y g.WS8R0xsEIvc0b1vIZ6riPcm0LQjpFYTVChFi5atFpY0wclzbM4yxqa2lUMeu6.rDOFKg3nv3cw F4_hPSDv8x3QMFfUe9YgaC7gBdBc6QZQq02pVc6Q2vSJknLAggTwxt2DW_nbUfIaGocB1Qp8oAJh PH6gYjqCdTACqw1egu7NPTtAx_Ub5SmZWeclHkksKjtve8K0grECvvSVfIbvEoevw1F4Uvr.cUkR o99vq7H4G.9UqCL7QuoObml1PVlgCFOS_xipDFqMyZyHDQYFtCmYTU7TcHT6nRsufVGF3xq6_8il ZIHr.ztAR1To_IjLyQlSP1RlYvIAJKheeIWgsveTUSDSfzTeF9TZG8GzMxs2j4jeXw7PGKWmxcA5 6o_2a8R2wAp2ohIzIUc85XpnoKBQkbet_eVT3JE1rsZ2xKmsflDVx3iBB.RpBhXsCNv1ADW25o20 Q5fSPuAN3KumtzhcQkgVUlsjl3yfKKcTsKEWQY0VxgI2QnfBb__tluYZlV3.0JrrUKW.KdNsmNRL EI9VKkGDewWxTGyW5alWxA7pujxGiW3lCyUm.juxHNyt4Qpuf9qTSMDmDDnk6Qo7YUv3GFT7gLP6 OGBeGlTlUl2j029Uk0S6ZZWP2b1y7iopxGA0ZmR0UgaZYdfLH22tt7sO4PliKtt50ZnenHGLQVOb xasFO6ay86lcFsooqWkaJZs_hP5BzpFyeGEOVqUbkoUzlZa_L6arN_KbAn1GnDZT5dZtSHul6wod YOKG8dmYq8j0fTnm.oSNuIDyiYUmKvinGryi_Nrg43eqNltbVpilgW2EIFrT7C3Qbs291T.gud1c Hvs7yGAifw5IbJGJz7vm84arodSM5Ip0DiAqE2kEOijCyG7nbaT_snj.CZh27ZSeaGLzDgJ3VXDi zHryhqFMxa.ezskAR70Uhi3UydUG6bUC_GQenPngiPEljdi4MyIVWKbSoyt0WKe6Bc_59bNI19RD oOpbAP_WrBU0yzD5sBf8XdGF6FIGLFqfsDIFtDIwuZKXiUIx4H_tBXAi4.tjYhAHHnLcWbwqAKRp qBvkFZTSg4zA_IZdpIkb0WDoctR0NrPuy595zTPfl X-Sonic-MF: X-Sonic-ID: d402c823-ee02-426e-86ff-f32bbfc1af68 Received: from sonic.gate.mail.ne1.yahoo.com by sonic315.consmr.mail.gq1.yahoo.com with HTTP; Tue, 28 Oct 2025 14:29:38 +0000 Date: Tue, 28 Oct 2025 14:28:54 +0000 (UTC) From: Sergey Prokhorenko To: Jelte Fennema-Nio Cc: Andrey Borodin , Masahiko Sawada , pgsql-hackers Message-ID: <1543828736.1122782.1761661734535@mail.yahoo.com> In-Reply-To: References: <1791665551.452444.1761209220211.ref@mail.yahoo.com> <1791665551.452444.1761209220211@mail.yahoo.com> <18022523-0F8F-4C07-AFF5-57DC9086D78E@yandex-team.ru> <1895971769.8343.1761240853939@mail.yahoo.com> <574624399.175025.1761290201491@mail.yahoo.com> <953203149.383019.1761345585325@mail.yahoo.com> <6F76FA61-E2DC-44EF-9504-889D9BDB4EBD@yandex-team.ru> <1154454839.957923.1761604611424@mail.yahoo.com> Subject: Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_1122781_393993468.1761661734533" X-Mailer: WebService/1.1.24652 YMailNorrin Content-Length: 13714 List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk ------=_Part_1122781_393993468.1761661734533 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit First of all, I'm definitely a proponent of being able to encode UUIDsusing base32hex in Postgres. On Mon, 27 Oct 2025 at 23:37, Sergey Prokhorenko wrote: > I wanted to highlight an important discussion among the authors and contributors of RFC 9562 regarding UUID text encoding: > > https://github.com/uuid6/new-uuid-encoding-techniques-ietf-draft/discussions/17#discussioncomment-10614817 I think a very important thing to note here is that this is a github discussion, not an officially accepted RFC. I think if it was an officially accepted RFC on how to encode UUIDs then you would have a lot less pushback here. Right now your emails mostly read like you want to push your preferential format, while essentially disallowing other encodings. While base32hex seems like a good choice for UUIDv7 I see no reason to give it preferential treatment at this point in time. crockford base32 seems just as valid. And e.g. base64url[1] seems totally fine for UUID versions that have no inherent ordering like UUIDv4. And if someone comes up with a base64urlhex format you could have even shorter bit still sortable UUIDs at the expense of legibility. The main reason why a specific encoding should receive preferential treatment in Postgres, would be if it was standardized, as that would help with interoperability. At this point in time there's no such standard (not even a draft), so forcing an explicit encoding will actually reduce interoperability, because people already encode their UUIDs in various different forms. > but the discussion established that base32hex is the existing standard format already defined in RFC 4648, Section 7, specifically designed for sort-preserving encoding. You even reach a similar conclusion here: not choosing crockford base32, purely because it does not have an official RFC. > This context is crucial because it underscores that the uuid type, as a first-class concept, deserves its own standardized text encoding. It already has! The standard text encoding is defined in RFC 4122. That's why postgres displays it as such when encoding to text. > Regarding the proposal to couple UUID encoding with the bytea type through encode()/decode() functions: I understand the appeal of reusing existing infrastructure, but this creates a conceptual mismatch. UUID is a distinct semantic type in PostgreSQL, not merely binary data. The bytea type has existed for decades without base32hex encoding, and that's worked fine, because bytea represents arbitrary binary data, not universally unique identifiers with specific structural properties and needs. I think by far the first step is to make the encoding of UUIDs in different formats possible in Postgres. The way to do so with the least API impact (and thus as you noticed, least pushback), would be to add base32hex to the list of encoding formats in the encode/decode functions. Then combining that with UUID <-> bytea casting (which also seems totally reasonable functionality to me), would give you the functionality (but not the defaults you want). In a follow up patch I would personally be fine making the API to encode UUIDs a bit more friendly. In particular, adding an overload to the encode function that takes a UUID instead of a bytea seems reasonable to me, i.e. encode(id uuid, format text) -> text I'm currently less convinced about a decode_uuid function though. I think some perf argument (including some benchmarks) would need to be made to convince me of its usefulness. Because purely from an API friendliness lens, I feel like decode('...', 'base32hex)::uuid and decode_uuid('...', 'base32hex') rank basically the same. Once/if an accepted RFC actually defines a default shorter encoding for UUIDs we could I would definitely be in favor of adding a decode_uuid function with the default encoding configured as a default argument. As well as adding the default argument to the uuid encode overload function. ______________________________________________________________________________________________ Hi Jelte, I agree with your points. I believe we should put the discussion about compact UUID text encoding in PostgreSQL on hold for now. None of the proposed solutions has sufficient unconditional support from the participants. It makes sense to pause this discussion for more in-depth exploration to try and reach a consensus. Jelte, I particularly liked your idea of a new, dedicated, standardized encoding for UUIDs, base64urlhex, and dedicated encoding/decoding functions for this encoding in PostgreSQL. I will try to develop such an encoding and submit it for discussion. I suggest calling it base64uuid. My current attempt to establish base32hex as a de facto standard (even prior to an RFC) was unsuccessful. However, I remain convinced, like the authors of RFC 9562, that there should be only one standard compact encoding for UUIDs. Therefore, we must continue efforts to standardize such an encoding. As for Crockford's Base32, it was rejected because of a lack of support in standard programming language libraries. Otherwise, it's just as good as base32hex. Best regards,Sergey Prokhorenko ------=_Part_1122781_393993468.1761661734533 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
First= of all, I'm definitely a proponent of being able to encode UUIDs
usi= ng base32hex in Postgres.

On Mon, 27 O= ct 2025 at 23:37, Sergey Prokhorenko
<sergeyprokhorenko@yahoo.com.au> wrote:
&g= t; I wanted to highlight an important discussion among the authors and cont= ributors of RFC 9562 regarding UUID text encoding:
>> https://github.com/uuid6/new-uuid-= encoding-techniques-ietf-draft/discussions/17#discussioncomment-10614817

I think a very important thing to no= te here is that this is a github
discussion, not an offic= ially accepted RFC. I think if it was an
officially accep= ted RFC on how to encode UUIDs then you would have a
lot = less pushback here. Right now your emails mostly read like you
want to push your preferential format, while essentially disallowing<= br clear=3D"none">other encodings. While base32hex seems like a good choice= for UUIDv7 I
see no reason to give it preferential treat= ment at this point in time.
crockford base32 seems just a= s valid. And e.g. base64url[1] seems
totally fine for UUI= D versions that have no inherent ordering like
UUIDv4. An= d if someone comes up with a base64urlhex format you could
have even shorter bit still sortable UUIDs at the expense of
legibility.

The main reason why = a specific encoding should receive preferential
treatment= in Postgres, would be if it was standardized, as that would
help with interoperability. At this point in time there's no such
standard (not even a draft), so forcing an explicit encoding w= ill
actually reduce interoperability, because people alre= ady encode their
UUIDs in various different forms.

> but the discussion established that bas= e32hex is the existing standard format already defined in RFC 4648, Section= 7, specifically designed for sort-preserving encoding.
<= br clear=3D"none">You even reach a similar conclusion here: not choosing cr= ockford
base32, purely because it does not have an offici= al RFC.

> This context is crucial b= ecause it underscores that the uuid type, as a first-class concept, deserve= s its own standardized text encoding.

= It already has! The standard text encoding is defined in RFC 4122.
That's why postgres displays it as such when encoding to text.

> Regarding the proposal to couple UUID enc= oding with the bytea type through encode()/decode() functions: I understand= the appeal of reusing existing infrastructure, but this creates a conceptu= al mismatch. UUID is a distinct semantic type in PostgreSQL, not merely bin= ary data. The bytea type has existed for decades without base32hex encoding= , and that's worked fine, because bytea represents arbitrary binary data, n= ot universally unique identifiers with specific structural properties and n= eeds.


I think by far the first s= tep is to make the encoding of UUIDs in
different formats= possible in Postgres. The way to do so with the
least AP= I impact (and thus as you noticed, least pushback), would be
to add base32hex to the list of encoding formats in the encode/decodefunctions. Then combining that with UUID <-> bytea c= asting (which also
seems totally reasonable functionality= to me), would give you the
functionality (but not the de= faults you want).

In a follow up patch= I would personally be fine making the API to
encode UUID= s a bit more friendly. In particular, adding an overload to
the encode function that takes a UUID instead of a bytea seems
reasonable to me, i.e. encode(id uuid, format text) -> text
I'm currently less convinced about a dec= ode_uuid function though. I
think some perf argument (inc= luding some benchmarks) would need to be
made to convince= me of its usefulness. Because purely from an API
friendl= iness lens, I feel like decode('...', 'base32hex)::uuid and
decode_uuid('...', 'base32hex') rank basically the same.

Once/if an accepted RFC actually defines a default sh= orter encoding
for UUIDs we could I would definitely be i= n favor of adding a
decode_uuid function with the default= encoding configured as a default
argument. As well as ad= ding the default argument to the uuid encode
overload fun= ction.
=
_______________________________________________
__________________________________= _____________


Hi Jelte,

I agree with your points.
<= div>
I believe we should put the discussion about compact UUI= D text encoding in PostgreSQL on hold for now. None of the proposed solutio= ns has sufficient unconditional support from the participants. It makes sen= se to pause this discussion for more in-depth exploration to try and reach = a consensus.

Jelte, I particularly liked your idea= of a new, dedicated, standardized encoding for UUIDs, base64urlhex, and de= dicated encoding/decoding functions for this encoding in PostgreSQL. I will= try to develop such an encoding and submit it for discussion. I suggest ca= lling it base64uuid.

My current attempt to establi= sh base32hex as a de facto standard (even prior to an RFC) was unsuccessful= . However, I remain convinced, like the authors of RFC 9562, that there sho= uld be only one standard compact encoding for UUIDs. Therefore, we must con= tinue efforts to standardize such an encoding.

As = for Crockford's Base32, it was rejected because of a lack of support in sta= ndard programming language libraries. Otherwise, it's just as good as base3= 2hex.


Best regards,
Serge= y Prokhorenko







------=_Part_1122781_393993468.1761661734533--