public inbox for [email protected]  
help / color / mirror / Atom feed
Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
37+ messages / 7 participants
[nested] [flat]

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
@ 2025-10-23 09:55 Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-23 15:08 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  0 siblings, 2 replies; 37+ messages in thread

From: Aleksander Alekseev @ 2025-10-23 09:55 UTC (permalink / raw)
  To: pgsql-hackers; +Cc: Sergey Prokhorenko <[email protected]>

Hi Sergey,

> I'm writing to propose adding two new built-in functions to PostgreSQL that provide compact UUID encoding using the base32hex format.

Firstly, cc:'ing a few dozens of people is not the best way to get
attention to your patch. Please don't do this.

Secondly, in order to propose a patch please use `git format-patch`
and send it as an attachment. Then register it on the nearest open
commitfest [1].

The interface you are proposing is ugly and is not composable. The
right way of doing this IMO would be:

1. Implement uuid -> bytea and bytea -> uuid casting
2. Implement encode(bytea, 'base32') and decode(text, 'base32')

So the overall interface should be like this:

SELECT encode(uuidv7() :: bytea, 'base32');

The value of converting uuid to base32 is not obvious though, so I
would recommend explaining it in more detail. Consider starting a new
thread for each separate patch.

[1]: https://commitfest.postgresql.org/
-- 
Best regards,
Aleksander Alekseev





^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
@ 2025-10-23 13:06 ` Andrey Borodin <[email protected]>
  2025-10-23 14:10   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Jelte Fennema-Nio <[email protected]>
  2025-10-23 17:34   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  1 sibling, 2 replies; 37+ messages in thread

From: Andrey Borodin @ 2025-10-23 13:06 UTC (permalink / raw)
  To: Aleksander Alekseev <[email protected]>; +Cc: pgsql-hackers; Sergey Prokhorenko <[email protected]>

Hello!

> On 23 Oct 2025, at 14:55, Aleksander Alekseev <[email protected]> wrote:
> 
> Secondly, in order to propose a patch please use `git format-patch`
> and send it as an attachment. Then register it on the nearest open
> commitfest [1].

I think it's not about review yet, but more of a discussing viability and general approach.
The code itself is trivial in this case.

My first reaction was very skeptical too. Yes, this representation (28V4APV8JC9D792M89J185Q000) seems more developer-friendly than default (123e4567-e89b-12d3-a456-426614174000). But why should we bother with propagating one data format over another?

Yet, this format is RFC-blessed. It makes sense to consider providing an alternative to unfriendly format.

> The interface you are proposing is ugly and is not composable. The
> right way of doing this IMO would be:
> 
> 1. Implement uuid -> bytea and bytea -> uuid casting
> 2. Implement encode(bytea, 'base32') and decode(text, 'base32')
> 
> So the overall interface should be like this:
> 
> SELECT encode(uuidv7() :: bytea, 'base32');

That's an excellent feedback! Would such conversion be idiomatic for Postgres users?
Are there any other alternative approaches?

> The value of converting uuid to base32 is not obvious though, so I
> would recommend explaining it in more detail.

Yes, and maybe some examples of other systems that adopted this format would be handy too. Sergey, can you, please, extend reasoning why this particular format is prominent? RFC 4648 describes a bunch of formats.

> Consider starting a new
> thread for each separate patch.

I think this thread is fine for discussing.

Thank you!


Best regards, Andrey Borodin.




^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
@ 2025-10-23 14:10   ` Jelte Fennema-Nio <[email protected]>
  1 sibling, 0 replies; 37+ messages in thread

From: Jelte Fennema-Nio @ 2025-10-23 14:10 UTC (permalink / raw)
  To: Andrey Borodin <[email protected]>; +Cc: Aleksander Alekseev <[email protected]>; pgsql-hackers; Sergey Prokhorenko <[email protected]>

On Thu, 23 Oct 2025 at 15:07, Andrey Borodin <[email protected]> wrote:
> > SELECT encode(uuidv7() :: bytea, 'base32');
>
> That's an excellent feedback! Would such conversion be idiomatic for Postgres users?
> Are there any other alternative approaches?

Agreed that extending the encode function is the way to go. An example
of that is the recently added support for base64url:
https://git.postgresql.org/cgit/postgresql.git/commit/?h=REL_18_0&id=e1d917182c1953b16b32a39ed2f...

> > The value of converting uuid to base32 is not obvious though, so I
> > would recommend explaining it in more detail.
>
> Yes, and maybe some examples of other systems that adopted this format would be handy too. Sergey, can you, please, extend reasoning why this particular format is prominent? RFC 4648 describes a bunch of formats.

I've definitely used base32 to encode uuids myself. The primary
benefit being shorter strings, while still being able to spell them
out by voice to people without having to specify whether a letter is
upper or lowercase.





^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
@ 2025-10-23 17:34   ` Sergey Prokhorenko <[email protected]>
  2025-10-23 21:00     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  1 sibling, 1 reply; 37+ messages in thread

From: Sergey Prokhorenko @ 2025-10-23 17:34 UTC (permalink / raw)
  To: Andrey Borodin <[email protected]>; +Cc: pgsql-hackers

>> The value of converting uuid to base32 is not obvious though, so I>> would recommend explaining it in more detail.

> Yes, and maybe some examples of other systems that adopted this format would be handy too.

DNSSEC (https://en.wikipedia.org/wiki/Domain_Name_System_Security_Extensions)
many encoders and decoders
> Sergey, can you, please, extend reasoning why this particular format is prominent? RFC 4648 describes a bunch of formats.

> Best regards, Andrey Borodin.


Base32hex:1. Preserves sort order (unlike base64)2. Compact3. Standardized and therefore implemented consistently everywhere4. Implemented in many programming languages' standard libraries5. Does not require specifying character case during dictation6. Has simple and high-performance encoding and decoding algorithms (necessary for system integration using JSON)
The only compact text encoding eliminates the problem of incompatibility. The authors and contributors of RFC 9562 were categorically against having multiple encodings for UUIDs. They wanted to have only one compact, sort-order-preserving text encoding. For compatibility, they added the canonical UUID format. Due to time constraints, the compact encoding was not included in RFC 9562.
In databases, UUIDs should preferably be stored in binary format (the UUID type in PostgreSQL) according to RFC 9562.
Intermediate formats (bytea) reduce performance, which is the very reason we even abandoned the more compact base36 encoding.




  

^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-23 17:34   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
@ 2025-10-23 21:00     ` Masahiko Sawada <[email protected]>
  2025-10-23 22:45       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 07:16       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  0 siblings, 2 replies; 37+ messages in thread

From: Masahiko Sawada @ 2025-10-23 21:00 UTC (permalink / raw)
  To: Sergey Prokhorenko <[email protected]>; +Cc: Andrey Borodin <[email protected]>; pgsql-hackers

On Thu, Oct 23, 2025 at 10:34 AM Sergey Prokhorenko
<[email protected]> wrote:
>
> >> The value of converting uuid to base32 is not obvious though, so I
> >> would recommend explaining it in more detail.
>
> > Yes, and maybe some examples of other systems that adopted this format would be handy too.
>
> DNSSEC (https://en.wikipedia.org/wiki/Domain_Name_System_Security_Extensions)
> many encoders and decoders
>
> > Sergey, can you, please, extend reasoning why this particular format is prominent? RFC 4648 describes a bunch of formats.
>
>
> > Best regards, Andrey Borodin.
>
>
> Base32hex:
> 1. Preserves sort order (unlike base64)
> 2. Compact
> 3. Standardized and therefore implemented consistently everywhere
> 4. Implemented in many programming languages' standard libraries
> 5. Does not require specifying character case during dictation
> 6. Has simple and high-performance encoding and decoding algorithms (necessary for system integration using JSON)
>
> The only compact text encoding eliminates the problem of incompatibility. The authors and contributors of RFC 9562 were categorically against having multiple encodings for UUIDs. They wanted to have only one compact, sort-order-preserving text encoding. For compatibility, they added the canonical UUID format. Due to time constraints, the compact encoding was not included in RFC 9562.
>
> In databases, UUIDs should preferably be stored in binary format (the UUID type in PostgreSQL) according to RFC 9562.
>
> Intermediate formats (bytea) reduce performance, which is the very reason we even abandoned the more compact base36 encoding.

Given that what uuid_to_base32hex() actually does is encoding the
input UUID,  I find that it could be confusing if we have a similar
function other than encode() function. Also, we could end up
introducing as many encoding and decoding functions dedicated for UUID
as we want to support encoding methods, bloating the functions.

So as the first step, +1 for supporting base32hex for encode() and
decode() functions and supporting the UUID <-> bytea conversion. I
believe it would cover most use cases and the cost of UUID <-> bytea
conversion is negligible.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com





^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-23 17:34   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-23 21:00     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
@ 2025-10-23 22:45       ` Sergey Prokhorenko <[email protected]>
  2025-10-24 04:57         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  1 sibling, 1 reply; 37+ messages in thread

From: Sergey Prokhorenko @ 2025-10-23 22:45 UTC (permalink / raw)
  To: Masahiko Sawada <[email protected]>; +Cc: Andrey Borodin <[email protected]>; pgsql-hackers

> Given that what uuid_to_base32hex() actually does is encoding theinput UUID,  I find that it could be confusing if we have a similar
function other than encode() function. Also, we could end up
introducing as many encoding and decoding functions dedicated for UUID
as we want to support encoding methods, bloating the functions.

> So as the first step, +1 for supporting base32hex for encode() and
decode() functions and supporting the UUID <-> bytea conversion. I
believe it would cover most use cases and the cost of UUID <-> bytea
conversion is negligible.

> Regards,

> -- 
> Masahiko Sawada
> Amazon Web Services: https://aws.amazon.com


Masahiko,
I see you're in favor of base32hex encoding. That's great!
Your arguments make sense, and I generally support enhancing the standard encode() and decode() functions to handle base32hex. It seems like the right approach from a developer experience standpoint.
However, I'm unclear about some implementation aspects. Why add conversions between UUID and bytea data types? Wouldn't that require creating dedicated UUID <-> bytea conversion functions? Instead, could we implement encode() as polymorphic to handle UUID type inputs directly? For decode(), we'd need  some way (a parameter?) to specify the UUID output type instead of bytea. Another option would be automatic type casting when inserting bytea data into UUID columns. Neither an extra parameter nor additional type casting seems ideal to me, though I don't have better alternatives. But actually, for a short UUID text encoding to succeed, it's more important that it becomes the single, de facto standard. We should avoid supporting multiple encodings, just as the authors and contributors of RFC 9562 did: https://github.com/uuid6/new-uuid-encoding-techniques-ietf-draft/discussions/17#discussioncomment-10...  Therefore, whenever possible, encode() and decode() should support just one UUID text encoding, namely base32hex.
Best regards,Sergey Prokhorenko



  

^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-23 17:34   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-23 21:00     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-23 22:45       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
@ 2025-10-24 04:57         ` Masahiko Sawada <[email protected]>
  0 siblings, 0 replies; 37+ messages in thread

From: Masahiko Sawada @ 2025-10-24 04:57 UTC (permalink / raw)
  To: Sergey Prokhorenko <[email protected]>; +Cc: Andrey Borodin <[email protected]>; pgsql-hackers

On Thu, Oct 23, 2025 at 3:46 PM Sergey Prokhorenko
<[email protected]> wrote:
>
> > Given that what uuid_to_base32hex() actually does is encoding the
> input UUID,  I find that it could be confusing if we have a similar
> function other than encode() function. Also, we could end up
> introducing as many encoding and decoding functions dedicated for UUID
> as we want to support encoding methods, bloating the functions.
>
> > So as the first step, +1 for supporting base32hex for encode() and
> decode() functions and supporting the UUID <-> bytea conversion. I
> believe it would cover most use cases and the cost of UUID <-> bytea
> conversion is negligible.
>
> > Regards,
>
> > --
> > Masahiko Sawada
> > Amazon Web Services: https://aws.amazon.com
>
>
> Masahiko,
>
> I see you're in favor of base32hex encoding. That's great!
>
> Your arguments make sense, and I generally support enhancing the standard encode() and decode() functions to handle base32hex. It seems like the right approach from a developer experience standpoint.
>
> However, I'm unclear about some implementation aspects. Why add conversions between UUID and bytea data types? Wouldn't that require creating dedicated UUID <-> bytea conversion functions? Instead, could we implement encode() as polymorphic to handle UUID type inputs directly? For decode(), we'd need  some way (a parameter?) to specify the UUID output type instead of bytea. Another option would be automatic type casting when inserting bytea data into UUID columns. Neither an extra parameter nor additional type casting seems ideal to me, though I don't have better alternatives.

While we can implement something like decode(uuid, text), I don't
think we can implement decode() in the way you proposed unless I'm
missing something.

I think the conversion support between UUID and bytea is useful in
general, not limited to encode()/decode() support. And users would be
able to create wrapper functions if they don't want to add casting for
every encode() and decode() calls. For example,

create function uuid_to_base32(uuid) returns text language sql immutable strict
begin atomic
    select encode($1::bytea, 'base32hex');
end;

Since such functions are inlineable, the different between executing
encode(uuid_data::bytea, 'base32hex') and encode(uuid_data,
'base32hex') would only be the conversion; one palloc and one memcpy.

> But actually, for a short UUID text encoding to succeed, it's more important that it becomes the single, de facto standard. We should avoid supporting multiple encodings, just as the authors and contributors of RFC 9562 did: https://github.com/uuid6/new-uuid-encoding-techniques-ietf-draft/discussions/17#discussioncomment-10...    Therefore, whenever possible, encode() and decode() should support just one UUID text encoding, namely base32hex.

I guess it's ultimately the developer's choice, no? For example, if
they are using multiple databases (or data processing platforms) in
their system and 'hex' is the only encoding that all components can
encode and decode, they might choose 'hex' encoding.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com





^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-23 17:34   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-23 21:00     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
@ 2025-10-24 07:16       ` Sergey Prokhorenko <[email protected]>
  2025-10-24 18:23         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  1 sibling, 1 reply; 37+ messages in thread

From: Sergey Prokhorenko @ 2025-10-24 07:16 UTC (permalink / raw)
  To: Masahiko Sawada <[email protected]>; +Cc: Andrey Borodin <[email protected]>; pgsql-hackers

On Thu, Oct 23, 2025 at 3:46 PM Sergey Prokhorenko
<[email protected]> wrote:
>
> > Given that what uuid_to_base32hex() actually does is encoding the
> input UUID,  I find that it could be confusing if we have a similar
> function other than encode() function. Also, we could end up
> introducing as many encoding and decoding functions dedicated for UUID
> as we want to support encoding methods, bloating the functions.
>
> > So as the first step, +1 for supporting base32hex for encode() and
> decode() functions and supporting the UUID <-> bytea conversion. I
> believe it would cover most use cases and the cost of UUID <-> bytea
> conversion is negligible.
>
> > Regards,
>
> > --
> > Masahiko Sawada
> > Amazon Web Services: https://aws.amazon.com
>
>
> Masahiko,
>
> I see you're in favor of base32hex encoding. That's great!
>
> Your arguments make sense, and I generally support enhancing the standard encode() and decode() functions to handle base32hex. It seems like the right approach from a developer experience standpoint.
>
> However, I'm unclear about some implementation aspects. Why add conversions between UUID and bytea data types? Wouldn't that require creating dedicated UUID <-> bytea conversion functions? Instead, could we implement encode() as polymorphic to handle UUID type inputs directly? For decode(), we'd need  some way (a parameter?) to specify the UUID output type instead of bytea. Another option would be automatic type casting when inserting bytea data into UUID columns. Neither an extra parameter nor additional type casting seems ideal to me, though I don't have better alternatives.

While we can implement something like decode(uuid, text), I don't
think we can implement decode() in the way you proposed unless I'm
missing something.

I think the conversion support between UUID and bytea is useful in
general, not limited to encode()/decode() support. And users would be
able to create wrapper functions if they don't want to add casting for
every encode() and decode() calls. For example,

create function uuid_to_base32(uuid) returns text language sql immutable strict
begin atomic
    select encode($1::bytea, 'base32hex');
end;

Since such functions are inlineable, the different between executing
encode(uuid_data::bytea, 'base32hex') and encode(uuid_data,
'base32hex') would only be the conversion; one palloc and one memcpy.

> But actually, for a short UUID text encoding to succeed, it's more important that it becomes the single, de facto standard. We should avoid supporting multiple encodings, just as the authors and contributors of RFC 9562 did: https://github.com/uuid6/new-uuid-encoding-techniques-ietf-draft/discussions/17#discussioncomment-10...   Therefore, whenever possible, encode() and decode() should support just one UUID text encoding, namely base32hex.

I guess it's ultimately the developer's choice, no? For example, if
they are using multiple databases (or data processing platforms) in
their system and 'hex' is the only encoding that all components can
encode and decode, they might choose 'hex' encoding.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

____________________________________________


Masahiko,
Developers will still be able to use the long canonical 'hex' UUID format for compatibility. But the short format is not a developer choice, but a convention. We mustn't allow a situation where 25% of systems use base32hex, 25% use Crocksford's Base32, 25% use base36, and 25% even use erroneously sorted base64. That's a very real nightmare. You, too, have every reason not to want to increase the number of built-in functions in PostgreSQL.
 But here is a solution that I hope will satisfy everyone:

encode('019535d9-3df7-79fb-b466-​fa907fa17f9e', 'uuid_to_base32hex') -> 06AJBM9TUTSVND36VA87V8BVJOdecode('06AJBM9TUTSVND36VA87V8BVJO', 'base32hex_to_uuid') -> 019535d9-3df7-79fb-b466-​fa907fa17f9e
I don't see any real business need for UUID <-> bytea conversions.
Best regards,Sergey Prokhorenko









  

^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-23 17:34   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-23 21:00     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 07:16       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
@ 2025-10-24 18:23         ` Masahiko Sawada <[email protected]>
  2025-10-24 22:39           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  0 siblings, 1 reply; 37+ messages in thread

From: Masahiko Sawada @ 2025-10-24 18:23 UTC (permalink / raw)
  To: Sergey Prokhorenko <[email protected]>; +Cc: Andrey Borodin <[email protected]>; pgsql-hackers

On Fri, Oct 24, 2025 at 12:17 AM Sergey Prokhorenko
<[email protected]> wrote:
>
>
> Masahiko,
>
> Developers will still be able to use the long canonical 'hex' UUID format for compatibility. But the short format is not a developer choice, but a convention. We mustn't allow a situation where 25% of systems use base32hex, 25% use Crocksford's Base32, 25% use base36, and 25% even use erroneously sorted base64. That's a very real nightmare. You, too, have every reason not to want to increase the number of built-in functions in PostgreSQL.
>
> But here is a solution that I hope will satisfy everyone:
>
> encode('019535d9-3df7-79fb-b466-fa907fa17f9e', 'uuid_to_base32hex') -> 06AJBM9TUTSVND36VA87V8BVJO

Does it mean the first argument is uuid type data and when
'uuid_to_base32hex' is specified as the format the function requires a
uuid data at the first argument? I could not understand the difference
between specifying 'based32hex' and 'uuid_to_base32hex' when encoding
UUID data with base32hex encoding.

> decode('06AJBM9TUTSVND36VA87V8BVJO', 'base32hex_to_uuid') -> 019535d9-3df7-79fb-b466-fa907fa17f9e

Suppose that the decode() takes text data at the first argument and
returns UUID data, the function signature would be decode(text, text)
-> uuid. But we cannot create two functions that have the same name
and the same argument types.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com





^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-23 17:34   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-23 21:00     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 07:16       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 18:23         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
@ 2025-10-24 22:39           ` Sergey Prokhorenko <[email protected]>
  2025-10-24 23:31             ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  0 siblings, 1 reply; 37+ messages in thread

From: Sergey Prokhorenko @ 2025-10-24 22:39 UTC (permalink / raw)
  To: Masahiko Sawada <[email protected]>; +Cc: Andrey Borodin <[email protected]>; pgsql-hackers



    On Friday 24 October 2025 at 09:24:15 pm GMT+3, Masahiko Sawada <[email protected]> wrote:  
 
 On Fri, Oct 24, 2025 at 12:17 AM Sergey Prokhorenko
<[email protected]> wrote:
>
>
> Masahiko,
>
> Developers will still be able to use the long canonical 'hex' UUID format for compatibility. But the short format is not a developer choice, but a convention. We mustn't allow a situation where 25% of systems use base32hex, 25% use Crocksford's Base32, 25% use base36, and 25% even use erroneously sorted base64. That's a very real nightmare. You, too, have every reason not to want to increase the number of built-in functions in PostgreSQL.
>
> But here is a solution that I hope will satisfy everyone:
>
> encode('019535d9-3df7-79fb-b466-fa907fa17f9e', 'uuid_to_base32hex') -> 06AJBM9TUTSVND36VA87V8BVJO

> Does it mean the first argument is uuid type data and when
> 'uuid_to_base32hex' is specified as the format the function requires a
> uuid data at the first argument?

Yes, that's right.PostgreSQL will automatically cast the string '019535d9-3df7-79fb-b466-fa907fa17f9e' to the uuid type, since the format is correct.


> I could not understand the difference
> between specifying 'based32hex' and 'uuid_to_base32hex' when encoding
> UUID data with base32hex encoding.


1. Specifying 'based32hex' in encode function means the first parameter is of bytea type as usual. 


But specifying 'uuid_to_base32hex' means the first parameter is of uuid type.


2. The encode function does not yet support format based32hex. Therefore, it is not known whether padding ===== should be used.
But padding ===== is not used when specifying 'uuid_to_base32hex'.


> decode('06AJBM9TUTSVND36VA87V8BVJO', 'base32hex_to_uuid') -> 019535d9-3df7-79fb-b466-fa907fa17f9e

> Suppose that the decode() takes text data at the first argument and
> returns UUID data, the function signature would be decode(text, text)
> -> uuid. But we cannot create two functions that have the same name
> and the same argument types.
Yes, you're right. This is a problem that can't be solved without composite return values. We clearly took the wrong approach by coupling UUID conversion with encode/decode functions, which only apply to bytea. UUID and bytea are fundamentally different data types. Meanwhile, PostgreSQL has over 30 other type conversion functions that deal with other data types. For example, array_to_string, string_to_array, jsonb_to_record, to_char, to_timestamp, and to_hex. In this situation, the best solution would be to revert to the original uuid_to_base32hex() and base32hex_to_uuid() functions rather than deal with type incompatibility issues.


> Regards,

> -- 
> Masahiko Sawada
> Amazon Web Services: https://aws.amazon.com


  

^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-23 17:34   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-23 21:00     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 07:16       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 18:23         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 22:39           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
@ 2025-10-24 23:31             ` Masahiko Sawada <[email protected]>
  2025-10-25 18:07               ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  0 siblings, 1 reply; 37+ messages in thread

From: Masahiko Sawada @ 2025-10-24 23:31 UTC (permalink / raw)
  To: Sergey Prokhorenko <[email protected]>; +Cc: Andrey Borodin <[email protected]>; pgsql-hackers

On Fri, Oct 24, 2025 at 3:42 PM Sergey Prokhorenko
<[email protected]> wrote:
>
>
>
> On Friday 24 October 2025 at 09:24:15 pm GMT+3, Masahiko Sawada <[email protected]> wrote:
>
>
> On Fri, Oct 24, 2025 at 12:17 AM Sergey Prokhorenko
> <[email protected]> wrote:
> >
> >
> > Masahiko,
> >
> > Developers will still be able to use the long canonical 'hex' UUID format for compatibility. But the short format is not a developer choice, but a convention. We mustn't allow a situation where 25% of systems use base32hex, 25% use Crocksford's Base32, 25% use base36, and 25% even use erroneously sorted base64. That's a very real nightmare. You, too, have every reason not to want to increase the number of built-in functions in PostgreSQL.
> >
> > But here is a solution that I hope will satisfy everyone:
> >
> > encode('019535d9-3df7-79fb-b466-fa907fa17f9e', 'uuid_to_base32hex') -> 06AJBM9TUTSVND36VA87V8BVJO
>
> > Does it mean the first argument is uuid type data and when
> > 'uuid_to_base32hex' is specified as the format the function requires a
> > uuid data at the first argument?
>
> Yes, that's right.
> PostgreSQL will automatically cast the string '019535d9-3df7-79fb-b466-fa907fa17f9e' to the uuid type, since the format is correct.
>
>
> > I could not understand the difference
> > between specifying 'based32hex' and 'uuid_to_base32hex' when encoding
> > UUID data with base32hex encoding.
>
>
> 1. Specifying 'based32hex' in encode function means the first parameter is of bytea type as usual.
>
>
> But specifying 'uuid_to_base32hex' means the first parameter is of uuid type.
>
>
> 2. The encode function does not yet support format based32hex. Therefore, it is not known whether padding ===== should be used.
> But padding ===== is not used when specifying 'uuid_to_base32hex'.
>
>
> > decode('06AJBM9TUTSVND36VA87V8BVJO', 'base32hex_to_uuid') -> 019535d9-3df7-79fb-b466-fa907fa17f9e
>
> > Suppose that the decode() takes text data at the first argument and
> > returns UUID data, the function signature would be decode(text, text)
> > -> uuid. But we cannot create two functions that have the same name
> > and the same argument types.
>
> Yes, you're right. This is a problem that can't be solved without composite return values. We clearly took the wrong approach by coupling UUID conversion with encode/decode functions, which only apply to bytea. UUID and bytea are fundamentally different data types. Meanwhile, PostgreSQL has over 30 other type conversion functions that deal with other data types. For example, array_to_string, string_to_array, jsonb_to_record, to_char, to_timestamp, and to_hex. In this situation, the best solution would be to revert to the original uuid_to_base32hex() and base32hex_to_uuid() functions rather than deal with type incompatibility issues.
>

I think that type conversions and data encodings serve different
purposes. Type conversions express semantic transformations between
data types (e.g., text -> timestamp, jsonb -> record), while encodings
are simply representations of binary data as text. For the latter,
PostgreSQL already provides a well-defined abstraction through
encode()/decode(). Mixing encoding logic with type-specific
conversions would blur that boundary.

Also, if we start adding dedicated functions for each supported
encoding (uuid_to_base32hex, uuid_to_hex etc.), the number of
functions could easily multiply. That’s exactly what encode() and
decode() were designed to avoid.

While I agree that base32hex should be the recommended, I'm really not
sure it's a good design that PostgreSQL core should enforce it as the
only built-in method. It seems better to me to provide flexible
primitives, encode()/decode() plus UUID <-> bytea casts, and document
base32hex as the canonical convention (if necessary). Or providing
'uuid_encode(uuid, format text) -> text' and 'uuid_decode(text, format
text) -> uuid' might make sense too, but I'm not sure. I'd like to
hear opinions from other hackers too.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com





^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-23 17:34   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-23 21:00     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 07:16       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 18:23         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 22:39           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 23:31             ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
@ 2025-10-25 18:07               ` Andrey Borodin <[email protected]>
  2025-10-25 19:15                 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-27 10:23                 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-27 20:09                 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  0 siblings, 3 replies; 37+ messages in thread

From: Andrey Borodin @ 2025-10-25 18:07 UTC (permalink / raw)
  To: Masahiko Sawada <[email protected]>; +Cc: Sergey Prokhorenko <[email protected]>; pgsql-hackers



> On 25 Oct 2025, at 04:31, Masahiko Sawada <[email protected]> wrote:
> 
> Or providing
> 'uuid_encode(uuid, format text) -> text' and 'uuid_decode(text, format
> text) -> uuid' might make sense too, but I'm not sure.

I like the idea, so I drafted a prototype for discussion.
Though I do not see what else methods should be provided along with added one...


Best regards, Andrey Borodin.


Attachments:

  [application/octet-stream] v2-0001-Add-uuid_encode-and-uuid_decode-functions.patch (17.8K, 2-v2-0001-Add-uuid_encode-and-uuid_decode-functions.patch)
  download | inline diff:
From 3fb70e9e6e0a3faa5b0e467ee460aae3223e0bef Mon Sep 17 00:00:00 2001
From: Andrey Borodin <[email protected]>
Date: Thu, 9 Oct 2025 18:13:11 +0500
Subject: [PATCH v2] Add uuid_encode() and uuid_decode() functions

Add functions for encoding UUIDs to alternative text formats,
following the encode()/decode() pattern. Initial support includes
base32hex format (RFC 4648) producing 26-character unpadded strings.
Decoding accepts both padded and unpadded input and is case-insensitive.

The extensible design allows easy addition of more formats.
---
 doc/src/sgml/func/func-uuid.sgml   |  75 ++++++++
 src/backend/utils/adt/uuid.c       | 265 +++++++++++++++++++++++++++++
 src/include/catalog/pg_proc.dat    |   6 +
 src/test/regress/expected/uuid.out |  98 +++++++++++
 src/test/regress/sql/uuid.sql      |  34 ++++
 5 files changed, 478 insertions(+)

diff --git a/doc/src/sgml/func/func-uuid.sgml b/doc/src/sgml/func/func-uuid.sgml
index 2638e2bf855..93d11b8341b 100644
--- a/doc/src/sgml/func/func-uuid.sgml
+++ b/doc/src/sgml/func/func-uuid.sgml
@@ -26,6 +26,14 @@
    <primary>uuid_extract_version</primary>
   </indexterm>
 
+  <indexterm>
+   <primary>uuid_to_base32hex</primary>
+  </indexterm>
+
+  <indexterm>
+   <primary>base32hex_to_uuid</primary>
+  </indexterm>
+
   <para>
    <xref linkend="func_uuid_gen_table"/> shows the <productname>PostgreSQL</productname>
    functions that can be used to generate UUIDs.
@@ -167,6 +175,73 @@
    </tgroup>
   </table>
 
+  <para>
+   <xref linkend="func_uuid_conversion_table"/> shows the <productname>PostgreSQL</productname>
+   functions that can be used to convert UUIDs to and from alternative text representations.
+  </para>
+
+  <table id="func_uuid_conversion_table">
+   <title><acronym>UUID</acronym> Conversion Functions</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="func_table_entry"><para role="func_signature">
+        Function
+       </para>
+       <para>
+        Description
+       </para>
+       <para>
+        Example(s)
+       </para></entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="func_table_entry"><para role="func_signature">
+        <function>uuid_encode</function>
+        ( <parameter>value</parameter> <type>uuid</type>,
+        <parameter>format</parameter> <type>text</type> )
+        <returnvalue>text</returnvalue>
+       </para>
+       <para>
+        Converts a UUID to an encoded text string using the specified format.
+        Currently supported formats:
+        <literal>base32hex</literal> - Encodes the UUID as a 26-character
+        base32hex string (0-9, A-V) without padding, as defined in
+        <ulink url="https://datatracker.ietf.org/doc/html/rfc4648">RFC 4648</ulink>.
+        This produces a more compact representation than the standard UUID format
+        and is case-insensitive, making it suitable for URLs and identifiers.
+       </para>
+       <para>
+        <literal>uuid_encode('123e4567-e89b-12d3-a456-&zwsp;426614174000'::uuid, 'base32hex')</literal>
+        <returnvalue>28V4APV8JC9D792M89J185Q000</returnvalue>
+       </para></entry>
+     </row>
+     <row>
+      <entry role="func_table_entry"><para role="func_signature">
+        <function>uuid_decode</function>
+        ( <parameter>string</parameter> <type>text</type>,
+        <parameter>format</parameter> <type>text</type> )
+        <returnvalue>uuid</returnvalue>
+       </para>
+       <para>
+        Converts an encoded text string to a UUID using the specified format.
+        For <literal>base32hex</literal> format, the input can be either
+        26 characters (unpadded) or 32 characters (padded with <literal>=</literal>),
+        and must contain only valid base32hex characters
+        (0-9, A-V, case-insensitive).
+       </para>
+       <para>
+        <literal>uuid_decode('28V4APV8JC9D792M89J185Q000', 'base32hex')</literal>
+        <returnvalue>123e4567-e89b-12d3-a456-426614174000</returnvalue>
+       </para></entry>
+     </row>
+    </tbody>
+   </tgroup>
+  </table>
+
   <para>
    <productname>PostgreSQL</productname> also provides the usual comparison
    operators shown in <xref linkend="functions-comparison-op-table"/> for
diff --git a/src/backend/utils/adt/uuid.c b/src/backend/utils/adt/uuid.c
index e5f27ff892b..f4e34d6f795 100644
--- a/src/backend/utils/adt/uuid.c
+++ b/src/backend/utils/adt/uuid.c
@@ -20,6 +20,7 @@
 #include "lib/hyperloglog.h"
 #include "libpq/pqformat.h"
 #include "port/pg_bswap.h"
+#include "utils/builtins.h"
 #include "utils/fmgrprotos.h"
 #include "utils/guc.h"
 #include "utils/skipsupport.h"
@@ -777,3 +778,267 @@ uuid_extract_version(PG_FUNCTION_ARGS)
 
 	PG_RETURN_UINT16(version);
 }
+
+/*
+ * UUID encoding conversion API.
+ */
+struct uuid_encoding
+{
+	uint64		(*encode_len) (void);
+	uint64		(*decode_len) (void);
+	uint64		(*encode) (const pg_uuid_t *uuid, char *res);
+	uint64		(*decode) (const char *data, size_t dlen, pg_uuid_t *res);
+};
+
+static const struct uuid_encoding *uuid_find_encoding(const char *name);
+
+/*
+ * BASE32HEX encoding for UUID
+ *
+ * Base32hex encoding uses 32 characters (0-9, A-V) and represents 5 bits per
+ * character. For a 128-bit UUID (16 bytes), we need 26 characters
+ * (128 bits / 5 bits per char = 25.6, rounded up to 26).
+ * As defined in RFC 4648.
+ */
+static const char base32hex_chars[] = "0123456789ABCDEFGHIJKLMNOPQRSTUV";
+
+static uint64
+base32hex_encode_len(void)
+{
+	/* 128 bits / 5 bits per character = 25.6, rounded up to 26 */
+	return 26;
+}
+
+static uint64
+base32hex_decode_len(void)
+{
+	/* Always 16 bytes for UUID */
+	return UUID_LEN;
+}
+
+static uint64
+base32hex_encode(const pg_uuid_t *uuid, char *res)
+{
+	int			i;
+	uint64		bits_buffer = 0;
+	int			bits_in_buffer = 0;
+	int			output_pos = 0;
+
+	for (i = 0; i < UUID_LEN; i++)
+	{
+		/* Add 8 bits to the buffer */
+		bits_buffer = (bits_buffer << 8) | uuid->data[i];
+		bits_in_buffer += 8;
+
+		/* Extract 5-bit chunks while we have enough bits */
+		while (bits_in_buffer >= 5)
+		{
+			bits_in_buffer -= 5;
+			/* Extract top 5 bits */
+			res[output_pos++] = base32hex_chars[(bits_buffer >> bits_in_buffer) & 0x1F];
+			/* Clear the extracted bits by masking */
+			bits_buffer &= ((1ULL << bits_in_buffer) - 1);
+		}
+	}
+
+	/* Handle remaining bits (128 % 5 = 3, so we have 3 bits left) */
+	if (bits_in_buffer > 0)
+	{
+		res[output_pos++] = base32hex_chars[(bits_buffer << (5 - bits_in_buffer)) & 0x1F];
+	}
+
+	return output_pos;
+}
+
+static uint64
+base32hex_decode(const char *src, size_t srclen, pg_uuid_t *uuid)
+{
+	int			i;
+	uint64		bits_buffer = 0;
+	int			bits_in_buffer = 0;
+	int			output_pos = 0;
+	size_t		decode_len = srclen;
+
+	/*
+	 * Accept both unpadded (26 chars) and padded (32 chars) input.
+	 * RFC 4648 specifies padding to make length a multiple of 8.
+	 */
+	if (srclen == 32)
+	{
+		/* Verify padding: should be exactly 6 '=' characters at the end */
+		for (i = 26; i < 32; i++)
+		{
+			if (src[i] != '=')
+				ereport(ERROR,
+						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+						 errmsg("invalid base32hex padding for UUID"),
+						 errdetail("Expected '=' padding characters at position %d.", i)));
+		}
+		decode_len = 26;	/* Only decode the first 26 characters */
+	}
+	else if (srclen != 26)
+	{
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("invalid base32hex length for UUID"),
+				 errdetail("Expected 26 or 32 characters, got %zu.", srclen)));
+	}
+
+	for (i = 0; i < (int) decode_len; i++)
+	{
+		unsigned char c = src[i];
+		int			val;
+
+		/* Decode base32hex character (0-9, A-V, case-insensitive) */
+		if (c >= '0' && c <= '9')
+			val = c - '0';
+		else if (c >= 'A' && c <= 'V')
+			val = c - 'A' + 10;
+		else if (c >= 'a' && c <= 'v')
+			val = c - 'a' + 10;
+		else
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+					 errmsg("invalid base32hex digit: \"%c\"", c)));
+
+		/* Add 5 bits to buffer */
+		bits_buffer = (bits_buffer << 5) | val;
+		bits_in_buffer += 5;
+
+		/* Extract 8-bit bytes when we have enough bits */
+		while (bits_in_buffer >= 8)
+		{
+			bits_in_buffer -= 8;
+			if (output_pos < UUID_LEN)
+			{
+				uuid->data[output_pos++] = (unsigned char) (bits_buffer >> bits_in_buffer);
+				/* Clear the extracted bits */
+				bits_buffer &= ((1ULL << bits_in_buffer) - 1);
+			}
+		}
+	}
+
+	/* Verify we got exactly 16 bytes */
+	if (output_pos != UUID_LEN)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("invalid base32hex data for UUID"),
+				 errdetail("Decoded to %d bytes instead of %d.", output_pos, UUID_LEN)));
+
+	/* Verify no extra bits remain (should be exactly 2 padding bits, all zeros) */
+	if (bits_in_buffer != 2 || (bits_buffer & ((1ULL << bits_in_buffer) - 1)) != 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("invalid base32hex padding for UUID")));
+
+	return UUID_LEN;
+}
+
+/*
+ * Encoding lookup table
+ */
+static const struct
+{
+	const char *name;
+	struct uuid_encoding enc;
+}			uuid_enclist[] =
+{
+	{
+		"base32hex",
+		{
+			base32hex_encode_len, base32hex_decode_len, base32hex_encode, base32hex_decode
+		}
+	},
+	{
+		NULL,
+		{
+			NULL, NULL, NULL, NULL
+		}
+	}
+};
+
+static const struct uuid_encoding *
+uuid_find_encoding(const char *name)
+{
+	int			i;
+
+	for (i = 0; uuid_enclist[i].name; i++)
+		if (pg_strcasecmp(uuid_enclist[i].name, name) == 0)
+			return &uuid_enclist[i].enc;
+
+	return NULL;
+}
+
+/*
+ * uuid_encode - encode UUID to text using specified format
+ */
+Datum
+uuid_encode(PG_FUNCTION_ARGS)
+{
+	pg_uuid_t  *uuid = PG_GETARG_UUID_P(0);
+	Datum		name = PG_GETARG_DATUM(1);
+	text	   *result;
+	char	   *namebuf;
+	uint64		resultlen;
+	uint64		res;
+	const struct uuid_encoding *enc;
+
+	namebuf = TextDatumGetCString(name);
+
+	enc = uuid_find_encoding(namebuf);
+	if (enc == NULL)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("unrecognized encoding: \"%s\"", namebuf)));
+
+	resultlen = enc->encode_len();
+
+	result = (text *) palloc(VARHDRSZ + resultlen);
+
+	res = enc->encode(uuid, VARDATA(result));
+
+	/* Make this FATAL 'cause we've trodden on memory ... */
+	if (res > resultlen)
+		elog(FATAL, "overflow - encode estimate too small");
+
+	SET_VARSIZE(result, VARHDRSZ + res);
+
+	PG_RETURN_TEXT_P(result);
+}
+
+/*
+ * uuid_decode - decode text to UUID using specified format
+ */
+Datum
+uuid_decode(PG_FUNCTION_ARGS)
+{
+	text	   *data = PG_GETARG_TEXT_PP(0);
+	Datum		name = PG_GETARG_DATUM(1);
+	pg_uuid_t  *result;
+	char	   *namebuf;
+	char	   *dataptr;
+	size_t		datalen;
+	uint64		res;
+	const struct uuid_encoding *enc;
+
+	namebuf = TextDatumGetCString(name);
+
+	enc = uuid_find_encoding(namebuf);
+	if (enc == NULL)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("unrecognized encoding: \"%s\"", namebuf)));
+
+	dataptr = VARDATA_ANY(data);
+	datalen = VARSIZE_ANY_EXHDR(data);
+
+	result = (pg_uuid_t *) palloc(sizeof(pg_uuid_t));
+
+	res = enc->decode(dataptr, datalen, result);
+
+	/* Make this FATAL 'cause we've trodden on memory ... */
+	if (res > UUID_LEN)
+		elog(FATAL, "overflow - decode estimate too small");
+
+	PG_RETURN_UUID_P(result);
+}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index b51d2b17379..9e9d15437d2 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -9549,6 +9549,12 @@
 { oid => '6343', descr => 'extract version from RFC 9562 UUID',
   proname => 'uuid_extract_version', proleakproof => 't', prorettype => 'int2',
   proargtypes => 'uuid', prosrc => 'uuid_extract_version' },
+{ oid => '8083', descr => 'encode UUID to text using specified format',
+  proname => 'uuid_encode', prorettype => 'text',
+  proargtypes => 'uuid text', prosrc => 'uuid_encode' },
+{ oid => '8084', descr => 'decode text to UUID using specified format',
+  proname => 'uuid_decode', prorettype => 'uuid',
+  proargtypes => 'text text', prosrc => 'uuid_decode' },
 
 # pg_lsn
 { oid => '3229', descr => 'I/O',
diff --git a/src/test/regress/expected/uuid.out b/src/test/regress/expected/uuid.out
index 95392003b86..8c9ecae44c3 100644
--- a/src/test/regress/expected/uuid.out
+++ b/src/test/regress/expected/uuid.out
@@ -305,5 +305,103 @@ SELECT uuid_extract_timestamp('11111111-1111-1111-1111-111111111111');  -- null
  
 (1 row)
 
+-- UUID encoding/decoding functions
+-- test uuid_encode with base32hex
+SELECT uuid_encode('00000000-0000-0000-0000-000000000000'::uuid, 'base32hex');
+        uuid_encode         
+----------------------------
+ 00000000000000000000000000
+(1 row)
+
+SELECT uuid_encode('11111111-1111-1111-1111-111111111111'::uuid, 'base32hex');
+        uuid_encode         
+----------------------------
+ 248H248H248H248H248H248H24
+(1 row)
+
+SELECT uuid_encode('ffffffff-ffff-ffff-ffff-ffffffffffff'::uuid, 'base32hex');
+        uuid_encode         
+----------------------------
+ VVVVVVVVVVVVVVVVVVVVVVVVVS
+(1 row)
+
+SELECT uuid_encode('123e4567-e89b-12d3-a456-426614174000'::uuid, 'base32hex');
+        uuid_encode         
+----------------------------
+ 28V4APV8JC9D792M89J185Q000
+(1 row)
+
+-- test uuid_decode with base32hex
+SELECT uuid_decode('00000000000000000000000000', 'base32hex');
+             uuid_decode              
+--------------------------------------
+ 00000000-0000-0000-0000-000000000000
+(1 row)
+
+SELECT uuid_decode('28V4APV8JC9D792M89J185Q000', 'base32hex');
+             uuid_decode              
+--------------------------------------
+ 123e4567-e89b-12d3-a456-426614174000
+(1 row)
+
+-- test round-trip conversions
+SELECT uuid_decode(uuid_encode('00000000-0000-0000-0000-000000000000'::uuid, 'base32hex'), 'base32hex');
+             uuid_decode              
+--------------------------------------
+ 00000000-0000-0000-0000-000000000000
+(1 row)
+
+SELECT uuid_encode(uuid_decode('28V4APV8JC9D792M89J185Q000', 'base32hex'), 'base32hex');
+        uuid_encode         
+----------------------------
+ 28V4APV8JC9D792M89J185Q000
+(1 row)
+
+SELECT uuid_decode(uuid_encode('123e4567-e89b-12d3-a456-426614174000'::uuid, 'base32hex'), 'base32hex');
+             uuid_decode              
+--------------------------------------
+ 123e4567-e89b-12d3-a456-426614174000
+(1 row)
+
+-- test case insensitivity
+SELECT uuid_decode('28v4apv8jc9d792m89j185q000', 'base32hex');
+             uuid_decode              
+--------------------------------------
+ 123e4567-e89b-12d3-a456-426614174000
+(1 row)
+
+SELECT uuid_decode('28V4APV8JC9D792M89J185Q000', 'base32hex');
+             uuid_decode              
+--------------------------------------
+ 123e4567-e89b-12d3-a456-426614174000
+(1 row)
+
+-- test RFC 4648 padding (32 chars with 6 '=' signs)
+SELECT uuid_decode('28V4APV8JC9D792M89J185Q000======', 'base32hex');
+             uuid_decode              
+--------------------------------------
+ 123e4567-e89b-12d3-a456-426614174000
+(1 row)
+
+SELECT uuid_decode('00000000000000000000000000======', 'base32hex');
+             uuid_decode              
+--------------------------------------
+ 00000000-0000-0000-0000-000000000000
+(1 row)
+
+-- test error cases for base32hex
+SELECT uuid_decode('short', 'base32hex');  -- too short
+ERROR:  invalid base32hex length for UUID
+DETAIL:  Expected 26 or 32 characters, got 5.
+SELECT uuid_decode('28V4APV8JC9D792M89J185Q000X', 'base32hex');  -- too long
+ERROR:  invalid base32hex length for UUID
+DETAIL:  Expected 26 or 32 characters, got 27.
+SELECT uuid_decode('28V4APV8JC9D792M89J185Q00W', 'base32hex');  -- invalid character (W)
+ERROR:  invalid base32hex digit: "W"
+-- test invalid encoding
+SELECT uuid_encode('123e4567-e89b-12d3-a456-426614174000'::uuid, 'invalid');
+ERROR:  unrecognized encoding: "invalid"
+SELECT uuid_decode('00000000000000000000000000', 'invalid');
+ERROR:  unrecognized encoding: "invalid"
 -- clean up
 DROP TABLE guid1, guid2, guid3 CASCADE;
diff --git a/src/test/regress/sql/uuid.sql b/src/test/regress/sql/uuid.sql
index 465153a0341..b0385a07672 100644
--- a/src/test/regress/sql/uuid.sql
+++ b/src/test/regress/sql/uuid.sql
@@ -146,6 +146,40 @@ SELECT uuid_extract_timestamp('017F22E2-79B0-7CC3-98C4-DC0C0C07398F') = 'Tuesday
 SELECT uuid_extract_timestamp(gen_random_uuid());  -- null
 SELECT uuid_extract_timestamp('11111111-1111-1111-1111-111111111111');  -- null
 
+-- UUID encoding/decoding functions
+
+-- test uuid_encode with base32hex
+SELECT uuid_encode('00000000-0000-0000-0000-000000000000'::uuid, 'base32hex');
+SELECT uuid_encode('11111111-1111-1111-1111-111111111111'::uuid, 'base32hex');
+SELECT uuid_encode('ffffffff-ffff-ffff-ffff-ffffffffffff'::uuid, 'base32hex');
+SELECT uuid_encode('123e4567-e89b-12d3-a456-426614174000'::uuid, 'base32hex');
+
+-- test uuid_decode with base32hex
+SELECT uuid_decode('00000000000000000000000000', 'base32hex');
+SELECT uuid_decode('28V4APV8JC9D792M89J185Q000', 'base32hex');
+
+-- test round-trip conversions
+SELECT uuid_decode(uuid_encode('00000000-0000-0000-0000-000000000000'::uuid, 'base32hex'), 'base32hex');
+SELECT uuid_encode(uuid_decode('28V4APV8JC9D792M89J185Q000', 'base32hex'), 'base32hex');
+SELECT uuid_decode(uuid_encode('123e4567-e89b-12d3-a456-426614174000'::uuid, 'base32hex'), 'base32hex');
+
+-- test case insensitivity
+SELECT uuid_decode('28v4apv8jc9d792m89j185q000', 'base32hex');
+SELECT uuid_decode('28V4APV8JC9D792M89J185Q000', 'base32hex');
+
+-- test RFC 4648 padding (32 chars with 6 '=' signs)
+SELECT uuid_decode('28V4APV8JC9D792M89J185Q000======', 'base32hex');
+SELECT uuid_decode('00000000000000000000000000======', 'base32hex');
+
+-- test error cases for base32hex
+SELECT uuid_decode('short', 'base32hex');  -- too short
+SELECT uuid_decode('28V4APV8JC9D792M89J185Q000X', 'base32hex');  -- too long
+SELECT uuid_decode('28V4APV8JC9D792M89J185Q00W', 'base32hex');  -- invalid character (W)
+
+-- test invalid encoding
+SELECT uuid_encode('123e4567-e89b-12d3-a456-426614174000'::uuid, 'invalid');
+SELECT uuid_decode('00000000000000000000000000', 'invalid');
+
 
 -- clean up
 DROP TABLE guid1, guid2, guid3 CASCADE;
-- 
2.39.5 (Apple Git-154)



^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-23 17:34   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-23 21:00     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 07:16       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 18:23         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 22:39           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 23:31             ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-25 18:07               ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
@ 2025-10-25 19:15                 ` Sergey Prokhorenko <[email protected]>
  2 siblings, 0 replies; 37+ messages in thread

From: Sergey Prokhorenko @ 2025-10-25 19:15 UTC (permalink / raw)
  To: Masahiko Sawada <[email protected]>; Andrey Borodin <[email protected]>; +Cc: pgsql-hackers

> On 25 Oct 2025, at 04:31, Masahiko Sawada <[email protected]> wrote:> 
> Or providing
> 'uuid_encode(uuid, format text) -> text' and 'uuid_decode(text, format
> text) -> uuid' might make sense too, but I'm not sure.
> On Saturday 25 October 2025 at 09:07:39 pm GMT+3, Andrey Borodin <[email protected]> wrote:
> I like the idea, so I drafted a prototype for discussion.
> Though I do not see what else methods should be provided along with added one...


> Best regards, Andrey Borodin.

If base32hex becomes the default string representation for UUIDs in PostgreSQL, then the canonical UUID string representation may be added into these functions for backward compatibility.
Best regards,
Sergey Prokhorenko







  

^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-23 17:34   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-23 21:00     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 07:16       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 18:23         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 22:39           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 23:31             ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-25 18:07               ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
@ 2025-10-27 10:23                 ` Aleksander Alekseev <[email protected]>
  2025-10-27 14:03                   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2 siblings, 1 reply; 37+ messages in thread

From: Aleksander Alekseev @ 2025-10-27 10:23 UTC (permalink / raw)
  To: pgsql-hackers; +Cc: Andrey Borodin <[email protected]>; Masahiko Sawada <[email protected]>; Sergey Prokhorenko <[email protected]>

Hi,

> > Or providing
> > 'uuid_encode(uuid, format text) -> text' and 'uuid_decode(text, format
> > text) -> uuid' might make sense too, but I'm not sure.
>
> I like the idea, so I drafted a prototype for discussion.
> Though I do not see what else methods should be provided along with added one...

I see no reason why we should forbid the use of base32 encoding with
bytea. Or have different functions for this e.g. uuid_encode() and
encode(). To me it looks like a poor API design.

-- 
Best regards,
Aleksander Alekseev





^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-23 17:34   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-23 21:00     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 07:16       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 18:23         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 22:39           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 23:31             ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-25 18:07               ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-27 10:23                 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
@ 2025-10-27 14:03                   ` Sergey Prokhorenko <[email protected]>
  2025-10-27 14:37                     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  0 siblings, 1 reply; 37+ messages in thread

From: Sergey Prokhorenko @ 2025-10-27 14:03 UTC (permalink / raw)
  To: pgsql-hackers; Aleksander Alekseev <[email protected]>; +Cc: Andrey Borodin <[email protected]>; Masahiko Sawada <[email protected]>

Hi,
> > Or providing
> > 'uuid_encode(uuid, format text) -> text' and 'uuid_decode(text, format
> > text) -> uuid' might make sense too, but I'm not sure.
>
> I like the idea, so I drafted a prototype for discussion.
> Though I do not see what else methods should be provided along with added one...

I see no reason why we should forbid the use of base32 encoding with
bytea. Or have different functions for this e.g. uuid_encode() and
encode(). To me it looks like a poor API design.

-- 
Best regards,
Aleksander Alekseev

____________________________________________

It seems that bytea is your personal interest, since you continue to impose your bytea when a better solution has already been found with uuid_encode() and uuid_decode().


The bytea proposal has a lot of drawbacks:

1. It requires unnecessary casting in addition to encoding/decoding. This complicates the interface and creates unnecessary cognitive load on developers. It also creates additional CPU load, although perhaps only a small amount.

2. The encoding function encourages developers to use the slightly more compact base64 encoding (see https://www.postgresql.org/docs/current/functions-binarystring.html), which doesn't preserve sort order, isn't URL-safe, is case-sensitive, and requires specifying the case of letters when dictating. This also creates a serious problem of incompatibility between UUID encodings.

3. The hashing functions used by bytea create a temptation to implement popular, idiotic ideas for hashing UUIDs to obscure their creation date and to hide internal keys from clients.

4. Other various functions for bytea allow the construction of Frankenstein identifiers that compete with UUIDv7, which could negatively impact the reputation of UUIDs.

The bytea type has nothing in common with the uuid type other than the binary encoding. Therefore, the bytea <-> uuid cast can only encourage abuse and errors, creating the illusion of unlimited developer power.


The bytea proposal has no merit whatsoever. It's the worst, most insafe, and most harmful design, undermining efforts to widely adopt UUIDv7 and improve PostgreSQL.



  

^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-23 17:34   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-23 21:00     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 07:16       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 18:23         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 22:39           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 23:31             ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-25 18:07               ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-27 10:23                 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-27 14:03                   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
@ 2025-10-27 14:37                     ` Aleksander Alekseev <[email protected]>
  2025-10-27 15:33                       ` Отв.: Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  0 siblings, 1 reply; 37+ messages in thread

From: Aleksander Alekseev @ 2025-10-27 14:37 UTC (permalink / raw)
  To: pgsql-hackers; +Cc: Sergey Prokhorenko <[email protected]>; Andrey Borodin <[email protected]>; Masahiko Sawada <[email protected]>

Hi Sergey,

> It seems that bytea is your personal interest, since you continue to impose your bytea when a better solution has already been found with uuid_encode() and uuid_decode().

In the previous messages Masahiko Sawada wrote:

> Or providing
> 'uuid_encode(uuid, format text) -> text' and 'uuid_decode(text, format
> text) -> uuid' might make sense too, but I'm not sure. I'd like to
> hear opinions from other hackers too.

I merely shared my personal opinion on why I think this is a bad idea.
Let's see what other people think.

-- 
Best regards,
Aleksander Alekseev





^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Отв.: Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-23 17:34   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-23 21:00     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 07:16       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 18:23         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 22:39           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 23:31             ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-25 18:07               ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-27 10:23                 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-27 14:03                   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-27 14:37                     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
@ 2025-10-27 15:33                       ` Sergey Prokhorenko <[email protected]>
  0 siblings, 0 replies; 37+ messages in thread

From: Sergey Prokhorenko @ 2025-10-27 15:33 UTC (permalink / raw)
  To: Aleksander Alekseev <[email protected]>; pgsql-hackers; +Cc: Andrey Borodin <[email protected]>; Masahiko Sawada <[email protected]>



Hi Sergey,

> It seems that bytea is your personal interest, since you continue to impose your bytea when a better solution has already been found with uuid_encode() and uuid_decode().

In the previous messages Masahiko Sawada wrote:

> Or providing
> 'uuid_encode(uuid, format text) -> text' and 'uuid_decode(text, format
> text) -> uuid' might make sense too, but I'm not sure. I'd like to
> hear opinions from other hackers too.

I merely shared my personal opinion on why I think this is a bad idea.
Let's see what other people think.

-- 
Best regards,
Aleksander Alekseev



_____________________
You didn't give any arguments in favor of your opinion





^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-23 17:34   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-23 21:00     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 07:16       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 18:23         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 22:39           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 23:31             ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-25 18:07               ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
@ 2025-10-27 20:09                 ` Masahiko Sawada <[email protected]>
  2025-10-27 22:36                   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2 siblings, 1 reply; 37+ messages in thread

From: Masahiko Sawada @ 2025-10-27 20:09 UTC (permalink / raw)
  To: Andrey Borodin <[email protected]>; +Cc: Sergey Prokhorenko <[email protected]>; pgsql-hackers

On Sat, Oct 25, 2025 at 11:07 AM Andrey Borodin <[email protected]> wrote:
>
>
>
> > On 25 Oct 2025, at 04:31, Masahiko Sawada <[email protected]> wrote:
> >
> > Or providing
> > 'uuid_encode(uuid, format text) -> text' and 'uuid_decode(text, format
> > text) -> uuid' might make sense too, but I'm not sure.
>
> I like the idea, so I drafted a prototype for discussion.
> Though I do not see what else methods should be provided along with added one...

Thank you for drafting the patch! But I find it potentially confusing
to have different encoding methods for bytea and UUID types. I don't
see a compelling reason why the core should support base32hex
exclusively for the UUID data type, nor why base32hex should be the
only encoding method that the core provides for UUIDs (while we can
use it by default).

If we implement uuid_encode() and uuid_decode(), we might end up
creating similar encoding and decoding functions for other data types
as well, which doesn't seem like the best approach. I still believe
that extending the existing encode() and decode() functions is a
better starting point.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com





^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-23 17:34   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-23 21:00     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 07:16       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 18:23         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 22:39           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 23:31             ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-25 18:07               ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-27 20:09                 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
@ 2025-10-27 22:36                   ` Sergey Prokhorenko <[email protected]>
  2025-10-28 09:30                     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Jelte Fennema-Nio <[email protected]>
  0 siblings, 1 reply; 37+ messages in thread

From: Sergey Prokhorenko @ 2025-10-27 22:36 UTC (permalink / raw)
  To: Andrey Borodin <[email protected]>; Masahiko Sawada <[email protected]>; +Cc: pgsql-hackers


 
 
 On Sat, Oct 25, 2025 at 11:07 AM Andrey Borodin <[email protected]> wrote:
>
>
>
> > On 25 Oct 2025, at 04:31, Masahiko Sawada <[email protected]> wrote:
> >
> > Or providing
> > 'uuid_encode(uuid, format text) -> text' and 'uuid_decode(text, format
> > text) -> uuid' might make sense too, but I'm not sure.
>
> I like the idea, so I drafted a prototype for discussion.
> Though I do not see what else methods should be provided along with added one...

Thank you for drafting the patch! But I find it potentially confusing
to have different encoding methods for bytea and UUID types. I don't
see a compelling reason why the core should support base32hex
exclusively for the UUID data type, nor why base32hex should be the
only encoding method that the core provides for UUIDs (while we can
use it by default).

If we implement uuid_encode() and uuid_decode(), we might end up
creating similar encoding and decoding functions for other data types
as well, which doesn't seem like the best approach. I still believe
that extending the existing encode() and decode() functions is a
better starting point.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
________________________________________________
 


Masahiko,
 I wanted to highlight an important discussion among the authors and contributors of RFC 9562 regarding UUID text encoding:
https://github.com/uuid6/new-uuid-encoding-techniques-ietf-draft/discussions/17#discussioncomment-10...
 The RFC 9562 authors and contributors reached consensus that standardizing an alternate short text format for UUIDs is important. While the community debated between base32hex (RFC 4648) and Crockford's Base32, both were recognized for preserving lexicographical sort order, a critical property for database primary keys and URL-safe identifiers. Time constraints prevented inclusion in RFC 9562, but the discussion established that base32hex is the existing standard format already defined in RFC 4648, Section 7, specifically designed for sort-preserving encoding.

This context is crucial because it underscores that the uuid type, as a first-class concept, deserves its own standardized text encoding.

Regarding the proposal to couple UUID encoding with the bytea type through encode()/decode() functions: I understand the appeal of reusing existing infrastructure, but this creates a conceptual mismatch. UUID is a distinct semantic type in PostgreSQL, not merely binary data. The bytea type has existed for decades without base32hex encoding, and that's worked fine, because bytea represents arbitrary binary data, not universally unique identifiers with specific structural properties and needs.
Consider PostgreSQL's own design philosophy. The documentation states:
"9.5. Binary String Functions and Operators  This section describes functions and operators for examining and manipulating binary strings, that is values of type bytea. Many of these are equivalent, in purpose and syntax, to the text-string functions described in the previous section."
 PostgreSQL maintains parallel function sets for text strings and bytea precisely because they serve different purposes, despite the implementation overhead. The uuid type deserves the same treatment: it's not just another binary blob, but a type with specific semantics (uniqueness, version bits, variant encoding) and use cases (distributed identifiers, sortable keys, URL-safe representations).
Why should uuid be treated as a second-class citizen and forced through bytea conversion, when text and bytea each have their own dedicated function families?
You've been very careful in your previous arguments to separate data type conversion from encoding/decoding operations. I appreciate that rigor. However, the current proposal to route UUID encoding through bytea contradicts that principle. It merges two fundamentally different data types for convenience rather than correctness.
 If someone wants to add base32hex encoding/decoding to bytea for general binary data operations, that's a worthwhile but separate discussion. The uuid type, however, needs native base32hex support to fulfill its role as a first-class PostgreSQL type with a standardized compact text representation, as recommended by the RFC 9562 community.

I would value your thoughts on these arguments.

Best regards,
Sergey Prokhorenko




















  

^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-23 17:34   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-23 21:00     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 07:16       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 18:23         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 22:39           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 23:31             ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-25 18:07               ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-27 20:09                 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-27 22:36                   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
@ 2025-10-28 09:30                     ` Jelte Fennema-Nio <[email protected]>
  2025-10-28 11:53                       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Dagfinn Ilmari Mannsåker <[email protected]>
  2025-10-28 14:28                       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-28 21:56                       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  0 siblings, 3 replies; 37+ messages in thread

From: Jelte Fennema-Nio @ 2025-10-28 09:30 UTC (permalink / raw)
  To: Sergey Prokhorenko <[email protected]>; +Cc: Andrey Borodin <[email protected]>; Masahiko Sawada <[email protected]>; pgsql-hackers

First of all, I'm definitely a proponent of being able to encode UUIDs
using base32hex in Postgres.

On Mon, 27 Oct 2025 at 23:37, Sergey Prokhorenko
<[email protected]> wrote:
> I wanted to highlight an important discussion among the authors and contributors of RFC 9562 regarding UUID text encoding:
>
> https://github.com/uuid6/new-uuid-encoding-techniques-ietf-draft/discussions/17#discussioncomment-10...

I think a very important thing to note here is that this is a github
discussion, not an officially accepted RFC. I think if it was an
officially accepted RFC on how to encode UUIDs then you would have a
lot less pushback here. Right now your emails mostly read like you
want to push your preferential format, while essentially disallowing
other encodings. While base32hex seems like a good choice for UUIDv7 I
see no reason to give it preferential treatment at this point in time.
crockford base32 seems just as valid. And e.g. base64url[1] seems
totally fine for UUID versions that have no inherent ordering like
UUIDv4. And if someone comes up with a base64urlhex format you could
have even shorter bit still sortable UUIDs at the expense of
legibility.

The main reason why a specific encoding should receive preferential
treatment in Postgres, would be if it was standardized, as that would
help with interoperability. At this point in time there's no such
standard (not even a draft), so forcing an explicit encoding will
actually reduce interoperability, because people already encode their
UUIDs in various different forms.

> but the discussion established that base32hex is the existing standard format already defined in RFC 4648, Section 7, specifically designed for sort-preserving encoding.

You even reach a similar conclusion here: not choosing crockford
base32, purely because it does not have an official RFC.

> This context is crucial because it underscores that the uuid type, as a first-class concept, deserves its own standardized text encoding.

It already has! The standard text encoding is defined in RFC 4122.
That's why postgres displays it as such when encoding to text.

> Regarding the proposal to couple UUID encoding with the bytea type through encode()/decode() functions: I understand the appeal of reusing existing infrastructure, but this creates a conceptual mismatch. UUID is a distinct semantic type in PostgreSQL, not merely binary data. The bytea type has existed for decades without base32hex encoding, and that's worked fine, because bytea represents arbitrary binary data, not universally unique identifiers with specific structural properties and needs.

I think by far the first step is to make the encoding of UUIDs in
different formats possible in Postgres. The way to do so with the
least API impact (and thus as you noticed, least pushback), would be
to add base32hex to the list of encoding formats in the encode/decode
functions. Then combining that with UUID <-> bytea casting (which also
seems totally reasonable functionality to me), would give you the
functionality (but not the defaults you want).

In a follow up patch I would personally be fine making the API to
encode UUIDs a bit more friendly. In particular, adding an overload to
the encode function that takes a UUID instead of a bytea seems
reasonable to me, i.e. encode(id uuid, format text) -> text

I'm currently less convinced about a decode_uuid function though. I
think some perf argument (including some benchmarks) would need to be
made to convince me of its usefulness. Because purely from an API
friendliness lens, I feel like decode('...', 'base32hex)::uuid and
decode_uuid('...', 'base32hex') rank basically the same.

Once/if an accepted RFC actually defines a default shorter encoding
for UUIDs we could I would definitely be in favor of adding a
decode_uuid function with the default encoding configured as a default
argument. As well as adding the default argument to the uuid encode
overload function.





^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-23 17:34   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-23 21:00     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 07:16       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 18:23         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 22:39           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 23:31             ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-25 18:07               ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-27 20:09                 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-27 22:36                   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-28 09:30                     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Jelte Fennema-Nio <[email protected]>
@ 2025-10-28 11:53                       ` Dagfinn Ilmari Mannsåker <[email protected]>
  2025-10-28 12:24                         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Jelte Fennema-Nio <[email protected]>
  2025-10-28 16:41                         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Dagfinn Ilmari Mannsåker <[email protected]>
  2 siblings, 2 replies; 37+ messages in thread

From: Dagfinn Ilmari Mannsåker @ 2025-10-28 11:53 UTC (permalink / raw)
  To: Jelte Fennema-Nio <[email protected]>; +Cc: Sergey Prokhorenko <[email protected]>; Andrey Borodin <[email protected]>; Masahiko Sawada <[email protected]>; pgsql-hackers

Jelte Fennema-Nio <[email protected]> writes:

> First of all, I'm definitely a proponent of being able to encode UUIDs
> using base32hex in Postgres.
>
> On Mon, 27 Oct 2025 at 23:37, Sergey Prokhorenko
> <[email protected]> wrote:
>>
>> Regarding the proposal to couple UUID encoding with the bytea type
>> through encode()/decode() functions: I understand the appeal of
>> reusing existing infrastructure, but this creates a conceptual
>> mismatch. UUID is a distinct semantic type in PostgreSQL, not merely
>> binary data. The bytea type has existed for decades without base32hex
>> encoding, and that's worked fine, because bytea represents arbitrary
>> binary data, not universally unique identifiers with specific
>> structural properties and needs.
>
> I think by far the first step is to make the encoding of UUIDs in
> different formats possible in Postgres. The way to do so with the
> least API impact (and thus as you noticed, least pushback), would be
> to add base32hex to the list of encoding formats in the encode/decode
> functions. Then combining that with UUID <-> bytea casting (which also
> seems totally reasonable functionality to me), would give you the
> functionality (but not the defaults you want).

+1 for adding casts.

> In a follow up patch I would personally be fine making the API to
> encode UUIDs a bit more friendly. In particular, adding an overload to
> the encode function that takes a UUID instead of a bytea seems
> reasonable to me, i.e. encode(id uuid, format text) -> text

Yeah, this works for encode, but not decode, since the argment types are
the same (text, text) in both cases.

> I'm currently less convinced about a decode_uuid function though. I
> think some perf argument (including some benchmarks) would need to be
> made to convince me of its usefulness. Because purely from an API
> friendliness lens, I feel like decode('...', 'base32hex)::uuid and
> decode_uuid('...', 'base32hex') rank basically the same.

Agreed.

> Once/if an accepted RFC actually defines a default shorter encoding
> for UUIDs we could I would definitely be in favor of adding a
> decode_uuid function with the default encoding configured as a default
> argument. As well as adding the default argument to the uuid encode
> overload function.

If it's just one new form, do we need a separate decode function?  Could
we not just make uuid_in() accept both forms (they're easily
distinguishable by length), like bytea_in accepts both the old escape
format and the new hex format?

And if the new format becomes the standard and want to change the
default output format, we would need a GUC like bytea_output anyway, to
let users control when to make the change.

- ilmari





^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-23 17:34   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-23 21:00     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 07:16       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 18:23         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 22:39           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 23:31             ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-25 18:07               ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-27 20:09                 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-27 22:36                   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-28 09:30                     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Jelte Fennema-Nio <[email protected]>
  2025-10-28 11:53                       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Dagfinn Ilmari Mannsåker <[email protected]>
@ 2025-10-28 12:24                         ` Jelte Fennema-Nio <[email protected]>
  1 sibling, 0 replies; 37+ messages in thread

From: Jelte Fennema-Nio @ 2025-10-28 12:24 UTC (permalink / raw)
  To: Dagfinn Ilmari Mannsåker <[email protected]>; +Cc: Sergey Prokhorenko <[email protected]>; Andrey Borodin <[email protected]>; Masahiko Sawada <[email protected]>; pgsql-hackers

On Tue, 28 Oct 2025 at 12:53, Dagfinn Ilmari Mannsåker
<[email protected]> wrote:
> If it's just one new form, do we need a separate decode function?  Could
> we not just make uuid_in() accept both forms (they're easily
> distinguishable by length), like bytea_in accepts both the old escape
> format and the new hex format?
>
> And if the new format becomes the standard and want to change the
> default output format, we would need a GUC like bytea_output anyway, to
> let users control when to make the change.

Agreed to both of those. This seems too far out to spend much time
discussing now.





^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-23 17:34   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-23 21:00     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 07:16       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 18:23         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 22:39           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 23:31             ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-25 18:07               ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-27 20:09                 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-27 22:36                   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-28 09:30                     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Jelte Fennema-Nio <[email protected]>
  2025-10-28 11:53                       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Dagfinn Ilmari Mannsåker <[email protected]>
@ 2025-10-28 16:41                         ` Dagfinn Ilmari Mannsåker <[email protected]>
  2025-10-28 18:44                           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-28 23:35                           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Jelte Fennema-Nio <[email protected]>
  1 sibling, 2 replies; 37+ messages in thread

From: Dagfinn Ilmari Mannsåker @ 2025-10-28 16:41 UTC (permalink / raw)
  To: Jelte Fennema-Nio <[email protected]>; +Cc: Sergey Prokhorenko <[email protected]>; Andrey Borodin <[email protected]>; Masahiko Sawada <[email protected]>; pgsql-hackers

Dagfinn Ilmari Mannsåker <[email protected]> writes:

> Jelte Fennema-Nio <[email protected]> writes:
>
>> Then combining that with UUID <-> bytea casting (which also
>> seems totally reasonable functionality to me), would give you the
>> functionality (but not the defaults you want).
>
> +1 for adding casts.

Here's a patch for that.  I'm not 100% confident about the error code
for invalid length, but that was the closest one I could find in
errcodes.txt.

I might do the base32hex encode/decode later, unless someone else beats
me to it.

- ilmari



^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-23 17:34   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-23 21:00     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 07:16       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 18:23         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 22:39           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 23:31             ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-25 18:07               ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-27 20:09                 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-27 22:36                   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-28 09:30                     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Jelte Fennema-Nio <[email protected]>
  2025-10-28 11:53                       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Dagfinn Ilmari Mannsåker <[email protected]>
  2025-10-28 16:41                         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Dagfinn Ilmari Mannsåker <[email protected]>
@ 2025-10-28 18:44                           ` Masahiko Sawada <[email protected]>
  2025-10-29 12:19                             ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  1 sibling, 1 reply; 37+ messages in thread

From: Masahiko Sawada @ 2025-10-28 18:44 UTC (permalink / raw)
  To: Dagfinn Ilmari Mannsåker <[email protected]>; +Cc: Jelte Fennema-Nio <[email protected]>; Sergey Prokhorenko <[email protected]>; Andrey Borodin <[email protected]>; pgsql-hackers

On Tue, Oct 28, 2025 at 9:41 AM Dagfinn Ilmari Mannsåker
<[email protected]> wrote:
>
> Dagfinn Ilmari Mannsåker <[email protected]> writes:
>
> > Jelte Fennema-Nio <[email protected]> writes:
> >
> >> Then combining that with UUID <-> bytea casting (which also
> >> seems totally reasonable functionality to me), would give you the
> >> functionality (but not the defaults you want).
> >
> > +1 for adding casts.
>
> Here's a patch for that.  I'm not 100% confident about the error code
> for invalid length, but that was the closest one I could find in
> errcodes.txt.

Thank you for the patch. I'll review it.

> I might do the base32hex encode/decode later, unless someone else beats
> me to it.

Andrey has shared his patch for base32hex support before[1]. While it
needs to be updated, it seems to implement sufficient function.

Regards,

[1] https://www.postgresql.org/message-id/6F76FA61-E2DC-44EF-9504-889D9BDB4EBD%40yandex-team.ru

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com





^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-23 17:34   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-23 21:00     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 07:16       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 18:23         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 22:39           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 23:31             ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-25 18:07               ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-27 20:09                 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-27 22:36                   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-28 09:30                     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Jelte Fennema-Nio <[email protected]>
  2025-10-28 11:53                       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Dagfinn Ilmari Mannsåker <[email protected]>
  2025-10-28 16:41                         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Dagfinn Ilmari Mannsåker <[email protected]>
  2025-10-28 18:44                           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
@ 2025-10-29 12:19                             ` Andrey Borodin <[email protected]>
  2025-10-30 19:10                               ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-11-20 05:46                               ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Илья Чердаков <[email protected]>
  0 siblings, 2 replies; 37+ messages in thread

From: Andrey Borodin @ 2025-10-29 12:19 UTC (permalink / raw)
  To: Masahiko Sawada <[email protected]>; +Cc: Dagfinn Ilmari Mannsåker <[email protected]>; Jelte Fennema-Nio <[email protected]>; Sergey Prokhorenko <[email protected]>; pgsql-hackers



> On 28 Oct 2025, at 22:44, Masahiko Sawada <[email protected]> wrote:
> 
> Andrey has shared his patch for base32hex support before[1]. While it
> needs to be updated, it seems to implement sufficient function.

I'd propose something like attached patch. It's on top of Ilmari's v2 patch with small suggestions as a step 2.

Thanks!


Best regards, Andrey Borodin.



Attachments:

  [application/octet-stream] v3-0001-Allow-explicit-casting-between-bytea-and-UUID.patch (4.6K, 2-v3-0001-Allow-explicit-casting-between-bytea-and-UUID.patch)
  download | inline diff:
From cf475ec8632720a629afaab264106a952c625b80 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= <[email protected]>
Date: Tue, 28 Oct 2025 16:33:17 +0000
Subject: [PATCH v3 1/3] Allow explicit casting between bytea and UUID

This enables using encode() and decode() to convert UUIDs to and from
alternative formats, such as base64.
---
 src/backend/utils/adt/bytea.c      | 26 ++++++++++++++++++++++++++
 src/include/catalog/pg_cast.dat    |  6 ++++++
 src/include/catalog/pg_proc.dat    |  7 +++++++
 src/test/regress/expected/uuid.out | 15 +++++++++++++++
 src/test/regress/sql/uuid.sql      |  4 ++++
 5 files changed, 58 insertions(+)

diff --git a/src/backend/utils/adt/bytea.c b/src/backend/utils/adt/bytea.c
index 6e7b914c563..ff9e46f3015 100644
--- a/src/backend/utils/adt/bytea.c
+++ b/src/backend/utils/adt/bytea.c
@@ -26,6 +26,7 @@
 #include "utils/fmgrprotos.h"
 #include "utils/memutils.h"
 #include "utils/sortsupport.h"
+#include "utils/uuid.h"
 #include "utils/varlena.h"
 #include "varatt.h"
 
@@ -1112,3 +1113,28 @@ int8_bytea(PG_FUNCTION_ARGS)
 {
 	return int8send(fcinfo);
 }
+
+/* Cast bytea -> uuid */
+Datum
+bytea_uuid(PG_FUNCTION_ARGS)
+{
+	bytea	   *v = PG_GETARG_BYTEA_PP(0);
+	int			len = VARSIZE_ANY_EXHDR(v);
+	pg_uuid_t  *uuid;
+
+	if (len != UUID_LEN)
+		ereport(ERROR,
+				errcode(ERRCODE_INVALID_BINARY_REPRESENTATION),
+				errmsg("invalid uuid length"));
+
+	uuid = (pg_uuid_t *) palloc(UUID_LEN);
+	memcpy(uuid->data, VARDATA_ANY(v), UUID_LEN);
+	PG_RETURN_POINTER(uuid);
+}
+
+/* Cast uuid -> bytea; can just use uuid_send() */
+Datum
+uuid_bytea(PG_FUNCTION_ARGS)
+{
+	return uuid_send(fcinfo);
+}
diff --git a/src/include/catalog/pg_cast.dat b/src/include/catalog/pg_cast.dat
index fbfd669587f..913c55ef869 100644
--- a/src/include/catalog/pg_cast.dat
+++ b/src/include/catalog/pg_cast.dat
@@ -348,6 +348,12 @@
 { castsource => 'bytea', casttarget => 'int8', castfunc => 'int8(bytea)',
   castcontext => 'e', castmethod => 'f' },
 
+# Allow explicit coercions between bytea and uuid type
+{ castsource => 'bytea', casttarget => 'uuid', castfunc => 'uuid(bytea)',
+  castcontext => 'e', castmethod => 'f' },
+{ castsource => 'uuid', casttarget => 'bytea', castfunc => 'bytea(uuid)',
+  castcontext => 'e', castmethod => 'f' },
+
 # Allow explicit coercions between int4 and "char"
 { castsource => 'char', casttarget => 'int4', castfunc => 'int4(char)',
   castcontext => 'e', castmethod => 'f' },
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index b51d2b17379..afcc9a3476a 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -1199,6 +1199,13 @@
   proname => 'int8', prorettype => 'int8', proargtypes => 'bytea',
   prosrc => 'bytea_int8' },
 
+{ oid => '9880', descr => 'convert uuid to bytea',
+  proname => 'bytea', prorettype => 'bytea', proargtypes => 'uuid',
+  prosrc => 'uuid_bytea' },
+{ oid => '9881', descr => 'convert bytea to uuid',
+  proname => 'uuid', prorettype => 'uuid', proargtypes => 'bytea',
+  prosrc => 'bytea_uuid' },
+
 { oid => '449', descr => 'hash',
   proname => 'hashint2', prorettype => 'int4', proargtypes => 'int2',
   prosrc => 'hashint2' },
diff --git a/src/test/regress/expected/uuid.out b/src/test/regress/expected/uuid.out
index 95392003b86..4b635336606 100644
--- a/src/test/regress/expected/uuid.out
+++ b/src/test/regress/expected/uuid.out
@@ -305,5 +305,20 @@ SELECT uuid_extract_timestamp('11111111-1111-1111-1111-111111111111');  -- null
  
 (1 row)
 
+-- casts
+SELECT '5b35380a-7143-4912-9b55-f322699c6770'::uuid::bytea;
+               bytea                
+------------------------------------
+ \x5b35380a714349129b55f322699c6770
+(1 row)
+
+SELECT '\x019a2f859ced7225b99d9c55044a2563'::bytea::uuid;
+                 uuid                 
+--------------------------------------
+ 019a2f85-9ced-7225-b99d-9c55044a2563
+(1 row)
+
+SELECT '\x1234567890abcdef'::bytea::uuid; -- error
+ERROR:  invalid uuid length
 -- clean up
 DROP TABLE guid1, guid2, guid3 CASCADE;
diff --git a/src/test/regress/sql/uuid.sql b/src/test/regress/sql/uuid.sql
index 465153a0341..63520d0b640 100644
--- a/src/test/regress/sql/uuid.sql
+++ b/src/test/regress/sql/uuid.sql
@@ -146,6 +146,10 @@ SELECT uuid_extract_timestamp('017F22E2-79B0-7CC3-98C4-DC0C0C07398F') = 'Tuesday
 SELECT uuid_extract_timestamp(gen_random_uuid());  -- null
 SELECT uuid_extract_timestamp('11111111-1111-1111-1111-111111111111');  -- null
 
+-- casts
+SELECT '5b35380a-7143-4912-9b55-f322699c6770'::uuid::bytea;
+SELECT '\x019a2f859ced7225b99d9c55044a2563'::bytea::uuid;
+SELECT '\x1234567890abcdef'::bytea::uuid; -- error
 
 -- clean up
 DROP TABLE guid1, guid2, guid3 CASCADE;
-- 
2.39.5 (Apple Git-154)



  [application/octet-stream] v3-0003-Add-base32hex-encoding-support-to-encode-and-deco.patch (12.0K, 3-v3-0003-Add-base32hex-encoding-support-to-encode-and-deco.patch)
  download | inline diff:
From 76a13f46fb07903d67f980b552346ebe6b3a8fe2 Mon Sep 17 00:00:00 2001
From: Andrey Borodin <[email protected]>
Date: Wed, 29 Oct 2025 15:53:12 +0400
Subject: [PATCH v3 3/3] Add base32hex encoding support to encode() and
 decode()

Implement base32hex encoding/decoding per RFC 4648 Section 7 for
encode() and decode() functions. This encoding uses the extended hex
alphabet (0-9, A-V) which preserves sort order.

The encode() function produces unpadded output, while decode() accepts
both padded and unpadded input. Decoding is case-insensitive.

This is particularly useful for encoding UUIDs compactly:
    SELECT encode(uuid_value::bytea, 'base32hex');
produces a 26-character string compared to the standard 36-character
UUID representation.

Suggested-by: Sergey Prokhorenko<[email protected]>
---
 doc/src/sgml/func/func-binarystring.sgml |  25 +++++
 src/backend/utils/adt/encode.c           | 124 +++++++++++++++++++++++
 src/test/regress/expected/uuid.out       |  88 ++++++++++++++++
 src/test/regress/sql/uuid.sql            |  27 +++++
 4 files changed, 264 insertions(+)

diff --git a/doc/src/sgml/func/func-binarystring.sgml b/doc/src/sgml/func/func-binarystring.sgml
index dd7037811af..4659ad49787 100644
--- a/doc/src/sgml/func/func-binarystring.sgml
+++ b/doc/src/sgml/func/func-binarystring.sgml
@@ -729,6 +729,7 @@
        <parameter>format</parameter> values are:
        <link linkend="encode-format-base64"><literal>base64</literal></link>,
        <link linkend="encode-format-base64url"><literal>base64url</literal></link>,
+       <link linkend="encode-format-base32hex"><literal>base32hex</literal></link>,
        <link linkend="encode-format-escape"><literal>escape</literal></link>,
        <link linkend="encode-format-hex"><literal>hex</literal></link>.
       </para>
@@ -804,6 +805,30 @@
      </listitem>
     </varlistentry>
 
+    <varlistentry id="encode-format-base32hex">
+     <term>base32hex
+      <indexterm>
+       <primary>base32hex format</primary>
+      </indexterm></term>
+     <listitem>
+      <para>
+       The <literal>base32hex</literal> format is that of
+       <ulink url="https://datatracker.ietf.org/doc/html/rfc4648#section-7">
+       RFC 4648 Section 7</ulink>.  It uses the extended hex alphabet
+       (0-9, A-V) which preserves sort order when encoding binary data.
+       The <function>encode</function> function produces unpadded output,
+       while <function>decode</function> accepts both padded and unpadded
+       input. Decoding is case-insensitive and ignores whitespace characters.
+      </para>
+      <para>
+       This format is particularly useful for encoding UUIDs in a compact,
+       sortable format: <literal>encode(uuid_value::bytea, 'base32hex')</literal>
+       produces a 26-character string compared to the standard 36-character
+       UUID representation.
+      </para>
+     </listitem>
+    </varlistentry>
+
     <varlistentry id="encode-format-escape">
      <term>escape
      <indexterm>
diff --git a/src/backend/utils/adt/encode.c b/src/backend/utils/adt/encode.c
index aabe9913eee..c31ab60d4b7 100644
--- a/src/backend/utils/adt/encode.c
+++ b/src/backend/utils/adt/encode.c
@@ -821,6 +821,124 @@ esc_dec_len(const char *src, size_t srclen)
 	return len;
 }
 
+/*
+ * BASE32HEX
+ */
+
+static const char base32hex_table[] = "0123456789ABCDEFGHIJKLMNOPQRSTUV";
+
+static uint64
+base32hex_enc_len(const char *src, size_t srclen)
+{
+	/* 5 bits per base32hex character, so round up (srclen * 8 + 4) / 5 */
+	return ((uint64) srclen * 8 + 4) / 5;
+}
+
+static uint64
+base32hex_dec_len(const char *src, size_t srclen)
+{
+	/* Decode length is (srclen * 5) / 8, but we may have padding */
+	return ((uint64) srclen * 5) / 8;
+}
+
+static uint64
+base32hex_encode(const char *src, size_t srclen, char *dst)
+{
+	const unsigned char *data = (const unsigned char *) src;
+	uint64		bits_buffer = 0;
+	int			bits_in_buffer = 0;
+	uint64		output_pos = 0;
+	size_t		i;
+
+	for (i = 0; i < srclen; i++)
+	{
+		/* Add 8 bits to the buffer */
+		bits_buffer = (bits_buffer << 8) | data[i];
+		bits_in_buffer += 8;
+
+		/* Extract 5-bit chunks while we have enough bits */
+		while (bits_in_buffer >= 5)
+		{
+			bits_in_buffer -= 5;
+			/* Extract top 5 bits */
+			dst[output_pos++] = base32hex_table[(bits_buffer >> bits_in_buffer) & 0x1F];
+			/* Clear the extracted bits by masking */
+			bits_buffer &= ((1ULL << bits_in_buffer) - 1);
+		}
+	}
+
+	/* Handle remaining bits (if any) */
+	if (bits_in_buffer > 0)
+	{
+		dst[output_pos++] = base32hex_table[(bits_buffer << (5 - bits_in_buffer)) & 0x1F];
+	}
+
+	return output_pos;
+}
+
+static uint64
+base32hex_decode(const char *src, size_t srclen, char *dst)
+{
+	const unsigned char *data = (const unsigned char *) src;
+	uint64		bits_buffer = 0;
+	int			bits_in_buffer = 0;
+	uint64		output_pos = 0;
+	size_t		i;
+	size_t		decode_len = srclen;
+
+	/*
+	 * RFC 4648 allows padding with '=' to make the length a multiple of 8.
+	 * Count and skip trailing padding characters.
+	 */
+	while (decode_len > 0 && data[decode_len - 1] == '=')
+		decode_len--;
+
+	for (i = 0; i < decode_len; i++)
+	{
+		unsigned char c = data[i];
+		int			val;
+
+		/* Skip whitespace */
+		if (c == ' ' || c == '\t' || c == '\n' || c == '\r')
+			continue;
+
+		/* Decode base32hex character (0-9, A-V, case-insensitive) */
+		if (c >= '0' && c <= '9')
+			val = c - '0';
+		else if (c >= 'A' && c <= 'V')
+			val = c - 'A' + 10;
+		else if (c >= 'a' && c <= 'v')
+			val = c - 'a' + 10;
+		else
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+					 errmsg("invalid symbol \"%.*s\" found while decoding base32hex sequence",
+							pg_mblen((const char *) &c), (const char *) &c)));
+
+		/* Add 5 bits to buffer */
+		bits_buffer = (bits_buffer << 5) | val;
+		bits_in_buffer += 5;
+
+		/* Extract 8-bit bytes when we have enough bits */
+		while (bits_in_buffer >= 8)
+		{
+			bits_in_buffer -= 8;
+			dst[output_pos++] = (unsigned char) (bits_buffer >> bits_in_buffer);
+			/* Clear the extracted bits */
+			bits_buffer &= ((1ULL << bits_in_buffer) - 1);
+		}
+	}
+
+	/* Verify no extra bits remain (padding bits should be zero) */
+	if (bits_in_buffer > 0 && (bits_buffer & ((1ULL << bits_in_buffer) - 1)) != 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("invalid base32hex end sequence"),
+				 errhint("Input data has non-zero padding bits.")));
+
+	return output_pos;
+}
+
 /*
  * Common
  */
@@ -850,6 +968,12 @@ static const struct
 			pg_base64url_enc_len, pg_base64url_dec_len, pg_base64url_encode, pg_base64url_decode
 		}
 	},
+	{
+		"base32hex",
+		{
+			base32hex_enc_len, base32hex_dec_len, base32hex_encode, base32hex_decode
+		}
+	},
 	{
 		"escape",
 		{
diff --git a/src/test/regress/expected/uuid.out b/src/test/regress/expected/uuid.out
index 24486084aaf..86d21a29093 100644
--- a/src/test/regress/expected/uuid.out
+++ b/src/test/regress/expected/uuid.out
@@ -321,5 +321,93 @@ SELECT '\x019a2f859ced7225b99d9c55044a2563'::bytea::uuid;
 SELECT '\x1234567890abcdef'::bytea::uuid; -- error
 ERROR:  invalid length for UUID
 DETAIL:  Expected 16 bytes, got 8.
+-- base32hex encoding via encode/decode
+SELECT encode('00000000-0000-0000-0000-000000000000'::uuid::bytea, 'base32hex');
+           encode           
+----------------------------
+ 00000000000000000000000000
+(1 row)
+
+SELECT encode('11111111-1111-1111-1111-111111111111'::uuid::bytea, 'base32hex');
+           encode           
+----------------------------
+ 248H248H248H248H248H248H24
+(1 row)
+
+SELECT encode('ffffffff-ffff-ffff-ffff-ffffffffffff'::uuid::bytea, 'base32hex');
+           encode           
+----------------------------
+ VVVVVVVVVVVVVVVVVVVVVVVVVS
+(1 row)
+
+SELECT encode('123e4567-e89b-12d3-a456-426614174000'::uuid::bytea, 'base32hex');
+           encode           
+----------------------------
+ 28V4APV8JC9D792M89J185Q000
+(1 row)
+
+-- test decode with base32hex
+SELECT decode('00000000000000000000000000', 'base32hex')::uuid;
+                decode                
+--------------------------------------
+ 00000000-0000-0000-0000-000000000000
+(1 row)
+
+SELECT decode('28V4APV8JC9D792M89J185Q000', 'base32hex')::uuid;
+                decode                
+--------------------------------------
+ 123e4567-e89b-12d3-a456-426614174000
+(1 row)
+
+-- test round-trip conversions
+SELECT decode(encode('00000000-0000-0000-0000-000000000000'::uuid::bytea, 'base32hex'), 'base32hex')::uuid;
+                decode                
+--------------------------------------
+ 00000000-0000-0000-0000-000000000000
+(1 row)
+
+SELECT encode(decode('28V4APV8JC9D792M89J185Q000', 'base32hex')::uuid::bytea, 'base32hex');
+           encode           
+----------------------------
+ 28V4APV8JC9D792M89J185Q000
+(1 row)
+
+SELECT decode(encode('123e4567-e89b-12d3-a456-426614174000'::uuid::bytea, 'base32hex'), 'base32hex')::uuid;
+                decode                
+--------------------------------------
+ 123e4567-e89b-12d3-a456-426614174000
+(1 row)
+
+-- test case insensitivity
+SELECT decode('28v4apv8jc9d792m89j185q000', 'base32hex')::uuid;
+                decode                
+--------------------------------------
+ 123e4567-e89b-12d3-a456-426614174000
+(1 row)
+
+SELECT decode('28V4APV8JC9D792M89J185Q000', 'base32hex')::uuid;
+                decode                
+--------------------------------------
+ 123e4567-e89b-12d3-a456-426614174000
+(1 row)
+
+-- test RFC 4648 padding (32 chars with 6 '=' signs)
+SELECT decode('28V4APV8JC9D792M89J185Q000======', 'base32hex')::uuid;
+                decode                
+--------------------------------------
+ 123e4567-e89b-12d3-a456-426614174000
+(1 row)
+
+SELECT decode('00000000000000000000000000======', 'base32hex')::uuid;
+                decode                
+--------------------------------------
+ 00000000-0000-0000-0000-000000000000
+(1 row)
+
+-- test error cases for base32hex
+SELECT decode('28V4APV8JC9D792M89J185Q00W', 'base32hex')::uuid;  -- invalid character W
+ERROR:  invalid symbol "W" found while decoding base32hex sequence
+SELECT decode('28V4APV8JC9D792M89J185Q00!', 'base32hex')::uuid;  -- invalid character !
+ERROR:  invalid symbol "!" found while decoding base32hex sequence
 -- clean up
 DROP TABLE guid1, guid2, guid3 CASCADE;
diff --git a/src/test/regress/sql/uuid.sql b/src/test/regress/sql/uuid.sql
index 63520d0b640..44e8fa8b243 100644
--- a/src/test/regress/sql/uuid.sql
+++ b/src/test/regress/sql/uuid.sql
@@ -151,5 +151,32 @@ SELECT '5b35380a-7143-4912-9b55-f322699c6770'::uuid::bytea;
 SELECT '\x019a2f859ced7225b99d9c55044a2563'::bytea::uuid;
 SELECT '\x1234567890abcdef'::bytea::uuid; -- error
 
+-- base32hex encoding via encode/decode
+SELECT encode('00000000-0000-0000-0000-000000000000'::uuid::bytea, 'base32hex');
+SELECT encode('11111111-1111-1111-1111-111111111111'::uuid::bytea, 'base32hex');
+SELECT encode('ffffffff-ffff-ffff-ffff-ffffffffffff'::uuid::bytea, 'base32hex');
+SELECT encode('123e4567-e89b-12d3-a456-426614174000'::uuid::bytea, 'base32hex');
+
+-- test decode with base32hex
+SELECT decode('00000000000000000000000000', 'base32hex')::uuid;
+SELECT decode('28V4APV8JC9D792M89J185Q000', 'base32hex')::uuid;
+
+-- test round-trip conversions
+SELECT decode(encode('00000000-0000-0000-0000-000000000000'::uuid::bytea, 'base32hex'), 'base32hex')::uuid;
+SELECT encode(decode('28V4APV8JC9D792M89J185Q000', 'base32hex')::uuid::bytea, 'base32hex');
+SELECT decode(encode('123e4567-e89b-12d3-a456-426614174000'::uuid::bytea, 'base32hex'), 'base32hex')::uuid;
+
+-- test case insensitivity
+SELECT decode('28v4apv8jc9d792m89j185q000', 'base32hex')::uuid;
+SELECT decode('28V4APV8JC9D792M89J185Q000', 'base32hex')::uuid;
+
+-- test RFC 4648 padding (32 chars with 6 '=' signs)
+SELECT decode('28V4APV8JC9D792M89J185Q000======', 'base32hex')::uuid;
+SELECT decode('00000000000000000000000000======', 'base32hex')::uuid;
+
+-- test error cases for base32hex
+SELECT decode('28V4APV8JC9D792M89J185Q00W', 'base32hex')::uuid;  -- invalid character W
+SELECT decode('28V4APV8JC9D792M89J185Q00!', 'base32hex')::uuid;  -- invalid character !
+
 -- clean up
 DROP TABLE guid1, guid2, guid3 CASCADE;
-- 
2.39.5 (Apple Git-154)



  [application/octet-stream] v3-0002-Minor-suggestions-to-Ilmari-s-patch.patch (1.6K, 4-v3-0002-Minor-suggestions-to-Ilmari-s-patch.patch)
  download | inline diff:
From d73e7c1cd31f5036a847b8a6bc3823a01cac5a77 Mon Sep 17 00:00:00 2001
From: Andrey Borodin <[email protected]>
Date: Wed, 29 Oct 2025 15:59:46 +0400
Subject: [PATCH v3 2/3] Minor suggestions to Ilmari's patch

---
 src/backend/utils/adt/bytea.c      | 9 +++++----
 src/test/regress/expected/uuid.out | 3 ++-
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/src/backend/utils/adt/bytea.c b/src/backend/utils/adt/bytea.c
index ff9e46f3015..14e95031be1 100644
--- a/src/backend/utils/adt/bytea.c
+++ b/src/backend/utils/adt/bytea.c
@@ -1124,12 +1124,13 @@ bytea_uuid(PG_FUNCTION_ARGS)
 
 	if (len != UUID_LEN)
 		ereport(ERROR,
-				errcode(ERRCODE_INVALID_BINARY_REPRESENTATION),
-				errmsg("invalid uuid length"));
+				(errcode(ERRCODE_INVALID_BINARY_REPRESENTATION),
+				 errmsg("invalid length for UUID"),
+				 errdetail("Expected %d bytes, got %d.", UUID_LEN, len)));
 
-	uuid = (pg_uuid_t *) palloc(UUID_LEN);
+	uuid = (pg_uuid_t *) palloc(sizeof(pg_uuid_t));
 	memcpy(uuid->data, VARDATA_ANY(v), UUID_LEN);
-	PG_RETURN_POINTER(uuid);
+	PG_RETURN_UUID_P(uuid);
 }
 
 /* Cast uuid -> bytea; can just use uuid_send() */
diff --git a/src/test/regress/expected/uuid.out b/src/test/regress/expected/uuid.out
index 4b635336606..24486084aaf 100644
--- a/src/test/regress/expected/uuid.out
+++ b/src/test/regress/expected/uuid.out
@@ -319,6 +319,7 @@ SELECT '\x019a2f859ced7225b99d9c55044a2563'::bytea::uuid;
 (1 row)
 
 SELECT '\x1234567890abcdef'::bytea::uuid; -- error
-ERROR:  invalid uuid length
+ERROR:  invalid length for UUID
+DETAIL:  Expected 16 bytes, got 8.
 -- clean up
 DROP TABLE guid1, guid2, guid3 CASCADE;
-- 
2.39.5 (Apple Git-154)



^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-23 17:34   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-23 21:00     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 07:16       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 18:23         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 22:39           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 23:31             ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-25 18:07               ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-27 20:09                 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-27 22:36                   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-28 09:30                     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Jelte Fennema-Nio <[email protected]>
  2025-10-28 11:53                       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Dagfinn Ilmari Mannsåker <[email protected]>
  2025-10-28 16:41                         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Dagfinn Ilmari Mannsåker <[email protected]>
  2025-10-28 18:44                           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-29 12:19                             ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
@ 2025-10-30 19:10                               ` Masahiko Sawada <[email protected]>
  1 sibling, 0 replies; 37+ messages in thread

From: Masahiko Sawada @ 2025-10-30 19:10 UTC (permalink / raw)
  To: Andrey Borodin <[email protected]>; +Cc: Dagfinn Ilmari Mannsåker <[email protected]>; Jelte Fennema-Nio <[email protected]>; Sergey Prokhorenko <[email protected]>; pgsql-hackers

On Wed, Oct 29, 2025 at 5:19 AM Andrey Borodin <[email protected]> wrote:
>
>
>
> > On 28 Oct 2025, at 22:44, Masahiko Sawada <[email protected]> wrote:
> >
> > Andrey has shared his patch for base32hex support before[1]. While it
> > needs to be updated, it seems to implement sufficient function.
>
> I'd propose something like attached patch. It's on top of Ilmari's v2 patch with small suggestions as a step 2.
>

I've reviewed the v3 patches, and here are some review comments.

v3-0001 and v3-0002:

-                               errcode(ERRCODE_INVALID_BINARY_REPRESENTATION),
-                               errmsg("invalid uuid length"));
+                               (errcode(ERRCODE_INVALID_BINARY_REPRESENTATION),
+                                errmsg("invalid length for UUID"),
+                                errdetail("Expected %d bytes, got
%d.", UUID_LEN, len)));

How about the error message like "invalid input length for type uuid"?
I think "uuid" should be lower case as it indicates PostgreSQL uuid
data type, and it's better to use %s format instead of directly
writing "uuid" (see string_to_uuid() for example).

As for the errdetail message, should we add "bytea" also after "got %d"?

---
+-- casts
+SELECT '5b35380a-7143-4912-9b55-f322699c6770'::uuid::bytea;
+SELECT '\x019a2f859ced7225b99d9c55044a2563'::bytea::uuid;
+SELECT '\x1234567890abcdef'::bytea::uuid; -- error

We already have tests for casting bytes to integer data types in
strings.sql. I suggest moving the casting tests from bytea to uuid
into therel. For the uuid.sql file, we could add a test to verify that
a UUID value remains unchanged when it's cast to bytea and back to
UUID. For example,

SELECT v = v::bytea::uuid as matched FROM gen_random_uuid() v;

---
I think we should update the documentation in the uuid section about
casting data between bytea and uuid. For references, we have a similar
description for bytea and integer[1].

v3-0003:

base32hex_encode() doesn't seem to add '=' paddings, but is it
intentional? I don't see any description in RFC 4648 that we can omit
'=' paddings.

---
I think the patch should add tests not only for uuid data type but
also for general cases like other encodings.

---
In uuid.sql tests, how about adding some tests to check if base32hex
maintains the sortability of UUIDv7 data?

---
I would suggest registering the patches to the next commit fest if not yet.

Regards,

[1] https://www.postgresql.org/docs/devel/functions-binarystring.html

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com





^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-23 17:34   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-23 21:00     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 07:16       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 18:23         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 22:39           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 23:31             ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-25 18:07               ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-27 20:09                 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-27 22:36                   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-28 09:30                     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Jelte Fennema-Nio <[email protected]>
  2025-10-28 11:53                       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Dagfinn Ilmari Mannsåker <[email protected]>
  2025-10-28 16:41                         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Dagfinn Ilmari Mannsåker <[email protected]>
  2025-10-28 18:44                           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-29 12:19                             ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
@ 2025-11-20 05:46                               ` Илья Чердаков <[email protected]>
  1 sibling, 0 replies; 37+ messages in thread

From: Илья Чердаков @ 2025-11-20 05:46 UTC (permalink / raw)
  To: Andrey Borodin <[email protected]>; Masahiko Sawada <[email protected]>; +Cc: Dagfinn Ilmari Mannsåker <[email protected]>; Jelte Fennema-Nio <[email protected]>; Sergey Prokhorenko <[email protected]>; pgsql-hackers

31.10.2025 2:10, Masahiko Sawada wrote:
> ---
> I think we should update the documentation in the uuid section about
> casting data between bytea and uuid. For references, we have a similar
> description for bytea and integer[1].
Greetings!
I briefly tested the patched version of v3. The implemented
functionality works correctly.

---
You can also add a case with the error from v3-0002
"invalid base32hex end sequence" to the tests :

+        ereport(ERROR,
+                (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+                 errmsg("invalid base32hex end sequence"),
+                 errhint("Input data has non-zero padding bits.")));

---
I agree with Masahiko Sawada; information about conversions
should be added to the documentation.

-- 
Best regards,
Ilya Cherdakov, PostgresPro

^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-23 17:34   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-23 21:00     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 07:16       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 18:23         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 22:39           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 23:31             ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-25 18:07               ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-27 20:09                 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-27 22:36                   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-28 09:30                     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Jelte Fennema-Nio <[email protected]>
  2025-10-28 11:53                       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Dagfinn Ilmari Mannsåker <[email protected]>
  2025-10-28 16:41                         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Dagfinn Ilmari Mannsåker <[email protected]>
@ 2025-10-28 23:35                           ` Jelte Fennema-Nio <[email protected]>
  2025-10-29 10:51                             ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Dagfinn Ilmari Mannsåker <[email protected]>
  1 sibling, 1 reply; 37+ messages in thread

From: Jelte Fennema-Nio @ 2025-10-28 23:35 UTC (permalink / raw)
  To: Dagfinn Ilmari Mannsåker <[email protected]>; +Cc: Sergey Prokhorenko <[email protected]>; Andrey Borodin <[email protected]>; Masahiko Sawada <[email protected]>; pgsql-hackers

On Tue, 28 Oct 2025 at 17:41, Dagfinn Ilmari Mannsåker
<[email protected]> wrote:
> Here's a patch for that.

Looks good to me. Maybe add a test where not every byte is the same though.

> I'm not 100% confident about the error code
> for invalid length, but that was the closest one I could find in
> errcodes.txt.

The errorcode you chose seems acceptable to me, but I think a slightly
more fitting option would be ERRCODE_INVALID_BINARY_REPRESENTATION.
Error codes in postgres are pretty arbitrary though, so either seems
fine to me.





^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-23 17:34   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-23 21:00     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 07:16       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 18:23         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 22:39           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 23:31             ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-25 18:07               ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-27 20:09                 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-27 22:36                   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-28 09:30                     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Jelte Fennema-Nio <[email protected]>
  2025-10-28 11:53                       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Dagfinn Ilmari Mannsåker <[email protected]>
  2025-10-28 16:41                         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Dagfinn Ilmari Mannsåker <[email protected]>
  2025-10-28 23:35                           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Jelte Fennema-Nio <[email protected]>
@ 2025-10-29 10:51                             ` Dagfinn Ilmari Mannsåker <[email protected]>
  2025-10-29 11:31                               ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  0 siblings, 1 reply; 37+ messages in thread

From: Dagfinn Ilmari Mannsåker @ 2025-10-29 10:51 UTC (permalink / raw)
  To: Jelte Fennema-Nio <[email protected]>; +Cc: Sergey Prokhorenko <[email protected]>; Andrey Borodin <[email protected]>; Masahiko Sawada <[email protected]>; pgsql-hackers

Jelte Fennema-Nio <[email protected]> writes:

> On Tue, 28 Oct 2025 at 17:41, Dagfinn Ilmari Mannsåker
> <[email protected]> wrote:
>> Here's a patch for that.
>
> Looks good to me. Maybe add a test where not every byte is the same though.

Good point. I've replaced them with two randomly generated ones.

>> I'm not 100% confident about the error code
>> for invalid length, but that was the closest one I could find in
>> errcodes.txt.
>
> The errorcode you chose seems acceptable to me, but I think a slightly
> more fitting option would be ERRCODE_INVALID_BINARY_REPRESENTATION.
> Error codes in postgres are pretty arbitrary though, so either seems
> fine to me.

That does seem like a better fit. It's used mainly in recv functions,
which this basically is (but user-callable).

Updated patch attaced.

- ilmari



^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-23 17:34   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-23 21:00     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 07:16       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 18:23         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 22:39           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 23:31             ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-25 18:07               ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-27 20:09                 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-27 22:36                   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-28 09:30                     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Jelte Fennema-Nio <[email protected]>
  2025-10-28 11:53                       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Dagfinn Ilmari Mannsåker <[email protected]>
  2025-10-28 16:41                         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Dagfinn Ilmari Mannsåker <[email protected]>
  2025-10-28 23:35                           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Jelte Fennema-Nio <[email protected]>
  2025-10-29 10:51                             ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Dagfinn Ilmari Mannsåker <[email protected]>
@ 2025-10-29 11:31                               ` Aleksander Alekseev <[email protected]>
  2025-10-29 11:57                                 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-29 12:04                                 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Dagfinn Ilmari Mannsåker <[email protected]>
  0 siblings, 2 replies; 37+ messages in thread

From: Aleksander Alekseev @ 2025-10-29 11:31 UTC (permalink / raw)
  To: pgsql-hackers; +Cc: Dagfinn Ilmari Mannsåker <[email protected]>; Jelte Fennema-Nio <[email protected]>; Sergey Prokhorenko <[email protected]>; Andrey Borodin <[email protected]>; Masahiko Sawada <[email protected]>

Hi,

Thanks for the patch.

> That does seem like a better fit. It's used mainly in recv functions,
> which this basically is (but user-callable).
>
> Updated patch attaced.

Perhaps bytea_uuid() should check the UUID format. Consider the
following example:

SELECT uuid_extract_version('\x019a2f859cedffffb99d9c55044a2563'::bytea::uuid);
 uuid_extract_version
----------------------
                   15

There is no UUID version 15 according to RFC 9562, and the
documentation for uuid_extract_version() says:

"""
Extracts the version from a UUID of the variant described by RFC 9562.
For other variants, this function returns null. For example, for a
UUID generated by gen_random_uuid, this function will return 4.
"""

If I read this correctly, either bytea_uuid() should reject this, or
uuid_extract_version() should be modified to return NULL, or the
documentation for uuid_extract_version() should be altered.

-- 
Best regards,
Aleksander Alekseev





^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-23 17:34   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-23 21:00     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 07:16       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 18:23         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 22:39           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 23:31             ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-25 18:07               ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-27 20:09                 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-27 22:36                   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-28 09:30                     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Jelte Fennema-Nio <[email protected]>
  2025-10-28 11:53                       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Dagfinn Ilmari Mannsåker <[email protected]>
  2025-10-28 16:41                         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Dagfinn Ilmari Mannsåker <[email protected]>
  2025-10-28 23:35                           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Jelte Fennema-Nio <[email protected]>
  2025-10-29 10:51                             ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Dagfinn Ilmari Mannsåker <[email protected]>
  2025-10-29 11:31                               ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
@ 2025-10-29 11:57                                 ` Aleksander Alekseev <[email protected]>
  1 sibling, 0 replies; 37+ messages in thread

From: Aleksander Alekseev @ 2025-10-29 11:57 UTC (permalink / raw)
  To: pgsql-hackers; +Cc: Dagfinn Ilmari Mannsåker <[email protected]>; Jelte Fennema-Nio <[email protected]>; Sergey Prokhorenko <[email protected]>; Andrey Borodin <[email protected]>; Masahiko Sawada <[email protected]>

Hi,

> > Updated patch attaced.
>
> Perhaps bytea_uuid() should check the UUID format. Consider the
> following example:
>
> SELECT uuid_extract_version('\x019a2f859cedffffb99d9c55044a2563'::bytea::uuid);

On the flip side we allow this:

=# select '019a2f85-9ced-ffff-b99d-9c55044a2563' :: uuid;
                 uuid
--------------------------------------
 019a2f85-9ced-ffff-b99d-9c55044a2563

So I guess the patch is fine.

-- 
Best regards,
Aleksander Alekseev





^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-23 17:34   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-23 21:00     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 07:16       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 18:23         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 22:39           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 23:31             ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-25 18:07               ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-27 20:09                 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-27 22:36                   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-28 09:30                     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Jelte Fennema-Nio <[email protected]>
  2025-10-28 11:53                       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Dagfinn Ilmari Mannsåker <[email protected]>
  2025-10-28 16:41                         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Dagfinn Ilmari Mannsåker <[email protected]>
  2025-10-28 23:35                           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Jelte Fennema-Nio <[email protected]>
  2025-10-29 10:51                             ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Dagfinn Ilmari Mannsåker <[email protected]>
  2025-10-29 11:31                               ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
@ 2025-10-29 12:04                                 ` Dagfinn Ilmari Mannsåker <[email protected]>
  1 sibling, 0 replies; 37+ messages in thread

From: Dagfinn Ilmari Mannsåker @ 2025-10-29 12:04 UTC (permalink / raw)
  To: Aleksander Alekseev <[email protected]>; +Cc: pgsql-hackers; Jelte Fennema-Nio <[email protected]>; Sergey Prokhorenko <[email protected]>; Andrey Borodin <[email protected]>; Masahiko Sawada <[email protected]>

Aleksander Alekseev <[email protected]> writes:

> Hi,
>
> Thanks for the patch.
>
>> That does seem like a better fit. It's used mainly in recv functions,
>> which this basically is (but user-callable).
>>
>> Updated patch attaced.
>
> Perhaps bytea_uuid() should check the UUID format. Consider the
> following example:
>
> SELECT uuid_extract_version('\x019a2f859cedffffb99d9c55044a2563'::bytea::uuid);
>  uuid_extract_version
> ----------------------
>                    15

The UUID input function doesn't do any such validation, so I don't see
why the cast should behave any differently:

# select '019a2f859cedffffb99d9c55044a2563'::uuid;
┌──────────────────────────────────────┐
│                 uuid                 │
├──────────────────────────────────────┤
│ 019a2f85-9ced-ffff-b99d-9c55044a2563 │
└──────────────────────────────────────┘

# select uuid_extract_version('019a2f859cedffffb99d9c55044a2563'::uuid);
┌──────────────────────┐
│ uuid_extract_version │
├──────────────────────┤
│                   15 │
└──────────────────────┘
(1 row)

> There is no UUID version 15 according to RFC 9562, and the
> documentation for uuid_extract_version() says:

There's no version 15 specified _yet_.

> """
> Extracts the version from a UUID of the variant described by RFC 9562.
> For other variants, this function returns null. For example, for a
> UUID generated by gen_random_uuid, this function will return 4.
> """
>
> If I read this correctly, either bytea_uuid() should reject this, or
> uuid_extract_version() should be modified to return NULL, or the
> documentation for uuid_extract_version() should be altered.

Future RFCs could define new versions of this variant, which we should
not reject or pretend don't have a version , just because we haven't
heard of them yet.  In fact, RFC 9562 defines two new versions, v7 and
v8, which by this argument PostgreSQL versions before 18 should reject.

- ilmari





^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-23 17:34   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-23 21:00     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 07:16       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 18:23         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 22:39           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 23:31             ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-25 18:07               ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-27 20:09                 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-27 22:36                   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-28 09:30                     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Jelte Fennema-Nio <[email protected]>
@ 2025-10-28 14:28                       ` Sergey Prokhorenko <[email protected]>
  2025-10-28 14:39                         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Jelte Fennema-Nio <[email protected]>
  2 siblings, 1 reply; 37+ messages in thread

From: Sergey Prokhorenko @ 2025-10-28 14:28 UTC (permalink / raw)
  To: Jelte Fennema-Nio <[email protected]>; +Cc: Andrey Borodin <[email protected]>; Masahiko Sawada <[email protected]>; pgsql-hackers

First of all, I'm definitely a proponent of being able to encode UUIDsusing base32hex in Postgres.

On Mon, 27 Oct 2025 at 23:37, Sergey Prokhorenko
<[email protected]> wrote:
> I wanted to highlight an important discussion among the authors and contributors of RFC 9562 regarding UUID text encoding:
>
> https://github.com/uuid6/new-uuid-encoding-techniques-ietf-draft/discussions/17#discussioncomment-10...

I think a very important thing to note here is that this is a github
discussion, not an officially accepted RFC. I think if it was an
officially accepted RFC on how to encode UUIDs then you would have a
lot less pushback here. Right now your emails mostly read like you
want to push your preferential format, while essentially disallowing
other encodings. While base32hex seems like a good choice for UUIDv7 I
see no reason to give it preferential treatment at this point in time.
crockford base32 seems just as valid. And e.g. base64url[1] seems
totally fine for UUID versions that have no inherent ordering like
UUIDv4. And if someone comes up with a base64urlhex format you could
have even shorter bit still sortable UUIDs at the expense of
legibility.

The main reason why a specific encoding should receive preferential
treatment in Postgres, would be if it was standardized, as that would
help with interoperability. At this point in time there's no such
standard (not even a draft), so forcing an explicit encoding will
actually reduce interoperability, because people already encode their
UUIDs in various different forms.

> but the discussion established that base32hex is the existing standard format already defined in RFC 4648, Section 7, specifically designed for sort-preserving encoding.

You even reach a similar conclusion here: not choosing crockford
base32, purely because it does not have an official RFC.

> This context is crucial because it underscores that the uuid type, as a first-class concept, deserves its own standardized text encoding.

It already has! The standard text encoding is defined in RFC 4122.
That's why postgres displays it as such when encoding to text.

> Regarding the proposal to couple UUID encoding with the bytea type through encode()/decode() functions: I understand the appeal of reusing existing infrastructure, but this creates a conceptual mismatch. UUID is a distinct semantic type in PostgreSQL, not merely binary data. The bytea type has existed for decades without base32hex encoding, and that's worked fine, because bytea represents arbitrary binary data, not universally unique identifiers with specific structural properties and needs.

I think by far the first step is to make the encoding of UUIDs in
different formats possible in Postgres. The way to do so with the
least API impact (and thus as you noticed, least pushback), would be
to add base32hex to the list of encoding formats in the encode/decode
functions. Then combining that with UUID <-> bytea casting (which also
seems totally reasonable functionality to me), would give you the
functionality (but not the defaults you want).

In a follow up patch I would personally be fine making the API to
encode UUIDs a bit more friendly. In particular, adding an overload to
the encode function that takes a UUID instead of a bytea seems
reasonable to me, i.e. encode(id uuid, format text) -> text

I'm currently less convinced about a decode_uuid function though. I
think some perf argument (including some benchmarks) would need to be
made to convince me of its usefulness. Because purely from an API
friendliness lens, I feel like decode('...', 'base32hex)::uuid and
decode_uuid('...', 'base32hex') rank basically the same.

Once/if an accepted RFC actually defines a default shorter encoding
for UUIDs we could I would definitely be in favor of adding a
decode_uuid function with the default encoding configured as a default
argument. As well as adding the default argument to the uuid encode
overload function.
______________________________________________________________________________________________


Hi Jelte,
I agree with your points.
I believe we should put the discussion about compact UUID text encoding in PostgreSQL on hold for now. None of the proposed solutions has sufficient unconditional support from the participants. It makes sense to pause this discussion for more in-depth exploration to try and reach a consensus.
Jelte, I particularly liked your idea of a new, dedicated, standardized encoding for UUIDs, base64urlhex, and dedicated encoding/decoding functions for this encoding in PostgreSQL. I will try to develop such an encoding and submit it for discussion. I suggest calling it base64uuid.
My current attempt to establish base32hex as a de facto standard (even prior to an RFC) was unsuccessful. However, I remain convinced, like the authors of RFC 9562, that there should be only one standard compact encoding for UUIDs. Therefore, we must continue efforts to standardize such an encoding.
As for Crockford's Base32, it was rejected because of a lack of support in standard programming language libraries. Otherwise, it's just as good as base32hex.

Best regards,Sergey Prokhorenko






  

^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-23 17:34   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-23 21:00     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 07:16       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 18:23         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 22:39           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 23:31             ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-25 18:07               ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-27 20:09                 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-27 22:36                   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-28 09:30                     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Jelte Fennema-Nio <[email protected]>
  2025-10-28 14:28                       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
@ 2025-10-28 14:39                         ` Jelte Fennema-Nio <[email protected]>
  0 siblings, 0 replies; 37+ messages in thread

From: Jelte Fennema-Nio @ 2025-10-28 14:39 UTC (permalink / raw)
  To: Sergey Prokhorenko <[email protected]>; +Cc: Andrey Borodin <[email protected]>; Masahiko Sawada <[email protected]>; pgsql-hackers

On Tue, 28 Oct 2025 at 15:29, Sergey Prokhorenko
<[email protected]> wrote:
> I believe we should put the discussion about compact UUID text encoding in PostgreSQL on hold for now.

I agree with respect to the dedicated encode_uuid and decode_uuid functions.

But I think having postgres support base32hex in encode/decode for
bytea seems like a sensible thing in any case. As well as conversion
between UUID and bytea. If someone wants to create patches for that,
it seems like there'd be enough support for those to get them merged.





^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-23 17:34   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-23 21:00     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 07:16       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 18:23         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 22:39           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 23:31             ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-25 18:07               ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-27 20:09                 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-27 22:36                   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-28 09:30                     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Jelte Fennema-Nio <[email protected]>
@ 2025-10-28 21:56                       ` Sergey Prokhorenko <[email protected]>
  2025-10-28 23:44                         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Jelte Fennema-Nio <[email protected]>
  2 siblings, 1 reply; 37+ messages in thread

From: Sergey Prokhorenko @ 2025-10-28 21:56 UTC (permalink / raw)
  To: Jelte Fennema-Nio <[email protected]>; +Cc: Andrey Borodin <[email protected]>; Masahiko Sawada <[email protected]>; pgsql-hackers

First of all, I'm definitely a proponent of being able to encode UUIDsusing base32hex in Postgres.

On Mon, 27 Oct 2025 at 23:37, Sergey Prokhorenko
<[email protected]> wrote:
> I wanted to highlight an important discussion among the authors and contributors of RFC 9562 regarding UUID text encoding:
>
> https://github.com/uuid6/new-uuid-encoding-techniques-ietf-draft/discussions/17#discussioncomment-10...

I think a very important thing to note here is that this is a github
discussion, not an officially accepted RFC. I think if it was an
officially accepted RFC on how to encode UUIDs then you would have a
lot less pushback here. Right now your emails mostly read like you
want to push your preferential format, while essentially disallowing
other encodings. While base32hex seems like a good choice for UUIDv7 I
see no reason to give it preferential treatment at this point in time.
crockford base32 seems just as valid. And e.g. base64url[1] seems
totally fine for UUID versions that have no inherent ordering like
UUIDv4. And if someone comes up with a base64urlhex format you could
have even shorter bit still sortable UUIDs at the expense of
legibility.

The main reason why a specific encoding should receive preferential
treatment in Postgres, would be if it was standardized, as that would
help with interoperability. At this point in time there's no such
standard (not even a draft), so forcing an explicit encoding will
actually reduce interoperability, because people already encode their
UUIDs in various different forms.

> but the discussion established that base32hex is the existing standard format already defined in RFC 4648, Section 7, specifically designed for sort-preserving encoding.

You even reach a similar conclusion here: not choosing crockford
base32, purely because it does not have an official RFC.

> This context is crucial because it underscores that the uuid type, as a first-class concept, deserves its own standardized text encoding.

It already has! The standard text encoding is defined in RFC 4122.
That's why postgres displays it as such when encoding to text.

> Regarding the proposal to couple UUID encoding with the bytea type through encode()/decode() functions: I understand the appeal of reusing existing infrastructure, but this creates a conceptual mismatch. UUID is a distinct semantic type in PostgreSQL, not merely binary data. The bytea type has existed for decades without base32hex encoding, and that's worked fine, because bytea represents arbitrary binary data, not universally unique identifiers with specific structural properties and needs.

I think by far the first step is to make the encoding of UUIDs in
different formats possible in Postgres. The way to do so with the
least API impact (and thus as you noticed, least pushback), would be
to add base32hex to the list of encoding formats in the encode/decode
functions. Then combining that with UUID <-> bytea casting (which also
seems totally reasonable functionality to me), would give you the
functionality (but not the defaults you want).

In a follow up patch I would personally be fine making the API to
encode UUIDs a bit more friendly. In particular, adding an overload to
the encode function that takes a UUID instead of a bytea seems
reasonable to me, i.e. encode(id uuid, format text) -> text

I'm currently less convinced about a decode_uuid function though. I
think some perf argument (including some benchmarks) would need to be
made to convince me of its usefulness. Because purely from an API
friendliness lens, I feel like decode('...', 'base32hex)::uuid and
decode_uuid('...', 'base32hex') rank basically the same.

Once/if an accepted RFC actually defines a default shorter encoding
for UUIDs we could I would definitely be in favor of adding a
decode_uuid function with the default encoding configured as a default
argument. As well as adding the default argument to the uuid encode
overload function.

____________________________________________________


Hi Jelte,

Here's a project for a compact, sortable text encoding Base64UUID specifically for UUIDs, the idea for which you submitted ("base64urlhex"):
https://github.com/sergeyprokhorenko/Base64UUID
Where's the best place to discuss this: in this thread or starting a new one?

Best regards,Sergey Prokhorenko



  

^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
  2025-10-23 13:06 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-23 17:34   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-23 21:00     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 07:16       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 18:23         ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-24 22:39           ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-24 23:31             ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-25 18:07               ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Andrey Borodin <[email protected]>
  2025-10-27 20:09                 ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Masahiko Sawada <[email protected]>
  2025-10-27 22:36                   ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
  2025-10-28 09:30                     ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Jelte Fennema-Nio <[email protected]>
  2025-10-28 21:56                       ` Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
@ 2025-10-28 23:44                         ` Jelte Fennema-Nio <[email protected]>
  0 siblings, 0 replies; 37+ messages in thread

From: Jelte Fennema-Nio @ 2025-10-28 23:44 UTC (permalink / raw)
  To: Sergey Prokhorenko <[email protected]>; +Cc: Andrey Borodin <[email protected]>; Masahiko Sawada <[email protected]>; pgsql-hackers

On Tue, 28 Oct 2025 at 22:58, Sergey Prokhorenko
<[email protected]> wrote:
> Where's the best place to discuss this: in this thread or starting a new one?

I don't think core Postgres is the place where experimental encodings
should be discussed and tried (it would fit great in an extension
though). It sounds more like you should create an RFC or ask for
feedback from some of the other contributors to the UUIDv7 RFC.





^ permalink  raw  reply  [nested|flat] 37+ messages in thread

* Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions
  2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
@ 2025-10-23 15:08 ` Aleksander Alekseev <[email protected]>
  1 sibling, 0 replies; 37+ messages in thread

From: Aleksander Alekseev @ 2025-10-23 15:08 UTC (permalink / raw)
  To: pgsql-hackers; +Cc: Sergey Prokhorenko <[email protected]>

Hi,

> > I'm writing to propose adding two new built-in functions to PostgreSQL that provide compact UUID encoding using the base32hex format.
>
> Firstly, cc:'ing a few dozens of people is not the best way to get
> attention to your patch. Please don't do this.
>
> [...]

I checked pgsql-hackers@ archive [1] and if I understand correctly
Sergey is not on the mailing list. So people that were not cc:'ed
didn't receive his e-mail. I attached the original text for those
interested and also for history.

Sergey, please make sure you are subscribed to the mailing list [2].

[1]: https://www.postgresql.org/message-id/CAJ7c6TOramr1UTLcyB128LWMqita1Y7%3Darq3KHaU%3Dqikf5yKOQ%40mail...
[2]: https://www.postgresql.org/list/

-- 
Best regards,
Aleksander Alekseev

Hi pgsql-hackers,

I'm writing to propose adding two new built-in functions to PostgreSQL that provide compact UUID encoding using the base32hex format.

I'm one of the contributors to RFC 9562 (UUIDs) and to the uuidv7() implementations in PostgreSQL and several libraries. I'm writing to express my strong support for a new patch by Andrey Borodin, the developer of the built-in uuidv7() function for PostgreSQL 18. This patch adds two new functions for UUID compact text representation. These functions would be long-awaited additions to PostgreSQL's UUID functionality.

I would like to request the community to review this patch and to consider it for commit.

The patch is available here: https://github.com/x4m/postgres_g/commit/aa902bbc5dfc47d4b35f05016304a1e671abb505 

_______________________________

uuid_to_base32hex ( uuid ) -> text

Encodes a UUID into a 26-character base32hex string (uppercase, no hyphens, without padding), using the alphabet 0123456789ABCDEFGHIJKLMNOPQRSTUV as specified in RFC 4648 (https://datatracker.ietf.org/doc/html/rfc4648#page-10).

To accommodate base32hex encoding (5 bits per character), the 128-bit UUID requires 130 bits total (26 characters ? 5 bits). The additional 2 zero bits are appended as padding.

This compact, lexicographically sortable format preserves temporal ordering for UUIDv7, making it ideal for primary keys stored as values in JSON key-value pairs, as well as for URLs, filenames, and other space-constrained contexts.

Example:
uuid_to_base32hex('019535d9-3df7-79fb-b466-fa907fa17f9e'::uuid) > 06AJBM9TUTSVND36VA87V8BVJO
_______________________________

base32hex_to_uuid ( text ) -> uuid

Decodes a base32hex string back into its original UUID. The input is case-insensitive. Invalid inputs return NULL. The decoding is lossless and produces a bitwise-identical UUID.

Example:
base32hex_to_uuid('06AJBM9TUTSVND36VA87V8BVJO') > 019535d9-3df7-79fb-b466-fa907fa17f9e
_______________________________

We considered base36 but rejected it due to poor performance. Crockford's Base32 was also rejected due to its lack of native support in standard libraries, making base32hex the most practical choice.

Converter: https://tomeko.net/online_tools/base32hex.php?lang=en


Best regards,
Sergey Prokhorenko

Attachments:

  [text/plain] hi-hackers.txt (2.2K, 2-hi-hackers.txt)
  download | inline:
Hi pgsql-hackers,

I'm writing to propose adding two new built-in functions to PostgreSQL that provide compact UUID encoding using the base32hex format.

I'm one of the contributors to RFC 9562 (UUIDs) and to the uuidv7() implementations in PostgreSQL and several libraries. I'm writing to express my strong support for a new patch by Andrey Borodin, the developer of the built-in uuidv7() function for PostgreSQL 18. This patch adds two new functions for UUID compact text representation. These functions would be long-awaited additions to PostgreSQL's UUID functionality.

I would like to request the community to review this patch and to consider it for commit.

The patch is available here: https://github.com/x4m/postgres_g/commit/aa902bbc5dfc47d4b35f05016304a1e671abb505 

_______________________________

uuid_to_base32hex ( uuid ) -> text

Encodes a UUID into a 26-character base32hex string (uppercase, no hyphens, without padding), using the alphabet 0123456789ABCDEFGHIJKLMNOPQRSTUV as specified in RFC 4648 (https://datatracker.ietf.org/doc/html/rfc4648#page-10).

To accommodate base32hex encoding (5 bits per character), the 128-bit UUID requires 130 bits total (26 characters ? 5 bits). The additional 2 zero bits are appended as padding.

This compact, lexicographically sortable format preserves temporal ordering for UUIDv7, making it ideal for primary keys stored as values in JSON key-value pairs, as well as for URLs, filenames, and other space-constrained contexts.

Example:
uuid_to_base32hex('019535d9-3df7-79fb-b466-fa907fa17f9e'::uuid) > 06AJBM9TUTSVND36VA87V8BVJO
_______________________________

base32hex_to_uuid ( text ) -> uuid

Decodes a base32hex string back into its original UUID. The input is case-insensitive. Invalid inputs return NULL. The decoding is lossless and produces a bitwise-identical UUID.

Example:
base32hex_to_uuid('06AJBM9TUTSVND36VA87V8BVJO') > 019535d9-3df7-79fb-b466-fa907fa17f9e
_______________________________

We considered base36 but rejected it due to poor performance. Crockford's Base32 was also rejected due to its lack of native support in standard libraries, making base32hex the most practical choice.

Converter: https://tomeko.net/online_tools/base32hex.php?lang=en


Best regards,
Sergey Prokhorenko

^ permalink  raw  reply  [nested|flat] 37+ messages in thread


end of thread, other threads:[~2025-11-20 05:46 UTC | newest]

Thread overview: 37+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2025-10-23 09:55 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Aleksander Alekseev <[email protected]>
2025-10-23 13:06 ` Andrey Borodin <[email protected]>
2025-10-23 14:10   ` Jelte Fennema-Nio <[email protected]>
2025-10-23 17:34   ` Sergey Prokhorenko <[email protected]>
2025-10-23 21:00     ` Masahiko Sawada <[email protected]>
2025-10-23 22:45       ` Sergey Prokhorenko <[email protected]>
2025-10-24 04:57         ` Masahiko Sawada <[email protected]>
2025-10-24 07:16       ` Sergey Prokhorenko <[email protected]>
2025-10-24 18:23         ` Masahiko Sawada <[email protected]>
2025-10-24 22:39           ` Sergey Prokhorenko <[email protected]>
2025-10-24 23:31             ` Masahiko Sawada <[email protected]>
2025-10-25 18:07               ` Andrey Borodin <[email protected]>
2025-10-25 19:15                 ` Sergey Prokhorenko <[email protected]>
2025-10-27 10:23                 ` Aleksander Alekseev <[email protected]>
2025-10-27 14:03                   ` Sergey Prokhorenko <[email protected]>
2025-10-27 14:37                     ` Aleksander Alekseev <[email protected]>
2025-10-27 15:33                       ` Отв.: Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions Sergey Prokhorenko <[email protected]>
2025-10-27 20:09                 ` Masahiko Sawada <[email protected]>
2025-10-27 22:36                   ` Sergey Prokhorenko <[email protected]>
2025-10-28 09:30                     ` Jelte Fennema-Nio <[email protected]>
2025-10-28 11:53                       ` Dagfinn Ilmari Mannsåker <[email protected]>
2025-10-28 12:24                         ` Jelte Fennema-Nio <[email protected]>
2025-10-28 16:41                         ` Dagfinn Ilmari Mannsåker <[email protected]>
2025-10-28 18:44                           ` Masahiko Sawada <[email protected]>
2025-10-29 12:19                             ` Andrey Borodin <[email protected]>
2025-10-30 19:10                               ` Masahiko Sawada <[email protected]>
2025-11-20 05:46                               ` Илья Чердаков <[email protected]>
2025-10-28 23:35                           ` Jelte Fennema-Nio <[email protected]>
2025-10-29 10:51                             ` Dagfinn Ilmari Mannsåker <[email protected]>
2025-10-29 11:31                               ` Aleksander Alekseev <[email protected]>
2025-10-29 11:57                                 ` Aleksander Alekseev <[email protected]>
2025-10-29 12:04                                 ` Dagfinn Ilmari Mannsåker <[email protected]>
2025-10-28 14:28                       ` Sergey Prokhorenko <[email protected]>
2025-10-28 14:39                         ` Jelte Fennema-Nio <[email protected]>
2025-10-28 21:56                       ` Sergey Prokhorenko <[email protected]>
2025-10-28 23:44                         ` Jelte Fennema-Nio <[email protected]>
2025-10-23 15:08 ` Aleksander Alekseev <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox