Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1vCRGV-00E5FE-K0 for pgsql-hackers@arkaria.postgresql.org; Fri, 24 Oct 2025 23:32:11 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1vCRGT-001Ro1-Bi for pgsql-hackers@arkaria.postgresql.org; Fri, 24 Oct 2025 23:32:08 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1vCRGS-001Rnt-V3 for pgsql-hackers@lists.postgresql.org; Fri, 24 Oct 2025 23:32:08 +0000 Received: from mail-lj1-x22f.google.com ([2a00:1450:4864:20::22f]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vCRGP-0044KV-1A for pgsql-hackers@postgresql.org; Fri, 24 Oct 2025 23:32:07 +0000 Received: by mail-lj1-x22f.google.com with SMTP id 38308e7fff4ca-378e0f355b9so14387561fa.0 for ; Fri, 24 Oct 2025 16:32:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1761348722; x=1761953522; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=mzv3ksrOT3KKvryZJ2mn4f9kHq/GoNR6GXNonJMUhA0=; b=KE0O8YYw4PGoNXq2V+8DrOtsNY5K4rkcikWfpsbPMB1hThLrHE0lGlRyj5YLFPZjyW OiE2iilHHeg5kw/cjeZSLVYWwA22Cu8PmxXPaPeJaScNbO8POAPvXF69OCPd0eW5ka6v rzZ43zi0rG4/cmQ5wu5erCoY/vHFqX24wogiM3EedhRojKoc9zKLvjpijiltYCZskoFv SZXgbV4WG9xWhHoQHfGbO7Qf1TH4J4QCd31RbFJntJoIcBfXYNFBRr4LWV5AG+bIxT80 apGvtUaA71Ffyapit4ItAcJMjxbMVad8NRbdqJ7Fhk8KLxQUT4+PK0pAxDIYfeDnOk6J 8kyQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761348722; x=1761953522; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=mzv3ksrOT3KKvryZJ2mn4f9kHq/GoNR6GXNonJMUhA0=; b=Mm3Nye9O8Rt1qckOEr928tQFCx+pBQTpamm6bYQep0efh/q8iJuDt2AjWTIZptJ8zr aS0MzbohCEA3D/FJnVZi8Vjw9P8+T/8BpoYvqB2075Mr0SfXK6qmkxJS6z+2NYXlY4I+ yxrzdD2j6xQsHV57jYMIpbTIfDZ/0jmRcITrpDNR4l+Yyit+OSHQf4+nnS/Is6JTmQ2o 3P2evJ2M5Pkj7wBi4SeohxlLOenp0mjbJFbQr2LDeWiSpOmyPVOHBg7WerW9Gp+ydiMP Li6cHrpSnmx88YZAigwZz5p3rdlSBNz3fjTXeW8bgL/L98UoW6NiYbSPi/8SMCm+t1NH PYTA== X-Forwarded-Encrypted: i=1; AJvYcCXyFMYhl0iftwxVGInPrXxnyspGuHLpxyvZrqQeItTuiP0RY7EBsW+1c2XRqZACrP2WQOfLlKOpFmSJo75R@postgresql.org X-Gm-Message-State: AOJu0YzD7GPrp39wd4PJQMImlmtQpqoAHktNMdwbxNHvWLOA53xPsJ5N EhJaz1RIjVMQeB7vpgd1182zZvkE/ChNslTigWZcIqjzQgkJpo45IWnRGBaewSbN7zjtQ+DbDY8 a+XPphQZoocnJGqa9iOzqTkHBKPMJ0Jw= X-Gm-Gg: ASbGncsrg4bsWB+7SqEq3cljP7hljgD/hxapoO5F4gXqv6ZaVbPeVVy79/jZjHIYd4v 0DZA6ovDSuZjLkqaOOqDqb4UzDdnbiA7FjdPCh0SAiSEc3oeA0GvWjoAqub/g2tfBNLuMPrlVur nqqPpBUhSGzSJDFC6D16Ks7RaVLHT8+uQpwtRlN2NXYesxbI/fMy7Ks2GKUznDLDXVB08M6o1H7 n2LIAdw6ROUT1VYrfMwoNqd1vIPTe1LwJuJ0TrhG/UXMmiPiGYKsKNRXjFVn9is1fytctwY X-Google-Smtp-Source: AGHT+IFRhqdaUlW8B8daE5fvUp1oPrbVPEN4etJmXyTUKd+Rnc0YZcEJ9SnmpQMcrrlxb6ik14uFquCFRNSMgj5sEjc= X-Received: by 2002:a05:651c:556:b0:376:5027:7f37 with SMTP id 38308e7fff4ca-378e4518459mr10851731fa.41.1761348722101; Fri, 24 Oct 2025 16:32:02 -0700 (PDT) MIME-Version: 1.0 References: <1791665551.452444.1761209220211.ref@mail.yahoo.com> <1791665551.452444.1761209220211@mail.yahoo.com> <18022523-0F8F-4C07-AFF5-57DC9086D78E@yandex-team.ru> <1895971769.8343.1761240853939@mail.yahoo.com> <574624399.175025.1761290201491@mail.yahoo.com> <953203149.383019.1761345585325@mail.yahoo.com> In-Reply-To: <953203149.383019.1761345585325@mail.yahoo.com> From: Masahiko Sawada Date: Fri, 24 Oct 2025 16:31:25 -0700 X-Gm-Features: AWmQ_bk8q8gAJu7_STgrG1W1w7wweCUbu7D_-FQGW_PeeGGZSk6ZvVvUZ1OPIH0 Message-ID: Subject: Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions To: Sergey Prokhorenko Cc: Andrey Borodin , pgsql-hackers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Fri, Oct 24, 2025 at 3:42=E2=80=AFPM Sergey Prokhorenko wrote: > > > > On Friday 24 October 2025 at 09:24:15 pm GMT+3, Masahiko Sawada wrote: > > > On Fri, Oct 24, 2025 at 12:17=E2=80=AFAM Sergey Prokhorenko > wrote: > > > > > > Masahiko, > > > > Developers will still be able to use the long canonical 'hex' UUID form= at for compatibility. But the short format is not a developer choice, but a= convention. We mustn't allow a situation where 25% of systems use base32he= x, 25% use Crocksford's Base32, 25% use base36, and 25% even use erroneousl= y sorted base64. That's a very real nightmare. You, too, have every reason = not to want to increase the number of built-in functions in PostgreSQL. > > > > But here is a solution that I hope will satisfy everyone: > > > > encode('019535d9-3df7-79fb-b466-fa907fa17f9e', 'uuid_to_base32hex') -> = 06AJBM9TUTSVND36VA87V8BVJO > > > Does it mean the first argument is uuid type data and when > > 'uuid_to_base32hex' is specified as the format the function requires a > > uuid data at the first argument? > > Yes, that's right. > PostgreSQL will automatically cast the string '019535d9-3df7-79fb-b466-fa= 907fa17f9e' to the uuid type, since the format is correct. > > > > I could not understand the difference > > between specifying 'based32hex' and 'uuid_to_base32hex' when encoding > > UUID data with base32hex encoding. > > > 1. Specifying 'based32hex' in encode function means the first parameter i= s of bytea type as usual. > > > But specifying 'uuid_to_base32hex' means the first parameter is of uuid t= ype. > > > 2. The encode function does not yet support format based32hex. Therefore,= it is not known whether padding =3D=3D=3D=3D=3D should be used. > But padding =3D=3D=3D=3D=3D is not used when specifying 'uuid_to_base32he= x'. > > > > decode('06AJBM9TUTSVND36VA87V8BVJO', 'base32hex_to_uuid') -> 019535d9-3= df7-79fb-b466-fa907fa17f9e > > > Suppose that the decode() takes text data at the first argument and > > returns UUID data, the function signature would be decode(text, text) > > -> uuid. But we cannot create two functions that have the same name > > and the same argument types. > > Yes, you're right. This is a problem that can't be solved without composi= te return values. We clearly took the wrong approach by coupling UUID conve= rsion with encode/decode functions, which only apply to bytea. UUID and byt= ea are fundamentally different data types. Meanwhile, PostgreSQL has over 3= 0 other type conversion functions that deal with other data types. For exam= ple, array_to_string, string_to_array, jsonb_to_record, to_char, to_timesta= mp, and to_hex. In this situation, the best solution would be to revert to = the original uuid_to_base32hex() and base32hex_to_uuid() functions rather t= han deal with type incompatibility issues. > I think that type conversions and data encodings serve different purposes. Type conversions express semantic transformations between data types (e.g., text -> timestamp, jsonb -> record), while encodings are simply representations of binary data as text. For the latter, PostgreSQL already provides a well-defined abstraction through encode()/decode(). Mixing encoding logic with type-specific conversions would blur that boundary. Also, if we start adding dedicated functions for each supported encoding (uuid_to_base32hex, uuid_to_hex etc.), the number of functions could easily multiply. That=E2=80=99s exactly what encode() and decode() were designed to avoid. While I agree that base32hex should be the recommended, I'm really not sure it's a good design that PostgreSQL core should enforce it as the only built-in method. It seems better to me to provide flexible primitives, encode()/decode() plus UUID <-> bytea casts, and document base32hex as the canonical convention (if necessary). Or providing 'uuid_encode(uuid, format text) -> text' and 'uuid_decode(text, format text) -> uuid' might make sense too, but I'm not sure. I'd like to hear opinions from other hackers too. Regards, --=20 Masahiko Sawada Amazon Web Services: https://aws.amazon.com