Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1vDg1v-00DNOm-9X for pgsql-hackers@arkaria.postgresql.org; Tue, 28 Oct 2025 09:30:15 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1vDg1u-00BuSG-7A for pgsql-hackers@arkaria.postgresql.org; Tue, 28 Oct 2025 09:30:13 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1vDg1t-00BuRm-OM for pgsql-hackers@lists.postgresql.org; Tue, 28 Oct 2025 09:30:12 +0000 Received: from mail-lj1-x233.google.com ([2a00:1450:4864:20::233]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vDg1q-004fns-0N for pgsql-hackers@postgresql.org; Tue, 28 Oct 2025 09:30:12 +0000 Received: by mail-lj1-x233.google.com with SMTP id 38308e7fff4ca-36295d53a10so48069011fa.0 for ; Tue, 28 Oct 2025 02:30:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=jeltef.nl; s=google; t=1761643809; x=1762248609; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=2XlZusPKy8hscqtTk9IlZUxvE2vif+dfWLMJ+t7lez0=; b=YUGOxkzLiA4fdrjd8J+UeGbNnFCzZ7XEKu8uVZE3xLpGtYarWYqZLdaqmlv2gWZ6ao UIP6UGfEW3NbXm3LrB1G9qCAZWrZZaazzDS5pQ1ZdO3XjAlwYHBgj4p6qkgUQsiCvwR3 uKnyNg9zfSxwlqbu5aIeoVgoj1zsqH8ZYXA04gIG/yTmdc1hrhE37DVCghG/9t9Eq9nj FBrHKx1RsmKShwTSWqt/nKq1LLlV3QFI5pQp1q9lZa9A71LOYYRfb7K/kl3C1mcKMsRP LkV843ICmq1066skGGzHY2tQ+GyY79SllFaZsX4SW6ZnZUi5HmHMW0VBP82J1/FxjUoV dbrg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761643809; x=1762248609; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2XlZusPKy8hscqtTk9IlZUxvE2vif+dfWLMJ+t7lez0=; b=VK949aJz5M+Am7aFPVmPee7c2bznileudd6oQ+K1MMj+qbo6NjToJavIjcAp7f3cjY xPc+2SEtYwaDVu2ZSHo7BvkV6ybz0nA7wLi1Gyn19RRdz5LW35bci0UBB4Ax28vum7s4 7b09QY/jiLJUP/Hs585qTrq/cHIPix7YSFtHbRo6JYYde25NoTqUp9nsF/v1xEMvuYKL ECNDk1DLuhLK8dtuoCluSacJBgkhlnM85yF+QMOpBuOz3cFcIuTQy6Y0NYRJ8TQTFWoQ K6xq7XAbRFc6FRT34s0WGMQCpoSMK8rFLe5aJ28TgWp0ipHncHs4mmeT65WmdbEyeojF UifA== X-Forwarded-Encrypted: i=1; AJvYcCUmWL9pAFLwpmekDS+EN0hD0Fy5/vMLNIS+LC9sVqtDqcNnLkUB/Kjzsx54EPjmSgeSEyW7q5MdSRxaN8Xi@postgresql.org X-Gm-Message-State: AOJu0YzO/By1btyuXDfar0EqE+BKEjr/WMc8iJPDCr/8Thdv8zQTxiR+ /tQXAAuhHi0MAn5VTuEA08hUGrt+8XRXQ+ILjoBWP+ii2XSebb/4pk5O07JqRcq/nAOwIYGt1PS o7gOgYahEdB0ChXd1WoyOQuofRg2wiP0BIkUxG0R++Q== X-Gm-Gg: ASbGncsgA9HzyG0QIUOq5snpOtbKZyEIX3BLtHHRB9SMZuqtfNaIxb9d2hXAnSxiT7d /eI61bf1xCi3z5WHxoM4CaB0pmCl+n16TMCxxuTqKBHt3auTh0jI5H+51KJsGppWmev6OprKXTw XtIKJNaTuKlOS8W0MxE5rv2gGuyCLz/GK5BcmTrB/+yDOfu21zxHawuATzbrAj/BVb59SFOSe4z 5Y99k2LmMxzqvnDFW3zPReDW0pFuaufe1KwDWowx+FbHBMgsgxjRID0ZcUnFefeFoXM9+ULAbwh zSNMGhY7ZnCXCi8= X-Google-Smtp-Source: AGHT+IH0xZZ1NIS8bnfTH4GLvU/fI54GepfCc7fQrafXifrMELaiv58RUtxSgjfacbzhlihWdUVttJAmFvGa88LHIZA= X-Received: by 2002:a2e:ac13:0:b0:378:edc5:1319 with SMTP id 38308e7fff4ca-37907777178mr7675071fa.41.1761643808488; Tue, 28 Oct 2025 02:30:08 -0700 (PDT) MIME-Version: 1.0 References: <1791665551.452444.1761209220211.ref@mail.yahoo.com> <1791665551.452444.1761209220211@mail.yahoo.com> <18022523-0F8F-4C07-AFF5-57DC9086D78E@yandex-team.ru> <1895971769.8343.1761240853939@mail.yahoo.com> <574624399.175025.1761290201491@mail.yahoo.com> <953203149.383019.1761345585325@mail.yahoo.com> <6F76FA61-E2DC-44EF-9504-889D9BDB4EBD@yandex-team.ru> <1154454839.957923.1761604611424@mail.yahoo.com> In-Reply-To: <1154454839.957923.1761604611424@mail.yahoo.com> From: Jelte Fennema-Nio Date: Tue, 28 Oct 2025 10:30:04 +0100 X-Gm-Features: AWmQ_bmq0sqlHC9uBrEqUaFRhJpQv1h0zgLBZKM--bP0AlMDiAHOLw9S9jwMCU8 Message-ID: Subject: Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions To: Sergey Prokhorenko Cc: Andrey Borodin , Masahiko Sawada , pgsql-hackers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk First of all, I'm definitely a proponent of being able to encode UUIDs using base32hex in Postgres. On Mon, 27 Oct 2025 at 23:37, Sergey Prokhorenko wrote: > I wanted to highlight an important discussion among the authors and contr= ibutors of RFC 9562 regarding UUID text encoding: > > https://github.com/uuid6/new-uuid-encoding-techniques-ietf-draft/discussi= ons/17#discussioncomment-10614817 I think a very important thing to note here is that this is a github discussion, not an officially accepted RFC. I think if it was an officially accepted RFC on how to encode UUIDs then you would have a lot less pushback here. Right now your emails mostly read like you want to push your preferential format, while essentially disallowing other encodings. While base32hex seems like a good choice for UUIDv7 I see no reason to give it preferential treatment at this point in time. crockford base32 seems just as valid. And e.g. base64url[1] seems totally fine for UUID versions that have no inherent ordering like UUIDv4. And if someone comes up with a base64urlhex format you could have even shorter bit still sortable UUIDs at the expense of legibility. The main reason why a specific encoding should receive preferential treatment in Postgres, would be if it was standardized, as that would help with interoperability. At this point in time there's no such standard (not even a draft), so forcing an explicit encoding will actually reduce interoperability, because people already encode their UUIDs in various different forms. > but the discussion established that base32hex is the existing standard fo= rmat already defined in RFC 4648, Section 7, specifically designed for sort= -preserving encoding. You even reach a similar conclusion here: not choosing crockford base32, purely because it does not have an official RFC. > This context is crucial because it underscores that the uuid type, as a f= irst-class concept, deserves its own standardized text encoding. It already has! The standard text encoding is defined in RFC 4122. That's why postgres displays it as such when encoding to text. > Regarding the proposal to couple UUID encoding with the bytea type throug= h encode()/decode() functions: I understand the appeal of reusing existing = infrastructure, but this creates a conceptual mismatch. UUID is a distinct = semantic type in PostgreSQL, not merely binary data. The bytea type has exi= sted for decades without base32hex encoding, and that's worked fine, becaus= e bytea represents arbitrary binary data, not universally unique identifier= s with specific structural properties and needs. I think by far the first step is to make the encoding of UUIDs in different formats possible in Postgres. The way to do so with the least API impact (and thus as you noticed, least pushback), would be to add base32hex to the list of encoding formats in the encode/decode functions. Then combining that with UUID <-> bytea casting (which also seems totally reasonable functionality to me), would give you the functionality (but not the defaults you want). In a follow up patch I would personally be fine making the API to encode UUIDs a bit more friendly. In particular, adding an overload to the encode function that takes a UUID instead of a bytea seems reasonable to me, i.e. encode(id uuid, format text) -> text I'm currently less convinced about a decode_uuid function though. I think some perf argument (including some benchmarks) would need to be made to convince me of its usefulness. Because purely from an API friendliness lens, I feel like decode('...', 'base32hex)::uuid and decode_uuid('...', 'base32hex') rank basically the same. Once/if an accepted RFC actually defines a default shorter encoding for UUIDs we could I would definitely be in favor of adding a decode_uuid function with the default encoding configured as a default argument. As well as adding the default argument to the uuid encode overload function.