Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1vC9st-009k63-Gr for pgsql-hackers@arkaria.postgresql.org; Fri, 24 Oct 2025 04:58:38 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1vC9ss-00DgJy-5x for pgsql-hackers@arkaria.postgresql.org; Fri, 24 Oct 2025 04:58:37 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1vC9sr-00DgJq-SP for pgsql-hackers@lists.postgresql.org; Fri, 24 Oct 2025 04:58:36 +0000 Received: from mail-lj1-x236.google.com ([2a00:1450:4864:20::236]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vC9so-003TYu-33 for pgsql-hackers@postgresql.org; Fri, 24 Oct 2025 04:58:35 +0000 Received: by mail-lj1-x236.google.com with SMTP id 38308e7fff4ca-36a448c8aa2so14735981fa.0 for ; Thu, 23 Oct 2025 21:58:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1761281912; x=1761886712; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=rMVPBfSlk3a1Iy6dG1PbkNbl+qzgL9YuTwIX/Clh/Y4=; b=YTasz0GKw7q4uS+l2PCATLGdKoeoVlX284/lWTza+TfFOaog5PN2GQM3i2q/OjxsZL 9f+0jQHzuXzugFDiJTzywh1NmpMWFcohMpkj2cInrF1hcZLcKABQszuC2uAfi96oQAkJ lPyhq10F0n3+LeX8bDrDhYxNYQlDHIDWtKbO70jsZow69BzYhlPTciJ7HKqP67tohkQQ BLrRmQMHvC+fH5o/cjpaPqsItT7/9p854NRFjzyR/jLOuJ4Amcs+l8GQUxolsZesC6D7 u/hffqrATEPosSieblPiruWpaVzvXHoCsHw9oB/zFUyMApk+twzUWUHD/a+p6RHkGrKF nqVQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761281912; x=1761886712; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rMVPBfSlk3a1Iy6dG1PbkNbl+qzgL9YuTwIX/Clh/Y4=; b=wNHOgwVBZVNWHasMAuRb0GuJtmEV8O0VsV2RSEMSCWn0RAGxKZOf7Bb0PQnIi3vD+Q oKnONkV1gGnZ10uBMLD3WH2L2oNA6lVuT7gSMVq4oHGLmYibeO9Xw6g8soGItt9VNsZ6 U6lC8VvwwLh/2/oq1YbfMy2Lstn9AgG2OaLAGp9k/KhyciTcNaAK4BO8qYPqpokhpw20 X19ZGAbMdCEXLvfbO5zyB8WzouF8OXhzCPVuvLXVvAvFOY4cfLlNUdIiGPbaz2r+abFY 9BJm+lspwN65jyHwMVlecwt5Zy6+SPeT9AF0Vcqcfh2iELzWkYEBknKFezHNX3gUAq/p VAJQ== X-Forwarded-Encrypted: i=1; AJvYcCXdcjfW11+rQE1DDF/7+M/Bl7qVSrPoYOLmr62PQYXNc/ZLNpYldHpeHp3cn02dP0IRY9t5FUczbq1mfV8I@postgresql.org X-Gm-Message-State: AOJu0YzKO7WwTzr7MDbvLUh0PEOH3lQpMp1vVLoUmVqVmcb2nMwh8l5J yxRj4xk3H3yKbIr6+w8qhHiTHhrLVZaYW4cF59ZXXJEyhSIi4PjU3VlXULbRBz+Qn3Knu3lNw9d LeJ7AYgE7Ervu/8lMU0UzO8zwnd8Tzhw= X-Gm-Gg: ASbGncvemOkXkZspXig2sryzCIV5rNxA2LF3PtWLfuWS8hZYgrG/F9mcPHEGTi9y6zD UIoi+BSsiJlyz7v6BDISpIIRjCQwHvd9koIUjwq/2e6bN5hdIwFZ3EM0rogllFzkV7Y/4/avvd+ fJnjBKRl5Lb3m+56LHZQA84Cnc04bF2pF6Nj9EiqUZKk5T9ICfsmX1MznNIPQIhvxk8cSlbo7jA MXCvofxchh4IXy/j4F8S+9C1+5raBdpva/RpOxc0nZ+aboS5uK0K+6xMdN8l35xo03xKBvqEc+P R9YFbeQ= X-Google-Smtp-Source: AGHT+IHRTZ+CIqrZkqm8XOJywMci2xnU95MCu8lp3IggTo5lHM6b+q6v146NiQcISCKI/xNxNZU9zXranYgOPMyUUBk= X-Received: by 2002:a05:651c:1b12:b0:372:99ba:c28e with SMTP id 38308e7fff4ca-378e43b7b5emr2665411fa.36.1761281911936; Thu, 23 Oct 2025 21:58:31 -0700 (PDT) MIME-Version: 1.0 References: <1791665551.452444.1761209220211.ref@mail.yahoo.com> <1791665551.452444.1761209220211@mail.yahoo.com> <18022523-0F8F-4C07-AFF5-57DC9086D78E@yandex-team.ru> <1895971769.8343.1761240853939@mail.yahoo.com> <1846725727.103615.1761259548268@mail.yahoo.com> In-Reply-To: <1846725727.103615.1761259548268@mail.yahoo.com> From: Masahiko Sawada Date: Thu, 23 Oct 2025 21:57:55 -0700 X-Gm-Features: AWmQ_bkSAzqvdecJJ2ZocYa11lQYiU1kD3vxw6mdfi9SczQNct4wDF2pF7G947s Message-ID: Subject: Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions To: Sergey Prokhorenko Cc: Andrey Borodin , pgsql-hackers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Thu, Oct 23, 2025 at 3:46=E2=80=AFPM Sergey Prokhorenko wrote: > > > Given that what uuid_to_base32hex() actually does is encoding the > input UUID, I find that it could be confusing if we have a similar > function other than encode() function. Also, we could end up > introducing as many encoding and decoding functions dedicated for UUID > as we want to support encoding methods, bloating the functions. > > > So as the first step, +1 for supporting base32hex for encode() and > decode() functions and supporting the UUID <-> bytea conversion. I > believe it would cover most use cases and the cost of UUID <-> bytea > conversion is negligible. > > > Regards, > > > -- > > Masahiko Sawada > > Amazon Web Services: https://aws.amazon.com > > > Masahiko, > > I see you're in favor of base32hex encoding. That's great! > > Your arguments make sense, and I generally support enhancing the standard= encode() and decode() functions to handle base32hex. It seems like the rig= ht approach from a developer experience standpoint. > > However, I'm unclear about some implementation aspects. Why add conversio= ns between UUID and bytea data types? Wouldn't that require creating dedica= ted UUID <-> bytea conversion functions? Instead, could we implement encode= () as polymorphic to handle UUID type inputs directly? For decode(), we'd n= eed some way (a parameter?) to specify the UUID output type instead of byt= ea. Another option would be automatic type casting when inserting bytea dat= a into UUID columns. Neither an extra parameter nor additional type casting= seems ideal to me, though I don't have better alternatives. While we can implement something like decode(uuid, text), I don't think we can implement decode() in the way you proposed unless I'm missing something. I think the conversion support between UUID and bytea is useful in general, not limited to encode()/decode() support. And users would be able to create wrapper functions if they don't want to add casting for every encode() and decode() calls. For example, create function uuid_to_base32(uuid) returns text language sql immutable st= rict begin atomic select encode($1::bytea, 'base32hex'); end; Since such functions are inlineable, the different between executing encode(uuid_data::bytea, 'base32hex') and encode(uuid_data, 'base32hex') would only be the conversion; one palloc and one memcpy. > But actually, for a short UUID text encoding to succeed, it's more import= ant that it becomes the single, de facto standard. We should avoid supporti= ng multiple encodings, just as the authors and contributors of RFC 9562 did= : https://github.com/uuid6/new-uuid-encoding-techniques-ietf-draft/discussi= ons/17#discussioncomment-10614817 Therefore, whenever possible, encode()= and decode() should support just one UUID text encoding, namely base32hex. I guess it's ultimately the developer's choice, no? For example, if they are using multiple databases (or data processing platforms) in their system and 'hex' is the only encoding that all components can encode and decode, they might choose 'hex' encoding. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com