Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w5bBD-003PaA-02 for pgsql-hackers@arkaria.postgresql.org; Thu, 26 Mar 2026 03:14:43 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w5bBB-000irZ-1C for pgsql-hackers@arkaria.postgresql.org; Thu, 26 Mar 2026 03:14:41 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w5bBB-000irR-0A for pgsql-hackers@lists.postgresql.org; Thu, 26 Mar 2026 03:14:41 +0000 Received: from mail-lf1-x129.google.com ([2a00:1450:4864:20::129]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1w5bB3-00000001DdW-2ym6 for pgsql-hackers@postgresql.org; Thu, 26 Mar 2026 03:14:36 +0000 Received: by mail-lf1-x129.google.com with SMTP id 2adb3069b0e04-5a13a06fc85so692820e87.1 for ; Wed, 25 Mar 2026 20:14:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1774494873; cv=none; d=google.com; s=arc-20240605; b=Lkaypb1u/z/GdM7hLi5OTp6wiGp5fPWV25bjag1DQ2uUrMpzuF32GUzjhilsmys9qi tAcp2UH5rty31014QLs3VMGYdJ7NzkF0IgzQJEn+7yM7W5EMHzdabyUZtckAXVAIc+iF i/jqfYh4xsxs3xiKqsG5TVR7kZBL+NnQa2WKHOcNinOGnzJEhFv97Yqixspu5CD63WG+ JoqoC6Fjz9hXlqR0KWVxhO6sbznetrhgqFpbvenzgb1tsrf9vFyxnAAUBwYA5NsvEW4J l2NF6LgHJ9jVYj/gw+d6wMpvNv3DZITVYYfWkOO/w719Es02JCkhUerIqB5yAdQdMTPA Rfwg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=VA9jQ8R0S5oVF9//EOVO8wIsDJlHMNWcEsjw3GyPFvI=; fh=2rFYo91QIy9BHL4zifmbaw/WKD5hLndNpHeFhIjoSBc=; b=cMmuHDHVUpBdDlVQmmLI+2vhMXoAUrIoUzG7Smq986vjRJ1hkJQ0ynSMLGZ+5Fe2a/ XM1FTPbNrsKdaA7k08HFwP7OIEYB2o1ULab0ohr6CicV0cSQJvWPNRUUcqq/rlqefLou fwfjr964aATINSuFiNTM0xEjDuYu2dUipR0dn6m53CloEkXXlg8u3bIxoKS42V+b4drW qCfKVyAkpWegf1XTLdqHmq9kGJkMOaFXsXDn+G9EolgZPXLMWND9Fr/BYun2zDsL3hn+ uo8KsQLKtec4Xc7IL6dabt1IsEKHZOWkztpAszb0Ubrzn6F6tX4EA4LIBdtJ6gX1QyBf hp0Q==; darn=postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1774494873; x=1775099673; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=VA9jQ8R0S5oVF9//EOVO8wIsDJlHMNWcEsjw3GyPFvI=; b=sEAAdh/0e+/Jb96pAOoA+SlssjboAoiEG3vBL1XxAr6JmRMGHa92PVJ2BRposzUN2F qmJIu1oMF244IIpwNE/q4iGEja3OHIzwm5FP/Hj4Q7ABu1aF07dJ8qHy3/t36rZVvp4p yqLrzpETwGg6WpYC63rx2tTjiI8CTN5GPYdvxYavmBG4sK6I97lYr32+SXgSFIW6GpCz zqgMPiytdGIp/ojrI3OJHkitc9vDlKvS7o6KgyK/We2FBfkv1NqRfO2VNJG7w+Z8j81g JxqR8xbcd/YYoh+jr/JSJxLJAPcrKDdoJS+WHuByxdBJQf2r7yCtZZCYEkbJAnOuGo7w ZJOQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774494873; x=1775099673; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=VA9jQ8R0S5oVF9//EOVO8wIsDJlHMNWcEsjw3GyPFvI=; b=VCdFTU+hVVE0HECV2vKAe+VDDOaTkXd9hb5/ykLKyurtUCFxaNWbNbgHZa9f6ZHb7X 2ZimL+H7/Ptie06X3LZglSRInZrHp83nWbGZXiRhMWT1Bs6uf7hCFP0fPsCT198AJXQU 7GQevLJgn2ot8OBbovFVZG+dcMvcBbaFicNqrwR0qUt6hihhriLqmiws/Hru4aU1ifn7 kJm3RM2Ni0bujOmvzXKu0yBf6riQs0E1rw7CQCPVafqQALtqwjxnyPEcsYVhFHyAhqMp /cALskfaObH0VcCwmYYg6tM2RZPtza/IC1OAs7ebDWoQuvqtDj9w1j6Gdmm+nF80QRqB N90g== X-Forwarded-Encrypted: i=1; AJvYcCWTUnxA7VRJC0blvUBkofmZHqIV8voYeGYkywngtlCRFReqbPPhyS4pWmO4b+81KupGsaPZae5Bpz02r4Am@postgresql.org X-Gm-Message-State: AOJu0YyGo/H747Q4pEbiBEVW5tdkCkzFE5b8Zzdw0fG1oikyJ8N4vXN7 ETlB1nDqmH9GrZcxTFl8+fK5/Tz3DnQoqU34ltIrLM18IIgoaIUyrAJZT1TKKa1QrBfZwrkHCEm 6XLA/QEsR7/MhyKyeou7f1LH8FTUAmWg= X-Gm-Gg: ATEYQzxxiEalAERkocREybfV/0d7wbM9rvNMJ4KTirldfMMq+X1PcT3tfYoPYQm78MC bk9AXLSFgGBIXsI2XlPFEBsxTjRyYbolviCW/CmUSkNTCDmCuDpNsWKIooVu4ulfhXqksc1YkGd dvP2qKqfo7BDZcYF4ix227Kpo4SXJHf+SVuJ6id4+3Tv6Hnt+OVIrOzex6lD6tNwbP8cHPuUxKB tN9B2rrxIDPSDFyT1SIFw/NRG+wnT7zwzhMDhE6FOzt1wbWCsF+I/YtyGCehaUCm9DqmWaVLD6r LKveV0HD X-Received: by 2002:ac2:4e01:0:b0:5a1:3bd8:a068 with SMTP id 2adb3069b0e04-5a29b99553emr1970199e87.30.1774494872551; Wed, 25 Mar 2026 20:14:32 -0700 (PDT) MIME-Version: 1.0 References: <1791665551.452444.1761209220211.ref@mail.yahoo.com> <878qbn51kb.fsf@wibble.ilmari.org> <1F13AF7E-B90A-40FB-B47C-DC38FBB08353@gmail.com> <01aac5bb-63b1-420f-b396-fdc2027fbb50@vondra.me> <688741.1774485357@sss.pgh.pa.us> In-Reply-To: From: Masahiko Sawada Date: Wed, 25 Mar 2026 20:13:55 -0700 X-Gm-Features: AQROBzCwLAAscMgn6m5o75nG6CWTmedyFrzFwxjRSnN2P8FRsyWnpwGP4AmyGTE Message-ID: Subject: Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions To: Tom Lane Cc: Tomas Vondra , Aleksander Alekseev , pgsql-hackers , Chao Li , =?UTF-8?Q?Dagfinn_Ilmari_Manns=C3=A5ker?= , Andrey Borodin Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Wed, Mar 25, 2026 at 6:21=E2=80=AFPM Masahiko Sawada wrote: > > On Wed, Mar 25, 2026 at 6:09=E2=80=AFPM Masahiko Sawada wrote: > > > > On Wed, Mar 25, 2026 at 5:35=E2=80=AFPM Tom Lane wr= ote: > > > > > > Tomas Vondra writes: > > > > On 3/26/26 00:40, Tom Lane wrote: > > > >> I believe what's happening there is that in cs_CZ locale, > > > >> "V" doesn't follow simple ASCII sort ordering. > > > > > > > With cs_CZ all letters sort *before* numbers, while in en_US it's t= he > > > > other way around. V is not special in any way. > > > > > > Ah, sorry, I should have researched a bit instead of relying on > > > fading memory. The quirk I was thinking of is that in cs_CZ, > > > "ch" sorts after "h": > > > > > > u8=3D# select 'h' < 'ch'::text collate "en_US"; > > > ?column? > > > ---------- > > > f > > > (1 row) > > > > > > u8=3D# select 'h' < 'ch'::text collate "cs_CZ"; > > > ?column? > > > ---------- > > > t > > > (1 row) > > > > > > Regular hex encoding isn't bitten by that because it doesn't > > > use 'h' in the text form ... but this base32hex thingie does. > > > > > > However, your point is also correct: > > > > > > u8=3D# select '0' < 'C'::text ; > > > ?column? > > > ---------- > > > t > > > (1 row) > > > > > > u8=3D# select '0' < 'C'::text collate "cs_CZ"; > > > ?column? > > > ---------- > > > f > > > (1 row) > > > > > > and that breaks "text ordering matches numeric ordering" > > > for both traditional hex and base32hex. So maybe this > > > is not as big a deal as I first thought. We need a fix > > > for the new test though. Probably adding COLLATE "C" > > > would be enough. > > > > Thank you for the report and the analysis. > > > > I've reproduced the issue with "cs_CZ" collation and adding COLLATE > > "C" to the query resolves it. It seems also a good idea to add a note > > in the documentation too as users might face the same issue. For > > example, > > > > To maintain the lexicographical sort order of the encoded data, ensure > > that the text is sorted using the C collation (e.g., using COLLATE > > "C"). Natural language collations may sort characters differently and > > break the ordering. > > > > Attached the patch doing the above idea. Pushed the fix without the documentation changes to make the buildfarm animals happy first. Regards, --=20 Masahiko Sawada Amazon Web Services: https://aws.amazon.com