Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w2rlQ-000gGz-1n for pgsql-hackers@arkaria.postgresql.org; Wed, 18 Mar 2026 14:20:49 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w2rlP-00BNW6-1R for pgsql-hackers@arkaria.postgresql.org; Wed, 18 Mar 2026 14:20:47 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w2rlO-00BNVx-2U for pgsql-hackers@lists.postgresql.org; Wed, 18 Mar 2026 14:20:47 +0000 Received: from fhigh-a5-smtp.messagingengine.com ([103.168.172.156]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1w2rlL-00000000wbx-1UXP for pgsql-hackers@postgresql.org; Wed, 18 Mar 2026 14:20:46 +0000 Received: from phl-compute-06.internal (phl-compute-06.internal [10.202.2.46]) by mailfhigh.phl.internal (Postfix) with ESMTP id 910621400262; Wed, 18 Mar 2026 10:20:41 -0400 (EDT) Received: from phl-frontend-03 ([10.202.2.162]) by phl-compute-06.internal (MEProxy); Wed, 18 Mar 2026 10:20:41 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=anarazel.de; h= cc:cc:content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm1; t=1773843641; x=1773930041; bh=JwOczPmj7J ZK6+gH45iFrK6+efJq5D4gnKh7bxK4nLw=; b=IDoUOxTqj3FMlEohZd6UBAeD2H WmL49ibuCAUqKkgZnnANZDfnJuFZxwPiqOkFktQ+z8Gn5t631Tj1PyT0gtcmhXEr TAzcNK/lt2wj62GD/izJwDkZ4nLuCFqJHtHl9tCIGgGIkXEFToIlNPn093ZzFkmb 5GO5LR3OiTOyxAWtAaa4mlVUuV9UYNBWZYYCC+G1CRRrXRtZ07gUXvuSIVf17DZm 3dVkCft9DqTudY/wvqNAa5r/gIfFBWv/lZn5fit6O7usk1Vi+ZGlPT1w8RwHeqiw B+Rgv1KfodT0ShTvrWvEvxFeTsxx9lpfN+TJR3+VFZVMaV653TCoU8av42nQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm1; t= 1773843641; x=1773930041; bh=JwOczPmj7JZK6+gH45iFrK6+efJq5D4gnKh 7bxK4nLw=; b=H006dGGEGBygrU6o08H09aFjPhEvVpYIiM81mIfUbw8jVrTywtQ nR8BuGsk80GYmihtmynT/Uyz5Ktg7oVRd/0IpqnJa39Nzn3u5vWce4EAyC73KlFr Cz4asGtqJvIcPLAeiIm6IF4ZztErnni4LM5NjBHFsgzdE2crmvi70AGaXC1rThg0 8oFTqy6X9FTsfpMCQHv8379Bk+71iLDklgYnlDv0CQxxsDg13IwZDCYxW9rQk9J+ YKvw017T2QF2cLZxKFfMHrYqztLbAH+g61n/1ZvLmk3ioN2hbjjAk/gP31DVrcNg wNjxwLZXEy/V87Gr4G0Zpu/D94pudWeuTeA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefgedrtddtgdeftdegfeehucetufdoteggodetrf dotffvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfurfetoffkrfgpnffqhgenuceu rghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujf gurhepfffhvfevuffkfhggtggujgesthdtsfdttddtvdenucfhrhhomheptehnughrvghs ucfhrhgvuhhnugcuoegrnhgurhgvshesrghnrghrrgiivghlrdguvgeqnecuggftrfgrth htvghrnhepkeekkeffjeeltedttedvveetgeffvdefudffgedvudekhefgvdeufeelffel jeejnecuffhomhgrihhnpehgihhthhhusghushgvrhgtohhnthgvnhhtrdgtohhmpdhrvg hplhgrtggvrddqfhhipdhunhhitghouggvrdhorhhgnecuvehluhhsthgvrhfuihiivgep tdenucfrrghrrghmpehmrghilhhfrhhomheprghnughrvghssegrnhgrrhgriigvlhdrug gvpdhnsggprhgtphhtthhopedvpdhmohguvgepshhmthhpohhuthdprhgtphhtthhopehp vghtvghrsegvihhsvghnthhrrghuthdrohhrghdprhgtphhtthhopehpghhsqhhlqdhhrg gtkhgvrhhssehpohhsthhgrhgvshhqlhdrohhrgh X-ME-Proxy: Feedback-ID: id4a34324:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 18 Mar 2026 10:20:41 -0400 (EDT) Date: Wed, 18 Mar 2026 10:20:40 -0400 From: Andres Freund To: Peter Eisentraut Cc: pgsql-hackers Subject: Re: Unicode update and some tooling improvements Message-ID: References: <2a668979-ed92-49a3-abf9-a3ec2d460ec2@eisentraut.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2a668979-ed92-49a3-abf9-a3ec2d460ec2@eisentraut.org> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi, On 2026-02-26 21:36:08 +0100, Peter Eisentraut wrote: > This is the annual update of the Unicode data. I also worked a bit on the > tooling. The update-unicode target under meson did not update the data in > contrib/unaccent/, so I added that. I also fixed a Python deprecation > warning in the generation script and made some light changes in the > surrounding documentation. > From ef15b16dcef7a3868fc37744d201bb233f8271bd Mon Sep 17 00:00:00 2001 > From: Peter Eisentraut > Date: Thu, 26 Feb 2026 11:36:27 +0100 > Subject: [PATCH 3/6] Implement unaccent Unicode data update in meson > > The meson/ninja update-unicode target did not cover the required > updates in contrib/unaccent/. This is fixed now. Makes sesne. > +# Download CLDR files on demand. > + > +cldr_baseurl = 'https://raw.githubusercontent.com/unicode-org/cldr/release-@0@/common/transforms/@1@' Hm. I take it the relevant contents aren't available on unicode.org, which we use in src/common/unicode? We reference githubusercontent.com in Makefile too, but somehow that feels a bit weird. > +if not wget.found() or not cp.found() > + subdir_done() > +endif > + > +foreach f : ['Latin-ASCII.xml'] > + # XXX .replace requires meson 0.58 > + url = cldr_baseurl.format(CLDR_VERSION.replace('.', '-'), f) I think this could be replaced with something like CLDR_VERSION.split('.').join('-') for < 0.58 compat. But I'm also ok with going to 0.58. > From 20d5a665f72b3ddde8bfdf06b94d218da0dc2d09 Mon Sep 17 00:00:00 2001 > From: Peter Eisentraut > Date: Thu, 26 Feb 2026 11:38:16 +0100 > Subject: [PATCH 4/6] Update RELEASE_CHANGES > > The existing instructions did not cover meson. Point to > src/common/unicode/README instead, where there is more information. LGTM. > From 868e269b518daf0d3d288e2e379d5fd3ad215f49 Mon Sep 17 00:00:00 2001 > From: Peter Eisentraut > Date: Thu, 26 Feb 2026 10:25:48 +0100 > Subject: [PATCH 5/6] Update Unicode data to CLDR 48.1 > > No actual changes result. > > XXX should change that to CLDR 49 in April 48.2 has been released from what I can tell. LGTM otherwise. > From dd4b5ced419b319c24fa0928180e54d7317e1690 Mon Sep 17 00:00:00 2001 > From: Peter Eisentraut > Date: Thu, 26 Feb 2026 11:38:51 +0100 > Subject: [PATCH 6/6] Update Unicode data to Unicode 17.0.0 Looks like 18 is out, any reason to not go straight to that? > diff --git a/src/Makefile.global.in b/src/Makefile.global.in > index 7d65e428607..b99116a9ef8 100644 > --- a/src/Makefile.global.in > +++ b/src/Makefile.global.in > @@ -376,7 +376,7 @@ DOWNLOAD = wget -O $@ --no-use-server-timestamps > # Pick a release from here: . Note > # that the most recent release listed there is often a pre-release; > # don't pick that one, except for testing. > -UNICODE_VERSION = 16.0.0 > +UNICODE_VERSION = 17.0.0 Wonder if we, in a separate change, should put UNICODE_VERSION and CLDR_VERSION version in dedicated files (probably just named UNICODE_VERSION/CLDR_VERSION) that then could be shared by meson & make. Greetings, Andres Freund