public inbox for [email protected]
help / color / mirror / Atom feedFrom: Andres Freund <[email protected]>
To: Peter Eisentraut <[email protected]>
Cc: pgsql-hackers <[email protected]>
Subject: Re: Unicode update and some tooling improvements
Date: Wed, 18 Mar 2026 10:20:40 -0400
Message-ID: <lmj5ju4omjr3iswibu477ybipljzzbe4pmnp3oa2rs5gxzanmb@eph27jugatgg> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
Hi,
On 2026-02-26 21:36:08 +0100, Peter Eisentraut wrote:
> This is the annual update of the Unicode data. I also worked a bit on the
> tooling. The update-unicode target under meson did not update the data in
> contrib/unaccent/, so I added that. I also fixed a Python deprecation
> warning in the generation script and made some light changes in the
> surrounding documentation.
> From ef15b16dcef7a3868fc37744d201bb233f8271bd Mon Sep 17 00:00:00 2001
> From: Peter Eisentraut <[email protected]>
> Date: Thu, 26 Feb 2026 11:36:27 +0100
> Subject: [PATCH 3/6] Implement unaccent Unicode data update in meson
>
> The meson/ninja update-unicode target did not cover the required
> updates in contrib/unaccent/. This is fixed now.
Makes sesne.
> +# Download CLDR files on demand.
> +
> +cldr_baseurl = 'https://raw.githubusercontent.com/unicode-org/cldr/release-@0@/common/transforms/@1@';
Hm. I take it the relevant contents aren't available on unicode.org, which we
use in src/common/unicode?
We reference githubusercontent.com in Makefile too, but somehow that feels a
bit weird.
> +if not wget.found() or not cp.found()
> + subdir_done()
> +endif
> +
> +foreach f : ['Latin-ASCII.xml']
> + # XXX .replace requires meson 0.58
> + url = cldr_baseurl.format(CLDR_VERSION.replace('.', '-'), f)
I think this could be replaced with something like
CLDR_VERSION.split('.').join('-')
for < 0.58 compat. But I'm also ok with going to 0.58.
> From 20d5a665f72b3ddde8bfdf06b94d218da0dc2d09 Mon Sep 17 00:00:00 2001
> From: Peter Eisentraut <[email protected]>
> Date: Thu, 26 Feb 2026 11:38:16 +0100
> Subject: [PATCH 4/6] Update RELEASE_CHANGES
>
> The existing instructions did not cover meson. Point to
> src/common/unicode/README instead, where there is more information.
LGTM.
> From 868e269b518daf0d3d288e2e379d5fd3ad215f49 Mon Sep 17 00:00:00 2001
> From: Peter Eisentraut <[email protected]>
> Date: Thu, 26 Feb 2026 10:25:48 +0100
> Subject: [PATCH 5/6] Update Unicode data to CLDR 48.1
>
> No actual changes result.
>
> XXX should change that to CLDR 49 in April
48.2 has been released from what I can tell.
LGTM otherwise.
> From dd4b5ced419b319c24fa0928180e54d7317e1690 Mon Sep 17 00:00:00 2001
> From: Peter Eisentraut <[email protected]>
> Date: Thu, 26 Feb 2026 11:38:51 +0100
> Subject: [PATCH 6/6] Update Unicode data to Unicode 17.0.0
Looks like 18 is out, any reason to not go straight to that?
> diff --git a/src/Makefile.global.in b/src/Makefile.global.in
> index 7d65e428607..b99116a9ef8 100644
> --- a/src/Makefile.global.in
> +++ b/src/Makefile.global.in
> @@ -376,7 +376,7 @@ DOWNLOAD = wget -O $@ --no-use-server-timestamps
> # Pick a release from here: <https://www.unicode.org/Public/;. Note
> # that the most recent release listed there is often a pre-release;
> # don't pick that one, except for testing.
> -UNICODE_VERSION = 16.0.0
> +UNICODE_VERSION = 17.0.0
Wonder if we, in a separate change, should put UNICODE_VERSION and
CLDR_VERSION version in dedicated files (probably just named
UNICODE_VERSION/CLDR_VERSION) that then could be shared by meson & make.
Greetings,
Andres Freund
view thread (6+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected]
Subject: Re: Unicode update and some tooling improvements
In-Reply-To: <lmj5ju4omjr3iswibu477ybipljzzbe4pmnp3oa2rs5gxzanmb@eph27jugatgg>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox