public inbox for [email protected]  
help / color / mirror / Atom feed
From: Jeff Davis <[email protected]>
To: Andreas Karlsson <[email protected]>
To: pgsql-hackers <[email protected]>
Subject: Re: Speed up ICU case conversion by using ucasemap_utf8To*()
Date: Fri, 20 Dec 2024 11:24:04 -0800
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>

On Fri, 2024-12-20 at 06:20 +0100, Andreas Karlsson wrote:
> SELECT count(upper) FROM (SELECT upper(('Kålhuvud ' || i) COLLATE 
> "sv-SE-x-icu") FROM generate_series(1, 1000000) i);
> 
> master:  ~540 ms
> Patched: ~460 ms
> glibc:   ~410 ms

It looks like you are opening and closing the UCaseMap object each
time. Why not save it in pg_locale_t? That should speed it up even more
and hopefully beat libc.


Also, to support older ICU versions consistently, we need to fix up the
locale name to support "und"; cf. pg_ucol_open(). Perhaps factor out
that logic?

Regards,
	Jeff Davis







view thread (10+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected]
  Subject: Re: Speed up ICU case conversion by using ucasemap_utf8To*()
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox