public inbox for [email protected]  
help / color / mirror / Atom feed
From: Peter Eisentraut <[email protected]>
To: Daniel Verite <[email protected]>
To: Todd Lang <[email protected]>
Cc: [email protected] <[email protected]>
Subject: Re: Supporting non-deterministic collations with tailoring rules.
Date: Thu, 12 Mar 2026 11:00:32 +0100
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>

On 24.09.25 12:17, Daniel Verite wrote:
> To me, the most plausible fix on the Postgres side would be to pass
> UCOL_DEFAULT instead of UCOL_DEFAULT_STRENGTH as in the attached,
> which lets the user specify the strength in the rule, as the OP did in [1].

With this change, I don't see that the bug reported in ICU-22456 is 
fixed.  See attached my test case.

What change of behavior are you expecting from your patch?  Should there 
be test cases?

From 2588c3eb6ff4acdf2aaeba0ebe8b026b811d7755 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <[email protected]>
Date: Thu, 12 Mar 2026 10:58:12 +0100
Subject: [PATCH] Test for: collation customization with rules loses attributes
 of original collation

---
 src/backend/utils/adt/pg_locale_icu.c     | 2 +-
 src/test/regress/sql/collate.icu.utf8.sql | 6 ++++++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/src/backend/utils/adt/pg_locale_icu.c b/src/backend/utils/adt/pg_locale_icu.c
index 352b4c3885f..5ad05fcd016 100644
--- a/src/backend/utils/adt/pg_locale_icu.c
+++ b/src/backend/utils/adt/pg_locale_icu.c
@@ -587,7 +587,7 @@ make_icu_collator(const char *iculocstr, const char *icurules)
 
 		status = U_ZERO_ERROR;
 		collator_all_rules = ucol_openRules(all_rules, u_strlen(all_rules),
-											UCOL_DEFAULT, UCOL_DEFAULT_STRENGTH,
+											UCOL_DEFAULT, UCOL_DEFAULT,
 											NULL, &status);
 		if (U_FAILURE(status))
 		{
diff --git a/src/test/regress/sql/collate.icu.utf8.sql b/src/test/regress/sql/collate.icu.utf8.sql
index b6c54503d21..929e9603089 100644
--- a/src/test/regress/sql/collate.icu.utf8.sql
+++ b/src/test/regress/sql/collate.icu.utf8.sql
@@ -513,6 +513,12 @@ CREATE TABLE test7 (a text);
 
 CREATE COLLATION testcoll_rulesx (provider = icu, locale = '', rules = '!!wrong!!');
 
+-- WIP https://unicode-org.atlassian.net/browse/ICU-22456
+CREATE COLLATION testcoll_rules2 (provider = icu, locale = 'und-u-ka-shifted-ks-level1', deterministic = false, rules = '');
+CREATE COLLATION testcoll_rules2nr (provider = icu, locale = 'und-u-ka-shifted-ks-level1', deterministic = false);  -- the same without rules
+SELECT 'ab' = 'a b' COLLATE testcoll_rules2nr;  -- true
+SELECT 'ab' = 'a b' COLLATE testcoll_rules2;  -- FIXME should be true
+
 
 -- nondeterministic collations
 
-- 
2.53.0



Attachments:

  [text/plain] nocfbot-0001-Test-for-collation-customization-with-rules-loses-at.patch (1.8K, 2-nocfbot-0001-Test-for-collation-customization-with-rules-loses-at.patch)
  download | inline diff:
From 2588c3eb6ff4acdf2aaeba0ebe8b026b811d7755 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <[email protected]>
Date: Thu, 12 Mar 2026 10:58:12 +0100
Subject: [PATCH] Test for: collation customization with rules loses attributes
 of original collation

---
 src/backend/utils/adt/pg_locale_icu.c     | 2 +-
 src/test/regress/sql/collate.icu.utf8.sql | 6 ++++++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/src/backend/utils/adt/pg_locale_icu.c b/src/backend/utils/adt/pg_locale_icu.c
index 352b4c3885f..5ad05fcd016 100644
--- a/src/backend/utils/adt/pg_locale_icu.c
+++ b/src/backend/utils/adt/pg_locale_icu.c
@@ -587,7 +587,7 @@ make_icu_collator(const char *iculocstr, const char *icurules)
 
 		status = U_ZERO_ERROR;
 		collator_all_rules = ucol_openRules(all_rules, u_strlen(all_rules),
-											UCOL_DEFAULT, UCOL_DEFAULT_STRENGTH,
+											UCOL_DEFAULT, UCOL_DEFAULT,
 											NULL, &status);
 		if (U_FAILURE(status))
 		{
diff --git a/src/test/regress/sql/collate.icu.utf8.sql b/src/test/regress/sql/collate.icu.utf8.sql
index b6c54503d21..929e9603089 100644
--- a/src/test/regress/sql/collate.icu.utf8.sql
+++ b/src/test/regress/sql/collate.icu.utf8.sql
@@ -513,6 +513,12 @@ CREATE TABLE test7 (a text);
 
 CREATE COLLATION testcoll_rulesx (provider = icu, locale = '', rules = '!!wrong!!');
 
+-- WIP https://unicode-org.atlassian.net/browse/ICU-22456
+CREATE COLLATION testcoll_rules2 (provider = icu, locale = 'und-u-ka-shifted-ks-level1', deterministic = false, rules = '');
+CREATE COLLATION testcoll_rules2nr (provider = icu, locale = 'und-u-ka-shifted-ks-level1', deterministic = false);  -- the same without rules
+SELECT 'ab' = 'a b' COLLATE testcoll_rules2nr;  -- true
+SELECT 'ab' = 'a b' COLLATE testcoll_rules2;  -- FIXME should be true
+
 
 -- nondeterministic collations
 
-- 
2.53.0



view thread (9+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Supporting non-deterministic collations with tailoring rules.
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox