public inbox for [email protected]
help / color / mirror / Atom feedCollation with upper and numeric comparing in unexpected way
3+ messages / 2 participants
[nested] [flat]
* Collation with upper and numeric comparing in unexpected way
@ 2026-01-20 18:36 Matt Magoffin <[email protected]>
2026-01-21 07:52 ` Re: Collation with upper and numeric comparing in unexpected way Peter Eisentraut <[email protected]>
0 siblings, 1 reply; 3+ messages in thread
From: Matt Magoffin @ 2026-01-20 18:36 UTC (permalink / raw)
To: [email protected]
I am using Postgres 17 and trying to configure a collation that sorts upper case before lower case and includes numeric sorting:
CREATE COLLATION testsort (provider = icu, locale = 'und-u-kf-upper-kn’);
These comparisons are working as I expected:
SELECT 'id-45' < 'id-123' COLLATE testsort; -- true (45 before 123)
SELECT 'id' < 'ID' COLLATE testsort; -- false (upper case before lower case)
However combining them resulted in an unexpected result:
SELECT 'id-45' < 'ID-123' COLLATE testsort; -- true
I thought that last one would be false because “ID” would come before “id”. Is there a way to configure the collation to achieve that? I’m trying to match the sorting behaviour in external application code.
Thanks for any help,
Matt
^ permalink raw reply [nested|flat] 3+ messages in thread
* Re: Collation with upper and numeric comparing in unexpected way
2026-01-20 18:36 Collation with upper and numeric comparing in unexpected way Matt Magoffin <[email protected]>
@ 2026-01-21 07:52 ` Peter Eisentraut <[email protected]>
2026-01-21 20:53 ` Re: Collation with upper and numeric comparing in unexpected way Matt Magoffin <[email protected]>
0 siblings, 1 reply; 3+ messages in thread
From: Peter Eisentraut @ 2026-01-21 07:52 UTC (permalink / raw)
To: Matt Magoffin <[email protected]>; [email protected]
On 20.01.26 19:36, Matt Magoffin wrote:
> I am using Postgres 17 and trying to configure a collation that sorts upper case before lower case and includes numeric sorting:
>
> CREATE COLLATION testsort (provider = icu, locale = 'und-u-kf-upper-kn’);
>
> These comparisons are working as I expected:
>
> SELECT 'id-45' < 'id-123' COLLATE testsort; -- true (45 before 123)
>
> SELECT 'id' < 'ID' COLLATE testsort; -- false (upper case before lower case)
>
> However combining them resulted in an unexpected result:
>
> SELECT 'id-45' < 'ID-123' COLLATE testsort; -- true
>
> I thought that last one would be false because “ID” would come before “id”. Is there a way to configure the collation to achieve that? I’m trying to match the sorting behaviour in external application code.
I suspect that this is because the effect of the numeric sorting is a
primary difference and the case difference is only a tertiary difference.
In other words, imagine the numeric sorting pass replacing all numbers
by hypothetical letters corresponding to the numeric order, like
'id-45' -> 'id-X'
'id-123' -> 'id-Z'
'ID-123' -> 'ID-Z'
Then you would have
'id-45' < 'ID-123' =>
'id-X' < 'ID-Z'
which would be correct.
This is just my guess from the outside. The numeric sorting is not a
part of the Unicode Collation Algorithm standard, it is an extension by
ICU, so one would have to dig into the code or documentation there, but
I didn't find anything.
I don't know if there is a way to customize this further to get the
effect you want. Maybe you could reach out to an ICU support forum to
get more expert insights there.
^ permalink raw reply [nested|flat] 3+ messages in thread
* Re: Collation with upper and numeric comparing in unexpected way
2026-01-20 18:36 Collation with upper and numeric comparing in unexpected way Matt Magoffin <[email protected]>
2026-01-21 07:52 ` Re: Collation with upper and numeric comparing in unexpected way Peter Eisentraut <[email protected]>
@ 2026-01-21 20:53 ` Matt Magoffin <[email protected]>
0 siblings, 0 replies; 3+ messages in thread
From: Matt Magoffin @ 2026-01-21 20:53 UTC (permalink / raw)
To: Peter Eisentraut <[email protected]>; +Cc: [email protected]
On 21 Jan 2026, at 8:52 PM, Peter Eisentraut <[email protected]> wrote:
> This is just my guess from the outside. The numeric sorting is not a part of the Unicode Collation Algorithm standard, it is an extension by ICU, so one would have to dig into the code or documentation there, but I didn't find anything.
>
> I don't know if there is a way to customize this further to get the effect you want. Maybe you could reach out to an ICU support forum to get more expert insights there.
Ah, thank you for your thoughts. I will reach out to the ICU project as you suggest.
^ permalink raw reply [nested|flat] 3+ messages in thread
end of thread, other threads:[~2026-01-21 20:53 UTC | newest]
Thread overview: 3+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2026-01-20 18:36 Collation with upper and numeric comparing in unexpected way Matt Magoffin <[email protected]>
2026-01-21 07:52 ` Peter Eisentraut <[email protected]>
2026-01-21 20:53 ` Matt Magoffin <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox