public inbox for [email protected]  
help / color / mirror / Atom feed
From: JānisE <[email protected]>
To: [email protected] <[email protected]>
Subject: Sorting by respecting diacritics/accents
Date: Fri, 25 Jul 2025 13:05:17 +0300 (EEST)
Message-ID: <1814742357.238678.1753437917809@w8> (raw)

Hello! I seem to not be able to get PostgreSQL to sort rows by a string column respecting the diacritics. I read [1] that it's possible to define a custom collation having collation strength "ks" set to "level2", which would mean that it's accent-sensitive. However, when I try to actually sort using that collation, the order seem to be accent-insensitive. For example: CREATE TABLE test (string text); INSERT INTO test VALUES ('bar'), ('bat'), ('bär'); CREATE COLLATION "und1" (provider = icu, deterministic = false, locale = 'und-u-ks-level1'); CREATE COLLATION "und2" (provider = icu, deterministic = false, locale = 'und-u-ks-level2'); CREATE COLLATION "und3" (provider = icu, deterministic = false, locale = 'und-u-ks-level3'); SELECT * FROM test ORDER BY string collate "und1"; SELECT * FROM test ORDER BY string collate "und2"; SELECT * FROM test ORDER BY string collate "und3"; All three collations give me the same order: bar < bär < bat, although an accent-sensitive order would be bar < bat < bär This does lose "bär", meaning that those strength levels do have some kind of an effect on "DISTINCT": SELECT DISTINCT string COLLATE "und1" FROM test; But it's not working on "ORDER BY". Do I misunderstand the collation capabilities? Is there a way to actually get an accent-sensitive order? Also, is there a way to see what options are there for the default built-in collations? I don't see, for example, the used "ks" level in the "pg_collation" table data. Best regards, Janis [1]  https://www.postgresql.org/docs/current/collation.html#ICU-COLLATION-COMPARISON-LEVELS

view thread (2+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected]
  Subject: Re: Sorting by respecting diacritics/accents
  In-Reply-To: <1814742357.238678.1753437917809@w8>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox