public inbox for [email protected]  
help / color / mirror / Atom feed
From: Daniel Verite <[email protected]>
To: Dominique Devienne <[email protected]>
Cc: Laurenz Albe <[email protected]>
Cc: [email protected]
Subject: Re: LOCALE C.UTF-8 on EDB Windows v17 server
Date: Thu, 05 Jun 2025 22:57:24 +0200
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAFCRh-8KoLDppV0K21rLZc=dR9_hgvzgoYf=g5PszbARKZ9RjQ@mail.gmail.com>

	Dominique Devienne wrote:

> So you're saying datcollate and datctype from pg_database are
> irrelevant to PostgreSQL itself, and only extensions might be affects?

Almost. An exception that still exists in v18, as far as I can see [1],
is the default full text search parser still using libc functions like
iswdigit(), iswpunct(), iswspace()... that depend on LC_CTYPE. 

So you could see differences between OSes in tsvector contents
in a database with the builtin provider.
Unless using LC_CTYPE=C. But then the parsing is suboptimal, since the
parser does not recognize Unicode fancy punctuation signs or spaces as
such.
Personally I would still care to set LC_CTYPE to a reasonable UTF-8 locale
with v17 or v18.

[1]
https://doxygen.postgresql.org/wparser__def_8c.html#a420ea398a8a11db92412a2af7bf45e40

Best regards,
-- 
Daniel Vérité 
https://postgresql.verite.pro/






view thread (2+ messages)

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re: LOCALE C.UTF-8 on EDB Windows v17 server
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox