public inbox for [email protected]
help / color / mirror / Atom feedRe: LOCALE C.UTF-8 on EDB Windows v17 server
2+ messages / 2 participants
[nested] [flat]
* Re: LOCALE C.UTF-8 on EDB Windows v17 server
@ 2025-06-05 15:40 Dominique Devienne <[email protected]>
2025-06-05 20:57 ` Re: LOCALE C.UTF-8 on EDB Windows v17 server Daniel Verite <[email protected]>
0 siblings, 1 reply; 2+ messages in thread
From: Dominique Devienne @ 2025-06-05 15:40 UTC (permalink / raw)
To: Daniel Verite <[email protected]>; +Cc: Laurenz Albe <[email protected]>; pgsql-general
On Thu, Jun 5, 2025 at 5:01 PM Daniel Verite <[email protected]> wrote:
> Dominique Devienne wrote:
> > > locale 'C.UTF-8' or lc_collate 'C.UTF-8' lc_ctype 'C.UTF-8'
> > > cannot work on Windows because Windows does not have a locale
> > > named C.UTF-8, whereas a Linux system does (well at least recent
> > > Linuxes. Some old Linuxes don't).
> >
> > But isn't the point of the new-in-v17 builtin provider is to be system
> > independent???
>
> Yes, definitely.
>
> But suppose your database has an extension that calls local-dependent
> code, such as strxfrm() [1] for instance.
>
> The linked MSVC doc says:
>
> "The transformation is made using the locale's LC_COLLATE category
> setting. For more information on LC_COLLATE, see setlocale. strxfrm
> uses the current locale for its locale-dependent behavior"
>
> But what will be the value in LC_COLLATE when this extension code
> is running in a database using the builtin provider?
> It's the value found in pg_database.datcollate that was specified
> when creating the database with the lc_collate or locale option.
>
> The builtin provider routines are used for code inside Postgres
> core, but code outside its perimeter can still call libc functions
> that depend on lc_collate and lc_ctype.
So you're saying datcollate and datctype from pg_database are
irrelevant to PostgreSQL itself, and only extensions might be affects?
Which implies that I shouldn't worry about those differences?
Am I reading you right? --DD
^ permalink raw reply [nested|flat] 2+ messages in thread
* Re: LOCALE C.UTF-8 on EDB Windows v17 server
2025-06-05 15:40 Re: LOCALE C.UTF-8 on EDB Windows v17 server Dominique Devienne <[email protected]>
@ 2025-06-05 20:57 ` Daniel Verite <[email protected]>
0 siblings, 0 replies; 2+ messages in thread
From: Daniel Verite @ 2025-06-05 20:57 UTC (permalink / raw)
To: Dominique Devienne <[email protected]>; +Cc: Laurenz Albe <[email protected]>; pgsql-general
Dominique Devienne wrote:
> So you're saying datcollate and datctype from pg_database are
> irrelevant to PostgreSQL itself, and only extensions might be affects?
Almost. An exception that still exists in v18, as far as I can see [1],
is the default full text search parser still using libc functions like
iswdigit(), iswpunct(), iswspace()... that depend on LC_CTYPE.
So you could see differences between OSes in tsvector contents
in a database with the builtin provider.
Unless using LC_CTYPE=C. But then the parsing is suboptimal, since the
parser does not recognize Unicode fancy punctuation signs or spaces as
such.
Personally I would still care to set LC_CTYPE to a reasonable UTF-8 locale
with v17 or v18.
[1]
https://doxygen.postgresql.org/wparser__def_8c.html#a420ea398a8a11db92412a2af7bf45e40
Best regards,
--
Daniel Vérité
https://postgresql.verite.pro/
^ permalink raw reply [nested|flat] 2+ messages in thread
end of thread, other threads:[~2025-06-05 20:57 UTC | newest]
Thread overview: 2+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2025-06-05 15:40 Re: LOCALE C.UTF-8 on EDB Windows v17 server Dominique Devienne <[email protected]>
2025-06-05 20:57 ` Daniel Verite <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox