public inbox for [email protected]
help / color / mirror / Atom feedFrom: David G. Johnston <[email protected]>
To: [email protected]
To: Pg Docs <[email protected]>
Subject: Re: Does the POSITION() function takes into account the COLLATION... or not ?!?
Date: Wed, 16 Feb 2022 09:10:08 -0700
Message-ID: <CAKFQuwavkvERHKc=-Ws=1YEXy68j4GpJx0XbUwbKbY-=hKs7zQ@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
On Tue, Feb 15, 2022 at 11:17 AM PG Doc comments form <
[email protected]> wrote:
> The following documentation comment has been logged on the website:
>
> Page: https://www.postgresql.org/docs/14/functions-string.html
> Description:
>
>
> SELECT POSITION(('å' COLLATE "en_US.utf8") IN 'yeah'); -- should return 3
> instead of 0 !?!
> SELECT POSITION(('o' COLLATE "en_US.utf8") IN 'ångström'); -- should return
> 7 instead of 0 !?!
>
> ==> up to here, this seems pretty enough to conclude that POSITION()
> doesn't
> care at all about COLLATION and always perform a byte search.
>
IIUC "Performing a byte search" is what happens when you use a
deterministic collation where the only test is for equality. So your
examples are not useful in distinguishing whether a collation agnostic byte
search or a deterministic collation search is happening.
> I would like to have something in the doc about that... i.e. either some
> examples showing how the COLLATION is impacting the results of the
> POSITION() function
I don't disagree, but lack of documentation regarding string functions and
collations is not just limited to the position function. I find the fact
the word collation doesn't appear anywhere on the string functions page to
be potentially worthy of change.
How collations behave is documented, in particular:
"A collation is either deterministic or nondeterministic. A deterministic
collation uses deterministic comparisons, which means that it considers
strings to be equal only if they consist of the same byte sequence."
I'll admit I'm definitely an unfavorably biased observer here and don't
deal with these nuances on a daily basis. I have the general impression
that our documentation is correct and sufficient but could be made more
user-friendly. Updating a single function doesn't do that though and in
some ways makes things worse when other related elements, and the
presentation of the material as a whole, doesn't go along with that single
change. Based on this, and the above observation about your test cases, I
don't see much motivation for change here. The effort seems to outweigh
the reward. But for someone who feels differently and submits a patch
there is, IMO, room enough for improvement that a well-written one is
likely to be welcomed.
David J.
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected]
Subject: Re: Does the POSITION() function takes into account the COLLATION... or not ?!?
In-Reply-To: <CAKFQuwavkvERHKc=-Ws=1YEXy68j4GpJx0XbUwbKbY-=hKs7zQ@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox