public inbox for [email protected]  
help / color / mirror / Atom feed
From: Robert Haas <[email protected]>
To: Peter Eisentraut <[email protected]>
Cc: Daniel Verite <[email protected]>
Cc: pgsql-hackers <[email protected]>
Subject: Re: Support LIKE with nondeterministic collations
Date: Thu, 2 May 2024 20:11:48 -0400
Message-ID: <CA+TgmoYntBJZnnZrRBXMgbXEXrp0Bm8yje4VAjE5LX6aXwj9_w@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
	<[email protected]>

On Thu, May 2, 2024 at 9:38 AM Peter Eisentraut <[email protected]> wrote:
> On 30.04.24 14:39, Daniel Verite wrote:
> >    postgres=# SELECT '.foo.' like '_oo' COLLATE ign_punct;
> >     ?column?
> >    ----------
> >     f
> >    (1 row)
> >
> > The first two results look fine, but the next one is inconsistent.
>
> This is correct, because '_' means "any single character".  This is
> independent of the collation.

Seems really counterintuitive. I had to think for a long time to be
able to guess what was happening here. Finally I came up with this
guess:

If the collation-aware matching tries to match up f with the initial
period, the period is skipped and the f matches f. But when the
wildcard is matched to the initial period, that uses up the wildcard
and then we're left trying to match o with f, which doesn't work.

Is that right?

It'd probably be good to use something like this as an example in the
documentation. My intuition is that if foo matches a string, then _oo
f_o and fo_ should also match that string. Apparently that's not the
case, but I doubt I'll be the last one who thinks it should be.

-- 
Robert Haas
EDB: http://www.enterprisedb.com






reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re: Support LIKE with nondeterministic collations
  In-Reply-To: <CA+TgmoYntBJZnnZrRBXMgbXEXrp0Bm8yje4VAjE5LX6aXwj9_w@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox