public inbox for [email protected]  
help / color / mirror / Atom feed
From: Robert Haas <[email protected]>
To: Peter Eisentraut <[email protected]>
Cc: Daniel Verite <[email protected]>
Cc: pgsql-hackers <[email protected]>
Subject: Re: Support LIKE with nondeterministic collations
Date: Fri, 3 May 2024 09:20:52 -0400
Message-ID: <CA+TgmoaKMO-PcqsdJj9246riXz1DNB+Bcpfww2P4y8SH+W27iA@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
	<[email protected]>
	<CA+TgmoYntBJZnnZrRBXMgbXEXrp0Bm8yje4VAjE5LX6aXwj9_w@mail.gmail.com>
	<[email protected]>

On Fri, May 3, 2024 at 4:52 AM Peter Eisentraut <[email protected]> wrote:
> What the implementation does is, it walks through the pattern.  It sees
> '_', so it steps over one character in the input string, which is '.'
> here.  Then we have 'foo.' left to match in the input string.  Then it
> takes from the pattern the next substring up to but not including either
> a wildcard character or the end of the string, which is 'oo', and then
> it checks if a prefix of the remaining input string can be found that is
> "equal to" 'oo'.  So here it would try in turn
>
>      ''     = 'oo' collate ign_punct ?
>      'f'    = 'oo' collate ign_punct ?
>      'fo'   = 'oo' collate ign_punct ?
>      'foo'  = 'oo' collate ign_punct ?
>      'foo.' = 'oo' collate ign_punct ?
>
> and they all fail, so the match fails.

Interesting. Does that imply that these matches are slower than normal ones?

> The second definition would satisfy the expectation here, because then
> '.f' matches '_' because '.f' is equal to some string of length one,
> such as 'f'.  (And then 'oo.' matches 'oo' for the rest of the pattern.)
>   However, off the top of my head, this definition has three flaws: (1)
> It would make the single-character wildcard effectively an
> any-number-of-characters wildcard, but only in some circumstances, which
> could be confusing, (2) it would be difficult to compute, because you'd
> have to check equality against all possible single-character strings,
> and (3) it is not what the SQL standard says.

Right, those are good arguments.

> In any case, yes, some explanation and examples should be added.

Cool.

-- 
Robert Haas
EDB: http://www.enterprisedb.com






view thread (11+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re: Support LIKE with nondeterministic collations
  In-Reply-To: <CA+TgmoaKMO-PcqsdJj9246riXz1DNB+Bcpfww2P4y8SH+W27iA@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox