public inbox for [email protected]  
help / color / mirror / Atom feed
From: Peter Eisentraut <[email protected]>
To: jian he <[email protected]>
Cc: Heikki Linnakangas <[email protected]>
Cc: Jacob Champion <[email protected]>
Cc: pgsql-hackers <[email protected]>
Cc: Daniel Verite <[email protected]>
Cc: Paul A Jungwirth <[email protected]>
Subject: Re: Support LIKE with nondeterministic collations
Date: Fri, 15 Nov 2024 16:42:31 +0100
Message-ID: <[email protected]> (raw)
In-Reply-To: <CACJufxFeOuBbkHfp=0-0rwamydjYY4ky1A+CPr6s3WUABC9_Rg@mail.gmail.com>
References: <[email protected]>
	<[email protected]>
	<CA+renyWd-_sAj3YqBRaQVOOMr5uQoeBcA3tjCSyQFzvnbGrMYA@mail.gmail.com>
	<[email protected]>
	<[email protected]>
	<CAOYmi+nqr4xCe9-g4BAupnu2rZmvLy1T3qq3ejOUWOCsoJ4ZdA@mail.gmail.com>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<CACJufxFeOuBbkHfp=0-0rwamydjYY4ky1A+CPr6s3WUABC9_Rg@mail.gmail.com>

On 15.11.24 05:26, jian he wrote:
> /*
> * Now build a substring of the text and try to match it against
> * the subpattern.  t is the start of the text, t1 is one past the
> * last byte.  We start with a zero-length string.
> */
> t1 = t
> t1len = tlen;
> for (;;)
> {
> int cmp;
> CHECK_FOR_INTERRUPTS();
> cmp = pg_strncoll(subpat, subpatlen, t, (t1 - t), locale);
> 
> select '.foo.' LIKE '_oo' COLLATE ign_punct;
> pg_strncoll's iteration of the first 4 argument values.
> oo      2       foo. 0
> oo      2       foo. 1
> oo      2       foo. 2
> oo      2       foo. 3
> oo      2       foo. 4
> 
> seems there is a shortcut/optimization.
> if subpat don't have wildcard(percent sign, underscore)
> then we can have less pg_strncoll calls?

How would you do that?  You need to try all combinations to find one 
that matches.

> minimum case to trigger error within GenericMatchText
> since no related tests.
> create table t1(a text collate case_insensitive, b text collate "C");
> insert into t1 values ('a','a');
> select a like b from t1;

This results in

ERROR:  42P22: could not determine which collation to use for LIKE
HINT:  Use the COLLATE clause to set the collation explicitly.

which is the expected behavior.

> at 9.7.1. LIKE  section, we still don't know what "wildcard" is.
> we mentioned it at 9.7.2.
> maybe we can add a sentence at the end of:
>      <para>
>       If <replaceable>pattern</replaceable> does not contain percent
>       signs or underscores, then the pattern only represents the string
>       itself; in that case <function>LIKE</function> acts like the
>       equals operator.  An underscore (<literal>_</literal>) in
>       <replaceable>pattern</replaceable> stands for (matches) any single
>       character; a percent sign (<literal>%</literal>) matches any sequence
>       of zero or more characters.
>      </para>
> 
> saying underscore and percent sign are wildcards in LIKE.
> other than that, I can understand the doc.

Ok, I agree that could be clarified.







reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Support LIKE with nondeterministic collations
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox