Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1s2gX2-0057Vj-RF for pgsql-hackers@arkaria.postgresql.org; Fri, 03 May 2024 00:12:08 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1s2gWz-004Y7G-Sb for pgsql-hackers@arkaria.postgresql.org; Fri, 03 May 2024 00:12:06 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1s2gWz-004Y77-JB for pgsql-hackers@lists.postgresql.org; Fri, 03 May 2024 00:12:06 +0000 Received: from mail-ej1-x62e.google.com ([2a00:1450:4864:20::62e]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1s2gWv-001G10-Qz for pgsql-hackers@postgresql.org; Fri, 03 May 2024 00:12:05 +0000 Received: by mail-ej1-x62e.google.com with SMTP id a640c23a62f3a-a598e03836bso25336866b.3 for ; Thu, 02 May 2024 17:12:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1714695120; x=1715299920; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=BNzR1Va0lHIelJfaL1KmU9jvA1GsCWG9LTcLSQU9OWo=; b=B8xogfe3OYFR3pOkc/hLNb5hXs4u9ZTGbjNr5n093EymkS840XcCLMZbJzA+Xm9NzB 98ZNv8+NjAqJLygRjeYy6+Qp81yAy+XiEhfZGX34YcgmxrkgwYgGxck5H/MnkNHhrCjH Sj7YBSlRYu2gJkpWQaOCPRK1GG0EWBzpDQMZt/nwOjX1/5PsrTHLIFC8J/uCipMYtFwk BOIKFJfpFfjznHgeElDNM3IVIenkFd02D2/+Oimu1oEoW1nFHqLDwQDw5v5DGROgR6Cr rl7at63xOvp18NkCRJEIdqLrpDoaW1aMgi38mEbYsPtAP24dvj8d+9RJ5Lze6lTmc0hm qK5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714695120; x=1715299920; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BNzR1Va0lHIelJfaL1KmU9jvA1GsCWG9LTcLSQU9OWo=; b=R9Aux7ww1oTvfcIo1MhPcfntyEthE22pPz9Uq5TGO+4EP7WvIaggIavTbGiL8k0zwB VTHf6y9Q+Hy/g4k1nw591EacPTZuv39EBX8Gyd7xdiTHaKodiYQ/N/AvmNzJBbKT1o0v OuCYrldLzxIw4bLY0aWxsYBOYB2DCAdzjn0e7flo8RG0s1EzTK5i3ym7XZG83l9+C5Te 3WRdNpIU4tNAJOCOf0RtYrNjk5uYtorugeUS1EWef3J2gezx67tZWxYhP8S+A1XjJvwH 6qfbbSlPL+O9cx1sHeoAx8he/nBqYnVTIeNm5z0C8D96hfJSynwbymvAT3iVjN6tnhgF svwA== X-Forwarded-Encrypted: i=1; AJvYcCVnZB0yndznqKGqlyUYiphOP+W+jmLqmcyFN6dTZ1+Z4Bo7RCpGzM5bXKW2/lTwAraQCxpJpRzZRRb4J5l6MNk3B4uMXCVS2is7fomK X-Gm-Message-State: AOJu0YxQJcGbhZBn4mptE8A9MD6Ih8sQ6X6FPDP7+vz4SvFM0D6O04QB o1rLQrUQx3urnwHeZONJEKex8tT0/RwlEMhBXZXo6pyES8NP+wqVRv34OsMPv11CD1rvuanYmSN XTRamvb8SPdHMbUMidcr8DXgK8AQ= X-Google-Smtp-Source: AGHT+IG8rCdv2gLkjjX516R2OnqRHJRRgZvM+fybHRQdjZIlc325NyzPYTjD1uRKHbLp0iGJkdrHvHIPhQDfWOqmDuI= X-Received: by 2002:a17:906:1b43:b0:a52:2c0e:2e91 with SMTP id p3-20020a1709061b4300b00a522c0e2e91mr538622ejg.17.1714695119807; Thu, 02 May 2024 17:11:59 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Robert Haas Date: Thu, 2 May 2024 20:11:48 -0400 Message-ID: Subject: Re: Support LIKE with nondeterministic collations To: Peter Eisentraut Cc: Daniel Verite , pgsql-hackers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Thu, May 2, 2024 at 9:38=E2=80=AFAM Peter Eisentraut wrote: > On 30.04.24 14:39, Daniel Verite wrote: > > postgres=3D# SELECT '.foo.' like '_oo' COLLATE ign_punct; > > ?column? > > ---------- > > f > > (1 row) > > > > The first two results look fine, but the next one is inconsistent. > > This is correct, because '_' means "any single character". This is > independent of the collation. Seems really counterintuitive. I had to think for a long time to be able to guess what was happening here. Finally I came up with this guess: If the collation-aware matching tries to match up f with the initial period, the period is skipped and the f matches f. But when the wildcard is matched to the initial period, that uses up the wildcard and then we're left trying to match o with f, which doesn't work. Is that right? It'd probably be good to use something like this as an example in the documentation. My intuition is that if foo matches a string, then _oo f_o and fo_ should also match that string. Apparently that's not the case, but I doubt I'll be the last one who thinks it should be. --=20 Robert Haas EDB: http://www.enterprisedb.com