Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tByT5-007nDv-Bo for pgsql-hackers@arkaria.postgresql.org; Fri, 15 Nov 2024 15:42:42 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1tByT2-00ARYE-5F for pgsql-hackers@arkaria.postgresql.org; Fri, 15 Nov 2024 15:42:40 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tByT1-00ARY5-S2 for pgsql-hackers@lists.postgresql.org; Fri, 15 Nov 2024 15:42:40 +0000 Received: from fhigh-b5-smtp.messagingengine.com ([202.12.124.156]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tBySy-0025yF-HS for pgsql-hackers@postgresql.org; Fri, 15 Nov 2024 15:42:39 +0000 Received: from phl-compute-11.internal (phl-compute-11.phl.internal [10.202.2.51]) by mailfhigh.stl.internal (Postfix) with ESMTP id A8FDC2540132; Fri, 15 Nov 2024 10:42:34 -0500 (EST) Received: from phl-mailfrontend-01 ([10.202.2.162]) by phl-compute-11.internal (MEProxy); Fri, 15 Nov 2024 10:42:34 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=eisentraut.org; h=cc:cc:content-transfer-encoding:content-type:content-type :date:date:from:from:in-reply-to:in-reply-to:message-id :mime-version:references:reply-to:subject:subject:to:to; s=fm2; t=1731685354; x=1731771754; bh=N9ooUpYnat4xtIASGbk8DpnNIYorJs0q BZuXrSaBmKM=; b=JhnkyHadoV70bH26Z1w6oMlzmztOi3Fwl/NPAqXarLd5s1pE 8k3A/klyeS0cT3eIo9RxsKpqDxvuxGHZzPQBd/+JggtglX0lHatK+1d8xXy1IiCH C7A+G3pkK8TauIoQgpJLisbnd9kF6FNzYMwIPH3eOS+EcVsxTiRM2TIGGE5yD0Wf ppKw3QoKPO/dFiXKxAS5V7y+oCvEUniiveBKwGyV/q9S3nrQSRfUWMsx+ayJ3nxv ilyprcJnZAp4bH7WsKDV9XQPf+rXnSjZl76EMUZxSh/7W25zRrCGgPWGosrkbkbf YnNGmg0r3XHL+J7Mc7nminqz9BKMRPGIBEWOgA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm3; t=1731685354; x= 1731771754; bh=N9ooUpYnat4xtIASGbk8DpnNIYorJs0qBZuXrSaBmKM=; b=X bQktj5+m3a2XTBy+jbko4MGnzz4bA+yt7oTsO+UbiLE4AAqZ7kFUmyvb6+9Dx2OY 7cTPtpVU/cP8fIESPK2sX17QPqtnMd+gLgQ2kJAXr9nOLW0IP6wBytk6lz8ME5Ur MsBybKR2qu98uN9OIhEqr8aSIpObvWVWFOjxWySNKcc4/sUEVPq9kXcER4W1/83b zq0EKd2proo7t5xA+j/C45ekjVlJeFXoiZQfNeMxQSv6W1jIoF7JE6e5dGtScQVZ wNeHIrOyoGqgf2rfFOuLL3xIZhQ0CCOdwBYVzZ6FaaeBc8UEoHzCMhWn33jyMU5O zfpMO34YBS8YH6miHxYqg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefuddrvdeggdejjecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpggftfghnshhusghstghrihgsvgdpuffr tefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnth hsucdlqddutddtmdenucfjughrpefkffggfgfuvfevfhfhjggtgfesthejredttddvjeen ucfhrhhomheprfgvthgvrhcugfhishgvnhhtrhgruhhtuceophgvthgvrhesvghishgvnh htrhgruhhtrdhorhhgqeenucggtffrrghtthgvrhhnpefgjedthfekfedtuefgieelheet leejgefhueeltdfhueetvdffudekfeejhfegheenucevlhhushhtvghrufhiiigvpedtne curfgrrhgrmhepmhgrihhlfhhrohhmpehpvghtvghrsegvihhsvghnthhrrghuthdrohhr ghdpnhgspghrtghpthhtohepiedpmhhouggvpehsmhhtphhouhhtpdhrtghpthhtohepjh hirghnrdhunhhivhgvrhhsrghlihhthiesghhmrghilhdrtghomhdprhgtphhtthhopehh lhhinhhnrghkrgesihhkihdrfhhipdhrtghpthhtohepjhgrtghosgdrtghhrghmphhioh hnsegvnhhtvghrphhrihhsvggusgdrtghomhdprhgtphhtthhopehpghhsqhhlqdhhrggt khgvrhhssehpohhsthhgrhgvshhqlhdrohhrghdprhgtphhtthhopegurghnihgvlhesmh grnhhithhouhdqmhgrihhlrdhorhhgpdhrtghpthhtohepphhjsehilhhluhhmihhnrght vggutghomhhpuhhtihhnghdrtghomh X-ME-Proxy: Feedback-ID: ie0a040ee:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 15 Nov 2024 10:42:33 -0500 (EST) Message-ID: <07fdbb85-c530-48d0-adc0-7d43d7951e1b@eisentraut.org> Date: Fri, 15 Nov 2024 16:42:31 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Support LIKE with nondeterministic collations To: jian he Cc: Heikki Linnakangas , Jacob Champion , pgsql-hackers , Daniel Verite , Paul A Jungwirth References: <3104729e-1fbf-4368-ac21-1f670062de28@eisentraut.org> <0ca761b5-7b62-42a1-bffd-8bedefad48dd@eisentraut.org> <62156cde-b2c5-4375-b47a-9730aa8f7b5e@eisentraut.org> Content-Language: en-US From: Peter Eisentraut In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On 15.11.24 05:26, jian he wrote: > /* > * Now build a substring of the text and try to match it against > * the subpattern. t is the start of the text, t1 is one past the > * last byte. We start with a zero-length string. > */ > t1 = t > t1len = tlen; > for (;;) > { > int cmp; > CHECK_FOR_INTERRUPTS(); > cmp = pg_strncoll(subpat, subpatlen, t, (t1 - t), locale); > > select '.foo.' LIKE '_oo' COLLATE ign_punct; > pg_strncoll's iteration of the first 4 argument values. > oo 2 foo. 0 > oo 2 foo. 1 > oo 2 foo. 2 > oo 2 foo. 3 > oo 2 foo. 4 > > seems there is a shortcut/optimization. > if subpat don't have wildcard(percent sign, underscore) > then we can have less pg_strncoll calls? How would you do that? You need to try all combinations to find one that matches. > minimum case to trigger error within GenericMatchText > since no related tests. > create table t1(a text collate case_insensitive, b text collate "C"); > insert into t1 values ('a','a'); > select a like b from t1; This results in ERROR: 42P22: could not determine which collation to use for LIKE HINT: Use the COLLATE clause to set the collation explicitly. which is the expected behavior. > at 9.7.1. LIKE section, we still don't know what "wildcard" is. > we mentioned it at 9.7.2. > maybe we can add a sentence at the end of: > > If pattern does not contain percent > signs or underscores, then the pattern only represents the string > itself; in that case LIKE acts like the > equals operator. An underscore (_) in > pattern stands for (matches) any single > character; a percent sign (%) matches any sequence > of zero or more characters. > > > saying underscore and percent sign are wildcards in LIKE. > other than that, I can understand the doc. Ok, I agree that could be clarified.