Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1s2y2h-007tB6-Tp for pgsql-hackers@arkaria.postgresql.org; Fri, 03 May 2024 18:53:59 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1s2y2f-00BHRG-66 for pgsql-hackers@arkaria.postgresql.org; Fri, 03 May 2024 18:53:58 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1s2y2e-00BHR7-Rp for pgsql-hackers@lists.postgresql.org; Fri, 03 May 2024 18:53:57 +0000 Received: from fhigh6-smtp.messagingengine.com ([103.168.172.157]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1s2y2d-001Nz6-G5 for pgsql-hackers@postgresql.org; Fri, 03 May 2024 18:53:57 +0000 Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailfhigh.nyi.internal (Postfix) with ESMTP id 6E451114013C; Fri, 3 May 2024 14:53:54 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute5.internal (MEProxy); Fri, 03 May 2024 14:53:54 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=eisentraut.org; h=cc:cc:content-transfer-encoding:content-type:content-type :date:date:from:from:in-reply-to:in-reply-to:message-id :mime-version:references:reply-to:subject:subject:to:to; s=fm2; t=1714762434; x=1714848834; bh=Gc9SaxjImoRExVCB3JH71Ow5oXuBJWHm nAp1JEbb9MA=; b=4gYvhSusqLnGCy7s1ChMdCdXHJAe2qjInDUueYe0bpqiObiy JzWtGpdzv0etTUngxWFQpYdGVnfbTAkjT0G4neNPdl5oQxaa9rUYGcQGKhS0xLPw tFBXzKRrVvKSOclA/xoQxO1Wk1wmhIaBCGUUu0gfiPPc5Lli5ZKxv2Xw9/c1Snmz UxsGVXLi+jhji63dRc5IwSi0CqJurta0E+rwS2yBorQIaePExugpGpBp449JgHe/ EEWQ+8hJh7PI55ATakKGgERf8a7nMRFVj0etCZPjL89rmdNuhcN6yXbGb8L+fSr+ bCb2JjuMYED8VbbGCyvBZ18ini6VzlJkTQxK+A== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm3; t=1714762434; x= 1714848834; bh=Gc9SaxjImoRExVCB3JH71Ow5oXuBJWHmnAp1JEbb9MA=; b=M hTpLXfQPXN7q0/wJpoiJh6kxfL1Ps8gB0orss8tupPg+qNszN0U8cocsOdQ0z+Yk OeyaOaIYhtbV153xvVtIaihcnsJ/JTYfMbrg3OSwBVFjxO8qMFyDXI6NcVT3gRly pV+gZ1pHVYk4Wil5/HZtckOB97dUOEEfocPwWmqo+EGg4i2Z9Wk28S3npdYtnLIj /vy6EovM0SHhQs0ZzPm6DBuwp0ZIPSZwNW8a2v5FCeCJHhcPOirUomjO2IvPTCCL tPSLer9iecyv3xIjXAuf1xzOHa5ibOtaYikRLbrnMywiQnvKYKnUz1/uGlFDfZn/ AsuxqGumt6PCes9rxMwzA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrvddvtddgudeftdcutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpefkffggfgfuvfevfhfhjggtgfesthekredttddvjeenucfhrhhomheprfgv thgvrhcugfhishgvnhhtrhgruhhtuceophgvthgvrhesvghishgvnhhtrhgruhhtrdhorh hgqeenucggtffrrghtthgvrhhnpeejhfevhedttefgfffhhfeffefggffhffelgfeiueeu keehvdehvdefheffvdefueenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmh grihhlfhhrohhmpehpvghtvghrsegvihhsvghnthhrrghuthdrohhrgh X-ME-Proxy: Feedback-ID: ie0a040ee:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 3 May 2024 14:53:53 -0400 (EDT) Message-ID: Date: Fri, 3 May 2024 20:53:52 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Support LIKE with nondeterministic collations To: Daniel Verite Cc: Robert Haas , Pgsql-Hackers References: <9c81e66f-0e06-4756-b9a2-d7b7d46379c1@manitou-mail.org> Content-Language: en-US From: Peter Eisentraut In-Reply-To: <9c81e66f-0e06-4756-b9a2-d7b7d46379c1@manitou-mail.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On 03.05.24 16:58, Daniel Verite wrote: > * Generating bounds for a sort key (prefix matching) > > Having sort keys for strings allows for easy creation of bounds - > sort keys that are guaranteed to be smaller or larger than any sort > key from a give range. For example, if bounds are produced for a > sortkey of string “smith”, strings between upper and lower bounds > with one level would include “Smith”, “SMITH”, “sMiTh”. Two kinds > of upper bounds can be generated - the first one will match only > strings of equal length, while the second one will match all the > strings with the same initial prefix. > > CLDR 1.9/ICU 4.6 and later map U+FFFF to a collation element with > the maximum primary weight, so that for example the string > “smith\uFFFF” can be used as the upper bound rather than modifying > the sort key for “smith”. > > In other words it says that > > col LIKE 'smith%' collate "nd" > > is equivalent to: > > col >= 'smith' collate "nd" AND col < U&'smith\ffff' collate "nd" > > which could be obtained from an index scan, assuming a btree > index on "col" collate "nd". > > U+FFFF is a valid code point but a "non-character" [1] so it's > not supposed to be present in normal strings. Thanks, this could be very useful!