Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1v1m5H-00CXnd-FY for pgsql-hackers@arkaria.postgresql.org; Thu, 25 Sep 2025 13:32:31 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1v1m5E-002q2E-R2 for pgsql-hackers@arkaria.postgresql.org; Thu, 25 Sep 2025 13:32:28 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1v1m5E-002q25-BL for pgsql-hackers@lists.postgresql.org; Thu, 25 Sep 2025 13:32:28 +0000 Received: from mail-canadacentralazlp170120003.outbound.protection.outlook.com ([2a01:111:f403:c103::3] helo=YT6PR01CU002.outbound.protection.outlook.com) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1v1m5B-002N8N-39 for pgsql-hackers@lists.postgresql.org; Thu, 25 Sep 2025 13:32:27 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=yoO4aEaNc3bAjpfT4GZSFJyjZmL+R/virRgd7q2EkZoMajMI/OLS53PYYDQ73QExwyiCI+FxfwOKS9+enetJW5XyA/9mhwtMWjtyVPgGURcPk9brtHAxYNFOq/RXsfLT1JZmHvub8WziOQB/NGxQ83EZ+p0Yj3c99VbwF5ZFIAn/xBecsN/anYjCytQDeQ9FEv9kk3QsK1/E8nE57PghrqEUjfgyEv+5bSHvN1SlyLVBJUiC609GQqsxCPE9iC1KNprFGIz8y5AdlWb5UuvYc9VKf3Bh5SYgc6W46qdkyxr2Mn1VAImZy06ra+QjvHDlmf+9dDP3Exop8nrYWpAMSA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ePsBTSI2/t4LF3apm9lOp639qkq/MLYyON6/oRW5bEA=; b=A+X4O5Rmrnpj7JbgrRp78iijKjXKbn+EQr3k2i1CAGGr9EahbWd8B4VYZud3xVdCgYQm0RkmDOz2WLqSVAmsWBBGgE+mO94hnvQtCNBXKuilQKk+NDAtQoncJ2iO1el3eUQY1GEW90gNaH+BTtGos+QvqiGkIgH3mGGLz+yeVXcsoECvEyngOHvdeDjW2SCu1kFwZHUI4ykJ0PCPQP5UkC9avp4LhFAIAIb/Oxtc4675cAGHJScRFKFU+yV33AgbNAUBd16xVFFN2DLoFjpzR20Tt45Q4e4WEX1WtOseO7Hpi5rNjS+p/+X/3DEja8JcJYaleyJ1xS5Sn3+mV0xS0w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=d2l.com; dmarc=pass action=none header.from=d2l.com; dkim=pass header.d=d2l.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=d2l.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ePsBTSI2/t4LF3apm9lOp639qkq/MLYyON6/oRW5bEA=; b=rPkpdkrv+M5u1bQI7PCf+5AMEscq39r/6tcY8rN2yq1Mij3VIwVLSIke9pFdPZGtqWGyedjH8c/nnB3HCqniOKDEj1SOpcyCQdDD6H/+ktnLbsfvOgvEPKCTAL8W+DDfAXruwdin9ixujq+k2ePGE50bjhfMMVzT8kS4mvyVsVoB5Dxsg/ZF+buGWAMML7SMUMnbnz6m1xh9nRFnDcnRRz7iEUjQ6m/rOH5Kak2LjEnlNjrCNCVYpWP+FLpH5rpl5Gw3sUEy/k1IaQ/adGvznk60S+q7IUUXiSLRcZ8+tLjKnVfiZ2vde24S3c/LKvrZe6sndG6djKfmiz3YkTvbbQ== Received: from YT2PPF959236618.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:b08::466) by YT2PR01MB9697.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:b01:df::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9160.11; Thu, 25 Sep 2025 13:32:18 +0000 Received: from YT2PPF959236618.CANPRD01.PROD.OUTLOOK.COM ([fe80::cb1c:77e3:3148:cbd]) by YT2PPF959236618.CANPRD01.PROD.OUTLOOK.COM ([fe80::cb1c:77e3:3148:cbd%7]) with mapi id 15.20.9115.020; Thu, 25 Sep 2025 13:32:17 +0000 From: Todd Lang To: Daniel Verite CC: "pgsql-hackers@lists.postgresql.org" Subject: RE: Supporting non-deterministic collations with tailoring rules. Thread-Topic: Supporting non-deterministic collations with tailoring rules. Thread-Index: AdwsmLEULSqpl1UDQyeUPnc6i6g45AAAMY4gACi9qAAAOQ9z8A== Date: Thu, 25 Sep 2025 13:32:17 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: msip_labels: MSIP_Label_2bf23dec-417a-4f4e-854c-c4792a92f800_ActionId=d046884b-7486-43b7-8013-44e4c9484ca1;MSIP_Label_2bf23dec-417a-4f4e-854c-c4792a92f800_ContentBits=0;MSIP_Label_2bf23dec-417a-4f4e-854c-c4792a92f800_Enabled=true;MSIP_Label_2bf23dec-417a-4f4e-854c-c4792a92f800_Method=Standard;MSIP_Label_2bf23dec-417a-4f4e-854c-c4792a92f800_Name=Confidential;MSIP_Label_2bf23dec-417a-4f4e-854c-c4792a92f800_SetDate=2025-09-25T13:31:16Z;MSIP_Label_2bf23dec-417a-4f4e-854c-c4792a92f800_SiteId=74bbca6d-410b-45b3-9b51-2a6aa6477079;MSIP_Label_2bf23dec-417a-4f4e-854c-c4792a92f800_Tag=10, 3, 0, 1; authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=D2L.com; x-ms-publictraffictype: Email x-ms-traffictypediagnostic: YT2PPF959236618:EE_|YT2PR01MB9697:EE_ x-ms-office365-filtering-correlation-id: 04c9caea-6ea3-426f-d66c-08ddfc37f26a x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0;ARA:13230040|366016|376014|19092799006|1800799024|38070700021; x-microsoft-antispam-message-info: =?iso-8859-1?Q?OOrCE51QRtZsWIm9WdK4YOYUUcsYUtzYp0UX0Nqieibh0OFPydPxSE1+sc?= =?iso-8859-1?Q?O5LDkRwF+m+a2PfaucN4Aaewe+2VcNetzpAmiuhjwNkxMRSi/ru/0G4PzR?= =?iso-8859-1?Q?G/q0sKGTymZJd1w121K33XPeYURaQMmm7jMacO/gpSGEY5mOJbEk2J7aIz?= =?iso-8859-1?Q?emTlGKc2KAKm9wsoLk+QqYZROma6ok3n1aqp/jKFWFFzFjGW9Gc+GiCTwI?= =?iso-8859-1?Q?WXSq9n2/Y7ttAtkWLoLhV6AvgOBHHXVvnY1aFB5BBPDh/H8Fri+DtEkBBI?= =?iso-8859-1?Q?/K9zLX6bs9MwGLdyuPp/4wIitdhkkesSI9H0j0c+24P6IdPZOk++b9oOup?= =?iso-8859-1?Q?EeHPtQ9FiggNXi5A9Rs0J2P6UNmQtt7jOtdHrus+bfsYrHEya+6BDqOZl4?= =?iso-8859-1?Q?AxpNiAQo3C46TT1kBZJkIUF0uDU0NoJalKwYZLi0g2OLDRSGzGx1nt4PQP?= =?iso-8859-1?Q?lAWmPHn9JP+Fq196OLR9UnjnmMmc+N0W73G1R7seTo21UgTCia6gs63jgV?= =?iso-8859-1?Q?l8YK2oicpLsweJIE4IAWWFg/ppZZSpPe9vxXUKE4a9YfMlW0yLWksjX3p7?= =?iso-8859-1?Q?7pzIUBGThv/Efc2Pcm1h9cZ4Sw5JBN71GksweKJfuB3zSH21RP2RA2WG7o?= =?iso-8859-1?Q?/XRUJplmzSfcx96Zgs27mz47VIHG9JMNp9ZPwIgNs2k1+OMRJ7Mb0Ql86h?= =?iso-8859-1?Q?ukbu4Vmaqr3Yv2mSJbd2b8oDQmxXsQVhCWRLOoIZ8lB+i3wIy/nfID9gyd?= =?iso-8859-1?Q?AxykQWIrI0BgbC/nXs+fMIK94mVZ9UOoK1N++X+uITzbcDfeJX3Qd5Ek7F?= =?iso-8859-1?Q?JPOJFoNBT6vHzfLJg8gHEjgzMO+ezYPE89pNg7WlFxLIBAmykZHEYpAmI6?= =?iso-8859-1?Q?W0INHMe1Ow+D/0SQkQRYyiJMGe+LYV3Qyc2JlnxyRYkNk4h6VFCsXNK5wh?= =?iso-8859-1?Q?/2JzVCshE0FwrIqjhH4GS1Xc1EgKTZ5NMsOYOQI96Cv5BV2n071H2AtjHV?= =?iso-8859-1?Q?n/A2tM0VvsUFNlAsjGxnM07yIbb0B1/VSIo3JgbGi3jIewzeEFZI+LExL8?= =?iso-8859-1?Q?dhrra7Z09SkmrxrFUsGzMTpd5n4WbO2eqLo5n4e4Y8kDvbkt3b1m+xNP0V?= =?iso-8859-1?Q?q18I1a+d7lPKkAAh46cledUmNzVjZL/kfgFbUnoYgrk+ixpv3CxJkedD/I?= =?iso-8859-1?Q?dWU3TajxMGF2BVis6LIyYGgOBX4lh+tPhDqUf+F4g0KyGE8OzRDnV8PN2i?= =?iso-8859-1?Q?jK953O7NSN5hrpHn7J1RT5uQRroIqmJhe6VpvldOdly9sAo0ET+QXQoAEp?= =?iso-8859-1?Q?QSKw4c2gOfzMQAvdzxfp8zsjt9ht7lOF7cqys8kvlKelEexX7Wgmqk/ipb?= =?iso-8859-1?Q?wx+/IaaclACK/nNAvLk83Us9ktoEtSLEFFhOjAbB1D24Li5QRKGjJq1tJm?= =?iso-8859-1?Q?kWADPELsnUGo+oWlHxkt0TERA5y66O56n/37aRUkdJxOgkLbnMQ3R3Ejui?= =?iso-8859-1?Q?4=3D?= x-forefront-antispam-report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:YT2PPF959236618.CANPRD01.PROD.OUTLOOK.COM;PTR:;CAT:NONE;SFS:(13230040)(366016)(376014)(19092799006)(1800799024)(38070700021);DIR:OUT;SFP:1102; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?iso-8859-1?Q?vPMhwsDS9M8qw8HkiGzqZ3f5AtRL11Lt9/j+4NjUXPDl8VoxlJGcSsYtwP?= =?iso-8859-1?Q?DerV1CEwecQTWjTzE386jUivWL3cmXZwFxviLtCYMfYphro1N/9Suf2szF?= =?iso-8859-1?Q?KmkC+69/g3Up/ZV3fwJX2jUqx5l0V6ung4pw3FC+UoZHWBG0gtZRcoM766?= =?iso-8859-1?Q?bl0vtmbQXI1VgRSNrCTFbcrKWyBZcLy7KKOMRNGm/zSKTE/Lkqzsnt2SJq?= =?iso-8859-1?Q?75VefAX45V09QWzupSLh/UkA38LyHhbIETeiOpmLCSibdLivq9a/WXmCPg?= =?iso-8859-1?Q?lok2XeKy5d9QtjEsnIPB+HM0etnZIn/itwDd8OnZ9GZf74pnklT4POBM93?= =?iso-8859-1?Q?FaiMUSMG1V1nU6x1IbCKQlitrLdNV45ZGRkdqtlhLMZj5KSFZ/L+sfbLrJ?= =?iso-8859-1?Q?J8ReQq9N/g7rbdjvOoUkhWT/mlrurYnhhT/VSJcnQ2MErJ1yj7WLdClYXw?= =?iso-8859-1?Q?tIJ9/lemviXN6oqmXAiv4nYoRPhtnTn6fq4l/Le8k2tUpvqcyvxLbgQdbI?= =?iso-8859-1?Q?osd5FoQnwe8J7PcAstHG8Tz3SX4HlQjH3N9t51ZGlHn1jYYpHljHx9qfZ+?= =?iso-8859-1?Q?g+U2fCbr0cR43YPreDjsmrNid3bep8c2bQb7ckzLGj3NE6CTFmWG0SrShi?= =?iso-8859-1?Q?X8N5EzlDTQQ2KOB1zzn/MZaM2Bw035ZsB2SwTcNYhf3EJ5FQ6X0920VY8w?= =?iso-8859-1?Q?12Uhh7dw/0BpJ6KGWr0cHG+98pVDG/inJgcg5g88SVNYtV7vRRoB3ODyym?= =?iso-8859-1?Q?ve99RawtQOU2GidyRjwdPN8X1813BchgaTkN40VePqCQznXgH6wMZz27SZ?= =?iso-8859-1?Q?gjL6IBV0vDvcItXlHozj3szZQgjvPEPTUpa66OjUqS+ngqmk2oHgLjAcqj?= =?iso-8859-1?Q?Sbiv5AQ29R/WzTlw1dRtYHBfIrgAjYycJAX3S53r6v2++uTS39/e1Bk/v4?= =?iso-8859-1?Q?d/bIsfxLqiXnc4OEr9nLqReGzZ5uD5tTFMBaHM7SSOcQ9Ln5VuGD67JBdP?= =?iso-8859-1?Q?aUeTPqkdOeI0LZ7iO0Dx9bLJSQE09+qFlAAVde1hZqs9GVrZXmDAT6woow?= =?iso-8859-1?Q?az3goAnACpYN9N0W9PnqlMrxBhLuU90iHY+OkizRUwEE9bH565HU5Brru8?= =?iso-8859-1?Q?A34pS71dcbEgTbzgzkVE0CjQb82ulWkE8zPxEBPQ8aAPpLTec13ctWnhIP?= =?iso-8859-1?Q?hcLPxstRNORw6HCL632KlB7PjDvhaOc+wYbmlbBg4OGKQbv7HauhIrZOnW?= =?iso-8859-1?Q?2ctgTOLcCX8ceUAsExHVzwZxwO0n833pC/PAK5WQj5YQxcblsqgdZc0EMe?= =?iso-8859-1?Q?G1F1J3A+F/hXv5LRw5zouiXbq2EuJWUoL+5ue5cHgut0WbS47HrQtspcpR?= =?iso-8859-1?Q?a2PXZIaaQBOrc05gv9u1hGwVfBL8engdmIohk1ZVLVygnZUjvmzNKOUEOW?= =?iso-8859-1?Q?pvex/eq4a34kZBs7+ib1Xp2QVkrboqzgI0jEQ5CNCntXJsBtS42IvcRVa3?= =?iso-8859-1?Q?NfJ+4GgZc2e2trGTXJDWeprIGDhGbEEdb1SGoxwo23vczfU9CQrgwbdXfn?= =?iso-8859-1?Q?9i4fXWEvjnipuoF7hap9Mlgu+TysHTNUA+L1Zw7Hq1IQB9qAn7j68pSaNe?= =?iso-8859-1?Q?q39+PVxt/C6GQk6+oULkhS980kw0ovwu6X?= Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: d2l.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: YT2PPF959236618.CANPRD01.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-Network-Message-Id: 04c9caea-6ea3-426f-d66c-08ddfc37f26a X-MS-Exchange-CrossTenant-originalarrivaltime: 25 Sep 2025 13:32:17.6846 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 74bbca6d-410b-45b3-9b51-2a6aa6477079 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: +Vxb6lyM0x2Cg+iJpQU0CzrZGrSyo3WqZy6Hw1SGZKPEF6GND0nXrALympFYMqyIFOLbBhlVrwYyZSFkA/nzdw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: YT2PR01MB9697 List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Ah, somehow I missed your email on this. This is, in fact, exactly what sh= ould happen. The ICU folks are updating their documentation to reflect thi= s with https://github.com/unicode-org/icu/pull/3684/files . Is this small change a reasonable thing to include given the update in guid= ance from the ICU team? -----Original Message----- From: Daniel Verite Sent: Wednesday, September 24, 2025 6:17 AM To: Todd Lang Cc: pgsql-hackers@lists.postgresql.org Subject: Re: Supporting non-deterministic collations with tailoring rules. CAUTION: This email originated from outside of D2L. Do not respond to, clic= k links or open attachments unless you recognize the sender and know the co= ntent is safe. Todd Lang wrote: > When creating a collation, in > https://gith/ > ub.com%2Fpostgres%2Fpostgres%2Fblob%2Fmaster%2Fsrc%2Fbackend%2Futils%2 > Fadt%2Fpg_locale_icu.c%23L461&data=3D05%7C02%7CTodd.Lang%40D2L.com%7Cb34 > 6f047ed7944ebe01408ddfb5391b2%7C74bbca6d410b45b39b512a6aa6477079%7C0%7 > C0%7C638943058554088325%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydW > UsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D > %7C0%7C%7C%7C&sdata=3D4%2F6%2BalfTMFrnAzjQBt4i9Qa1BUUWBpaUJnGz%2B8dvy1s% > 3D&reserved=3D0 it is opening the collator with the tailoring rules > supplied. However, it has hardcoded the strength level > UCOL_DEFAULT_STRENGTH. This has the effect of ignoring the > "deterministic=3Dfalse" you may have specified in your CREATE COLLATION > call. This is related to BUG #18771 previously reported at [1], where the reporte= r notes that passing UCOL_DEFAULT works for him whereas UCOL_DEFAULT_STRENG= TH does not. It looks like a documentation bug in ICU [2] It says: strength: The default collation strength; one of UCOL_PRIMARY, UCOL_SECONDARY, UCOL_TERTIARY, UCOL_IDENTICAL,UCOL_DEFAULT_STRENGTH - can be also set in the rules. But UCOL_DEFAULT_STRENGTH is an alias for UCOL_TERTIARY. U_COL_DEFAULT is what should normally be passed to not override the collati= on strength. Now, by "it works", it means that the strength expressed in the rule (with = rules =3D '[strength 1]' in the case of the OP) takes effect. This syntax is described at [3] (see "Rule Syntax" column) There is a second problem: when the strength is specified in the locale and= not specified in the rules (as you did), it would also be expected to take= effect. It does not appear to be the case, as if the rules were resetting = the collation settings. As mentioned in the thread at [1], Peter Eisentraut has submitted this as a= bug [4], but there hasn't been any follow-up to it in 2.5 years. > If, instead of UCOL_DEFAULT_STRENGTH, the code understood the > deterministic parameter and passed either UCOL_PRIMARY for > "deterministic=3Dtrue", and UCOL_SECONDARY for "deterministic=3Dfalse", > this would preserve the attempt to obtain case-insensitivity in the > locale while simultaneously allowing tailoring as expected. We can't hardcode that deterministic=3Dfalse implies that the strength is 2= . deterministic=3Dfalse only says that the collation can have equal strings= that are not binary-equal. To me, the most plausible fix on the Postgres side would be to pass UCOL_DE= FAULT instead of UCOL_DEFAULT_STRENGTH as in the attached, which lets the u= ser specify the strength in the rule, as the OP did in [1]. [1]: https://www.postgresql.org/message-id/flat/18771-98bb23e455b0f367%40postgre= sql.org [2]: https://unicode-org.github.io/icu-docs/apidoc/released/icu4c/ucol_8h.html#a= 0cb1ddd81f322ed24e389f208eb35c8a [3]: https://www.unicode.org/reports/tr35/tr35-collation.html#Setting_Optio= ns [4]: https://unicode-org.atlassian.net/browse/ICU-22456 Best regards, -- Daniel V=E9rit=E9 https://postgresql.verite.pro/