Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1vEwhY-004LTq-Em for pgsql-hackers@arkaria.postgresql.org; Fri, 31 Oct 2025 21:30:28 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1vEwhW-0013qK-H8 for pgsql-hackers@arkaria.postgresql.org; Fri, 31 Oct 2025 21:30:25 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1vEwhV-0013qC-T7 for pgsql-hackers@lists.postgresql.org; Fri, 31 Oct 2025 21:30:25 +0000 Received: from mail-pf1-x434.google.com ([2607:f8b0:4864:20::434]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vEwhS-004nZz-1E for pgsql-hackers@postgresql.org; Fri, 31 Oct 2025 21:30:23 +0000 Received: by mail-pf1-x434.google.com with SMTP id d2e1a72fcca58-793021f348fso2462701b3a.1 for ; Fri, 31 Oct 2025 14:30:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=j-davis-com.20230601.gappssmtp.com; s=20230601; t=1761946221; x=1762551021; darn=postgresql.org; h=mime-version:user-agent:references:in-reply-to:date:to:from:subject :message-id:from:to:cc:subject:date:message-id:reply-to; bh=8O3VVDXykakABACNYXOsivYx/gcwCWolqTw3VLj8Ieg=; b=URuXzHPfyZbO0j36VXyW1LrCZQpDvlgIdb/To4kCoa58Kyrg6h2UHFFWo3YbwxfYi3 7Jy3QO03JzEKVMhe7Lz4QPF/1z3EhJt0yOu3hO0/8s0bUAsaAuVmYWD2KojtE9qd9iHO X4LlJynAQg1JozEKPbrEQDhBnMOWmPKu5Cgkv63KN2zOEYuhbqeFIelgimXJd91i9oy0 p/jGWbF1dYs2MU91/bydLAZwafEIP2YJFWBMuLNmNWKZkLE574Ci0uVqsCmCCi1I1Upi mwI6g+gqD1jgLt+sWAJUYUCSxScF/8ajnryXhrk576PdAYWNRABSwQ8Z48qFRB2MD1sB a1tA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761946221; x=1762551021; h=mime-version:user-agent:references:in-reply-to:date:to:from:subject :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=8O3VVDXykakABACNYXOsivYx/gcwCWolqTw3VLj8Ieg=; b=ecmwvu+lcpMIgyAlX4+aBk5zjiJ0DafuEi4ZyPNw2CDHjlQkBpL3i2OoA9rmqKt5E/ VDprCn3q4/fgeRhCNSVrgMxJQ0+MPalUTm036tGnUddXYanIirhGo096Mq4nQt3HumDi tKS+tT5x0Hb8MSe0rBxTH4kHObGu2WxaJaX6L29N7jj/pogj4A9jPDsZAZWTBq2REF4l 1jcPYua9JeQvIQ8GbulFd9tsmoPb+zqM03s+gJT4XbTJpRT8/vgxihIMTSEwh+g9HEAS EVIDjnpxWdYaY+rog3Auk00RVQptR/8QDQnahmQqIpCyTNzV2Sa6ZJKbcsviZdn+xoxs CltQ== X-Gm-Message-State: AOJu0Yz/AinmfZqA6WFbRizpnK11W3IYCzd0lCpgZlcgQOjcC/8Gj4ph hXtczaodMF8gF9wQGOKjatTQ5jc+NhlrG3bvCI+NRHFFvozyca3O74qxNMjtL/y8Qz27cLeqJ9F 64kM= X-Gm-Gg: ASbGncvgRma0+E3HeR05OVTukS+y2XUtum13lFjMypqdA2/iNVT/cFdM5nz9KnFuLn7 X+22mUEROmAK05VAJyjCOX7A9+qlQNZx5Aak2ULmu9XIchZv2hMgAMKpCsU+GNZu+sY0yvXx/oI uP1xhbwobD6rtlRCMTBprGAxeD8ZIqXH5oSlIrpjVJug5znFHaKfAWtcFQnA/wZQsmVfRRvp9Oz XEpX5G7W3wbswxAtKj09kFyXIWI1PbUCFdJvwLQ6XgcEggiVFX+CBxrTZvEsOp48eXHfBLZJVei qUQGcOXY7Rq+M7Eson0bXBoQEYVhYHNh8ElIN7Q4clF0VizZoG4HleM1pLiXxhCLJz2EP6dayjR +btICzom0PLJIVUVdN8LEw3+sa2KH4EqRJbr6ccSoSUevSIiIPCoZNDiKtB3qtryZehU1z01Uqo frzK7amqR6iNo8mfte37c95l5MOICC4gYa X-Google-Smtp-Source: AGHT+IGRbX/oKJSIRU+kL5/8lIvCIBfpnv+BEokT5b46Kulcwm78/X+ReyLSqbrkoPoAxlPDlsdaEQ== X-Received: by 2002:a05:6a21:328a:b0:33e:84f7:94ef with SMTP id adf61e73a8af0-348c9d69dd2mr6594555637.11.1761946220887; Fri, 31 Oct 2025 14:30:20 -0700 (PDT) Received: from jeff-ws-bridge.lan (c-24-7-19-3.hsd1.ca.comcast.net. [24.7.19.3]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-b93b8aa3ff1sm3043794a12.14.2025.10.31.14.30.20 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 31 Oct 2025 14:30:20 -0700 (PDT) Message-ID: <47e1b4f72fe732c5ae85c6cf2c085b4e99a10120.camel@j-davis.com> Subject: Re: Change initdb default to the builtin collation provider From: Jeff Davis To: pgsql-hackers@postgresql.org Date: Fri, 31 Oct 2025 14:30:19 -0700 In-Reply-To: References: Content-Type: multipart/mixed; boundary="=-ggvP/RBiEsz2petvrijI" User-Agent: Evolution 3.52.3-0ubuntu1 MIME-Version: 1.0 List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --=-ggvP/RBiEsz2petvrijI Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, 2025-10-10 at 17:48 -0700, Jeff Davis wrote: > ------- > Summary > ------- >=20 > The libc collation provider is a bad default[1]. The builtin > collation > provider is a good default, so let's use that. The attached patches implement a more modest proposal which does not conflict with Peter's objection about the display order: 0001: If the encoding is unspecified, and cannot be determined from the locale (i.e. the locale is C), then use UTF-8 rather than SQL_ASCII. 0002: If the provider is unspecified, and the locale is C or C.UTF-8, then use the builtin provider. Motivation: * UTF-8 seems safer than SQL_ASCII when the locale is compatible with either. * Whether the "C" locale uses the builtin provider or the libc provider is mostly about the catalog representation, because the implementation is the same. I don't have a strong motivation for this change, it just clarifies that libc is not actually being used when the locale is "C". * I think most users of the "C.UTF-8" locale would be better off with the builtin provider, which benefits from important optimizations. Note: This would mean that "initdb --no-locale" would select UTF-8 and the builtin provider with locale "C", whereas previously it would have selected SQL_ASCII and the libc provider (though it didn't ever really use libc internally). I'm not sure if others want this behavior or if it would be surprising. Regards, Jeff Davis --=-ggvP/RBiEsz2petvrijI Content-Disposition: attachment; filename="v1-0001-initdb-prefer-UTF-8-encoding-over-SQL_ASCII.patch" Content-Type: text/x-patch; name="v1-0001-initdb-prefer-UTF-8-encoding-over-SQL_ASCII.patch"; charset="UTF-8" Content-Transfer-Encoding: base64 RnJvbSA5YzhjZjU4YzU0MTQ2MmE2YWVmNDNmZWQwZGRlYTFlOWYxNjMzOTYwIE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBKZWZmIERhdmlzIDxqZWZmQGotZGF2aXMuY29tPgpEYXRlOiBG cmksIDMxIE9jdCAyMDI1IDEzOjM2OjQ2IC0wNzAwClN1YmplY3Q6IFtQQVRDSCB2MSAxLzJdIGlu aXRkYjogcHJlZmVyIFVURi04IGVuY29kaW5nIG92ZXIgU1FMX0FTQ0lJLgoKVGhpcyB3YXMgYWxy ZWFkeSB0cnVlIGZvciB0aGUgSUNVIGxvY2FsZSBwcm92aWRlciwgbWFrZSBpdCB0cnVlIGZvcgp0 aGUgb3RoZXJzLgotLS0KIHNyYy9iaW4vaW5pdGRiL2luaXRkYi5jIHwgNiArKystLS0KIDEgZmls ZSBjaGFuZ2VkLCAzIGluc2VydGlvbnMoKyksIDMgZGVsZXRpb25zKC0pCgpkaWZmIC0tZ2l0IGEv c3JjL2Jpbi9pbml0ZGIvaW5pdGRiLmMgYi9zcmMvYmluL2luaXRkYi9pbml0ZGIuYwppbmRleCA5 MmZlMmY1MzFmNy4uYWE3ZmM1YTY2MzYgMTAwNjQ0Ci0tLSBhL3NyYy9iaW4vaW5pdGRiL2luaXRk Yi5jCisrKyBiL3NyYy9iaW4vaW5pdGRiL2luaXRkYi5jCkBAIC0yNzE4LDEwICsyNzE4LDEwIEBA IHNldHVwX2xvY2FsZV9lbmNvZGluZyh2b2lkKQogCQljdHlwZV9lbmMgPSBwZ19nZXRfZW5jb2Rp bmdfZnJvbV9sb2NhbGUobGNfY3R5cGUsIHRydWUpOwogCiAJCS8qCi0JCSAqIElmIGN0eXBlX2Vu Yz1TUUxfQVNDSUksIGl0J3MgY29tcGF0aWJsZSB3aXRoIGFueSBlbmNvZGluZy4gSUNVIGRvZXMK LQkJICogbm90IHN1cHBvcnQgU1FMX0FTQ0lJLCBzbyBzZWxlY3QgVVRGLTggaW5zdGVhZC4KKwkJ ICogSWYgY3R5cGVfZW5jPVNRTF9BU0NJSSwgaXQncyBjb21wYXRpYmxlIHdpdGggYW55IGVuY29k aW5nLiBQcmVmZXIKKwkJICogVVRGLTguCiAJCSAqLwotCQlpZiAobG9jYWxlX3Byb3ZpZGVyID09 IENPTExQUk9WSURFUl9JQ1UgJiYgY3R5cGVfZW5jID09IFBHX1NRTF9BU0NJSSkKKwkJaWYgKGN0 eXBlX2VuYyA9PSBQR19TUUxfQVNDSUkpCiAJCQljdHlwZV9lbmMgPSBQR19VVEY4OwogCiAJCWlm IChjdHlwZV9lbmMgPT0gLTEpCi0tIAoyLjQzLjAKCg== --=-ggvP/RBiEsz2petvrijI Content-Disposition: attachment; filename*0=v1-0002-initdb-if-locale-is-C-or-C.UTF-8-use-builtin-prov.pat; filename*1=ch Content-Type: text/x-patch; name="v1-0002-initdb-if-locale-is-C-or-C.UTF-8-use-builtin-prov.patch"; charset="UTF-8" Content-Transfer-Encoding: base64 RnJvbSA4YjE2NTlmYWI1MDM5NmVhZWFjYWIwNDJhZWFlZjhkZjI0MWFmNDY3IE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBKZWZmIERhdmlzIDxqZWZmQGotZGF2aXMuY29tPgpEYXRlOiBG cmksIDMxIE9jdCAyMDI1IDE0OjA1OjEwIC0wNzAwClN1YmplY3Q6IFtQQVRDSCB2MSAyLzJdIGlu aXRkYjogaWYgbG9jYWxlIGlzIEMgb3IgQy5VVEYtOCwgdXNlIGJ1aWx0aW4KIHByb3ZpZGVyLgoK SWYgdGhlIHByb3ZpZGVyIGlzIHVuc3BlY2lmaWVkLCB1c2UgdGhlIGJ1aWx0aW4gcHJvdmlkZXIg QyBvcgpDLlVURi04LiBJZiB0aGUgcHJvdmlkZXIgaXMgc3BlY2lmaWVkLCB0aGVuIGRvIG5vdCBv dmVycmlkZSBpdC4KClRoZSBDIGxvY2FsZSBoYXMgYWx3YXlzIGJlZW4sIGVmZmVjdGl2ZWx5LCB0 aGUgYnVpbHRpbiBwcm92aWRlciwgaW4KdGhlIHNlbnNlIHRoYXQgaXQgdXNlcyBidWlsdC1pbiBs b2dpYyByYXRoZXIgdGhhbiBzdHJjb2xsKCksIGV0Yy4gVGhlCmNoYW5nZSBoZXJlIGlzIG1vc3Rs eSBhYm91dCB0aGUgY2F0YWxvZyByZXByZXNlbnRhdGlvbi4KClRoZSBDLlVURi04IGxvY2FsZSBo YXMgdXNlZCBsaWJjLCBidXQgYnkgZG9pbmcgc28sIGNvbGxhdGlvbiBkb2Vzbid0CmJlbmVmaXQg ZnJvbSBpbXBvcnRhbnQgcGVyZm9ybWFuY2Ugb3B0aW1pemF0aW9ucy4gTm93IHRoYXQgd2UgaGF2 ZSBhCmJ1aWx0aW4gIkMuVVRGLTgiIGNvbGxhdGlvbiB3aGljaCBkb2VzIGJlbmVmaXQgZnJvbSB0 aG9zZQpvcHRpbWl6YXRpb25zLCB1c2UgdGhhdC4KLS0tCiBzcmMvYmluL2luaXRkYi9pbml0ZGIu YyB8IDI1ICsrKysrKysrKysrKysrKysrKysrKysrKysKIDEgZmlsZSBjaGFuZ2VkLCAyNSBpbnNl cnRpb25zKCspCgpkaWZmIC0tZ2l0IGEvc3JjL2Jpbi9pbml0ZGIvaW5pdGRiLmMgYi9zcmMvYmlu L2luaXRkYi9pbml0ZGIuYwppbmRleCBhYTdmYzVhNjYzNi4uODQ5MzFmMTQ1ZjQgMTAwNjQ0Ci0t LSBhL3NyYy9iaW4vaW5pdGRiL2luaXRkYi5jCisrKyBiL3NyYy9iaW4vaW5pdGRiL2luaXRkYi5j CkBAIC0xNDUsNiArMTQ1LDcgQEAgc3RhdGljIGNoYXIgKmxjX251bWVyaWMgPSBOVUxMOwogc3Rh dGljIGNoYXIgKmxjX3RpbWUgPSBOVUxMOwogc3RhdGljIGNoYXIgKmxjX21lc3NhZ2VzID0gTlVM TDsKIHN0YXRpYyBjaGFyIGxvY2FsZV9wcm92aWRlciA9IENPTExQUk9WSURFUl9MSUJDOworc3Rh dGljIGJvb2wgbG9jYWxlX3Byb3ZpZGVyX3NwZWNpZmllZCA9IGZhbHNlOwogc3RhdGljIGJvb2wg YnVpbHRpbl9sb2NhbGVfc3BlY2lmaWVkID0gZmFsc2U7CiBzdGF0aWMgY2hhciAqZGF0bG9jYWxl ID0gTlVMTDsKIHN0YXRpYyBib29sIGljdV9sb2NhbGVfc3BlY2lmaWVkID0gZmFsc2U7CkBAIC0y NDY1LDYgKzI0NjYsMjggQEAgc2V0bG9jYWxlcyh2b2lkKQogCWxjX21lc3NhZ2VzID0gY2Fub25u YW1lOwogI2VuZGlmCiAKKwkvKgorCSAqIElmIHRoZSBsb2NhbGUgaXMgQyBvciBDLlVURi04LCBh bmQgbm8gcHJvdmlkZXIgd2FzIHNwZWNpZmllZCwgdXNlIHRoZQorCSAqIGJ1aWx0aW4gcHJvdmlk ZXIgcmF0aGVyIHRoYW4gbGliYy4KKwkgKi8KKwlpZiAoIWxvY2FsZV9wcm92aWRlcl9zcGVjaWZp ZWQgJiYgbG9jYWxlX3Byb3ZpZGVyID09IENPTExQUk9WSURFUl9MSUJDKQorCXsKKwkJaWYgKHN0 cmNtcChsY19jdHlwZSwgbGNfY29sbGF0ZSkgPT0gMCkKKwkJeworCQkJaWYgKHN0cmNtcChsY19j dHlwZSwgIkMiKSA9PSAwKQorCQkJeworCQkJCWxvY2FsZV9wcm92aWRlciA9IENPTExQUk9WSURF Ul9CVUlMVElOOworCQkJCWRhdGxvY2FsZSA9ICJDIjsKKwkJCX0KKwkJCWVsc2UgaWYgKHN0cmNt cChsY19jdHlwZSwgIkMuVVRGLTgiKSA9PSAwIHx8CisJCQkJCSBzdHJjbXAobGNfY3R5cGUsICJD LlVURjgiKSA9PSAwKQorCQkJeworCQkJCWxvY2FsZV9wcm92aWRlciA9IENPTExQUk9WSURFUl9C VUlMVElOOworCQkJCWRhdGxvY2FsZSA9ICJDLlVURi04IjsKKwkJCX0KKwkJfQorCX0KKwogCWlm IChsb2NhbGVfcHJvdmlkZXIgIT0gQ09MTFBST1ZJREVSX0xJQkMgJiYgZGF0bG9jYWxlID09IE5V TEwpCiAJCXBnX2ZhdGFsKCJsb2NhbGUgbXVzdCBiZSBzcGVjaWZpZWQgaWYgcHJvdmlkZXIgaXMg JXMiLAogCQkJCSBjb2xscHJvdmlkZXJfbmFtZShsb2NhbGVfcHJvdmlkZXIpKTsKQEAgLTMzNjIs NiArMzM4NSw4IEBAIG1haW4oaW50IGFyZ2MsIGNoYXIgKmFyZ3ZbXSkKIAkJCQkJCQkJCQkgIi1j IGRlYnVnX2Rpc2NhcmRfY2FjaGVzPTEiKTsKIAkJCQlicmVhazsKIAkJCWNhc2UgMTU6CisJCQkJ bG9jYWxlX3Byb3ZpZGVyX3NwZWNpZmllZCA9IHRydWU7CisKIAkJCQlpZiAoc3RyY21wKG9wdGFy ZywgImJ1aWx0aW4iKSA9PSAwKQogCQkJCQlsb2NhbGVfcHJvdmlkZXIgPSBDT0xMUFJPVklERVJf QlVJTFRJTjsKIAkJCQllbHNlIGlmIChzdHJjbXAob3B0YXJnLCAiaWN1IikgPT0gMCkKLS0gCjIu NDMuMAoK --=-ggvP/RBiEsz2petvrijI--