Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sbvcu-002YhD-9H for pgsql-general@arkaria.postgresql.org; Thu, 08 Aug 2024 05:23:52 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sbvcr-00BuXn-QN for pgsql-general@arkaria.postgresql.org; Thu, 08 Aug 2024 05:23:49 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sbvcr-00BuXf-DH for pgsql-general@lists.postgresql.org; Thu, 08 Aug 2024 05:23:49 +0000 Received: from mail-yb1-xb32.google.com ([2607:f8b0:4864:20::b32]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1sbvco-003kHd-EF for pgsql-general@lists.postgresql.org; Thu, 08 Aug 2024 05:23:48 +0000 Received: by mail-yb1-xb32.google.com with SMTP id 3f1490d57ef6-e0e86974172so481808276.2 for ; Wed, 07 Aug 2024 22:23:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=enterprisedb.com; s=google; t=1723094624; x=1723699424; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=qvaHTm+WxA3y0mCHbgml2Pn7s31hMq1tEGDZ3wNC2Ro=; b=N/cr5G2Fqg6CmYtPY8bV9y9mlXbTbt6qCocm1lyv3f6EORMBsaVy6y0TUaRVcuVDBR DaZSsIk0ZbGEy2Auk5kejkxuyQ7S6iQBUTWLAA+1cDCB6DPf93UtdBTEjc9fw3U5Gy7+ jFxUiFddF40nXXbSJy6PAV7xUxZpbULkpo5f5wxFF9QeEY5tYFLWZAeP8BnmfpjCoH31 KKK4cQtD6gahkJUyjpHip9k1dvaxFr6mdC4XN82xJh34rQmM1OkujJ1wT2H3NedwvfYP yfP4MhyV4rkgLSe1Ig4aTfdAZRR80NvqnlTQxjVYqea/ArReCm/kokIKKFqcQ1WEfiwP mQ7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723094624; x=1723699424; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=qvaHTm+WxA3y0mCHbgml2Pn7s31hMq1tEGDZ3wNC2Ro=; b=MRI8WRWhnEYwxT7nvHzGOrRmh38OjNrfIF8slXJ+8YW5tXSG5OTlMeUvRCi9Wn9JUk kHJCB/XZdZMpEbyEJtVyi4+wk7aYYNS5NWNJKnWBO3vVDAw9dgpf4iEV9NZHRsJardwz 83VETZZimzhGTEOrG5GAtbRkkRcGu1mf8AeyCGJTuWcN4ekg1P69vYsmX/NUT68oV7f0 EjLDKhTtlXm1wpACoeI8L5j4k17OXmNH/0hefbJFAYqd0B1aKFDM7Bhujsk9InsXiD4N qD1znScPo+03wTmErgc83Ux2Pll8BWGB3DVT2NCEVjtUXIfPbHR04UPrvORAE2eH4OdT 8dJA== X-Forwarded-Encrypted: i=1; AJvYcCVer4tlYGmWVjFapTIN8B4zATd9sBIx4IaYp3MuBy5BAu92goN46GRgukY/4OkKLSP667xv6MxfL3f0wXDUAT7mC34Gx2uZL0jGam/eOU6zhJ4L X-Gm-Message-State: AOJu0YyWOluLEYOZkBNxXVz+ecYzzpbsEIjfgUky/Me/9nEiTJnxsdTo alkHhmCyIy/H3h/y4Hry5ksnXiCYlC8kiKWoHnN0zK99AnOk9ndtfEWfI+sgk4nbSKQRtsDDUkL QbW8PQCvLw5tJMX71wHtHRx2DRUTvsYmudYF4 X-Google-Smtp-Source: AGHT+IEtAoXPYofTLBp0/CEDjkux75uP59iU3Viz2vyX6UfVEZHZGkhj6kmwLEo41zxO2k/E70ag4mmjdxtwZCA5qw4= X-Received: by 2002:a05:6902:2190:b0:e0b:f69b:da27 with SMTP id 3f1490d57ef6-e0e9dae654bmr793740276.34.1723094624096; Wed, 07 Aug 2024 22:23:44 -0700 (PDT) MIME-Version: 1.0 References: <80c9b0ea-c874-40ad-a006-fb1eb37464c2@aklaver.com> <44b44ece-dce6-4b4f-b751-8787a5a071e0@aklaver.com> In-Reply-To: From: Sandeep Thakkar Date: Thu, 8 Aug 2024 10:53:27 +0530 Message-ID: Subject: Re: Windows installation problem at post-install step To: Thomas Munro Cc: =?UTF-8?B?RXJ0YW4gS8O8w6fDvGtvZ2x1?= , Adrian Klaver , pgsql-general@lists.postgresql.org, Manika Singhal Content-Type: multipart/alternative; boundary="000000000000fbf22f061f253a16" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000fbf22f061f253a16 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, Aug 8, 2024 at 6:10=E2=80=AFAM Thomas Munro wrote: > Thanks. The log didn't offer any more clues, and my colleague David R > has Windows and knows how to work its debugger so we sat down together > and chased this down (thanks David!). > > 1. It is indeed calling abort(), but it's not a PANIC or Assert() in > PostgreSQL, it's an assertion inside Windows' own setlocale(): > > minkernel\crts\ucrt\src\appcrt\convert\mbstowcs.cpp(245) : Assertion > failed: (pwcs =3D=3D nullptr && sizeInWords =3D=3D 0) || (pwcs !=3D nullp= tr && > sizeInWords > 0) > > 2. It is indeed confused about the encoding of the string > "Turkish_T=C3=BCrkiye.1254" itself, and works fine if you use "tr-TR". > > 3. It doesn't happen on 15, because 16 added a key ingredient: > > commit bf03cfd162176d543da79f9398131abc251ddbb9 > Author: Peter Eisentraut > Date: Tue Jan 3 14:21:40 2023 +0100 > > Windows support in pg_import_system_collations > > That causes it to spin through a bunch of system locales and switch to > them, and the first one is "aa". After it calls: > > setlocale(2, "aa"); > > ... then the next call to restore the previous locale is something like: > > setlocale(2, "Turkish_T\252rkiye.1254"); > > (That \252 =3D=3D 0xfc probably depends on your system's default > encoding.) It doesn't like that name anymore, and aborts. A minimal > program with just those two lines shows that. > > It appears that after switching to "aa", it interprets the string > passed to the next call to setlocale() as some other encoding > (probably UTF-8, I dunno). I don't know why it doesn't fail and > return NULL, but there is a more general point that it's a bit bonkers > to use non-ASCII byte sequences in the library calls that are used to > control how non-ASCII byte sequences are interpreted. Maybe it can be > done if you're careful, but in particular a naive save-and-restore > sequence just won't work. > > I guess a save-and-restore done with wsetlocale() could fix that. But > I decline to work on that, we need less Windows kludgery in the tree, > not more. I think a better answer is "don't do that". > > Really, we *have* to chase all these non-BCP-47 locales out of the > installer (I hope you can work on that?), yeah, It seems getlocales.cpp needs to be changed to achieve it. I'll look into it out of PostgreSQL (testers > wanted[1]), and out of the world's existing clusters (maybe with > Dave's pg_upgrade idea, someone would need to write a patch, or maybe > someone could write a stand-alone locale migration program that just > connects to a cluster and (using some authoritative source, that's the > key bit to research) and replaces bad old names with nice new ones). > > [1] > https://www.postgresql.org/message-id/flat/CA+hUKGJ=3DXThErgAQRoqfCy1bKPx= XVuF0=3D2zDbB+SxDs59pv7Fw@mail.gmail.com > --=20 Sandeep Thakkar --000000000000fbf22f061f253a16 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On Thu, Aug 8, 2024 at 6:10=E2= =80=AFAM Thomas Munro <thomas.= munro@gmail.com> wrote:
Thanks.=C2=A0 The log didn't offer any more clues, and m= y colleague David R
has Windows and knows how to work its debugger so we sat down together
and chased this down (thanks David!).

1.=C2=A0 It is indeed calling abort(), but it's not a PANIC or Assert()= in
PostgreSQL, it's an assertion inside Windows' own setlocale():

minkernel\crts\ucrt\src\appcrt\convert\mbstowcs.cpp(245) : Assertion
failed: (pwcs =3D=3D nullptr && sizeInWords =3D=3D 0) || (pwcs !=3D= nullptr &&
sizeInWords > 0)

2.=C2=A0 It is indeed confused about the encoding of the string
"Turkish_T=C3=BCrkiye.1254" itself, and works fine if you use &qu= ot;tr-TR".

3.=C2=A0 It doesn't happen on 15, because 16 added a key ingredient:
commit bf03cfd162176d543da79f9398131abc251ddbb9
Author: Peter Eisentraut <peter@eisentraut.org>
Date:=C2=A0 =C2=A0Tue Jan 3 14:21:40 2023 +0100

=C2=A0 =C2=A0 Windows support in pg_import_system_collations

That causes it to spin through a bunch of system locales and switch to
them, and the first one is "aa".=C2=A0 After it calls:

=C2=A0 =C2=A0 setlocale(2, "aa");

... then the next call to restore the previous locale is something like:
=C2=A0 =C2=A0 setlocale(2, "Turkish_T\252rkiye.1254");

(That \252 =3D=3D 0xfc probably depends on your system's default
encoding.)=C2=A0 It doesn't like that name anymore, and aborts.=C2=A0 A= minimal
program with just those two lines shows that.

It appears that after switching to "aa", it interprets the string=
passed to the next call to setlocale() as some other encoding
(probably UTF-8, I dunno).=C2=A0 I don't know why it doesn't fail a= nd
return NULL, but there is a more general point that it's a bit bonkers<= br> to use non-ASCII byte sequences in the library calls that are used to
control how non-ASCII byte sequences are interpreted.=C2=A0 Maybe it can be=
done if you're careful, but in particular a naive save-and-restore
sequence just won't work.

I guess a save-and-restore done with wsetlocale() could fix that.=C2=A0 But=
I decline to work on that, we need less Windows kludgery in the tree,
not more.=C2=A0 I think a better answer is "don't do that".
Really, we *have* to chase all these non-BCP-47 locales out of the
installer (I hope you can work on that?),

yeah,= It seems getlocales.cpp needs to be changed to achieve it.
I'll loo= k into it

ou= t of PostgreSQL (testers
wanted[1]), and out of the world's existing clusters (maybe with
Dave's pg_upgrade idea, someone would need to write a patch, or maybe someone could write a stand-alone locale migration program that just
connects to a cluster and (using some authoritative source, that's the<= br> key bit to research) and replaces bad old names with nice new ones).

[1] https://www.postgresql.org/message-id/flat/CA+hUKGJ=3DXThE= rgAQRoqfCy1bKPxXVuF0=3D2zDbB+SxDs59pv7Fw@mail.gmail.com


--
Sandeep Thakkar


--000000000000fbf22f061f253a16--