Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pEGJj-0006hH-OT for pgsql-sql@arkaria.postgresql.org; Sat, 07 Jan 2023 21:01:28 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1pEGJh-00023g-Tw for pgsql-sql@arkaria.postgresql.org; Sat, 07 Jan 2023 21:01:25 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pEGJh-00023X-Im for pgsql-sql@lists.postgresql.org; Sat, 07 Jan 2023 21:01:25 +0000 Received: from mail-ej1-x62d.google.com ([2a00:1450:4864:20::62d]) by magus.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1pEGJd-00052w-Ty for pgsql-sql@lists.postgresql.org; Sat, 07 Jan 2023 21:01:25 +0000 Received: by mail-ej1-x62d.google.com with SMTP id u19so10917747ejm.8 for ; Sat, 07 Jan 2023 13:01:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=reddoc.net; s=google; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=ywXQxRF75MxJumNR9jT8C82JnYNxahCO6iHDb3CKlxU=; b=djA2d5d4SyuNWdk7JVHo4ym1MeyKWeieVkxoD5hM4B1O5JVV7nTd0Hujs1B6TlP+oM B5UXcNdoNhVbM3qLIWxG8+TS4XuzVA7ukaUDxrwzxsZVyvWNa2IzX0sZDJErfLxO5gWi IQo0NghY6d14oMrRxKjZlgLMVXRXlhy3Z0tao= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ywXQxRF75MxJumNR9jT8C82JnYNxahCO6iHDb3CKlxU=; b=kwmWKoQ6Z7actGHClk2ZF4b7WCGvStwZGsgIKal0AokQBHUGvvNbDbJ3HBO8cwhGkl ZPFF89N6+WnqdAPMvsHKZ6VA/Y2WRCFSEbCXlT6v8Yg0+NUZ0sg+twCqEjWzT8c+estJ wyFhDjYs0WTdVzjCTR4VZhK+r3ByJGaLYpQVl0WTU1lGhg2hEQUALnkpKliXSE08HowC lNeDio8N4UVgqUQBTZMtcSey+yuOBX6uVO6j8JR99ZwaTG14WISM1UaNWI3E3wyahfAj //Y7rS3PP7xtqNJWJJnvM2lcAHG8+PLFAZov2aOYOSueRCLsS94DlXXEBuXZGfCcjwM9 yBWA== X-Gm-Message-State: AFqh2ko2EuUASJxJqo2CYvVROVCsOGDAWLgq81yP2Lk/+YRrrv8f4Yi3 lF65okBQQ4UWYmUPfUHLpaNaHAUNSa4e/+aPSP0Bfw== X-Google-Smtp-Source: AMrXdXtpmBR4lj9oibzhKr2ttDPqfMxx5u0gWOHIX1cl5MG2shwh1Xb2AAnbTytnLtzwqHu00sd12qRe1oMliZzUBz0= X-Received: by 2002:a17:907:1393:b0:7c1:4e96:da7a with SMTP id vs19-20020a170907139300b007c14e96da7amr5186582ejb.596.1673125280075; Sat, 07 Jan 2023 13:01:20 -0800 (PST) MIME-Version: 1.0 References: <25e94e6e-5b18-9d45-0b7b-c23bd4c9a17b@sopchak.me> In-Reply-To: <25e94e6e-5b18-9d45-0b7b-c23bd4c9a17b@sopchak.me> From: Samed YILDIRIM Date: Sat, 7 Jan 2023 23:01:09 +0200 Message-ID: Subject: Re: Order By weirdness? To: Carl Sopchak Cc: pgsql-sql@lists.postgresql.org Content-Type: multipart/alternative; boundary="000000000000fbe95d05f1b2d559" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000fbe95d05f1b2d559 Content-Type: text/plain; charset="UTF-8" Hi Carl, This can be related to glibc2.38 update. I recommend you to check following documents. https://postgresql.verite.pro/blog/2018/08/27/glibc-upgrade.html https://wiki.postgresql.org/wiki/Locale_data_changes Best regards. Samed YILDIRIM On Sat, 7 Jan 2023 at 22:35, Carl Sopchak wrote: > I'm seeing something (very) unexpected with ORDER BY. If I run this query: > > select txt > from ( values('x12345'), ('xz1234'), ('x23456'), ('xz2345'), ('x34567'), > ('xz3456') ) a(txt) > order by txt; > > I get expected results with x<#> being sorted before xz. However, if I > replace the z's with ~, giving > > select txt > from ( values('x12345'), ('x~1234'), ('x23456'), ('x~2345'), ('x34567'), > ('x~3456') ) a(txt) > order by txt; > > I get this??? > > txt > -------- > x~1234 > x12345 > x~2345 > x23456 > x~3456 > x34567 > > Which appears to mean that ~ is treated differently than z (basically ~ is > ignored). Same if I use other special characters, such as @. > > Up until stumbling into this, I have never seen such behavior from a > database. (Windows OS, yes, but I won't go there...) Character-based text > always sorted in an alphabetic order (which puts special characters in > different places in the ordering depending on encoding, but it's > consistent). > > Two questions (which may be the same way of asking the same question): > > - How is this correct? I can see where this could be useful in limited > scenarios, but IMHO it makes no sense as a default sort order. > > - What do I need to do to get a strictly character-based sort in ORDER BY? > > I am using postgres version 14.3 on Fedora 37. > > Thanks for the help. > > Carl > > > --000000000000fbe95d05f1b2d559 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Carl,

This can be related= to glibc2.38 update. I recommend you to check following documents.

https://wiki.postgresql.org/wiki/Locale_data_changes<= /div>

Best regar= ds.
Samed YILDIRIM


On Sat, 7 Jan 2023 at 22:35, Carl Sopchak <carl@sopchak.me> wrote:
=20 =20 =20

I'm seeing something (very) unexpected with ORDER BY.=C2=A0 If I= run this query:

select txt
from ( values('x12345'), ('xz1234'), ('x23456'= ;), ('xz2345'), ('x34567'), ('xz3456') ) a(txt)
order by txt;

I get expected results with x<#> being sorted before xz.=C2=A0 However, if I replace the z's with ~, giving

select txt
from ( values('x12345'), ('x~1234'), ('x23456'= ;), ('x~2345'), ('x34567'), ('x~3456') ) a(txt)
order by txt;

I get this???

=C2=A0 txt=C2=A0 =C2=A0
--------
=C2=A0x~1234
=C2=A0x12345
=C2=A0x~2345
=C2=A0x23456
=C2=A0x~3456
=C2=A0x34567

Which appears to mean that ~ is treated differently than z (basically ~ is ignored).=C2=A0 Same if I use other special character= s, such as @.

Up until stumbling into this, I have never seen such behavior from a database.=C2=A0 (Windows OS, yes, but I won't go there...)= =C2=A0 Character-based text always sorted in an alphabetic order (which puts special characters in different places in the ordering depending on encoding, but it's consistent).

Two questions (which may be the same way of asking the same question):

- How is this correct?=C2=A0 I can see where this could be useful in limited scenarios, but IMHO it makes no sense as a default sort order.

- What do I need to do to get a strictly character-based sort in ORDER BY?

I am using postgres version 14.3 on Fedora 37.

Thanks for the help.

Carl


--000000000000fbe95d05f1b2d559--