From carl@sopchak.me Sat Jun 6 05:26:21 2026 Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pEFv1-0005b5-0D for pgsql-sql@arkaria.postgresql.org; Sat, 07 Jan 2023 20:35:55 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1pEFuw-00008C-Sv for pgsql-sql@arkaria.postgresql.org; Sat, 07 Jan 2023 20:35:50 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pEFuw-00007D-Ki for pgsql-sql@lists.postgresql.org; Sat, 07 Jan 2023 20:35:50 +0000 Received: from mout.perfora.net ([74.208.4.194]) by makus.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pEFup-0008Rq-PW for pgsql-sql@lists.postgresql.org; Sat, 07 Jan 2023 20:35:49 +0000 Received: from [10.221.133.171] ([142.105.135.230]) by mrelay.perfora.net (mreueus002 [74.208.5.2]) with ESMTPSA (Nemesis) id 0LzuPF-1ojXYL1swN-0150AC for ; Sat, 07 Jan 2023 21:35:42 +0100 Message-ID: <25e94e6e-5b18-9d45-0b7b-c23bd4c9a17b@sopchak.me> Date: Sat, 7 Jan 2023 15:35:41 -0500 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.6.0 Content-Language: en-US To: pgsql-sql@lists.postgresql.org From: Carl Sopchak Subject: Order By weirdness? Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Provags-ID: V03:K1:IX/NyBPTbBywK3Gx0huCUk50B/gmC9FJaX81iQSz4Vl00UrATjb 6mNojmvfZjp52/QWQLtszNkAHMfRV46wBwg1NKbpsQQqCSCQ1FUMqMBil910SvBlb0Kt3Yi OKwWoxxtIV+3j2IqBpX3IHNlqBDGwIuFzDaItWzpu1Icw7F7h+lMAQpKCb8qCiDi07OKDO6 QN//bXeVLiXJMsI6niULw== X-Spam-Flag: NO UI-OutboundReport: notjunk:1;M01:P0:pYP97nmZWUQ=;wMKtsGUHC6nlOHpSGoq1mhqjvWc jGuKvGQrpxlLHDHUd79IrQKQWNR0JF+QjnKHpuD+4ciEtFRNKKcLSizGbtQSi1nKK8VbrpSpH RvvfIYEXOB1sGKPnN/rmRFX5xAe0TISLcyf/hYdSAffQMWsbW13DsoMVSWUkfd/4KGW7Xpfsv ZPDF97aKrXGmFnWTaeLP/igK1sxXSi2XhGkTJ4sCXsGDuehC4No8mELQ+Jp7CcpJjqYdHbtVt lohsIs81ig/ByCf/uv4VLKdPJMW/mrSdEAfcFXpv5z+d9+NZvay5J8MXKyRSPZj61RKO3Y9xC zb938dN3SeMBpta910XWErpXIaI1yDmPBorpoRK9qpBmCK51CoiFIqvVAvt+3KDvzpPkO2bRo 32uqhymdND2o7rs0TvCbwSFO6DO3eDG/A2oqLmy3NraauvLHhvHE5YefZlon6lkjf4F6+rf2g Ci4VaTnch1sAdO4m5sZTixAo3ZPKdRyCP3sTVzxDvTqopu4foUXjAeE1ZUm+9e4Hxb5RikvIW 0bnMSDB436860Z2RK6/dgBn6fl9pgtAT99ZUeig8FzILbDA+Ufiy7jM3D7x+PHpCS4iMkzgD4 w6Z3jVsmXELd/PAV32imi7rapZGXBhd1SSV/KaA928mhWAEHM62ztqsyl/0s9aUhpGZULuMiP 4ouBrJ0osgAODDfZB1f1Y7yxVYGS+AS1hvp7BAM/IA== List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk

I'm seeing something (very) unexpected with ORDER BY.  If I run this query:

select txt
from ( values('x12345'), ('xz1234'), ('x23456'), ('xz2345'), ('x34567'), ('xz3456') ) a(txt)
order by txt;

I get expected results with x<#> being sorted before xz.  However, if I replace the z's with ~, giving

select txt
from ( values('x12345'), ('x~1234'), ('x23456'), ('x~2345'), ('x34567'), ('x~3456') ) a(txt)
order by txt;

I get this???

  txt   
--------
 x~1234
 x12345
 x~2345
 x23456
 x~3456
 x34567

Which appears to mean that ~ is treated differently than z (basically ~ is ignored).  Same if I use other special characters, such as @.

Up until stumbling into this, I have never seen such behavior from a database.  (Windows OS, yes, but I won't go there...)  Character-based text always sorted in an alphabetic order (which puts special characters in different places in the ordering depending on encoding, but it's consistent).

Two questions (which may be the same way of asking the same question):

- How is this correct?  I can see where this could be useful in limited scenarios, but IMHO it makes no sense as a default sort order.

- What do I need to do to get a strictly character-based sort in ORDER BY?

I am using postgres version 14.3 on Fedora 37.

Thanks for the help.

Carl


From erik@heimdalldata.com Sat Jun 6 05:26:21 2026 Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pEFxY-0005iI-TY for pgsql-sql@arkaria.postgresql.org; Sat, 07 Jan 2023 20:38:33 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1pEFxX-0003HK-PN for pgsql-sql@arkaria.postgresql.org; Sat, 07 Jan 2023 20:38:31 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pEFxX-0003HB-DX for pgsql-sql@lists.postgresql.org; Sat, 07 Jan 2023 20:38:31 +0000 Received: from mail-pf1-x431.google.com ([2607:f8b0:4864:20::431]) by makus.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1pEFxU-0008TP-W1 for pgsql-sql@lists.postgresql.org; Sat, 07 Jan 2023 20:38:30 +0000 Received: by mail-pf1-x431.google.com with SMTP id k19so3472388pfg.11 for ; Sat, 07 Jan 2023 12:38:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=heimdalldata.com; s=google; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=qasRdQXjeFK9Q6z/39rPYO/UKNH0fniSGpvfsolb0HY=; b=s1Str3e6r6RFred5f3ZcyuBsk3+OVwkw6RY50ihcPdtm/4q3Q6uC55te0IJThIgb3b EBLMGpOysbPR2dRsdDN1uaRb12dZBE4nAF9TEHjbXBZ+wtY1LTfCE3wzoArpvGDSHggg 7HhOx9H3aQTtv0dMZNTzDQ6mMWTOKikFXa7pg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=qasRdQXjeFK9Q6z/39rPYO/UKNH0fniSGpvfsolb0HY=; b=3Ty3zaJuTzCwKCDX8Gu1sIAIrFADSTw/GFbsZ9LT6X1YpWhGG0FFeuOu7xf74/vA9K 9iUaVTHVNnW79pKUbSJfA9gT4iLKbRWjZU7B4/ps5OKkucU3EG3h0+/piZHL3tWtieY0 DTbNcl35onXnfjGLWHcNI5wJ1MG5R05uwoDWMW8Gk11FWV8I6j53qGdF84OJek4HAmhz iblFKMDQwJS/NuQicuSz2CDMQZ3HWMhjxUQuyE93ejcR8wrPZO3ByNLCk86Go5ub63cX Ob+4DZjuUcurR7gPQHK7jxDDwo4LkHgKmyaNpKOC/pvKKYcr11R71amzohSTY/YZBWgk JXDQ== X-Gm-Message-State: AFqh2krmO/eNxhaSRGWvK0vPikrpL6sHtLkQ6HVNwCI9Mr5MyAllC+ik TNZEZogNZsYKBEsbrRCvr+KRqPnHs1bAPRNVerEOUtWwwU3HwpnL X-Google-Smtp-Source: AMrXdXu8nBV0npp00qPSXou3RgS+FL624lGnItnsGgs8IU4szDx1SbnlDrV1Inf2IudQPF6qE6JCIe3OWpaCeYiekI8= X-Received: by 2002:a63:e44e:0:b0:4a4:be42:9ad6 with SMTP id i14-20020a63e44e000000b004a4be429ad6mr1342058pgk.288.1673123907770; Sat, 07 Jan 2023 12:38:27 -0800 (PST) MIME-Version: 1.0 References: <25e94e6e-5b18-9d45-0b7b-c23bd4c9a17b@sopchak.me> In-Reply-To: <25e94e6e-5b18-9d45-0b7b-c23bd4c9a17b@sopchak.me> From: Erik Brandsberg Date: Sat, 7 Jan 2023 15:38:16 -0500 Message-ID: Subject: Re: Order By weirdness? To: Carl Sopchak Cc: pgsql-sql@lists.postgresql.org Content-Type: multipart/alternative; boundary="000000000000303a3e05f1b284fb" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000303a3e05f1b284fb Content-Type: text/plain; charset="UTF-8" This will relate to collation order, which is something that you can specify. Please see: https://stackoverflow.com/questions/17225652/how-can-i-sort-the-postgres-column-with-certain-special-characters On Sat, Jan 7, 2023 at 3:35 PM Carl Sopchak wrote: > I'm seeing something (very) unexpected with ORDER BY. If I run this query: > > select txt > from ( values('x12345'), ('xz1234'), ('x23456'), ('xz2345'), ('x34567'), > ('xz3456') ) a(txt) > order by txt; > > I get expected results with x<#> being sorted before xz. However, if I > replace the z's with ~, giving > > select txt > from ( values('x12345'), ('x~1234'), ('x23456'), ('x~2345'), ('x34567'), > ('x~3456') ) a(txt) > order by txt; > > I get this??? > > txt > -------- > x~1234 > x12345 > x~2345 > x23456 > x~3456 > x34567 > > Which appears to mean that ~ is treated differently than z (basically ~ is > ignored). Same if I use other special characters, such as @. > > Up until stumbling into this, I have never seen such behavior from a > database. (Windows OS, yes, but I won't go there...) Character-based text > always sorted in an alphabetic order (which puts special characters in > different places in the ordering depending on encoding, but it's > consistent). > > Two questions (which may be the same way of asking the same question): > > - How is this correct? I can see where this could be useful in limited > scenarios, but IMHO it makes no sense as a default sort order. > > - What do I need to do to get a strictly character-based sort in ORDER BY? > > I am using postgres version 14.3 on Fedora 37. > > Thanks for the help. > > Carl > > > --000000000000303a3e05f1b284fb Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
This will relate to collation order, which is something th= at you can specify.=C2=A0 Please see:=C2=A0=C2=A0https://stackoverflow.com/questions/17225652/how-can= -i-sort-the-postgres-column-with-certain-special-characters

On Sat, Jan = 7, 2023 at 3:35 PM Carl Sopchak <carl= @sopchak.me> wrote:
=20 =20 =20

I'm seeing something (very) unexpected with ORDER BY.=C2=A0 If I= run this query:

select txt
from ( values('x12345'), ('xz1234'), ('x23456'= ;), ('xz2345'), ('x34567'), ('xz3456') ) a(txt)
order by txt;

I get expected results with x<#> being sorted before xz.=C2=A0 However, if I replace the z's with ~, giving

select txt
from ( values('x12345'), ('x~1234'), ('x23456'= ;), ('x~2345'), ('x34567'), ('x~3456') ) a(txt)
order by txt;

I get this???

=C2=A0 txt=C2=A0 =C2=A0
--------
=C2=A0x~1234
=C2=A0x12345
=C2=A0x~2345
=C2=A0x23456
=C2=A0x~3456
=C2=A0x34567

Which appears to mean that ~ is treated differently than z (basically ~ is ignored).=C2=A0 Same if I use other special character= s, such as @.

Up until stumbling into this, I have never seen such behavior from a database.=C2=A0 (Windows OS, yes, but I won't go there...)= =C2=A0 Character-based text always sorted in an alphabetic order (which puts special characters in different places in the ordering depending on encoding, but it's consistent).

Two questions (which may be the same way of asking the same question):

- How is this correct?=C2=A0 I can see where this could be useful in limited scenarios, but IMHO it makes no sense as a default sort order.

- What do I need to do to get a strictly character-based sort in ORDER BY?

I am using postgres version 14.3 on Fedora 37.

Thanks for the help.

Carl


--000000000000303a3e05f1b284fb-- From carl@sopchak.me Sat Jun 6 05:26:21 2026 Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pEG44-0005yv-9h for pgsql-sql@arkaria.postgresql.org; Sat, 07 Jan 2023 20:45:16 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1pEG43-0006nX-4J for pgsql-sql@arkaria.postgresql.org; Sat, 07 Jan 2023 20:45:15 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pEG42-0006n7-Rp for pgsql-sql@lists.postgresql.org; Sat, 07 Jan 2023 20:45:14 +0000 Received: from mout.perfora.net ([74.208.4.196]) by makus.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pEG40-00006a-4r for pgsql-sql@lists.postgresql.org; Sat, 07 Jan 2023 20:45:13 +0000 Received: from [10.221.133.171] ([142.105.135.230]) by mrelay.perfora.net (mreueus004 [74.208.5.2]) with ESMTPSA (Nemesis) id 1MUGJb-1pNIbU0cJn-00RGqD for ; Sat, 07 Jan 2023 21:45:11 +0100 Message-ID: <1b6f965c-c97b-1b3f-e955-14402544979c@sopchak.me> Date: Sat, 7 Jan 2023 15:45:10 -0500 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.6.0 Subject: Re: Order By weirdness? Content-Language: en-US To: pgsql-sql@lists.postgresql.org References: <25e94e6e-5b18-9d45-0b7b-c23bd4c9a17b@sopchak.me> From: Carl Sopchak In-Reply-To: Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Provags-ID: V03:K1:XNjX7iruIGfhp420R6DMtl0JbhKLl5gN34UmiU73ZvlgiSEclgm VYYUrhAPIB4FTQztIjrPLvMI0FJBLr4ZD2CRfycgcYG8+cMf0PXlNeFPB0MKqrPqb44lgT9 Cl2Klpf7Vq4gZIhIyXhuzxBgMGN8HRS6/7QajIOaMCzVKm66cK+HlltKIerL9IP0PTxBjkb w5GULAymP/xvAPjlKmmaQ== X-Spam-Flag: NO UI-OutboundReport: notjunk:1;M01:P0:L+fZwLBh39U=;a+/pwjQWdAIAease+qFixTgPSIw uG4ZwAiqkiN0mdxRO5NWmne0orOiAOmfmD1V1tYzkiNT1OqdStJ2e4KqrIj3ViZlBmjN0AC49 E1WuN94lgTYb3zOFgCTuYY0mZqED1h6CTbtEV1Cvq0d95RsNThjszmOFbHl4HejCVcmkodDUE Vs/HuOoTIq30JDUNj/hJYovqGs4jlBQWrb7mIjupkKgeM8pBe+JwQ5A1TVGppGLdALTsM4tSu rgPQ19z9lz0yLlXoMPzi9XCHeMMMx0IJ5Q712ad0t8GaBJbExO1Hv2hrzJCS6njanJLCj+tEO hTbefekuURvktJJqImQbL/ziCIVTrioO8HjjO34wtfDN0z7kc3uAXRztVILVT2ZSAe86BMmks YC7DWOfkfEhySC17a4tyHF/kKqbVq936oJJNc5H32oBUBjr4sxVl+b+jfI3WRfeKTdto7iPx9 QzvZvglMvYp5zO66taLkRbDk3auFOB6TLhqreFpyjjthvzncBTCHx/x5iEfRGcp2rglavL420 bwraN18AgEQig98Rpuf/Sn7aw8vZXDT7HAEh02nnMEQ/pynh4WtadiZoNAuEpLvahAK5cZJnh EczZVZRbrWvS7oDPwkzAedso6RCpta6K8VFBIA/T3NWb35mteVbdF0fHwsIz8wdQMnnY5MuCW I3LvzH3snTtNsusHxwy5wnV3rIDILt68fiCBdvnX9g== List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk

Thanks, that did the trick.  The surprising thing in the link is "Most locales would ignore the leading # for sorting. "  I guess I've been around too long and hadn't noticed. :-)

On 1/7/23 15:38, Erik Brandsberg wrote:
This will relate to collation order, which is something that you can specify.  Please see:  https://stackoverflow.com/questions/17225652/how-can-i-sort-the-postgres-column-with-certain-special-characters

On Sat, Jan 7, 2023 at 3:35 PM Carl Sopchak <carl@sopchak.me> wrote:

I'm seeing something (very) unexpected with ORDER BY.  If I run this query:

select txt
from ( values('x12345'), ('xz1234'), ('x23456'), ('xz2345'), ('x34567'), ('xz3456') ) a(txt)
order by txt;

I get expected results with x<#> being sorted before xz.  However, if I replace the z's with ~, giving

select txt
from ( values('x12345'), ('x~1234'), ('x23456'), ('x~2345'), ('x34567'), ('x~3456') ) a(txt)
order by txt;

I get this???

  txt   
--------
 x~1234
 x12345
 x~2345
 x23456
 x~3456
 x34567

Which appears to mean that ~ is treated differently than z (basically ~ is ignored).  Same if I use other special characters, such as @.

Up until stumbling into this, I have never seen such behavior from a database.  (Windows OS, yes, but I won't go there...)  Character-based text always sorted in an alphabetic order (which puts special characters in different places in the ordering depending on encoding, but it's consistent).

Two questions (which may be the same way of asking the same question):

- How is this correct?  I can see where this could be useful in limited scenarios, but IMHO it makes no sense as a default sort order.

- What do I need to do to get a strictly character-based sort in ORDER BY?

I am using postgres version 14.3 on Fedora 37.

Thanks for the help.

Carl


From samed@reddoc.net Sat Jun 6 05:26:21 2026 Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pEGJj-0006hH-OT for pgsql-sql@arkaria.postgresql.org; Sat, 07 Jan 2023 21:01:28 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1pEGJh-00023g-Tw for pgsql-sql@arkaria.postgresql.org; Sat, 07 Jan 2023 21:01:25 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pEGJh-00023X-Im for pgsql-sql@lists.postgresql.org; Sat, 07 Jan 2023 21:01:25 +0000 Received: from mail-ej1-x62d.google.com ([2a00:1450:4864:20::62d]) by magus.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1pEGJd-00052w-Ty for pgsql-sql@lists.postgresql.org; Sat, 07 Jan 2023 21:01:25 +0000 Received: by mail-ej1-x62d.google.com with SMTP id u19so10917747ejm.8 for ; Sat, 07 Jan 2023 13:01:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=reddoc.net; s=google; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=ywXQxRF75MxJumNR9jT8C82JnYNxahCO6iHDb3CKlxU=; b=djA2d5d4SyuNWdk7JVHo4ym1MeyKWeieVkxoD5hM4B1O5JVV7nTd0Hujs1B6TlP+oM B5UXcNdoNhVbM3qLIWxG8+TS4XuzVA7ukaUDxrwzxsZVyvWNa2IzX0sZDJErfLxO5gWi IQo0NghY6d14oMrRxKjZlgLMVXRXlhy3Z0tao= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ywXQxRF75MxJumNR9jT8C82JnYNxahCO6iHDb3CKlxU=; b=kwmWKoQ6Z7actGHClk2ZF4b7WCGvStwZGsgIKal0AokQBHUGvvNbDbJ3HBO8cwhGkl ZPFF89N6+WnqdAPMvsHKZ6VA/Y2WRCFSEbCXlT6v8Yg0+NUZ0sg+twCqEjWzT8c+estJ wyFhDjYs0WTdVzjCTR4VZhK+r3ByJGaLYpQVl0WTU1lGhg2hEQUALnkpKliXSE08HowC lNeDio8N4UVgqUQBTZMtcSey+yuOBX6uVO6j8JR99ZwaTG14WISM1UaNWI3E3wyahfAj //Y7rS3PP7xtqNJWJJnvM2lcAHG8+PLFAZov2aOYOSueRCLsS94DlXXEBuXZGfCcjwM9 yBWA== X-Gm-Message-State: AFqh2ko2EuUASJxJqo2CYvVROVCsOGDAWLgq81yP2Lk/+YRrrv8f4Yi3 lF65okBQQ4UWYmUPfUHLpaNaHAUNSa4e/+aPSP0Bfw== X-Google-Smtp-Source: AMrXdXtpmBR4lj9oibzhKr2ttDPqfMxx5u0gWOHIX1cl5MG2shwh1Xb2AAnbTytnLtzwqHu00sd12qRe1oMliZzUBz0= X-Received: by 2002:a17:907:1393:b0:7c1:4e96:da7a with SMTP id vs19-20020a170907139300b007c14e96da7amr5186582ejb.596.1673125280075; Sat, 07 Jan 2023 13:01:20 -0800 (PST) MIME-Version: 1.0 References: <25e94e6e-5b18-9d45-0b7b-c23bd4c9a17b@sopchak.me> In-Reply-To: <25e94e6e-5b18-9d45-0b7b-c23bd4c9a17b@sopchak.me> From: Samed YILDIRIM Date: Sat, 7 Jan 2023 23:01:09 +0200 Message-ID: Subject: Re: Order By weirdness? To: Carl Sopchak Cc: pgsql-sql@lists.postgresql.org Content-Type: multipart/alternative; boundary="000000000000fbe95d05f1b2d559" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000fbe95d05f1b2d559 Content-Type: text/plain; charset="UTF-8" Hi Carl, This can be related to glibc2.38 update. I recommend you to check following documents. https://postgresql.verite.pro/blog/2018/08/27/glibc-upgrade.html https://wiki.postgresql.org/wiki/Locale_data_changes Best regards. Samed YILDIRIM On Sat, 7 Jan 2023 at 22:35, Carl Sopchak wrote: > I'm seeing something (very) unexpected with ORDER BY. If I run this query: > > select txt > from ( values('x12345'), ('xz1234'), ('x23456'), ('xz2345'), ('x34567'), > ('xz3456') ) a(txt) > order by txt; > > I get expected results with x<#> being sorted before xz. However, if I > replace the z's with ~, giving > > select txt > from ( values('x12345'), ('x~1234'), ('x23456'), ('x~2345'), ('x34567'), > ('x~3456') ) a(txt) > order by txt; > > I get this??? > > txt > -------- > x~1234 > x12345 > x~2345 > x23456 > x~3456 > x34567 > > Which appears to mean that ~ is treated differently than z (basically ~ is > ignored). Same if I use other special characters, such as @. > > Up until stumbling into this, I have never seen such behavior from a > database. (Windows OS, yes, but I won't go there...) Character-based text > always sorted in an alphabetic order (which puts special characters in > different places in the ordering depending on encoding, but it's > consistent). > > Two questions (which may be the same way of asking the same question): > > - How is this correct? I can see where this could be useful in limited > scenarios, but IMHO it makes no sense as a default sort order. > > - What do I need to do to get a strictly character-based sort in ORDER BY? > > I am using postgres version 14.3 on Fedora 37. > > Thanks for the help. > > Carl > > > --000000000000fbe95d05f1b2d559 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Carl,

This can be related= to glibc2.38 update. I recommend you to check following documents.

https://wiki.postgresql.org/wiki/Locale_data_changes<= /div>

Best regar= ds.
Samed YILDIRIM


On Sat, 7 Jan 2023 at 22:35, Carl Sopchak <carl@sopchak.me> wrote:
=20 =20 =20

I'm seeing something (very) unexpected with ORDER BY.=C2=A0 If I= run this query:

select txt
from ( values('x12345'), ('xz1234'), ('x23456'= ;), ('xz2345'), ('x34567'), ('xz3456') ) a(txt)
order by txt;

I get expected results with x<#> being sorted before xz.=C2=A0 However, if I replace the z's with ~, giving

select txt
from ( values('x12345'), ('x~1234'), ('x23456'= ;), ('x~2345'), ('x34567'), ('x~3456') ) a(txt)
order by txt;

I get this???

=C2=A0 txt=C2=A0 =C2=A0
--------
=C2=A0x~1234
=C2=A0x12345
=C2=A0x~2345
=C2=A0x23456
=C2=A0x~3456
=C2=A0x34567

Which appears to mean that ~ is treated differently than z (basically ~ is ignored).=C2=A0 Same if I use other special character= s, such as @.

Up until stumbling into this, I have never seen such behavior from a database.=C2=A0 (Windows OS, yes, but I won't go there...)= =C2=A0 Character-based text always sorted in an alphabetic order (which puts special characters in different places in the ordering depending on encoding, but it's consistent).

Two questions (which may be the same way of asking the same question):

- How is this correct?=C2=A0 I can see where this could be useful in limited scenarios, but IMHO it makes no sense as a default sort order.

- What do I need to do to get a strictly character-based sort in ORDER BY?

I am using postgres version 14.3 on Fedora 37.

Thanks for the help.

Carl


--000000000000fbe95d05f1b2d559--