Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pEFxY-0005iI-TY for pgsql-sql@arkaria.postgresql.org; Sat, 07 Jan 2023 20:38:33 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1pEFxX-0003HK-PN for pgsql-sql@arkaria.postgresql.org; Sat, 07 Jan 2023 20:38:31 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1pEFxX-0003HB-DX for pgsql-sql@lists.postgresql.org; Sat, 07 Jan 2023 20:38:31 +0000 Received: from mail-pf1-x431.google.com ([2607:f8b0:4864:20::431]) by makus.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1pEFxU-0008TP-W1 for pgsql-sql@lists.postgresql.org; Sat, 07 Jan 2023 20:38:30 +0000 Received: by mail-pf1-x431.google.com with SMTP id k19so3472388pfg.11 for ; Sat, 07 Jan 2023 12:38:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=heimdalldata.com; s=google; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=qasRdQXjeFK9Q6z/39rPYO/UKNH0fniSGpvfsolb0HY=; b=s1Str3e6r6RFred5f3ZcyuBsk3+OVwkw6RY50ihcPdtm/4q3Q6uC55te0IJThIgb3b EBLMGpOysbPR2dRsdDN1uaRb12dZBE4nAF9TEHjbXBZ+wtY1LTfCE3wzoArpvGDSHggg 7HhOx9H3aQTtv0dMZNTzDQ6mMWTOKikFXa7pg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=qasRdQXjeFK9Q6z/39rPYO/UKNH0fniSGpvfsolb0HY=; b=3Ty3zaJuTzCwKCDX8Gu1sIAIrFADSTw/GFbsZ9LT6X1YpWhGG0FFeuOu7xf74/vA9K 9iUaVTHVNnW79pKUbSJfA9gT4iLKbRWjZU7B4/ps5OKkucU3EG3h0+/piZHL3tWtieY0 DTbNcl35onXnfjGLWHcNI5wJ1MG5R05uwoDWMW8Gk11FWV8I6j53qGdF84OJek4HAmhz iblFKMDQwJS/NuQicuSz2CDMQZ3HWMhjxUQuyE93ejcR8wrPZO3ByNLCk86Go5ub63cX Ob+4DZjuUcurR7gPQHK7jxDDwo4LkHgKmyaNpKOC/pvKKYcr11R71amzohSTY/YZBWgk JXDQ== X-Gm-Message-State: AFqh2krmO/eNxhaSRGWvK0vPikrpL6sHtLkQ6HVNwCI9Mr5MyAllC+ik TNZEZogNZsYKBEsbrRCvr+KRqPnHs1bAPRNVerEOUtWwwU3HwpnL X-Google-Smtp-Source: AMrXdXu8nBV0npp00qPSXou3RgS+FL624lGnItnsGgs8IU4szDx1SbnlDrV1Inf2IudQPF6qE6JCIe3OWpaCeYiekI8= X-Received: by 2002:a63:e44e:0:b0:4a4:be42:9ad6 with SMTP id i14-20020a63e44e000000b004a4be429ad6mr1342058pgk.288.1673123907770; Sat, 07 Jan 2023 12:38:27 -0800 (PST) MIME-Version: 1.0 References: <25e94e6e-5b18-9d45-0b7b-c23bd4c9a17b@sopchak.me> In-Reply-To: <25e94e6e-5b18-9d45-0b7b-c23bd4c9a17b@sopchak.me> From: Erik Brandsberg Date: Sat, 7 Jan 2023 15:38:16 -0500 Message-ID: Subject: Re: Order By weirdness? To: Carl Sopchak Cc: pgsql-sql@lists.postgresql.org Content-Type: multipart/alternative; boundary="000000000000303a3e05f1b284fb" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000303a3e05f1b284fb Content-Type: text/plain; charset="UTF-8" This will relate to collation order, which is something that you can specify. Please see: https://stackoverflow.com/questions/17225652/how-can-i-sort-the-postgres-column-with-certain-special-characters On Sat, Jan 7, 2023 at 3:35 PM Carl Sopchak wrote: > I'm seeing something (very) unexpected with ORDER BY. If I run this query: > > select txt > from ( values('x12345'), ('xz1234'), ('x23456'), ('xz2345'), ('x34567'), > ('xz3456') ) a(txt) > order by txt; > > I get expected results with x<#> being sorted before xz. However, if I > replace the z's with ~, giving > > select txt > from ( values('x12345'), ('x~1234'), ('x23456'), ('x~2345'), ('x34567'), > ('x~3456') ) a(txt) > order by txt; > > I get this??? > > txt > -------- > x~1234 > x12345 > x~2345 > x23456 > x~3456 > x34567 > > Which appears to mean that ~ is treated differently than z (basically ~ is > ignored). Same if I use other special characters, such as @. > > Up until stumbling into this, I have never seen such behavior from a > database. (Windows OS, yes, but I won't go there...) Character-based text > always sorted in an alphabetic order (which puts special characters in > different places in the ordering depending on encoding, but it's > consistent). > > Two questions (which may be the same way of asking the same question): > > - How is this correct? I can see where this could be useful in limited > scenarios, but IMHO it makes no sense as a default sort order. > > - What do I need to do to get a strictly character-based sort in ORDER BY? > > I am using postgres version 14.3 on Fedora 37. > > Thanks for the help. > > Carl > > > --000000000000303a3e05f1b284fb Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
This will relate to collation order, which is something th= at you can specify.=C2=A0 Please see:=C2=A0=C2=A0https://stackoverflow.com/questions/17225652/how-can= -i-sort-the-postgres-column-with-certain-special-characters

On Sat, Jan = 7, 2023 at 3:35 PM Carl Sopchak <carl= @sopchak.me> wrote:
=20 =20 =20

I'm seeing something (very) unexpected with ORDER BY.=C2=A0 If I= run this query:

select txt
from ( values('x12345'), ('xz1234'), ('x23456'= ;), ('xz2345'), ('x34567'), ('xz3456') ) a(txt)
order by txt;

I get expected results with x<#> being sorted before xz.=C2=A0 However, if I replace the z's with ~, giving

select txt
from ( values('x12345'), ('x~1234'), ('x23456'= ;), ('x~2345'), ('x34567'), ('x~3456') ) a(txt)
order by txt;

I get this???

=C2=A0 txt=C2=A0 =C2=A0
--------
=C2=A0x~1234
=C2=A0x12345
=C2=A0x~2345
=C2=A0x23456
=C2=A0x~3456
=C2=A0x34567

Which appears to mean that ~ is treated differently than z (basically ~ is ignored).=C2=A0 Same if I use other special character= s, such as @.

Up until stumbling into this, I have never seen such behavior from a database.=C2=A0 (Windows OS, yes, but I won't go there...)= =C2=A0 Character-based text always sorted in an alphabetic order (which puts special characters in different places in the ordering depending on encoding, but it's consistent).

Two questions (which may be the same way of asking the same question):

- How is this correct?=C2=A0 I can see where this could be useful in limited scenarios, but IMHO it makes no sense as a default sort order.

- What do I need to do to get a strictly character-based sort in ORDER BY?

I am using postgres version 14.3 on Fedora 37.

Thanks for the help.

Carl


--000000000000303a3e05f1b284fb--