MIME-Version: 1.0
In-Reply-To: <18037.1498260805@sss.pgh.pa.us>
References: 
 <CALck7q=QCruONy5WSbyiSrGUyrHRxWfaLcT=TkDNEyozOVomTA@mail.gmail.com>
 <18037.1498260805@sss.pgh.pa.us>
From: Merlin Moncure <mmoncure@gmail.com>
Date: Wed, 28 Jun 2017 08:13:40 -0500
Message-ID: 
 <CAHyXU0ydXL0WxgJ_PmNFor=LExQ7KtN99Xv3Sk77_fY0d9stXg@mail.gmail.com>
Subject: Re: Efficiently merging and sorting collections of sorted rows
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: Clint Miller <clint.miller1@gmail.com>,
	postgres performance list <pgsql-performance@postgresql.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
Sender: pgsql-performance-owner@postgresql.org

On Fri, Jun 23, 2017 at 6:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Clint Miller <clint.miller1@gmail.com> writes:
>> That's a good plan because it's not doing a quick sort. Instead, it's ju=
st
>> reading the sort order off of the index, which is exactly what I want. (I
>> had to disable enable_sort because I didn't have enough rows of test data
>> in the table to get Postgres to use the index. But if I had enough rows,
>> the enable_sort stuff wouldn't be necessary. My real table has lots of r=
ows
>> and doesn't need enable_sort turned off to do the sort with the index.)
>
> TBH, I think this whole argument is proceeding from false premises.
> Using an indexscan as a substitute for an explicit sort of lots of
> rows isn't all that attractive, because it implies a whole lot of
> random access to the table (unless the table is nearly in index
> order, which isn't a condition you can count on without expending
> a lot of maintenance effort to keep it that way).  seqscan-and-sort
> is often a superior alternative, especially if you're willing to give
> the sort a reasonable amount of work_mem.

Hm, if he reverses the index terms he gets his sort order for free and
a guaranteed IOS.   This would only be sensible to do only if several
conditions applied, you'd have to live under the IOS criteria
generally, the number of rows returned to what relative to what was
thrown out would have to be reasonably high (this is key), and the
result set would have to be large making the sort an expensive
consideration relative to the filtering.  You'd also have to be
uninterested in explicit filters on 's' or be willing to create
another index to do that if you were.

merlin

postgres=3D# \d foo
      Table "public.foo"
 Column =E2=94=82  Type   =E2=94=82 Modifiers
=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=BC=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80=E2=94=80=E2=94=BC=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=
=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80
 s      =E2=94=82 text    =E2=94=82
 i      =E2=94=82 integer =E2=94=82
Indexes:
    "foo_i_s_idx" btree (i, s)  -- reversed

postgres=3D# set enable_sort =3D false;
SET

postgres=3D# explain analyze select * from foo where s =3D 'a' or s =3D 'b'
order by i;
                                                       QUERY PLAN
=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=
=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=
=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=
=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=
=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80
 Index Only Scan using foo_i_s_idx on foo  (cost=3D0.15..68.75 rows=3D12
width=3D36) (actual time=3D0.004..0.004 rows=3D0 loops=3D1)
   Filter: ((s =3D 'a'::text) OR (s =3D 'b'::text))
   Heap Fetches: 0
 Planning time: 0.215 ms
 Execution time: 0.025 ms


merlin


--=20
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance