MIME-Version: 1.0
In-Reply-To: 
 <CAJ3Xv+hGjhFk_GSZzh2YqJFpSGStwmsNeM_9yLd=T_UHWdguSA@mail.gmail.com>
References: 
 <CAJ3Xv+hGjhFk_GSZzh2YqJFpSGStwmsNeM_9yLd=T_UHWdguSA@mail.gmail.com>
Date: Sun, 28 Feb 2016 23:06:35 +1300
Message-ID: 
 <CAKJS1f-zcuAMwDkucvO6iKGduGX8wn-4P6v34FC-UZtcFyC-EA@mail.gmail.com>
Subject: Re: Merge joins on index scans
From: David Rowley <david.rowley@2ndquadrant.com>
To: James Parks <james.parks@meraki.net>
Cc: postgres performance list <pgsql-performance@postgresql.org>
Content-Type: text/plain; charset=UTF-8
Precedence: bulk
Sender: pgsql-performance-owner@postgresql.org

On 27 February 2016 at 11:07, James Parks <james.parks@meraki.net> wrote:
>
> CREATE TABLE a (id bigint primary key, nonce bigint);
> CREATE TABLE b (id bigint primary key, a_id bigint not null);
> CREATE INDEX a_idx ON b (a_id);
>
> The query:
>
> SELECT b.* FROM b JOIN a ON b.a_id = a.id WHERE a.nonce = ? ORDER BY b.id
> ASC;
>
> (skip down to [1] and [2] to see the query performance)
>
> What I know:
>
> If you force the query planner to use a merge join on the above query, it
> takes 10+ minutes to complete using the data as per below. If you force the
> query planner to use a hash join on the same data, it takes ~200
> milliseconds.

I believe I know what is going on here, but can you please test;

SELECT b.* FROM b WHERE EXISTS (SELECT 1 FROM a ON b.a_id = a.id AND
a.nonce = ?) ORDER BY b.id ASC;

using the merge join plan.

If this performs much better then the problem is due to the merge join
mark/restore causing the join to have to transition through many
tuples which don't match the a.nonce = ? predicate. The mark and
restore is not required for the rewritten query, as this use a semi
join rather than a regular inner join. With the semi join the executor
knows that it's only meant to be matching a single tuple in "a", so
once the first match is found it can move to the next row in the outer
relation without having to restore the scan back to where it started
matching that inner row again.

If I'm right, to get around the problem you could; create index on a
(nonce, id);

If such an index is out of the question then a patch has been
submitted for review which should fix this problem in (hopefully)
either 9.6 or 9.7
https://commitfest.postgresql.org/9/129/
If you have a test environment handy, it would be nice if you could
test the patch on the current git head to see if this fixes your
problem. The findings would be quite interesting for me. Please note
this patch is for test environments only at this stage, not for
production use.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance