public inbox for [email protected]
help / color / mirror / Atom feedFrom: Jeff Janes <[email protected]>
To: Craig James <[email protected]>
Cc: [email protected] <[email protected]>
Subject: Re: Can't get two index scans
Date: Wed, 22 Jun 2016 11:36:31 -0700
Message-ID: <CAMkU=1xnuR8wTwMGu9aEnoe54t4C3wk_Bof9P4S=dcJc=0pNtA@mail.gmail.com> (raw)
In-Reply-To: <CAFwQ8reztx+Kw7pGoGTMyD2xt25_Z5Vrt5r4qB3AQ=ybG43QHA@mail.gmail.com>
References: <CAFwQ8reztx+Kw7pGoGTMyD2xt25_Z5Vrt5r4qB3AQ=ybG43QHA@mail.gmail.com>
List-Unsubscribe: <mailto:[email protected]?body=unsub%20pgsql-performance>
On Wed, Jun 22, 2016 at 9:03 AM, Craig James <[email protected]> wrote:
> I'm working with a third-party plugin that does chemistry.
Out of personal/professional curiosity, which one are you using, if
that can be disclosed?
....
> Notice that it doesn't use the i_version_smiles index at all, but instead
> applies the very expensive filter |>| to all 1% of the database.
You have to tell the database that |>| is very expensive, by setting
the COST of the function which it invokes. You can get the name of
the function with:
select oprcode from pg_operator where oprname ='|>|' ;
(taking care for schema and overloading, etc.)
I would set the COST to at least 1000, probably more.
> So instead
> of getting a 100x speedup, we only get a 3x speedup, about 30x worse that
> what is theoretically possible.
>
> The production database is about 50x larger than this test database.
>
> Maybe I misunderstand what's possible with indexes, but it seems to me that
> it could first do the pk_version index scan, and then use the results of
> that to do a limited index-scan search using the i_version_smiles index. Is
> that not possible?
I don't think it can do that. What it can do is run each index scan
to completion as a bitmap index scan, and then AND the bitmaps
together.
You might be able to build a multiple column index on (smiles,
version_id) and have it do the right thing automatically. Whether that
is possible, and if so how effective it will actually be, would depend
on the implementation details of |>|. My gut feeling is that it would
not work well.
You could partition your data on version_id. Then it would keep a
separate smiles index on each partition, and would only consult those
indexes which can possibly contain (according to the CHECK
constraints) the version_ids of interest in the query.
Also, if you tune your system using benzene, you will be probably
arrive at a place not optimal for more realistic queries.
Cheers,
Jeff
--
Sent via pgsql-performance mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected]
Subject: Re: Can't get two index scans
In-Reply-To: <CAMkU=1xnuR8wTwMGu9aEnoe54t4C3wk_Bof9P4S=dcJc=0pNtA@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox