public inbox for [email protected]
help / color / mirror / Atom feedFrom: Mike Broers <[email protected]>
To: Tom Lane <[email protected]>
Cc: David Rowley <[email protected]>
Cc: postgres performance list <[email protected]>
Subject: Re: query of partitioned object doesnt use index in qa
Date: Fri, 15 Sep 2017 15:59:17 -0500
Message-ID: <CAB9893g-1fpvh=0snbe7qFJKfXEsn2YxR3ZWZ6-JxrMCyaZg3Q@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <CAB9893izHQaPTk1bGEDs8UTQUTKtpj1sk6PLyWrvXU-j0JBFaQ@mail.gmail.com>
<CAKJS1f-AkrKfLEsrb7ymZve_b3e9cTKUcEdeeeJkVWnOTVdPnA@mail.gmail.com>
<CAB9893hmTC-TMeFN8S91NWS_++3w2t5D0X7O-ogsZZ8zEyxv6w@mail.gmail.com>
<[email protected]>
List-Unsubscribe: <mailto:[email protected]?body=unsub%20pgsql-performance>
That makes a lot of sense, thanks for taking a look. An index like you
suggest would probably further improve the query. Is that suggestion
sidestepping the original problem that production is evaluating the
landing_id bit with the partition index and qa is sequence scanning instead?
AND exists (select 1 from t_sap where e.landing_id = t_sap.landing_id)) as
rankings;
Based on the difference in row estimate I am attempting an analyze with a
higher default_statistic_target (currently 100) to see if that helps.
On Fri, Sep 15, 2017 at 3:42 PM, Tom Lane <[email protected]> wrote:
> Mike Broers <[email protected]> writes:
> > If Im reading this correctly postgres thinks the partition will return
> 6.5
> > million matching rows but actually comes back with 162k. Is this a case
> > where something is wrong with the analyze job?
>
> You've got a lot of scans there that're using conditions like
>
> > │ -> Seq Scan on event__99999999 e_1
> (cost=0.00..2527828.05 rows=11383021 width=778) (actual
> time=25522.389..747238.885 rows=42 loops=1)
> > │ Filter: (((body ->> 'SID'::text) IS
> NOT NULL) AND (validation_status_code = 'P'::bpchar))
> > │ Rows Removed by Filter: 12172186
>
> While I'd expect the planner to be pretty solid on estimating the
> validation_status_code condition, it's not going to have any idea about
> that JSON field test. That's apparently very selective, but you're just
> getting a default estimate, which is not going to think that a NOT NULL
> test will exclude lots of rows.
>
> One thing you could consider doing about this is creating an index
> on (body ->> 'SID'::text), which would prompt ANALYZE to gather statistics
> about that expression. Even if the index weren't actually used in the
> plan, this might improve the estimates and the resulting planning choices
> enough to make it worth maintaining such an index.
>
> Or you could think about pulling that field out and storing it on its own.
> JSON columns are great for storing random unstructured data, but they are
> less great when you want to do relational-ish things on subfields.
>
> regards, tom lane
>
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected]
Subject: Re: query of partitioned object doesnt use index in qa
In-Reply-To: <CAB9893g-1fpvh=0snbe7qFJKfXEsn2YxR3ZWZ6-JxrMCyaZg3Q@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox