public inbox for [email protected]  
help / color / mirror / Atom feed
From: Mike Broers <[email protected]>
To: Tom Lane <[email protected]>
Cc: David Rowley <[email protected]>
Cc: postgres performance list <[email protected]>
Subject: Re: query of partitioned object doesnt use index in qa
Date: Fri, 15 Sep 2017 15:59:17 -0500
Message-ID: <CAB9893g-1fpvh=0snbe7qFJKfXEsn2YxR3ZWZ6-JxrMCyaZg3Q@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <CAB9893izHQaPTk1bGEDs8UTQUTKtpj1sk6PLyWrvXU-j0JBFaQ@mail.gmail.com>
	<CAKJS1f-AkrKfLEsrb7ymZve_b3e9cTKUcEdeeeJkVWnOTVdPnA@mail.gmail.com>
	<CAB9893hmTC-TMeFN8S91NWS_++3w2t5D0X7O-ogsZZ8zEyxv6w@mail.gmail.com>
	<[email protected]>
List-Unsubscribe:  <mailto:[email protected]?body=unsub%20pgsql-performance>

That makes a lot of sense, thanks for taking a look.  An index like you
suggest would probably further improve the query.   Is that suggestion
sidestepping the original problem that production is evaluating the
landing_id bit with the partition index and qa is sequence scanning instead?

AND exists (select 1 from t_sap where e.landing_id = t_sap.landing_id)) as
rankings;

Based on the difference in row estimate I am attempting an analyze with a
higher default_statistic_target (currently 100) to see if that helps.




On Fri, Sep 15, 2017 at 3:42 PM, Tom Lane <[email protected]> wrote:

> Mike Broers <[email protected]> writes:
> > If Im reading this correctly postgres thinks the partition will return
> 6.5
> > million matching rows but actually comes back with 162k.  Is this a case
> > where something is wrong with the analyze job?
>
> You've got a lot of scans there that're using conditions like
>
> > │                           ->  Seq Scan on event__99999999 e_1
> (cost=0.00..2527828.05 rows=11383021 width=778) (actual
> time=25522.389..747238.885 rows=42 loops=1)
> > │                                 Filter: (((body ->> 'SID'::text) IS
> NOT NULL) AND (validation_status_code = 'P'::bpchar))
> > │                                 Rows Removed by Filter: 12172186
>
> While I'd expect the planner to be pretty solid on estimating the
> validation_status_code condition, it's not going to have any idea about
> that JSON field test.  That's apparently very selective, but you're just
> getting a default estimate, which is not going to think that a NOT NULL
> test will exclude lots of rows.
>
> One thing you could consider doing about this is creating an index
> on (body ->> 'SID'::text), which would prompt ANALYZE to gather statistics
> about that expression.  Even if the index weren't actually used in the
> plan, this might improve the estimates and the resulting planning choices
> enough to make it worth maintaining such an index.
>
> Or you could think about pulling that field out and storing it on its own.
> JSON columns are great for storing random unstructured data, but they are
> less great when you want to do relational-ish things on subfields.
>
>                         regards, tom lane
>


reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re: query of partitioned object doesnt use index in qa
  In-Reply-To: <CAB9893g-1fpvh=0snbe7qFJKfXEsn2YxR3ZWZ6-JxrMCyaZg3Q@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox