Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dsxje-0001vm-NC for pgsql-performance@arkaria.postgresql.org; Fri, 15 Sep 2017 21:01:14 +0000 Received: from localhost ([127.0.0.1] helo=postgresql.org) by malur.postgresql.org with smtp (Exim 4.84_2) (envelope-from ) id 1dsxje-0003Rl-54 for pgsql-performance@arkaria.postgresql.org; Fri, 15 Sep 2017 21:01:14 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1dsxhr-0000EK-I4 for pgsql-performance@postgresql.org; Fri, 15 Sep 2017 20:59:23 +0000 Received: from mail-io0-x22f.google.com ([2607:f8b0:4001:c06::22f]) by magus.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.84_2) (envelope-from ) id 1dsxho-0006zj-7C for pgsql-performance@postgresql.org; Fri, 15 Sep 2017 20:59:23 +0000 Received: by mail-io0-x22f.google.com with SMTP id d16so11368036ioj.3 for ; Fri, 15 Sep 2017 13:59:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=KIi1h6J46jvtGVYoye+YMvd/zlFMvgfEm77Q4lWJnI4=; b=ozRojQ2pTnTbskMeOGICI9oni6K6IxEK3OkK8MOCanegRC9Rs680mZB54Wgj+3tQNp G/UPVe90smSdIMSgM1ZLKmUmgQQOyI0WoEPbstyd8ehyQLvjNq/R1nHEh5OeQnsTTDqL j2ddYgw+zFCKJNDRHI+EA2o8h1MjJOB1CZ/lJTvFIecTS+j6buelMYOPjo0ewJHbrmKV q68OcF/Na6Vap999Olk5j2jmu1aafVXrNgrPRZr+vNaq3zpQ6iFlFrfXcIrgN2tL0d6i gxvSm+HLmwIL/mDo0uxW1pt3dsuNgKa9LEnjch2705uL3duXWin0wfAXghGO5tUBhjFy AGgg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=KIi1h6J46jvtGVYoye+YMvd/zlFMvgfEm77Q4lWJnI4=; b=LZ7/21M8pEILHMfjZ0i194xjv7jh1k8TNPtnXNJZ1wcsSpHP2+c17Cg4jOT2euVwT2 HPeFCza43WLqJ+I9qDXfxzjh08J0/t1QSyBAK9/KYnpXqEdUQKDAsgoCdoimp60itGNB xy7gm90d0/nLB3BAciiXm3il9ai+8cxAjwcYmwSdB9AIhDwCvnA8nPPu+f+VLiQYYbCq Of3FimJuEi/gR+q5nRbEtbdMYRPH0Vli8AcpVo/JgCAyiyFHe8kSaZHzQxH8vireb2cI rySvQ1eLFI8y4OROMr6GeIkcaPf6ta2ZLHrUMZC9y0TEhx103SNZM9QhpC8LPZZCtKvx 09Jg== X-Gm-Message-State: AHPjjUjYnX+jHdpB/3RqrHu6udwe9yo2+HgNFGV7b23HchXVLylQz50E vJE7dlEBb1ZMfja9NnuUB0Rz6VO74zXOtUhWQtG8ed1E X-Google-Smtp-Source: ADKCNb5zzz+N2wnbeY9VcPl5Wucm6KAQWxNlmKZIqbbrMH80rPf8oeVVxTUby+84k94kAyauomZv2UeYiFweXcez2iA= X-Received: by 10.202.102.71 with SMTP id a68mr27767315oic.242.1505509158260; Fri, 15 Sep 2017 13:59:18 -0700 (PDT) MIME-Version: 1.0 Received: by 10.157.52.73 with HTTP; Fri, 15 Sep 2017 13:59:17 -0700 (PDT) In-Reply-To: <3138.1505508143@sss.pgh.pa.us> References: <3138.1505508143@sss.pgh.pa.us> From: Mike Broers Date: Fri, 15 Sep 2017 15:59:17 -0500 Message-ID: Subject: Re: query of partitioned object doesnt use index in qa To: Tom Lane Cc: David Rowley , postgres performance list Content-Type: multipart/alternative; boundary="001a1140b78495990b055940aab8" List-Archive: List-Help: List-ID: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: X-Mailing-List: pgsql-performance Precedence: bulk Sender: pgsql-performance-owner@postgresql.org --001a1140b78495990b055940aab8 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable That makes a lot of sense, thanks for taking a look. An index like you suggest would probably further improve the query. Is that suggestion sidestepping the original problem that production is evaluating the landing_id bit with the partition index and qa is sequence scanning instead= ? AND exists (select 1 from t_sap where e.landing_id =3D t_sap.landing_id)) a= s rankings; Based on the difference in row estimate I am attempting an analyze with a higher default_statistic_target (currently 100) to see if that helps. On Fri, Sep 15, 2017 at 3:42 PM, Tom Lane wrote: > Mike Broers writes: > > If Im reading this correctly postgres thinks the partition will return > 6.5 > > million matching rows but actually comes back with 162k. Is this a cas= e > > where something is wrong with the analyze job? > > You've got a lot of scans there that're using conditions like > > > =E2=94=82 -> Seq Scan on event__99999999 e_1 > (cost=3D0.00..2527828.05 rows=3D11383021 width=3D778) (actual > time=3D25522.389..747238.885 rows=3D42 loops=3D1) > > =E2=94=82 Filter: (((body ->> 'SID'::te= xt) IS > NOT NULL) AND (validation_status_code =3D 'P'::bpchar)) > > =E2=94=82 Rows Removed by Filter: 12172= 186 > > While I'd expect the planner to be pretty solid on estimating the > validation_status_code condition, it's not going to have any idea about > that JSON field test. That's apparently very selective, but you're just > getting a default estimate, which is not going to think that a NOT NULL > test will exclude lots of rows. > > One thing you could consider doing about this is creating an index > on (body ->> 'SID'::text), which would prompt ANALYZE to gather statistic= s > about that expression. Even if the index weren't actually used in the > plan, this might improve the estimates and the resulting planning choices > enough to make it worth maintaining such an index. > > Or you could think about pulling that field out and storing it on its own= . > JSON columns are great for storing random unstructured data, but they are > less great when you want to do relational-ish things on subfields. > > regards, tom lane > --001a1140b78495990b055940aab8 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
That makes a lot of sense, thanks for taking a look.=C2=A0= An index like you suggest would probably further improve the query. =C2=A0= Is that suggestion sidestepping the original problem that production is ev= aluating the landing_id bit with the partition index and qa is sequence sca= nning instead?

AND exists (select 1 from t_sap where e.l= anding_id =3D t_sap.landing_id)) as rankings;=C2=A0

Based on the difference in row estimate I am attempting an analyze with a= higher default_statistic_target (currently 100) to see if that helps.


<= /div>

On Fri= , Sep 15, 2017 at 3:42 PM, Tom Lane <tgl@sss.pgh.pa.us> wrot= e:
Mike Broers <mbroers@gmail.com> writes:
> If Im reading this correctly postgres thinks the partition will return= 6.5
> million matching rows but actually comes back with 162k.=C2=A0 Is this= a case
> where something is wrong with the analyze job?

You've got a lot of scans there that're using conditions lik= e

> =E2=94=82=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0->=C2=A0 Seq Scan on event__999999= 99 e_1 (cost=3D0.00..2527828.05 rows=3D11383021 width=3D778) (actual time= =3D25522.389..747238.885 rows=3D42 loops=3D1)
> =E2=94=82=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Filter: (((body = ->> 'SID'::text) IS NOT NULL) AND (validation_status_code =3D= 'P'::bpchar))
> =E2=94=82=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Rows Removed by = Filter: 12172186

While I'd expect the planner to be pretty solid on estimating th= e
validation_status_code condition, it's not going to have any idea about=
that JSON field test.=C2=A0 That's apparently very selective, but you&#= 39;re just
getting a default estimate, which is not going to think that a NOT NULL
test will exclude lots of rows.

One thing you could consider doing about this is creating an index
on (body ->> 'SID'::text), which would prompt ANALYZE to gath= er statistics
about that expression.=C2=A0 Even if the index weren't actually used in= the
plan, this might improve the estimates and the resulting planning choices enough to make it worth maintaining such an index.

Or you could think about pulling that field out and storing it on its own.<= br> JSON columns are great for storing random unstructured data, but they are less great when you want to do relational-ish things on subfields.

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 regards, tom lane

--001a1140b78495990b055940aab8--