From: Daniel Blanch Bataller <daniel.blanch.bataller@gmail.com>
Message-Id: <C1FC53B7-DA51-4060-B1DE-6D6D14682DA3@gmail.com>
Content-Type: multipart/alternative;
 boundary="Apple-Mail=_2F23939A-23CF-4AA1-BD68-583E552BE862"
Mime-Version: 1.0 (Mac OS X Mail 10.2 \(3259\))
Subject: Re: Slow query after 9.3 to 9.6 migration
Date: Thu, 5 Jan 2017 18:51:10 +0100
In-Reply-To: 
 <CAOGex3=0DB-R9V558CkeoSOuU4KG_RyNh5etzo85o43xGcuVvQ@mail.gmail.com>
Cc: postgres performance list <pgsql-performance@postgresql.org>
To: =?utf-8?Q?Fl=C3=A1vio_Henrique?= <yoshimit@gmail.com>
References: 
 <CAOGex3nXTRPZTD-KeoSwD=bj62hQrMK+6h30u09srV71sePqUA@mail.gmail.com>
 <CAHyXU0wDi9VjfGC8aQeLsBq4ncLVOKJ=1QR6iRq71U2HXQso4Q@mail.gmail.com>
 <CAOGex3=0DB-R9V558CkeoSOuU4KG_RyNh5etzo85o43xGcuVvQ@mail.gmail.com>
Precedence: bulk
Sender: pgsql-performance-owner@postgresql.org


--Apple-Mail=_2F23939A-23CF-4AA1-BD68-583E552BE862
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=utf-8

Hi,

If just recreating the index now it uses it, it might mean that the =
index was bloated, that is, it grew so big that it was cheaper a seq =
scan.

I=E2=80=99ve seen another case recently where postgres 9.6 wasn=E2=80=99t =
using the right index in a query, I was able to reproduce the issue =
crafting index bigger, much bigger than it should be.=20

Can you record index size as it is now? Keep this info, and If problem =
happens again check indexes size, and see if they have grow too much.

i.e. SELECT relname, relpages, reltuples FROM pg_class WHERE relname =3D =
=E2=80=98index_name'

This might help to see if this is the problem, that indexes are growing =
too much for some reason.

Regards.

P.S the other parameters don't seem to be the cause of the problem to =
me.

> El 5 ene 2017, a las 17:51, Fl=C3=A1vio Henrique <yoshimit@gmail.com> =
escribi=C3=B3:
>=20
> Hi all!
> Sorry the delay (holidays).
>=20
> Well, the most expensive sequencial scan was solved.
> I asked the db team to drop the index and recreate it and guess what: =
now postgresql is using it and the time dropped.
> (thank you, @Gerardo Herzig!)
>=20
> I think there's still room for improvement, but the problem is not so =
crucial right now.
> I'll try to investigate every help mentioned here. Thank you all.
>=20
> @Daniel Blanch
> I'll make some tests with a materialized view. Thank you.
> On systems side: ask them if they have not changed anything in =
effective_cache_size and shared_buffers parameters, I presume they =
haven=E2=80=99t change anything related to costs.
> Replying your comment, I think they tunned the server:
> effective_cache_size =3D 196GB
> shared_buffers =3D 24GB (this shouldn't be higher?)
>=20
> @Kevin Grittner
> sorry, but I'm not sure when the autovacuum is aggressive enough, but =
here my settings related:
> autovacuum                          |on       =20
> autovacuum_analyze_scale_factor     |0.05     =20
> autovacuum_analyze_threshold        |10       =20
> autovacuum_freeze_max_age           |200000000=20
> autovacuum_max_workers              |3        =20
> autovacuum_multixact_freeze_max_age |400000000=20
> autovacuum_naptime                  |15s      =20
> autovacuum_vacuum_cost_delay        |10ms     =20
> autovacuum_vacuum_cost_limit        |-1       =20
> autovacuum_vacuum_scale_factor      |0.1      =20
> autovacuum_vacuum_threshold         |10       =20
> autovacuum_work_mem                 |-1       =20
>=20
> @Merlin Moncure
> Big gains (if any) are likely due to indexing strategy.
> I do see some suspicious casting, for example:
> Join Filter: ((four_charlie.delta_tango)::integer =3D
> (six_quebec.golf_bravo)::integer)
> Are you casting in the query or joining through dissimilar data types?
> No casts in query. The joins are on same data types.=20
>=20
> Thank you all for the answers. Happy 2017!
>=20
> Fl=C3=A1vio Henrique
> --------------------------------------------------------
> "There are only 10 types of people in the world: Those who understand =
binary, and those who don't"
> --------------------------------------------------------
>=20
> On Thu, Jan 5, 2017 at 12:40 PM, Merlin Moncure <mmoncure@gmail.com =
<mailto:mmoncure@gmail.com>> wrote:
> On Tue, Dec 27, 2016 at 5:50 PM, Fl=C3=A1vio Henrique =
<yoshimit@gmail.com <mailto:yoshimit@gmail.com>> wrote:
> > Hi there, fellow experts!
> >
> > I need an advice with query that became slower after 9.3 to 9.6 =
migration.
> >
> > First of all, I'm from the dev team.
> >
> > Before migration, we (programmers) made some modifications on query =
bring
> > it's average time from 8s to 2-3s.
> >
> > As this query is the most executed on our system (it builds the user =
panel
> > to work), every bit that we can squeeze from it will be nice.
> >
> > Now, after server migration to 9.6 we're experiencing bad times with =
this
> > query again.
> >
> > Unfortunately, I don't have the old query plain (9.3 version) to =
show you,
> > but in the actual version (9.6) I can see some buffers written that =
tells me
> > that something is wrong.
> >
> > Our server has 250GB of memory available, but the database team says =
that
> > they can't do nothing to make this query better. I'm not sure, as =
some
> > buffers are written on disk.
> >
> > Any tip/help will be much appreciated (even from the query side).
> >
> > Thank you!
> >
> > The query plan: https://explain.depesz.com/s/5KMn =
<https://explain.depesz.com/s/5KMn>
> >
> > Note: I tried to add index on kilo_victor table already, but =
Postgresql
> > still thinks that is better to do a seq scan.
>=20
> Hard to provide more without the query or the 'old' plan.   Here are
> some things you can try:
> *) Set effective_io_concurrency high.    You have some heap scanning
> going on and this can sometimes help (but it should be marginal).
> *) See if you can get any juice out of parallel query
> *) try playing with enable_nestloop and enable_seqscan.   these are
> hail mary passes but worth a shot.
>=20
> Run the query back to back with same arguments in the same database
> session. Does performance improve?
>=20
> Big gains (if any) are likely due to indexing strategy.
> I do see some suspicious casting, for example:
>=20
> Join Filter: ((four_charlie.delta_tango)::integer =3D
> (six_quebec.golf_bravo)::integer)
>=20
> Are you casting in the query or joining through dissimilar data types?
>  I suspect your database team might be incorrect.
>=20
> merlin
>=20


--Apple-Mail=_2F23939A-23CF-4AA1-BD68-583E552BE862
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=utf-8

<html><head><meta http-equiv=3D"Content-Type" content=3D"text/html =
charset=3Dutf-8"></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" =
class=3D""><div class=3D"">Hi,</div><div class=3D""><br =
class=3D""></div><div class=3D"">If just recreating the index now it =
uses it, it might mean that the index was bloated, that is, it grew so =
big that it was cheaper a seq scan.</div><div class=3D""><br =
class=3D""></div><div class=3D"">I=E2=80=99ve seen another case recently =
where postgres 9.6 wasn=E2=80=99t using the right index in a query, I =
was able to reproduce the issue crafting index bigger, much bigger than =
it should be.&nbsp;</div><div class=3D""><br class=3D""></div><div =
class=3D"">Can you record index size as it is now? Keep this info, and =
If problem happens again check indexes size, and see if they have grow =
too much.</div><div class=3D""><br class=3D""></div><div class=3D"">i.e. =
SELECT relname, relpages, reltuples FROM pg_class WHERE relname =3D =
=E2=80=98index_name'</div><div class=3D""><br class=3D""></div><div =
class=3D"">This might help to see if this is the problem, that indexes =
are growing too much for some reason.</div><div class=3D""><br =
class=3D""></div><div class=3D"">Regards.</div><div class=3D""><br =
class=3D""></div><div class=3D"">P.S the other parameters don't seem to =
be the cause of the problem to me.</div><br class=3D""><div><blockquote =
type=3D"cite" class=3D""><div class=3D"">El 5 ene 2017, a las 17:51, =
Fl=C3=A1vio Henrique &lt;<a href=3D"mailto:yoshimit@gmail.com" =
class=3D"">yoshimit@gmail.com</a>&gt; escribi=C3=B3:</div><br =
class=3D"Apple-interchange-newline"><div class=3D""><div dir=3D"ltr" =
class=3D"">Hi all!<div class=3D"">Sorry the delay (holidays).</div><div =
class=3D""><br class=3D""></div><div class=3D"">Well, the most expensive =
sequencial scan was solved.</div><div class=3D"">I asked the db team to =
drop the index and recreate it and guess what: now postgresql is using =
it and the time dropped.</div>(thank you, @Gerardo Herzig!)<div =
class=3D""><br class=3D""></div><div class=3D"">I think there's still =
room for improvement, but the problem is not so crucial right =
now.</div><div class=3D"">I'll try to investigate every help mentioned =
here. Thank you all.</div><div class=3D""><br class=3D""></div><div =
class=3D"">@Daniel Blanch</div><div class=3D""><span =
style=3D"font-size:12.8px" class=3D"">I'll make some tests with a =
materialized view. Thank you.</span><br class=3D""></div><div =
class=3D""><blockquote style=3D"margin:0px 0px 0px 0.8ex;border-left:1px =
solid rgb(204,204,204);padding-left:1ex" class=3D"gmail_quote">On =
systems side: ask them if they have not changed anything in =
effective_cache_size and shared_buffers parameters, I presume they =
haven=E2=80=99t change anything related to costs.</blockquote></div><div =
class=3D""><span style=3D"font-size:12.8px" class=3D"">Replying your =
comment, I think they tunned the server:</span></div><div class=3D""><span=
 style=3D"font-size:12.8px" class=3D"">effective_cache_size =
=3D&nbsp;</span><span style=3D"font-size:12.8px" =
class=3D"">196GB</span><br class=3D""></div><div class=3D""><span =
style=3D"font-size:12.8px" class=3D"">shared_buffers =3D&nbsp;24GB (this =
shouldn't be higher?)<br class=3D""></span></div><div class=3D""><span =
style=3D"font-size:12.8px" class=3D""><br class=3D""></span></div><div =
class=3D""><span style=3D"font-size:12.8px" class=3D"">@Kevin =
Grittner</span></div><div class=3D""><span style=3D"font-size:12.8px" =
class=3D"">sorry, but I'm not sure when the autovacuum is aggressive =
enough, but here my settings related:</span></div><div class=3D""><div =
class=3D""><span style=3D"font-size:12.8px" class=3D"">autovacuum &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp;|on &nbsp; &nbsp; &nbsp; &nbsp;</span></div><div =
class=3D""><span style=3D"font-size:12.8px" =
class=3D"">autovacuum_analyze_scale_factor &nbsp; &nbsp; |0.05 &nbsp; =
&nbsp; &nbsp;</span></div><div class=3D""><span style=3D"font-size:12.8px"=
 class=3D"">autovacuum_analyze_threshold &nbsp; &nbsp; &nbsp; &nbsp;|10 =
&nbsp; &nbsp; &nbsp; &nbsp;</span></div><div class=3D""><span =
style=3D"font-size:12.8px" class=3D"">autovacuum_freeze_max_age &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp; |200000000&nbsp;</span></div><div =
class=3D""><span style=3D"font-size:12.8px" =
class=3D"">autovacuum_max_workers &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp;|3 &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;</span></div><div =
class=3D""><span style=3D"font-size:12.8px" =
class=3D"">autovacuum_multixact_freeze_max_age =
|400000000&nbsp;</span></div><div class=3D""><span =
style=3D"font-size:12.8px" class=3D"">autovacuum_naptime &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;|15s &nbsp; &nbsp; =
&nbsp;&nbsp;</span></div><div class=3D""><span style=3D"font-size:12.8px" =
class=3D"">autovacuum_vacuum_cost_delay &nbsp; &nbsp; &nbsp; &nbsp;|10ms =
&nbsp; &nbsp; &nbsp;</span></div><div class=3D""><span =
style=3D"font-size:12.8px" class=3D"">autovacuum_vacuum_cost_limit =
&nbsp; &nbsp; &nbsp; &nbsp;|-1 &nbsp; &nbsp; &nbsp; =
&nbsp;</span></div><div class=3D""><span style=3D"font-size:12.8px" =
class=3D"">autovacuum_vacuum_scale_factor &nbsp; &nbsp; &nbsp;|0.1 =
&nbsp; &nbsp; &nbsp;&nbsp;</span></div><div class=3D""><span =
style=3D"font-size:12.8px" class=3D"">autovacuum_vacuum_threshold &nbsp; =
&nbsp; &nbsp; &nbsp; |10 &nbsp; &nbsp; &nbsp; &nbsp;</span></div><div =
class=3D""><span style=3D"font-size:12.8px" class=3D"">autovacuum_work_mem=
 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; |-1 &nbsp; =
&nbsp; &nbsp; &nbsp;</span></div></div><div class=3D""><span =
style=3D"font-size:12.8px" class=3D""><br class=3D""></span></div><div =
class=3D""><span style=3D"font-size:12.8px" class=3D"">@Merlin =
Moncure</span></div><blockquote class=3D"gmail_quote" style=3D"margin:0px =
0px 0px 0.8ex;border-left:1px solid =
rgb(204,204,204);padding-left:1ex"><span style=3D"font-size:12.8px" =
class=3D"">Big gains (if any) are likely due to indexing strategy.<br =
class=3D""></span><span style=3D"font-size:12.8px" class=3D"">I do see =
some suspicious casting, for example:</span><br style=3D"font-size:12.8px"=
 class=3D""><span style=3D"font-size:12.8px" class=3D"">Join Filter: =
((four_charlie.delta_tango)::</span><wbr style=3D"font-size:12.8px" =
class=3D""><span style=3D"font-size:12.8px" class=3D"">integer =3D<br =
class=3D""></span><span style=3D"font-size:12.8px" =
class=3D"">(six_quebec.golf_bravo)::</span><wbr style=3D"font-size:12.8px"=
 class=3D""><span style=3D"font-size:12.8px" class=3D"">integer)</span><br=
 style=3D"font-size:12.8px" class=3D""><span style=3D"font-size:12.8px" =
class=3D"">Are you casting in the query or joining through dissimilar =
data types?</span></blockquote><div class=3D"">No casts in query. The =
joins are on same data types.&nbsp;</div><div class=3D""><br =
class=3D""></div><div class=3D"">Thank you all for the answers. Happy =
2017!</div></div><div class=3D"gmail_extra"><br clear=3D"all" =
class=3D""><div class=3D""><div class=3D"gmail_signature" =
data-smartmail=3D"gmail_signature"><div class=3D"">Fl=C3=A1vio =
Henrique</div>--------------------------------------------------------<br =
class=3D"">"There are only 10 types of people in the world: Those who =
understand binary, and those who don't"<br =
class=3D"">--------------------------------------------------------</div><=
/div>
<br class=3D""><div class=3D"gmail_quote">On Thu, Jan 5, 2017 at 12:40 =
PM, Merlin Moncure <span dir=3D"ltr" class=3D"">&lt;<a =
href=3D"mailto:mmoncure@gmail.com" target=3D"_blank" =
class=3D"">mmoncure@gmail.com</a>&gt;</span> wrote:<br =
class=3D""><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 =
.8ex;border-left:1px #ccc solid;padding-left:1ex"><span class=3D"">On =
Tue, Dec 27, 2016 at 5:50 PM, Fl=C3=A1vio Henrique &lt;<a =
href=3D"mailto:yoshimit@gmail.com" class=3D"">yoshimit@gmail.com</a>&gt; =
wrote:<br class=3D"">
</span><span class=3D"">&gt; Hi there, fellow experts!<br class=3D"">
&gt;<br class=3D"">
&gt; I need an advice with query that became slower after 9.3 to 9.6 =
migration.<br class=3D"">
&gt;<br class=3D"">
&gt; First of all, I'm from the dev team.<br class=3D"">
&gt;<br class=3D"">
&gt; Before migration, we (programmers) made some modifications on query =
bring<br class=3D"">
&gt; it's average time from 8s to 2-3s.<br class=3D"">
&gt;<br class=3D"">
&gt; As this query is the most executed on our system (it builds the =
user panel<br class=3D"">
&gt; to work), every bit that we can squeeze from it will be nice.<br =
class=3D"">
&gt;<br class=3D"">
&gt; Now, after server migration to 9.6 we're experiencing bad times =
with this<br class=3D"">
&gt; query again.<br class=3D"">
&gt;<br class=3D"">
&gt; Unfortunately, I don't have the old query plain (9.3 version) to =
show you,<br class=3D"">
&gt; but in the actual version (9.6) I can see some buffers written that =
tells me<br class=3D"">
&gt; that something is wrong.<br class=3D"">
&gt;<br class=3D"">
&gt; Our server has 250GB of memory available, but the database team =
says that<br class=3D"">
&gt; they can't do nothing to make this query better. I'm not sure, as =
some<br class=3D"">
&gt; buffers are written on disk.<br class=3D"">
&gt;<br class=3D"">
&gt; Any tip/help will be much appreciated (even from the query =
side).<br class=3D"">
&gt;<br class=3D"">
&gt; Thank you!<br class=3D"">
&gt;<br class=3D"">
&gt; The query plan: <a href=3D"https://explain.depesz.com/s/5KMn" =
rel=3D"noreferrer" target=3D"_blank" =
class=3D"">https://explain.depesz.com/s/<wbr class=3D"">5KMn</a><br =
class=3D"">
&gt;<br class=3D"">
&gt; Note: I tried to add index on kilo_victor table already, but =
Postgresql<br class=3D"">
&gt; still thinks that is better to do a seq scan.<br class=3D"">
<br class=3D"">
</span>Hard to provide more without the query or the 'old' plan.&nbsp; =
&nbsp;Here are<br class=3D"">
some things you can try:<br class=3D"">
*) Set effective_io_concurrency high.&nbsp; &nbsp; You have some heap =
scanning<br class=3D"">
going on and this can sometimes help (but it should be marginal).<br =
class=3D"">
*) See if you can get any juice out of parallel query<br class=3D"">
*) try playing with enable_nestloop and enable_seqscan.&nbsp; =
&nbsp;these are<br class=3D"">
hail mary passes but worth a shot.<br class=3D"">
<br class=3D"">
Run the query back to back with same arguments in the same database<br =
class=3D"">
session. Does performance improve?<br class=3D"">
<br class=3D"">
Big gains (if any) are likely due to indexing strategy.<br class=3D"">
I do see some suspicious casting, for example:<br class=3D"">
<br class=3D"">
Join Filter: ((four_charlie.delta_tango)::<wbr class=3D"">integer =3D<br =
class=3D"">
(six_quebec.golf_bravo)::<wbr class=3D"">integer)<br class=3D"">
<br class=3D"">
Are you casting in the query or joining through dissimilar data =
types?<br class=3D"">
&nbsp;I suspect your database team might be incorrect.<br class=3D"">
<span class=3D"HOEnZb"><font color=3D"#888888" class=3D""><br class=3D"">
merlin<br class=3D"">
</font></span></blockquote></div><br class=3D""></div>
</div></blockquote></div><br class=3D""></body></html>=

--Apple-Mail=_2F23939A-23CF-4AA1-BD68-583E552BE862--