MIME-Version: 1.0
References: <431223779.107307.1721971823963.ref@mail.yahoo.com>
 <431223779.107307.1721971823963@mail.yahoo.com> <CA+bJJbyvo2BW1=cgX_g3QgcZQDs=haBOwC8PShoE6Gh+LUGEaQ@mail.gmail.com>
In-Reply-To: <CA+bJJbyvo2BW1=cgX_g3QgcZQDs=haBOwC8PShoE6Gh+LUGEaQ@mail.gmail.com>
From: Fatih Sazan <fatihsazan01@gmail.com>
Date: Fri, 26 Jul 2024 12:40:53 +0300
Message-ID: <CABFFKA+KNfqS+K8Le_hK1uqrNWwS5ApR=YOQniZcdj_nH1WKBw@mail.gmail.com>
Subject: Re: Slow performance
To: "sivapostgres@yahoo.com" <sivapostgres@yahoo.com>
Cc: Postgresql General Group <pgsql-general@lists.postgresql.org>
Content-Type: multipart/alternative; boundary="000000000000698798061e234f63"
Archived-At: <https://www.postgresql.org/message-id/CABFFKA%2BKNfqS%2BK8Le_hK1uqrNWwS5ApR%3DYOQniZcdj_nH1WKBw%40mail.gmail.com>
Precedence: bulk

--000000000000698798061e234f63
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Hi Siva,

pg_dump taken from client_db seems not to have transferred full data to
client_test.

When I examine the query plans, the rows scanned seem to be extremely
different. For example, there is 1 row in cl_level table on client_test,
while clined_db shows that around 300,000 records were scanned.

My suggestion would be to check the data counts in the tables you moved
with count(*).


Francisco Olarte <folarte@peoplecall.com>, 26 Tem 2024 Cum, 10:55 tarihinde
=C5=9Funu yazd=C4=B1:

> Hello:
>
> On Fri, 26 Jul 2024 at 07:31, sivapostgres@yahoo.com
> <sivapostgres@yahoo.com> wrote:
> ...
> > Took backup (pg_dump) of first database (client_db) and restored the
> database as second database (client_test).
> ...
> > The query when run against DB1 takes around 7 min 32 seconds.
> > The same query when run against DB2 takes around 124 msec.
> > Same computer, same PG cluster, same query.
> > Why it takes so much time when run against DB1 (client_db)?
>
> Can be bad luck, but the usual suspect would be different databases.
>
> I assume db1 is quiescent on the tests ( as it seems the production
> database, no heavy querying concurrent with your tests ).
>
> Bear in mind restoring leaves the database similar to what a vacuum
> full will do, so it can differ a lot from the original.
>
> > Already executed vacuum against client_db database.
>
> I think you already have pointed out this, but IIRC you have not told
> us if you have ANALYZED any of the databases. This is important. Bad
> stats in any of them could make the planner choose a bad plan ( or, if
> you are unlucky, make it choose a bad one ).
>
> Also, did you vacuum verbose? where your tables well packed? ( bad
> vacuuming can lead to huge tables with a lot of free space, but I
> doubt this is your case, but everything has to be checked, we only
> know what you write us ).
>
> And now, not being an expert in tracing explain I see this in plan-db1:
> "              Join Filter: (((b.registrationnumber)::text =3D
> (p.registrationnumber)::text) AND ((c.subjectcode)::text =3D
> (p.subjectcode)::text) AND (a.semester =3D p.semester))"
> "              Rows Removed by Join Filter: 13614738"
> "              ->  Index Scan using
> ""cl_student_semester_subject_IX3"" on cl_student_semester_subject p
> (cost=3D0.55..8.57 rows=3D1 width=3D60) (actual time=3D0.033..55.702
> rows=3D41764 loops=3D1)"
> "                    Index Cond: (((companycode)::text =3D '100'::text)
> AND ((examheaderfk)::text =3D
> 'BA80952CFF8F4E1C3F9F44B62ED9BF37'::text))"
>
> Not an explain expert, but if i read correctly an index scan expecting
> 1 row recovers 41674, which hints at bad statistics ( or skewed data
> distribution and bad luck )
>
> The plans are similar, but in the fast query
> cl_student_semester_subject is accessed using other index:
>
> "              ->  Index Scan using
> ""cl_student_semester_subject_IX1"" on cl_student_semester_subject p
> (cost=3D0.42..3.09 rows=3D1 width=3D60) (actual time=3D0.010..0.010 rows=
=3D1
> loops=3D326)"
> "                    Index Cond: (((companycode)::text =3D '100'::text)
> AND ((subjectcode)::text =3D (a.subjectcode)::text) AND
> ((registrationnumber)::text =3D (a.registrationnumber)::text) AND
> (semester =3D a.semester))"
>
> Which seems much more selective and recovers just what it wants.
>
> I would start by analyzing ( and, if not too costly, reindexing ) that
> table.
>
> Francisco Olarte.
>
>
>

--000000000000698798061e234f63
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Hi Siva,<div><br></div><div>pg_dump taken from client_db s=
eems not to have transferred full data to client_test.<br><br>When I examin=
e the query plans, the rows scanned seem to be extremely different. For exa=
mple, there is 1 row in cl_level table on client_test, while clined_db show=
s that around 300,000 records were scanned.<br><br>My suggestion would be t=
o check the data counts in the tables you moved with count(*).<br></div><di=
v><br></div><div><br></div><div><br></div></div><br><div class=3D"gmail_quo=
te"><div dir=3D"ltr" class=3D"gmail_attr">Francisco Olarte &lt;<a href=3D"m=
ailto:folarte@peoplecall.com">folarte@peoplecall.com</a>&gt;, 26 Tem 2024 C=
um, 10:55 tarihinde =C5=9Funu yazd=C4=B1:<br></div><blockquote class=3D"gma=
il_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,2=
04,204);padding-left:1ex">Hello:<br>
<br>
On Fri, 26 Jul 2024 at 07:31, <a href=3D"mailto:sivapostgres@yahoo.com" tar=
get=3D"_blank">sivapostgres@yahoo.com</a><br>
&lt;<a href=3D"mailto:sivapostgres@yahoo.com" target=3D"_blank">sivapostgre=
s@yahoo.com</a>&gt; wrote:<br>
...<br>
&gt; Took backup (pg_dump) of first database (client_db) and restored the d=
atabase as second database (client_test).<br>
...<br>
&gt; The query when run against DB1 takes around 7 min 32 seconds.<br>
&gt; The same query when run against DB2 takes around 124 msec.<br>
&gt; Same computer, same PG cluster, same query.<br>
&gt; Why it takes so much time when run against DB1 (client_db)?<br>
<br>
Can be bad luck, but the usual suspect would be different databases.<br>
<br>
I assume db1 is quiescent on the tests ( as it seems the production<br>
database, no heavy querying concurrent with your tests ).<br>
<br>
Bear in mind restoring leaves the database similar to what a vacuum<br>
full will do, so it can differ a lot from the original.<br>
<br>
&gt; Already executed vacuum against client_db database.<br>
<br>
I think you already have pointed out this, but IIRC you have not told<br>
us if you have ANALYZED any of the databases. This is important. Bad<br>
stats in any of them could make the planner choose a bad plan ( or, if<br>
you are unlucky, make it choose a bad one ).<br>
<br>
Also, did you vacuum verbose? where your tables well packed? ( bad<br>
vacuuming can lead to huge tables with a lot of free space, but I<br>
doubt this is your case, but everything has to be checked, we only<br>
know what you write us ).<br>
<br>
And now, not being an expert in tracing explain I see this in plan-db1:<br>
&quot;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Join Filter: (((b.re=
gistrationnumber)::text =3D<br>
(p.registrationnumber)::text) AND ((c.subjectcode)::text =3D<br>
(p.subjectcode)::text) AND (a.semester =3D p.semester))&quot;<br>
&quot;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Rows Removed by Join=
 Filter: 13614738&quot;<br>
&quot;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 -&gt;=C2=A0 Index Sc=
an using<br>
&quot;&quot;cl_student_semester_subject_IX3&quot;&quot; on cl_student_semes=
ter_subject p<br>
(cost=3D0.55..8.57 rows=3D1 width=3D60) (actual time=3D0.033..55.702<br>
rows=3D41764 loops=3D1)&quot;<br>
&quot;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
 Index Cond: (((companycode)::text =3D &#39;100&#39;::text)<br>
AND ((examheaderfk)::text =3D<br>
&#39;BA80952CFF8F4E1C3F9F44B62ED9BF37&#39;::text))&quot;<br>
<br>
Not an explain expert, but if i read correctly an index scan expecting<br>
1 row recovers 41674, which hints at bad statistics ( or skewed data<br>
distribution and bad luck )<br>
<br>
The plans are similar, but in the fast query<br>
cl_student_semester_subject is accessed using other index:<br>
<br>
&quot;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 -&gt;=C2=A0 Index Sc=
an using<br>
&quot;&quot;cl_student_semester_subject_IX1&quot;&quot; on cl_student_semes=
ter_subject p<br>
(cost=3D0.42..3.09 rows=3D1 width=3D60) (actual time=3D0.010..0.010 rows=3D=
1<br>
loops=3D326)&quot;<br>
&quot;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
 Index Cond: (((companycode)::text =3D &#39;100&#39;::text)<br>
AND ((subjectcode)::text =3D (a.subjectcode)::text) AND<br>
((registrationnumber)::text =3D (a.registrationnumber)::text) AND<br>
(semester =3D a.semester))&quot;<br>
<br>
Which seems much more selective and recovers just what it wants.<br>
<br>
I would start by analyzing ( and, if not too costly, reindexing ) that tabl=
e.<br>
<br>
Francisco Olarte.<br>
<br>
<br>
</blockquote></div>

--000000000000698798061e234f63--