MIME-Version: 1.0
References: <CAG-eXHJ+KbQ8_k-jKSGZU9V6HkLKU2Jqz7nYMYGhHuC-Zqm7qQ@mail.gmail.com>
 <CAGsyd8WqPEgoAkNO0Q7rpQpOWOZ-Z6wCM7xh5d6nXCxLH_GM_A@mail.gmail.com>
 <CAFL4M8EmboE4wXBHe2EMFcShxUAxXgWFa4TT-iVD2hJcHumetg@mail.gmail.com> <CAGsyd8X7U07UK8hjapwYBfbtK0KnMSxLtH6BFaxe1_i2=BR-+A@mail.gmail.com>
In-Reply-To: <CAGsyd8X7U07UK8hjapwYBfbtK0KnMSxLtH6BFaxe1_i2=BR-+A@mail.gmail.com>
From: ravi k <ravisql09@gmail.com>
Date: Sat, 9 Nov 2024 09:15:57 +0530
Message-ID: <CAFL4M8FuS1ivNARaNUjoSgdjec+KH0DLXSqVa11uy1Nscup2+w@mail.gmail.com>
Subject: Re: Performance Issue with Hash Partition Query Execution in
 PostgreSQL 16
To: David Mullineux <dmullx@gmail.com>
Cc: Ramakrishna m <ram.pgdb@gmail.com>, pgsql-general <pgsql-general@lists.postgresql.org>
Content-Type: multipart/alternative; boundary="0000000000005d80fc062672b5b6"
Archived-At: <https://www.postgresql.org/message-id/CAFL4M8FuS1ivNARaNUjoSgdjec%2BKH0DLXSqVa11uy1Nscup2%2Bw%40mail.gmail.com>
Precedence: bulk

--0000000000005d80fc062672b5b6
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Sorry, it was typo. Bind variable is bigint only.

Thanks

On Fri, 8 Nov, 2024, 7:09=E2=80=AFpm David Mullineux, <dmullx@gmail.com> wr=
ote:

> Just spotted a potential problem. The indexed column is a bigint. Are you=
,
> in your prepared statement passing a string or a big int ?
> I notice your plan is doing an implicit type conversion when you run it
> manually.
> Sometimes the wrong type will make it not use the index.
>
> On Fri, 8 Nov 2024, 03:07 ravi k, <ravisql09@gmail.com> wrote:
>
>> Hi ,
>>
>> Thanks for the suggestions.
>>
>> Two more observations:
>>
>> 1) no sequence scan noticed from pg_stat_user_tables ( hope stats are
>> accurate in postgres 16) if parameter sniffing happens the possibility o=
f
>> going to  sequence scan is more right.
>>
>> 2) no blockings or IO issue during the time.
>>
>> 3) even with limit clause if touch all partitions also it could have bee=
n
>> completed in milliseconds as this is just one record.
>>
>> 4) auto_explain in prod we cannot enable as this is expensive and with
>> high TPS we may face latency issues and lower environment this issue can=
not
>> be reproduced,( this is happening out of Million one case)
>>
>> This looks puzzle to us, just in case anyone experianced pls share your
>> experience.
>>
>> Regards,
>> Ravi
>>
>> On Thu, 7 Nov, 2024, 3:41=E2=80=AFam David Mullineux, <dmullx@gmail.com>=
 wrote:
>>
>>> It might be worth eliminating the use of cached plans here. Is your app
>>> using prepared statements at all?
>>> Point is that if the optimizer sees the same prepared query , 5 times,
>>> the  it locks the plan that it found at that time. This is a good trade=
 off
>>> as it avoids costly planning-time for repetitive queries. But if you ar=
e
>>> manually querying, the  a custom plan will be generated  anew.
>>> A quick analyze of the table should reset the stats and invalidate any
>>> cached plans.
>>> This may not be your problem  just worth eliminating it from the list o=
f
>>> potential causes.
>>>
>>> On Wed, 6 Nov 2024, 17:14 Ramakrishna m, <ram.pgdb@gmail.com> wrote:
>>>
>>>> Hi Team,
>>>>
>>>> One of the queries, which retrieves a single record from a table with
>>>> 16 hash partitions, is taking more than 10 seconds to execute. In cont=
rast,
>>>> when we run the same query manually, it completes within milliseconds.=
 This
>>>> issue is causing exhaustion of the application pools. Do we have any b=
ugs
>>>> in postgrs16 hash partitions? Please find the attached log, table, and
>>>> execution plan.
>>>>
>>>> size of the each partitions : 300GB
>>>> Index Size : 12GB
>>>>
>>>> Postgres Version : 16.x
>>>> Shared Buffers : 75 GB
>>>> Effective_cache :  175 GB
>>>> Work _mem : 4MB
>>>> Max_connections : 3000
>>>>
>>>> OS  : Ubuntu 22.04
>>>> Ram : 384 GB
>>>> CPU : 64
>>>>
>>>> Please let us know if you need any further information or if there are
>>>> additional details required.
>>>>
>>>>
>>>> Regards,
>>>> Ram.
>>>>
>>>

--0000000000005d80fc062672b5b6
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"auto">Sorry, it was typo. Bind variable is bigint only.<div dir=
=3D"auto"><br></div><div dir=3D"auto">Thanks=C2=A0</div></div><br><div clas=
s=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Fri, 8 Nov, 2024=
, 7:09=E2=80=AFpm David Mullineux, &lt;<a href=3D"mailto:dmullx@gmail.com">=
dmullx@gmail.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" =
style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><di=
v dir=3D"auto">Just spotted a potential problem. The indexed column is a bi=
gint. Are you, in your prepared statement passing a string or a big int ?<d=
iv dir=3D"auto">I notice your plan is doing an implicit type conversion whe=
n you run it manually.</div><div dir=3D"auto">Sometimes the wrong type will=
 make it not use the index.</div></div><br><div class=3D"gmail_quote"><div =
dir=3D"ltr" class=3D"gmail_attr">On Fri, 8 Nov 2024, 03:07 ravi k, &lt;<a h=
ref=3D"mailto:ravisql09@gmail.com" target=3D"_blank" rel=3D"noreferrer">rav=
isql09@gmail.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" =
style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><di=
v dir=3D"auto">Hi ,<div dir=3D"auto"><br></div><div dir=3D"auto">Thanks for=
 the suggestions.</div><div dir=3D"auto"><br></div><div dir=3D"auto">Two mo=
re observations:</div><div dir=3D"auto"><br></div><div dir=3D"auto">1) no s=
equence scan noticed from pg_stat_user_tables ( hope stats are accurate in =
postgres 16) if parameter sniffing happens the possibility of going to=C2=
=A0 sequence scan is more right.</div><div dir=3D"auto"><br></div><div dir=
=3D"auto">2) no blockings or IO issue during the time.</div><div dir=3D"aut=
o"><br></div><div dir=3D"auto">3) even with limit clause if touch all parti=
tions also it could have been completed in milliseconds as this is just one=
 record.</div><div dir=3D"auto"><br></div><div dir=3D"auto">4) auto_explain=
 in prod we cannot enable as this is expensive and with high TPS we may fac=
e latency issues and lower environment this issue cannot be reproduced,( th=
is is happening out of Million one case)</div><div dir=3D"auto"><br></div><=
div dir=3D"auto">This looks puzzle to us, just in case anyone experianced p=
ls share your experience.</div><div dir=3D"auto"><br></div><div dir=3D"auto=
">Regards,</div><div dir=3D"auto">Ravi</div></div><br><div class=3D"gmail_q=
uote"><div dir=3D"ltr" class=3D"gmail_attr">On Thu, 7 Nov, 2024, 3:41=E2=80=
=AFam David Mullineux, &lt;<a href=3D"mailto:dmullx@gmail.com" rel=3D"noref=
errer noreferrer" target=3D"_blank">dmullx@gmail.com</a>&gt; wrote:<br></di=
v><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:=
1px #ccc solid;padding-left:1ex"><div dir=3D"auto">It might be worth elimin=
ating the use of cached plans here. Is your app using prepared statements a=
t all?=C2=A0=C2=A0<div dir=3D"auto">Point is that if the optimizer sees the=
 same prepared query , 5 times, the=C2=A0 it locks the plan that it found a=
t that time. This is a good trade off as it avoids costly planning-time for=
 repetitive queries. But if you are manually querying, the=C2=A0 a custom p=
lan will be generated=C2=A0 anew.</div><div dir=3D"auto">A quick analyze of=
 the table should reset the stats and invalidate any cached plans.</div><di=
v dir=3D"auto">This may not be your problem=C2=A0 just worth eliminating it=
 from the list of potential causes.</div></div><br><div class=3D"gmail_quot=
e"><div dir=3D"ltr" class=3D"gmail_attr">On Wed, 6 Nov 2024, 17:14 Ramakris=
hna m, &lt;<a href=3D"mailto:ram.pgdb@gmail.com" rel=3D"noreferrer noreferr=
er noreferrer" target=3D"_blank">ram.pgdb@gmail.com</a>&gt; wrote:<br></div=
><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1=
px #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div>Hi Team,</div><div><d=
iv><div><div><div><div></div></div></div></div></div><div><div><div><div di=
r=3D"auto"><div><div><p>One of the queries, which retrieves a single record=
 from a table with 16 hash partitions, is taking more than 10 seconds to ex=
ecute. In contrast, when we run the same query manually, it completes withi=
n milliseconds. This issue is causing exhaustion of the application pools.=
=C2=A0Do we have any bugs in postgrs16 hash partitions? Please find the att=
ached log, table, and execution plan.=C2=A0</p><p><font face=3D"arial, sans=
-serif">size of the each partitions : 300GB=C2=A0<br>Index Size : 12GB</fon=
t></p><p><span style=3D"font-family:arial,sans-serif">Postgres Version : 16=
.x</span><font face=3D"arial, sans-serif"><br></font><span style=3D"font-fa=
mily:arial,sans-serif">Shared Buffers : 75 GB</span><font face=3D"arial, sa=
ns-serif"><br></font><span style=3D"font-family:arial,sans-serif">Effective=
_cache :=C2=A0 175 GB</span><font face=3D"arial, sans-serif"><br></font><sp=
an style=3D"font-family:arial,sans-serif">Work _mem : 4MB</span><font face=
=3D"arial, sans-serif"><br></font><span style=3D"font-family:arial,sans-ser=
if">Max_connections : 3000</span><font face=3D"arial, sans-serif"></font></=
p><p><span style=3D"font-family:arial,sans-serif">OS=C2=A0 :=C2=A0Ubuntu 22=
.04</span><br style=3D"font-family:arial,sans-serif"><span style=3D"font-fa=
mily:arial,sans-serif">Ram : 384 GB</span><br style=3D"font-family:arial,sa=
ns-serif"><span style=3D"font-family:arial,sans-serif">CPU : 64</span><font=
 face=3D"arial, sans-serif"></font></p><p>Please let us know if you need an=
y further information or if there are additional details required.=C2=A0=C2=
=A0</p><p><br></p></div></div></div></div></div></div></div><div>Regards,</=
div><div dir=3D"ltr" class=3D"gmail_signature" data-smartmail=3D"gmail_sign=
ature"><div dir=3D"ltr"><div>Ram.<br></div></div></div></div>
</blockquote></div>
</blockquote></div>
</blockquote></div>
</blockquote></div>

--0000000000005d80fc062672b5b6--