Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1t9kBO-00BymS-8l for pgsql-general@arkaria.postgresql.org; Sat, 09 Nov 2024 12:03:13 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1t9kBL-000tM7-Na for pgsql-general@arkaria.postgresql.org; Sat, 09 Nov 2024 12:03:12 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1t9kBL-000tLa-2o for pgsql-general@lists.postgresql.org; Sat, 09 Nov 2024 12:03:11 +0000 Received: from mail-lj1-x22e.google.com ([2a00:1450:4864:20::22e]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1t9kBI-000xtw-Is for pgsql-general@lists.postgresql.org; Sat, 09 Nov 2024 12:03:10 +0000 Received: by mail-lj1-x22e.google.com with SMTP id 38308e7fff4ca-2fb5fa911aaso38524621fa.2 for ; Sat, 09 Nov 2024 04:03:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1731153787; x=1731758587; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=vA2zR1zBRgZDNkOclg4lI2bQtGY6fsUSpeNRIyfZ+AY=; b=gE0L+lrtVCtizggIng4nNt+pj7Ba49OC6BwfOIv7cTDPAqwFqXVZc7BihQkhAKkPP/ UWOu6e9gGNCOIqxnFSbIwF/j49SHL8Q40ilKbZG1Auxqbtu/zOOWfKKD9smg+qapdgtV +CdJpJ1dvaJfySLBowEMe09Uph1WnK31gPO2822CtP1yoKXpZ+U48u1Ja6ANB/dcHkhy Nq7tdY3tgB94SvZl0IiNPU/c6XKqQ1FqDytU44v8jpqJB8jBmwxqm0Pns+6QOaD9x0fK fpNyx0IKq1A+polpUSvY9JEVpK1XnTYPd5JGhQobmdXyh8+L4Jk8/KeXnQJasl0o1teI Rqmg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731153787; x=1731758587; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=vA2zR1zBRgZDNkOclg4lI2bQtGY6fsUSpeNRIyfZ+AY=; b=l88ruKPaw/pq6qpuz7xMpxIZhZeIIGVPOl3fo3VtWDse0G8OVsd2zv5yZx0yk9rdl8 BaDqSSvcKoKn5JliFwR14/rV10bY5y91+SBdzTXOiuMDwqFILx22dknHxeWZqc9TEpn1 CNNlVRdBDxzEQ6kltF+m5KN1V/cHHln8ulSa3IXaCY+s7R1Z0+d07PiKICBsXStT1h72 ewcoMfrLTAJyJ6DEK1TOR53N2y4NQAMozOF1QC1vvvLB4KAd4b8ufge7kyyCMGaW35CV IVKP/GdOeoeAfkD2Q4Jns4smjO0QFhhM2T+Gf6XqcOShJO2mXP9matWL37yOpKMbr2xQ vMrw== X-Forwarded-Encrypted: i=1; AJvYcCXTbGtyDlWT/iCBnyArb6qrWYisbnwI4YlJm6IYh7yXaFIwKpYk+bFNfU6z8BcTqF2oVz/xp6e9UCzKn0AB@lists.postgresql.org X-Gm-Message-State: AOJu0YynnfW7Kr54OfO8ASSAlksOPLdeBJDvuXy7z2JJYAg2KwTeDQ45 ehZcw+RZZbwyV9uplb1yhhTKfjCVHbttpeI2DhtrK6kPY0SeNq4Ypjl3bXJQbXzaVx1+TFYvXLO qVLB11HYoTEP7IZ748KvEm3nx3ZY= X-Google-Smtp-Source: AGHT+IHC3fBMFFI/twrs+gDNi2ILiOL6cCt54g41x8BSW9t8HETmYrhtlk+3O28nnyE74mphH9E0bkpcQBJQ3/hlN6k= X-Received: by 2002:a2e:bc86:0:b0:2fa:d604:e519 with SMTP id 38308e7fff4ca-2ff20187e23mr58061661fa.11.1731153786481; Sat, 09 Nov 2024 04:03:06 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: ravi k Date: Sat, 9 Nov 2024 17:32:54 +0530 Message-ID: Subject: Re: Performance Issue with Hash Partition Query Execution in PostgreSQL 16 To: David Mullineux Cc: Ramakrishna m , pgsql-general Content-Type: multipart/alternative; boundary="0000000000007eb336062679a677" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000007eb336062679a677 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Thanks for the advice! I am planing to set session level! but before that one more observations noticed i.e One more table has same issue, which is having similar like hash partitions. And I scheduled manual analyze for all parent hash tables(thus all stats will update together). After this change I didn't noticed the issue, not sure does this addressed issue or not, just monitoring if this not works will set custom plan in session level. I have seen in SQL server parameter sniffing regularly but in postgres I never experienced. I am still wondering does this sniffing or not as from stats I didn't notice any sequence scan. Best, On Sat, 9 Nov, 2024, 3:40=E2=80=AFpm David Mullineux, wr= ote: > Thanks for correction. At this point I would be trying to modify > plan_cache_mode > for the session which uses the bond variable. alter it so that > plan_cache_mode=3Dforce_custom_plan > One hypothesis is that, a bad plan got cached for that SQL pattern. > Obviously, when you run it *manually* you are always getting a *custom* > plan as it's not a prepared statement. > > > > > On Sat, 9 Nov 2024, 03:46 ravi k, wrote: > >> Sorry, it was typo. Bind variable is bigint only. >> >> Thanks >> >> On Fri, 8 Nov, 2024, 7:09=E2=80=AFpm David Mullineux, = wrote: >> >>> Just spotted a potential problem. The indexed column is a bigint. Are >>> you, in your prepared statement passing a string or a big int ? >>> I notice your plan is doing an implicit type conversion when you run it >>> manually. >>> Sometimes the wrong type will make it not use the index. >>> >>> On Fri, 8 Nov 2024, 03:07 ravi k, wrote: >>> >>>> Hi , >>>> >>>> Thanks for the suggestions. >>>> >>>> Two more observations: >>>> >>>> 1) no sequence scan noticed from pg_stat_user_tables ( hope stats are >>>> accurate in postgres 16) if parameter sniffing happens the possibility= of >>>> going to sequence scan is more right. >>>> >>>> 2) no blockings or IO issue during the time. >>>> >>>> 3) even with limit clause if touch all partitions also it could have >>>> been completed in milliseconds as this is just one record. >>>> >>>> 4) auto_explain in prod we cannot enable as this is expensive and with >>>> high TPS we may face latency issues and lower environment this issue c= annot >>>> be reproduced,( this is happening out of Million one case) >>>> >>>> This looks puzzle to us, just in case anyone experianced pls share you= r >>>> experience. >>>> >>>> Regards, >>>> Ravi >>>> >>>> On Thu, 7 Nov, 2024, 3:41=E2=80=AFam David Mullineux, wrote: >>>> >>>>> It might be worth eliminating the use of cached plans here. Is your >>>>> app using prepared statements at all? >>>>> Point is that if the optimizer sees the same prepared query , 5 times= , >>>>> the it locks the plan that it found at that time. This is a good tra= de off >>>>> as it avoids costly planning-time for repetitive queries. But if you = are >>>>> manually querying, the a custom plan will be generated anew. >>>>> A quick analyze of the table should reset the stats and invalidate an= y >>>>> cached plans. >>>>> This may not be your problem just worth eliminating it from the list >>>>> of potential causes. >>>>> >>>>> On Wed, 6 Nov 2024, 17:14 Ramakrishna m, wrote: >>>>> >>>>>> Hi Team, >>>>>> >>>>>> One of the queries, which retrieves a single record from a table wit= h >>>>>> 16 hash partitions, is taking more than 10 seconds to execute. In co= ntrast, >>>>>> when we run the same query manually, it completes within millisecond= s. This >>>>>> issue is causing exhaustion of the application pools. Do we have any= bugs >>>>>> in postgrs16 hash partitions? Please find the attached log, table, a= nd >>>>>> execution plan. >>>>>> >>>>>> size of the each partitions : 300GB >>>>>> Index Size : 12GB >>>>>> >>>>>> Postgres Version : 16.x >>>>>> Shared Buffers : 75 GB >>>>>> Effective_cache : 175 GB >>>>>> Work _mem : 4MB >>>>>> Max_connections : 3000 >>>>>> >>>>>> OS : Ubuntu 22.04 >>>>>> Ram : 384 GB >>>>>> CPU : 64 >>>>>> >>>>>> Please let us know if you need any further information or if there >>>>>> are additional details required. >>>>>> >>>>>> >>>>>> Regards, >>>>>> Ram. >>>>>> >>>>> --0000000000007eb336062679a677 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Thanks for the advice!

I am planing to set session level!

but before that one more observations noticed i.e One= more table has same issue, which is having similar like hash partitions.

And I scheduled manual an= alyze for all parent hash tables(thus all stats will update together).

After this change I didn'= ;t noticed the issue, not sure does this addressed issue or not, just monit= oring if this not works will set custom plan in session level.

I have seen in SQL server=C2=A0 para= meter sniffing regularly but in postgres I never=C2=A0 experienced. I am st= ill wondering does this sniffing or not as from stats I didn't notice a= ny sequence scan.

Best,<= /div>







On Sat, 9 Nov, 2024, 3:40=E2=80=AFpm David Mullineux= , <dmullx@gmail.com> wrote:
Thanks for correct= ion. At this point I would be trying to modify=C2=A0
plan_= cache_mode
for the session which uses the bond varia= ble. alter it so that plan_cache_mode=3Dforce_custom_plan
One hypothesis is that,= a bad plan got cached for that SQL pattern. Obviously, when you run it = manually you are always getting a custom plan as it's not a = prepared statement.=C2=A0=C2=A0




On Sat, 9 Nov 2024, 03:46 ravi k, <ravisql09@gmail.com>= wrote:
Sorry, it= was typo. Bind variable is bigint only.

Thanks=C2=A0

On Fri, 8 Nov, 2024, 7:09=E2=80=AFpm David Mu= llineux, <dmullx@gmail.com> wrote:
Just spotted a potential problem. The = indexed column is a bigint. Are you, in your prepared statement passing a s= tring or a big int ?
I notice your plan is doing an implic= it type conversion when you run it manually.
Sometim= es the wrong type will make it not use the index.

On Fri, 8 Nov 2024, = 03:07 ravi k, <ravisql09@gmail.com> wrote:=
Hi ,

Thanks for the suggestions.

Two more observations:

1) no sequence scan noticed from pg_s= tat_user_tables ( hope stats are accurate in postgres 16) if parameter snif= fing happens the possibility of going to=C2=A0 sequence scan is more right.=

2) no blockings or IO i= ssue during the time.

3)= even with limit clause if touch all partitions also it could have been com= pleted in milliseconds as this is just one record.
<= br>
4) auto_explain in prod we cannot enable as this= is expensive and with high TPS we may face latency issues and lower enviro= nment this issue cannot be reproduced,( this is happening out of Million on= e case)

This looks puzzl= e to us, just in case anyone experianced pls share your experience.

Regards,
Ravi

On Thu, 7 Nov, 2024, 3:41=E2=80=AFam David Mullineux, <dmullx@gmail.com> wrote:
It might be worth eliminating the u= se of cached plans here. Is your app using prepared statements at all?=C2= =A0=C2=A0
Point is that if the optimizer sees the same pre= pared query , 5 times, the=C2=A0 it locks the plan that it found at that ti= me. This is a good trade off as it avoids costly planning-time for repetiti= ve queries. But if you are manually querying, the=C2=A0 a custom plan will = be generated=C2=A0 anew.
A quick analyze of the tabl= e should reset the stats and invalidate any cached plans.
This may not be your problem=C2=A0 just worth eliminating it from the= list of potential causes.

On Wed, 6 Nov 2024, 17:14 Ramakrishna m, &l= t;ram.pgdb@gmail.com> wr= ote:
Hi Team,=

One of the queries, which retrieves a = single record from a table with 16 hash partitions, is taking more than 10 = seconds to execute. In contrast, when we run the same query manually, it co= mpletes within milliseconds. This issue is causing exhaustion of the applic= ation pools.=C2=A0Do we have any bugs in postgrs16 hash partitions? Please = find the attached log, table, and execution plan.=C2=A0

size of the each partitions : 300GB=C2=A0
Index Size= : 12GB

Postgres = Version : 16.x
Shared Buffers : 75 GB
Effective_cache :=C2=A0 175 GB<= br>Work _mem : 4MB
Max_connections : 3000

OS=C2=A0 :=C2= =A0Ubuntu 22.04
Ram : 384 GB
CPU : 64=

Please let us know if= you need any further information or if there are additional details requir= ed.=C2=A0=C2=A0


Regards,
Ram.
--0000000000007eb336062679a677--