Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1t3MMW-00C7tp-Ss for pgsql-general@arkaria.postgresql.org; Tue, 22 Oct 2024 21:24:21 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1t3MMT-003PlM-KR for pgsql-general@arkaria.postgresql.org; Tue, 22 Oct 2024 21:24:17 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1t3MMT-003PlD-2x for pgsql-general@lists.postgresql.org; Tue, 22 Oct 2024 21:24:17 +0000 Received: from mail-oa1-x31.google.com ([2001:4860:4864:20::31]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1t3MMQ-002S6O-Eg for pgsql-general@lists.postgresql.org; Tue, 22 Oct 2024 21:24:16 +0000 Received: by mail-oa1-x31.google.com with SMTP id 586e51a60fabf-288b392b8daso2305116fac.2 for ; Tue, 22 Oct 2024 14:24:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1729632252; x=1730237052; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=e4/yYttNBjvCJEjnjF7CfdfP9Qxs62enhKflj2FN0WY=; b=iKLcL66Q/e+Iuk8vz8pPwiHe//x/x2e2rwp+k6FWpf3UUPrYj1XiKWb02ecSatdnF6 i7ZZBTcou7QqVhpBNYyWy6HKRq5SBckN6ndXy3h+YbY8nPmvFGIM5OzrnVsXUg9dX24z e31J8+QEufP/FMXmprH2a7UdQI4XKP9O/h/wmWu02DvD6tuPh8QyMrdsbWkzq+hKZO8s CvfwRtREaX0WxkebICBWcd1TbdUkiIN3RsQL1x/97LN+7BKEqKeRffnD0Gd5GbdxWguM 7EVAh/h5o6c457hQh+Xok1DG1Qv4niTZYUur+YRiWLfxtV+o3/6wSifOnGkgYToe8EFL Soyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729632252; x=1730237052; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=e4/yYttNBjvCJEjnjF7CfdfP9Qxs62enhKflj2FN0WY=; b=TCpO5tVSpeXx4Z4nI/JmEsltAsGYg8hrA32OckI3y9lm42vS3fHlKwwAbe9TMwMzay 0aSDYAHTfCtTlkJo7YwLXxvooK6T+3CcbNWkuT/TF5zF85ZXRhNlXnlh+0U7Hig9k+gr T9Aeu7SRfGlNb8egIhF5EyIwuZZbpyIb3Iwl3kZFHWmnczAwNxSueQSbwgBgjGGcOG6F SRsFaWyEWGaCvNK43A16qb61fcnVKHB4KXD7Y+dB4VwTjZoyQJeor4R3mSyObP0AH4dS mxFL/zWI5GhQiQd+7Yj/HMUwAmBY/ntoi44FWSWbWjzYI/js5CM9Xpt7MjFvqjKWHcn4 UYAw== X-Gm-Message-State: AOJu0YxEYm6KML6bQw3Cl4ZoenbV+1Icb44icABT3vujCZm/+msM3NkT PHMoZNpDaLBDame7AJbS2G9cvKNO38zQEU/CWiOOYXJScsGRTakWMb9/3462AP/UnFRkGgC1H5E oyreqCNR5V/VBEV68AH3BjFEWbCCcI4lH X-Google-Smtp-Source: AGHT+IHyK85EC0fGK2iY6BnQXzOFWqreHqKv3jA8Ym87/LCB5XpvQJBEdRNDwiOBWb/BUYG650XbaAbJ6bSbtWQK7Wc= X-Received: by 2002:a05:6870:82a0:b0:277:a43a:dac2 with SMTP id 586e51a60fabf-28ccb7bacb3mr682189fac.17.1729632252499; Tue, 22 Oct 2024 14:24:12 -0700 (PDT) MIME-Version: 1.0 References: <20241016213610.wbfqj3r7gg5ba6eh@hjp.at> In-Reply-To: From: yudhi s Date: Wed, 23 Oct 2024 02:53:59 +0530 Message-ID: Subject: Re: Query performance issue To: Greg Sabino Mullane , jross@openvistas.net Cc: pgsql-general@lists.postgresql.org, hjp-pgsql@hjp.at Content-Type: multipart/alternative; boundary="00000000000000a0ea0625176428" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --00000000000000a0ea0625176428 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, Oct 23, 2024 at 2:06=E2=80=AFAM yudhi s wrote: > > > On Wed, Oct 23, 2024 at 12:32=E2=80=AFAM Greg Sabino Mullane > wrote: > >> To be frank, there is so much wrong with this query that it is hard to >> know where to start. But a few top items: >> >> * Make sure all of the tables involved have been analyzed. You might wan= t >> to bump default_statistics_target up and see if that helps. >> >> * As mentioned already, increase work_mem, as you have things spilling t= o >> disk (e.g. external merge Disk: 36280kB) >> >> * Don't use the "FROM table1, table2, table3" syntax but use "FROM table= 1 >> JOIN table2 ON (...) JOIN table3 ON (...) >> >> * Try not to use subselects. Things like WHERE x IN (SELECT ...) are >> expensive and hard to optimize. >> >> * You have useless GROUP BY clauses in there. Remove to simplify the que= ry >> >> * There is no LIMIT. Does the client really need all 135,214 rows? >> >> >> > I tried running the query by removing both the "group by" from the inner > subqueries (I think the initial thought was that they will give distinct > records to the outer query and will thus help), and added limit 500 at th= e > last and also set the work_mem to 2GB for that session before running the > query. But seeing the response increased to ~5 seconds (from ~3.1 seconds > earlier). Below I have updated the execution plan for the same at the las= t > section. > > https://gist.github.com/databasetech0073/746353a9e76d5e29b2fc6abdc80cdef8 > > Again , not able to clearly understand the third point you said below. Ca= n > you please clarify a bit more. Do you mean we should write it as exists > /not exists rather IN and NOT IN and that will improve the performance? I > hope the third point doesn't matter much as we have all equijoin used her= e. > Correct me if I'm wrong. > > *"Try not to use subselects. Things like WHERE x IN (SELECT ...) are > expensive and hard to optimize*." > > Additionally in the plan which mysql makes and showing the highest > response time, is it suffering because of differences of the speed of the > underlying IO/storage or is it just because of the optimization features > which are available in postgres and not there in mysql ? Trying to > understand if it can be identified from the execution plan itself. > Additionally, I see below differences, when I compare two specific lines of the plan which fetches data using the same index in both posgres and mysql, so does it point that the speed in mysql is slower as compared to postgres. And thus is it possible that the underlying infrastructure is playing a role here in the mysql slowness too and that might be the key one here to first address for mysql OR its any optimization feature which helps postgres to give it an edge in performance here over mysql? In mysql plan:- -> Index lookup on EX_STS using EX_STS_INDEX (AID=3Db3.AID, RC_ID=3Db3.RC_I= D, RC_VNB=3Db3.RC_VNB) (cost=3D0.43 rows=3D2) (actual time=3D0.014..0.021 row= s=3D2 loops=3D70904) VS In postgres plan:- -> Index Scan using EX_STS_INDEX on RCE_STS EX_STS (cost=3D0.42..0.82 rows= =3D1 width=3D424) (actual time=3D0.006..0.007 rows=3D2 loops=3D70904) ************** In mysql plan -> Covering index lookup on mns using M_INF_AID_index (AID=3D'XXXXXXXXXXXXXXXXXXX') (cost=3D9187.54 rows=3D72748) (actual time=3D0.058..19.637 rows=3D35980 loops=3D1) -> Filter: (RNS.ASID =3D 'XXXXXXXXXXXXXXXXXXX') (cost=3D43.50 rows=3D42) (actual time=3D0.042..0.218 rows=3D97 loops=3D1) VS In postgres plan -> Bitmap Index Scan on M_INF_AID_index (cost=3D0.00..406.98 rows=3D36074 width=3D0) (actual time=3D0.790..0.790 rows=3D35980 loops=3D1) Index Cond: ((AID)::text =3D 'XXXXXXXXXXXXXXXXXXX'::text) --00000000000000a0ea0625176428 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Wed, Oct 23, 2024 at 2:06=E2=80=AF= AM yudhi s <learnerdataba= se99@gmail.com> wrote:


On Wed, Oct 23, 2024 at= 12:32=E2=80=AFAM Greg Sabino Mullane <htamfids@gmail.com> wrote:
To be frank, ther= e is so much wrong with this query that it is hard to know where to start. = But a few top items:

* Make sure all of the tables involved have bee= n analyzed. You might want to bump default_statistics_target up and see if = that helps.

* As mentioned already, increase work_mem, as you have t= hings spilling to disk (e.g. external merge Disk: 36280kB)

* Don'= ;t use the "FROM table1, table2, table3" syntax but use "FRO= M table1 JOIN table2 ON (...) JOIN table3 ON (...)

* Try not to use = subselects. Things like WHERE x IN (SELECT ...) are expensive and hard to o= ptimize.

* You have useless GROUP BY clauses in there. Remove to si= mplify the query

* There is no LIMIT. Does the client really need al= l 135,214 rows?



I tried runnin= g the query by removing both the "group by" from the inner subque= ries (I think the initial thought was that they will give distinct records = to the outer query and will thus help), and added limit 500 at the last and= also set the work_mem to 2GB for that session before running the query. Bu= t seeing the response increased to ~5 seconds (from ~3.1 seconds earlier). = Below I have updated the execution plan for the same at the last section.

https://gist.github.com/databasetech0073/746353a9e76d= 5e29b2fc6abdc80cdef8

Again , not able to clearly understand the = third point you said below. Can you please clarify a bit more. Do you mean = we should write it as exists /not exists rather IN and NOT IN and that will= improve the performance? I hope the third point doesn't=C2=A0matter mu= ch as we have all equijoin used here. Correct me if I'm wrong.

<= div>"Try not to use subselects. Things like WHERE x IN (SELECT ...)= are expensive and hard to optimize."=C2=A0

Additionally in the plan which mysql makes and showing=C2=A0the highest = response time, is it suffering because of differences of the speed of the u= nderlying IO/storage or is it just because of the optimization features whi= ch are available in postgres and not there in mysql ? Trying to understand = if it can be identified from the execution plan itself.

Additionally, I see below differences, when I com= pare two specific lines of the plan which fetches data using the same index= in both posgres and mysql, so does it point that the speed in mysql is slo= wer as compared to postgres. And thus is it possible that the underlying in= frastructure is playing a role here in the mysql slowness too and that migh= t be the key one here to first address for mysql=C2=A0
OR its any optimization feature which helps postgres to give it= an edge in performance here over mysql?

In= mysql plan:-
-> Index lookup on EX_STS using EX_STS_INDEX (AID=3Db3.= AID, RC_ID=3Db3.RC_ID, RC_VNB=3Db3.RC_VNB) =C2=A0(cost=3D0.43 rows=3D2) (ac= tual time=3D0.014..0.021 rows=3D2 loops=3D70904)
VS
In postgres plan:= -
-> Index Scan using EX_STS_INDEX on RCE_STS EX_STS =C2=A0(cost=3D0.= 42..0.82 rows=3D1 width=3D424) (actual time=3D0.006..0.007 rows=3D2 loops= =3D70904)


**************

In m= ysql plan
-> Covering index lookup on mns using M_INF_AID_index (AID= =3D'XXXXXXXXXXXXXXXXXXX') =C2=A0(cost=3D9187.54 rows=3D72748) (actu= al time=3D0.058..19.637 rows=3D35980 loops=3D1)
=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 -> Filter: (RNS.ASID =3D 'XXX= XXXXXXXXXXXXXXXX') =C2=A0(cost=3D43.50 rows=3D42) (actual time=3D0.042.= .0.218 rows=3D97 loops=3D1)
VS
In postgres plan
-> =C2=A0Bitmap= Index Scan on M_INF_AID_index =C2=A0(cost=3D0.00..406.98 rows=3D36074 widt= h=3D0) (actual time=3D0.790..0.790 rows=3D35980 loops=3D1)
<= font face=3D"monospace">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Index Cond: ((AID)::text = =3D 'XXXXXXXXXXXXXXXXXXX'::text)=C2=A0
--00000000000000a0ea0625176428--