MIME-Version: 1.0
References: <CAEzWdqd0SPkZMYNaAbERdgczkfQqLmNV5JBMmF-F9s7KjxJ0gw@mail.gmail.com>
 <fde67103-e00e-460e-b2ec-eb539809e998@aklaver.com> <CAEzWdqd6LAHs+FiFeJLqDTS-QBLq6+foE1-mgBC9AXVpFmVnZg@mail.gmail.com>
 <vecavrvgzoxkks66nw2gvt3vot5lwbcm7f65iopgjbw72v2lc6@qd5leh3coj7g>
 <CAEzWdqcAbi0GYp_K64oZTeUeN3YN7-eFQ2m2fZDRvmnJx5Lb5w@mail.gmail.com>
 <CANzqJaDf4kKc89e_9YGZ+BorPbViYgPZomo1ssQO9utOHeStCg@mail.gmail.com>
 <CAEzWdqc-2O8mWGdeDhnzKrp7-kwC99sqJ+ArWUS38WuHUKP-UQ@mail.gmail.com>
 <CANzqJaBucLq65V9OH_Ruah7S=g+5s-L8yFkjELdAerZszzcOXA@mail.gmail.com>
 <CAEzWdqd3dcJoDDaG8t1nsgrdy7Tw-EvD1zMCXHy_uOgkLFAZdQ@mail.gmail.com>
 <CANzqJaDHtROOVSEB_i6Lo+wFki-vNuQ6wDTNGT-hCWRsMBpvZQ@mail.gmail.com>
 <CAEzWdqddHgqsZCachpQMgZmRRRuTQ9HxvdP6=Lhr+TYtPL=w-A@mail.gmail.com> <CANzqJaDxOsp=QNqa5bK4JXKOt7uHfxNpcTPZGWuVUMpGjNUB4Q@mail.gmail.com>
In-Reply-To: <CANzqJaDxOsp=QNqa5bK4JXKOt7uHfxNpcTPZGWuVUMpGjNUB4Q@mail.gmail.com>
From: yudhi s <learnerdatabase99@gmail.com>
Date: Tue, 3 Feb 2026 14:56:14 +0530
Message-ID: <CAEzWdqc3hdnZbHTLd1kXxR3FUdxGpM7=Eea9c8Hp7QsF+HKMtg@mail.gmail.com>
Subject: Re: Top -N Query performance issue and high CPU usage
To: Ron Johnson <ronljohnsonjr@gmail.com>
Cc: "pgsql-generallists.postgresql.org" <pgsql-general@lists.postgresql.org>
Content-Type: multipart/alternative; boundary="0000000000008b60bb0649e808e1"
Archived-At: <https://www.postgresql.org/message-id/CAEzWdqc3hdnZbHTLd1kXxR3FUdxGpM7%3DEea9c8Hp7QsF%2BHKMtg%40mail.gmail.com>
Precedence: bulk

--0000000000008b60bb0649e808e1
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Tue, Feb 3, 2026 at 4:50=E2=80=AFAM Ron Johnson <ronljohnsonjr@gmail.com=
> wrote:

> On Mon, Feb 2, 2026 at 3:43=E2=80=AFPM yudhi s <learnerdatabase99@gmail.c=
om>
> wrote:
>
>>
>> On Tue, Feb 3, 2026 at 1:01=E2=80=AFAM Ron Johnson <ronljohnsonjr@gmail.=
com>
>> wrote:
>>
>>> On Mon, Feb 2, 2026 at 1:39=E2=80=AFPM yudhi s <learnerdatabase99@gmail=
.com>
>>> wrote:
>>>
>>>> On Mon, Feb 2, 2026 at 8:57=E2=80=AFPM Ron Johnson <ronljohnsonjr@gmai=
l.com>
>>>> wrote:
>>>>
>>>>>
>>>>>> My apologies if i misunderstand the plan, But If I see,   it's
>>>>>> spending ~140ms(140ms-6ms) i.e. almost all the time now, in performi=
ng the
>>>>>> below nested loop join. So my question was , is there any possibilit=
y to
>>>>>> reduce the resource consumption or response time further here?  Hope=
 my
>>>>>> understanding is correct here.
>>>>>>
>>>>>> -> Nested Loop (cost=3D266.53..1548099.38 rows=3D411215 width=3D20) =
(actual
>>>>>> time=3D*6.009..147.695* rows=3D1049 loops=3D1)
>>>>>> Join Filter: ((df.ent_id)::numeric =3D m.ent_id)
>>>>>> Rows Removed by Join Filter: 513436
>>>>>> Buffers: shared hit=3D1939
>>>>>>
>>>>>
>>>>> I don't see m.ent_id in the actual query.  Did you only paste a
>>>>> portion of the query?
>>>>>
>>>>> Also, casting in a JOIN typically brutalizes the ability to use an
>>>>> index.
>>>>>
>>>>>
>>>>> Thank you.
>>>> Actually i tried executing the first two CTE where the query was
>>>> spending most of the time  and teh alias has changed.
>>>>
>>>
>>> We need to see everything, not just what you think is relevant.
>>>
>>>
>>>> Also here i have changed the real table names before putting it here,
>>>> hope that is fine.
>>>> However , i verified the data type of the ent_id column in "ent" its
>>>> "int8" and in table "txn_tbl" is "numeric 12", so do you mean to say t=
his
>>>> difference in the data type is causing this high response time during =
the
>>>> nested loop join? My understanding was it will be internally castable
>>>> without additional burden. Also, even i tried creating an index on the=
 "(df.ent_id)::numeric"
>>>> its still reulting into same plan and response time.
>>>>
>>>
>>> If you'd shown the "\d" table definitions like Adrian asked two days
>>> ago, we'd know what indexes are on each table, and not have to beg you =
to
>>> dispense dribs and drabs of information.
>>>
>>>
>> I am unable to run "\d" from the dbeaver sql worksheet. However,  I have
>> fetched the DDL for the three tables and their selected columns, used in
>> the smaller version of the query and its plan , which I recently updated=
.
>>
>> https://gist.github.com/databasetech0073/e4290b085f8f974e315fb41bdc47a1f=
3
>>
>> https://gist.github.com/databasetech0073/344df46c328e02b98961fab0cd22149=
2
>>
>
> Lines 30-32 are where most of the time and effort are taken.
>
> I can't be certain, but changing APP_schema.ent.ent_id from NUMERIC to
> int8 (with a CHECK constraint to, well, constrain it to 12 digits, if
> really necessary) is something I'd test.
>
> --
>


Thank you so much.

After making the data types equal on both tables for the column ent_id the
plan now looks as below. The costing function sinow removed. So it must be
helping reduce CPU cycle consumption to some extent, But,  I still see
~100ms is spent in this step. Is there anything we can do to further drop
the response time here? Or it's the best time we can get here.

  ->  Nested Loop  (cost=3D262.77..1342550.91 rows=3D579149 width=3D20) (*a=
ctual
time=3D6.406..107.946* rows=3D1049 loops=3D1)
              Join Filter: (*df.ent_id =3D m.ent_id*)
              Rows Removed by Join Filter: 514648
              Buffers: shared hit=3D1972


Also I do see in some other steps in the plan , the casting function is
getting used. For example in the below filter. Here txn_tbl_type_nm is
defined as Varchar(25) and still it's trying to cast it to Text. Can we do
anything to avoid these force casts as these must consume the CPU cycles?

    AND txn_tbl_dcsn.txn_tbl_txn_sts_tx NOT IN ('STATUS_A','STATUS_B')
    WHERE txn_tbl.txn_tbl_type_nm IN ('TYPE1','TYPE2','TYPE3')

  ->  Index Scan Backward using txn_tbl_due_dt_idx on txn_tbl df
 (cost=3D0.43..115879.87 rows=3D1419195 width=3D20) (actual time=3D0.019..2=
0.377
rows=3D43727 loops=3D1)
Filter: *((txn_tbl_type_nm)::text =3D ANY ('{TYPE1,TYPE2,TYPE3}'::text[])*)
Rows Removed by Filter: 17
Buffers: shared hit=3D1839

The plan is as below.

https://gist.github.com/databasetech0073/558377c1939a9291e7b72b1cbac7c9f9

Regards
Yudhi

--0000000000008b60bb0649e808e1
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div dir=3D"ltr"><br></div><br><div class=3D"gmail_quote g=
mail_quote_container"><div dir=3D"ltr" class=3D"gmail_attr">On Tue, Feb 3, =
2026 at 4:50=E2=80=AFAM Ron Johnson &lt;<a href=3D"mailto:ronljohnsonjr@gma=
il.com">ronljohnsonjr@gmail.com</a>&gt; wrote:<br></div><blockquote class=
=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rg=
b(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div dir=3D"ltr">On Mon, =
Feb 2, 2026 at 3:43=E2=80=AFPM yudhi s &lt;<a href=3D"mailto:learnerdatabas=
e99@gmail.com" target=3D"_blank">learnerdatabase99@gmail.com</a>&gt; wrote:=
</div><div class=3D"gmail_quote"><blockquote class=3D"gmail_quote" style=3D=
"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-le=
ft:1ex"><div dir=3D"ltr"><div dir=3D"ltr"><br></div><div class=3D"gmail_quo=
te"><div dir=3D"ltr" class=3D"gmail_attr">On Tue, Feb 3, 2026 at 1:01=E2=80=
=AFAM Ron Johnson &lt;<a href=3D"mailto:ronljohnsonjr@gmail.com" target=3D"=
_blank">ronljohnsonjr@gmail.com</a>&gt; wrote:<br></div><blockquote class=
=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rg=
b(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div dir=3D"ltr">On Mon, =
Feb 2, 2026 at 1:39=E2=80=AFPM yudhi s &lt;<a href=3D"mailto:learnerdatabas=
e99@gmail.com" target=3D"_blank">learnerdatabase99@gmail.com</a>&gt; wrote:=
</div><div class=3D"gmail_quote"><blockquote class=3D"gmail_quote" style=3D=
"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-le=
ft:1ex"><div dir=3D"ltr"><div class=3D"gmail_quote"><div dir=3D"ltr" class=
=3D"gmail_attr">On Mon, Feb 2, 2026 at 8:57=E2=80=AFPM Ron Johnson &lt;<a h=
ref=3D"mailto:ronljohnsonjr@gmail.com" target=3D"_blank">ronljohnsonjr@gmai=
l.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"ma=
rgin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:=
1ex"><div dir=3D"ltr"><div class=3D"gmail_quote"><blockquote class=3D"gmail=
_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204=
,204);padding-left:1ex"><div dir=3D"ltr"><div class=3D"gmail_quote"><div><b=
r></div><div>My apologies if i misunderstand=C2=A0the plan, But If I see,=
=C2=A0 =C2=A0it&#39;s spending ~140ms(140ms-6ms) i.e. almost all the time n=
ow, in performing the below nested loop join. So my question was , is there=
 any possibility to reduce the resource consumption or response time furthe=
r=C2=A0here?=C2=A0 Hope my understanding is correct here.</div><div><br></d=
iv><table style=3D"border-spacing:0px;border-collapse:collapse;color:rgb(31=
,35,40);font-family:-apple-system,BlinkMacSystemFont,&quot;Segoe UI&quot;,&=
quot;Noto Sans&quot;,Helvetica,Arial,sans-serif,&quot;Apple Color Emoji&quo=
t;,&quot;Segoe UI Emoji&quot;;font-size:14px"><tbody style=3D"box-sizing:bo=
rder-box"><tr style=3D"box-sizing:border-box;background-color:rgba(0,0,0,0)=
"><td id=3D"m_5989161288274321570m_2057818931727152927m_-331096739617859091=
2m_520612369333155588m_-5904582641194941826m_-5496761544458945616gmail-file=
-gistfile1-txt-LC30" style=3D"box-sizing:border-box;padding:0px 10px;line-h=
eight:20px;vertical-align:top;overflow:visible;font-family:&quot;Monaspace =
Neon&quot;,ui-monospace,SFMono-Regular,&quot;SF Mono&quot;,Menlo,Consolas,&=
quot;Liberation Mono&quot;,monospace;font-size:12px;white-space:pre-wrap"> =
    -&gt;  Nested Loop  (cost=3D266.53..1548099.38 rows=3D411215 width=3D20=
) (actual time=3D<b>6.009..147.695</b> rows=3D1049 loops=3D1)</td></tr><tr =
style=3D"box-sizing:border-box"><td id=3D"m_5989161288274321570m_2057818931=
727152927m_-3310967396178590912m_520612369333155588m_-5904582641194941826m_=
-5496761544458945616gmail-file-gistfile1-txt-LC31" style=3D"box-sizing:bord=
er-box;padding:0px 10px;line-height:20px;vertical-align:top;overflow:visibl=
e;font-family:&quot;Monaspace Neon&quot;,ui-monospace,SFMono-Regular,&quot;=
SF Mono&quot;,Menlo,Consolas,&quot;Liberation Mono&quot;,monospace;font-siz=
e:12px;white-space:pre-wrap">              Join Filter: ((df.ent_id)::numer=
ic =3D m.ent_id)</td></tr><tr style=3D"box-sizing:border-box;background-col=
or:rgba(0,0,0,0)"><td id=3D"m_5989161288274321570m_2057818931727152927m_-33=
10967396178590912m_520612369333155588m_-5904582641194941826m_-5496761544458=
945616gmail-file-gistfile1-txt-LC32" style=3D"box-sizing:border-box;padding=
:0px 10px;line-height:20px;vertical-align:top;overflow:visible;font-family:=
&quot;Monaspace Neon&quot;,ui-monospace,SFMono-Regular,&quot;SF Mono&quot;,=
Menlo,Consolas,&quot;Liberation Mono&quot;,monospace;font-size:12px;white-s=
pace:pre-wrap">              Rows Removed by Join Filter: 513436</td></tr><=
tr style=3D"box-sizing:border-box"><td id=3D"m_5989161288274321570m_2057818=
931727152927m_-3310967396178590912m_520612369333155588m_-590458264119494182=
6m_-5496761544458945616gmail-file-gistfile1-txt-LC33" style=3D"box-sizing:b=
order-box;padding:0px 10px;line-height:20px;vertical-align:top;overflow:vis=
ible;font-family:&quot;Monaspace Neon&quot;,ui-monospace,SFMono-Regular,&qu=
ot;SF Mono&quot;,Menlo,Consolas,&quot;Liberation Mono&quot;,monospace;font-=
size:12px;white-space:pre-wrap">              Buffers: shared hit=3D1939</t=
d></tr></tbody></table></div></div></blockquote><div><br></div><div>I don&#=
39;t see=C2=A0<span style=3D"color:rgb(31,35,40);font-family:&quot;Monaspac=
e Neon&quot;,ui-monospace,SFMono-Regular,&quot;SF Mono&quot;,Menlo,Consolas=
,&quot;Liberation Mono&quot;,monospace;font-size:12px">m.ent_id</span>=C2=
=A0in the actual query.=C2=A0 Did you only paste a portion of the query?</d=
iv><div><br></div><div>Also, casting in a JOIN typically brutalizes the abi=
lity to use an index.</div><div><br></div></div><br></div></blockquote><div=
>Thank you.</div><div>Actually i tried executing the first two CTE where th=
e query was spending most of the time=C2=A0 and teh alias has changed.</div=
></div></div></blockquote><div><br></div><div>We need to see everything, no=
t just what you think is relevant.</div><div>=C2=A0</div><blockquote class=
=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rg=
b(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div class=3D"gmail_quote=
"><div> Also here i have changed the real table names before putting it her=
e, hope that is fine.=C2=A0</div><div>However , i verified the data type of=
 the ent_id column in &quot;ent&quot; its &quot;int8&quot; and in table &qu=
ot;txn_tbl&quot; is &quot;numeric 12&quot;, so do you mean to say this diff=
erence in the data type is causing this high response time during the neste=
d loop join? My understanding was it will be internally castable without ad=
ditional burden. Also, even i tried creating an index on the &quot;<span st=
yle=3D"color:rgb(31,35,40);font-family:&quot;Monaspace Neon&quot;,ui-monosp=
ace,SFMono-Regular,&quot;SF Mono&quot;,Menlo,Consolas,&quot;Liberation Mono=
&quot;,monospace;font-size:12px">(df.ent_id)::numeric&quot; its still reult=
ing into same plan and response time.</span>=C2=A0</div></div></div>
</blockquote></div><div><br clear=3D"all"></div><div>If you&#39;d shown the=
 &quot;\d&quot; table definitions like Adrian asked two days ago, we&#39;d =
know what indexes are on each table, and not have to beg you to dispense dr=
ibs and drabs of information.</div><div><br></div></div></blockquote><div><=
br></div><div>I am unable to run &quot;\d&quot; from the dbeaver sql worksh=
eet. However,=C2=A0 I have fetched the DDL for the three tables and their s=
elected columns, used in the smaller version of the query and its plan , wh=
ich I recently=C2=A0updated.=C2=A0</div><div><br></div><div><a href=3D"http=
s://gist.github.com/databasetech0073/e4290b085f8f974e315fb41bdc47a1f3" targ=
et=3D"_blank">https://gist.github.com/databasetech0073/e4290b085f8f974e315f=
b41bdc47a1f3</a></div><div><br></div><div><a href=3D"https://gist.github.co=
m/databasetech0073/344df46c328e02b98961fab0cd221492" target=3D"_blank">http=
s://gist.github.com/databasetech0073/344df46c328e02b98961fab0cd221492</a></=
div></div></div></blockquote><div><br></div><div>Lines 30-32 are where most=
 of the time and effort are taken.</div><div><br></div><div>I can&#39;t be =
certain, but changing=C2=A0<span style=3D"color:rgb(31,35,40);font-family:u=
i-monospace,SFMono-Regular,&quot;SF Mono&quot;,Menlo,Consolas,&quot;Liberat=
ion Mono&quot;,monospace;font-size:12px;white-space:pre-wrap">APP_schema.en=
t.</span><span style=3D"color:rgb(31,35,40);font-family:ui-monospace,SFMono=
-Regular,&quot;SF Mono&quot;,Menlo,Consolas,&quot;Liberation Mono&quot;,mon=
ospace;font-size:12px;white-space:pre-wrap">ent_id</span>=C2=A0from NUMERIC=
 to int8 (with a CHECK constraint to, well, constrain it to 12 digits, if r=
eally necessary) is something I&#39;d test.</div><div><br></div></div><span=
 class=3D"gmail_signature_prefix">--</span></div></blockquote><div><br></di=
v><div><br></div><div id=3D"gmail-:1ku" class=3D"gmail-Am gmail-aiL gmail-a=
O9 gmail-Al editable gmail-LW-avf gmail-tS-tW gmail-tS-tY" aria-label=3D"Me=
ssage Body" role=3D"textbox" aria-multiline=3D"true" tabindex=3D"1" style=
=3D"direction:ltr;min-height:85px" aria-controls=3D":1ne" aria-expanded=3D"=
false">Thank=C2=A0you so much.<div><br></div><div>After making the data typ=
es equal on both tables for the column ent_id the plan now looks as below. =
The costing function sinow removed. So it must be helping reduce CPU cycle =
consumption to some extent, But, =C2=A0I still see ~100ms is spent in this =
step. Is there anything we can do to further drop the response time here? O=
r it&#39;s the best time we can get here.<br><br>=C2=A0 -&gt; =C2=A0Nested =
Loop =C2=A0(cost=3D262.77..1342550.91 rows=3D579149 width=3D20) (<b>actual =
time=3D6.406..107.946</b> rows=3D1049 loops=3D1)<br>=C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 Join Filter: (<b>df.ent_id =3D m.ent_id</b>)<br=
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Rows Removed by Join Filt=
er: 514648<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Buffers: sha=
red hit=3D1972<br><br><br>Also I do see in some other steps in the plan , t=
he casting function is getting used. For example in the below filter. Here =
txn_tbl_type_nm is defined as Varchar(25) and still it&#39;s trying to cast=
 it to Text. Can we do anything to avoid these force casts as these must co=
nsume the CPU cycles?<br><br>=C2=A0 =C2=A0 AND txn_tbl_dcsn.txn_tbl_txn_sts=
_tx NOT IN (&#39;STATUS_A&#39;,&#39;STATUS_B&#39;)<br>=C2=A0 =C2=A0 WHERE t=
xn_tbl.txn_tbl_type_nm IN (&#39;TYPE1&#39;,&#39;TYPE2&#39;,&#39;TYPE3&#39;)=
<br><br>=C2=A0 -&gt; =C2=A0Index Scan Backward using txn_tbl_due_dt_idx on =
txn_tbl df =C2=A0(cost=3D0.43..115879.87 rows=3D1419195 width=3D20) (actual=
 time=3D0.019..20.377 rows=3D43727 loops=3D1)<br>		Filter: <b>((txn_tbl_typ=
e_nm)::text =3D ANY (&#39;{TYPE1,TYPE2,TYPE3}&#39;::text[])</b>)<br>		Rows =
Removed by Filter: 17<br>		Buffers: shared hit=3D1839<br><br>The plan is as=
 below.<br><br><a href=3D"https://gist.github.com/databasetech0073/558377c1=
939a9291e7b72b1cbac7c9f9">https://gist.github.com/databasetech0073/558377c1=
939a9291e7b72b1cbac7c9f9</a></div><div><br></div><div>Regards</div></div><d=
iv>Yudhi=C2=A0</div></div></div>

--0000000000008b60bb0649e808e1--