MIME-Version: 1.0
References: <CAEzWdqfUQuKtpqGAwf86dwkjPq2Kkeyj6Pw31GXr92YC8M2Y5g@mail.gmail.com>
 <A57336D3-37A0-413F-8D74-B41607359DC6@americanefficient.com> <CAEzWdqc1eTH43ok0xuv-kTrWeEjVxXc2rcEEAcz5FeG2HBoFWw@mail.gmail.com>
In-Reply-To: <CAEzWdqc1eTH43ok0xuv-kTrWeEjVxXc2rcEEAcz5FeG2HBoFWw@mail.gmail.com>
From: Greg Sabino Mullane <htamfids@gmail.com>
Date: Tue, 1 Oct 2024 08:00:07 -0400
Message-ID: <CAKAnmmJSfw92MDa6TDxJ3e63AE=1gaZFJinFdHbkFQh=KrA7BQ@mail.gmail.com>
Subject: Re: Suggestion for memory parameters
To: yudhi s <learnerdatabase99@gmail.com>
Cc: Philip Semanchuk <philip@americanefficient.com>, 
	pgsql-general <pgsql-general@lists.postgresql.org>
Content-Type: multipart/alternative; boundary="0000000000003fce0d0623691237"
Archived-At: <https://www.postgresql.org/message-id/CAKAnmmJSfw92MDa6TDxJ3e63AE%3D1gaZFJinFdHbkFQh%3DKrA7BQ%40mail.gmail.com>
Precedence: bulk

--0000000000003fce0d0623691237
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Tue, Oct 1, 2024 at 2:52=E2=80=AFAM yudhi s <learnerdatabase99@gmail.com=
> wrote:

> When I execute the query with explain (analyze, buffers),I see the sectio=
n
> below in the plan having "sort method" information in three places
> each showing ~75MB size, which if combined is coming <250MB. So , does th=
at
> mean it's enough to set the work_mem as ~250MB for these queries before
> they start?
>

work_mem is set per action, so you don't need to usually combine them.
However, these are parallel workers, so you probably need to account for
the case in which no workers are available, in which case you DO want to
combine the values - but only for parallel workers all doing the same
action.


>  But yes somehow this query is finished in a few seconds when i execute
> using explain(analyze,buffers) while if i run it without using explain it
> runs for ~10minutes+. My expectation was that doing (explain analyze)
> should actually execute the query fully. Is my understanding correct here
> and if the disk spilling stats which I am seeing is accurate enough to go
> with?
>

Running explain analyze does indeed run the actual query, but it also
throws away the output. It looks like your limit is set to 300,000 rows
(why!??), which could account for some or all of the time taken - to pass
back those rows and for your client to process them. But it's hard to say
if that's the total reason for the difference without more data. It might
help to see the query, but as a rule of thumb, don't use SELECT * and keep
your LIMIT sane - only pull back the columns and rows your application
absolutely needs.

Cheers,
Greg

--0000000000003fce0d0623691237
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div dir=3D"ltr">On Tue, Oct 1, 2024 at 2:52=E2=80=AFAM yu=
dhi s &lt;<a href=3D"mailto:learnerdatabase99@gmail.com">learnerdatabase99@=
gmail.com</a>&gt; wrote:</div><div class=3D"gmail_quote"><blockquote class=
=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rg=
b(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div class=3D"gmail_quote=
"><div>When I execute the query with explain (analyze, buffers),I see the s=
ection below in the plan having &quot;sort method&quot; information in thre=
e places each=C2=A0showing ~75MB size, which if combined is coming &lt;250M=
B. So , does that mean it&#39;s enough to set the work_mem as ~250MB for th=
ese queries before they start?</div></div></div></blockquote><div><br></div=
><div>work_mem is set per action, so you don&#39;t need to usually combine =
them. However, these are parallel workers, so you probably need to account =
for the case in which no workers are available, in which case you DO want t=
o combine the values - but only for parallel workers all doing the same act=
ion.=C2=A0=C2=A0</div><div>=C2=A0<br></div><blockquote class=3D"gmail_quote=
" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);=
padding-left:1ex"><div dir=3D"ltr"><div class=3D"gmail_quote"><div></div><d=
iv>=C2=A0But yes somehow this query is finished in a few seconds when i exe=
cute using explain(analyze,buffers) while if i run it without using explain=
 it runs for ~10minutes+. My expectation was that doing (explain analyze) s=
hould actually execute the query fully. Is my understanding correct here an=
d if the=C2=A0disk spilling stats which I am seeing is accurate=C2=A0enough=
 to go with?</div></div></div></blockquote><div><br></div><div>Running expl=
ain analyze does indeed run the actual query, but it also throws away the o=
utput. It looks like your limit is set to 300,000 rows (why!??), which coul=
d account for some or all of the time taken - to pass back those rows and f=
or your client to process them. But it&#39;s hard to say if that&#39;s the =
total reason for the difference without more data. It might help to see the=
 query, but as a rule of thumb, don&#39;t use SELECT * and keep your LIMIT =
sane - only pull back the columns=C2=A0and rows your application absolutely=
 needs.</div><div><br></div><div>Cheers,</div><div>Greg</div><div><br></div=
></div></div>

--0000000000003fce0d0623691237--