Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1su4bM-003uQ9-SL for pgsql-general@arkaria.postgresql.org; Fri, 27 Sep 2024 06:37:17 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1su4bL-004Biw-Rq for pgsql-general@arkaria.postgresql.org; Fri, 27 Sep 2024 06:37:15 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1su4bL-004Bil-E8 for pgsql-general@lists.postgresql.org; Fri, 27 Sep 2024 06:37:15 +0000 Received: from mail-ot1-x329.google.com ([2607:f8b0:4864:20::329]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1su4bH-001JPy-7E for pgsql-general@lists.postgresql.org; Fri, 27 Sep 2024 06:37:14 +0000 Received: by mail-ot1-x329.google.com with SMTP id 46e09a7af769-710ddb52139so930900a34.1 for ; Thu, 26 Sep 2024 23:37:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1727419030; x=1728023830; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=J8a8PkbmvUvebz+23LnFxwPMNxzQ4ZTtRWQC4D+9NeM=; b=Ua6Stv0k96PFIq988j55weRnZFsEr90DSUQDWohzfBmpWiyY4Pwd3I02cXMEDQ32E6 6vyi1DJ0U9ovmsaMCjcMOGlqFvO97PSmJv4p9kn10oQiUZm0AJ+EfOhKQO5QsFIgVDH5 tN2s+ekBDQq5JarBlBOAurXqTNgCeJmaOkOxcyPbCgoOlrAidYAT2Lw1mHHD9rJrWiMG zZrfGIrKsx7I7UuGxoW0dcrUJhYZSow4yXFgUGjhzjDqbw7H1PHbE3lvJzFecU7jqSKc z3UhmtYo+jv4kPAY5NoXeCALs1wvBcbIJVucP6AgAW18gbnF8KkHGJFHZPBBTedNheA1 dvHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727419030; x=1728023830; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=J8a8PkbmvUvebz+23LnFxwPMNxzQ4ZTtRWQC4D+9NeM=; b=MgvL8WzGuM4nexINght+jLg2ihiY3pX6uj/Cw7jHLMAZIfW/TAsM5UbajBREL1MGkJ ELLXXCaWwlPwRaP5lj6b808LgZGfLg524g+uRRR2FzsQPbKHV6465Ztb81A7Rf9K7QfR RaegBwnQSG8Ly8yOxVm4YaR0EsyXFy39hgi2lt8pjLXLM9+SJnW24RKM/BLX2EONk0xU TkunAw59jyi7ZHGlHVAP5mbD7DKv1P5Ay9Boavd6QAAKpDc/hREPrGsRPKm18uo9WtIB DEd59UkyDOT5fbKuO6kjWQo9ksgPMBMBtZ84Ed9XEpTyvqUfbbaSoLDkQAfmefj21GGg nIJw== X-Gm-Message-State: AOJu0Yz9dAcnAj97y3gkh7X9vERedv8JbZjkvO5PY/m6LXir9fAk2l5c AjsoCtsRh4g5JxoO9cWjJYi30NKFkWFJGl/d/Twrt5bceicmhJdzQ+iM30ERypPOEED+osV1MMH 2h0GP+iIAsHvzfx7WtF3YgUqAGt8= X-Google-Smtp-Source: AGHT+IFqFSu+CZN1BCltdFAWvPf5VnqvkNm+/2fMwZKsL9HmrN0XVvnRe4J1/NLPE7a6LrF9hwCIguPK8Nah+mWAe1Q= X-Received: by 2002:a05:6870:8993:b0:270:c1e:41ad with SMTP id 586e51a60fabf-28710ba72a0mr1889134fac.35.1727419030475; Thu, 26 Sep 2024 23:37:10 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: yudhi s Date: Fri, 27 Sep 2024 12:06:58 +0530 Message-ID: Subject: Re: Suggestion for memory parameters To: veem v Cc: pgsql-general Content-Type: multipart/alternative; boundary="000000000000b09d9f0623141560" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000b09d9f0623141560 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, Sep 27, 2024 at 9:11=E2=80=AFAM veem v wrote: > > On Thu, 26 Sept 2024 at 16:33, yudhi s > wrote: > >> Hello All, >> >> In a RDS postgres we are seeing some select queries when running and >> doing sorting on 50 million rows(as its having order by clause in it) , = the >> significant portion of wait event is showing as "IO:BufFileWrite" and it >> runs for ~20minutes+. >> >> Going through the document in the link below, it states we should monito= r >> the "FreeLocalStorage" metric and when monitoring that, I see it showing= up >> to ~535GB as the max limit and when these queries run this goes down til= l >> 100GB. Note- (it's a R7g8xl instance) >> >> >> https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/apg-waits.i= obuffile.html >> >> We were thinking of bumping up the work_mem to a higher value in databas= e >> level , which is currently having size 4MB default. But we will also hav= e >> ~100 sessions running at same time and majority were from other >> applications which execute other single row "insert" queries and I hope >> that will not need high "work_mem" . And setting it at database level wi= ll >> consume 100 times that set work_mem value. So how to handle this situati= on? >> Or >> Is it fine to let it use "FreeLocalStorage" unless it goes till zero? >> >> Also I am confused between the local storage (which is showing as 535GB) >> vs the memory/RAM which is 256GB for this instance class with ~128TB max >> storage space restriction, how are these storage different, (mainly the >> 535GB space which it's showing vs the 128TB storage space restriction)? >> Appreciate your guidance. >> >> select query looks something as below with no Joins but just single tabl= e >> fetch:- >> >> Select.... >> from >> where >> order by column1, column2 LIMIT $b1 OFFSET $B2 ; >> >> >> > My 2 cents > I think you should set the work_mem on specific session level , if your > sorting queries are only from specific handful of sessions, as because > setting it up at database level will eat up your most of RAM(which you sa= id > is 256GB) and you said 100+ sessions getting spawned at any point in time= . > Thank you. When I checked pg_stat_statements for this query , and divided the temp_blk_read+temp_blk_written with the "calls", it came as ~1million which means ~7GB. So does that mean ~7GB of work_mem should be allocated for this query? --000000000000b09d9f0623141560 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Fri, Sep 27, 2024 at 9:11=E2=80=AF= AM veem v <veema0000@gmail.com> wrote:

Hello All,

In a RDS postgres we are= seeing some select queries when running and doing sorting on 50 million ro= ws(as its having order by clause in it) , the significant portion of wait e= vent is showing as "IO:BufFileWrite" and it runs for ~20minutes+.= =C2=A0

Going through the document in the link below, it states we s= hould monitor the "FreeLocalStorage" metric and when monitoring t= hat, I see it showing up to ~535GB as the max limit and when these queries = run this goes down till 100GB. Note-=C2=A0 (it's a R7g8xl instance)

https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/apg-waits.io= buffile.html

We were thinking of bumping up the work_mem to a hi= gher value in database level , which is currently having size 4MB default. = But we will also have ~100 sessions running at same time and majority were = from other applications which execute other single row "insert" q= ueries and I hope that will not need high "work_mem" . And settin= g it at database level will consume 100 times that set work_mem value. So h= ow to handle this situation?
=C2=A0Or
=C2=A0Is it fine to let it use = "FreeLocalStorage" unless it goes till zero?

Also I am con= fused between the local storage (which is showing as 535GB) vs the memory/R= AM which is 256GB for this instance class with ~128TB max storage space res= triction, how are these storage different, (mainly the 535GB space which it= 's showing vs the 128TB storage space restriction)?=C2=A0 Appreciate=C2= =A0your guidance.

select query looks something as below with no Join= s but just single table fetch:-

Select....
from <table_name&g= t;
where
order by column1, column2 LIMIT $b1 OFFSET $B2 ;


My 2 ce= nts=C2=A0
I think you should=C2=A0set the work_mem on specific se= ssion level , if your sorting queries are only from specific handful of ses= sions, as because setting=C2=A0it up at database level will eat up your mos= t of RAM(which you said is 256GB) and you said 100+ sessions getting spawne= d at any point in time.

<= br>
Thank you.
When I checked pg_stat_statements for th= is query , and divided the temp_blk_read+temp_blk_written with the "ca= lls", it came as ~1million which means ~7GB. So does that mean ~7GB of= work_mem should be allocated for this query?
--000000000000b09d9f0623141560--