Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1swo6q-0068sV-HY for pgsql-general@arkaria.postgresql.org; Fri, 04 Oct 2024 19:37:05 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1swo6p-004xxS-SO for pgsql-general@arkaria.postgresql.org; Fri, 04 Oct 2024 19:37:03 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1swo6p-004xte-C9 for pgsql-general@lists.postgresql.org; Fri, 04 Oct 2024 19:37:03 +0000 Received: from mail-ua1-x933.google.com ([2607:f8b0:4864:20::933]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1swo6m-002V3G-On for pgsql-general@lists.postgresql.org; Fri, 04 Oct 2024 19:37:02 +0000 Received: by mail-ua1-x933.google.com with SMTP id a1e0cc1a2514c-84ea66d1785so742847241.3 for ; Fri, 04 Oct 2024 12:37:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1728070619; x=1728675419; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=8rm1/an/7CQqqF6irKom4j0+APO0mrtJO1kxlQdhm9g=; b=CmXpJU00VyYNBSsLfMi2z7+kkMdFOFGVUTm+dPpOoy61zvcJARHsjzfIZtxVZvEfQf rTO7qtsLKYJ0SrLDTzYsnvS7ttco/qhY2eRjhEr0w5+nB8k/tc2iQjJxXCfZzH+jWh0D IRAmZk0iPCL+RLDP1gUgb/gK4AWgWqOvWvNWCwDuk90A7Hneshjx7UhqOnKlxZ9HUf/M P9IS6U/CoAV19jChC79Ukxq6UQH2hYGwfKuR+IIHVy6DuA4D/wBe7smYjThkPJX4mDtz rz0GWVFshpD0rQ/TMHTDFYI0UKzLOoQ7ia81ZsTyuUWunLqE99KvBOE3Gllh9HWkvH7r DiEA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728070619; x=1728675419; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=8rm1/an/7CQqqF6irKom4j0+APO0mrtJO1kxlQdhm9g=; b=jvezs64PrqdWaqk9bc8Af7osGjxZKYJvuOfjsySCThP/71ThV+OJoP7DwuE4iuwtRI s12SYWaJnSmmDtBzW6uUXOk80r0p6NU5ziSCBcKk0X0wnxcDx0AxzlKE02Re4KJRdC+A AiQ6r7ZCz818cItmdF6G5pznveKK+plP7p4NP6WJ8H7qaa9C5V3VTOSLgMrNyADxRgvl vECCiXaLWYLCLX/umNXNjBwnIhoDt5PR58+hr99p7IPcVMFWbkqcPmmvOIBrH51B54sB 6UwBVEca78NQWhXIZc8VjoKRM3fiROYp3NuEk1MFnb5ONDbhXSDXURmwRsF24Srj1c7F 2Lfw== X-Forwarded-Encrypted: i=1; AJvYcCVgetXwCCwk27RY3C+Co/9jVBOCq8YACEQH2Sf6EXZnTcECG1toAaU3W8hncFbxXq6D5d/5uDZsG93zsY/B@lists.postgresql.org X-Gm-Message-State: AOJu0YzeavuiqZNdgWpDaQ9+oG06ZxiLapRF58qn1GKWxwqN9biHWtlC LVfNimg+Okj4XiS6pEnWF/SZu7n2Y59QKvoF6WS7h1cLPPPbD7eQADvDqsgV+OMwHbLB+qxNxzF iDzzxzBpFH7c0lzHqocI0+gU/u78= X-Google-Smtp-Source: AGHT+IEgO7MSxgT/HS+0pz3di+4CbmAYgRn8XyD19xLB5Ml654xntqMproLw0RhegoYycgvO78+FGtQyH+cOAizBUsc= X-Received: by 2002:a05:6102:5111:b0:498:ef8d:8368 with SMTP id ada2fe7eead31-4a405782f37mr3700371137.13.1728070618770; Fri, 04 Oct 2024 12:36:58 -0700 (PDT) MIME-Version: 1.0 References: <3133182a-35c4-4a81-be60-8ab227c0fbc0@aklaver.com> In-Reply-To: From: sud Date: Sat, 5 Oct 2024 01:06:47 +0530 Message-ID: Subject: Re: Question on session timeout To: Greg Sabino Mullane Cc: Adrian Klaver , pgsql-general Content-Type: multipart/alternative; boundary="00000000000060f4470623abcb90" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --00000000000060f4470623abcb90 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Oct 1, 2024 at 5:45=E2=80=AFPM Greg Sabino Mullane wrote: > On Tue, Oct 1, 2024 at 1:57=E2=80=AFAM sud wrote: > >> *Where are you getting the ~2000 count from?* >> Seeing this in the "performance insights" dashboard and also its matchin= g >> when I query the count of sessions from pg_stat_activity. >> > > So I'm guessing this is perhaps RDS or Aurora? Stating that up front can > be helpful. > > >> As you described, a long time open transaction with a session state as >> "idle" will be threatening as that will cause locking >> > > No, idle is fine, "idle in transaction" is bad. :) > > Is it correct to assume the session in pg_stat_activity with very old >> XACT_START are the one which are part of long running >> > > > > You need to look at the "state" column as your primary bit of information= . > Second most important is how long something has been in that state, which > you can find with now() - state_change. The best way to learn all of this > is to open a few concurrent sessions in psql and experiment. > > >> We have max_connections set as 5000. >> > > That's quite high. But if you never reach that high, it doesn't matter a > whole lot. > > "Database connection" touching ~2000 then coming down till 200. And we se= e >> uneven spikes in those, it seems to be matching with the pattern , when = we >> have some errors occurring during the insert queries which are submitted= by >> the Java application to insert the data into the tables. >> > > (What sort of errors?) 2000 is high. Clearly, you are not pooling > connections, or not pooling them well. If you are using plain Postgres, > look into setting up pgbouncer. If using something managed (e.g. RDS) loo= k > into their particular pooling solution. Or fix your application-level > pooling. > > Thanks Greg. It's a third party app and the application team confirmed they are using connection pooling at their side. But as you mentioned, the number of connections *"2000 is high"* . But , isn't it possible because they may be having a max connection pool size limit set as ~2000 which is why we see that many connections during peak window. So in that case is it advisable to reduce the number of Max connections, because we have a number of cores -32 for this instance. And yes it's RDS. The errors which we were seeing were related to the data bit not related to connections. --00000000000060f4470623abcb90 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

On Tue, Oct 1, 2024 at 5:45=E2=80=AFPM Gr= eg Sabino Mullane <htamfids@gmail.= com> wrote:
On Tue, Oct 1, 2024 at 1:57=E2=80=AFAM= sud <suds1434@g= mail.com> wrote:
Where are you getting the ~2000 =C2=A0count from?
Seeing this i= n the "performance insights" dashboard and also its matching when= I query the count of sessions from pg_stat_activity.

So I'm guessing this is perhaps RDS or Auror= a? Stating that up front can be helpful.
=C2=A0
As you described, a long time open transaction with a session state = as "idle" will be threatening as that will cause locking

No, idle is fine, "idle in transac= tion" is bad. :)=C2=A0

Is it co= rrect to assume the session in pg_stat_activity with very old XACT_START ar= e the one which are part of long running

<= /div>
<snip rest of question>

You need t= o look at the "state" column as your primary bit of information. = Second most important is how long something has been in that state, which y= ou can find with now() - state_change. The best way to learn all of this is= to open a few concurrent sessions in psql and experiment.
=C2=A0=
We have max_connections set as 5000.

That's quite high. But if you never = reach that high, it doesn't matter a whole lot.

"Database connection" touching ~2000 then coming= down till 200. And we see uneven spikes in those, it seems to be matching = with the pattern , when we have some errors occurring during the insert que= ries which are submitted by the Java application to insert the data into th= e tables.

(What sort of errors?= ) 2000 is high.=C2=A0Clearly, you are not pooling connections, or not pooli= ng them well. If you are using plain Postgres, look into setting up pgbounc= er. If using something managed (e.g. RDS) look into their particular poolin= g solution. Or fix your application-level pooling.


Thanks Greg.
It's a = third party=C2=A0app and the application team confirmed they are using conn= ection pooling at their side. But as you mentioned, the number of connectio= ns "2000 is high"=C2=A0. But , isn't it possible becau= se they may be having a max connection pool size limit set as ~2000 which i= s why we see that many connections during peak window. So in that case is i= t advisable to reduce the number of Max connections, because we have a numb= er of cores -32 for this instance.=C2=A0

And yes i= t's RDS. The errors which we were seeing were related to the data bit n= ot related to connections.
--00000000000060f4470623abcb90--