Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tAaoV-00GcBX-Vp for pgsql-admin@arkaria.postgresql.org; Mon, 11 Nov 2024 20:15:07 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1tAaoT-000EB3-DZ for pgsql-admin@arkaria.postgresql.org; Mon, 11 Nov 2024 20:15:05 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tAaoT-000EAu-11 for pgsql-admin@lists.postgresql.org; Mon, 11 Nov 2024 20:15:05 +0000 Received: from mail-pj1-x1032.google.com ([2607:f8b0:4864:20::1032]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1tAaoP-001Pu2-Ua for pgsql-admin@lists.postgresql.org; Mon, 11 Nov 2024 20:15:04 +0000 Received: by mail-pj1-x1032.google.com with SMTP id 98e67ed59e1d1-2e3fca72a41so3916862a91.1 for ; Mon, 11 Nov 2024 12:15:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1731356101; x=1731960901; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=wiuEOd7JcsZhrXz26egajkGrkQtYcyi90SQRp97kBNg=; b=bM2SxrMJB8ioPoOc4OWe490ZEFHr+K3H++XvpOzI7Rq9SM15PE9QmwYF/M5HLTRDEi H5ZKGQ+OddLP/fIvOVwDCpNy9pDNphCzoMhjdvWjgyv6aLa0FXn3HSVpTyCAHrpuJguC lj9cpHhXP4Ix119MD4nifgXjavm/3KEFn3XT8xgAC1hyBUj0w3fqcb778JNMilgk/KKh sPHkmWqQLluhhyuZVqmYvN6DEFmSUXDIIcyOdsJAiA8cj5D06r0MlA0v+HVOQDyaaO1F O0WXube6cY04HM1ZBEMqZlUhIrZ3j+KiDdfL9/5IAb6mnAMmmatg5sMcUaB/08UP/8o8 OhAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731356101; x=1731960901; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=wiuEOd7JcsZhrXz26egajkGrkQtYcyi90SQRp97kBNg=; b=RXEPaO3YCkT6nz5CNK8RIIBXXXpX3DTyhoUi3NyvyVjeiKemtSOg41EVfMd6mQXT/7 OZHKaejRPZ6nxvNLA+KTI6JeAEObcaCmWg9Ed9RPjIBUr76VM7YQeiQMsaOch7g4j5NZ K/aL5Lra+hMBz3fVjqbgyGRG2ovgKXqHGYj1UD7mbFwRR38HCOEfyXbiJMvMQoMHVKTM 04yiOmdgQ48EZ6nGkZA11+XtgahUog/W7bSfVUnlZrdXX7MzhubbxorYuJ5bSbkiwPEM jdIWKOqDZbLCQdfbWatWFOLSW3Y5WGMOnZEHd4I3tUNVkQC9b8Ap7jZ5XwSy/a/ry9ab 0ktw== X-Forwarded-Encrypted: i=1; AJvYcCWT4JhAkMmaw413aJkgWyWT2WPc/TrG5nTbjA/UntqA2nYj/30QPz2CEAPxAeyG58K8RxG1HfZj0ls0Fg==@lists.postgresql.org X-Gm-Message-State: AOJu0YyQPQ3B1g12TSt4OeQq9KCLblXGg+VuZ0a7e44xnr2IzJkOJ7OK LeQVWQvifxEQYNEMdCsCIRTJB4MoiHIEDoO64GpU+IHurnKlMNkhlf03APtgsowcqvVuN1zQooG yoDz8vPJCiuMVoBlcZbH0hmBlwyo= X-Google-Smtp-Source: AGHT+IGjK1L+ibVap8erJ8hpxxNnhneTJfbxF45hv6cqm4VRHaeScFvrCUeZlQTy1NZ2FOZNz3C5kETkzFBlQ9e3pj0= X-Received: by 2002:a17:90b:1d01:b0:2e2:cd2f:b919 with SMTP id 98e67ed59e1d1-2e9b1741152mr19771313a91.28.1731356101142; Mon, 11 Nov 2024 12:15:01 -0800 (PST) MIME-Version: 1.0 References: <216310.1731300338@sss.pgh.pa.us> <025001db3454$0dbc19d0$29344d70$@msym.fr> In-Reply-To: <025001db3454$0dbc19d0$29344d70$@msym.fr> From: Edwin UY Date: Tue, 12 Nov 2024 09:14:24 +1300 Message-ID: Subject: Re: How to check if session is a hung thread/session To: msalais@msym.fr Cc: Tom Lane , "David G. Johnston" , pgsql-admin@lists.postgresql.org Content-Type: multipart/alternative; boundary="0000000000006373170626a8c188" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000006373170626a8c188 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Michel, Thanks for the explanation. Didn't know about this default setting. It doesn't happen all the time. It is reported in the log every 2-3 hours and once or two at most during that time but it doesn't necessarily affect the whole application. This table is a Mule flow sort of table where there's 3 flows that check on this table for things to process and then continue on further. 2 of them goes fine and only one of them is failing intermittently. [25751]:DETAIL: Process holding the lock: 22297. Wait queue: 25751. [25751]:CONTEXT: while locking tuple (1,17) in relation "[table_name]" [25751]:STATEMENT: [select SQL statement] for update [25751]:LOG: process 25751 acquired ShareLock on transaction 114953443 after 4756.967 ms On Tue, Nov 12, 2024 at 5:09=E2=80=AFAM wrote: > Hi, > > > > The log you expose doesn=E2=80=99t mean a dead lock! It comes from settin= g > deadlock_timeout parameter. Its value is 1 second by default. When a > session waits for a lock more than this threshold then a test to see if w= e > are in a deadlock condition is triggered. It is simply that we are waitin= g > to obtain a lock for longer that the value of deadlock_timeout. Any way i= t > could be interpreted as a performance problem when we are frequently > waiting more than 1 sec for a lock=E2=80=A6 > > > > Regards > > > > *Michel SALAIS* > > *De :* Edwin UY > *Envoy=C3=A9 :* lundi 11 novembre 2024 07:45 > *=C3=80 :* Tom Lane > *Cc :* David G. Johnston ; > pgsql-admin@lists.postgresql.org > *Objet :* Re: How to check if session is a hung thread/session > > > > Thanks Tom, > > > > OK, I've decided to 'painfully' look at the PostgreSQL RDS logs and it is > showing something like below. > > There seems to be a locking/deadlock issue of some sort somewhere. > > I have checked the other days' log prior to the patching and these appear > to be a 'normal' occurrence for this database and it wasn't affecting the > application nonetheless. > > After the patching, it starts affecting the application. Not sure what > else I can check on the Aurora PostgreSQL RDS end. I may request them to > restart the app server. > > > > [25751]:LOG: process 25751 still waiting for ShareLock on transaction > 114953443 after 1000.054 ms > > [25751]:DETAIL: Process holding the lock: 22297. Wait queue: 25751. > > [25751]:CONTEXT: while locking tuple (1,17) in relation "[table_name]" > > [25751]:STATEMENT: [SQL_STATEMENT] for update > [25751]:LOG: process 25751 acquired ShareLock on transaction 114953443 > after 4756.967 ms > [25751]:CONTEXT: while locking tuple (1,17) in relation " [table_name] " > > > > > > On Mon, Nov 11, 2024 at 5:45=E2=80=AFPM Tom Lane wrot= e: > > Edwin UY writes: > > I thought it could be the backend has sent something back to the client > but > > it never received it and it just kept on doing the same at some > intervals. > > Your pg_stat_activity output shows the backend is idle, meaning it's > waiting for a client command. While the session has been around for > days, we can see the last client command was about 47 minutes ago, > from your "now() - pg_stat_activity.query_start AS duration" column. > I see no reason to think there is anything interesting here at all, > except for a client that is sitting doing nothing for long periods. > > regards, tom lane > > --0000000000006373170626a8c188 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Michel,

= Thanks for the explanation. Didn't know about this default setting.
It doesn't happen all the time. It is reported in the= log every 2-3 hours and once or two at most during that time but it doesn&= #39;t necessarily affect the whole application.

Th= is table is a Mule flow sort of table where there's 3 flows that check = on this table for things to process and then continue on further. 2 of them= goes fine and only one of them is failing intermittently.
[25751]:DETAI= L: =C2=A0Process holding the lock: 22297. Wait queue: 25751.
[25751]:CON= TEXT: =C2=A0while locking tuple (1,17) in relation "[table_name]"=
[25751]:STATEME= NT:=C2=A0 [select SQL statement] for update
[25751]:LOG: =C2=A0process 2= 5751 acquired ShareLock on transaction 114953443 after 4756.967 ms

On Tue, Nov 12, 2024 at 5:09=E2=80=AFAM <msalais@msym.fr> wrote:

Hi,

=C2=A0

The log you expose doesn=E2=80=99t mean a dea= d lock! It comes from setting deadlock_timeout parameter. Its value is 1 se= cond by default. When a session waits for a lock more than this threshold t= hen a test to see if we are in a deadlock condition is triggered. It is sim= ply that we are waiting to obtain a lock for longer that the value of deadl= ock_timeout. Any way it could be interpreted as a performance problem when = we are frequently waiting more than 1 sec for a lock=E2=80=A6=

=C2=A0=

Regards

=C2=A0

Michel SALAIS

De=C2=A0: Edwin = UY <edwin.uy@gma= il.com>
Envoy=C3=A9=C2=A0: lundi 11 novembre 2024 07:45=C3=80=C2=A0: Tom Lane <tgl@sss.pgh.pa.us>
Cc=C2=A0: David G. John= ston <da= vid.g.johnston@gmail.com>; pgsql-admin@lists.postgresql.org
Obj= et=C2=A0: Re: How to check if session is a hung thread/session

= =C2=A0

Thanks T= om,

=C2=A0=

OK, I've decided to 'painfull= y' look at the PostgreSQL RDS logs and it is showing something like bel= ow.

There seems to be a= locking/deadlock issue of some sort somewhere.

I have checked the other days' log prior to th= e patching and these appear to be a 'normal' occurrence for this da= tabase and it wasn't affecting the application nonetheless.

After the patching, it starts affec= ting the application. Not sure what else I can check on the Aurora PostgreS= QL RDS end. I may request them to restart the app server.

=

=C2=A0

[25751= ]:LOG: =C2=A0process 25751 still waiting for ShareLock on transaction 11495= 3443 after 1000.054 ms

[25751]:DETAIL: = =C2=A0Process holding the lock: 22297. Wait queue: 25751.<= /u>

[25751]:CONTEXT: =C2=A0while locking tuple (1,17) in rela= tion "[table_name]"

[25751]:STATEME= NT:=C2=A0 [SQL_STATEMENT] for update
[25751]:LOG: =C2=A0process 25751 ac= quired ShareLock on transaction 114953443 after 4756.967 ms
[25751]:CONT= EXT: =C2=A0while locking tuple (1,17) in relation "
[table_name] "

<= p class=3D"MsoNormal">=C2=A0

=C2=A0

On Mon, Nov = 11, 2024 at 5:45=E2=80=AFPM Tom Lane <tgl@sss.pgh.pa.us> wrote:

<= p class=3D"MsoNormal">Edwin UY <edwin.uy@gmail.com> writes:
> I thought it cou= ld be the backend has sent something back to the client but
> it neve= r received it and it just kept on doing the same at some intervals.

= Your pg_stat_activity output shows the backend is idle, meaning it'swaiting for a client command.=C2=A0 While the session has been around for<= br>days, we can see the last client command was about 47 minutes ago,
fr= om your "now() - pg_stat_activity.query_start AS duration" column= .
I see no reason to think there is anything interesting here at all,except for a client that is sitting doing nothing for long periods.
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 regards, tom lane

--0000000000006373170626a8c188--