Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tAKkf-00F7l4-FN for pgsql-admin@arkaria.postgresql.org; Mon, 11 Nov 2024 03:06:04 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1tAKkb-00AhOH-Cb for pgsql-admin@arkaria.postgresql.org; Mon, 11 Nov 2024 03:06:01 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tAKka-00AhO9-RK for pgsql-admin@lists.postgresql.org; Mon, 11 Nov 2024 03:06:01 +0000 Received: from mail-pl1-x634.google.com ([2607:f8b0:4864:20::634]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1tAKkY-001E28-Nv for pgsql-admin@lists.postgresql.org; Mon, 11 Nov 2024 03:06:00 +0000 Received: by mail-pl1-x634.google.com with SMTP id d9443c01a7336-20c693b68f5so40839375ad.1 for ; Sun, 10 Nov 2024 19:05:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1731294358; x=1731899158; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=vE8cAwMOtOe5fdQIT9EXJvk5J7f/L6YIZ6eE9j+OCyg=; b=WOZT6KJDloEkR8OadniDulaakxXnIrxFiL/XgF81GCXg8OkuO9f7TGFhx1SfLl5q4l iOxDFpsBppTeXBoA1IN7OlcqNyhjPF2yQG0BHtsGi8rG4ub532v/jJXkq0mYeBpncL1w 9lf8zlOyfWlBilssbYLRCYw6y8FMOQsjaHKAGR5GR23r+2lHpEwUrsuSmo+MB9wW41hI ptXA3Sep6BEz0ujenCcyaBlAZilq9MmYQALKdcPo+oroWjSLP36kNAwAUWERgPcyf4HI GDBhQ7pwNY5ys2287ZN6w+8JtgDsDqEp+4E1IVW/eWZTRsFCNOCf/fPn7Om1DKg/r9ap HQIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731294358; x=1731899158; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=vE8cAwMOtOe5fdQIT9EXJvk5J7f/L6YIZ6eE9j+OCyg=; b=RcaEtQVKgZiMS/jSJ8lF/c2EkoC1t46ke2vjMpUUYV7HexUVdYHzf73okF6fsDjskN gzXGb7JQ+sEdffsS1G54rydUNzyfomG4RUGputs2kf6mA0XkndnaspPeBe/W7YAAjrrU 20p45Yw4aOrKzKRoZ6Em6J4m6pEFNs7+Nrr18eQG3iCGdYJrAi3YpU666GnBnf27baF+ StnogQ7oIivZG/UHUrqQ58OGdVzrzp/xSS/OlcXxVGu+q4R0MK72TctlcxeHKJPmP9Ky oJ9DbF7dw2k94W2JPi1+nCYDLx8nEWLF9IqM/0wNLO5ML2FTHsEgg0Ux6gGoggN/BBwS gKKg== X-Gm-Message-State: AOJu0YyMUwtXVXXqoAuXS53BEL2tDyAZRfq3bcV4qYbJl89etHv/4Ya7 mbiVYnr7bcOdG90mKTYdWxb4I/cp8Eh1chNF3ZGpLzsPq47eP7oCikp+b/kMVdS1eJAugsRel5o cJHORYR+NQfzzTUBEMifpauNVTjg= X-Google-Smtp-Source: AGHT+IF207wrifLiGLKCWxtVcqeZD2n3z9apaFyfbs1IdrkfgVIkMxrRSVlBZ0Sc/PpqHdghFlDtbXvXX/8dKSDrdXQ= X-Received: by 2002:a17:903:1d1:b0:20c:f261:2516 with SMTP id d9443c01a7336-211834f3842mr157438355ad.8.1731294357553; Sun, 10 Nov 2024 19:05:57 -0800 (PST) MIME-Version: 1.0 References: <6af4a50d5a942117f2c76add61c37d2a55810a3f.camel@cybertec.at> In-Reply-To: <6af4a50d5a942117f2c76add61c37d2a55810a3f.camel@cybertec.at> From: Edwin UY Date: Mon, 11 Nov 2024 16:05:21 +1300 Message-ID: Subject: Re: How to check if session is a hung thread/session To: Laurenz Albe Cc: "pgsql-admin@lists.postgresql.org" Content-Type: multipart/alternative; boundary="0000000000002f0bac06269a617d" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000002f0bac06269a617d Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Laurenz Thanks for your reply. When you say "You should look at the "state_change" column for the session to see how long it has been idle." Do you mean compare the time between state_change whether they change or that? Will have to script that somehow. On Sun, Nov 10, 2024 at 6:29=E2=80=AFPM Laurenz Albe wrote: > On Sun, 2024-11-10 at 10:25 +1300, Edwin UY wrote: > > I am not sure how to explain this, but I believe this is something > usually referred to in the JDBC world as a hung thread. > > > > This is an Aurora RDS PostgreSQL database that was patched in a suppose= d > to be zero downtime kind of thing. > > After the patching, I have a user complaining that he has a job that > normally gets run in the OS cron session that has been failing. > > > > Checking pg_stat_activity > > > > select pid as process_id, > > usename as username, > > datname as database_name, > > client_addr as client_address, > > application_name, > > now() - pg_stat_activity.query_start AS duration, > > backend_start, > > backend_start at time zone 'Pacific/Auckland' as > backend_start_localtime, > > state, > > state_change, > > state_change at time zone 'Pacific/Auckland' as > state_change_localtime > > from pg_stat_activity > > where usename =3D '[blah]' > > order by username asc > > > > I have the output as below: > > > > process_id | username | database_name | client_address | > application_name | duration | backend_start | > backend_start_localtime | state | state_change | > state_change_localtime > > > ------------+-----------+---------------+----------------+---------------= ---------+-----------------+-------------------------------+---------------= -------------+-------+-------------------------------+---------------------= ------- > > 31947 | [blah] | [blah] | [blah] | PostgreSQL JDBC > Driver | 00:47:21.838892 | 2024-11-06 10:44:53.309388+00 | 2024-11-06 > 23:44:53.309388 | idle | 2024-11-08 02:00:06.005173+00 | 2024-11-08 > 15:00:06.005173 > > (1 row) > > > > From the pg_stat_output, the backend_start has not changed for several > days since the patching, so I am 'guessing' > > it may have gone lost/rogue already. Is there any way to check that thi= s > is the case before I kill it. > > The state_change date is getting updated though, I don't know if this i= s > proof that it is not a hung thread. > > You should look at the "state_change" column for the session to see how > long it has been idle. > > At any rate, an idle session is not hanging - at least not in the > database. Perhaps you got the > wrong session, or perhaps the client has got a problem. > > I'd say that killing the session won't resolve any hanging thread on the > client side. > Still, if you kill it, you'd at worst cause an error on the client side. > > Yours, > Laurenz Albe > --0000000000002f0bac06269a617d Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Laurenz

Thanks for your reply.=C2=A0

When you say "Yo= u should look at the "state_change" column for the session to see= how long it has been idle."

Do you mean comp= are the time between state_change whether they change or that? Will have to= script that somehow.


On Sun, Nov 10, 2024 at 6:29= =E2=80=AFPM Laurenz Albe <la= urenz.albe@cybertec.at> wrote:
On Sun, 2024-11-10 at 10:25 +1300, Edwin UY wrote: > I am not sure how to explain this, but I believe this is something usu= ally referred to in the JDBC world as a hung thread.
>
> This is an Aurora RDS PostgreSQL database that was patched in a suppos= ed to be zero downtime kind of thing.
> After the patching, I have a user complaining that he has a job that n= ormally gets run in the OS cron session that has been failing.
>
> Checking pg_stat_activity
>
> select pid as process_id,
> =C2=A0 =C2=A0 =C2=A0 =C2=A0usename as username,
> =C2=A0 =C2=A0 =C2=A0 =C2=A0datname as database_name,
> =C2=A0 =C2=A0 =C2=A0 =C2=A0client_addr as client_address,
> =C2=A0 =C2=A0 =C2=A0 =C2=A0application_name,
> =C2=A0 =C2=A0 =C2=A0 =C2=A0now() - pg_stat_activity.query_start AS dur= ation,
> =C2=A0 =C2=A0 =C2=A0 =C2=A0backend_start,
> =C2=A0 =C2=A0 =C2=A0 =C2=A0backend_start at time zone 'Pacific/Auc= kland' as backend_start_localtime,
> =C2=A0 =C2=A0 =C2=A0 =C2=A0state,
> =C2=A0 =C2=A0 =C2=A0 =C2=A0state_change,
> =C2=A0 =C2=A0 =C2=A0 =C2=A0state_change at time zone 'Pacific/Auck= land' as state_change_localtime
> from pg_stat_activity
> where usename =3D '[blah]'
> order by username asc
>
> I have the output as below:
>
> =C2=A0process_id | username =C2=A0| database_name | client_address | = =C2=A0 =C2=A0application_name =C2=A0 =C2=A0| =C2=A0 =C2=A0duration =C2=A0 = =C2=A0 | =C2=A0 =C2=A0 =C2=A0 =C2=A0 backend_start =C2=A0 =C2=A0 =C2=A0 =C2= =A0 | =C2=A0backend_start_localtime =C2=A0 | state | =C2=A0 =C2=A0 =C2=A0 = =C2=A0 state_change =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0| =C2=A0 state_change= _localtime
> ------------+-----------+---------------+----------------+------------= ------------+-----------------+-------------------------------+------------= ----------------+-------+-------------------------------+------------------= ----------
> =C2=A0 =C2=A0 =C2=A0 31947 | [blah] | [blah] =C2=A0 =C2=A0 =C2=A0 =C2= =A0 | [blah] =C2=A0=C2=A0 | PostgreSQL JDBC Driver | 00:47:21.838892 | 2024= -11-06 10:44:53.309388+00 | 2024-11-06 23:44:53.309388 | idle =C2=A0| 2024-= 11-08 02:00:06.005173+00 | 2024-11-08 15:00:06.005173
> (1 row)
>
> From the pg_stat_output, the backend_start has not changed for several= days since the patching, so I am 'guessing'
> it may have gone lost/rogue already. Is there any way to check that th= is is the case before I kill it.
> The state_change date is getting updated though, I don't know if t= his is proof that it is not a hung thread.

You should look at the "state_change" column for the session to s= ee how long it has been idle.

At any rate, an idle session is not hanging - at least not in the database.= =C2=A0 Perhaps you got the
wrong session, or perhaps the client has got a problem.

I'd say that killing the session won't resolve any hanging thread o= n the client side.
Still, if you kill it, you'd at worst cause an error on the client side= .

Yours,
Laurenz Albe
--0000000000002f0bac06269a617d--