Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uLbIt-00CckB-K9 for pgsql-admin@arkaria.postgresql.org; Sun, 01 Jun 2025 05:32:15 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1uLbIq-000oxy-Sr for pgsql-admin@arkaria.postgresql.org; Sun, 01 Jun 2025 05:32:12 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uLbIq-000oxq-Hb for pgsql-admin@lists.postgresql.org; Sun, 01 Jun 2025 05:32:12 +0000 Received: from cloud.gatewaynet.com ([185.90.37.94]) by magus.postgresql.org with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1uLbIn-0019aw-1R for pgsql-admin@lists.postgresql.org; Sun, 01 Jun 2025 05:32:11 +0000 Content-Type: multipart/alternative; boundary="------------KNcT3gR70UGFVKCsJ726pd6f" Message-ID: <6be8d715-a9a7-4be8-8ebc-8b6bdb98da2e@cloud.gatewaynet.com> Date: Sun, 1 Jun 2025 08:32:06 +0300 MIME-Version: 1.0 Subject: Re: PostgreSQL 16.6 , query stuck with STAT Ssl, wait_event_type : IPC , wait_event : ParallelFinish To: Tom Lane Cc: pgsql-admin , itdev@gatewaynet.com References: <3049794.1748751598@sss.pgh.pa.us> Content-Language: en-US From: Achilleas Mantzios In-Reply-To: <3049794.1748751598@sss.pgh.pa.us> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk This is a multi-part message in MIME format. --------------KNcT3gR70UGFVKCsJ726pd6f Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 1/6/25 07:19, Tom Lane wrote: > Achilleas Mantzios writes: >>> a query is stuck with the above, it seems it waits for parallel worker >>> to finish, however , there are no parallel works running : > You didn't explain the subject about "STAT Ssl", but if you mean > that that was what ps was showing for the backend process, there's > something very wrong there. According to "man ps", the "l" means > > l is multi-threaded (using CLONE_THREAD, like NPTL pthreads > do) Yes. sorry, I didn't include this info, you are spot on, yes this the output of ps aux . > which is something that a Postgres backend should never be > (in existing releases anyway). So I'm speculating that > the process somehow became multi-threaded and then some > wakeup signal went to the wrong thread. > > We've had issues with perl or python introducing multi-threading > because of plperl or plpython functions doing things they > probably shouldn't. Do you have any of those in your system? Yes we have two perl functions only that I'd be happy to get rid off : postgres@[local]/dynacom=# select p.proname, l.lanname from pg_language l, pg_proc p where p.prolang=l.oid and l.lanname ~* '.*perl.*'; proname  | lanname ----------+--------- basename | plperlu filetype | plperlu (2 rows) Nothing used in the app, just some two utility functions to help us batch insert some attachments, guess mimetype etc. However the calling client is Perl , based on libpg-perl (not DBI), basically this is a descendant of DBMirror.pl (we are still using it). The strange thing is that we run pgsql 16.* since November, also we run our version of DBMirror since 2005 (and PostgreSQL since 2001) and we never had this problem before (at least from what I know). > > regards, tom lane --------------KNcT3gR70UGFVKCsJ726pd6f Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit

On 1/6/25 07:19, Tom Lane wrote:

Achilleas Mantzios <a.mantzios@cloud.gatewaynet.com> writes:
a query is stuck with the above, it seems it waits for parallel worker 
to finish, however , there are no parallel works running :
You didn't explain the subject about "STAT Ssl", but if you mean
that that was what ps was showing for the backend process, there's
something very wrong there.  According to "man ps", the "l" means

               l    is multi-threaded (using CLONE_THREAD, like NPTL pthreads
                    do)
Yes. sorry, I didn't include this info, you are spot on, yes this the output of ps aux .
which is something that a Postgres backend should never be
(in existing releases anyway).  So I'm speculating that
the process somehow became multi-threaded and then some
wakeup signal went to the wrong thread.

We've had issues with perl or python introducing multi-threading
because of plperl or plpython functions doing things they
probably shouldn't.  Do you have any of those in your system?

Yes we have two perl functions only that I'd be happy to get rid off :

postgres@[local]/dynacom=# select p.proname, l.lanname from pg_language l, pg_proc p where p.prolang=l.oid and l.lanname ~* '.*perl.*';
proname  | lanname  
----------+---------
basename | plperlu
filetype | plperlu
(2 rows)

Nothing used in the app, just some two utility functions to help us batch insert some attachments, guess mimetype etc. However the calling client is Perl , based on libpg-perl (not DBI), basically this is a descendant of DBMirror.pl (we are still using it).

The strange thing is that we run pgsql 16.* since November, also we run our version of DBMirror since 2005 (and PostgreSQL since 2001) and we never had this problem before (at least from what I know).


			regards, tom lane
--------------KNcT3gR70UGFVKCsJ726pd6f--