public inbox for [email protected]  
help / color / mirror / Atom feed
From: Maxim Boguk <[email protected]>
To: [email protected]
To: [email protected]
Subject: Re: BUG #19505: Some weird spikes postgresql processes in database (up to 200k sometime) without apparent reasons.
Date: Mon, 22 Jun 2026 23:22:53 +0300
Message-ID: <CAK-MWwRVb7Lz14uJNeiggM8O15Y=QLRny9evxec2Pquu5+DwBg@mail.gmail.com> (raw)
In-Reply-To: <CAK-MWwR6vm=0yg9wTovhsTUcSRnHJrBZ2X-d-KHBtyRM0j=qaw@mail.gmail.com>
References: <[email protected]>
	<CAK-MWwR6vm=0yg9wTovhsTUcSRnHJrBZ2X-d-KHBtyRM0j=qaw@mail.gmail.com>

On Tue, Jun 2, 2026 at 9:51 PM Maxim Boguk <[email protected]> wrote:

>
>
> On Tue, Jun 2, 2026 at 9:37 PM PG Bug reporting form <
> [email protected]> wrote:
>
>> The following bug has been logged on the website:
>>
>> Bug reference:      19505
>> Logged by:          Maxim Boguk
>> Email address:      [email protected]
>> PostgreSQL version: 18.4
>> Operating system:   Ubuntu 24.04.4 LTS
>> Description:
>>
>> I started investigation of this issue after found that process count of
>> postgresql on my replica sometime jump to 200k+ (with max_connections=1000
>> and real connections under 100 most time).
>> Somehow single (seems random by always heavy/analytical) query spawn
>> thousands of the threads and tens thousands of parallel workers.
>>
>> After some logging I caught one snapshot (ps -u postgres -L -o
>> pid,tid,ppid,lstart,args -ww 2 ) with 39257 processes:
>>
>> [postgres@db ~/tmp]$ zcat ps-L-2026-06-02_17-40-22.gz | wc -l
>> 39257
>>
>> Main content is:
>> PID          TID               PPID  StartTime
>> command
>> 2158552 2158552  948705 Tue Jun  2 17:40:17 2026 postgres: 18/main:
>> background_shared db [local] SELECT
>>
>> Then:
>> The same PID but 1620 different TIDS.
>> PID          TID               PPID  StartTime
>> command
>> #main process
>> 2158557 2158557  948705 Tue Jun  2 17:40:18 2026 postgres: 18/main:
>> background_shared db [local] SELECT
>> #1620 threads
>> 2158557 2158607  948705 Tue Jun  2 17:40:20 2026 postgres: 18/main:
>> background_shared db [local] SELECT
>> 2158557 2158608  948705 Tue Jun  2 17:40:20 2026 postgres: 18/main:
>> background_shared db [local] SELECT
>> 2158557 2158609  948705 Tue Jun  2 17:40:20 2026 postgres: 18/main:
>> background_shared db [local] SELECT
>>
>> Then, 37571 rows!!! of:
>> PID          TID               PPID  StartTime
>> command
>> 2158579 2159176  948705 Tue Jun  2 17:40:20 2026 postgres: 18/main:
>> parallel
>> worker for PID 2158557
>> 2158579 2159179  948705 Tue Jun  2 17:40:20 2026 postgres: 18/main:
>> parallel
>> worker for PID 2158557
>> 2158579 2159183  948705 Tue Jun  2 17:40:20 2026 postgres: 18/main:
>> parallel
>> worker for PID 2158557
>> 2158579 2159196  948705 Tue Jun  2 17:40:20 2026 postgres: 18/main:
>> parallel
>> worker for PID 2158557
>> 2158579 2159198  948705 Tue Jun  2 17:40:20 2026 postgres: 18/main:
>> parallel
>> worker for PID 2158557
>> 2158579 2159202  948705 Tue Jun  2 17:40:20 2026 postgres: 18/main:
>> parallel
>> worker for PID 2158557
>>
>> I double checked the query (it had been logged in database log): it run
>> with
>> 6 worker processes and without any issues on manual run.
>>
>> Related db configuration:
>> max_connections = 1000
>> max_worker_processes = 128              # (change requires restart)
>> max_parallel_workers_per_gather = 16    # limited by max_parallel_workers
>> max_parallel_workers = 64
>> io_method = io_uring                    # worker, io_uring, sync
>> io_max_concurrency = -1         # Max number of IOs that one process
>> jit = on (usual suspect in case of weird things going on)
>>
>> Given that situation happens like 1-10 times per hour (and lead for short
>> LA
>> spikes up to 10000) - it's seriously affect the database replica
>> performance.
>>
>> No external/non-standard/C extensions except of pgq and postgis loaded
>> into
>> the database.
>>
>> I can look for any additional information and  perform any local research
>> but currently I'm out of ideas what my next steps should be.
>>
>> PS: it's seems that the issue could be triggered by different queries, but
>> not the one particular
>
>

Update:  issue had been triggered by unconstrained spawn of helper threads
for io_method=io_uring
(thousands/ten thousands of helper "iou-wrk-****" threads per bitmap scan).
Switching to the io_method=worker fixed problem.

Seems io_uring have some unexpected issues with unconstrained threads spawn.


-- 
Maxim Boguk
Senior Postgresql DBA

Phone UA: +380 99 143 0000
Phone AU: +61  45 218 5678


view thread (4+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected]
  Subject: Re: BUG #19505: Some weird spikes postgresql processes in database (up to 200k sometime) without apparent reasons.
  In-Reply-To: <CAK-MWwRVb7Lz14uJNeiggM8O15Y=QLRny9evxec2Pquu5+DwBg@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox