Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1veAAe-009co5-2f for pgsql-bugs@arkaria.postgresql.org; Fri, 09 Jan 2026 10:56:45 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1veAAd-006n2h-0p for pgsql-bugs@arkaria.postgresql.org; Fri, 09 Jan 2026 10:56:44 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1ve9WP-006eZy-16 for pgsql-bugs@lists.postgresql.org; Fri, 09 Jan 2026 10:15:10 +0000 Received: from mahout.postgresql.org ([2001:4800:3e1:1::227]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1ve9WN-00533c-2T for pgsql-bugs@lists.postgresql.org; Fri, 09 Jan 2026 10:15:09 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=postgresql.org; s=20171124; h=Message-ID:Date:Reply-To:Cc:From:To:Subject: Content-Transfer-Encoding:MIME-Version:Content-Type:Sender:Content-ID: Content-Description:In-Reply-To:References; bh=skXkNx8XcRcp5Jxo+MEI4CqSpfiDNraRhxtpXhkY5tU=; b=dB+ubwtfIAiL5+ktEze3BTQt03 uvYZxDw/nJ9p6Yx3ePAaPBfHbH8rEdqr/1DIMI7q35rSFFCTQ5x0jRsqxM8aZf0yS4Y7p1CP+yjUV zJ2j5QcZo/x8GLzTz+6Ltt/2V22jrAjvlFM9S/sQaeAzPC4CbbGY3j8hNkY89+ybOtraPkblrKLWR bUaheaTgCH3CXWNgvkX0h3kYlZKoFKGV+roQYSisZX0qA/NRWATZjdOtvM+hYFK++V8FLue+2vr11 0lokNrC9YZIlu11sZNv+MAQ6ufwyJ3q09ci+E94R5llIddkVrmPC2VGOsj2P0e8MmBC1Ss1d6Kgr8 AKCygdOA==; Received: from wrigleys.postgresql.org ([2a02:16a8:dc51::60]) by mahout.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1ve9WM-00Eigi-1l for pgsql-bugs@lists.postgresql.org; Fri, 09 Jan 2026 10:15:06 +0000 Received: from localhost ([127.0.0.1] helo=wrigleys.postgresql.org) by wrigleys.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1ve9WJ-007hy8-15 for pgsql-bugs@lists.postgresql.org; Fri, 09 Jan 2026 10:15:04 +0000 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Subject: BUG #19373: One backend hanging in AioIoUringExecution blocking other backends To: pgsql-bugs@lists.postgresql.org From: PG Bug reporting form Cc: michael.kroell@gmail.com Reply-To: michael.kroell@gmail.com, pgsql-bugs@lists.postgresql.org Date: Fri, 09 Jan 2026 10:14:28 +0000 Message-ID: <19373-aac0a0ee0aac6a8b@postgresql.org> X-Auto-Response-Suppress: All Auto-Submitted: auto-generated List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk The following bug has been logged on the website: Bug reference: 19373 Logged by: Michael Kr=C3=B6ll Email address: michael.kroell@gmail.com PostgreSQL version: 18.1 Operating system: Linux 6.1.0-41-amd64 #1 SMP PREEMPT_DYNAMIC Debian Description: =20 We've upgraded to Pg18 with ``io_method=3Dio_uring`` early last December and things were running smoothly until early last Sunday one of the simple SELECT queries which is triggered a couple of thousands a day and usually only runs for milliseconds got stuck. It was hanging for almost 24h without visible activity until I've manually killed the backend (with -9 force). The query looked like this in the backend: | pid | leader_pid | state_change | wait_event_type | wait_event | state | |---------|------------|-------------------------------|-----------------|-= --------------------|--------| | 2034811 | | 2026-01-04 07:18:27.158077+01 | IO | AioIoUringExecution | active | | 3497711 | 2034811 | 2026-01-04 07:18:27.182794+01 | IPC | MessageQueueSend | active | | 3497712 | 2034811 | 2026-01-04 07:18:27.184025+01 | IPC | MessageQueueSend | active | and the leader PID looked like waiting ```bash ~ # strace -p 2034811 strace: Process 2034811 attached io_uring_enter(20, 0, 1, IORING_ENTER_GETEVENTS, NULL, 8 [root@host] 2026-01-05 07:58:11 ~ # ltrace -p 2034811 io_uring_wait_cqes(0x7f3af3ea9e10, 0x7fff2bb25e00, 1, 0 ``` Even though there was a *global* statement_timeout=3D61s configured, backen= ds accessing the same table were hanging with ``LWLock AioUringCompletion`` Restarting the cluster did not go through until the hanging leader PID was ``SIGKILL``ed Nothing in journal, Pg log or ring-buffer hinting to something around the time-frame of problematic's backend query_start. Did anyone experience similar issues? Is that a kernel/io_uring issue or something which Pg should/could handle? ``` Pg 18.1 (Debian 18.1-1.pgdg12+2) Linux 6.1.0-41-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.158-1 (2025-11-09) x86_64 GNU/Linux ```