public inbox for [email protected]
help / color / mirror / Atom feedFrom: surya poondla <[email protected]>
To: Laurenz Albe <[email protected]>
Cc: Andres Freund <[email protected]>
Cc: Pierre Forstmann <[email protected]>
Cc: Si, Evan <[email protected]>
Cc: [email protected] <[email protected]>
Subject: Re: BUG #19369: Not documented that io_uring on kernel versions between 5.1 and below 5.6 does not work
Date: Thu, 15 Jan 2026 12:27:23 -0800
Message-ID: <CAOVWO5q5hoQvLM_xY-yGDyn7kv-s8Dg7uTcTfqqqPHVjFYKJ1Q@mail.gmail.com> (raw)
In-Reply-To: <CAOVWO5ovWe76XqJozO7rbOGXWLx_2VMBC=dnArZb4T-o9B6OrA@mail.gmail.com>
References: <[email protected]>
<CAOVWO5ozUNcLLB=sXdjqyd54SYL8W-u636XYHzpUROx-p8rtFA@mail.gmail.com>
<CAOVWO5qufByqYyUS87X80-T=XDdMTaBNjqc9LNrXcCqYc8A5rQ@mail.gmail.com>
<[email protected]>
<rjo26nijemix22gllbpknliioyndrcu2mg3ojdduy5tencve5w@5hmjportc75t>
<CAOVWO5p5z10mbsjyzWfv0m4gdGcAjtnat991qLWKYRJEc5sAew@mail.gmail.com>
<[email protected]>
<hg7mmunuylerqg2zeqvjufu5iatcq5unvuxany6lknrtxprr3v@6wa2mpid3oww>
<[email protected]>
<CAOVWO5ovWe76XqJozO7rbOGXWLx_2VMBC=dnArZb4T-o9B6OrA@mail.gmail.com>
>
> Sure Andres, I am working on a patch which emits a useful error message
> too.
>
Hi All,
I prepared a patch that implements an early detection of unsupported
io_uring operations during PostgreSQL startup, before any I/O is attempted.
The patch focuses on
1. Clear error message at startup instead of cryptic EINVAL during
queries.
2. Immediate failure with actionable hints (upgrade kernel or use
io_method=worker)
3. Prevents PostgreSQL from starting in a broken state
Before the patch:
1. PostgreSQL started successfully
2. Connection attempts failed with EINVAL errors
After patch:
1. PostgreSQL refuses to start
2. Clear error message that looks like:
"FATAL: kernel does not support required io_uring operations"
"DETAIL: The kernel supports io_uring but lacks one or more of the
required opcodes (IORING_OP_READ, IORING_OP_WRITE, IORING_OP_READV,
IORING_OP_WRITEV). This typically occurs on Linux kernels older than 5.6."
"HINT: Either upgrade your kernel to version 5.6 or newer, or
use io_method=worker"
Modified files in the Patch:
1. configure.ac: Added io_uring_opcode_supported to AC_CHECK_FUNCS
2. meson.build: Added corresponding function check for Meson build
3. src/backend/storage/aio/method_io_uring.c: Added
is_uring_read_write_unsupported() function, that is integrated into
pgaio_uring_init() and reports clear error with details and hints.
I tested the patch on Ubuntu server with Linux kernel 5.4.0-216-generic,
and when io_uring is enabled I see that postgres doesn’t start (expected
behavior).
The existing error handling for kernels < 5.1 (ENOSYS) is preserved.
Regards,
Surya Poondla
Attachments:
[application/octet-stream] v2-0001-Document-correct-kernel-requirements-for-io_uring.patch (7.6K, 3-v2-0001-Document-correct-kernel-requirements-for-io_uring.patch)
download | inline diff:
From c4342f7117df0c7ab420155d8e2726a5f0a8dd6e Mon Sep 17 00:00:00 2001
From: spoondla <[email protected]>
Date: Mon, 12 Jan 2026 15:29:48 -0800
Subject: [PATCH v2] Document correct kernel requirements for io_uring Add
startup-time kernel version check for io_uring
While io_uring was introduced in Linux 5.1, PostgreSQL requires kernel
version 5.6 or newer due to the io_uring operations it relies on.
Earlier kernels may appear to support io_uring but can fail at runtime.
Updated the internal AIO documentation and the sample configuration file
to state the correct minimum kernel requirement.
---
configure.ac | 2 +-
meson.build | 5 +
src/backend/storage/aio/README.md | 8 +-
src/backend/storage/aio/method_io_uring.c | 108 ++++++++++++++++++
src/backend/utils/misc/postgresql.conf.sample | 2 +-
5 files changed, 122 insertions(+), 3 deletions(-)
diff --git a/configure.ac b/configure.ac
index 145197e6bd6..f056966c25b 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1417,7 +1417,7 @@ fi
if test "$with_liburing" = yes; then
_LIBS="$LIBS"
LIBS="$LIBURING_LIBS $LIBS"
- AC_CHECK_FUNCS([io_uring_queue_init_mem])
+ AC_CHECK_FUNCS([io_uring_queue_init_mem io_uring_opcode_supported])
LIBS="$_LIBS"
fi
diff --git a/meson.build b/meson.build
index 555c94796c6..c4a5271a18b 100644
--- a/meson.build
+++ b/meson.build
@@ -1040,6 +1040,11 @@ if liburing.found()
cdata.set('HAVE_IO_URING_QUEUE_INIT_MEM', 1)
endif
+ if cc.has_function('io_uring_opcode_supported',
+ dependencies: liburing, args: test_c_args)
+ cdata.set('HAVE_IO_URING_OPCODE_SUPPORTED', 1)
+ endif
+
endif
diff --git a/src/backend/storage/aio/README.md b/src/backend/storage/aio/README.md
index 72ae3b3737d..c40a6ce16cf 100644
--- a/src/backend/storage/aio/README.md
+++ b/src/backend/storage/aio/README.md
@@ -256,10 +256,16 @@ synchronous manner.
#### io_uring
-`io_method=io_uring` is available on Linux 5.1+. In contrast to worker mode it
+`io_method=io_uring` is available on Linux 5.6+. In contrast to worker mode it
dispatches all IO from within the process, lowering context switch rate /
latency.
+While io_uring was introduced in Linux kernel 5.1, the operations required by
+PostgreSQL (IORING_OP_READ and IORING_OP_WRITE opcodes for non-vectored I/O)
+are only available starting with Linux kernel 5.6. Attempting to use io_uring
+on kernels between 5.1 and 5.5 will result in runtime errors (EINVAL) when
+connections are established.
+
### AIO Handles
diff --git a/src/backend/storage/aio/method_io_uring.c b/src/backend/storage/aio/method_io_uring.c
index af58c6118ac..2853c92eb15 100644
--- a/src/backend/storage/aio/method_io_uring.c
+++ b/src/backend/storage/aio/method_io_uring.c
@@ -30,6 +30,7 @@
#ifdef IOMETHOD_IO_URING_ENABLED
#include <sys/mman.h>
+#include <sys/utsname.h>
#include <unistd.h>
#include <liburing.h>
@@ -225,6 +226,96 @@ pgaio_uring_check_capabilities(void)
pgaio_uring_caps.checked = true;
}
+/*
+ * Check if the kernel supports the required io_uring operations.
+ *
+ * PostgreSQL requires four io_uring opcodes:
+ * - IORING_OP_READ and IORING_OP_WRITE (added in kernel 5.6)
+ * - IORING_OP_READV and IORING_OP_WRITEV (added in kernel 5.1)
+ *
+ * While io_uring was introduced in Linux 5.1 with vectored operations,
+ * the non-vectored READ/WRITE opcodes weren't added until 5.6. Since
+ * PostgreSQL uses all four, we need kernel 5.6+.
+ *
+ * Rather than checking kernel version (which is unreliable due to vendor
+ * backports), we probe for actual opcode support when possible.
+ *
+ * Returns true if any required opcode is NOT supported.
+ */
+static bool
+is_uring_read_write_unsupported(void)
+{
+ struct io_uring test_ring;
+ struct io_uring_params p = {0};
+ int ret;
+ bool unsupported = false;
+
+ /* Create a temporary ring to probe capabilities */
+ ret = io_uring_queue_init(2, &test_ring, 0);
+ if (ret < 0)
+ {
+ /*
+ * If we can't even create a ring, let the normal initialization path
+ * handle the error with appropriate messages.
+ */
+ return false;
+ }
+
+#ifdef HAVE_IO_URING_OPCODE_SUPPORTED
+ /*
+ * Use io_uring_opcode_supported() if available (liburing 2.1+).
+ * This directly queries the kernel for opcode support.
+ *
+ * PostgreSQL uses both single-buffer (READ/WRITE) and vectored
+ * (READV/WRITEV) operations. READV/WRITEV were added in kernel 5.1,
+ * but READ/WRITE were added in kernel 5.6. Check for all four to
+ * ensure complete support.
+ */
+ if (!io_uring_opcode_supported(&test_ring, IORING_OP_READ) ||
+ !io_uring_opcode_supported(&test_ring, IORING_OP_WRITE) ||
+ !io_uring_opcode_supported(&test_ring, IORING_OP_READV) ||
+ !io_uring_opcode_supported(&test_ring, IORING_OP_WRITEV))
+ {
+ unsupported = true;
+ }
+#else
+ /*
+ * Fallback: Try to probe by checking if we can prepare read operations.
+ * Kernels without IORING_OP_READ support will fail later, but at least
+ * we tried. This is less reliable but works with older liburing.
+ */
+ {
+ struct io_uring_sqe *sqe;
+
+ sqe = io_uring_get_sqe(&test_ring);
+ if (sqe)
+ {
+ /*
+ * Prepare a dummy read operation. On kernels without
+ * IORING_OP_READ support, this will be accepted here but fail
+ * with EINVAL when submitted. We'd need to actually submit to
+ * detect, but that requires a valid fd. The version check is a
+ * reasonable fallback.
+ */
+ struct utsname uts;
+ int major,
+ minor;
+
+ if (uname(&uts) == 0 &&
+ sscanf(uts.release, "%d.%d", &major, &minor) == 2)
+ {
+ /* Known problematic kernel range */
+ if (major == 5 && minor >= 1 && minor <= 5)
+ unsupported = true;
+ }
+ }
+ }
+#endif
+
+ io_uring_queue_exit(&test_ring);
+ return unsupported;
+}
+
/*
* Memory for all PgAioUringContext instances
*/
@@ -284,6 +375,23 @@ pgaio_uring_shmem_init(bool first_time)
size_t ring_mem_remain = 0;
char *ring_mem_next = 0;
+ /*
+ * Check if the kernel supports the required io_uring operations before
+ * attempting full initialization. Kernels without all required opcodes
+ * (IORING_OP_READ, WRITE, READV, WRITEV) will cause runtime EINVAL errors.
+ */
+ if (is_uring_read_write_unsupported())
+ {
+ ereport(ERROR,
+ errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("kernel does not support required io_uring operations"),
+ errdetail("The kernel supports io_uring but lacks one or more of the "
+ "required opcodes (IORING_OP_READ, IORING_OP_WRITE, "
+ "IORING_OP_READV, IORING_OP_WRITEV). "
+ "This typically occurs on Linux kernels older than 5.6."),
+ errhint("Either upgrade your kernel to version 5.6 or newer, or use io_method=worker."));
+ }
+
/*
* We allocate memory for all PgAioUringContext instances and, if
* supported, the memory required for each of the io_uring instances, in
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index dc9e2255f8a..1648f4be207 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -204,7 +204,7 @@
# (change requires restart)
#io_combine_limit = 128kB # usually 1-128 blocks (depends on OS)
-#io_method = worker # worker, io_uring, sync
+#io_method = worker # worker, io_uring (Linux 5.6+), sync
# (change requires restart)
#io_max_concurrency = -1 # Max number of IOs that one process
# can execute simultaneously
--
2.39.5 (Apple Git-154)
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: BUG #19369: Not documented that io_uring on kernel versions between 5.1 and below 5.6 does not work
In-Reply-To: <CAOVWO5q5hoQvLM_xY-yGDyn7kv-s8Dg7uTcTfqqqPHVjFYKJ1Q@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox