public inbox for [email protected]  
help / color / mirror / Atom feed
From: surya poondla <[email protected]>
To: Laurenz Albe <[email protected]>
Cc: Andres Freund <[email protected]>
Cc: Pierre Forstmann <[email protected]>
Cc: Si, Evan <[email protected]>
Cc: [email protected] <[email protected]>
Subject: Re: BUG #19369: Not documented that io_uring on kernel versions between 5.1 and below 5.6 does not work
Date: Thu, 15 Jan 2026 12:27:23 -0800
Message-ID: <CAOVWO5q5hoQvLM_xY-yGDyn7kv-s8Dg7uTcTfqqqPHVjFYKJ1Q@mail.gmail.com> (raw)
In-Reply-To: <CAOVWO5ovWe76XqJozO7rbOGXWLx_2VMBC=dnArZb4T-o9B6OrA@mail.gmail.com>
References: <[email protected]>
	<CAOVWO5ozUNcLLB=sXdjqyd54SYL8W-u636XYHzpUROx-p8rtFA@mail.gmail.com>
	<CAOVWO5qufByqYyUS87X80-T=XDdMTaBNjqc9LNrXcCqYc8A5rQ@mail.gmail.com>
	<[email protected]>
	<rjo26nijemix22gllbpknliioyndrcu2mg3ojdduy5tencve5w@5hmjportc75t>
	<CAOVWO5p5z10mbsjyzWfv0m4gdGcAjtnat991qLWKYRJEc5sAew@mail.gmail.com>
	<[email protected]>
	<hg7mmunuylerqg2zeqvjufu5iatcq5unvuxany6lknrtxprr3v@6wa2mpid3oww>
	<[email protected]>
	<CAOVWO5ovWe76XqJozO7rbOGXWLx_2VMBC=dnArZb4T-o9B6OrA@mail.gmail.com>

>
> Sure Andres, I am working on a patch which emits a useful error message
> too.
>


Hi All,


I prepared a patch that implements an early detection of unsupported
io_uring operations during PostgreSQL startup, before any I/O is attempted.
The patch focuses on

  1. Clear error message at startup instead of cryptic EINVAL during
queries.

  2. Immediate failure with actionable hints (upgrade kernel or use
io_method=worker)

  3. Prevents PostgreSQL from starting in a broken state


Before the patch:

  1. PostgreSQL started successfully

  2. Connection attempts failed with EINVAL errors


After patch:

  1. PostgreSQL refuses to start

  2. Clear error message that looks like:

    "FATAL: kernel does not support required io_uring operations"
    "DETAIL:  The kernel supports io_uring but lacks one or more of the
required opcodes (IORING_OP_READ, IORING_OP_WRITE, IORING_OP_READV,
IORING_OP_WRITEV). This typically occurs on Linux kernels older than 5.6."

    "HINT: Either upgrade your kernel to version 5.6 or newer, or

     use io_method=worker"


Modified files in the Patch:

  1. configure.ac: Added io_uring_opcode_supported to AC_CHECK_FUNCS

  2. meson.build: Added corresponding function check for Meson build

  3. src/backend/storage/aio/method_io_uring.c: Added
is_uring_read_write_unsupported() function, that is integrated into
pgaio_uring_init() and reports clear error with details and hints.


I tested the patch on Ubuntu server with Linux kernel 5.4.0-216-generic,
and when io_uring is enabled I see that postgres doesn’t start (expected
behavior).


The existing error handling for kernels < 5.1 (ENOSYS) is preserved.


Regards,
Surya Poondla


Attachments:

  [application/octet-stream] v2-0001-Document-correct-kernel-requirements-for-io_uring.patch (7.6K, 3-v2-0001-Document-correct-kernel-requirements-for-io_uring.patch)
  download | inline diff:
From c4342f7117df0c7ab420155d8e2726a5f0a8dd6e Mon Sep 17 00:00:00 2001
From: spoondla <[email protected]>
Date: Mon, 12 Jan 2026 15:29:48 -0800
Subject: [PATCH v2] Document correct kernel requirements for io_uring Add
 startup-time kernel version check for io_uring

While io_uring was introduced in Linux 5.1, PostgreSQL requires kernel
version 5.6 or newer due to the io_uring operations it relies on.
Earlier kernels may appear to support io_uring but can fail at runtime.

Updated the internal AIO documentation and the sample configuration file
to state the correct minimum kernel requirement.
---
 configure.ac                                  |   2 +-
 meson.build                                   |   5 +
 src/backend/storage/aio/README.md             |   8 +-
 src/backend/storage/aio/method_io_uring.c     | 108 ++++++++++++++++++
 src/backend/utils/misc/postgresql.conf.sample |   2 +-
 5 files changed, 122 insertions(+), 3 deletions(-)

diff --git a/configure.ac b/configure.ac
index 145197e6bd6..f056966c25b 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1417,7 +1417,7 @@ fi
 if test "$with_liburing" = yes; then
   _LIBS="$LIBS"
   LIBS="$LIBURING_LIBS $LIBS"
-  AC_CHECK_FUNCS([io_uring_queue_init_mem])
+  AC_CHECK_FUNCS([io_uring_queue_init_mem io_uring_opcode_supported])
   LIBS="$_LIBS"
 fi
 
diff --git a/meson.build b/meson.build
index 555c94796c6..c4a5271a18b 100644
--- a/meson.build
+++ b/meson.build
@@ -1040,6 +1040,11 @@ if liburing.found()
     cdata.set('HAVE_IO_URING_QUEUE_INIT_MEM', 1)
   endif
 
+  if cc.has_function('io_uring_opcode_supported',
+      dependencies: liburing, args: test_c_args)
+    cdata.set('HAVE_IO_URING_OPCODE_SUPPORTED', 1)
+  endif
+
 endif
 
 
diff --git a/src/backend/storage/aio/README.md b/src/backend/storage/aio/README.md
index 72ae3b3737d..c40a6ce16cf 100644
--- a/src/backend/storage/aio/README.md
+++ b/src/backend/storage/aio/README.md
@@ -256,10 +256,16 @@ synchronous manner.
 
 #### io_uring
 
-`io_method=io_uring` is available on Linux 5.1+. In contrast to worker mode it
+`io_method=io_uring` is available on Linux 5.6+. In contrast to worker mode it
 dispatches all IO from within the process, lowering context switch rate /
 latency.
 
+While io_uring was introduced in Linux kernel 5.1, the operations required by
+PostgreSQL (IORING_OP_READ and IORING_OP_WRITE opcodes for non-vectored I/O)
+are only available starting with Linux kernel 5.6. Attempting to use io_uring
+on kernels between 5.1 and 5.5 will result in runtime errors (EINVAL) when
+connections are established.
+
 
 ### AIO Handles
 
diff --git a/src/backend/storage/aio/method_io_uring.c b/src/backend/storage/aio/method_io_uring.c
index af58c6118ac..2853c92eb15 100644
--- a/src/backend/storage/aio/method_io_uring.c
+++ b/src/backend/storage/aio/method_io_uring.c
@@ -30,6 +30,7 @@
 #ifdef IOMETHOD_IO_URING_ENABLED
 
 #include <sys/mman.h>
+#include <sys/utsname.h>
 #include <unistd.h>
 
 #include <liburing.h>
@@ -225,6 +226,96 @@ pgaio_uring_check_capabilities(void)
 	pgaio_uring_caps.checked = true;
 }
 
+/*
+ * Check if the kernel supports the required io_uring operations.
+ *
+ * PostgreSQL requires four io_uring opcodes:
+ *   - IORING_OP_READ and IORING_OP_WRITE (added in kernel 5.6)
+ *   - IORING_OP_READV and IORING_OP_WRITEV (added in kernel 5.1)
+ *
+ * While io_uring was introduced in Linux 5.1 with vectored operations,
+ * the non-vectored READ/WRITE opcodes weren't added until 5.6. Since
+ * PostgreSQL uses all four, we need kernel 5.6+.
+ *
+ * Rather than checking kernel version (which is unreliable due to vendor
+ * backports), we probe for actual opcode support when possible.
+ *
+ * Returns true if any required opcode is NOT supported.
+ */
+static bool
+is_uring_read_write_unsupported(void)
+{
+	struct io_uring test_ring;
+	struct io_uring_params p = {0};
+	int			ret;
+	bool		unsupported = false;
+
+	/* Create a temporary ring to probe capabilities */
+	ret = io_uring_queue_init(2, &test_ring, 0);
+	if (ret < 0)
+	{
+		/*
+		 * If we can't even create a ring, let the normal initialization path
+		 * handle the error with appropriate messages.
+		 */
+		return false;
+	}
+
+#ifdef HAVE_IO_URING_OPCODE_SUPPORTED
+	/*
+	 * Use io_uring_opcode_supported() if available (liburing 2.1+).
+	 * This directly queries the kernel for opcode support.
+	 *
+	 * PostgreSQL uses both single-buffer (READ/WRITE) and vectored
+	 * (READV/WRITEV) operations. READV/WRITEV were added in kernel 5.1,
+	 * but READ/WRITE were added in kernel 5.6. Check for all four to
+	 * ensure complete support.
+	 */
+	if (!io_uring_opcode_supported(&test_ring, IORING_OP_READ) ||
+		!io_uring_opcode_supported(&test_ring, IORING_OP_WRITE) ||
+		!io_uring_opcode_supported(&test_ring, IORING_OP_READV) ||
+		!io_uring_opcode_supported(&test_ring, IORING_OP_WRITEV))
+	{
+		unsupported = true;
+	}
+#else
+	/*
+	 * Fallback: Try to probe by checking if we can prepare read operations.
+	 * Kernels without IORING_OP_READ support will fail later, but at least
+	 * we tried. This is less reliable but works with older liburing.
+	 */
+	{
+		struct io_uring_sqe *sqe;
+
+		sqe = io_uring_get_sqe(&test_ring);
+		if (sqe)
+		{
+			/*
+			 * Prepare a dummy read operation. On kernels without
+			 * IORING_OP_READ support, this will be accepted here but fail
+			 * with EINVAL when submitted. We'd need to actually submit to
+			 * detect, but that requires a valid fd. The version check is a
+			 * reasonable fallback.
+			 */
+			struct utsname uts;
+			int			major,
+						minor;
+
+			if (uname(&uts) == 0 &&
+				sscanf(uts.release, "%d.%d", &major, &minor) == 2)
+			{
+				/* Known problematic kernel range */
+				if (major == 5 && minor >= 1 && minor <= 5)
+					unsupported = true;
+			}
+		}
+	}
+#endif
+
+	io_uring_queue_exit(&test_ring);
+	return unsupported;
+}
+
 /*
  * Memory for all PgAioUringContext instances
  */
@@ -284,6 +375,23 @@ pgaio_uring_shmem_init(bool first_time)
 	size_t		ring_mem_remain = 0;
 	char	   *ring_mem_next = 0;
 
+	/*
+	 * Check if the kernel supports the required io_uring operations before
+	 * attempting full initialization. Kernels without all required opcodes
+	 * (IORING_OP_READ, WRITE, READV, WRITEV) will cause runtime EINVAL errors.
+	 */
+	if (is_uring_read_write_unsupported())
+	{
+		ereport(ERROR,
+				errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				errmsg("kernel does not support required io_uring operations"),
+				errdetail("The kernel supports io_uring but lacks one or more of the "
+						  "required opcodes (IORING_OP_READ, IORING_OP_WRITE, "
+						  "IORING_OP_READV, IORING_OP_WRITEV). "
+						  "This typically occurs on Linux kernels older than 5.6."),
+				errhint("Either upgrade your kernel to version 5.6 or newer, or use io_method=worker."));
+	}
+
 	/*
 	 * We allocate memory for all PgAioUringContext instances and, if
 	 * supported, the memory required for each of the io_uring instances, in
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index dc9e2255f8a..1648f4be207 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -204,7 +204,7 @@
                                         # (change requires restart)
 #io_combine_limit = 128kB               # usually 1-128 blocks (depends on OS)
 
-#io_method = worker                     # worker, io_uring, sync
+#io_method = worker                     # worker, io_uring (Linux 5.6+), sync
                                         # (change requires restart)
 #io_max_concurrency = -1                # Max number of IOs that one process
                                         # can execute simultaneously
-- 
2.39.5 (Apple Git-154)



reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: BUG #19369: Not documented that io_uring on kernel versions between 5.1 and below 5.6 does not work
  In-Reply-To: <CAOVWO5q5hoQvLM_xY-yGDyn7kv-s8Dg7uTcTfqqqPHVjFYKJ1Q@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox