public inbox for [email protected]  
help / color / mirror / Atom feed
From: Christoph Berg <[email protected]>
To: Tomas Vondra <[email protected]>
Cc: Jakub Wartak <[email protected]>
Cc: [email protected]
Subject: Re: failed NUMA pages inquiry status: Operation not permitted
Date: Thu, 11 Dec 2025 13:29:14 +0100
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>

Re: Tomas Vondra
> >> So I'm leaning to adjust pg_numa_init() to also check EPERM, per the
> >> attached patch. It still calls numa_available(), so that we don't
> >> silently miss future libnuma changes.
> >>
> >> Can you check this makes it work inside the docker container?
> > 
> > Yes your patch works. (Sorry I meant to test earlier, but RL...)
> 
> Thanks. I've pushed the fix (and backpatched to 18).

It looks like we are not done here yet :(

postgresql-18 is failing here intermittently with this diff:

12:20:24 --- /build/reproducible-path/postgresql-18-18.1/src/test/regress/expected/numa.out	2025-11-10 21:52:06.000000000 +0000
12:20:24 +++ /build/reproducible-path/postgresql-18-18.1/build/src/test/regress/results/numa.out	2025-12-11 11:20:22.618989603 +0000
12:20:24 @@ -6,8 +6,4 @@
12:20:24  -- switch to superuser
12:20:24  \c -
12:20:24  SELECT COUNT(*) >= 0 AS ok FROM pg_shmem_allocations_numa;
12:20:24 - ok
12:20:24 -----
12:20:24 - t
12:20:24 -(1 row)
12:20:24 -
12:20:24 +ERROR:  invalid NUMA node id outside of allowed range [0, 0]: -2

That's REL_18_STABLE @ 580b5c, with the Debian packaging on top.

I've seen it on unstable/amd64, unstable/arm64, and Ubuntu
questing/amd64, where libnuma should take care of this itself, without
the extra patch in PG. There was another case on bullseye/amd64 which
has the old libnuma.

It's been frequent enough so it killed 4 out of the 10 builds
currently visible on
https://jengus.postgresql.org/job/postgresql-18-binaries-snapshot/.
(Though to be fair, only one distribution/arch combination was failing
for each of them.)

There is also one instance of it in
https://jengus.postgresql.org/job/postgresql-19-binaries-snapshot/

I currently have no idea what's happening.

Christoph





view thread (83+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected]
  Subject: Re: failed NUMA pages inquiry status: Operation not permitted
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox