public inbox for [email protected]
help / color / mirror / Atom feedFrom: Tomas Vondra <[email protected]>
To: Christoph Berg <[email protected]>
Cc: Jakub Wartak <[email protected]>
Cc: [email protected]
Subject: Re: failed NUMA pages inquiry status: Operation not permitted
Date: Thu, 11 Dec 2025 13:46:54 +0100
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
On 12/11/25 13:29, Christoph Berg wrote:
> Re: Tomas Vondra
>>>> So I'm leaning to adjust pg_numa_init() to also check EPERM, per the
>>>> attached patch. It still calls numa_available(), so that we don't
>>>> silently miss future libnuma changes.
>>>>
>>>> Can you check this makes it work inside the docker container?
>>>
>>> Yes your patch works. (Sorry I meant to test earlier, but RL...)
>>
>> Thanks. I've pushed the fix (and backpatched to 18).
>
> It looks like we are not done here yet :(
>
> postgresql-18 is failing here intermittently with this diff:
>
> 12:20:24 --- /build/reproducible-path/postgresql-18-18.1/src/test/regress/expected/numa.out 2025-11-10 21:52:06.000000000 +0000
> 12:20:24 +++ /build/reproducible-path/postgresql-18-18.1/build/src/test/regress/results/numa.out 2025-12-11 11:20:22.618989603 +0000
> 12:20:24 @@ -6,8 +6,4 @@
> 12:20:24 -- switch to superuser
> 12:20:24 \c -
> 12:20:24 SELECT COUNT(*) >= 0 AS ok FROM pg_shmem_allocations_numa;
> 12:20:24 - ok
> 12:20:24 -----
> 12:20:24 - t
> 12:20:24 -(1 row)
> 12:20:24 -
> 12:20:24 +ERROR: invalid NUMA node id outside of allowed range [0, 0]: -2
>
> That's REL_18_STABLE @ 580b5c, with the Debian packaging on top.
>
> I've seen it on unstable/amd64, unstable/arm64, and Ubuntu
> questing/amd64, where libnuma should take care of this itself, without
> the extra patch in PG. There was another case on bullseye/amd64 which
> has the old libnuma.
>
> It's been frequent enough so it killed 4 out of the 10 builds
> currently visible on
> https://jengus.postgresql.org/job/postgresql-18-binaries-snapshot/.
> (Though to be fair, only one distribution/arch combination was failing
> for each of them.)
>
> There is also one instance of it in
> https://jengus.postgresql.org/job/postgresql-19-binaries-snapshot/
>
> I currently have no idea what's happening.
>
Hmmm, strange. -2 is ENOENT, which should mean this:
-ENOENT
The page is not present.
But what does "not present" mean in this context? And why would that be
only intermittent? Presumably this is still running in Docker, so maybe
it's another weird consequence of that?
regards
--
Tomas Vondra
view thread (83+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected]
Subject: Re: failed NUMA pages inquiry status: Operation not permitted
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox