Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vTfnN-007HMg-2M for pgsql-hackers@arkaria.postgresql.org; Thu, 11 Dec 2025 12:29:22 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vTfnM-003fnk-23 for pgsql-hackers@arkaria.postgresql.org; Thu, 11 Dec 2025 12:29:21 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vTfnM-003fnZ-14 for pgsql-hackers@lists.postgresql.org; Thu, 11 Dec 2025 12:29:21 +0000 Received: from goedel.df7cb.de ([2a01:4f8:c013:1d4::1]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vTfnK-0008rv-2Q for pgsql-hackers@lists.postgresql.org; Thu, 11 Dec 2025 12:29:20 +0000 Received: from msg.df7cb.de (unknown [IPv6:2a02:908:1472:9340:f0ad:fc6e:9c86:f1dc]) by goedel.df7cb.de (Postfix) with ESMTPSA id 98AD740E33; Thu, 11 Dec 2025 12:29:15 +0000 (UTC) Date: Thu, 11 Dec 2025 13:29:14 +0100 From: Christoph Berg To: Tomas Vondra Cc: Jakub Wartak , pgsql-hackers@lists.postgresql.org Subject: Re: failed NUMA pages inquiry status: Operation not permitted Message-ID: References: <7bbc582b-cc70-4a6f-bbf2-b5fd9b13a867@vondra.me> <54329add-59b6-4c08-96f0-a025a7804174@vondra.me> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <54329add-59b6-4c08-96f0-a025a7804174@vondra.me> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Re: Tomas Vondra > >> So I'm leaning to adjust pg_numa_init() to also check EPERM, per the > >> attached patch. It still calls numa_available(), so that we don't > >> silently miss future libnuma changes. > >> > >> Can you check this makes it work inside the docker container? > > > > Yes your patch works. (Sorry I meant to test earlier, but RL...) > > Thanks. I've pushed the fix (and backpatched to 18). It looks like we are not done here yet :( postgresql-18 is failing here intermittently with this diff: 12:20:24 --- /build/reproducible-path/postgresql-18-18.1/src/test/regress/expected/numa.out 2025-11-10 21:52:06.000000000 +0000 12:20:24 +++ /build/reproducible-path/postgresql-18-18.1/build/src/test/regress/results/numa.out 2025-12-11 11:20:22.618989603 +0000 12:20:24 @@ -6,8 +6,4 @@ 12:20:24 -- switch to superuser 12:20:24 \c - 12:20:24 SELECT COUNT(*) >= 0 AS ok FROM pg_shmem_allocations_numa; 12:20:24 - ok 12:20:24 ----- 12:20:24 - t 12:20:24 -(1 row) 12:20:24 - 12:20:24 +ERROR: invalid NUMA node id outside of allowed range [0, 0]: -2 That's REL_18_STABLE @ 580b5c, with the Debian packaging on top. I've seen it on unstable/amd64, unstable/arm64, and Ubuntu questing/amd64, where libnuma should take care of this itself, without the extra patch in PG. There was another case on bullseye/amd64 which has the old libnuma. It's been frequent enough so it killed 4 out of the 10 builds currently visible on https://jengus.postgresql.org/job/postgresql-18-binaries-snapshot/. (Though to be fair, only one distribution/arch combination was failing for each of them.) There is also one instance of it in https://jengus.postgresql.org/job/postgresql-19-binaries-snapshot/ I currently have no idea what's happening. Christoph