Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vTg4V-007R6o-1j for pgsql-hackers@arkaria.postgresql.org; Thu, 11 Dec 2025 12:47:04 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vTg4U-003jwW-1c for pgsql-hackers@arkaria.postgresql.org; Thu, 11 Dec 2025 12:47:03 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vTg4U-003jwN-0L for pgsql-hackers@lists.postgresql.org; Thu, 11 Dec 2025 12:47:02 +0000 Received: from relay5-d.mail.gandi.net ([2001:4b98:dc4:8::225]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vTg4S-00090T-2T for pgsql-hackers@lists.postgresql.org; Thu, 11 Dec 2025 12:47:01 +0000 Received: by mail.gandi.net (Postfix) with ESMTPSA id 79DF341CFA; Thu, 11 Dec 2025 12:46:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vondra.me; s=gm1; t=1765457215; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tffyinXxIdTIPKdo4GgLpsSSZCuk8j1Iw8nJTme2q1E=; b=cJrr4kZndroH/S9xZaDKbiEhdoltsyvUtFF/KrAyS+lo4vufl6llSeHJrL+dodU8GvnPWo OzbDvUVPmEN/KDxxVpQ+IUWL6o5X/r5BPo2eOhcyQMJfQb6Mo5jPGaz0E2PdYWk5xPpXDz emiAA5Up/UlA1CFchzBT2b73ofaqjqS/aU1agvIpBNo+Dby+0Vel92vAo7OXJPJbhr2zDD b0jTPY1iBsqmjGcyyGUpwKtqDJEIqjsHNZAk7l8eiWhJgPLv08Nq9hNK6sNfQr8SDwFyxv j3usOtSFTJdnfkaXQyMN5ey5M8XIoohteFaw8ksqkND4OHHESbGL1uWmm7bFCg== Message-ID: <4ff9578d-1de2-45c1-98c4-29caf99334ff@vondra.me> Date: Thu, 11 Dec 2025 13:46:54 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: failed NUMA pages inquiry status: Operation not permitted To: Christoph Berg Cc: Jakub Wartak , pgsql-hackers@lists.postgresql.org References: <7bbc582b-cc70-4a6f-bbf2-b5fd9b13a867@vondra.me> <54329add-59b6-4c08-96f0-a025a7804174@vondra.me> Content-Language: en-US From: Tomas Vondra In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-GND-Sasl: tomas@vondra.me X-GND-State: clean X-GND-Score: -100 X-GND-Cause: gggruggvucftvghtrhhoucdtuddrgeefgedrtddtgddvheefgecutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfitefpfffkpdcuggftfghnshhusghstghrihgsvgenuceurghilhhouhhtmecufedtudenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurhepkfffgggfuffvvehfhfgjtgfgsehtjeertddtvdejnecuhfhrohhmpefvohhmrghsucggohhnughrrgcuoehtohhmrghssehvohhnughrrgdrmhgvqeenucggtffrrghtthgvrhhnpeevtefggfeiueelheelkedujeevgfegjeegueekjefggfejiefhtdduteehjeefkeenucffohhmrghinhepphhoshhtghhrvghsqhhlrdhorhhgnecukfhppeekiedrgeelrddvfedtrddvtdeinecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehinhgvthepkeeirdegledrvdeftddrvddtiedphhgvlhhopegluddtrddufeejrddtrddvngdpmhgrihhlfhhrohhmpehtohhmrghssehvohhnughrrgdrmhgvpdhqihgupeejleffhfefgeduvefhtedpmhhouggvpehsmhhtphhouhhtpdhnsggprhgtphhtthhopeefpdhrtghpthhtohepmhihohhnseguvggsihgrnhdrohhrghdprhgtphhtthhopehjrghkuhgsrdifrghrthgrkhesvghnthgvrhhprhhishgvuggsrdgtohhmpdhrtghpthhtohepphhgshhqlhdqhhgrtghkvghrsheslhhishhtshdrphhoshhtghhrvghsqhhlrdhorhhg List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On 12/11/25 13:29, Christoph Berg wrote: > Re: Tomas Vondra >>>> So I'm leaning to adjust pg_numa_init() to also check EPERM, per the >>>> attached patch. It still calls numa_available(), so that we don't >>>> silently miss future libnuma changes. >>>> >>>> Can you check this makes it work inside the docker container? >>> >>> Yes your patch works. (Sorry I meant to test earlier, but RL...) >> >> Thanks. I've pushed the fix (and backpatched to 18). > > It looks like we are not done here yet :( > > postgresql-18 is failing here intermittently with this diff: > > 12:20:24 --- /build/reproducible-path/postgresql-18-18.1/src/test/regress/expected/numa.out 2025-11-10 21:52:06.000000000 +0000 > 12:20:24 +++ /build/reproducible-path/postgresql-18-18.1/build/src/test/regress/results/numa.out 2025-12-11 11:20:22.618989603 +0000 > 12:20:24 @@ -6,8 +6,4 @@ > 12:20:24 -- switch to superuser > 12:20:24 \c - > 12:20:24 SELECT COUNT(*) >= 0 AS ok FROM pg_shmem_allocations_numa; > 12:20:24 - ok > 12:20:24 ----- > 12:20:24 - t > 12:20:24 -(1 row) > 12:20:24 - > 12:20:24 +ERROR: invalid NUMA node id outside of allowed range [0, 0]: -2 > > That's REL_18_STABLE @ 580b5c, with the Debian packaging on top. > > I've seen it on unstable/amd64, unstable/arm64, and Ubuntu > questing/amd64, where libnuma should take care of this itself, without > the extra patch in PG. There was another case on bullseye/amd64 which > has the old libnuma. > > It's been frequent enough so it killed 4 out of the 10 builds > currently visible on > https://jengus.postgresql.org/job/postgresql-18-binaries-snapshot/. > (Though to be fair, only one distribution/arch combination was failing > for each of them.) > > There is also one instance of it in > https://jengus.postgresql.org/job/postgresql-19-binaries-snapshot/ > > I currently have no idea what's happening. > Hmmm, strange. -2 is ENOENT, which should mean this: -ENOENT The page is not present. But what does "not present" mean in this context? And why would that be only intermittent? Presumably this is still running in Docker, so maybe it's another weird consequence of that? regards -- Tomas Vondra