Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vVWpY-004lJI-2v for pgsql-hackers@arkaria.postgresql.org; Tue, 16 Dec 2025 15:19:17 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vVWpX-007Lcf-1A for pgsql-hackers@arkaria.postgresql.org; Tue, 16 Dec 2025 15:19:16 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vVWoG-007Izg-1Y for pgsql-hackers@lists.postgresql.org; Tue, 16 Dec 2025 15:17:57 +0000 Received: from relay9-d.mail.gandi.net ([217.70.183.199]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vVWoE-001453-1w for pgsql-hackers@lists.postgresql.org; Tue, 16 Dec 2025 15:17:57 +0000 Received: by mail.gandi.net (Postfix) with ESMTPSA id 35EB44320B; Tue, 16 Dec 2025 15:17:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vondra.me; s=gm1; t=1765898272; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7wRH6zHmLqv5DuDlCD3QzYVr5nymi/5KLy0cIU3OcrE=; b=iheYKZ0uhj+vXINGh8uU4nOhfIUiab0AX4HrSui/TGFZllPGe2zqbmG59xW+DppQWTUrJA pWmjcnrkvpsSkgu5VprJ1ZEcCgcB9n6kVAllLFq+84J/QlRm0SQ6vVtH/aD7ZrDeE55gpi 2Miczo89bbUCYWKDuru4r0B8NrZYtQLaVjra1YJdCRXVMlV6NTZXhYNJDl6R3+WeGCH6gt 6dzS7ykXWPajhKLMW/8WwQMl9JRoaTqhhntqldgILwVnyw00/crYuKLdKPpwKVkWKKwfwa gg7Wr99MXbG2NjMi+gXdFuR9I6DR6Mu2Z75MgkAeICcO/hGKDFRcGvkmvI9rzw== Message-ID: <183fe9ab-6010-4cca-b648-1deca332ce2a@vondra.me> Date: Tue, 16 Dec 2025 16:17:51 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: failed NUMA pages inquiry status: Operation not permitted To: Christoph Berg Cc: Jakub Wartak , pgsql-hackers@lists.postgresql.org References: <7bbc582b-cc70-4a6f-bbf2-b5fd9b13a867@vondra.me> <54329add-59b6-4c08-96f0-a025a7804174@vondra.me> <4ff9578d-1de2-45c1-98c4-29caf99334ff@vondra.me> Content-Language: en-US From: Tomas Vondra In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-GND-Sasl: tomas@vondra.me X-GND-Score: -100 X-GND-Cause: gggruggvucftvghtrhhoucdtuddrgeefgedrtddtgdegtddtudcutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfitefpfffkpdcuggftfghnshhusghstghrihgsvgenuceurghilhhouhhtmecufedtudenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurhepkfffgggfuffvvehfhfgjtgfgsehtjeertddtvdejnecuhfhrohhmpefvohhmrghsucggohhnughrrgcuoehtohhmrghssehvohhnughrrgdrmhgvqeenucggtffrrghtthgvrhhnpeeludegieekgfelhffgffeuvdelteetveeghfdvieekfeduudduvdfhvedufefhveenucfkphepkeeirdegledrvdeftddrvddtieenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepihhnvghtpeekiedrgeelrddvfedtrddvtdeipdhhvghloheplgdutddrudefjedrtddrvdgnpdhmrghilhhfrhhomhepthhomhgrshesvhhonhgurhgrrdhmvgdpqhhiugepfeehgfeugeegfedvtdeupdhmohguvgepshhmthhpohhuthdpnhgspghrtghpthhtohepfedprhgtphhtthhopehmhihonhesuggvsghirghnrdhorhhgpdhrtghpthhtohepjhgrkhhusgdrfigrrhhtrghksegvnhhtvghrphhrihhsvggusgdrtghomhdprhgtphhtthhopehpghhsqhhlqdhhrggtkhgvrhhssehlihhsthhsrdhpohhsthhgrhgvshhqlhdrohhrgh X-GND-State: clean List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On 12/16/25 15:48, Christoph Berg wrote: > Re: To Tomas Vondra >> I've managed to reproduce it once, running this loop on >> 18-as-of-today. It errored out after a few 100 iterations: >> >> while psql -c 'SELECT COUNT(*) >= 0 AS ok FROM pg_shmem_allocations_numa'; do :; done >> >> 2025-12-16 11:49:35.982 UTC [621807] myon@postgres ERROR: invalid NUMA node id outside of allowed range [0, 0]: -2 >> 2025-12-16 11:49:35.982 UTC [621807] myon@postgres STATEMENT: SELECT COUNT(*) >= 0 AS ok FROM pg_shmem_allocations_numa >> >> That was on the apt.pg.o amd64 build machine while a few things were >> just building. Maybe ENOENT "The page is not present" means something >> was just swapped out because the machine was under heavy load. > > I played a bit more with it. > > * It seems to trigger only once for a running cluster. The next one > needs a restart > * If it doesn't trigger within the first 30s, it probably never will > * It seems easier to trigger on a system that is under load (I started > a few pgmodeler compile runs in parallel (C++)) > > But none of that answers the "why". > Hmmm, so this is interesting. I tried this on my workstation (with a single NUMA node), and I see this: 1) right after opening a connection, I get this test=# select numa_node, count(*) from pg_buffercache_numa group by 1; numa_node | count -----------+------- 0 | 290 -2 | 32478 (2 rows) 2) but a select from pg_shmem_allocations_numa works fine test=# select numa_node, count(*) from pg_shmem_allocations_numa group by 1; numa_node | count -----------+------- 0 | 72 (1 row) 3) and if I repeat the pg_buffercache_numa query, it now works test=# select numa_node, count(*) from pg_buffercache_numa group by 1; numa_node | count -----------+------- 0 | 32768 (1 row) That's a bit strange. I have no idea why is this happening. If I reconnect, I start getting the failures again. regards -- Tomas Vondra