Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uUAJm-00Gerq-O2 for pgsql-hackers@arkaria.postgresql.org; Tue, 24 Jun 2025 20:32:34 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1uUAJk-00EkdO-Dq for pgsql-hackers@arkaria.postgresql.org; Tue, 24 Jun 2025 20:32:33 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uUAJj-00EkdG-Td for pgsql-hackers@lists.postgresql.org; Tue, 24 Jun 2025 20:32:32 +0000 Received: from relay4-d.mail.gandi.net ([217.70.183.196]) by makus.postgresql.org with smtp (Exim 4.96) (envelope-from ) id 1uUAJh-003nIM-2n for pgsql-hackers@lists.postgresql.org; Tue, 24 Jun 2025 20:32:31 +0000 Received: by mail.gandi.net (Postfix) with ESMTPSA id 0D073443CC; Tue, 24 Jun 2025 20:32:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vondra.me; s=gm1; t=1750797147; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=6h3Yr0LWCdlsROL8WeAhsj9ihXweaytngD7CQzArQYk=; b=oRt6gYj5hctHvjEivUawcqFY7WQASraOUaVbAKqJySmPWq0F4WcAbbfI6TAsOKrvh2qFkM F7pm7CXuLTuW3rZloJECrKphdkvA3lvlFbMAYcrvr1ImOn4QKWMOgDTgUlhQ6ZQWtb8YdE b0jymwl5/ppELw75WYbNBkeNfDMS1f0dk0zWmKjplJcr2An4sU44ZvbXxcRfGtw4sXRQ5J 9ZJl+FpycpJNI1C9vDBuK/XKaci7Ik4ZL2ersQDv5IQokwLgX8i4XcIJm//nYg2h66vS1i 9SYeMRRjeOnszbLcCCkQDLnyWpt3FwnjvzM8f36CmsNGpiJqfPqFb2uHSS+55A== Content-Type: multipart/mixed; boundary="------------k93Wv7BdXkBaEMlI9jjs835l" Message-ID: Date: Tue, 24 Jun 2025 22:32:25 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: pgsql: Introduce pg_shmem_allocations_numa view To: Christoph Berg Cc: Bertrand Drouvot , Andres Freund , Tomas Vondra , pgsql-hackers@lists.postgresql.org References: <6342f601-77de-4ee0-8c2a-3deb50ceac5b@vondra.me> <8649a4e3-c60d-4f37-aa6f-e7e7c14c581e@vondra.me> <8961c087-e49b-4b16-9437-31331625215c@vondra.me> Content-Language: en-US From: Tomas Vondra In-Reply-To: X-GND-State: clean X-GND-Score: -100 X-GND-Cause: gggruggvucftvghtrhhoucdtuddrgeeffedrtddvgddvtdekiecutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfitefpfffkpdcuggftfghnshhusghstghrihgsvgenuceurghilhhouhhtmecufedtudenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurheptgfkffggfgfuvfevfhfhjgesmhdtreertddvjeenucfhrhhomhepvfhomhgrshcugghonhgurhgruceothhomhgrshesvhhonhgurhgrrdhmvgeqnecuggftrfgrthhtvghrnhepleejvdetleffffdtledtieeuieffjefglefghfeuueeuveegvdeikedvueeuleegnecuffhomhgrihhnpehmrghrtgdrihhnfhhonecukfhppeekiedrgeelrddvfeegrdduheefnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehinhgvthepkeeirdegledrvdefgedrudehfedphhgvlhhopegluddtrddufeejrddtrddukegnpdhmrghilhhfrhhomhepthhomhgrshesvhhonhgurhgrrdhmvgdpnhgspghrtghpthhtohephedprhgtphhtthhopehmhihonhesuggvsghirghnrdhorhhgpdhrtghpthhtohepsggvrhhtrhgrnhguughrohhuvhhothdrphhgsehgmhgrihhlrdgtohhmpdhrtghpthhtoheprghnughrvghssegrnhgrrhgriigvlhdruggvpdhrtghpthhtohepthhomhgrshdrvhhonhgurhgrsehpohhsthhgrhgvshhqlhdrohhrghdprhgtphhtthhopehpghhsqhhlqdhhrggtkhgvrhhssehlihhsthhsrdhpohhst hhgrhgvshhqlhdrohhrgh X-GND-Sasl: tomas@vondra.me List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk This is a multi-part message in MIME format. --------------k93Wv7BdXkBaEMlI9jjs835l Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit On 6/24/25 17:30, Christoph Berg wrote: > Re: Tomas Vondra >> If it's a reliable fix, then I guess we can do it like this. But won't >> that be a performance penalty on everyone? Or does the system split the >> array into 16-element chunks anyway, so this makes no difference? > > There's still the overhead of the syscall itself. But no idea how > costly it is to have this 16-step loop in user or kernel space. > > We could claim that on 32-bit systems, shared_buffers would be smaller > anyway, so there the overhead isn't that big. And the step size should > be larger (if at all) on 64-bit. > >> Anyway, maybe we should start by reporting this to the kernel people. Do >> you want me to do that, or shall one of you take care of that? I suppose >> that'd be better, as you already wrote a fix / know the code better. > > Submitted: https://marc.info/?l=linux-mm&m=175077821909222&w=2 > Thanks! Now we wait ... Attached is a minor tweak of the valgrind suppresion rules, to add the two places touching the memory. I was hoping I could add a single rule for pg_numa_touch_mem_if_required, but that does not work - it's a macro, not a function. So I had to add one rule for both functions, querying the NUMA. That's a bit disappointing, because it means it'll hide all other failues (of Memcheck:Addr8 type) in those functions. Perhaps it'd be be better to turn pg_numa_touch_mem_if_required into a proper (inlined) function, at least with USE_VALGRIND defined. Something like the v2 patch - needs more testing to ensure the inlined function doesn't break the touching or something silly like that. regards -- Tomas Vondra --------------k93Wv7BdXkBaEMlI9jjs835l Content-Type: text/x-patch; charset=UTF-8; name="fix-valgrind-for-numa.patch" Content-Disposition: attachment; filename="fix-valgrind-for-numa.patch" Content-Transfer-Encoding: base64 ZGlmZiAtLWdpdCBhL3NyYy90b29scy92YWxncmluZC5zdXBwIGIvc3JjL3Rvb2xzL3ZhbGdy aW5kLnN1cHAKaW5kZXggN2VhNDY0YzgwOTQuLjM2YmYzMjUzZjc2IDEwMDY0NAotLS0gYS9z cmMvdG9vbHMvdmFsZ3JpbmQuc3VwcAorKysgYi9zcmMvdG9vbHMvdmFsZ3JpbmQuc3VwcApA QCAtMTgwLDMgKzE4MCwyMiBAQAogICAgTWVtY2hlY2s6Q29uZAogICAgZnVuOlB5T2JqZWN0 X1JlYWxsb2MKIH0KKworIyBRdWVyeWluZyBOVU1BIG5vZGUgZm9yIHNoYXJlZCBtZW1vcnkg cmVxdWlyZXMgdG91Y2hpbmcgdGhlIG1lbW9yeSBzbworIyB0aGF0IGl0IGdldHMgYWxsb2Nh dGVkIGluIHRoZSBwcm9jZXNzLiBCdXQgd2UnbGwgdG91Y2ggbWVtb3J5IGJhY2tpbmcKKyMg YnVmZmVycywgYnV0IHRoYXQgbWVtb3J5IG1heSBiZSBtYXJrZWQgYXMgbm9hY2Nlc3MgZm9y IGJ1ZmZlcnMgdGhhdAorIyBhcmUgbm90IHBpbm5lZC4gU28ganVzdCBpZ25vcmUgdGhhdCwg d2UncmUgbm90IHJlYWxseSBhY2Nlc3NpbmcgdGhlCisjIGJ1ZmZlcnMsIGZvciBib3RoIHBs YWNlcyBxdWVyeWluZyB0aGUgTlVNQSBzdGF0dXMuCit7CisgICBwZ19idWZmZXJjYWNoZV9u dW1hX3BhZ2VzCisgICBNZW1jaGVjazpBZGRyOAorICAgZnVuOnBnX2J1ZmZlcmNhY2hlX251 bWFfcGFnZXMKKyAgIGZ1bjpFeGVjTWFrZVRhYmxlRnVuY3Rpb25SZXN1bHQKK30KKworewor ICAgcGdfZ2V0X3NobWVtX2FsbG9jYXRpb25zX251bWEKKyAgIE1lbWNoZWNrOkFkZHI4Cisg ICBmdW46cGdfZ2V0X3NobWVtX2FsbG9jYXRpb25zX251bWEKKyAgIGZ1bjpFeGVjTWFrZVRh YmxlRnVuY3Rpb25SZXN1bHQKK30K --------------k93Wv7BdXkBaEMlI9jjs835l Content-Type: text/x-patch; charset=UTF-8; name="fix-valgrind-for-numa-v2.patch" Content-Disposition: attachment; filename="fix-valgrind-for-numa-v2.patch" Content-Transfer-Encoding: base64 ZGlmZiAtLWdpdCBhL3NyYy9pbmNsdWRlL3BvcnQvcGdfbnVtYS5oIGIvc3JjL2luY2x1ZGUv cG9ydC9wZ19udW1hLmgKaW5kZXggNDBmMWQzMjRkY2YuLjNiOWE1YjQyODk4IDEwMDY0NAot LS0gYS9zcmMvaW5jbHVkZS9wb3J0L3BnX251bWEuaAorKysgYi9zcmMvaW5jbHVkZS9wb3J0 L3BnX251bWEuaApAQCAtMjQsOSArMjQsMjIgQEAgZXh0ZXJuIFBHRExMSU1QT1JUIGludCBw Z19udW1hX2dldF9tYXhfbm9kZSh2b2lkKTsKICAqIFRoaXMgaXMgcmVxdWlyZWQgb24gTGlu dXgsIGJlZm9yZSBwZ19udW1hX3F1ZXJ5X3BhZ2VzKCkgYXMgd2UKICAqIG5lZWQgdG8gcGFn ZS1mYXVsdCBiZWZvcmUgbW92ZV9wYWdlcygyKSBzeXNjYWxsIHJldHVybnMgdmFsaWQgcmVz dWx0cy4KICAqLworI2lmZGVmIFVTRV9WQUxHUklORAorCitzdGF0aWMgaW5saW5lIHZvaWQK K3BnX251bWFfdG91Y2hfbWVtX2lmX3JlcXVpcmVkKHVpbnQ2NCB0bXAsIGNoYXIgKnB0cikK K3sKKwl2b2xhdGlsZSB1aW50NjQgcm9fdm9sYXRpbGVfdmFyIHBnX2F0dHJpYnV0ZV91bnVz ZWQoKTsKKwlyb192b2xhdGlsZV92YXIgPSAqKHZvbGF0aWxlIHVpbnQ2NCAqKSBwdHI7Cit9 CisKKyNlbHNlCisKICNkZWZpbmUgcGdfbnVtYV90b3VjaF9tZW1faWZfcmVxdWlyZWQocm9f dm9sYXRpbGVfdmFyLCBwdHIpIFwKIAlyb192b2xhdGlsZV92YXIgPSAqKHZvbGF0aWxlIHVp bnQ2NCAqKSBwdHIKIAorI2VuZGlmCisKICNlbHNlCiAKICNkZWZpbmUgcGdfbnVtYV90b3Vj aF9tZW1faWZfcmVxdWlyZWQocm9fdm9sYXRpbGVfdmFyLCBwdHIpIFwKZGlmZiAtLWdpdCBh L3NyYy90b29scy92YWxncmluZC5zdXBwIGIvc3JjL3Rvb2xzL3ZhbGdyaW5kLnN1cHAKaW5k ZXggN2VhNDY0YzgwOTQuLjZiOWE4OTk4ZjgyIDEwMDY0NAotLS0gYS9zcmMvdG9vbHMvdmFs Z3JpbmQuc3VwcAorKysgYi9zcmMvdG9vbHMvdmFsZ3JpbmQuc3VwcApAQCAtMTgwLDMgKzE4 MCwxNCBAQAogICAgTWVtY2hlY2s6Q29uZAogICAgZnVuOlB5T2JqZWN0X1JlYWxsb2MKIH0K KworIyBRdWVyeWluZyBOVU1BIG5vZGUgZm9yIHNoYXJlZCBtZW1vcnkgcmVxdWlyZXMgdG91 Y2hpbmcgdGhlIG1lbW9yeSBzbworIyB0aGF0IGl0IGdldHMgYWxsb2NhdGVkIGluIHRoZSBw cm9jZXNzLiBCdXQgd2UnbGwgdG91Y2ggbWVtb3J5IGJhY2tpbmcKKyMgYnVmZmVycywgYnV0 IHRoYXQgbWVtb3J5IG1heSBiZSBtYXJrZWQgYXMgbm9hY2Nlc3MgZm9yIGJ1ZmZlcnMgdGhh dAorIyBhcmUgbm90IHBpbm5lZC4gU28ganVzdCBpZ25vcmUgdGhhdCwgd2UncmUgbm90IHJl YWxseSBhY2Nlc3NpbmcgdGhlCisjIGJ1ZmZlcnMsIGZvciBhbGwgcGxhY2VzIHF1ZXJ5aW5n IHRoZSBOVU1BIHN0YXR1cy4KK3sKKyAgIHBnX251bWFfdG91Y2hfbWVtX2lmX3JlcXVpcmVk CisgICBNZW1jaGVjazpBZGRyOAorICAgZnVuOnBnX251bWFfdG91Y2hfbWVtX2lmX3JlcXVp cmVkCit9Cg== --------------k93Wv7BdXkBaEMlI9jjs835l--