public inbox for [email protected]
help / color / mirror / Atom feedFrom: Tomas Vondra <[email protected]>
To: Christoph Berg <[email protected]>
Cc: Andres Freund <[email protected]>
Cc: Tomas Vondra <[email protected]>
Cc: [email protected]
Subject: Re: pgsql: Introduce pg_shmem_allocations_numa view
Date: Tue, 24 Jun 2025 03:43:19 +0200
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <g3mywoeo7jmh6rci7epx2ishowgz65q2j7ek3c5f3lxcmvuktg@ler2fsv4szmn>
<[email protected]>
<[email protected]>
<kl4zd72eeaex7zcicpuvpsuslrs5nfvmab7xzt4jnvcjvd6mxw@tcp64c55qkpj>
<[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
On 6/23/25 23:47, Tomas Vondra wrote:
> ...
>
> Or maybe the 32-bit chroot on 64-bit host matters and confuses some
> calculation.
>
I think it's likely something like this. I noticed that if I modify
pg_buffercache_numa_pages() to query the addresses one by one, it works.
And when I increase the number, it stops working somewhere between 16k
and 17k items.
It may be a coincidence, but I suspect it's related to the sizeof(void
*) being 8 in the kernel, but only 4 in the chroot. So the userspace
passes an array of 4-byte items, but kernel interprets that as 8-byte
items. That is, we call
long move_pages(int pid, unsigned long count, void *pages[.count], const
int nodes[.count], int status[.count], int flags);
Which (I assume) just passes the parameters to kernel. And it'll
interpret them per kernel pointer size.
If this is what's happening, I'm not sure what to do about it ...
FWIW while looking into this, I tried running this under valgrind (on a
regular 64-bit system, not in the chroot), and I get this report:
==65065== Invalid read of size 8
==65065== at 0x113B0EBE: pg_buffercache_numa_pages
(pg_buffercache_pages.c:380)
==65065== by 0x6B539D: ExecMakeTableFunctionResult (execSRF.c:234)
==65065== by 0x6CEB7E: FunctionNext (nodeFunctionscan.c:94)
==65065== by 0x6B6ACA: ExecScanFetch (execScan.h:126)
==65065== by 0x6B6B31: ExecScanExtended (execScan.h:170)
==65065== by 0x6B6C9D: ExecScan (execScan.c:59)
==65065== by 0x6CEF0F: ExecFunctionScan (nodeFunctionscan.c:269)
==65065== by 0x6B29FA: ExecProcNodeFirst (execProcnode.c:469)
==65065== by 0x6A6F56: ExecProcNode (executor.h:313)
==65065== by 0x6A9533: ExecutePlan (execMain.c:1679)
==65065== by 0x6A7422: standard_ExecutorRun (execMain.c:367)
==65065== by 0x6A7330: ExecutorRun (execMain.c:304)
==65065== by 0x934EF0: PortalRunSelect (pquery.c:921)
==65065== by 0x934BD8: PortalRun (pquery.c:765)
==65065== by 0x92E4CD: exec_simple_query (postgres.c:1273)
==65065== by 0x93301E: PostgresMain (postgres.c:4766)
==65065== by 0x92A88B: BackendMain (backend_startup.c:124)
==65065== by 0x85A7C7: postmaster_child_launch (launch_backend.c:290)
==65065== by 0x860111: BackendStartup (postmaster.c:3580)
==65065== by 0x85DE6F: ServerLoop (postmaster.c:1702)
==65065== Address 0x7b6c000 is in a rw- anonymous segment
This fails here (on the pg_numa_touch_mem_if_required call):
for (char *ptr = startptr; ptr < endptr; ptr += os_page_size)
{
os_page_ptrs[idx++] = ptr;
/* Only need to touch memory once per backend process */
if (firstNumaTouch)
pg_numa_touch_mem_if_required(touch, ptr);
}
The 0x7b6c000 is the very first pointer, and it's the only pointer that
triggers this warning. At first I thought there's something wrong with
how we align the pointer using TYPEALIGN_DOWN(), but then I noticed it's
actually the pointer of BufferGetBlock(1).
So I'm a bit puzzled by this, and I'm not sure it's related to the other
issue at all (it probably is not).
It's a bit too late here, I'll continue investigating this tomorrow.
--
Tomas Vondra
view thread (83+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: pgsql: Introduce pg_shmem_allocations_numa view
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox