public inbox for [email protected]
help / color / mirror / Atom feedFrom: Christoph Berg <[email protected]>
To: Bertrand Drouvot <[email protected]>
Cc: Tomas Vondra <[email protected]>
Cc: Andres Freund <[email protected]>
Cc: Tomas Vondra <[email protected]>
Cc: [email protected]
Subject: Re: pgsql: Introduce pg_shmem_allocations_numa view
Date: Tue, 24 Jun 2025 16:41:33 +0200
Message-ID: <[email protected]> (raw)
In-Reply-To: <aFqnM/[email protected]>
References: <[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
<aFqHoNXQ/[email protected]>
<[email protected]>
<aFqnM/[email protected]>
Re: Bertrand Drouvot
> Yes, I think compat_uptr_t usage is missing in do_pages_stat() (while it's used
> in do_pages_move()).
I was also reading the kernel source around that place but you spotted
the problem before me. This patch resolves the issue here:
diff --git a/mm/migrate.c b/mm/migrate.c
index 8cf0f9c9599..542c81ec3ed 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -2444,7 +2444,13 @@ static int do_pages_stat(struct mm_struct *mm, unsigned long nr_pages,
if (copy_to_user(status, chunk_status, chunk_nr * sizeof(*status)))
break;
- pages += chunk_nr;
+ if (in_compat_syscall()) {
+ compat_uptr_t __user *pages32 = (compat_uptr_t __user *)pages;
+
+ pages32 += chunk_nr;
+ pages = (const void __user * __user *) pages32;
+ } else
+ pages += chunk_nr;
status += chunk_nr;
nr_pages -= chunk_nr;
}
> Having a chunk size <= DO_PAGES_STAT_CHUNK_NR ensures we are not affected
> by the wrong pointer arithmetic.
Good idea. Buggy kernels will be around for some time.
> + #define NUMA_QUERY_CHUNK_SIZE 16 /* has to be <= DO_PAGES_STAT_CHUNK_NR (do_pages_stat())*/
> +
> + for (uint64 chunk_start = 0; chunk_start < shm_ent_page_count; chunk_start += NUMA_QUERY_CHUNK_SIZE) {
Perhaps optimize it to this:
#if sizeof(void *) == 4
#define NUMA_QUERY_CHUNK_SIZE 16 /* has to be <= DO_PAGES_STAT_CHUNK_NR (do_pages_stat())*/
#else
#define NUMA_QUERY_CHUNK_SIZE 1024
#endif
... or some other bigger number.
The loop could also include CHECK_FOR_INTERRUPTS();
> > I don't have much
> > experience with the kernel code, so don't want to rely too much on my
> > interpretation of it.
>
> I don't have that much experience too but I think the issue is in do_pages_stat()
> and that "pages += chunk_nr" should be advanced by sizeof(compat_uptr_t) instead.
Me neither, but I'll try submit this fix.
Christoph
view thread (83+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: pgsql: Introduce pg_shmem_allocations_numa view
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox