public inbox for [email protected]
help / color / mirror / Atom feedFrom: Heikki Linnakangas <[email protected]>
To: Ashutosh Bapat <[email protected]>
To: pgsql-hackers <[email protected]>
To: Andres Freund <[email protected]>
Cc: [email protected]
Subject: Re: Better shared data structure management and resizable shared data structures
Date: Fri, 13 Feb 2026 14:03:12 +0200
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com>
References: <CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com>
On 13/02/2026 13:47, Ashutosh Bapat wrote:
> `man madvise` has this
> MADV_REMOVE (since Linux 2.6.16)
> Free up a given range of pages and its associated
> backing store. This is equivalent to punching a
> hole in the corresponding byte range of the backing
> store (see fallocate(2)). Subsequent accesses
> in the specified address range will see bytes containing zero.
>
> The specified address range must be mapped shared
> and writable. This flag cannot be applied to
> locked pages, Huge TLB pages, or VM_PFNMAP pages.
>
> In the initial implementation, only tmpfs(5) was
> supported MADV_REMOVE; but since Linux 3.5, any
> filesystem which supports the fallocate(2)
> FALLOC_FL_PUNCH_HOLE mode also supports MADV_REMOVE.
> Hugetlbfs fails with the error EINVAL and other
> filesystems fail with the error EOPNOTSUPP.
>
> It says the flag can not be applied to Huge TLB pages. We won't be
> able to make resizable shared memory structures allocated with huge
> pages. That seems like a serious restriction.
Per https://man7.org/linux/man-pages/man2/madvise.2.html:
MADV_REMOVE (since Linux 2.6.16)
...
Support for the Huge TLB filesystem was added in Linux
v4.3.
> I may be misunderstanding something, but it seems like this is useful
> to free already allocated memory, not necessarily allocate more
> memory. I don't understand how a user would start with a larger
> reserved address space with only small portions of that space being
> backed by memory.
Hmm, I guess you'll need to use MAP_NORESERVE in the first mmap() call.
to reserve address space for the maximum size, and then
madvise(MADV_POPULATE_WRITE) using the initial size. Later,
madvise(MADV_REMOVE) to shrink, and madvise(MADV_POPULATE_WRITE) to grow
again.
- Heikki
view thread (54+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected]
Subject: Re: Better shared data structure management and resizable shared data structures
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox