public inbox for [email protected]  
help / color / mirror / Atom feed
From: Ashutosh Bapat <[email protected]>
To: Andres Freund <[email protected]>
To: Heikki Linnakangas <[email protected]>
Cc: pgsql-hackers <[email protected]>
Cc: [email protected]
Subject: Re: Better shared data structure management and resizable shared data structures
Date: Tue, 17 Feb 2026 17:06:24 +0530
Message-ID: <CAExHW5uEK+eeG7e2g6uWh7POrFpfp+dqfaa=_3miMN17zgeaJw@mail.gmail.com> (raw)
In-Reply-To: <mlsruptoxgm2nqtdfyfsowjklzxl5zltsjb3y5bmywtigm474l@5tsonk4t3kia>
References: <CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com>
	<[email protected]>
	<CAExHW5s9Vp+-vJi020UJ+otyccBBo7eT1g6bttdRKL6HAvscyQ@mail.gmail.com>
	<mlsruptoxgm2nqtdfyfsowjklzxl5zltsjb3y5bmywtigm474l@5tsonk4t3kia>

On Mon, Feb 16, 2026 at 11:26 PM Andres Freund <[email protected]> wrote:
>
> I think we *do* want the MADV_POPULATE_WRITE, at least when using huge pages,
> because otherwise you'll get a SIGBUS when accessing the memory if there is no
> huge page available anymore.
>

Ok.

Jakub's experiments [1] showed that fallocate()ing shared memory would
slow down postmaster start on a slow machine. I suppose the same thing
applies to MADV_POPULATE_WRITE. And we don't do that today even in the
case of huge pages; so we already have that problem.

If we perform MADV_POPULATE_WRITE, do we want it only for resizable
shared memory structures or all the structures in the shared memory?

On Mon, Feb 16, 2026 at 11:02 PM Heikki Linnakangas <[email protected]> wrote:
>
> On 16/02/2026 16:52, Ashutosh Bapat wrote:
> > 2. to use madvise() the address needs to be backed by a file, so
> > memfd_create is a must.
>
> It seems to work fine for anonymous mmapped memory here. See attached
> test program.
On Mon, Feb 16, 2026 at 11:26 PM Andres Freund <[email protected]> wrote:
> > 2. to use madvise() the address needs to be backed by a file, so
> > memfd_create is a must.
>
> I am quite sure that that is not true.  I hacked this up with today's
> postgres, and the madvise works with the mmap() backed allocation from
> sysv_shmem.c, which is anonymous.
>
> What made you conclude that that is the case?
>

You are right. I was misled by the following sentence in the `man
madvise`: "but since Linux 3.5, any filesystem which supports the
fallocate(2) FALLOC_FL_PUNCH_HOLE mode also supports MADV_REMOVE.
Filesystems which do not support MADV_REMOVE fail with the error
EOPNOTSUPP." And in a subsequent experiment I dropped MAP_ANONYMOUS
from mmap() and used madvise() which didn't work obviously. My bad.

In the attached patches, I have got rid of memfd_create. That simplifies code.


>
> > 4. the address and length passed to madvise needs to be page aligned,
> > but that passed to fallocate() needn't be. `man fallocate` says
> > "Specifying the FALLOC_FL_PUNCH_HOLE flag (available since Linux
> > 2.6.38) in mode deallocates space (i.e., creates a hole) in the byte
> > range starting at offset and continuing for len bytes. Within the
> > specified range, partial filesystem blocks are zeroed, and whole
> > filesystem blocks are removed from the file.". It seems to be
> > automatically taking care of the page size. So using fallocate()
> > simplifies logic. Further `man madvise` says "but since Linux 3.5, any
> > filesystem which supports the fallocate(2) FALLOC_FL_PUNCH_HOLE mode
> > also supports MADV_REMOVE." fallocate with FALLOC_FL_PUNCH_HOLE is
> > guaranteed to be available on a system which supports MADV_REMOVE.
>
> I think it makes no sense to support resizing below page size
> granularity. What's the point of doing that?
>

No point really. But we can not control the extensions which want to
specify a maximum size smaller than a page size. They wouldn't know
what page size the underlying machine will have, especially with huge
pages which have a wide range of sizes. Even in the case of shared
buffers, a value of max_shared_buffers may cause buffer blocks to span
pages but other structures may fit a page.

In the attached patches, if a resizable structure is such that its
max_size is smaller than a page size, it is treated as a fixed
structure with size = max_size. Any request to resize such structures
will simply update the metadata without actual madvise operation. Only
the structures whose max_size > page_size would be treated as truly
resizable and will use madvise. You bring another interesting point.
If a resizable structure has a maximum size higher than the page size,
but it is allocated such that the initial part of it is on a partially
allocated page and the last part of it is on another partially
allocated page, those pages are never freed because of adjoining
structures. Per the logic in the attached patches, all the fixed (or
pseudo-resizable structures) are packed together. The resizable
structures start on a page boundary and their max_sizes are adjusted
to be page aligned. That way we can release pages when the structure
shrinks more than a page.

>
> > Using fallocate() (or madvise()) to free memory, we don't need
> > multiple segments. So much less code churn compared to the multiple
> > mappings approach. However, there is one drawback. In the multiple
> > mapping approach access beyond the current size of the structure would
> > result in segfault or bus error. But in the fallocate/madvise approach
> > such an access does not cause a crash. A write beyond the pages that
> > fit the current size of the structure causes more memory to be
> > allocated silently. A read returns 0s. So, there's a possibility that
> > bugs in size calculations might go unnoticed. I think that's how it
> > works even today, access in the yet un-allocated part of the shared
> > memory will simply go unnoticed.
>
> If that's something you care about, you can mprotect(PROT_NONE) the relevant
> regions.

I am fine, if we let go of this protection while getting rid of
multiple segments, if we all agree to do so.

I could be wrong, but mprotect needs to be executed in every backend
where the memory is mapped and then a new backend needs to inherit it
from the postmaster. Makes resizing complex since it has to touch
every backend. So avoiding mprotect is better.

[1] https://www.postgresql.org/message-id/CAKZiRmwxVqEbp7JgOed%3DBCT6cq8RNuHk3N0vuwro65Tsw9E8NA%40mail.g...

PFA patches.

-- 
Best Wishes,
Ashutosh Bapat


Attachments:

  [text/x-patch] 0002-Get-rid-of-global-shared-memory-pointer-mac-20260217.patch (15.0K, 2-0002-Get-rid-of-global-shared-memory-pointer-mac-20260217.patch)
  download | inline diff:
From 395a95e9934286869b1fe8d45dc5a155ea9be030 Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <[email protected]>
Date: Tue, 10 Feb 2026 20:26:31 +0530
Subject: [PATCH 2/3] Get rid of global shared memory pointer macro
 declarations

---
 .../pg_stat_statements/pg_stat_statements.c   | 10 ++++---
 src/backend/access/transam/varsup.c           |  5 ++++
 src/backend/storage/ipc/dsm.c                 |  5 ++--
 src/backend/storage/ipc/dsm_registry.c        |  4 ++-
 src/backend/storage/ipc/pmsignal.c            | 18 +++++-------
 src/backend/storage/ipc/procarray.c           | 13 +++++----
 src/backend/storage/ipc/procsignal.c          |  4 ++-
 src/backend/storage/ipc/shmem.c               | 29 ++++++++++++-------
 src/backend/storage/ipc/sinvaladt.c           |  7 +++--
 src/backend/storage/lmgr/proc.c               | 25 ++++++++++------
 src/include/access/transam.h                  |  2 +-
 src/include/storage/shmem.h                   |  6 ++--
 12 files changed, 77 insertions(+), 51 deletions(-)

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 71debc8b47f..73fdf561419 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -258,24 +258,26 @@ typedef struct pgssSharedState
 	pgssGlobalStats stats;		/* global statistics for pgss */
 } pgssSharedState;
 
+/* Links to shared memory state */
+pgssSharedState *pgss = NULL;
+HTAB *pgss_hash = NULL;
+
 static void pgss_shmem_init(void *arg);
 
 static ShmemStructDesc pgssSharedStateShmemDesc = {
 	.name = "pg_stat_statements",
 	.size = sizeof(pgssSharedState),
 	.init_fn = pgss_shmem_init,
+	.ptr = (void *) &pgss,
 };
 
 static ShmemHashDesc pgssSharedHashDesc = {
 	.name = "pg_stat_statements hash",
 	.init_size = 0,		/* set from 'pgss_max' */
 	.max_size = 0,		/* set from 'pgss_max' */
+	.ptr = &pgss_hash,
 };
 
-/* Links to shared memory state */
-#define pgss ((pgssSharedState *) pgssSharedStateShmemDesc.ptr)
-#define pgss_hash (pgssSharedHashDesc.ptr)
-
 
 /*---- Local variables ----*/
 
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 11ad90e7372..3dfda875e80 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -32,10 +32,14 @@
 
 static void VarsupShmemInit(void *arg);
 
+/* pointer to variables struct in shared memory */
+TransamVariablesData *TransamVariables = NULL;
+
 ShmemStructDesc TransamVariablesShmemDesc = {
 	.name = "TransamVariables",
 	.size = sizeof(TransamVariablesData),
 	.init_fn = VarsupShmemInit,
+	.ptr = (void **) &TransamVariables,
 };
 
 /*
@@ -49,6 +53,7 @@ VarsupShmemRegister(void)
 
 static void
 VarsupShmemInit(void *arg)
+
 {
 	memset(TransamVariables, 0, sizeof(TransamVariablesData));
 }
diff --git a/src/backend/storage/ipc/dsm.c b/src/backend/storage/ipc/dsm.c
index 55f46c7687e..73644ec3bbb 100644
--- a/src/backend/storage/ipc/dsm.c
+++ b/src/backend/storage/ipc/dsm.c
@@ -110,14 +110,15 @@ static bool dsm_init_done = false;
 /* Preallocated DSM space in the main shared memory region. */
 static void dsm_main_space_init(void *);
 
+static void *dsm_main_space_begin = NULL;
+
 static ShmemStructDesc dsm_main_space_shmem_desc = {
 	.name = "Preallocated DSM",
 	.size = 0, /* dynamic */
 	.init_fn = dsm_main_space_init,
+	.ptr = &dsm_main_space_begin,
 };
 
-#define dsm_main_space_begin (dsm_main_space_shmem_desc.ptr)
-
 /*
  * List of dynamic shared memory segments used by this backend.
  *
diff --git a/src/backend/storage/ipc/dsm_registry.c b/src/backend/storage/ipc/dsm_registry.c
index 882af83b7b2..1659e1dd71d 100644
--- a/src/backend/storage/ipc/dsm_registry.c
+++ b/src/backend/storage/ipc/dsm_registry.c
@@ -56,13 +56,15 @@ typedef struct DSMRegistryCtxStruct
 
 static void DSMRegistryCtxShmemInit(void *arg);
 
+DSMRegistryCtxStruct *DSMRegistryCtx = NULL;
+
 static ShmemStructDesc DSMRegistryCtxShmemDesc = {
 	.name = "DSM Registry Data",
 	.size = sizeof(DSMRegistryCtxStruct),
 	.init_fn = DSMRegistryCtxShmemInit,
+	.ptr = (void **) &DSMRegistryCtx,
 };
 
-#define DSMRegistryCtx ((DSMRegistryCtxStruct *) DSMRegistryCtxShmemDesc.ptr)
 
 typedef struct NamedDSMState
 {
diff --git a/src/backend/storage/ipc/pmsignal.c b/src/backend/storage/ipc/pmsignal.c
index 23752500d16..3aa0380eadd 100644
--- a/src/backend/storage/ipc/pmsignal.c
+++ b/src/backend/storage/ipc/pmsignal.c
@@ -82,21 +82,17 @@ struct PMSignalData
 
 static void PMSignalShmemInit(void *);
 
-static ShmemStructDesc PMSignalShmemDesc = {
-	.name = "PMSignalState",
-	.size = 0, /* dynamic */
-	.init_fn = PMSignalShmemInit,
-};
-
 /*
  * PMSignalState pointer is valid in both postmaster and child processes
- *
- * This is a stand-alone variable rather than just a #define over
- * PMSignalShmemDesc.ptr because it is needed early at backend startup and
- * passed as a backend parameter in EXEC_BACKEND mode
  */
 NON_EXEC_STATIC volatile PMSignalData *PMSignalState = NULL;
 
+static ShmemStructDesc PMSignalShmemDesc = {
+	.name = "PMSignalState",
+	.size = 0, /* dynamic */
+	.init_fn = PMSignalShmemInit,
+	.ptr = (void **) &PMSignalState,
+};
 
 /*
  * Local copy of PMSignalState->num_child_flags, only valid in the
@@ -156,7 +152,7 @@ static void
 PMSignalShmemInit(void *arg)
 {
 	/* initialize all flags to zeroes */
-	PMSignalState = PMSignalShmemDesc.ptr;
+	Assert(PMSignalState);
 	MemSet(unvolatize(PMSignalData *, PMSignalState), 0, PMSignalShmemDesc.size);
 	num_child_flags = MaxLivePostmasterChildren();
 	PMSignalState->num_child_flags = num_child_flags;
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 08c63bcb2a7..736504d3a3e 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -104,15 +104,16 @@ typedef struct ProcArrayStruct
 static void ProcArrayShmemInit(void *arg);
 static void ProcArrayShmemAttach(void *arg);
 
+ProcArrayStruct *procArray = NULL;
+
 static ShmemStructDesc ProcArrayShmemDesc = {
 	.name = "Proc Array",
 	.size = 0, /* dynamic */
 	.init_fn = ProcArrayShmemInit,
 	.attach_fn = ProcArrayShmemAttach,
+	.ptr = (void **) &procArray,
 };
 
-#define procArray ((ProcArrayStruct *) ProcArrayShmemDesc.ptr)
-
 /*
  * State for the GlobalVisTest* family of functions. Those functions can
  * e.g. be used to decide if a deleted row can be removed without violating
@@ -290,22 +291,24 @@ static TransactionId cachedXidIsNotInProgress = InvalidTransactionId;
  * Bookkeeping for tracking emulated transactions in recovery
  */
 
+TransactionId *KnownAssignedXids = NULL;
+
 static ShmemStructDesc KnownAssignedXidsShmemDesc = {
 	.name = "KnownAssignedXids",
 	.size = 0, /* dynamic */
 	.init_fn = NULL,
+	.ptr = (void **) &KnownAssignedXids,
 };
 
-#define KnownAssignedXids ((TransactionId *) KnownAssignedXidsShmemDesc.ptr)
+bool *KnownAssignedXidsValid = NULL;
 
 static ShmemStructDesc KnownAssignedXidsValidShmemDesc = {
 	.name = "KnownAssignedXidsValid",
 	.size = 0, /* dynamic */
 	.init_fn = NULL,
+	.ptr = (void **) &KnownAssignedXidsValid,
 };
 
-#define KnownAssignedXidsValid ((bool *) KnownAssignedXidsValidShmemDesc.ptr)
-
 static TransactionId latestObservedXid = InvalidTransactionId;
 
 /*
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 5743f088324..eec04eae3f4 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -104,13 +104,15 @@ struct ProcSignalHeader
 
 static void ProcSignalShmemInit(void *arg);
 
+ProcSignalHeader *ProcSignal = NULL;
+
 static ShmemStructDesc ProcSignalShmemDesc = {
 	.name = "ProcSignal",
 	.size = 0, /* dynamic */
 	.init_fn = ProcSignalShmemInit,
+	.ptr = (void **) &ProcSignal,
 };
 
-#define ProcSignal ((ProcSignalHeader *) ProcSignalShmemDesc.ptr)
 
 static ProcSignalSlot *MyProcSignalSlot = NULL;
 
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index faa0fcbd21e..e73ac489b2b 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -118,17 +118,18 @@ static void *ShmemEnd;			/* end+1 address of shared memory */
 
 static ShmemAllocatorData *ShmemAllocator;
 slock_t    *ShmemLock;			/* points to ShmemAllocator->shmem_lock */
+ /* primary index hashtable for shmem */
+HTAB *ShmemIndex = NULL;
+
 
 
 static ShmemHashDesc ShmemIndexHashDesc = {
 	.name = "ShmemIndex",
 	.init_size = SHMEM_INDEX_SIZE,
 	.max_size = SHMEM_INDEX_SIZE,
+	.ptr = &ShmemIndex
 };
 
- /* primary index hashtable for shmem */
-#define ShmemIndex (ShmemIndexHashDesc.ptr)
-
 
 /* To get reliable results for NUMA inquiry we need to "touch pages" once */
 static bool firstNumaTouch = true;
@@ -205,7 +206,7 @@ ShmemInitRegistered(void)
 		result->allocated_size = allocated_size;
 		result->location = structPtr;
 
-		registry[i]->ptr = structPtr;
+		*(registry[i]->ptr) = structPtr;
 		if (registry[i]->init_fn)
 			registry[i]->init_fn(registry[i]->init_fn_arg);
 	}
@@ -239,7 +240,7 @@ ShmemAttachRegistered(void)
 							registry[i]->name)));
 		}
 
-		registry[i]->ptr = result->location;
+		*registry[i]->ptr = result->location;
 
 		if (registry[i]->attach_fn)
 			registry[i]->attach_fn(registry[i]->attach_fn_arg);
@@ -425,10 +426,11 @@ InitShmemIndex(void)
 	info.keysize = SHMEM_INDEX_KEYSIZE;
 	info.entrysize = sizeof(ShmemIndexEnt);
 
-	ShmemIndex = ShmemInitHash("ShmemIndex",
+	*ShmemIndexHashDesc.ptr = ShmemInitHash("ShmemIndex",
 							   SHMEM_INDEX_SIZE, SHMEM_INDEX_SIZE,
 							   &info,
 							   HASH_ELEM | HASH_STRINGS);
+	Assert(ShmemIndex != NULL && *ShmemIndexHashDesc.ptr == ShmemIndex);
 }
 
 /*
@@ -482,6 +484,12 @@ ShmemRegisterHash(ShmemHashDesc *desc,		/* configuration */
 	desc->base_desc.init_fn_arg = desc;
 	desc->base_desc.attach_fn = shmem_hash_attach;
 	desc->base_desc.attach_fn_arg = desc;
+	/*
+	 * We need a stable pointer to hold the pointer to the shared memory. Use
+	 * the one passed in the descriptor now. It will be replaced with the hash
+	 * table header by init or attach function.
+	 */
+	desc->base_desc.ptr = (void **) desc->ptr;
 
 	desc->base_desc.extra_size = hash_estimate_size(desc->max_size, infoP->entrysize) - desc->base_desc.size;
 
@@ -499,10 +507,9 @@ shmem_hash_init(void *arg)
 	int			hash_flags = desc->hash_flags;
 
 	/* Pass location of hashtable header to hash_create */
-	desc->ptr = desc->base_desc.ptr;
-	desc->infoP->hctl = (HASHHDR *) desc->ptr;
+	desc->infoP->hctl = (HASHHDR *) *desc->base_desc.ptr;
 
-	desc->ptr = hash_create(desc->name, desc->init_size, desc->infoP, hash_flags);
+	*desc->ptr = hash_create(desc->name, desc->init_size, desc->infoP, hash_flags);
 }
 
 static void
@@ -518,9 +525,9 @@ shmem_hash_attach(void *arg)
 	hash_flags |= HASH_ATTACH;
 
 	/* Pass location of hashtable header to hash_create */
-	desc->infoP->hctl = (HASHHDR *) desc->ptr;
+	desc->infoP->hctl = (HASHHDR *) *desc->base_desc.ptr;
 
-	desc->ptr = hash_create(desc->name, desc->init_size, desc->infoP, hash_flags);
+	*desc->ptr = hash_create(desc->name, desc->init_size, desc->infoP, hash_flags);
 }
 
 /*
diff --git a/src/backend/storage/ipc/sinvaladt.c b/src/backend/storage/ipc/sinvaladt.c
index 0fe0f256971..8321bd9b52d 100644
--- a/src/backend/storage/ipc/sinvaladt.c
+++ b/src/backend/storage/ipc/sinvaladt.c
@@ -205,15 +205,16 @@ typedef struct SISeg
 
 static void SharedInvalShmemInit(void *arg);
 
+/* pointer to the shared inval buffer */
+SISeg *shmInvalBuffer = NULL;
+
 static ShmemStructDesc SharedInvalShmemDesc = {
 	.name = "shmInvalBuffer",
 	.size = 0,	/* dynamic */
 	.init_fn = SharedInvalShmemInit,
+	.ptr = (void **) &shmInvalBuffer,
 };
 
-/* pointer to the shared inval buffer */
-#define shmInvalBuffer ((SISeg *) SharedInvalShmemDesc.ptr)
-
 
 static LocalTransactionId nextLocalTransactionId;
 
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 85375b5195e..a3d6557aa9d 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -77,27 +77,33 @@ PGPROC	   *MyProc = NULL;
 
 static void ProcGlobalShmemInit(void *arg);
 
+/* Pointers to shared-memory structures */
+PROC_HDR   *ProcGlobal = NULL;
+void *tmpAllProcs = NULL;
+void *tmpFastPathLockArray = NULL;
+NON_EXEC_STATIC PGPROC *AuxiliaryProcs = NULL;
+PGPROC	   *PreparedXactProcs = NULL;
+
+
 static ShmemStructDesc ProcGlobalShmemDesc = {
 	.name = "Proc Header",
 	.size = sizeof(PROC_HDR),
 	.init_fn = ProcGlobalShmemInit,
+	.ptr = (void **) &ProcGlobal,
 };
 
 static ShmemStructDesc ProcGlobalAllProcsShmemDesc = {
 	.name = "PGPROC structures",
 	.size = 0, /* dynamic */
+	.ptr = (void **) &tmpAllProcs,
 };
 
 static ShmemStructDesc FastPathLockArrayShmemDesc = {
 	.name = "Fast-Path Lock Array",
 	.size = 0, /* dynamic */
+	.ptr = (void **) &tmpFastPathLockArray,
 };
 
-/* Pointers to shared-memory structures */
-PROC_HDR   *ProcGlobal = NULL;
-NON_EXEC_STATIC PGPROC *AuxiliaryProcs = NULL;
-PGPROC	   *PreparedXactProcs = NULL;
-
 static uint32 TotalProcs;
 
 static DeadLockState deadlock_state = DS_NOT_YET_CHECKED;
@@ -222,8 +228,7 @@ ProcGlobalShmemInit(void *arg)
 	Size		fpLockBitsSize,
 				fpRelIdSize;
 
-	ProcGlobal = ProcGlobalShmemDesc.ptr;
-
+	Assert(ProcGlobal);
 	ProcGlobal->spins_per_delay = DEFAULT_SPINS_PER_DELAY;
 	dlist_init(&ProcGlobal->freeProcs);
 	dlist_init(&ProcGlobal->autovacFreeProcs);
@@ -236,7 +241,8 @@ ProcGlobalShmemInit(void *arg)
 	pg_atomic_init_u32(&ProcGlobal->clogGroupFirst, INVALID_PROC_NUMBER);
 	SpinLockInit(ProcStructLock);
 
-	ptr = ProcGlobalAllProcsShmemDesc.ptr;
+	Assert(tmpAllProcs);
+	ptr = tmpAllProcs;
 	requestSize = ProcGlobalAllProcsShmemDesc.size;
 	memset(ptr, 0, requestSize);
 
@@ -274,7 +280,8 @@ ProcGlobalShmemInit(void *arg)
 	fpLockBitsSize = MAXALIGN(FastPathLockGroupsPerBackend * sizeof(uint64));
 	fpRelIdSize = MAXALIGN(FastPathLockSlotsPerBackend() * sizeof(Oid));
 
-	fpPtr = FastPathLockArrayShmemDesc.ptr;
+	Assert(tmpFastPathLockArray);
+	fpPtr = tmpFastPathLockArray;
 	requestSize = FastPathLockArrayShmemDesc.size;
 	memset(fpPtr, 0, requestSize);
 
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 49d476e9d5c..6e5a546f411 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -334,7 +334,7 @@ extern bool TransactionStartedDuringRecovery(void);
 /* in transam/varsup.c */
 #ifndef FRONTEND
 extern PGDLLIMPORT struct ShmemStructDesc TransamVariablesShmemDesc;
-#define TransamVariables ((TransamVariablesData *) TransamVariablesShmemDesc.ptr)
+extern PGDLLIMPORT TransamVariablesData *TransamVariables;
 #endif
 
 /*
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index 40e2fc17056..cbd4ef8d03f 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -50,8 +50,8 @@ typedef struct ShmemStructDesc
 	 */
 	size_t		extra_size;
 
-	/* Pointer to the shared memory area, when it's allocated. */
-	void	   *ptr;
+	/* Pointer to the variable to which pointer to this shared memory area is assigned after allocation. */
+	void	   **ptr;
 } ShmemStructDesc;
 
 /*
@@ -67,7 +67,7 @@ typedef struct ShmemHashDesc
 	size_t		max_size;		/* max number of entries */
 	HASHCTL	   *infoP;
 
-	HTAB	   *ptr;
+	HTAB	   **ptr;
 
 	ShmemStructDesc	base_desc;
 } ShmemHashDesc;
-- 
2.34.1



  [text/x-patch] 0001-wip-Introduce-a-new-way-of-registering-shar-20260217.patch (53.8K, 3-0001-wip-Introduce-a-new-way-of-registering-shar-20260217.patch)
  download | inline diff:
From 49676c5ba088d13236f2c1c66800d7e7b1abbe5f Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Mon, 9 Feb 2026 22:28:23 +0200
Subject: [PATCH 1/3] wip: Introduce a new way of registering shared memory
 structs

---
 .../pg_stat_statements/pg_stat_statements.c   | 112 ++++-----
 src/backend/access/transam/varsup.c           |  32 +--
 src/backend/bootstrap/bootstrap.c             |   2 +
 src/backend/postmaster/launch_backend.c       |  11 +-
 src/backend/postmaster/postmaster.c           |   2 +
 src/backend/storage/ipc/dsm.c                 |  46 ++--
 src/backend/storage/ipc/dsm_registry.c        |  34 ++-
 src/backend/storage/ipc/ipci.c                |  51 ++--
 src/backend/storage/ipc/pmsignal.c            |  53 ++--
 src/backend/storage/ipc/procarray.c           | 127 +++++-----
 src/backend/storage/ipc/procsignal.c          |  63 +++--
 src/backend/storage/ipc/shmem.c               | 233 +++++++++++++++++-
 src/backend/storage/ipc/sinvaladt.c           |  39 +--
 src/backend/storage/lmgr/proc.c               | 156 ++++++------
 src/backend/tcop/postgres.c                   |   2 +
 src/include/access/transam.h                  |  12 +-
 src/include/storage/dsm_registry.h            |   3 +-
 src/include/storage/ipc.h                     |   1 +
 src/include/storage/pmsignal.h                |   3 +-
 src/include/storage/proc.h                    |   5 +-
 src/include/storage/procarray.h               |   3 +-
 src/include/storage/procsignal.h              |   3 +-
 src/include/storage/shmem.h                   |  57 +++++
 src/include/storage/sinvaladt.h               |   3 +-
 24 files changed, 665 insertions(+), 388 deletions(-)

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 4a427533bd8..71debc8b47f 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -258,6 +258,25 @@ typedef struct pgssSharedState
 	pgssGlobalStats stats;		/* global statistics for pgss */
 } pgssSharedState;
 
+static void pgss_shmem_init(void *arg);
+
+static ShmemStructDesc pgssSharedStateShmemDesc = {
+	.name = "pg_stat_statements",
+	.size = sizeof(pgssSharedState),
+	.init_fn = pgss_shmem_init,
+};
+
+static ShmemHashDesc pgssSharedHashDesc = {
+	.name = "pg_stat_statements hash",
+	.init_size = 0,		/* set from 'pgss_max' */
+	.max_size = 0,		/* set from 'pgss_max' */
+};
+
+/* Links to shared memory state */
+#define pgss ((pgssSharedState *) pgssSharedStateShmemDesc.ptr)
+#define pgss_hash (pgssSharedHashDesc.ptr)
+
+
 /*---- Local variables ----*/
 
 /* Current nesting depth of planner/ExecutorRun/ProcessUtility calls */
@@ -274,10 +293,6 @@ static ExecutorFinish_hook_type prev_ExecutorFinish = NULL;
 static ExecutorEnd_hook_type prev_ExecutorEnd = NULL;
 static ProcessUtility_hook_type prev_ProcessUtility = NULL;
 
-/* Links to shared memory state */
-static pgssSharedState *pgss = NULL;
-static HTAB *pgss_hash = NULL;
-
 /*---- GUC variables ----*/
 
 typedef enum
@@ -365,7 +380,6 @@ static void pgss_store(const char *query, int64 queryId,
 static void pg_stat_statements_internal(FunctionCallInfo fcinfo,
 										pgssVersion api_version,
 										bool showtext);
-static Size pgss_memsize(void);
 static pgssEntry *entry_alloc(pgssHashKey *key, Size query_offset, int query_len,
 							  int encoding, bool sticky);
 static void entry_dealloc(void);
@@ -500,11 +514,39 @@ _PG_init(void)
 static void
 pgss_shmem_request(void)
 {
+	HASHCTL		info;
+
 	if (prev_shmem_request_hook)
 		prev_shmem_request_hook();
 
-	RequestAddinShmemSpace(pgss_memsize());
 	RequestNamedLWLockTranche("pg_stat_statements", 1);
+
+	/*
+	 * Register our shared memory state, including hash table
+	 */
+	ShmemRegisterStruct(&pgssSharedStateShmemDesc);
+
+	info.keysize = sizeof(pgssHashKey);
+	info.entrysize = sizeof(pgssEntry);
+	pgssSharedHashDesc.init_size = pgss_max;
+	pgssSharedHashDesc.max_size = pgss_max;
+	ShmemRegisterHash(&pgssSharedHashDesc,
+					  &info,
+					  HASH_ELEM | HASH_BLOBS);
+}
+
+static void
+pgss_shmem_init(void *arg)
+{
+	pgss->lock = &(GetNamedLWLockTranche("pg_stat_statements"))->lock;
+	pgss->cur_median_usage = ASSUMED_MEDIAN_INIT;
+	pgss->mean_query_len = ASSUMED_LENGTH_INIT;
+	SpinLockInit(&pgss->mutex);
+	pgss->extent = 0;
+	pgss->n_writers = 0;
+	pgss->gc_count = 0;
+	pgss->stats.dealloc = 0;
+	pgss->stats.stats_reset = GetCurrentTimestamp();
 }
 
 /*
@@ -516,8 +558,6 @@ pgss_shmem_request(void)
 static void
 pgss_shmem_startup(void)
 {
-	bool		found;
-	HASHCTL		info;
 	FILE	   *file = NULL;
 	FILE	   *qfile = NULL;
 	uint32		header;
@@ -530,42 +570,6 @@ pgss_shmem_startup(void)
 	if (prev_shmem_startup_hook)
 		prev_shmem_startup_hook();
 
-	/* reset in case this is a restart within the postmaster */
-	pgss = NULL;
-	pgss_hash = NULL;
-
-	/*
-	 * Create or attach to the shared memory state, including hash table
-	 */
-	LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-
-	pgss = ShmemInitStruct("pg_stat_statements",
-						   sizeof(pgssSharedState),
-						   &found);
-
-	if (!found)
-	{
-		/* First time through ... */
-		pgss->lock = &(GetNamedLWLockTranche("pg_stat_statements"))->lock;
-		pgss->cur_median_usage = ASSUMED_MEDIAN_INIT;
-		pgss->mean_query_len = ASSUMED_LENGTH_INIT;
-		SpinLockInit(&pgss->mutex);
-		pgss->extent = 0;
-		pgss->n_writers = 0;
-		pgss->gc_count = 0;
-		pgss->stats.dealloc = 0;
-		pgss->stats.stats_reset = GetCurrentTimestamp();
-	}
-
-	info.keysize = sizeof(pgssHashKey);
-	info.entrysize = sizeof(pgssEntry);
-	pgss_hash = ShmemInitHash("pg_stat_statements hash",
-							  pgss_max, pgss_max,
-							  &info,
-							  HASH_ELEM | HASH_BLOBS);
-
-	LWLockRelease(AddinShmemInitLock);
-
 	/*
 	 * If we're in the postmaster (or a standalone backend...), set up a shmem
 	 * exit hook to dump the statistics to disk.
@@ -573,12 +577,6 @@ pgss_shmem_startup(void)
 	if (!IsUnderPostmaster)
 		on_shmem_exit(pgss_shmem_shutdown, (Datum) 0);
 
-	/*
-	 * Done if some other process already completed our initialization.
-	 */
-	if (found)
-		return;
-
 	/*
 	 * Note: we don't bother with locks here, because there should be no other
 	 * processes running when this code is reached.
@@ -2082,20 +2080,6 @@ pg_stat_statements_info(PG_FUNCTION_ARGS)
 	PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
 }
 
-/*
- * Estimate shared memory space needed.
- */
-static Size
-pgss_memsize(void)
-{
-	Size		size;
-
-	size = MAXALIGN(sizeof(pgssSharedState));
-	size = add_size(size, hash_estimate_size(pgss_max, sizeof(pgssEntry)));
-
-	return size;
-}
-
 /*
  * Allocate a new hashtable entry.
  * caller must hold an exclusive lock on pgss->lock
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 3e95d4cfd16..11ad90e7372 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -30,35 +30,27 @@
 /* Number of OIDs to prefetch (preallocate) per XLOG write */
 #define VAR_OID_PREFETCH		8192
 
-/* pointer to variables struct in shared memory */
-TransamVariablesData *TransamVariables = NULL;
+static void VarsupShmemInit(void *arg);
 
+ShmemStructDesc TransamVariablesShmemDesc = {
+	.name = "TransamVariables",
+	.size = sizeof(TransamVariablesData),
+	.init_fn = VarsupShmemInit,
+};
 
 /*
  * Initialization of shared memory for TransamVariables.
  */
-Size
-VarsupShmemSize(void)
+void
+VarsupShmemRegister(void)
 {
-	return sizeof(TransamVariablesData);
+	ShmemRegisterStruct(&TransamVariablesShmemDesc);
 }
 
-void
-VarsupShmemInit(void)
+static void
+VarsupShmemInit(void *arg)
 {
-	bool		found;
-
-	/* Initialize our shared state struct */
-	TransamVariables = ShmemInitStruct("TransamVariables",
-									   sizeof(TransamVariablesData),
-									   &found);
-	if (!IsUnderPostmaster)
-	{
-		Assert(!found);
-		memset(TransamVariables, 0, sizeof(TransamVariablesData));
-	}
-	else
-		Assert(found);
+	memset(TransamVariables, 0, sizeof(TransamVariablesData));
 }
 
 /*
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index 7d32cd0e159..0ded7018e86 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -337,6 +337,8 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
 
 	InitializeFastPathLocks();
 
+	RegisterShmemStructs();
+
 	CreateSharedMemoryAndSemaphores();
 
 	/*
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index 926fd6f2700..8f638118cdf 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -49,6 +49,7 @@
 #include "replication/walreceiver.h"
 #include "storage/dsm.h"
 #include "storage/io_worker.h"
+#include "storage/ipc.h"
 #include "storage/pg_shmem.h"
 #include "tcop/backend_startup.h"
 #include "utils/memutils.h"
@@ -104,12 +105,10 @@ typedef struct
 	char	  **LWLockTrancheNames;
 	int		   *LWLockCounter;
 	LWLockPadded *MainLWLockArray;
-	slock_t    *ProcStructLock;
 	PROC_HDR   *ProcGlobal;
 	PGPROC	   *AuxiliaryProcs;
 	PGPROC	   *PreparedXactProcs;
 	volatile PMSignalData *PMSignalState;
-	ProcSignalHeader *ProcSignal;
 	pid_t		PostmasterPid;
 	TimestampTz PgStartTime;
 	TimestampTz PgReloadTime;
@@ -678,8 +677,12 @@ SubPostmasterMain(int argc, char *argv[])
 
 	/* Restore basic shared memory pointers */
 	if (UsedShmemSegAddr != NULL)
+	{
 		InitShmemAllocator(UsedShmemSegAddr);
 
+		RegisterShmemStructs();
+	}
+
 	/*
 	 * Run the appropriate Main function
 	 */
@@ -735,12 +738,10 @@ save_backend_variables(BackendParameters *param,
 	param->LWLockTrancheNames = LWLockTrancheNames;
 	param->LWLockCounter = LWLockCounter;
 	param->MainLWLockArray = MainLWLockArray;
-	param->ProcStructLock = ProcStructLock;
 	param->ProcGlobal = ProcGlobal;
 	param->AuxiliaryProcs = AuxiliaryProcs;
 	param->PreparedXactProcs = PreparedXactProcs;
 	param->PMSignalState = PMSignalState;
-	param->ProcSignal = ProcSignal;
 
 	param->PostmasterPid = PostmasterPid;
 	param->PgStartTime = PgStartTime;
@@ -995,12 +996,10 @@ restore_backend_variables(BackendParameters *param)
 	LWLockTrancheNames = param->LWLockTrancheNames;
 	LWLockCounter = param->LWLockCounter;
 	MainLWLockArray = param->MainLWLockArray;
-	ProcStructLock = param->ProcStructLock;
 	ProcGlobal = param->ProcGlobal;
 	AuxiliaryProcs = param->AuxiliaryProcs;
 	PreparedXactProcs = param->PreparedXactProcs;
 	PMSignalState = param->PMSignalState;
-	ProcSignal = param->ProcSignal;
 
 	PostmasterPid = param->PostmasterPid;
 	PgStartTime = param->PgStartTime;
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index d6133bfebc6..f6d3369f917 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -968,6 +968,8 @@ PostmasterMain(int argc, char *argv[])
 	 * shared memory, determine the value of any runtime-computed GUCs that
 	 * depend on the amount of shared memory required.
 	 */
+	RegisterShmemStructs();
+
 	InitializeShmemGUCs();
 
 	/*
diff --git a/src/backend/storage/ipc/dsm.c b/src/backend/storage/ipc/dsm.c
index 6a5b16392f7..55f46c7687e 100644
--- a/src/backend/storage/ipc/dsm.c
+++ b/src/backend/storage/ipc/dsm.c
@@ -108,7 +108,15 @@ static inline bool is_main_region_dsm_handle(dsm_handle handle);
 static bool dsm_init_done = false;
 
 /* Preallocated DSM space in the main shared memory region. */
-static void *dsm_main_space_begin = NULL;
+static void dsm_main_space_init(void *);
+
+static ShmemStructDesc dsm_main_space_shmem_desc = {
+	.name = "Preallocated DSM",
+	.size = 0, /* dynamic */
+	.init_fn = dsm_main_space_init,
+};
+
+#define dsm_main_space_begin (dsm_main_space_shmem_desc.ptr)
 
 /*
  * List of dynamic shared memory segments used by this backend.
@@ -479,27 +487,29 @@ void
 dsm_shmem_init(void)
 {
 	size_t		size = dsm_estimate_size();
-	bool		found;
 
 	if (size == 0)
 		return;
 
-	dsm_main_space_begin = ShmemInitStruct("Preallocated DSM", size, &found);
-	if (!found)
-	{
-		FreePageManager *fpm = (FreePageManager *) dsm_main_space_begin;
-		size_t		first_page = 0;
-		size_t		pages;
-
-		/* Reserve space for the FreePageManager. */
-		while (first_page * FPM_PAGE_SIZE < sizeof(FreePageManager))
-			++first_page;
-
-		/* Initialize it and give it all the rest of the space. */
-		FreePageManagerInitialize(fpm, dsm_main_space_begin);
-		pages = (size / FPM_PAGE_SIZE) - first_page;
-		FreePageManagerPut(fpm, first_page, pages);
-	}
+	ShmemRegisterStruct(&dsm_main_space_shmem_desc);
+}
+
+static void
+dsm_main_space_init(void *arg)
+{
+	size_t		size = dsm_main_space_shmem_desc.size;
+	FreePageManager *fpm = (FreePageManager *) dsm_main_space_begin;
+	size_t		first_page = 0;
+	size_t		pages;
+
+	/* Reserve space for the FreePageManager. */
+	while (first_page * FPM_PAGE_SIZE < sizeof(FreePageManager))
+		++first_page;
+
+	/* Initialize it and give it all the rest of the space. */
+	FreePageManagerInitialize(fpm, dsm_main_space_begin);
+	pages = (size / FPM_PAGE_SIZE) - first_page;
+	FreePageManagerPut(fpm, first_page, pages);
 }
 
 /*
diff --git a/src/backend/storage/ipc/dsm_registry.c b/src/backend/storage/ipc/dsm_registry.c
index 068c1577b12..882af83b7b2 100644
--- a/src/backend/storage/ipc/dsm_registry.c
+++ b/src/backend/storage/ipc/dsm_registry.c
@@ -54,7 +54,15 @@ typedef struct DSMRegistryCtxStruct
 	dshash_table_handle dshh;
 } DSMRegistryCtxStruct;
 
-static DSMRegistryCtxStruct *DSMRegistryCtx;
+static void DSMRegistryCtxShmemInit(void *arg);
+
+static ShmemStructDesc DSMRegistryCtxShmemDesc = {
+	.name = "DSM Registry Data",
+	.size = sizeof(DSMRegistryCtxStruct),
+	.init_fn = DSMRegistryCtxShmemInit,
+};
+
+#define DSMRegistryCtx ((DSMRegistryCtxStruct *) DSMRegistryCtxShmemDesc.ptr)
 
 typedef struct NamedDSMState
 {
@@ -113,27 +121,17 @@ static const dshash_parameters dsh_params = {
 static dsa_area *dsm_registry_dsa;
 static dshash_table *dsm_registry_table;
 
-Size
-DSMRegistryShmemSize(void)
+void
+DSMRegistryShmemRegister(void)
 {
-	return MAXALIGN(sizeof(DSMRegistryCtxStruct));
+	ShmemRegisterStruct(&DSMRegistryCtxShmemDesc);
 }
 
-void
-DSMRegistryShmemInit(void)
+static void
+DSMRegistryCtxShmemInit(void *)
 {
-	bool		found;
-
-	DSMRegistryCtx = (DSMRegistryCtxStruct *)
-		ShmemInitStruct("DSM Registry Data",
-						DSMRegistryShmemSize(),
-						&found);
-
-	if (!found)
-	{
-		DSMRegistryCtx->dsah = DSA_HANDLE_INVALID;
-		DSMRegistryCtx->dshh = DSHASH_HANDLE_INVALID;
-	}
+	DSMRegistryCtx->dsah = DSA_HANDLE_INVALID;
+	DSMRegistryCtx->dshh = DSHASH_HANDLE_INVALID;
 }
 
 /*
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 1f7e933d500..952988645d0 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -101,13 +101,14 @@ CalculateShmemSize(void)
 	size = add_size(size, hash_estimate_size(SHMEM_INDEX_SIZE,
 											 sizeof(ShmemIndexEnt)));
 	size = add_size(size, dsm_estimate_size());
-	size = add_size(size, DSMRegistryShmemSize());
+
+	size = add_size(size, ShmemRegisteredSize());
+
+	/* legacy subsystmes */
 	size = add_size(size, BufferManagerShmemSize());
 	size = add_size(size, LockManagerShmemSize());
 	size = add_size(size, PredicateLockShmemSize());
-	size = add_size(size, ProcGlobalShmemSize());
 	size = add_size(size, XLogPrefetchShmemSize());
-	size = add_size(size, VarsupShmemSize());
 	size = add_size(size, XLOGShmemSize());
 	size = add_size(size, XLogRecoveryShmemSize());
 	size = add_size(size, CLOGShmemSize());
@@ -117,11 +118,7 @@ CalculateShmemSize(void)
 	size = add_size(size, BackgroundWorkerShmemSize());
 	size = add_size(size, MultiXactShmemSize());
 	size = add_size(size, LWLockShmemSize());
-	size = add_size(size, ProcArrayShmemSize());
 	size = add_size(size, BackendStatusShmemSize());
-	size = add_size(size, SharedInvalShmemSize());
-	size = add_size(size, PMSignalShmemSize());
-	size = add_size(size, ProcSignalShmemSize());
 	size = add_size(size, CheckpointerShmemSize());
 	size = add_size(size, AutoVacuumShmemSize());
 	size = add_size(size, ReplicationSlotsShmemSize());
@@ -217,6 +214,10 @@ CreateSharedMemoryAndSemaphores(void)
 	 */
 	InitShmemAllocator(seghdr);
 
+	/* Reserve space for semaphores. */
+	if (!IsUnderPostmaster)
+		PGReserveSemaphores(ProcGlobalSemas());
+
 	/* Initialize subsystems */
 	CreateOrAttachShmemStructs();
 
@@ -230,6 +231,19 @@ CreateSharedMemoryAndSemaphores(void)
 		shmem_startup_hook();
 }
 
+void
+RegisterShmemStructs(void)
+{
+	DSMRegistryShmemRegister();
+
+	ProcGlobalShmemRegister();
+	VarsupShmemRegister();
+	ProcArrayShmemRegister();
+	SharedInvalShmemRegister();
+	PMSignalShmemRegister();
+	ProcSignalShmemRegister();
+}
+
 /*
  * Initialize various subsystems, setting up their data structures in
  * shared memory.
@@ -259,14 +273,23 @@ CreateOrAttachShmemStructs(void)
 	 */
 	InitShmemIndex();
 
+#ifdef EXEC_BACKEND
+	if (IsUnderPostmaster)
+		ShmemAttachRegistered();
+	else
+#endif
+	{
+		ShmemInitRegistered();
+	}
+
 	dsm_shmem_init();
-	DSMRegistryShmemInit();
+	//DSMRegistryShmemInit();
 
 	/*
 	 * Set up xlog, clog, and buffers
 	 */
-	VarsupShmemInit();
 	XLOGShmemInit();
+
 	XLogPrefetchShmemInit();
 	XLogRecoveryShmemInit();
 	CLOGShmemInit();
@@ -288,23 +311,13 @@ CreateOrAttachShmemStructs(void)
 	/*
 	 * Set up process table
 	 */
-	if (!IsUnderPostmaster)
-		InitProcGlobal();
-	ProcArrayShmemInit();
 	BackendStatusShmemInit();
 	TwoPhaseShmemInit();
 	BackgroundWorkerShmemInit();
 
-	/*
-	 * Set up shared-inval messaging
-	 */
-	SharedInvalShmemInit();
-
 	/*
 	 * Set up interprocess signaling mechanisms
 	 */
-	PMSignalShmemInit();
-	ProcSignalShmemInit();
 	CheckpointerShmemInit();
 	AutoVacuumShmemInit();
 	ReplicationSlotsShmemInit();
diff --git a/src/backend/storage/ipc/pmsignal.c b/src/backend/storage/ipc/pmsignal.c
index 4618820b337..23752500d16 100644
--- a/src/backend/storage/ipc/pmsignal.c
+++ b/src/backend/storage/ipc/pmsignal.c
@@ -80,9 +80,24 @@ struct PMSignalData
 	sig_atomic_t PMChildFlags[FLEXIBLE_ARRAY_MEMBER];
 };
 
-/* PMSignalState pointer is valid in both postmaster and child processes */
+static void PMSignalShmemInit(void *);
+
+static ShmemStructDesc PMSignalShmemDesc = {
+	.name = "PMSignalState",
+	.size = 0, /* dynamic */
+	.init_fn = PMSignalShmemInit,
+};
+
+/*
+ * PMSignalState pointer is valid in both postmaster and child processes
+ *
+ * This is a stand-alone variable rather than just a #define over
+ * PMSignalShmemDesc.ptr because it is needed early at backend startup and
+ * passed as a backend parameter in EXEC_BACKEND mode
+ */
 NON_EXEC_STATIC volatile PMSignalData *PMSignalState = NULL;
 
+
 /*
  * Local copy of PMSignalState->num_child_flags, only valid in the
  * postmaster.  Postmaster keeps a local copy so that it doesn't need to
@@ -123,39 +138,28 @@ postmaster_death_handler(SIGNAL_ARGS)
 static void MarkPostmasterChildInactive(int code, Datum arg);
 
 /*
- * PMSignalShmemSize
- *		Compute space needed for pmsignal.c's shared memory
+ * PMSignalShmemRegister - Register our shared memory
  */
-Size
-PMSignalShmemSize(void)
+void
+PMSignalShmemRegister(void)
 {
 	Size		size;
 
 	size = offsetof(PMSignalData, PMChildFlags);
 	size = add_size(size, mul_size(MaxLivePostmasterChildren(),
 								   sizeof(sig_atomic_t)));
-
-	return size;
+	PMSignalShmemDesc.size = size;
+	ShmemRegisterStruct(&PMSignalShmemDesc);
 }
 
-/*
- * PMSignalShmemInit - initialize during shared-memory creation
- */
-void
-PMSignalShmemInit(void)
+static void
+PMSignalShmemInit(void *arg)
 {
-	bool		found;
-
-	PMSignalState = (PMSignalData *)
-		ShmemInitStruct("PMSignalState", PMSignalShmemSize(), &found);
-
-	if (!found)
-	{
-		/* initialize all flags to zeroes */
-		MemSet(unvolatize(PMSignalData *, PMSignalState), 0, PMSignalShmemSize());
-		num_child_flags = MaxLivePostmasterChildren();
-		PMSignalState->num_child_flags = num_child_flags;
-	}
+	/* initialize all flags to zeroes */
+	PMSignalState = PMSignalShmemDesc.ptr;
+	MemSet(unvolatize(PMSignalData *, PMSignalState), 0, PMSignalShmemDesc.size);
+	num_child_flags = MaxLivePostmasterChildren();
+	PMSignalState->num_child_flags = num_child_flags;
 }
 
 /*
@@ -291,6 +295,7 @@ RegisterPostmasterChildActive(void)
 {
 	int			slot = MyPMChildSlot;
 
+	Assert(PMSignalState);
 	Assert(slot > 0 && slot <= PMSignalState->num_child_flags);
 	slot--;
 	Assert(PMSignalState->PMChildFlags[slot] == PM_CHILD_ASSIGNED);
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 301f54fb5a8..08c63bcb2a7 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -101,6 +101,18 @@ typedef struct ProcArrayStruct
 	int			pgprocnos[FLEXIBLE_ARRAY_MEMBER];
 } ProcArrayStruct;
 
+static void ProcArrayShmemInit(void *arg);
+static void ProcArrayShmemAttach(void *arg);
+
+static ShmemStructDesc ProcArrayShmemDesc = {
+	.name = "Proc Array",
+	.size = 0, /* dynamic */
+	.init_fn = ProcArrayShmemInit,
+	.attach_fn = ProcArrayShmemAttach,
+};
+
+#define procArray ((ProcArrayStruct *) ProcArrayShmemDesc.ptr)
+
 /*
  * State for the GlobalVisTest* family of functions. Those functions can
  * e.g. be used to decide if a deleted row can be removed without violating
@@ -267,9 +279,6 @@ typedef enum KAXCompressReason
 	KAX_STARTUP_PROCESS_IDLE,	/* startup process is about to sleep */
 } KAXCompressReason;
 
-
-static ProcArrayStruct *procArray;
-
 static PGPROC *allProcs;
 
 /*
@@ -280,8 +289,23 @@ static TransactionId cachedXidIsNotInProgress = InvalidTransactionId;
 /*
  * Bookkeeping for tracking emulated transactions in recovery
  */
-static TransactionId *KnownAssignedXids;
-static bool *KnownAssignedXidsValid;
+
+static ShmemStructDesc KnownAssignedXidsShmemDesc = {
+	.name = "KnownAssignedXids",
+	.size = 0, /* dynamic */
+	.init_fn = NULL,
+};
+
+#define KnownAssignedXids ((TransactionId *) KnownAssignedXidsShmemDesc.ptr)
+
+static ShmemStructDesc KnownAssignedXidsValidShmemDesc = {
+	.name = "KnownAssignedXidsValid",
+	.size = 0, /* dynamic */
+	.init_fn = NULL,
+};
+
+#define KnownAssignedXidsValid ((bool *) KnownAssignedXidsValidShmemDesc.ptr)
+
 static TransactionId latestObservedXid = InvalidTransactionId;
 
 /*
@@ -372,18 +396,19 @@ static inline FullTransactionId FullXidRelativeTo(FullTransactionId rel,
 static void GlobalVisUpdateApply(ComputeXidHorizonsResult *horizons);
 
 /*
- * Report shared-memory space needed by ProcArrayShmemInit
+ * Register the shared PGPROC array during postmaster startup.
  */
-Size
-ProcArrayShmemSize(void)
+void
+ProcArrayShmemRegister(void)
 {
-	Size		size;
-
-	/* Size of the ProcArray structure itself */
 #define PROCARRAY_MAXPROCS	(MaxBackends + max_prepared_xacts)
 
-	size = offsetof(ProcArrayStruct, pgprocnos);
-	size = add_size(size, mul_size(sizeof(int), PROCARRAY_MAXPROCS));
+	/* Create or attach to the ProcArray shared structure */
+	ProcArrayShmemDesc.size =
+		add_size(offsetof(ProcArrayStruct, pgprocnos),
+				 mul_size(sizeof(int),
+						  PROCARRAY_MAXPROCS));
+	ShmemRegisterStruct(&ProcArrayShmemDesc);
 
 	/*
 	 * During Hot Standby processing we have a data structure called
@@ -403,64 +428,38 @@ ProcArrayShmemSize(void)
 
 	if (EnableHotStandby)
 	{
-		size = add_size(size,
-						mul_size(sizeof(TransactionId),
-								 TOTAL_MAX_CACHED_SUBXIDS));
-		size = add_size(size,
-						mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS));
+		KnownAssignedXidsShmemDesc.size =
+			mul_size(sizeof(TransactionId),
+					 TOTAL_MAX_CACHED_SUBXIDS);
+		ShmemRegisterStruct(&KnownAssignedXidsShmemDesc);
+
+		KnownAssignedXidsValidShmemDesc.size =
+			mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS);
+		ShmemRegisterStruct(&KnownAssignedXidsValidShmemDesc);
 	}
-
-	return size;
 }
 
-/*
- * Initialize the shared PGPROC array during postmaster startup.
- */
-void
-ProcArrayShmemInit(void)
+static void
+ProcArrayShmemInit(void *arg)
 {
-	bool		found;
-
-	/* Create or attach to the ProcArray shared structure */
-	procArray = (ProcArrayStruct *)
-		ShmemInitStruct("Proc Array",
-						add_size(offsetof(ProcArrayStruct, pgprocnos),
-								 mul_size(sizeof(int),
-										  PROCARRAY_MAXPROCS)),
-						&found);
-
-	if (!found)
-	{
-		/*
-		 * We're the first - initialize.
-		 */
-		procArray->numProcs = 0;
-		procArray->maxProcs = PROCARRAY_MAXPROCS;
-		procArray->maxKnownAssignedXids = TOTAL_MAX_CACHED_SUBXIDS;
-		procArray->numKnownAssignedXids = 0;
-		procArray->tailKnownAssignedXids = 0;
-		procArray->headKnownAssignedXids = 0;
-		procArray->lastOverflowedXid = InvalidTransactionId;
-		procArray->replication_slot_xmin = InvalidTransactionId;
-		procArray->replication_slot_catalog_xmin = InvalidTransactionId;
-		TransamVariables->xactCompletionCount = 1;
-	}
+	procArray->numProcs = 0;
+	procArray->maxProcs = PROCARRAY_MAXPROCS;
+	procArray->maxKnownAssignedXids = TOTAL_MAX_CACHED_SUBXIDS;
+	procArray->numKnownAssignedXids = 0;
+	procArray->tailKnownAssignedXids = 0;
+	procArray->headKnownAssignedXids = 0;
+	procArray->lastOverflowedXid = InvalidTransactionId;
+	procArray->replication_slot_xmin = InvalidTransactionId;
+	procArray->replication_slot_catalog_xmin = InvalidTransactionId;
+	TransamVariables->xactCompletionCount = 1;
 
 	allProcs = ProcGlobal->allProcs;
+}
 
-	/* Create or attach to the KnownAssignedXids arrays too, if needed */
-	if (EnableHotStandby)
-	{
-		KnownAssignedXids = (TransactionId *)
-			ShmemInitStruct("KnownAssignedXids",
-							mul_size(sizeof(TransactionId),
-									 TOTAL_MAX_CACHED_SUBXIDS),
-							&found);
-		KnownAssignedXidsValid = (bool *)
-			ShmemInitStruct("KnownAssignedXidsValid",
-							mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS),
-							&found);
-	}
+static void
+ProcArrayShmemAttach(void *arg)
+{
+	allProcs = ProcGlobal->allProcs;
 }
 
 /*
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 8e56922dcea..5743f088324 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -102,7 +102,16 @@ struct ProcSignalHeader
 #define BARRIER_CLEAR_BIT(flags, type) \
 	((flags) &= ~(((uint32) 1) << (uint32) (type)))
 
-NON_EXEC_STATIC ProcSignalHeader *ProcSignal = NULL;
+static void ProcSignalShmemInit(void *arg);
+
+static ShmemStructDesc ProcSignalShmemDesc = {
+	.name = "ProcSignal",
+	.size = 0, /* dynamic */
+	.init_fn = ProcSignalShmemInit,
+};
+
+#define ProcSignal ((ProcSignalHeader *) ProcSignalShmemDesc.ptr)
+
 static ProcSignalSlot *MyProcSignalSlot = NULL;
 
 static bool CheckProcSignal(ProcSignalReason reason);
@@ -110,51 +119,37 @@ static void CleanupProcSignalState(int status, Datum arg);
 static void ResetProcSignalBarrierBits(uint32 flags);
 
 /*
- * ProcSignalShmemSize
- *		Compute space needed for ProcSignal's shared memory
+ * ProcSignalShmemRegister
+ *		Register ProcSignal's shared memory needs at postmaster startup
  */
-Size
-ProcSignalShmemSize(void)
+void
+ProcSignalShmemRegister(void)
 {
 	Size		size;
 
 	size = mul_size(NumProcSignalSlots, sizeof(ProcSignalSlot));
 	size = add_size(size, offsetof(ProcSignalHeader, psh_slot));
-	return size;
+
+	ProcSignalShmemDesc.size = size;
+	ShmemRegisterStruct(&ProcSignalShmemDesc);
 }
 
-/*
- * ProcSignalShmemInit
- *		Allocate and initialize ProcSignal's shared memory
- */
-void
-ProcSignalShmemInit(void)
+static void
+ProcSignalShmemInit(void *arg)
 {
-	Size		size = ProcSignalShmemSize();
-	bool		found;
+	pg_atomic_init_u64(&ProcSignal->psh_barrierGeneration, 0);
 
-	ProcSignal = (ProcSignalHeader *)
-		ShmemInitStruct("ProcSignal", size, &found);
-
-	/* If we're first, initialize. */
-	if (!found)
+	for (int i = 0; i < NumProcSignalSlots; ++i)
 	{
-		int			i;
-
-		pg_atomic_init_u64(&ProcSignal->psh_barrierGeneration, 0);
+		ProcSignalSlot *slot = &ProcSignal->psh_slot[i];
 
-		for (i = 0; i < NumProcSignalSlots; ++i)
-		{
-			ProcSignalSlot *slot = &ProcSignal->psh_slot[i];
-
-			SpinLockInit(&slot->pss_mutex);
-			pg_atomic_init_u32(&slot->pss_pid, 0);
-			slot->pss_cancel_key_len = 0;
-			MemSet(slot->pss_signalFlags, 0, sizeof(slot->pss_signalFlags));
-			pg_atomic_init_u64(&slot->pss_barrierGeneration, PG_UINT64_MAX);
-			pg_atomic_init_u32(&slot->pss_barrierCheckMask, 0);
-			ConditionVariableInit(&slot->pss_barrierCV);
-		}
+		SpinLockInit(&slot->pss_mutex);
+		pg_atomic_init_u32(&slot->pss_pid, 0);
+		slot->pss_cancel_key_len = 0;
+		MemSet(slot->pss_signalFlags, 0, sizeof(slot->pss_signalFlags));
+		pg_atomic_init_u64(&slot->pss_barrierGeneration, PG_UINT64_MAX);
+		pg_atomic_init_u32(&slot->pss_barrierCheckMask, 0);
+		ConditionVariableInit(&slot->pss_barrierCV);
 	}
 }
 
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 9f362ce8641..faa0fcbd21e 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -19,6 +19,8 @@
  * methods).  The routines in this file are used for allocating and
  * binding to shared memory data structures.
  *
+ * FIXME: NOTES below are outdated
+ *
  * NOTES:
  *		(a) There are three kinds of shared memory data structures
  *	available to POSTGRES: fixed-size structures, queues and hash
@@ -76,6 +78,16 @@
 #include "storage/spin.h"
 #include "utils/builtins.h"
 
+/* size constants for the shmem index table */
+ /* max size of data structure string name */
+#define SHMEM_INDEX_KEYSIZE		 (48)
+ /* estimated size of the shmem index table (not a hard limit) */
+#define SHMEM_INDEX_SIZE		 (64)
+
+/* these are in postmaster private memory */
+static ShmemStructDesc *registry[SHMEM_INDEX_SIZE];
+static int num_registrations = 0;
+
 /*
  * This is the first data structure stored in the shared memory segment, at
  * the offset that PGShmemHeader->content_offset points to.  Allocations by
@@ -95,6 +107,9 @@ typedef struct ShmemAllocatorData
 
 static void *ShmemAllocRaw(Size size, Size *allocated_size);
 
+static void shmem_hash_init(void *arg);
+static void shmem_hash_attach(void *arg);
+
 /* shared memory global variables */
 
 static PGShmemHeader *ShmemSegHdr;	/* shared mem segment header */
@@ -103,13 +118,137 @@ static void *ShmemEnd;			/* end+1 address of shared memory */
 
 static ShmemAllocatorData *ShmemAllocator;
 slock_t    *ShmemLock;			/* points to ShmemAllocator->shmem_lock */
-static HTAB *ShmemIndex = NULL; /* primary index hashtable for shmem */
+
+
+static ShmemHashDesc ShmemIndexHashDesc = {
+	.name = "ShmemIndex",
+	.init_size = SHMEM_INDEX_SIZE,
+	.max_size = SHMEM_INDEX_SIZE,
+};
+
+ /* primary index hashtable for shmem */
+#define ShmemIndex (ShmemIndexHashDesc.ptr)
+
 
 /* To get reliable results for NUMA inquiry we need to "touch pages" once */
 static bool firstNumaTouch = true;
 
 Datum		pg_numa_available(PG_FUNCTION_ARGS);
 
+
+void
+ShmemRegisterStruct(ShmemStructDesc *desc)
+{
+	elog(DEBUG2, "REGISTER: %s with size %zd", desc->name, desc->size);
+
+	registry[num_registrations++] = desc;
+}
+
+size_t
+ShmemRegisteredSize(void)
+{
+	size_t		size;
+
+	size = 0;
+	for (int i = 0; i < num_registrations; i++)
+	{
+		size = add_size(size, registry[i]->size);
+		size = add_size(size, registry[i]->extra_size);
+	}
+
+	elog(DEBUG2, "SIZE: total %zd", size);
+
+	return size;
+}
+
+void
+ShmemInitRegistered(void)
+{
+	/* Should be called only by the postmaster or a standalone backend. */
+	Assert(!IsUnderPostmaster);
+
+	for (int i = 0; i < num_registrations; i++)
+	{
+		size_t		allocated_size;
+		void	   *structPtr;
+		bool		found;
+		ShmemIndexEnt *result;
+
+		elog(DEBUG2, "INIT [%d/%d]: %s", i, num_registrations, registry[i]->name);
+
+		/* look it up in the shmem index */
+		result = (ShmemIndexEnt *)
+			hash_search(ShmemIndex, registry[i]->name, HASH_ENTER_NULL, &found);
+		if (!result)
+		{
+			ereport(ERROR,
+					(errcode(ERRCODE_OUT_OF_MEMORY),
+					 errmsg("could not create ShmemIndex entry for data structure \"%s\"",
+							registry[i]->name)));
+		}
+		if (found)
+			elog(ERROR, "shmem struct \"%s\" is already initialized", registry[i]->name);
+
+		/* allocate and initialize it */
+		structPtr = ShmemAllocRaw(registry[i]->size, &allocated_size);
+		if (structPtr == NULL)
+		{
+			/* out of memory; remove the failed ShmemIndex entry */
+			hash_search(ShmemIndex, registry[i]->name, HASH_REMOVE, NULL);
+			ereport(ERROR,
+					(errcode(ERRCODE_OUT_OF_MEMORY),
+					 errmsg("not enough shared memory for data structure"
+							" \"%s\" (%zu bytes requested)",
+							registry[i]->name, registry[i]->size)));
+		}
+		result->size = registry[i]->size;
+		result->allocated_size = allocated_size;
+		result->location = structPtr;
+
+		registry[i]->ptr = structPtr;
+		if (registry[i]->init_fn)
+			registry[i]->init_fn(registry[i]->init_fn_arg);
+	}
+}
+
+#ifdef EXEC_BACKEND
+void
+ShmemAttachRegistered(void)
+{
+	/* Must be initializing a (non-standalone) backend */
+	Assert(IsUnderPostmaster);
+	Assert(ShmemAllocator->index != NULL);
+
+	LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
+
+	for (int i = 0; i < num_registrations; i++)
+	{
+		bool		found;
+		ShmemIndexEnt *result;
+
+		elog(LOG, "ATTACH [%d/%d]: %s", i, num_registrations, registry[i]->name);
+
+		/* look it up in the shmem index */
+		result = (ShmemIndexEnt *)
+			hash_search(ShmemIndex, registry[i]->name, HASH_FIND, &found);
+		if (!found)
+		{
+			ereport(ERROR,
+					(errcode(ERRCODE_OUT_OF_MEMORY),
+					 errmsg("could not find ShmemIndex entry for data structure \"%s\"",
+							registry[i]->name)));
+		}
+
+		registry[i]->ptr = result->location;
+
+		if (registry[i]->attach_fn)
+			registry[i]->attach_fn(registry[i]->attach_fn_arg);
+	}
+
+	LWLockRelease(ShmemIndexLock);
+}
+#endif
+
 /*
  *	InitShmemAllocator() --- set up basic pointers to shared memory.
  *
@@ -292,6 +431,98 @@ InitShmemIndex(void)
 							   HASH_ELEM | HASH_STRINGS);
 }
 
+/*
+ * ShmemInitHash -- Create and initialize, or attach to, a
+ *		shared memory hash table.
+ *
+ * We assume caller is doing some kind of synchronization
+ * so that two processes don't try to create/initialize the same
+ * table at once.  (In practice, all creations are done in the postmaster
+ * process; child processes should always be attaching to existing tables.)
+ *
+ * max_size is the estimated maximum number of hashtable entries.  This is
+ * not a hard limit, but the access efficiency will degrade if it is
+ * exceeded substantially (since it's used to compute directory size and
+ * the hash table buckets will get overfull).
+ *
+ * init_size is the number of hashtable entries to preallocate.  For a table
+ * whose maximum size is certain, this should be equal to max_size; that
+ * ensures that no run-time out-of-shared-memory failures can occur.
+ *
+ * *infoP and hash_flags must specify at least the entry sizes and key
+ * comparison semantics (see hash_create()).  Flag bits and values specific
+ * to shared-memory hash tables are added here, except that callers may
+ * choose to specify HASH_PARTITION and/or HASH_FIXED_SIZE.
+ *
+ * Note: before Postgres 9.0, this function returned NULL for some failure
+ * cases.  Now, it always throws error instead, so callers need not check
+ * for NULL.
+ */
+void
+ShmemRegisterHash(ShmemHashDesc *desc,		/* configuration */
+				  HASHCTL *infoP,	/* info about key and bucket size */
+				  int hash_flags)	/* info about infoP */
+{
+	/*
+	 * Hash tables allocated in shared memory have a fixed directory; it can't
+	 * grow or other backends wouldn't be able to find it. So, make sure we
+	 * make it big enough to start with.
+	 *
+	 * The shared memory allocator must be specified too.
+	 */
+	infoP->dsize = infoP->max_dsize = hash_select_dirsize(desc->max_size);
+	infoP->alloc = ShmemAllocNoError;
+	hash_flags |= HASH_SHARED_MEM | HASH_ALLOC | HASH_DIRSIZE;
+
+	/* look it up in the shmem index */
+	memset(&desc->base_desc, 0, sizeof(desc->base_desc));
+	desc->base_desc.name = desc->name;
+	desc->base_desc.size = hash_get_shared_size(infoP, hash_flags);
+	desc->base_desc.init_fn = shmem_hash_init;
+	desc->base_desc.init_fn_arg = desc;
+	desc->base_desc.attach_fn = shmem_hash_attach;
+	desc->base_desc.attach_fn_arg = desc;
+
+	desc->base_desc.extra_size = hash_estimate_size(desc->max_size, infoP->entrysize) - desc->base_desc.size;
+
+	desc->hash_flags = hash_flags;
+	desc->infoP = MemoryContextAlloc(TopMemoryContext, sizeof(HASHCTL));
+	memcpy(desc->infoP, infoP, sizeof(HASHCTL));
+
+	ShmemRegisterStruct(&desc->base_desc);
+}
+
+static void
+shmem_hash_init(void *arg)
+{
+	ShmemHashDesc *desc = (ShmemHashDesc *) arg;
+	int			hash_flags = desc->hash_flags;
+
+	/* Pass location of hashtable header to hash_create */
+	desc->ptr = desc->base_desc.ptr;
+	desc->infoP->hctl = (HASHHDR *) desc->ptr;
+
+	desc->ptr = hash_create(desc->name, desc->init_size, desc->infoP, hash_flags);
+}
+
+static void
+shmem_hash_attach(void *arg)
+{
+	ShmemHashDesc *desc = (ShmemHashDesc *) arg;
+	int			hash_flags = desc->hash_flags;
+
+	/*
+	 * if it already exists, attach to it rather than allocate and initialize
+	 * new space
+	 */
+	hash_flags |= HASH_ATTACH;
+
+	/* Pass location of hashtable header to hash_create */
+	desc->infoP->hctl = (HASHHDR *) desc->ptr;
+
+	desc->ptr = hash_create(desc->name, desc->init_size, desc->infoP, hash_flags);
+}
+
 /*
  * ShmemInitHash -- Create and initialize, or attach to, a
  *		shared memory hash table.
diff --git a/src/backend/storage/ipc/sinvaladt.c b/src/backend/storage/ipc/sinvaladt.c
index a7a7cc4f0a9..0fe0f256971 100644
--- a/src/backend/storage/ipc/sinvaladt.c
+++ b/src/backend/storage/ipc/sinvaladt.c
@@ -203,7 +203,16 @@ typedef struct SISeg
  */
 #define NumProcStateSlots	(MaxBackends + NUM_AUXILIARY_PROCS)
 
-static SISeg *shmInvalBuffer;	/* pointer to the shared inval buffer */
+static void SharedInvalShmemInit(void *arg);
+
+static ShmemStructDesc SharedInvalShmemDesc = {
+	.name = "shmInvalBuffer",
+	.size = 0,	/* dynamic */
+	.init_fn = SharedInvalShmemInit,
+};
+
+/* pointer to the shared inval buffer */
+#define shmInvalBuffer ((SISeg *) SharedInvalShmemDesc.ptr)
 
 
 static LocalTransactionId nextLocalTransactionId;
@@ -212,10 +221,11 @@ static void CleanupInvalidationState(int status, Datum arg);
 
 
 /*
- * SharedInvalShmemSize --- return shared-memory space needed
+ * SharedInvalShmemRegister
+ *		Register shared memory needs for the SI message buffer
  */
-Size
-SharedInvalShmemSize(void)
+void
+SharedInvalShmemRegister(void)
 {
 	Size		size;
 
@@ -223,26 +233,17 @@ SharedInvalShmemSize(void)
 	size = add_size(size, mul_size(sizeof(ProcState), NumProcStateSlots));	/* procState */
 	size = add_size(size, mul_size(sizeof(int), NumProcStateSlots));	/* pgprocnos */
 
-	return size;
+	/* Allocate space in shared memory */
+	SharedInvalShmemDesc.size = size;
+	ShmemRegisterStruct(&SharedInvalShmemDesc);
 }
 
-/*
- * SharedInvalShmemInit
- *		Create and initialize the SI message buffer
- */
-void
-SharedInvalShmemInit(void)
+static void
+SharedInvalShmemInit(void *arg)
 {
 	int			i;
-	bool		found;
-
-	/* Allocate space in shared memory */
-	shmInvalBuffer = (SISeg *)
-		ShmemInitStruct("shmInvalBuffer", SharedInvalShmemSize(), &found);
-	if (found)
-		return;
 
-	/* Clear message counters, save size of procState array, init spinlock */
+	/* Clear message counters, save size of procState array FIXME, init spinlock */
 	shmInvalBuffer->minMsgNum = 0;
 	shmInvalBuffer->maxMsgNum = 0;
 	shmInvalBuffer->nextThreshold = CLEANUP_MIN;
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index c7a001b3b79..85375b5195e 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -73,13 +73,33 @@ PGPROC	   *MyProc = NULL;
  * relatively infrequently (only at backend startup or shutdown) and not for
  * very long, so a spinlock is okay.
  */
-NON_EXEC_STATIC slock_t *ProcStructLock = NULL;
+#define ProcStructLock (&ProcGlobal->freeProcsLock)
+
+static void ProcGlobalShmemInit(void *arg);
+
+static ShmemStructDesc ProcGlobalShmemDesc = {
+	.name = "Proc Header",
+	.size = sizeof(PROC_HDR),
+	.init_fn = ProcGlobalShmemInit,
+};
+
+static ShmemStructDesc ProcGlobalAllProcsShmemDesc = {
+	.name = "PGPROC structures",
+	.size = 0, /* dynamic */
+};
+
+static ShmemStructDesc FastPathLockArrayShmemDesc = {
+	.name = "Fast-Path Lock Array",
+	.size = 0, /* dynamic */
+};
 
 /* Pointers to shared-memory structures */
 PROC_HDR   *ProcGlobal = NULL;
 NON_EXEC_STATIC PGPROC *AuxiliaryProcs = NULL;
 PGPROC	   *PreparedXactProcs = NULL;
 
+static uint32 TotalProcs;
+
 static DeadLockState deadlock_state = DS_NOT_YET_CHECKED;
 
 /* Is a deadlock check pending? */
@@ -91,24 +111,6 @@ static void AuxiliaryProcKill(int code, Datum arg);
 static void CheckDeadLock(void);
 
 
-/*
- * Report shared-memory space needed by PGPROC.
- */
-static Size
-PGProcShmemSize(void)
-{
-	Size		size = 0;
-	Size		TotalProcs =
-		add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
-
-	size = add_size(size, mul_size(TotalProcs, sizeof(PGPROC)));
-	size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->xids)));
-	size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->subxidStates)));
-	size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->statusFlags)));
-
-	return size;
-}
-
 /*
  * Report shared-memory space needed by Fast-Path locks.
  */
@@ -116,8 +118,6 @@ static Size
 FastPathLockShmemSize(void)
 {
 	Size		size = 0;
-	Size		TotalProcs =
-		add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
 	Size		fpLockBitsSize,
 				fpRelIdSize;
 
@@ -133,25 +133,6 @@ FastPathLockShmemSize(void)
 	return size;
 }
 
-/*
- * Report shared-memory space needed by InitProcGlobal.
- */
-Size
-ProcGlobalShmemSize(void)
-{
-	Size		size = 0;
-
-	/* ProcGlobal */
-	size = add_size(size, sizeof(PROC_HDR));
-	size = add_size(size, sizeof(slock_t));
-
-	size = add_size(size, PGSemaphoreShmemSize(ProcGlobalSemas()));
-	size = add_size(size, PGProcShmemSize());
-	size = add_size(size, FastPathLockShmemSize());
-
-	return size;
-}
-
 /*
  * Report number of semaphores needed by InitProcGlobal.
  */
@@ -186,35 +167,63 @@ ProcGlobalSemas(void)
  *	  implementation typically requires us to create semaphores in the
  *	  postmaster, not in backends.
  *
- * Note: this is NOT called by individual backends under a postmaster,
+ * Note: this is NOT called by individual backends under a postmaster, XXX
  * not even in the EXEC_BACKEND case.  The ProcGlobal and AuxiliaryProcs
  * pointers must be propagated specially for EXEC_BACKEND operation.
  */
 void
-InitProcGlobal(void)
+ProcGlobalShmemRegister(void)
+{
+	Size		size = 0;
+
+	/*
+	 * Reserve all the PGPROC structures we'll need.  There are
+	 * six separate consumers: (1) normal backends, (2) autovacuum workers and
+	 * special workers, (3) background workers, (4) walsenders, (5) auxiliary
+	 * processes, and (6) prepared transactions.  (For largely-historical
+	 * reasons, we combine autovacuum and special workers into one category
+	 * with a single freelist.)  Each PGPROC structure is dedicated to exactly
+	 * one of these purposes, and they do not move between groups.
+	 */
+	TotalProcs =
+		add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
+
+	size = add_size(size, mul_size(TotalProcs, sizeof(PGPROC)));
+
+	/* FIXME: the sizeofs look dangerous because ProcGlobal is not initialized yet */
+	size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->xids)));
+	size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->subxidStates)));
+	size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->statusFlags)));
+
+	ProcGlobalAllProcsShmemDesc.size = size;
+	ShmemRegisterStruct(&ProcGlobalAllProcsShmemDesc);
+
+	FastPathLockArrayShmemDesc.size = FastPathLockShmemSize();
+	ShmemRegisterStruct(&FastPathLockArrayShmemDesc);
+
+	/*
+	 * Create the ProcGlobal shared structure last. Its init callback
+	 * initializes the others too.
+	 */
+	ShmemRegisterStruct(&ProcGlobalShmemDesc);
+}
+
+static void
+ProcGlobalShmemInit(void *arg)
 {
+	char	   *ptr;
+	size_t		requestSize;
 	PGPROC	   *procs;
 	int			i,
 				j;
-	bool		found;
-	uint32		TotalProcs = MaxBackends + NUM_AUXILIARY_PROCS + max_prepared_xacts;
-
 	/* Used for setup of per-backend fast-path slots. */
 	char	   *fpPtr,
 			   *fpEndPtr PG_USED_FOR_ASSERTS_ONLY;
 	Size		fpLockBitsSize,
 				fpRelIdSize;
-	Size		requestSize;
-	char	   *ptr;
 
-	/* Create the ProcGlobal shared structure */
-	ProcGlobal = (PROC_HDR *)
-		ShmemInitStruct("Proc Header", sizeof(PROC_HDR), &found);
-	Assert(!found);
+	ProcGlobal = ProcGlobalShmemDesc.ptr;
 
-	/*
-	 * Initialize the data structures.
-	 */
 	ProcGlobal->spins_per_delay = DEFAULT_SPINS_PER_DELAY;
 	dlist_init(&ProcGlobal->freeProcs);
 	dlist_init(&ProcGlobal->autovacFreeProcs);
@@ -225,23 +234,11 @@ InitProcGlobal(void)
 	ProcGlobal->checkpointerProc = INVALID_PROC_NUMBER;
 	pg_atomic_init_u32(&ProcGlobal->procArrayGroupFirst, INVALID_PROC_NUMBER);
 	pg_atomic_init_u32(&ProcGlobal->clogGroupFirst, INVALID_PROC_NUMBER);
+	SpinLockInit(ProcStructLock);
 
-	/*
-	 * Create and initialize all the PGPROC structures we'll need.  There are
-	 * six separate consumers: (1) normal backends, (2) autovacuum workers and
-	 * special workers, (3) background workers, (4) walsenders, (5) auxiliary
-	 * processes, and (6) prepared transactions.  (For largely-historical
-	 * reasons, we combine autovacuum and special workers into one category
-	 * with a single freelist.)  Each PGPROC structure is dedicated to exactly
-	 * one of these purposes, and they do not move between groups.
-	 */
-	requestSize = PGProcShmemSize();
-
-	ptr = ShmemInitStruct("PGPROC structures",
-						  requestSize,
-						  &found);
-
-	MemSet(ptr, 0, requestSize);
+	ptr = ProcGlobalAllProcsShmemDesc.ptr;
+	requestSize = ProcGlobalAllProcsShmemDesc.size;
+	memset(ptr, 0, requestSize);
 
 	procs = (PGPROC *) ptr;
 	ptr = ptr + TotalProcs * sizeof(PGPROC);
@@ -277,20 +274,13 @@ InitProcGlobal(void)
 	fpLockBitsSize = MAXALIGN(FastPathLockGroupsPerBackend * sizeof(uint64));
 	fpRelIdSize = MAXALIGN(FastPathLockSlotsPerBackend() * sizeof(Oid));
 
-	requestSize = FastPathLockShmemSize();
-
-	fpPtr = ShmemInitStruct("Fast-Path Lock Array",
-							requestSize,
-							&found);
-
-	MemSet(fpPtr, 0, requestSize);
+	fpPtr = FastPathLockArrayShmemDesc.ptr;
+	requestSize = FastPathLockArrayShmemDesc.size;
+	memset(fpPtr, 0, requestSize);
 
 	/* For asserts checking we did not overflow. */
 	fpEndPtr = fpPtr + requestSize;
 
-	/* Reserve space for semaphores. */
-	PGReserveSemaphores(ProcGlobalSemas());
-
 	for (i = 0; i < TotalProcs; i++)
 	{
 		PGPROC	   *proc = &procs[i];
@@ -380,12 +370,6 @@ InitProcGlobal(void)
 	 */
 	AuxiliaryProcs = &procs[MaxBackends];
 	PreparedXactProcs = &procs[MaxBackends + NUM_AUXILIARY_PROCS];
-
-	/* Create ProcStructLock spinlock, too */
-	ProcStructLock = (slock_t *) ShmemInitStruct("ProcStructLock spinlock",
-												 sizeof(slock_t),
-												 &found);
-	SpinLockInit(ProcStructLock);
 }
 
 /*
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 02e9aaa6bca..eed188416ee 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -4117,6 +4117,8 @@ PostgresSingleUserMain(int argc, char *argv[],
 	 * shared memory, determine the value of any runtime-computed GUCs that
 	 * depend on the amount of shared memory required.
 	 */
+	RegisterShmemStructs();
+
 	InitializeShmemGUCs();
 
 	/*
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 6fa91bfcdc0..49d476e9d5c 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -15,7 +15,9 @@
 #define TRANSAM_H
 
 #include "access/xlogdefs.h"
-
+#ifndef FRONTEND
+#include "storage/shmem.h"
+#endif
 
 /* ----------------
  *		Special transaction ID values
@@ -330,7 +332,10 @@ TransactionIdFollowsOrEquals(TransactionId id1, TransactionId id2)
 extern bool TransactionStartedDuringRecovery(void);
 
 /* in transam/varsup.c */
-extern PGDLLIMPORT TransamVariablesData *TransamVariables;
+#ifndef FRONTEND
+extern PGDLLIMPORT struct ShmemStructDesc TransamVariablesShmemDesc;
+#define TransamVariables ((TransamVariablesData *) TransamVariablesShmemDesc.ptr)
+#endif
 
 /*
  * prototypes for functions in transam/transam.c
@@ -345,8 +350,7 @@ extern TransactionId TransactionIdLatest(TransactionId mainxid,
 extern XLogRecPtr TransactionIdGetCommitLSN(TransactionId xid);
 
 /* in transam/varsup.c */
-extern Size VarsupShmemSize(void);
-extern void VarsupShmemInit(void);
+extern void VarsupShmemRegister(void);
 extern FullTransactionId GetNewTransactionId(bool isSubXact);
 extern void AdvanceNextFullTransactionIdPastXid(TransactionId xid);
 extern FullTransactionId ReadNextFullTransactionId(void);
diff --git a/src/include/storage/dsm_registry.h b/src/include/storage/dsm_registry.h
index 506fae2c9ca..9a1b4d982af 100644
--- a/src/include/storage/dsm_registry.h
+++ b/src/include/storage/dsm_registry.h
@@ -22,7 +22,6 @@ extern dsa_area *GetNamedDSA(const char *name, bool *found);
 extern dshash_table *GetNamedDSHash(const char *name,
 									const dshash_parameters *params,
 									bool *found);
-extern Size DSMRegistryShmemSize(void);
-extern void DSMRegistryShmemInit(void);
+extern void DSMRegistryShmemRegister(void);
 
 #endif							/* DSM_REGISTRY_H */
diff --git a/src/include/storage/ipc.h b/src/include/storage/ipc.h
index da32787ab51..8a3b71ad5d3 100644
--- a/src/include/storage/ipc.h
+++ b/src/include/storage/ipc.h
@@ -77,6 +77,7 @@ extern void check_on_shmem_exit_lists_are_empty(void);
 /* ipci.c */
 extern PGDLLIMPORT shmem_startup_hook_type shmem_startup_hook;
 
+extern void RegisterShmemStructs(void);
 extern Size CalculateShmemSize(void);
 extern void CreateSharedMemoryAndSemaphores(void);
 #ifdef EXEC_BACKEND
diff --git a/src/include/storage/pmsignal.h b/src/include/storage/pmsignal.h
index 206fb78f8a5..7cdc4852334 100644
--- a/src/include/storage/pmsignal.h
+++ b/src/include/storage/pmsignal.h
@@ -66,8 +66,7 @@ extern PGDLLIMPORT volatile PMSignalData *PMSignalState;
 /*
  * prototypes for functions in pmsignal.c
  */
-extern Size PMSignalShmemSize(void);
-extern void PMSignalShmemInit(void);
+extern void PMSignalShmemRegister(void);
 extern void SendPostmasterSignal(PMSignalReason reason);
 extern bool CheckPostmasterSignal(PMSignalReason reason);
 extern void SetQuitSignalReason(QuitSignalReason reason);
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 679f0624f92..37023e1a93f 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -418,6 +418,9 @@ typedef struct PROC_HDR
 	dlist_head	bgworkerFreeProcs;
 	/* Head of list of walsender free PGPROC structures */
 	dlist_head	walsenderFreeProcs;
+
+	slock_t		freeProcsLock;
+
 	/* First pgproc waiting for group XID clear */
 	pg_atomic_uint32 procArrayGroupFirst;
 	/* First pgproc waiting for group transaction status update */
@@ -488,7 +491,7 @@ extern PGDLLIMPORT PGPROC *AuxiliaryProcs;
  * Function Prototypes
  */
 extern int	ProcGlobalSemas(void);
-extern Size ProcGlobalShmemSize(void);
+extern void ProcGlobalShmemRegister(void);
 extern void InitProcGlobal(void);
 extern void InitProcess(void);
 extern void InitProcessPhase2(void);
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index 3a8593f87ba..41753c3a630 100644
--- a/src/include/storage/procarray.h
+++ b/src/include/storage/procarray.h
@@ -20,8 +20,7 @@
 #include "utils/snapshot.h"
 
 
-extern Size ProcArrayShmemSize(void);
-extern void ProcArrayShmemInit(void);
+extern void ProcArrayShmemRegister(void);
 extern void ProcArrayAdd(PGPROC *proc);
 extern void ProcArrayRemove(PGPROC *proc, TransactionId latestXid);
 
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index e52b8eb7697..f2df1f30c5f 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -71,8 +71,7 @@ typedef enum
 /*
  * prototypes for functions in procsignal.c
  */
-extern Size ProcSignalShmemSize(void);
-extern void ProcSignalShmemInit(void);
+extern void ProcSignalShmemRegister(void);
 
 extern void ProcSignalInit(const uint8 *cancel_key, int cancel_key_len);
 extern int	SendProcSignal(pid_t pid, ProcSignalReason reason,
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index 89d45287c17..40e2fc17056 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -24,6 +24,53 @@
 #include "storage/spin.h"
 #include "utils/hsearch.h"
 
+typedef void (*ShmemInitCallback) (void *arg);
+typedef void (*ShmemAttachCallback) (void *arg);
+
+/*
+ * Descriptor for a named area or struct in shared memory
+ */
+typedef struct ShmemStructDesc
+{
+	/* Name of the shared memory area. Must be unique across the system */
+	const char *name;
+
+	size_t		size;
+
+	size_t		alignment;
+	ShmemInitCallback init_fn;
+	ShmemInitCallback attach_fn;
+	void	   *init_fn_arg;
+	void	   *attach_fn_arg;
+
+	/*
+	 * Extra space to allocated in the shared memory segment, but it's not
+	 * part of the struct itself. This is used for shared memory hash tables
+	 * that can grow beyond the initial size when more buckets are allocated.
+	 */
+	size_t		extra_size;
+
+	/* Pointer to the shared memory area, when it's allocated. */
+	void	   *ptr;
+} ShmemStructDesc;
+
+/*
+ * Descriptor for shared memory hash table
+ */
+typedef struct ShmemHashDesc
+{
+	const char *name;
+
+	int			hash_flags;
+
+	size_t		init_size;		/* initial number of entries */
+	size_t		max_size;		/* max number of entries */
+	HASHCTL	   *infoP;
+
+	HTAB	   *ptr;
+
+	ShmemStructDesc	base_desc;
+} ShmemHashDesc;
 
 /* shmem.c */
 extern PGDLLIMPORT slock_t *ShmemLock;
@@ -34,9 +81,19 @@ extern void *ShmemAlloc(Size size);
 extern void *ShmemAllocNoError(Size size);
 extern bool ShmemAddrIsValid(const void *addr);
 extern void InitShmemIndex(void);
+
+extern void ShmemRegisterHash(ShmemHashDesc *desc, HASHCTL *infoP, int hash_flags);
+extern void ShmemRegisterStruct(ShmemStructDesc *desc);
+
+/* Legacy functions */
 extern HTAB *ShmemInitHash(const char *name, int64 init_size, int64 max_size,
 						   HASHCTL *infoP, int hash_flags);
 extern void *ShmemInitStruct(const char *name, Size size, bool *foundPtr);
+
+extern size_t ShmemRegisteredSize(void);
+extern void ShmemInitRegistered(void);
+extern void ShmemAttachRegistered(void);
+
 extern Size add_size(Size s1, Size s2);
 extern Size mul_size(Size s1, Size s2);
 
diff --git a/src/include/storage/sinvaladt.h b/src/include/storage/sinvaladt.h
index a1694500a85..4edba2936e6 100644
--- a/src/include/storage/sinvaladt.h
+++ b/src/include/storage/sinvaladt.h
@@ -28,8 +28,7 @@
 /*
  * prototypes for functions in sinvaladt.c
  */
-extern Size SharedInvalShmemSize(void);
-extern void SharedInvalShmemInit(void);
+extern void SharedInvalShmemRegister(void);
 extern void SharedInvalBackendInit(bool sendOnly);
 
 extern void SIInsertDataEntries(const SharedInvalidationMessage *data, int n);

base-commit: c67bef3f3252a3a38bf347f9f119944176a796ce
-- 
2.34.1



  [text/x-patch] 0003-WIP-resizable-shared-memory-structures-20260217.patch (40.3K, 4-0003-WIP-resizable-shared-memory-structures-20260217.patch)
  download | inline diff:
From 68dfc9c94e32a639102bfc7f0694636fb82134c3 Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <[email protected]>
Date: Tue, 17 Feb 2026 16:51:20 +0530
Subject: [PATCH 3/3] WIP: resizable shared memory structures

---
 doc/src/sgml/system-views.sgml                |  10 +
 src/backend/port/sysv_shmem.c                 |  44 +++
 src/backend/port/win32_shmem.c                |  36 +++
 src/backend/storage/ipc/ipci.c                |   3 +
 src/backend/storage/ipc/shmem.c               | 208 ++++++++++++--
 src/include/catalog/pg_proc.dat               |   4 +-
 src/include/storage/pg_shmem.h                |   2 +
 src/include/storage/shmem.h                   |  13 +
 src/test/modules/Makefile                     |   1 +
 src/test/modules/meson.build                  |   1 +
 src/test/modules/resizable_shmem/Makefile     |  23 ++
 .../expected/resizable_shmem.out              | 149 ++++++++++
 src/test/modules/resizable_shmem/meson.build  |  37 +++
 .../resizable_shmem/resizable_shmem--1.0.sql  |  39 +++
 .../modules/resizable_shmem/resizable_shmem.c | 270 ++++++++++++++++++
 .../resizable_shmem/resizable_shmem.conf      |   4 +
 .../resizable_shmem/resizable_shmem.control   |   5 +
 .../specs/resizable_shmem.spec                |  42 +++
 src/test/regress/expected/rules.out           |   5 +-
 19 files changed, 874 insertions(+), 22 deletions(-)
 create mode 100644 src/test/modules/resizable_shmem/Makefile
 create mode 100644 src/test/modules/resizable_shmem/expected/resizable_shmem.out
 create mode 100644 src/test/modules/resizable_shmem/meson.build
 create mode 100644 src/test/modules/resizable_shmem/resizable_shmem--1.0.sql
 create mode 100644 src/test/modules/resizable_shmem/resizable_shmem.c
 create mode 100644 src/test/modules/resizable_shmem/resizable_shmem.conf
 create mode 100644 src/test/modules/resizable_shmem/resizable_shmem.control
 create mode 100644 src/test/modules/resizable_shmem/specs/resizable_shmem.spec

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 8b4abef8c68..881c0ffb360 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -4247,6 +4247,16 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
        the columns will be equal in that case also.
       </para></entry>
      </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>reserved_size</structfield> <type>int8</type>
+      </para>
+      <para>
+       Maximum size in bytes that the allocation can grow upto including padding
+       in case of resizable allocations.
+      </para></entry>
+     </row>
     </tbody>
    </tgroup>
   </table>
diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
index 2e3886cf9fe..32a4d5ed428 100644
--- a/src/backend/port/sysv_shmem.c
+++ b/src/backend/port/sysv_shmem.c
@@ -589,6 +589,27 @@ check_huge_page_size(int *newval, void **extra, GucSource source)
 	return true;
 }
 
+/*
+ * Get the page size of being used by the shared memory.
+ *
+ * The function should be called only after the shared memory has been setup.
+ */
+Size
+GetOSPageSize(void)
+{
+	Size		os_page_size;
+
+	Assert(huge_pages_status != HUGE_PAGES_UNKNOWN);
+
+	os_page_size = sysconf(_SC_PAGESIZE);
+
+	/* If huge pages are actually in use, use huge page size */
+	if (huge_pages_status == HUGE_PAGES_ON)
+		GetHugePageSize(&os_page_size, NULL);
+
+	return os_page_size;
+}
+
 /*
  * Creates an anonymous mmap()ed shared memory segment.
  *
@@ -991,3 +1012,26 @@ PGSharedMemoryDetach(void)
 		AnonymousShmem = NULL;
 	}
 }
+
+/*
+ * Release part of the shared memory of given size starting at given address.
+ *
+ * The address and size are expected to be page aligned.
+ *
+ * Only supported on platforms that support anonymous shared memory.
+ */
+void PGSharedMemoryRelease(void *addr, Size size)
+{
+	if (!AnonymousShmem)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("releasing shared memory is not supported with the current \"shared_memory_type\" setting")));
+
+	Assert(addr == (void *) TYPEALIGN(GetOSPageSize(), addr));
+	Assert(size == TYPEALIGN(GetOSPageSize(), size));
+	Assert(size > 0);
+
+	if (madvise(addr, size, MADV_REMOVE) == -1)
+		ereport(ERROR,
+					(errmsg("could not release shared memory: %m")));
+}
diff --git a/src/backend/port/win32_shmem.c b/src/backend/port/win32_shmem.c
index 794e4fcb2ad..394bb8e6282 100644
--- a/src/backend/port/win32_shmem.c
+++ b/src/backend/port/win32_shmem.c
@@ -621,6 +621,19 @@ pgwin32_ReserveSharedMemoryRegion(HANDLE hChild)
 	return true;
 }
 
+/*
+ * Release part of the shared memory of given size from given address.
+ *
+ * Not supported on Windows currently.
+ */
+void
+PGSharedMemoryRelease(void *addr, Size size)
+{
+	ereport(ERROR,
+			(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+			 errmsg("releasing part of shared memory is not supported on windows")));
+}
+
 /*
  * This function is provided for consistency with sysv_shmem.c and does not
  * provide any useful information for Windows.  To obtain the large page size,
@@ -648,3 +661,26 @@ check_huge_page_size(int *newval, void **extra, GucSource source)
 	}
 	return true;
 }
+
+/*
+ * Get the page size used by the shared memory.
+ *
+ * The function should be called only after the shared memory has been setup. 
+ */
+Size
+GetOSPageSize(void)
+{
+	SYSTEM_INFO sysinfo;
+	Size		os_page_size;
+
+	Assert(huge_pages_status != HUGE_PAGES_UNKNOWN);
+
+	GetSystemInfo(&sysinfo);
+	os_page_size = sysinfo.dwPageSize;
+
+	/* If huge pages are actually in use, use huge page size */
+	if (huge_pages_status == HUGE_PAGES_ON)
+		GetHugePageSize(&os_page_size, NULL);
+
+	return os_page_size;
+}
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 952988645d0..410bbfaf678 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -360,6 +360,9 @@ InitializeShmemGUCs(void)
 	/*
 	 * Calculate the shared memory size and round up to the nearest megabyte.
 	 */
+	/*
+	 * TODO: we need to do a better job of separating out maximum shared memory
+	 * and minimum shared memory because of on-demand shared memory segments. */
 	size_b = CalculateShmemSize();
 	size_mb = add_size(size_b, (1024 * 1024) - 1) / (1024 * 1024);
 	sprintf(buf, "%zu", size_mb);
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index e73ac489b2b..31bd28c84fe 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -143,6 +143,9 @@ ShmemRegisterStruct(ShmemStructDesc *desc)
 	elog(DEBUG2, "REGISTER: %s with size %zd", desc->name, desc->size);
 
 	registry[num_registrations++] = desc;
+
+	if (desc->max_size > 0)
+		elog(DEBUG2, "RESIZABLE structure: %s has max_size %zd", desc->name, desc->max_size);
 }
 
 size_t
@@ -153,7 +156,7 @@ ShmemRegisteredSize(void)
 	size = 0;
 	for (int i = 0; i < num_registrations; i++)
 	{
-		size = add_size(size, registry[i]->size);
+		size = add_size(size, registry[i]->max_size > 0 ? registry[i]->max_size : registry[i]->size);
 		size = add_size(size, registry[i]->extra_size);
 	}
 
@@ -162,9 +165,27 @@ ShmemRegisteredSize(void)
 	return size;
 }
 
+/*
+ * Allocate memory for the registered shared structures and initialize them.
+ *
+ * A shared structure may be fixed sized, with max_size = 0 or resizable, with
+ * max_size > 0. A resizable structure which does not span multiple pages is
+ * treated as a fixed size structure since the memory gets allocated or
+ * deallocated only in pages. The function allocates all the fixed sized
+ * structures first to pack them tightly. To easily deallocated memory in pages,
+ * the resizable structures are allocated on page boundaries. Unregistered
+ * objects are allocated after the registered structures and are treated as fixed
+ * structures.
+ *
+ * Because registered structures are allocated in pages instead of bytes, the
+ * actual allocated size may be more than the requested size. This may leave less
+ * space for unregistered structures than expected. 
+ */
 void
 ShmemInitRegistered(void)
 {
+	Size		page_size = GetOSPageSize();
+
 	/* Should be called only by the postmaster or a standalone backend. */
 	Assert(!IsUnderPostmaster);
 
@@ -177,6 +198,27 @@ ShmemInitRegistered(void)
 
 		elog(DEBUG2, "INIT [%d/%d]: %s", i, num_registrations, registry[i]->name);
 
+		/*
+		 * Pack fixed sized structures and resizable structures that do not
+		 * span multiple pages tightly together.
+		 */
+		if (registry[i]->max_size > 0)
+		{
+			Size pages_for_max_size;
+			
+			pages_for_max_size = (registry[i]->max_size + page_size - 1) / page_size;
+			
+			if (pages_for_max_size <= 1)
+			{
+				elog(DEBUG2, "RESIZABLE structure: %s does not span across pages, treating it as fixed size structure with size %zd",
+					 registry[i]->name, registry[i]->max_size);
+				registry[i]->size = registry[i]->max_size;
+				registry[i]->max_size = 0;
+			}
+			else
+				continue;
+		}
+
 		/* look it up in the shmem index */
 		result = (ShmemIndexEnt *)
 			hash_search(ShmemIndex, registry[i]->name, HASH_ENTER_NULL, &found);
@@ -204,14 +246,132 @@ ShmemInitRegistered(void)
 		}
 		result->size = registry[i]->size;
 		result->allocated_size = allocated_size;
+		result->reservedsize = allocated_size;
 		result->location = structPtr;
 
 		*(registry[i]->ptr) = structPtr;
 		if (registry[i]->init_fn)
 			registry[i]->init_fn(registry[i]->init_fn_arg);
 	}
+
+	/* Allocate resizable structures on page boundaries. */
+	ShmemAllocator->free_offset = align_size(ShmemAllocator->free_offset, page_size);
+	for (int i = 0; i < num_registrations; i++)
+	{
+		size_t		allocated_size;
+		void	   *structPtr;
+		bool		found;
+		ShmemIndexEnt *result;
+
+		/*
+		 * Handle truely resizable structures.
+		 */
+		if (registry[i]->max_size > 0)
+		{
+#ifdef USE_ASSERT_CHECKING
+			Size pages_for_max_size;
+			
+			pages_for_max_size = (registry[i]->max_size + page_size - 1) / page_size;
+			
+			Assert (pages_for_max_size > 1);
+#endif
+			registry[i]->max_size = align_size(registry[i]->max_size, page_size);
+			elog(DEBUG2, "RESIZABLE structure: %s with maximum size %zd",
+					 registry[i]->name, registry[i]->max_size);
+		}
+		else
+			continue;
+
+		/* look it up in the shmem index */
+		result = (ShmemIndexEnt *)
+			hash_search(ShmemIndex, registry[i]->name, HASH_ENTER_NULL, &found);
+		if (!result)
+		{
+			ereport(ERROR,
+					(errcode(ERRCODE_OUT_OF_MEMORY),
+					 errmsg("could not create ShmemIndex entry for data structure \"%s\"",
+							registry[i]->name)));
+		}
+		if (found)
+			elog(ERROR, "shmem struct \"%s\" is already initialized", registry[i]->name);
+
+		/*
+		 * Allocate maximum size for a resizable structure. This should not
+		 * allocate memory but just the address space. Memory will be allocated
+		 * as the address space is written to. It is expected that the
+		 * registrants do not use memory beyond the initial size until they have
+		 * resized the structure. If they do so, this will result in allocating
+		 * more memory than expected.
+		 */
+		structPtr = ShmemAllocRaw(registry[i]->max_size, &allocated_size);
+		if (structPtr == NULL)
+		{
+			/* out of memory; remove the failed ShmemIndex entry */
+			hash_search(ShmemIndex, registry[i]->name, HASH_REMOVE, NULL);
+			ereport(ERROR,
+					(errcode(ERRCODE_OUT_OF_MEMORY),
+					 errmsg("not enough shared memory for data structure"
+							" \"%s\" (%zu bytes requested)",
+							registry[i]->name, registry[i]->size)));
+		}
+		/* Since max_size is page aligned, it should also be cachealigned. */
+		Assert(allocated_size == registry[i]->max_size);
+		result->size = registry[i]->size;
+		/* Resizable structure memory will get allocated in pages. */
+		result->allocated_size = align_size(registry[i]->size, page_size);
+		result->reservedsize = allocated_size;
+		result->location = structPtr;
+
+		*(registry[i]->ptr) = structPtr;
+		if (registry[i]->init_fn)
+			registry[i]->init_fn(registry[i]->init_fn_arg);
+	}
+
+	/* All allocations were in page sizes, so free space should be page aligned. */
+	Assert(ShmemAllocator->free_offset == align_size(ShmemAllocator->free_offset, page_size))
 }
 
+#ifndef EXEC_BACKEND
+void
+ShmemResizeRegistered(const char *name, Size new_size)
+{
+	ShmemIndexEnt *result;
+	bool found;
+	Size allocated_size;
+
+	/* look it up in the shmem index */
+	LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
+	result = (ShmemIndexEnt *)
+		hash_search(ShmemIndex, name, HASH_FIND, &found);
+	if (!found)
+		elog(ERROR, "shmem struct \"%s\" is not initialized", name);
+
+	Assert(result);
+
+	/* Resizable structure memory will get allocated in pages. */
+	allocated_size = align_size(new_size, GetOSPageSize()); 
+	if (result->reservedsize < allocated_size)
+		ereport(ERROR,
+				(errcode(ERRCODE_INSUFFICIENT_RESOURCES),
+				 errmsg("not enough address space is reserved for resizing structure \"%s\"", name)));
+
+
+	/*
+	 * If shrinking release memory which is not required anymore. If expanding we
+	 * don't have to do anything as the memory is allocated when it's written to.
+	 */
+	if (allocated_size < result->allocated_size)
+		PGSharedMemoryRelease((char *) result->location + allocated_size,
+								result->reservedsize - allocated_size);
+
+	/* Update shmem index entry. */
+	result->size = new_size;
+	result->allocated_size = allocated_size;
+
+	LWLockRelease(ShmemIndexLock);
+}
+#endif
+
 #ifdef EXEC_BACKEND
 void
 ShmemAttachRegistered(void)
@@ -700,6 +860,7 @@ ShmemInitStruct(const char *name, Size size, bool *foundPtr)
 		}
 		result->size = size;
 		result->allocated_size = allocated_size;
+		result->reservedsize = allocated_size;
 		result->location = structPtr;
 	}
 
@@ -743,11 +904,26 @@ mul_size(Size s1, Size s2)
 	return result;
 }
 
+/*
+ * Round up the given size to the next multiple of the given alignment, checking
+ * for overflow.
+ */
+Size
+align_size(Size size, Size alignment)
+{
+
+	Assert(alignment != 0);
+
+	if (size % alignment == 0)
+		return size;
+	return add_size(size, alignment - (size % alignment));
+}
+
 /* SQL SRF showing allocated shared memory */
 Datum
 pg_get_shmem_allocations(PG_FUNCTION_ARGS)
 {
-#define PG_GET_SHMEM_SIZES_COLS 4
+#define PG_GET_SHMEM_SIZES_COLS 5
 	ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
 	HASH_SEQ_STATUS hstat;
 	ShmemIndexEnt *ent;
@@ -769,7 +945,17 @@ pg_get_shmem_allocations(PG_FUNCTION_ARGS)
 		values[1] = Int64GetDatum((char *) ent->location - (char *) ShmemSegHdr);
 		values[2] = Int64GetDatum(ent->size);
 		values[3] = Int64GetDatum(ent->allocated_size);
-		named_allocated += ent->allocated_size;
+		values[4] = Int64GetDatum(ent->reservedsize);
+
+		/*
+		 * The shared memory segment metadata does not know about the internal
+		 * reservation. From it's point of view the maximum size of a resizable
+		 * shared structure is considered to be allocated. In order to compute
+		 * the amount of memory allocated to the unnamed structures, add
+		 * reserved size, which is same as the allocated size for a fixed sized
+		 * structure.
+		 */
+		named_allocated += ent->reservedsize;
 
 		tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
 							 values, nulls);
@@ -975,23 +1161,9 @@ pg_get_shmem_allocations_numa(PG_FUNCTION_ARGS)
 Size
 pg_get_shmem_pagesize(void)
 {
-	Size		os_page_size;
-#ifdef WIN32
-	SYSTEM_INFO sysinfo;
-
-	GetSystemInfo(&sysinfo);
-	os_page_size = sysinfo.dwPageSize;
-#else
-	os_page_size = sysconf(_SC_PAGESIZE);
-#endif
-
 	Assert(IsUnderPostmaster);
-	Assert(huge_pages_status != HUGE_PAGES_UNKNOWN);
-
-	if (huge_pages_status == HUGE_PAGES_ON)
-		GetHugePageSize(&os_page_size, NULL);
 
-	return os_page_size;
+	return GetOSPageSize();
 }
 
 Datum
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 83f6501df38..dfd3389ced3 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -8592,8 +8592,8 @@
 { oid => '5052', descr => 'allocations from the main shared memory segment',
   proname => 'pg_get_shmem_allocations', prorows => '50', proretset => 't',
   provolatile => 'v', prorettype => 'record', proargtypes => '',
-  proallargtypes => '{text,int8,int8,int8}', proargmodes => '{o,o,o,o}',
-  proargnames => '{name,off,size,allocated_size}',
+  proallargtypes => '{text,int8,int8,int8,int8}', proargmodes => '{o,o,o,o,o}',
+  proargnames => '{name,off,size,allocated_size,reserved_size}',
   prosrc => 'pg_get_shmem_allocations' },
 
 { oid => '4099', descr => 'Is NUMA support available?',
diff --git a/src/include/storage/pg_shmem.h b/src/include/storage/pg_shmem.h
index 10c7b065861..07e7f1a9661 100644
--- a/src/include/storage/pg_shmem.h
+++ b/src/include/storage/pg_shmem.h
@@ -89,6 +89,8 @@ extern PGShmemHeader *PGSharedMemoryCreate(Size size,
 										   PGShmemHeader **shim);
 extern bool PGSharedMemoryIsInUse(unsigned long id1, unsigned long id2);
 extern void PGSharedMemoryDetach(void);
+extern void PGSharedMemoryRelease(void *addr, Size size);
 extern void GetHugePageSize(Size *hugepagesize, int *mmap_flags);
+extern Size GetOSPageSize(void);
 
 #endif							/* PG_SHMEM_H */
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index cbd4ef8d03f..a1fe945e02b 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -50,6 +50,14 @@ typedef struct ShmemStructDesc
 	 */
 	size_t		extra_size;
 
+	/*
+	 * Maximum size this structure can grow upto in future. The memory is not
+	 * allocated right away but the corresponding address space is allocated so
+	 * that memory can be mapped to it when the structure grows. Typically
+	 * should be used for structures which need contiguous memory.
+	 */
+	size_t		max_size;
+
 	/* Pointer to the variable to which pointer to this shared memory area is assigned after allocation. */
 	void	   **ptr;
 } ShmemStructDesc;
@@ -84,6 +92,9 @@ extern void InitShmemIndex(void);
 
 extern void ShmemRegisterHash(ShmemHashDesc *desc, HASHCTL *infoP, int hash_flags);
 extern void ShmemRegisterStruct(ShmemStructDesc *desc);
+#ifndef EXEC_BACKEND
+extern void ShmemResizeRegistered(const char *name, Size new_size);
+#endif
 
 /* Legacy functions */
 extern HTAB *ShmemInitHash(const char *name, int64 init_size, int64 max_size,
@@ -96,6 +107,7 @@ extern void ShmemAttachRegistered(void);
 
 extern Size add_size(Size s1, Size s2);
 extern Size mul_size(Size s1, Size s2);
+extern Size align_size(Size size, Size align);
 
 extern PGDLLIMPORT Size pg_get_shmem_pagesize(void);
 
@@ -115,6 +127,7 @@ typedef struct
 	void	   *location;		/* location in shared mem */
 	Size		size;			/* # bytes requested for the structure */
 	Size		allocated_size; /* # bytes actually allocated */
+	Size		reservedsize;   /* # bytes reserved for this resizable structure. */
 } ShmemIndexEnt;
 
 #endif							/* SHMEM_H */
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 44c7163c1cd..a5df6edae18 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -14,6 +14,7 @@ SUBDIRS = \
 		  libpq_pipeline \
 		  oauth_validator \
 		  plsample \
+		  resizable_shmem \
 		  spgist_name_ops \
 		  test_aio \
 		  test_binaryheap \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 2634a519935..961bb62759d 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -13,6 +13,7 @@ subdir('libpq_pipeline')
 subdir('nbtree')
 subdir('oauth_validator')
 subdir('plsample')
+subdir('resizable_shmem')
 subdir('spgist_name_ops')
 subdir('ssl_passphrase_callback')
 subdir('test_aio')
diff --git a/src/test/modules/resizable_shmem/Makefile b/src/test/modules/resizable_shmem/Makefile
new file mode 100644
index 00000000000..ad2f040b2f0
--- /dev/null
+++ b/src/test/modules/resizable_shmem/Makefile
@@ -0,0 +1,23 @@
+# src/test/modules/resizable_shmem/Makefile
+
+MODULES = resizable_shmem
+ISOLATION = resizable_shmem
+
+EXTENSION = resizable_shmem
+DATA = resizable_shmem--1.0.sql
+PGFILEDESC = "resizable_shmem - test module for resizable shared memory"
+
+# This test requires library to be loaded at the server start, so disable
+# installcheck
+NO_INSTALLCHECK = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/resizable_shmem
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/src/makefiles/pgxs.mk
+endif
diff --git a/src/test/modules/resizable_shmem/expected/resizable_shmem.out b/src/test/modules/resizable_shmem/expected/resizable_shmem.out
new file mode 100644
index 00000000000..afb18dce0a8
--- /dev/null
+++ b/src/test/modules/resizable_shmem/expected/resizable_shmem.out
@@ -0,0 +1,149 @@
+Parsed test spec with 3 sessions
+
+starting permutation: s1u s2u zu zia s1w_initial s1u s2r_initial s2u zs_max zu zia s2r_initial s2w_max s2u s1r_max s1u zs_shrink zu zia s1r_shrink s1w_shrink s1u s2r_shrink s2u zs_fail
+step s1u: SELECT pg_size_pretty(rss_shmem) AS rss_shmem FROM resizable_shmem_usage() AS "session1_shmem_usage";
+rss_shmem
+---------
+5504 kB  
+(1 row)
+
+step s2u: SELECT pg_size_pretty(rss_shmem) AS rss_shmem FROM resizable_shmem_usage() AS "session2_shmem_usage";
+rss_shmem
+---------
+5504 kB  
+(1 row)
+
+step zu: SELECT pg_size_pretty(rss_shmem) AS rss_shmem FROM resizable_shmem_usage() AS "resizer_shmem_usage";
+rss_shmem
+---------
+5504 kB  
+(1 row)
+
+step zia: SELECT name, pg_size_pretty(size) AS size, pg_size_pretty(allocated_size) AS allocated_size, pg_size_pretty(reserved_size) AS reserved_size FROM pg_shmem_allocations WHERE name = 'resizable_shmem';
+name           |size  |allocated_size|reserved_size
+---------------+------+--------------+-------------
+resizable_shmem|100 MB|100 MB        |400 MB       
+(1 row)
+
+step s1w_initial: SELECT resizable_shmem_write(100);
+resizable_shmem_write
+---------------------
+                     
+(1 row)
+
+step s1u: SELECT pg_size_pretty(rss_shmem) AS rss_shmem FROM resizable_shmem_usage() AS "session1_shmem_usage";
+rss_shmem
+---------
+105 MB   
+(1 row)
+
+step s2r_initial: SELECT resizable_shmem_read(25 * 1024 * 1024, 100);
+resizable_shmem_read
+--------------------
+t                   
+(1 row)
+
+step s2u: SELECT pg_size_pretty(rss_shmem) AS rss_shmem FROM resizable_shmem_usage() AS "session2_shmem_usage";
+rss_shmem
+---------
+105 MB   
+(1 row)
+
+step zs_max: SELECT resizable_shmem_resize(100 * 1024 * 1024);
+resizable_shmem_resize
+----------------------
+                      
+(1 row)
+
+step zu: SELECT pg_size_pretty(rss_shmem) AS rss_shmem FROM resizable_shmem_usage() AS "resizer_shmem_usage";
+rss_shmem
+---------
+6272 kB  
+(1 row)
+
+step zia: SELECT name, pg_size_pretty(size) AS size, pg_size_pretty(allocated_size) AS allocated_size, pg_size_pretty(reserved_size) AS reserved_size FROM pg_shmem_allocations WHERE name = 'resizable_shmem';
+name           |size  |allocated_size|reserved_size
+---------------+------+--------------+-------------
+resizable_shmem|400 MB|400 MB        |400 MB       
+(1 row)
+
+step s2r_initial: SELECT resizable_shmem_read(25 * 1024 * 1024, 100);
+resizable_shmem_read
+--------------------
+t                   
+(1 row)
+
+step s2w_max: SELECT resizable_shmem_write(500);
+resizable_shmem_write
+---------------------
+                     
+(1 row)
+
+step s2u: SELECT pg_size_pretty(rss_shmem) AS rss_shmem FROM resizable_shmem_usage() AS "session2_shmem_usage";
+rss_shmem
+---------
+405 MB   
+(1 row)
+
+step s1r_max: SELECT resizable_shmem_read(100 * 1024 * 1024, 500);
+resizable_shmem_read
+--------------------
+t                   
+(1 row)
+
+step s1u: SELECT pg_size_pretty(rss_shmem) AS rss_shmem FROM resizable_shmem_usage() AS "session1_shmem_usage";
+rss_shmem
+---------
+405 MB   
+(1 row)
+
+step zs_shrink: SELECT resizable_shmem_resize(75 * 1024 * 1024);
+resizable_shmem_resize
+----------------------
+                      
+(1 row)
+
+step zu: SELECT pg_size_pretty(rss_shmem) AS rss_shmem FROM resizable_shmem_usage() AS "resizer_shmem_usage";
+rss_shmem
+---------
+6272 kB  
+(1 row)
+
+step zia: SELECT name, pg_size_pretty(size) AS size, pg_size_pretty(allocated_size) AS allocated_size, pg_size_pretty(reserved_size) AS reserved_size FROM pg_shmem_allocations WHERE name = 'resizable_shmem';
+name           |size  |allocated_size|reserved_size
+---------------+------+--------------+-------------
+resizable_shmem|300 MB|300 MB        |400 MB       
+(1 row)
+
+step s1r_shrink: SELECT resizable_shmem_read(75 * 1024 * 1024, 500);
+resizable_shmem_read
+--------------------
+t                   
+(1 row)
+
+step s1w_shrink: SELECT resizable_shmem_write(999);
+resizable_shmem_write
+---------------------
+                     
+(1 row)
+
+step s1u: SELECT pg_size_pretty(rss_shmem) AS rss_shmem FROM resizable_shmem_usage() AS "session1_shmem_usage";
+rss_shmem
+---------
+305 MB   
+(1 row)
+
+step s2r_shrink: SELECT resizable_shmem_read(75 * 1024 * 1024, 999);
+resizable_shmem_read
+--------------------
+t                   
+(1 row)
+
+step s2u: SELECT pg_size_pretty(rss_shmem) AS rss_shmem FROM resizable_shmem_usage() AS "session2_shmem_usage";
+rss_shmem
+---------
+305 MB   
+(1 row)
+
+step zs_fail: SELECT resizable_shmem_resize(128 * 1024 * 1024);
+ERROR:  not enough address space is reserved for resizing structure "resizable_shmem"
diff --git a/src/test/modules/resizable_shmem/meson.build b/src/test/modules/resizable_shmem/meson.build
new file mode 100644
index 00000000000..6e5f6d5caaf
--- /dev/null
+++ b/src/test/modules/resizable_shmem/meson.build
@@ -0,0 +1,37 @@
+# src/test/modules/test_resizable_shmem/meson.build
+
+resizable_shmem_sources = files(
+  'resizable_shmem.c',
+)
+
+if host_system == 'windows'
+  resizable_shmem_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+    '--NAME', 'resizable_shmem',
+    '--FILEDESC', 'resizable_shmem - test module for resizable shared memory',])
+endif
+
+resizable_shmem = shared_module('resizable_shmem',
+  resizable_shmem_sources,
+  kwargs: pg_test_mod_args,
+)
+test_install_libs += resizable_shmem
+
+test_install_data += files(
+  'resizable_shmem.control',
+  'resizable_shmem--1.0.sql',
+)
+
+tests += {
+  'name': 'resizable_shmem',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'isolation': {
+    'specs': [
+      'resizable_shmem',
+    ],
+    'regress_args': ['--temp-config', files('resizable_shmem.conf')],
+    # This test requires library to be loaded at the server start, so disable
+    # installcheck
+    'runningcheck': false,
+  },
+}
diff --git a/src/test/modules/resizable_shmem/resizable_shmem--1.0.sql b/src/test/modules/resizable_shmem/resizable_shmem--1.0.sql
new file mode 100644
index 00000000000..2a5fa957167
--- /dev/null
+++ b/src/test/modules/resizable_shmem/resizable_shmem--1.0.sql
@@ -0,0 +1,39 @@
+/* src/test/modules/resizable_shmem/resizable_shmem--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION resizable_shmem" to load this file. \quit
+
+-- Function to resize the test structure in the shared memory
+CREATE FUNCTION resizable_shmem_resize(new_entries integer)
+RETURNS void
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
+
+-- Function to write data to all entries in the test structure in shared memory
+-- Writing all the entries makes sure that the memory is actually allocated and
+-- mapped to the process, so that we can later measure the memory usage.
+CREATE FUNCTION resizable_shmem_write(entry_value integer)
+RETURNS void
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
+
+-- Function to verify that specified number of initial entries have expected value.
+-- Reading all the entries makes sure that the memory is actually mapped to the
+-- process, so that we can later measure the memory usage.
+CREATE FUNCTION resizable_shmem_read(entry_count integer, entry_value integer)
+RETURNS boolean
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
+
+-- Function to report memory usage statistics of the calling backend
+--
+-- TODO: This function is useful to determine that the memory is freed after
+-- shrinking and that we don't allocate all the memory upfront instead of just
+-- reserving the address space. the C implementation of this function relies
+-- heavily on Linux-specific /proc/self/stats and also the sizes that it returns
+-- may not be stable across different machines. Hence we should consider
+-- removing this function and the related tests after we have verified the
+-- resizing works as expected.
+CREATE FUNCTION resizable_shmem_usage(OUT rss_anon bigint, OUT rss_file bigint, OUT rss_shmem bigint, OUT vm_size bigint)
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
diff --git a/src/test/modules/resizable_shmem/resizable_shmem.c b/src/test/modules/resizable_shmem/resizable_shmem.c
new file mode 100644
index 00000000000..55c86b8e1ab
--- /dev/null
+++ b/src/test/modules/resizable_shmem/resizable_shmem.c
@@ -0,0 +1,270 @@
+/* -------------------------------------------------------------------------
+ *
+ * resizable_shmem.c
+ *		Test module for PostgreSQL's resizable shared memory functionality
+ *
+ * This module demonstrates and tests the resizable shared memory API
+ * provided by shmem.c/shmem.h.
+ *
+ * -------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "funcapi.h"
+#include "miscadmin.h"
+#include "storage/shmem.h"
+#include "storage/spin.h"
+#include "utils/builtins.h"
+#include "utils/guc.h"
+#include "utils/memutils.h"
+#include "utils/timestamp.h"
+#include "access/htup_details.h"
+
+#include <stdio.h>
+
+PG_MODULE_MAGIC;
+
+/*
+ * Default amount of shared buffers and hence the amount of shared memory
+ * allocated by default is in hundreds of MBs. The memory allocated to the test
+ * structure will be noticeable only when it's in the same order.
+ */
+#define TEST_INITIAL_ENTRIES	(25 * 1024 * 1024)		/* Initial number of entries (100MB) */
+#define TEST_MAX_ENTRIES		(100 * 1024 * 1024)		/* Maximum number of entries (400MB, 4x initial) */
+#define TEST_ENTRY_SIZE			sizeof(int32)		/* Size of each entry */
+
+/*
+ * Resizable test data structure stored in shared memory.
+ *
+ * We do not use any locks. The test performs resizing, reads and writes none of
+ * which are concurrent to keep the code and the test simple.
+ */
+typedef struct TestResizableShmemStruct
+{
+	/* Metadata */
+	int32		num_entries;		/* Number of entries that can fit */
+
+	/* Data area - variable size */
+	int32		data[FLEXIBLE_ARRAY_MEMBER];
+} TestResizableShmemStruct;
+
+/* Global pointer to our shared memory structure */
+static TestResizableShmemStruct *resizable_shmem = NULL;
+
+static void resizable_shmem_shmem_init(void *arg);
+
+static ShmemStructDesc testShmemDesc = {
+	.name = "resizable_shmem",
+	.size = offsetof(TestResizableShmemStruct, data) + (TEST_INITIAL_ENTRIES * TEST_ENTRY_SIZE),
+	.max_size = offsetof(TestResizableShmemStruct, data) + (TEST_MAX_ENTRIES * TEST_ENTRY_SIZE),
+	.alignment = MAXIMUM_ALIGNOF,
+	.init_fn = resizable_shmem_shmem_init,
+	.ptr = (void **) &resizable_shmem,
+};
+
+static shmem_request_hook_type prev_shmem_request_hook = NULL;
+
+static void resizable_shmem_request(void);
+
+/* SQL-callable functions */
+PG_FUNCTION_INFO_V1(resizable_shmem_resize);
+PG_FUNCTION_INFO_V1(resizable_shmem_write);
+PG_FUNCTION_INFO_V1(resizable_shmem_read);
+PG_FUNCTION_INFO_V1(resizable_shmem_usage);
+
+/*
+ * Module load callback
+ */
+void
+_PG_init(void)
+{
+	/*
+	 * The module needs to be loaded via shared_preload_libraries to register
+	 * shared memory structure. But if that's not the case, don't throw an error.
+	 * The SQL functions check for existence of the shared memory data structure.
+	 */
+	if (!process_shared_preload_libraries_in_progress)
+		return;
+
+#ifdef EXEC_BACKEND
+	ereport(ERROR,
+			(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+			 errmsg("resizable_shmem is not supported in EXEC_BACKEND builds")));
+#endif
+
+	/* Install hook to register shared memory structure. */
+	prev_shmem_request_hook = shmem_request_hook;
+	shmem_request_hook = resizable_shmem_request;
+}
+
+/*
+ * Request shared memory resources
+ */
+static void
+resizable_shmem_request(void)
+{
+	if (prev_shmem_request_hook)
+		prev_shmem_request_hook();
+
+	/* Register our resizable shared memory structure */
+	ShmemRegisterStruct(&testShmemDesc);
+}
+
+/*
+ * Initialize shared memory structure
+ */
+static void
+resizable_shmem_shmem_init(void *arg)
+{
+	/*
+	 * Shared memory structure should have been allocated with the requested
+	 * size. Initialize the metadata.
+	 */
+	Assert(resizable_shmem != NULL);
+	resizable_shmem->num_entries = TEST_INITIAL_ENTRIES;
+}
+
+/*
+ * Resize the shared memory structure to accommodate the specified number of
+ * entries.
+ */
+Datum
+resizable_shmem_resize(PG_FUNCTION_ARGS)
+{
+#ifndef EXEC_BACKEND
+	int32		new_entries = PG_GETARG_INT32(0);
+	Size		new_size;
+
+	if (!resizable_shmem)
+		ereport(ERROR,
+				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+				 errmsg("resizable_shmem is not initialized")));
+
+	new_size = offsetof(TestResizableShmemStruct, data) + (new_entries * TEST_ENTRY_SIZE);
+	ShmemResizeRegistered(testShmemDesc.name, new_size);
+	resizable_shmem->num_entries = new_entries;
+
+	PG_RETURN_VOID();
+#else
+	ereport(ERROR,
+			(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+			 errmsg("resizing shared memory is not supported in EXEC_BACKEND builds")));
+#endif
+}
+
+/*
+ * Write the given integer value to all entries in the data array.
+ */
+Datum
+resizable_shmem_write(PG_FUNCTION_ARGS)
+{
+	int32		entry_value = PG_GETARG_INT32(0);
+	int32		i;
+
+	if (!resizable_shmem)
+		ereport(ERROR,
+				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+				 errmsg("resizable_shmem is not initialized")));
+
+	/* Write the value to all current entries */
+	for (i = 0; i < resizable_shmem->num_entries; i++)
+		resizable_shmem->data[i] = entry_value;
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Check whether the first 'entry_count' entries all have the expected 'entry_value'.
+ * Returns true if all match, false otherwise.
+ */
+Datum
+resizable_shmem_read(PG_FUNCTION_ARGS)
+{
+	int32		entry_count = PG_GETARG_INT32(0);
+	int32		entry_value = PG_GETARG_INT32(1);
+	int32		i;
+
+	if (resizable_shmem == NULL)
+		ereport(ERROR,
+				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+				 errmsg("resizable_shmem is not initialized")));
+
+	/* Validate entry_count */
+	if (entry_count < 0 || entry_count > resizable_shmem->num_entries)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("entry_count %d is out of range (0..%d)", entry_count, resizable_shmem->num_entries)));
+
+	/* Check if first entry_count entries have the expected value */
+	for (i = 0; i < entry_count; i++)
+	{
+		if (resizable_shmem->data[i] != entry_value)
+			PG_RETURN_BOOL(false);
+	}
+
+	PG_RETURN_BOOL(true);
+}
+
+/*
+ * Report multiple memory usage statistics of the calling backend process
+ * as reported by the kernel.
+ * Returns RssAnon, RssFile, RssShmem, VmSize from /proc/self/status as a record.
+ *
+ * TODO: See TODO note in SQL definition of this function.
+ */
+Datum
+resizable_shmem_usage(PG_FUNCTION_ARGS)
+{
+	FILE	   *f;
+	char		line[256];
+	int64		rss_anon_kb = -1;
+	int64		rss_file_kb = -1;
+	int64		rss_shmem_kb = -1;
+	int64		vm_size_kb = -1;
+	int			found = 0;
+	TupleDesc	tupdesc;
+	Datum		values[4];
+	bool		nulls[4];
+	HeapTuple	tuple;
+
+	/* Open /proc/self/status to read memory information */
+	f = fopen("/proc/self/status", "r");
+	if (f == NULL)
+		ereport(ERROR,
+				(errcode_for_file_access(),
+				 errmsg("could not open /proc/self/status: %m")));
+
+	/* Look for the memory usage lines */
+	while (fgets(line, sizeof(line), f) != NULL && found < 4)
+	{
+		if (rss_anon_kb == -1 && sscanf(line, "RssAnon: %ld kB", &rss_anon_kb) == 1)
+			found++;
+		else if (rss_file_kb == -1 && sscanf(line, "RssFile: %ld kB", &rss_file_kb) == 1)
+			found++;
+		else if (rss_shmem_kb == -1 && sscanf(line, "RssShmem: %ld kB", &rss_shmem_kb) == 1)
+			found++;
+		else if (vm_size_kb == -1 && sscanf(line, "VmSize: %ld kB", &vm_size_kb) == 1)
+			found++;
+	}
+
+	fclose(f);
+
+	/* Build tuple descriptor for our result type */
+	if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("function returning record called in context "
+						"that cannot accept a record")));
+
+	/* Build the result tuple */
+	values[0] = Int64GetDatum(rss_anon_kb >= 0 ? rss_anon_kb * 1024 : 0);
+	values[1] = Int64GetDatum(rss_file_kb >= 0 ? rss_file_kb * 1024 : 0);
+	values[2] = Int64GetDatum(rss_shmem_kb >= 0 ? rss_shmem_kb * 1024 : 0);
+	values[3] = Int64GetDatum(vm_size_kb >= 0 ? vm_size_kb * 1024 : 0);
+
+	nulls[0] = nulls[1] = nulls[2] = nulls[3] = false;
+
+	tuple = heap_form_tuple(tupdesc, values, nulls);
+	PG_RETURN_DATUM(HeapTupleGetDatum(tuple));
+}
diff --git a/src/test/modules/resizable_shmem/resizable_shmem.conf b/src/test/modules/resizable_shmem/resizable_shmem.conf
new file mode 100644
index 00000000000..764b6357cbb
--- /dev/null
+++ b/src/test/modules/resizable_shmem/resizable_shmem.conf
@@ -0,0 +1,4 @@
+shared_preload_libraries = 'resizable_shmem'
+# Turn huge pages off so that we can test changes in the shared memory smaller
+# than huge page size which may be in GBs.
+huge_pages = off
diff --git a/src/test/modules/resizable_shmem/resizable_shmem.control b/src/test/modules/resizable_shmem/resizable_shmem.control
new file mode 100644
index 00000000000..1ce2c5ea21a
--- /dev/null
+++ b/src/test/modules/resizable_shmem/resizable_shmem.control
@@ -0,0 +1,5 @@
+# resizable_shmem extension test module
+comment = 'test module for testing resizable shared memory structure functionality'
+default_version = '1.0'
+module_pathname = '$libdir/resizable_shmem'
+relocatable = true
diff --git a/src/test/modules/resizable_shmem/specs/resizable_shmem.spec b/src/test/modules/resizable_shmem/specs/resizable_shmem.spec
new file mode 100644
index 00000000000..8b8cf153df8
--- /dev/null
+++ b/src/test/modules/resizable_shmem/specs/resizable_shmem.spec
@@ -0,0 +1,42 @@
+# Test resizable shared memory structure
+#
+# It tests that a resizable shared memory structure can be resized from any
+# backend and that the new sizes are visible to all the backends. It uses
+# isolation test infrastructure so that resizing, reading and writing can be
+# interleaved.
+
+setup
+{
+  CREATE EXTENSION resizable_shmem;
+}
+
+teardown
+{
+  DROP EXTENSION resizable_shmem;
+}
+
+session "session1"
+step s1w_initial { SELECT resizable_shmem_write(100); }
+step s1r_max { SELECT resizable_shmem_read(100 * 1024 * 1024, 500); }
+step s1r_shrink { SELECT resizable_shmem_read(75 * 1024 * 1024, 500); }
+step s1w_shrink { SELECT resizable_shmem_write(999); }
+step s1u { SELECT pg_size_pretty(rss_shmem) AS rss_shmem FROM resizable_shmem_usage() AS "session1_shmem_usage"; }
+
+session "session2"
+step s2r_initial { SELECT resizable_shmem_read(25 * 1024 * 1024, 100); }
+step s2w_max { SELECT resizable_shmem_write(500); }
+step s2r_shrink { SELECT resizable_shmem_read(75 * 1024 * 1024, 999); }
+step s2u { SELECT pg_size_pretty(rss_shmem) AS rss_shmem FROM resizable_shmem_usage() AS "session2_shmem_usage"; }
+
+session "resizer"
+step zs_max { SELECT resizable_shmem_resize(100 * 1024 * 1024); }
+step zs_shrink { SELECT resizable_shmem_resize(75 * 1024 * 1024); }
+step zs_fail { SELECT resizable_shmem_resize(128 * 1024 * 1024); } # this fails (512MB > 400MB max)
+step zia { SELECT name, pg_size_pretty(size) AS size, pg_size_pretty(allocated_size) AS allocated_size, pg_size_pretty(reserved_size) AS reserved_size FROM pg_shmem_allocations WHERE name = 'resizable_shmem'; }
+step zu { SELECT pg_size_pretty(rss_shmem) AS rss_shmem FROM resizable_shmem_usage() AS "resizer_shmem_usage"; }
+
+# Test shrinking and expanding the shared memory, while ensuring that:
+# - All entries can be written to and read from after each resize
+# - Previously written data remains unchanged in entries that survive resizing
+# Now using 100MB initial (25M entries), 400MB max (100M entries), 300MB shrink (75M entries)
+permutation s1u s2u zu zia s1w_initial s1u s2r_initial s2u zs_max zu zia s2r_initial s2w_max s2u s1r_max s1u zs_shrink zu zia s1r_shrink s1w_shrink s1u s2r_shrink s2u zs_fail
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index f4ee2bd7459..a6eaeb49dbc 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1770,8 +1770,9 @@ pg_shadow| SELECT pg_authid.rolname AS usename,
 pg_shmem_allocations| SELECT name,
     off,
     size,
-    allocated_size
-   FROM pg_get_shmem_allocations() pg_get_shmem_allocations(name, off, size, allocated_size);
+    allocated_size,
+    reserved_size
+   FROM pg_get_shmem_allocations() pg_get_shmem_allocations(name, off, size, allocated_size, reserved_size);
 pg_shmem_allocations_numa| SELECT name,
     numa_node,
     size
-- 
2.34.1



view thread (54+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Better shared data structure management and resizable shared data structures
  In-Reply-To: <CAExHW5uEK+eeG7e2g6uWh7POrFpfp+dqfaa=_3miMN17zgeaJw@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox