public inbox for [email protected]
help / color / mirror / Atom feedFrom: Ashutosh Bapat <[email protected]>
To: Andres Freund <[email protected]>
To: Heikki Linnakangas <[email protected]>
Cc: pgsql-hackers <[email protected]>
Cc: [email protected]
Subject: Re: Better shared data structure management and resizable shared data structures
Date: Mon, 23 Feb 2026 19:44:23 +0530
Message-ID: <CAExHW5so6VSxBC-1V=35229Z1+dw5vhw8HxHg9ry7UzceKcXzA@mail.gmail.com> (raw)
In-Reply-To: <CAExHW5vz+PUHHUuzGRwtyx-mPLQk3nCZXxrFqnruRadEFrO5Xg@mail.gmail.com>
References: <CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com>
<[email protected]>
<CAExHW5s9Vp+-vJi020UJ+otyccBBo7eT1g6bttdRKL6HAvscyQ@mail.gmail.com>
<mlsruptoxgm2nqtdfyfsowjklzxl5zltsjb3y5bmywtigm474l@5tsonk4t3kia>
<CAExHW5uEK+eeG7e2g6uWh7POrFpfp+dqfaa=_3miMN17zgeaJw@mail.gmail.com>
<CAExHW5vz+PUHHUuzGRwtyx-mPLQk3nCZXxrFqnruRadEFrO5Xg@mail.gmail.com>
On Wed, Feb 18, 2026 at 9:17 PM Ashutosh Bapat
<[email protected]> wrote:
> > > 4. the address and length passed to madvise needs to be page aligned,
> > > but that passed to fallocate() needn't be. `man fallocate` says
> > > "Specifying the FALLOC_FL_PUNCH_HOLE flag (available since Linux
> > > 2.6.38) in mode deallocates space (i.e., creates a hole) in the byte
> > > range starting at offset and continuing for len bytes. Within the
> > > specified range, partial filesystem blocks are zeroed, and whole
> > > filesystem blocks are removed from the file.". It seems to be
> > > automatically taking care of the page size. So using fallocate()
> > > simplifies logic. Further `man madvise` says "but since Linux 3.5, any
> > > filesystem which supports the fallocate(2) FALLOC_FL_PUNCH_HOLE mode
> > > also supports MADV_REMOVE." fallocate with FALLOC_FL_PUNCH_HOLE is
> > > guaranteed to be available on a system which supports MADV_REMOVE.
> >
> > I think it makes no sense to support resizing below page size
> > granularity. What's the point of doing that?
> >
>
> No point really. But we can not control the extensions which want to
> specify a maximum size smaller than a page size. They wouldn't know
> what page size the underlying machine will have, especially with huge
> pages which have a wide range of sizes. Even in the case of shared
> buffers, a value of max_shared_buffers may cause buffer blocks to span
> pages but other structures may fit a page.
>
> In the attached patches, if a resizable structure is such that its
> max_size is smaller than a page size, it is treated as a fixed
> structure with size = max_size. Any request to resize such structures
> will simply update the metadata without actual madvise operation. Only
> the structures whose max_size > page_size would be treated as truly
> resizable and will use madvise. You bring another interesting point.
> If a resizable structure has a maximum size higher than the page size,
> but it is allocated such that the initial part of it is on a partially
> allocated page and the last part of it is on another partially
> allocated page, those pages are never freed because of adjoining
> structures. Per the logic in the attached patches, all the fixed (or
> pseudo-resizable structures) are packed together. The resizable
> structures start on a page boundary and their max_sizes are adjusted
> to be page aligned. That way we can release pages when the structure
> shrinks more than a page.
It was a mistake on my part to assume that more memory will be freed
if we page align the start and end of a resizable structure. I didn't
account for the memory wasted in alignment itself. That amount comes
out to be same as the amount of memory wasted if we don't page align
the structure. But the code is simpler if we don't page align the
structure as seen in the attached patches.
> > >
> > > > Using fallocate() (or madvise()) to free memory, we don't need
> > > > multiple segments. So much less code churn compared to the multiple
> > > > mappings approach. However, there is one drawback. In the multiple
> > > > mapping approach access beyond the current size of the structure would
> > > > result in segfault or bus error. But in the fallocate/madvise approach
> > > > such an access does not cause a crash. A write beyond the pages that
> > > > fit the current size of the structure causes more memory to be
> > > > allocated silently. A read returns 0s. So, there's a possibility that
> > > > bugs in size calculations might go unnoticed. I think that's how it
> > > > works even today, access in the yet un-allocated part of the shared
> > > > memory will simply go unnoticed.
> > >
> > > If that's something you care about, you can mprotect(PROT_NONE) the relevant
> > > regions.
> >
> > I am fine, if we let go of this protection while getting rid of
> > multiple segments, if we all agree to do so.
> >
> > I could be wrong, but mprotect needs to be executed in every backend
> > where the memory is mapped and then a new backend needs to inherit it
> > from the postmaster. Makes resizing complex since it has to touch
> > every backend. So avoiding mprotect is better.
I discussed this point with Andres offlist. Here's a summary of that
discussion. Any serious users of resizable shared memory structures
would need to send proc signal barriers to synchronize the resizing
across the backends. This barrier can be used to perform mprotect() in
the backends and a separate signal to Postmaster, if mprotect is
needed in Postmaster. But whether mprotect is needed depends upon the
usecase. It should be responsibility of the resizable structure user
and not of the ShmemResizeRegistered()
Following points need a bit of discussion.
1. calculation of allocated_size
For fixed sized shared memory structures, allocated_size is the size
of the structure after cache aligning it. Assuming that the shared
memory is allcoated in pages, this also is the actually memory
allocated to the structure when the whole structure is written to. For
resizable structure, it's a bit more complicated. We allocate and
reserve the maximum space required by the structure. At a given point
in time, the memory page where the next structure begins and the page
which contains the end of the structure at that point in time are
allocated. The pages in-between are not allocated. Thus the
allocated_size should be the length from the start the structure to
the end of the page containing the current end of the structure + part
of the page where the next structure starts upto the start of the next
structure. That is what is implemented in the attached patches.
2. GUCs shared_memory_size, shared_memory_size_in_huge_pages
These GUCs indicate the size of the shared memory in bytes and in huge
pages. Without resizable shared memory structures calculating these is
straight forward, we sum all the sizes of all the requested
structures. With resizable shared memory structures, these GUCs do not
make much sense. Since the memory allocated to the resizable
structures can be anywhere between 0 to maximum, neither the sum of
the their initial sizes nor the sum of their maximum sizes can be
reported as shared_memory_size. Similarly for
shared_memory_size_in_huge_pages. We need two GUCs to replace each of
the existing GUCs - max_shared_memory_size, initial_shared_memory_size
and their huge page peers. max_shared_memory_size is the sum of the
maximum sizes of resizable structures + the requested sizes of the
fixed structure. initial_shared_memory_size is the sum of the initial
sizes requested for all the structures.
3. Testing the memory allocation
I couldn't find a way to reliably know the shared memory allocated at
a given address in the process. RSS Shmem given the amount of shared
memory accessed by the process which includes memory allocated to the
fixed structures accessed by the process. This value isn't stable
across runs of the test in the patch. The test adds the RSS shmem
reported against the variations in the resizable shared memory
structure which can be visually inspected to be within limits. But
those limits are hard to test in the test code. Looking for some
suggestions here.
Disabling resizable structures in the builds which do not support
resizable structures is still a TODO.
--
Best Wishes,
Ashutosh Bapat
Attachments:
[text/x-patch] 0001-wip-Introduce-a-new-way-of-registering-shar-20260223.patch (53.8K, 2-0001-wip-Introduce-a-new-way-of-registering-shar-20260223.patch)
download | inline diff:
From 49676c5ba088d13236f2c1c66800d7e7b1abbe5f Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Mon, 9 Feb 2026 22:28:23 +0200
Subject: [PATCH 1/3] wip: Introduce a new way of registering shared memory
structs
---
.../pg_stat_statements/pg_stat_statements.c | 112 ++++-----
src/backend/access/transam/varsup.c | 32 +--
src/backend/bootstrap/bootstrap.c | 2 +
src/backend/postmaster/launch_backend.c | 11 +-
src/backend/postmaster/postmaster.c | 2 +
src/backend/storage/ipc/dsm.c | 46 ++--
src/backend/storage/ipc/dsm_registry.c | 34 ++-
src/backend/storage/ipc/ipci.c | 51 ++--
src/backend/storage/ipc/pmsignal.c | 53 ++--
src/backend/storage/ipc/procarray.c | 127 +++++-----
src/backend/storage/ipc/procsignal.c | 63 +++--
src/backend/storage/ipc/shmem.c | 233 +++++++++++++++++-
src/backend/storage/ipc/sinvaladt.c | 39 +--
src/backend/storage/lmgr/proc.c | 156 ++++++------
src/backend/tcop/postgres.c | 2 +
src/include/access/transam.h | 12 +-
src/include/storage/dsm_registry.h | 3 +-
src/include/storage/ipc.h | 1 +
src/include/storage/pmsignal.h | 3 +-
src/include/storage/proc.h | 5 +-
src/include/storage/procarray.h | 3 +-
src/include/storage/procsignal.h | 3 +-
src/include/storage/shmem.h | 57 +++++
src/include/storage/sinvaladt.h | 3 +-
24 files changed, 665 insertions(+), 388 deletions(-)
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 4a427533bd8..71debc8b47f 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -258,6 +258,25 @@ typedef struct pgssSharedState
pgssGlobalStats stats; /* global statistics for pgss */
} pgssSharedState;
+static void pgss_shmem_init(void *arg);
+
+static ShmemStructDesc pgssSharedStateShmemDesc = {
+ .name = "pg_stat_statements",
+ .size = sizeof(pgssSharedState),
+ .init_fn = pgss_shmem_init,
+};
+
+static ShmemHashDesc pgssSharedHashDesc = {
+ .name = "pg_stat_statements hash",
+ .init_size = 0, /* set from 'pgss_max' */
+ .max_size = 0, /* set from 'pgss_max' */
+};
+
+/* Links to shared memory state */
+#define pgss ((pgssSharedState *) pgssSharedStateShmemDesc.ptr)
+#define pgss_hash (pgssSharedHashDesc.ptr)
+
+
/*---- Local variables ----*/
/* Current nesting depth of planner/ExecutorRun/ProcessUtility calls */
@@ -274,10 +293,6 @@ static ExecutorFinish_hook_type prev_ExecutorFinish = NULL;
static ExecutorEnd_hook_type prev_ExecutorEnd = NULL;
static ProcessUtility_hook_type prev_ProcessUtility = NULL;
-/* Links to shared memory state */
-static pgssSharedState *pgss = NULL;
-static HTAB *pgss_hash = NULL;
-
/*---- GUC variables ----*/
typedef enum
@@ -365,7 +380,6 @@ static void pgss_store(const char *query, int64 queryId,
static void pg_stat_statements_internal(FunctionCallInfo fcinfo,
pgssVersion api_version,
bool showtext);
-static Size pgss_memsize(void);
static pgssEntry *entry_alloc(pgssHashKey *key, Size query_offset, int query_len,
int encoding, bool sticky);
static void entry_dealloc(void);
@@ -500,11 +514,39 @@ _PG_init(void)
static void
pgss_shmem_request(void)
{
+ HASHCTL info;
+
if (prev_shmem_request_hook)
prev_shmem_request_hook();
- RequestAddinShmemSpace(pgss_memsize());
RequestNamedLWLockTranche("pg_stat_statements", 1);
+
+ /*
+ * Register our shared memory state, including hash table
+ */
+ ShmemRegisterStruct(&pgssSharedStateShmemDesc);
+
+ info.keysize = sizeof(pgssHashKey);
+ info.entrysize = sizeof(pgssEntry);
+ pgssSharedHashDesc.init_size = pgss_max;
+ pgssSharedHashDesc.max_size = pgss_max;
+ ShmemRegisterHash(&pgssSharedHashDesc,
+ &info,
+ HASH_ELEM | HASH_BLOBS);
+}
+
+static void
+pgss_shmem_init(void *arg)
+{
+ pgss->lock = &(GetNamedLWLockTranche("pg_stat_statements"))->lock;
+ pgss->cur_median_usage = ASSUMED_MEDIAN_INIT;
+ pgss->mean_query_len = ASSUMED_LENGTH_INIT;
+ SpinLockInit(&pgss->mutex);
+ pgss->extent = 0;
+ pgss->n_writers = 0;
+ pgss->gc_count = 0;
+ pgss->stats.dealloc = 0;
+ pgss->stats.stats_reset = GetCurrentTimestamp();
}
/*
@@ -516,8 +558,6 @@ pgss_shmem_request(void)
static void
pgss_shmem_startup(void)
{
- bool found;
- HASHCTL info;
FILE *file = NULL;
FILE *qfile = NULL;
uint32 header;
@@ -530,42 +570,6 @@ pgss_shmem_startup(void)
if (prev_shmem_startup_hook)
prev_shmem_startup_hook();
- /* reset in case this is a restart within the postmaster */
- pgss = NULL;
- pgss_hash = NULL;
-
- /*
- * Create or attach to the shared memory state, including hash table
- */
- LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
-
- pgss = ShmemInitStruct("pg_stat_statements",
- sizeof(pgssSharedState),
- &found);
-
- if (!found)
- {
- /* First time through ... */
- pgss->lock = &(GetNamedLWLockTranche("pg_stat_statements"))->lock;
- pgss->cur_median_usage = ASSUMED_MEDIAN_INIT;
- pgss->mean_query_len = ASSUMED_LENGTH_INIT;
- SpinLockInit(&pgss->mutex);
- pgss->extent = 0;
- pgss->n_writers = 0;
- pgss->gc_count = 0;
- pgss->stats.dealloc = 0;
- pgss->stats.stats_reset = GetCurrentTimestamp();
- }
-
- info.keysize = sizeof(pgssHashKey);
- info.entrysize = sizeof(pgssEntry);
- pgss_hash = ShmemInitHash("pg_stat_statements hash",
- pgss_max, pgss_max,
- &info,
- HASH_ELEM | HASH_BLOBS);
-
- LWLockRelease(AddinShmemInitLock);
-
/*
* If we're in the postmaster (or a standalone backend...), set up a shmem
* exit hook to dump the statistics to disk.
@@ -573,12 +577,6 @@ pgss_shmem_startup(void)
if (!IsUnderPostmaster)
on_shmem_exit(pgss_shmem_shutdown, (Datum) 0);
- /*
- * Done if some other process already completed our initialization.
- */
- if (found)
- return;
-
/*
* Note: we don't bother with locks here, because there should be no other
* processes running when this code is reached.
@@ -2082,20 +2080,6 @@ pg_stat_statements_info(PG_FUNCTION_ARGS)
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
}
-/*
- * Estimate shared memory space needed.
- */
-static Size
-pgss_memsize(void)
-{
- Size size;
-
- size = MAXALIGN(sizeof(pgssSharedState));
- size = add_size(size, hash_estimate_size(pgss_max, sizeof(pgssEntry)));
-
- return size;
-}
-
/*
* Allocate a new hashtable entry.
* caller must hold an exclusive lock on pgss->lock
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 3e95d4cfd16..11ad90e7372 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -30,35 +30,27 @@
/* Number of OIDs to prefetch (preallocate) per XLOG write */
#define VAR_OID_PREFETCH 8192
-/* pointer to variables struct in shared memory */
-TransamVariablesData *TransamVariables = NULL;
+static void VarsupShmemInit(void *arg);
+ShmemStructDesc TransamVariablesShmemDesc = {
+ .name = "TransamVariables",
+ .size = sizeof(TransamVariablesData),
+ .init_fn = VarsupShmemInit,
+};
/*
* Initialization of shared memory for TransamVariables.
*/
-Size
-VarsupShmemSize(void)
+void
+VarsupShmemRegister(void)
{
- return sizeof(TransamVariablesData);
+ ShmemRegisterStruct(&TransamVariablesShmemDesc);
}
-void
-VarsupShmemInit(void)
+static void
+VarsupShmemInit(void *arg)
{
- bool found;
-
- /* Initialize our shared state struct */
- TransamVariables = ShmemInitStruct("TransamVariables",
- sizeof(TransamVariablesData),
- &found);
- if (!IsUnderPostmaster)
- {
- Assert(!found);
- memset(TransamVariables, 0, sizeof(TransamVariablesData));
- }
- else
- Assert(found);
+ memset(TransamVariables, 0, sizeof(TransamVariablesData));
}
/*
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index 7d32cd0e159..0ded7018e86 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -337,6 +337,8 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
InitializeFastPathLocks();
+ RegisterShmemStructs();
+
CreateSharedMemoryAndSemaphores();
/*
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index 926fd6f2700..8f638118cdf 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -49,6 +49,7 @@
#include "replication/walreceiver.h"
#include "storage/dsm.h"
#include "storage/io_worker.h"
+#include "storage/ipc.h"
#include "storage/pg_shmem.h"
#include "tcop/backend_startup.h"
#include "utils/memutils.h"
@@ -104,12 +105,10 @@ typedef struct
char **LWLockTrancheNames;
int *LWLockCounter;
LWLockPadded *MainLWLockArray;
- slock_t *ProcStructLock;
PROC_HDR *ProcGlobal;
PGPROC *AuxiliaryProcs;
PGPROC *PreparedXactProcs;
volatile PMSignalData *PMSignalState;
- ProcSignalHeader *ProcSignal;
pid_t PostmasterPid;
TimestampTz PgStartTime;
TimestampTz PgReloadTime;
@@ -678,8 +677,12 @@ SubPostmasterMain(int argc, char *argv[])
/* Restore basic shared memory pointers */
if (UsedShmemSegAddr != NULL)
+ {
InitShmemAllocator(UsedShmemSegAddr);
+ RegisterShmemStructs();
+ }
+
/*
* Run the appropriate Main function
*/
@@ -735,12 +738,10 @@ save_backend_variables(BackendParameters *param,
param->LWLockTrancheNames = LWLockTrancheNames;
param->LWLockCounter = LWLockCounter;
param->MainLWLockArray = MainLWLockArray;
- param->ProcStructLock = ProcStructLock;
param->ProcGlobal = ProcGlobal;
param->AuxiliaryProcs = AuxiliaryProcs;
param->PreparedXactProcs = PreparedXactProcs;
param->PMSignalState = PMSignalState;
- param->ProcSignal = ProcSignal;
param->PostmasterPid = PostmasterPid;
param->PgStartTime = PgStartTime;
@@ -995,12 +996,10 @@ restore_backend_variables(BackendParameters *param)
LWLockTrancheNames = param->LWLockTrancheNames;
LWLockCounter = param->LWLockCounter;
MainLWLockArray = param->MainLWLockArray;
- ProcStructLock = param->ProcStructLock;
ProcGlobal = param->ProcGlobal;
AuxiliaryProcs = param->AuxiliaryProcs;
PreparedXactProcs = param->PreparedXactProcs;
PMSignalState = param->PMSignalState;
- ProcSignal = param->ProcSignal;
PostmasterPid = param->PostmasterPid;
PgStartTime = param->PgStartTime;
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index d6133bfebc6..f6d3369f917 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -968,6 +968,8 @@ PostmasterMain(int argc, char *argv[])
* shared memory, determine the value of any runtime-computed GUCs that
* depend on the amount of shared memory required.
*/
+ RegisterShmemStructs();
+
InitializeShmemGUCs();
/*
diff --git a/src/backend/storage/ipc/dsm.c b/src/backend/storage/ipc/dsm.c
index 6a5b16392f7..55f46c7687e 100644
--- a/src/backend/storage/ipc/dsm.c
+++ b/src/backend/storage/ipc/dsm.c
@@ -108,7 +108,15 @@ static inline bool is_main_region_dsm_handle(dsm_handle handle);
static bool dsm_init_done = false;
/* Preallocated DSM space in the main shared memory region. */
-static void *dsm_main_space_begin = NULL;
+static void dsm_main_space_init(void *);
+
+static ShmemStructDesc dsm_main_space_shmem_desc = {
+ .name = "Preallocated DSM",
+ .size = 0, /* dynamic */
+ .init_fn = dsm_main_space_init,
+};
+
+#define dsm_main_space_begin (dsm_main_space_shmem_desc.ptr)
/*
* List of dynamic shared memory segments used by this backend.
@@ -479,27 +487,29 @@ void
dsm_shmem_init(void)
{
size_t size = dsm_estimate_size();
- bool found;
if (size == 0)
return;
- dsm_main_space_begin = ShmemInitStruct("Preallocated DSM", size, &found);
- if (!found)
- {
- FreePageManager *fpm = (FreePageManager *) dsm_main_space_begin;
- size_t first_page = 0;
- size_t pages;
-
- /* Reserve space for the FreePageManager. */
- while (first_page * FPM_PAGE_SIZE < sizeof(FreePageManager))
- ++first_page;
-
- /* Initialize it and give it all the rest of the space. */
- FreePageManagerInitialize(fpm, dsm_main_space_begin);
- pages = (size / FPM_PAGE_SIZE) - first_page;
- FreePageManagerPut(fpm, first_page, pages);
- }
+ ShmemRegisterStruct(&dsm_main_space_shmem_desc);
+}
+
+static void
+dsm_main_space_init(void *arg)
+{
+ size_t size = dsm_main_space_shmem_desc.size;
+ FreePageManager *fpm = (FreePageManager *) dsm_main_space_begin;
+ size_t first_page = 0;
+ size_t pages;
+
+ /* Reserve space for the FreePageManager. */
+ while (first_page * FPM_PAGE_SIZE < sizeof(FreePageManager))
+ ++first_page;
+
+ /* Initialize it and give it all the rest of the space. */
+ FreePageManagerInitialize(fpm, dsm_main_space_begin);
+ pages = (size / FPM_PAGE_SIZE) - first_page;
+ FreePageManagerPut(fpm, first_page, pages);
}
/*
diff --git a/src/backend/storage/ipc/dsm_registry.c b/src/backend/storage/ipc/dsm_registry.c
index 068c1577b12..882af83b7b2 100644
--- a/src/backend/storage/ipc/dsm_registry.c
+++ b/src/backend/storage/ipc/dsm_registry.c
@@ -54,7 +54,15 @@ typedef struct DSMRegistryCtxStruct
dshash_table_handle dshh;
} DSMRegistryCtxStruct;
-static DSMRegistryCtxStruct *DSMRegistryCtx;
+static void DSMRegistryCtxShmemInit(void *arg);
+
+static ShmemStructDesc DSMRegistryCtxShmemDesc = {
+ .name = "DSM Registry Data",
+ .size = sizeof(DSMRegistryCtxStruct),
+ .init_fn = DSMRegistryCtxShmemInit,
+};
+
+#define DSMRegistryCtx ((DSMRegistryCtxStruct *) DSMRegistryCtxShmemDesc.ptr)
typedef struct NamedDSMState
{
@@ -113,27 +121,17 @@ static const dshash_parameters dsh_params = {
static dsa_area *dsm_registry_dsa;
static dshash_table *dsm_registry_table;
-Size
-DSMRegistryShmemSize(void)
+void
+DSMRegistryShmemRegister(void)
{
- return MAXALIGN(sizeof(DSMRegistryCtxStruct));
+ ShmemRegisterStruct(&DSMRegistryCtxShmemDesc);
}
-void
-DSMRegistryShmemInit(void)
+static void
+DSMRegistryCtxShmemInit(void *)
{
- bool found;
-
- DSMRegistryCtx = (DSMRegistryCtxStruct *)
- ShmemInitStruct("DSM Registry Data",
- DSMRegistryShmemSize(),
- &found);
-
- if (!found)
- {
- DSMRegistryCtx->dsah = DSA_HANDLE_INVALID;
- DSMRegistryCtx->dshh = DSHASH_HANDLE_INVALID;
- }
+ DSMRegistryCtx->dsah = DSA_HANDLE_INVALID;
+ DSMRegistryCtx->dshh = DSHASH_HANDLE_INVALID;
}
/*
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 1f7e933d500..952988645d0 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -101,13 +101,14 @@ CalculateShmemSize(void)
size = add_size(size, hash_estimate_size(SHMEM_INDEX_SIZE,
sizeof(ShmemIndexEnt)));
size = add_size(size, dsm_estimate_size());
- size = add_size(size, DSMRegistryShmemSize());
+
+ size = add_size(size, ShmemRegisteredSize());
+
+ /* legacy subsystmes */
size = add_size(size, BufferManagerShmemSize());
size = add_size(size, LockManagerShmemSize());
size = add_size(size, PredicateLockShmemSize());
- size = add_size(size, ProcGlobalShmemSize());
size = add_size(size, XLogPrefetchShmemSize());
- size = add_size(size, VarsupShmemSize());
size = add_size(size, XLOGShmemSize());
size = add_size(size, XLogRecoveryShmemSize());
size = add_size(size, CLOGShmemSize());
@@ -117,11 +118,7 @@ CalculateShmemSize(void)
size = add_size(size, BackgroundWorkerShmemSize());
size = add_size(size, MultiXactShmemSize());
size = add_size(size, LWLockShmemSize());
- size = add_size(size, ProcArrayShmemSize());
size = add_size(size, BackendStatusShmemSize());
- size = add_size(size, SharedInvalShmemSize());
- size = add_size(size, PMSignalShmemSize());
- size = add_size(size, ProcSignalShmemSize());
size = add_size(size, CheckpointerShmemSize());
size = add_size(size, AutoVacuumShmemSize());
size = add_size(size, ReplicationSlotsShmemSize());
@@ -217,6 +214,10 @@ CreateSharedMemoryAndSemaphores(void)
*/
InitShmemAllocator(seghdr);
+ /* Reserve space for semaphores. */
+ if (!IsUnderPostmaster)
+ PGReserveSemaphores(ProcGlobalSemas());
+
/* Initialize subsystems */
CreateOrAttachShmemStructs();
@@ -230,6 +231,19 @@ CreateSharedMemoryAndSemaphores(void)
shmem_startup_hook();
}
+void
+RegisterShmemStructs(void)
+{
+ DSMRegistryShmemRegister();
+
+ ProcGlobalShmemRegister();
+ VarsupShmemRegister();
+ ProcArrayShmemRegister();
+ SharedInvalShmemRegister();
+ PMSignalShmemRegister();
+ ProcSignalShmemRegister();
+}
+
/*
* Initialize various subsystems, setting up their data structures in
* shared memory.
@@ -259,14 +273,23 @@ CreateOrAttachShmemStructs(void)
*/
InitShmemIndex();
+#ifdef EXEC_BACKEND
+ if (IsUnderPostmaster)
+ ShmemAttachRegistered();
+ else
+#endif
+ {
+ ShmemInitRegistered();
+ }
+
dsm_shmem_init();
- DSMRegistryShmemInit();
+ //DSMRegistryShmemInit();
/*
* Set up xlog, clog, and buffers
*/
- VarsupShmemInit();
XLOGShmemInit();
+
XLogPrefetchShmemInit();
XLogRecoveryShmemInit();
CLOGShmemInit();
@@ -288,23 +311,13 @@ CreateOrAttachShmemStructs(void)
/*
* Set up process table
*/
- if (!IsUnderPostmaster)
- InitProcGlobal();
- ProcArrayShmemInit();
BackendStatusShmemInit();
TwoPhaseShmemInit();
BackgroundWorkerShmemInit();
- /*
- * Set up shared-inval messaging
- */
- SharedInvalShmemInit();
-
/*
* Set up interprocess signaling mechanisms
*/
- PMSignalShmemInit();
- ProcSignalShmemInit();
CheckpointerShmemInit();
AutoVacuumShmemInit();
ReplicationSlotsShmemInit();
diff --git a/src/backend/storage/ipc/pmsignal.c b/src/backend/storage/ipc/pmsignal.c
index 4618820b337..23752500d16 100644
--- a/src/backend/storage/ipc/pmsignal.c
+++ b/src/backend/storage/ipc/pmsignal.c
@@ -80,9 +80,24 @@ struct PMSignalData
sig_atomic_t PMChildFlags[FLEXIBLE_ARRAY_MEMBER];
};
-/* PMSignalState pointer is valid in both postmaster and child processes */
+static void PMSignalShmemInit(void *);
+
+static ShmemStructDesc PMSignalShmemDesc = {
+ .name = "PMSignalState",
+ .size = 0, /* dynamic */
+ .init_fn = PMSignalShmemInit,
+};
+
+/*
+ * PMSignalState pointer is valid in both postmaster and child processes
+ *
+ * This is a stand-alone variable rather than just a #define over
+ * PMSignalShmemDesc.ptr because it is needed early at backend startup and
+ * passed as a backend parameter in EXEC_BACKEND mode
+ */
NON_EXEC_STATIC volatile PMSignalData *PMSignalState = NULL;
+
/*
* Local copy of PMSignalState->num_child_flags, only valid in the
* postmaster. Postmaster keeps a local copy so that it doesn't need to
@@ -123,39 +138,28 @@ postmaster_death_handler(SIGNAL_ARGS)
static void MarkPostmasterChildInactive(int code, Datum arg);
/*
- * PMSignalShmemSize
- * Compute space needed for pmsignal.c's shared memory
+ * PMSignalShmemRegister - Register our shared memory
*/
-Size
-PMSignalShmemSize(void)
+void
+PMSignalShmemRegister(void)
{
Size size;
size = offsetof(PMSignalData, PMChildFlags);
size = add_size(size, mul_size(MaxLivePostmasterChildren(),
sizeof(sig_atomic_t)));
-
- return size;
+ PMSignalShmemDesc.size = size;
+ ShmemRegisterStruct(&PMSignalShmemDesc);
}
-/*
- * PMSignalShmemInit - initialize during shared-memory creation
- */
-void
-PMSignalShmemInit(void)
+static void
+PMSignalShmemInit(void *arg)
{
- bool found;
-
- PMSignalState = (PMSignalData *)
- ShmemInitStruct("PMSignalState", PMSignalShmemSize(), &found);
-
- if (!found)
- {
- /* initialize all flags to zeroes */
- MemSet(unvolatize(PMSignalData *, PMSignalState), 0, PMSignalShmemSize());
- num_child_flags = MaxLivePostmasterChildren();
- PMSignalState->num_child_flags = num_child_flags;
- }
+ /* initialize all flags to zeroes */
+ PMSignalState = PMSignalShmemDesc.ptr;
+ MemSet(unvolatize(PMSignalData *, PMSignalState), 0, PMSignalShmemDesc.size);
+ num_child_flags = MaxLivePostmasterChildren();
+ PMSignalState->num_child_flags = num_child_flags;
}
/*
@@ -291,6 +295,7 @@ RegisterPostmasterChildActive(void)
{
int slot = MyPMChildSlot;
+ Assert(PMSignalState);
Assert(slot > 0 && slot <= PMSignalState->num_child_flags);
slot--;
Assert(PMSignalState->PMChildFlags[slot] == PM_CHILD_ASSIGNED);
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 301f54fb5a8..08c63bcb2a7 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -101,6 +101,18 @@ typedef struct ProcArrayStruct
int pgprocnos[FLEXIBLE_ARRAY_MEMBER];
} ProcArrayStruct;
+static void ProcArrayShmemInit(void *arg);
+static void ProcArrayShmemAttach(void *arg);
+
+static ShmemStructDesc ProcArrayShmemDesc = {
+ .name = "Proc Array",
+ .size = 0, /* dynamic */
+ .init_fn = ProcArrayShmemInit,
+ .attach_fn = ProcArrayShmemAttach,
+};
+
+#define procArray ((ProcArrayStruct *) ProcArrayShmemDesc.ptr)
+
/*
* State for the GlobalVisTest* family of functions. Those functions can
* e.g. be used to decide if a deleted row can be removed without violating
@@ -267,9 +279,6 @@ typedef enum KAXCompressReason
KAX_STARTUP_PROCESS_IDLE, /* startup process is about to sleep */
} KAXCompressReason;
-
-static ProcArrayStruct *procArray;
-
static PGPROC *allProcs;
/*
@@ -280,8 +289,23 @@ static TransactionId cachedXidIsNotInProgress = InvalidTransactionId;
/*
* Bookkeeping for tracking emulated transactions in recovery
*/
-static TransactionId *KnownAssignedXids;
-static bool *KnownAssignedXidsValid;
+
+static ShmemStructDesc KnownAssignedXidsShmemDesc = {
+ .name = "KnownAssignedXids",
+ .size = 0, /* dynamic */
+ .init_fn = NULL,
+};
+
+#define KnownAssignedXids ((TransactionId *) KnownAssignedXidsShmemDesc.ptr)
+
+static ShmemStructDesc KnownAssignedXidsValidShmemDesc = {
+ .name = "KnownAssignedXidsValid",
+ .size = 0, /* dynamic */
+ .init_fn = NULL,
+};
+
+#define KnownAssignedXidsValid ((bool *) KnownAssignedXidsValidShmemDesc.ptr)
+
static TransactionId latestObservedXid = InvalidTransactionId;
/*
@@ -372,18 +396,19 @@ static inline FullTransactionId FullXidRelativeTo(FullTransactionId rel,
static void GlobalVisUpdateApply(ComputeXidHorizonsResult *horizons);
/*
- * Report shared-memory space needed by ProcArrayShmemInit
+ * Register the shared PGPROC array during postmaster startup.
*/
-Size
-ProcArrayShmemSize(void)
+void
+ProcArrayShmemRegister(void)
{
- Size size;
-
- /* Size of the ProcArray structure itself */
#define PROCARRAY_MAXPROCS (MaxBackends + max_prepared_xacts)
- size = offsetof(ProcArrayStruct, pgprocnos);
- size = add_size(size, mul_size(sizeof(int), PROCARRAY_MAXPROCS));
+ /* Create or attach to the ProcArray shared structure */
+ ProcArrayShmemDesc.size =
+ add_size(offsetof(ProcArrayStruct, pgprocnos),
+ mul_size(sizeof(int),
+ PROCARRAY_MAXPROCS));
+ ShmemRegisterStruct(&ProcArrayShmemDesc);
/*
* During Hot Standby processing we have a data structure called
@@ -403,64 +428,38 @@ ProcArrayShmemSize(void)
if (EnableHotStandby)
{
- size = add_size(size,
- mul_size(sizeof(TransactionId),
- TOTAL_MAX_CACHED_SUBXIDS));
- size = add_size(size,
- mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS));
+ KnownAssignedXidsShmemDesc.size =
+ mul_size(sizeof(TransactionId),
+ TOTAL_MAX_CACHED_SUBXIDS);
+ ShmemRegisterStruct(&KnownAssignedXidsShmemDesc);
+
+ KnownAssignedXidsValidShmemDesc.size =
+ mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS);
+ ShmemRegisterStruct(&KnownAssignedXidsValidShmemDesc);
}
-
- return size;
}
-/*
- * Initialize the shared PGPROC array during postmaster startup.
- */
-void
-ProcArrayShmemInit(void)
+static void
+ProcArrayShmemInit(void *arg)
{
- bool found;
-
- /* Create or attach to the ProcArray shared structure */
- procArray = (ProcArrayStruct *)
- ShmemInitStruct("Proc Array",
- add_size(offsetof(ProcArrayStruct, pgprocnos),
- mul_size(sizeof(int),
- PROCARRAY_MAXPROCS)),
- &found);
-
- if (!found)
- {
- /*
- * We're the first - initialize.
- */
- procArray->numProcs = 0;
- procArray->maxProcs = PROCARRAY_MAXPROCS;
- procArray->maxKnownAssignedXids = TOTAL_MAX_CACHED_SUBXIDS;
- procArray->numKnownAssignedXids = 0;
- procArray->tailKnownAssignedXids = 0;
- procArray->headKnownAssignedXids = 0;
- procArray->lastOverflowedXid = InvalidTransactionId;
- procArray->replication_slot_xmin = InvalidTransactionId;
- procArray->replication_slot_catalog_xmin = InvalidTransactionId;
- TransamVariables->xactCompletionCount = 1;
- }
+ procArray->numProcs = 0;
+ procArray->maxProcs = PROCARRAY_MAXPROCS;
+ procArray->maxKnownAssignedXids = TOTAL_MAX_CACHED_SUBXIDS;
+ procArray->numKnownAssignedXids = 0;
+ procArray->tailKnownAssignedXids = 0;
+ procArray->headKnownAssignedXids = 0;
+ procArray->lastOverflowedXid = InvalidTransactionId;
+ procArray->replication_slot_xmin = InvalidTransactionId;
+ procArray->replication_slot_catalog_xmin = InvalidTransactionId;
+ TransamVariables->xactCompletionCount = 1;
allProcs = ProcGlobal->allProcs;
+}
- /* Create or attach to the KnownAssignedXids arrays too, if needed */
- if (EnableHotStandby)
- {
- KnownAssignedXids = (TransactionId *)
- ShmemInitStruct("KnownAssignedXids",
- mul_size(sizeof(TransactionId),
- TOTAL_MAX_CACHED_SUBXIDS),
- &found);
- KnownAssignedXidsValid = (bool *)
- ShmemInitStruct("KnownAssignedXidsValid",
- mul_size(sizeof(bool), TOTAL_MAX_CACHED_SUBXIDS),
- &found);
- }
+static void
+ProcArrayShmemAttach(void *arg)
+{
+ allProcs = ProcGlobal->allProcs;
}
/*
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 8e56922dcea..5743f088324 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -102,7 +102,16 @@ struct ProcSignalHeader
#define BARRIER_CLEAR_BIT(flags, type) \
((flags) &= ~(((uint32) 1) << (uint32) (type)))
-NON_EXEC_STATIC ProcSignalHeader *ProcSignal = NULL;
+static void ProcSignalShmemInit(void *arg);
+
+static ShmemStructDesc ProcSignalShmemDesc = {
+ .name = "ProcSignal",
+ .size = 0, /* dynamic */
+ .init_fn = ProcSignalShmemInit,
+};
+
+#define ProcSignal ((ProcSignalHeader *) ProcSignalShmemDesc.ptr)
+
static ProcSignalSlot *MyProcSignalSlot = NULL;
static bool CheckProcSignal(ProcSignalReason reason);
@@ -110,51 +119,37 @@ static void CleanupProcSignalState(int status, Datum arg);
static void ResetProcSignalBarrierBits(uint32 flags);
/*
- * ProcSignalShmemSize
- * Compute space needed for ProcSignal's shared memory
+ * ProcSignalShmemRegister
+ * Register ProcSignal's shared memory needs at postmaster startup
*/
-Size
-ProcSignalShmemSize(void)
+void
+ProcSignalShmemRegister(void)
{
Size size;
size = mul_size(NumProcSignalSlots, sizeof(ProcSignalSlot));
size = add_size(size, offsetof(ProcSignalHeader, psh_slot));
- return size;
+
+ ProcSignalShmemDesc.size = size;
+ ShmemRegisterStruct(&ProcSignalShmemDesc);
}
-/*
- * ProcSignalShmemInit
- * Allocate and initialize ProcSignal's shared memory
- */
-void
-ProcSignalShmemInit(void)
+static void
+ProcSignalShmemInit(void *arg)
{
- Size size = ProcSignalShmemSize();
- bool found;
+ pg_atomic_init_u64(&ProcSignal->psh_barrierGeneration, 0);
- ProcSignal = (ProcSignalHeader *)
- ShmemInitStruct("ProcSignal", size, &found);
-
- /* If we're first, initialize. */
- if (!found)
+ for (int i = 0; i < NumProcSignalSlots; ++i)
{
- int i;
-
- pg_atomic_init_u64(&ProcSignal->psh_barrierGeneration, 0);
+ ProcSignalSlot *slot = &ProcSignal->psh_slot[i];
- for (i = 0; i < NumProcSignalSlots; ++i)
- {
- ProcSignalSlot *slot = &ProcSignal->psh_slot[i];
-
- SpinLockInit(&slot->pss_mutex);
- pg_atomic_init_u32(&slot->pss_pid, 0);
- slot->pss_cancel_key_len = 0;
- MemSet(slot->pss_signalFlags, 0, sizeof(slot->pss_signalFlags));
- pg_atomic_init_u64(&slot->pss_barrierGeneration, PG_UINT64_MAX);
- pg_atomic_init_u32(&slot->pss_barrierCheckMask, 0);
- ConditionVariableInit(&slot->pss_barrierCV);
- }
+ SpinLockInit(&slot->pss_mutex);
+ pg_atomic_init_u32(&slot->pss_pid, 0);
+ slot->pss_cancel_key_len = 0;
+ MemSet(slot->pss_signalFlags, 0, sizeof(slot->pss_signalFlags));
+ pg_atomic_init_u64(&slot->pss_barrierGeneration, PG_UINT64_MAX);
+ pg_atomic_init_u32(&slot->pss_barrierCheckMask, 0);
+ ConditionVariableInit(&slot->pss_barrierCV);
}
}
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 9f362ce8641..faa0fcbd21e 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -19,6 +19,8 @@
* methods). The routines in this file are used for allocating and
* binding to shared memory data structures.
*
+ * FIXME: NOTES below are outdated
+ *
* NOTES:
* (a) There are three kinds of shared memory data structures
* available to POSTGRES: fixed-size structures, queues and hash
@@ -76,6 +78,16 @@
#include "storage/spin.h"
#include "utils/builtins.h"
+/* size constants for the shmem index table */
+ /* max size of data structure string name */
+#define SHMEM_INDEX_KEYSIZE (48)
+ /* estimated size of the shmem index table (not a hard limit) */
+#define SHMEM_INDEX_SIZE (64)
+
+/* these are in postmaster private memory */
+static ShmemStructDesc *registry[SHMEM_INDEX_SIZE];
+static int num_registrations = 0;
+
/*
* This is the first data structure stored in the shared memory segment, at
* the offset that PGShmemHeader->content_offset points to. Allocations by
@@ -95,6 +107,9 @@ typedef struct ShmemAllocatorData
static void *ShmemAllocRaw(Size size, Size *allocated_size);
+static void shmem_hash_init(void *arg);
+static void shmem_hash_attach(void *arg);
+
/* shared memory global variables */
static PGShmemHeader *ShmemSegHdr; /* shared mem segment header */
@@ -103,13 +118,137 @@ static void *ShmemEnd; /* end+1 address of shared memory */
static ShmemAllocatorData *ShmemAllocator;
slock_t *ShmemLock; /* points to ShmemAllocator->shmem_lock */
-static HTAB *ShmemIndex = NULL; /* primary index hashtable for shmem */
+
+
+static ShmemHashDesc ShmemIndexHashDesc = {
+ .name = "ShmemIndex",
+ .init_size = SHMEM_INDEX_SIZE,
+ .max_size = SHMEM_INDEX_SIZE,
+};
+
+ /* primary index hashtable for shmem */
+#define ShmemIndex (ShmemIndexHashDesc.ptr)
+
/* To get reliable results for NUMA inquiry we need to "touch pages" once */
static bool firstNumaTouch = true;
Datum pg_numa_available(PG_FUNCTION_ARGS);
+
+void
+ShmemRegisterStruct(ShmemStructDesc *desc)
+{
+ elog(DEBUG2, "REGISTER: %s with size %zd", desc->name, desc->size);
+
+ registry[num_registrations++] = desc;
+}
+
+size_t
+ShmemRegisteredSize(void)
+{
+ size_t size;
+
+ size = 0;
+ for (int i = 0; i < num_registrations; i++)
+ {
+ size = add_size(size, registry[i]->size);
+ size = add_size(size, registry[i]->extra_size);
+ }
+
+ elog(DEBUG2, "SIZE: total %zd", size);
+
+ return size;
+}
+
+void
+ShmemInitRegistered(void)
+{
+ /* Should be called only by the postmaster or a standalone backend. */
+ Assert(!IsUnderPostmaster);
+
+ for (int i = 0; i < num_registrations; i++)
+ {
+ size_t allocated_size;
+ void *structPtr;
+ bool found;
+ ShmemIndexEnt *result;
+
+ elog(DEBUG2, "INIT [%d/%d]: %s", i, num_registrations, registry[i]->name);
+
+ /* look it up in the shmem index */
+ result = (ShmemIndexEnt *)
+ hash_search(ShmemIndex, registry[i]->name, HASH_ENTER_NULL, &found);
+ if (!result)
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not create ShmemIndex entry for data structure \"%s\"",
+ registry[i]->name)));
+ }
+ if (found)
+ elog(ERROR, "shmem struct \"%s\" is already initialized", registry[i]->name);
+
+ /* allocate and initialize it */
+ structPtr = ShmemAllocRaw(registry[i]->size, &allocated_size);
+ if (structPtr == NULL)
+ {
+ /* out of memory; remove the failed ShmemIndex entry */
+ hash_search(ShmemIndex, registry[i]->name, HASH_REMOVE, NULL);
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("not enough shared memory for data structure"
+ " \"%s\" (%zu bytes requested)",
+ registry[i]->name, registry[i]->size)));
+ }
+ result->size = registry[i]->size;
+ result->allocated_size = allocated_size;
+ result->location = structPtr;
+
+ registry[i]->ptr = structPtr;
+ if (registry[i]->init_fn)
+ registry[i]->init_fn(registry[i]->init_fn_arg);
+ }
+}
+
+#ifdef EXEC_BACKEND
+void
+ShmemAttachRegistered(void)
+{
+ /* Must be initializing a (non-standalone) backend */
+ Assert(IsUnderPostmaster);
+ Assert(ShmemAllocator->index != NULL);
+
+ LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
+
+ for (int i = 0; i < num_registrations; i++)
+ {
+ bool found;
+ ShmemIndexEnt *result;
+
+ elog(LOG, "ATTACH [%d/%d]: %s", i, num_registrations, registry[i]->name);
+
+ /* look it up in the shmem index */
+ result = (ShmemIndexEnt *)
+ hash_search(ShmemIndex, registry[i]->name, HASH_FIND, &found);
+ if (!found)
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("could not find ShmemIndex entry for data structure \"%s\"",
+ registry[i]->name)));
+ }
+
+ registry[i]->ptr = result->location;
+
+ if (registry[i]->attach_fn)
+ registry[i]->attach_fn(registry[i]->attach_fn_arg);
+ }
+
+ LWLockRelease(ShmemIndexLock);
+}
+#endif
+
/*
* InitShmemAllocator() --- set up basic pointers to shared memory.
*
@@ -292,6 +431,98 @@ InitShmemIndex(void)
HASH_ELEM | HASH_STRINGS);
}
+/*
+ * ShmemInitHash -- Create and initialize, or attach to, a
+ * shared memory hash table.
+ *
+ * We assume caller is doing some kind of synchronization
+ * so that two processes don't try to create/initialize the same
+ * table at once. (In practice, all creations are done in the postmaster
+ * process; child processes should always be attaching to existing tables.)
+ *
+ * max_size is the estimated maximum number of hashtable entries. This is
+ * not a hard limit, but the access efficiency will degrade if it is
+ * exceeded substantially (since it's used to compute directory size and
+ * the hash table buckets will get overfull).
+ *
+ * init_size is the number of hashtable entries to preallocate. For a table
+ * whose maximum size is certain, this should be equal to max_size; that
+ * ensures that no run-time out-of-shared-memory failures can occur.
+ *
+ * *infoP and hash_flags must specify at least the entry sizes and key
+ * comparison semantics (see hash_create()). Flag bits and values specific
+ * to shared-memory hash tables are added here, except that callers may
+ * choose to specify HASH_PARTITION and/or HASH_FIXED_SIZE.
+ *
+ * Note: before Postgres 9.0, this function returned NULL for some failure
+ * cases. Now, it always throws error instead, so callers need not check
+ * for NULL.
+ */
+void
+ShmemRegisterHash(ShmemHashDesc *desc, /* configuration */
+ HASHCTL *infoP, /* info about key and bucket size */
+ int hash_flags) /* info about infoP */
+{
+ /*
+ * Hash tables allocated in shared memory have a fixed directory; it can't
+ * grow or other backends wouldn't be able to find it. So, make sure we
+ * make it big enough to start with.
+ *
+ * The shared memory allocator must be specified too.
+ */
+ infoP->dsize = infoP->max_dsize = hash_select_dirsize(desc->max_size);
+ infoP->alloc = ShmemAllocNoError;
+ hash_flags |= HASH_SHARED_MEM | HASH_ALLOC | HASH_DIRSIZE;
+
+ /* look it up in the shmem index */
+ memset(&desc->base_desc, 0, sizeof(desc->base_desc));
+ desc->base_desc.name = desc->name;
+ desc->base_desc.size = hash_get_shared_size(infoP, hash_flags);
+ desc->base_desc.init_fn = shmem_hash_init;
+ desc->base_desc.init_fn_arg = desc;
+ desc->base_desc.attach_fn = shmem_hash_attach;
+ desc->base_desc.attach_fn_arg = desc;
+
+ desc->base_desc.extra_size = hash_estimate_size(desc->max_size, infoP->entrysize) - desc->base_desc.size;
+
+ desc->hash_flags = hash_flags;
+ desc->infoP = MemoryContextAlloc(TopMemoryContext, sizeof(HASHCTL));
+ memcpy(desc->infoP, infoP, sizeof(HASHCTL));
+
+ ShmemRegisterStruct(&desc->base_desc);
+}
+
+static void
+shmem_hash_init(void *arg)
+{
+ ShmemHashDesc *desc = (ShmemHashDesc *) arg;
+ int hash_flags = desc->hash_flags;
+
+ /* Pass location of hashtable header to hash_create */
+ desc->ptr = desc->base_desc.ptr;
+ desc->infoP->hctl = (HASHHDR *) desc->ptr;
+
+ desc->ptr = hash_create(desc->name, desc->init_size, desc->infoP, hash_flags);
+}
+
+static void
+shmem_hash_attach(void *arg)
+{
+ ShmemHashDesc *desc = (ShmemHashDesc *) arg;
+ int hash_flags = desc->hash_flags;
+
+ /*
+ * if it already exists, attach to it rather than allocate and initialize
+ * new space
+ */
+ hash_flags |= HASH_ATTACH;
+
+ /* Pass location of hashtable header to hash_create */
+ desc->infoP->hctl = (HASHHDR *) desc->ptr;
+
+ desc->ptr = hash_create(desc->name, desc->init_size, desc->infoP, hash_flags);
+}
+
/*
* ShmemInitHash -- Create and initialize, or attach to, a
* shared memory hash table.
diff --git a/src/backend/storage/ipc/sinvaladt.c b/src/backend/storage/ipc/sinvaladt.c
index a7a7cc4f0a9..0fe0f256971 100644
--- a/src/backend/storage/ipc/sinvaladt.c
+++ b/src/backend/storage/ipc/sinvaladt.c
@@ -203,7 +203,16 @@ typedef struct SISeg
*/
#define NumProcStateSlots (MaxBackends + NUM_AUXILIARY_PROCS)
-static SISeg *shmInvalBuffer; /* pointer to the shared inval buffer */
+static void SharedInvalShmemInit(void *arg);
+
+static ShmemStructDesc SharedInvalShmemDesc = {
+ .name = "shmInvalBuffer",
+ .size = 0, /* dynamic */
+ .init_fn = SharedInvalShmemInit,
+};
+
+/* pointer to the shared inval buffer */
+#define shmInvalBuffer ((SISeg *) SharedInvalShmemDesc.ptr)
static LocalTransactionId nextLocalTransactionId;
@@ -212,10 +221,11 @@ static void CleanupInvalidationState(int status, Datum arg);
/*
- * SharedInvalShmemSize --- return shared-memory space needed
+ * SharedInvalShmemRegister
+ * Register shared memory needs for the SI message buffer
*/
-Size
-SharedInvalShmemSize(void)
+void
+SharedInvalShmemRegister(void)
{
Size size;
@@ -223,26 +233,17 @@ SharedInvalShmemSize(void)
size = add_size(size, mul_size(sizeof(ProcState), NumProcStateSlots)); /* procState */
size = add_size(size, mul_size(sizeof(int), NumProcStateSlots)); /* pgprocnos */
- return size;
+ /* Allocate space in shared memory */
+ SharedInvalShmemDesc.size = size;
+ ShmemRegisterStruct(&SharedInvalShmemDesc);
}
-/*
- * SharedInvalShmemInit
- * Create and initialize the SI message buffer
- */
-void
-SharedInvalShmemInit(void)
+static void
+SharedInvalShmemInit(void *arg)
{
int i;
- bool found;
-
- /* Allocate space in shared memory */
- shmInvalBuffer = (SISeg *)
- ShmemInitStruct("shmInvalBuffer", SharedInvalShmemSize(), &found);
- if (found)
- return;
- /* Clear message counters, save size of procState array, init spinlock */
+ /* Clear message counters, save size of procState array FIXME, init spinlock */
shmInvalBuffer->minMsgNum = 0;
shmInvalBuffer->maxMsgNum = 0;
shmInvalBuffer->nextThreshold = CLEANUP_MIN;
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index c7a001b3b79..85375b5195e 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -73,13 +73,33 @@ PGPROC *MyProc = NULL;
* relatively infrequently (only at backend startup or shutdown) and not for
* very long, so a spinlock is okay.
*/
-NON_EXEC_STATIC slock_t *ProcStructLock = NULL;
+#define ProcStructLock (&ProcGlobal->freeProcsLock)
+
+static void ProcGlobalShmemInit(void *arg);
+
+static ShmemStructDesc ProcGlobalShmemDesc = {
+ .name = "Proc Header",
+ .size = sizeof(PROC_HDR),
+ .init_fn = ProcGlobalShmemInit,
+};
+
+static ShmemStructDesc ProcGlobalAllProcsShmemDesc = {
+ .name = "PGPROC structures",
+ .size = 0, /* dynamic */
+};
+
+static ShmemStructDesc FastPathLockArrayShmemDesc = {
+ .name = "Fast-Path Lock Array",
+ .size = 0, /* dynamic */
+};
/* Pointers to shared-memory structures */
PROC_HDR *ProcGlobal = NULL;
NON_EXEC_STATIC PGPROC *AuxiliaryProcs = NULL;
PGPROC *PreparedXactProcs = NULL;
+static uint32 TotalProcs;
+
static DeadLockState deadlock_state = DS_NOT_YET_CHECKED;
/* Is a deadlock check pending? */
@@ -91,24 +111,6 @@ static void AuxiliaryProcKill(int code, Datum arg);
static void CheckDeadLock(void);
-/*
- * Report shared-memory space needed by PGPROC.
- */
-static Size
-PGProcShmemSize(void)
-{
- Size size = 0;
- Size TotalProcs =
- add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
-
- size = add_size(size, mul_size(TotalProcs, sizeof(PGPROC)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->xids)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->subxidStates)));
- size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->statusFlags)));
-
- return size;
-}
-
/*
* Report shared-memory space needed by Fast-Path locks.
*/
@@ -116,8 +118,6 @@ static Size
FastPathLockShmemSize(void)
{
Size size = 0;
- Size TotalProcs =
- add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
Size fpLockBitsSize,
fpRelIdSize;
@@ -133,25 +133,6 @@ FastPathLockShmemSize(void)
return size;
}
-/*
- * Report shared-memory space needed by InitProcGlobal.
- */
-Size
-ProcGlobalShmemSize(void)
-{
- Size size = 0;
-
- /* ProcGlobal */
- size = add_size(size, sizeof(PROC_HDR));
- size = add_size(size, sizeof(slock_t));
-
- size = add_size(size, PGSemaphoreShmemSize(ProcGlobalSemas()));
- size = add_size(size, PGProcShmemSize());
- size = add_size(size, FastPathLockShmemSize());
-
- return size;
-}
-
/*
* Report number of semaphores needed by InitProcGlobal.
*/
@@ -186,35 +167,63 @@ ProcGlobalSemas(void)
* implementation typically requires us to create semaphores in the
* postmaster, not in backends.
*
- * Note: this is NOT called by individual backends under a postmaster,
+ * Note: this is NOT called by individual backends under a postmaster, XXX
* not even in the EXEC_BACKEND case. The ProcGlobal and AuxiliaryProcs
* pointers must be propagated specially for EXEC_BACKEND operation.
*/
void
-InitProcGlobal(void)
+ProcGlobalShmemRegister(void)
+{
+ Size size = 0;
+
+ /*
+ * Reserve all the PGPROC structures we'll need. There are
+ * six separate consumers: (1) normal backends, (2) autovacuum workers and
+ * special workers, (3) background workers, (4) walsenders, (5) auxiliary
+ * processes, and (6) prepared transactions. (For largely-historical
+ * reasons, we combine autovacuum and special workers into one category
+ * with a single freelist.) Each PGPROC structure is dedicated to exactly
+ * one of these purposes, and they do not move between groups.
+ */
+ TotalProcs =
+ add_size(MaxBackends, add_size(NUM_AUXILIARY_PROCS, max_prepared_xacts));
+
+ size = add_size(size, mul_size(TotalProcs, sizeof(PGPROC)));
+
+ /* FIXME: the sizeofs look dangerous because ProcGlobal is not initialized yet */
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->xids)));
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->subxidStates)));
+ size = add_size(size, mul_size(TotalProcs, sizeof(*ProcGlobal->statusFlags)));
+
+ ProcGlobalAllProcsShmemDesc.size = size;
+ ShmemRegisterStruct(&ProcGlobalAllProcsShmemDesc);
+
+ FastPathLockArrayShmemDesc.size = FastPathLockShmemSize();
+ ShmemRegisterStruct(&FastPathLockArrayShmemDesc);
+
+ /*
+ * Create the ProcGlobal shared structure last. Its init callback
+ * initializes the others too.
+ */
+ ShmemRegisterStruct(&ProcGlobalShmemDesc);
+}
+
+static void
+ProcGlobalShmemInit(void *arg)
{
+ char *ptr;
+ size_t requestSize;
PGPROC *procs;
int i,
j;
- bool found;
- uint32 TotalProcs = MaxBackends + NUM_AUXILIARY_PROCS + max_prepared_xacts;
-
/* Used for setup of per-backend fast-path slots. */
char *fpPtr,
*fpEndPtr PG_USED_FOR_ASSERTS_ONLY;
Size fpLockBitsSize,
fpRelIdSize;
- Size requestSize;
- char *ptr;
- /* Create the ProcGlobal shared structure */
- ProcGlobal = (PROC_HDR *)
- ShmemInitStruct("Proc Header", sizeof(PROC_HDR), &found);
- Assert(!found);
+ ProcGlobal = ProcGlobalShmemDesc.ptr;
- /*
- * Initialize the data structures.
- */
ProcGlobal->spins_per_delay = DEFAULT_SPINS_PER_DELAY;
dlist_init(&ProcGlobal->freeProcs);
dlist_init(&ProcGlobal->autovacFreeProcs);
@@ -225,23 +234,11 @@ InitProcGlobal(void)
ProcGlobal->checkpointerProc = INVALID_PROC_NUMBER;
pg_atomic_init_u32(&ProcGlobal->procArrayGroupFirst, INVALID_PROC_NUMBER);
pg_atomic_init_u32(&ProcGlobal->clogGroupFirst, INVALID_PROC_NUMBER);
+ SpinLockInit(ProcStructLock);
- /*
- * Create and initialize all the PGPROC structures we'll need. There are
- * six separate consumers: (1) normal backends, (2) autovacuum workers and
- * special workers, (3) background workers, (4) walsenders, (5) auxiliary
- * processes, and (6) prepared transactions. (For largely-historical
- * reasons, we combine autovacuum and special workers into one category
- * with a single freelist.) Each PGPROC structure is dedicated to exactly
- * one of these purposes, and they do not move between groups.
- */
- requestSize = PGProcShmemSize();
-
- ptr = ShmemInitStruct("PGPROC structures",
- requestSize,
- &found);
-
- MemSet(ptr, 0, requestSize);
+ ptr = ProcGlobalAllProcsShmemDesc.ptr;
+ requestSize = ProcGlobalAllProcsShmemDesc.size;
+ memset(ptr, 0, requestSize);
procs = (PGPROC *) ptr;
ptr = ptr + TotalProcs * sizeof(PGPROC);
@@ -277,20 +274,13 @@ InitProcGlobal(void)
fpLockBitsSize = MAXALIGN(FastPathLockGroupsPerBackend * sizeof(uint64));
fpRelIdSize = MAXALIGN(FastPathLockSlotsPerBackend() * sizeof(Oid));
- requestSize = FastPathLockShmemSize();
-
- fpPtr = ShmemInitStruct("Fast-Path Lock Array",
- requestSize,
- &found);
-
- MemSet(fpPtr, 0, requestSize);
+ fpPtr = FastPathLockArrayShmemDesc.ptr;
+ requestSize = FastPathLockArrayShmemDesc.size;
+ memset(fpPtr, 0, requestSize);
/* For asserts checking we did not overflow. */
fpEndPtr = fpPtr + requestSize;
- /* Reserve space for semaphores. */
- PGReserveSemaphores(ProcGlobalSemas());
-
for (i = 0; i < TotalProcs; i++)
{
PGPROC *proc = &procs[i];
@@ -380,12 +370,6 @@ InitProcGlobal(void)
*/
AuxiliaryProcs = &procs[MaxBackends];
PreparedXactProcs = &procs[MaxBackends + NUM_AUXILIARY_PROCS];
-
- /* Create ProcStructLock spinlock, too */
- ProcStructLock = (slock_t *) ShmemInitStruct("ProcStructLock spinlock",
- sizeof(slock_t),
- &found);
- SpinLockInit(ProcStructLock);
}
/*
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 02e9aaa6bca..eed188416ee 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -4117,6 +4117,8 @@ PostgresSingleUserMain(int argc, char *argv[],
* shared memory, determine the value of any runtime-computed GUCs that
* depend on the amount of shared memory required.
*/
+ RegisterShmemStructs();
+
InitializeShmemGUCs();
/*
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 6fa91bfcdc0..49d476e9d5c 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -15,7 +15,9 @@
#define TRANSAM_H
#include "access/xlogdefs.h"
-
+#ifndef FRONTEND
+#include "storage/shmem.h"
+#endif
/* ----------------
* Special transaction ID values
@@ -330,7 +332,10 @@ TransactionIdFollowsOrEquals(TransactionId id1, TransactionId id2)
extern bool TransactionStartedDuringRecovery(void);
/* in transam/varsup.c */
-extern PGDLLIMPORT TransamVariablesData *TransamVariables;
+#ifndef FRONTEND
+extern PGDLLIMPORT struct ShmemStructDesc TransamVariablesShmemDesc;
+#define TransamVariables ((TransamVariablesData *) TransamVariablesShmemDesc.ptr)
+#endif
/*
* prototypes for functions in transam/transam.c
@@ -345,8 +350,7 @@ extern TransactionId TransactionIdLatest(TransactionId mainxid,
extern XLogRecPtr TransactionIdGetCommitLSN(TransactionId xid);
/* in transam/varsup.c */
-extern Size VarsupShmemSize(void);
-extern void VarsupShmemInit(void);
+extern void VarsupShmemRegister(void);
extern FullTransactionId GetNewTransactionId(bool isSubXact);
extern void AdvanceNextFullTransactionIdPastXid(TransactionId xid);
extern FullTransactionId ReadNextFullTransactionId(void);
diff --git a/src/include/storage/dsm_registry.h b/src/include/storage/dsm_registry.h
index 506fae2c9ca..9a1b4d982af 100644
--- a/src/include/storage/dsm_registry.h
+++ b/src/include/storage/dsm_registry.h
@@ -22,7 +22,6 @@ extern dsa_area *GetNamedDSA(const char *name, bool *found);
extern dshash_table *GetNamedDSHash(const char *name,
const dshash_parameters *params,
bool *found);
-extern Size DSMRegistryShmemSize(void);
-extern void DSMRegistryShmemInit(void);
+extern void DSMRegistryShmemRegister(void);
#endif /* DSM_REGISTRY_H */
diff --git a/src/include/storage/ipc.h b/src/include/storage/ipc.h
index da32787ab51..8a3b71ad5d3 100644
--- a/src/include/storage/ipc.h
+++ b/src/include/storage/ipc.h
@@ -77,6 +77,7 @@ extern void check_on_shmem_exit_lists_are_empty(void);
/* ipci.c */
extern PGDLLIMPORT shmem_startup_hook_type shmem_startup_hook;
+extern void RegisterShmemStructs(void);
extern Size CalculateShmemSize(void);
extern void CreateSharedMemoryAndSemaphores(void);
#ifdef EXEC_BACKEND
diff --git a/src/include/storage/pmsignal.h b/src/include/storage/pmsignal.h
index 206fb78f8a5..7cdc4852334 100644
--- a/src/include/storage/pmsignal.h
+++ b/src/include/storage/pmsignal.h
@@ -66,8 +66,7 @@ extern PGDLLIMPORT volatile PMSignalData *PMSignalState;
/*
* prototypes for functions in pmsignal.c
*/
-extern Size PMSignalShmemSize(void);
-extern void PMSignalShmemInit(void);
+extern void PMSignalShmemRegister(void);
extern void SendPostmasterSignal(PMSignalReason reason);
extern bool CheckPostmasterSignal(PMSignalReason reason);
extern void SetQuitSignalReason(QuitSignalReason reason);
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 679f0624f92..37023e1a93f 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -418,6 +418,9 @@ typedef struct PROC_HDR
dlist_head bgworkerFreeProcs;
/* Head of list of walsender free PGPROC structures */
dlist_head walsenderFreeProcs;
+
+ slock_t freeProcsLock;
+
/* First pgproc waiting for group XID clear */
pg_atomic_uint32 procArrayGroupFirst;
/* First pgproc waiting for group transaction status update */
@@ -488,7 +491,7 @@ extern PGDLLIMPORT PGPROC *AuxiliaryProcs;
* Function Prototypes
*/
extern int ProcGlobalSemas(void);
-extern Size ProcGlobalShmemSize(void);
+extern void ProcGlobalShmemRegister(void);
extern void InitProcGlobal(void);
extern void InitProcess(void);
extern void InitProcessPhase2(void);
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index 3a8593f87ba..41753c3a630 100644
--- a/src/include/storage/procarray.h
+++ b/src/include/storage/procarray.h
@@ -20,8 +20,7 @@
#include "utils/snapshot.h"
-extern Size ProcArrayShmemSize(void);
-extern void ProcArrayShmemInit(void);
+extern void ProcArrayShmemRegister(void);
extern void ProcArrayAdd(PGPROC *proc);
extern void ProcArrayRemove(PGPROC *proc, TransactionId latestXid);
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index e52b8eb7697..f2df1f30c5f 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -71,8 +71,7 @@ typedef enum
/*
* prototypes for functions in procsignal.c
*/
-extern Size ProcSignalShmemSize(void);
-extern void ProcSignalShmemInit(void);
+extern void ProcSignalShmemRegister(void);
extern void ProcSignalInit(const uint8 *cancel_key, int cancel_key_len);
extern int SendProcSignal(pid_t pid, ProcSignalReason reason,
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index 89d45287c17..40e2fc17056 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -24,6 +24,53 @@
#include "storage/spin.h"
#include "utils/hsearch.h"
+typedef void (*ShmemInitCallback) (void *arg);
+typedef void (*ShmemAttachCallback) (void *arg);
+
+/*
+ * Descriptor for a named area or struct in shared memory
+ */
+typedef struct ShmemStructDesc
+{
+ /* Name of the shared memory area. Must be unique across the system */
+ const char *name;
+
+ size_t size;
+
+ size_t alignment;
+ ShmemInitCallback init_fn;
+ ShmemInitCallback attach_fn;
+ void *init_fn_arg;
+ void *attach_fn_arg;
+
+ /*
+ * Extra space to allocated in the shared memory segment, but it's not
+ * part of the struct itself. This is used for shared memory hash tables
+ * that can grow beyond the initial size when more buckets are allocated.
+ */
+ size_t extra_size;
+
+ /* Pointer to the shared memory area, when it's allocated. */
+ void *ptr;
+} ShmemStructDesc;
+
+/*
+ * Descriptor for shared memory hash table
+ */
+typedef struct ShmemHashDesc
+{
+ const char *name;
+
+ int hash_flags;
+
+ size_t init_size; /* initial number of entries */
+ size_t max_size; /* max number of entries */
+ HASHCTL *infoP;
+
+ HTAB *ptr;
+
+ ShmemStructDesc base_desc;
+} ShmemHashDesc;
/* shmem.c */
extern PGDLLIMPORT slock_t *ShmemLock;
@@ -34,9 +81,19 @@ extern void *ShmemAlloc(Size size);
extern void *ShmemAllocNoError(Size size);
extern bool ShmemAddrIsValid(const void *addr);
extern void InitShmemIndex(void);
+
+extern void ShmemRegisterHash(ShmemHashDesc *desc, HASHCTL *infoP, int hash_flags);
+extern void ShmemRegisterStruct(ShmemStructDesc *desc);
+
+/* Legacy functions */
extern HTAB *ShmemInitHash(const char *name, int64 init_size, int64 max_size,
HASHCTL *infoP, int hash_flags);
extern void *ShmemInitStruct(const char *name, Size size, bool *foundPtr);
+
+extern size_t ShmemRegisteredSize(void);
+extern void ShmemInitRegistered(void);
+extern void ShmemAttachRegistered(void);
+
extern Size add_size(Size s1, Size s2);
extern Size mul_size(Size s1, Size s2);
diff --git a/src/include/storage/sinvaladt.h b/src/include/storage/sinvaladt.h
index a1694500a85..4edba2936e6 100644
--- a/src/include/storage/sinvaladt.h
+++ b/src/include/storage/sinvaladt.h
@@ -28,8 +28,7 @@
/*
* prototypes for functions in sinvaladt.c
*/
-extern Size SharedInvalShmemSize(void);
-extern void SharedInvalShmemInit(void);
+extern void SharedInvalShmemRegister(void);
extern void SharedInvalBackendInit(bool sendOnly);
extern void SIInsertDataEntries(const SharedInvalidationMessage *data, int n);
base-commit: c67bef3f3252a3a38bf347f9f119944176a796ce
--
2.34.1
[text/x-patch] 0002-Get-rid-of-global-shared-memory-pointer-mac-20260223.patch (15.0K, 3-0002-Get-rid-of-global-shared-memory-pointer-mac-20260223.patch)
download | inline diff:
From 395a95e9934286869b1fe8d45dc5a155ea9be030 Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <[email protected]>
Date: Tue, 10 Feb 2026 20:26:31 +0530
Subject: [PATCH 2/3] Get rid of global shared memory pointer macro
declarations
---
.../pg_stat_statements/pg_stat_statements.c | 10 ++++---
src/backend/access/transam/varsup.c | 5 ++++
src/backend/storage/ipc/dsm.c | 5 ++--
src/backend/storage/ipc/dsm_registry.c | 4 ++-
src/backend/storage/ipc/pmsignal.c | 18 +++++-------
src/backend/storage/ipc/procarray.c | 13 +++++----
src/backend/storage/ipc/procsignal.c | 4 ++-
src/backend/storage/ipc/shmem.c | 29 ++++++++++++-------
src/backend/storage/ipc/sinvaladt.c | 7 +++--
src/backend/storage/lmgr/proc.c | 25 ++++++++++------
src/include/access/transam.h | 2 +-
src/include/storage/shmem.h | 6 ++--
12 files changed, 77 insertions(+), 51 deletions(-)
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 71debc8b47f..73fdf561419 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -258,24 +258,26 @@ typedef struct pgssSharedState
pgssGlobalStats stats; /* global statistics for pgss */
} pgssSharedState;
+/* Links to shared memory state */
+pgssSharedState *pgss = NULL;
+HTAB *pgss_hash = NULL;
+
static void pgss_shmem_init(void *arg);
static ShmemStructDesc pgssSharedStateShmemDesc = {
.name = "pg_stat_statements",
.size = sizeof(pgssSharedState),
.init_fn = pgss_shmem_init,
+ .ptr = (void *) &pgss,
};
static ShmemHashDesc pgssSharedHashDesc = {
.name = "pg_stat_statements hash",
.init_size = 0, /* set from 'pgss_max' */
.max_size = 0, /* set from 'pgss_max' */
+ .ptr = &pgss_hash,
};
-/* Links to shared memory state */
-#define pgss ((pgssSharedState *) pgssSharedStateShmemDesc.ptr)
-#define pgss_hash (pgssSharedHashDesc.ptr)
-
/*---- Local variables ----*/
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 11ad90e7372..3dfda875e80 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -32,10 +32,14 @@
static void VarsupShmemInit(void *arg);
+/* pointer to variables struct in shared memory */
+TransamVariablesData *TransamVariables = NULL;
+
ShmemStructDesc TransamVariablesShmemDesc = {
.name = "TransamVariables",
.size = sizeof(TransamVariablesData),
.init_fn = VarsupShmemInit,
+ .ptr = (void **) &TransamVariables,
};
/*
@@ -49,6 +53,7 @@ VarsupShmemRegister(void)
static void
VarsupShmemInit(void *arg)
+
{
memset(TransamVariables, 0, sizeof(TransamVariablesData));
}
diff --git a/src/backend/storage/ipc/dsm.c b/src/backend/storage/ipc/dsm.c
index 55f46c7687e..73644ec3bbb 100644
--- a/src/backend/storage/ipc/dsm.c
+++ b/src/backend/storage/ipc/dsm.c
@@ -110,14 +110,15 @@ static bool dsm_init_done = false;
/* Preallocated DSM space in the main shared memory region. */
static void dsm_main_space_init(void *);
+static void *dsm_main_space_begin = NULL;
+
static ShmemStructDesc dsm_main_space_shmem_desc = {
.name = "Preallocated DSM",
.size = 0, /* dynamic */
.init_fn = dsm_main_space_init,
+ .ptr = &dsm_main_space_begin,
};
-#define dsm_main_space_begin (dsm_main_space_shmem_desc.ptr)
-
/*
* List of dynamic shared memory segments used by this backend.
*
diff --git a/src/backend/storage/ipc/dsm_registry.c b/src/backend/storage/ipc/dsm_registry.c
index 882af83b7b2..1659e1dd71d 100644
--- a/src/backend/storage/ipc/dsm_registry.c
+++ b/src/backend/storage/ipc/dsm_registry.c
@@ -56,13 +56,15 @@ typedef struct DSMRegistryCtxStruct
static void DSMRegistryCtxShmemInit(void *arg);
+DSMRegistryCtxStruct *DSMRegistryCtx = NULL;
+
static ShmemStructDesc DSMRegistryCtxShmemDesc = {
.name = "DSM Registry Data",
.size = sizeof(DSMRegistryCtxStruct),
.init_fn = DSMRegistryCtxShmemInit,
+ .ptr = (void **) &DSMRegistryCtx,
};
-#define DSMRegistryCtx ((DSMRegistryCtxStruct *) DSMRegistryCtxShmemDesc.ptr)
typedef struct NamedDSMState
{
diff --git a/src/backend/storage/ipc/pmsignal.c b/src/backend/storage/ipc/pmsignal.c
index 23752500d16..3aa0380eadd 100644
--- a/src/backend/storage/ipc/pmsignal.c
+++ b/src/backend/storage/ipc/pmsignal.c
@@ -82,21 +82,17 @@ struct PMSignalData
static void PMSignalShmemInit(void *);
-static ShmemStructDesc PMSignalShmemDesc = {
- .name = "PMSignalState",
- .size = 0, /* dynamic */
- .init_fn = PMSignalShmemInit,
-};
-
/*
* PMSignalState pointer is valid in both postmaster and child processes
- *
- * This is a stand-alone variable rather than just a #define over
- * PMSignalShmemDesc.ptr because it is needed early at backend startup and
- * passed as a backend parameter in EXEC_BACKEND mode
*/
NON_EXEC_STATIC volatile PMSignalData *PMSignalState = NULL;
+static ShmemStructDesc PMSignalShmemDesc = {
+ .name = "PMSignalState",
+ .size = 0, /* dynamic */
+ .init_fn = PMSignalShmemInit,
+ .ptr = (void **) &PMSignalState,
+};
/*
* Local copy of PMSignalState->num_child_flags, only valid in the
@@ -156,7 +152,7 @@ static void
PMSignalShmemInit(void *arg)
{
/* initialize all flags to zeroes */
- PMSignalState = PMSignalShmemDesc.ptr;
+ Assert(PMSignalState);
MemSet(unvolatize(PMSignalData *, PMSignalState), 0, PMSignalShmemDesc.size);
num_child_flags = MaxLivePostmasterChildren();
PMSignalState->num_child_flags = num_child_flags;
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 08c63bcb2a7..736504d3a3e 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -104,15 +104,16 @@ typedef struct ProcArrayStruct
static void ProcArrayShmemInit(void *arg);
static void ProcArrayShmemAttach(void *arg);
+ProcArrayStruct *procArray = NULL;
+
static ShmemStructDesc ProcArrayShmemDesc = {
.name = "Proc Array",
.size = 0, /* dynamic */
.init_fn = ProcArrayShmemInit,
.attach_fn = ProcArrayShmemAttach,
+ .ptr = (void **) &procArray,
};
-#define procArray ((ProcArrayStruct *) ProcArrayShmemDesc.ptr)
-
/*
* State for the GlobalVisTest* family of functions. Those functions can
* e.g. be used to decide if a deleted row can be removed without violating
@@ -290,22 +291,24 @@ static TransactionId cachedXidIsNotInProgress = InvalidTransactionId;
* Bookkeeping for tracking emulated transactions in recovery
*/
+TransactionId *KnownAssignedXids = NULL;
+
static ShmemStructDesc KnownAssignedXidsShmemDesc = {
.name = "KnownAssignedXids",
.size = 0, /* dynamic */
.init_fn = NULL,
+ .ptr = (void **) &KnownAssignedXids,
};
-#define KnownAssignedXids ((TransactionId *) KnownAssignedXidsShmemDesc.ptr)
+bool *KnownAssignedXidsValid = NULL;
static ShmemStructDesc KnownAssignedXidsValidShmemDesc = {
.name = "KnownAssignedXidsValid",
.size = 0, /* dynamic */
.init_fn = NULL,
+ .ptr = (void **) &KnownAssignedXidsValid,
};
-#define KnownAssignedXidsValid ((bool *) KnownAssignedXidsValidShmemDesc.ptr)
-
static TransactionId latestObservedXid = InvalidTransactionId;
/*
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 5743f088324..eec04eae3f4 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -104,13 +104,15 @@ struct ProcSignalHeader
static void ProcSignalShmemInit(void *arg);
+ProcSignalHeader *ProcSignal = NULL;
+
static ShmemStructDesc ProcSignalShmemDesc = {
.name = "ProcSignal",
.size = 0, /* dynamic */
.init_fn = ProcSignalShmemInit,
+ .ptr = (void **) &ProcSignal,
};
-#define ProcSignal ((ProcSignalHeader *) ProcSignalShmemDesc.ptr)
static ProcSignalSlot *MyProcSignalSlot = NULL;
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index faa0fcbd21e..e73ac489b2b 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -118,17 +118,18 @@ static void *ShmemEnd; /* end+1 address of shared memory */
static ShmemAllocatorData *ShmemAllocator;
slock_t *ShmemLock; /* points to ShmemAllocator->shmem_lock */
+ /* primary index hashtable for shmem */
+HTAB *ShmemIndex = NULL;
+
static ShmemHashDesc ShmemIndexHashDesc = {
.name = "ShmemIndex",
.init_size = SHMEM_INDEX_SIZE,
.max_size = SHMEM_INDEX_SIZE,
+ .ptr = &ShmemIndex
};
- /* primary index hashtable for shmem */
-#define ShmemIndex (ShmemIndexHashDesc.ptr)
-
/* To get reliable results for NUMA inquiry we need to "touch pages" once */
static bool firstNumaTouch = true;
@@ -205,7 +206,7 @@ ShmemInitRegistered(void)
result->allocated_size = allocated_size;
result->location = structPtr;
- registry[i]->ptr = structPtr;
+ *(registry[i]->ptr) = structPtr;
if (registry[i]->init_fn)
registry[i]->init_fn(registry[i]->init_fn_arg);
}
@@ -239,7 +240,7 @@ ShmemAttachRegistered(void)
registry[i]->name)));
}
- registry[i]->ptr = result->location;
+ *registry[i]->ptr = result->location;
if (registry[i]->attach_fn)
registry[i]->attach_fn(registry[i]->attach_fn_arg);
@@ -425,10 +426,11 @@ InitShmemIndex(void)
info.keysize = SHMEM_INDEX_KEYSIZE;
info.entrysize = sizeof(ShmemIndexEnt);
- ShmemIndex = ShmemInitHash("ShmemIndex",
+ *ShmemIndexHashDesc.ptr = ShmemInitHash("ShmemIndex",
SHMEM_INDEX_SIZE, SHMEM_INDEX_SIZE,
&info,
HASH_ELEM | HASH_STRINGS);
+ Assert(ShmemIndex != NULL && *ShmemIndexHashDesc.ptr == ShmemIndex);
}
/*
@@ -482,6 +484,12 @@ ShmemRegisterHash(ShmemHashDesc *desc, /* configuration */
desc->base_desc.init_fn_arg = desc;
desc->base_desc.attach_fn = shmem_hash_attach;
desc->base_desc.attach_fn_arg = desc;
+ /*
+ * We need a stable pointer to hold the pointer to the shared memory. Use
+ * the one passed in the descriptor now. It will be replaced with the hash
+ * table header by init or attach function.
+ */
+ desc->base_desc.ptr = (void **) desc->ptr;
desc->base_desc.extra_size = hash_estimate_size(desc->max_size, infoP->entrysize) - desc->base_desc.size;
@@ -499,10 +507,9 @@ shmem_hash_init(void *arg)
int hash_flags = desc->hash_flags;
/* Pass location of hashtable header to hash_create */
- desc->ptr = desc->base_desc.ptr;
- desc->infoP->hctl = (HASHHDR *) desc->ptr;
+ desc->infoP->hctl = (HASHHDR *) *desc->base_desc.ptr;
- desc->ptr = hash_create(desc->name, desc->init_size, desc->infoP, hash_flags);
+ *desc->ptr = hash_create(desc->name, desc->init_size, desc->infoP, hash_flags);
}
static void
@@ -518,9 +525,9 @@ shmem_hash_attach(void *arg)
hash_flags |= HASH_ATTACH;
/* Pass location of hashtable header to hash_create */
- desc->infoP->hctl = (HASHHDR *) desc->ptr;
+ desc->infoP->hctl = (HASHHDR *) *desc->base_desc.ptr;
- desc->ptr = hash_create(desc->name, desc->init_size, desc->infoP, hash_flags);
+ *desc->ptr = hash_create(desc->name, desc->init_size, desc->infoP, hash_flags);
}
/*
diff --git a/src/backend/storage/ipc/sinvaladt.c b/src/backend/storage/ipc/sinvaladt.c
index 0fe0f256971..8321bd9b52d 100644
--- a/src/backend/storage/ipc/sinvaladt.c
+++ b/src/backend/storage/ipc/sinvaladt.c
@@ -205,15 +205,16 @@ typedef struct SISeg
static void SharedInvalShmemInit(void *arg);
+/* pointer to the shared inval buffer */
+SISeg *shmInvalBuffer = NULL;
+
static ShmemStructDesc SharedInvalShmemDesc = {
.name = "shmInvalBuffer",
.size = 0, /* dynamic */
.init_fn = SharedInvalShmemInit,
+ .ptr = (void **) &shmInvalBuffer,
};
-/* pointer to the shared inval buffer */
-#define shmInvalBuffer ((SISeg *) SharedInvalShmemDesc.ptr)
-
static LocalTransactionId nextLocalTransactionId;
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 85375b5195e..a3d6557aa9d 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -77,27 +77,33 @@ PGPROC *MyProc = NULL;
static void ProcGlobalShmemInit(void *arg);
+/* Pointers to shared-memory structures */
+PROC_HDR *ProcGlobal = NULL;
+void *tmpAllProcs = NULL;
+void *tmpFastPathLockArray = NULL;
+NON_EXEC_STATIC PGPROC *AuxiliaryProcs = NULL;
+PGPROC *PreparedXactProcs = NULL;
+
+
static ShmemStructDesc ProcGlobalShmemDesc = {
.name = "Proc Header",
.size = sizeof(PROC_HDR),
.init_fn = ProcGlobalShmemInit,
+ .ptr = (void **) &ProcGlobal,
};
static ShmemStructDesc ProcGlobalAllProcsShmemDesc = {
.name = "PGPROC structures",
.size = 0, /* dynamic */
+ .ptr = (void **) &tmpAllProcs,
};
static ShmemStructDesc FastPathLockArrayShmemDesc = {
.name = "Fast-Path Lock Array",
.size = 0, /* dynamic */
+ .ptr = (void **) &tmpFastPathLockArray,
};
-/* Pointers to shared-memory structures */
-PROC_HDR *ProcGlobal = NULL;
-NON_EXEC_STATIC PGPROC *AuxiliaryProcs = NULL;
-PGPROC *PreparedXactProcs = NULL;
-
static uint32 TotalProcs;
static DeadLockState deadlock_state = DS_NOT_YET_CHECKED;
@@ -222,8 +228,7 @@ ProcGlobalShmemInit(void *arg)
Size fpLockBitsSize,
fpRelIdSize;
- ProcGlobal = ProcGlobalShmemDesc.ptr;
-
+ Assert(ProcGlobal);
ProcGlobal->spins_per_delay = DEFAULT_SPINS_PER_DELAY;
dlist_init(&ProcGlobal->freeProcs);
dlist_init(&ProcGlobal->autovacFreeProcs);
@@ -236,7 +241,8 @@ ProcGlobalShmemInit(void *arg)
pg_atomic_init_u32(&ProcGlobal->clogGroupFirst, INVALID_PROC_NUMBER);
SpinLockInit(ProcStructLock);
- ptr = ProcGlobalAllProcsShmemDesc.ptr;
+ Assert(tmpAllProcs);
+ ptr = tmpAllProcs;
requestSize = ProcGlobalAllProcsShmemDesc.size;
memset(ptr, 0, requestSize);
@@ -274,7 +280,8 @@ ProcGlobalShmemInit(void *arg)
fpLockBitsSize = MAXALIGN(FastPathLockGroupsPerBackend * sizeof(uint64));
fpRelIdSize = MAXALIGN(FastPathLockSlotsPerBackend() * sizeof(Oid));
- fpPtr = FastPathLockArrayShmemDesc.ptr;
+ Assert(tmpFastPathLockArray);
+ fpPtr = tmpFastPathLockArray;
requestSize = FastPathLockArrayShmemDesc.size;
memset(fpPtr, 0, requestSize);
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 49d476e9d5c..6e5a546f411 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -334,7 +334,7 @@ extern bool TransactionStartedDuringRecovery(void);
/* in transam/varsup.c */
#ifndef FRONTEND
extern PGDLLIMPORT struct ShmemStructDesc TransamVariablesShmemDesc;
-#define TransamVariables ((TransamVariablesData *) TransamVariablesShmemDesc.ptr)
+extern PGDLLIMPORT TransamVariablesData *TransamVariables;
#endif
/*
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index 40e2fc17056..cbd4ef8d03f 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -50,8 +50,8 @@ typedef struct ShmemStructDesc
*/
size_t extra_size;
- /* Pointer to the shared memory area, when it's allocated. */
- void *ptr;
+ /* Pointer to the variable to which pointer to this shared memory area is assigned after allocation. */
+ void **ptr;
} ShmemStructDesc;
/*
@@ -67,7 +67,7 @@ typedef struct ShmemHashDesc
size_t max_size; /* max number of entries */
HASHCTL *infoP;
- HTAB *ptr;
+ HTAB **ptr;
ShmemStructDesc base_desc;
} ShmemHashDesc;
--
2.34.1
[text/x-patch] 0003-WIP-resizable-shared-memory-structures-20260223.patch (38.9K, 4-0003-WIP-resizable-shared-memory-structures-20260223.patch)
download | inline diff:
From 1f88d0bab7bb6b3ae6d9ecc573c7d6a621f03d2c Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <[email protected]>
Date: Tue, 17 Feb 2026 16:51:20 +0530
Subject: [PATCH 3/3] WIP: resizable shared memory structures
---
doc/src/sgml/system-views.sgml | 20 +-
src/backend/port/sysv_shmem.c | 69 +++++
src/backend/port/win32_shmem.c | 49 +++
src/backend/storage/ipc/shmem.c | 156 ++++++++--
src/include/catalog/pg_proc.dat | 4 +-
src/include/storage/pg_shmem.h | 3 +
src/include/storage/shmem.h | 10 +
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/resizable_shmem/Makefile | 23 ++
src/test/modules/resizable_shmem/meson.build | 36 +++
.../resizable_shmem/resizable_shmem--1.0.sql | 37 +++
.../modules/resizable_shmem/resizable_shmem.c | 281 ++++++++++++++++++
.../resizable_shmem/resizable_shmem.control | 5 +
.../resizable_shmem/t/001_resizable_shmem.pl | 118 ++++++++
src/test/regress/expected/rules.out | 5 +-
16 files changed, 789 insertions(+), 29 deletions(-)
create mode 100644 src/test/modules/resizable_shmem/Makefile
create mode 100644 src/test/modules/resizable_shmem/meson.build
create mode 100644 src/test/modules/resizable_shmem/resizable_shmem--1.0.sql
create mode 100644 src/test/modules/resizable_shmem/resizable_shmem.c
create mode 100644 src/test/modules/resizable_shmem/resizable_shmem.control
create mode 100644 src/test/modules/resizable_shmem/t/001_resizable_shmem.pl
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 8b4abef8c68..533eff3d6cb 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -4243,8 +4243,24 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
Size of the allocation in bytes including padding. For anonymous
allocations, no information about padding is available, so the
<literal>size</literal> and <literal>allocated_size</literal> columns
- will always be equal. Padding is not meaningful for free memory, so
- the columns will be equal in that case also.
+ will always be equal. Padding is not meaningful for free memory, so the
+ columns will be equal in that case also. For resizable allocations which
+ may span multiple memory pages, the padding includes the padding due to
+ page alignment.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>maximum_size</structfield> <type>int8</type>
+ </para>
+ <para>
+ Maximum size in bytes that the allocation can grow upto including padding
+ in case of resizable allocations. For anonymous allocations, no
+ information about maximum size is available, so the
+ <literal>size</literal> and <literal>maximum_size</literal> columns will
+ always be equal. Maximum size is not meaningful for free memory, so the
+ columns will be equal in that case also.
</para></entry>
</row>
</tbody>
diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
index 2e3886cf9fe..67a39e97007 100644
--- a/src/backend/port/sysv_shmem.c
+++ b/src/backend/port/sysv_shmem.c
@@ -589,6 +589,27 @@ check_huge_page_size(int *newval, void **extra, GucSource source)
return true;
}
+/*
+ * Get the page size of being used by the shared memory.
+ *
+ * The function should be called only after the shared memory has been setup.
+ */
+Size
+GetOSPageSize(void)
+{
+ Size os_page_size;
+
+ Assert(huge_pages_status != HUGE_PAGES_UNKNOWN);
+
+ os_page_size = sysconf(_SC_PAGESIZE);
+
+ /* If huge pages are actually in use, use huge page size */
+ if (huge_pages_status == HUGE_PAGES_ON)
+ GetHugePageSize(&os_page_size, NULL);
+
+ return os_page_size;
+}
+
/*
* Creates an anonymous mmap()ed shared memory segment.
*
@@ -991,3 +1012,51 @@ PGSharedMemoryDetach(void)
AnonymousShmem = NULL;
}
}
+
+/*
+ * Make sure that the memory of given size from the given address is released.
+ *
+ * The address and size are expected to be page aligned.
+ *
+ * Only supported on platforms that support anonymous shared memory.
+ */
+void
+PGSharedMemoryEnsureFreed(void *addr, Size size)
+{
+ if (!AnonymousShmem)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("releasing shared memory is supported only in anonymous mappings")));
+
+ Assert(addr == (void *) TYPEALIGN(GetOSPageSize(), addr));
+ Assert(size == TYPEALIGN(GetOSPageSize(), size));
+ Assert(size > 0);
+
+ if (madvise(addr, size, MADV_REMOVE) == -1)
+ ereport(ERROR,
+ (errmsg("could not release shared memory: %m")));
+}
+
+/*
+ * Make sure that the memory of given size from the given address is allocated.
+ *
+ * The address and size are expected to be page aligned.
+ *
+ * Only supported on platforms that support anonymous shared memory.
+ */
+void
+PGSharedMemoryEnsureAllocated(void *addr, Size size)
+{
+ if (!AnonymousShmem)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("allocating shared memory is supported only in anonymous mappings")));
+
+ Assert(addr == (void *) TYPEALIGN(GetOSPageSize(), addr));
+ Assert(size == TYPEALIGN(GetOSPageSize(), size));
+ Assert(size > 0);
+
+ if (madvise(addr, size, MADV_POPULATE_WRITE) == -1)
+ ereport(ERROR,
+ (errmsg("could not release shared memory: %m")));
+}
diff --git a/src/backend/port/win32_shmem.c b/src/backend/port/win32_shmem.c
index 794e4fcb2ad..afbbd0da8da 100644
--- a/src/backend/port/win32_shmem.c
+++ b/src/backend/port/win32_shmem.c
@@ -621,6 +621,32 @@ pgwin32_ReserveSharedMemoryRegion(HANDLE hChild)
return true;
}
+/*
+ * Make sure that the memory of given size from the given address is released.
+ *
+ * Not supported on Windows currently.
+ */
+void
+PGSharedMemoryEnsureFreed(void *addr, Size size)
+{
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("releasing part of shared memory is not supported on windows")));
+}
+
+/*
+ * Make sure that the memory of given size from the given address is allocated.
+ *
+ * Not supported on Windows currently.
+ */
+void
+PGSharedMemoryEnsureFreed(void *addr, Size size)
+{
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("allocating shared memory is not supported on windows")));
+}
+
/*
* This function is provided for consistency with sysv_shmem.c and does not
* provide any useful information for Windows. To obtain the large page size,
@@ -648,3 +674,26 @@ check_huge_page_size(int *newval, void **extra, GucSource source)
}
return true;
}
+
+/*
+ * Get the page size used by the shared memory.
+ *
+ * The function should be called only after the shared memory has been setup.
+ */
+Size
+GetOSPageSize(void)
+{
+ SYSTEM_INFO sysinfo;
+ Size os_page_size;
+
+ Assert(huge_pages_status != HUGE_PAGES_UNKNOWN);
+
+ GetSystemInfo(&sysinfo);
+ os_page_size = sysinfo.dwPageSize;
+
+ /* If huge pages are actually in use, use huge page size */
+ if (huge_pages_status == HUGE_PAGES_ON)
+ GetHugePageSize(&os_page_size, NULL);
+
+ return os_page_size;
+}
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index e73ac489b2b..4ecb354fd06 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -106,6 +106,7 @@ typedef struct ShmemAllocatorData
} ShmemAllocatorData;
static void *ShmemAllocRaw(Size size, Size *allocated_size);
+static void ShmemSetAllocatedSize(ShmemIndexEnt *entry);
static void shmem_hash_init(void *arg);
static void shmem_hash_attach(void *arg);
@@ -143,8 +144,20 @@ ShmemRegisterStruct(ShmemStructDesc *desc)
elog(DEBUG2, "REGISTER: %s with size %zd", desc->name, desc->size);
registry[num_registrations++] = desc;
+
+ if (desc->max_size > 0)
+ elog(DEBUG2, "RESIZABLE structure: %s has max_size %zd", desc->name, desc->max_size);
}
+/*
+ * Calculate the total size of shared memory required for the registered
+ * structures.
+ *
+ * Resizable structures need contiguous memory worth the specified maximum size
+ * when they grow to the fullest. Hence use max_size. It is expected that that
+ * much address space is reserved. Actual memory allocated at the beginning will
+ * be worth the total of initial sizes of all the structures.
+ */
size_t
ShmemRegisteredSize(void)
{
@@ -153,7 +166,7 @@ ShmemRegisteredSize(void)
size = 0;
for (int i = 0; i < num_registrations; i++)
{
- size = add_size(size, registry[i]->size);
+ size = add_size(size, registry[i]->max_size > 0 ? registry[i]->max_size : registry[i]->size);
size = add_size(size, registry[i]->extra_size);
}
@@ -162,6 +175,43 @@ ShmemRegisteredSize(void)
return size;
}
+/*
+ * Set the allocated_size of given structure.
+ */
+static void
+ShmemSetAllocatedSize(ShmemIndexEnt *entry)
+{
+ Size page_size = GetOSPageSize();
+
+ char *align_end = (char *) TYPEALIGN(page_size, (char *) entry->location + entry->size);
+ char *floor_max_end = (char *) TYPEALIGN_DOWN(page_size, (char *) entry->location + entry->maximum_size);
+
+ if (align_end >= floor_max_end)
+ {
+ /*
+ * A fixed sized structure or a resizable structure whose maximal size
+ * ends on the same page as its initial size. In either case, the
+ * structure will be allocated in its entirety at the beginning and there
+ * is no need to allocate additional memory for it when it grows. So, set
+ * allocated_size to maximum_size.
+ */
+ entry->allocated_size = entry->maximum_size;
+ }
+ else
+ {
+ /*
+ * The maximal structure spans multiple pages. Initially only
+ * the pages where this structure ends and where the next structure
+ * starts will be allocated.
+ */
+ entry->allocated_size = entry->maximum_size - (floor_max_end - align_end);
+ }
+}
+
+
+/*
+ * Allocate memory for the registered shared structures and initialize them.
+ */
void
ShmemInitRegistered(void)
{
@@ -170,10 +220,11 @@ ShmemInitRegistered(void)
for (int i = 0; i < num_registrations; i++)
{
- size_t allocated_size;
+ Size max_alloc_size;
void *structPtr;
bool found;
ShmemIndexEnt *result;
+ Size struct_size;
elog(DEBUG2, "INIT [%d/%d]: %s", i, num_registrations, registry[i]->name);
@@ -190,8 +241,14 @@ ShmemInitRegistered(void)
if (found)
elog(ERROR, "shmem struct \"%s\" is already initialized", registry[i]->name);
- /* allocate and initialize it */
- structPtr = ShmemAllocRaw(registry[i]->size, &allocated_size);
+ /*
+ * Allocate space for the structure in the shared memory. The memory
+ * allocation happens as the corresponding pages are written to. For a
+ * resizable structure allocate enough space for it to grow to its
+ * maximum size, not just worth its initial size.
+ */
+ struct_size = registry[i]->max_size > 0 ? registry[i]->max_size : registry[i]->size;
+ structPtr = ShmemAllocRaw(struct_size, &max_alloc_size);
if (structPtr == NULL)
{
/* out of memory; remove the failed ShmemIndex entry */
@@ -202,9 +259,10 @@ ShmemInitRegistered(void)
" \"%s\" (%zu bytes requested)",
registry[i]->name, registry[i]->size)));
}
- result->size = registry[i]->size;
- result->allocated_size = allocated_size;
result->location = structPtr;
+ result->size = registry[i]->size;
+ result->maximum_size = max_alloc_size;
+ ShmemSetAllocatedSize(result);
*(registry[i]->ptr) = structPtr;
if (registry[i]->init_fn)
@@ -212,6 +270,62 @@ ShmemInitRegistered(void)
}
}
+void
+ShmemResizeRegistered(const char *name, Size new_size)
+{
+ ShmemIndexEnt *result;
+ bool found;
+ Size page_size = GetOSPageSize();
+ char *new_end;
+
+ Assert(new_size > 0);
+
+ /* look it up in the shmem index */
+ LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE);
+ result = (ShmemIndexEnt *)
+ hash_search(ShmemIndex, name, HASH_FIND, &found);
+ if (!found)
+ elog(ERROR, "shmem struct \"%s\" is not initialized", name);
+
+ Assert(result);
+
+ if (result->maximum_size < new_size)
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_RESOURCES),
+ errmsg("not enough address space is reserved for resizing structure \"%s\"", name)));
+
+
+ /*
+ * When shrinking the memory from the page aligned new end to the start of
+ * the page containing end of the reserved space is not required. Whereas
+ * when expanding the memory from the start of the page containing the start
+ * of the structure to the page aligned new end is required.
+ */
+ new_end = (char *) TYPEALIGN(page_size, (char *) result->location + new_size);
+ if (new_size < result->size)
+ {
+ char *max_end = (char *) TYPEALIGN_DOWN(page_size, (char *) result->location + result->maximum_size);
+ Size free_size = max_end - new_end;
+
+ if (free_size > 0)
+ PGSharedMemoryEnsureFreed(new_end, free_size);
+ }
+ else if (new_size > result->size)
+ {
+ char *struct_start = (char *) TYPEALIGN_DOWN(page_size, (char *) result->location);
+ Size alloc_size = new_end - struct_start;
+
+ if (alloc_size > 0)
+ PGSharedMemoryEnsureAllocated(struct_start, alloc_size);
+ }
+
+ /* Update shmem index entry. */
+ result->size = new_size;
+ ShmemSetAllocatedSize(result);
+
+ LWLockRelease(ShmemIndexLock);
+}
+
#ifdef EXEC_BACKEND
void
ShmemAttachRegistered(void)
@@ -701,6 +815,7 @@ ShmemInitStruct(const char *name, Size size, bool *foundPtr)
result->size = size;
result->allocated_size = allocated_size;
result->location = structPtr;
+ result->maximum_size = allocated_size;
}
LWLockRelease(ShmemIndexLock);
@@ -747,7 +862,7 @@ mul_size(Size s1, Size s2)
Datum
pg_get_shmem_allocations(PG_FUNCTION_ARGS)
{
-#define PG_GET_SHMEM_SIZES_COLS 4
+#define PG_GET_SHMEM_SIZES_COLS 5
ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
HASH_SEQ_STATUS hstat;
ShmemIndexEnt *ent;
@@ -769,7 +884,14 @@ pg_get_shmem_allocations(PG_FUNCTION_ARGS)
values[1] = Int64GetDatum((char *) ent->location - (char *) ShmemSegHdr);
values[2] = Int64GetDatum(ent->size);
values[3] = Int64GetDatum(ent->allocated_size);
- named_allocated += ent->allocated_size;
+ values[4] = Int64GetDatum(ent->maximum_size);
+
+ /*
+ * Resizable structures are allocated address space upto their maximum
+ * size, that's what we are counting here - allocated space. For fixed
+ * sized structures, allocated_size is same as the maximum_size.
+ */
+ named_allocated += ent->maximum_size;
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
values, nulls);
@@ -780,6 +902,7 @@ pg_get_shmem_allocations(PG_FUNCTION_ARGS)
nulls[1] = true;
values[2] = Int64GetDatum(ShmemAllocator->free_offset - named_allocated);
values[3] = values[2];
+ values[4] = values[2];
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls);
/* output as-of-yet unused shared memory */
@@ -788,6 +911,7 @@ pg_get_shmem_allocations(PG_FUNCTION_ARGS)
nulls[1] = false;
values[2] = Int64GetDatum(ShmemSegHdr->totalsize - ShmemAllocator->free_offset);
values[3] = values[2];
+ values[4] = values[2];
tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls);
LWLockRelease(ShmemIndexLock);
@@ -975,23 +1099,9 @@ pg_get_shmem_allocations_numa(PG_FUNCTION_ARGS)
Size
pg_get_shmem_pagesize(void)
{
- Size os_page_size;
-#ifdef WIN32
- SYSTEM_INFO sysinfo;
-
- GetSystemInfo(&sysinfo);
- os_page_size = sysinfo.dwPageSize;
-#else
- os_page_size = sysconf(_SC_PAGESIZE);
-#endif
-
Assert(IsUnderPostmaster);
- Assert(huge_pages_status != HUGE_PAGES_UNKNOWN);
-
- if (huge_pages_status == HUGE_PAGES_ON)
- GetHugePageSize(&os_page_size, NULL);
- return os_page_size;
+ return GetOSPageSize();
}
Datum
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 83f6501df38..fbf5749bca7 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -8592,8 +8592,8 @@
{ oid => '5052', descr => 'allocations from the main shared memory segment',
proname => 'pg_get_shmem_allocations', prorows => '50', proretset => 't',
provolatile => 'v', prorettype => 'record', proargtypes => '',
- proallargtypes => '{text,int8,int8,int8}', proargmodes => '{o,o,o,o}',
- proargnames => '{name,off,size,allocated_size}',
+ proallargtypes => '{text,int8,int8,int8,int8}', proargmodes => '{o,o,o,o,o}',
+ proargnames => '{name,off,size,allocated_size,maximum_size}',
prosrc => 'pg_get_shmem_allocations' },
{ oid => '4099', descr => 'Is NUMA support available?',
diff --git a/src/include/storage/pg_shmem.h b/src/include/storage/pg_shmem.h
index 10c7b065861..f0efbf2aec1 100644
--- a/src/include/storage/pg_shmem.h
+++ b/src/include/storage/pg_shmem.h
@@ -89,6 +89,9 @@ extern PGShmemHeader *PGSharedMemoryCreate(Size size,
PGShmemHeader **shim);
extern bool PGSharedMemoryIsInUse(unsigned long id1, unsigned long id2);
extern void PGSharedMemoryDetach(void);
+extern void PGSharedMemoryEnsureFreed(void *addr, Size size);
+extern void PGSharedMemoryEnsureAllocated(void *addr, Size size);
extern void GetHugePageSize(Size *hugepagesize, int *mmap_flags);
+extern Size GetOSPageSize(void);
#endif /* PG_SHMEM_H */
diff --git a/src/include/storage/shmem.h b/src/include/storage/shmem.h
index cbd4ef8d03f..f25b60b5f42 100644
--- a/src/include/storage/shmem.h
+++ b/src/include/storage/shmem.h
@@ -50,6 +50,14 @@ typedef struct ShmemStructDesc
*/
size_t extra_size;
+ /*
+ * Maximum size this structure can grow upto in future. The memory is not
+ * allocated right away but the corresponding address space is reserved so
+ * that memory can be mapped to it when the structure grows. Typically
+ * should be used for resizable structures which need contiguous memory.
+ */
+ size_t max_size;
+
/* Pointer to the variable to which pointer to this shared memory area is assigned after allocation. */
void **ptr;
} ShmemStructDesc;
@@ -84,6 +92,7 @@ extern void InitShmemIndex(void);
extern void ShmemRegisterHash(ShmemHashDesc *desc, HASHCTL *infoP, int hash_flags);
extern void ShmemRegisterStruct(ShmemStructDesc *desc);
+extern void ShmemResizeRegistered(const char *name, Size new_size);
/* Legacy functions */
extern HTAB *ShmemInitHash(const char *name, int64 init_size, int64 max_size,
@@ -115,6 +124,7 @@ typedef struct
void *location; /* location in shared mem */
Size size; /* # bytes requested for the structure */
Size allocated_size; /* # bytes actually allocated */
+ Size maximum_size; /* maximum size this structure can grow to */
} ShmemIndexEnt;
#endif /* SHMEM_H */
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 44c7163c1cd..a5df6edae18 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -14,6 +14,7 @@ SUBDIRS = \
libpq_pipeline \
oauth_validator \
plsample \
+ resizable_shmem \
spgist_name_ops \
test_aio \
test_binaryheap \
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 2634a519935..961bb62759d 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -13,6 +13,7 @@ subdir('libpq_pipeline')
subdir('nbtree')
subdir('oauth_validator')
subdir('plsample')
+subdir('resizable_shmem')
subdir('spgist_name_ops')
subdir('ssl_passphrase_callback')
subdir('test_aio')
diff --git a/src/test/modules/resizable_shmem/Makefile b/src/test/modules/resizable_shmem/Makefile
new file mode 100644
index 00000000000..f3bd8ac0c7f
--- /dev/null
+++ b/src/test/modules/resizable_shmem/Makefile
@@ -0,0 +1,23 @@
+# src/test/modules/resizable_shmem/Makefile
+
+MODULES = resizable_shmem
+TAP_TESTS = 1
+
+EXTENSION = resizable_shmem
+DATA = resizable_shmem--1.0.sql
+PGFILEDESC = "resizable_shmem - test module for resizable shared memory"
+
+# This test requires library to be loaded at the server start, so disable
+# installcheck
+NO_INSTALLCHECK = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/resizable_shmem
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/src/makefiles/pgxs.mk
+endif
diff --git a/src/test/modules/resizable_shmem/meson.build b/src/test/modules/resizable_shmem/meson.build
new file mode 100644
index 00000000000..493bbbc95c3
--- /dev/null
+++ b/src/test/modules/resizable_shmem/meson.build
@@ -0,0 +1,36 @@
+# src/test/modules/resizable_shmem/meson.build
+
+resizable_shmem_sources = files(
+ 'resizable_shmem.c',
+)
+
+if host_system == 'windows'
+ resizable_shmem_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'resizable_shmem',
+ '--FILEDESC', 'resizable_shmem - test module for resizable shared memory',])
+endif
+
+resizable_shmem = shared_module('resizable_shmem',
+ resizable_shmem_sources,
+ kwargs: pg_test_mod_args,
+)
+test_install_libs += resizable_shmem
+
+test_install_data += files(
+ 'resizable_shmem.control',
+ 'resizable_shmem--1.0.sql',
+)
+
+tests += {
+ 'name': 'resizable_shmem',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'tap': {
+ 'tests': [
+ 't/001_resizable_shmem.pl',
+ ],
+ # This test requires library to be loaded at the server start, so disable
+ # installcheck
+ 'runningcheck': false,
+ },
+}
diff --git a/src/test/modules/resizable_shmem/resizable_shmem--1.0.sql b/src/test/modules/resizable_shmem/resizable_shmem--1.0.sql
new file mode 100644
index 00000000000..c1bcb6117b6
--- /dev/null
+++ b/src/test/modules/resizable_shmem/resizable_shmem--1.0.sql
@@ -0,0 +1,37 @@
+/* src/test/modules/resizable_shmem/resizable_shmem--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION resizable_shmem" to load this file. \quit
+
+-- Function to resize the test structure in the shared memory
+CREATE FUNCTION resizable_shmem_resize(new_entries integer)
+RETURNS void
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
+
+-- Function to write data to all entries in the test structure in shared memory
+-- Writing all the entries makes sure that the memory is actually allocated and
+-- mapped to the process, so that we can later measure the memory usage.
+CREATE FUNCTION resizable_shmem_write(entry_value integer)
+RETURNS void
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
+
+-- Function to verify that specified number of initial entries have expected value.
+-- Reading all the entries makes sure that the memory is actually mapped to the
+-- process, so that we can later measure the memory usage.
+CREATE FUNCTION resizable_shmem_read(entry_count integer, entry_value integer)
+RETURNS boolean
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
+
+-- Function to report memory usage statistics of the calling backend
+CREATE FUNCTION resizable_shmem_usage(OUT rss_anon bigint, OUT rss_file bigint, OUT rss_shmem bigint, OUT vm_size bigint)
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
+
+-- Function to get the shared memory page size
+CREATE FUNCTION resizable_shmem_pagesize()
+RETURNS integer
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT;
diff --git a/src/test/modules/resizable_shmem/resizable_shmem.c b/src/test/modules/resizable_shmem/resizable_shmem.c
new file mode 100644
index 00000000000..15f02e3f8ff
--- /dev/null
+++ b/src/test/modules/resizable_shmem/resizable_shmem.c
@@ -0,0 +1,281 @@
+/* -------------------------------------------------------------------------
+ *
+ * resizable_shmem.c
+ * Test module for PostgreSQL's resizable shared memory functionality
+ *
+ * This module demonstrates and tests the resizable shared memory API
+ * provided by shmem.c/shmem.h.
+ *
+ * -------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "fmgr.h"
+#include "funcapi.h"
+#include "miscadmin.h"
+#include "storage/shmem.h"
+#include "storage/spin.h"
+#include "utils/builtins.h"
+#include "utils/guc.h"
+#include "utils/memutils.h"
+#include "utils/timestamp.h"
+#include "access/htup_details.h"
+
+#include <stdio.h>
+
+PG_MODULE_MAGIC;
+
+/*
+ * Default amount of shared buffers and hence the amount of shared memory
+ * allocated by default is in hundreds of MBs. The memory allocated to the test
+ * structure will be noticeable only when it's in the same order.
+ */
+#define TEST_INITIAL_ENTRIES (25 * 1024 * 1024) /* Initial number of entries (100MB) */
+#define TEST_MAX_ENTRIES (100 * 1024 * 1024) /* Maximum number of entries (400MB, 4x initial) */
+#define TEST_ENTRY_SIZE sizeof(int32) /* Size of each entry */
+
+/*
+ * Resizable test data structure stored in shared memory.
+ *
+ * We do not use any locks. The test performs resizing, reads and writes none of
+ * which are concurrent to keep the code and the test simple.
+ */
+typedef struct TestResizableShmemStruct
+{
+ /* Metadata */
+ int32 num_entries; /* Number of entries that can fit */
+
+ /* Data area - variable size */
+ int32 data[FLEXIBLE_ARRAY_MEMBER];
+} TestResizableShmemStruct;
+
+/* Global pointer to our shared memory structure */
+static TestResizableShmemStruct *resizable_shmem = NULL;
+
+static void resizable_shmem_shmem_init(void *arg);
+
+static ShmemStructDesc testShmemDesc = {
+ .name = "resizable_shmem",
+ .size = offsetof(TestResizableShmemStruct, data) + (TEST_INITIAL_ENTRIES * TEST_ENTRY_SIZE),
+ .max_size = offsetof(TestResizableShmemStruct, data) + (TEST_MAX_ENTRIES * TEST_ENTRY_SIZE),
+ .alignment = MAXIMUM_ALIGNOF,
+ .init_fn = resizable_shmem_shmem_init,
+ .ptr = (void **) &resizable_shmem,
+};
+
+static shmem_request_hook_type prev_shmem_request_hook = NULL;
+
+static void resizable_shmem_request(void);
+
+/* SQL-callable functions */
+PG_FUNCTION_INFO_V1(resizable_shmem_resize);
+PG_FUNCTION_INFO_V1(resizable_shmem_write);
+PG_FUNCTION_INFO_V1(resizable_shmem_read);
+PG_FUNCTION_INFO_V1(resizable_shmem_usage);
+PG_FUNCTION_INFO_V1(resizable_shmem_pagesize);
+
+/*
+ * Module load callback
+ */
+void
+_PG_init(void)
+{
+ /*
+ * The module needs to be loaded via shared_preload_libraries to register
+ * shared memory structure. But if that's not the case, don't throw an error.
+ * The SQL functions check for existence of the shared memory data structure.
+ */
+ if (!process_shared_preload_libraries_in_progress)
+ return;
+
+#ifdef EXEC_BACKEND
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizable_shmem is not supported in EXEC_BACKEND builds")));
+#endif
+
+ /* Install hook to register shared memory structure. */
+ prev_shmem_request_hook = shmem_request_hook;
+ shmem_request_hook = resizable_shmem_request;
+}
+
+/*
+ * Request shared memory resources
+ */
+static void
+resizable_shmem_request(void)
+{
+ if (prev_shmem_request_hook)
+ prev_shmem_request_hook();
+
+ /* Register our resizable shared memory structure */
+ ShmemRegisterStruct(&testShmemDesc);
+}
+
+/*
+ * Initialize shared memory structure
+ */
+static void
+resizable_shmem_shmem_init(void *arg)
+{
+ /*
+ * Shared memory structure should have been allocated with the requested
+ * size. Initialize the metadata.
+ */
+ Assert(resizable_shmem != NULL);
+ resizable_shmem->num_entries = TEST_INITIAL_ENTRIES;
+ memset(resizable_shmem->data, 0, TEST_INITIAL_ENTRIES * TEST_ENTRY_SIZE);
+}
+
+/*
+ * Resize the shared memory structure to accommodate the specified number of
+ * entries.
+ */
+Datum
+resizable_shmem_resize(PG_FUNCTION_ARGS)
+{
+#ifndef EXEC_BACKEND
+ int32 new_entries = PG_GETARG_INT32(0);
+ Size new_size;
+
+ if (!resizable_shmem)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("resizable_shmem is not initialized")));
+
+ new_size = offsetof(TestResizableShmemStruct, data) + (new_entries * TEST_ENTRY_SIZE);
+ ShmemResizeRegistered(testShmemDesc.name, new_size);
+ resizable_shmem->num_entries = new_entries;
+
+ PG_RETURN_VOID();
+#else
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("resizing shared memory is not supported in EXEC_BACKEND builds")));
+#endif
+}
+
+/*
+ * Write the given integer value to all entries in the data array.
+ */
+Datum
+resizable_shmem_write(PG_FUNCTION_ARGS)
+{
+ int32 entry_value = PG_GETARG_INT32(0);
+ int32 i;
+
+ if (!resizable_shmem)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("resizable_shmem is not initialized")));
+
+ /* Write the value to all current entries */
+ for (i = 0; i < resizable_shmem->num_entries; i++)
+ resizable_shmem->data[i] = entry_value;
+
+ PG_RETURN_VOID();
+}
+
+/*
+ * Check whether the first 'entry_count' entries all have the expected 'entry_value'.
+ * Returns true if all match, false otherwise.
+ */
+Datum
+resizable_shmem_read(PG_FUNCTION_ARGS)
+{
+ int32 entry_count = PG_GETARG_INT32(0);
+ int32 entry_value = PG_GETARG_INT32(1);
+ int32 i;
+
+ if (resizable_shmem == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("resizable_shmem is not initialized")));
+
+ /* Validate entry_count */
+ if (entry_count < 0 || entry_count > resizable_shmem->num_entries)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("entry_count %d is out of range (0..%d)", entry_count, resizable_shmem->num_entries)));
+
+ /* Check if first entry_count entries have the expected value */
+ for (i = 0; i < entry_count; i++)
+ {
+ if (resizable_shmem->data[i] != entry_value)
+ PG_RETURN_BOOL(false);
+ }
+
+ PG_RETURN_BOOL(true);
+}
+
+/*
+ * Report multiple memory usage statistics of the calling backend process
+ * as reported by the kernel.
+ * Returns RssAnon, RssFile, RssShmem, VmSize from /proc/self/status as a record.
+ *
+ * TODO: See TODO note in SQL definition of this function.
+ */
+Datum
+resizable_shmem_usage(PG_FUNCTION_ARGS)
+{
+ FILE *f;
+ char line[256];
+ int64 rss_anon_kb = -1;
+ int64 rss_file_kb = -1;
+ int64 rss_shmem_kb = -1;
+ int64 vm_size_kb = -1;
+ int found = 0;
+ TupleDesc tupdesc;
+ Datum values[4];
+ bool nulls[4];
+ HeapTuple tuple;
+
+ /* Open /proc/self/status to read memory information */
+ f = fopen("/proc/self/status", "r");
+ if (f == NULL)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not open /proc/self/status: %m")));
+
+ /* Look for the memory usage lines */
+ while (fgets(line, sizeof(line), f) != NULL && found < 4)
+ {
+ if (rss_anon_kb == -1 && sscanf(line, "RssAnon: %ld kB", &rss_anon_kb) == 1)
+ found++;
+ else if (rss_file_kb == -1 && sscanf(line, "RssFile: %ld kB", &rss_file_kb) == 1)
+ found++;
+ else if (rss_shmem_kb == -1 && sscanf(line, "RssShmem: %ld kB", &rss_shmem_kb) == 1)
+ found++;
+ else if (vm_size_kb == -1 && sscanf(line, "VmSize: %ld kB", &vm_size_kb) == 1)
+ found++;
+ }
+
+ fclose(f);
+
+ /* Build tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("function returning record called in context "
+ "that cannot accept a record")));
+
+ /* Build the result tuple */
+ values[0] = Int64GetDatum(rss_anon_kb >= 0 ? rss_anon_kb * 1024 : 0);
+ values[1] = Int64GetDatum(rss_file_kb >= 0 ? rss_file_kb * 1024 : 0);
+ values[2] = Int64GetDatum(rss_shmem_kb >= 0 ? rss_shmem_kb * 1024 : 0);
+ values[3] = Int64GetDatum(vm_size_kb >= 0 ? vm_size_kb * 1024 : 0);
+
+ nulls[0] = nulls[1] = nulls[2] = nulls[3] = false;
+
+ tuple = heap_form_tuple(tupdesc, values, nulls);
+ PG_RETURN_DATUM(HeapTupleGetDatum(tuple));
+}
+
+/*
+ * resizable_shmem_pagesize() - Get the shared memory page size
+ */
+Datum
+resizable_shmem_pagesize(PG_FUNCTION_ARGS)
+{
+ PG_RETURN_INT32(pg_get_shmem_pagesize());
+}
diff --git a/src/test/modules/resizable_shmem/resizable_shmem.control b/src/test/modules/resizable_shmem/resizable_shmem.control
new file mode 100644
index 00000000000..1ce2c5ea21a
--- /dev/null
+++ b/src/test/modules/resizable_shmem/resizable_shmem.control
@@ -0,0 +1,5 @@
+# resizable_shmem extension test module
+comment = 'test module for testing resizable shared memory structure functionality'
+default_version = '1.0'
+module_pathname = '$libdir/resizable_shmem'
+relocatable = true
diff --git a/src/test/modules/resizable_shmem/t/001_resizable_shmem.pl b/src/test/modules/resizable_shmem/t/001_resizable_shmem.pl
new file mode 100644
index 00000000000..d0a4b504d8e
--- /dev/null
+++ b/src/test/modules/resizable_shmem/t/001_resizable_shmem.pl
@@ -0,0 +1,118 @@
+#!/usr/bin/perl
+# Copyright (c) 2026, PostgreSQL Global Development Group
+
+use strict;
+use warnings FATAL => 'all';
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# Test resizable shared memory functionality
+# This converts the isolation test resizable_shmem.spec into a TAP test
+
+my $node = PostgreSQL::Test::Cluster->new('resizable_shmem');
+
+# Need to configure for resizable_shmem
+$node->init;
+$node->append_conf('postgresql.conf', 'shared_preload_libraries = resizable_shmem');
+$node->start;
+
+# Create extension
+$node->safe_psql('postgres', 'CREATE EXTENSION resizable_shmem;');
+
+# Query string variables for reuse
+my $rss_usage_query = 'SELECT rss_shmem FROM resizable_shmem_usage();';
+my $alloc_size_query = "SELECT allocated_size FROM pg_shmem_allocations WHERE name = 'resizable_shmem';";
+# Currently only one structure is resizable
+my $fixed_struct_query = "SELECT count(*) FROM pg_shmem_allocations WHERE name <> 'resizable_shmem' and allocated_size <> maximum_size;";
+
+my $page_size = $node->safe_psql('postgres', "SELECT resizable_shmem_pagesize();");
+
+# Create background sessions for testing
+my $session1 = $node->background_psql('postgres');
+my $session2 = $node->background_psql('postgres');
+
+my $num_entries = 25 * 1024 * 1024; # Initial number of entries in resizable shared memory
+my $max_entries = 100 * 1024 * 1024; # Maximum number of entries allowed
+my $entry_size = 4; # each entry is int32
+my $prev_shmem_usage1 = $session1->query_safe($rss_usage_query, verbose => 0);
+my $prev_shmem_usage2 = $session2->query_safe($rss_usage_query, verbose => 0);
+my $prev_alloc_size;
+
+# We need to make sure that the changes to shared memory allocated are
+# proportionate to the changes in the resizable shared memory structure. But
+# there is no way to know the shared memory allocated at the given address in a
+# given process. We can only know the size of shared memory accessed by the a
+# given process. In case of PostgreSQL, that includes the memory allocated to
+# other shared memory structures as well. Instead, we just note the changes in
+# the function below to help in debugging overallocation issues.
+sub note_shmem_changes
+{
+ my ($prev_shmem_usage1, $prev_shmem_usage2, $prev_alloc_size) = @_;
+
+ my $shmem_usage1 = $session1->query_safe($rss_usage_query, verbose => 0);
+ my $shmem_usage2 = $session2->query_safe($rss_usage_query, verbose => 0);
+ my $alloc_size = $node->safe_psql('postgres', $alloc_size_query, verbose => 0);
+
+ note "changes in allocated size: " . ($alloc_size - $prev_alloc_size);
+ note "Session 1: changes in rss_shmem usage: " . ($shmem_usage1 - $prev_shmem_usage1);
+ note "Session 1: difference in rss_shmem change and allocated size change: " . (($shmem_usage1 - $prev_shmem_usage1) - ($alloc_size - $prev_alloc_size));
+ note "Session 2: changes in rss_shmem usage: " . ($shmem_usage2 - $prev_shmem_usage2);
+ note "Session 2: difference in rss_shmem change and allocated size change: " . (($shmem_usage2 - $prev_shmem_usage2) - ($alloc_size - $prev_alloc_size));
+
+ return ($shmem_usage1, $shmem_usage2, $alloc_size);
+}
+
+my $value = 100;
+# Write and read the initial set of entries.
+$session1->query_safe("SELECT resizable_shmem_write($value);", verbose => 0);
+is($session2->query_safe("SELECT resizable_shmem_read($num_entries, $value);", verbose => 0), 't', 'data read after write successful');
+($prev_shmem_usage1, $prev_shmem_usage2, $prev_alloc_size) = note_shmem_changes($prev_shmem_usage1, $prev_shmem_usage2, 0);
+is($node->safe_psql('postgres', $fixed_struct_query), '0', 'initial fixed sized structures');
+
+# Resize to maximum
+my $old_num_entries = $num_entries;
+$num_entries = $max_entries;
+$session1->query_safe("SELECT resizable_shmem_resize($num_entries);", verbose => 0);
+# Old data after resize should still be intact
+is($session1->query_safe("SELECT resizable_shmem_read($old_num_entries, $value);", verbose => 0), 't', 'initial data readable after resize');
+$value = 500;
+$session2->query_safe("SELECT resizable_shmem_write($value);", verbose => 0);
+is($session1->query_safe("SELECT resizable_shmem_read($num_entries, $value);", verbose => 0), 't', 'enlarged area data read successful');
+($prev_shmem_usage1, $prev_shmem_usage2, $prev_alloc_size) = note_shmem_changes($prev_shmem_usage1, $prev_shmem_usage2, $prev_alloc_size);
+is($node->safe_psql('postgres', $fixed_struct_query), '0', 'fixed sized structures after resize to maximum');
+
+# Shrink smaller size
+$old_num_entries = $num_entries;
+$num_entries = 75 * 1024 * 1024;
+$session2->query_safe("SELECT resizable_shmem_resize($num_entries);", verbose => 0);
+# Old values should remain intact in the shrunk area
+is($session1->query_safe("SELECT resizable_shmem_read($num_entries, $value);", verbose => 0), 't', 'data readable after shrinking');
+$value = 999;
+$session1->query_safe("SELECT resizable_shmem_write($value);", verbose => 0);
+is($session2->query_safe("SELECT resizable_shmem_read($num_entries, $value);", verbose => 0), 't', 'new data readable in shrunken area');
+($prev_shmem_usage1, $prev_shmem_usage2, $prev_alloc_size) = note_shmem_changes($prev_shmem_usage1, $prev_shmem_usage2, $prev_alloc_size);
+is($node->safe_psql('postgres', $fixed_struct_query), '0', 'fixed sized structures after shrinking');
+
+# Resize to the same size
+$session2->query_safe("SELECT resizable_shmem_resize($num_entries);", verbose => 0);
+# Old values should remain intact in the shrunk area
+is($session1->query_safe("SELECT resizable_shmem_read($num_entries, $value);", verbose => 0), 't', 'data readable after shrinking');
+$value = 1999;
+$session1->query_safe("SELECT resizable_shmem_write($value);", verbose => 0);
+is($session2->query_safe("SELECT resizable_shmem_read($num_entries, $value);", verbose => 0), 't', 'new data readable in shrunken area');
+($prev_shmem_usage1, $prev_shmem_usage2, $prev_alloc_size) = note_shmem_changes($prev_shmem_usage1, $prev_shmem_usage2, $prev_alloc_size);
+is($node->safe_psql('postgres', $fixed_struct_query), '0', 'fixed sized structures at the end');
+
+# Test resize failure (attempt to resize beyond max - should fail)
+my ($ret, $stdout, $stderr) = $node->psql('postgres', "SELECT resizable_shmem_resize(" . ($max_entries * 2) . ");");
+ok($ret != 0 || $stderr =~ /ERROR/, 'Resize beyond maximum fails');
+
+# Cleanup sessions
+$session1->quit;
+$session2->quit;
+
+# Cleanup
+$node->stop;
+
+done_testing();
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index f4ee2bd7459..0942cc2f771 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1770,8 +1770,9 @@ pg_shadow| SELECT pg_authid.rolname AS usename,
pg_shmem_allocations| SELECT name,
off,
size,
- allocated_size
- FROM pg_get_shmem_allocations() pg_get_shmem_allocations(name, off, size, allocated_size);
+ allocated_size,
+ reserved_size
+ FROM pg_get_shmem_allocations() pg_get_shmem_allocations(name, off, size, allocated_size, maximum_size);
pg_shmem_allocations_numa| SELECT name,
numa_node,
size
--
2.34.1
view thread (29+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected]
Subject: Re: Better shared data structure management and resizable shared data structures
In-Reply-To: <CAExHW5so6VSxBC-1V=35229Z1+dw5vhw8HxHg9ry7UzceKcXzA@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox